BST227: Introduction to Statistical Genetics

Similar documents
An expanded view of complex traits: from polygenic to omnigenic

Lecture 20. Disease Genetics

Heritability enrichment of differentially expressed genes. Hilary Finucane PGC Statistical Analysis Call January 26, 2016

Use and Interpreta,on of LD Score Regression. Brendan Bulik- Sullivan PGC Stat Analysis Call

Introduction to the Genetics of Complex Disease

New Enhancements: GWAS Workflows with SVS

Introduction of Genome wide Complex Trait Analysis (GCTA) Presenter: Yue Ming Chen Location: Stat Gen Workshop Date: 6/7/2013

GENOME-WIDE ASSOCIATION STUDIES

Supplementary Figures

Example HLA-B and abacavir. Roujeau 2014

An Introduction to Quantitative Genetics I. Heather A Lawson Advanced Genetics Spring2018

Mendelian Randomization

Chromatin marks identify critical cell-types for fine-mapping complex trait variants

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis

Investigating causality in the association between 25(OH)D and schizophrenia

Discontinuous Traits. Chapter 22. Quantitative Traits. Types of Quantitative Traits. Few, distinct phenotypes. Also called discrete characters

HHS Public Access Author manuscript Nat Genet. Author manuscript; available in PMC 2015 September 01.

5/2/18. After this class students should be able to: Stephanie Moon, Ph.D. - GWAS. How do we distinguish Mendelian from non-mendelian traits?

Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance components analysis

Nature Genetics: doi: /ng Supplementary Figure 1

NIH Public Access Author Manuscript Nat Genet. Author manuscript; available in PMC 2012 September 01.

Nature Genetics: doi: /ng Supplementary Figure 1

Tutorial on Genome-Wide Association Studies

ChromHMM Tutorial. Jason Ernst Assistant Professor University of California, Los Angeles

CS2220 Introduction to Computational Biology

IS IT GENETIC? How do genes, environment and chance interact to specify a complex trait such as intelligence?

Heritability. The concept

Introduction to linkage and family based designs to study the genetic epidemiology of complex traits. Harold Snieder

The genetics of complex traits Amazing progress (much by ppl in this room)

Epigenetics. Jenny van Dongen Vrije Universiteit (VU) Amsterdam Boulder, Friday march 10, 2017

Rare Variant Burden Tests. Biostatistics 666

Taking a closer look at trio designs and unscreened controls in the GWAS era

For more information about how to cite these materials visit

Introduction to Genetics and Genomics

Quantitative genetics: traits controlled by alleles at many loci

Doing more with genetics: Gene-environment interactions

QTL Studies- Past, Present and Future. David Evans

Association mapping (qualitative) Association scan, quantitative. Office hours Wednesday 3-4pm 304A Stanley Hall. Association scan, qualitative

Human Genetics 542 Winter 2018 Syllabus

Missing Heritablility How to Analyze Your Own Genome Fall 2013

Human Genetics 542 Winter 2017 Syllabus

Genes, Diseases and Lisa How an advanced ICT research infrastructure contributes to our health

Supplementary Figure 1: Attenuation of association signals after conditioning for the lead SNP. a) attenuation of association signal at the 9p22.

Statistical Tests for X Chromosome Association Study. with Simulations. Jian Wang July 10, 2012

An Atlas of Genetic Correlations across Human Diseases and Traits

Human Genetics (Learning Objectives)

Accessing and Using ENCODE Data Dr. Peggy J. Farnham

Imaging Genetics: Heritability, Linkage & Association

During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin,

Title: Pinpointing resilience in Bipolar Disorder

Mendelian Genetics & Inheritance Patterns. Practice Questions. Slide 1 / 116. Slide 2 / 116. Slide 3 / 116

Progressive Science Initiative. Click to go to website:

Statistical Genetics. Matthew Stephens. Statistics Retreat, October 26th 2012

Peak-calling for ChIP-seq and ATAC-seq

Complex Traits Activity INSTRUCTION MANUAL. ANT 2110 Introduction to Physical Anthropology Professor Julie J. Lesnik

Mining the Human Phenome Using Allelic Scores That Index Biological Intermediates

Supplementary Figures

Supplementary Figure 1. Nature Genetics: doi: /ng.3736

A fully Bayesian approach for the analysis of Whole-Genome Bisulfite Sequencing Data

GENETIC VARIATION AND PATTERNS OF INHERITANCE. SOURCES OF GENETIC VARIATION How siblings / families can be so different

Probability and Punnett Squares

C. Incorrect! Resemblance is not described by heritability. Heritability is a concept that is easily confused, so memorize the definition.

Supplementary Online Content

GENETIC SUSCEPTIBILITY TO CANCER

MOLECULAR EPIDEMIOLOGY Afiono Agung Prasetyo Faculty of Medicine Sebelas Maret University Indonesia

Genes and Inheritance

NOTES: Exceptions to Mendelian Genetics!

Heritability and genetic correlations explained by common SNPs for MetS traits. Shashaank Vattikuti, Juen Guo and Carson Chow LBM/NIDDK

GWAS of HCC Proposed Statistical Approach Mendelian Randomization and Mediation Analysis. Chris Amos Manal Hassan Lewis Roberts Donghui Li

Session 6: Integration of epigenetic data. Peter J Park Department of Biomedical Informatics Harvard Medical School July 18-19, 2016

Unit 7 Section 2 and 3

Patterns of Inheritance

Welcome Back! 2/6/18. A. GGSS B. ggss C. ggss D. GgSs E. Ggss. 1. A species of mice can have gray or black fur

Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups.

Genetics of common disorders with complex inheritance Bettina Blaumeiser MD PhD

Accepted Manuscript. Predicting polygenic risk of psychiatric disorders

Genetics PPT Part 1 Biology-Mrs. Flannery

Components of heritability in an Icelandic cohort

Genetic Studies of Human Hematopoiesis

Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection

Single SNP/Gene Analysis. Typical Results of GWAS Analysis (Single SNP Approach) Typical Results of GWAS Analysis (Single SNP Approach)

Expression of Genetic Effects in the Environment. Expression of Genetic Effects in the Environment

Lecture 13: May 24, 2004

Codominance. P: H R H R (Red) x H W H W (White) H W H R H W H R H W. F1: All Roan (H R H W x H R H W ) Name: Date: Class:

Request for Applications Post-Traumatic Stress Disorder GWAS

Genes and Inheritance (11-12)

Human Molecular Genetics Prof. S. Ganesh Department of Biological Sciences and Bioengineering Indian Institute of Technology, Kanpur

Using Network Flow to Bridge the Gap between Genotype and Phenotype. Teresa Przytycka NIH / NLM / NCBI

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

8.1 Human Chromosomes and Genes

ADVANCED PGT SERVICES

Behavioral genetics: The study of differences

3) It is not clear to me why the authors exclude blond hair from the red hair GWAS, and blond and red hair from the brown hair GWAS.

Unit 5 Review Name: Period:

Your DNA extractions! 10 kb

Day 0 Sunday, July 8: Arrivals. Day 1 Monday, July 9: Introduction, Fundamentals, and Germline Variation. *As of 6/29/18

Dan Koller, Ph.D. Medical and Molecular Genetics

A UNIFIED FRAMEWORK FOR VARIANCE COMPONENT ESTIMATION WITH SUMMARY STATISTICS IN GENOME-WIDE ASSOCIATION STUDIES 1

Meiotic Mistakes and Abnormalities Learning Outcomes

Cognitive, affective, & social neuroscience

Transcription:

BST227: Introduction to Statistical Genetics Lecture 11: Heritability from summary statistics & epigenetic enrichments Guest Lecturer: Caleb Lareau

Success of GWAS EBI Human GWAS Catalog

As of this morning EBI Human GWAS Catalog

Questions of the Post-GWAS Era Can we identify other traits that share a similar genetic basis for the specific phenotype? Can we identify the cell types most important for disease (e.g. schizophrenia) and other traits (e.g. height) where variants are acting?

Tackling the big post-gwas questions Khan Academy; NIH Roadmap Website

Overview Part I: Omnigenic Model Part II: LD Score Regression <break> Part III: Epigenetic enrichment of GWAS Part IV: Improving precision of epigenetic enrichments ~ 1 hour ~ 20 minutes

Part I: Omnigenic Model

Questions: How many genes are important in a Mendelian disease (e.g. Sickle-Cell Disease)? How many genes are important in a non- Mendelian disease(e.g. schizophrenia)? How many genes are important in height?

Inflated summary statistics PGC 2014 Nature

Remove all green regions (+/- 1 Mb) PGC 2014 Nature

After removing all GWAS-hits PGC 2014 Nature

Omnigenic model Boyle et al. 2017 Cell

Low et al 2010 PLoS One; PGC 2014 Nature Contrasting Models Polygenic Omnigenic

Question: If the omnigenic model is true, which chromosome should have the most heritability?

Omnigenic model validation Shi et al., 2016 AJHG

Part II: LD Score Regression

LD Score Regression can 1. Accurately distinguish polygenicity over confounding 2. Estimate heritability from summary statistics 3. Identify traits that share a genetic basis all of which you need to discuss in your project so ask questions!

LD Score Regression can 1. Accurately distinguish polygenicity over confounding 2. Estimate heritability from summary statistics 3. Identify traits that share a genetic basis

Omnigenic association vs. confounding Inflation: Confounding: No Yes *Simulated Data Bulik-Sullivan 2015 Nature Genetics

Definitions A standard model for GWAS is: (recall: need standardization) Heritability can be defined: Heritability of a category C is: Finucane 2014 AJHG

Polygenicity Polygenicity causes more chi-square statistic inflation in high LD regions than in low LD regions Finucane 2014 AJHG

Toy Illustration of the Genome Bulik-Sullivan 2015

Simulating a polygenic trait Bulik-Sullivan 2015

Simulating a polygenic trait Bulik-Sullivan 2015

Simulating a polygenic trait Bulik-Sullivan 2015

High-level overview 1. Separate the genome into bins 2. Compute the mean chi-squared statistic per bin 3. Compute the mean LD score per bin 4. Perform a regression of 2 & 3

LD Bins

LD Score Let C be the bin of genome of interest LD Score for SNP j Χ 2 statistic for SNP j (copy on board) Traylor et al. 2014 PLoS Genetics

LD Score Regression each bin is a dot intercept is important Bulik-Sullivan et al 2015 Nature Genetics

pause, review last slides if needed

Confounding (Population Stratification) Bulik-Sullivan et al 2015 Nature Genetics

No Confounding (Omnigenic) Bulik-Sullivan et al 2015 Nature Genetics

Intercept matters

Real GWAS PGC 2014 Nature

Bulik-Sullivan et al 2015 Nature Genetics

LD Score Regression can 1. Accurately distinguish polygenicity over confounding 2. Estimate heritability from summary statistics 3. Identify traits that share a genetic basis

LD Score Regression Slope -> Slope is proportional to the heritability Write on the board

Recall Lecture 9 requires genotypes!!!

Key point: LD Score regression can compute heritability using summary statistics Why might this be important?

From LD Hub ldsc.broadinstitute.org

LD Score Regression can 1. Accurately distinguish polygenicity over confounding 2. Estimate heritability from summary statistics 3. Identify traits that share a genetic basis

Pleiotropy Pleiotropy := the production by a single gene (or genes!) of two or more apparently unrelated phenotypes or traits.

Single Trait Bulik-Sullivan 2015

Two Traits Bulik-Sullivan 2015

Pleiotropy using LD Score Z 1j and Z 2j are the z statistics of a single SNP j for two different traits Bulik-Sullivan et al., 2015 Nature Genetics

Genetic Correlations Cor = ~ 0 Cor = ~ 0.5 Bulik-Sullivan 2015

Many traits share a genetic basis! Bulik-Sullivan et al., 2015 Nature Genetics

LD Score isn t alone Bulik-Sullivan et al 2015 Nature Genetics

<break>

Part III: Epigenetic enrichment of GWAS

Epigenetics Encode Project Consortium 2012 Nature

What makes cells so different? NIH Roadmap Website

Epigenetic plots Buenrostro et al 2013 Nature Methods

Meyer and Liu 2014 Nature Reviews Genetics

Roadmap Project Roadmap Consortium 2015 Nature

Finding causal tissues for GWAS Intersecting with epigenetic annotations can find causal variants Intersecting GWAS with epigenetics can also find important tissues

Finding important tissue Encode Consortium 2012 Nature

Where is schizophrenia risk important? Boyle et al. 2017 Cell

Stratified LD Score Regression Regular LD Score Regression: Stratified LD Score Regression (sldsc): Finucane 2014 AJHG

Stratifying the genome Encode Consortium 2012 Nature

Where is heritability localized? Finucane et al 2015 Nature Genetics

What cell types are important? Finucane et al 2015 Nature Genetics

LD Score Regression can 1. Accurately distinguish polygenicity over confounding 2. Estimate heritability from summary statistics 3. Identify traits that share a genetic basis 4. Identify cell types important for traits all of which you need to discuss in your project so ask questions!

Part IV: Improving precision of epigenetic enrichments

In collaboration with Jacob Ulirsch Harvard BBS Program Martin Aryee, PhD Massachusetts General Hospital Erik Bao Harvard Medical School Jason Buenrostro, PhD Broad Institute Vijay Sankaran, MD, PhD Boston Children s Hospital

LD Score Regression gets us in the right zip code Finucane et al 2017 Nature Genetics

Accessibility peaks are not the same!

Main Question: Can we develop a methodology that accurately identifies the causal tissue for GWAS traits? Can we apply this approach to single cells?

Human Hematopoiesis

New method: gchromvar 1. Use quantitative genetic information about the core gene associations 2. Use quantitative epigenetic information about chromatin locations

Human hematopoietic traits are heritable h 2

sldsc vs. gchromvar reticulocyte count (-log 10 p-value)

New method: gchromvar 1. Use quantitative genetic information about the core gene associations 2. Use quantitative epigenetic information about chromatin locations

gchromvar Results

Can we apply gchromvar to single cells?

Single Cell ATAC ~2,200 cells assayed

scatac + gchromvar

Pseudotime

Platelet count single cell GWAS Enrichment

Ongoing efforts Pinpoint the precise cell types and stage of development where GWAS seems to matter most for a trait Our approach, gchromvar, is more sensitive at distinguishing enrichments in closelyrelated cell types.

More information EPI511 Offered Spring of 2019 Supplemental reading on the course webpage Homework 5, final projects will require running and interpreting LD Score Regression

Thanks! caleblareau@g.harvard.edu