QTL Studies- Past, Present and Future. David Evans

Similar documents
Thinking big: Finding more and (more) genes influencing glycaemic and anthropometric traits. Mark McCarthy, Oxford

Introduction to linkage and family based designs to study the genetic epidemiology of complex traits. Harold Snieder

Introduction to the Genetics of Complex Disease

CS2220 Introduction to Computational Biology

Introduction to Genetics and Genomics

Mining the Human Phenome Using Allelic Scores That Index Biological Intermediates

GWAS of HCC Proposed Statistical Approach Mendelian Randomization and Mediation Analysis. Chris Amos Manal Hassan Lewis Roberts Donghui Li

Quantitative genetics: traits controlled by alleles at many loci

Over the past year, the capacity to perform

Genetics and Pharmacogenetics in Human Complex Disorders (Example of Bipolar Disorder)

Effects of environment and genetic interactions on chronic metabolic diseases

An Introduction to Quantitative Genetics I. Heather A Lawson Advanced Genetics Spring2018

MULTIFACTORIAL DISEASES. MG L-10 July 7 th 2014

Genes, Genes, Everywhere! The key to a cure for IBD

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis

Large-scale identity-by-descent mapping discovers rare haplotypes of large effect. Suyash Shringarpure 23andMe, Inc. ASHG 2017

Genetics of Pediatric Inflammatory Bowel Disease

Dan Koller, Ph.D. Medical and Molecular Genetics

Results. Introduction

During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin,

A genome-wide association study identifies vitiligo

Genetics of common disorders with complex inheritance Bettina Blaumeiser MD PhD

Mendelian Randomization

FTO Polymorphisms Are Associated with Obesity But Not with Diabetes in East Asian Populations: A Meta analysis

Overview of the Synthetic Derivative

Additional Disclosure

Genetics and Genomics in Medicine Chapter 8 Questions

Our Stage 1 genotype scan was performed using Illumina Human1 Beadarrays, which have a

Topic (Final-03): Immunologic Tolerance and Autoimmunity-Part II

Imaging Genetics: Heritability, Linkage & Association

An expanded view of complex traits: from polygenic to omnigenic

BST227: Introduction to Statistical Genetics

White Paper Guidelines on Vetting Genetic Associations

Genetic Meta-Analysis and Mendelian Randomization

Special Report. Genome-wide association studies and musculoskeletal diseases. Future Rheumatology

Does prenatal alcohol exposure affect neurodevelopment? Attempts to give causal answers

Editorial Type 2 Diabetes and More Gene Panel: A Predictive Genomics Approach for a Polygenic Disease

Heritability and genetic correlations explained by common SNPs for MetS traits. Shashaank Vattikuti, Juen Guo and Carson Chow LBM/NIDDK

Genome-wide association studies (case/control and family-based) Heather J. Cordell, Institute of Genetic Medicine Newcastle University, UK

Cordoba 01/02/2008. Slides Professor Pierre LEFEBVRE

ARTICLE Genetic Risk Factors for Type 2 Diabetes: A Trans-Regulatory Genetic Architecture?

Transferability of Type 2 Diabetes Implicated Loci in Multi-Ethnic Cohorts from Southeast Asia

Heritability enrichment of differentially expressed genes. Hilary Finucane PGC Statistical Analysis Call January 26, 2016

Overlap of disease susceptibility loci for rheumatoid arthritis and juvenile idiopathic arthritis

The genetic architecture of type 2 diabetes appears

MRC Integrative Epidemiology Unit (IEU) at the University of Bristol. George Davey Smith

Genetic association analysis incorporating intermediate phenotypes information for complex diseases

Genome-wide Association Analysis Applied to Asthma-Susceptibility Gene. McCaw, Z., Wu, W., Hsiao, S., McKhann, A., Tracy, S.

IS IT GENETIC? How do genes, environment and chance interact to specify a complex trait such as intelligence?

Multifactorial Inheritance. Prof. Dr. Nedime Serakinci

Multifactorial Inheritance

Summary & general discussion

Supplemental Table 1 Age and gender-specific cut-points used for MHO.

Genetics and the Path Towards Targeted Therapies in Systemic Lupus

On Missing Data and Genotyping Errors in Association Studies

DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK

Supplementary Online Content

FTO gene variants are strongly associated with type 2 diabetes in South Asian Indians

SNPrints: Defining SNP signatures for prediction of onset in complex diseases

The genetics of complex traits Amazing progress (much by ppl in this room)

Genome- Wide Association Studies of Human Growth Traits

Example HLA-B and abacavir. Roujeau 2014

Tutorial on Genome-Wide Association Studies

For more information about how to cite these materials visit

Missing Heritablility How to Analyze Your Own Genome Fall 2013

Genomic approach for drug target discovery and validation

New Enhancements: GWAS Workflows with SVS

26 th International Workshop on Methodology for Human Genomic Studies: the Advanced course

Common variants in WFS1 confer risk of type 2 diabetes

The Genetic Epidemiology of Rheumatoid Arthritis. Lindsey A. Criswell AURA meeting, 2016

SUPPLEMENTARY DATA. 1. Characteristics of individual studies

Components of heritability in an Icelandic cohort

Metabolomics for Characterizing the Human Exposome: The need for a unified and high-throughput way to ascertain environmental exposures

GENOME-WIDE ASSOCIATION STUDIES

Nature Genetics: doi: /ng Supplementary Figure 1

Lecture 20. Disease Genetics

Introduction of Genome wide Complex Trait Analysis (GCTA) Presenter: Yue Ming Chen Location: Stat Gen Workshop Date: 6/7/2013

Supplementary Online Content

Nature Genetics: doi: /ng Supplementary Figure 1

Supplementary Figure 1: Attenuation of association signals after conditioning for the lead SNP. a) attenuation of association signal at the 9p22.

Replication of 13 genome-wide association (GWA)-validated risk variants for type 2 diabetes in Pakistani populations

Supplementary Online Content

Mendelian & Complex Traits. Quantitative Imaging Genomics. Genetics Terminology 2. Genetics Terminology 1. Human Genome. Genetics Terminology 3

Assessing Accuracy of Genotype Imputation in American Indians

Human population sub-structure and genetic association studies

Genetic predisposition to obesity leads to increased risk of type 2 diabetes

Role of Genomics in Selection of Beef Cattle for Healthfulness Characteristics

The Risk of Anti-selection in Protection Business from Advances in Statistical Genetics

Genetics of COPD Prof. Ian P Hall

EXAMINATION OF THE CONTRIBUTION OF DIFFERENCES IN DISEASE ARCHITECTURE AND EPISTASIS TO TYPE 2 DIABETES RISK IN AFRICAN AMERICANS JACOB MILES KEATON

Use and Interpreta,on of LD Score Regression. Brendan Bulik- Sullivan PGC Stat Analysis Call

Supplementary Appendix

Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies 1

MOLECULAR EPIDEMIOLOGY Afiono Agung Prasetyo Faculty of Medicine Sebelas Maret University Indonesia

What Do We Know About Individual Variability and Its

Division of Public Health Sciences, Department of Biostatistics, Wake Forest University Health Sciences, Winston-Salem, NC, USA; 2

Genome-wide association study identifies variants in TMPRSS6 associated with hemoglobin levels.

Nutritional Epidemiology in the Genomic Age. Saroja Voruganti, PhD Assistant Professor Department of Nutrition and UNC Nutrition Research Institute

Complex Multifactorial Genetic Diseases

Transcription:

QTL Studies Past, Present and Future David Evans

Genetic studies of complex diseases have not met anticipated success Glazier et al, Science (2002) 298:23452349

Korstanje & Pagan (2002) Nat Genet Korstanje & Pagan (2002) Nat Genet

Reasons for Failure? BUT Not much success in mapping complex diseases / traits! Individual environment Linkage Marker Linkage Gene 1 disequilibrium Linkage Association Complex Phenotype Mode of inheritance Gene 2 Gene 3 Common environment Polygenic background Weiss & Terwilliger (2000) Nat Genet

LD Patterns and Allelic Association Type 1 diabetes and Insulin VNTR Alzheimers and ApoE4 Bennett & Todd, Ann Rev Genet, 1996 Pattern of LD unpredictable Extent of common genetic variation unknown Roses, Nature 2000

Genomewide Association? Risch & Merikangas, Science 1996

Multiple Rare Variant Hypothesis? GWA assumes that common variants underlie common diseases

Enabling Genomewide Association Studies HAPlotype MAP High throughput genotyping Large cohorts ALSPAC

Wellcome Trust Case Control Consortium

Wellcome Trust CaseControl Consortium GenomeWide Association Across Major Human Diseases DESIGN Collaboration amongst 26 UK disease investigators 2000 cases each from 9 diseases 1000 cases from 4 diseases GENOTYPING Affymetrix 500k SNPs Illumina Human NS_12 SNP chip CASES 1. Type 1 Diabetes 2. Type 2 Diabetes 3. Crohn s Disease 4. Coronary Heart Disease 5. Hypertension 6. Bipolar Disorder 7. Rheumatoid Arthritis 8. Malaria 9. Tuberculosis 10. Ankylosing Spondylitis 11. Grave s Disease 12.Breast Cancer 13.Multiple Sclerosis CONTROLS 1. UK Controls A (1,500 1958 BC) 2. UK Controls B (1,500 NBS) 3. Gambian controls (2000)

Ankylosing Spondylitis Autoimmune arthritis resulting in fusion of vertebrae Prevalence of 0.4% in Caucasians. More common in men. Often associated with psoriasis, IBD and uveitis Ed Sullivan, Mike Atherton

Ankylosing Spondylitis GWAS WTCCC (2007) Nat Genet IL23R ARTS1

Successes MSMB JAZF1 CTBP2 HNF1B 10q21 PTPN2 FAM92B NKX23 IFIH1 CTLA4 KCNJ11 ERBB3 TCF2 WFS1 TCF7L2 Common genes = common etiology? Some in gene deserts Large relative risks does not = success 11q13 NCF4 CD25 SLC30A8 Drug targets 2q35 LSP1 3des. NUDT11 PHOX2B IRGM PTPN22 KIAAA 035D PPARG HHEX MAPKI3 SLC22A3 BSN IL2RA IGF2BP2 TNRC9 KLK3 ATG16L1 18p11 CDKAL1 2q36 8q24 LMTK2 8q24 5p13 12q24 FTO 6q251 IL21 IL23R FGFR2 8q24 SMAD7 NOD2 IL23R INS 9p21 FTO GCKR 9p21 IL2 PTPN22 ARTS GALP LOXL1 ORMDL Breast ca Prostate ca Colorectal ca Crohn s Dis IBD T1D T2D Obesity Triglycerides CAD Coeliac Dis Rheumatoid A Ankylosing S Alzheimer Dis Glaucoma Asthma

What About Quantitative Traits? 1 Gene 3 Genotypes 3 Phenotypes 2 Genes 9 Genotypes 5 Phenotypes 3 Genes 27 Genotypes 7 Phenotypes 4 Genes 81 Genotypes 9 Phenotypes 3 2 1 0 3 2 1 0 Quantitative genetics theory suggests that quantitative traits are the result of many variants of small effect Unselected samples Central Limit Theorem Normal Distribution The corollary is that very large sample sizes will be needed to detect these variants in UNSELECTED samples 7 6 5 4 3 2 1 0 20 15 10 5 0

FTSO WTCCC T2D Scan FTO FTO produces a moderate signal in WTCCC T2D scan But, no signal in an American T2D scan? American cases and controls matched on BMI

Replication is critical!!! FTO

Height The Archetypal Polygenic Dizygotic Twins Trait Monozygotic Twins Twin, family and adoption studies suggest that, within a population, 90% of variation in height is due to genetic variation Twins separated at birth Borjeson, Acta Paed, 1976

GWA of Height A 1914 Cases (WTCCC T2D) B 4892 Cases (DGI) C 6788 Cases (WTCCC HT) D 8668 Cases (WTCCC CAD) E 12228 Cases (EPIC) F 13665 Cases (WTCCC UKBS) Weedon et al. (in press) Nat Genet Large numbers are needed to detect QTLs!!! Significant results Other loci? Collaboration is the name of the game!!!

A: 1,900 B: 5,000 C: 7,200 D: 9,100 E: 12,600 F: 14,000 Some real hits sit in the bottom of the distribution Weedon et al. (in press) Nat Genet Some hits initially look interesting but then go away

Hedgehog signaling, cell cycle, and extracellular matrix genes overrepresented Candidate gene Monogenic Knockout mouse Details* ZBTB38 Transcription factor. CDK6 Yes Involved in the control of the cell cycle. HMGA2 Yes Yes Chromatin architectural factors GDF5 Yes Yes Involved in bone formation LCORL May act as transcription activator LOC387103 Not known EFEMP1 Yes Extracellular matrix C6orf106 Not known PTCH1 Yes Yes Hedgehog signalling SPAG17 Not known SOCS2 Yes Regulates cytokine signal transduction HHIP Hedgehog signaling ZNF678 Transcription factor DLEU7 Not known SCMH1 Yes Polycomb protein ADAMTSL3 Extracellular matrix IHH Yes Yes Hedgehog signaling ANAPC13 Cell cycle ACAN Yes Yes Extracellular matrix DYM Yes Not known Weedon et al. (in press) Nat Genet

The combined impact of the 20 SNPS with a P < 5 x 10 7 The 20 SNPs explain only ~3% of the variation of height Lots more genes to find but extremely large numbers needed Weedon et al. (in press) Nat Genet

Height Linkage Regions Perola et al, Plos Genetics, 2007; data available at http://www.genomeutwin.org; Weedon et al.; unpublished data

Perola et al, Plos Genetics, 2007; data available at http://www.genomeutwin.org; Weedon et al.; unpublished data

What s Going On? Loci identified by GWAs don t have linkage peaks over them Linkage analysis lacks power? Areas identified by linkage don t have significant assocation hits over them Type I error? Power? BUT what if linkage analysis and association analysis identify different types of loci?

What next? Genomewide Sequencing Functional Studies Animal models Other ethnic groups Epigenetics Fine mapping Initial Genome Wide Scans More genes Transcriptomics Mendelian Randomisation Genomic Profiling CNVs

Distribution of MAFs in HapMap Genomewide panels and HapMap biased towards common variants Common variants don t tag rare variants well

Complex Disease Tree??????????????? High hanging fruit??? ARTS1 TCF7L2 FTO IL23R PPARG Low hanging fruit WFS1 TCF2 HLAB27 SEMA5A Rotten Fruit

Methods of gene hunting rare, monogenic (linkage) Effect Size? common, complex (association) Frequency

Genomewide Sequencing Sequence individuals genomes Will identify rare variants But will we have enough power? Evans et al. (2008) EJHG

Genomic Profiling The idea of using genetic information to inform diagnosis Predictive testing in the case of monogenic diseases has been used for years (1300+ tests available) (e.g. Phenylketonuria) Not possible in complex diseases as effects of an individual variant is so small BUT if we consider several predisposing genetic and environmental factors, can we predict disease?

Genomic Profiling (from Janssens et al. 2004 AJHG) => Give up and go home? (from Yang et al. 2003 AJHG)

Ankylosing Spondylitis 1 0.9 0.8 POSTERIOR PROBABILITY OF DISEASE 0.7 0.6 0.5 0.4 0.3 D+/B27+ D+/B27 D+/B27+,ARTS1+,IL23+ D+/B27/ARTS1/IL23R 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PRIOR DISEASE PROBABILITY (Brown & Evans, in prep) Prevalence of B27+, ARTS1+,IL23R+ is 2.4% Prevalence of B27, ARTS1, IL23R is 19%

Using Genetics to Inform Classical Epidemiology

Observational Studies Fanciful claims often made from observational studies In a casecontrol study, a group of diseased individuals are recruited (Cases); A group of individuals without disease are gathered (Controls); Both groups are then measured retrospectively on an exposure of interest; A test of association is performed Example: Obesity (Exposure) and Coronary Heart Disease (Outcome) CHD Control 200 50 Obesity Yes No Odds of obesity in cases: 200/100 = 2 100 250 Odds of obesity in controls: 50/250 = 0.2 Odds Ratio: 2/0.2 = 10

Confounding Classic limitations to observational science Reverse Causation Bias

Randomized Control Trials Individuals free from exposure and disease Randomization Treatment Group Control Group Measure Outcome Measure Outcome Randomization controls for confounding Reverse causation impossible Gold standard for assessing causality

Mendelian Randomization RCTs not always ethical or possible Fortunately nature has provided us with a natural randomized control trial! Mendel s law of independent assortment states that inheritance of a trait is independent (randomized) with respect to other traits Therefore individuals are randomly assigned to three groups based on their genotype (AA, Aa, aa) independent of outcome Assessing the relationship between genotype, environmental risk factor and disease informs us on causality

Mendelian Randomization Genetic Variant (FTO) Confounding Variables (smoking, diet etc.) β FTOObesity Modifiable Environmental Exposure (Obesity) β ObesityCHD Disease (Coronary Heart Disease) If obesity causes CHD then the relationship between FTO and CHD should be estimated by the product of β FTOObesity and β ObesityCHD

Mendelian Randomization Genetic Variant (FTO) Confounding Variables (smoking, diet etc.) β FTOObesity Modifiable Environmental Exposure (Obesity) β ObesityCHD Disease (Coronary Heart Disease) If CHD causes obesity then β FTOCHD should be zero.

Mendelian Randomization Genetic Variant (FTO) Confounding Variables (smoking, diet etc.) β FTOObesity Modifiable Environmental Exposure (Obesity) Disease (Coronary Heart Disease) If the relationship between Obesity and CHD is purely correlational (i.e. due to confounding) then β FTOCHD should be 0

Mendelian Randomization Genetic Variant (FTO) Confounding Variables (smoking, diet etc.) Modifiable Environmental Exposure (Obesity) Disease (Coronary Heart Disease) Genotype is associated with the environmental exposure of interest Genotype is NOT associated with confounders Genotype is only related to its outcome via its association with the modifiable environmental exposure

Mendelian Randomization Mendelian Randomization is a way of using a genetic variant(s) to make causal inferences about (modifiable) environmental risk factors for disease and health related outcomes Environmental exposures (e.g. Obesity) can be modified! Genetic factors cannot (at least for the moment ) Still a relatively new approach that has problems (i.e. finding genetic proxies for environmental exposures multiple instruments?) but a LOT of scope for development

Could SEM be used to enhance MR? Genetic Variant (FTO) Confounding Variables (smoking, diet etc.) Modifiable Environmental Exposure (Obesity) Disease (Coronary Heart Disease) Genetic Variant (MC4R) Genetic Variant (?)