Mendelian Randomization

Similar documents
Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome

Outline. Introduction GitHub and installation Worked example Stata wishes Discussion. mrrobust: a Stata package for MR-Egger regression type analyses

CS2220 Introduction to Computational Biology

Mendelian Randomisation and Causal Inference in Observational Epidemiology. Nuala Sheehan

Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies 1

Advanced IPD meta-analysis methods for observational studies

Mendelian randomization: Using genes as instruments for making causal inferences in epidemiology

Mendelian randomization does not support serum calcium in prostate cancer risk

Supplementary Online Content

Does prenatal alcohol exposure affect neurodevelopment? Attempts to give causal answers

New Enhancements: GWAS Workflows with SVS

Investigating causality in the association between 25(OH)D and schizophrenia

Simple Linear Regression the model, estimation and testing

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): 10.

GWAS of HCC Proposed Statistical Approach Mendelian Randomization and Mediation Analysis. Chris Amos Manal Hassan Lewis Roberts Donghui Li

For more information about how to cite these materials visit

Supplementary Figures

Use of allele scores as instrumental variables for Mendelian randomization

MRC Integrative Epidemiology Unit (IEU) at the University of Bristol. George Davey Smith

MATERNAL INFLUENCES ON OFFSPRING S EPIGENETIC AND LATER BODY COMPOSITION

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

BST227: Introduction to Statistical Genetics

An instrumental variable in an observational study behaves

What is Multilevel Modelling Vs Fixed Effects. Will Cook Social Statistics

Mendelian randomization analysis with multiple genetic variants using summarized data

Human population sub-structure and genetic association studies

Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes

Supplementary Table 1. Association of rs with risk of obesity among participants in NHS and HPFS

Imaging Genetics: Heritability, Linkage & Association

Mining the Human Phenome Using Allelic Scores That Index Biological Intermediates

Dylan Small Department of Statistics, Wharton School, University of Pennsylvania. Based on joint work with Paul Rosenbaum

Conceptual Model of Epigenetic Influence on Obesity Risk

breast cancer; relative risk; risk factor; standard deviation; strength of association

Cancers attributable to excess body weight in Canada in D Zakaria, A Shaw Public Health Agency of Canada

Genetic instrumental variable regression: Explaining socioeconomic and health outcomes in nonexperimental data

NIH Public Access Author Manuscript Obesity (Silver Spring). Author manuscript; available in PMC 2013 December 01.

BMI may underestimate the socioeconomic gradient in true obesity

UNIVERSITY OF CALIFORNIA, LOS ANGELES

THE FIRST NINE MONTHS AND CHILDHOOD OBESITY. Deborah A Lawlor MRC Integrative Epidemiology Unit

Tutorial on Genome-Wide Association Studies

Genetic association analysis incorporating intermediate phenotypes information for complex diseases

Supplementary Figure 1. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations.

Using genetic data to strengthen causal inference in observational research

(b) empirical power. IV: blinded IV: unblinded Regr: blinded Regr: unblinded α. empirical power

Network Mendelian randomization: using genetic variants as instrumental variables to investigate mediation in causal pathways

RESEARCH. Dagfinn Aune, 1,2 Abhijit Sen, 1 Manya Prasad, 3 Teresa Norat, 2 Imre Janszky, 1 Serena Tonstad, 3 Pål Romundstad, 1 Lars J Vatten 1

GENETIC INFLUENCES ON APPETITE AND CHILDREN S NUTRITION

Biostatistics and Epidemiology Step 1 Sample Questions Set 2. Diagnostic and Screening Tests

Heritability and genetic correlations explained by common SNPs for MetS traits. Shashaank Vattikuti, Juen Guo and Carson Chow LBM/NIDDK

MOLECULAR EPIDEMIOLOGY Afiono Agung Prasetyo Faculty of Medicine Sebelas Maret University Indonesia

A loss-of-function variant in CETP and risk of CVD in Chinese adults

During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin,

Multiple Regression Analysis

Nature Genetics: doi: /ng Supplementary Figure 1

C2 Training: August 2010

Developing and evaluating polygenic risk prediction models for stratified disease prevention

A NEW TRIAL DESIGN FULLY INTEGRATING BIOMARKER INFORMATION FOR THE EVALUATION OF TREATMENT-EFFECT MECHANISMS IN PERSONALISED MEDICINE

Brief introduction to instrumental variables. IV Workshop, Bristol, Miguel A. Hernán Department of Epidemiology Harvard School of Public Health

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*

Examining Relationships Least-squares regression. Sections 2.3

Running head: MENDELIAN RANDOMISATION and PSYCHOPATHOLOGY 1. studies aiming to identify environmental risk factors for psychopathology

Introduction of Genome wide Complex Trait Analysis (GCTA) Presenter: Yue Ming Chen Location: Stat Gen Workshop Date: 6/7/2013

Consideration of Anthropometric Measures in Cancer. S. Lani Park April 24, 2009

MEA DISCUSSION PAPERS

Introduction to linkage and family based designs to study the genetic epidemiology of complex traits. Harold Snieder

Extending Causality Tests with Genetic Instruments: An Integration of Mendelian Randomization with the Classical Twin Design

IS IT GENETIC? How do genes, environment and chance interact to specify a complex trait such as intelligence?

Manuscript ID BMJ entitled "Education and coronary heart disease: a Mendelian randomization study"

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

Trudging Through Murky Associations: Potential Impacts of Confounding in Pesticide Epidemiologic Studies

Big Data Training for Translational Omics Research. Session 1, Day 3, Liu. Case Study #2. PLOS Genetics DOI: /journal.pgen.

Nature Genetics: doi: /ng Supplementary Figure 1. Rates of different mutation types in CRC.

Relationships between adiposity and left ventricular function in adolescents: mediation by blood pressure and other cardiovascular measures

Genome-wide association study identifies variants in TMPRSS6 associated with hemoglobin levels.

University of Bristol - Explore Bristol Research

Gene-Environment Interactions

Evidence Based Medicine

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Quantitative genetics: traits controlled by alleles at many loci

Use of allele scores as instrumental variables for Mendelian randomization

Confounding and Bias

Low-Fat Dietary Pattern Intervention Trials for the Prevention of Breast and Other Cancers

Effects of environment and genetic interactions on chronic metabolic diseases

Genetic Meta-Analysis and Mendelian Randomization

Careful with Causal Inference

GENOME-WIDE ASSOCIATION STUDIES

Introduction and motivation

Mendelian randomization for strengthening causal inference in observational studies: application to gene by environment interaction.

An Introduction to Quantitative Genetics I. Heather A Lawson Advanced Genetics Spring2018

Fixed Effect Combining

Field wide development of analytic approaches for sequence data

8/10/2012. Education level and diabetes risk: The EPIC-InterAct study AIM. Background. Case-cohort design. Int J Epidemiol 2012 (in press)

Supplementary Figure 1: Attenuation of association signals after conditioning for the lead SNP. a) attenuation of association signal at the 9p22.

Biases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University

Genome-wide association studies (case/control and family-based) Heather J. Cordell, Institute of Genetic Medicine Newcastle University, UK

Choosing a Significance Test. Student Resource Sheet

Introduction to Observational Studies. Jane Pinelis

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Prediction Model For Risk Of Breast Cancer Considering Interaction Between The Risk Factors

Exploring shared genetic bases and causal relationships of schizophrenia and bipolar disorder with 28 cardiovascular and metabolic traits

Transcription:

Mendelian Randomization

Drawback with observational studies Risk factor X Y Outcome Risk factor X? Y Outcome C (Unobserved) Confounders

The power of genetics Intermediate phenotype (risk factor) Genetic variant (Instrumental Variable - IV) G X Y Outcome C (Unobserved) Confounders

Mendelian Randomization Basic principle: genetic variants which mirror the biological effects of a modifiable environmental exposure and alters disease risk should be associated with disease risk to the extent predicted by their influence on exposure to the risk factor. The random allocation of genetic variants from parents to offspring means these variants will generally be unrelated to other factors which affect the outcome. Furthermore, associations between the genotype and the outcome will not be affected by reverse causation because disease does not affect genotype Ebrahim & Davey Smith, Hum Genet 2008 Davey Smith & Ebrahim, Int J Epi 2004

Possible effects of C-reactive protein (CRP) on cardiovascular (CV) events. Expected outcome from hypothetical randomized clinical trial of selective CRP-lowering intervention, and from Mendelian randomization analysis, if CRP were causal in developing CV. Hingorani & Humphries, Lancet 2005

Three key assumptions in MR analysis 1. G (SNP or a combination of multiple SNPs) is robustly associated with X (risk factor) G X Y 2. G is unrelated to any confounders C, that can bias the relationship between G and Y (outcome). In other words, there are no common causes of G and Y (e.g. population stratification) C 3. G is related to Y only through its association with X (i.e. no pleiotropy)

Assumption 1: G is robustly associated with X Under certain conditions, the relative bias of the instrument variable (IV) estimate is ~1/F. A weak IV has been defined as having F<10, where R 2 is variance in X explained by the IV(s), n is sample size and k is number of IVs Weak IVs can lead to biased effect estimates (in the direction of the observed X-Y association) in the presence of confounding of the X Y relationship. Pierce, IJE 2011

Assumption 2: No confounding G is independent of factors (measured and unmeasured) that confound the X-Y relation Since G is randomized at birth and thus is independent of non-genetic confounders and is not modified by the course of disease, the one main concern here is population stratification i.e. if ancestry is related both to G and Y. If you have individual-level data, you can test for this (e.g. PCA)

Assumption 3: No pleiotropy This assumption is the trickiest Assumes that G is only associated with Y via X and thus the association between G and Y is fully mediated by X and not through any unmeasured factor(s). Needs to be true for SNPs in LD too X G X Y C

Scenarios invalidating assumption 3 LD G 2 Pleiotropy G X 1 Y G 1 X Y X 2 C C X 2 G 1 X Y G X 1 Y G 2 C C

Haycock et al, Am J Clin Nutr 2016

Individual-level data in one sample Access to SNPs, risk factor, and outcome for all participants The causal effect of X on Y can be estimated using 2-stage least-squares (2SLS) regression: 1. X = a + γg 2. Y = c + βx, where X are the genetically predicted exposure levels as measured in (1) The causal estimate is given by β Can be implemented in R using the ivpack package Weak instruments cause bias towards the observed confounded association

Summary data from two samples The G-X and the G-Y associations are estimated in two different samples. Assumes no overlap among samples and that the two populations are similar (ethnicity, age, sex, etc.) Here, bias due to weak IVs will be towards the null Note: The G-X and G-Y associations need to be coded using the same effect allele

Summary data from two samples ˆβ = k 2 β 1k β 2k σ β2 k k se( ˆβ) = 2 β 1k σ β2 k k 1 β 2 2 1k σ β2 k β 1k is the mean change in X per allele for SNP k, β 2k is the mean change in Y per allele for SNP k, σ -+ +, is the inverse variance for the G-Y association.

Meta-analysis of associations between height and risk of breast cancer in prospective cohort studies. Ben Zhang et al. JNCI J Natl Cancer Inst 2015;107:djv219

Ben Zhang et al. JNCI J Natl Cancer Inst 2015;107:djv219

Association between BMI and cancer risk was assessed for 22 cancers 5.24 million individuals (166,996 cancer cases) Bhaskaran et al, Lancet 2014

Childhood body fatness is inversely associated with breast cancer risk Breast Cancer Risk 0.4 0.6 0.8 1.0 1.2 1 1.5-2 2.5-3 3.5-4 4.5-5 5.5-6 >6 Baer et al, AJE 2010

Expansion to other cancer types within GAME-ON Cancer Type Cases Controls GWAS studies Breast All 15,569 18,204 11 ER-negative 4,760 13,248 8 Colorectal 5,100 4,831 6 Lung a All 12,527 17,285 6 Adenocarcinoma 3,804 16,289 6 Squamous 3,546 16,434 6 Ovarian a All 4,369 9,123 3 Clear-cell 356 9,123 3 Endometrioid 715 9,123 3 Serous 2,556 9,123 3 Prostate All 14,160 12,712 6 Aggressive 4,446 12,724 6 Total 51,725 62,155 Gao et al, IJE 2016

Childhood body fatness (9 SNPs) p=0.05 p=0.05 p=5.6x10-5 p=3.7x10-4 Gao et al, IJE 2016

Adult BMI (77 SNPs) p=0.02 p=0.0005 p=0.002 p=3.4x10-7 p=0.002 p=0.003 Gao et al, IJE 2016

Bidirectional MR analysis Approach to overcome reverse causation IVs for both X 1 and X 2 are used to assess the causal association in both directions 1. Is G 1 associated with X 2? 2. Is G 2 associated with X 1? (Also confirm that G 1 is associated with X 1 and that G 2 is associated with X 2 C X 1 X 2 G 1 G 2

BMI and CRP what causes what? There is a consistent observed association between high BMI and high CRP levels Light grey points represent a scatter plot of the correlation between circulating CRP and residual BMI. Gray areas represent 95% confidence regions around IV estimates. Black area represents 95% confidence regions around simple linear regression estimates. Timpson et al, Int J Obesity 2011

These data suggest that the observed association between circulating CRP and measured BMI is likely to be driven by BMI, with CRP being a marker of elevated adiposity. Timpson et al, Int J Obesity 2011

Drawbacks with MR analysis Large sample sizes are needed! As genetic effects on risk factors are typically small, MR estimates of association have much wider confidence intervals than conventional epidemiological estimates. Make sure that the three key assumptions hold!