Supplementary Appendix

Similar documents
A Naturally Randomized Trial Comparing the Effect of Genetic Variants that Mimic CETP Inhibitors and Statins on the Risk of Cardiovascular Disease.

Supplementary Online Content

A Mendelian Randomized Controlled Trial of Long Term Reduction in Low-Density Lipoprotein Cholesterol Beginning Early in Life

During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin,

JUPITER NEJM Poll. Panel Discussion: Literature that Should Have an Impact on our Practice: The JUPITER Study

ARIC Manuscript Proposal #2426. PC Reviewed: 9/9/14 Status: A Priority: 2 SC Reviewed: Status: Priority:

Genetics and the prevention of CAD

What have We Learned in Dyslipidemia Management Since the Publication of the 2013 ACC/AHA Guideline?

Supplementary Online Content

John J.P. Kastelein MD PhD Professor of Medicine Dept. of Vascular Medicine Academic Medial Center / University of Amsterdam

Supplementary Online Content

CS2220 Introduction to Computational Biology

Supplementary Online Content

Protecting the heart and kidney: implications from the SHARP trial

CVD risk assessment using risk scores in primary and secondary prevention

Low-density lipoproteins cause atherosclerotic cardiovascular disease (ASCVD) 1. Evidence from genetic, epidemiologic and clinical studies

Supplementary Appendix

Supplement materials:

Atherosclerotic Disease Risk Score

Genetics of Arterial and Venous Thrombosis: Clinical Aspects and a Look to the Future

Variation in PCSK9 and HMGCR and Risk of Cardiovascular Disease and Diabetes

ESC Geoffrey Rose Lecture on Population Sciences Cholesterol and risk: past, present and future

Supplementary Appendix

Genomic approach for drug target discovery and validation

Supplemental Data. Genome-wide Association of Copy-Number Variation. Reveals an Association between Short Stature

9/18/2017 DISCLOSURES. Consultant: RubiconMD. Research: Amgen, NHLBI OUTLINE OBJECTIVES. Review current CV risk assessment tools.

The Clinical Unmet need in the patient with Diabetes and ACS

Low-density lipoprotein as the key factor in atherogenesis too high, too long, or both

ACC/AHA GUIDELINES ON LIPIDS AND PCSK9 INHIBITORS

CARDIOVASCULAR RISK ASSESSMENT ADDITION OF CHRONIC KIDNEY DISEASE AND RACE TO THE FRAMINGHAM EQUATION PAUL E. DRAWZ, MD, MHS

Cardiovascular Disease Prevention: Current Knowledge, Future Directions

FOURIER: Enough Evidence to Justify Widespread Use? Did It fulfill Its Expectations?

Effect of the PCSK9 Inhibitor Evolocumab on Cardiovascular Outcomes

Andrew Cohen, MD and Neil S. Skolnik, MD INTRODUCTION

Investigating causality in the association between 25(OH)D and schizophrenia

Biases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University

Optimizing risk assessment of total cardiovascular risk What are the tools? Lars Rydén Professor Karolinska Institutet Stockholm, Sweden

THE ESC/EAS LIPID GUIDELINES IN THE ELDERLY

Supplementary Appendix

Fasting or non fasting?

However, if instead, CHD risk is plotted on a doubling scale (as in slide 2) then there is a

Lipoprotein(a), PCSK9 Inhibition and Cardiovascular Risk: Insights from the FOURIER Trial

Randomized comparison of single versus double mammary coronary artery bypass grafting: 5 year outcomes of the Arterial Revascularization Trial

JAMA. 2011;305(24): Nora A. Kalagi, MSc

Contemporary management of Dyslipidemia

An update on lipidology and cardiovascular risk management. Lipids, Metabolism & Vascular Risk Section - Royal Society of Medicine

ORIGINAL INVESTIGATION. C-Reactive Protein Concentration and Incident Hypertension in Young Adults

Characterization of Types and Sizes of Myocardial Infarction Reduced with Evolocumab in FOURIER

APPENDIX AVAILABLE ON THE HEI WEB SITE

4/7/ The stats on heart disease. + Deaths & Age-Adjusted Death Rates for

Mendelian Randomization

The Framingham Coronary Heart Disease Risk Score

Supplementary Online Content

An instrumental variable in an observational study behaves

journal of medicine The new england Rosuvastatin to Prevent Vascular Events in Men and Women with Elevated C-Reactive Protein Abstract

Inflammation and and Heart Heart Disease in Women Inflammation and Heart Disease

Lifetime clinical and economic benefits of statin-based LDL lowering in the 20-year Followup of the West of Scotland Coronary Prevention Study

GALECTIN-3 PREDICTS LONG TERM CARDIOVASCULAR DEATH IN HIGH-RISK CORONARY ARTERY DISEASE PATIENTS

CONTENT SUPPLEMENTARY FIGURE E. INSTRUMENTAL VARIABLE ANALYSIS USING DESEASONALISED PLASMA 25-HYDROXYVITAMIN D. 7

Diabetes Mellitus: A Cardiovascular Disease

Rare Variant Burden Tests. Biostatistics 666

SUPPLEMENTARY DATA. 1. Characteristics of individual studies

Dyslipidemia in the light of Current Guidelines - Do we change our Practice?

La prevenzione secondaria dopo sindrome coronarica acuta. Aldo Pietro Maggioni Centro Studi ANMCO Firenze

PCSK9 Inhibitors and Modulators

Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies 1

Antihypertensive Trial Design ALLHAT

Long-term prognostic value of N-Terminal Pro-Brain Natriuretic Peptide (NT-proBNP) changes within one year in patients with coronary heart disease

AIM HIGH for SATURN and stay SHARP; COURAGE (v1.5)

The earlier BP control the better cardiovascular outcome. Jin Oh Na Cardiovascular center Korea University Medical College

Biases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University

Trial to Reduce. Aranesp* Therapy. Cardiovascular Events with

Association of plasma uric acid with ischemic heart disease and blood pressure:

A loss-of-function variant in CETP and risk of CVD in Chinese adults

Statin therapy in patients with Mild to Moderate Coronary Stenosis by 64-slice Multidetector Coronary Computed Tomography

Early Clinical Development #1 REGN727: anti-pcsk9

egfr > 50 (n = 13,916)

Bariatric Surgery versus Intensive Medical Therapy for Diabetes 3-Year Outcomes

Novel PCSK9 Outcomes. in Perspective: Lessons from FOURIER & ODYSSEY LDL-C. ASCVD Risk. Suboptimal Statin Therapy

Value and challenges of combining large cohorts

SCIENTIFIC STUDY REPORT

Approach to Dyslipidemia among diabetic patients

LEPTIN AS A NOVEL PREDICTOR OF DEPRESSION IN PATIENTS WITH THE METABOLIC SYNDROME

Is Lower Better for LDL or is there a Sweet Spot

Study 2 ( ) Pivotal Phase 3 Study Top-Line Results. October 29, 2018

Know Your Number Aggregate Report Single Analysis Compared to National Averages

Supplementary webappendix

Prevention of Heart Disease: The New Guidelines

Title: Statins for haemodialysis patients with diabetes? Long-term follow-up endorses the original conclusions of the 4D study.

Current Cholesterol Guidelines and Treatment of Residual Risk COPYRIGHT. J. Peter Oettgen, MD

The JUPITER trial: What does it tell us? Alice Y.Y. Cheng, MD, FRCPC January 24, 2009

Statistical Fact Sheet Populations

Supplementary Methods

Supplementary appendix

Supplementary Appendix

Supplementary Online Content

Ct=28.4 WAT 92.6% Hepatic CE (mg/g) P=3.6x10-08 Plasma Cholesterol (mg/dl)

Marshall Tulloch-Reid, MD, MPhil, DSc, FACE Epidemiology Research Unit Tropical Medicine Research Institute The University of the West Indies, Mona,

REVEAL: Randomized placebo-controlled trial of anacetrapib in 30,449 patients with atherosclerotic vascular disease

Reviewer: 1 Comment Response Comment Response Comment Response Comment Response

Transcription:

Supplementary Appendix This appendix has been provided by the authors to give readers additional information about their work. Supplement to: Ference BA, Robinson JG, Brook RD, et al. Variation in PCSK9 and HMGCR and risk of cardiovascular disease and diabetes. N Engl J Med 2016;375:2144-53. DOI: 10.1056/NEJMoa1604304

Supplementary Appendix Table of Contents Supplemental Methods 3 Data Sources and Acknowledgements 13 Figure S1: Design of the Naturally Randomized Trials 20 Figure S2: Effect of PCSK9 genetic score on the risk of cardiovascular events 23 Figure S3: Log-linear association between lower LDL-C mediated by PCSK9 24 polymorphisms and the risk of coronary death or MI Figure S4: Effect of PCSK9 genetic score on coronary death and MI in selected subgroups 25 Figure S5: Comparison of the effect of PCSK9 and HMGCR genetic scores on the risk of 26 various cardiovascular outcomes adjusted per 10 mg/dl lower LDL-C Figure S6: Effect of PCSK9 genetic score on risk of cardiovascular disease in up to 62,240 27 case and 127,299 control subjects enrolled in the CARDIoGRAMplusC4D consortium studies adjusted per 10 mg/dl lower LDL-C Figure S7: Effect of HMGCR genetic score on risk of cardiovascular disease in up to 28 62,240 case and 127,299 control subjects enrolled in the CARDIoGRAMplusC4D consortium studies adjusted per 10 mg/dl lower LDL-C Figure S8: Comparison of effect of PCSK9 and HMGCR genetic scores on risk of 29 cardiovascular events in up to 62,240 case and 127,299 control subjects enrolled in the CARDIoGRAMplusC4D consortium studies adjusted per 10 mg/dl lower LDL-C Figure S9: PCSK9 genetic score MR-Egger Evaluation of Pleiotropic Effects 30 Figure S10: HMGCR genetic score MR-Egger Evaluation of Pleiotropic Effects 31 Figure S11: Comparison of LDLR, PCSK9, HMGCR and NPC1L1 genetic scores on risk of 32 diabetes Figure S12: Effect of PCSK9 genetic score on risk of diabetes in up to 86,197 subjects 33 enrolled in the DIAGRAM consortium studies Figure S13: Effect of HMGCR genetic score on risk of diabetes in up to 181,111 subjects 34 enrolled in DIAGRAM consortium and other studies Figure S14: Comparison of effect of PCSK9 and HMGCR genetic scores on risk of diabetes 35 in up to 181,111 subjects enrolled in DIAGRAM consortium and other studies Figure S15: Comparison of effect of unweighted PCSK9 and HMGCR scores on the risk 36 1

cardiovascular events and diabetes Table S1: Included studies and genotyping platforms 37 Table S2: Consortia studies used for external validation 38 Table S3: Baseline characteristics of study sample participants 39 Table S4: PCSK9 polymorphisms included in genetic score and their association with LDL- 40 C in the Global Lipids Genetics Consortium Table S5: Linkage disequilibrium matrix for polymorphisms included in the PCSK9 genetic 40 score Table S6: HMGCR polymorphisms included in genetic score and their association with 41 LDL-C in the Global Lipids Genetics Consortium Table S7: Linkage disequilibrium matrix for polymorphisms included in the HMGCR 41 genetic score Table S8: Association between PCSK9 and HMGCR genetic scores and cardiometabolic 42 outcomes in consortia data per 10 mg/dl lower LDL-C Supplemental References 43 Supplement to: PCSK9 Polymorphisms, HMGCR Polymorphisms, Protection from Cardiovascular Disease, and Risk of Diabetes Brian A. Ference, MD, MPhil, MSc, Jennifer G. Robinson, MD, MPH, Robert D. Brook, MD, Alberico L. Catapano, MSc, M. John Chapman, PhD, David R. Neff, DO, Szilard Voros, MD, Robert P. Giugliano, MD, SM, George Davey Smith, MD, DSc, Sergio Fazio, MD, PhD, Marc S. Sabatine, MD, MPH 2

Supplemental Methods: I. Principle of Mendelian randomization Observational epidemiologic studies measure the association between an exposure and an outcome. However, these studies are vulnerable to confounding, reverse causation and other forms of bias. As a result, epidemiologic studies do not provide compelling evidence that an observed association is causal. Mendelian randomization studies are designed to introduce a randomization scheme into an observational study specifically to assess whether an observed association between an exposure and an outcome is likely to be causal. The concept of Mendelian randomization is based on the random allocation of genetic variants from parents to offspring as described in Mendel s law of independent assortment. These studies use genetic polymorphisms that are associated with an exposure of interest as an instrument to randomly allocate study participants to higher or lower levels of the exposure under study. 1 Because allocation to genetic variation in levels of the exposure is random, this study design should be less susceptible to confounding. In addition, because allocation of the polymorphism occurs at conception this study design should not be vulnerable to reverse causation. The results of a Mendelian randomization study can be interpreted as follows. If a polymorphism is associated with an exposure of interest that is observationally associated with the outcome under study; then the observed association between the exposure and outcome is likely to be causal if the polymorphism is also associated with the outcome. If not, then the observed association between the exposure and outcome is likely to be an artefact of confounding, reverse causation or other study bias. Perhaps the most intuitive way to explain the potential clinical relevance of a Mendelian randomization study is by way of analogy with a randomized trial. Indeed, Mendelian randomization studies have been called nature s randomized trials. 2,3 There are many polymorphisms that are associated with plasma LDL-C cholesterol levels, including polymorphisms in the PCSK9 gene. 4 For example, the C allele of the rs11206510 polymorphism in the PCSK9 gene is associated with lower LDL-C. This polymorphism, like all other polymorphisms, is inherited approximately randomly at the time of conception in the process sometimes referred to as Mendelian randomization. Therefore, inheriting an LDL-C lowering C allele at this polymorphism in the 3

PCSK9 gene is analogous to being randomly allocated to a PCSK9 inhibitor while inheriting the other allele is analogous to being randomly allocated to usual care. If this polymorphism is associated only with LDL-C, and not with any other biomarkers or other pleiotropic effects, then the only difference between persons with and without one or more C alleles at this PCSK9 polymorphism will be their plasma LDL-C level. If allocation is indeed random, then measuring the association between this PCSK9 polymorphism and the risk of cardiovascular disease should provide a naturally randomized and unconfounded estimate of the causal effect of lower LDL-C mediated by polymorphisms in the PCSK9 gene on the risk of cardiovascular disease in a manner analogous to a long-term randomized trial. As in a randomized trial, one can evaluate the success of the Mendelian randomization scheme by comparing a table of baseline characteristics among persons with and without the LDL-C lowering allele. If the baseline characteristics are well balanced, then allocation must have been random and one can reasonably assume that all known and unknown confounders between the exposure and outcomes of interest are equally distributed between the groups being compared. Ironically, Mendelian randomization studies have very little to do with genetics. They are not designed to identify persons at higher or lower risk of disease based on genotype (or genetic scores). Instead, Mendelian randomization studies are designed to evaluate the causal association between an exposure and the risk of disease. The genetic polymorphisms in a Mendelian randomization study are merely convenient instruments that are used to randomly allocate persons to higher or lower levels of the nongenetic exposure under study. It is important to note that Mendelian randomization studies have several limitations. Despite presumed random allocation of genetic polymorphisms according to Mendel s law of independent assortment, this study design is still vulnerable confounding by population structure, pleiotropy and linkage disequilibrium. Confounding by population structure can be addressed by performing studies within ethnically homogeneous study populations. Confounding by pleiotropy can be addressed by selecting polymorphisms that are only associated with the exposure of interest, but not with other exposures that are known to be causally associated with the outcome under study. Both confounding by pleiotropy and confounding by linkage disequilibrium can be assessed by measuring the effect of the genetic instrument on known potential confounding exposures. 4

Another important limitation of Mendelian randomization studies is the problem of weak instruments. If a polymorphism does not have a quantitatively large enough effect on the exposure of interest, it will not be a useful instrument to reliably assess the causal effect of that exposure on an outcome. The problem of weak instruments can be addressed by combining multiple polymorphisms that are associated with the exposure of interest into a genetic score to create an instrument that has a quantitatively much larger effect on the exposure of interest. 1 II. Constructing the genetic scores To create a genetic instrument with the largest possible effect on LDL-C, we constructed genetic scores. We combined multiple independently inherited polymorphisms in the PCSK9 gene to create a PCSK9 genetic score, and combined multiple independently inherited polymorphisms in the HMGCR gene to create an HMGCR genetic score. These genetic scores are instruments that reflect the combined effect of the polymorphisms included in either score, respectively, on circulating LDL-C levels. As a result, each score has a much larger effect on plasma LDL-C than any individual polymorphism included in the score. We selected polymorphisms for inclusion in each genetic score according to the following protocol. First we identified all polymorphisms within 100Kb of the target gene (PCSK9 or HMGCR). We ranked each polymorphism by its p value for the association with LDL-C as reported by the Global Lipid Genetics Consortium (GLGC). 4 We then iteratively selected for inclusion (in order of decreasing magnitude of association with LDL-C ) all polymorphisms that satisfied both of two criteria: 1) a p-value for association with LDL-C of < 5x10-8; and 2) low linkage disequilibrium with all other polymorphisms included in the score (defined as r2 < 0.2 for all comparisons with all other SNPs included in the score). 5 We iteratively confirmed that each polymorphism added to a score had an independent effect on LDL-C and contributed additional information to the score using forward step-wise regression. We defined the exposure allele for each polymorphism as the allele associated with lower LDL-C as reported in the GLGC. To calculate the PCSK9 and HMGCR genetic score for each participant, we multiplied the number of exposure alleles that a person inherited at each polymorphism included in either score by the effect of that polymorphism on LDL-C measured in mg/dl as estimated in the GLGC. We then summed these values to create a weighted genetic score for each participant. For sensitivity analyses, we also constructed unweighted PCSK9 and HMGCR genetic scores for each participant by 5

summing the number of exposure alleles that a person inherited at each polymorphism included in either score, respectively. If a polymorphism included in the PCSK9 or HMGCR genetic scores was not included on the genotyping platform(s) used in a particular study, we selected the closest proxy polymorphism that was genotyped on at least one of the available genotyping platforms available for that study. III. Allocation into exposure groups Because all polymorphisms included in either genetic score are inherited approximately randomly at the time of conception in a process sometimes referred to as Mendelian randomization, 1 and because each polymorphism is inherited approximately independently of the other polymorphisms included in the score by virtue of low linkage disequilibrium, the number of exposure alleles that a person inherits in either score should also be random. We dichotomized each genetic score as having a value above or below the median score for participants in the population under study. Because the number of exposure alleles that a person inherits in either score should be random, dichotomizing the genetic score as above and below the median should therefore randomly allocate the population under study into two approximately equal sized groups. We chose to dichotomize the genetic scores in the primary analysis and use the dichotomized score as an instrument to randomly allocate study participants into two approximately equal sized groups for two reasons. First, using either genetic score to randomly allocate the population into two approximately equal sized groups permitted us to directly estimate the separate and combined effect of polymorphisms that mimic the effect of PCSK9 inhibitors and statins using a 2x2 factorial study design. Second, randomly allocating study participants into two approximately equal sized groups would give our study the same structure as a randomized trial. We designed our Mendelian randomization study to have the same structure as a randomized trial to facilitate clarity of presentation and ease of interpretation for a clinical audience. To evaluate dose-response, we allocated participants into four groups based on the quartile value of their PCSK9 and HMGCR genetic scores, respectively. Because the number of exposure alleles that a 6

person inherits in either score should be random, allocation to each quartile score should also be random. To conduct the 2x2 factorial analyses, study participants were first randomly allocated into two groups based on whether their PCSK9 genetic score was above or below the median value. Subjects in either of these two groups were then randomly allocated into two further groups based on whether their HMGCR genetic score was above or below the median value. Because all polymorphisms included in either score are inherited approximately randomly and approximately independently of each other due to low linkage disequilibrium; and because PCSK9 and HMGCR polymorphisms are located on different chromosomes and therefore inherited independently of the other, this process should randomly allocate the study population into 4 approximately equal-sized groups: the reference group (analogous to a placebo group), a group with lower LDL-C mediated polymorphisms in the HMGCR gene (analogous to treatment with a statin), a group with lower LDL-C mediated polymorphisms in the PCSK9 gene (analogous to treatment with PCSK9 inhibitor), and a group with lower LDL-C mediated by the combined effect of polymorphisms in the both the HMGCR and PCSK9 genes (analogous to treatment with combination statin and a PCSK9 inhibitor). 6 The success of the naturally random allocation scheme was assessed by comparing baseline characteristics among persons in each of the groups being compared. Continuous variables were compared using a t-test, dichotomous (and ordinal) variables were compared using a chi-square test, and non-normally distributed variables were compared using non-parametric rank tests or empirical resampling. IV. Harmonized definition of cardiovascular outcome events As part of a larger project, we first harmonized the definition of all cardiovascular-related outcome variables in each of the 14 studies listed in Table S1. We then re-coded individual level data for each study participant as necessary to satisfy the harmonized variable definitions to the extent possible, as described below. In general, and where possible, we only included the outcomes of coronary heart disease death (as adjudicated by the individual studies); definite myocardial infarction (excluding silent MI, possible MI, probable MI, ECG-detected prior MI and resuscitated cardiac arrest ); coronary 7

revascularization (defined as angioplasty, percutaneous coronary intervention or coronary artery bypass grafting ); stroke (using ischemic stroke only in studies that sub-divided stroke by type specifically excluding haemorrhagic stroke, embolic stroke or unknown stroke type ). We created new reconciled study outcome variables in each data set using the definitions described above. The primary cardiovascular outcome for our study was a composite of the first occurrence of coronary death or MI. We used both prevalent and incident cases of MI in the primary composite to meet the definition of first occurrence (understanding that all coronary deaths during follow-up were necessarily incident events) in the cohort studies. Therefore, the primary cardiovascular outcome is a composite of prevalent MI or the first occurrence of incident MI or coronary death. The key secondary composite cardiovascular outcomes included a) major coronary events (MCE) defined as the first occurrence of coronary death, MI or coronary revascularization; b) major vascular events (MVE) defined as the first occurrence of coronary death, MI, coronary revascularization or stroke; and c) the first occurrence of coronary death, MI, or stroke. Therefore, MCE is a composite of prevalent MI or prevalent coronary revascularization; or the first occurrence of incident MI, incident coronary revascularization or coronary death. (Coronary revascularization was defined as coronary angioplasty, percutaneous coronary intervention, or coronary artery bypass graft surgery). Similarly, MVE is therefore a composite of prevalent MI, prevalent coronary revascularization or prevalent stroke; or the first occurrence of incident MI, incident coronary revascularization, incident stroke or coronary death. Finally, the outcome of coronary death, MI, or stroke is therefore a composite of the first occurrence of either prevalent MI or prevalent stroke; or the first occurrence of incident MI, incident stroke or coronary death. Tertiary cardiovascular outcomes included the individual components of the composite cardiovascular outcomes: coronary death; MI; stroke; and coronary revascularization. 8

Coronary death included only incident events by definition. Myocardial infarction included prevalent MI or the first occurrence of an incident MI (recurrent events were not included). Stroke included prevalent stoke or the first occurrence of an incident stroke (recurrent events were not included). Coronary revascularization included either prevalent or the first occurrence of an incident coronary revascularization (recurrent events were not included). We did not recode the case definition for the case-control studies. In the six (6) Myocardial Infarction Genetics (MIGEN) Consortium case-control studies, all cases were MI. 7 Therefore, these cases were included in the primary composite outcome. In the Wellcome Trust Case-Control Consortium (WTCCC) study, case subjects had a history of either myocardial infarction or coronary revascularization before the age of 66 years, and a family history of coronary artery disease. 8 Of the 1,926 WTCCC case subjects, 1,377 had MI (71.5%) and the remaining 549 had coronary revascularization (202 PCI; 347 CABG) as the case ascertainment event. Because the vast majority of events in the WTCCC were MI, and to avoid excluding the 1,377 cases of MI, all 1,926 WTCCC CAD cases were included in the composite primary cardiovascular event outcome. V. Analytic methods The association between each dichotomized weighted genetic LDL-C score and plasma LDL-C level was evaluated using linear regression, and the association with the cardiovascular outcomes and diabetes was evaluated using logistic regression (for combined prevalent and incident outcomes) or proportional hazards models (for incident events). All analyses were adjusted for age and gender. All analyses were conducted separately in each of the 14 studies listed in Table S1, and then combined using a fixed-effects inverse variance-weighted meta-analysis to produce summary estimates of effect. Within each study population, all analyses were conducted separately among each included ethnic group to minimize the potential for confounding by population stratification bias before being combined to produce the overall summary estimate of effect. In the main analyses, we combined prevalent and incident cardiovascular outcome events and cases of diabetes. The rational for using both prevalent and incident events in the main analyses is four-fold. First, we assumed that all events occurred incident to the genetic exposure. Second, as mentioned 9

above, the primary and secondary composite outcome definitions included the first occurrence of a component event. Therefore, persons with a prevalent event at the time of enrolment into one of the prospective cohort studies would have satisfied the outcome definition. Third, combining prevalent and incident events in genetic analyses is a common strategy to maximize the number of events and therefore to maximize power. Fourth, the primary analysis combined data from both prospective cohort studies and case-control studies, and therefore excluding prevalent cases from the cohort studies would be incongruent. We did not to adjust for the use of lipid-lowering therapy. We chose not to adjust for the use of lipidlowering therapy for three reasons. First, recognize that the use of lipid-lowering therapy has the potential to bias our effect estimates toward the null when measuring the effect of either genetic score on both LDL-C and the risk of cardiovascular events and diabetes. We were willing to accept this potential bias toward the null to adopt the most conservative analysis strategy possible. Second, we wanted to perform the same analyses using the same methods in all 14 of the included studies to avoid introducing any potential bias. Because we did not have data on lipid-lowering therapy for participants in the case-control studies, we did not want to analyze the case-control and cohort studies differently by adjusting for lipid lowering therapy only in the cohort studies. Third, we were willing to accept that our point estimates of effect may be biased toward the null because estimating the precise magnitude of these point estimates of effect was not the primary goal of the study. Instead, the primary goal of the study was to compare the relative magnitude of the point estimates of effect for the PCSK9 and HMGCR genetic scores on the risk of both cardiovascular events and diabetes in order to make inferences about the potential relative magnitude of the benefit of treatment with a PCSK9 inhibitor as compared to a statin. VI. External Validation Analyses To provide external validation, we compared the effect of lower LDL-C on the risk of cardiovascular events mediated by the HMGCR and PCSK9 genetic scores in up to 62,240 case and 127,299 control subjects enrolled in the CARDIoGRAMplusC4D consortia studies, 9,10 and in up to 86,196 Caucasian subjects (22,669 cases of diabetes) enrolled in the DIAGRAM consortia studies (Supplemental Table S2). 11 10

As has been previously demonstrated, for a set of genetic markers with small effect size and in linkage equilibrium with each other, regression on a genetic risk score can be reconstructed from regressions on the individual polymorphisms without further access to individual-level data. 12 This is accomplished by weighting the association between each exposure allele and the risk of the outcome of interest by the effect size of the exposure allele on the modifiable exposure of interest, and then combining these weighted effect estimates to produce an overall weighted summary estimate of effect. To calculate PCSK9 and HMGCR genetic scores on the risk of CHD using the available summary data, we looked-up the association between each polymorphism included in either genetic score, respectively, and the risk of coronary heart disease (CHD) as reported by the CARDIoGRAMplusC4D consortium (www.cardiogramplusc4d.org). 9,10 We adjusted the reported CHD effect size (and the corresponding standard error) by the effect of that polymorphism on LDL-C (measured in mg/dl) as reported by the GLGC using the usual ratio of effect estimates method. 4 We then combined the adjusted effect estimates in a fixed-effects inverse variance-weighted meta-analysis to produce PCSK9 and HMGCR genetic scores that represent a summary estimate of the effect of each unit lower LDL-C on the risk of coronary events mediated by the combined effect of the polymorphisms included in either genetic score, respectively. In sensitivity analyses, we also calculated the effect of lower LDL-c mediated by the combined effect of the polymorphisms included in either genetic score, respectively, using linear regression analysis forced to pass through the origin (by leaving out the constant term in the regression equation). These two methods provide numerically and computationally equivalent results. We also performed the same linear regression analyses that retained the constant term in the regression equation. In these MR-Egger analyses, the constant term represents an estimate of the total pleiotropic effect of the genetic score on the risk of CHD not mediated by LDL-C. 13 To evaluate the effect of the PCSK9 and HMGCR genetic scores on the risk of diabetes using the available summary data, we looked-up the association between each polymorphism included in either genetic score, respectively, and the risk of diabetes as reported by the DIAbetes Genetics Replication And Metaanalysis (DIAGRAM) consortium (www.diagram-consortium.org). 11 described above using diabetes as the outcome definition. We then repeated the procedure 11

VII. Calculating Standardized Effect Estimates To directly compare the effect of each genetic score or individual polymorphism on various outcomes measured per unit change in LDL-C, we adjusted the effect estimate for a standard decrement of 10 mg/dl in LDL-C using the usual ratio of effect estimates method. The same standardized effect estimate methods were applied using the individual participant data in the primary analyses, and using the summary-level data in external validation analyses. Specifically, for dichotomous outcome variables, we multiplied the natural logarithm of the odds ratio (and the corresponding standard error) for the association with coronary events or diabetes, respectively, by the effect the effect of that genetic score (or polymorphism) on LDL-C measured in units of 10 mg/dl. We then exponentiated this value to produce the adjusted odds ratio (and 95% confidence interval) for the association between that genetic score (or polymorphism) and the risk of cardiovascular events or diabetes per 10 mg/dl lower LDL-C. For continuous outcome variables (e.g. fasting glucose), we multiplied the effect estimate in measured units (and the corresponding standard error) by the effect the effect of the genetic score (or polymorphism) on LDL-C measured in units of 10 mg/dl, to produce an adjusted estimate of the effect of the genetic score (or polymorphism) and the continuous outcome variable per 10 mg/dl lower LDL-C. In sensitivity analyses, the standardized effect estimates were also calculated using linear regression analysis forced to pass through the origin (by leaving out the constant term in the regression equation) and coding the effect of each genetic score (or polymorphism) on LDL-C in units of 10 mg/dl (rather than mg/dl). These two methods provide numerically and computationally equivalent results. 12

Data Sources and Acknowledgements ARIC Atherosclerosis Risk in Communities Study (ARIC) 14 DbGaP dataset reference: The datasets used for the analyses described in this manuscript were obtained from dbgap at http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap through dbgap Study Accession: phs000280.v2.p1 The research reported in this article was supported by contract numbers HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, and HHSN268201100012C; all from the National Heart, Lung, and Blood Institute; National Institutes of Health; Bethesda, MD, USA. A full list of principal ARIC investigators and institutions can be found at https://www2.cscc.unc.edu/aric/. This manuscript was not prepared in collaboration with ARIC investigators and does not necessarily reflect the opinions or views of ARIC, or the NHLBI. Candidate Gene Association Resource (CARe) - Support for the genotyping through the CARe Study was provided by NHLBI Contract N01-HC-65226. GENEVA (Gene-Environment Association Studies) - Support for the genotyping through the GENEVA Study was provided by the NIH GEI U01HG004438, U01HG04424, and HHSN268200782096C Cardiovascular Health Study (CHS) 15 DbGaP dataset reference: The datasets used for the analyses described in this manuscript were obtained from dbgap at http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap through dbgap Study Accession: phs000287.v4.p1 The research reported in this article was supported by contract numbers N01-HC- 85079, N01-HC-85080, N01-HC-85081, N01-HC-85082, N01-HC-85083, N01-HC- 85084, N01-HC-85085, N01-HC-85086, N01-HC-35129, N01 HC-15103, N01 HC- 13

55222, N01-HC-75150, N01-HC-45133, N01-HC-85239 and HHSN268201200036C; grant numbers U01 HL080295 from the National Heart, Lung, and Blood Institute and R01 AG-023629 from the National Institute on Aging, with additional contribution from the National Institute of Neurological Disorders and Stroke. A full list of principal CHS investigators and institutions can be found at http://www.chsnhlbi.org/pi.htm. This manuscript was not prepared in collaboration with CHS investigators and does not necessarily reflect the opinions or views of CHS, or the NHLBI. CHS Candidate gene Association Resource (CARe): Support for the genotyping through the CARe Study was provided by NHLBI Contract N01-HC-65226. Support for the Cardiovascular Health Study Whole Genome Study was provided by NHLBI grant HL087652. Additional support for infrastructure was provided by HL105756 and additional genotyping among the African-American cohort was supported in part by HL085251. DNA handling and genotyping at Cedars-Sinai Medical Center was supported in part by National Center for Research Resources grant UL1RR033176, now at the National Center for Advancing Translational Technologies CTSI grant UL1TR000124; in addition to the National Institute of Diabetes and Digestive and Kidney Diseases grant DK063491 to the Southern California Diabetes Endocrinology Research Center. The Framingham Heart Study (FHS) 16,17 DbGaP dataset reference: The datasets used for the analyses described in this manuscript were obtained from dbgap at http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap through dbgap Study Accession: phs000007.v23.p8 The Framingham Heart Study is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with Boston University (Contract No. N01-HC-25195). This manuscript was not prepared in collaboration with investigators of the Framingham Heart Study and does not necessarily reflect the opinions or views of the Framingham Heart Study, Boston University, or NHLBI. 14

FHS SNP Health Association Resource (SHARe): Funding for SHARe Affymetrix genotyping was provided by NHLBI Contract N02-HL-64278. SHARe Illumina genotyping was provided under an agreement between Illumina and Boston University. Candidate gene Association Resource (CARe): Funding for CARe genotyping was provided by NHLBI Contract N01-HC-65226. Multi-Ethnic Study of Atherosclerosis (MESA) 18 DbGaP dataset reference: The datasets used for the analyses described in this manuscript were obtained from dbgap at http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap through dbgap Study Accession: phs000209.v12.p3 MESA and the MESA SHARe project are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with MESA investigators. Support for MESA is provided by contracts N01-HC-95159, N01-HC-95160, N01-HC-95161, N01-HC-95162, N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169 and CTSA UL1-RR-024156. MESA SNP Health Association Resource (SHARe): Funding for SHARe genotyping was provided by NHLBI Contract N02-HL-64278. Genotyping was performed at Affymetrix (Santa Clara, California, USA) and the Broad Institute of Harvard and MIT (Boston, Massachusetts, USA) using the Affymetric Genome-Wide Human SNP Array 6.0. Candidate gene Association Resource (CARe): The MESA CARe data used for the analyses described in this manuscript were obtained through dbgap (accession numbers). Funding for CARe genotyping was provided by NHLBI Contract N01-HC-65226. 15

Coronary Artery Risk Development in Young Adults (CARDIA) 19 DbGaP dataset reference: The datasets used for the analyses described in this manuscript were obtained from dbgap at http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap through dbgap Study Accession: phs000285.v3.p2 The research reported in this article was supported by contract numbers HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, and HHSN268201100012C; all from the National Heart, Lung, and Blood Institute; National Institutes of Health; Bethesda, MD, USA. A full list of principal ARIC investigators and institutions can be found at http://www.cardia.dopm.uab.edu/. This manuscript was not prepared in collaboration with CARDIA investigators and does not necessarily reflect the opinions or views of CARDIA, or the NHLBI. Candidate Gene Association Resource (CARe) - Support for the genotyping through the CARe Study was provided by NHLBI Contract N01-HC-65226. GENEVA (Gene-Environment Association Studies) - Support for the genotyping through the GENEVA Study was provided by the NIH GEI U01HG004438, U01HG04424, and HHSN268200782096C Women's Health Initiative (WHI) 20 DbGaP dataset reference: The datasets used for the analyses described in this manuscript were obtained from dbgap at http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap through dbgap Study Accession: phs000200.v9.p3 The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts N01WH22110, 24152, 32100-2, 32105-6, 32108-9, 32111-13, 32115, 32118-32119, 32122, 42107-26, 42129-32, and 44221. This manuscript was not prepared in collaboration with investigators of the WHI, has not been reviewed and/or approved by the Women s Health Initiative (WHI), and does not necessarily reflect the opinions of the WHI investigators or the NHLBI. 16

PAGE: WHI PAGE is funded through the NHGRI Population Architecture Using Genomics and Epidemiology (PAGE) network (Grant Number U01 HG004790). Assistance with phenotype harmonization, SNP selection, data cleaning, meta-analyses, data management and dissemination, and general study coordination, was provided by the PAGE Coordinating Center (U01HG004801-01). GARNET: Funding support for WHI GARNET was provided through the NHGRI Genomics and Randomized Trials Network (GARNET) (Grant Number U01 HG005152). Assistance with phenotype harmonization and genotype cleaning, as well as with general study coordination, was provided by the GARNET Coordinating Center (U01 HG005157). Assistance with data cleaning was provided by the National Center for Biotechnology Information. Funding support for genotyping, which was performed at the Broad Institute of MIT and Harvard, was provided by the NIH Genes, Environment and Health Initiative [GEI] (U01 HG004424). SHARe: Funding for WHI SNP Health Association Resource (SHARe) genotyping was provided by NHLBI Contract N02-HL-64278. Myocardial Infarction Genetics Consortium (MIGen) 7 DbGaP dataset reference: The datasets used for the analyses described in this manuscript were obtained from dbgap at http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap through dbgap Study Accession: phs000294.v1.p1 Funding Source: R01 HL087676. National Institutes of Health, Bethesda, MD, USA Wellcome Trust Case Control Consortium (WTCCC) 8 The principal funder of this project was the Wellcome Trust. Case collections were funded by: Arthritis Research Campaign, BDA Research, British Heart Foundation, British Hypertension Society, Diabetes UK, Glaxo-Smith Kline Research and Development, Juvenile Diabetes Research Foundation, National 17

Association for Colitis and Crohn's disease, SHERT (The Scottish Hospitals Endowment Research Trust), St Bartholomew's and The Royal London Charitable Foundation, UK Medical Research Council, UK NHS R&D and the Wellcome Trust. Global Lipid Genetic Consortium (GLGC) 4 Data on coronary artery disease / myocardial infarction have been contributed by Global Lipids Genetics Consortium investigators and have been downloaded from: www.sph.umich.edu/csg/abecasis/public/lipids2013/ CARDIoGRAMplusC4D Consortium 9,10 External validation study data on coronary artery disease / myocardial infarction have been contributed by CARDIoGRAMplusC4D investigators and have been downloaded from: www.cardiogramplusc4d.org DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium 11 External validation study data on diabetes have been contributed by DIAGRAM investigators and have been downloaded from: diagram-consortium.org/downloads.html Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) 21-23 External validation data on glycaemic traits have been contributed by MAGIC investigators and have been downloaded from: www.magicinvestigators.org The Genetic Investigation of ANthropometric Traits (GIANT) consortium 24-26 External validation data on anthropomorphic traits have been contributed by GIANT investigators and have been downloaded from: www.broadinstitute.org/collaboration/giant/index.php/giant_consortium_data_files 18

Additional Data Sources: Data on SNP annotation, proxy search and estimates of pairwise linkage disequilibrium metrics was obtained from SNAP: SNP Annotation and Proxy Search Version 2.2 5 https://www.broadinstitute.org/mpg/snap/index.php SNAP development is funded, in part, by the NHLBI CARe (Candidate Gene Resource) grant (N01-HC- 65226) and by NHLBI's Framingham Heart Study (N01-HC-25195). Data on genotyping platform specific SNP identification was obtained from NCBI dbsnp Human Build 141: http://www.ncbi.nlm.nih.gov/snp/ 19

Figure S1: Design of the Naturally Randomized Trials A. PCSK9 genetic score B. HMGCR genetic score 20

C. 2x2 factorial analysis Legend: A. We dichotomized the PCSK9 genetic score, and used this instrument to randomly allocate subjects into two approximately equal sized groups based on whether their PCSK9 genetic score was above or below the median so that our study would have the same structure as a randomized trial. B. Similarly, we dichotomized the HMGCR genetic score, and used this instrument to randomly allocate subjects into two approximately equal sized groups based on whether their HMGCR genetic score was above or below the median so that this part of the study would also have the same structure as a randomized trial. C. To conduct the 2x2 factorial analysis comparing the separate and combined effects of PCSK9 and HMGCR genetic scores, we first randomly allocated all subjects into two groups based on whether their HMGCR genetic score was above or below the median value and then randomly allocated subjects in these two groups into two further groups based on whether their PCSK9 genetic 21

score was above or below the median value. This process had the effect of randomly allocating all subjects into one of 4 groups: the reference group (analogous to a placebo group), a group with lower LDL-C mediated polymorphisms in the PCSK9 gene (analogous to treatment with a PCSK9 inhibitor), a group with lower LDL-C mediated polymorphisms in the HMGCR gene (analogous to treatment with statin), and a group with lower LDL-C mediated by the combined effect of polymorphisms in the both genes (analogous to treatment with combination PCSK9 inhibitor and statin). 22

Figure S2: Effect of PCSK9 genetic score on the risk of cardiovascular events Legend: Boxes represent point estimates of effect. Lines represent 95% confidence intervals (CI). Please see text for details. 23

Figure S3: Log-linear association between lower LDL-C mediated by PCSK9 polymorphisms and the risk of coronary death or MI Legend: Plot of LDL-C effect size by proportional risk reduction. Boxes represent proportional risk reduction (1 OR) for each exposure allele or genetic score plotted against the absolute magnitude of lower LDL-C associated with that allele or score, respectively. Vertical lines represent one SE above and below point estimate of proportional risk reduction. SNPs and scores are plotted in order of increasing absolute magnitude of exposure to lower LDL-C. The line (which is forced to pass through the origin) represents the increase in proportional risk reduction of coronary death or MI per unit exposure to lower LDL-C (plotted on the log scale). 24

Figure S4: Effect of PCSK9 genetic score on coronary death and MI in selected subgroups Legend: Boxes represent point estimates of effect. Lines represent 95% confidence intervals (CI). Please see text for details. 25

Figure S5: Comparison of the effect of PCSK9 and HMGCR genetic scores on the risk of various cardiovascular outcomes adjusted per 10 mg/dl lower LDL-C Legend: Boxes represent point estimates of effect. Lines represent 95% confidence intervals (CI). Please see text for details. 26

Figure S6: Effect of PCSK9 genetic score on risk of cardiovascular disease in up to 62,240 case and 127,299 control subjects enrolled in the CARDIoGRAMplusC4D consortium studies adjusted per 10 mg/dl lower LDL-C OR CHD (per 10 mg/dl lower LDL-C): 0.843 (0.805-0.882), p = 1.6x10-14 27

Figure S7: Effect of HMGCR genetic score on risk of cardiovascular disease in up to 62,240 case and 127,299 control subjects enrolled in the CARDIoGRAMplusC4D consortium studies adjusted per 10 mg/dl lower LDL-C OR CHD (per 10 mg/dl lower LDL-C): 0.844 (0.806-0.885), p = 3.2x10-14 28

Figure S8: Comparison of effect of PCSK9 and HMGCR genetic scores on risk of cardiovascular events in up to 62,240 case and 127,299 control subjects enrolled in the CARDIoGRAMplusC4D consortium studies adjusted per 10 mg/dl lower LDL-C A. Comparison of PCSK9 and HMGCR scores B. Comparison ofmultiple genetic scores and polymorphisms involving lower LDL-C mediated through the common final pathway of the LDL recptor 29

Figure S9: PCSK9 genetic score MR-Egger Evaluation of Pleiotropic Effects in up to 62,240 case and 127,299 control subjects enrolled in the CARDIoGRAMplusC4D consortium studies Legend: The constant term in these regression analyses is a measure of the effect of the PCSK9 genetic score on the risk of coronary death or MI not related to its effect on the biomarker in the regression equation; and is therefore an estimate of the total pleiotropic effect of the PCSK9 genetic score not related to lower LDL-C. 30

Figure S10: HMGCR genetic score MR-Egger Evaluation of Pleiotropic Effects in up to 62,240 case and 127,299 control subjects enrolled in the CARDIoGRAMplusC4D consortium studies Legend: The constant term in these regression analyses is a measure of the effect of the HMGCR genetic score on the risk of coronary death or MI not related to its effect on the biomarker in the regression equation; and is therefore an estimate of the total pleiotropic effect of the HMGCR genetic score not related to lower LDL-C. 31

Figure S11: Comparison of LDLR, PCSK9, HMGCR and NPC1L1 genetic scores on risk of diabetes Legend: The LDL receptor (LDLR) genetic score includes 5 polymorphisms (rs6511720, rs8110695, rs1122608, rs688 and rs7188); and the NPC1L1 genetic score includes 5 polymorphisms (rs217386, rs2073547, rs7791240, rs10234070 and rs2300414). 6 The effect of each score on the risk of diabetes (combined prevalent or incident cases) was estimated using the individual participant data. Polymorphisms in the LDLR gene and genetic scores that mimic the effect of PCSK9 inhibitors, statins (HMGCR) and ezetimibe (NPC1L1), each of which lower LDL-C via a common final pathway involving the LDLR, appear to have very similar effects on the risk of diabetes per unit lower LDL-C. 32

Figure S12: Effect of PCSK9 genetic score on risk of diabetes in up to 86,197 subjects enrolled in the DIAGRAM consortium studies OR DM (per 10 mg/dl lower LDL-C): 1.085(1.023-1.150), p = 0.006 33

Figure S13: Effect of HMGCR genetic score on risk of diabetes in up to 181,111 subjects enrolled in DIAGRAM consortium and other studies Sample SNP Size (n) OR (95% CI) rs12916 151,329 1.09 (1.00, 1.18) rs17238484 181,111 1.16 (1.03, 1.31) rs5909 54640 0.90 (0.71, 1.14) rs2303152 73794 0.86 (0.63, 1.16) rs10066707 80650 1.06 (0.91, 1.24) Overall (I-squared = 31.7%, p = 0.210) 1.08 (1.02, 1.14).7.8.9 1 1.1 1.2 1.3 1.4 OR DM (per 10 mg/dl lower LDL-C): 1.078(1.016-1.144), p = 0.013 NB: data for rs12916 and rs17238484 include data from the DIAGRAM consortium supplemented by non-overlapping data from additional studies as described in the study by Swerdlow, et al (supplementary Table 10). 27 34

Figure S14: Comparison of effect of PCSK9 and HMGCR genetic scores on risk of diabetes in up to 181,111 subjects enrolled in DIAGRAM consortium and other studies 35

Figure S15: Comparison of effect of unweighted PCSK9 and HMGCR scores on the risk cardiovascular events and diabetes A. Risk of coronary death or MI B. Risk of Diabetes Legend: For each study participant, unweighted PCSK9 and HMGCR genetic scores, respectively, were calculated by summing the number of LDL-C lowering alleles that person inherited at each polymorphism included in either score (without weighting by each polymorphism s effect on LDL-C) 36

Table S1: Included studies and genotyping platforms Study Total No. Subjects Total No. Primary CV Events (Incident Cases) Total No. Cases of Diabetes (Incident Cases) Follow-up (Years) Included Genetic Sub-studies Genotyping Platforms Atherosclerosis Risk in Communities Study (ARIC) 15,676 1,590 (1,063) 2,794 (1,311) 16 phs000090 GENEVA_ARIC phs000557 ARIC_CARe Cardiovascular Health Study (CHS) 5,592 1,462 (964) 1318 (709) 14 phs000377 CARe Cardiovascular Health Study phs000226 STAMPEED: Cardiovascular Health Study The Framingham Heart Study (FHS): phs000342 Framingham SHARe phs000282 Framingham CARe Original Cohort Offspring Cohort Multi-Ethnic Study of Atherosclerosis (MESA) 5,209 5,124 416 (416) 407 (393) 464 (460) 486 (439) 8,296 206 (206) 1439 (736) 10 phs000420 MESA SHARe phs000283 MESA CARe Coronary Artery Risk Development in Young Adults (CARDIA) 3,622 21 (21) 99 (99) 15 phs000309 GENEVA_CARDIA phs000613 CARDIA_CARe Women's Health Initiative (WHI) 53,318 1,406 (917) 4035 (2541) 16 phs000386 WHI SHARe phs000315 WHI GARNET phs000227 PAGE WHI Myocardial Infarction Genetics Consortium (MIGen) ATVB FINRISK HARPS MALMO MGH-PCOD Wellcome Trust Case Control Consortium (WTCCC) REGICOR 3,361 339 1,064 185 464 629 1,693 167 505 86 204 312 N/A 54 32 Case-control Case-control Case-control Case-control Case-control Case-control phs000294 STAMPEED: Myocardial Infarction Genetics Consortium (MIGen) 5,002 1926 N/A Case-control 1958 British Birth Cohort controls (1504) UK National Blood Service controls (1500) Coronary Artery Disease (CAD) cases (1998) Affymetrix AFFY_6.0 Illumina CVDSNP55v1_A Illumina CVDSNP55v1_A Illumina HumanOmni1-Quad_v1-0_B Affymetrix HuGeneFocused 50K_Affy Affymetrix 500K Set (Mapping250K_Nsp and Mapping250K_Sty Arrays) Illumina CVDSNP55v1_A Affymetrix AFFY_6.0 Illumina CVDSNP55v1_A Affymetrix AFFY_6.0 Illumina CVDSNP55v1_A Affymetrix AFFY_6.0 Illumina HumanOmni1-Quad_v1-0_B Illumina Cardio- Metabo_Chip_11395247_A Affymetrix AFFY_6.0 Affymetrix 500K (Mapping250K_Nsp and Mapping250K_Sty Arrays) 37

Table S2: Consortia studies used for external validation Trait Consortium No. Studies Inlcuded No. Participants No. Cases Ethnicities Included Link to the Data Lipids Coronary Heart Disease Type II Diabetes Metabolic Traits Anthropomorphic Traits Global Lipids Genetics Consortium (GLGC) CARDIoGRAMplusC4D Consortium (2013) Metabochip metaanalysis CARDIoGRAMplusC4D Consortium (2015) 1000 Genomes-based GWAS DIAbetes Genetics Replication And Metaanalysis (DIAGRAM) Consortium Meta-Analyses of Glucose and Insulinrelated traits Consortium (MAGIC) Genetic Investigation of ANthropometric Traits (GIANT) consortium 60 188,577 European http://csg.sph.umich.edu/abecasis/public/lipids2013/ 48 194,427 63,746 European http://www.cardiogramplusc4d.org/data-downloads/ 48 184,305 60,801 Mixed http://www.cardiogramplusc4d.org/data-downloads/ 38 94,742 22,669 European http://diagram-consortium.org/downloads.html 56 46,368 European http://www.magicinvestigators.org/downloads 125 339,224 European http://www.broadinstitute.org/collaboration/ giant/index.php/giant_consortium_data_files 38

Table S3: Baseline characteristics of study sample participants Baseline Characteristic Mean (SD or IQR) Sample Size 112,772 No. Included Studies 14 Age (years) 59.9 (±6.5) Women (%) 58.2% LDL-C (mg/dl) 129.9 (±32.0) HDL-C (mg/dl) 52.3 (±15.4) triglycerides (mg/dl)* 117.0 (85-162) total cholesterol (mg/dl) 207.8 (±36.8) non-hdl-c (mg/dl) 155.3 (±37.6) Systolic Blood Pressure (mmhg) 127.0 (±18.7) Diastolic Blood pressure (mmhg) 75.2 (±10.2) Weight (lbs) 169.2 (±33.1) Body mass index (kg/m 2 ) 27.7 (±5.2) Prevalent Diabetes (%) 5.7 Prevalent Cardiovascular disease (%) 1.9 Ever smoker (%) 54.3 Legend: Prevalent Diabetes and Prevalent Cardiovascular Disease refer to the percentage of participants with a history of diabetes (Prevalent Diabetes), or a history of myocardial infarction, stroke or coronary revascularization (Prevalent Cardiovascular Disease) at the time of enrollment into the included prospective cohort studies. Triglycerides are given as median (inter-quartile range); all other continuous variables are given as mean (± standard deviation). Dichotomous variables are given as percentages. Values in the table represent weighted mean values of the baseline characteristics for the entire study sample, after combining study specific estimates in an inverse variance-weighted meta-analysis. 39