Comparing heritability estimates for twin studies + : & Mary Ellen Koran Tricia Thornton-Wells Bennett Landman January 20, 2014
Outline Motivation Software for performing heritability analysis Simulations Twin Study (SOLAR & OpenMX) Family Study (SOLAR) Live Demo Conclusion January 20, 2014 IIGC 2014 SOLAR Workshop 2
Outline Motivation Software for performing heritability analysis Simulations Twin Study (SOLAR & OpenMX) Family Study (SOLAR) Live Demo Conclusion January 20, 2014 IIGC 2014 SOLAR Workshop 3
Heritability (h 2 ) crash course A first question to ask before embarking on a genetic study: is a trait heritable? BEFORE GENOTYPING h 2 = degree of genetic association As h 2 -> 1, a trait increasingly heritable January 20, 2014 IIGC 2014 SOLAR Workshop 4
Cortical gray matter density is heritable January 20, 2014 IIGC 2014 SOLAR Workshop 5
Cortical area thickness is heritable January 20, 2014 IIGC 2014 SOLAR Workshop 6
Resting state fmri connectivity is heritable January 20, 2014 IIGC 2014 SOLAR Workshop 7
White matter microstructure is heritable January 20, 2014 IIGC 2014 SOLAR Workshop 8
We are just getting started Software for estimating heritability are freely available and functional on commodity hardware. Heritabilities can be estimated using diverse family structures (not just twins). We are just beginning to understand the role of brain phenotypes with substantial degrees of heritability. Missing heritability January 20, 2014 IIGC 2014 SOLAR Workshop 9
Choosing study designs Twin studies: h 2 = 2(r MZ r DZ ) Family studies: variance components method January 20, 2014 IIGC 2014 SOLAR Workshop 10
Estimating h 2 with variance components Family studies: σ T 2 = σ G 2 + σ E 2 σ G 2 = σ a 2 + σ d 2 Trait variation = genetic + environment Genetic variation = additive + dominant σ E 2 = σ c 2 + σ e 2 Environmental variation = familial/household + random/individual ACE model commonly used σ T 2 σ a 2 + σ c 2 + σ e 2 January 20, 2014 IIGC 2014 SOLAR Workshop 11
To wrap up. ACE model A σ a 2 σ a 2 + σ c 2 + σ e 2 C σ c 2 σ a 2 + σ c 2 + σ e 2 E σ e 2 σ a 2 + σ c 2 + σ e 2 Additive genetics Common environmental effects Individual environmental effects or Error h 2 = A = σ a 2 / σ T 2 January 20, 2014 IIGC 2014 SOLAR Workshop 12
Outline Motivation Software for performing heritability analysis Simulations Results Twin Study (SOLAR & OpenMX) Family Study (SOLAR) Live Demo Conclusion January 20, 2014 IIGC 2014 SOLAR Workshop 13
Estimating Heritability Sequential oligogenic linkage analysis routines http://solar.txbiomedgenetics.org/ Multipoint quantitative-trait linkage analysis in general pedigrees. Almasy, Blangero. Am J Hum Genet. 1998. Tcl-based Structural equation modeling http://openmx.psyc.virginia.edu/ OpenMx: An Open Source Extended Structural Equation Modeling Framework. Boker, Neale, et.al. Psychometrika. 2011. R-based January 20, 2014 IIGC 2014 SOLAR Workshop 14
SOLAR [landmaba@localhost twins]$ solar SOLAR version 7.3.2 (Experimental), last updated on December 31, 2013 Copyright (c) 1995-2013 Texas Biomedical Research Institute Enter help for help, exit to exit, doc to browse documentation. solar> load pedigree ped2.ped solar> load phenotypes gen1.csv solar> model new solar> Unloading current pedigree data... solar> trait kindgen1 solar> covar age solar> outdir /data/solar_mek/ solar> polygenic Loading pedigree data from the file /data/solar_mek/10_18_2013_openmx_solar/twins/pair2/ped2.ped... solar> /data/solar_mek/10_18_2013_openmx_solar/twins/pair2/a_0/a_0_c_0/gen1.csv: ID age kindgen1 FAMID solar> solar> solar> solar> solar> ********************************************************************** * Maximize sporadic model * January 20, 2014 IIGC 2014 SOLAR Workshop 15
Inputs to SOLAR Pedigree File Phenotype File January 20, 2014 IIGC 2014 SOLAR Workshop 16
Output of SOLAR Pedigree: /data/solar_mek/livedemo/twins4/pair20/ped20.ped Phenotypes: /data/solar_mek/livedemo/twins4/pair20/a_50/a_50_c_0/gen10.csv Trait: kindgen10 Individuals: 80 H2r is 0.4861110 p = 0.0016505 (Significant) H2r Std. Error: 0.1341583 Proportion of Variance Due to All Final Covariates Is 0.0247895 Loglikelihoods and chi's are in /data/solar_mek/livedemo/twins4/pair20/a_50/a_50_c_0/gen10out//polyg enic.logs.out Best model is named poly and null0 Final models are named poly, spor, nocovar Residual Kurtosis is 0.4459, within normal range January 20, 2014 IIGC 2014 SOLAR Workshop 17
OpenMX January 20, 2014 IIGC 2014 SOLAR Workshop 18
Inputs to OpenMX Pedigree Coded in SEM Phenotype Variables January 20, 2014 IIGC 2014 SOLAR Workshop 19
Outputs of OpenMX Results stored into R variables January 20, 2014 IIGC 2014 SOLAR Workshop 20
Cross Platform Comparison Closed source/black box Easily extendible to complex families Easier to specify complicated pedigrees Single threaded execution Complete list of command descriptions Multi-platform (tcl-based) Open source Difficult to extend to complex families Requires user input for starting values of A, C, E Tends to not converge more often than SOLAR Integrates with R s cluster processing Well managed, active forum, lots of help documents on a wiki January 20, 2014 IIGC 2014 SOLAR Workshop 21
Outline Motivation Software for performing heritability analysis Simulations Twin Study (SOLAR & OpenMX) Family Study (SOLAR) Live Demo Conclusion January 20, 2014 IIGC 2014 SOLAR Workshop 22
Why use simulation? Debug Performance assessment Method comparison January 20, 2014 IIGC 2014 SOLAR Workshop 23
Why simulate? : Debug How do we know if the software is properly Installed (documentation), Configured (system / platform), Used (user error) Isolate errors, crashes, convergence issues Freely sharable design Reproducible results without large data exchange Commoditize heritability analysis January 20, 2014 IIGC 2014 SOLAR Workshop 24
Why simulate? : Performance New studies operate on edge cases that may have not been expected by the designers. Are there coding bugs? Are there theoretical bugs? Are crashes due to poorly formatted data? Confirm that a generative model for the study is well understood. Produce empirical power curves for study planning. Assess estimator performance January 20, 2014 IIGC 2014 SOLAR Workshop 25
Alternative for Power Estimation I h2power Purpose: Perform heritability power calculations This command performs a power calculation for the currently loaded pedigree, with the following default assumptions: (1) the trait to be studied is either quantitative or dichotomous (e.g. affected/unaffected) (2) the trait to be studied is influenced by additive genetics (3) all pedigree members will be phenotyped for the trait to be studied (unless the -data option is used to exclude those individuals who will not have phenotypic data; see the description of this option below) Also see power. January 20, 2014 IIGC 2014 SOLAR Workshop 26
Alternative for Power Estimation II Theoretical power models are available using spectral properties of the pedigree structure. John Blangero, et al. Chapter One - A Kernel of Truth : Statistical Advances in Polygenic Variance Component Models for Complex Human Pedigrees. Advances in Genetics Volume 81 2013 1-31 January 20, 2014 IIGC 2014 SOLAR Workshop 27
Why simulate? : Comparison Simulations provide scalable access to many (virtual) subjects with known genetic association structure Measure accuracy (bias) of heritability estimators with ground truth Measure precision (variance) of heritability estimators Mean Squared Error = Bias 2 + Variance. Quantify absolute and relative accuracy January 20, 2014 IIGC 2014 SOLAR Workshop 28
Modeling family structures MZ and DZ Twins Nuclear Families (Quartets) Grand- Nuclear Families (Octets) January 20, 2014 IIGC 2014 SOLAR Workshop 29
Quantitative Trait Simulation Pedigrees Created 8 600 subjects A simulated 0 0.95 C simulated 0 0.3 0.5 0.7 100 Phenotype files created y = X β + N(0, 2ΦA + γc + 1 E) y = simulated phenotype X = simulated covariate ( age ) β= arbitrary coefficient (.005) N= noise dependent on: A, C, E = specified by user Φ = expected fraction of genome shared between subjects (MZ =.5, DZ =.25, parent-child=.25) γ = assumed common environment between relationship pairs (DZ, MZ, siblings = 1) January 20, 2014 IIGC 2014 SOLAR Workshop 30
Example Pedigree Simulation Simulating family structures and phenotypes Twin 1 Twin 2 MZ twins Twin 1 A + C + E A + C Twin 2 A + C A + C + E January 20, 2014 IIGC 2014 SOLAR Workshop 31
Example Pedigree Simulation Simulating family structures and phenotypes Twin 1 Twin 2 MZ twins Twin 1 A + C + E A + C Twin 2 A + C A + C + E Twin 1 Twin 2 DZ twins Twin 1 A + C + E.5*A + C Twin 2.5*A + C A + C + E January 20, 2014 IIGC 2014 SOLAR Workshop 32
Outline Motivation Software for performing heritability analysis Simulations Twin Study (SOLAR & OpenMX) Family Study (SOLAR) Live Demo Conclusion January 20, 2014 IIGC 2014 SOLAR Workshop 33
Twin Study: Cross Platform Comparison January 20, 2014 IIGC 2014 SOLAR Workshop 34
Cross Platform Comparison Simulated quantitative phenotype with the ACE Model and covariate of simulated age in twins Pedigrees Created 8 600 subjects MZ : DZ = 1:1 A simulated 0 0.95 C simulated 0 0.3 0.5 0.7 100 Phenotype files created January 20, 2014 IIGC 2014 SOLAR Workshop 35
Family Study of Heritability Estimates with SOLAR Heritability (A) Estimates Simulated Heritability (A) % Number of Subjects Pedigrees Created 8 600 subjects A simulated 0 0.95 C simulated 0 0.3 0.5 0.7 100 Phenotype files created January 20, 2014 IIGC 2014 SOLAR Workshop 36
Cross Platform Comparison-Twin Study C = 0 Simulated Heritability (A) % Bias in Heritability Estimates (A) January 20, 2014 IIGC 2014 SOLAR Workshop 37
Cross Platform Comparison-Twin Study C = 0 Simulated Heritability (A) % Bias in Heritability Estimates (A) January 20, 2014 IIGC 2014 SOLAR Workshop 38
Cross Platform Comparison-Twin Study C = 0 Heritability (A) % = abs Bias OpenMX abs Bias SOLAR Blue = OpenMX more biased Red = SOLAR more biased January 20, 2014 IIGC 2014 SOLAR Workshop 39
Cross Platform Comparison-Twin Study Heritability (A) Estimates C = 0 Bias in A Simulated Heritability (A) % Variance in A January 20, 2014 IIGC 2014 SOLAR Workshop 40
Cross Platform Comparison-Twin Study Heritability (A) Estimates C = 0.30 Bias in A Simulated Heritability (A) % Variance in A January 20, 2014 IIGC 2014 SOLAR Workshop 41
Cross Platform Comparison-Twin Study Heritability (A) Estimates C = 0.50 Bias in A Simulated Heritability (A) % Variance in A January 20, 2014 IIGC 2014 SOLAR Workshop 42
Cross Platform Comparison-Twin Study Heritability (A) Estimates C = 0.70 Bias in A Simulated Heritability (A) % Variance in A January 20, 2014 IIGC 2014 SOLAR Workshop 43
Cross Platform Comparison-Twin Study C = 0 C = 0.30 C = 0.50 C = 0.70 Simulated Heritability (A) % = abs Bias OpenMX abs Bias SOLAR Blue = OpenMX more biased Red = SOLAR more biased January 20, 2014 IIGC 2014 SOLAR Workshop 44
Outline Motivation Software for performing heritability analysis Simulations Twin Study (SOLAR & OpenMX) Family Study (SOLAR) Live Demo Conclusion January 20, 2014 IIGC 2014 SOLAR Workshop 45
Family Study of Heritability Estimates with SOLAR January 20, 2014 IIGC 2014 SOLAR Workshop 46
Family Study of Heritability Estimates with SOLAR Heritability (A) Estimates C = 0 Bias in A Simulated Heritability (A) % Variance in A January 20, 2014 IIGC 2014 SOLAR Workshop 47
Family Study of Heritability Estimates with SOLAR Heritability (A) Estimates C = 0.30 Bias in A Simulated Heritability (A) % Variance in A January 20, 2014 IIGC 2014 SOLAR Workshop 48
Family Study of Heritability Estimates with SOLAR Heritability (A) Estimates C = 0.50 Bias in A Simulated Heritability (A) % Variance in A January 20, 2014 IIGC 2014 SOLAR Workshop 49
Family Study of Heritability Estimates with SOLAR Heritability (A) Estimates C = 0.70 Bias in A Simulated Heritability (A) % Variance in A January 20, 2014 IIGC 2014 SOLAR Workshop 50
Heritability Estimates with Family Data If you don t know common environment contribution (C) nuclear, grand-nuclear family study in SOLAR twin study in OpenMX If you know C = 0 twin family study in SOLAR If you know C > 0 nuclear, grand-nuclear family study in SOLAR twin family study in OpenMX January 20, 2014 IIGC 2014 SOLAR Workshop 51
Outline Motivation Software for performing heritability analysis Simulations Twin Study (SOLAR & OpenMX) Family Study (SOLAR) Live Demo Conclusion January 20, 2014 IIGC 2014 SOLAR Workshop 52
Platform Scientific Linux 6.4 64 bit OS SOLAR 7.3.2 (Experimental), last updated on December 31, 2013 R 2.15.3 **NOT 3.X.X** OpenMX 1.3.10.12 January 20, 2014 IIGC 2014 SOLAR Workshop 53
Simulation Run Times Simulation Single Core 1.6 GHz OpenMX (including simulations) Scientific Linux 6.4 with Kernel 2.6.32 MacBook Pro with VMWare Fusion 7 SOLAR AE Twins 155 min 266 min ACE Twins 242 min 382 min Nuclear Families Grand Nuclear Families - - 612 min 1803 min January 20, 2014 IIGC 2014 SOLAR Workshop 54
January 20, 2014 IIGC 2014 SOLAR Workshop 55
Outline Motivation Software for performing heritability analysis Simulations Twin Study (SOLAR & OpenMX) Family Study (SOLAR) Live Demo Conclusion January 20, 2014 IIGC 2014 SOLAR Workshop 56
Does any of this matter? h 2 0.52±0.11 January 20, 2014 IIGC 2014 SOLAR Workshop 57
Cross Platform Comparison-Twin Study C = 0 Simulated Heritability (A) % Bias in Heritability Estimates (A) January 20, 2014 IIGC 2014 SOLAR Workshop 58
Cross Platform Comparison-Twin Study C = 0 Heritability (A) % = abs Bias OpenMX abs Bias SOLAR Blue = OpenMX more biased Red = SOLAR more biased January 20, 2014 IIGC 2014 SOLAR Workshop 59
Side by Side Comparison SOLAR OpenMX 0 0.8 0 0.01 January 20, 2014 IIGC 2014 SOLAR Workshop 60
Side by Side Comparison SOLAR OpenMX 0 0.8 0 0.01 January 20, 2014 IIGC 2014 SOLAR Workshop 61
Commonality of the common effect Phenotype A C White matter tracts (Whole brain FA) White Matter (neonatal regional metrics) ~0.5 Low Kochunov, et.al., Neuroimage,2010 Cortical Thickness ~0.7 <0.10 Kremen, et.al. Neuroimage, 2010 Brain Volume ~0.7 <0.05 Kremen, et.al. Neuroimage, 2010 ~0-0.8 ~0.2-0.4 Geng, et al., Twin Res Hum Genet. 2012
e-science Objectives Provide a freely available platform for learning Virtual Machine available on NITRC Quantify differences between OpenMX and SOLAR Studies both AE and ACE models Illustrate integration with other analysis environments Currently, we are using R. In continuing work, integrate SOLAR s nifti support with pipeline platforms January 20, 2014 IIGC 2014 SOLAR Workshop 63
Thank you. Neda Jahanshad Peter Kochunov Tom Nichols Paul Thompson John Blangero David C. Glahn And the ENIGMA DTI working group. MASI Lab Fall 2013 NIH/NIBIB R01 EB015611 January 20, 2014 IIGC 2014 SOLAR Workshop 64