Introduction of Genome wide Complex Trait Analysis (GCTA) Presenter: Yue Ming Chen Location: Stat Gen Workshop Date: 6/7/2013

Similar documents
New Enhancements: GWAS Workflows with SVS

Supplementary Figures

An Introduction to Quantitative Genetics I. Heather A Lawson Advanced Genetics Spring2018

Introduction to linkage and family based designs to study the genetic epidemiology of complex traits. Harold Snieder

BST227: Introduction to Statistical Genetics

Genome-wide Association Analysis Applied to Asthma-Susceptibility Gene. McCaw, Z., Wu, W., Hsiao, S., McKhann, A., Tracy, S.

Statistical Tests for X Chromosome Association Study. with Simulations. Jian Wang July 10, 2012

Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance components analysis

CS2220 Introduction to Computational Biology

Your DNA extractions! 10 kb

Extended Abstract prepared for the Integrating Genetics in the Social Sciences Meeting 2014

Estimating genetic variation within families

Performing. linkage analysis using MERLIN

The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0

Statistical Genetics : Gene Mappin g through Linkag e and Associatio n

Tutorial on Genome-Wide Association Studies

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis

C. Incorrect! Resemblance is not described by heritability. Heritability is a concept that is easily confused, so memorize the definition.

MULTIFACTORIAL DISEASES. MG L-10 July 7 th 2014

Heritability and genetic correlations explained by common SNPs for MetS traits. Shashaank Vattikuti, Juen Guo and Carson Chow LBM/NIDDK

Heritability enrichment of differentially expressed genes. Hilary Finucane PGC Statistical Analysis Call January 26, 2016

An expanded view of complex traits: from polygenic to omnigenic

Missing Heritablility How to Analyze Your Own Genome Fall 2013

Imaging Genetics: Heritability, Linkage & Association

For more information about how to cite these materials visit

Advanced IPD meta-analysis methods for observational studies

Quantitative genetics: traits controlled by alleles at many loci

Genome-wide association studies (case/control and family-based) Heather J. Cordell, Institute of Genetic Medicine Newcastle University, UK

Combined Linkage and Association in Mx. Hermine Maes Kate Morley Dorret Boomsma Nick Martin Meike Bartels

Use and Interpreta,on of LD Score Regression. Brendan Bulik- Sullivan PGC Stat Analysis Call

The Efficiency of Mapping of Quantitative Trait Loci using Cofactor Analysis

Developing and evaluating polygenic risk prediction models for stratified disease prevention

Rare Variant Burden Tests. Biostatistics 666

Interaction of Genes and the Environment

Mendelian Randomization

Non-parametric methods for linkage analysis

Decomposition of the Genotypic Value

It is shown that maximum likelihood estimation of

Genetics of extreme body size evolution in mice from Gough Island

Discontinuous Traits. Chapter 22. Quantitative Traits. Types of Quantitative Traits. Few, distinct phenotypes. Also called discrete characters

Genetic association analysis incorporating intermediate phenotypes information for complex diseases

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics

QTL detection for traits of interest for the dairy goat industry

NIH Public Access Author Manuscript Nat Genet. Author manuscript; available in PMC 2012 September 01.

Complex Traits Activity INSTRUCTION MANUAL. ANT 2110 Introduction to Physical Anthropology Professor Julie J. Lesnik

IS IT GENETIC? How do genes, environment and chance interact to specify a complex trait such as intelligence?

Statistical power and significance testing in large-scale genetic studies

Introduction to Quantitative Genetics

Role of Genomics in Selection of Beef Cattle for Healthfulness Characteristics

Assessing Gene-Environment Interactions in Genome-Wide Association Studies: Statistical Approaches

Supplementary Figure 1: Attenuation of association signals after conditioning for the lead SNP. a) attenuation of association signal at the 9p22.

Refining multivariate disease phenotypes for high chip heritability

Complex Trait Genetics in Animal Models. Will Valdar Oxford University

Introduction to the Genetics of Complex Disease

Genetics and Pharmacogenetics in Human Complex Disorders (Example of Bipolar Disorder)

Overview of Animal Breeding

European Educational Programme in Epidemiology

Interaction of Genes and the Environment

Genetic parameters for a multiple-trait linear model conception rate evaluation

Comparing heritability estimates for twin studies + : & Mary Ellen Koran. Tricia Thornton-Wells. Bennett Landman

Genetic parameters for a multiple-trait linear model conception rate evaluation

Alzheimer Disease and Complex Segregation Analysis p.1/29

Chapter 2. Linkage Analysis. JenniferH.BarrettandM.DawnTeare. Abstract. 1. Introduction

Epigenetics. Jenny van Dongen Vrije Universiteit (VU) Amsterdam Boulder, Friday march 10, 2017

(b) What is the allele frequency of the b allele in the new merged population on the island?

Roadmap. Inbreeding How inbred is a population? What are the consequences of inbreeding?

Probability and Punnett Squares

Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor

Lecture 20. Disease Genetics

Dan Koller, Ph.D. Medical and Molecular Genetics

Imaging Genetics: The Example of Schizophrenia

Reviewers' comments: Reviewer #1 (Remarks to the Author):

Taking a closer look at trio designs and unscreened controls in the GWAS era

Quantitative Genetics

Mendelian & Complex Traits. Quantitative Imaging Genomics. Genetics Terminology 2. Genetics Terminology 1. Human Genome. Genetics Terminology 3

Mendel Short IGES 2003 Data Preparation. Eric Sobel. Department of of Human Genetics UCLA School of of Medicine

Shared genetic influences between dimensional ASD and ADHD symptoms during child and adolescent development

An Introduction to Quantitative Genetics

The Inheritance of Complex Traits

During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin,

Lecture 1 Mendelian Inheritance

B-4.7 Summarize the chromosome theory of inheritance and relate that theory to Gregor Mendel s principles of genetics

Mendelian Genetics. Biology 3201 Unit 3

Nature Genetics: doi: /ng Supplementary Figure 1

BIOL 364 Population Biology Fairly testing the theory of evolution by natural selection with playing cards

Big Data Training for Translational Omics Research. Session 1, Day 3, Liu. Case Study #2. PLOS Genetics DOI: /journal.pgen.

COMPLETE DOMINANCE. Autosomal Dominant Inheritance Autosomal Recessive Inheritance

MENDELIAN GENETICS. Punnet Squares and Pea Plants

Chapter 11. Introduction to Genetics

Gamma gene expression in haemoglobin disorders

MBG* Animal Breeding Methods Fall Final Exam

QTs IV: miraculous and missing heritability

Lab 5: Testing Hypotheses about Patterns of Inheritance

GWAS of HCC Proposed Statistical Approach Mendelian Randomization and Mediation Analysis. Chris Amos Manal Hassan Lewis Roberts Donghui Li

Pedigree Analysis. A = the trait (a genetic disease or abnormality, dominant) a = normal (recessive)

Quantitative Trait Analysis in Sibling Pairs. Biostatistics 666

Accepted Manuscript. Predicting polygenic risk of psychiatric disorders

DNA Analysis Techniques for Molecular Genealogy. Luke Hutchison Project Supervisor: Scott R. Woodward

QTL Studies- Past, Present and Future. David Evans

Table S1. Trait summary for Northern Leaf Blight resistance in the maize nested association mapping (NAM) population and individual NAM families

Transcription:

Introduction of Genome wide Complex Trait Analysis (GCTA) resenter: ue Ming Chen Location: Stat Gen Workshop Date: 6/7/013

Outline Brief review of quantitative genetics Overview of GCTA Ideas Main functions ractical Estimating total heritability

Quantitative Genetics Quantitative traits henotypes, continuous variation polygenic effects, product of two or more genes, and their environment Do not follow patterns of Mendelian inheritance Quantitative trait locus (QTL) Underlies quantitative traits Many QTLs associated with a single trait urpose of quantitative genetics To study how the quantitative traits are determined by the genetic factors and their interaction with environmental factors 3

henotype Genotype Relationship Define the model E G Variance due to environment E[( The phenotypic variance E E[ G]) ] E[( E[( E[( E E[ ]) E[ E[ G ] G] E[ G]) G] ] E[( E[ E[ ]) ] G] E[ ]) ] 4

Simple Linear Regression for a Quantitative Trait y j x ij a i e j y x a e ij i j j : phenotypic value of individual j : genotype of individual j at SN i : allele substitution effect of SN i : residual effect, follows N 0, e x ij 0 1 if bb if Bb if BB 5

Heritability An alternative expression i Gi Ei i Assume that G, E and ε are independent Variance of G E Dividing both sides by Heritability G E 1 h e h Heritability is a population concept in that it deals with variation Heritability does not imply causation 6

Missing heritability Study confirmed SNs explain a small fraction of the heritability. Why? Study 1 suggests hiding rather missing heritability Many SNs with small effects (infinitesimal model) GCTA implements the method of estimating the proportion of phenotypic variance explained by genome or chromosome wide SNs for complex traits 1. ang et al, 010, Nature Genetics. ang et al, 011, AJHG 7

Statistical Framework of GCTA Fit the effects of all the SNs as random effects by a mixed linear model y X Wu with vary V WW' I and u ~ N 0, I u u Define the variance explained by all SNs Equivalent expression y X g with I A : The genetic relationship matrix (GRM) between individuals V WW' g N A I g g N u 8

Main Functionalities Estimate the genetic relationship from genome wide SNs Estimate the inbreeding coefficient from genome wide SNs Estimate the variance explained by SNs on a single chromosome or the whole genome by REML Estimate the LD structure encompassing a list of target SNs Simulate GWAS data based upon the observed genotype data redict the genome wide additive genetic effects for individual subjects and for individual SNs 9

QC Cautions Remove close relatives To minimize any confounding of shared environment with GRM Control for ethnic principle components (Cs) To minimize confounding of ethnicity with GRM Adapted from 013 IBG slides of Keller and de Candia 10

ractical Estimating total heritability Workflow Data QC, use LINK plink noweb bfile CDWTCCC geno 0.0 # SN with 100% genotyping rate maf 0.05 # minor allele frequency of at least 5% hwe 1e 3 # HWE test p value (in controls) of p>0.001 chr 10 # on chromosome 10 mind 0.0 # individual have a genotyping rate of less than 100% thin 0.1 # keep a random 10% of SNs make bed out../chr10_thin10/cdwtccc_chr10_thin10 After frequency and genotyping pruning, there are 151 SNs After filtering, 1748 cases, 938 controls and 0 missing After filtering, 16 males, 560 females, and 0 of unspecified sex plink noweb bfile CDWTCCC_chr10_thin10 write snplist # write SN list files Simulate a quantitative trait with the heritability of 0.5, use GCTA gcta64 bfile CDWTCCC_chr10_thin10 simu qt simu causal loci plink.snplist simu hsq 0.5 out test Simulation parameters: Number of simulation replicate(s) = 1 (Default = 1) Heritability = 0.5 (Default = 0.1) Simulated QTL effect(s) have been saved in [test.par]. Simulating GWAS based on the real genotyped data with 1 replicate(s)... Simulated phenotypes of 4686 individuals have been saved in [test.phen]. If the effect sizes are not specified in the file, plink.snplist, they will be generated from a standard normal distribution. 11

ractical Estimating total heritability Workflow (Cont d) Estimate the GRM from all the SNs gcta64 bfile CDWTCCC_chr10_thin10 make grm out test Estimation of the phenotypic variance explained by the SNs using the REML method gcta64 reml grm test pheno test.phen out test 1

The summary result of REML analysis erforming REML analysis... (NOTE: may take hours depends on sample size). 4686 observations, 1 fixed effect(s), and variance component(s)(including residual variance). Calculating prior values by EM-REML... rior values updated from EM-REML: 779.63 768.955 Running AI-REML algorithm... Iter. logl V(G) V(e) 1-18868.80 776.0151 767.48973-18868.76 769.64350 764.3893 3-18868.73 769.71337 764.33605 4-18868.73 769.71303 764.33616 Log-likelihood ratio converged. Calculating the loglikelihood for the reduced model... (variance component 1 is dropped from the model) Calculating prior values by EM-REML... rior values updated from EM-REML: 1565.8500 Running AI-REML algorithm... Iter. logl V(e) 1-19577.74 1565.8500 Log-likelihood ratio converged. Summary result of REML analysis: Source Variance SE V(G) 769.71305 41.7114 V(e) 764.336155 19.516501 Vp 1534.049180 41.70414 V(G)/Vp 0.501753 0.01678 Covariance/Variance/Correlation Matrix: 1740.65-191.155 380.894 13