Genes Mendelian Inheritance Lecture 1 Mendelian Inheritance Jurg Ott Gregor Mendel, monk in a monastery in Brünn (now Brno in Czech Republic): Breeding experiments with the garden pea: Flower color and seed shape (phenotypes) are determined by factors (now genes ) that are passed through generations. He formulated two laws of inheritance that he thought were generally valid. Mendel s Laws First Law, Segregation of Characteristics Of a pair of characteristics (e.g. blue and brown eye color) only one can be represented in a gamete even though there are two genes in ordinary cells. Second Law, Independent Assortment For two characteristics, the genes are inherited independently. Today we make use of deviations from this law for statistical gene mapping. Mendel s paper Mendel GJ (1866) Versuche über Pflanzen- Hybriden. Verh Naturforsch Ver Brünn 4:3-47 Ironically, when Mendel s paper was published in 1866, it had little impact. It wasn t until the early 20 th century that the enormity of his ideas was realized. Mendelian Inheritance Trait due to a single gene Huntington disease (dominant). Mapped 1983 by Gusella et al Cystic fibrosis (recessive). Mapped 1985 by Lap-Chee Tsui et al LIPED computer program, Ott 1974 Dominant N/N N/N Recessive D/D N/N Familial Hypercholesterolemia Schrott et al (1972) Annals of Internal Medicine 76, 711-720 Alaska kindred with many affected individuals. Cholesterol level > 95 th %ile of normal affected Early analysis with LIPED program showed mild evidence of linkage to C3 polymorphism (Ott et al, 1974). Later confirmed by others. This demonstrated existence of a disease gene in the vicinity of C3 (chr. 19) Work by Joe Goldstein and Michael Brown (Nobel prize in 1985) identified disease as defect in LDL receptor; located on chromosome 19. Now drugs have been developed (statins) for lowering cholesterol level. Alaska kindred with familial hypercholesterolemia 6 1
X-Linked Inheritance Female genotypes: As for autosomal genes Male genotypes: N/y and D/y (hemizygous) XX XX XY XY Examples (usually recessive; mutationselection!): hemophilia, red/green color blindness, Duchenne muscular dystrophy Genotype and Phenotype Genotype = set of 2 alleles at a locus (gene) in an individual. Examples: A/G (marker alleles), N/D (disease alleles) Haplotype = set of alleles, one each at different loci, inherited from one parent (on same chromosome). Diplotype = set of genotypes (genotype pattern) Phenotype = what you see, expression of this genotype. Examples: A/G (marker), affected (disease). Relation between Genotype and Phenotype Dominant, A > N Recessive Table entries = penetrances. Usually, only 1 line needed (affected). Penetrance = conditional probability of phenotype given genotype. Penetrance = probability of being affected given genotype (diseases). ABO Blood Types 3 alleles: A, B, 0 Genotype Phenotype N/N A/N A/A unaffected 1 0 0 affected 0 1 1 Genotype Phenotype N/N A/N A/A unaffected 1 1 0 affected 0 0 1 Phenotype Genotype A/A A/B A/0 B/B B/0 0/0 A 1 0 1 0 0 0 B 0 0 0 1 1 0 AB 0 1 0 0 0 0 0 0 0 0 0 0 1 Hardy-Weinberg Equilibrium, HWE Parent 1 Parent 2 A (p) T (1 p) p = frequency of A allele A (p) p 2 p(1 p) AA AT TT T (1 p) (1 p)p (1 p) 2 p 2 2p(1 p) (1 p) 2 Conditioning on blood type: Bottom sums =1 Independently formulated ~100 years ago by Hardy (mathematician) and Weinberg (physician). Earlier, people thought that dominant diseases had to increase in frequency. True for large population, absence of mutation, selection, etc. Small populations: Genetic drift (random walks). 2
Generalized Mendelian Inheritance Genotype NN DN DD Frequency* (1 p) 2 2p(1 p) p 2 Penetrance f 1 f 2 f 3 * HWE assumed p = population frequency of D allele Prevalence = (1 p) 2 f 1 + 2p(1 p) f 2 + p 2 f 3 Penetrance: Cystic fibrosis p = frequency of disease alleles, 0.025 Genotype NN DN DD Frequency 0.9506 0.0488 0.0006 Penetrance 0 0 1 Incidence = Prevalence at birth = 0.0006 = 1/1600 Carrier frequency = 0.0488 1/20 Age-dependent penetrance Huntington disease Age class 0-15 16-30 31-45 46-60 61+ Penetrance 0.02 0.33 0.58 0.71 0.94 100% Penetrance Age at onset Penetrance = Proportion of susceptible individuals affected by given age Torsion Dystonia Median age of onset 10 years Penetrance at high age 30% Familial Breast Cancer, BRCA1 Newman et al. (1988) PNAS 85, 3044 Easton et al. (1993) Am J Hum Genet 52, 678 Age group P(affected by given age) dd Dd DD <30.00009.008.008 30-39.00146.083.083 40-49.0083.269.269 50-59.021.469.469 60-69.039.616.616 70-79.061.724.724 80+.082.801.801 Breast Cancer Penetrances 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Penetrance Non-genetic cases Genetic cases <30 30-39 40-49 50-59 60-69 70-79 80+ Age Cystic fibrosis 3 mating types t/n or r/n t = tested CF mutations cover 80% of mut. r = remaining mutations, 20% t/n or r/n Counselee = unaffected child, negative for tested mutations. Carrier? No genetic marker information. Father Mother t/n 0.8 r/n 0.2 t/n 0.8 0.64 0.16 r/n 0.2 0.16 0.04 3
Cystic fibrosis Calculations Counselee s genotype Mating types ¼ ¼ ¼ ¼ t/n t/n 0.64 t/t 0.16 t/n 0.16 t/n 0.16 n/n 0.16 t/n r/n 0.32 t/r 0.08 t/n 0.08 r/n 0.08 n/n 0.08 r/n r/n 0.04 r/r 0.01 r/n 0.01 r/n 0.01 n/n 0.01 Risk: 8 1 1 10 2 29% 8 1 1 16 8 1 35 7 Heritability Linear model for phenotype: x = g + c + e. Heritability = Var(g)/Var(x) Gene-environment interactions: CCR5: No effect of mutation without infection Sickle cell anemia: heterozygote advantage in malaria Pima Indians: Obesity, thrifty gene hypothesis Measure degree of genetic influence by how consistently a trait runs in families Distribution of CCR5Δ32 in Europe Limborska et al. (2002) Hum Hered 53, 49-54 Framingham Study http://www.nhlbi.nih.gov/about/framingham/policies/pagetwelve.htm Blood Pressure Variable Families Subjects Heritability Systolic Blood Pressure, adjusted for age 238 2067 0.323 ± 0.043 Systolic Blood Pressure, adjusted for age, BMI 238 2064 0.339 ± 0.043 Lipid Variable Families Subjects Heritability Total Cholesterol, adjusted 1366 4527 0.462 ± 0.034 HDL Cholesterol, adjusted 1366 4527 0.433 ± 0.034 Log Lp(a), adjusted 902 1832 0.805 ± 0.064 Log TG, adjusted 1366 4527 0.396 ± 0.033 TC / HDL Ratio, adjusted 1366 4527 0.410 ± 0.032 TG / HDL Ratio, adjusted 1366 4527 0.332 ± 0.031 Twin Concordance Rates Complex Diseases Plomin et al. (1994) Science 264, 1734 Risch s Lambda Risch (1990) Am J Hum Genet 46, 222-228 Risk, R r = Prob(relative or type r has trait given index case has trait) Risk ratio, r = R r /R unrelated = R r /K, K = population prevalence Most common: s = risk ratio to a sib CF: s = ¼ / 0.0006 = 417 4
Sib risk ratios for obesity Price and Lee (2001) Hum Hered 51, 35-40 Risk ratios higher when proband and sibling have high BMI severe obesity is more heritable than mild obesity. Penetrance ~ Risk Ratio Ott J (1994) Choice of genetic models for linkage analysis of psychiatric traits, in: Genetic approaches to mental disorders. E. S. Gershon and C. R. Cloninger. Washington, DC, American Psychiatric Press: 63-75 Let A = affected with disease, G = risk genotype, g = non-risk genotype Epidemiology: P(A G) = disease risk of gene carriers, P(A g) for non-gene carriers Linkage analysis: P(A G) = penetrance for genetic cases, P(A g) = penetrance for phenocopies R = P(A G)/P(A g) = risk (penetrance) ratio R = 1 phenotype unknown Quantitative Traits (QTLs) Hartl & Clark (1997) Principles of Population Genetics Hallmark of mendelian inheritance: Mixture of distributions/bimodality NN, DN DD Purely dominant trait: Mean phenotype elevated or reduced. Examples: Cholesterol level, bone mineral density (osteoporosis) Transformations Many QTLs not normally distributed (lower limit of 0). Suitable family of power transformations: x = (y λ 1)/λ + λ, λ = 1 for no transformation λ = 0 for log-transformation λ = ½ for square root transformation y = original data, x = transformed (normalized) data Analyzing Mixture of Distributions Ott J (1979) Hum Genet 51, 79-91 http://www.jurgott.org/linkage/util.htm Thode et al (1988) Biometrics 44, 1195-1201 Related Individuals Use NOCOM (or other suitable) program to estimate mixture parameters. Example with IRI (insulin resistance index, low values are indicative of disease): Count 200 150 100 50 0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0 0.00-1 0 1 2 3 4 LISI Proportion per Bar Apply NOCOM program as if individuals were unrelated Use resulting parameter estimates as input to the ILINK program (LINKAGE package) to obtain proper ML estimates Is relatively cumbersome 5
The Polygenic Threshold Model Hartl & Clark 1997 Liability = underlying QTL T = threshold for disease B p = population prevalence μ = mean liability, μ s = f(t)/b p = mean liability of affecteds Lower panel: Liability distribution of offspring with one parent affected, B o = proportion of affected offspring Expression Level as Phenotype Watts et al. (2002) Am J Hum Genet 71, 791-800 Ataxia telangectesia (AT) = recessive trait Heterozygotes prone to other diseases Compare expression levels of 2880 genes on each of 10 cases (heteroz. for AT) and 10 controls (no AT allele). Identified genes are likely to interact with the AT gene. Genes Influencing Variability of Gene Expression Morley Cheung (2004) Nature 430, 743-747 Used microarrays to measure gene expression levels Genome-wide linkage analysis for expression levels (= QTL) of 3,554 genes in 14 large families. For ~1,000 expression phenotypes, significant linkage to specific chromosomal regions. These regions harbor determinants for variation in human gene expression. 6