Welcome to the Genetic Code: An Overview of Basic Genetics October 24, 2016 12:00pm 3:00pm
Course Schedule 12:00 pm 2:00 pm Principles of Mendelian Genetics Introduction to Genetics of Complex Disease Jeremiah Scharf, M.D., Ph.D. Principal Research Faculty Member, Psychiatric and Neurodevelopmental Genetics Unit, Departments of Neurology and Psychiatry Center for Human Genetic Research, MGH 2:00 pm 3:00 pm Questions and Answers
Winter and Spring 2017 Upcoming Course Offerings Responsible Conduct of Omics Research Genomic and Precision Medicine Next Generation Sequencing: An Introduction and Clinical Applications Epigenetics Nanocourse Bedside to Bench: Recognizing and studying patients that lead to new discoveries Please visit our course catalog at learn.partners.org for more information.
Omics Consultations The DCR s Omics Unit provides consultation for the MGH clinical research community. Assistance is provided in genetic study design and execution, human subject protection, career advice, and/or identification of particular resources. Requests are triaged by the Omics Unit and assigned to specific consultants depending on expertise and availability. To request a consult, visit the Division of Clinical Research site http://www.massgeneral.org/research/dcr/default.aspx
GOAL To provide a basic overview of genetic terms and concepts Lecture 1: DNA and RNA structure and function, mutation, Mendelian inheritance, linkage, epigenetics Lecture 2: DNA polymorphisms, SNPs, genetic association, GWAS, population variability, rare vs. common disease, brief overview of next-generation sequencing approaches Questions and Answers (formal and audience participation)
Principles of Mendelian Genetics Jeremiah M. Scharf, M.D., Ph.D. Genetics and Genomics Unit MGH Division of Clinical Research Departments of Neurology and Psychiatry Psychiatric and Neurodevelopmental Genetics Unit Center for Human Genetics Research Massachusetts General Hospital
Genetic Information Gene the basic unit of genetic information. Genes determine inherited characteristics. Conceptual unit Defined by a function Start/end of each gene not always clear Genome the entire collection of genetic information in an organism. Chromosomes storage units of genes. DNA - the chemical structure (Deoxy-riboNucleic Acid) that contains genes Image from http://www.accessexcellence.com/ab/gg/
Human Genome Human cells contain 46 chromosomes: 2 sex chromosomes (X,Y): XY in males. XX in females. 22 pairs of chromosomes named autosomes. Relative Size Chr 1: 249 million bp (Mb) of DNA Chr 21: 48 Mb karyotype
Genetics vs. Genomics Genetics is the study of heredity (inherited traits) -typically at the scale of one (or a few) genes Genomics is the study of the entire genome -typically at the scale of all genes working together - systems level approach (chromosomes, rearrangements, copy number variability, amino acid changes, small insertions/deletions, single DNA base changes, i.e., the sum of all genetic variation contributing to a phenotype)
DNA Structure
Adenine, Guanine (purines) Cytosine, Thymine, Uracil (pyrimidines) 5 T C G A 3 A G C T 3 5
Central Dogma of Genetics DNA RNA Protein Transcription Translation
DNA is transcribed into Pre-mRNA
mrna Splicing GT AG GT AG Pre-mRNA contains introns and exons GT AG GT AG Exons contain instructions for making protein mrna splicing removes introns
Alternative Splicing Generates Diversity Splicing mutations can lead to disease
Central Dogma of Genetics DNA RNA Protein Transcription Translation
Building Proteins - Translation mrna is transported out of the nucleus and is translated into protein according to the genetic code Translation begins at AUG (start methionine) and ends at the stop codon
Genetic Code
Genetic Variation or Polymorphism Single nucleotide polymorphisms (SNPs) 1 every few hundred bp Short indels (=insertion/deletion) 1 every few thousand bp (kb) TGCATTGCGTAGGC TGCATTCCGTAGGC TGCATT---TAGGC TGCATTCCGTAGGC Microsatellite (STR) repeat number 1 every few kb TGCTCATCATCATCAGC TGCTCATCA------GC Copy Number Variation (CNV) - Thousands mapped - Range in size from 100s of bp-5 million bp (Mb) - Deletions, duplications, inversions - Vast majority of these are normal variation -We all have ~100-150 CNVs in our genome
Mutation Classically refers to any change in the DNA sequence (so all polymorphisms are mutations), but We typically refer to a DNA change as a mutation when it is not normal variation Easiest to conceptualize in the context of amino-acid changing variation, but will see that is just the tip of the iceberg
Polyphen2: Adzhubei et al. Nature Methods 2010; Curr Prot Hum Genet 2013 (changes an amino acid in encoded protein) Warning! Not all amino-acid altering mutations are pathogenic! Question: Is there a way to predict which missense mutations are deleterious? Answer: Sort of. Many prediction algorithms (Polyphen2, SIFT, CADD) though nothing s perfect.
(aka Loss of Function (LoF) mutations) Question: Aren t all nonsense mutations deleterious? Answer: Not necessarily. Some genes are tolerant to Loss-of-Function (LoF) mutations. Every one of us has 150-200 LoF variants in a heterozygous state.
(aka Indel mutations) Insertions of 3n nucleotides = In frame insertion Insertions that are not multiples of 3 nucleotides = out of frame or frameshift mutation Can also create a frameshift mutation by destroying a canonical splice site (exongt AG-exon)
Copy Number Variation (CNV) 16p13.11 Deletion (red): Previously associated with Intellectual Disability (ID), seizures, autism Duplication (blue): Previously associated with ID, Schizophrenia, epilepsy, autism, ADHD UCSC Genome Browser - https://genome.ucsc.edu/ Lauren McGrath McGrath et al., JAACAP 2014
Huntington Disease Myotonic Dystrophy Fragile X Syndrome FXTAS Spinocerebellar Ataxias FTD/ALS
Nomenclature of Genetic Variation Locus location of a gene/marker on the chromosome. Plural = loci Allele one variant form of a gene/marker at a particular locus. Haplotype set of alleles along a chromosome A1-B2 Locus1 Possible Alleles: A1,A2 Locus2 Possible Alleles: B1,B2,B3
Genotype:Phenotype The genetic makeup of an organism Observed characteristics of an organism DNA that is inherited Example: ABO Blood Typing Gene: ABO Transferase Genotype: AO Phenotype: Blood Type A
Dominant vs. Recessive A dominant allele is expressed even if it is paired with a recessive allele. A recessive allele is only visible when paired with another recessive allele.
Autosomal Recessive The disease appears in male and female children of unaffected parents. e.g., cystic fibrosis
Autosomal Dominant Affected males and females appear in each generation of the pedigree. Affected mothers and fathers transmit the phenotype to both sons and daughters. e.g., Huntington disease.
X-linked recessive Many more males than females show the disorder. All the daughters of an affected male are carriers. None of the sons of an affected male show the disorder or are carriers. e.g., hemophilia, Duchenne Muscular Dystrophy
X-linked dominant Affected males pass the disorder to all daughters but to none of their sons. Affected heterozygous females married to unaffected males pass the condition to half their sons and daughters e.g. fragile X syndrome
Codominant Inheritance Two different versions (alleles) of a gene can be expressed, and each version makes a slightly different protein Both alleles influence the genetic trait or determine the characteristics of the genetic condition. E.g. ABO locus
Mitochondrial DNA Maternally inherited 16 kb 37 genes Oxidative phosphorylation 1000+ nuclear encoded proteins transported to mitochondria
Mitochondrial Inheritance This type of inheritance applies to genes in mitochondrial DNA Mitochondrial disorders can appear in every generation of a family and can affect both males and females, but fathers do not pass mitochondrial traits to their children. E.g. Leber's hereditary optic neuropathy (LHON)
Penetrance The proportion of individuals carrying a particular variation of a gene (mutation or allele) that also express an associated trait (phenotype). If the penetrance of a disease is 95%, then 95% of the people with the mutation will express the trait, and 5% will not Penetrance refers to the presence or absence of a trait (phenotype, disease) Complete penetrance vs. reduced penetrance Will come back to this in the Q&A session
Variable Expressivity Not the same as penetrance! Variation in phenotype in individuals with the same genotype The same mutation can lead to mild disease in some individuals, severe disease in others this is variable expressivity Will come back to this in the Q&A session Hint: Difference btwn Genetics and Genomics
Mendel s Laws The law of segregation Alleles separate when gametes are formed The law of independent assortment Two or more pairs of alleles segregate independently when gametes are formed When pairs of alleles violate Mendel s second law they are linked Mendelian genetics or Mendelian disease = Single gene disorders
Linkage is a measure of distance 1 3 No Linkage
How do we find Mendelian genes? Genetic Linkage Analysis Collect families genotype (microsatellites, SNPs) look for linkage of a DNA marker to the trait or disease of interest Positional cloning ( old school)
Exome Sequencing Capture and sequence all coding regions of the genome (exons) ~1% of genome 30 Mb of DNA Early successes very rare disorders, only need 2-3 patients for rare recessive disease; other types of disease are more complicated
Exome Sequencing for Mendelian Disorders Major breakthrough for rare Mendelian disorders Works best for recessive traits Relies heavily on successful filtering of variants Ultra-rare (unique) causal mutations (not in public databases**) Well-annotated genes (correct prediction about loss-of-function**) Little or no genetic heterogeneity across individuals For common/complex traits, exome sequencing is much harder! **Caution in next few slides Ng et al, Nature Genetics 2010
Despite remarkable advances in genomic sequencing technology we still struggle to interpret the functional consequence of sequence changes
Loss-of-function (LoF) mutations occur frequently in every human genome Systematic analysis of 185 complete genomes and ~700 whole exomes confirms that every human has ~100 heterozygous LoF mutations -which ones are relevant for disease? MacArthur & Tyler-Smith, Hum. Mol. Genetics, 2010; MacArthur et al, Science 2012
Exome Aggregation Consortium (ExAC) Leveraging population-level data to predict variant pathogenicity by assessing tolerance to functional variation pli: Probabilty of LoF Intolerance Monkol Lek Konrad Karczewski Eric Minikel Kaitlin Samocha Mark Daly Daniel MacArthur Lek et al, Nature 2016
3 billion base pairs of DNA beads on a string Nucleosome = DNA and histone proteins DNA is packaged into chromosomes
Epigenetics Changes in phenotype or gene expression caused by mechanisms other than a change in the DNA sequence Prader-Willi vs. Angelman 15q deletion No change in DNA sequence If inherit deletion from dad PW If inherit deletion from mom - AS Methylation patterns different in the region depending on maternal or paternal inheritance First imprinting disease in humans
Non-coding RNAs Micro RNAs Long non-coding RNAs (lincrna/lncrna) snorna) Esteller, Nature Reviews Genetics, 2011
Questions? Brief Break!