Introduction to genetic variation. He Zhang Bioinformatics Core Facility 6/22/2016

Similar documents
WHEN DO MUTATIONS OCCUR?

Section Chapter 14. Go to Section:

Mutations. Any change in DNA sequence is called a mutation.

Introduction to Genetics

Dan Koller, Ph.D. Medical and Molecular Genetics

BIOLOGY - CLUTCH CH.15 - CHROMOSOMAL THEORY OF INHERITANCE

Chapter 1 : Genetics 101

Computational Systems Biology: Biology X

LESSON 3.2 WORKBOOK. How do normal cells become cancer cells? Workbook Lesson 3.2

Chapter 12-4 DNA Mutations Notes

6.3 DNA Mutations. SBI4U Ms. Ho-Lau

Lecture: Variability. Different types of variability in Biology and Medicine. Cytological essentials of heritable diseases. Plan of the lecture

Chromosomes, Mapping, and the Meiosis-Inheritance Connection. Chapter 13

Human inherited diseases

Mutations. A2 Biology For WJEC

The Chromosomal Basis of Inheritance

Welcome to the Genetic Code: An Overview of Basic Genetics. October 24, :00pm 3:00pm

The Chromosomal Basis Of Inheritance

Genetics Review. Alleles. The Punnett Square. Genotype and Phenotype. Codominance. Incomplete Dominance

Lecture 17: Human Genetics. I. Types of Genetic Disorders. A. Single gene disorders

Unit 5 Review Name: Period:

Chapter 15 Notes 15.1: Mendelian inheritance chromosome theory of inheritance wild type 15.2: Sex-linked genes

TRANSLATION: 3 Stages to translation, can you guess what they are?

MULTIPLE CHOICE QUESTIONS

Chromosome Abnormalities

Challenges of CGH array testing in children with developmental delay. Dr Sally Davies 17 th September 2014

Chapter 11 Gene Expression

Table of Contents. What is a gene mutation and how do mutations occur? How can gene mutations affect health and development?

Chapter 28 Modern Mendelian Genetics

What is the relationship between genes and chromosomes? Is twinning genetic or can a person choose to have twins?

The Chromosomal Basis of Inheritance

Genetic Assessment and Counseling

STRUCTURAL CHROMOSOMAL ABERRATIONS

DNA is the genetic material that provides instructions for what our bodies look like and how they function. DNA is packaged into structures called

Chromosome Mutations

12.1 X-linked Inheritance in Humans. Units of Heredity: Chromosomes and Inheritance Ch. 12. X-linked Inheritance. X-linked Inheritance

Human Genetics (Learning Objectives)

AS90163 Biology Describe the transfer of genetic information Part 1 - DNA structure & Cell division

A. Incorrect! Cells contain the units of genetic they are not the unit of heredity.

Chapter 15: The Chromosomal Basis of Inheritance

Figure 1: Transmission of Wing Shape & Body Color Alleles: F0 Mating. Figure 1.1: Transmission of Wing Shape & Body Color Alleles: Expected F1 Outcome

Concurrent Practical Session ACMG Classification

Example: Distance in M.U. % Crossing Over Why? Double crossovers

Gene Expression and Mutation

GENE EXPRESSION. Amoeba Sisters video 3pk9YVo. Individuality & Mutations

Agro/Ansc/Bio/Gene/Hort 305 Fall, 2017 MEDICAL GENETICS AND CANCER Chpt 24, Genetics by Brooker (lecture outline) #17

Genetics, Mendel and Units of Heredity

Chromosomal Mutations

Exam #2 BSC Fall. NAME_Key correct answers in BOLD FORM A

Human Genetics Notes:

JULY 21, Genetics 101: SCN1A. Katie Angione, MS CGC Certified Genetic Counselor CHCO Neurology

Genetic Variation Junior Science

VOCABULARY somatic cell autosome fertilization gamete sex chromosome diploid homologous chromosome sexual reproduction meiosis

Genetics - Problem Drill 06: Pedigree and Sex Determination

GENOME-WIDE ASSOCIATION STUDIES

The Chromosomal Basis of Inheritance

The Chromosomal Basis of Inheritance

Developmental Psychology 2017

The Chromosomal Basis of Inheritance

UNIT 6 GENETICS 12/30/16

Mutations Quick Questions and Notes (#1) QQ#1: What do you know about mutations?

Human Genetic Mutations

The Meaning of Genetic Variation

BSC 2010C SI EXAM 3 REVIEW REVIEW SESSION AT: Wednesday, 12 2 PM In CB2 Room 105

Introduction to the Genetics of Complex Disease

Genetic diagrams show the genotype and phenotype of the offspring of two organisms. The different generation are abbreviated like so:

The Chromosomal Basis of Inheritance

The Chromosomal Basis of Inheritance

Genetics and Genomics in Medicine Chapter 8 Questions

Unit 8.1: Human Chromosomes and Genes

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 6 Patterns of Inheritance

The Chromosomal Basis of Inheritance

Ch 7 Extending Mendelian Genetics

Ch. 15 The Chromosomal Basis of Inheritance

Human Molecular Genetics Prof. S. Ganesh Department of Biological Sciences and Bioengineering Indian Institute of Technology, Kanpur

Lesson Overview. Human Genetic Disorders. Lesson Overview Human Genetic Disorders

Genetics in Primary Care Curriculum Statement 6. Dr Dave Harniess PCME Stockport

Genetics All somatic cells contain 23 pairs of chromosomes 22 pairs of autosomes 1 pair of sex chromosomes Genes contained in each pair of chromosomes

The Chromosomal Basis of Inheritance

Genomic structural variation

Lecture 20. Disease Genetics

The Cell Cycle. Chapter 10

CURRENT GENETIC TESTING TOOLS IN NEONATAL MEDICINE. Dr. Bahar Naghavi

Analysis with SureCall 2.1

Ch. 23 The Evolution of Populations

Human Genetic Disorders

B.6.E identify and illustrate changes in DNA and evaluate the significance of these changes

Genetic Diseases. SCPA202: Basic Pathology

Extra Review Practice Biology Test Genetics

A gene is a sequence of DNA that resides at a particular site on a chromosome the locus (plural loci). Genetic linkage of genes on a single

Chromosomes and Human Inheritance. Chapter 11

The Living Environment Unit 3 Genetics Unit 11 Complex Inheritance and Human Heredity-class key. Name: Class key. Period:

Chapter 16 Mutations. Practice Questions:

Basic Definitions. Dr. Mohammed Hussein Assi MBChB MSc DCH (UK) MRCPCH

Meiotic Mistakes and Abnormalities Learning Outcomes

Lesson Overview Human Chromosomes

Chapter 15: The Chromosomal Basis of Inheritance

Merging single gene-level CNV with sequence variant interpretation following the ACMGG/AMP sequence variant guidelines

Classifications of genetic disorders disorders

THE CHROMOSOMAL BASIS OF INHERITANCE CHAPTER 15

Transcription:

Introduction to genetic variation He Zhang Bioinformatics Core Facility 6/22/2016

Outline Basic concepts of genetic variation Genetic variation in human populations Variation and genetic disorders Databases and resources

Human genetic variation Genetic variation is the genetic differences both within and among populations No two humans, including monozygotic twins, are genetically identical. On average, in terms of DNA sequence all humans are more than 99% similar to any other humans

Primary sources of genetic variation Random mutations are the ultimate source of genetic variation. DNA fails to copy accurately Induced mutation by chemicals or radiation Crossing over and random segregation during meiosis can result in the production of new alleles or new combinations of alleles.

Hereditary mutations (inherited mutations) Hereditary mutations are inherited from a parent and are present throughout a person s life in virtually every cell in the body usually. These mutations are also called germline mutations because they are present in the parent s egg or sperm cells, which are also called germ cells.

Acquired mutations Acquired mutations may occur relatively early in development or at any later time throughout the lifespan, generally affecting fewer cells These changes can be caused by environmental factors such as ultraviolet radiation from the sun, or can occur if a mistake is made as DNA copies itself during cell division.

Acquired mutations Acquired mutations in somatic cells (cells other than sperm and egg cells) cannot be passed on to the next generation. Donald Freed, et al, 2014

Acquired mutations can be inherited in some cases Acquired mutations may occur in early stage of development, and affect both germ cells and somatic cells Acquired mutations occurs in a person s egg or sperm cell but is not present in any of the person s other cells. In other cases, the mutation occurs in the fertilized egg shortly after the egg and sperm cells unite. Donald Freed, et al, 2014

Mosaic mutations Acquired mutations that happen in a single cell in embryonic development can lead to a situation called mosaicism. These genetic changes are not present in a parent s egg or sperm cells, or in the fertilized egg, but happen a bit later when the embryo includes several cells. As all the cells divide during growth and development, cells that arise from the cell with the altered gene will have the mutation, while other cells will not.

De novo mutations De novo mutations are operationally defined as genotypes observed in a child but not in either parent. They may originate in a parental germ cell or postzygotically Donald Freed, et al, 2014

Mutation rate in human genome The overall error rate of DNA polymerase is 10-8 per base pair. Repair enzymes fix 99% of these lesions for an overall error rate of 10-10 per bp. Mutation rate in some somatic cell can reach to 1.06x10-6 per bp (David Araten, et al, 2005) ~40 germline de novo mutations per generation (Donald Conrad, et al, 2011) ~1,500 non-germline de novo mutations were derived each person (Donald Conrad, et al, 2011)

Types of genetic variation Single Nucleotide Polymorphism (SNP) Insertion/Deletion (Indel) Copy Number Variation (CNV) Rearrangement Inversion Translocation Segmental duplication Numerical variation Polyploidy Aneuploidy

Single Nucleotide Polymorphism (SNP) SNP is a variation in a single nucleotide that occurs at a specific position in the genome and exchanges a single nucleotide for another Transitions: replacement of a purine base with another purine or replacement of a pyrimidine with another pyrimidine A <-> G, C <-> T Transversions: replacement of a purine with a pyrimidine or vice versa. A <-> C, A <-> T, C <-> G, G <-> T ts/tv ~ 2 for whole genome ts/tv ~ 3 for whole exome α : transition β : transversion

Effects of SNPs in coding sequence Silent mutation A silent mutation changes a codon, but doesn t affected the protein sequences Different codons can lead to differential protein expression levels Missense mutation A missense mutation changes a codon and generate a different amino acid. Nonsense mutation A nonsense mutation converts an amino acid codon into a termination codon. This causes the protein to be shortened because of the stop codon interrupting its normal code Read-through mutation A read-through mutation changes a stop codon to a sense codon Splice site mutation Results in one or more introns remaining in mature mrna and may lead to the production of abnormal proteins

Insertion/Deletion (Indel) Insertions add one or more extra nucleotides into the DNA. Deletions remove one or more nucleotides from the DNA. They are usually caused by transposable elements, or errors during replication of repeating elements.

Effect of Indels in coding sequence Reading-frame shift The number of nucleotides in a coding sequence of a gene that is not divisible by three The message in the gene is no longer correctly parsed. Insertion or deletion of one or more amino acids Altering splicing of the mrna

Inversion An inversion is a chromosome rearrangement in which a segment of a chromosome is reversed end to end. Inversions do not change the overall amount of the genetic material

Effect of inversions An Introduction to Genetic Analysis. 7th edition

Effect of inversions An Introduction to Genetic Analysis. 7th edition

Effect of inversions An Introduction to Genetic Analysis. 7th edition

Translocation Translocation is a chromosome abnormality caused by rearrangement of parts between nonhomologous chromosomes Braude P, et al, 2002

Effect of translocations Balanced translocation An even exchange of material with no genetic information extra or missing, and ideally full functionality Unbalanced translocation The exchange of chromosome material is unequal resulting in extra or missing genes

Unbalanced translocation

Copy Number Variation (CNV) A copy-number variation (CNV) is a difference in the genome due to deleting or duplicating large regions of DNA on some chromosome. Duplications lead to multiple copies of all chromosomal regions, increasing the dosage of the genes located within them. Deletions of large chromosomal regions, leading to loss of the genes within those regions. Recent research indicates that approximately two thirds of the entire human genome is composed of repeats and 4.8-9.5% of the human genome can be classified as copy number variations (Mehdi Zarrei, et al, 2015).

Polyploidy and Aneuploidy Polyploidy refers to a numerical change in a whole set of chromosomes Polyploidy occurs in humans in the form of triploidy, with 69 chromosomes and tetraploidy with 92 chromosomes. Aneuploidy refers to a numerical change in part of the chromosome set 45 or 47 chromosomes are common aneuploidy found in human

Genetic variations in human populations

Human reference genome The human genome is the complete set of nucleic acid sequence for humans (Homo sapiens) Haploid human genome 22 autosomes X chromosome Y chromosome

Human reference genome Human reference genome does not correspond to any actual human individual Genome Reference Consortium human genome (build 37) is mosaic haploid genome derived from 13 anonymous volunteers One male accounts for 66% of the total The latest human reference genome (GRCh38) integrated whole genome sequencing data from other projects to improve the completeness, but still have gaps covering ~5% of the genome

How many variants in human genomes A typical genome differs from the reference human genome at 4.1 million to 5.0 million sites >99.9% of variants consist of SNPs and short indels 2,100 to 2,500 structural variants (affecting ~20 million bases of sequence) AFR AMR EAS EUR SAS Samples 661 347 504 503 489 Mean coverage 8.2 7.6 7.7 7.4 8 SNPs 4.31M 3.64M 3.55M 3.53M 3.60M Indels 625k 557k 546k 546k 556k Large deletions 1.1k 949 940 939 947 CNVs 170 153 158 157 165 Inversions 12 9 10 9 11 The 1000 Genomes Project Consortium, 2015

Loss of function variants in human genome human genomes typically contain ~100 genuine loss of function (LoF) variants with ~20 genes completely inactivated

Genetic diversity in different populations Europe East Asian South Asian America Africa

Modern humans originated from Africa L. Luca Cavalli-Sforza & Marcus W. Feldman, 2003

Bottleneck effect during migrations reduce the diversity of human genetic variations Michael C. Campbell and Sarah A. Tishkoff, 2009 Effective population size (Albert Tenesa et al, 2009) non-african populations was 3100 African population was 7500

Genetic variation exists between populations Founder effect and past small population size (increasing the likelihood of genetic drift) may have had an important influence in neutral differences between populations. Natural selection may confer an adaptive advantage to individuals in a specific environment if an allele provides a competitive advantage. Genetic drift will cause some neutral mutations fixed or disappeared randomly in a population.

Genes mirror geography within Europe Nature. 2008 Nov 6; 456(7218): 98 101.

Variations and genetic disorders

Genetic variants and health Most of the variants in human genome don t affect health A typical human genome contains ~100 loss of function (LoF) variants with ~20 genes completely inactivated (Daniel MacArthur, et al, 2012). LoF variants found in healthy individuals will fall into several overlapping categories Severe recessive disease alleles in the heterozygous state Alleles that are less deleterious but nonetheless have an impact on phenotype and disease risk Benign LoF variation in redundant genes Genuine variants that do not seriously disrupt gene function

Genetic disorder A genetic disorder is a disease caused in whole or in part by a change in the DNA sequence away from the normal sequence. Genetic disorders can be caused by a mutation in one gene (monogenic disorder), by mutations in multiple genes (multifactorial inheritance disorder), by a combination of gene mutations and environmental factors, or by damage to chromosomes (changes in the number or structure of entire chromosomes, the structures that carry genes).

Monogenetic disorders Monogenetic disorders (single-gene disorders, Mendelian disorders) are caused by mutations in a single gene. These are usually rare diseases. The mutation may be present on one or both chromosomes Over 4000 human diseases are caused by single-gene defects Sickle cell disease Cystic fibrosis

Multifactorial inheritance disorders Multifactorial inheritance disorders are caused by a combination of variations in different genes, often acting together with environmental factors. The effect of each variant/gene was usually small Many common diseases including cardiovascular disease, diabetes, and most cancers are examples of such disorders.

Chromosome disorders Chromosome disorders are caused by an excess or deficiency of the genes that are located on chromosomes, or by structural changes within chromosomes. Down syndrome is caused by an extra copy of chromosome 21 (called trisomy 21) Prader-Willi syndrome is caused by the absence or non-expression of a group of genes on chromosome 15.

Genetic Mapping in Human Diseases Genetic mapping is the localization of genes underlying phenotypes on the basis of correlation with DNA variation Methods Linkage analysis Association study

Linkage analysis Genetic linkage analysis is a statistical method that is used to associate functionality of genes to their location on chromosomes. It is based on the observation that genes that reside physically close on a chromosome remain linked during meiosis. if some disease is often passed to offspring along with specific marker-genes, then it can be concluded that the gene(s) which are responsible for the disease are located close to these markers. Pedigree is required for linkage analysis

Linkage analysis

Association study Genetic association studies test for a correlation between disease status and genetic variation In case-control studies, it is investigated if the allele frequency is significantly altered between the case and their mathed control group

Population-based design Case and controls are unrelated Easier to collect Susceptible to population stratification bias

Family-based design Cases and controls are related: parents, sibs etc Commonly used design: case-parent trios Not susceptible to population stratification bias Not easy to collect Not appropriate for late-onset diseases Female Male Disease-affected Healthy

Databases and resources

NCBI dbsnp and dbvar The Single Nucleotide Polymorphism database (dbsnp) is a publicdomain archive for a broad collection of simple genetic polymorphisms. dbvar is NCBI's database of genomic structural variation (SV)

1000 Genomes Project The goal of the 1000 Genomes Project was to find most genetic variants with frequencies of at least 1% in the populations studied. 2,504 individuals from 26 populations using a combination of lowcoverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping.

Exome Aggregation Consortium The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects. The data set 60,706 unrelated individuals.

OMIM Online Mendelian Inheritance in Man (Online Mendelian Inheritance in Man ) is a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily http://omim.org Class of phenotype Phenotype Gene* Single gene disorders and traits 4,728 3,182 Susceptibility to complex disease or infection 700 499 "Nondiseases" 141 111 Somatic cell genetic disease 202 115

NCBI ClinVar ClinVar is a public archive of reports of the relationships among human variations and phenotypes, with supporting evidence.