Global variation in copy number in the human genome

Similar documents
Genomic structural variation

Global variation in copy number in the human genome

LTA Analysis of HapMap Genotype Data

Structural Variation and Medical Genomics

Supplementary note: Comparison of deletion variants identified in this study and four earlier studies

0.1% variance attributed to scattered single base-pair changes SNPs

Structural Variants and Susceptibility to Common Human Disorders Dr. Xavier Estivill

Identification of regions with common copy-number variations using SNP array

GENOME-WIDE ASSOCIATION STUDIES

Challenges of CGH array testing in children with developmental delay. Dr Sally Davies 17 th September 2014

SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY.

Practical challenges that copy number variation and whole genome sequencing create for genetic diagnostic labs

CHROMOSOMAL MICROARRAY (CGH+SNP)

Using GWAS Data to Identify Copy Number Variants Contributing to Common Complex Diseases

Dan Koller, Ph.D. Medical and Molecular Genetics

CNV Detection and Interpretation in Genomic Data

Introduction to genetic variation. He Zhang Bioinformatics Core Facility 6/22/2016

Multiple Copy Number Variations in a Patient with Developmental Delay ASCLS- March 31, 2016

SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY.

Generating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University

Statistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies

22q11.2 DELETION SYNDROME. Anna Mª Cueto González Clinical Geneticist Programa de Medicina Molecular y Genética Hospital Vall d Hebrón (Barcelona)

Applications of Chromosomal Microarray Analysis (CMA) in pre- and postnatal Diagnostic: advantages, limitations and concerns

Genome-wide copy-number calling (CNAs not CNVs!) Dr Geoff Macintyre

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF BIOCHEMISTRY AND MOLECULAR BIOLOGY

Understanding DNA Copy Number Data

November 9, Johns Hopkins School of Medicine, Baltimore, MD,

Analysis of CGH and SNP arrays for the detection of chromosomal aberrations in single cells

CNV detection. Introduction and detection in NGS data. G. Demidov 1,2. NGSchool2016. Centre for Genomic Regulation. CNV detection. G.

American Psychiatric Nurses Association

CURRENT GENETIC TESTING TOOLS IN NEONATAL MEDICINE. Dr. Bahar Naghavi

Introduction to the Genetics of Complex Disease

Nature Genetics: doi: /ng Supplementary Figure 1. PCA for ancestry in SNV data.

Supplementary Information. Supplementary Figures

Multiplex target enrichment using DNA indexing for ultra-high throughput variant detection

Approach to Mental Retardation and Developmental Delay. SR Ghaffari MSc MD PhD

Cytogenetics 101: Clinical Research and Molecular Genetic Technologies

CS2220 Introduction to Computational Biology

Figure S2. Distribution of acgh probes on all ten chromosomes of the RIL M0022

Nature Biotechnology: doi: /nbt.1904

Optimizing Copy Number Variation Analysis Using Genome-wide Short Sequence Oligonucleotide Arrays

Integrated detection and population-genetic analysis. of SNPs and copy number variation

Implementation of the DDD/ClinGen OGT (CytoSure v3) Microarray

Genome-Wide Analysis of Copy Number Variations in Normal Population Identified by SNP Arrays

Population Genetics of Structural Variation Speaker Dr. Don Conrad

BIOLOGY - CLUTCH CH.15 - CHROMOSOMAL THEORY OF INHERITANCE

Integrated detection and population-genetic analysis of SNPs and copy number variation

Integrated detection and population-genetic analysis of SNPs and copy number variation

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics

Dr Rick Tearle Senior Applications Specialist, EMEA Complete Genomics Complete Genomics, Inc.

UNIVERSITI TEKNOLOGI MARA COPY NUMBER VARIATIONS OF ORANG ASLI (NEGRITO) FROM PENINSULAR MALAYSIA

SUPPLEMENTARY INFORMATION

Associating Copy Number and SNP Variation with Human Disease. Autism Segmental duplication Neurobehavioral, includes social disability

CRISPR/Cas9 Enrichment and Long-read WGS for Structural Variant Discovery

DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK

MRC-Holland MLPA. Description version 29; 31 July 2015

Genetics and Genomics in Medicine Chapter 8 Questions

Clinical evaluation of microarray data

Supplementary Figure 1: Attenuation of association signals after conditioning for the lead SNP. a) attenuation of association signal at the 9p22.

Prenatal Diagnosis: Are There Microarrays in Your Future?

Investigating rare diseases with Agilent NGS solutions

Association mapping (qualitative) Association scan, quantitative. Office hours Wednesday 3-4pm 304A Stanley Hall. Association scan, qualitative

The Foundations of Personalized Medicine

Introduction to LOH and Allele Specific Copy Number User Forum

Supplementary Figures

Large-scale identity-by-descent mapping discovers rare haplotypes of large effect. Suyash Shringarpure 23andMe, Inc. ASHG 2017

STATISTICAL METHODS FOR THE DETECTION AND ANALYSES OF STRUCTURAL VARIANTS IN THE HUMAN GENOME. Shu Mei, Teo

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis

5/2/18. After this class students should be able to: Stephanie Moon, Ph.D. - GWAS. How do we distinguish Mendelian from non-mendelian traits?

Agilent s Copy Number Variation (CNV) Portfolio

Genomics 101 (2013) Contents lists available at SciVerse ScienceDirect. Genomics. journal homepage:

Chromosomes, Mapping, and the Meiosis-Inheritance Connection. Chapter 13

Tutorial on Genome-Wide Association Studies

Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS

Introduction. 8 These authors contributed equally to this work

Analysis with SureCall 2.1

MRC-Holland MLPA. Description version 18; 09 September 2015

Exam #2 BSC Fall. NAME_Key correct answers in BOLD FORM A

Shape-based retrieval of CNV regions in read coverage data. Sangkyun Hong and Jeehee Yoon*

MODULE NO.14: Y-Chromosome Testing

Genetics and Genomics in Medicine Chapter 6 Questions

Detection of copy number variations in PCR-enriched targeted sequencing data

Case Report Persistent Mosaicism for 12p Duplication/Triplication Chromosome Structural Abnormality in Peripheral Blood

Quantitative genetics: traits controlled by alleles at many loci

Mosaic loss of chromosome Y in peripheral blood is associated with shorter survival and higher risk of cancer

MEDICAL GENOMICS LABORATORY. Next-Gen Sequencing and Deletion/Duplication Analysis of NF1 Only (NF1-NG)

A gene is a sequence of DNA that resides at a particular site on a chromosome the locus (plural loci). Genetic linkage of genes on a single

Identification of both copy number variation-type and constant-type core elements in a large segmental duplication region of the mouse genome

Relationship between genomic features and distributions of RS1 and RS3 rearrangements in breast cancer genomes.

On Missing Data and Genotyping Errors in Association Studies

Nature Genetics: doi: /ng Supplementary Figure 1

DETECTING HIGHLY DIFFERENTIATED COPY-NUMBER VARIANTS FROM POOLED POPULATION SEQUENCING

Whole-genome detection of disease-associated deletions or excess homozygosity in a case control study of rheumatoid arthritis

Identifying Mutations Responsible for Rare Disorders Using New Technologies

MRC-Holland MLPA. Description version 30; 06 June 2017

The Role of Host: Genetic Variation

Clinical Interpretation of Cancer Genomes

Human Genetics 542 Winter 2018 Syllabus

The Chromosomal Basis of Inheritance

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers

Transcription:

Global variation in copy number in the human genome Redon et. al. Nature 444:444-454 (2006) 12.03.2007 Tarmo Puurand

Study 270 individuals (HapMap collection) Affymetrix 500K Whole Genome TilePath (WGTP) 1447 CNVRs, 360 Mbp, 12% of the genome sum(length(cnvrs)) > sum(snps) per genome

CNVs Deletions Insertions Duplications Complex multi-site variants In this study over 1 kb in coparision with a reference genome

Technology platforms Copparative analysis of: 1. Affymetrix GeneChip Human Mapping 500K, 474642 SNPs 2. Hybridisation with Whole Genome TilePath (WGTP) array, 26574 large-inserting clones, 93,7% euchromatic DNA

Quality control Repeated experiments- 82 individuals on the WGTP and 15 individuals on 500K Assessed by standard deviation among log 2 ratios of autosomal probes (after normalisation and filtering for cell-line artefacts) To train threshold parameters- 203 CNVs based by NA10851 and NA15510 DNA- Epstein-Barr-virus transformed lymphoblastoid cell lines. Karyotype all 268 cell lines, 30 of these with abnormalities. Most common chr9, chr12 and chrx trisomies, mosaic trisomy of chr12 Experiments triplication for 10 individuals

A genome-wide map of CNVs CNV detection per experiment was 70 and 24 (WGTP and 500K respectively) WGTP detects in both, test and reference 500K detect only in single genome Median size 228 kb (WGTP) and 81 kb Detection of 913 CNVs in WGTP platform and 980 CNVs in 500K, in total 1447 CNVs

Gaps 164 out of 345 gaps in the built 35 assembly flanked or overlapped by CNVs

Classes of CNVs

CNV formations 24% of 1447 CNVs are associated with segmental duplications 1. rearrangemets generated by non-allelic homologous recombination 2. not all annotated seqmental duplications are fixed in humans, but are CNVs 121 (500K) and 223 (WGTP) CNVs contains 100 bp similarities either end

Genomic impact of CNV

LD around CNVs Indirect methods to identify causative variants, such as cosegregation of linked markers in families and genetic association with markers in linkage disequilibrium with the causative variant, are considered to be blind to the nature of the underlying mutation. This raises the question of whether SNP-based whole-genome association studies have the same power to detect disease-related CNVs as for disease-related SNPs. Recent studies of linkage disequilibrium around CNVs have produced conflicting evidence as to the degree to which CNVs are tagged by neighbouring SNPs. Comparing the proportion of variants tagged by a neighbouring SNP with an arbitrary threshold of r2.0.8 shows that whereas 75 80% of Phase I SNPs in non- African populations were tagged, only 51% of CNVs were tagged in the same populations. We considered three explanations for these observations of lowerapparent linkage disequilibrium around CNVs than SNPs.

LD around CNVs First, some duplications might represent transposition events that would generate linkage disequilibrium around the (unknown) acceptor locus but not the donor locus. One of the genotyped CNVs is known to be a duplicative transposition, but evidence from de novo pathogenic duplications strongly suggests a preference for tandem, rather than dispersed, duplications, regardless of whether the duplication is caused by non-allelic homologous recombination. Second, some CNVs might undergo recurrent mutations or reversions, especially tandem duplications which are mechanistically prone to unequal crossing over, causing reversions back to a single copy. However, duplications were not in lower linkage disequilibrium with flanking SNPs than were deletions. Finally, we considered that CNVs might occur preferentially in genomic regions with lower densities of SNP genotypes in HapMap Phase I. We found that CNVs are enriched within segmentally duplicated regions of the genome, in which thereis a paucity of genotyped SNPs owing to technical difficulties.

Patterns of linkage disequilibrium between CNVs and SNPs

Population clustering from CNV genotypes. A range of polymorphisms, including SNPs, microsatellites and Alu insertion variants, has been used to investigate population structure. To demonstrate the utility of copy number variation genotypes for population genetic inference we performed population clustering on 67 genotyped biallelic CNVs.

Population differentiation for copy number variation

Discussion CNV assessment should now become standard in the design of all studies of the genetic basis of phenotypic variation, including disease susceptibility. Similarly important will be CNV annotation in all future genome assemblies. Our analysis of linkage disequilibrium between CNVs and SNPs gives us limited optimism that CNVs influencing risk to complex disease will be detected by such approaches. The tag SNPs that we have identified for specific CNVs can be used as proxies for these CNVs. Moreover, CNV-specific genotyping assays can be developed for CNVs for which tag SNPs are not readily identifiable but whose proximity to candidate genes warrants further characterization. Extrapolation based on existing data suggests that smaller deletions (<20 kb) are much more frequent than larger deletions (>20 kb), and the same may be true for duplications. CNV calls have been released at the Database of Genomic Variants (http://projects.tcag.ca/variation/) integrated with all other CNV data.