Detection of copy number variations in PCR-enriched targeted sequencing data

Similar documents
Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction

DNA-seq Bioinformatics Analysis: Copy Number Variation

Multiplex target enrichment using DNA indexing for ultra-high throughput variant detection

UNIVERSITI TEKNOLOGI MARA COPY NUMBER VARIATIONS OF ORANG ASLI (NEGRITO) FROM PENINSULAR MALAYSIA

MRC-Holland MLPA. Description version 08; 30 March 2015

Most severely affected will be the probe for exon 15. Please keep an eye on the D-fragments (especially the 96 nt fragment).

Global variation in copy number in the human genome

Identifying Mutations Responsible for Rare Disorders Using New Technologies

AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits

Enabling Personalized

MRC-Holland MLPA. Description version 08; 18 November 2016

MEDICAL GENOMICS LABORATORY. Next-Gen Sequencing and Deletion/Duplication Analysis of NF1 Only (NF1-NG)

AVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB

SALSA MLPA KIT P060-B2 SMA

MRC-Holland MLPA. Description version 19;

Andrew Parrish, Richard Caswell, Garan Jones, Christopher M. Watson, Laura A. Crinnion 3,4, Sian Ellard 1,2

New: P077 BRCA2. This new probemix can be used to confirm results obtained with P045 BRCA2 probemix.

MRC-Holland MLPA. Description version 30; 06 June 2017

BRCA 1/2. Breast cancer testing THINK ABOUT TOMORROW, TODAY

Implementation of BRCA Oncomine panel for germline and somatic variant analysis

MRC-Holland MLPA. Description version 29; 31 July 2015

SALSA MLPA probemix P169-C2 HIRSCHSPRUNG-1 Lot C As compared to version C1 (lot C1-0612), the length of one probe has been adjusted.

Lynch Syndrome and COLARIS Testing

Advance Your Genomic Research Using Targeted Resequencing with SeqCap EZ Library

Performance Characteristics BRCA MASTR Plus Dx

MRC-Holland MLPA. Description version 12; 13 January 2017

The Deciphering Development Disorders (DDD) project: What a genomic approach can achieve

SALSA MLPA probemix P360-A1 Y-Chromosome Microdeletions Lot A

Investigating rare diseases with Agilent NGS solutions

SALSA MLPA probemix P315-B1 EGFR

iplex genotyping IDH1 and IDH2 assays utilized the following primer sets (forward and reverse primers along with extension primers).

MRC-Holland MLPA. Description version 06; 23 December 2016

Mutation Detection and CNV Analysis for Illumina Sequencing data from HaloPlex Target Enrichment Panels using NextGENe Software for Clinical Research

Analysis with SureCall 2.1

NGS ONCOPANELS: FDA S PERSPECTIVE

Fluxion Biosciences and Swift Biosciences Somatic variant detection from liquid biopsy samples using targeted NGS

Corporate Medical Policy

MRC-Holland MLPA. Description version 14; 28 September 2016

Supplementary Figure 1

NGS for Cancer Predisposition

Human Genetics 542 Winter 2018 Syllabus

NGS in Cancer Pathology After the Microscope: From Nucleic Acid to Interpretation

SALSA MLPA KIT P050-B2 CAH

Nature Biotechnology: doi: /nbt.1904

5 th July 2016 ACGS Dr Michelle Wood Laboratory Genetics, Cardiff

Human Genetics 542 Winter 2017 Syllabus

MRC-Holland MLPA. Description version 29;

MRC-Holland MLPA. Description version 07; 26 November 2015

A Versatile Algorithm for Finding Patterns in Large Cancer Cell Line Data Sets

SALSA MLPA probemix P372-B1 Microdeletion Syndromes 6 Lot B1-1016, B

Title:Exome sequencing helped the fine diagnosis of two siblings afflicted with atypical Timothy syndrome (TS2)

CNV detection. Introduction and detection in NGS data. G. Demidov 1,2. NGSchool2016. Centre for Genomic Regulation. CNV detection. G.

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

Molecular Testing in Lung Cancer

Cancer Gene Panels. Dr. Andreas Scherer. Dr. Andreas Scherer President and CEO Golden Helix, Inc. Twitter: andreasscherer

SALSA MLPA probemix P241-D2 MODY mix 1 Lot D As compared to version D1 (lot D1-0911), one reference probe has been replaced.

Attachment 1. Newborn Screening Program Description

MRC-Holland MLPA. Description version 18; 09 September 2015

Nature Genetics: doi: /ng Supplementary Figure 1. PCA for ancestry in SNV data.

Supplementary Figure 1: Features of IGLL5 Mutations in CLL: a) Representative IGV screenshot of first

SUPPLEMENTARY INFORMATION

Whole Genome and Transcriptome Analysis of Anaplastic Meningioma. Patrick Tarpey Cancer Genome Project Wellcome Trust Sanger Institute

Using the Bravo Liquid-Handling System for Next Generation Sequencing Sample Prep

TOWARDS ACCURATE GERMLINE AND SOMATIC INDEL DISCOVERY WITH MICRO-ASSEMBLY. Giuseppe Narzisi, PhD Bioinformatics Scientist

NGS in tissue and liquid biopsy

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD

NGS panels in clinical diagnostics: Utrecht experience. Van Gijn ME PhD Genome Diagnostics UMCUtrecht

Iso-Seq Method Updates and Target Enrichment Without Amplification for SMRT Sequencing

AD (Leave blank) TITLE: Genomic Characterization of Brain Metastasis in Non-Small Cell Lung Cancer Patients

SALSA MLPA probemix P241-D2 MODY mix 1 Lot D2-0716, D As compared to version D1 (lot D1-0911), one reference probe has been replaced.

Next Generation Sequencing as a tool for breakpoint analysis in rearrangements of the globin-gene clusters

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers

Analyse de données de séquençage haut débit

Illuminating the genetics of complex human diseases

Pathologists role Ancillary Studies in Cytology Challenges. Pre-analytical issues. LUNG CYTOLOGY Predictive markers and molecular tests

From reference genes to global mean normalization

The Promise of Epilepsy Genetics A Personal & Scientific Perspective December 3, 2012

COMPARISON OF HIV DRUG-RESISTANT MUTANT DETECTION BY NGS WITH AND WITHOUT UNIQUE MOLECULAR IDENTIFIERS (UMI)

MRC-Holland MLPA. Description version 13;

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics

CPT Codes for Pharmacogenomic Tests

Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS

Ginkgo Interactive analysis and quality assessment of single-cell CNV data

NGS IN ONCOLOGY: FDA S PERSPECTIVE

Transform genomic data into real-life results

Calling DNA variants SNVs, CNVs, and SVs. Steve Laurie Variant Effect Predictor Training Course Prague, 6 th November 2017

6/12/2018. Disclosures. Clinical Genomics The CLIA Lab Perspective. Outline. COH HopeSeq Heme Panels

No mutations were identified.

MRC-Holland MLPA. Description version 06; 07 August 2015

NGS Gateway Lab Services

NGS Types of gene dossier applications UKGTN can evaluate

SUPPLEMENTARY INFORMATION

A complete next-generation sequencing workfl ow for circulating cell-free DNA isolation and analysis

MEDICAL GENOMICS LABORATORY. Peripheral Nerve Sheath Tumor Panel by Next-Gen Sequencing (PNT-NG)

NEWBORN METABOLIC SCREEN, MINNESOTA

Sequencing in Newborn Screening Introduction and Background

Product Description SALSA MLPA Probemix P055-D1 PAH To be used with the MLPA General Protocol.

Newborn Screening. Helping babies start life healthy. Minnesota Newborn Screening Program

CRISPR/Cas9 Enrichment and Long-read WGS for Structural Variant Discovery

MutationTaster & RegulationSpotter

Transcription:

Detection of copy number variations in PCR-enriched targeted sequencing data German Demidov Parseq Lab, Saint-Petersburg University of Russian Academy of Sciences, current: Center for Genomic Regulation german.demidov@crg.eu July 29, 2016 German Demidov (CRG) CNVs detection July 29, 2016 1 / 22

Overview 1 Problem Formulation Neonatal Screening Data 2 Methods Counting of coverages Quality Control Unsupervised Algorithm Supervised Algorithm 3 Results Validation Results 4 Open Questions German Demidov (CRG) CNVs detection July 29, 2016 2 / 22

Neonatal Screening Cystic fibrosis, Phenylketonuria, Galactosemia and others. We are interested only in Mendelian disorders. They are rare and treatable (if the sample was diagnosed early). Otherwise irreversible damage. (image from progenity.com) German Demidov (CRG) CNVs detection July 29, 2016 3 / 22

Neonatal Screening for CF, PKU, GALT Pipeline (CF) Immunoreactive Trypsin Test Immunoreactive Trypsin Test 2 Sweat Probe Panel for approx. 10 common mutations. Time, Sensitivity? Alternative Immunoreactive Trypsin Test Immunoreactive Trypsin Test 2 NGS for more than 300 mutations for 3 disorders Sweat Probe. German Demidov (CRG) CNVs detection July 29, 2016 4 / 22

Problem CNVs can be as short as one exon + small intonic regions. From 1% up to 5% of samples have CNVs [for particular disorders]. Alternatives? FISH, MLPA, qpcr, SNVs? The goal Detect germline CNVs in multiplex PCR enriched amplicon sequencing data using only coverages. German Demidov (CRG) CNVs detection July 29, 2016 5 / 22

Multiplex PCR Image source http://rosalind.info German Demidov (CRG) CNVs detection July 29, 2016 6 / 22

Multiplex PCR Image source: unpublished, Bushmanova et al., bioinformaticsinstitute.ru German Demidov (CRG) CNVs detection July 29, 2016 7 / 22

Description of Data Multiplex PCR (divided into 2 pools of primers) + IonTorrent Sequencing. 128 amplicons per 3 genes and several intronic regions. One run 48 samples. Average coverage from 10 reads to 1200 per amplicon (samples from dried blood spot). German Demidov (CRG) CNVs detection July 29, 2016 8 / 22

Overview German Demidov (CRG) CNVs detection July 29, 2016 9 / 22

Counting of coverages 2 pools We know that primers were divided into 2 pools that generate non-overlapping amplicons (inside each pool), we can count coverages more efficiently. Mapping German Demidov (CRG) CNVs detection July 29, 2016 10 / 22

Counting of coverages Chimeric Sequences We have found that 1 Sufficient (from 1 to 5 percents) part of reads have strange soft clipped parts. 2 We used blast and found that these parts actually come mostly from targeted regions. We realign them. Mapping German Demidov (CRG) CNVs detection July 29, 2016 11 / 22

Counting of coverages German Demidov (CRG) CNVs detection July 29, 2016 12 / 22

Quality Control Samples arrive to the lab from other labs or hospitals. Some samples DNAs were poorly extracted. Some samples have CNVs and we should not mix these categories. We developed an algorithm that filters poorly extracted samples out before the analysis. We have 3 genes and we can assume that only one of them has CNV inside One of 3 genes may fail QC control. German Demidov (CRG) CNVs detection July 29, 2016 13 / 22

General Idea Typical approaches There are several sources of variation in coverages. We can normalise on GC-content, length, etc. Our alternative Amplicon-based sequencing has a lot of sources of variation that is not possible to infer. Some amplicons in a panel should show similar efficiency. We can use clusters of correlated amplicons for normalisation. German Demidov (CRG) CNVs detection July 29, 2016 14 / 22

Unsupervised Algorithm Figure: Regression and prediction intervals (Right figure source: novayagazeta.ru) German Demidov (CRG) CNVs detection July 29, 2016 15 / 22

Image source: dzone.com German Demidov (CRG) CNVs detection July 29, 2016 16 / 22 Supervised Algorithm We can use output of Unsupervised Algorithm or pre-defined Control Dataset. We tried to detect if single amplicons shows a CNV presence, now we want to detect CNV sites. Idea: to use Mahalanobis distance and classify each exon.

Supervised Algorithm Data whitening Normalise coverages within the clusters of correlated amplicons and calculate Mahalanobis distance that should be (in theory) χ 2 -distributed. Three models Having H 0 = M Normal, we can construct H a = (M HetDel M HetDup ). Three questions Can each data point from the region be generated by H a? If so, is it highly probable that it was generated by H 0? If so, is H a the most probable explanation for the region? German Demidov (CRG) CNVs detection July 29, 2016 17 / 22

Supervised Algorithm German Demidov (CRG) CNVs detection July 29, 2016 18 / 22

Validation More than 500 samples, more than 1000 sequencing results. 16 de novo discovered variants. One of them was novel (PAHdele4). 810 samples were negative for each algorithm. Unsupervised Supervised Sens 90.36% 90.36% Spec 94.97 % 94.62% Figure: Unsupervised algorithm Figure: Supervised algorithm German Demidov (CRG) CNVs detection July 29, 2016 19 / 22