Copy Number Varia/on Detec/on. Alex Mawla UCD Genome Center Bioinforma5cs Core Tuesday June 16, 2015

Similar documents
Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction

Using the Bravo Liquid-Handling System for Next Generation Sequencing Sample Prep

DNA-seq Bioinformatics Analysis: Copy Number Variation

Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics

Multiplex target enrichment using DNA indexing for ultra-high throughput variant detection

NGS, Cancer and Bioinforma;cs. 20/10/15 Yannick Boursin

Characterisation of structural variation in breast. cancer genomes using paired-end sequencing on. the Illumina Genome Analyser

Canadian Bioinforma1cs Workshops

Calling DNA variants SNVs, CNVs, and SVs. Steve Laurie Variant Effect Predictor Training Course Prague, 6 th November 2017

Advance Your Genomic Research Using Targeted Resequencing with SeqCap EZ Library

AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits

Evolu&on of Disease genes

Global variation in copy number in the human genome

Analysis with SureCall 2.1

Evolu+on of Disease genes

Transcript reconstruction

Genomic structural variation

cn.mops - Mixture of Poissons for CNV detection in NGS data Günter Klambauer Institute of Bioinformatics, Johannes Kepler University Linz

Characteriza*on of Soma*c Muta*ons in Cancer Genomes

Recombina*on of Linked Genes: Crossing Over. discovered that genes can be linked. the linkage was incomplete

cn.mops - Mixture of Poissons for CNV detection in NGS data Günter Klambauer Institute of Bioinformatics, Johannes Kepler University Linz

Cytogenetics 101: Clinical Research and Molecular Genetic Technologies

Missing Heritablility How to Analyze Your Own Genome Fall 2013

Genome-wide copy-number calling (CNAs not CNVs!) Dr Geoff Macintyre

NGS in Cancer Pathology After the Microscope: From Nucleic Acid to Interpretation

CNV Detection and Interpretation in Genomic Data

Small RNAs and how to analyze them using sequencing

CNV detection. Introduction and detection in NGS data. G. Demidov 1,2. NGSchool2016. Centre for Genomic Regulation. CNV detection. G.

Andrew Parrish, Richard Caswell, Garan Jones, Christopher M. Watson, Laura A. Crinnion 3,4, Sian Ellard 1,2

Investigating rare diseases with Agilent NGS solutions

Genetic Tests and Genetic Counseling How to Analyze Your Own Genome

Mutation Detection and CNV Analysis for Illumina Sequencing data from HaloPlex Target Enrichment Panels using NextGENe Software for Clinical Research

Emerging gene)cs in schizophrenia: Real challenges for crea)ng animal models

Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit

Considera*ons when undergoing personal genotyping

Nature Genetics: doi: /ng Supplementary Figure 1. PCA for ancestry in SNV data.

Generating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University

Whole Genome and Transcriptome Analysis of Anaplastic Meningioma. Patrick Tarpey Cancer Genome Project Wellcome Trust Sanger Institute

PERSONALIZED GENETIC REPORT CLIENT-REPORTED DATA PURPOSE OF THE X-SCREEN TEST

Agilent s Copy Number Variation (CNV) Portfolio

Supplementary Figure 1

ChIP-seq hands-on. Iros Barozzi, Campus IFOM-IEO (Milan) Saverio Minucci, Gioacchino Natoli Labs

DNA Sequence Bioinformatics Analysis with the Galaxy Platform

ncounter Data Analysis Guidelines for Copy Number Variation (CNV) Molecules That Count NanoString Technologies, Inc.

ACE ImmunoID Biomarker Discovery Solutions ACE ImmunoID Platform for Tumor Immunogenomics

Carcinoma mammario triple nega0ve Nuove acquisizioni biologiche. Giuseppe Curigliano MD PhD UNIMI & IEO

SUPPLEMENTARY INFORMATION

Biosta's'cs Board Review. Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016

MolEcular Taxonomy of BReast cancer International Consortium (METABRIC)

Identifying Mutations Responsible for Rare Disorders Using New Technologies

Nature Biotechnology: doi: /nbt.1904

Performance Characteristics BRCA MASTR Plus Dx

RNA SEQUENCING AND DATA ANALYSIS

RNA SEQUENCING AND DATA ANALYSIS

P. Tang ( 鄧致剛 ); PJ Huang ( 黄栢榕 ) g( ); g ( ) Bioinformatics Center, Chang Gung University.

Dr Rick Tearle Senior Applications Specialist, EMEA Complete Genomics Complete Genomics, Inc.

CRISPR/Cas9 Enrichment and Long-read WGS for Structural Variant Discovery

PGC Worldwide Lab Call Details

Implementation of BRCA Oncomine panel for germline and somatic variant analysis

Supplementary Tables. Supplementary Figures

Illuminating the genetics of complex human diseases

Detection of copy number variations in PCR-enriched targeted sequencing data

Chip Seq Peak Calling in Galaxy

Variations in Chromosome Structure & Function. Ch. 8

Introduction to LOH and Allele Specific Copy Number User Forum

Valida5on of a Microsatellite Instability Assay by NGS

Identification of regions with common copy-number variations using SNP array

Next Generation Sequencing as a tool for breakpoint analysis in rearrangements of the globin-gene clusters

AVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB

Statistical Applications in Genetics and Molecular Biology

PREPARED FOR: U.S. Army Medical Research and Materiel Command Fort Detrick, Maryland

RNA- seq Introduc1on. Promises and pi7alls

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers

Fluxion Biosciences and Swift Biosciences Somatic variant detection from liquid biopsy samples using targeted NGS

COMPARISON OF HIV DRUG-RESISTANT MUTANT DETECTION BY NGS WITH AND WITHOUT UNIQUE MOLECULAR IDENTIFIERS (UMI)

DNA Basics. We are all made up of cells. Cells contain DNA, or instructions to tell our bodies how to work.

LEIDEN, THE NETHERLANDS

Single-strand DNA library preparation improves sequencing of formalin-fixed and paraffin-embedded (FFPE) cancer DNA

Ginkgo Interactive analysis and quality assessment of single-cell CNV data

No mutations were identified.

EXAMPLE. - Potentially responsive to PI3K/mTOR and MEK combination therapy or mtor/mek and PKC combination therapy. ratio (%)

Sta$s$cs is Easy. Dennis Shasha From a book co- wri7en with Manda Wilson

Supplementary Figure 1. Schematic diagram of o2n-seq. Double-stranded DNA was sheared, end-repaired, and underwent A-tailing by standard protocols.

NGS for Cancer Predisposition

Towards Personalized Medicine: An Improved De Novo Assembly Procedure for Early Detection of Drug Resistant HIV Minor Quasispecies in Patient Samples

Hands-On Ten The BRCA1 Gene and Protein

Case 1B. 46,XY,-14,+t(14;21)

ACE ImmunoID. ACE ImmunoID. Precision immunogenomics. Precision Genomics for Immuno-Oncology

Identification of genomic alterations in cervical cancer biopsies by exome sequencing

PSSV User Manual (V1.0)

Association mapping (qualitative) Association scan, quantitative. Office hours Wednesday 3-4pm 304A Stanley Hall. Association scan, qualitative

De novo iden)fica)on of SNPs from RNA- seq data in non- model species

Golden Helix s End-to-End Solution for Clinical Labs

Associating Copy Number and SNP Variation with Human Disease. Autism Segmental duplication Neurobehavioral, includes social disability

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space

MRC-Holland MLPA. Description version 18; 09 September 2015

RESISTANCE RELATIONSHIPS BETWEEN PLATINUM AND PARP-INHIBITORS IN OVARIAN CANCER.

p.r623c p.p976l p.d2847fs p.t2671 p.d2847fs p.r2922w p.r2370h p.c1201y p.a868v p.s952* RING_C BP PHD Cbp HAT_KAT11

Understanding DNA Copy Number Data

Transcription:

Copy Number Varia/on Detec/on Alex Mawla UCD Genome Center Bioinforma5cs Core Tuesday June 16, 2015

Today s Goals Understand the applica5on and capabili5es of using targe5ng sequencing and CNV calling in a clinical or research sekng In today s exercise, you will: Align raw FASTQ reads using Bow5e2 Generate depth of coverage files from BAM files using targetdepth (Samtools depth wrapper) Learn to set parameters and work with input files (bait probe, regions of interest) for the targeted CNV calling tool, PanelDoC Run PanelDoC using R Script on a subset of data and interpret results 2

What are CNVs? Copy Number Varia5ons are altera5ons in the genome that result in either normal or abnormal varia5ons in the number of copies of a gene or region Field has been limited in focusing on exome regions of interest, but with decreases in whole genome sequencing cost, non- coding copy number varia5ons are star5ng to be considered 3

What are CNVs? (cont d) Structural varia5ons (unbalanced) Duplica5on (gain) Dele5on (loss) AB CDE FG Dele5on AB FG Duplica5on AB CDE CDE FG 4

Mechanisms Different causes: Homologous recombina5on can result in tandem repe55on and dele5on of a gene Non- homologous repair can result in dele5ons Non Allelic Homologous Recombination The Fast Car The Brown Rat The Fast Car The Brown Rat X Homologous Recombination at Incorrect Locus The Fast Car The Fast BrownCar Rat The Brown Rat Tandem Duplication The Brown Rat Deletion Non homologous Repair (e.g. NHEJ) Horses Eat Oats Cats Chase Mice DNA break Horses Eat Oats Cats Chase Mice deleted after repair Horses Eat Mice Deletion 5

Consequences Consequences include: dosage effect, gain of func5on, or loss of func5on Dosage Effect Loss Protein B 2X Protein B New Alleles Gain of Function Protein A Fusion Protein Protein 6

Why care about CNVs? CNVs have been iden5fied as causes to many human diseases: Au5sm (Marshall et al, 2012) Cancer (Walsh et al, 2010) Crohn s Disease (Schaschl et al, 2009) Down Syndrome (Ramachandran et al, 2014) HIV suscep5bility (Hollox et al, 2014) Idiopathic Learning Disability (Morrow et al, 2010) Lupus (Chen et al, 2014) Schizophrenia (Suleyman et al, 2015) 7

Why care about CNVs? (cont d) Ex: GSTM1 gene: involved in coding for glutathione What happens during a null dele5on (0 copies)? Individuals with low number of copies are more suscep5ble to certain cancers, but also have a beder clinical outcome (Mahimkar et al., 2012) Ex: CCL3L1 gene: codes protein that binds to the CCR5 binding site on immune cells Individuals with higher number of copies are less suscep5ble to HIV infec5on (Urban et al., 2009) 8

Different Sequencing Methods Whole Genome Sequencing Most expensive Sequences coding and non- coding regions of the genome Less used in clinical and research sekngs, but now emerging Exome Sequencing Moderately expensive Sequences all coding regions of the genome 180,000 exons 1% of en5re human genome Oien used in clinical and research sekngs Targeted Sequencing Cheapest Sequences specific regions of interest, usually within exomes Growing amount of use in clinical and research sekngs 9

What is Targeted Sequencing? Target Sequencing is the sequencing of specific regions of the genome, usually within the exome, to specifically look at certain areas of interest Magnitudes cheaper than exome or whole genome sequencing Two primary methods: Bait- probe hybridiza5on capture Uses overlapping modified bait probe sequences to capture specific genes of interest Amplicon- based Uses primers 10

Bait Probe Hybridiza5on Capture Modified bait probes designed to capture specific regions of interest from the cdna library Usually designed to 5le to ensure strong capture of all nucleo5des in region of interest crna baits (3x 5ling) Area of Interest Repeat 11

Different CNV Tools Whole Genome Sequencing Tools SegSeq cnvhmm CNVnator Exome Sequencing Tools CoNIFER XHMM ExomeCNV Targeted Sequencing Tools Svseq SVDetect PanelDoC Reference: Zhao et al, 2013 12

Why PanelDoC? Designed to work well with bait- probe hybridiza5on capture Works with exome sequencing as well Can be used on blood samples, or tumor popula5ons Sensi5ve to small CNVs (31 bp) Very good with larger CNV (>1 kb) detec5on Verified against breast cancer study data Future updates: Complementary hidden markov model will be added to improve robustness Func5onality to handle amplicon- based capture will also be added 13

PanelDoC Background Developed by Dr. Alex Nord at University of Washington in 2011 (Nord et al, 2011) Verified by evalua5ng 94 pa5ents for CNVs related to breast and ovarian cancer Correctly iden5fied all known muta5ons Iden5fied 10 CNVs (5 gains and 5 losses) in four cancer- related genes Confirmed a 31 bp homozygous dele5on in BRCA2 Detected 200 bp CNV with 87% sensi5vity and 100 bp CNV with 80% sensi5vity at high signal- to- noise ra5os 14

'''''''''''+,++,+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+' ''''''''''''',++,+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,' ''''''''''''',++,+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,' '''''''''''''''''+,+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+' ''''''''''''''''''',+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..' ''''''''''''''''''',+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..' '''''''''''''''''''''+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+' '''''''''''''''''''''+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+' '''''''''''''''''''''+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+' '''''''''''''''''''''''-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,' '''''''''''''''''''''''-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,' ''''''''''''''''''''''''''.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''',-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''''''-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''''''''''',,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''',+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''''''''''''''''+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''',-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''''''-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''''''''',+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''''''''''''''''''''''''''+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''''''''''''',,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''''''''''''''''''''''''''''''',++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''+-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-'''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''',,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''',,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-''''''''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''',.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-'' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''',-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--' '''''''''''''''''''''''''''''''-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.' ''''''''''''''''''''''''''''''''''.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+' '''''''''''''''''''''''''''''''''''',,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,' ''''''''''''''''''''''''''''''''''''''',+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,' '''''''''''''''''''''''''''''''''''''''''+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.' ''''''''''''''''''''''''''''''''''''''''''',-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-' ''''''''''''''''''''''''''''''''''''''''''''''-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,' ''''''''''''''''''''''''''''''''''''''''''''''-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,' ''''''''''''''''''''''''''''''''''''''''''''''''',+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+' '''''''''''''''''''''''''''''''''''''''''''''''''''+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.' ''''''''''''''''''''''''''''''''''''''''''''''''''''',,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,' '''''''''''''''''''''''''''''''''''''''''''''''''''''''',++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-''''''''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''+-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''+-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,'' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,.''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,.' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,.'' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''',,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,.+''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''',,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,.+' Methods Pre- PanelDoC Map reads to genome Generate coverage Normalize coverage Sample specific effect GC content bias Bait probe 5ling Call CNVs with sliding window approach!"#$%&'()"*($ %"+%,+"-&.$/&)$("%0$ 1")*(1(2$3"4($ 5&'()"*($.&)6"+78"-&.$ ".2$%&))(%-&.$/&)$ %"91,)(:37"4$!"-&$*(.()"-&.$/&)$ 1(41$4,3;(%1$'4<$+".($ 6(27".$$ =40&#.$7.$>?$ 5@A$%"++7.*$'7"$4+727.*$ #7.2&#$"."+B474$&/$ )"-&$2"1"$$ =40&#.$7.$>?$ 5&.C)6"-&.$&/$5@A$ %"++4$#710$47*."1,)($&/$ 9")-"++B:6"99(2$)("24$ =40&#.$7.$D?$ EF"%1$3)("G9&7.1$ 6"997.*$,47.*$9")-"++B: 6"99(2$)("24$$ =40&#.$7.$5?$ >$ D$ 5$ 12-)/*,$ (*',0$ %"#$ %"!$!"#$ &'()*+*,$(*)-./0$ 3*7*(*/6*$0*89*/6*$ 3'4.$ 5*2*4./$6'22$ D$ 5!"#"$"%&"'(")*"%&"' /0$1022345066"7' $"07'(89%0:*$"' ;<0&:'=$"0>6?8%:' <062"."5'6"2(%&' /01'(&'0#(2&3"&.' 9":"6"&-"')";*"&-"'!"#"$%&'(&' )*+,"-.' 4#(2&"5'106$0##78 3011"5'6"05)' 15

Methods (cont d) Bait- probe hybridiza5on capture used to capture regions of interest at minimum of 50x- 100x depth 16

Methods (cont d) Reads can be mapped with aligners such as: BWA Bow5e/Bow5e2 Coverage files can be generated using a tool in the Samtools suite: Samtools depth targetdepth is a wrapper that will automate this process for each region of interest 17

PanelDoC Methods Raw depth of coverage is noisy Raw Coverage Example (chr17:41196337 41206337) Coverage 0 100 200 300 400 500 600 700 Large varia5ons in coverage Processing steps necessary for CNV analysis 1. Coverage normaliza5on & correc5on 2. Analyze rela5ve coverage vs. expected diploid coverage 41196000 41198000 41200000 41202000 41204000 41206000 Gaps where repeats are present 18

PanelDoC Methods (cont d) Raw coverage is normalized against median coverage per base posi5on across all samples Raw coverage counts from alignment Normaliza5on to median coverage 19

PanelDoC Methods (cont d) Normalized coverage is corrected for GC content bias and bait probe 5ling 20

PanelDoc Methods (cont d) Bait Probe Tiling Normaliza5on: Overlap will lead to skewed depth of coverage values crna baits (3x 5ling) Area of Interest Repeat 21

PanelDoc Methods (cont d) GC content and capture bias correc5on Raw coverage counts from alignment Normaliza5on to median coverage Correc5on for GC and capture bias 22

PanelDoc Methods (cont d) Raw coverage counts from alignment Normaliza5on to median coverage Correc5on for GC content and capture bias Rela5ve ra5o calcula5on and QC 23

PanelDoc Methods (cont d) CNV called using sliding window algorithm CNV calling 24

PanelDoC Applica5on: Breast Cancer 25

'''''''''''+,++,+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+' ''''''''''''',++,+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,' ''''''''''''',++,+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,' '''''''''''''''''+,+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+' ''''''''''''''''''',+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..' ''''''''''''''''''',+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..' '''''''''''''''''''''+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+' '''''''''''''''''''''+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+' '''''''''''''''''''''+-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+' '''''''''''''''''''''''-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,' '''''''''''''''''''''''-.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,' ''''''''''''''''''''''''''.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''.,-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''',-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''''''-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''''''''''',,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''',+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''''''''''''''''+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''',-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''''''-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''''''''',+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''''''''''''''''''''''''''+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''''''''''''',,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' '''''''''''''''''''''''''''''''''''''''''''''''''''''''',++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''+-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-'''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''',,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''',,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-''''''''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''',.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-'' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,-' ''''''''''''''''''''''''''''',-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--' '''''''''''''''''''''''''''''''-.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.' ''''''''''''''''''''''''''''''''''.,,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+' '''''''''''''''''''''''''''''''''''',,+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,' ''''''''''''''''''''''''''''''''''''''',+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,' '''''''''''''''''''''''''''''''''''''''''+,-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.' ''''''''''''''''''''''''''''''''''''''''''',-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-' ''''''''''''''''''''''''''''''''''''''''''''''-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,' ''''''''''''''''''''''''''''''''''''''''''''''-,+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,' ''''''''''''''''''''''''''''''''''''''''''''''''',+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+' '''''''''''''''''''''''''''''''''''''''''''''''''''+,,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.' ''''''''''''''''''''''''''''''''''''''''''''''''''''',,++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,' '''''''''''''''''''''''''''''''''''''''''''''''''''''''',++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''++-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-''''''''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''+-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''+-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,'' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,.''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,.' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''-,,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,.'' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''',,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,.+''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''',,,,.,,++,+..,-..,--+-.,-+--.+.,.,..+-+,,+.+.,-.-,+.+,+..+,--.+,,.-,+.,.-,.+' Methods: Review Pre- PanelDoC Map reads to genome Generate coverage Normalize coverage Sample specific effect GC content bias Bait probe 5ling Call CNVs with sliding window approach!"#$%&'()"*($ %"+%,+"-&.$/&)$("%0$ 1")*(1(2$3"4($ 5&'()"*($.&)6"+78"-&.$ ".2$%&))(%-&.$/&)$ %"91,)(:37"4$!"-&$*(.()"-&.$/&)$ 1(41$4,3;(%1$'4<$+".($ 6(27".$$ =40&#.$7.$>?$ 5@A$%"++7.*$'7"$4+727.*$ #7.2&#$"."+B474$&/$ )"-&$2"1"$$ =40&#.$7.$>?$ 5&.C)6"-&.$&/$5@A$ %"++4$#710$47*."1,)($&/$ 9")-"++B:6"99(2$)("24$ =40&#.$7.$D?$ EF"%1$3)("G9&7.1$ 6"997.*$,47.*$9")-"++B: 6"99(2$)("24$$ =40&#.$7.$5?$ >$ D$ 5$ 12-)/*,$ (*',0$ %"#$ %"!$!"#$ &'()*+*,$(*)-./0$ 3*7*(*/6*$0*89*/6*$ 3'4.$ 5*2*4./$6'22$ D$ 5!"#"$"%&"'(")*"%&"' /0$1022345066"7' $"07'(89%0:*$"' ;<0&:'=$"0>6?8%:' <062"."5'6"2(%&' /01'(&'0#(2&3"&.' 9":"6"&-"')";*"&-"'!"#"$%&'(&' )*+,"-.' 4#(2&"5'106$0##78 3011"5'6"05)' 26

Exercise Goals In today s exercise, you will: Align raw FASTQ reads using Bow5e2 Generate depth of coverage from BAM files using targetdepth Learn to set parameters and work with input files (bait probe, regions of interest) in PanelDoC Run PanelDoC through R Script on a subset of data and interpret results 27

Acknowledgements Dr. Alex Nord UCD Genome Center Bioinforma5cs Core Team Dr. Monica Bridon Dr. Joseph Fass Dr. Nikhil Joshi Dr. Ian Korf 28

Ques5ons? 29