CRISPR/Cas9 Enrichment and Long-read WGS for Structural Variant Discovery

Similar documents
Iso-Seq Method Updates and Target Enrichment Without Amplification for SMRT Sequencing

Binding Calculator Parameters

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers

SVIM: Structural variant identification with long reads DAVID HELLER MAX PLANCK INSTITUTE FOR MOLECULAR GENETICS, BERLIN JUNE 2O18, SMRT LEIDEN

Advance Your Genomic Research Using Targeted Resequencing with SeqCap EZ Library

AVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB

AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits

Detection of low-frequent mitochondrial DNA variants using SMRT sequencing

Multiplex target enrichment using DNA indexing for ultra-high throughput variant detection

Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS

Illuminating the genetics of complex human diseases

MEDICAL GENOMICS LABORATORY. Next-Gen Sequencing and Deletion/Duplication Analysis of NF1 Only (NF1-NG)

Global variation in copy number in the human genome

LEIDEN, THE NETHERLANDS

Eukaryotic Gene Regulation

Fluxion Biosciences and Swift Biosciences Somatic variant detection from liquid biopsy samples using targeted NGS

Characterisation of structural variation in breast. cancer genomes using paired-end sequencing on. the Illumina Genome Analyser

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction

Genomic structural variation

Next Generation Sequencing as a tool for breakpoint analysis in rearrangements of the globin-gene clusters

No mutations were identified.

Accel-Amplicon Panels

Comprehensive Genome and Transcriptome Structural Analysis of a Breast Cancer Cell Line using PacBio Long Read Sequencing

Mutation Detection and CNV Analysis for Illumina Sequencing data from HaloPlex Target Enrichment Panels using NextGENe Software for Clinical Research

Performance Characteristics BRCA MASTR Plus Dx

Introduction to Systems Biology of Cancer Lecture 2

ORIGINAL RESEARCH ARTICLE American College of Medical Genetics and Genomics

Calling DNA variants SNVs, CNVs, and SVs. Steve Laurie Variant Effect Predictor Training Course Prague, 6 th November 2017

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space

Using the Bravo Liquid-Handling System for Next Generation Sequencing Sample Prep

NEXT GENERATION SEQUENCING. R. Piazza (MD, PhD) Dept. of Medicine and Surgery, University of Milano-Bicocca

MMB (MGPG) Non traditional Inheritance Epigenetics. A.Turco

Fragile X Syndrome. Genetics, Epigenetics & the Role of Unprogrammed Events in the expression of a Phenotype

Dr Rick Tearle Senior Applications Specialist, EMEA Complete Genomics Complete Genomics, Inc.

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

P. Tang ( 鄧致剛 ); PJ Huang ( 黄栢榕 ) g( ); g ( ) Bioinformatics Center, Chang Gung University.

Genetic Testing and Analysis. (858) MRN: Specimen: Saliva Received: 07/26/2016 GENETIC ANALYSIS REPORT

NGS in tissue and liquid biopsy

DNA-seq Bioinformatics Analysis: Copy Number Variation

Investigating rare diseases with Agilent NGS solutions

HLA and new technologies. Vicky Van Sandt

Epigenetics. Jenny van Dongen Vrije Universiteit (VU) Amsterdam Boulder, Friday march 10, 2017

Computer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015

Supplementary Figure 1. Schematic diagram of o2n-seq. Double-stranded DNA was sheared, end-repaired, and underwent A-tailing by standard protocols.

Introduction to genetic variation. He Zhang Bioinformatics Core Facility 6/22/2016

Nature Biotechnology: doi: /nbt.1904

Supplementary Figures

Supplemental Information For: The genetics of splicing in neuroblastoma

RNA SEQUENCING AND DATA ANALYSIS

Deep-Sequencing of HIV-1

CITATION FILE CONTENT/FORMAT

Ambient temperature regulated flowering time

Structural Variation and Medical Genomics

Rapid genomic screening of embryos using nanopore sequencing

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq)

SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY.

Evaluation of MIA FORA NGS HLA test and software. Lisa Creary, PhD Department of Pathology Stanford Blood Center Research & Development Group

2/10/2016. Evaluation of MIA FORA NGS HLA test and software. Disclosure. NGS-HLA typing requirements for the Stanford Blood Center

Fragile X full mutation expansions are inhibited by one or more AGG interruptions in premutation carriers

Cytogenetics 101: Clinical Research and Molecular Genetic Technologies

Not IN Our Genes - A Different Kind of Inheritance.! Christopher Phiel, Ph.D. University of Colorado Denver Mini-STEM School February 4, 2014

Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping

CPT Codes for Pharmacogenomic Tests

SUPPLEMENTARY INFORMATION. Intron retention is a widespread mechanism of tumor suppressor inactivation.

A complete next-generation sequencing workfl ow for circulating cell-free DNA isolation and analysis

Fragile X One gene, three very different disorders for which Genetic Technology is essential. Significance of Fragile X. Significance of Fragile X

NGS in neurodegenerative disorders - our experience

Victor Guryev. European Research Institute for the Biology of Ageing

Introduction of an NGS gene panel into the Haemato-Oncology MPN service

Please Silence Your Cell Phones. Thank You

Population Screening for Fragile X Syndrome

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq

Cell-free tumor DNA for cancer monitoring

ACE ImmunoID Biomarker Discovery Solutions ACE ImmunoID Platform for Tumor Immunogenomics

PSSV User Manual (V2.1)

Introduction to LOH and Allele Specific Copy Number User Forum

Introduction to Genetics

PSSV User Manual (V1.0)

Single-strand DNA library preparation improves sequencing of formalin-fixed and paraffin-embedded (FFPE) cancer DNA

iplex genotyping IDH1 and IDH2 assays utilized the following primer sets (forward and reverse primers along with extension primers).

Proteogenomic analysis of alternative splicing: the search for novel biomarkers for colorectal cancer Gosia Komor

HST.161 Molecular Biology and Genetics in Modern Medicine Fall 2007

SUPPLEMENTARY INFORMATION

Viral genome sequencing: applications to clinical management and public health. Professor Judy Breuer

TP53 mutational profile in CLL : A retrospective study of the FILO group.

Supplementary Figure 1: Features of IGLL5 Mutations in CLL: a) Representative IGV screenshot of first

Genome. Institute. GenomeVIP: A Genomics Analysis Pipeline for Cloud Computing with Germline and Somatic Calling on Amazon s Cloud. R. Jay Mashl.

Inference of Isoforms from Short Sequence Reads

Studying The Role Of DNA Mismatch Repair In Brain Cancer Malignancy

SUPPLEMENTARY INFORMATION

SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY.

Lecture 20. Disease Genetics

TOWARDS ACCURATE GERMLINE AND SOMATIC INDEL DISCOVERY WITH MICRO-ASSEMBLY. Giuseppe Narzisi, PhD Bioinformatics Scientist

Single-Cell Sequencing in Cancer. Peter A. Sims, Columbia University G4500: Cellular & Molecular Biology of Cancer October 22, 2018

Circular RNAs (circrnas) act a stable mirna sponges

Non-Mendelian inheritance

CNV Detection and Interpretation in Genomic Data

Nature Genetics: doi: /ng Supplementary Figure 1. Rates of different mutation types in CRC.

AD (Leave blank) TITLE: Genomic Characterization of Brain Metastasis in Non-Small Cell Lung Cancer Patients

Reducing INDEL calling errors in whole genome and exome sequencing data.

Transcription:

CRISPR/Cas9 Enrichment and Long-read WGS for Structural Variant Discovery PacBio CoLab Session October 20, 2017 For Research Use Only. Not for use in diagnostics procedures. Copyright 2017 by Pacific Biosciences of California, Inc. All rights reserved.

PACBIO SMRT SEQUENCING Long Reads average 10 to 15 kb High Consensus Accuracy random errors produce QV50 consensus Uniform, Unbiased Coverage no GC% or sequence complexity bias Sequel System Epigenetic Characterization simultaneous detection of DNA methylation

APPLICATIONS OF SMRT SEQUENCING De novo genome assembly Full isoform sequencing Epigenetic characterization Minor variant discovery Structural variant discovery Sequel System Targeted sequencing

APPLICATIONS OF SMRT SEQUENCING De novo genome assembly Full isoform sequencing Epigenetic characterization Minor variant discovery Structural variant discovery Sequel System Targeted sequencing

cumulative percent VARIATION IN A HUMAN GENOME HG00733 SNVs 100% indels 50 bp structural variants (SVs) 80% 60% 60% 40% base pairs 20% count 0.5% 0% 1 10 100 1,000 10,000 100,000 variant size (bp) Chaisson et al. (2017) biorxiv. doi:10.1101/193144.

TYPES OF STRUCTURAL VARIATION deletion insertion duplication inversion translocation repeat expansion

STRUCTURAL VARIANTS AND DISEASE Schizophrenia Carney complex repeat expansion disorders Poor drug metabolism Breast & ovarian cancer Neurofibromatosis Chronic myeloid leukemia http://www.pacb.com/wp-content/uploads/structural-variation-infographic.pdf Richards et al. (2013) Front Mol Neurosci. doi:10.3389/fnmol.2013.00025.

TECHNOLOGY TO DETECT STRUCTURAL VARIANTS Deletions Insertions Missed PacBio Illumina repeats + large insertions Bionano variants 1.5 kb 0 5,000 10,000 15,000 20,000 25,000 structural variants Chaisson et al. (2017) biorxiv. doi:10.1101/193144.

A move forward to full-spectrum SV detection... will increase the diagnostic yield in patients with genetic disease, SV-mediated mutation, and repeat expansions. Chaisson et al. (2017) biorxiv. doi:10.1101/193144.

PacBio Long-Read WGS for Structural Variant Discovery Targeted Enrichment without Amplification and SMRT Sequencing of Repeat-Expansion Disease Causative Genomic Regions

PacBio Long-Read WGS for Structural Variant Discovery Targeted Enrichment without Amplification and SMRT Sequencing of Repeat-Expansion Disease Causative Genomic Regions

FOR MORE INFORMATION PACB.COM/SV

WGS FOR STRUCTURAL VARIANT DISCOVERY Structural Variants Small Variants Sequencing PacBio Sequel System Short-read NGS Read Mapping NGMLR BWA Variant Calling pbsv GATK Visualization IGV 2.4 IGV

WGS FOR STRUCTURAL VARIANT DISCOVERY Structural Variants Small Variants Sequencing PacBio Sequel System Short-read NGS Read Mapping NGMLR BWA Variant Calling pbsv GATK Visualization IGV 2.4 IGV

SEQUENCING Library Preparation 5 µg DNA 20 kb shear + damage repair SMRTbell adapter ligation 15 kb size selection

SEQUENCING Sequencing polymerase binding Sequel System (5 Gb per SMRT Cell)

WGS FOR STRUCTURAL VARIANT DISCOVERY Structural Variants Small Variants Sequencing PacBio Sequel System Short-read NGS Read Mapping NGMLR BWA Variant Calling pbsv GATK Visualization IGV 2.4 IGV

penalty READ MAPPING reference biological indel affine (BWA) convex (NGMLR) PacBio read gap size BWA NGMLR Sedlazeck et al. (2017) biorxiv. doi:10.1101/169557.

WGS FOR STRUCTURAL VARIANT DISCOVERY Structural Variants Small Variants Sequencing PacBio Sequel System Short-read NGS Read Mapping NGMLR BWA Variant Calling pbsv GATK Visualization IGV 2.4 IGV

VARIANT CALLING FIND SV SIGNATURES CLUSTER SV SIGNATURES FILTER SUMMARIZE INTO SV GENOTYPE CIGAR D & I 50 bp nearby with similar sequence 2 and 20% reads support consensus of supporting reads supporting reads / covering reads http://pacb.com/sv

VARIANT CALLING FIND SV SIGNATURES CLUSTER SV SIGNATURES FILTER SUMMARIZE INTO SV GENOTYPE CIGAR D & I 50 bp nearby with similar sequence 2 and 20% reads support consensus of supporting reads supporting reads / covering reads http://pacb.com/sv

VARIANT CALLING FIND SV SIGNATURES CLUSTER SV SIGNATURES FILTER SUMMARIZE INTO SV GENOTYPE CIGAR D & I 50 bp nearby with similar sequence 2 and 20% reads support consensus of supporting reads supporting reads / covering reads http://pacb.com/sv

VARIANT CALLING FIND SV SIGNATURES CIGAR D & I 50 bp CLUSTER SV SIGNATURES nearby with similar sequence FILTER 2 and 20% reads support 1 of 10 4 of 10 SUMMARIZE INTO SV consensus of supporting reads GENOTYPE supporting reads / covering reads http://pacb.com/sv

VARIANT CALLING FIND SV SIGNATURES CIGAR D & I 50 bp CLUSTER SV SIGNATURES nearby with similar sequence FILTER 2 and 20% reads support 1 of 10 4 of 10 SUMMARIZE INTO SV consensus of supporting reads 329 bp deletion GENOTYPE supporting reads / covering reads http://pacb.com/sv

VARIANT CALLING FIND SV SIGNATURES CIGAR D & I 50 bp CLUSTER SV SIGNATURES nearby with similar sequence FILTER 2 and 20% reads support 1 of 10 4 of 10 SUMMARIZE INTO SV consensus of supporting reads 329 bp deletion GENOTYPE supporting reads / covering reads heterozygous (4 of 10) http://pacb.com/sv

VARIANT CALLING SMRT Analysis SMRT Analysis chr1 904490 ACGCGGCCGCCTCC TCCTCCGAACGTGGCCTCCTCCGAACGCGGCCGCCTCCTCCTCCGAACGCGGCCGCCTCCTCCTCCGA A PASS IMPRECISE;SVTY PE=DEL;END=904587;SVLEN=-97;SVANN=TANDEM GT:AD:DP 0/1:9:15 SMRT Analysis SMRT Analysis

WGS FOR STRUCTURAL VARIANT DISCOVERY Structural Variants Small Variants Sequencing PacBio Sequel System Short-read NGS Read Mapping NGMLR BWA Variant Calling pbsv GATK Visualization IGV 2.4 IGV

VISUALIZATION insertion deletion Robinson et al. (2011) Nature Biotechnology. doi:10.1038/nbt.1754.

WGS FOR STRUCTURAL VARIANT DISCOVERY Structural Variants Small Variants Sequencing PacBio Sequel System Short-read NGS Read Mapping NGMLR BWA Variant Calling pbsv GATK Visualization IGV 2.4 IGV

% SVs detected HOW MUCH TO SEQUENCE? 100% 90% 80% 70% 60% 50% 40% Het Hom Human HG00733 Sequel System 211 Gb (70-fold) 30% 20% short read 10% 0% 0 10 20 30 40 50 Coverage 5- to 10-fold optimal tradeoff of cost vs. performance disease gene discovery; population characterization 30- to 40-fold saturate discovery de novo variant discovery

CLINICAL CASE HISTORY 7 yrs 10 yrs left atrial myxoma resection, atrial repair testicular mass, right orchiectomy 13 yrs pituitary tumor 16 yrs 18 yrs 19 yrs recurrence of myxomata, resection, adrenal microadenoma recurrence of ventricular myxomata, resection, VT ACTH-independent Cushing s disease, thyroid nodules genetics suggests Carney complex PRKAR1A testing negative 21 yrs transphenoidal resection of pituitary present (26 yrs) recurrence of myxomata, consideration for heart transplant short-read whole genome sequencing negative Merker et al. (2017) Genetics in Medicine. doi:10.1038/gim.2017.86.

% SVs detected EVALUATING STRUCTURAL VARIANTS Deletions Insertions Initial call set 6,971 6,821 100% 90% 8-fold 80% 70% 60% 50% 40% 30% 20% 10% Het Hom Merker et al. (2017) Genetics in Medicine. doi:10.1038/gim.2017.86. 0% 0 10 20 30 40 50 Coverage

% SVs detected EVALUATING STRUCTURAL VARIANTS Deletions Insertions Initial call set 6,971 6,821 Not in segdup 5,893 6,254 100% 8-fold Not in NA12878 healthy control Overlaps RefSeq coding exon 2,476 3,171 39 16 90% 80% 70% 60% 50% 40% Het Hom Gene linked to some disease in OMIM 3 3 30% 20% 10% Merker et al. (2017) Genetics in Medicine. doi:10.1038/gim.2017.86. 0% 0 10 20 30 40 50 Coverage

HETEROZYGOUS 2.2 KB DELETION IN PRKAR1A PacBio discovery Sanger confirmation Merker et al. (2017) Genetics in Medicine. doi:10.1038/gim.2017.86.

PacBio Long-Read WGS for Structural Variant Discovery Targeted Enrichment without Amplification and SMRT Sequencing of Repeat-Expansion Disease Causative Genomic Regions

REPEAT EXPANSION DISEASES

CRISPR/CAS9 SYSTEM Bacterial Adaptive Immunity RNA-guided DNA Endonuclease Some in vivo applications: - Gene silencing - Homology-directed repair - Transient gene silencing or transcriptional repression - Transient activation of endogenous genes - Transgenic animals and embryonic stem cells

PCR-FREE TARGET ENRICHMENT VIA CAS9

COVERAGE ACROSS THE GENOME 1 SMRT Cell (PacBio RS II)

HUNTINGTON S DISEASE (HD) - Autosomal dominant neurodegenerative genetic disorder - Caused by an expansion of a CAG triplet repeat stretch in the Huntingtin (HTT) gene - polyglutamine tract

CAG REPEAT COUNTS

CAG REPEAT COUNTS IN HD PATIENTS Widening repeat number distribution at the mutated allele is biological Obtained roughly equal number of sequenced molecules for normal and mutated alleles Samples obtained from Vanessa Wheeler (Harvard Medical School)

FRAGILE X SYNDROME - Most common heritable form of cognitive impairment - Caused by expansion of a CGG trinucleotide repeat in the 5 UTR of the FMR1 gene fraxa.org

AGG INTERRUPTIONS REDUCE THE CHANCES OF PRE- TO FULL MUTATION TRANSMISSION CGG CGG CGG CGG AGG CGG Difference in risk is greatest near 75-80 CGG repeats Having full sequence information is medically relevant 80% 60% 15% Yrigollen et al. (2012) Genet Med Maternal CGG repeat number 2 CGG CGG CGG CGG AGG CGG CGG CGG CGG CGG CGG CGG CGG CGG AGG CGG 1 CGG CGG CGG CGG AGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG 0 CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG Yrigollen et al. (2012) Genet Med 14:729 736

SUBREAD COVERAGE ON THE SEQUEL SYSTEM 1 Sequel SMRT Cell 1M

MULTIPLEXED SAMPLES ON THE SEQUEL SYSTEM CAG Repeat Counts from 3 Controls and 3 HD Patients

CONCLUSION Amplification-free enrichment with CRISPR/Cas9 and SMRT Sequencing achieves the base-level resolution required to understand the underlying biology of repeat expansion disorders - Target any hard-to-amplify genomic region regardless of sequence context - Avoid PCR bias and PCR errors - Accurately sequence through long repetitive and low-complexity regions - Count repeats and identify sequence interruptions - Detect sample mosaicism

www.pacb.com For Research Use Only. Not for use in diagnostics procedures. Copyright 2017 by Pacific Biosciences of California, Inc. All rights reserved. Pacific Biosciences, the Pacific Biosciences logo, PacBio, SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and SageELF are trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx. FEMTO Pulse and Fragment Analyzer are trademarks of Advanced Analytical Technologies. All other trademarks are the sole property of their respective owners.