Introduction. Introduction

Similar documents
Finding subtle mutations with the Shannon human mrna splicing pipeline

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays

Biomarker development in the era of precision medicine. Bei Li, Interdisciplinary Technical Journal Club

Mutation specific therapies

Cancer Informatics Lecture

SUPPLEMENTARY INFORMATION. Intron retention is a widespread mechanism of tumor suppressor inactivation.

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation,

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq)

Hands-On Ten The BRCA1 Gene and Protein

PALB2 c g>c is. VARIANT OF UNCERTAIN SIGNIFICANCE (VUS) CGI s summary of the available evidence is in Appendices A-C.

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.

Supplemental Information For: The genetics of splicing in neuroblastoma

Supplemental Data. Integrating omics and alternative splicing i reveals insights i into grape response to high temperature

DeconRNASeq: A Statistical Framework for Deconvolution of Heterogeneous Tissue Samples Based on mrna-seq data

Simple, rapid, and reliable RNA sequencing

High Throughput Sequence (HTS) data analysis. Lei Zhou

Analyse de données de séquençage haut débit

Mechanisms of alternative splicing regulation

MODULE 4: SPLICING. Removal of introns from messenger RNA by splicing

Identification of Tissue Independent Cancer Driver Genes

AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits

RNA-seq Introduction

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers

Nature Getetics: doi: /ng.3471

Inference of Isoforms from Short Sequence Reads

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

Supplementary Tables. Supplementary Figures

The Cancer Genome Atlas & International Cancer Genome Consortium

AVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB

Genetics/Genomics: role of genes in diagnosis and/risk and in personalised medicine

Reporting TP53 gene analysis results in CLL

Supplemental Figure 1. Genes showing ectopic H3K9 dimethylation in this study are DNA hypermethylated in Lister et al. study.

Raymond Auerbach PhD Candidate, Yale University Gerstein and Snyder Labs August 30, 2012

Computer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015

A Practical Guide to Integrative Genomics by RNA-seq and ChIP-seq Analysis

Supplemental Figure 1. Small RNA size distribution from different soybean tissues.

Alternative Splicing and Genomic Stability

NGS in Cancer Pathology After the Microscope: From Nucleic Acid to Interpretation

MODULE 3: TRANSCRIPTION PART II

Benefits and pitfalls of new genetic tests

Session 4 Rebecca Poulos

RNA-Seq guided gene therapy for vision loss. Michael H. Farkas

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data

PSSV User Manual (V1.0)

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq

Supplementary Figures

On the Reproducibility of TCGA Ovarian Cancer MicroRNA Profiles

Nature Genetics: doi: /ng Supplementary Figure 1. PCA for ancestry in SNV data.

Personalised medicine: Past, present and future

New Opportunities in Cancer Research National Academy of Sciences Advancing Health Research in Poland and the United States December 3, 2009

Trinity: Transcriptome Assembly for Genetic and Functional Analysis of Cancer [U24]

Research Strategy: 1. Background and Significance

Session 4 Rebecca Poulos

Supplementary Figure 1: Attenuation of association signals after conditioning for the lead SNP. a) attenuation of association signal at the 9p22.

Supplementary Figure 1. Schematic diagram of o2n-seq. Double-stranded DNA was sheared, end-repaired, and underwent A-tailing by standard protocols.

Whole Genome and Transcriptome Analysis of Anaplastic Meningioma. Patrick Tarpey Cancer Genome Project Wellcome Trust Sanger Institute

Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS

Golden Helix s End-to-End Solution for Clinical Labs

Proteogenomic analysis of alternative splicing: the search for novel biomarkers for colorectal cancer Gosia Komor

Relationship between genomic features and distributions of RS1 and RS3 rearrangements in breast cancer genomes.

MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data

RNA SEQUENCING AND DATA ANALYSIS

The Cancer Genome Atlas

Studying Alternative Splicing

Lecture 8 Understanding Transcription RNA-seq analysis. Foundations of Computational Systems Biology David K. Gifford

RNA splicing meets genetic testing: detection and interpretation of splicing defects in

Gene finding. kuobin/

TP53 ABERRATIONS Methodical considerations

Supplementary Information

PSSV User Manual (V2.1)

Supplementary Figure 1: Features of IGLL5 Mutations in CLL: a) Representative IGV screenshot of first

David Tamborero, PhD

Nature Genetics: doi: /ng.3731

Transcript reconstruction

The Biology and Genetics of Cells and Organisms The Biology of Cancer

Nature Medicine: doi: /nm.3967

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer

COMPUTATIONAL OPTIMISATION OF TARGETED DNA SEQUENCING FOR CANCER DETECTION

MutationTaster & RegulationSpotter

Alternative RNA processing: Two examples of complex eukaryotic transcription units and the effect of mutations on expression of the encoded proteins.

PRECISION INSIGHTS. GPS Cancer. Molecular Insights You Can Rely On. Tumor-normal sequencing of DNA + RNA expression.

Bio 111 Study Guide Chapter 17 From Gene to Protein

TPMI Presents: Translational Genomics Research Update, Opportunities and Challenges

Supplementary Figure 1

Enterprise Interest Thermo Fisher Scientific / Employee

Concurrent Practical Session ACMG Classification

Variant Classification: ACMG recommendations. Andreas Laner MGZ München

RNA SEQUENCING AND DATA ANALYSIS

RNA- seq Introduc1on. Promises and pi7alls

REGULATED SPLICING AND THE UNSOLVED MYSTERY OF SPLICEOSOME MUTATIONS IN CANCER

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Multi-omics data integration colon cancer using proteogenomics approach

Ambient temperature regulated flowering time

Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit

CDH1 truncating alterations were detected in all six plasmacytoid-variant bladder tumors analyzed by whole-exome sequencing.

variant led to a premature stop codon p.k316* which resulted in nonsense-mediated mrna decay. Although the exact function of the C19L1 is still

Investigating rare diseases with Agilent NGS solutions

Transcription:

Introduction We are leveraging genome sequencing data from The Cancer Genome Atlas (TCGA) to more accurately define mutated and stable genes and dysregulated metabolic pathways in solid tumors. These efforts are motivated by the failure of currently available methods to correctly categorize many gene variants of unknown significance (VUS), despite their substantial potential to be pathogenic. Mutations in coding and non-coding regions (typically near exon/intron boundaries) have the ability to affect mrna processing which can result in aberrant splicing. The Shannon Pipeline, developed in our lab, implements an algorithm for high-throughput detection and interpretation of these mrna splicing mutations, using information theory. Introduction The putative variants that result require empirical confirmation in order to begin the process of translating in silico predictions to clinically relevant insights. It becomes intractable to visually inspect each variant with corresponding mrna sequencing data (RNA-Seq) data and it is difficult to complete an unbiased statistical analysis genome-wide. These analyses are possible with the available DNA and companion mrna sequencing data, originating from the same patient sample. OBJECTIVE: To develop a method to automatically validate putative DNA sequencing variants that alter mrna splicing across multiple patient samples, by using corresponding RNA sequencing data 1

Workflow diagram depicting the analyses undertaken by Veridical to analyze splicing mutations in TCGA breast carcinoma data. Items outlined in red comprise the input data for Veridical. TCGA input data is indicated by an orange shadow. The steps involving perl programs have a purple shadow. Using RNA Sequencing Data to Validate Splicing Mutations Tumor: TCGA-A2-A0CQ Transcriptome (RNA-Seq) sequencing RNA after splicing has occurred, introns are not expected G>A mutation Expected (intron spliced out) NOT expected (intron inclusion) Wild Type: Mutated: Intron Inclusion 2

Veridical Hypothesis-driven Statistically validates mutations throughout entire exome using RNA sequencing data Can be used to validate mutations in any individual/disease Intron Inclusion (Read-Abundance) Exon Skipping (Junction-Spanning) Cryptic Splicing (Junction-Spanning) **COMPARED SPLICING CONSEQUENCES WITH > 500 TUMOUR/NORMAL CONTROLS Veridical: Intron Inclusion with Mutation Tumor: TCGA-A2-A0CQ G>A mutation Transcriptome (RNA-Seq) ZOOMED Reads containing the +1 G>A splice site mutation show intron inclusion This is not observed in tumours (n = 446) and normal breast tissue (n = 106) that do not have the mutation (p < 0.01) 3

Histogram and embedded Q-Q plots portraying the difference between untransformed and Yeo-Johnson (YJ) 1 transformed data. The plots depict intron inclusion for the inactivating mutation (chr12:83359523 G>A) within TMTC2. The arrowheads denote the number of reads in the variant-containing file, which is, in all cases, more than observed in the control samples p < 0.01. Blue and red plot elements correspond to untransformed data, while yellow and purple correspond to YJ transformed elements. Dotted lines in the Q-Q plots are lines passing through the first and third quantiles for a normal reference distribution. The top plot shows the distribution of read-abundance intron inclusion in normal samples, while the bottom shows the same for tumour samples. 1 Yeo IK, Johnson RA: A new family of power transformations to improve normality or symmetry. Biometrika. 2000; 87(4): 954 959. 4

Conclusions Veridical provides a general, hypothesis-driven framework, for the elucidation of validated variants Can be utilized irrespective of the underlying disease In silico variant prediction followed by validation with RNA sequencing data significantly reduces the set of variants for downstream analyses and can aid in formulating biological insights Provides a robust and high-throughput method to handle the vast quantity of putative variants When adequate expression data are available at the locus carrying the mutation, this approach reveals a comprehensive set of genes exhibiting mrna splicing defects in complete genomes and exomes Known breast cancer genes may be mutated in a higher percentage of tumours than currently reported Splicing mutations are more prevalent than those that are currently being reported 5