RNA sequencing of cancer reveals novel splicing alterations

Similar documents
Supplemental Data. Integrating omics and alternative splicing i reveals insights i into grape response to high temperature

MODULE 3: TRANSCRIPTION PART II

Novel Insights into Breast Cancer Genetic Variance through RNA Sequencing

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq)

Transcriptome Analysis

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers

Package splicer. R topics documented: August 19, Version Date 2014/04/29

Introduction. Introduction

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Lecture 8 Understanding Transcription RNA-seq analysis. Foundations of Computational Systems Biology David K. Gifford

An Analysis of MDM4 Alternative Splicing and Effects Across Cancer Cell Lines

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

MODULE 4: SPLICING. Removal of introns from messenger RNA by splicing

Ambient temperature regulated flowering time

Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor suppressor genes

Bio 111 Study Guide Chapter 17 From Gene to Protein

RNA SEQUENCING AND DATA ANALYSIS

RNA SEQUENCING AND DATA ANALYSIS

Supplementary Figures

Hands-On Ten The BRCA1 Gene and Protein

V23 Regular vs. alternative splicing

Alternative RNA processing: Two examples of complex eukaryotic transcription units and the effect of mutations on expression of the encoded proteins.

Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63.

Nature Structural & Molecular Biology: doi: /nsmb.2419

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays

Nature Immunology: doi: /ni Supplementary Figure 1. Characteristics of SEs in T reg and T conv cells.

Accessing and Using ENCODE Data Dr. Peggy J. Farnham

GENOME-WIDE DETECTION OF ALTERNATIVE SPLICING IN EXPRESSED SEQUENCES USING PARTIAL ORDER MULTIPLE SEQUENCE ALIGNMENT GRAPHS

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data

Analyse de données de séquençage haut débit

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq

SSM signature genes are highly expressed in residual scar tissues after preoperative radiotherapy of rectal cancer.

A Practical Guide to Integrative Genomics by RNA-seq and ChIP-seq Analysis

Inference of Isoforms from Short Sequence Reads

Nature Immunology: doi: /ni Supplementary Figure 1. Transcriptional program of the TE and MP CD8 + T cell subsets.

Selective depletion of abundant RNAs to enable transcriptome analysis of lowinput and highly-degraded RNA from FFPE breast cancer samples

Comparison of open chromatin regions between dentate granule cells and other tissues and neural cell types.

Supplementary Material. Table S1. Summary of mapping results

Table S1. Total and mapped reads produced for each ChIP-seq sample

Identification and characterization of multiple splice variants of Cdc2-like kinase 4 (Clk4)

Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression. Chromatin Array of nuc

Supplemental Information For: The genetics of splicing in neuroblastoma

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space

Studying Alternative Splicing

Supplementary Figures

Nature Genetics: doi: /ng Supplementary Figure 1. Assessment of sample purity and quality.

Mechanisms of alternative splicing regulation

RNA-Seq Preparation Comparision Summary: Lexogen, Standard, NEB

Transcript reconstruction

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Experimental design and workflow utilized to generate the WMG Protein Atlas.

Histone Modifications Are Associated with Transcript Isoform Diversity in Normal and Cancer Cells

REGULATED SPLICING AND THE UNSOLVED MYSTERY OF SPLICEOSOME MUTATIONS IN CANCER

V24 Regular vs. alternative splicing

IPA Advanced Training Course

Where Splicing Joins Chromatin And Transcription. 9/11/2012 Dario Balestra

MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.

Profiles of gene expression & diagnosis/prognosis of cancer. MCs in Advanced Genetics Ainoa Planas Riverola

Proteogenomic analysis of alternative splicing: the search for novel biomarkers for colorectal cancer Gosia Komor

Lecture Readings. Vesicular Trafficking, Secretory Pathway, HIV Assembly and Exit from Cell

Nature Genetics: doi: /ng Supplementary Figure 1. Somatic coding mutations identified by WES/WGS for 83 ATL cases.

GeneOverlap: An R package to test and visualize

SUPPLEMENTARY INFORMATION

Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009

Nature Immunology: doi: /ni Supplementary Figure 1 33,312. Aire rep 1. Aire rep 2 # 44,325 # 44,055. Aire rep 1. Aire rep 2.

SUPPLEMENTARY FIGURES: Supplementary Figure 1

SCIENCE CHINA Life Sciences

Nature Genetics: doi: /ng.3731

RNA-seq Introduction

Supplementary Information. Preferential associations between co-regulated genes reveal a. transcriptional interactome in erythroid cells

Utility of Adequate Core Biopsy Samples from Ultrasound Biopsies Needed for Today s Breast Pathology

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project

Nature Genetics: doi: /ng Supplementary Figure 1. PCA for ancestry in SNV data.

Data mining with Ensembl Biomart. Stéphanie Le Gras

of TERT, MLL4, CCNE1, SENP5, and ROCK1 on tumor development were discussed.

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

ChromHMM Tutorial. Jason Ernst Assistant Professor University of California, Los Angeles

Nature Biotechnology: doi: /nbt.1904

Cancer Informatics Lecture

Polyomaviridae. Spring

Figure S2. Distribution of acgh probes on all ten chromosomes of the RIL M0022

TITLE: The Role Of Alternative Splicing In Breast Cancer Progression

Nature Immunology: doi: /ni Supplementary Figure 1. RNA-Seq analysis of CD8 + TILs and N-TILs.

CS 6824: Tissue-Based Map of the Human Proteome

Nature Genetics: doi: /ng Supplementary Figure 1

ChIP-seq analysis. J. van Helden, M. Defrance, C. Herrmann, D. Puthier, N. Servant, M. Thomas-Chollier, O.Sand

EST alignments suggest that [secret number]% of Arabidopsis thaliana genes are alternatively spliced

Supplementary Material for IPred - Integrating Ab Initio and Evidence Based Gene Predictions to Improve Prediction Accuracy

Molecular Biology (BIOL 4320) Exam #2 May 3, 2004

Huntington s Disease and its therapeutic target genes: A global functional profile based on the HD Research Crossroads database

BIMM 143. RNA sequencing overview. Genome Informatics II. Barry Grant. Lecture In vivo. In vitro.

Molecular Cell Biology - Problem Drill 10: Gene Expression in Eukaryotes

Supplemental Figure legends

Mechanism of splicing

High-Resolution Expression Map of the Arabidopsis Root Reveals Alternative Splicing and lincrna Regulation

Computational aspects of ChIP-seq. John Marioni Research Group Leader European Bioinformatics Institute European Molecular Biology Laboratory

Gene finding. kuobin/

Supplementary Figures

7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans.

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer

Transcription:

RNA sequencing of cancer reveals novel splicing alterations Jeyanthy Eswaran, Anelia Horvath, Sucheta Godbole, Sirigiri Divijendra Reddy, Prakriti Mudvari, Kazufumi Ohshiro, Dinesh Cyanam, Sujit Nair, Suzanne A. W. Fuqua, Kornelia Polyak, Liliana D. Florea & Rakesh Kumar

Supplemental Table 1: NBS global sequencing statistics and read distribution Normal breast sample global mrna sequencing statistics Global Statistics NBS1 NBS2 NBS3 Total Number of reads 63636829 61418652 69648831 Unique Reads 63199920 60961838 69301083 Aligned Reads 58889160 56654064 64624628 Unique Exons 156808 156716 156808 Total Exons 248904 241317 248904 Transcripts (known and novel) 21718 21724 21939 Genes 15562 15605 15498

Supplemental Table 2: The number of cancer specific isoforms that align (nblast) with the human open reading frame database, human ORFeome 8.1 (http://horfdb.dfci.harvard.edu/). Cancer group Total cancer specific isoforms that express only in cancer subtype The number of cancer specific isoforms that align with ORF above 90% identity 540 246 Non- 355 165 HER-2 Positive 588 254

Supplemental Table 3: The number of genes involved in differential promoter usage and promoter switching events in breast cancers in comparison to NBS Differential promoter usage and promoter switching events (PSE) Differential promoter usage comparison Statistically significant PSE genes Number of primary transcripts vs NBS 138 1697 75 Non- vs NBS 83 832 44 Genes that change coding region due to PSE HER-2 vs NBS 178 2690 152

Supplement Table 4: Determination of alternative splice events in individual breast cancer samples in comparison to NBS 1 (A) 2 (B) and 3 (C) using Multivariate Analysis of Transcript Splicing (MATS). A. B.

C.

Supplement Table 5: Determination of alternative splice events found common in individual breast cancer type vs. normal breast sample 1, 2 and 3 using MATS. Comparisons used to identify cancer specific events Exon Skip Alternative 5' End Alternativ e 3' End Retained Intron vs. NBS 2 0 0 24 Non- vs. NBS 7 0 3 37 HER2-positive vs. NBS 5 1 4 20

Supplement Table 6A: Determination of alternative splice events in merged breast cancer samples in comparison to merged NBS using MATS Merged Groups taken for MATS Alternative 3 prime Alternative 5 prime Mutually exclusive exon Intron retention Exon skip vs. NBS 405 446 1549 2038 2898 Non- vs. NBS 394 443 1573 2124 2811 HER2-positive vs. NBS 358 398 1387 2148 2027 Supplement Table 6B: Determination of switch like event i.e. specific alternative splice events that occur only in merged breast cancer samples or in merged NBS using MATS Event Type Event Number of Events Group A vs NBS Group B vs NBS Group C vs NBS Exon Skip Exclusion in Cancer 2153 1357 1671 Inclusion in Cancer 22615 21453 21438 Exclusion in Normal 3900 4531 4421 Inclusion in Normal 2134 2853 3595 Alternative 3' end Exclusion in Cancer 464 340 403 Inclusion in Cancer 1223 984 1138 Exclusion in Normal 554 610 596 Inclusion in Normal 204 283 252 Alternative 5' end Exclusion in Cancer 245 197 252 Inclusion in Cancer 890 745 788 Exclusion in Normal 447 473 447 Inclusion in Normal 132 162 177 Mutually Exclusive Exon Exclusion in Cancer 295 172 213 Inclusion in Cancer 262 183 212 Exclusion in Normal 375 428 420 Inclusion in Normal 355 401 386 Intron Retention Inclusion in Cancer 843 679 797 Exclusion in Normal 2903 2942 2857 Inclusion in Normal 128 155 114

Supplement Table 7: Annotation of splice events in individual breast cancer samples from, Non- and HER2-positive group and in comparison to normal breast samples using direct exon model comparison Group Samples Type of Splice Events TSS TTS SKIP_ON SKIP_OFF MSKIP_ON MSKIP_OFF 1 100668 42294 159463 159463 37409 37409 2 99272 41917 160030 160030 37839 37839 3 101755 42739 161433 161433 37873 37873 4 99858 41870 159597 159597 37256 37256 5 99818 41961 160041 160041 37888 37888 6 99999 41896 160211 160211 37958 37958 Non- Group Samples Type of splice Events TSS TTS SKIP_ON SKIP_OFF MSKIP_ON MSKIP_OFF Non-1 97004 39179 159305 159305 37569 37569 Non-2 97378 39536 160042 160042 37689 37689 Non-3 97699 39581 159338 159338 37198 37198 Non-4 97783 39579 160122 160122 37728 37728 Non-5 97865 39757 159951 159951 37715 37715 Non-6 96762 38965 160556 160556 37633 37633 HER2-positive breast cancer Samples Type of Splice Events TSS TTS SKIP_ON SKIP_OFF MSKIP_ON MSKIP_OFF HER2_1 96832 38735 159225 159225 37447 37447 HER2_2 96681 38707 159606 159606 37670 37670 HER2_3 96456 38522 159677 159677 37674 37674 HER2_4 97106 38771 159578 159578 37489 37489 HER2_5 96061 38279 161160 161160 37882 37882

Supplement Table 8: Annotation of Novel Splice Events that are common in individual breast cancer after eliminating all the splice events that occur in normal breast samples as well as reference human genome, hg19 using direct exon model comparison Common Splice Events in breast cancer groups after eliminating events similar to hg19 Groups Type of Splice Events TSS TTS SKIP_ON SKIP_OFF MSKIP_ON MSKIP_OFF 10 0 108 103 32 10 Non- 5 3 120 99 38 6 HER2-positive 40 1 148 111 31 7

Supplemental Figure 1: Venn diagram showing the overlapping transcripts that are similar to reference between A. all three breast cancer types and B. normal breast sample (NBS) A 54879 7317 Non- 8345 51755 6680 7965 48972 HER2-positive B HER2-positive 48972 Non- 51755 54879 7965 7317 7021 NBS 6680 5611 13027 4359 8345 5021 5203 4998 6105

density Supplemental Figure 2: CummeRbund plots of the expression level distribution for all genes that are considered from individual experimental conditions shown as the A. csdensity plot B. dendrogram A genes B 0.6 0.5 0.4 0.3 sample_name NBT Non_ HER2 0.2 0.1 0.0 0 1 2 3 4 log10(fpkm)

Supplemental Figure 3: Isoforms associated with statistically significant differentially spliced genes (p-value<0.05) identified through pairwise comparisons of vs NBS (A), non- vs NBS (B), and HER2-positive vs NBS (C).

log10(fpkm) HER2 log10(fpkm) HER2 Non_ NBT log10(fpkm) HER2 Non_ NBT Non_ NBT Supplemental Figure 4: The distributions of A. genes B. primary transcripts and C. Coding sequence FPKM across all four groups shown as csboxplot 3 A B C 4 4 2 2 2 1 0 sample_name NBT 0 Non_ HER2 sample_name NBT Non_ 0-1 sample_name NBT Non_ HER2 HER2-2 -2-2 -3-4 -4-4 sample_name sample_name sample_name

Supplemental Figure 5: FPKM Bins of de novo reassembled transcripts from cufflinks assembler that are classified as novel and reference like using cuffcompare program in A. B. Non- and C. HER2-positive D. Normal breast samples expression transcripts

FPKM + 1 FPKM + 1 FPKM + 1 GRIPAP1 P2RY10 PGK1 ENG DOCK8 TMEM71 ASAP1 DSCC1,TAF2 LMO7,UCHL3 CDK8 GCN1L1 RBM19 UTP20 HSPA8,SNORD14C SLC36A4 CTTN,PPFIA1 TEAD1 BTAF1 KIAA1274 EPHX1,SRP9 PGK1 MED12,NLGN3 ZER1 DPM2,FAM102A NCBP1 DSCC1,TAF2 ENPP2 PARP2 VPS36 RPLP0 GCN1L1 PEBP1 LRRK2 FGD4 HSPA8,SNORD14C NT5C2 PARD3 KIAA1217 MLLT10 CROCC THOC2 NUP62CL,RBM41 DOCK11 ALG13 PGK1 ATP7A ZMYND19 EPB41L4B SPTAN1 VPS13A KIAA1797 PRKDC SORL1 CTTN,PPFIA1 MARK2 FRA10AC1 PARD3 MAPK8 RBM17 INTS3,SLC27A3 A Supplemental Figure 6: The relative abundances (FPKM) of top 20 statistically significant splice genes identified from the pair wise comparison between normal breast samples vs. (A), (B) Non- and (C) HER2-positive breast cancers 10 4 10 3 10 2 10 1 sample_name NBT Non_ HER2 10 0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! B 10 4 10 3 10 2 10 1 sample_name NBT Non_ HER2 10 0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! C 10 3 10 2.5 10 2 10 1.5 10 1 sample_name NBT Non_ HER2 10 0.5 10 0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Supplemental Figure 7: The overlap of statistically significant splice genes identified from the pair wise comparison between (A) normal breast samples vs., Non- and HER2- positive breast cancers and (B) comparison among the cancer subtypes alone. A NBS vs. NBS vs. Non- B NBS vs. NBS vs. Non- NBS vs. HER-2 positive NBS vs. HER-2 positive

Supplemental Figure 8: A. The top panel shows the total number of differentially spliced genes associated known and novel isoforms in (red), non- (green) and HER2-positive (blue) breast cancers. The bottom panel presents the total number of differentially spliced isoforms that are expressed only in, non- and HER2-positive breast cancers. B. Top 20 genes with an abundant novel splice isoform that is not expressed in NBT. The genes are sorted by the highest abundance (FPKMs) for, Non- and HER2. The relative abundance of the novel isoform in the other two breast cancer subtypes is also shown (color coded). The exon models for the novel isoforms are shown in Supplemental file 11. A B

Supplemental Figure 9: Venn diagram showing the overlap among candidates identified from cuffdiff statistical test at the level of pre-mrna (TSS), Splicing (Splice), Promoter usage (Promoter) and Coding sequence (CDS) between the normal breast samples vs. (A) Non- (B) and HER2-positive (C) breast cancers. TSS Splice A B C Non- Non- 170 521 339 52 236 112 181 90 38 172 40 51 548 258 Promoter 109 11 57 5 13 10 150 Non- HER2-positive HER2-positive HER2-positive D CDS 28 47 48 Non- 13 44 10 12 HER2-positive

Supplemental Figure 10. Relative abundance of TFAP2A isoforms

Supplemental Figure 11: The pathway influenced by the differentially splicing genes NBS vs. differentially splicing genes Cell Death, Cellular Function and Maintenance, Cell Cycle 27 out of 35 molecules NBS vs. Non- differentially splicing genes Post-translational modification, digestive system development & function, embryonic development. 27 out of 35 molecules NBS vs. HER2-positive differentially splicing genes Cell Morphology, cellular function and maintenance, embryonic development 27 out of 35 molecules

Supplemental Figure 12: The pathway influenced by the differentially expressing primary transcripts Immunological Disease, Cell to cell Signaling and interaction, cellular movement 30 out of 35 molecules NBS vs. differentially expressing primary transcript genes Tissue morphology, Cell Cycle, Hair & Skin development and function 29 out of 35 molecules NBS vs. Non- differentially expressing primary transcript genes Infectious Disease, renal and urological disease, antimicrobial response 31 out of 35 molecules NBS vs. HER2-positive differentially expressing primary transcript genes

Supplemental Figure 13: The pathway influenced by the genes that are involved in differential promoter usage Cellular development, cell to cell signaling and interaction, hematological system development & function 14 molecules present out of 35 NBS vs. differential promoter usage genes Cell to cell signaling and interaction, connective tissue development and function, Cancer 11 out of 35 molecules NBS vs. Non- differential promoter usage genes NBS vs. HER2-positive differential promoter usage genes Tissue development, Cell death, cell morphology 19 out of 35 molecules

Supplemental Figure 14: Overlap between differential splice, primary transcripts, promoter usage and promoter switching that occur in, Non- and HER2-positive in comparison with NBS. A 58 B 36 Non- Promoter switching 34 14 17 212 22 4 6 119 14 5 DYRK1A MSI2 MLL5 FTO LRBA PHF16 ABCG1 GRIPAP1 SEC15L1 PHF16 ENO1 KIAA0556 AC0272.6 HSPA18 AC092135.1 34 47 16 849 TSS C 132 64 FGD4 NCAPD2 KIAA0664 TIAA1217 SNHG7 HER2-Positive 18 31 10 632 TSS 107 ALDH1N1 CPSF7 TFAP2A HSPA8, SNORD14 FCHO2 FGD4 NCAPD2 KIAA0664 TIAA1217 SNHG7 44 11 15 205 26 55 67 32 757 TSS 135 SDCCAG8 SORL1 SFPQ ZC3H7a DICER1 CASP10 MBD5 CSDA FRYL PPP1R12A ASAP1 DDI2 GPATCH8 LTN1 TRAPPC10 DROSHADHX9 CCDC107 ATP11B INPP4B NF5B IKBKB FBXW7 TFAP2A BRE PHF14 RC3H2 HSPA8 MAMDC2 AC013272.3 FCHO2 TRAF3IP1

Supplemental Figure 15: Exon model of the, Non- and HER2-positive validated novel isoforms Novel Isoform: validated candidate PHLPP2 Non- Novel Isoform: validated candidate LARP1 HER2-positive Novel Isoform: validated candidateadd3

Supplemental Figure 16: Predicted protein domain models of the validated novel hybrid isoforms, PHLPP2 (A), ADD3 (B) and LARP1 (C)

Supplemental Figure 17: Overlap between the mrna sequencing based alternatively splicing genes and the genes identified from the comparative analysis of normal vs. ductal carcinoma in situ or invasive breast cancer Microarray based Normal vs. DCIS resulting differentially splicing genes microarray mrna seq mrna sequencing based Normal vs., Non- and HER2-positive breast cancer differentially splicing genes 7086 408 434 Microarray based Normal vs. IBC resulting differentially splicing genes mrna sequencing based Normal vs., Non- and HER2-positive breast cancer differentially splicing genes microarray mrna seq 7696 374 468