Lecture 8 Understanding Transcription RNA-seq analysis. Foundations of Computational Systems Biology David K. Gifford

Size: px
Start display at page:

Download "Lecture 8 Understanding Transcription RNA-seq analysis. Foundations of Computational Systems Biology David K. Gifford"

Transcription

1 Lecture 8 Understanding Transcription RNA-seq analysis Foundations of Computational Systems Biology David K. Gifford 1

2 Lecture 8 RNA-seq Analysis RNA-seq principles How can we characterize mrna isoform expression using high-throughput sequencing? Differential expression and PCA What genes are differentially expressed, and how can we characterize expressed genes? Single cell RNA-seq What are the benefits and challenges of working with single cells for RNA-seq? 2

3 RNA-Seq characterizes RNA molecules export to cytoplasm nucleus A B C A C mrna High-throughput sequencing of RNAs at various stages of processing splicing A B C pre-mrna or ncrna transcription A B C Gene in genome Courtesy of Cole Trapnell. Used with permission. cytoplasm Slide courtesy Cole Trapnel 3

4 Pervasive tissue-specific regulation of alternative mrna isoforms. 4 Courtesy of Macmillan Publishers Limited. Used with permission. Source: Wang, Eric T., Rickard Sandberg, et al. "Alternative Isoform Regulation in Human Tissue Transcriptomes." Nature 456, no (2008): ET Wang et al. Nature 000, 1-7 (2008) doi: / nature07509

5 RNA-Seq: millions of short reads from fragmented mrna Extract RNA from cells/tissue + splice junctions! Courtesy of Macmillan Publishers Limited. Used with permission. Source: Pepke, Shirley, Barbara Wold, et al. "Computation for ChIP-seq and RNA-seq Studies." Nature Methods 6 (2009): S Pepke et. al. Nature Methods

6 Mapping RNA-seq reads to a reference genome reveals expression Sox2 6

7 RNA-seq reads map to exons and across exons Reads over exons Smug1 Junction reads (split between exons) 7

8 Two major approaches to RNA-seq analysis 1. Assemble reads into transcripts. Typical issues with coverage and correctness. 2. Map reads to reference genome and identify isoforms using constraints Goal is to quantify isoforms and determine significance of differential expression Common RNA-seq expression metrics are Reads per killobase per million reads (RPKM) or Fragments per killobase per million (FPKM) Short sequencing reads, randomly sampled from a transcript exon 1 exon 2 exon 3 8

9 Aligned reads reveal isoform possibilities A B C identify candidate exons via genomic mapping A B A C B C Generate possible pairings of exons A B A C B C Align reads to possible junctions Courtesy of Cole Trapnell. Used with permission. Slide courtesy Cole Trapnell 9

10 We can use mapped reads to learn the isoform mixture ψ" A D C B Isoform Fraction T 1 ψ 1" T 2 ψ 2" T 3 ψ 3" T 4 ψ 4" E Courtesy of Cole Trapnell. Used with permission. Slide courtesy Cole Trapnell 10

11 Detecting alternative splicing from mrna-seq data Isoforms Inclusion reads Common reads Common reads Exclusion reads Given a set of reads, estimate: = Distribution of isoforms 11

12 P(R i T=T j ) Excluded reads If a single ended read or read pair R i is structurally incompatible with transcript T j, then P(R = R i T = T j ) = 0 R i T j Intron in T j Courtesy of Cole Trapnell. Used with permission. Slide courtesy Cole Trapnell 12

13 P(R i T=T j ) Single end reads Cufflinks assumes that fragmentation is roughly uniform. The probability of observing a fragment starting at a specific position S i in a transcript of length l j is:" P(S = S i T = T j ) = 1 l j starting position in transcript, S i R i T j Transcript length l j Courtesy of Cole Trapnell. Used with permission. Slide courtesy Cole Trapnell 13

14 P(R i T=T j ) Paired end reads Assume our library fragments have a length distribution described by a probability density F. Thus, the probability of observing a particular paired alignment to a transcript:" P(R = R i T = T j ) = F(l j(r j )) l j Implied fragment length l j (R i ) R i T j Courtesy of Cole Trapnell. Used with permission. Slide courtesy Cole Trapnell 14

15 Estimating Isoform Expression Find expression abundances ψ 1,,ψ n for a set of isoforms T 1,,T n Observations are the set of reads R 1,,R m m P(R Ψ) = Ψ j P(R = R i T = T j ) i=0 j=0 L(Ψ R) P(R Ψ)P(Ψ) n Ψ = argmaxl(ψ R) Ψ Can estimate mrna expression of each isoform number of reads that map to a gene and ψ using total 15

16 Case study: myogenesis Transcript categories, by coverage Transcripts (%) match contained intra intron novel isoform repeat other Cufflinks identified 116,839 distinct transcribed fragments (transfrags) Nearly 70% of the reads in 14,241 matching transcripts Tracked 8,134 transfrags across all time points, 5,845 complete matches to UCSC/ Ensembl/VEGA Transcripts Tracked 643 new isoforms of known genes across all points Reads per bp Courtesy of Cole Trapnell. Used with permission. Slide courtesy Cole Trapnell 16

17 Case study: myogenesis Transcript categories, by coverage Transcripts (%) match contained intra intron novel isoform repeat other ~25% of transcripts have light sequence coverage, and are fragments of full transcripts Intronic reads, repeats, and other artifacts are numerous, but account for less than 5% of the assembled reads. Transcripts Reads per bp Courtesy of Cole Trapnell. Used with permission. Slide courtesy Cole Trapnell 17

18 Lecture 8 RNA-seq Analysis RNA-seq principles How can we characterize mrna isoform expression using high-throughput sequencing? Differential expression and PCA What genes are differentially expressed, and how can we characterize expressed genes? Single cell RNA-seq What are the benefits and challenges of working with single cells for RNA-seq? 18

19 19

20 20

21 21

22 22

23 23

24 Scaling RNA-seq data (DESeq) i gene or isoform j sample (experiment) m number of samples K ij number of counts for isoform i in experiment j s j sampling depth for experiment j (scale factor) s j = median i K ij ( v=1kiv m ) 1 m 24

25 Model for RNA-seq data (DESeq) i gene or isoform p condition j sample (experiment) p(j) condition of sample j m number of samples K ij number of counts for isoform i in experiment j q ip Average scaled expression for gene i condition p q ip = 1 # of replicates j in replicates K ij s j µ ij = q ip( j) s j σ ij 2 2 = µ + s j ij ( q ) v p ip( j) K ij ~ NB ( µ ij, σ ij 2 ) 25

26 2 2 σ ij = µ ij + s jvp(q ) ip( j) Orange Line DESeq Dashed Orange edger Purple - Poission Courtesy of the authors. License: CC-BY. Source: Anders, Simon, and Wolfgang Huber. "Differential Expression Analysis for Sequence Count Data." Genome Biology 11, no. 10 (2010): R

27 Significance of differential expression using test statistics Hypothesis H0 (null) Condition A and B identically express isoform i with random noise added Hypothesis H1 Condition A and B differentially express isoform Degrees of freedom (dof) is the number of free parameters in H1 minus the number of free parameters in H0; in this case degrees of freedom is 4 2 = 2 (H1 has an extra mean and variance). Likelihood ratio test defines a test statistic that follows the Chi Squared distribution T i = 2 log P( K ia H1)P( ib P ia ib K H1) K, K H 0 ( ) P(H 0) 1 ChiSquaredCDF (T i dof ) 27

28 28 Courtesy of the authors. License: CC-BY. Source: Anders, Simon, and Wolfgang Huber. "Differential Expression Analysis for Sequence Count Data." Genome Biology 11, no. 10 (2010): R106.

29 Hypergeometric test for overlap significance N total # of genes 1000 n1 - # of genes in set A 20 n2 - # of genes in set B 30 k - # of genes in both A and B 3 P( k) =! # " n1 k $! N n1 &# %" n2 k! N $ # & " n2 % $ & % ( ) = P(i) P x k min(n1,n2) i=k

30 30

31 31

32 32

33 33

34 34

35 35

36 Lecture 8 RNA-seq Analysis RNA-seq principles How can we characterize mrna isoform expression using high-throughput sequencing? Differential expression and PCA What genes are differentially expressed, and how can we characterize expressed genes? Single cell RNA-seq What are the benefits and challenges of working with single cells for RNA-seq? 36

37 37 Courtesy of Fluidigm Corporation. Used with permission.

38 Single-cell RNA-Seq of LPS-stimulated bone-marrow-derived dendritic cells reveals extensive transcriptome heterogeneity. AK Shalek et al. Nature 000, 1-5 (2012) doi: /nature Courtesy of Macmillan Publishers Limited. Used with permission. Source: Shalek, Alex K., Rahul Satija, et al. "Single-cell Transcriptomics Reveals Bimodality in Expression and Splicing in Immune Cells." Nature (2013).

39 Analysis of co-variation in single-cell mrna expression levels reveals distinct maturity states and an antiviral cell circuit. AK Shalek et al. Nature 000, 1-5 (2012) doi: /nature Courtesy of Macmillan Publishers Limited. Used with permission. Source: Shalek, Alex K., Rahul Satija, et al. "Single-cell Transcriptomics Reveals Bimodality in Expression and Splicing in Immune Cells." Nature (2013).

40 Analysis of co-variation in single-cell mrna expression levels reveals distinct maturity states and an antiviral cell circuit. AK Shalek et al. Nature 000, 1-5 (2012) doi: /nature Courtesy of Macmillan Publishers Limited. Used with permission. Source: Shalek, Alex K., Rahul Satija, et al. "Single-cell Transcriptomics Reveals Bimodality in Expression and Splicing in Immune Cells." Nature (2013).

41 RNA-seq library complexity can help qualify cells for analysis 41 Michal Grzadkowski Michal Grzadkowski. All rights reserved. This content is excluded from our Creative Commons license. For more information, see

42 RNA-seq library complexity can help qualify cells for analysis Michal Grzadkowski Michal Grzadkowski. All rights reserved. This content is excluded from our Creative Commons license. For more information, see 42

43 43 FIN

44 MIT OpenCourseWare J / J / J / 7.36J / 6.802J / 6.874J / HST.506J Foundations of Computational and Systems Biology Spring 2014 For information about citing these materials or our Terms of Use, visit:

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq)

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq) RNA sequencing (RNA-seq) Module Outline MO 13-Mar-2017 RNA sequencing: Introduction 1 WE 15-Mar-2017 RNA sequencing: Introduction 2 MO 20-Mar-2017 Paper: PMID 25954002: Human genomics. The human transcriptome

More information

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers Gordon Blackshields Senior Bioinformatician Source BioScience 1 To Cancer Genetics Studies

More information

Inference of Isoforms from Short Sequence Reads

Inference of Isoforms from Short Sequence Reads Inference of Isoforms from Short Sequence Reads Tao Jiang Department of Computer Science and Engineering University of California, Riverside Tsinghua University Joint work with Jianxing Feng and Wei Li

More information

BIMM 143. RNA sequencing overview. Genome Informatics II. Barry Grant. Lecture In vivo. In vitro.

BIMM 143. RNA sequencing overview. Genome Informatics II. Barry Grant. Lecture In vivo. In vitro. RNA sequencing overview BIMM 143 Genome Informatics II Lecture 14 Barry Grant http://thegrantlab.org/bimm143 In vivo In vitro In silico ( control) Goal: RNA quantification, transcript discovery, variant

More information

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays Supplementary Materials RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays Junhee Seok 1*, Weihong Xu 2, Ronald W. Davis 2, Wenzhong Xiao 2,3* 1 School of Electrical Engineering,

More information

Introduction to Systems Biology of Cancer Lecture 2

Introduction to Systems Biology of Cancer Lecture 2 Introduction to Systems Biology of Cancer Lecture 2 Gustavo Stolovitzky IBM Research Icahn School of Medicine at Mt Sinai DREAM Challenges High throughput measurements: The age of omics Systems Biology

More information

Ambient temperature regulated flowering time

Ambient temperature regulated flowering time Ambient temperature regulated flowering time Applications of RNAseq RNA- seq course: The power of RNA-seq June 7 th, 2013; Richard Immink Overview Introduction: Biological research question/hypothesis

More information

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells. SUPPLEMENTAL FIGURE AND TABLE LEGENDS Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells. A) Cirbp mrna expression levels in various mouse tissues collected around the clock

More information

Transcript reconstruction

Transcript reconstruction Transcript reconstruction Summary I Data types, file formats and utilities Annotation: Genomic regions Genes Peaks bedtools Alignment: Map reads BAM/SAM Samtools Aggregation: Summary files Wig (UCSC) TDF

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Figure 1. Heatmap of GO terms for differentially expressed genes. The terms were hierarchically clustered using the GO term enrichment beta. Darker red, higher positive

More information

RNA-seq Introduction

RNA-seq Introduction RNA-seq Introduction DNA is the same in all cells but which RNAs that is present is different in all cells There is a wide variety of different functional RNAs Which RNAs (and sometimes then translated

More information

MODULE 4: SPLICING. Removal of introns from messenger RNA by splicing

MODULE 4: SPLICING. Removal of introns from messenger RNA by splicing Last update: 05/10/2017 MODULE 4: SPLICING Lesson Plan: Title MEG LAAKSO Removal of introns from messenger RNA by splicing Objectives Identify splice donor and acceptor sites that are best supported by

More information

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq Philipp Bucher Wednesday January 21, 2009 SIB graduate school course EPFL, Lausanne ChIP-seq against histone variants: Biological

More information

MODULE 3: TRANSCRIPTION PART II

MODULE 3: TRANSCRIPTION PART II MODULE 3: TRANSCRIPTION PART II Lesson Plan: Title S. CATHERINE SILVER KEY, CHIYEDZA SMALL Transcription Part II: What happens to the initial (premrna) transcript made by RNA pol II? Objectives Explain

More information

Transcriptome Analysis

Transcriptome Analysis Transcriptome Analysis Data Preprocessing Sample Preparation Illumina Sequencing Demultiplexing Raw FastQ Reference Genome (fasta) Reference Annotation (GTF) Reference Genome Analysis Tophat Accepted hits

More information

High-Throughput Sequencing Course

High-Throughput Sequencing Course High-Throughput Sequencing Course Introduction Biostatistics and Bioinformatics Summer 2017 From Raw Unaligned Reads To Aligned Reads To Counts Differential Expression Differential Expression 3 2 1 0 1

More information

RNA SEQUENCING AND DATA ANALYSIS

RNA SEQUENCING AND DATA ANALYSIS RNA SEQUENCING AND DATA ANALYSIS Length of mrna transcripts in the human genome 5,000 5,000 4,000 3,000 2,000 4,000 1,000 0 0 200 400 600 800 3,000 2,000 1,000 0 0 2,000 4,000 6,000 8,000 10,000 Length

More information

RNA- seq Introduc1on. Promises and pi7alls

RNA- seq Introduc1on. Promises and pi7alls RNA- seq Introduc1on Promises and pi7alls DNA is the same in all cells but which RNAs that is present is different in all cells There is a wide variety of different func1onal RNAs Which RNAs (and some1mes

More information

fl/+ KRas;Atg5 fl/+ KRas;Atg5 fl/fl KRas;Atg5 fl/fl KRas;Atg5 Supplementary Figure 1. Gene set enrichment analyses. (a) (b)

fl/+ KRas;Atg5 fl/+ KRas;Atg5 fl/fl KRas;Atg5 fl/fl KRas;Atg5 Supplementary Figure 1. Gene set enrichment analyses. (a) (b) KRas;At KRas;At KRas;At KRas;At a b Supplementary Figure 1. Gene set enrichment analyses. (a) GO gene sets (MSigDB v3. c5) enriched in KRas;Atg5 fl/+ as compared to KRas;Atg5 fl/fl tumors using gene set

More information

Lectures 13: High throughput sequencing: Beyond the genome. Spring 2017 March 28, 2017

Lectures 13: High throughput sequencing: Beyond the genome. Spring 2017 March 28, 2017 Lectures 13: High throughput sequencing: Beyond the genome Spring 2017 March 28, 2017 h@p://www.fejes.ca/2009/06/science- cartoons- 5- rna- seq.html Omics Transcriptome - the set of all mrnas present in

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Assessment of sample purity and quality.

Nature Genetics: doi: /ng Supplementary Figure 1. Assessment of sample purity and quality. Supplementary Figure 1 Assessment of sample purity and quality. (a) Hematoxylin and eosin staining of formaldehyde-fixed, paraffin-embedded sections from a human testis biopsy collected concurrently with

More information

Transcriptome and isoform reconstruc1on with short reads. Tangled up in reads

Transcriptome and isoform reconstruc1on with short reads. Tangled up in reads Transcriptome and isoform reconstruc1on with short reads Tangled up in reads Topics of this lecture Mapping- based reconstruc1on methods Case study: The domes1c dog De- novo reconstruc1on method Trinity

More information

Exercises: Differential Methylation

Exercises: Differential Methylation Exercises: Differential Methylation Version 2018-04 Exercises: Differential Methylation 2 Licence This manual is 2014-18, Simon Andrews. This manual is distributed under the creative commons Attribution-Non-Commercial-Share

More information

Supplemental Data. Integrating omics and alternative splicing i reveals insights i into grape response to high temperature

Supplemental Data. Integrating omics and alternative splicing i reveals insights i into grape response to high temperature Supplemental Data Integrating omics and alternative splicing i reveals insights i into grape response to high temperature Jianfu Jiang 1, Xinna Liu 1, Guotian Liu, Chonghuih Liu*, Shaohuah Li*, and Lijun

More information

Methods: Biological Data

Methods: Biological Data Transcriptome analysis of short read Illumina RNA sequencing: investigating baseline variability in gene expression levels and splice variants among human brain and Lymphoblastoid samples Abstract Understanding

More information

STAT1 regulates microrna transcription in interferon γ stimulated HeLa cells

STAT1 regulates microrna transcription in interferon γ stimulated HeLa cells CAMDA 2009 October 5, 2009 STAT1 regulates microrna transcription in interferon γ stimulated HeLa cells Guohua Wang 1, Yadong Wang 1, Denan Zhang 1, Mingxiang Teng 1,2, Lang Li 2, and Yunlong Liu 2 Harbin

More information

Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63.

Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63. Supplementary Figure Legends Supplementary Figure S1. Gene expression analysis of epidermal marker genes and TP63. A. Screenshot of the UCSC genome browser from normalized RNAPII and RNA-seq ChIP-seq data

More information

MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data

MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data Nucleic Acids Research Advance Access published February 1, 2012 Nucleic Acids Research, 2012, 1 13 doi:10.1093/nar/gkr1291 MATS: a Bayesian framework for flexible detection of differential alternative

More information

Small RNAs and how to analyze them using sequencing

Small RNAs and how to analyze them using sequencing Small RNAs and how to analyze them using sequencing RNA-seq Course November 8th 2017 Marc Friedländer ComputaAonal RNA Biology Group SciLifeLab / Stockholm University Special thanks to Jakub Westholm for

More information

RNA-Seq Preparation Comparision Summary: Lexogen, Standard, NEB

RNA-Seq Preparation Comparision Summary: Lexogen, Standard, NEB RNA-Seq Preparation Comparision Summary: Lexogen, Standard, NEB CSF-NGS January 22, 214 Contents 1 Introduction 1 2 Experimental Details 1 3 Results And Discussion 1 3.1 ERCC spike ins............................................

More information

ChIP-seq data analysis

ChIP-seq data analysis ChIP-seq data analysis Harri Lähdesmäki Department of Computer Science Aalto University November 24, 2017 Contents Background ChIP-seq protocol ChIP-seq data analysis Transcriptional regulation Transcriptional

More information

Analyse de données de séquençage haut débit

Analyse de données de séquençage haut débit Analyse de données de séquençage haut débit Vincent Lacroix Laboratoire de Biométrie et Biologie Évolutive INRIA ERABLE 9ème journée ITS 21 & 22 novembre 2017 Lyon https://its.aviesan.fr Sequencing is

More information

Introduction. Introduction

Introduction. Introduction Introduction We are leveraging genome sequencing data from The Cancer Genome Atlas (TCGA) to more accurately define mutated and stable genes and dysregulated metabolic pathways in solid tumors. These efforts

More information

7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans.

7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans. Supplementary Figure 1 7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans. Regions targeted by the Even and Odd ChIRP probes mapped to a secondary structure model 56 of the

More information

ChIP-seq hands-on. Iros Barozzi, Campus IFOM-IEO (Milan) Saverio Minucci, Gioacchino Natoli Labs

ChIP-seq hands-on. Iros Barozzi, Campus IFOM-IEO (Milan) Saverio Minucci, Gioacchino Natoli Labs ChIP-seq hands-on Iros Barozzi, Campus IFOM-IEO (Milan) Saverio Minucci, Gioacchino Natoli Labs Main goals Becoming familiar with essential tools and formats Visualizing and contextualizing raw data Understand

More information

Single cell RNA Seq reveals dynamic paracrine control of cellular variation

Single cell RNA Seq reveals dynamic paracrine control of cellular variation Single cell RNA Seq reveals dynamic paracrine control of cellular variation The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation

More information

Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression. Chromatin Array of nuc

Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression. Chromatin Array of nuc Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression Chromatin Array of nuc 1 Transcriptional control in Eukaryotes: Chromatin undergoes structural changes

More information

Computational aspects of ChIP-seq. John Marioni Research Group Leader European Bioinformatics Institute European Molecular Biology Laboratory

Computational aspects of ChIP-seq. John Marioni Research Group Leader European Bioinformatics Institute European Molecular Biology Laboratory Computational aspects of ChIP-seq John Marioni Research Group Leader European Bioinformatics Institute European Molecular Biology Laboratory ChIP-seq Using highthroughput sequencing to investigate DNA

More information

Polyomaviridae. Spring

Polyomaviridae. Spring Polyomaviridae Spring 2002 331 Antibody Prevalence for BK & JC Viruses Spring 2002 332 Polyoma Viruses General characteristics Papovaviridae: PA - papilloma; PO - polyoma; VA - vacuolating agent a. 45nm

More information

Hao D. H., Ma W. G., Sheng Y. L., Zhang J. B., Jin Y. F., Yang H. Q., Li Z. G., Wang S. S., GONG Ming*

Hao D. H., Ma W. G., Sheng Y. L., Zhang J. B., Jin Y. F., Yang H. Q., Li Z. G., Wang S. S., GONG Ming* Comparison of transcriptomes and gene expression profiles of two chilling- and drought-tolerant and intolerant Nicotiana tabacum varieties under low temperature and drought stress Hao D. H., Ma W. G.,

More information

Selective depletion of abundant RNAs to enable transcriptome analysis of lowinput and highly-degraded RNA from FFPE breast cancer samples

Selective depletion of abundant RNAs to enable transcriptome analysis of lowinput and highly-degraded RNA from FFPE breast cancer samples DNA CLONING DNA AMPLIFICATION & PCR EPIGENETICS RNA ANALYSIS Selective depletion of abundant RNAs to enable transcriptome analysis of lowinput and highly-degraded RNA from FFPE breast cancer samples LIBRARY

More information

Histone Modifications Are Associated with Transcript Isoform Diversity in Normal and Cancer Cells

Histone Modifications Are Associated with Transcript Isoform Diversity in Normal and Cancer Cells Histone Modifications Are Associated with Transcript Isoform Diversity in Normal and Cancer Cells Ondrej Podlaha 1, Subhajyoti De 2,3,4, Mithat Gonen 5, Franziska Michor 1 * 1 Department of Biostatistics

More information

Accessing and Using ENCODE Data Dr. Peggy J. Farnham

Accessing and Using ENCODE Data Dr. Peggy J. Farnham 1 William M Keck Professor of Biochemistry Keck School of Medicine University of Southern California How many human genes are encoded in our 3x10 9 bp? C. elegans (worm) 959 cells and 1x10 8 bp 20,000

More information

Iso-Seq Method Updates and Target Enrichment Without Amplification for SMRT Sequencing

Iso-Seq Method Updates and Target Enrichment Without Amplification for SMRT Sequencing Iso-Seq Method Updates and Target Enrichment Without Amplification for SMRT Sequencing PacBio Americas User Group Meeting Sample Prep Workshop June.27.2017 Tyson Clark, Ph.D. For Research Use Only. Not

More information

Analysis and design of RNA sequencing experiments for identifying isoform regulation

Analysis and design of RNA sequencing experiments for identifying isoform regulation ARTICLES Analysis and design of RNA sequencing experiments for identifying isoform regulation Yarden Katz 1,2, Eric T Wang 2,3, Edoardo M Airoldi 4 & Christopher B Burge 2,5 2010 Nature America, Inc. All

More information

Nature Structural & Molecular Biology: doi: /nsmb.2419

Nature Structural & Molecular Biology: doi: /nsmb.2419 Supplementary Figure 1 Mapped sequence reads and nucleosome occupancies. (a) Distribution of sequencing reads on the mouse reference genome for chromosome 14 as an example. The number of reads in a 1 Mb

More information

A Practical Guide to Integrative Genomics by RNA-seq and ChIP-seq Analysis

A Practical Guide to Integrative Genomics by RNA-seq and ChIP-seq Analysis A Practical Guide to Integrative Genomics by RNA-seq and ChIP-seq Analysis Jian Xu, Ph.D. Children s Research Institute, UTSW Introduction Outline Overview of genomic and next-gen sequencing technologies

More information

Genome-wide Association Studies (GWAS) Pasieka, Science Photo Library

Genome-wide Association Studies (GWAS) Pasieka, Science Photo Library Lecture 5 Genome-wide Association Studies (GWAS) Pasieka, Science Photo Library Chi-squared test to evaluate whether the odds ratio is different from 1. Corrected for multiple testing Source: wikipedia.org

More information

Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition

Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition Donald J. Patterson, Ken Yasuhara, Walter L. Ruzzo January 3-7, 2002 Pacific Symposium on Biocomputing University of Washington Computational

More information

Small RNA-Seq and profiling

Small RNA-Seq and profiling Small RNA-Seq and profiling Y. Hoogstrate 1,2 1 Department of Bioinformatics & Department of Urology ErasmusMC, Rotterdam 2 CTMM Translational Research IT (TraIT) BioSB: 5th RNA-seq data analysis course,

More information

Package splicer. R topics documented: August 19, Version Date 2014/04/29

Package splicer. R topics documented: August 19, Version Date 2014/04/29 Package splicer August 19, 2014 Version 1.6.0 Date 2014/04/29 Title Classification of alternative splicing and prediction of coding potential from RNA-seq data. Author Johannes Waage ,

More information

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1 Supplementary Figure 1 U1 inhibition causes a shift of RNA-seq reads from exons to introns. (a) Evidence for the high purity of 4-shU-labeled RNAs used for RNA-seq. HeLa cells transfected with control

More information

Daehwan Kim September 2018

Daehwan Kim September 2018 Daehwan Kim September 2018 Michael L. Rosenberg Assistant Professor, CPRIT Scholar (214) 645-1738 Lyda Hill Department of Bioinformatics infphilo@gmail.com University of Texas Southwestern Medical Center

More information

Peak-calling for ChIP-seq and ATAC-seq

Peak-calling for ChIP-seq and ATAC-seq Peak-calling for ChIP-seq and ATAC-seq Shamith Samarajiwa CRUK Autumn School in Bioinformatics 2017 University of Cambridge Overview Peak-calling: identify enriched (signal) regions in ChIP-seq or ATAC-seq

More information

Effects of UBL5 knockdown on cell cycle distribution and sister chromatid cohesion

Effects of UBL5 knockdown on cell cycle distribution and sister chromatid cohesion Supplementary Figure S1. Effects of UBL5 knockdown on cell cycle distribution and sister chromatid cohesion A. Representative examples of flow cytometry profiles of HeLa cells transfected with indicated

More information

RNA sequencing of cancer reveals novel splicing alterations

RNA sequencing of cancer reveals novel splicing alterations RNA sequencing of cancer reveals novel splicing alterations Jeyanthy Eswaran, Anelia Horvath, Sucheta Godbole, Sirigiri Divijendra Reddy, Prakriti Mudvari, Kazufumi Ohshiro, Dinesh Cyanam, Sujit Nair,

More information

Pseudogenes transcribed in breast invasive carcinoma show subtype-specific expression and cerna potential

Pseudogenes transcribed in breast invasive carcinoma show subtype-specific expression and cerna potential Welch et al. BMC Genomics (2015) 16:113 DOI 10.1186/s12864-015-1227-8 RESEARCH ARTICLE Open Access Pseudogenes transcribed in breast invasive carcinoma show subtype-specific expression and cerna potential

More information

Mechanism of splicing

Mechanism of splicing Outline of Splicing Mechanism of splicing Em visualization of precursor-spliced mrna in an R loop Kinetics of in vitro splicing Analysis of the splice lariat Lariat Branch Site Splice site sequence requirements

More information

RNA SEQUENCING AND DATA ANALYSIS

RNA SEQUENCING AND DATA ANALYSIS RNA SEQUENCING AND DATA ANALYSIS Download slides and package http://odin.mdacc.tmc.edu/~rverhaak/package.zip http://odin.mdacc.tmc.edu/~rverhaak/rna-seqlecture.zip Overview Introduction into the topic

More information

Simple, rapid, and reliable RNA sequencing

Simple, rapid, and reliable RNA sequencing Simple, rapid, and reliable RNA sequencing RNA sequencing applications RNA sequencing provides fundamental insights into how genomes are organized and regulated, giving us valuable information about the

More information

Alternative Splicing and Genomic Stability

Alternative Splicing and Genomic Stability Alternative Splicing and Genomic Stability Kevin Cahill cahill@unm.edu http://dna.phys.unm.edu/ Abstract In a cell that uses alternative splicing, the total length of all the exons is far less than in

More information

DeconRNASeq: A Statistical Framework for Deconvolution of Heterogeneous Tissue Samples Based on mrna-seq data

DeconRNASeq: A Statistical Framework for Deconvolution of Heterogeneous Tissue Samples Based on mrna-seq data DeconRNASeq: A Statistical Framework for Deconvolution of Heterogeneous Tissue Samples Based on mrna-seq data Ting Gong, Joseph D. Szustakowski April 30, 2018 1 Introduction Heterogeneous tissues are frequently

More information

Reliability of Ordination Analyses

Reliability of Ordination Analyses Reliability of Ordination Analyses Objectives: Discuss Reliability Define Consistency and Accuracy Discuss Validation Methods Opening Thoughts Inference Space: What is it? Inference space can be defined

More information

To test the possible source of the HBV infection outside the study family, we searched the Genbank

To test the possible source of the HBV infection outside the study family, we searched the Genbank Supplementary Discussion The source of hepatitis B virus infection To test the possible source of the HBV infection outside the study family, we searched the Genbank and HBV Database (http://hbvdb.ibcp.fr),

More information

DNA codes for RNA, which guides protein synthesis.

DNA codes for RNA, which guides protein synthesis. Section 3: DNA codes for RNA, which guides protein synthesis. K What I Know W What I Want to Find Out L What I Learned Vocabulary Review synthesis New RNA messenger RNA ribosomal RNA transfer RNA transcription

More information

ncounter Assay Automated Process Immobilize and align reporter for image collecting and barcode counting ncounter Prep Station

ncounter Assay Automated Process Immobilize and align reporter for image collecting and barcode counting ncounter Prep Station ncounter Assay ncounter Prep Station Automated Process Hybridize Reporter to RNA Remove excess reporters Bind reporter to surface Immobilize and align reporter Image surface Count codes Immobilize and

More information

Computational Modeling and Inference of Alternative Splicing Regulation.

Computational Modeling and Inference of Alternative Splicing Regulation. University of Miami Scholarly Repository Open Access Dissertations Electronic Theses and Dissertations 2013-04-17 Computational Modeling and Inference of Alternative Splicing Regulation. Ji Wen University

More information

Table S1. Total and mapped reads produced for each ChIP-seq sample

Table S1. Total and mapped reads produced for each ChIP-seq sample Tale S1. Total and mapped reads produced for each ChIP-seq sample Sample Total Reads Mapped Reads Col- H3K27me3 rep1 125662 1334323 (85.76%) Col- H3K27me3 rep2 9176437 7986731 (87.4%) atmi1a//c H3K27m3

More information

CONTRACTING ORGANIZATION: Johns Hopkins University, Baltimore, MD

CONTRACTING ORGANIZATION: Johns Hopkins University, Baltimore, MD AD Award Number: W81XWH-12-1-0480 TITLE: Molecular Characterization of Indolent Prostate Cancer PRINCIPAL INVESTIGATOR: Jun Luo, Ph.D. CONTRACTING ORGANIZATION: Johns Hopkins University, Baltimore, MD

More information

NEXT GENERATION SEQUENCING. R. Piazza (MD, PhD) Dept. of Medicine and Surgery, University of Milano-Bicocca

NEXT GENERATION SEQUENCING. R. Piazza (MD, PhD) Dept. of Medicine and Surgery, University of Milano-Bicocca NEXT GENERATION SEQUENCING R. Piazza (MD, PhD) Dept. of Medicine and Surgery, University of Milano-Bicocca SANGER SEQUENCING 5 3 3 5 + Capillary Electrophoresis DNA NEXT GENERATION SEQUENCING SOLEXA-ILLUMINA

More information

A Statistical Framework for Classification of Tumor Type from microrna Data

A Statistical Framework for Classification of Tumor Type from microrna Data DEGREE PROJECT IN MATHEMATICS, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2016 A Statistical Framework for Classification of Tumor Type from microrna Data JOSEFINE RÖHSS KTH ROYAL INSTITUTE OF TECHNOLOGY

More information

epigenomix Epigenetic and gene transcription data normalization and integration with mixture models

epigenomix Epigenetic and gene transcription data normalization and integration with mixture models epigenomix Epigenetic and gene transcription data normalization and integration with mixture models Hans-Ulrich Klein, Martin Schäfer April 9, 2017 Contents 1 Introduction 2 2 Data preprocessing and normalization

More information

Last time we talked about the few steps in viral replication cycle and the un-coating stage:

Last time we talked about the few steps in viral replication cycle and the un-coating stage: Zeina Al-Momani Last time we talked about the few steps in viral replication cycle and the un-coating stage: Un-coating: is a general term for the events which occur after penetration, we talked about

More information

V23 Regular vs. alternative splicing

V23 Regular vs. alternative splicing Regular splicing V23 Regular vs. alternative splicing mechanistic steps recognition of splice sites Alternative splicing different mechanisms how frequent is alternative splicing? Effect of alternative

More information

Tutorial: RNA-Seq Analysis Part II: Non-Specific Matches and Expression Measures

Tutorial: RNA-Seq Analysis Part II: Non-Specific Matches and Expression Measures : RNA-Seq Analysis Part II: Non-Specific Matches and Expression Measures March 15, 2013 CLC bio Finlandsgade 10-12 8200 Aarhus N Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com support@clcbio.com

More information

Circular RNAs (circrnas) act a stable mirna sponges

Circular RNAs (circrnas) act a stable mirna sponges Circular RNAs (circrnas) act a stable mirna sponges cernas compete for mirnas Ancestal mrna (+3 UTR) Pseudogene RNA (+3 UTR homolgy region) The model holds true for all RNAs that share a mirna binding

More information

RNA-Seq guided gene therapy for vision loss. Michael H. Farkas

RNA-Seq guided gene therapy for vision loss. Michael H. Farkas RNA-Seq guided gene therapy for vision loss Michael H. Farkas The retina is a complex tissue Many cell types Neural retina vs. RPE Each highly dependent on the other Graw, Nature Reviews Genetics, 2003

More information

Supplementary Figure 1. Schematic diagram of o2n-seq. Double-stranded DNA was sheared, end-repaired, and underwent A-tailing by standard protocols.

Supplementary Figure 1. Schematic diagram of o2n-seq. Double-stranded DNA was sheared, end-repaired, and underwent A-tailing by standard protocols. Supplementary Figure 1. Schematic diagram of o2n-seq. Double-stranded DNA was sheared, end-repaired, and underwent A-tailing by standard protocols. A-tailed DNA was ligated to T-tailed dutp adapters, circularized

More information

BIOMEDICAL SCIENCES GRADUATE PROGRAM SPRING 2015

BIOMEDICAL SCIENCES GRADUATE PROGRAM SPRING 2015 THE OHIO STATE UNIVERSITY BIOMEDICAL SCIENCES GRADUATE PROGRAM SPRING 2015 Adam Suhy PhD Candidate Regulation of Cholesteryl Ester Transfer Protein and Expression of Transporters in the Blood Brain Barrier

More information

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space

BWA alignment to reference transcriptome and genome. Convert transcriptome mappings back to genome space Whole genome sequencing Whole exome sequencing BWA alignment to reference transcriptome and genome Convert transcriptome mappings back to genome space genomes Filter on MQ, distance, Cigar string Annotate

More information

Research Strategy: 1. Background and Significance

Research Strategy: 1. Background and Significance Research Strategy: 1. Background and Significance 1.1. Heterogeneity is a common feature of cancer. A better understanding of this heterogeneity may present therapeutic opportunities: Intratumor heterogeneity

More information

TITLE: Use of a Novel Embryonic Mammary Stem Cell Gene Signature to Improve Human Breast Cancer Diagnostics and Therapeutic Decision Making

TITLE: Use of a Novel Embryonic Mammary Stem Cell Gene Signature to Improve Human Breast Cancer Diagnostics and Therapeutic Decision Making AD Award Number: W81XWH-12-1-0106 TITLE: Use of a Novel Embryonic Mammary Stem Cell Gene Signature to Improve Human Breast Cancer Diagnostics and Therapeutic Decision Making PRINCIPAL INVESTIGATOR: CONTRACTING

More information

RNA Processing in Eukaryotes *

RNA Processing in Eukaryotes * OpenStax-CNX module: m44532 1 RNA Processing in Eukaryotes * OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 By the end of this section, you

More information

Generation of antibody diversity October 18, Ram Savan

Generation of antibody diversity October 18, Ram Savan Generation of antibody diversity October 18, 2016 Ram Savan savanram@uw.edu 441 Lecture #10 Slide 1 of 30 Three lectures on antigen receptors Part 1 : Structural features of the BCR and TCR Janeway Chapter

More information

Integrated analysis of sequencing data

Integrated analysis of sequencing data Integrated analysis of sequencing data How to combine *-seq data M. Defrance, M. Thomas-Chollier, C. Herrmann, D. Puthier, J. van Helden *ChIP-seq, RNA-seq, MeDIP-seq, Transcription factor binding ChIP-seq

More information

Gene Expression: Details (Eukaryotes) Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition

Gene Expression: Details (Eukaryotes) Pre-mRNA Secondary Structure Prediction Aids Splice Site Recognition re-mrna rediction Aids plice ite Recognition Gene Expression: Details (Eukaryotes) DNA pre-mrna mrna rotein nucleus gene rotein Donald J. atterson, Ken Yasuhara, Walter L. Ruzzo January 3-7, 2002 DNA (chromosome)

More information

Bio 111 Study Guide Chapter 17 From Gene to Protein

Bio 111 Study Guide Chapter 17 From Gene to Protein Bio 111 Study Guide Chapter 17 From Gene to Protein BEFORE CLASS: Reading: Read the introduction on p. 333, skip the beginning of Concept 17.1 from p. 334 to the bottom of the first column on p. 336, and

More information

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project Introduction RNA splicing is a critical step in eukaryotic gene

More information

Mechanisms of alternative splicing regulation

Mechanisms of alternative splicing regulation Mechanisms of alternative splicing regulation The number of mechanisms that are known to be involved in splicing regulation approximates the number of splicing decisions that have been analyzed in detail.

More information

High-Resolution Expression Map of the Arabidopsis Root Reveals Alternative Splicing and lincrna Regulation

High-Resolution Expression Map of the Arabidopsis Root Reveals Alternative Splicing and lincrna Regulation Resource High-Resolution Expression Map of the Arabidopsis Root Reveals Alternative Splicing and lincrna Regulation Highlights d Cell type expression analyses characterize alt splicing and lincrnas Authors

More information

Supplemental Figure S1. Tertiles of FKBP5 promoter methylation and internal regulatory region

Supplemental Figure S1. Tertiles of FKBP5 promoter methylation and internal regulatory region Supplemental Figure S1. Tertiles of FKBP5 promoter methylation and internal regulatory region methylation in relation to PSS and fetal coupling. A, PSS values for participants whose placentas showed low,

More information

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc. Variant Classification Author: Mike Thiesen, Golden Helix, Inc. Overview Sequencing pipelines are able to identify rare variants not found in catalogs such as dbsnp. As a result, variants in these datasets

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION doi:10.1038/nature19360 Supplementary Tables Supplementary Table 1. Number of monoclonal reads in each sample Sample Number of cells Total reads Aligned reads Monoclonal reads

More information

RNA-seq. Differential analysis

RNA-seq. Differential analysis RNA-seq Differential analysis Data transformations Count data transformations In order to test for differential expression, we operate on raw counts and use discrete distributions differential expression.

More information

Supplementary Figures and Tables

Supplementary Figures and Tables Supplementary Figures and Tables Supplementary Figure 1. Study design and sample collection. S.japonicum were harvested from C57 mice at 8 time points after infection. Total number of samples for RNA-Seq:

More information

Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009

Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009 Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009 1 Abstract A stretch of chimpanzee DNA was annotated using tools including BLAST, BLAT, and Genscan. Analysis of Genscan predicted genes revealed

More information

Nature Immunology: doi: /ni Supplementary Figure 1. Characteristics of SEs in T reg and T conv cells.

Nature Immunology: doi: /ni Supplementary Figure 1. Characteristics of SEs in T reg and T conv cells. Supplementary Figure 1 Characteristics of SEs in T reg and T conv cells. (a) Patterns of indicated transcription factor-binding at SEs and surrounding regions in T reg and T conv cells. Average normalized

More information

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation,

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation, Supplementary Information Supplementary Figures Supplementary Figure 1. a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation, gene ID and specifities are provided. Those highlighted

More information

Long non coding RNA in the pea aphid; iden3fica3on and compara3ve expression in sexual and asexual embryos

Long non coding RNA in the pea aphid; iden3fica3on and compara3ve expression in sexual and asexual embryos Long non coding RNA in the pea aphid; iden3fica3on and compara3ve expression in sexual and asexual embryos Fabrice Legeai, Thomas Derrien, Valen3n Wucher, Audrey David, Gael Le Trionnaire and Denis Tagu

More information

Reducing INDEL calling errors in whole genome and exome sequencing data.

Reducing INDEL calling errors in whole genome and exome sequencing data. Reducing INDEL calling errors in whole genome and exome sequencing data. Han Fang November 8, 2014 CSHL Biological Data Science Meeting Acknowledgments Lyon Lab Yiyang Wu Jason O Rawe Laura J Barron Max

More information