RNA- seq Introduc1on. Promises and pi7alls

Similar documents
RNA- seq Introduc1on. Promises and pi7alls

RNA-seq Introduction

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq)

Transcriptome and isoform reconstruc1on with short reads. Tangled up in reads

Small RNAs and how to analyze them using sequencing

Ambient temperature regulated flowering time

Small RNAs and how to analyze them using sequencing

Transcript reconstruction

Breast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data

The RNA revolution: rewriting the fundamentals of genetics

Alternative RNA processing: Two examples of complex eukaryotic transcription units and the effect of mutations on expression of the encoded proteins.

Long non-coding RNAs

MicroRNA in Cancer Karen Dybkær 2013

Mining for disease- causing genes in the other 98% of the human genome Daniel Gautheret

STAT1 regulates microrna transcription in interferon γ stimulated HeLa cells

Epigenetic Principles and Mechanisms Underlying Nervous System Function in Health and Disease Mark F. Mehler MD, FAAN

Transcriptional control in Eukaryotes: (chapter 13 pp276) Chromatin structure affects gene expression. Chromatin Array of nuc

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers

Computational Analysis of UHT Sequences Histone modifications, CAGE, RNA-Seq

RESEARCHER S NAME: Làszlò Tora RESEARCHER S ORGANISATION: Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC)

EPIGENOMICS PROFILING SERVICES

Computer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015

RNA SEQUENCING AND DATA ANALYSIS

Analyse de données de séquençage haut débit

Deploying the full transcriptome using RNA sequencing. Jo Vandesompele, CSO and co-founder The Non-Coding Genome May 12, 2016, Leuven

Simple, rapid, and reliable RNA sequencing

DNA codes for RNA, which guides protein synthesis.

Where Splicing Joins Chromatin And Transcription. 9/11/2012 Dario Balestra

Selective depletion of abundant RNAs to enable transcriptome analysis of lowinput and highly-degraded RNA from FFPE breast cancer samples

Lecture 8 Understanding Transcription RNA-seq analysis. Foundations of Computational Systems Biology David K. Gifford

Prokaryotes and eukaryotes alter gene expression in response to their changing environment

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

Transcription and RNA processing

Genetics. Instructor: Dr. Jihad Abdallah Transcription of DNA

Long non coding RNA in the pea aphid; iden3fica3on and compara3ve expression in sexual and asexual embryos

V16: involvement of micrornas in GRNs

Bi 8 Lecture 17. interference. Ellen Rothenberg 1 March 2016

Missing Heritablility How to Analyze Your Own Genome Fall 2013

Global Epigenetic and Transcriptional Trends among Two Rice Subspecies and Their Reciprocal Hybrids W

High AU content: a signature of upregulated mirna in cardiac diseases

mirna Dr. S Hosseini-Asl

Small RNA-Seq and profiling

Accessing and Using ENCODE Data Dr. Peggy J. Farnham

Introduction to Systems Biology of Cancer Lecture 2

Eukaryotic small RNA Small RNAseq data analysis for mirna identification

Human breast milk mirna, maternal probiotic supplementation and atopic dermatitis in offsrping

Raymond Auerbach PhD Candidate, Yale University Gerstein and Snyder Labs August 30, 2012

Set the stage: Genomics technology. Jos Kleinjans Dept of Toxicogenomics Maastricht University, the Netherlands

Nuno Barbosa-Morais. Senior Postdoctoral Fellow (Blencowe Lab) Recipient of a Marie Curie International Outgoing Fellowship

Transcription and RNA processing

Islamic University Faculty of Medicine

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays

Peak-calling for ChIP-seq and ATAC-seq

Transcriptome Analysis

Profiles of gene expression & diagnosis/prognosis of cancer. MCs in Advanced Genetics Ainoa Planas Riverola

Iso-Seq Method Updates and Target Enrichment Without Amplification for SMRT Sequencing

CRS4 Seminar series. Inferring the functional role of micrornas from gene expression data CRS4. Biomedicine. Bioinformatics. Paolo Uva July 11, 2012

The Blueprint of Life: DNA to Protein. What is genetics? DNA Structure 4/27/2011. Chapter 7

The Blueprint of Life: DNA to Protein

MicroRNA and Male Infertility: A Potential for Diagnosis

MicroRNAs, RNA Modifications, RNA Editing. Bora E. Baysal MD, PhD Oncology for Scientists Lecture Tue, Oct 17, 2017, 3:30 PM - 5:00 PM

Evolution of the unspliced transcriptome

Small RNA Sequencing. Project Workflow. Service Description. Sequencing Service Specification BGISEQ-500 SERVICE OVERVIEW SAMPLE PREPARATION

Morphogens: What are they and why should we care?

Genome-wide Association Studies (GWAS) Pasieka, Science Photo Library

Inference of Isoforms from Short Sequence Reads

Session 2: Biomarkers of epigenetic changes and their applicability to genetic toxicology

Sunday, October 26 th

Canadian Bioinforma1cs Workshops

Integration of high-throughput biological data

Early life metal exposure and epigene4cs: a transdisciplinary approach

PROGRAM RNA The Nineteenth Annual Meeting of the RNA Society Quebec City, Canada June 03 08, 2014

Molecular Biology (BIOL 4320) Exam #2 May 3, 2004

Personalised medicine: Past, present and future

Not IN Our Genes - A Different Kind of Inheritance.! Christopher Phiel, Ph.D. University of Colorado Denver Mini-STEM School February 4, 2014

Studying Alternative Splicing

7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans.

TRANSCRIPTION. DNA à mrna

Chapter 5: Global Analysis of the Effect of Densin. Knockout on Gene Transcription in the Brain

GWAS mega-analysis. Speakers: Michael Metzker, Douglas Blackwood, Andrew Feinberg, Nicholas Schork, Benjamin Pickard

micrornas (mirna) and Biomarkers

DNA-seq Bioinformatics Analysis: Copy Number Variation

MODULE 3: TRANSCRIPTION PART II

Multi-omics data integration colon cancer using proteogenomics approach

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.

Marta Puerto Plasencia. microrna sponges

NGS in Cancer Pathology After the Microscope: From Nucleic Acid to Interpretation

Take-Home Final Exam: Mining Regulatory Modules from Gene Expression Data

L I F E S C I E N C E S

Figure mouse globin mrna PRECURSOR RNA hybridized to cloned gene (genomic). mouse globin MATURE mrna hybridized to cloned gene (genomic).

A Practical Guide to Integrative Genomics by RNA-seq and ChIP-seq Analysis

Regulation of Gene Expression in Eukaryotes

Molecular Biology (BIOL 4320) Exam #2 April 22, 2002

Alternative Splicing and Genomic Stability

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project

Circular RNAs (circrnas) act a stable mirna sponges

PRECISION INSIGHTS. GPS Cancer. Molecular Insights You Can Rely On. Tumor-normal sequencing of DNA + RNA expression.

RNA and Protein Synthesis Guided Notes

RNA Processing in Eukaryotes *

Nature Structural & Molecular Biology: doi: /nsmb.2419

Transcription:

RNA- seq Introduc1on Promises and pi7alls

RNA gives informa1on on which genes that are expressed How DNA get transcribed to RNA (and some1mes then translated to proteins) varies between e. g. - Tissues - Cell types - Cell states - Individuals - Cells

RNA gives informa1on on which genes that are expressed How DNA get transcribed to RNA (and some1mes then translated to proteins) varies between e. g. - Tissues - Cell types - Cell states - Individuals

RNA gives informa1on on which genes that are expressed How DNA get transcribed to RNA (and some1mes then translated to proteins) varies between e. g. - Tissues - Cell types - Cell states - Individuals

House keeping RNAs rrnas, trnas, snornas, snrnas, SRP RNAs, cataly1c RNAs (RNAse E) Protein coding RNAs RNA flavors (pre sequencing era) (1 coding gene ~ 1 mrna) Regulatory RNAs Few rare examples

ENCODE, the Encyclopedia of DNA Elements, is a project funded by the Na1onal Human Genome Research Ins1tute to iden1fy all regions of transcrip1on, transcrip1on factor associa1on, chroma1n structure and histone modifica1on in the human genome sequence.

ENCyclopedia Of Dna Elements

Different kind of RNAs have different expression values Landscape of transcrip/on in human cells, S Djebali et al. Nature 2012

What defines RNA depends on how you look at it Coverage Variants Abundance House keeping RNAs mrnas Regulatory RNAs Novel intergenic None Adapted from Landscape of transcrip/on in human cells, S Djebali et al. Nature 2012

Defining func1onal DNA elements in the human genome Statement A priori, we should not expect the transcriptome to consist exclusively of func1onal RNAs. Why is that Zero tolerance for errant transcripts would come at high cost in the proofreading machinery needed to perfectly gate RNA polymerase and splicing ac1vi1es, or to instantly eliminate spurious transcripts. In general, sequences encoding RNAs transcribed by noisy transcrip1onal machinery are expected to be less constrained, which is consistent with data shown here for very low abundance RNA Consequence Thus, one should have high confidence that the subset of the genome with large signals for RNA or chroma1n signatures coupled with strong conserva1on is func1onal and will be supported by appropriate gene1c tests. In contrast, the larger propor1on of genome with reproducible but low biochemical signal strength and less evolu1onary conserva1on is challenging to parse between specific func1ons and biological noise.

This is of course not without an debate Variants Most Dark Matter Transcripts Are Associated With Known Genes Perspective Harm van Bakel 1, Corey Nislow 1,2, Benjamin J. Blencowe 1,2, Timothy R. Hughes 1,2 * 1 Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario, Canada, 2 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada The Reality of Pervasive Transcription Abundance Abstract Michael B. Clark 1, Paulo P. Amaral 1., Felix J. Schlesinger 2., Marcel E. Dinger 1, Ryan J. Taft 1, John L. Perspective A series Rinn of 3, reports Chrisover P. Ponting the last 4 few, Peter yearsf. have Stadler indicated 5, Kevin that av. much Morris larger 6, Antonin portion ofmorillon the mammalian 7, Joelgenome S. Rozowsky is 8, transcribed than can be accounted for by currently annotated genes, but the quantity and nature of these additional Mark B. Gerstein 8, Claes Wahlestedt 9, Yoshihide Hayashizaki 10, Piero Carninci 10, Thomas R. Gingeras 2 *, transcripts remains unclear. Here, we have used data from single- and paired-end RNA-Seq and tiling arrays to assess the quantity John ands. composition Mattick 1 * of transcripts in PolyA+ RNA from human and mouse tissues. Relative to tiling arrays, RNA-Seq Response to The Reality of Pervasive Transcription 1 Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia, 2 Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, Harm Newvan York, United Bakel States, Corey of America, Nislow 3 Broad Institute,, Benjamin Cambridge, J. Massachusetts, BlencoweUnited 1,2, Timothy States of America, R. Hughes 4 MRC Functional * Genomics Unit, Department of 1Physiology, Banting and Anatomy Best Department and Genetics, of Medical University Research of Oxford, andoxford, TerrenceUnited Donnelly Kingdom, Centre5 for Department Cellular and of Computer Biomolecular Science, Research, University University of Leipzig, of Toronto, Leipzig, Toronto, Ontario, Germany, 6 Department Canada, 2of Department Molecular and of Molecular Experimental Genetics, Medicine, University Scripps of ResearchToronto, Institute, Ontario, La Jolla, Canada California, United States of America, 7 Institut Curie, UMR3244- Pavillon Trouillet Rossignol, Paris, France, 8 Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America, 9 University of Miami, Miami, Florida, United States of America, 10 Omics Science Center, RIKEN Yokohama Institute, Tsurumi-ku, Yokohama, Kanagawa, Japan Clark et al. criticize several aspects of tic transcripts greatly increases their emphasized the lack of abundant pervasive our study [1], and specifically challenge abundance [7,8]. transcription in our study. Clark et al. cite our assertion that the degree of pervasive We acknowledge that the phrase quoted papers that have previously documented transcription has previously been overstated. by Clark et al. in our Author Summary pervasive transcription, and point out that We disagree with much of their should have read stably transcribed, or several different approaches have been

Biochemical evidence not enough to iden1fy func1onal RNAs Defining functional DNA elements in the human genome Kellis M et al. PNAS 2014;111:6131-6138

One gene many different mrnas

RNA seq course

The RNA seq course From RNA seq to reads Mapping reads programs Transcriptome reconstruc1on using reference Transcriptome reconstruc1on without reference QC analysis srna analysis Differen1al expression analysis mrnas mirnas Genome annota1on using RNA and other sources Differen1al expression using mul1 variate analysis RNA long read analysis

From RNA to short reads

Promises and pi7alls Long reads Low throughput (- ) Complete transcripts (+) Only highly expressed genes (- - ) Expensive (- ) Low background noise (+) Easy downstream analysis (+) 10000 1000 Micro Arrays High throughput (+) Only known sequences (- ) Limited dynamic range (- ) Cheap (+) High background noise (- ) Not strand specific (- ) Well established downstream methods (+) RNAseq High throughput (+) Frac1ons of transcripts (- ) Full dynamic range (+- ) Unlimited dynamic range (+) Cheap (+) Low background noise (+) Strand specificity (+) Re- sequencing (+) Signal 100 10 1 1 10 100 1000 10000 100000 1000000 # trancripts/cell EST MicroArray RNAseq

How are RNA- seq data generated? Sampling process

RNA seq reads correspond directly to abundance of RNAs in the sample

RNA to reads RNA- > AAAAAAAA PolyA (mrna) RiboMinus (- rrna) Size <50 nt (mirna ).. enrichments - > library - > Size of fragment Strand specific 5 end specific 3 end specific.. reads - > Single end (1 read per fragment) Paired end (2 reads per fragment)

Transcriptome assembly using reference

Transcriptome assembly without reference

Mapping long reads to reference

Quality control - samples might not be what you think they are Experiments go wrong 30 samples with 5 steps from samples to reads has 150 poten1al steps for errors Error rate 1/100 with 5 steps suggest that one of every 20 samples the reads does not represent the sample Mixing samples 30 samples with 5 steps from samples to reads has ~24M poten1al mix ups of samples Error rate 1/ 100 with 5 steps suggest that one of every 20 sample is mislabeled Combine the two steps and approximately one of every 10 samples are wrong

RNA QC Read quality Mapping sta1s1cs Transcript quality Compare between samples

Differen1al expression analysis using univariate analysis Typically univariate analysis (one gene at a 1me) even though we know that genes are not independent

Gene set analysis and data integra1on SCP hidden by gene downregulation?

microrna analysis (Jakub) (Berezikov et al. Genome Research, 2011.)

Single cell RNA- seq analysis (Sandberg, Nature Methods 2014)

Short reads Long reads

Long reads Short reads Long reads