microrna analysis Merete Molton Worren Ståle Nygård

Similar documents
Micro-RNA web tools. Introduction. UBio Training Courses. mirnas, target prediction, biology. Gonzalo

Small RNA Sequencing. Project Workflow. Service Description. Sequencing Service Specification BGISEQ-500 SERVICE OVERVIEW SAMPLE PREPARATION

Obstacles and challenges in the analysis of microrna sequencing data

A Quick-Start Guide for rseqdiff

Eukaryotic small RNA Small RNAseq data analysis for mirna identification

Breast and ovarian cancer in Serbia: the importance of mutation detection in hereditary predisposition genes using NGS

Supplementary Figure 1

DNA Sequence Bioinformatics Analysis with the Galaxy Platform

Human breast milk mirna, maternal probiotic supplementation and atopic dermatitis in offsrping

RmiR package vignette

VirusDetect pipeline - virus detection with small RNA sequencing

A Practical Guide to Integrative Genomics by RNA-seq and ChIP-seq Analysis

Golden Helix s End-to-End Solution for Clinical Labs

Package TargetScoreData

Arabidopsis thaliana small RNA Sequencing. Report

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation,

Dr Rick Tearle Senior Applications Specialist, EMEA Complete Genomics Complete Genomics, Inc.

Analysis with SureCall 2.1

High AU content: a signature of upregulated mirna in cardiac diseases

Supplementary information for: Human micrornas co-silence in well-separated groups and have different essentialities

Metabolomic and Proteomics Solutions for Integrated Biology. Christine Miller Omics Market Manager ASMS 2015

RNA Secondary Structures: A Case Study on Viruses Bioinformatics Senior Project John Acampado Under the guidance of Dr. Jason Wang

P. Tang ( 鄧致剛 ); PJ Huang ( 黄栢榕 ) g( 鄧致剛 ); g ( 黄栢榕 ) Bioinformatics Center, Chang Gung University.

Patnaik SK, et al. MicroRNAs to accurately histotype NSCLC biopsies

ChIP-seq hands-on. Iros Barozzi, Campus IFOM-IEO (Milan) Saverio Minucci, Gioacchino Natoli Labs

A Statistical Framework for Classification of Tumor Type from microrna Data

Data mining with Ensembl Biomart. Stéphanie Le Gras

P. Tang ( 鄧致剛 ); PJ Huang ( 黄栢榕 ) g( ); g ( ) Bioinformatics Center, Chang Gung University.

Transcriptome Analysis

Methods 59 (2013) Contents lists available at SciVerse ScienceDirect. Methods. journal homepage:

omiras: MicroRNA regulation of gene expression

HBV. Next Generation Sequencing, data analysis and reporting. Presenter Leen-Jan van Doorn

STAT1 regulates microrna transcription in interferon γ stimulated HeLa cells

Synthetic microrna Reference Standards Genomics Research Group ABRF 2015

Deciphering the Role of micrornas in BRD4-NUT Fusion Gene Induced NUT Midline Carcinoma

Supplemental Figure 1. Small RNA size distribution from different soybean tissues.

Lombardi Cancer Center Georgetown University Washington, DC

Small RNA-Seq and profiling

Multi-omics data integration colon cancer using proteogenomics approach

CRS4 Seminar series. Inferring the functional role of micrornas from gene expression data CRS4. Biomedicine. Bioinformatics. Paolo Uva July 11, 2012

AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits

Zhao et al. BMC Bioinformatics (2017) 18:180 DOI /s

Profiling micrornas in lung tissue from pigs infected with Actinobacillus pleuropneumoniae

Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009

Long non-coding RNAs

microrna PCR System (Exiqon), following the manufacturer s instructions. In brief, 10ng of

IPA Advanced Training Course

ChIP-seq data analysis

Simple, rapid, and reliable RNA sequencing

Chip Seq Peak Calling in Galaxy

PSSV User Manual (V1.0)

Deploying the full transcriptome using RNA sequencing. Jo Vandesompele, CSO and co-founder The Non-Coding Genome May 12, 2016, Leuven

SUPPLEMENTAL INFORMATION

developing new tools for diagnostics Join forces with IMGM Laboratories to make your mirna project a success

How To Use MiRSEA. Junwei Han. July 1, Overview 1. 2 Get the pathway-mirna correlation profile(pmset) and a weighting matrix 2

Profiles of gene expression & diagnosis/prognosis of cancer. MCs in Advanced Genetics Ainoa Planas Riverola

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays

genomics for systems biology / ISB2020 RNA sequencing (RNA-seq)

On the Reproducibility of TCGA Ovarian Cancer MicroRNA Profiles

Prediction of micrornas and their targets

Advance Your Genomic Research Using Targeted Resequencing with SeqCap EZ Library

Accurate detection for a wide range of mutation and editing sites of micrornas from small RNA high-throughput sequencing profiles

Protein Reports CPTAC Common Data Analysis Pipeline (CDAP)

Reliable Prediction of Viral RNA Structures

DNA-seq Bioinformatics Analysis: Copy Number Variation

Supplementary materials and methods.

New Enhancements: GWAS Workflows with SVS

The Value of Omics in Cardiovascular Research: Getting More Comfortable, More Frustrated or More Curious

Hands-On Ten The BRCA1 Gene and Protein

Analysis of Massively Parallel Sequencing Data Application of Illumina Sequencing to the Genetics of Human Cancers

ChromHMM Tutorial. Jason Ernst Assistant Professor University of California, Los Angeles

mirviewer: A multispecies microrna homologous viewer

I conducted an internship at the Texas Biomedical Research Institute in the

Deep Sequencing the MicroRNA Transcriptome in Colorectal Cancer

Bioinformatic analyses: methodology for allergen similarity search. Zoltán Divéki, Ana Gomes EFSA GMO Unit

High Throughput Sequence (HTS) data analysis. Lei Zhou

The Focused Exome service at Bristol Genetics Laboratory

sirna count per 50 kb small RNAs matching the direct strand Repeat length (bp) per 50 kb repeats in the chromosome

VARIANT PRIORIZATION AND ANALYSIS INCORPORATING PROBLEMATIC REGIONS OF THE GENOME ANIL PATWARDHAN

ncounter Assay Automated Process Immobilize and align reporter for image collecting and barcode counting ncounter Prep Station

MicroRNAs: novel regulators in skin research

7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans.

Medtech Training Guide

The Epigenome Tools 2: ChIP-Seq and Data Analysis

ncounter Assay Automated Process Capture & Reporter Probes Bind reporter to surface Remove excess reporters Hybridize CodeSet to RNA

Identification of mirnas in Eucalyptus globulus Plant by Computational Methods

Below, we included the point-to-point response to the comments of both reviewers.

Transcript reconstruction

Viral genome sequencing: applications to clinical management and public health. Professor Judy Breuer

Long non coding RNA in the pea aphid; iden3fica3on and compara3ve expression in sexual and asexual embryos

Supplementary Figure 1 IL-27 IL

PSSV User Manual (V2.1)

Gene-microRNA network module analysis for ovarian cancer

RNA interference induced hepatotoxicity results from loss of the first synthesized isoform of microrna-122 in mice

Package leukemiaseset

SUPPLEMENTAL DATA AGING, July 2014, Vol. 6 No. 7

Supplemental Figure S1. Tertiles of FKBP5 promoter methylation and internal regulatory region

The Role of micrornas in Pain. Phd Work of: Rehman Qureshi Advisor: Ahmet Sacan

RNA-seq Introduction

Screening for novel oncology biomarker panels using both DNA and protein microarrays. John Anson, PhD VP Biomarker Discovery

Transcription:

microrna analysis Merete Molton Worren Ståle Nygård Help personnel: Daniel Vodak

Background Dysregulation of mirna expression has been connected to progression and development of atherosclerosis The hypothesis: mirna expression profiles differ between baboons with high and low serum LDL-C A subset of these mirnas regulate genes are relevant to dyslipidemia and risk of atherosclerosis

Experiment Baboons divided into two groups based on their LDL-C response levels Two diets Chow (baseline diet) HCHF (High Cholesterol High Fat diet) Compared mirna expression profiles from liver biopsies collected before and after the HCHF challenge diet

Paper Karere GM, Glenn JP, Vandeberg JL, Cox LA. 2012 Differential microrna response to a high-cholesterol, high-fat diet in livers of low and high LDL-C baboons. BMC Genomics 13(1):32 http://www.ncbi.nlm.nih.gov/pubmed/22809019

Experiment setup 12 samples available from baboon liver biopsies 3 replicates from high-ldl, baseline (chow) diet 3 replicates from high-ldl, HCHF diet 3 replicates from low-ldl, baseline (chow) diet 3 replicates from low-ldl, HCHF diet We will work with the high-ldl samples only

QUESTIONS?

mirna analysis workflow Preparations Alignment Annotation Read count per mirna Normalization Differential expression analysis

Preparations, read files Starting point: sequence read files Practical info File format (.fastq,.fasta,.sff) Base quality scale used (Phred+33, Phred+64) Quality control Adapters

Preparations, reference Find sequences you will align to Resources: UCSC: http://hgdownload.cse.ucsc.edu/downloads.html NCBI: http://www.ncbi.nlm.nih.gov/guide/all/#downloads_ ENSEMBL: http://www.ensembl.org/info/data/ftp/index.html We will use sequences from mirbase that we will customize to our needs http://www.mirbase.org/

Preparations, mirna Collapse files for increased speed Check that reads/reference are both RNA or both DNA

QUESTIONS?

Next: Alignment

Alignment The sequence read is not very informative in itself. Alignment to a reference -> location of origin TAGCTTATCAGACTGATGTTGA

Alignment The sequence read is not very informative in itself. Alignment to a reference -> location of origin TAGCTTATCAGACTGATGTTGA TACACTGAGTCGACTGAGTGGCCTAGCTTATCAGACTGATGTTGAATTTATATAGGCGTGACGTA

Alignment The sequence read is not very informative in itself. Alignment to a reference -> location of origin TAGCTTATCAGACTGATGTTGA TACACTGAGTCGACTGAGTGGCCTAGCTTATCAGACTGATGTTGAATTTATATAGGCGTGACGTA mir-21-5p

What to align to Often several possible references The genome of the organism or a closely related organism Specific sequences like mrna or mirnas Databases such as Rfam filter out reads check what the unaligned reads are

Aligners Many different aligners Novoalign Bowtie/Bowtie2 BWA (Burrows-Wheeler Aligner) SOAPaligner/soap2 (Short Oligonucleotide Analysis Package)

Alignment challenges Sequencing errors True differences from reference Repeat regions Similar coding sequences

Alignment process Indexing An index is made of the genome Mapping based on index Finds several possible alignment location Final alignment Thorough search of possible mapping locations

Simplified example, k-mer=2 Genome : TTATATTTTATTAATTTTA Possible 2-mers Location positions TT 1,6,7,8,11,15,16,17 TA 2,4,9,12,18 AT 3,5,10,14 AA 13 Our read: ATT Start 2-mer AT

Options Different options for different aligners Reading reference manuals is a big part of doing bioinformatics

Some common options Level of similarity, genome/reads Output formats Options to control multiple mappers

QUESTIONS?

Next: Annotation

Annotation Aligned to sequence of interest The annotation is already in place Aligned to genome Either: coordinates for the features of interest are already known Or: part of project is to annotate the genome

Annotation Some genomes are better annotated than others Coordinates for specific genomic features can be found through several sites ENSEMBL http://www.ensembl.org/index.html UCSC Table browser http://genome.ucsc.edu/cgi-bin/hgtables mirbase http://www.mirbase.org/ftp.shtml

QUESTIONS?

Next: Read counts

Read counts How much do we have? Read counts is the starting point for finding expression and differential expression When counting, multiple mappers must be considered

Read count mirna Multiple mappers are common Size of reads Similarity of different mirnas This leads to some additional challenges when counting the reads of mirnas

Aligning mirnas to mirnas Reads\Reference hsa mirs all mirs hsa mirs (2578 reads) 2578/2494 84 mirs lost 2578/1669 909 mirs lost all mirs (30424 reads) 10111/9653 458 mirs lost 30424/15732 14692 mirs lost

Counting multiple mappers Disregard all the reads that maps to multiple locations Keeping one or more randomly Keeping them all

QUESTIONS?

Next: Normalization

Normalization The purpose of normalization is to remove technical bias while maintaining true biological signal.

Normalization Normalization is necessary because the true expression is not the only factor influencing the read counts Other factors may be Sample size Differences in extraction Differences during the sequencing procedure

Normalization methods for mirna No method for best practice yet Normalization by read counts Several other methods tried with better results TMM

QUESTIONS?

Next: Differential expression analysis

Differential expression analysis Differential expression analysis aims at finding the mirnas that are truly differentially expressed under the relevant conditions The challenge is to be able to find as many as possible of the true ones, while avoiding to call a mirna differentially expressed when the changes are random or due to other factors

Differential expression analysis FLU (subject1) HEALTHY (subject2) Is this a good way of checking what mirnas are differentially expressed when having the flu?

Differential expression analysis FLU (subject1) Male Arthritis Eats vegetables HEALTHY (subject2) Female Allergy Eats burgers and fries Is this a good way of checking what mirnas are differentially expressed when having the flu?

Biological replicates There will be a lot of mirnas that have different expression in two samples Some due to the condition you are investigating Some due to other factors The solution: biological replicates Average out unknown/unwanted differences Call differential expression only where it is due to the factor you are investigating

QUESTIONS?

Next: A little statistics with Ståle Nygård (Eivind Hovig s group)

QUESTIONS?

Next: Discussion of the results

Comparison Lifeportal analysis mirna FDR mir-206 2.9-2 mir-133b 1.5-17 mir-1-3p 2.5-17 mir-133a-3p 6.0-17 mir-95-3p 4.6-06 mir-499a-5p 0.0037 mir-1.5p ---- Command line mirna FDR mir-206 2.1-24 mir-133b 2.0-18 mir-1-3p 8.5-30 mir-133a-3p 6.1-20 mir-95-3p 0.041 mir-499a-5p 0.325 mir-1-5p 0.024

Comparison Our analysis mirna FDR mir-206 2.7-23 mir-1 2.8-19 mir-133a-3p 4.2-19 mir-133b 3.1-18 mir-95-3p 8.7-08 mir-499a-5p 5.7-05 mir-4454 0.016 mir-452-5p 0.021 Paper mirna FDR mir-29b 0.01 mir-222 0.01 mir-221 0.02 mir-1 0.35

Possible sources of difference From fastq files to read counts Differences in aligner/reference Different way of counting Statistics Different normalization Different statistical tests used

Learning points Different pipelines will give different results Very important to give good descriptions of exactly what you have done PCR validation is important