Prediction of micrornas and their targets

Similar documents
Cross species analysis of genomics data. Computational Prediction of mirnas and their targets

Mature microrna identification via the use of a Naive Bayes classifier

Supplementary Figure 1

CRS4 Seminar series. Inferring the functional role of micrornas from gene expression data CRS4. Biomedicine. Bioinformatics. Paolo Uva July 11, 2012

V16: involvement of micrornas in GRNs

he micrornas of Caenorhabditis elegans (Lim et al. Genes & Development 2003)

INTRODUCTION TO MACHINE LEARNING. Decision tree learning

Identification of mirnas in Eucalyptus globulus Plant by Computational Methods

MicroRNA and Male Infertility: A Potential for Diagnosis

Small RNAs and how to analyze them using sequencing

Bioinformation Volume 5

Supplementary Figure 1

Prediction of novel precursor mirnas using a. context-sensitive hidden Markov model (CSHMM)

Comparing Multifunctionality and Association Information when Classifying Oncogenes and Tumor Suppressor Genes

Supplemental Figure 1. Small RNA size distribution from different soybean tissues.

MicroRNA in Cancer Karen Dybkær 2013

An Improved Patient-Specific Mortality Risk Prediction in ICU in a Random Forest Classification Framework

Gene Finding in Eukaryotes

Small RNAs and how to analyze them using sequencing

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project

Predicting Breast Cancer Survivability Rates

Bi 8 Lecture 17. interference. Ellen Rothenberg 1 March 2016

Supplementary information for: Human micrornas co-silence in well-separated groups and have different essentialities

Phenomena first observed in petunia

A Statistical Framework for Classification of Tumor Type from microrna Data

Human Genome: Mapping, Sequencing Techniques, Diseases

mirna Dr. S Hosseini-Asl

Patrocles: a database of polymorphic mirna-mediated gene regulation

Prediction of novel microrna genes in cancer-associated genomic regions a combined computational and experimental approach

Chapter 2. Investigation into mir-346 Regulation of the nachr α5 Subunit

Transcriptome-wide analysis of microrna expression in the malaria mosquito Anopheles gambiae

Micro-RNA web tools. Introduction. UBio Training Courses. mirnas, target prediction, biology. Gonzalo

MicroRNAs: a new source of biomarkers in radiation response. Simone Moertl, Helmholtz Centre Munich

Computational Prediction Of Micro-RNAs In Hepatitis B Virus Genome

Bioinformatics. Sequence Analysis: Part III. Pattern Searching and Gene Finding. Fran Lewitter, Ph.D. Head, Biocomputing Whitehead Institute

STAT1 regulates microrna transcription in interferon γ stimulated HeLa cells

An Empirical and Formal Analysis of Decision Trees for Ranking

Transcriptome-Wide Prediction of mirna Targets in Human and Mouse Using FASTH

An Improved Algorithm To Predict Recurrence Of Breast Cancer

Computational and Experimental Identification of C. elegans micrornas

Research Article Base Composition Characteristics of Mammalian mirnas

omiras: MicroRNA regulation of gene expression

Circular RNAs (circrnas) act a stable mirna sponges

Human breast milk mirna, maternal probiotic supplementation and atopic dermatitis in offsrping

mirna-guided regulation at the molecular level

Annotation of Chimp Chunk 2-10 Jerome M Molleston 5/4/2009

NOVEL FUNCTION OF mirnas IN REGULATING GENE EXPRESSION. Ana M. Martinez

Transcriptome profiling of the developing male germ line identifies the mir-29 family as a global regulator during meiosis

Sebastian Jaenicke. trnascan-se. Improved detection of trna genes in genomic sequences

SC-L-H shared(37) Specific (1)

Non-messenger RNAs. Karin Lagesen

RNA-seq Introduction

Eukaryotic small RNA Small RNAseq data analysis for mirna identification

Supplementary information

Computer-Aided Diagnosis for Microcalcifications in Mammograms

Computational Identification of MicroRNAs and Their Targets

MicroRNA expression profiling and functional analysis in prostate cancer. Marco Folini s.c. Ricerca Traslazionale DOSL

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays

Integrated network analysis reveals distinct regulatory roles of transcription factors and micrornas

Impact of mirna Sequence on mirna Expression and Correlation between mirna Expression and Cell Cycle Regulation in Breast Cancer Cells

In silico identification of mirnas and their targets from the expressed sequence tags of Raphanus sativus

MicroRNA-mediated incoherent feedforward motifs are robust

MicroRNAs and Cancer

Deciphering the Role of micrornas in BRD4-NUT Fusion Gene Induced NUT Midline Carcinoma

RNA Secondary Structures: A Case Study on Viruses Bioinformatics Senior Project John Acampado Under the guidance of Dr. Jason Wang

Analysis of Diabetic Dataset and Developing Prediction Model by using Hive and R

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

10/31/2017. micrornas and cancer. From the one gene-one enzyme hypothesis to. microrna DNA RNA. Transcription factors.

Knowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes

Package TargetScoreData

Lentiviral Delivery of Combinatorial mirna Expression Constructs Provides Efficient Target Gene Repression.

Classification of benign and malignant masses in breast mammograms

Long non-coding RNAs

Gene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering

Machine learning II. Juhan Ernits ITI8600

The Characterization of microrna-mediated Gene Regulation as Impacted by Both Target Site Location and Seed Match Type

High-throughput transcriptome sequencing

Burhansstipanov-Bemis GENA obj. 14 mirna & CBPR excerpt from obj. 29

Reliable Prediction of Viral RNA Structures

Machine Learning! Robert Stengel! Robotics and Intelligent Systems MAE 345,! Princeton University, 2017

Interaktionen von RNAs und Proteinen

Paternal exposure and effects on microrna and mrna expression in developing embryo. Department of Chemical and Radiation Nur Duale

Random forest analysis in vaccine manufacturing. Matt Wiener Dept. of Applied Computer Science & Mathematics Merck & Co.

IDENTIFICATION OF IN SILICO MIRNAS IN FOUR PLANT SPECIES FROM FABACEAE FAMILY

Structural Variation and Medical Genomics

Post-transcriptional regulation of an intronic microrna

RNA interference induced hepatotoxicity results from loss of the first synthesized isoform of microrna-122 in mice

Bootstrapped Integrative Hypothesis Test, COPD-Lung Cancer Differentiation, and Joint mirnas Biomarkers

Most everyone in this room has been affected in one way or another by it, but what is it?

Gene-microRNA network module analysis for ovarian cancer

Abstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation,

Cellular MicroRNA and P Bodies Modulate Host-HIV-1 Interactions. 指導教授 : 張麗冠博士 演講者 : 黃柄翰 Date: 2009/10/19

Comparison of ESI-MS Spectra in MassBank Database

Profiles of gene expression & diagnosis/prognosis of cancer. MCs in Advanced Genetics Ainoa Planas Riverola

MicroRNAs: novel regulators in skin research

Over the past decade, it has become progressively more

From mirna regulation to mirna - TF co-regulation: computational

High AU content: a signature of upregulated mirna in cardiac diseases

A Bioinformatics Method for Identifying RNA Structures within Human Cells

Transcription:

Prediction of micrornas and their targets Introduction Brief history mirna Biogenesis Computational Methods Mature and precursor mirna prediction mirna target gene prediction Summary

micrornas? RNA can fold like proteins: possess primary, secondary and tertiary structure Secondary hairpin structure crucial to processing of small RNAs

~22nt noncoding small RNAs mrna stability and translation. Lin-4 (Lee et al. 1993) Let-7 (Reihart et al, 2000) micrornas?

95% Junk, 5% proteins Junk to Nobel Prize Andrew Z. Fire & Craig C. Mello (2006)

The questions Can we predict microrna genes? ab initio / de novo Homology Given a microrna gene, can we find what genes they regulate, a.k.a targets?

Computational methods to identify mirna genes: Why? ~500 human mirnas to date, thousands across species. However, experimental identification mirnas is not easy: low expression stability tissue specific Expensive, and long cloning procedure Predicting mirnas from genomic sequences provide a valuable alternative/support to cloning.

How do we evaluate these predictions? TP= True positives TN=True negatives FP=False positives FN= False negatives TPR( Sensitivity) = TP TP + FPR(1 Specificity) = FN FP FP + TN

mirna prediction Initial methods MiRscan find conserved hairpin structures known mirnas (50) as training set. S( x ) = log i i 7 i = 1 2 S = Si( xi) f g i i ( xi) ( x ) i Lim et al, Genes and Development 2003

Blue: distribution of MiRscan score of 35,697 sequences Red: training set Yellow and purple are verified by cloning or other evidence. 70% Specificity and 50% sensitivity

Comparative Genomics

HMM-based ProMiR

Results of ProMiR ProMiR: 96% Specificity, 73% sensitivity mirscan: 70% Specificity and 50% sensitivity

P(C features)? BayesMiRNAfind (1) # of base pairs. (2) # of bulges. (3) # of loops (4) # of asymmetric loops. (5)# of bulges of various lengths (6) #of asymmetric features (7) Distance of mirna from foot & loop (8) Nucleotide sequence words with lengths 4 9 are extracted from the candidate 21 nt sequence

C. Elegans (nematode worm) Mouse (1) # of base pairs. (2) # of bulges. (3) # of loops (4) # of asymmetric loops. (5)# of bulges of various lengths (6) #of asymmetric features (7) Distance of mirna from foot & loop (8) Nucleotide sequence words with lengths 4 9 are extracted from the candidate 21 nt sequence

mipred: 93.21% Specificity, 89.4 sensitivity ProMiR: 96% Specificity, 73% sensitivity mirscan: 70% Specificity, 50% sensitivity

Training

Node B Node C

Random Forest N cases in training set, M input variables Sample N cases at random, with replacement, from the original data. This sample will be the training set for growing the tree. At each node, m variables (m << M) are selected at random out of the M and the best split on these m is used to split the node. The value of m is held constant during the forest growing. Each tree is grown to the largest extent possible. There is no pruning.

mipred-- Random Forest Trained on RFAM data set of 60 cloned mirnas and random negative set (250 putative mirna hairpins) with a variety of features Independently construct 500 trees

MFE Minimum free energy value P Randomized sequences to evaluate MFE Structure composition Rank Features Mean decrease accuracy (%) 1 P-value 15.80 2 MFE 5.48 3 C... 2.04 4 U 2.00 5 A 1.49 6 A... 0.83 7 G... 0.76 8 U.. 0.43 9 G. 0.34 10 A.. 0.31

Other strategies RNA22 pattern based mirank Random walks RNAmicro Support Vector Machine Reverse motif search w conservation analysis PMID: 15735639 (Xie et al) and others mirnaminer Sequence similarity based

mirna target prediction =? Predicting which genes are regulated by which mirna(s)? 5 UTR Protein-coding region (ORF) 3 UTR

Common strategy

Method: TargetScan 1. Use 7 nt segment of the mirna as the microrna seed to find the perfect complementary motifs in the UTR regions. 2. Extend each seeds to find the best energy 3. Assign a score, Z. 4. Rank Give a rank (R i ) according to that species. 5. Repeat above process. 6. Keep those genes for which Z i > Z c and R i < R c.

TargetScan Signal to noise ratio Don t have a large training set Estimate of false positive

RNAhybrid problem with linker sequences in folding mirna:target Use MFE (minimum fold energy) Statistical model for random matches Cross-hybridization Good Cross-hybridization Good

RNAhybrid MFE statistics MFE is an optimized score Z EVD applicable to max/min of independent random variables MFE = log( mn) ( t ε )/ θ ( ) exp( ) PZ t = e t = ε θlog( log( P)) Regression analysis yields ε and θ Statistical significance 1-P

Resources (mirna target prediction) Rajewski, NatGen, 2006

Additional Resources for mirnas mirbase data base of micrornas Ensembl database UTRs, gene regions, etc UCSD genome browser genomes, conservation GEO Gene expression Omnibus http://en.wikipedia.org/wiki/list_of_rna_structur e_prediction_software (list of RNA structure prediction programs)

mirna target recognition One target, multiple mirna (Cooperative interaction) One mirna, multiple targets (Multiplicity/Promiscuity) 5 end sequence is important Structural accessibility is important Lots of predicted targets, which ones are important remains a question