Inferring condition-specific mirna activity from matched mirna and mrna expression data

Similar documents
From mirna regulation to mirna - TF co-regulation: computational

Package mirlab. R topics documented: June 29, Type Package

High AU content: a signature of upregulated mirna in cardiac diseases

Micro-RNA web tools. Introduction. UBio Training Courses. mirnas, target prediction, biology. Gonzalo

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation,

SUPPLEMENTARY FIGURE LEGENDS

SubLasso:a feature selection and classification R package with a. fixed feature subset

Supplementary information for: Human micrornas co-silence in well-separated groups and have different essentialities

HALLA KABAT * Outreach Program, mircore, 2929 Plymouth Rd. Ann Arbor, MI 48105, USA LEO TUNKLE *

developing new tools for diagnostics Join forces with IMGM Laboratories to make your mirna project a success

Cross species analysis of genomics data. Computational Prediction of mirnas and their targets

Discovery of Novel Human Gene Regulatory Modules from Gene Co-expression and

CRS4 Seminar series. Inferring the functional role of micrornas from gene expression data CRS4. Biomedicine. Bioinformatics. Paolo Uva July 11, 2012

Gene-microRNA network module analysis for ovarian cancer

microrna PCR System (Exiqon), following the manufacturer s instructions. In brief, 10ng of

Integration of high-throughput biological data

Comparison of open chromatin regions between dentate granule cells and other tissues and neural cell types.

Supplementary Figure 1

On the Reproducibility of TCGA Ovarian Cancer MicroRNA Profiles

Small RNAs and how to analyze them using sequencing

Circular RNAs (circrnas) act a stable mirna sponges

MicroRNA and Male Infertility: A Potential for Diagnosis

STAT1 regulates microrna transcription in interferon γ stimulated HeLa cells

International Journal of Innovative Research in Computer and Communication Engineering. (An ISO 3297: 2007 Certified Organization)

Package TargetScoreData

User Guide. Association analysis. Input

EXPression ANalyzer and DisplayER

Article from. Forecasting and Futurism. Month Year July 2015 Issue Number 11

microrna-200b and microrna-200c promote colorectal cancer cell proliferation via

Evaluating the Consistency of Differential Expression of MicroRNA Detected in Human Cancers

A Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data

Identification of mirnas in Eucalyptus globulus Plant by Computational Methods

From reference genes to global mean normalization

Small RNAs and how to analyze them using sequencing

Impact of mirna Sequence on mirna Expression and Correlation between mirna Expression and Cell Cycle Regulation in Breast Cancer Cells

Identification of distinct mirna target regulation between breast cancer molecular subtypes using AGO2-PAR-CLIP and patient datasets

How To Use MiRSEA. Junwei Han. July 1, Overview 1. 2 Get the pathway-mirna correlation profile(pmset) and a weighting matrix 2

IPA Advanced Training Course

SUPPLEMENTAY FIGURES AND TABLES

MicroRNA-mediated incoherent feedforward motifs are robust

Human breast milk mirna, maternal probiotic supplementation and atopic dermatitis in offsrping

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.

starbase Pan-Cancer Tutorial

Nature Neuroscience: doi: /nn Supplementary Figure 1

Concordant Regulation of Translation and mrna Abundance for Hundreds of Targets of a Human microrna

Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor suppressor genes

Systematic exploration of autonomous modules in noisy microrna-target networks for testing the generality of the cerna hypothesis

Identifying Relevant micrornas in Bladder Cancer using Multi-Task Learning

mirna-guided regulation at the molecular level

(A) Cells grown in monolayer were fixed and stained for surfactant protein-c (SPC,

Prediction of micrornas and their targets

Prioritizing cancer-related key mirna target interactions by integrative genomics

m 6 A mrna methylation regulates AKT activity to promote the proliferation and tumorigenicity of endometrial cancer

micrornas (mirna) and Biomarkers

Research Article A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real mirna Sequencing Data

Chapter 2. Investigation into mir-346 Regulation of the nachr α5 Subunit

Supplementary Methods: IGFBP7 Drives Resistance to Epidermal Growth Factor Receptor Tyrosine Kinase Inhibition in Lung Cancer

Molecular BioSystems PAPER. Gene module based regulator inference identifying mir-139 as a tumor suppressor in colorectal cancer.

Supplementary Figure S1. Venn diagram analysis of mrna microarray data and mirna target analysis. (a) Western blot analysis of T lymphoblasts (CLS)

Package AbsFilterGSEA

Supplementary Figure 1 IMQ-Induced Mouse Model of Psoriasis. IMQ cream was

Computational discovery of mir-tf regulatory modules in human genome

Bootstrapped Integrative Hypothesis Test, COPD-Lung Cancer Differentiation, and Joint mirnas Biomarkers

Analysis of paired mirna-mrna microarray expression data using a stepwise multiple linear regression model

Identification of Tissue Independent Cancer Driver Genes

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16

SUPPLEMENTARY INFORMATION

Data mining with Ensembl Biomart. Stéphanie Le Gras

PREPARED FOR: U.S. Army Medical Research and Materiel Command Fort Detrick, Maryland

A Quick-Start Guide for rseqdiff

Process Mining to enhance security of Web information systems

Accessing and Using ENCODE Data Dr. Peggy J. Farnham

Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India

MicroRNAs: a new source of biomarkers in radiation response. Simone Moertl, Helmholtz Centre Munich

Computational Identification and Prediction of Tissue-Specific Alternative Splicing in H. Sapiens. Eric Van Nostrand CS229 Final Project

Cancer Problems in Indonesia

omiras: MicroRNA regulation of gene expression

MicroRNA expression profiling and functional analysis in prostate cancer. Marco Folini s.c. Ricerca Traslazionale DOSL

P. Mathijs Voorhoeve, Carlos le Sage, Mariette Schrier, Ad J.M. Gillis, Hans Stoop,

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies

Identifying mirna sponge modules using biclustering and regulatory scores

A novel and universal method for microrna RT-qPCR data normalization

Marta Puerto Plasencia. microrna sponges


Supplementary Figure 1

Tissue and Process Specific microrna mrna Co-Expression in Mammalian Development and Malignancy

Metabolic programming. Role of micrornas. M Elizabeth Tejero, PhD Laboratory of Nutrigenetics and Nutrigenomics INMEGEN Mexico City

Small RNA-Seq and profiling

Selective filtering defect at the axon initial segment in Alzheimer s disease mouse models. Yu Wu

The value of Omics to chemical risk assessment

IMPaLA tutorial.

Gene Ontology 2 Function/Pathway Enrichment. Biol4559 Thurs, April 12, 2018 Bill Pearson Pinn 6-057

GENE EXPRESSION AND microrna PROFILING IN LYMPHOMAS. AN INTEGRATIVE GENOMICS ANALYSIS

M icrornas (mirnas) are a class of small endogenous single-stranded non-coding RNAs (,22 nt),

SUPPLEMENTARY APPENDIX

Profiles of gene expression & diagnosis/prognosis of cancer. MCs in Advanced Genetics Ainoa Planas Riverola

RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays

RNA - protein interactions in mrna decay A case study on TTP and HuR in AMD

Patnaik SK, et al. MicroRNAs to accurately histotype NSCLC biopsies

Transcription:

Inferring condition-specific mirna activity from matched mirna and mrna expression data Junpeng Zhang 1, Thuc Duy Le 2, Lin Liu 2, Bing Liu 3, Jianfeng He 4, Gregory J Goodall 5 and Jiuyong Li 2,* 1 Faculty of Engineering, Dali University, Dali, CN. 2 School of Information Technology and Mathematical Sciences, University of South Australia, AU. 3 Children s Cancer Institute Australia, Randwick NSW 2301, AU. 4 Kunming University of Science and Technology, Kunming, CN. 5 Centre for Cancer Biology, SA Pathology, SA 5000, AU. In this file, we provide supplementary materials discussed in the Results sections. 1. Results 1.1. Overview of the validation To assess the results obtained by using our method with the and data, the following validations and analysis are conducted. From (Khan et al., 2009), we obtain the transfection experimental data in 7 different cancer cell types, involving more than 20 different mirnas and 40 unique sirnas. The overlap between the mirnas of either the or the dataset and transfected mirnas from (Khan et al., 2009) are 18 unique mirnas. Firstly, we use the transfection experimental result to validate the identified significant mirna-mrna causal interactions, specifically the target genes involved in the significant causal interactions. The validation is done in comparison with the target genes predicted by MicroCosm, as well as the target genes discovered by using the non condition-specific analysis based on IDA without considering the difference in the behavior of mirnas across conditions. We also compare our method with five correlation methods (Pearson, Spearman, Kendall, Lasso and Elastic-net) in the number of validated targets. Secondly, function, pathway and literature based analyses are done for the identified active mirnas. To show the effectiveness of our method, we also compare the results of active mirnas obtained using our method with five non-causal methods (Pearson, Spearman, Kendall, Lasso and Elastic-net) and other five existing approaches (DIANA-mirExTra, Sylamer, MIR, mireduce and cwords). Finally, we examine the patterns of the regulatory behavior of the identified active mirnas and how they differ in different conditions. 1.2. Validation in comparison with MicroCosm When the LFC cutoff is set to 1.0, the comparison result is shown in Figure S1. For the dataset, for 8 out of the 11 mirnas (i.e. excluding mir-141, mir-200a and mir-215), our method produces higher rate of experimentally confirmed target genes than MicroCosm does. For the dataset, for 10 out of the 11 mirnas (i.e. excluding mir-215), our method outperforms MicroCosm in the rate of 1

8.51% 10.69% 14.34% 14.06% 17.43% 16.63% 16.28% 15.09% 15.73% 17.50% 13.37% 17.02% 14.34% 18.18% 16.61% 17.11% 16.33% 14.% 19.79% 18.03% 16.63% 14.79% Percentage of validated target genes 15.73% 18.87% 21.95% 20.97% 24.53% 22.58% 22.58% 20.66% 20.43% 19.40% 23.53% 23.53% 23.81% 22.54% 25.58% 26.09% 29.69% 28.30% 29.69%.48% 36.17% 34.% 34.% 40.68% experimentally confirmed target genes. Proposed method with LFC cutoff of 1.0 MicroCosm with LFC cutoff of 1.0 45.00%.00% 25.00% 15.00% 5.00% Fig. S1. The percentage of confirmed target genes identified by using the proposed method and MicroCosm in the and dataset (LFC=1.0). We also conceive a hypergeometric (HG) statistic test to assess the significance of the validated mirna-mrna interactions. Let S be the number of possible target genes for n mirnas in the dataset, K be the number of mirna-mrna interactions from transfection experiments for these mirnas, N be the number of mirna-mrna interactions predicted by our method for these mirnas, and x be the number of validated mirna-mrna interactions by transfection experiments for these mirnas. The p-value of the validation results, which is the probability of the random method to achieve the same or better results than the proposed method, is calculated using the cumulative hypergeometric test formula: K S n K N i N i p( X x)= (1) i x S n N We use the Matlab function hygepdf to compute statistical significance level p(x x) of the validated mirna-mrna interactions. The validation of mirna-mrna interactions is statistically significant when p(x x) is less than, e.g., 0.05. 2

#Validated targets As shown in Table S1, the validated mirna-mrna interactions are all statistically significant (p-value < 0.05) in both the and datasets. Table S1. Statistically significant level of validated mirna-mrna interactions for the 11 and 12 mirnas in the and dataset, respectively. Dataset S K 1 /K 2 N x 1 /x 2 p 1 /p 2 (n=11) 1126 4152/1791 489 214/103 9.8512E-07/3.5097E-05 (n=12) 1318 6361/2952 654 324/169 5.2288E-07/2.3328E-06 K 1 and K 2 are the number of mirna-mrna interactions from transfection experiments with LFC cutoff of 0.5 and 1.0, respectively. x 1 and x 2 denote the number of validated mirna-mrna interactions with LFC cutoff of 0.5 and 1.0 respectively. p 1 and p 2 represent significant level p-value with respect to LFC cutoff of 0.5 and 1.0, respectively. 1.3. Validation in comparison with non condition-specific analysis As Illustrated in Figure S2, the cutoffs value of the log 2 fold change in transfection experiments is set to 1.0. The comparison results also indicate that considering the difference of the condition of interest and the other conditions can help improve the prediction of mirna-mrna regulatory relationships. Condition-specific LFC=1.0 Non condition-specific 66 64 65 45 53 30 33 27 25 Top10- Top20- Top10- Top20- p-values Top10- Top20- Top10- Top20- Condition-specific 3.2521E-04 9.0525E-05 0.0034 4.74E-04 Non condition-specific 0.0034 0.0011 0.0112 0.0014 Fig. S2. Comparison between condition-specific and non condition-specific analyses in the number of validated targets, with LFC=1.0. The p-values of the validated targets are calculated using cumulative hypergeometric test. 1.4. Validation in comparison with five correlation methods As Illustrated in Figure S3, the cutoffs value of the log 2 fold change in transfection experiments is set to 1.0. The comparison results also show that our proposed method outperforms all the five correlation methods (Pearson, Spearman, Kendall, Lasso and Elastic-net) in the number of validated targets for all cases (Top10-, Top20-, Top10- and Top20-). 3

#Validated targets 75 65 45 25 30 Proposed method Pearson Spearman Kendall Lasso Elastic-net 26 24 24 22 20 LFC=1.0 53 52 51 46 39 31 26 27 27 24 66 58 57 57 15 Top10- Top20- Top10- Top20- p-values Proposed method Pearson Spearman Kendall Lasso Elastic-net Top10-3.2521E-04 0.0236 0.0068 0.0236 0.0682 0.1632 Top20-9.0525E-05 1.77E-04 3.3238E-04 0.0011 0.1001 0.0056 Top10-0.0034 0.2294 0.1667 0.0320 0.3891 0.1667 Top20-4.74E-04 0.03 0.0193 0.0279 0.2647 0.0279 Fig. S3. Comparison between the proposed method and five correlation methods in the number of confirmed targets, with LFC=1.0. The p-values of the validated targets are calculated using cumulative hypergeometric test. 1.5. Correlations for finding active mirnas We use other five of correlation methods (Pearson, Spearman, Kendall, Lasso and Elastic-net) for finding active mirnas. The R function used for the Pearson, Spearman and Kendall method is cor (package stats) with parameter method= pearson, spearman, and kendall, respectively. The stats package can be downloaded at http://www.r-project.org. The R function for Lasso and Elastic-net is glmnet (Friedman et al., 2010) with parameter alpha=1 and 0.5, respectively. 1.6. Other existing methods for detecting active mirnas We compare our proposed method with five existing approaches, namely DIANA-mirExTra (Alexiou et al., 2010), Sylamer (van Dongen et al., 2008), MIR (Cheng and Li, 2008), mireduce (Sood et al., 2006) and cwords (Rasmussen et al., 2013). The web link of DIANA-mirExTra is http://diana.cslab.ece.ntua.gr/hexamers/, and it identifies overrepresented 6nt long motifs (hexamers) on the 3'UTR sequences of deregulated genes. The inputs of DIANA-mirExTra contain two lists: a list of differentially expressed mrnas and a list of non-differentially expressed (background) mrnas. The Java Graphical Interface of Sylamer can be obtained from http://www.ebi.ac.uk/research/enright/software/sylamer, and the input of it is a list of ranked differential mrnas by adjusted p-value from limma (Smyth, 2005). The C++ program for MIR is available at http://www.dartmouth.edu/~chaocheng/software/infermirna/infermir.html, and two input files are needed: the differentially expressed mrnas with log fold change file from limma and the 4

mirna-gene binding affinity data file based on the predicted binding energy by miranda (Betel et al., 2008). The Perl script of mireduce implementation can be obtained at https://www.mdc-berlin.de/10615841/en/research/research_teams/systems_biology_of_gene_regulator y_elements/projects/mireduce. The three inputs of it are differentially expressed mrnas with log fold change from limma, the 3'UTR sequences of these differentially expressed mrnas and mirna sequences of differentially expressed mirnas. The online link of cwords is from http://servers.binf.ku.dk/cwords/, and the input is a ranked list of differentially expressed mrnas by adjusted p-value from limma. For Sylamer, mireduce and cwords, we combine the results of seed region from 6nt to 8nt as final results of them. For all methods, we set the p-value cutoff of 0.05 to detect active mirnas within differentially expressed mirnas in the and datasets. References Alexiou,P. et al. (2010) The DIANA-mirExTra web server: from gene expression data to microrna function, PLoS One, 5, e9171. Betel,D. et al. (2008) The microrna.org resource: targets and expression, Nucleic Acids Res., 36, D1-D153. Cheng,C. and Li,L.M. (2008) Inferring microrna activities by combining gene expression with microrna target prediction, PLoS One, 3, e1989. Friedman,J. et al. (2010) Regularization Paths for Generalized Linear Models via Coordinate Descent, J. Stat. Softw., 33, 1-22. Khan,A.A. et al. (2009) Transfection of small RNAs globally perturbs gene regulation by endogenous micrornas, Nat. Biotechnol., 27, 5-. Rasmussen,S.H. et al. (2013) cwords - systematic microrna regulatory motif discovery from mrna expression data, Silence, 4, 2. Sood,P. et al. (2006) Cell-type-specific signatures of microrna on target mrna expression, Proc. Natl. Acad. Sci. USA, 103, 2746-2751. Smyth,G.K. (2005) Limma: Linear Models for Microarray Data. Bioinformatics and Computational Biology Solutions using R and Bioconductor, Springer, New York, 397-420. van Dongen, S. et al. (2008) Detecting microrna binding and sirna off-target effects from expression data, Nat. Methods, 5, 1023-1025. 5