Package AbsFilterGSEA

Similar documents
Package inote. June 8, 2017

Package cssam. February 19, 2015

Package appnn. R topics documented: July 12, 2015

Package citccmst. February 19, 2015

Package HAP.ROR. R topics documented: February 19, 2015

Package wally. May 25, Type Package

Package nopaco. R topics documented: April 13, Type Package Title Non-Parametric Concordance Coefficient Version 1.0.

User Guide. Association analysis. Input

Package cancer. July 10, 2018

Package inctools. January 3, 2018

Package xseq. R topics documented: September 11, 2015

Package msgbsr. January 17, 2019

Package QualInt. February 19, 2015

Package ORIClust. February 19, 2015

Package StepReg. November 3, 2017

Package GANPAdata. February 19, 2015

Package ordibreadth. R topics documented: December 4, Type Package Title Ordinated Diet Breadth Version 1.0 Date Author James Fordyce

How To Use MiRSEA. Junwei Han. July 1, Overview 1. 2 Get the pathway-mirna correlation profile(pmset) and a weighting matrix 2

Package clust.bin.pair

Package singlecase. April 19, 2009

Package speff2trial. February 20, 2015

A Quick-Start Guide for rseqdiff

Package longroc. November 20, 2017

Package CompareTests

Package ega. March 21, 2017

Package rvtdt. February 20, rvtdt... 1 rvtdt.example... 3 rvtdts... 3 TrendWeights Index 6. population control weighted rare-varaints TDT

Package ExactCIdiff. February 15, 2013

Package TFEA.ChIP. R topics documented: January 28, 2018

Package memapp. R topics documented: June 2, Version 2.10 Date

Package ClusteredMutations

Package ICCbin. November 14, 2017

EXPression ANalyzer and DisplayER

Package pollen. October 7, 2018

metaseq: Meta-analysis of RNA-seq count data

Exercises: Differential Methylation

Package propoverlap. R topics documented: February 20, Type Package

ivlewbel: Uses heteroscedasticity to estimate mismeasured and endogenous regressor models

The FunCluster Package

Package sbpiper. April 20, 2018

Package RNAsmc. December 11, 2018

Package AIMS. June 29, 2018

Package MSstatsTMT. February 26, Title Protein Significance Analysis in shotgun mass spectrometry-based

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from

Package tumgr. February 4, 2016

Nature Immunology: doi: /ni Supplementary Figure 1. Transcriptional program of the TE and MP CD8 + T cell subsets.

Package prognosticroc

Package Actigraphy. R topics documented: January 15, Type Package Title Actigraphy Data Analysis Version 1.3.

Package PELVIS. R topics documented: December 18, Type Package

Package BCRA. April 25, 2018

RNA-seq. Differential analysis

Discovery of Novel Human Gene Regulatory Modules from Gene Co-expression and

Package DeconRNASeq. November 19, 2017

R documentation. of GSCA/man/GSCA-package.Rd etc. June 8, GSCA-package. LungCancer metadi... 3 plotmnw... 5 plotnw... 6 singledc...

Package MethPed. September 1, 2018

Package SAMURAI. February 19, 2015

Package mediation. September 18, 2009

DeSigN: connecting gene expression with therapeutics for drug repurposing and development. Bernard lee GIW 2016, Shanghai 8 October 2016

Package hei. August 17, 2017

Package TEQR. R topics documented: February 5, Version Date Title Target Equivalence Range Design Author M.

Package diggitdata. April 11, 2019

Comparison of Gene Set Analysis with Various Score Transformations to Test the Significance of Sets of Genes

SUPPLEMENTARY FIGURE LEGENDS

Supplementary Figures

Package preference. May 8, 2018

Package influence.me

Package p3state.msm. R topics documented: February 20, 2015

Package seqdesign. April 27, Version 1.1 Date

Package CancerMutationAnalysis

Package ncar. July 20, 2018

Package IMAS. March 10, 2019

Supplement to SCnorm: robust normalization of single-cell RNA-seq data

Patnaik SK, et al. MicroRNAs to accurately histotype NSCLC biopsies

Single SNP/Gene Analysis. Typical Results of GWAS Analysis (Single SNP Approach) Typical Results of GWAS Analysis (Single SNP Approach)

Package bionetdata. R topics documented: February 19, Type Package Title Biological and chemical data networks Version 1.0.

Gene Expression Analysis Web Forum. Jonathan Gerstenhaber Field Application Specialist

DiffVar: a new method for detecting differential variability with application to methylation in cancer and aging

Package CorMut. January 21, 2019

Package NarrowPeaks. August 3, Version Date Type Package

TADA: Analyzing De Novo, Transmission and Case-Control Sequencing Data

Package metaseq. R topics documented: November 1, Type Package. Title Meta-analysis of RNA-Seq count data in multiple studies. Version 1.1.

EPIGENETIC RE-EXPRESSION OF HIF-2α SUPPRESSES SOFT TISSUE SARCOMA GROWTH

Package CLL. April 19, 2018

TCGA. The Cancer Genome Atlas

Nature Methods: doi: /nmeth.3115

Package noia. February 20, 2015

fl/+ KRas;Atg5 fl/+ KRas;Atg5 fl/fl KRas;Atg5 fl/fl KRas;Atg5 Supplementary Figure 1. Gene set enrichment analyses. (a) (b)

Package metaseq. R topics documented: August 17, Type Package

microrna PCR System (Exiqon), following the manufacturer s instructions. In brief, 10ng of

Expanded View Figures

Package xmeta. August 16, 2017

Detecting gene signature activation in breast cancer in an absolute, single-patient manner

Package BUScorrect. September 16, 2018

Package frailtyhl. February 19, 2015

Package EpiEstim. R topics documented: February 19, Version Date 2013/03/08

Package leukemiaseset

Package mem. R topics documented: October 23, Type Package Title The Moving Epidemic Method Version 2.11 Date Depends R (>= 3.4.

Whole liver transcriptome analysis for the metabolic adaptation of dairy cows

Package TargetScoreData

Transcription:

Type Package Package AbsFilterGSEA September 21, 2017 Title Improved False Positive Control of Gene-Permuting GSEA with Absolute Filtering Version 1.5.1 Author Sora Yoon <yoonsora@unist.ac.kr> Maintainer Sora Yoon <yoonsora@unist.ac.kr> Description Gene-set enrichment analysis (GSEA) is popularly used to assess the enrichment of differential signal in a pre-defined gene-set without using a cutoff threshold for differential expression. The significance of enrichment is evaluated through sample- or genepermutation method. Although the sample-permutation approach is highly recommended due to its good false positive control, we must use gene-permuting method if the number of samples is small. However, such gene-permuting GSEA (or preranked GSEA) generates a lot of false positive gene-sets as the inter-gene correlation in each gene set increases. These false positives can be successfully reduced by filtering with the one-tailed absolute GSEA results. This package provides a function that performs gene-permuting GSEA calculation with or without the absolute filtering. Without filtering, users can perform (original) twotailed or one-tailed absolute GSEA. License GPL-2 LazyData TRUE RoxygenNote 6.0.1 Depends LinkingTo Rcpp, RcppArmadillo Imports Rcpp, Biobase, stats, DESeq, limma NeedsCompilation yes Repository CRAN Date/Publication 2017-09-21 13:39:52 UTC R topics documented: example........................................... 2 GenePermGSEA...................................... 3 1

2 example Index 6 example Normalized RNA-seq count data Description This is toy example of RNA-seq raw read count table. It containes 5000 genes and 6 samples (three for case and other three for control group). Usage data("example") Format Details A data frame with 5000 observations on the following 6 variables. groupa1 a numeric vector for RNA-seq counts for case samples 1. groupa2 a numeric vector for RNA-seq counts for case samples 2. groupa3 a numeric vector for RNA-seq counts for case samples 3. groupb1 a numeric vector for RNA-seq counts for control samples 1. groupb2 a numeric vector for RNA-seq counts for control samples 2. groupb3 a numeric vector for RNA-seq counts for control samples 3. This read count dataset was simulated based on the negative binomial distribution. Mean and dispersion parameters were assessed from TCGA KIRC RNA-seq dataset. Normalization was done by using edger package.geneset_41~45 are up-regulated and Geneset_46~50 are down-regulated gene sets. Source Cancer Genome Atlas Research, N. Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature 2013;499(7456):43-49. References Chen, Y., et al. edger: differential expression analysis of digital gene expression data User s Guide. 2015. Examples data(example)

GenePermGSEA 3 GenePermGSEA Gene permuting GSEA with or without filtering by absolute GSEA. Description Gene-set enrichment analysis (GSEA) is popularly used to assess the enrichment of differential signal in a pre-defined gene-set without using a cutoff threshold for differential expression. The significance of enrichment is evaluated through sample- or gene-permutation method. Although the sample-permutation approach is highly recommended due to its good false positive control, we must use gene-permuting method if the number of samples is small. However, such genepermuting GSEA (or preranked GSEA) generates a lot of false positive gene-sets as the inter-gene correlation in each gene set increases. These false positives can be successfully reduced by filtering with the one-tailed absolute GSEA results. This package provides a function that performs genepermuting GSEA calculation with or without the absolute filtering. Without filtering, users can perform (original) two-tailed or one-tailed absolute GSEA. Usage GenePermGSEA(countMatrix, GeneScoreType, idxcase, idxcontrol, GenesetFile, normalization, mingenesetsize = 10, maxgenesetsize = 300, q = 1, nperm = 1000, absolutegenescore = FALSE, GSEAtype = "absfilter", FDR = 0.05, FDRfilter = 0.05, mincount = 3) Arguments countmatrix GeneScoreType Normalized RNA-seq read count matrix. Type of gene score. Possible gene score is "moderated_t","snr", "FC" (log fold change score) or "RANKSUM" (zero centered). idxcase Indices for case samples in the count matrix. e.g., 1:3 idxcontrol Indices for control samples in the count matrix. e.g., 4:6 GenesetFile normalization File path for gene set file. Typical GMT file or its similar tab-delimited file is available. e.g., "C:/geneset.gmt" Type DESeq if the input matrix is composed of raw read counts. It will normalize the raw count data using DESeq method. Or type AlreadyNormalized if the input matrix is already normalized. mingenesetsize Minimum size of gene set allowed. Gene-sets of which sizes are below this value are filtered out from the analysis. Default = 10 maxgenesetsize Maximum size of gene set allowed. Gene-sets of which sizes are larger this value are filtered out from the analysis. Default = 300 q Weight exponent for gene score. For example, if q=0, only rank of gene score is reflected in calculating gene set score (preranked GSEA). If q=1, the gene score itself is used. If q=2, square of the gene score is used. nperm The number of gene permutation. Default = 1000.

4 GenePermGSEA Details Value Source absolutegenescore Boolean. Whether to take absolue to gene score (TRUE) or not (FALSE). GSEAtype Type of GSEA. Possible value is "absolute", "original" or "absfilter". "absolute" for one-tailed absolute GSEA. "original" for the original two-tailed GSEA. "absfilter" for the original GSEA filtered by the results from the one-tailed absolute GSEA. FDR FDR cutoff for the original or absolute GSEA. Default = 0.05 FDRfilter mincount FDR cutoff for the one-tailed absolute GSEA for absolute filtering (only working when GSEAtype is "absfilter"). Default = 0.05 Minimum median count of a gene to be included in the analysis. It is used for gene-filtering to avoid genes having small read counts. Default = 0 Typical usages are GenePermGSEA(countMatrix = countmatrix, GeneScoreType = "moderated_t", idxcase = 1:3, idxcontrol = 4:6, GenesetFile = geneset.txt, GSEAtype = "absfilter") GSEA result table sorted by FDR Q-value. Nam, D. Effect of the absolute statistic on gene-sampling gene-set analysis methods. Stat Methods Med Res 2015. Subramanian, A., et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. P Natl Acad Sci USA 2005;102(43):15545-15550. Li, J. and Tibshirani, R. Finding consistent patterns: A nonparametric approach for identifying differential expression in RNA-Seq data. Statistical Methods in Medical Research 2013;22(5):519-536. References Nam, D. Effect of the absolute statistic on gene-sampling gene-set analysis methods. Stat Methods Med Res 2015. Subramanian, A., et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. P Natl Acad Sci USA 2005;102(43):15545-15550. Li, J. and Tibshirani, R. Finding consistent patterns: A nonparametric approach for identifying differential expression in RNA-Seq data. Statistical Methods in Medical Research 2013;22(5):519-536. Simon Anders and Wolfgang Huber (2010): Differential expression analysis for sequence count data. Genome Biology 11:R106 Examples data(example) # Create a gene set file and save it to your local directory. # Note that you can use your local gene set file (tab-delimited like *.gmt file from msigdb). # But here, we will generate a toy gene set file to show the structure of this gene set file. # It consists of 50 gene sets and each contains 100 genes.

GenePermGSEA 5 for(geneset in 1:50) { GenesetName = paste("geneset", Geneset, sep = "_") Genes = paste("gene", (Geneset*100-99):(Geneset*100), sep="", collapse = '\t') Geneset = paste(genesetname, Genes, sep = '\t') write(geneset, file = "geneset.txt", append = TRUE, ncolumns = 1) } # Run Gene-permuting GSEA RES = GenePermGSEA(countMatrix = example, GeneScoreType = "moderated_t", idxcase = 1:3, idxcontrol = 4:6, GenesetFile = 'geneset.txt', normalization = 'DESeq', GSEAtype = "absfilter") RES

Index Topic datasets example, 2 example, 2 GenePermGSEA, 3 6