Package xseq. R topics documented: September 11, 2015

Similar documents
Package citccmst. February 19, 2015

Package inctools. January 3, 2018

Package ORIClust. February 19, 2015

Package ordibreadth. R topics documented: December 4, Type Package Title Ordinated Diet Breadth Version 1.0 Date Author James Fordyce

Package p3state.msm. R topics documented: February 20, 2015

Package AbsFilterGSEA

Package propoverlap. R topics documented: February 20, Type Package

Package ega. March 21, 2017

Package appnn. R topics documented: July 12, 2015

Package HAP.ROR. R topics documented: February 19, 2015

Package frailtyhl. February 19, 2015

Package speff2trial. February 20, 2015

Package MethPed. September 1, 2018

Package nopaco. R topics documented: April 13, Type Package Title Non-Parametric Concordance Coefficient Version 1.0.

Package QualInt. February 19, 2015

Package GANPAdata. February 19, 2015

Genetic alterations of histone lysine methyltransferases and their significance in breast cancer

Package diggitdata. April 11, 2019

Package Actigraphy. R topics documented: January 15, Type Package Title Actigraphy Data Analysis Version 1.3.

Package inote. June 8, 2017

Package prognosticroc

Package BUScorrect. September 16, 2018

Package cssam. February 19, 2015

Package CompareTests

Package mdsdt. March 12, 2016

Package pollen. October 7, 2018

Package hei. August 17, 2017

Nature Genetics: doi: /ng Supplementary Figure 1. Somatic coding mutations identified by WES/WGS for 83 ATL cases.

Session 4 Rebecca Poulos

Package longroc. November 20, 2017

Package CLL. April 19, 2018

Package MSstatsTMT. February 26, Title Protein Significance Analysis in shotgun mass spectrometry-based

Package sbpiper. April 20, 2018

Package singlecase. April 19, 2009

Package TFEA.ChIP. R topics documented: January 28, 2018

Package ClusteredMutations

Biostatistical modelling in genomics for clinical cancer studies

Session 4 Rebecca Poulos

Nature Medicine: doi: /nm.3967

The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0

Package cancer. July 10, 2018

Package wally. May 25, Type Package

Package DeconRNASeq. November 19, 2017

ivlewbel: Uses heteroscedasticity to estimate mismeasured and endogenous regressor models

Cancer Informatics Lecture

PSSV User Manual (V1.0)

Package clust.bin.pair

Package EpiEstim. R topics documented: February 19, Version Date 2013/03/08

Package msgbsr. January 17, 2019

Package pepdat. R topics documented: March 7, 2015

Package preference. May 8, 2018

Nature Medicine: doi: /nm.4439

Package influence.me

Package StepReg. November 3, 2017

Package noia. February 20, 2015

Package PELVIS. R topics documented: December 18, Type Package

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

Package medicalrisk. August 29, 2016

PSSV User Manual (V2.1)

Package seqdesign. April 27, Version 1.1 Date

Package CopulaREMADA

Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics Browser

Package mediation. September 18, 2009

Package splicer. R topics documented: August 19, Version Date 2014/04/29

Package CTT. February 4, 2018

Package CorMut. January 21, 2019

Hands-On Ten The BRCA1 Gene and Protein

Package ICCbin. November 14, 2017

Package RNAsmc. December 11, 2018

TADA: Analyzing De Novo, Transmission and Case-Control Sequencing Data

A Quick-Start Guide for rseqdiff

DNA-seq Bioinformatics Analysis: Copy Number Variation

Package PROcess. R topics documented: June 15, Title Ciphergen SELDI-TOF Processing Version Date Author Xiaochun Li

Package AIMS. June 29, 2018

Package CopulaREMADA

bivariate analysis: The statistical analysis of the relationship between two variables.

Identifying Thyroid Carcinoma Subtypes and Outcomes through Gene Expression Data Kun-Hsing Yu, Wei Wang, Chung-Yu Wang

Predicting Kidney Cancer Survival from Genomic Data

Molecular And Genetic Properties Of Breast Cancer Associated With Local Immune Activity

Inferring Biological Meaning from Cap Analysis Gene Expression Data

Package bionetdata. R topics documented: February 19, Type Package Title Biological and chemical data networks Version 1.0.

Introduction to LOH and Allele Specific Copy Number User Forum

Package xmeta. August 16, 2017

Item Response Theory for Polytomous Items Rachael Smyth

# Assessment of gene expression levels between several cell group types is a common application of the unsupervised technique.

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from

Metabolomic Data Analysis with MetaboAnalyst

Package leukemiaseset

Package TargetScoreData

Nature Methods: doi: /nmeth.3115

Package flowtype. R topics documented: July 18, Type Package. Title Phenotyping Flow Cytometry Assays. Version

SubLasso:a feature selection and classification R package with a. fixed feature subset

Supplementary Figure 1: Features of IGLL5 Mutations in CLL: a) Representative IGV screenshot of first

ncounter Assay Automated Process Immobilize and align reporter for image collecting and barcode counting ncounter Prep Station

A Versatile Algorithm for Finding Patterns in Large Cancer Cell Line Data Sets

Package CancerMutationAnalysis

Human Cancer Genome Project. Bioinformatics/Genomics of Cancer:

PROcess. June 1, Align peaks for a specified precision. A user specified precision of peak position.

Transcription:

Package xseq September 11, 2015 Title Assessing Functional Impact on Gene Expression of Mutations in Cancer Version 0.2.1 Date 2015-08-25 Author Jiarui Ding, Sohrab Shah Maintainer Jiarui Ding <jiaruid@cs.ubc.ca> Depends R (>= 3.1.0) Imports e1071 (>= 1.6-4), gptk (>= 1.08), impute (>= 1.38.1), preprocesscore (>= 1.26.1), RColorBrewer (>= 1.1-2), sfsmisc (>= 1.0-27) Suggests knitr A hierarchical Bayesian approach to assess functional impact of mutations on gene expression in cancer. Given a patient-gene matrix encoding the presence/absence of a mutation, a patient-gene expression matrix encoding continuous value expression data, and a graph structure encoding whether two genes are known to be functionally related, xseq outputs: a) the probability that a recurrently mutated gene g influences gene expression across the population of patients; and b) the probability that an individual mutation in gene g in an individual patient m influences expression within that patient. License GPL (>= 2) LazyLoad ture VignetteBuilder knitr LazyData TRUE NeedsCompilation yes Repository CRAN Date/Publication 2015-09-11 08:04:31 R topics documented: cna.call........................................... 2 cna.logr........................................... 3 1

2 cna.call ConvertXseqOutput..................................... 3 EstimateExpression..................................... 4 expr............................................. 4 FilterNetwork........................................ 5 GetExpressionDistribution................................. 6 ImputeKnn......................................... 7 InferXseqPosterior..................................... 7 InitXseqModel....................................... 8 LearnXseqParameter.................................... 9 mut............................................. 9 net.............................................. 10 NormExpr.......................................... 10 PlotRegulationHeatmap................................... 11 QuantileNorm........................................ 12 SetXseqPrior........................................ 12 Index 14 cna.call TCGA AML SNP6.0 GISTIC copy number alteration calls Format Source A dataset containing part of the The Cancer Genome Atlas acute myeloid leukemia Affymetrix SNP6.0 array copy number alteration calls from GISTIC cna.call A matrix containing the GISTIC copy number calls of 454 genes in 197 patients: Row names are patient identifiers Colume names are official HGNC gene symbols Each element of the matrix is coded: -2, homozygous deletions -1, hemizygous deletions 0, neutral 1, gain 2, amplifications https://www.synapse.org/#!synapse:syn300013

cna.logr 3 cna.logr TCGA AML SNP6.0 copy number alteration data A dataset containing part of the The Cancer Genome Atlas acute myeloid leukemia Affymetrix SNP6.0 array copy number alteration log2 ratios cna.logr Format A matrix containing the copy number log2 ratios of 454 genes in 197 patients: Row names are patient identifiers Colume names are official HGNC gene symbols Source https://www.synapse.org/#!synapse:syn300013 ConvertXseqOutput Convert xseq output to a data.frame Convert xseq output to a data.frame ConvertXseqOutput(posterior) posterior The posterior probabilities of mutations and mutated genes, output from InferXseqPosterior A data.frame with sample, gene, probability of individual mutations and the probabilities of individual mutated genes

4 expr EstimateExpression A mixture modelling approach to estiamte whether a gene is expressed in a study given RNA-seq gene expression data A mixture modelling approach to estiamte whether a gene is expressed in a study given RNA-seq gene expression data EstimateExpression(expr, show.plot = FALSE, loglik = TRUE, xlab = "Expression", ylab = "Density",...) expr show.plot loglik xlab ylab A matrix of RNA-seq gene expression values where each row corresponds to a patient and each column is a gene. Typically the expression of each gene is the log2 transformed RSEM value. Logical, specifying whether to plot results Logical, whether plot the log-likelihoods xlab of the plot ylab of the plot... for plotting A weight vector representing whether individual genes are expressed in the study Examples data(expr) weight = EstimateExpression(expr) expr TCGA AML SNP6.0 gene expression data A dataset containing part of the The Cancer Genome Atlas acute myeloid leukemia RNA-seq gene expression data expr

FilterNetwork 5 Format A matrix containing the expression of 454 genes in 197 patients: Row names are patient identifiers Colume names are official HGNC gene symbols Source https://www.synapse.org/#!synapse:syn300013 FilterNetwork Filter network Filter network FilterNetwork(net, weight, min.weight = 0.8, min.conn.strength = 0.4, min.num.conn = 5, max.num.conn = 50, remove.self.connection = TRUE, debug = FALSE) net weight List, a gene interaction network The weights of genes, could from the function EstimateExpression min.weight Filter the connected genes with weights less than min.weight min.conn.strength The minimum gene connection strength min.num.conn The minimum number of connections required for a gene to be considered for trans-analysis max.num.conn Only keep the top max.conn genes remove.self.connection Logical, whether removing self-connections or not debug The filtered network Examples data(net) net.filt = FilterNetwork(net) Logical, specifying whether debug information should be printed

6 GetExpressionDistribution GetExpressionDistribution Get the conditional distributions for a set of genes Get the conditional distributions for a set of genes GetExpressionDistribution(expr, mut = NULL, cna.call = NULL, gene = NULL, type = "student", show.plot = FALSE) expr mut cna.call gene type show.plot A matrix of gene expression values where each row corresponds to a patient and each column is a gene. A data.frame of mutations. The data.frame should have three columns of characters: sample, hgnc_symbol, and variant_type. The variant_type column cat be either "HOMD", "HLAMP", "MISSENSE", "NONSENSE", "FRAMESHIFT", "INFRAME", "SPLICE", "NONSTOP", "STARTGAINED", "SYNONYMOUS", "OTHER", "FUSION", "COMPLEX". A matrix containing the copy number calls, where each element is coded: -2, homozygous deletions -1, hemizygous deletions 0, neutral 1, gain 2, amplifications A character vector of official HGNC gene names Character, either Gaussian ("gauss") or Student ("student") Logical, specifying whether to plot the fitted model A list containg the fitted expression distributions

ImputeKnn 7 ImputeKnn Impute missing values (NAs) using K-nearest neighbour averaging Impute missing values (NAs) using K-nearest neighbour averaging ImputeKnn(X, ratio = 0.5,...) X A matrix of real values where each row corresponds to a patient and each column is a gene. ratio The rows (columns) with more than (default 50%) of missing values are removed... to be passed to impute.knn A matrix without NAs Examples data(expr) expr.norm = ImputeKnn(expr) InferXseqPosterior Learn xseq parameters given an initialized model Learn xseq parameters given an initialized model InferXseqPosterior(model, constraint, debug = FALSE) model An xseq model. constraint A list of constraints on θ G F. debug Logical, specifying whether debug information should be printed The posterior probabilities of latent variables in xseq

8 InitXseqModel InitXseqModel The datastructure to store the xseq models The datastructure to store the xseq models InitXseqModel(mut, expr, net, expr.dis, prior, cpd, gene, p.h, weight, cis = FALSE, debug = FALSE) mut expr net expr.dis prior cpd gene p.h weight cis debug A data.frame of mutations. The data.frame should have three columns of characters: sample, hgnc_symbol, and variant_type. The variant_type column cat be either "HOMD", "HLAMP", "MISSENSE", "NONSENSE", "FRAMESHIFT", "INFRAME", "SPLICE", "NONSTOP", "STARTGAINED", "SYNONYMOUS", "OTHER", "FUSION", "COMPLEX". A matrix of gene expression values where each row corresponds to a patient and each column is a gene A list of gene interaction networks The fitted gene expression distributions, output from GetExpressionDistribution The prior for xseq, output from SetXseqPrior A list of conditional probability tables for xseq, output from SetXseqPrior A character vector of gene names, default to all the genes with mutations The down-regulation probability list of each gene connected to a mutated gene, typically from running LearnXseqParameter on a discovery dataset The weight list of each gene connected to a mutated gene, typically from running LearnXseqParameter on a discovery dataset Logical, cis or trans analysis Logical, whether to output debug information A xseq model

LearnXseqParameter 9 LearnXseqParameter Learn xseq parameters given an initialized model Learn xseq parameters given an initialized model LearnXseqParameter(model, constraint, iter.max = 20, threshold = 1e-05, cis = FALSE, debug = FALSE, show.plot = TRUE) model An xseq model constraint A list of constraints on θ G F. iter.max Maximum number of iterations in learning xseq parameters threshold The threshold to stop learning paramters cis Logical, cis-analysis or trans-analysis debug Logical, specifying whether debug information should be printed show.plot Logical, specifying whether to plot the Log-Likelihoods A list including the learned xseq model mut TCGA AML somatic mutation data A dataset containing the The Cancer Genome Atlas acute myeloid leukemia somatic mutation data mut Format A data frame with 2311 rows and 12 variables: sample. character, patient identifier hgnc_symbol. character, official HGNC gene symbols variant_type. character, mutation type, can be either "HOMD", "HLAMP", "MISSENSE", "NONSENSE", "FRAMESHIFT", "INFRAME", "SPLICE", "NONSTOP", "STARTGAINED", "SYNONYMOUS", "OTHER", "FUSION", "COMPLEX"...

10 NormExpr Source https://www.synapse.org/#!synapse:syn1729383 net A networks containing gene associations A list of gene interactions net Format A list with 16 elements: List names are official HGNC gene symbols Each element of the list is a vector, and the name of the vector is an official HGNC gene symbol. Each element of the vector is a real number between 0 and 1, representing the assocation strongth between two genes. The names of the elements of the vector are also official HGNC gene symbols. NormExpr Remove the cis-effects of copy number alterations on gene expression Remove the cis-effects of copy number alterations on gene expression NormExpr(expr, cna.logr, gene, type = "gp", debug = FALSE, show.plot = FALSE, show.norm = TRUE) expr cna.logr gene A matrix of gene expression values where each row corresponds to a patient and each column is a gene. A matrix of copy number alterations log2 ratio where each row corresponds to a patient and each column is a gene. A character vector of gene HGNC symbols, default for all genes with both gene expression and copy number log2 ratio data

PlotRegulationHeatmap 11 type debug show.plot show.norm A character, either Gaussian process regression ("gp") or suppor vector machine regression ("svm") Logical, specifying whether debug information should be printed Logical, specifying whether to plot the orginal expression and the normalized expression for a gene Logical, specifying whether to plot the express of a gene after normalization, only used when show.plot = TRUE. The normalized expression matrix Examples data(cna.logr, expr) expr.norm = NormExpr(cna.logr, expr, gene="pten") PlotRegulationHeatmap Heatmap showing the connected genes dysregulation probabilities Heatmap showing the connected genes dysregulation probabilities PlotRegulationHeatmap(gene, posterior, mut, subtype = NULL, main = "in_cancer",...) gene posterior mut subtype main A character vector of gene names The xseq posteriors, output of InferXseqPosterior or LearnXseqParameter A data.frame of mutations. The data.frame should have at least three columns of characters: sample, hgnc_symbol, and variant_type. The variant_type column cat be either "HOMD", "HLAMP", "MISSENSE", "NONSENSE", "FRAMESHIFT", "INFRAME", "SPLICE", "NONSTOP", "STARTGAINED", "SYNONYMOUS", "OTHER", "FUSION", "COMPLEX". A vector representing a character of each patient, e.g., subtype The heatmap title... Other parameters passed to heatmap.2

12 SetXseqPrior QuantileNorm Quantile normalize a matrix Quantile normalize a matrix QuantileNorm(X) X A matrix of real values where each row corresponds to a patient and each column is a gene The normalized matrix of X Examples data(expr) expr.quantile = QuantileNorm(expr) SetXseqPrior Set model paramerter priors Set model paramerter priors SetXseqPrior(mut, expr.dis, net, regulation.direction = TRUE, cis = TRUE, mut.type = "loss",...) mut expr.dis net A data.frame of mutations. The data.frame should have three columns of characters: sample, hgnc_symbol, and variant_type. The variant_type column cat be either "HOMD", "HLAMP", "MISSENSE", "NONSENSE", "FRAMESHIFT", "INFRAME", "SPLICE", "NONSTOP", "STARTGAINED", "SYNONYMOUS", "OTHER", "FUSION", "COMPLEX". A list, the outputs from calling GetExpressionDistribution A list of gene interactions

SetXseqPrior 13 regulation.direction Logical, whether considering the directionality, i.e., up-regulation or downregulatin of genes, only used when cis=false. cis mut.type Logical, cis analysis or trans analysis Character, only used when cis = TRUE, and can be either loss, gain or both... Reserved for extension

Index Topic datasets cna.call, 2 cna.logr, 3 expr, 4 mut, 9 net, 10 cna.call, 2 cna.logr, 3 ConvertXseqOutput, 3 EstimateExpression, 4 expr, 4 FilterNetwork, 5 GetExpressionDistribution, 6 ImputeKnn, 7 InferXseqPosterior, 7 InitXseqModel, 8 LearnXseqParameter, 9 mut, 9 net, 10 NormExpr, 10 PlotRegulationHeatmap, 11 QuantileNorm, 12 SetXseqPrior, 12 14