Quality assessment of TCGA Agilent gene expression data for ovary cancer

Size: px
Start display at page:

Download "Quality assessment of TCGA Agilent gene expression data for ovary cancer"

Transcription

1 Quality assessment of TCGA Agilent gene expression data for ovary cancer Nianxiang Zhang & Keith A. Baggerly Dept of Bioinformatics and Computational Biology MD Anderson Cancer Center Oct 1, 2010 Contents 1 Executive Summary Introduction Methods Data Statistical methods Results and Conclusions Data Analysis Data consistency across versions Level 1 to Level 2 data Level 2 to Level 3 data Load level 2/3 data Sample Labeling consistency of Level 2 and Level 3 data Effects in Level 2 and Level 3 data Appendix File Location SessionInfo List of Figures 1 Correlation of level1 and level 2 data Probes in G4502A 07 2, G4502A 07 3 platform and level 2 data Consistency of level 2 and level 3 data by correlation. Level 2 data are summarized to gene level by taking the mean of probes that belong to the same gene. Then pairwise correlation between summarized level 2 and level 3 data are calculated. Red color represents correlation coefficient> effects in Level 2 Agilent gene expression data. An average across all probes for each sample is calculated using level 2 data. The average gene expression level for samples is shown by batch

2 AgiExpQC.Rnw 2 5 effects in Level 3 Agilent gene expression data. An average across all genes for each sample is calculated using level 3 data. The average gene expression level for samples is shown by batch The top probes with batch effects in level 2 data The top probes with batch effects in level 3 data Executive Summary 1.1 Introduction We are interested in assessing the quality of TCGA data including Agilent gene expression data. We would like to examine the consistency of data in different levels. We also want to access batch effects in Level 2 and Level 3 data. 1.2 Methods Data We use MD Anderson local copy of TCGA Agilent gene expression data at //gcgserv.mdanderson.org/tcga- PUBLIC/tcga/tumor/ov/cgcc/unc.edu for QC assessment of Level 1 data. We use consolidated level2 and level3 data from TCGA data portal located at mdadqsfs02/workspace/nzhangtcgadata/ovarian/expression- Genes Statistical methods We use limma package in R to assess Level 1 data. We use the mean of expression level of probes located on the same gene to summarize Level 2 data to gene expression level. We calculate Pearson correlation coefficient to assess the consistency of Level 2 and Level 3 data. 1.3 Results and Conclusions We found the data are consistent across different versions from level 1 to level 3 by checking random samples. We found the 67 probes in level 2 data do not exist in level 1 data. We do not know how level 2 data were obtained. They are highly correlated to the Feature Extraction Software processed LogRatio data, but not directly from FE. We did not identify the mislabeling problem. We do see batch effects in the data. The average expression levels across all genes are different for samples from different batches. The situations are similar for level 2 and level 3 data. 2 Data Analysis 2.1 Data consistency across versions In order to perform QC assessment of the data, we will need to figure out the data storage structure and retrieve proper files. We define the directories for the two platforms. > datapath2 <- "//gcgserv.mdanderson.org/tcga-public/tcga/tcga-stage/anonsite/tcga/tumor/ov/cgcc/unc.edu > datapath3 <- "//gcgserv.mdanderson.org/tcga-public/tcga/tcga-stage/anonsite/tcga/tumor/ov/cgcc/unc.edu

3 AgiExpQC.Rnw 3 We write 2 functions to get directory name and filenames. > getdir <- function(dp,...) { + Dirs <- list.files(dp,...) + Dirs <- Dirs[-grep("gz", Dirs)] + return(dirs) > extract.datafilename <- function(dpath, level = 1,...) { + allfile <- list.files(dpath,...) + data.file <- allfile[grep("us", allfile)] + data.file <- sort(data.file) + name23 <- grep("level", data.file) + level1 <- data.file[0 - name23] + level2 <- data.file[grep("level2", data.file)] + level3 <- data.file[grep("level3", data.file)] + switch(level, `1` = return(level1), `2` = return(level2), + `3` = return(level3)) We check Level 1 data first. We examine the samples from different versions to make sure the data file names are the same in different versions. > temp.ver <- getdir(datapath2, full.names = T) > identical(extract.datafilename(temp.ver[1]), extract.datafilename(temp.ver[2])) > temp1 <- extract.datafilename(temp.ver[1], full.names = T) > temp2 <- extract.datafilename(temp.ver[2], full.names = T) We choose random 3 samples from the different versions; load the level 1 data which are from Feature Extraction Software. They are identical. > temp.ind <- sample(1:length(temp1), 3) > choosensamp <- extract.datafilename(temp.ver[1])[temp.ind] > RG1 <- read.maimages(files = temp1[temp.ind], source = "agilent", + other.columns = list("controltype", "LogRatio", "gprocessedsignal", + "rprocessedsignal")) > RG2 <- read.maimages(files = temp2[temp.ind], source = "agilent", + other.columns = list("controltype", "LogRatio", "gprocessedsignal", + "rprocessedsignal")) > colnames(rg1) <- colnames(rg2) <- NULL > identical(rg1$r, RG2$R) > identical(rg1$others, RG2$others) We also check if the Level 2 or 3 data are consistent across different versions. The level 2 and 3 data that we checked for the two versions are identical. > temp.ver <- getdir(datapath2, full.names = T) > identical(extract.datafilename(temp.ver[1]), extract.datafilename(temp.ver[2])) > for (level in 2:3) { + temp1 <- extract.datafilename(temp.ver[1], level = level, + full.names = T) + temp2 <- extract.datafilename(temp.ver[2], level = level,

4 AgiExpQC.Rnw 4 + full.names = T) + date1 <- date() + for (ii in 1:length(temp1)) { + templ1 <- read.table(file = temp1[ii], skip = 2, fill = T) + templ2 <- read.table(file = temp2[ii], skip = 2, fill = T) + colnames(templ1) <- colnames(templ2) <- NULL + if (!identical(templ1, templ2)) + cat(paste(temp1[ii], "is different from \n", temp2[ii], + "\n")) + else cat("ok \n") 2.2 Level 1 to Level 2 data We do not know how level 2 data were obtained from Level 1 data. We check the mean of processed LogRatio data in level 1 and see correlation to level 2 data. The correlation coefficient of the logratio mean and level 2 data is (Figure 1). > NC1 <- RG1$genes[RG1$other$ControlType[, 1] == 0, ] > length(unique(nc1$probename)) > table(table(nc1$probename)) > temp.ver <- getdir(datapath2, full.name = T) > temp.level1.file <- extract.datafilename(temp.ver[1], level = 1, + full.names = T) > temp.rg <- read.maimages(files = temp.level1.file[1], source = "agilent", + other.columns = list("controltype", "LogRatio", "gprocessedsignal", + "rprocessedsignal")) > temp.level2.file <- extract.datafilename(temp.ver[1], level = 2, + full.names = T) > chosensamp <- extract.datafilename(temp.ver[1])[1] > temp <- substr(chosensamp, 1, 30) > chosen.level2.file <- unlist(lapply(temp, function(x) temp.level2.file[grep(x, + temp.level2.file)])) > temp <- read.table(chosen.level2.file[1], header = F, skip = 2, + fill = T) > matchlevel1data <- temp.rg[match(temp[, 1], temp.rg$genes$probename), + ] > replevel1data <- temp.rg[duplicated(temp.rg$genes$probename), + ] > replevel1data <- temp.rg[match(temp[, 1], replevel1data$genes$probename), + ] > identical(matchlevel1data$others$logratio, temp[, 2]) > temp.lrmean <- tapply(temp.rg$other$logratio[, 1], INDEX = temp.rg$genes$probename, + mean, na.rm = T) > temp.lrmean.match <- temp.lrmean[match(as.vector(temp[, 1]), + names(table(temp.rg$genes$probename)))]

5 AgiExpQC.Rnw 5 Level Mean LogRatio Level1 Figure 1: Correlation of level1 and level 2 data. > pdf("level1level2corr.pdf") > plot(temp.lrmean.match, temp[, 2], xlab = "Mean LogRatio Level1", + ylab = "Level 2", pch = ".") > invisible(dev.off()) > cor(temp.lrmean.match, temp[, 2], use = "pairwise.complete.obs") We check level 1 data in another platform AgilentG4502A > temp.ver <- getdir(datapath3, full.names = T) > identical(extract.datafilename(temp.ver[2]), extract.datafilename(temp.ver[3])) > temp3 <- extract.datafilename(temp.ver[2], full.names = T) > temp4 <- extract.datafilename(temp.ver[3], full.names = T) We choose a sample from AgilentG4502A 07 3 platform, load the level 1 data which are from Feature Extraction Software. The level 1 data from the different versions are consistent.

6 AgiExpQC.Rnw 6 > sampleid <- substr(extract.datafilename(temp.ver[2])[1], 1, 30) > RG3 <- read.maimages(files = temp3[grep(sampleid, temp3)], source = "agilent", + other.columns = list("controltype", "LogRatio", "gprocessedsignal", + "rprocessedsignal")) > RG4 <- read.maimages(files = temp4[grep(sampleid, temp4)], source = "agilent", + other.columns = list("controltype", "LogRatio", "gprocessedsignal", + "rprocessedsignal")) > colnames(rg3) <- colnames(rg4) <- NULL > identical(rg3$r, RG4$R) > identical(rg3$others, RG4$others) We get rid of the control probes and find out the number of probes. > NC3 <- RG3$genes[RG3$other$ControlType[, 1] == 0, ] > length(unique(nc3$probename)) The results show that the level 1 data are consistent across versions for both platform. Since the two platforms have different probesets. We compare the probes among the two platform level 1 data and Level 2 data. > identical(level2data2[, 1], Level2data3[, 1]) The level2 data for different platforms have the same set of probes. However, there are 67 probes in level2 data are not in level 3 (Figure 2). > allprobe <- unlist(unique(c(nc1$probename, NC3$ProbeName, as.vector(level2data2[, + 1])))) > temp.venn <- matrix(0, length(allprobe), 3) > colnames(temp.venn) <- c("g4502a_07_2", "G4502A_07_3", "Level2_2") > temp.venn[, 1] <- allprobe %in% NC1$ProbeName > temp.venn[, 2] <- allprobe %in% NC3$ProbeName > temp.venn[, 3] <- allprobe %in% Level2data2[, 1] > pdf("probevenn.pdf") > venndiagram(temp.venn, circle.col = c("red", "blue", "green"), + lwd = 3) > dev.off() 2.3 Level 2 to Level 3 data Load level 2/3 data Now we use the consolidated level 2 and level 3 data we just downloaded to do further analysis. We convert the Agilent level 2/3 data into matrix form. > datadir <- c("../../../expression-genes/unc AgilentG4502A_07_2", + "../../../Expression-Genes/UNC AgilentG4502A_07_3") > if (exists("level2data")) rm(level2data) > Agifile <- paste(datadir, "/Level_2/", c("unc.edu AgilentG4502A_07_2 log2_lowess_normalized.txt", + "unc.edu AgilentG4502A_07_3 log2_lowess_normalized.txt"), + sep = "") > temp.data <- NULL

7 AgiExpQC.Rnw 7 G4502A_07_2 G4502A_07_ Level2_2 0 Figure 2: Probes in G4502A 07 2, G4502A 07 3 platform and level 2 data.

8 AgiExpQC.Rnw 8 > for (j in 1:length(Agifile)) { + s.name <- read.delim(file = Agifile[j], sep = "\t", header = F, + nrow = 1, stringsasfactors = F, row.names = 1) + temp.raw <- read.delim(file = Agifile[j], sep = "\t", header = F, + skip = 2, stringsasfactors = F, row.names = 1) + colnames(temp.raw) <- t(s.name) + if (!exists("level2data")) + Level2data <- temp.raw + else { + stopifnot(identical(rownames(level2data), rownames(temp.raw))) + Level2data <- cbind(level2data, temp.raw) > rm(agifile) > rm(list = ls(pattern = "temp")) > for (i in 1:ncol(Level2data)) Level2data[, i] <- as.numeric(level2data[, + i]) > save(level2data, file = file.path("rdataobjects", "AgilentOVLevel2Data.Rda")) > Agifile <- paste(datadir, "/Level_3/", c("unc.edu AgilentG4502A_07_2 gene_expression_analysis_1.txt" + "unc.edu AgilentG4502A_07_3 gene_expression_analysis_1.txt"), + sep = "") > if (exists("level3data")) rm(level3data) > for (j in 1:length(Agifile)) { + temp.raw <- read.delim(file = Agifile[j], sep = "\t", header = T, + stringsasfactors = F) + temp <- matrix(as.numeric(temp.raw[, 3]), ncol = length(table(temp.raw[, + 1])), nrow = length(table(temp.raw[, 2])), dimnames = list(unique(temp.raw[, + 2]), unique(temp.raw[, 1]))) + temp.raw[is.na(as.numeric(temp.raw[, 3])), ] + if (!exists("level3data")) + Level3data <- temp + else { + stopifnot(identical(rownames(level3data), rownames(temp))) + Level3data <- cbind(level3data, temp) > rm(agifile) > rm(list = ls(pattern = "temp")) > save(level3data, file = file.path("rdataobjects", "AgilentOVLevel3Data.Rda")) We make sure the Level 2 and Level 3 data cover the same set of samples. > all(colnames(level2data) %in% colnames(level3data)) > all(colnames(level3data) %in% colnames(level2data)) We reorder the columns of Level3 data so that level 2 and level 3 data have the same sample order. > Level3data <- Level3data[, colnames(level2data)] We get the exclusion inclusion sample list, and retain only the included samples. We also remove one cell line sample without batch assignment.

9 AgiExpQC.Rnw 9 > source("~/project/weinsteintcga062509/tcgafunctions.r") > level.si <- getsi(level2data, batchpath = "../../Effect") > require(gdata) > inex <- read.xls(xls = "/workspace/nzhangtcgadata/ovarian/analysis/tcga_ovarianuseandexcludelist.xls", + sheet = 1) > temp.sample.in <- as.vector(inex$sample.id[inex$include.exclude == + "Include"]) > keep.ind <-!is.na(level.si$batch) & paste("tcga", level.si$siteid, + level.si$patientid, sep = "-") %in% temp.sample.in > final.si <- level.si[keep.ind, ] > Level2data <- Level2data[, keep.ind] > Level3data <- Level3data[, keep.ind] > save(list = c("level3data", "Level2data", "final.si"), file = file.path("rdataobjects", + "AgilentOVData.Rda")) Sample Labeling consistency of Level 2 and Level 3 data We do not have the annotation file for the customized array, we use HGUG4112a instead, which covers some of the probes. Actually, probes are mapped. We only keep the level 2 data that are mapped genes in Level3 data are in this annotation. We only keep the probes that can be mapped to the genes. > require(hgug4112a.db) > symbol <- unlist(mget(as.vector(rownames(level2data)), env = hgug4112asymbol, + ifnotfound = NA)) > sum(!is.na(symbol)) > sum(rownames(level3data) %in% symbol) > mappedgene <- intersect(rownames(level3data), symbol) > Level2map <- Level2data[symbol %in% mappedgene, order(final.si$batch)] > symbolmap <- symbol[symbol %in% mappedgene] > Level3map <- Level3data[rownames(Level3data) %in% mappedgene, + order(final.si$batch)] > save(list = c("level3map", "Level2map", "final.si", "symbolmap"), + file = file.path("rdataobjects", "AgilentOVDataMapped.Rda")) Now, we just take the mean of probes to summarize level 2 data. > Level2Sum <- apply(as.matrix(level2map), 2, function(x) tapply(x, + INDEX = symbolmap, mean, na.rm = T)) > Level2Sum <- Level2Sum[rownames(Level3map), ] Now, we calculate the correlation of the 2 data set. We expect the summarized level 2 data should have high correlation to the level 3 data of the same sample. We set threshold of 0.9 to show the pairwise correlations (Figure 3). There is no mislabeling found since all the high correlations appear on the diagonal line. > level23cor <- matrix(0, ncol(level3map), ncol(level3map)) > for (i in 1:ncol(Level3map)) { + for (j in 1:ncol(Level3map)) { + level23cor[i, j] <- cor(level3map[, i], Level2Sum[, j], + use = "pairwise.complete.obs")

10 AgiExpQC.Rnw 10 > pdf("corrlevel2and3.pdf") > heatmap((level23cor > 0.9) + 0, Colv = NA, Rowv = NA, xlab = "Level2 Average", + ylab = "Level 3", col = c("grey", "red")) > dev.off() 2.4 Effects in Level 2 and Level 3 data We assess the batch effects in level 2 and level 3 data. We calculate the mean across all probes/genes for level 2 and level 3 data. The plots of the cross-gene mean are shown in Figure 4 and 5. > genemeanlevel2 <- apply(as.matrix(level2map), 2, mean, na.rm = T) > genemeanlevel3 <- apply(as.matrix(level3map), 2, mean, na.rm = T) > pdf("level2.pdf") > temp <- boxplot(genemeanlevel2 ~ final.si$batch, xlab = "", + main = "Agilent Level2 data", cex = 0.7) > points(y = genemeanlevel3, x = jitter(rep(1:13, temp$n)), cex = 0.7) > abline(v = 0: , col = "brown") > dev.off() > pdf("level3.pdf") > temp <- boxplot(genemeanlevel3 ~ final.si$batch, xlab = "", + main = "Agilent Level3 data", cex = 0.7) > points(y = genemeanlevel3, x = jitter(rep(1:13, temp$n)), cex = 0.7) > abline(v = 0: , col = "brown") > dev.off() We pick some extreme genes to see how bad it would be. The top probes differentially expressed by batch are shown in Figure 6 and 7. > res <- MultiLinearModel(Y ~ batch, clindata = final.si[, ], arraydata = Level2map) > mad2 <- apply(level2map, 1, mad, na.rm = T) > top6 <- mad2[order(res@p.values, -mad2)][1:6] > pdf("level2batcheffecttop.pdf", height = 8, pointsize = 9) > par(mfrow = c(3, 2)) > for (i in 1:6) { + genedata <- t(level2map[names(top6)[i], ]) + temp <- boxplot(genedata ~ final.si$batch, xlab = "", + main = names(top6)[i], cex = 0.7) + points(y = genedata, x = jitter(rep(1:13, temp$n)), cex = 0.7) + abline(v = 0: , col = "brown") > dev.off() > res3 <- MultiLinearModel(Y ~ batch, clindata = final.si[, ], + arraydata = Level3map) > mad3 <- apply(level3map, 1, mad, na.rm = T) > top6.3 <- mad3[order(res3@p.values, -mad3)][1:6] > pdf("level3batcheffecttop.pdf", height = 8, pointsize = 9) > par(mfrow = c(3, 2))

11 AgiExpQC.Rnw Level2 Average Level 3 Figure 3: Consistency of level 2 and level 3 data by correlation. Level 2 data are summarized to gene level by taking the mean of probes that belong to the same gene. Then pairwise correlation between summarized level 2 and level 3 data are calculated. Red color represents correlation coefficient>0.9.

12 AgiExpQC.Rnw Agilent Level2 data Figure 4: effects in Level 2 Agilent gene expression data. An average across all probes for each sample is calculated using level 2 data. The average gene expression level for samples is shown by batch.

13 AgiExpQC.Rnw Agilent Level3 data Figure 5: effects in Level 3 Agilent gene expression data. An average across all genes for each sample is calculated using level 3 data. The average gene expression level for samples is shown by batch.

14 AgiExpQC.Rnw 14 > for (i in 1:6) { + genedata <- Level3map[names(top6.3)[i], ] + temp <- boxplot(genedata ~ factor(final.si$batch), xlab = "", + main = names(top6.3)[i], cex = 0.7) + points(y = genedata, x = jitter(rep(1:13, temp$n)), cex = 0.7) + abline(v = 0: , col = "brown") > dev.off() 3 Appendix 3.1 File Location > getwd() [1] "/workspace/nzhangtcgadata/ovarian/analysis/baggerlyqc/agilentqc" 3.2 SessionInfo > sessioninfo() R version ( ) i686-pc-linux-gnu locale: [1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=en_US [4] LC_COLLATE=en_US LC_MONETARY=C LC_MESSAGES=en_US [7] LC_PAPER=en_US LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C attached base packages: [1] splines stats graphics grdevices utils datasets methods [8] base other attached packages: [1] ClassComparison_ Biobase_2.6.1 PreProcess_ [4] oompabase_ limma_3.2.3

15 AgiExpQC.Rnw A_23_P A_23_P A_23_P A_23_P A_23_P A_23_P62741 Figure 6: The top probes with batch effects in level 2 data.

16 AgiExpQC.Rnw EHF F13A IGSF FGL SLC6A HIGD1B Figure 7: The top probes with batch effects in level 3 data.

MethylMix An R package for identifying DNA methylation driven genes

MethylMix An R package for identifying DNA methylation driven genes MethylMix An R package for identifying DNA methylation driven genes Olivier Gevaert May 3, 2016 Stanford Center for Biomedical Informatics Department of Medicine 1265 Welch Road Stanford CA, 94305-5479

More information

Checking the Clinical Information for Docetaxel

Checking the Clinical Information for Docetaxel Checking the Clinical Information for Docetaxel Keith A. Baggerly and Kevin R. Coombes November 13, 2007 1 Introduction In their reply to our correspondence, Potti and Nevins note that there is now more

More information

Introduction to antiprofiles

Introduction to antiprofiles Introduction to antiprofiles Héctor Corrada Bravo hcorrada@gmail.com Modified: March 13, 2013. Compiled: April 30, 2018 Introduction This package implements the gene expression anti-profiles method in

More information

Matching the Cisplatin Heatmap

Matching the Cisplatin Heatmap Matching the Cisplatin Heatmap Keith A. Baggerly September 24, 2009 Contents 1 Executive Summary 1 1.1 Introduction.............................................. 1 1.2 Methods................................................

More information

Backcalculating HIV incidence and predicting AIDS in Australia, Cambodia and Vietnam. Australia

Backcalculating HIV incidence and predicting AIDS in Australia, Cambodia and Vietnam. Australia Backcalculating HIV incidence and predicting AIDS in Australia, Cambodia and Vietnam The aim of today s practical is to give you some hands-on experience with a nonparametric method for backcalculating

More information

metaseq: Meta-analysis of RNA-seq count data

metaseq: Meta-analysis of RNA-seq count data metaseq: Meta-analysis of RNA-seq count data Koki Tsuyuzaki 1, and Itoshi Nikaido 2. October 30, 2017 1 Department of Medical and Life Science, Tokyo University of Science. 2 Bioinformatics Research Unit,

More information

GeneOverlap: An R package to test and visualize

GeneOverlap: An R package to test and visualize GeneOverlap: An R package to test and visualize gene overlaps Li Shen Contact: li.shen@mssm.edu or shenli.sam@gmail.com Icahn School of Medicine at Mount Sinai New York, New York http://shenlab-sinai.github.io/shenlab-sinai/

More information

Using Messina. Mark Pinese. October 13, Introduction The problem Example: Designing a colon cancer screening test...

Using Messina. Mark Pinese. October 13, Introduction The problem Example: Designing a colon cancer screening test... Using Messina Mark Pinese October 13, 2014 Contents 1 Introduction 1 2 Using Messina to construct optimal diagnostic classifiers 1 2.1 The problem.................................................. 1 2.2

More information

Hour 2: lm (regression), plot (scatterplots), cooks.distance and resid (diagnostics) Stat 302, Winter 2016 SFU, Week 3, Hour 1, Page 1

Hour 2: lm (regression), plot (scatterplots), cooks.distance and resid (diagnostics) Stat 302, Winter 2016 SFU, Week 3, Hour 1, Page 1 Agenda for Week 3, Hr 1 (Tuesday, Jan 19) Hour 1: - Installing R and inputting data. - Different tools for R: Notepad++ and RStudio. - Basic commands:?,??, mean(), sd(), t.test(), lm(), plot() - t.test()

More information

AIMS: Absolute Assignment of Breast Cancer Intrinsic Molecular Subtype

AIMS: Absolute Assignment of Breast Cancer Intrinsic Molecular Subtype AIMS: Absolute Assignment of Breast Cancer Intrinsic Molecular Subtype Eric R. Paquet (eric.r.paquet@gmail.com), Michael T. Hallett (michael.t.hallett@mcgill.ca) 1 1 Department of Biochemistry, Breast

More information

The LiquidAssociation Package

The LiquidAssociation Package The LiquidAssociation Package Yen-Yi Ho October 30, 2018 1 Introduction The LiquidAssociation package provides analytical methods to study three-way interactions. It incorporates methods to examine a particular

More information

Package cancer. July 10, 2018

Package cancer. July 10, 2018 Type Package Package cancer July 10, 2018 Title A Graphical User Interface for accessing and modeling the Cancer Genomics Data of MSKCC. Version 1.14.0 Date 2018-04-16 Author Karim Mezhoud. Nuclear Safety

More information

splicer: An R package for classification of alternative splicing and prediction of coding potential from RNA-seq data

splicer: An R package for classification of alternative splicing and prediction of coding potential from RNA-seq data splicer: An R package for classification of alternative splicing and prediction of coding potential from RNA-seq data Kristoffer Knudsen, Johannes Waage 5 Dec 2013 1 Contents 1 Introduction 3 1.1 Alternative

More information

Package citccmst. February 19, 2015

Package citccmst. February 19, 2015 Version 1.0.2 Date 2014-01-07 Package citccmst February 19, 2015 Title CIT Colon Cancer Molecular SubTypes Prediction Description This package implements the approach to assign tumor gene expression dataset

More information

Package CLL. April 19, 2018

Package CLL. April 19, 2018 Type Package Title A Package for CLL Gene Expression Data Version 1.19.0 Author Elizabeth Whalen Package CLL April 19, 2018 Maintainer Robert Gentleman The CLL package contains the

More information

R/Bioconductor Center for Genomic Sciences Universidad Nacional Autónoma de México

R/Bioconductor Center for Genomic Sciences Universidad Nacional Autónoma de México R/Bioconductor Center for Genomic Sciences Universidad Nacional Autónoma de México Daniela Azucena García Soriano, dgarcia@lcg.unam.mx Yuvia Alhelí Pérez Rico, yperez@lcg.unam.mx October 23, 2009 Abstract

More information

Nature Methods: doi: /nmeth.3115

Nature Methods: doi: /nmeth.3115 Supplementary Figure 1 Analysis of DNA methylation in a cancer cohort based on Infinium 450K data. RnBeads was used to rediscover a clinically distinct subgroup of glioblastoma patients characterized by

More information

White Rose Research Online URL for this paper: Version: Supplemental Material

White Rose Research Online URL for this paper:   Version: Supplemental Material This is a repository copy of How well can body size represent effects of the environment on demographic rates? Disentangling correlated explanatory variables. White Rose Research Online URL for this paper:

More information

CNV PCA Search Tutorial

CNV PCA Search Tutorial CNV PCA Search Tutorial Release 8.1 Golden Helix, Inc. March 18, 2014 Contents 1. Data Preparation 2 A. Join Log Ratio Data with Phenotype Information.............................. 2 B. Activate only

More information

Package flowtype. R topics documented: July 18, Type Package. Title Phenotyping Flow Cytometry Assays. Version

Package flowtype. R topics documented: July 18, Type Package. Title Phenotyping Flow Cytometry Assays. Version Package flowtype July 18, 2013 Type Package Title Phenotyping Flow Cytometry Assays Version 1.6.0 Date 2011-04-27 Author Nima Aghaeepour Maintainer Nima Aghaeepour Phenotyping Flow

More information

QPM Lab 9: Contingency Tables and Bivariate Displays in R

QPM Lab 9: Contingency Tables and Bivariate Displays in R QPM Lab 9: Contingency Tables and Bivariate Displays in R Department of Political Science Washington University, St. Louis November 3-4, 2016 QPM Lab 9: Contingency Tables and Bivariate Displays in R 1

More information

Checking Drug Sensitivity of Cell Lines Used in Signatures

Checking Drug Sensitivity of Cell Lines Used in Signatures Checking Drug Sensitivity of Used in Signatures Keith A. Baggerly Contents 1 Executive Summary 1 1.1 Introduction.............................................. 1 1.2 Methods................................................

More information

Package diggitdata. April 11, 2019

Package diggitdata. April 11, 2019 Type Package Title Example data for the diggit package Version 1.14.0 Date 2014-08-29 Author Mariano Javier Alvarez Package diggitdata April 11, 2019 Maintainer Mariano Javier Alvarez

More information

Infer mirna-mrna interactions using paired expression data from a single sample

Infer mirna-mrna interactions using paired expression data from a single sample Infer mirna-mrna interactions using paired expression data from a single sample Yue Li yueli@cs.toronto.edu October 0, 0 Introduction MicroRNAs (mirnas) are small ( nucleotides) RNA molecules that base-pair

More information

Package MSstatsTMT. February 26, Title Protein Significance Analysis in shotgun mass spectrometry-based

Package MSstatsTMT. February 26, Title Protein Significance Analysis in shotgun mass spectrometry-based Package MSstatsTMT February 26, 2019 Title Protein Significance Analysis in shotgun mass spectrometry-based proteomic experiments with tandem mass tag (TMT) labeling Version 1.1.2 Date 2019-02-25 Tools

More information

bivariate analysis: The statistical analysis of the relationship between two variables.

bivariate analysis: The statistical analysis of the relationship between two variables. bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for

More information

5 To Invest or not to Invest? That is the Question.

5 To Invest or not to Invest? That is the Question. 5 To Invest or not to Invest? That is the Question. Before starting this lab, you should be familiar with these terms: response y (or dependent) and explanatory x (or independent) variables; slope and

More information

How To Use SubpathwayGMir

How To Use SubpathwayGMir How To Use SubpathwayGMir Li Feng, Chunquan Li and Xia Li May 20, 2015 Contents 1 Overview 1 2 The experimentally verified mirna-target interactions 2 3 Reconstruct KEGG metabolic pathways 2 3.1 Embed

More information

User Guide. Association analysis. Input

User Guide. Association analysis. Input User Guide TFEA.ChIP is a tool to estimate transcription factor enrichment in a set of differentially expressed genes using data from ChIP-Seq experiments performed in different tissues and conditions.

More information

# Assessment of gene expression levels between several cell group types is a common application of the unsupervised technique.

# Assessment of gene expression levels between several cell group types is a common application of the unsupervised technique. # Aleksey Morozov # Microarray Data Analysis Using Hierarchical Clustering. # The "unsupervised learning" approach deals with data that has the features X1,X2...Xp, but does not have an associated response

More information

Package MethPed. September 1, 2018

Package MethPed. September 1, 2018 Type Package Version 1.8.0 Date 2016-01-01 Package MethPed September 1, 2018 Title A DNA methylation classifier tool for the identification of pediatric brain tumor subtypes Depends R (>= 3.0.0), Biobase

More information

To open a CMA file > Download and Save file Start CMA Open file from within CMA

To open a CMA file > Download and Save file Start CMA Open file from within CMA Example name Effect size Analysis type Level Tamiflu Hospitalized Risk ratio Basic Basic Synopsis The US government has spent 1.4 billion dollars to stockpile Tamiflu, in anticipation of a possible flu

More information

MS/MS Library Creation of Q-TOF LC/MS Data for MassHunter PCDL Manager

MS/MS Library Creation of Q-TOF LC/MS Data for MassHunter PCDL Manager MS/MS Library Creation of Q-TOF LC/MS Data for MassHunter PCDL Manager Quick Start Guide Step 1. Calibrate the Q-TOF LC/MS for low m/z ratios 2 Step 2. Set up a Flow Injection Analysis (FIA) method for

More information

Data Input/Output. Introduction to R for Public Health Researchers

Data Input/Output. Introduction to R for Public Health Researchers Data Input/Output Introduction to R for Public Health Researchers Common new user mistakes we have seen 1. Working directory problems: trying to read files that R can t find RStudio can help, and so do

More information

Using the DART package: Denoising Algorithm based on Relevance network Topology

Using the DART package: Denoising Algorithm based on Relevance network Topology Using the DART package: Denoising Algorithm based on Relevance network Topology Katherine Lawler, Yan Jiao, Andrew E Teschendorff, Charles Shijie Zheng October 30, 2018 Contents 1 Introduction 1 2 Load

More information

How To Use MiRSEA. Junwei Han. July 1, Overview 1. 2 Get the pathway-mirna correlation profile(pmset) and a weighting matrix 2

How To Use MiRSEA. Junwei Han. July 1, Overview 1. 2 Get the pathway-mirna correlation profile(pmset) and a weighting matrix 2 How To Use MiRSEA Junwei Han July 1, 2015 Contents 1 Overview 1 2 Get the pathway-mirna correlation profile(pmset) and a weighting matrix 2 3 Discovering the dysregulated pathways(or prior gene sets) based

More information

Vega: Variational Segmentation for Copy Number Detection

Vega: Variational Segmentation for Copy Number Detection Vega: Variational Segmentation for Copy Number Detection Sandro Morganella Luigi Cerulo Giuseppe Viglietto Michele Ceccarelli Contents 1 Overview 1 2 Installation 1 3 Vega.RData Description 2 4 Run Vega

More information

S1 Appendix: Figs A G and Table A. b Normal Generalized Fraction 0.075

S1 Appendix: Figs A G and Table A. b Normal Generalized Fraction 0.075 Aiello & Alter (216) PLoS One vol. 11 no. 1 e164546 S1 Appendix A-1 S1 Appendix: Figs A G and Table A a Tumor Generalized Fraction b Normal Generalized Fraction.25.5.75.25.5.75 1 53 4 59 2 58 8 57 3 48

More information

SubLasso:a feature selection and classification R package with a. fixed feature subset

SubLasso:a feature selection and classification R package with a. fixed feature subset SubLasso:a feature selection and classification R package with a fixed feature subset Youxi Luo,3,*, Qinghan Meng,2,*, Ruiquan Ge,2, Guoqin Mai, Jikui Liu, Fengfeng Zhou,#. Shenzhen Institutes of Advanced

More information

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival* LAB ASSIGNMENT 4 1 INFERENCES FOR NUMERICAL DATA In this lab assignment, you will analyze the data from a study to compare survival times of patients of both genders with different primary cancers. First,

More information

Gene-microRNA network module analysis for ovarian cancer

Gene-microRNA network module analysis for ovarian cancer Gene-microRNA network module analysis for ovarian cancer Shuqin Zhang School of Mathematical Sciences Fudan University Oct. 4, 2016 Outline Introduction Materials and Methods Results Conclusions Introduction

More information

OECD QSAR Toolbox v.4.2. An example illustrating RAAF scenario 6 and related assessment elements

OECD QSAR Toolbox v.4.2. An example illustrating RAAF scenario 6 and related assessment elements OECD QSAR Toolbox v.4.2 An example illustrating RAAF scenario 6 and related assessment elements Outlook Background Objectives Specific Aims Read Across Assessment Framework (RAAF) The exercise Workflow

More information

Content. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries. Research question. Example Newly diagnosed Type 2 Diabetes

Content. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries. Research question. Example Newly diagnosed Type 2 Diabetes Content Quantifying association between continuous variables. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General

More information

R documentation. of GSCA/man/GSCA-package.Rd etc. June 8, GSCA-package. LungCancer metadi... 3 plotmnw... 5 plotnw... 6 singledc...

R documentation. of GSCA/man/GSCA-package.Rd etc. June 8, GSCA-package. LungCancer metadi... 3 plotmnw... 5 plotnw... 6 singledc... R topics documented: R documentation of GSCA/man/GSCA-package.Rd etc. June 8, 2009 GSCA-package....................................... 1 LungCancer3........................................ 2 metadi...........................................

More information

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc.

Variant Classification. Author: Mike Thiesen, Golden Helix, Inc. Variant Classification Author: Mike Thiesen, Golden Helix, Inc. Overview Sequencing pipelines are able to identify rare variants not found in catalogs such as dbsnp. As a result, variants in these datasets

More information

Estimation of the Area-Under-the-Curve of Mycophenolic Acid using population pharmacokinetic and multi-linear regression models simultaneously.

Estimation of the Area-Under-the-Curve of Mycophenolic Acid using population pharmacokinetic and multi-linear regression models simultaneously. Estimation of the Area-Under-the-Curve of Mycophenolic Acid using population pharmacokinetic and multi-linear regression models simultaneously. Michał J. Figurski & Leslie M. Shaw Biomarker Research Laboratory

More information

Stat 13, Lab 11-12, Correlation and Regression Analysis

Stat 13, Lab 11-12, Correlation and Regression Analysis Stat 13, Lab 11-12, Correlation and Regression Analysis Part I: Before Class Objective: This lab will give you practice exploring the relationship between two variables by using correlation, linear regression

More information

Cancer Informatics Lecture

Cancer Informatics Lecture Cancer Informatics Lecture Mayo-UIUC Computational Genomics Course June 22, 2018 Krishna Rani Kalari Ph.D. Associate Professor 2017 MFMER 3702274-1 Outline The Cancer Genome Atlas (TCGA) Genomic Data Commons

More information

Module 3: Pathway and Drug Development

Module 3: Pathway and Drug Development Module 3: Pathway and Drug Development Table of Contents 1.1 Getting Started... 6 1.2 Identifying a Dasatinib sensitive cancer signature... 7 1.2.1 Identifying and validating a Dasatinib Signature... 7

More information

Background Information. Instructions. Problem Statement. HOMEWORK INSTRUCTIONS Homework #2 HIV Statistics Problem

Background Information. Instructions. Problem Statement. HOMEWORK INSTRUCTIONS Homework #2 HIV Statistics Problem Background Information HOMEWORK INSTRUCTIONS The scourge of HIV/AIDS has had an extraordinary impact on the entire world. The spread of the disease has been closely tracked since the discovery of the HIV

More information

On the Reproducibility of TCGA Ovarian Cancer MicroRNA Profiles

On the Reproducibility of TCGA Ovarian Cancer MicroRNA Profiles On the Reproducibility of TCGA Ovarian Cancer MicroRNA Profiles Ying-Wooi Wan 1,2,4, Claire M. Mach 2,3, Genevera I. Allen 1,7,8, Matthew L. Anderson 2,4,5 *, Zhandong Liu 1,5,6,7 * 1 Departments of Pediatrics

More information

SUPPLEMENTARY FIGURES: Supplementary Figure 1

SUPPLEMENTARY FIGURES: Supplementary Figure 1 SUPPLEMENTARY FIGURES: Supplementary Figure 1 Supplementary Figure 1. Glioblastoma 5hmC quantified by paired BS and oxbs treated DNA hybridized to Infinium DNA methylation arrays. Workflow depicts analytic

More information

Package DeconRNASeq. November 19, 2017

Package DeconRNASeq. November 19, 2017 Type Package Package DeconRNASeq November 19, 2017 Title Deconvolution of Heterogeneous Tissue Samples for mrna-seq data Version 1.20.0 Date 2013-01-22 Author Ting Gong Joseph D. Szustakowski

More information

Data Exploration and Visualization

Data Exploration and Visualization Data Exploration and Visualization Bu eğitim sunumları İstanbul Kalkınma Ajansı nın 2016 yılı Yenilikçi ve Yaratıcı İstanbul Mali Destek Programı kapsamında yürütülmekte olan TR10/16/YNY/0036 no lu İstanbul

More information

TCGA. The Cancer Genome Atlas

TCGA. The Cancer Genome Atlas TCGA The Cancer Genome Atlas TCGA: History and Goal History: Started in 2005 by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) with $110 Million to catalogue

More information

Package xseq. R topics documented: September 11, 2015

Package xseq. R topics documented: September 11, 2015 Package xseq September 11, 2015 Title Assessing Functional Impact on Gene Expression of Mutations in Cancer Version 0.2.1 Date 2015-08-25 Author Jiarui Ding, Sohrab Shah Maintainer Jiarui Ding

More information

Package leukemiaseset

Package leukemiaseset Package leukemiaseset August 14, 2018 Type Package Title Leukemia's microarray gene expression data (expressionset). Version 1.16.0 Date 2013-03-20 Author Sara Aibar, Celia Fontanillo and Javier De Las

More information

NERVE ACTION POTENTIAL SIMULATION version 2013 John Cornell

NERVE ACTION POTENTIAL SIMULATION version 2013 John Cornell NERVE ACTION POTENTIAL SIMULATION version 2013 John Cornell http://www.jccornell.net In 1963 Alan Hodgkin and Andrew Huxley received the Nobel Prize in Physiology and Medicine for their work on the mechanism

More information

Assignment 5: Integrative epigenomics analysis

Assignment 5: Integrative epigenomics analysis Assignment 5: Integrative epigenomics analysis Due date: Friday, 2/24 10am. Note: no late assignments will be accepted. Introduction CpG islands (CGIs) are important regulatory regions in the genome. What

More information

CHAPTER 1 COMMUNITY PHARMACY M.ASHOKKUMAR DEPT OF PHARMACY PRACTICE SRM COLLEGE OF PHARMACY SRM UNIVERSITY

CHAPTER 1 COMMUNITY PHARMACY M.ASHOKKUMAR DEPT OF PHARMACY PRACTICE SRM COLLEGE OF PHARMACY SRM UNIVERSITY CHAPTER 1 COMMUNITY PHARMACY M.ASHOKKUMAR DEPT OF PHARMACY PRACTICE SRM COLLEGE OF PHARMACY SRM UNIVERSITY COMMUNITY PHARMACY OPERATIONS Technician Duties Related to Dispensing Over-the-Counter Drugs and

More information

Package cssam. February 19, 2015

Package cssam. February 19, 2015 Type Package Package cssam February 19, 2015 Title cssam - cell-specific Significance Analysis of Microarrays Version 1.2.4 Date 2011-10-08 Author Shai Shen-Orr, Rob Tibshirani, Narasimhan Balasubramanian,

More information

GridMAT-MD: A Grid-based Membrane Analysis Tool for use with Molecular Dynamics

GridMAT-MD: A Grid-based Membrane Analysis Tool for use with Molecular Dynamics GridMAT-MD: A Grid-based Membrane Analysis Tool for use with Molecular Dynamics William J. Allen, Justin A. Lemkul, and David R. Bevan Department of Biochemistry, Virginia Tech User s Guide Version 1.0.2

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

Micro-RNA web tools. Introduction. UBio Training Courses. mirnas, target prediction, biology. Gonzalo

Micro-RNA web tools. Introduction. UBio Training Courses. mirnas, target prediction, biology. Gonzalo Micro-RNA web tools UBio Training Courses Gonzalo Gómez//ggomez@cnio.es Introduction mirnas, target prediction, biology Experimental data Network Filtering Pathway interpretation mirs-pathways network

More information

dataset1 <- read.delim("c:\dca_example_dataset1.txt", header = TRUE, sep = "\t") attach(dataset1)

dataset1 <- read.delim(c:\dca_example_dataset1.txt, header = TRUE, sep = \t) attach(dataset1) Worked examples of decision curve analysis using R A note about R versions The R script files to implement decision curve analysis were developed using R version 2.3.1, and were tested last using R version

More information

Package Actigraphy. R topics documented: January 15, Type Package Title Actigraphy Data Analysis Version 1.3.

Package Actigraphy. R topics documented: January 15, Type Package Title Actigraphy Data Analysis Version 1.3. Type Package Title Actigraphy Data Analysis Version 1.3.2 Date 2016-01-14 Package Actigraphy January 15, 2016 Author William Shannon, Tao Li, Hong Xian, Jia Wang, Elena Deych, Carlos Gonzalez Maintainer

More information

Supervised analysis of MS images using Cardinal

Supervised analysis of MS images using Cardinal Supervised analsis of MS images using Cardinal Klie A. Bemis and April Harr November, 28 Contents Introduction.............................. 2 Analsis of a renal cell carcinoma (RCC) dataset.... 2 2. Pre-processing..........................

More information

Extracting progression models for TCGA MSI/MSS colorectal tumors from the COADREAD project with the TRONCO package

Extracting progression models for TCGA MSI/MSS colorectal tumors from the COADREAD project with the TRONCO package Extracting progression models for TCGA MSI/MSS colorectal tumors from the COADREAD project with the TRONCO package Giulio Caravagna, Luca De Sano, Daniele Ramazzotti, Alex Graudenzi, Giancarlo Mauri, Marco

More information

Supplementary Data. Correlation analysis. Importance of normalizing indices before applying SPCA

Supplementary Data. Correlation analysis. Importance of normalizing indices before applying SPCA Supplementary Data Correlation analysis The correlation matrix R of the m = 25 GV indices calculated for each dataset is reported below (Tables S1 S3). R is an m m symmetric matrix, whose entries r ij

More information

Analysis of gene expression in blood before diagnosis of ovarian cancer

Analysis of gene expression in blood before diagnosis of ovarian cancer Analysis of gene expression in blood before diagnosis of ovarian cancer Different statistical methods Note no. Authors SAMBA/10/16 Marit Holden and Lars Holden Date March 2016 Norsk Regnesentral Norsk

More information

Package AIMS. June 29, 2018

Package AIMS. June 29, 2018 Type Package Package AIMS June 29, 2018 Title AIMS : Absolute Assignment of Breast Cancer Intrinsic Molecular Subtype Version 1.12.0 Date 2014-06-25 Description This package contains the AIMS implementation.

More information

MyWindFit Member Analytics Portal

MyWindFit Member Analytics Portal Member Analytics Portal The member section of is a private and personalized cloud database that contains the detailed records of all your activity. It also contains an analysis package that enables you

More information

Hands-On Ten The BRCA1 Gene and Protein

Hands-On Ten The BRCA1 Gene and Protein Hands-On Ten The BRCA1 Gene and Protein Objective: To review transcription, translation, reading frames, mutations, and reading files from GenBank, and to review some of the bioinformatics tools, such

More information

Name: Date: Period: Human Traits Genetics Activity

Name: Date: Period: Human Traits Genetics Activity Name: Date: Period: Human Traits Genetics Activity The following are considered by many to be single-gene traits, which mean that there are two alleles (versions of a gene) for a trait. It is important

More information

Cerebral Cortex. Edmund T. Rolls. Principles of Operation. Presubiculum. Subiculum F S D. Neocortex. PHG & Perirhinal. CA1 Fornix CA3 S D

Cerebral Cortex. Edmund T. Rolls. Principles of Operation. Presubiculum. Subiculum F S D. Neocortex. PHG & Perirhinal. CA1 Fornix CA3 S D Cerebral Cortex Principles of Operation Edmund T. Rolls F S D Neocortex S D PHG & Perirhinal 2 3 5 pp Ento rhinal DG Subiculum Presubiculum mf CA3 CA1 Fornix Appendix 4 Simulation software for neuronal

More information

Weighted Gene Co-expression Network Analysis (WGCNA) R Tutorial, Part C Summary The data and biological implications are described in

Weighted Gene Co-expression Network Analysis (WGCNA) R Tutorial, Part C Summary The data and biological implications are described in Weighted Gene Co-expression Network Analysis (WGCNA) R Tutorial, Part C Breast Cancer Microarray Data. Steve Horvath, Paul Mischel Correspondence: shorvath@mednet.ucla.edu, http://www.ph.ucla.edu/biostat/people/horvath.htm

More information

Data mining with Ensembl Biomart. Stéphanie Le Gras

Data mining with Ensembl Biomart. Stéphanie Le Gras Data mining with Ensembl Biomart Stéphanie Le Gras (slegras@igbmc.fr) Guidelines Genome data Genome browsers Getting access to genomic data: Ensembl/BioMart 2 Genome Sequencing Example: Human genome 2000:

More information

Package ega. March 21, 2017

Package ega. March 21, 2017 Title Error Grid Analysis Version 2.0.0 Package ega March 21, 2017 Maintainer Daniel Schmolze Functions for assigning Clarke or Parkes (Consensus) error grid zones to blood glucose values,

More information

New Enhancements: GWAS Workflows with SVS

New Enhancements: GWAS Workflows with SVS New Enhancements: GWAS Workflows with SVS August 9 th, 2017 Gabe Rudy VP Product & Engineering 20 most promising Biotech Technology Providers Top 10 Analytics Solution Providers Hype Cycle for Life sciences

More information

[ APPLICATION NOTE ] High Sensitivity Intact Monoclonal Antibody (mab) HRMS Quantification APPLICATION BENEFITS INTRODUCTION WATERS SOLUTIONS KEYWORDS

[ APPLICATION NOTE ] High Sensitivity Intact Monoclonal Antibody (mab) HRMS Quantification APPLICATION BENEFITS INTRODUCTION WATERS SOLUTIONS KEYWORDS Yun Wang Alelyunas, Henry Shion, Mark Wrona Waters Corporation, Milford, MA, USA APPLICATION BENEFITS mab LC-MS method which enables users to achieve highly sensitive bioanalysis of intact trastuzumab

More information

IMPaLA tutorial.

IMPaLA tutorial. IMPaLA tutorial http://impala.molgen.mpg.de/ 1. Introduction IMPaLA is a web tool, developed for integrated pathway analysis of metabolomics data alongside gene expression or protein abundance data. It

More information

Unit 1 Outline Science Practices. Part 1 - The Scientific Method. Screencasts found at: sciencepeek.com. 1. List the steps of the scientific method.

Unit 1 Outline Science Practices. Part 1 - The Scientific Method. Screencasts found at: sciencepeek.com. 1. List the steps of the scientific method. Screencasts found at: sciencepeek.com Part 1 - The Scientific Method 1. List the steps of the scientific method. 2. What is an observation? Give an example. Quantitative or Qualitative Data? 35 grams?

More information

Benchmark Dose Modeling Cancer Models. Allen Davis, MSPH Jeff Gift, Ph.D. Jay Zhao, Ph.D. National Center for Environmental Assessment, U.S.

Benchmark Dose Modeling Cancer Models. Allen Davis, MSPH Jeff Gift, Ph.D. Jay Zhao, Ph.D. National Center for Environmental Assessment, U.S. Benchmark Dose Modeling Cancer Models Allen Davis, MSPH Jeff Gift, Ph.D. Jay Zhao, Ph.D. National Center for Environmental Assessment, U.S. EPA Disclaimer The views expressed in this presentation are those

More information

Gene Expression Analysis Web Forum. Jonathan Gerstenhaber Field Application Specialist

Gene Expression Analysis Web Forum. Jonathan Gerstenhaber Field Application Specialist Gene Expression Analysis Web Forum Jonathan Gerstenhaber Field Application Specialist Our plan today: Import Preliminary Analysis Statistical Analysis Additional Analysis Downstream Analysis 2 Copyright

More information

Lesson 3 Profex Graphical User Interface for BGMN and Fullprof

Lesson 3 Profex Graphical User Interface for BGMN and Fullprof Lesson 3 Profex Graphical User Interface for BGMN and Fullprof Nicola Döbelin RMS Foundation, Bettlach, Switzerland March 01 02, 2016, Freiberg, Germany Background Information Developer: License: Founded

More information

Nature Medicine: doi: /nm.3967

Nature Medicine: doi: /nm.3967 Supplementary Figure 1. Network clustering. (a) Clustering performance as a function of inflation factor. The grey curve shows the median weighted Silhouette widths for varying inflation factors (f [1.6,

More information

One-Way Independent ANOVA

One-Way Independent ANOVA One-Way Independent ANOVA Analysis of Variance (ANOVA) is a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment.

More information

VIEW AS Fit Page! PRESS PgDn to advance slides!

VIEW AS Fit Page! PRESS PgDn to advance slides! VIEW AS Fit Page! PRESS PgDn to advance slides! UNDERSTAND REALIZE CHANGE WHY??? CHANGE THE PROCESSES OF YOUR BUSINESS CONNECTING the DOTS Customer Focus (W s) Customer Focused Metrics Customer Focused

More information

Bioinformatics Laboratory Exercise

Bioinformatics Laboratory Exercise Bioinformatics Laboratory Exercise Biology is in the midst of the genomics revolution, the application of robotic technology to generate huge amounts of molecular biology data. Genomics has led to an explosion

More information

Item Response Theory for Polytomous Items Rachael Smyth

Item Response Theory for Polytomous Items Rachael Smyth Item Response Theory for Polytomous Items Rachael Smyth Introduction This lab discusses the use of Item Response Theory (or IRT) for polytomous items. Item response theory focuses specifically on the items

More information

Package AbsFilterGSEA

Package AbsFilterGSEA Type Package Package AbsFilterGSEA September 21, 2017 Title Improved False Positive Control of Gene-Permuting GSEA with Absolute Filtering Version 1.5.1 Author Sora Yoon Maintainer

More information

Two-Way Independent ANOVA

Two-Way Independent ANOVA Two-Way Independent ANOVA Analysis of Variance (ANOVA) a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment. There

More information

Figure S2. Distribution of acgh probes on all ten chromosomes of the RIL M0022

Figure S2. Distribution of acgh probes on all ten chromosomes of the RIL M0022 96 APPENDIX B. Supporting Information for chapter 4 "changes in genome content generated via segregation of non-allelic homologs" Figure S1. Potential de novo CNV probes and sizes of apparently de novo

More information

How to compute a semantic similarity threshold. Charles Bettembourg, Christian Diot, Olivier Dameron

How to compute a semantic similarity threshold. Charles Bettembourg, Christian Diot, Olivier Dameron How to compute a semantic similarity threshold Charles Bettembourg, Christian Diot, Olivier Dameron Abstract The analysis of gene annotations related to Gene Ontology plays an important role in the interpretation

More information

Package wally. May 25, Type Package

Package wally. May 25, Type Package Type Package Package wally May 25, 2017 Title The Wally Calibration Plot for Risk Prediction Models Version 1.0.9 Date 2017-04-28 Author Paul F Blanche , Thomas A. Gerds

More information

A Quick-Start Guide for rseqdiff

A Quick-Start Guide for rseqdiff A Quick-Start Guide for rseqdiff Yang Shi (email: shyboy@umich.edu) and Hui Jiang (email: jianghui@umich.edu) 09/05/2013 Introduction rseqdiff is an R package that can detect differential gene and isoform

More information

Package prognosticroc

Package prognosticroc Type Package Package prognosticroc February 20, 2015 Title Prognostic ROC curves for evaluating the predictive capacity of a binary test Version 0.7 Date 2013-11-27 Author Y. Foucher

More information

CSDplotter user guide Klas H. Pettersen

CSDplotter user guide Klas H. Pettersen CSDplotter user guide Klas H. Pettersen [CSDplotter user guide] [0.1.1] [version: 23/05-2006] 1 Table of Contents Copyright...3 Feedback... 3 Overview... 3 Downloading and installation...3 Pre-processing

More information

J. A. Mayfield et al. FIGURE S1. Methionine Salvage. Methylthioadenosine. Methionine. AdoMet. Folate Biosynthesis. Methylation SAH.

J. A. Mayfield et al. FIGURE S1. Methionine Salvage. Methylthioadenosine. Methionine. AdoMet. Folate Biosynthesis. Methylation SAH. FIGURE S1 Methionine Salvage Methionine Methylthioadenosine AdoMet Folate Biosynthesis Methylation SAH Homocysteine Homocystine CBS Cystathionine Cysteine Glutathione Figure S1 Biochemical pathway of relevant

More information