Low Abundance Genes. DNA Microarrays. Log Transformation? Basic Idea. Mining for Low-abundance Transcripts in Microarray Data. Exploratory Methods
|
|
- Easter Carter
- 6 years ago
- Views:
Transcription
1 Mining for Low-abundance Transcripts in Microarray Data Yi Lin 1, Samuel T. Nadler 2, Alan D. Attie 2, Brian S. Yandell 1,3 1 Statistics, 2 Biochemistry, 3 Horticulture, University of Wisconsin-Madison September 2 Basic Idea DNA microarrays present tremendous opportunities to understand complex processes Transcription factors and receptors expressed at low levels, missed by most other methods Analyze low expression genes with normal scores Robustly adapt to changing variability across average expression levels Basis for clustering and other exploratory methods Data-driven p-value sensitive to low abundance & changes in variability with gene expression DNA Microarrays 1-1, items per chip cdna, oligonucleotides, proteins dozens of technologies, changing rapidly expose live tissue from organism mrna cdna hybridize with chip read expression signal as intensity (fluorescence) design considerations compare conditions across or within chips worry about image capture of signal Low Abundance Genes background adjustment remove local geography comparing within and between chips negative values after adjustment low abundance genes virtually absent in one condition could be important genes: transcription factors, receptors large measurement variability early technology (bleeding edge) prevalence across genes on a chip -2% per chip 1-5% across multiple conditions Log Transformation? tremendous scale range in mean intensities 1-1 fold common concentrations of chemicals (ph) fold changes have intuitive appeal looks pretty good in practice want transformed data to be roughly normal easy to test if no difference across conditions looking for genes that are outliers beyond edge of bell shaped curve provide formal or informal thresholds Exploratory Methods Clustering methods (Eisen 1998, Golub 1999) Self-organizing maps (Tamayo 1999) search for genes with similar changes across conditions do not determine significance of changes in expression require extensive pre-filtering to eliminate low intensity modest fold changes may detect patterns unrelated to fold change comparison of discrimination methods (Dudoit 2) 1
2 Confirmatory Methods ratio-based decisions for 2 conditions (Chen 1997) constant variance of ratio on log scale, use normality anova (Kerr 2, Dudoit 2) handles multiple conditions in anova model constant variance on log scale, use normality Bayesian inference (Newton 2, Tsodikov 2) Gamma-Gamma model variance proportional to squared intensity error model (Roberts 2, Hughes 2) variance proportional to squared intensity transform to log scale, use normality. acquire data Q, B 1. adjust for background A=Q B 2. rank order genes R=rank(A)/(n+1) 3. normal scores N=qnorm(R) 4. contrast conditions Y=N 1 N 2 Y = contrast 7. standardize Z= Y center spread 6. center & spread X = mean 5. mean intensity X=mean(N) Normal Scores Procedure adjusted expression A = Q B rank order R = rank(a) / (n+1) normal scores N = qnorm( R ) average intensity X = (N 1 +N 2 )/2 difference Y = N 1 N 2 variance Var(Y X) σ 2 (X) standardization S = [Y µ(x)]/σ(x) Motivate Normal Scores natural transformation to normality is log(a) background intensity B = bd +ω B measured with error ω attenuation d may depend on condition gene measurement Q = [αexp(g+h+ε)+b]d+ω Q gene signal g degree of hybridization h: intrinsic noise ε (variance may depend on g) attenuation α (depends thickness of sample, etc.) subtract background: A = Q B adjusted measurements A = dαexp(g+h+ε)+δ symmetric measurement error δ=ω B ω Q Motivate Normal Scores (cont.) adjusted measurements: A = Q B = dαg+δ log expression level: log(g) = g+h+ε gene signal g confounded with hybridization h unless hybridization h independent of condition G is observed if no measurement error δ no dye or array effect (no attenuation dα) no background intensity natural under this model to consider N=log(G) normal scores almost as good Robust Center & Spread genes sorted based on X partitioned into many (about 4) slices containing roughly the same number of genes slices summarized by median and MAD MAD = median absolute deviation robust to outliers (e.g. changing genes) MAD ~ same distribution across X up to scale MAD i = σ i Z i, Z i ~ Z, i = 1,,4 log(mad i ) = log(σ i ) + log( Z i ) median ~ same idea 2
3 Robust Center & Spread MAD ~ same distribution across X up to scale log(mad i ) = log(σ i ) + log( Z i ), I = 1,,4 regress log(mad i ) on X i with smoothing splines smoothing parameter tuned automatically generalized cross validation (Wahba 199) globally rescale anti-log of smooth curve Var(Y X) σ 2 (X) can force σ 2 (X) to be decreasing similar idea for median E(Y X) µ(x) Motivation for Spread log expression level: log(g) = g+h+ε hybridization h negligible or same across conditions intrinsic noise ε may depend on gene signal g compare two conditions 1 and 2 Y = N 1 N 2 log(g 1 ) log(g 2 ) = g 1 g 2 + ε 1 -ε 2 no differential expression: g 1 =g 2 =g Var(Y g) = σ 2 (g) g X suggests condition on X instead, but: Var(Y X) not exactly σ 2 (X) cannot be determined without further assumptions Simulation of Spread Recovery 1, genes: log expression Normal(4,2) 5% altered genes: add Normal(,2) no measurement error, attenuation estimate robust spread Robust Spread Estimation Estimate of Spread Q-Q Plot spread Sample Quantiles mean expression Bonferroni-corrected p-values standardized normal scores S = [Y µ(x)]/σ(x) ~ Normal(,1)? genes with differential expression more dispersed Zidak version of Bonferroni correction p = 1 (1 p 1 ) n 13, genes with an overall level p =.5 each gene should be tested at level 1.95*1-6 differential expression if S > 4.62 differential expression if Y µ(x) > 4.62σ(X) too conservative? weight by X? Dudoit (2) Simulation Study simulations with two conditions 1, genes g 1,g 2 ~ Normal(4,2) for nonchanging genes 5% with differential expression g c ~ Normal(4,2) + Normal(3-rank(X)/(n+1),1/2) up- or down-regulated with probability 1/2 intrinsic noise ε ~ Normal(,.5) attenuation α = 1 measurement error variance =, 1, 2, 5, 1, 2 3
4 Comparison of Methods 5 differential expression for two conditions Newton (2) J Comp Biol Gamma-Gamma-Bernouli model Bayesian odds of differential expression Chen (1997) constant ratio of expressions underlying log-normality 1 normal scores (Lin 2) some (unknown) transformation to normality robust, smooth estimate of spread & center Bonferroni-style p-values Capturing Changed Genes: Noise.2 Fold Change 1. Fold Change Capturing Changed Genes: No Noise No Noise: Average Intensity.5 E. coli with IPTG-b Comparison with E. coli Data Normal Scores IPTG-b Cy3/Cy5 ratio ~15 genes at low abundance including key genes Cy3/Cy5 ratio 1 e-3 1 e-1 1 e+3 Bayesian odds of differential expression IPTG-b known to affect only a few genes Raw IPTG-b 4,+ genes (whole genome) Newton (2) J Comp Biol Noise 1: Average Intensity 5. Number of Changed Genes Captured Success in Capture 1 e-2 1 e+ Mean Intensity 1 e Mean Intensity 2. 4
5 Diabetes & Obesity Study 13,+ gene fragments (11,+ genes) oligonuleotides, Affymetrix gene chips mean(pm) - mean(nm) adjusted expression levels six conditions in 2x3 factorial lean vs. obese B6, F1, BTBR mouse genotype adipose tissue influence whole-body fuel partitioning might be aberrant in obese and/or diabetic subjects Nadler (2) PNAS Low Abundance Obesity Genes low mean expression on at least 1 of 6 conditions negative adjusted values ignored by clustering routines transcription factors I-κB modulates transcription - inflammatory processes RXR nuclear hormone receptor - forms heterodimers with several nuclear hormone receptors regulation proteins protein kinase A glycogen synthase kinase-3 roughly 1 genes 9 new since Nadler (2) PNAS Table 1. Genes Differentially Expressed with Obesity as Identified by the Normal Scores Procedure Accesion # Description Obese/Lean Lean/Obese Average p-value aa68251 Mus musculus Dp1l1 mrna for polyposis locus protein 1-like 1 (TB2 protein-like ) x93999 M.musculus Gal beta-1,3-galnac-specific GalNAc alpha-2,6-sialyltransferase gene w45955 No significant homologies U16297 Mus musculus cytochrome B561 (mcyt) X72862* M.musculus gene for beta-3-adrenergic receptor X12521 Mouse mrna for transition protein 1 TP aa No significant homologies AA62269 Mus musculus protamine AA Mus musculus transition protein 1 (Tnp1) X63349 M.musculus tyrosinase-related protein AA11671 Mus musculus serum response factor (Srf) AA15229 Mus musculus hepsin (Hpn) aa26714 No significant homologies aa48365 Mus musculus mrna for SRp25 nuclear protein j5663 Mouse vas deferens androgen related protein (MVDP) w49331 No significant homologies J2935 Mouse camp-dependent protein kinase type II regulatory subunit aa195 No significant homologies m3181 Mouse 2,3 -cyclic-nucleotide 3 -phosphodiesterase AA33381 No significant homologies aa No significant homologies u2883 Mus musculus Balb/c mammary-derived growth inhibitor (MDGI) AA No significant homologies D14883 Mouse C33/R2/IA W36455 Mus musculus adipsin (Adn) m2751 Mus musculus protamine X8796 M.musculus brevican AA48974 No significant homologies ET63455 Mus musculus serum amyloid A-4 protein (Saa4) X6526 M.musculus mrna for GTP-binding protein AA Similar to Rat S6 kinase ET6236 CALCIUM-DEPENDENT PHOSPHOLIPASE A2 PRECURSOR w64688 No significant homologies AA3587 No significant homologies AA Mus musculus hippocalcin-like 1 (Hpcal1) w1922 No significant homologies w48392 similar X7518 Id4 helix-loop-helix protein AA Sequence withdrawn X99143 M.musculus hair keratin, mhb AA similar to Human ras-related protein rab W1926 Mus musculus mrna for ubiquitin like protein AA13862 Similar to Human hqp J4179 Mouse chromatin nonhistone high mobility group protein (HGM-I(Y)) X355 Mouse gene exon 2 for serum amyloid A (SAA) 3 protein U957 Mus musculus p21 (Waf1) AA2399 Mus musculus dutpase u7746 Mus musculus anaphylatoxin C3a receptor x66224 M.musculus retinoid X receptor-beta AA Mus musculus protein tyrosine phosphatase (Ptpn16) W85163 Mus musculus migration inhibitory factor Low Abundance Genes for Obesity Obese vs. Lean Microarray ANOVAs Kerr (2) gene by condition interaction N ijk = gene i + condition j + gene*condition ij + rep error ijk conditions organized in factorial design experimental units may be whole or part of array genes are random effects focus on outliers (BLUPs), not variance components gene*condition ij = differential expression allow variance to depend on gene i main effect replication to improve precision, catch gross errors Average Intensity for Obesity 5
6 Microarray Random Effects variance component for non-changing genes robust estimate of MS(G*C) using smoothed MAD rescale normal score response N by spread σ(x) look for differential expression or use clustering methods variance component for replication robust estimate of MSE using smoothed MAD look for outliers = gross errors Obesity Genotype Main Effects Obesity Dominance Obesity Additive Microarray QTLs condition may be genotype whole organism or pattern of genes genotype may be inferred rather than known exactly N ijk = gene i + QTL j + gene*qtl ij + individual ijk QTL genotype depends on flanking markers mixture model across possible QTL genotypes single vs. multiple QTL single QTL may influence numerous genes epistasis = inter-genic interaction modification of biochemical pathway(s) 6
Cancer outlier differential gene expression detection
Biostatistics (2007), 8, 3, pp. 566 575 doi:10.1093/biostatistics/kxl029 Advance Access publication on October 4, 2006 Cancer outlier differential gene expression detection BAOLIN WU Division of Biostatistics,
More informationExperimental Design For Microarray Experiments. Robert Gentleman, Denise Scholtens Arden Miller, Sandrine Dudoit
Experimental Design For Microarray Experiments Robert Gentleman, Denise Scholtens Arden Miller, Sandrine Dudoit Copyright 2002 Complexity of Genomic data the functioning of cells is a complex and highly
More informationData analysis and binary regression for predictive discrimination. using DNA microarray data. (Breast cancer) discrimination. Expression array data
West Mike of Statistics & Decision Sciences Institute Duke University wwwstatdukeedu IPAM Functional Genomics Workshop November Two group problems: Binary outcomes ffl eg, ER+ versus ER ffl eg, lymph node
More informationNature Genetics: doi: /ng Supplementary Figure 1. Assessment of sample purity and quality.
Supplementary Figure 1 Assessment of sample purity and quality. (a) Hematoxylin and eosin staining of formaldehyde-fixed, paraffin-embedded sections from a human testis biopsy collected concurrently with
More informationislet insulin production in diabetes model: spatial comparison of mouse strains
islet insulin production in diabetes model: spatial comparison of mouse strains mentor: Brian S. Yandell, UW-Biostat Trainee www.stat.wisc.edu/~yandell scientist: Donnie Stapleton, Attie Biochem Lab dsstaple@biochem.wisc.edu
More informationStatistical Assessment of the Global Regulatory Role of Histone. Acetylation in Saccharomyces cerevisiae. (Support Information)
Statistical Assessment of the Global Regulatory Role of Histone Acetylation in Saccharomyces cerevisiae (Support Information) Authors: Guo-Cheng Yuan, Ping Ma, Wenxuan Zhong and Jun S. Liu Linear Relationship
More informationStatistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies
Statistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies Stanford Biostatistics Workshop Pierre Neuvial with Henrik Bengtsson and Terry Speed Department of Statistics, UC Berkeley
More informationGene Selection for Tumor Classification Using Microarray Gene Expression Data
Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology
More informationSupplement to SCnorm: robust normalization of single-cell RNA-seq data
Supplement to SCnorm: robust normalization of single-cell RNA-seq data Supplementary Note 1: SCnorm does not require spike-ins, since we find that the performance of spike-ins in scrna-seq is often compromised,
More informationMOST: detecting cancer differential gene expression
Biostatistics (2008), 9, 3, pp. 411 418 doi:10.1093/biostatistics/kxm042 Advance Access publication on November 29, 2007 MOST: detecting cancer differential gene expression HENG LIAN Division of Mathematical
More informationAn Introduction to Bayesian Statistics
An Introduction to Bayesian Statistics Robert Weiss Department of Biostatistics UCLA Fielding School of Public Health robweiss@ucla.edu Sept 2015 Robert Weiss (UCLA) An Introduction to Bayesian Statistics
More informationSUPPLEMENTARY FIGURES
SUPPLEMENTARY FIGURES Figure S1. Clinical significance of ZNF322A overexpression in Caucasian lung cancer patients. (A) Representative immunohistochemistry images of ZNF322A protein expression in tissue
More informationGene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering
Gene expression analysis Roadmap Microarray technology: how it work Applications: what can we do with it Preprocessing: Image processing Data normalization Classification Clustering Biclustering 1 Gene
More informationAspects of Statistical Modelling & Data Analysis in Gene Expression Genomics. Mike West Duke University
Aspects of Statistical Modelling & Data Analysis in Gene Expression Genomics Mike West Duke University Papers, software, many links: www.isds.duke.edu/~mw ABS04 web site: Lecture slides, stats notes, papers,
More informationGenetic Analysis of Anxiety Related Behaviors by Gene Chip and In situ Hybridization of the Hippocampus and Amygdala of C57BL/6J and AJ Mice Brains
Genetic Analysis of Anxiety Related Behaviors by Gene Chip and In situ Hybridization of the Hippocampus and Amygdala of C57BL/6J and AJ Mice Brains INTRODUCTION To study the relationship between an animal's
More informationSample Size Estimation for Microarray Experiments
Sample Size Estimation for Microarray Experiments Gregory R. Warnes Department of Biostatistics and Computational Biology Univeristy of Rochester Rochester, NY 14620 and Peng Liu Department of Biological
More informationBiol220 Cell Signalling Cyclic AMP the classical secondary messenger
Biol220 Cell Signalling Cyclic AMP the classical secondary messenger The classical secondary messenger model of intracellular signalling A cell surface receptor binds the signal molecule (the primary
More informationEcological Statistics
A Primer of Ecological Statistics Second Edition Nicholas J. Gotelli University of Vermont Aaron M. Ellison Harvard Forest Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Brief Contents
More informationComparison of discrimination methods for the classification of tumors using gene expression data
Comparison of discrimination methods for the classification of tumors using gene expression data Sandrine Dudoit, Jane Fridlyand 2 and Terry Speed 2,. Mathematical Sciences Research Institute, Berkeley
More informationComplex Trait Genetics in Animal Models. Will Valdar Oxford University
Complex Trait Genetics in Animal Models Will Valdar Oxford University Mapping Genes for Quantitative Traits in Outbred Mice Will Valdar Oxford University What s so great about mice? Share ~99% of genes
More informationChapter 1: Exploring Data
Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!
More informationBayesian hierarchical modelling
Bayesian hierarchical modelling Matthew Schofield Department of Mathematics and Statistics, University of Otago Bayesian hierarchical modelling Slide 1 What is a statistical model? A statistical model:
More informationPSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science. Homework 5
PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science Homework 5 Due: 21 Dec 2016 (late homeworks penalized 10% per day) See the course web site for submission details.
More informationResults & Statistics: Description and Correlation. I. Scales of Measurement A Review
Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize
More informationApplication of Resampling Methods in Microarray Data Analysis
Application of Resampling Methods in Microarray Data Analysis Tests for two independent samples Oliver Hartmann, Helmut Schäfer Institut für Medizinische Biometrie und Epidemiologie Philipps-Universität
More informationInvestigating the robustness of the nonparametric Levene test with more than two groups
Psicológica (2014), 35, 361-383. Investigating the robustness of the nonparametric Levene test with more than two groups David W. Nordstokke * and S. Mitchell Colp University of Calgary, Canada Testing
More informationNature Getetics: doi: /ng.3471
Supplementary Figure 1 Summary of exome sequencing data. ( a ) Exome tumor normal sample sizes for bladder cancer (BLCA), breast cancer (BRCA), carcinoid (CARC), chronic lymphocytic leukemia (CLLX), colorectal
More informationTitle: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection
Author's response to reviews Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection Authors: Jestinah M Mahachie John
More informationn Outline final paper, add to outline as research progresses n Update literature review periodically (check citeseer)
Project Dilemmas How do I know when I m done? How do I know what I ve accomplished? clearly define focus/goal from beginning design a search method that handles plateaus improve some ML method s robustness
More informationRASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays
Supplementary Materials RASA: Robust Alternative Splicing Analysis for Human Transcriptome Arrays Junhee Seok 1*, Weihong Xu 2, Ronald W. Davis 2, Wenzhong Xiao 2,3* 1 School of Electrical Engineering,
More informationAP Statistics. Semester One Review Part 1 Chapters 1-5
AP Statistics Semester One Review Part 1 Chapters 1-5 AP Statistics Topics Describing Data Producing Data Probability Statistical Inference Describing Data Ch 1: Describing Data: Graphically and Numerically
More informationList of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition
List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing
More informationStatistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions
Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated
More informationchapter 1 - fig. 2 Mechanism of transcriptional control by ppar agonists.
chapter 1 - fig. 1 The -omics subdisciplines. chapter 1 - fig. 2 Mechanism of transcriptional control by ppar agonists. 201 figures chapter 1 chapter 2 - fig. 1 Schematic overview of the different steps
More information7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans.
Supplementary Figure 1 7SK ChIRP-seq is specifically RNA dependent and conserved between mice and humans. Regions targeted by the Even and Odd ChIRP probes mapped to a secondary structure model 56 of the
More informationA Case Study: Two-sample categorical data
A Case Study: Two-sample categorical data Patrick Breheny January 31 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/43 Introduction Model specification Continuous vs. mixture priors Choice
More informationWhat you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu
What you should know before you collect data BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Types and levels of study Descriptive statistics Inferential statistics How to choose a statistical test
More informationSUPPLEMENTAL MATERIAL
1 SUPPLEMENTAL MATERIAL Response time and signal detection time distributions SM Fig. 1. Correct response time (thick solid green curve) and error response time densities (dashed red curve), averaged across
More informationSupplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.
SUPPLEMENTAL FIGURE AND TABLE LEGENDS Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells. A) Cirbp mrna expression levels in various mouse tissues collected around the clock
More informationModelling Spatially Correlated Survival Data for Individuals with Multiple Cancers
Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers Dipak K. Dey, Ulysses Diva and Sudipto Banerjee Department of Statistics University of Connecticut, Storrs. March 16,
More informationKEY CONCEPT QUESTIONS IN SIGNAL TRANSDUCTION
Signal Transduction - Part 2 Key Concepts - Receptor tyrosine kinases control cell metabolism and proliferation Growth factor signaling through Ras Mutated cell signaling genes in cancer cells are called
More informationBiostatistics for Med Students. Lecture 1
Biostatistics for Med Students Lecture 1 John J. Chen, Ph.D. Professor & Director of Biostatistics Core UH JABSOM JABSOM MD7 February 14, 2018 Lecture note: http://biostat.jabsom.hawaii.edu/education/training.html
More information2402 : Anatomy/Physiology
Dr. Chris Doumen Lecture 2 2402 : Anatomy/Physiology The Endocrine System G proteins and Adenylate Cyclase /camp TextBook Readings Pages 405 and 599 through 603. Make use of the figures in your textbook
More informationReceptor mediated Signal Transduction
Receptor mediated Signal Transduction G-protein-linked receptors adenylyl cyclase camp PKA Organization of receptor protein-tyrosine kinases From G.M. Cooper, The Cell. A molecular approach, 2004, third
More informationAnalysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach
University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School November 2015 Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach Wei Chen
More informationDr. Kelly Bradley Final Exam Summer {2 points} Name
{2 points} Name You MUST work alone no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. This exam is being scored out of 00 points.
More informationUnderstandable Statistics
Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement
More informationSupplementary Figures
Supplementary Figures Supplementary Figure 1. Heatmap of GO terms for differentially expressed genes. The terms were hierarchically clustered using the GO term enrichment beta. Darker red, higher positive
More informationBreast cancer. Risk factors you cannot change include: Treatment Plan Selection. Inferring Transcriptional Module from Breast Cancer Profile Data
Breast cancer Inferring Transcriptional Module from Breast Cancer Profile Data Breast Cancer and Targeted Therapy Microarray Profile Data Inferring Transcriptional Module Methods CSC 177 Data Warehousing
More informationIndex. Springer International Publishing Switzerland 2017 T.J. Cleophas, A.H. Zwinderman, Modern Meta-Analysis, DOI /
Index A Adjusted Heterogeneity without Overdispersion, 63 Agenda-driven bias, 40 Agenda-Driven Meta-Analyses, 306 307 Alternative Methods for diagnostic meta-analyses, 133 Antihypertensive effect of potassium,
More informationHS Exam 1 -- March 9, 2006
Please write your name on the back. Don t forget! Part A: Short answer, multiple choice, and true or false questions. No use of calculators, notes, lab workbooks, cell phones, neighbors, brain implants,
More informationAnalysis of Unbalanced Microarray Data
Journal of Data Science 1(2003), 103-121 Analysis of Unbalanced Microarray Data Mei-Ling Ting Lee 1,2,3,4,G.A.Whitmore 5, Rus Y. Yukhananov 1,2 1 Brigham and Women s Hospital, 2 Harvard Medical School,
More informationChapter 1: Introduction to Statistics
Chapter 1: Introduction to Statistics Variables A variable is a characteristic or condition that can change or take on different values. Most research begins with a general question about the relationship
More informationThe Association Design and a Continuous Phenotype
PSYC 5102: Association Design & Continuous Phenotypes (4/4/07) 1 The Association Design and a Continuous Phenotype The purpose of this note is to demonstrate how to perform a population-based association
More informationStatistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions
Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated
More informationGeneral Principles of Endocrine Physiology
General Principles of Endocrine Physiology By Dr. Isabel S.S. Hwang Department of Physiology Faculty of Medicine University of Hong Kong The major human endocrine glands Endocrine glands and hormones
More informationClassification of cancer profiles. ABDBM Ron Shamir
Classification of cancer profiles 1 Background: Cancer Classification Cancer classification is central to cancer treatment; Traditional cancer classification methods: location; morphology, cytogenesis;
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter
More informationResearch Analysis MICHAEL BERNSTEIN CS 376
Research Analysis MICHAEL BERNSTEIN CS 376 Last time What is a statistical test? Chi-square t-test Paired t-test 2 Today ANOVA Posthoc tests Two-way ANOVA Repeated measures ANOVA 3 Recall: hypothesis testing
More informationUnsupervised Identification of Isotope-Labeled Peptides
Unsupervised Identification of Isotope-Labeled Peptides Joshua E Goldford 13 and Igor GL Libourel 124 1 Biotechnology institute, University of Minnesota, Saint Paul, MN 55108 2 Department of Plant Biology,
More informationIdentification of Tissue Independent Cancer Driver Genes
Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important
More informationIPA Advanced Training Course
IPA Advanced Training Course October 2013 Academia sinica Gene (Kuan Wen Chen) IPA Certified Analyst Agenda I. Data Upload and How to Run a Core Analysis II. Functional Interpretation in IPA Hands-on Exercises
More informationStill important ideas
Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still
More informationData analysis in microarray experiment
16 1 004 Chinese Bulletin of Life Sciences Vol. 16, No. 1 Feb., 004 1004-0374 (004) 01-0041-08 100005 Q33 A Data analysis in microarray experiment YANG Chang, FANG Fu-De * (National Laboratory of Medical
More informationSignal Transduction Cascades
Signal Transduction Cascades Contents of this page: Kinases & phosphatases Protein Kinase A (camp-dependent protein kinase) G-protein signal cascade Structure of G-proteins Small GTP-binding proteins,
More informationBoosted PRIM with Application to Searching for Oncogenic Pathway of Lung Cancer
Boosted PRIM with Application to Searching for Oncogenic Pathway of Lung Cancer Pei Wang Department of Statistics Stanford University Stanford, CA 94305 wp57@stanford.edu Young Kim, Jonathan Pollack Department
More informationRNA-seq. Design of experiments
RNA-seq Design of experiments Experimental design Introduction An experiment is a process or study that results in the collection of data. Statistical experiments are conducted in situations in which researchers
More informationSUPPLEMENTARY INFORMATION. Rett Syndrome Mutation MeCP2 T158A Disrupts DNA Binding, Protein Stability and ERP Responses
SUPPLEMENTARY INFORMATION Rett Syndrome Mutation T158A Disrupts DNA Binding, Protein Stability and ERP Responses Darren Goffin, Megan Allen, Le Zhang, Maria Amorim, I-Ting Judy Wang, Arith-Ruth S. Reyes,
More informationMBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks
MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1 Lecture 27: Systems Biology and Bayesian Networks Systems Biology and Regulatory Networks o Definitions o Network motifs o Examples
More informationSupplementary Materials for
www.sciencetranslationalmedicine.org/cgi/content/full/7/303/303ra139/dc1 Supplementary Materials for Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular
More informationDetection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit
APPLICATION NOTE Ion PGM System Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit Key findings The Ion PGM System, in concert with the Ion ReproSeq PGS View Kit and Ion Reporter
More informationV. Gathering and Exploring Data
V. Gathering and Exploring Data With the language of probability in our vocabulary, we re now ready to talk about sampling and analyzing data. Data Analysis We can divide statistical methods into roughly
More informationExperimental Studies. Statistical techniques for Experimental Data. Experimental Designs can be grouped. Experimental Designs can be grouped
Experimental Studies Statistical techniques for Experimental Data Require appropriate manipulations and controls Many different designs Consider an overview of the designs Examples of some of the analyses
More informationBiostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego
Biostatistics Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego (858) 534-1818 dsilverstein@ucsd.edu Introduction Overview of statistical
More informationLecture 15. Signal Transduction Pathways - Introduction
Lecture 15 Signal Transduction Pathways - Introduction So far.. Regulation of mrna synthesis Regulation of rrna synthesis Regulation of trna & 5S rrna synthesis Regulation of gene expression by signals
More informationKnowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes
Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC
More informationUsing Bayesian Networks to Analyze Expression Data. Xu Siwei, s Muhammad Ali Faisal, s Tejal Joshi, s
Using Bayesian Networks to Analyze Expression Data Xu Siwei, s0789023 Muhammad Ali Faisal, s0677834 Tejal Joshi, s0677858 Outline Introduction Bayesian Networks Equivalence Classes Applying to Expression
More informationNormal Distribution. Many variables are nearly normal, but none are exactly normal Not perfect, but still useful for a variety of problems.
Review Probability: likelihood of an event Each possible outcome can be assigned a probability If we plotted the probabilities they would follow some type a distribution Modeling the distribution is important
More informationResearch Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:
Research Methods 1 Handouts, Graham Hole,COGS - version 10, September 000: Page 1: T-TESTS: When to use a t-test: The simplest experimental design is to have two conditions: an "experimental" condition
More informationSawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc.
Sawtooth Software RESEARCH PAPER SERIES MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB Bryan Orme, Sawtooth Software, Inc. Copyright 009, Sawtooth Software, Inc. 530 W. Fir St. Sequim,
More informationBayesian Prediction Tree Models
Bayesian Prediction Tree Models Statistical Prediction Tree Modelling for Clinico-Genomics Clinical gene expression data - expression signatures, profiling Tree models for predictive sub-typing Combining
More informationAbstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction
Optimization strategy of Copy Number Variant calling using Multiplicom solutions Michael Vyverman, PhD; Laura Standaert, PhD and Wouter Bossuyt, PhD Abstract Copy number variations (CNVs) represent a significant
More informationMBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION
MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION Variables In the social sciences data are the observed and/or measured characteristics of individuals and groups
More informationQuantitative genetics: traits controlled by alleles at many loci
Quantitative genetics: traits controlled by alleles at many loci Human phenotypic adaptations and diseases commonly involve the effects of many genes, each will small effect Quantitative genetics allows
More informationProblem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms).
Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms). 1. Bayesian Information Criterion 2. Cross-Validation 3. Robust 4. Imputation
More informationMemorial Sloan-Kettering Cancer Center
Memorial Sloan-Kettering Cancer Center Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series Year 2007 Paper 14 On Comparing the Clustering of Regression Models
More informationHormones and Signal Transduction. Dr. Kevin Ahern
Dr. Kevin Ahern Signaling Outline Signaling Outline Background Signaling Outline Background Membranes Signaling Outline Background Membranes Hormones & Receptors Signaling Outline Background Membranes
More informationMulti-Channel Film Dosimetry and Gamma Map Verification
Multi-Channel Film Dosimetry and Gamma Map Verification Micke A International Specialty Products Ashland proprietary technology, patents pending Single Channel Film Dosimetry Calibration Curve X=R R ave
More informationSupplementary Figures
Supplementary Figures Supplementary Fig 1. Comparison of sub-samples on the first two principal components of genetic variation. TheBritishsampleisplottedwithredpoints.The sub-samples of the diverse sample
More informationChapter 1: Explaining Behavior
Chapter 1: Explaining Behavior GOAL OF SCIENCE is to generate explanations for various puzzling natural phenomenon. - Generate general laws of behavior (psychology) RESEARCH: principle method for acquiring
More informationBusiness Statistics Probability
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationSix Sigma Glossary Lean 6 Society
Six Sigma Glossary Lean 6 Society ABSCISSA ACCEPTANCE REGION ALPHA RISK ALTERNATIVE HYPOTHESIS ASSIGNABLE CAUSE ASSIGNABLE VARIATIONS The horizontal axis of a graph The region of values for which the null
More informationOn the purpose of testing:
Why Evaluation & Assessment is Important Feedback to students Feedback to teachers Information to parents Information for selection and certification Information for accountability Incentives to increase
More informationT. R. Golub, D. K. Slonim & Others 1999
T. R. Golub, D. K. Slonim & Others 1999 Big Picture in 1999 The Need for Cancer Classification Cancer classification very important for advances in cancer treatment. Cancers of Identical grade can have
More informationMembrane associated receptor transfers the information. Second messengers relay information
Membrane associated receptor transfers the information Most signals are polar and large Few of the signals are nonpolar Receptors are intrinsic membrane proteins Extracellular and intracellular domains
More informationTheta sequences are essential for internally generated hippocampal firing fields.
Theta sequences are essential for internally generated hippocampal firing fields. Yingxue Wang, Sandro Romani, Brian Lustig, Anthony Leonardo, Eva Pastalkova Supplementary Materials Supplementary Modeling
More informationSupplementary Materials for
www.sciencesignaling.org/cgi/content/full/8/364/ra18/dc1 Supplementary Materials for The tyrosine phosphatase (Pez) inhibits metastasis by altering protein trafficking Leila Belle, Naveid Ali, Ana Lonic,
More informationSmall-area estimation of mental illness prevalence for schools
Small-area estimation of mental illness prevalence for schools Fan Li 1 Alan Zaslavsky 2 1 Department of Statistical Science Duke University 2 Department of Health Care Policy Harvard Medical School March
More informationChapter 4 DESIGN OF EXPERIMENTS
Chapter 4 DESIGN OF EXPERIMENTS 4.1 STRATEGY OF EXPERIMENTATION Experimentation is an integral part of any human investigation, be it engineering, agriculture, medicine or industry. An experiment can be
More information