Practical Experience in the Analysis of Gene Expression Data
|
|
- Myrtle Dean
- 5 years ago
- Views:
Transcription
1 Workshop Biometrical Analysis of Molecular Markers, Heidelberg, 2001 Practical Experience in the Analysis of Gene Expression Data from Two Data Sets concerning ALL in Children and Patients with Nodules in the Thyroid Glands S. Kropf, Otto von Guericke University, Magdeburg O. Kuß, U. Hattenhorst, S. Burdach, Martin Luther University, Halle M. Eszlinger, K. Krohn, University of Leipzig
2 Contents Introduction Methods applied Data sets Results Summary Kropf et al., Heidelberg, November
3 1. Introduction One of the basic questions in the analysis of gene expression data: Detection of genes that are more or less expressed in some cells than in others: e.g., patients with some disease vs. healthy persons (two-sample problem) affected tissue vs. unaffected ones in the same persons (one-sample problem) Kropf et al., Heidelberg, November
4 Statistical concern: Problem of multiple testing: many hundreds or thousands of genes considered in parallel. In literature and own proposals: methods to control the experimentwise (familywise) type one error. Own experience was based on small samples of lowdimensional arrays (Macro arrays with 588 Genes, each two-fold spotted). Question: Does that also work with high-dimensional arrays? Kropf et al., Heidelberg, November
5 More detailed: Is it statistically feasible two keep the experimentwise error rate? (Can we find significant variables or is the claim to high to be fulfilled with higher dimensions?) Is it technically feasible? on a standard PC? with standard statistical software (SAS, SPSS)? Are there alternative techniques or claims? What about data transformations? Kropf et al., Heidelberg, November
6 2. Methods applied Methods to control the experimentwise type I error: a) Bonferroni-(Holm-)Method α - adjustment: α / (number of genes i) i = 0,1,... per test easy to implement small effect of Holm correction extremely small effective type I errors robustness questionable in extreme tails Kropf et al., Heidelberg, November
7 b) Permutation procedure of Westfall and Young (Westfall and Young, 1993; Dudoit et al., 2000) Basic tests (here t tests) are first carried out with data in the original data, then repeatedly with permuted samples. Each variable is not compared with itself in the permutation process, but the t value of the most significant original variable is compared to that of the most significant variable in the bth permutation sample, the t value of the second original variable (ordered by significance) is compared to the maximum of that for all variables except that which was originally most significant,... Kropf et al., Heidelberg, November
8 Comments: Procedure is distribution-free because of permutation principle. It uses a parametric test as basic element - insofar power dependent on distribution. Procedure is implemented in the SAS procedure PROC MULTTEST for the comparison of independent samples. It is recommended by the working group of Terry Speed (University of California, Berkeley). Kropf et al., Heidelberg, November
9 c) Parametric sequential procedure (Kropf, 2000), two-sample case Based on the assumption of equal variances in all variables (genes). 1. Sort all variables for decreasing values of the variance in the total sample. 2. Carry out unadjusted two-sample t tests in the order given by step 1 until the first non-significant test. Controls the experimentwise Type I error if all variables are normally distributed with equal variance in both groups. The additional assumption of equal variances between all variables is not necessary for the type I error, but for a good power. One-sample version similar (sort for decreasing values of the mean squared difference from 0). Kropf et al., Heidelberg, November
10 d) Nonparametric sequential procedure, two-sample case Also dependent on the assumption of equal variance (shape, interquartile range). 1. Sort variables for decreasing values of the interquartile range of the total sample. 2. Carry out Mann-Whitney-U-Tests in that order until the first non-significant test. Utilizes independence of rank and order statistics. Equal variance of variables important for power, not for type I error. Insofar dependent on distribution. Kropf et al., Heidelberg, November
11 3. Data sets A: acute lymphatic leukemia (ALL) in children (U. Hattenhorst, S. Burdach, O. Kuß, Halle) 11 children with ALL vs. 10 healthy persons, special Affymetrix (R) chip with genes, selection of genes after dropping all rows with empty description field or with a description starting with EST, skewed distributions (across genes or patients), also negative expression values - due to back-ground elimination and normalization process. Kropf et al., Heidelberg, November
12 ALL patients: distribution of expression levels across genes ALL02 ALL01 ALL15 ALL19 ALL22 ALL30 ALL31 ALL32 ALL25 ALL28 ALL29 percentiles Kropf et al., Heidelberg, November
13 Two versions of a transformation: 1. logarithms Problem: negative values shifted: ln(x + 200) 2. cubic root Variances of expression values not really equal for different genes in both versions. Genes with small variances will be supressed in sequential procedure. G G G G G G G group controls ALL mean stddev mean stddev E E selection of seven genes with logarithmic transformation Kropf et al., Heidelberg, November
14 B: patients with nodules in the thyroid glands (Krohn, Eszlinger, Leipzig) 5 patients with hot nodules 5 patients with cold nodules each patient tissue from nodule and from unaffected surrounding genes per tissue two versions of expression levels provided by the Affymetrix software: o LogAvg (pre-processing at logarithmic level) o AvgDiff ( usual version problems with negative values and skewed distributions) cubic root transformation Kropf et al., Heidelberg, November
15 Selection of 10 genes (logarithmic values) var001 var002 var003 var004 var005 var006 var007 var008 var009 var010 group hot nodules cold nodules mean stddev mean stddev E E E Variances not too different! Kropf et al., Heidelberg, November
16 The same genes with cubic root transformation VAR001 VAR002 VAR003 VAR004 VAR005 VAR006 VAR007 VAR008 VAR009 VAR010 group hot nodules cold nodules mean stddev mean stddev E Variances more heterogeneous! Kropf et al., Heidelberg, November
17 4. Results A: ALL data Full set of genes (p = , n 1 = 10, n 2 = 11, α = 0.05): Parametric sequential procedure: Matrix language of SPSS, a few minutes processing time on Pentium III PC, no special problems. #local sign. #Bonf. sign. #sequ.sign. logarithmic transf cubic root (of the 11 above) Westfall/Young: SAS can treat variables at most. Kropf et al., Heidelberg, November
18 Reduced set of genes (p = 9.805) all versions based on logarithmic values: # local sign # Bonferroni sign. 7 # parametric sequ. 10 (2 minutes) # nonparametric sequ. 7 1) # Westfall/Young 12 (30 minutes) 1) with standard SPSS procedures borderline load with large pivot tables Kropf et al., Heidelberg, November
19 Nonparametric sequential procedure for restricted ALL data G(1) G(2) G(3) G(4) G(5) G(6) G(7) G(8) G(9) G(10) G(11) G(12) Monte-Carlo-Sig Mannnifikanz(2-seitig) Whitney -U Signifikanz CASE_LBL GROUP G(1) BM BM BM TS TS CB CB CB CB CB ALL ALL ALL ALL ALL ALL ALL ALL ALL ALL ALL Kropf et al., Heidelberg, November
20 B: hot and cold nodules in thyroid glands ( p = ) both types of nodules together vs. surrounding (n = 5 + 5) one-sample problem, Westfall/Young not available in SAS logarithmic data (Affymetrix) cubic root # local sign # Bonf # parametric sequ. 4 4 (1 in common) # nonparametric sequ. 6 Kropf et al., Heidelberg, November
21 b) hot vs. cold nodules (n 1 = n 2 = 5) a lot of local significances (1.465 with ln, with root, both more than 10 % of genes), but nothing at familywise error level (sequential, Holm, Westfall-Young) Again question: is that claim of familywise error level to hard? Kropf et al., Heidelberg, November
22 Alternative claim: False discovery rate (Benjamini/Hochberg, 1995): expected proportion of falsely rejected null hypotheses among all rejected hypotheses. Recently discussed for array analyses (Tusher, Tibshirani, Chu, 2001, Dudoit et al., ISCB, 2001). Advantage: claim is not higher in high-dimensional arrays than in low-dimensional ones. Problems: In a situation with many non-null genes we can hide also a lot of falsely detected genes. 2. Can it be controlled properly? Benjamini and Hochberg s proposal treats only uncorrelated tests. Kropf et al., Heidelberg, November
23 SAS help text: PROC MULTTEST: FDR Option The FDR option requests adjusted p-values using the method of Benjamini and Hochberg. These p-values do not control the familywise error rate, but they do control the false discovery rate in some cases. ALL data (ln): genes, local sign., 7 Holm, 10/7 sequ., 12 Westfall/Young FDR 398 hot/cold nodules (ln): genes, local, 0 familywise, 3 FDR Kropf et al., Heidelberg, November
24 5. Summary The experimentwise error can be controlled even in extreme high dimensions, but only few significant variables will usually be found; dependent on procedure and transformations. Alternative way: FDR seems sensible. We should give several versions of significance. The combination of standard PC / statistical standard software is on borderline with such high dimensions. There are hard and soft restrictions. Raw data should be transformed, but special way may be dependent on type of arrays and others. Kropf et al., Heidelberg, November
25 References: Benjamini, Y., Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J.R.Statist. Soc. B 57, Dudoit, S, Yang, Y.H., Callow, M.J., Speed, T.P. (2000). Statistical methods for identifying differentially expressed genes in replicated cdna microarray experiments. Technical Report # 578, Stanford University School of Medicine. Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scand. J. Statist. 6, Kropf, S. (2000). Hochdimensionale multivariate Verfahren in der medizinischen Statistik. Shaker Verlag, Aachen. Tusher, V.G., Tibshirani, R., Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. PNAS 98, Westfall, P.H. and Young, S.S. (1993). Resampling-based multiple testing. John Wiley & Sons, New York. Kropf et al., Heidelberg, November
Application of Resampling Methods in Microarray Data Analysis
Application of Resampling Methods in Microarray Data Analysis Tests for two independent samples Oliver Hartmann, Helmut Schäfer Institut für Medizinische Biometrie und Epidemiologie Philipps-Universität
More informationOn testing dependency for data in multidimensional contingency tables
On testing dependency for data in multidimensional contingency tables Dominika Polko 1 Abstract Multidimensional data analysis has a very important place in statistical research. The paper considers the
More informationA Review of Multiple Hypothesis Testing in Otolaryngology Literature
The Laryngoscope VC 2014 The American Laryngological, Rhinological and Otological Society, Inc. Systematic Review A Review of Multiple Hypothesis Testing in Otolaryngology Literature Erin M. Kirkham, MD,
More informationCancer outlier differential gene expression detection
Biostatistics (2007), 8, 3, pp. 566 575 doi:10.1093/biostatistics/kxl029 Advance Access publication on October 4, 2006 Cancer outlier differential gene expression detection BAOLIN WU Division of Biostatistics,
More informationUsing SAS to Calculate Tests of Cliff s Delta. Kristine Y. Hogarty and Jeffrey D. Kromrey
Using SAS to Calculate Tests of Cliff s Delta Kristine Y. Hogarty and Jeffrey D. Kromrey Department of Educational Measurement and Research, University of South Florida ABSTRACT This paper discusses a
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationComments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al.
Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al. Holger Höfling Gad Getz Robert Tibshirani June 26, 2007 1 Introduction Identifying genes that are involved
More informationComparison of discrimination methods for the classification of tumors using gene expression data
Comparison of discrimination methods for the classification of tumors using gene expression data Sandrine Dudoit, Jane Fridlyand 2 and Terry Speed 2,. Mathematical Sciences Research Institute, Berkeley
More informationSTATISTICAL INFERENCE 1 Richard A. Johnson Professor Emeritus Department of Statistics University of Wisconsin
STATISTICAL INFERENCE 1 Richard A. Johnson Professor Emeritus Department of Statistics University of Wisconsin Key words : Bayesian approach, classical approach, confidence interval, estimation, randomization,
More informationStatistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.
Statistics as a Tool A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations. Descriptive Statistics Numerical facts or observations that are organized describe
More informationSTATISTICS AND RESEARCH DESIGN
Statistics 1 STATISTICS AND RESEARCH DESIGN These are subjects that are frequently confused. Both subjects often evoke student anxiety and avoidance. To further complicate matters, both areas appear have
More informationSingle SNP/Gene Analysis. Typical Results of GWAS Analysis (Single SNP Approach) Typical Results of GWAS Analysis (Single SNP Approach)
High-Throughput Sequencing Course Gene-Set Analysis Biostatistics and Bioinformatics Summer 28 Section Introduction What is Gene Set Analysis? Many names for gene set analysis: Pathway analysis Gene set
More informationAssignment #6. Chapter 10: 14, 15 Chapter 11: 14, 18. Due tomorrow Nov. 6 th by 2pm in your TA s homework box
Assignment #6 Chapter 10: 14, 15 Chapter 11: 14, 18 Due tomorrow Nov. 6 th by 2pm in your TA s homework box Assignment #7 Chapter 12: 18, 24 Chapter 13: 28 Due next Friday Nov. 13 th by 2pm in your TA
More informationQuantitative Methods in Computing Education Research (A brief overview tips and techniques)
Quantitative Methods in Computing Education Research (A brief overview tips and techniques) Dr Judy Sheard Senior Lecturer Co-Director, Computing Education Research Group Monash University judy.sheard@monash.edu
More informationFalse Discovery Rates and Copy Number Variation. Bradley Efron and Nancy Zhang Stanford University
False Discovery Rates and Copy Number Variation Bradley Efron and Nancy Zhang Stanford University Three Statistical Centuries 19th (Quetelet) Huge data sets, simple questions 20th (Fisher, Neyman, Hotelling,...
More informationSample Size Estimation for Microarray Experiments
Sample Size Estimation for Microarray Experiments Gregory R. Warnes Department of Biostatistics and Computational Biology Univeristy of Rochester Rochester, NY 14620 and Peng Liu Department of Biological
More informationAnalysis of Variance (ANOVA)
Research Methods and Ethics in Psychology Week 4 Analysis of Variance (ANOVA) One Way Independent Groups ANOVA Brief revision of some important concepts To introduce the concept of familywise error rate.
More informationLessons in biostatistics
Lessons in biostatistics The test of independence Mary L. McHugh Department of Nursing, School of Health and Human Services, National University, Aero Court, San Diego, California, USA Corresponding author:
More informationTitle: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection
Author's response to reviews Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection Authors: Jestinah M Mahachie John
More informationVoxel-based Lesion-Symptom Mapping. Céline R. Gillebert
Voxel-based Lesion-Symptom Mapping Céline R. Gillebert Paul Broca (1861) Mr. Tan no productive speech single repetitive syllable tan Broca s area: speech production Broca s aphasia: problems with fluency,
More informationInvestigating the robustness of the nonparametric Levene test with more than two groups
Psicológica (2014), 35, 361-383. Investigating the robustness of the nonparametric Levene test with more than two groups David W. Nordstokke * and S. Mitchell Colp University of Calgary, Canada Testing
More informationThe Research Roadmap Checklist
1/5 The Research Roadmap Checklist Version: December 1, 2007 All enquires to bwhitworth@acm.org This checklist is at http://brianwhitworth.com/researchchecklist.pdf The element details are explained at
More informationIntroduction to statistics Dr Alvin Vista, ACER Bangkok, 14-18, Sept. 2015
Analysing and Understanding Learning Assessment for Evidence-based Policy Making Introduction to statistics Dr Alvin Vista, ACER Bangkok, 14-18, Sept. 2015 Australian Council for Educational Research Structure
More informationExamining differences between two sets of scores
6 Examining differences between two sets of scores In this chapter you will learn about tests which tell us if there is a statistically significant difference between two sets of scores. In so doing you
More informationType I Error Of Four Pairwise Mean Comparison Procedures Conducted As Protected And Unprotected Tests
Journal of odern Applied Statistical ethods Volume 4 Issue 2 Article 1 11-1-25 Type I Error Of Four Pairwise ean Comparison Procedures Conducted As Protected And Unprotected Tests J. Jackson Barnette University
More informationPSY 216: Elementary Statistics Exam 4
Name: PSY 16: Elementary Statistics Exam 4 This exam consists of multiple-choice questions and essay / problem questions. For each multiple-choice question, circle the one letter that corresponds to the
More informationChapter 9. Factorial ANOVA with Two Between-Group Factors 10/22/ Factorial ANOVA with Two Between-Group Factors
Chapter 9 Factorial ANOVA with Two Between-Group Factors 10/22/2001 1 Factorial ANOVA with Two Between-Group Factors Recall that in one-way ANOVA we study the relation between one criterion variable and
More informationComputer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California
Computer Age Statistical Inference Algorithms, Evidence, and Data Science BRADLEY EFRON Stanford University, California TREVOR HASTIE Stanford University, California ggf CAMBRIDGE UNIVERSITY PRESS Preface
More informationApplication of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties
Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties Bob Obenchain, Risk Benefit Statistics, August 2015 Our motivation for using a Cut-Point
More informationEMA Workshop on Multiplicity Issues in Clinical Trials 16 November 2012, EMA, London, UK
EMA Workshop on Multiplicity Issues in Clinical Trials 16 November 2012, EMA, London, UK (http://www.ema.europa.eu/ema/index.jsp?curl=pages/news_and_events/events/2012/06/event_detai l_000589.jsp). Summary
More informationIntroduction to Gene Sets Analysis
Introduction to Svitlana Tyekucheva Dana-Farber Cancer Institute May 15, 2012 Introduction Various measurements: gene expression, copy number variation, methylation status, mutation profile, etc. Main
More informationPackage CLL. April 19, 2018
Type Package Title A Package for CLL Gene Expression Data Version 1.19.0 Author Elizabeth Whalen Package CLL April 19, 2018 Maintainer Robert Gentleman The CLL package contains the
More informationThe Role of CD164 in Metastatic Cancer Aaron M. Havens J. Wang, Y-X. Sun, G. Heresi, R.S. Taichman Mentor: Russell Taichman
The Role of CD164 in Metastatic Cancer Aaron M. Havens J. Wang, Y-X. Sun, G. Heresi, R.S. Taichman Mentor: Russell Taichman The spread of tumors, a process called metastasis, is a dreaded complication
More informationPower of a Clinical Study
Power of a Clinical Study M.Yusuf Celik 1, Editor-in-Chief 1 Prof.Dr. Biruni University, Medical Faculty, Dept of Biostatistics, Topkapi, Istanbul. Abstract The probability of not committing a Type II
More informationPower of the test of One-Way Anova after transforming with large sample size data
Available online at www.sciencedirect.com Procedia Social and Behavioral Sciences 9 (2010) 933 937 WCLTA-2010 Power of the test of One-Way Anova after transforming with large sample size data Natcha Mahapoonyanont
More informationf WILEY ANOVA and ANCOVA A GLM Approach Second Edition ANDREW RUTHERFORD Staffordshire, United Kingdom Keele University School of Psychology
ANOVA and ANCOVA A GLM Approach Second Edition ANDREW RUTHERFORD Keele University School of Psychology Staffordshire, United Kingdom f WILEY A JOHN WILEY & SONS, INC., PUBLICATION Contents Acknowledgments
More informationA Brief (very brief) Overview of Biostatistics. Jody Kreiman, PhD Bureau of Glottal Affairs
A Brief (very brief) Overview of Biostatistics Jody Kreiman, PhD Bureau of Glottal Affairs What We ll Cover Fundamentals of measurement Parametric versus nonparametric tests Descriptive versus inferential
More informationPOLS 5377 Scope & Method of Political Science. Correlation within SPSS. Key Questions: How to compute and interpret the following measures in SPSS
POLS 5377 Scope & Method of Political Science Week 15 Measure of Association - 2 Correlation within SPSS 2 Key Questions: How to compute and interpret the following measures in SPSS Ordinal Variable Gamma
More informationStatistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN
Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN Vs. 2 Background 3 There are different types of research methods to study behaviour: Descriptive: observations,
More informationHS Exam 1 -- March 9, 2006
Please write your name on the back. Don t forget! Part A: Short answer, multiple choice, and true or false questions. No use of calculators, notes, lab workbooks, cell phones, neighbors, brain implants,
More informationCross-over trials. Martin Bland. Cross-over trials. Cross-over trials. Professor of Health Statistics University of York
Cross-over trials Martin Bland Professor of Health Statistics University of York http://martinbland.co.uk Cross-over trials Use the participant as their own control. Each participant gets more than one
More informationWhat you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu
What you should know before you collect data BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Types and levels of study Descriptive statistics Inferential statistics How to choose a statistical test
More informationBIOINFORMATICS ORIGINAL PAPER
BIOINFORMATICS ORIGINAL PAPER Vol. 21 no. 9 2005, pages 1979 1986 doi:10.1093/bioinformatics/bti294 Gene expression Estimating misclassification error with small samples via bootstrap cross-validation
More informationConfidence Intervals On Subsets May Be Misleading
Journal of Modern Applied Statistical Methods Volume 3 Issue 2 Article 2 11-1-2004 Confidence Intervals On Subsets May Be Misleading Juliet Popper Shaffer University of California, Berkeley, shaffer@stat.berkeley.edu
More informationControlling The Rate of Type I Error Over A Large Set of Statistical Tests. H. J. Keselman. University of Manitoba. Burt Holland.
At Least Two Type I Errors 1 Controlling The Rate of Type I Error Over A Large Set of Statistical Tests by H. J. Keselman University of Manitoba Burt Holland Temple University and Robert Cribbie University
More informationinvestigate. educate. inform.
investigate. educate. inform. Research Design What drives your research design? The battle between Qualitative and Quantitative is over Think before you leap What SHOULD drive your research design. Advanced
More informationPaired samples CFA for the multivariate detection of change in small samples
Psychology Science, Volume 47, 2005 (3/4), p. 440-446 Paired samples CFA for the multivariate detection of change in small samples MARCUS ISING 1 & WILHELM JANKE 2 Abstract Paired samples Configuration
More informationGene Selection for Tumor Classification Using Microarray Gene Expression Data
Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology
More informationBusiness Statistics Probability
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationBayesian and Frequentist Approaches
Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law
More informationmicrorna PCR System (Exiqon), following the manufacturer s instructions. In brief, 10ng of
SUPPLEMENTAL MATERIALS AND METHODS Quantitative RT-PCR Quantitative RT-PCR analysis was performed using the Universal mircury LNA TM microrna PCR System (Exiqon), following the manufacturer s instructions.
More informationA COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY
A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,
More informationEcological Statistics
A Primer of Ecological Statistics Second Edition Nicholas J. Gotelli University of Vermont Aaron M. Ellison Harvard Forest Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Brief Contents
More informationUndesirable Optimality Results in Multiple Testing? Charles Lewis Dorothy T. Thayer
Undesirable Optimality Results in Multiple Testing? Charles Lewis Dorothy T. Thayer 1 Intuitions about multiple testing: - Multiple tests should be more conservative than individual tests. - Controlling
More informationDoing Thousands of Hypothesis Tests at the Same Time. Bradley Efron Stanford University
Doing Thousands of Hypothesis Tests at the Same Time Bradley Efron Stanford University 1 Simultaneous Hypothesis Testing 1980: Simultaneous Statistical Inference (Rupert Miller) 2, 3,, 20 simultaneous
More informationSUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK
SUMMER 011 RE-EXAM PSYF11STAT - STATISTIK Full Name: Årskortnummer: Date: This exam is made up of three parts: Part 1 includes 30 multiple choice questions; Part includes 10 matching questions; and Part
More informationList of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition
List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing
More informationStatistical Techniques. Masoud Mansoury and Anas Abulfaraj
Statistical Techniques Masoud Mansoury and Anas Abulfaraj What is Statistics? https://www.youtube.com/watch?v=lmmzj7599pw The definition of Statistics The practice or science of collecting and analyzing
More informationAnalysis of Variance: repeated measures
Analysis of Variance: repeated measures Tests for comparing three or more groups or conditions: (a) Nonparametric tests: Independent measures: Kruskal-Wallis. Repeated measures: Friedman s. (b) Parametric
More informationEffect of Source and Level of Protein on Weight Gain of Rats
Effect of Source and Level of Protein on of Rats 1 * two factor analysis of variance with interaction; 2 option ls=120 ps=75 nocenter nodate; 3 4 title Effect of Source of Protein and Level of Protein
More informationThe update of the multiplicity guideline
The update of the multiplicity guideline Norbert Benda and Medical Devices (BfArM), Bonn Disclaimer: Views expressed in this presentation are the author's personal views and not necessarily the views of
More informationKepler tried to record the paths of planets in the sky, Harvey to measure the flow of blood in the circulatory system, and chemists tried to produce
Stats 95 Kepler tried to record the paths of planets in the sky, Harvey to measure the flow of blood in the circulatory system, and chemists tried to produce pure gold knowing it was an element, though
More informationIN SPITE of a very quick development of medicine within
INTL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 21, VOL. 6, NO. 3, PP. 281-286 Manuscript received July 1, 21: revised September, 21. DOI: 1.2478/v1177-1-37-9 Application of Density Based Clustering
More informationPower & Sample Size. Dr. Andrea Benedetti
Power & Sample Size Dr. Andrea Benedetti Plan Review of hypothesis testing Power and sample size Basic concepts Formulae for common study designs Using the software When should you think about power &
More informationAMSc Research Methods Research approach IV: Experimental [2]
AMSc Research Methods Research approach IV: Experimental [2] Marie-Luce Bourguet mlb@dcs.qmul.ac.uk Statistical Analysis 1 Statistical Analysis Descriptive Statistics : A set of statistical procedures
More informationStill important ideas
Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still
More informationGene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering
Gene expression analysis Roadmap Microarray technology: how it work Applications: what can we do with it Preprocessing: Image processing Data normalization Classification Clustering Biclustering 1 Gene
More informationCLINICAL RESEARCH METHODS VISP356. MODULE LEADER: PROF A TOMLINSON B.Sc./B.Sc.(HONS) OPTOMETRY
DIVISION OF VISION SCIENCES SESSION: 2006/2007 DIET: 1ST CLINICAL RESEARCH METHODS VISP356 LEVEL: MODULE LEADER: PROF A TOMLINSON B.Sc./B.Sc.(HONS) OPTOMETRY MAY 2007 DURATION: 2 HRS CANDIDATES SHOULD
More informationBehavioral Data Mining. Lecture 4 Measurement
Behavioral Data Mining Lecture 4 Measurement Outline Hypothesis testing Parametric statistical tests Non-parametric tests Precision-Recall plots ROC plots Hardware update Icluster machines are ready for
More informationTesting Means. Related-Samples t Test With Confidence Intervals. 6. Compute a related-samples t test and interpret the results.
10 Learning Objectives Testing Means After reading this chapter, you should be able to: Related-Samples t Test With Confidence Intervals 1. Describe two types of research designs used when we select related
More informationOverview of Lecture. Survey Methods & Design in Psychology. Correlational statistics vs tests of differences between groups
Survey Methods & Design in Psychology Lecture 10 ANOVA (2007) Lecturer: James Neill Overview of Lecture Testing mean differences ANOVA models Interactions Follow-up tests Effect sizes Parametric Tests
More informationRussian Journal of Agricultural and Socio-Economic Sciences, 3(15)
ON THE COMPARISON OF BAYESIAN INFORMATION CRITERION AND DRAPER S INFORMATION CRITERION IN SELECTION OF AN ASYMMETRIC PRICE RELATIONSHIP: BOOTSTRAP SIMULATION RESULTS Henry de-graft Acquah, Senior Lecturer
More informationSEM: the precision of the mean of the sample in predicting the population parameter and is a way of relating the sample mean to population mean
1999b(9)/1997a(14)/1995b(17): What is meant by 95% confidence interval? Explain the practical applications of CIs and indicate why they may be preferred to P values General: 95% CI defines the range of
More informationA MONTE CARLO SIMULATION STUDY FOR COMPARING PERFORMANCES OF SOME HOMOGENEITY OF VARIANCES TESTS
A MONTE CARLO SIMULATION STUDY FOR COMPARING PERFORMANCES OF SOME HOMOGENEITY OF VARIANCES TESTS Hamit MIRTAGIOĞLU Bitlis Eren University, Faculty of Arts and Sciences, Department of Statistics, Bitlis-Turkey
More informationResearch Manual COMPLETE MANUAL. By: Curtis Lauterbach 3/7/13
Research Manual COMPLETE MANUAL By: Curtis Lauterbach 3/7/13 TABLE OF CONTENTS INTRODUCTION 1 RESEARCH DESIGN 1 Validity 1 Reliability 1 Within Subjects 1 Between Subjects 1 Counterbalancing 1 Table 1.
More informationResearch Analysis MICHAEL BERNSTEIN CS 376
Research Analysis MICHAEL BERNSTEIN CS 376 Last time What is a statistical test? Chi-square t-test Paired t-test 2 Today ANOVA Posthoc tests Two-way ANOVA Repeated measures ANOVA 3 Recall: hypothesis testing
More informationStatistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.
This guide contains a summary of the statistical terms and procedures. This guide can be used as a reference for course work and the dissertation process. However, it is recommended that you refer to statistical
More informationMEA DISCUSSION PAPERS
Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de
More informationLab 7 (100 pts.): One-Way ANOVA Objectives: Analyze data via the One-Way ANOVA
STAT 350 (Spring 2015) Lab 7: SAS Solution 1 Lab 7 (100 pts.): One-Way ANOVA Objectives: Analyze data via the One-Way ANOVA A. (50 pts.) Do isoflavones increase bone mineral density? (ex12-45bmd.txt) Kudzu
More informationResearch Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:
Research Methods 1 Handouts, Graham Hole,COGS - version 10, September 000: Page 1: T-TESTS: When to use a t-test: The simplest experimental design is to have two conditions: an "experimental" condition
More informationBayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions
Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions J. Harvey a,b, & A.J. van der Merwe b a Centre for Statistical Consultation Department of Statistics
More informationStatistics for EES Factorial analysis of variance
Statistics for EES Factorial analysis of variance Dirk Metzler http://evol.bio.lmu.de/_statgen 1. July 2013 1 ANOVA and F-Test 2 Pairwise comparisons and multiple testing 3 Non-parametric: The Kruskal-Wallis
More informationSampling for Impact Evaluation. Maria Jones 24 June 2015 ieconnect Impact Evaluation Workshop Rio de Janeiro, Brazil June 22-25, 2015
Sampling for Impact Evaluation Maria Jones 24 June 2015 ieconnect Impact Evaluation Workshop Rio de Janeiro, Brazil June 22-25, 2015 How many hours do you expect to sleep tonight? A. 2 or less B. 3 C.
More informationMOST: detecting cancer differential gene expression
Biostatistics (2008), 9, 3, pp. 411 418 doi:10.1093/biostatistics/kxm042 Advance Access publication on November 29, 2007 MOST: detecting cancer differential gene expression HENG LIAN Division of Mathematical
More informationReadings Assumed knowledge
3 N = 59 EDUCAT 59 TEACHG 59 CAMP US 59 SOCIAL Analysis of Variance 95% CI Lecture 9 Survey Research & Design in Psychology James Neill, 2012 Readings Assumed knowledge Howell (2010): Ch3 The Normal Distribution
More informationComparing multiple proportions
Comparing multiple proportions February 24, 2017 psych10.stanford.edu Announcements / Action Items Practice and assessment problem sets will be posted today, might be after 5 PM Reminder of OH switch today
More informationStatistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies
Statistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies Stanford Biostatistics Workshop Pierre Neuvial with Henrik Bengtsson and Terry Speed Department of Statistics, UC Berkeley
More informationComparison of Gene Set Analysis with Various Score Transformations to Test the Significance of Sets of Genes
Comparison of Gene Set Analysis with Various Score Transformations to Test the Significance of Sets of Genes Ivan Arreola and Dr. David Han Department of Management of Science and Statistics, University
More informationApplication of the concept of False Discovery Rate on predicted cancer outcome with microarrays
Mathematical Statistics Stockholm University Application of the concept of False Discovery Rate on predicted cancer outcome with microarrays Sally Salih Examensarbete 2006:1 Postal address: Mathematical
More informationResearch and Evaluation Methodology Program, School of Human Development and Organizational Studies in Education, University of Florida
Vol. 2 (1), pp. 22-39, Jan, 2015 http://www.ijate.net e-issn: 2148-7456 IJATE A Comparison of Logistic Regression Models for Dif Detection in Polytomous Items: The Effect of Small Sample Sizes and Non-Normality
More informationn Outline final paper, add to outline as research progresses n Update literature review periodically (check citeseer)
Project Dilemmas How do I know when I m done? How do I know what I ve accomplished? clearly define focus/goal from beginning design a search method that handles plateaus improve some ML method s robustness
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter
More information7 Statistical Issues that Researchers Shouldn t Worry (So Much) About
7 Statistical Issues that Researchers Shouldn t Worry (So Much) About By Karen Grace-Martin Founder & President About the Author Karen Grace-Martin is the founder and president of The Analysis Factor.
More informationIntegrative Biology 200A PRINCIPLES OF PHYLOGENETICS Spring 2012
Integrative Biology 200A PRINCIPLES OF PHYLOGENETICS Spring 2012 University of California, Berkeley Kipling Will- 1 March Data/Hypothesis Exploration and Support Measures I. Overview. -- Many would agree
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector
More informationProfile Analysis. Intro and Assumptions Psy 524 Andrew Ainsworth
Profile Analysis Intro and Assumptions Psy 524 Andrew Ainsworth Profile Analysis Profile analysis is the repeated measures extension of MANOVA where a set of DVs are commensurate (on the same scale). Profile
More informationOne-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;
1 One-Way ANOVAs We have already discussed the t-test. The t-test is used for comparing the means of two groups to determine if there is a statistically significant difference between them. The t-test
More informationResearch Manual STATISTICAL ANALYSIS SECTION. By: Curtis Lauterbach 3/7/13
Research Manual STATISTICAL ANALYSIS SECTION By: Curtis Lauterbach 3/7/13 TABLE OF CONTENTS INTRODUCTION 1 STATISTICAL ANALYSIS 1 Overview 1 Dependent Variable 1 Independent Variable 1 Interval 1 Ratio
More informationLearning Objectives 9/9/2013. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency
Conflicts of Interest I have no conflict of interest to disclose Biostatistics Kevin M. Sowinski, Pharm.D., FCCP Last-Chance Ambulatory Care Webinar Thursday, September 5, 2013 Learning Objectives For
More information