Immunological Data Processing & Analysis

Similar documents
What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

Understandable Statistics

Analysis and Interpretation of Data Part 1

isc ove ring i Statistics sing SPSS

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

Selecting the Right Data Analysis Technique

Table of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017

STATISTICS AND RESEARCH DESIGN

Business Research Methods. Introduction to Data Analysis

Basic Biostatistics. Chapter 1. Content

Experimental Design for Immunologists

Ecological Statistics

HOW STATISTICS IMPACT PHARMACY PRACTICE?

Choosing the Correct Statistical Test

Types of Statistics. Censored data. Files for today (June 27) Lecture and Homework INTRODUCTION TO BIOSTATISTICS. Today s Outline

Overview of Non-Parametric Statistics

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Biostatistics II

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Unit 1 Exploring and Understanding Data

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

REVIEW ARTICLE. A Review of Inferential Statistical Methods Commonly Used in Medicine

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Business Statistics Probability

From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1

Chapter 1: Exploring Data

INTRODUCTION TO MEDICAL RESEARCH: ESSENTIAL SKILLS

10. LINEAR REGRESSION AND CORRELATION

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.

Chapter 1: Explaining Behavior

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

NEUROBLASTOMA DATA -- TWO GROUPS -- QUANTITATIVE MEASURES 38 15:37 Saturday, January 25, 2003

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

Still important ideas

SUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

bivariate analysis: The statistical analysis of the relationship between two variables.

Still important ideas

AP Statistics. Semester One Review Part 1 Chapters 1-5

STATISTICS & PROBABILITY

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

Prepared by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Basic Steps in Planning Research. Dr. P.J. Brink and Dr. M.J. Wood

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

Biostatistics for Med Students. Lecture 1

Overview. Goals of Interpretation. Methodology. Reasons to Read and Evaluate

Figure: Presentation slides:

Content. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries. Research question. Example Newly diagnosed Type 2 Diabetes

BEST PRACTICES FOR IMPLEMENTATION AND ANALYSIS OF PAIN SCALE PATIENT REPORTED OUTCOMES IN CLINICAL TRIALS

investigate. educate. inform.

Learning Objectives 9/9/2013. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

9/4/2013. Decision Errors. Hypothesis Testing. Conflicts of Interest. Descriptive statistics: Numerical methods Measures of Central Tendency

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Survey research (Lecture 1) Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.

Survey research (Lecture 1)

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis

Using a Likert-type Scale DR. MIKE MARRAPODI

Chapter 14: More Powerful Statistical Methods

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Examining differences between two sets of scores

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Investigating the robustness of the nonparametric Levene test with more than two groups

Outline. Practice. Confounding Variables. Discuss. Observational Studies vs Experiments. Observational Studies vs Experiments

11/24/2017. Do not imply a cause-and-effect relationship

NORTH SOUTH UNIVERSITY TUTORIAL 1

Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN

Profile Analysis. Intro and Assumptions Psy 524 Andrew Ainsworth

Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties

Modern Regression Methods

Performance of Median and Least Squares Regression for Slightly Skewed Data

Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction

On the purpose of testing:

How to describe bivariate data

POST GRADUATE DIPLOMA IN BIOETHICS (PGDBE) Term-End Examination June, 2016 MHS-014 : RESEARCH METHODOLOGY

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Readings: Textbook readings: OpenStax - Chapters 1 4 Online readings: Appendix D, E & F Online readings: Plous - Chapters 1, 5, 6, 13

Assignment #6. Chapter 10: 14, 15 Chapter 11: 14, 18. Due tomorrow Nov. 6 th by 2pm in your TA s homework box

Research Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process

Statistical questions for statistical methods

Kidane Tesfu Habtemariam, MASTAT, Principle of Stat Data Analysis Project work

Industrial and Manufacturing Engineering 786. Applied Biostatistics in Ergonomics Spring 2012 Kurt Beschorner

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*

Introduction to Statistical Data Analysis I

Day 11: Measures of Association and ANOVA

Reveal Relationships in Categorical Data

AMSc Research Methods Research approach IV: Experimental [2]

Evidence-Based Medicine Journal Club. A Primer in Statistics, Study Design, and Epidemiology. August, 2013

PTHP 7101 Research 1 Chapter Assignments

Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving

(C) Jamalludin Ab Rahman

Statistical Methods and Reasoning for the Clinical Sciences

Transcription:

Immunological Data Processing & Analysis Hongmei Yang Center for Biodefence Immune Modeling Department of Biostatistics and Computational Biology University of Rochester June 12, 2012 Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 1 / 44

Outline 1 Immunological Data Analysis Examples Exploratory Data Analysis Graphic Methods Description of Data: Summary Statistics Basic Statistical Methods Univariate & Bivariate Analysis Multivariate Analysis 2 Immunological Data Processing Elispot Data Processing Elisa & Luminex Data Processing Hemagglutination Data Processing 3 Appendix Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 2 / 44

Examples Example I: H1N1 vaccine trial The H1N1 vaccine trial to study the correlations among immune responses to vaccination. It is an open-label, prospective evaluation in healthy adults in three age groups: 32, 60 69 and 70+, representing a spectrum of prior exposure to H1N1 viruses. Immune outcomes include: HAI titers against 2009 ph1n1 (day 0, day 28, fold increase) HAI titers against H1N1 Brisbane (day 0, day 28, fold increase) IgA and IgG ELISPOTs against ph1n1 on day 7 Percentages from FACS analysis of B cell subsets Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 3 / 44

Examples Example II: B Cell Response of DR1 and B10 Mice to NC and PR8 Infection Scientific interests include: Do DR1 and B10 mice respond differently to NC infection? Do DR1 and B10 mice respond differently to PR8 infection? About experiments: 2 mice groups with strains of DR1 and B10 respectively 3 types of infection imposed on each mice group: MOCK, NC and PR8 Measured 30 types of Cytokines or chemokines concentration representing B cell response after infection 5 replicates for each combination, but only 28 observations available Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 4 / 44

Exploratory Data Analysis Graphic Methods Bar charts Histograms Box plots Violin plots: a combination of a box plot and a kernel density plot Scatter plots & scatter plot matrices Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 5 / 44

Exploratory Data Analysis Bar Plot HAI at D0 and Fold Increase at D28 by Age Group d0 F.d28 0 10 20 30 40 50 <=32 60 69 70+ Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 6 / 44

Exploratory Data Analysis Histogram plot Histogram of H1N1.IgG Histogram of log10(h1n1.igg) Density 0e+00 1e 05 2e 05 3e 05 4e 05 0 50000 100000 150000 H1N1.IgG(million) Density 0.0 0.1 0.2 0.3 0.4 0.5 0.6 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 log10(h1n1.igg) Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 7 / 44

Exploratory Data Analysis Box plot with mean, median and geometric mean Boxplot of IgG H1N1.IgG(million) 0 5000 10000 15000 20000 Arithmetic Mean Median Geometric Mean <=32 60 69 70+ Age Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 8 / 44

Exploratory Data Analysis Violin Plot A combination of a box plot and a kernel density plot: Arithmetic Mean Median Geometric Mean 0 50000 100000 150000 <=32 60 69 70+ Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 9 / 44

Scatter Plot Matrices Exploratory Data Analysis Scatterplot Matrix of H1N1 Vaccine Study 0 2 4 6 8 10 0 1 2 3 4 0 10 20 30 40 log2.a.d0 r = 0.42 p < 0.01 r = 0.18 p = 0.17 r = 0 p = 0.99 r = 0.12 p = 0.37 r = 0.25 p = 0.08 r = 0.22 p = 0.11 2 4 6 8 0 4 8 log2.f.a r = 0.04 p = 0.76 r = 0.2 p = 0.15 r = 0.64 p < 0.01 r = 0.53 p < 0.01 r = 0.56 p < 0.01 log2.b.d0 r = 0.14 p = 0.31 r = 0.2 p = 0.14 r = 0.01 p = 0.92 r = 0.11 p = 0.45 2 4 6 8 0 1 2 3 4 log10.iga r = 0.53 p < 0.01 r = 0.49 p < 0.01 r = 0.52 p < 0.01 log10.igg r = 0.73 p < 0.01 r = 0.8 p < 0.01 2.0 3.5 5.0 0 20 40 x27hi38hi r = 0.96 p < 0.01 2 4 6 8 10 2 3 4 5 6 7 8 9 2.0 3.0 4.0 5.0 x38hi138 0 5 10 15 20 25 0 10 20 Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 10 / 44

Exploratory Data Analysis Data Assumptions & Transformation Scale of measurement: categorical, ordinal or continuous? If continuous, is normally distributed? If continuous but not normally distributed, could be transformed to be approximately normal. log transformation of Y = log(y + 1): most popular in immunological data analysis Square root transformation of Y = Y + 0.5: good to Poisson distributed data but not so popular Arcsine square root transformation of p = arcsin p: good to outcomes in percentages Inverse hyperbolic sine transformation Y = arcsinh(y ) = log (Y + Y 2 + 1): good for data with a lot of zeroes, e.g., microbiome data which often contains many zeroes for absent bacterial species By inverse hyperbolic sine transformation, dependent variable may be positive, negative and zero. Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 11 / 44

Exploratory Data Analysis Measure of Central Location Typically Mean, Median and Geometric Mean: n G = n x 1,..., x n ln G = 1 n i=1 ln x i Sample mean is very sensitive to outliers, but median and geometric mean are robust. When distribution of data is not symmetrical, mean and median will differ. Geometric mean is not more than mean. Presented measure of central location should be compatible with the test used for statistical analysis. For example: t test: mean Wilcoxon rank sum test: median Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 12 / 44

Exploratory Data Analysis IgG Plots Histogram of H1N1.IgG Histogram of log10(h1n1.igg) Density 0e+00 1e 05 2e 05 3e 05 4e 05 0 50000 100000 150000 H1N1.IgG(million) Density 0.0 0.1 0.2 0.3 0.4 0.5 0.6 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 log10(h1n1.igg) Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 13 / 44

Exploratory Data Analysis IgG Homogeneity among 3 Age Groups: 32, 60 69 and 70+ Kruskal-Wallis One-Way ANOVA IgG log 10 IgG skewness 6.09 38.72 kurtosis 0.056 0.057 p value 1.99e 05 0.0213 1.33e 05 Figure : IgG Homogeneity among 3 Age Groups Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 14 / 44

Exploratory Data Analysis Plasmablast Plots Plasmablast Arcsine Square Root Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0 10 20 30 40 Plasmablast Log Density 0 1 2 3 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Inverse Hyperbolic Sine Density 0.0 0.1 0.2 0.3 0.4 Density 0.0 0.1 0.2 0.3 0.4 2 1 0 1 2 3 4 0 1 2 3 4 Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 15 / 44

Exploratory Data Analysis Plasmablast Homogeneity among 3 Age Groups: 32, 60 69 and 70+ Kruskal-Wallis One-Way ANOVA f * log f * arcsin p arcsinh(p) skewness 3.47 0.235 1.685 0.225 kurtosis 14.83 0.127 4.437 0.338 p value 0.0018 0.0036 0.0006 0.0006 0.0005 * f = 100p Figure : Plasmablast Homogeneity among 3 Age Groups Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 16 / 44

Exploratory Data Analysis Measure of Spread Measure of spread uses Sample standard error of mean, if mean is used as central location; Sample interquartile range, if median is used as central location Sample standard error of log transformed geometric mean, if geometric mean is used as central location Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 17 / 44

Basic Statistical Methods Correlation Analysis Pearson correlation coefficients: indicating the strength of a linear relationship between two variables, and influenced by outliers unequal variances non-normality nonlinearity Spearman s rank correlation coefficients: a non-parametric measure of statistical dependence between two variables and it assesses how well the relationship between two variables can be described using a monotonic function Kendall s Tau: a measure of correlation between two ordinal-level variables Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 18 / 44

Basic Statistical Methods Standard Statistical Testing Methods: independent samples vs. paired samples Independent samples: two separate sets of i.i.d samples Paired samples: a sample of matched pairs of similar units, or one group of units that has been tested twice. Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 19 / 44

Basic Statistical Methods Standard Statistical Testing Methods: different types of data Type of data No. of samples being compared Relationship between samples Underlying distribution of all samples Potential statistical test Binary 1 Not applicable Binary One sample binomial test Binary 2 Independent Binary Chi-square test, Fisher s exact test Binary >2 Independent Binary Chi-square test Binary 2 Paired Binary McNemar s test Binary >2 Related Binary Cochran s Q test (an extension to McNemar s test) Nominal 1 Not applicable Normal One-sample t-test for means, one-sample chi-square test for variance Nominal 1 Not applicable Nonnormal One-sample Wilcoxon signed-rank test, one-sample sign test Nominal 2 Independent Normal Two-sample t test for means, two-sample F test for variance Nominal 2 Independent Nonnormal Wilcoxon rank sum test (also called Mann-Whitney U test, requires identical spread) for medians, Ansari-Bradley test for spread Nominal 2 Independent Nonnormal Kolmogorov-Smirnov test for overall difference. In case of same shape and spread, Wilcoxon rank sum test is more powerful; in case of same median and shape, Ansari-Bradley test is more powerful. Nominal 2 Paired Normal Paired t test Nominal 2 Paired Nonnormal Wilcoxon signed-rank test, sign test (Sign-rank test is more robust to outliers or data from a heavy-tailed distribution, but less powerful in case of outliers) Nominal >2 Independent Normal One-way ANOVA for means, Bartlett s test for homogeneity for variances Nominal >2 Independent Nonnormal Kruskal-Wallis test (an extension of Wilcoxon rank sum test) Nominal >2 Related Nonnormal Friedman rank sum test: an extension to sign test, and the non-parametric analogy of repeated ANOVA Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 20 / 44

Basic Statistical Methods Multivariate Analysis Multiple regression (linear, non-linear, logistic,...) MANOVA: simultaneous comparisons of more endpoints (cytokines) instead of repeated application of ANOVA MANCOVA: an extension of MANOVA that additionally allows to control for the effect of an other continuous variable to be controlled (confounder), f.g., simultaneously compare cytokines across groups and adjusted for age. Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 21 / 44

Multiple Regression: Example Basic Statistical Methods Example: Is age confounding between HAI at day 0 and HAI at day 28 in the H1N1 vaccine trial study? A confounding variable is an extraneous variable in a statistical model that correlates with both the dependent variable and the independent variable. HAI(log of F.I) D28 0 2 4 6 8 10 ρ = 0.42 p value= 0.001 2 4 6 8 10 HAI(log) D0 0 10 20 30 40 50 60 70 p value=0.011 HAI at D0 p value<0.0001 HAI F at D28 <=32 60 69 70+ Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 22 / 44

Multiple Regression: Example Basic Statistical Methods Model: log 2 (HAI.F.D28) = β 1 log 2 (HAI.D0) + β 2 Age, Table : Estimators from linear model fitting Estimates Std Pr(> t ) β 1 0.35 0.127 0.009 β 21 : 30 7.06 0.603 < 2e 16 β 22 : 60 69 2.74 0.624 5.14e 05 β 23 : 70+ 2.36 0.599 0.0002 Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 23 / 44

Basic Statistical Methods Data Reduction Principal component analysis (PCA): finding summary variables called principal components that contain most of the information of the original data Cluster analysis: group individuals so that subjects in the same cluster have similar profiles of the parameter under study Linear discriminant analysis: a method that derives linear combinations of the independent variable that best discriminate between the two outcome groups Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 24 / 44

Basic Statistical Methods MANOVA on Example II Possible statistical approaches: ANOVA on each cytokines: separate ANOVA is generally less powerful MANOVA: utilizing information from all variables simultaneously. In consideration of large number of dependent variables but smaller number of observation, following cytokines are selected to demonstrate:il7, IL12p70, VEGF, IL10. Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 25 / 44

Basic Statistical Methods MANOVA on Example II Table : Hypothesis testing by ANOVA and MANOVA p value NC: B10-DR1 PR8: B10-DR1 IL7 0.234 0.251 IL12p70 0.461 0.04 VEGF 0.327 0.238 IL10 0.005 0.769 MANOVA: IL7 IL12p70 VEGF IL10 0.05 0.019 Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 26 / 44

Examples Elispot Data Processing 0 20 40 60 80 100 120 id= 9 0 20 40 60 80 id= 25 IgG 0.0 0.2 0.4 0.6 0.8 1.0 id= 48 0.0 0.2 0.4 0.6 0.8 1.0 id= 133 0 50 100 150 200 250 0 100 200 300 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Dilution Factor Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 27 / 44

Elispot Data Processing Problems How to determine cell numbers from spot counts yielded by Elispot assays? How to deal with outliers? Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 28 / 44

Methods we developed Elispot Data Processing Least Squares: βls = K k=1 X ky k K k=1 X 2 k Robust Least Squares: βrls = min β K k=1 ρ( Y k X k β Mean Approach: βme = K Y k k=1 X k K σ ) Median Approach: βmd = median{ Y k X k }, k = 1,..., K Poisson Approach: βpoi = K k=1 Y k K k=1 X k Robust Poisson Approach: βrpoi = min β K k=1 ρ( Y k/x k β σ ) Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 29 / 44

Elispot Data Processing Simulations: Without Staining Error on Top Dilution Levels IgG: ARE 0.00 0.04 0.08 0.12 1000 2000 3000 4000 5000 0.00 0.04 0.08 0.12 6000 7000 8000 9000 10000 0.00 0.04 0.08 0.12 11000 12000 13000 14000 15000 0.00 0.04 0.08 0.12 16000 17000 18000 19000 20000 IgG: PSAE 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 HA: ARE 0.01 0.03 0.05 0.07 0.01 0.03 0.05 0.07 0.01 0.03 0.05 0.07 0.01 0.03 0.05 0.07 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 HA: PSAE 0.0 0.2 0.4 0.6 0.8 20 40 60 80 100 0.0 0.2 0.4 0.6 0.8 120 140 160 180 200 0.0 0.2 0.4 0.6 0.8 220 240 260 280 300 0.0 0.2 0.4 0.6 0.8 320 340 360 380 400 MD ME LS RLS POI RPOI Figure : ARE, average relative error; PSAE, proportion of yielding estimates with smallest absolute error among 500 runs Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 30 / 44

Elispot Data Processing Simulations: With Staining Error on Top Dilution Levels IgG: ARE 0.00 0.02 0.04 0.06 0.08 1000 2000 3000 4000 5000 0.00 0.02 0.04 0.06 0.08 6000 7000 8000 9000 10000 0.00 0.02 0.04 0.06 0.08 11000 12000 13000 14000 15000 0.00 0.02 0.04 0.06 0.08 16000 17000 18000 19000 20000 IgG: PSAE 0.0 0.1 0.2 0.3 0.0 0.1 0.2 0.3 0.0 0.1 0.2 0.3 0.0 0.1 0.2 0.3 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 HA: ARE 0.01 0.03 0.05 0.07 0.01 0.03 0.05 0.07 0.01 0.03 0.05 0.07 0.01 0.03 0.05 0.07 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 HA: PSAE 0.00 0.10 0.20 0.30 20 40 60 80 100 120 140 160 180 200 0.00 0.10 0.20 0.30 0.00 0.10 0.20 0.30 220 240 260 280 300 0.00 0.10 0.20 0.30 320 340 360 380 400 MD * ME * LS * RLS * POI * RPOI * Figure : ARE, average relative error; PSAE, proportion of yielding estimates with smallest absolute error among 500 runs Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 31 / 44

Recommendation Elispot Data Processing Simulation study shows: Poisson approach has optimal performance regardless of probable staining errors associated with the less diluted samples. In practice, we recommend: Remove probable staining errors associated with less diluted samples Use the Poisson approach to estimate cell counts from ELISPOT assays Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 32 / 44

Elisa & Luminex Data Processing Standard Approach Two-step approach for statistical processing: Fit four-parameter logistic curve using standard data y = a d 1 + (x/c) b + d + ɛ y =optical density, x =concentration, a =maximum response, d =minimum response, c =ED50, b =slope-like parameter. Unknown concentrations are calibrated from the standard curve. Estimates of diluted samples are scaled back to the original scale. Scaled values are averaged to obtain an estimated concentration for the unknown sample. Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 33 / 44

Numerical Problems Elisa & Luminex Data Processing Nonlinear maximization Nonlinear maximization requires specification of initial parameter values The choice of initial values may influence on convergence of estimation algorithm In the worst case yielding no convergence Initial values can be obtained by linearization Initial value for a: maximum y Initial value for d: minimum y Making transform as log a y y d = b{logx logc} Then initial values for b and c can be obtained by linear regression. Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 34 / 44

Examples: Fitted Curves Elisa & Luminex Data Processing Plate14 T13 IgG Total 500K 1M Plate17 T13 IgG Total 50K 100K Optical Density 0 1 2 3 4 5 6 0 2 4 6 8 10 12 14 Plate20 T13 IgG Total 10M 50M standard test: Outlier test: Normal 0 1 2 3 4 5 6 standard test: Outlier test: Normal 0 2 4 6 8 10 12 14 Plate21 T13 IgG Total 2.5M 5M 0.5 1.0 1.5 2.0 2.5 3.0 Below lower limit standard test: Outlier test: Normal 1 2 3 4 standard test: Outlier test: Normal 0 2 4 6 8 10 0 2 4 6 8 10 Concentration Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 35 / 44

Examples: Estimated Concentrations Elisa & Luminex Data Processing IgG Total T1307C 03 Concentration 0 2 4 6 8 10 12 14 17*50000 17*1e+05 14*5e+05 14*1e+06 21*2500000 21*5e+06 20*1e+07 IgG Total T1307C 03 in Original Scale Concentration in Original Scale 0e+00 4e+06 8e+06 17*50000 17*1e+05 14*5e+05 14*1e+06 21*2500000 21*5e+06 20*1e+07 Dilution Factor Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 36 / 44

Examples: Raw Data Hemagglutination Data Processing 10053L 10072L Positive Wells (%) 0.0 0.2 0.4 0.6 0.8 1.0 Positive Wells (%) 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 6 10073L 1 2 3 4 5 6 10078L Positive Wells (%) 0.0 0.2 0.4 0.6 0.8 1.0 Positive Wells (%) 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5 6 1 2 3 4 5 6 Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 37 / 44

Hemagglutination Data Processing Classical Approach Reed-Muench method: a linear interpolation with formula I = (% infected at dilution immediately above 50%)-50% (% infected at dilution immediately above 50%)-(% infected at dilution immediately below 50%) ED 50 = 10log of total dilution immediately above 50% I d I = interpolated value of the 50% endpoint d = log of the dilution factor (i.e. the difference between the log dilution intervals) Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 38 / 44

Classical Approach: Drawbacks Hemagglutination Data Processing Using information from only two points around the potential titer: loss of information Assuming a linear dose-response relationship: subject to question Inefficient in both precision and accuracy Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 39 / 44

Hemagglutination Data Processing Better Approach Four parameter logistic (4PL) regression y = a d 1 + (x/c) b + d + ɛ y = response, x = concentration, a = maximum response, d = minimum response, c =ED50, b = slope-like parameter. Response could be: % positive responses Arcsine square root transformation of % positive responses Numerical Problems: can be dealt with as previously. Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 40 / 44

Examples: Fitted Curves Hemagglutination Data Processing 10053L 10072L Positive Wells (%) 0.0 0.2 0.4 0.6 0.8 1.0 RM 4PL 4PL(AST) Positive Wells (%) 0.0 0.2 0.4 0.6 0.8 1.0 RM 4PL 4PL(AST) 1 2 3 4 5 6 10073L 1 2 3 4 5 6 10078L Positive Wells (%) 0.0 0.2 0.4 0.6 0.8 1.0 RM 4PL 4PL(AST) Positive Wells (%) 0.0 0.2 0.4 0.6 0.8 1.0 RM 4PL 4PL(AST) 1 2 3 4 5 6 1 2 3 4 5 6 Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 41 / 44

BLIS: a Platform Combining Data Management and Statistical Processing & Analysis Bio-Lab Informatics Server Standardize R codes for statistical processing of data from Elispot, Elisa and Hemmagglutination assays. Incorporate the R codes into automated routines within our customized BLIS application. Immunologists can use the system for data management, visual exploration and statistical processing and generate up-to-date reporting. Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 42 / 44

Reference Yang, H., Topham, D.J., Holden-Wiltse, J. andwu, H. (2012) Statistical Estimation & Inference of Cell Counts from ELISPOT Limiting Dilution Assays. Journal of Biopharmaceutical Statistics Accepted. Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 43 / 44

Acknowledgement NIAID: UR-CBIM (HHSN272201000055C) NIAID: UR-NYICE (HHSN266200700008C) David Topham s Lab Martin Zand s Lab Tim Mosmann s Lab Andrea Sant s Lab Hongmei Yang (CBIM at URMC) Immunological Data Processing & Analysis June 12, 2012 44 / 44