Analysis of left-censored multiplex immunoassay data: A unified approach

Similar documents
ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods

Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data

Design for Targeted Therapies: Statistical Considerations

Multilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison

A Bayesian Measurement Model of Political Support for Endorsement Experiments, with Application to the Militant Groups in Pakistan

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials

T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design

Statistical Tolerance Regions: Theory, Applications and Computation

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

Bayesian hierarchical modelling

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

A Comparison of Methods for Determining HIV Viral Set Point

MEA DISCUSSION PAPERS

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis

Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3

Correcting AUC for Measurement Error

Kelvin Chan Feb 10, 2015

A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests

Advanced IPD meta-analysis methods for observational studies

An Introduction to Bayesian Statistics

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals

Bayesian methods in health economics

Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods

Individual Differences in Attention During Category Learning

Ordinal Data Modeling

Institutional Ranking. VHA Study

A Bayesian Account of Reconstructive Memory

BD CBA on the BD Accuri C6: Bringing Multiplexed Cytokine Detection to the Benchtop

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

Performance Assessment for Radiologists Interpreting Screening Mammography

educational assessment and educational measurement

DESIGN: A CRITICAL NEED IN PEST- DAMAGE CONTROL EXPERIMENTS

Bayesian Estimation of a Meta-analysis model using Gibbs sampler

Risk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach

Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models

Statistical Models for Censored Point Processes with Cure Rates

Bayesian Latent Subgroup Design for Basket Trials

Statistical Science Issues in HIV Vaccine Trials: Part I

A bivariate parametric long-range dependent time series model with general phase

Modelling prognostic capabilities of tumor size: application to colorectal cancer

Hierarchical Bayesian Methods for Estimation of Parameters in a Longitudinal HIV Dynamic System

Using the Testlet Model to Mitigate Test Speededness Effects. James A. Wollack Youngsuk Suh Daniel M. Bolt. University of Wisconsin Madison

Introduction to Meta-analysis of Accuracy Data

Bayesian Joint Modelling of Benefit and Risk in Drug Development

Empirical evidence on sources of bias in randomised controlled trials: methods of and results from the BRANDO study

Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome

Meta-analysis of two studies in the presence of heterogeneity with applications in rare diseases

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill)

Bayesian hierarchical joint modeling of repeatedly measured continuous and ordinal markers of disease severity: Application to Ugandan diabetes data

Small-area estimation of mental illness prevalence for schools

How should the propensity score be estimated when some confounders are partially observed?

Human Neurology 3-Plex A

Maria-Athina Altzerinakou1, Xavier Paoletti2. 9 May, 2017

Beyond bumps: Spiking networks that store sets of functions

Empirical assessment of univariate and bivariate meta-analyses for comparing the accuracy of diagnostic tests

Identifying Peer Influence Effects in Observational Social Network Data: An Evaluation of Propensity Score Methods

Methods Research Report. An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy

A Bayesian Perspective on Unmeasured Confounding in Large Administrative Databases

Effective Implementation of Bayesian Adaptive Randomization in Early Phase Clinical Development. Pantelis Vlachos.

Statistical methods for the meta-analysis of full ROC curves

Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers

Comparing DIF methods for data with dual dependency

Heritability. The Extended Liability-Threshold Model. Polygenic model for continuous trait. Polygenic model for continuous trait.

On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015

Bayesian meta-analysis of Papanicolaou smear accuracy

Confidence Intervals On Subsets May Be Misleading

Consequences of effect size heterogeneity for meta-analysis: a Monte Carlo study

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

SUPPLEMENTARY MATERIAL. Impact of Vaccination on 14 High-Risk HPV type infections: A Mathematical Modelling Approach

LINEAR REGRESSION FOR BIVARIATE CENSORED DATA VIA MULTIPLE IMPUTATION

Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes

VARIATION IN MEASUREMENT OF HIV RNA VIRAL LOAD

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

SESUG Paper SD

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Estimation of delay to diagnosis and incidence in HIV using indirect evidence of infection dates

Meta-analysis of few small studies in small populations and rare diseases

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Multivariate meta-analysis using individual participant data

Research and Evaluation Methodology Program, School of Human Development and Organizational Studies in Education, University of Florida

Propensity scores: what, why and why not?

Sample Size Considerations. Todd Alonzo, PhD

Ecological Statistics

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy

Analyzing data from educational surveys: a comparison of HLM and Multilevel IRT. Amin Mousavi

OBSERVATIONAL MEDICAL OUTCOMES PARTNERSHIP

A novel approach to estimation of the time to biomarker threshold: Applications to HIV

Meta-analysis of external validation studies

The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0

Where a licence is displayed above, please note the terms and conditions of the licence govern your use of this document.

A Multilevel Testlet Model for Dual Local Dependence

Bayesian Joint Modelling of Longitudinal and Survival Data of HIV/AIDS Patients: A Case Study at Bale Robe General Hospital, Ethiopia

Running head: VARIABILITY AS A PREDICTOR 1. Variability as a Predictor: A Bayesian Variability Model for Small Samples and Few.

Lecture 10: Learning Optimal Personalized Treatment Rules Under Risk Constraint

Optimal dose selection considering both toxicity and activity data; plateau detection for molecularly targeted agents

Transcription:

1 / 41 Analysis of left-censored multiplex immunoassay data: A unified approach Elizabeth G. Hill Medical University of South Carolina Elizabeth H. Slate Florida State University FSU Department of Statistics Seminar Series 20 September 2013

2 / 41 Talk Outline Multiplex Immunoassays Problem statement Bayesian hierarchical model Simulation study Application Discussion

An immunoassay is... A chemical test used to measure the concentration of a molecule in solution Target molecule - often a protein - called an analyte Analyte is captured by an analyte-specific antibody Measureable label (e.g. intensity of fluorescent signal) facilitates quantitation Label Analyte Antibody 3 / 41

4 / 41 Conventional uses Detect presence (absence) of a protein Pregnancy test Steroid use Goal: Is analyte X detectable in this subject? Measure the concentration of protein Blood panel Viral load (HIV+ patients) Goal: What is the concentration of analyte X for this subject? Emphasis is on analyte measurement for an individual

5 / 41 Multiplex immunoassay Simultaneous quantitation of a panel of analytes Commonly used as biomarker discovery tool Emphasis on group-level comparisons Goal Is analyte X associated with disease status? Does analyte X predict disease status?

6 / 41 Multiplex immunoassay All analytes accommodated on a single 96-well plate k-plex, k = 5 30 (typically) Multiple plates used to facilitate large sample sizes Each well detects all k analytes Standards serially diluted known analyte concentrations Blanks no analyte concentration Patient samples unknown analyte concentration

Workflow For every analyte on each plate 1. Place 8 10 duplicate standard concentrations 2. Place duplicate blanks 3. Use remaining wells for duplicate patient samples (serum, urine, saliva) 4. Measure signal intensity (response) 5. Calculate average response for blanks 6. Correct standard and patient responses by subtracting the average blank response 7. Fit a standard curve to paired known standard concentrations and background corrected (BgC) responses 8. Average the patient BgC responses 9. Back-fit average BgC patient response values to obtain estimates of analyte concentrations 10. Conduct group-level inference 7 / 41

8 / 41 5-Parameter Logistic Model Standard curve f (x ijk β jk ) = d jk + a jk d jk [ ( ) ] gjk bjk xikj 1 + c jk where x ijk is the ith known concentration for analyte j plate k and β jk = (a jk, b jk, c jk, d jk, g jk ). a jk - upper asymptote d jk - lower asymptote b jk - transition rate between asymptotes c jk - concentration at inflection point g jk - asymmetry parameter

9 / 41 Fitting the 5-PL standard curve yijkl = observed BgC response corresponding to lth replicate of standard concentration x ijk yijkl Normal(f (x ijk β jk ), Var(yijkl )) Variance systematically related to mean Noise in signal detectors is proportional to response magnitude Kinetics associated with antibody binding Power of the mean variance function Var(yijkl ) = σ2 jk f (x ijk β jk ) 2θ Curve fitting achieved using generalized least squares (GLS) GLS = weighted least squares with weights estimated by Var(y ijkl ) 1

10 / 41 Standard curve

11 / 41 Patient analyte concentration estimation

12 / 41 Sounds straightforward until you look at the data Well IL 1b IL 2 IL 4 IL 5 B7,B8 *0.28 23.41 0.42 1.74 B9,B10 *0.61 51.61 1.07 3.94 B11,B12 OOR < 5.58 OOR < OOR < C1,C2 1.04 58.38 1.18 5.59 C3,C4 *0.79 63.17 1.66 5.69 C5,C6 *0.27 25.23 0.38 2.32 C7,C8 OOR < 8.46 OOR < *0.45 C9,C10 *0.25 47.58 0.96 3.87 C11,C12 OOR < OOR < OOR < OOR < D1,D2 *0.19 27.14 0.52 2.17 D3,D4 OOR < 22.98 0.6 1.95 D5,D6 *0.48 46.6 0.95 4.44 D7,D8 10.07 8.21 OOR < 5.16 D9,D10 *0.71 50.53 1.24 3.8

13 / 41 Starred concentrations Smallest standard concentration Estimated patient concentration extrapolated

14 / 41 OOR< Horizontal asymptote? Back fit fails concentration can t be estimated

15 / 41 What causes this problem? In biomarker discovery context Analytes expected concentrations uncertain Results in poorly tuned dose-response curves Pervasive problem - percent of analytes with extrapolated or out-of-range concentrations can range from 0% - >90% Occurs less frequently at high end of curve How can we analyze the data and address the investigator s research hypotheses?

16 / 41 Simple solution? Well IL 1b IL 2 IL 4 IL 5 B7,B8 0 23.41 0.42 1.74 B9,B10 0 51.61 1.07 3.94 B11,B12 0 5.58 0 0 C1,C2 1.04 58.38 1.18 5.59 C3,C4 0 63.17 1.66 5.69 C5,C6 0 25.23 0.38 2.32 C7,C8 0 8.46 0 0 C9,C10 0 47.58 0.96 3.87 C11,C12 0 0 0 0 D1,D2 0 27.14 0.52 2.17 D3,D4 0 22.98 0.6 1.95 D5,D6 0 46.6 0.95 4.44 D7,D8 10.07 8.21 0 5.16 D9,D10 0 50.53 1.24 3.8

17 / 41 Simple solution? Well IL 1b IL 2 IL 4 IL 5 B7,B8 0 23.41 0.42 1.74 B9,B10 0 51.61 1.07 3.94 B11,B12 0 5.58 0 0 C1,C2 1.04 58.38 1.18 5.59 Induces bias Underestimates variability Inflates type I error C3,C4 0 63.17 1.66 5.69 C5,C6 0 25.23 0.38 2.32 C7,C8 0 8.46 0 0 C9,C10 0 47.58 0.96 3.87 C11,C12 0 0 0 0 D1,D2 0 27.14 0.52 2.17 D3,D4 0 22.98 0.6 1.95 D5,D6 0 46.6 0.95 4.44 D7,D8 10.07 8.21 0 5.16 D9,D10 0 50.53 1.24 3.8

18 / 41 Other ideas? Use established methods for analysis of left-censored data ML with left-censored values contributing 1 - CDF to likelihood Harrell s C-index - AUC equivalent for censored time-to-event data Where to censor? Conventional approach - censor at LOD or LOQ No established assay detection/quantitation limits Would need to be determined for each analyte and plate Previous efforts resulted in high rates of censored data - not an agreeable solution to the scientific investigator Shouldn t we account for uncertainty in curve and background estimation?

19 / 41 A unified approach Incorporate background estimation, standard curve fitting, and patient analyte concentration estimation into a single Bayesian hierarchical model 1. Background δ jk = observed background for analyte j, plate k with E(δ jk ) = e µ jk log(δ jk ) Normal(µ jk, ϕ) Var(δ jk ) = λ jk = e 2µ jk e ϕ (e ϕ 1 )

20 / 41 Background mean

21 / 41 Background distribution

22 / 41 Standard curve 2. Standard curve y ijkl = observed (not BgC) response corresponding to the lth replicate of x ijk, the ith known concentration for analyte j, plate k y ijkl µ jk Normal(f (x ijk β jk ) + e µ jk, τ ijk ) where τ ijk = σ 2 jk f (x ijk β jk ) 2θ

23 / 41 Patient data 3. Patient analyte concentrations ν hj = patient h s true (latent) concentration of analyte j where D h = e γ 1j log(ν hj ) Normal(γ 0j + γ 1j D h, η j ) { 1, patient h is disease positive 0, patient h is disease negative. = fold-change comparing Ds+ to Ds- patients

24 / 41 Patient data 4. Patient response values z hjkl = observed (not BgC) response corresponding to the lth replicate of patient h s true concentration for analyte j, plate k z hjkl µ jk Normal(f (ν hj β jk ) + e µ jk, ξ hjk ) where ξ hjk = σ 2 jk f (ν hj β jk ) 2θ 1(ν hj x 1hj ) + λ jk 1(ν hj < x 1hj )

25 / 41 Simulation study - experimental design Three plates Ten analytes per plate ( A - J ) Ten duplicate standards for each analyte Two blanks per plate Duplicate samples for 15 disease and 15 control subjects per plate (total sample size = 45 disease and 45 control)

26 / 41 Simulation study - data description Prob(ν hj < x 1jk control) = 0.2 j, k e γ 1j = 1, 1.5, 2, 4, and 6, respectively, for analytes A/F, B/G, C/H, D/I, and E/J For control subjects, SD(log conc) = 1 for all analytes For disease subjects, SD(log conc) = 1 for analytes A - E For disease subjects, SD(log conc) = 1.25 for analytes F - J

27 / 41 Simulation study - inference 25 independent simulations in WinBUGS Burn-in = 50K iterations with posterior inference chain length of 10K Constructed FC and AUC from posterior estimates of patient analyte concentrations Compared to standard workflow Near-zero imputation for low extrapolated or out-of-range concentrations Linear regression for FC estimation and inference H 0 : FC = 1 vs H 1 : FC 1 Empirical estimate of AUC with inference based on corresponding 95% CI H 0 : AUC = 0.5 vs H 1 : AUC > 0.5 Type I error, power, coverage, bias, MSE to results using standard workflow

28 / 41 Results ANALYTE C Simulation 17 log(predicted conc) log(true conc) 6 4 2 0 2 HM Near zero imputation

29 / 41 Fold change results

30 / 41 Fold change results

31 / 41 Fold change results

32 / 41 Fold change results

33 / 41 AUC results

34 / 41 AUC results

35 / 41 AUC results

36 / 41 Head and neck cancer application Hypothesis: HPV16 infection affects tumor immunity, resulting in differences between patients with HPV-associated and non-associated head and neck cancer tumors. Serum samples from 15 HPV+, 18 HPV-, and 19 control patients Multiplex analysis of 9 cytokines and chemokines Two plates Eight standards per analyte and plate Two background wells per analyte per plate All patient samples measured in duplicate Percent of estimated concentrations below minimum standard (based on standard workflow) ranges from 0% to 41% FC and AUC estimated using hierarchical model

37 / 41 HNCa results IL6 was 3.4-times higher in HPV- samples relative to control (FC = 3.4, 95% CI = 1.6 to 7.9). Cytokines significantly predictive of HPV associated and nonassociated HNCa, based on AUC posterior inference. AUC (95% CI) Analyte HPV :Ctl HPV :HPV+ HPV+:Ctl GCSF 0.70 (0.56, 0.83) 0.54 (0.39, 0.68) 0.66 (0.51, 0.79) IL4 0.58 (0.49, 0.67) 0.64 (0.54, 0.73) 0.45 (0.33, 0.56) IL6 0.79 (0.75, 0.83) 0.67 (0.63, 0.71) 0.65 (0.59, 0.70) IL12 0.61 (0.54, 0.68) 0.60 (0.53, 0.69) 0.51 (0.41, 0.60) MCP1 0.66 (0.61, 0.70) 0.59 (0.54, 0.64) 0.54 (0.48, 0.60) MIP1b 0.52 (0.48, 0.56) 0.46 (0.42, 0.51) 0.55 (0.51, 0.60) TNFa 0.62 (0.55, 0.68) 0.58 (0.50, 0.66) 0.55 (0.47, 0.62)

38 / 41 Summary What we ve learned so far If goal is biomarker discovery, near-zero imputation is a bad idea FC estimation and inference has good accuracy AUC estimation and inference seems slightly conservative What s left Additional simulations Compare to censored data approaches

39 / 41 Future directions Multivariate distribution for analyte concentrations Design considerations - number of background replicates, number of plates (sample size) Flexible dose-response curve fitting - e.g. splines Hopefully the next grant!

40 / 41 Acknowledgements NIDCR/NIH R03 DE021775-01A1

41 / 41 Thank you!