Kelvin Chan Feb 10, 2015

Similar documents
A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

Appendix 1. Sensitivity analysis for ACQ: missing value analysis by multiple imputation

Advanced Bayesian Models for the Social Sciences

SCHIATTINO ET AL. Biol Res 38, 2005, Disciplinary Program of Immunology, ICBM, Faculty of Medicine, University of Chile. 3

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch.

Bayesian Inference Bayes Laplace

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Bayesian Estimation of a Meta-analysis model using Gibbs sampler

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill)

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Multilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Valuing health using visual analogue scales and rank data: does the visual analogue scale contain cardinal information?

Statistical Audit. Summary. Conceptual and. framework. MICHAELA SAISANA and ANDREA SALTELLI European Commission Joint Research Centre (Ispra, Italy)

Applications with Bayesian Approach

Mediation Analysis With Principal Stratification

Analyzing data from educational surveys: a comparison of HLM and Multilevel IRT. Amin Mousavi

Ordinal Data Modeling

Method Comparison for Interrater Reliability of an Image Processing Technique in Epilepsy Subjects

An Introduction to Multiple Imputation for Missing Items in Complex Surveys

Methods for Computing Missing Item Response in Psychometric Scale Construction

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data

Running Head: BAYESIAN MEDIATION WITH MISSING DATA 1. A Bayesian Approach for Estimating Mediation Effects with Missing Data. Craig K.

Using Test Databases to Evaluate Record Linkage Models and Train Linkage Practitioners

Bayesian Model Averaging for Propensity Score Analysis

Bayesian Joint Modelling of Benefit and Risk in Drug Development

Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data

A Brief Introduction to Bayesian Statistics

Inferring relationships between health and fertility in Norwegian Red cows using recursive models

How is the most severe health state being valued by the general population?

WinBUGS : part 1. Bruno Boulanger Jonathan Jaeger Astrid Jullion Philippe Lambert. Gabriele, living with rheumatoid arthritis

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods

Section on Survey Research Methods JSM 2009

A non-linear beta-binomial regression model for mapping EORTC QLQ- C30 to the EQ-5D-3L in lung cancer patients: a comparison with existing approaches

Missing Data and Imputation

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

An Introduction to Bayesian Statistics

SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA. Henrik Kure

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values

Introduction to Bayesian Analysis 1

A Bayesian Measurement Model of Political Support for Endorsement Experiments, with Application to the Militant Groups in Pakistan

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Hakan Demirtas, Anup Amatya, Oksana Pugach, John Cursio, Fei Shi, David Morton and Beyza Doganay 1. INTRODUCTION

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

George B. Ploubidis. The role of sensitivity analysis in the estimation of causal pathways from observational data. Improving health worldwide

Comparison of Value Set Based on DCE and/or TTO Data: Scoring for EQ-5D-5L Health States in Japan

Bayesian Mediation Analysis

Statistical Analysis with Missing Data. Second Edition

Detection of Unknown Confounders. by Bayesian Confirmatory Factor Analysis

An application of a pattern-mixture model with multiple imputation for the analysis of longitudinal trials with protocol deviations

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data

Bayesian Joint Modelling of Longitudinal and Survival Data of HIV/AIDS Patients: A Case Study at Bale Robe General Hospital, Ethiopia

Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models

LIHS Mini Master Class

Statistical data preparation: management of missing values and outliers

Bayesian and Frequentist Approaches

Master thesis Department of Statistics

A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests

Jinhui Ma 1,2,3, Parminder Raina 1,2, Joseph Beyene 1 and Lehana Thabane 1,3,4,5*

Statistical Tolerance Regions: Theory, Applications and Computation

Introduction. Patrick Breheny. January 10. The meaning of probability The Bayesian approach Preview of MCMC methods

Introduction to Survival Analysis Procedures (Chapter)

2014 Modern Modeling Methods (M 3 ) Conference, UCONN

Decomposition of the Genotypic Value

Georgetown University ECON-616, Fall Macroeconometrics. URL: Office Hours: by appointment

TRIPODS Workshop: Models & Machine Learning for Causal I. & Decision Making

Statistical Analysis Using Machine Learning Approach for Multiple Imputation of Missing Data

On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015

Hierarchical Bayesian Methods for Estimation of Parameters in a Longitudinal HIV Dynamic System

SESUG Paper SD

MS&E 226: Small Data

Analysis of left-censored multiplex immunoassay data: A unified approach

Using Discrete Choice Experiments with duration to model EQ-5D-5L health state preferences: Testing experimental design strategies

Data Analysis Using Regression and Multilevel/Hierarchical Models

Missing data in medical research is

Response to Comment on Cognitive Science in the field: Does exercising core mathematical concepts improve school readiness?

Bayesian Hierarchical Models for Fitting Dose-Response Relationships

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan

The matching effect of intra-class correlation (ICC) on the estimation of contextual effect: A Bayesian approach of multilevel modeling

15D: Strengths, weaknesses and future development

Vision as Bayesian inference: analysis by synthesis?

Analyzing diastolic and systolic blood pressure individually or jointly?

BayesRandomForest: An R

No Large Differences Among Centers in a Multi-Center Neurosurgical Clinical Trial

Econometrics II - Time Series Analysis

Accuracy of Range Restriction Correction with Multiple Imputation in Small and Moderate Samples: A Simulation Study

Regional variation in health state preferences Does it exist, and does it matter?

Multiple trait model combining random regressions for daily feed intake with single measured performance traits of growing pigs

ethnicity recording in primary care

STATISTICAL INFERENCE 1 Richard A. Johnson Professor Emeritus Department of Statistics University of Wisconsin

A test of quantitative genetic theory using Drosophila effects of inbreeding and rate of inbreeding on heritabilities and variance components #

(Girton College) Shadows of the Mind: Using Discrete Decision Tasks to Extract Mental Representations.

A Bayesian approach for constructing genetic maps when markers are miscoded

Running head: VARIABILITY AS A PREDICTOR 1. Variability as a Predictor: A Bayesian Variability Model for Small Samples and Few.

How do we combine two treatment arm trials with multiple arms trials in IPD metaanalysis? An Illustration with College Drinking Interventions

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Outline. Hierarchical Hidden Markov Models for HIV-Transmission Behavior Outcomes. Motivation. Why Hidden Markov Model? Why Hidden Markov Model?

Score Tests of Normality in Bivariate Probit Models

Modelling heterogeneity variances in multiple treatment comparison meta-analysis Are informative priors the better solution?

BAYESIAN HYPOTHESIS TESTING WITH SPSS AMOS

Transcription:

Underestimation of Variance of Predicted Mean Health Utilities Derived from Multi- Attribute Utility Instruments: The Use of Multiple Imputation as a Potential Solution. Kelvin Chan Feb 10, 2015

Outline Multi-attribute utility instruments (MAUI) Scoring algorithms and prediction errors Problems with variance underestimation Influence on minimal clinically important differences Solution using multiple imputation Proof-of-concept using the US EQ-5D-3L 2

Multi-attribute utility instruments MAUIs include EQ-5D, SF-6D, HUI-3 Valuation studies conducted on sample of general population Developed using regression models Indirect measure of health utility, recommended in the UK, Australia and Canada 3

MAUI Example EQ-5D 4 Figure 1. EQ-5D example from http://diabetesclinicevaluation.weebly.com/uploads/9/5/6/7/9567609/6029985.jpg?633

MAUI Valuation Studies Scoring algorithms are country-specific and are developed in a single valuation study. In a valuation study, respondents are asked to assign utilities to health states using time trade-off (TTO), discrete choice or standard gamble tasks 5

6 MAUI Valuation Studies (TTO)

7 Time Trade-Off (TTO)

EQ-5D scoring algorithm TTO i = 1 disutility i disutility = Xβ Xβ = int+ β1( MO) + β2( SC) + β3( UA) + β4( PD) + β5( AD) ± Xβ ( miscell.) 8

Problems with variance underestimation Subject to prediction errors ± 0.0754 for EQ-5D ± 0.1655 for SF-6D 1 Influence on minimal clinically important differences 0.05 0.08 for EQ-5D 2 0.01 0.09 for SF-6D 3 9 1 Kharroubi S, O'Hagan A, Brazier JE. Estimating utilities from individual health preference data: A nonparametric bayesian method. Applied Statistics. 2005;54(5):879-895. 2 Le QA, Doctor JN, Zoellner LA, Feeny NC. Minimal clinically important differences for the EQ-5D and QWB-SA in post-traumatic stress disorder (PTSD): Results from a doubly randomized preference trial (DRPT). Health Qual Life Outcomes. 2013;11:59-7525-11-59. 3 Walters SJ, Brazier JE. Comparison of the minimally important difference for two health state utility measures: EQ-5D and SF-6D. Qual Life Res. 2005;14(6):1523-1532.

Problems with variance underestimation No method to account for variance of the estimated predicted mean health state utilities E.g. subjects with the same health states will always map to the same health utility without variation Cost-effectiveness analyses based on MAUI do not capture parameter uncertainty in quality-adjusted life years 10

Solutions 1. Full Bayesian analysis Using posterior predictive distributions of health states using original study data 4 Challenges: Requires implementation by the original authors Lack of raw data from valuation studies 11 4 Pullenayegum EM, Chan KKW, Feng X. EQ-5D health utilities are estimated subject to considerable uncertainty. Submitted under review.

Solutions 2. Multiple imputation Approximation to Bayesian treatment of parameter uncertainty, used to handle missing data 5,6 E.g. true mean utilities of each health state Replaces missing value with a set of plausible values that represent the uncertainty about the right value to impute 5 Rubin DB. Multiple imputation for non-response in surveys. John Wiley & Sons; 1987. 6 Schafer JL. Multiple imputation: A primer. Stat Methods Med Res. 1999;8(1):3-15. 12

Multiple Imputation MCMC method: The imputation I-step: draw values of Y i(mis) from a conditional distribution of Y i(mis) given Yi (obs) The posterior P-step: simulates the posterior population mean vector and covariance matrix from the complete sample estimates 13

Multiple Imputation Three phases: 1. Missing data are filled in m times to generate m complete data sets 2. The m complete data sets are analyzed by using standard procedures 3. The results from the m complete data sets are combined for the inference 14

Multiple Imputation And B be the between-imputation variance 15

Multiple Imputation The total variance is: 1 T = U + ( 1+ ) B m 16

17 Multiple imputation method.

Solutions Multiple Imputation Sets are drawn from posterior predictive distributions only once If made public, sets can be used in future studies Removes requirement for full Bayesian analysis each time an MAUI is used 18

Aims Using the US EQ-5D valuation study, Demonstrate that multiple imputation can correct underestimation of variance of mean health utilities 19

Methods US EQ-5D valuation study 7 : N = 3,773, 42 health states 2 dummy independent variables Uses D1 model Using data from above study, create Bayesian mixed effect model 20 7 Shaw JW, Johnson JA, Coons SJ. US valuation of the EQ-5D health states: Development and testing of the D1 valuation model. Med Care. 2005;43(3):203-220.

Bayesian mixed effect model Let i = respondents (3,773), j = health states (N = 42), X β = linear predictor of the US D1 model, which included the 2 dummy variables for each dimension (i.e. a total of 10 dummy th variables for 5 dimensions), I3, I2 and I2-squared, µ = time-trade-off utility of the i respondent and the th j health state. µ = 1 ( fixed + random ) ij fixed ij random = ij X ij β = u i ij + δ j ij where u i represents the random effects for respondents ( ui ~ N(0, τ responents ) ), and δ j represents the random effects for health states ( δ j ~ N (0, τ states ) ). ij 21

Bayesian mixed effect model Take posterior predictive joint distributions of the mean utility to Capture parameter uncertainty Perform multiple imputations (imputed sets drawn randomly from the Gibbs sampler) 22

Full Bayesian Analysis Analysis conducted using US EQ-5D study, divided into 2 sets Derivation set: N = 3,273 Application set: N = 500 Derivation set used D1 model, which was fitted to Bayesian mixed effect model Obtained posterior predictive distribution of the mean utility attached to each health state Applied to application set to compute mean and variance 23

Multiple imputation Multiple imputation with Monte Carlo Markov chain models Performed using derivation set Randomly drawn multiple imputed sets applied to application dataset Mean, variance and standard error across imputed sets calculated using Rubin s rule 10 10 Rubin DB. Multiple imputation for non-response in surveys. John Wiley & Sons; 1987. 24

25 Methods to illustrate that multiple imputation can be used to correct for the underestimation of variance of mean health utilities of the sample.

Results Comparisons of the sample mean and sample standard error of the mean health utility of the application set (N = 500) based on (i) the regression coefficients (i.e. scoring algorithm), (ii) the full Bayesian model s posterior predictive distribution and multiple imputation Traditional Method (based on coefficients) Multiple Imputation Full Bayesian Model Mean 0.873 0.876 0.873 Variance 5.30 10-5 1.23 10-4 1.19 10-4 SE 7.28 10-3 1.11 10-2 1.09 10-2 26

Results 95% of confidence intervals (CI)/credible regions (CR) of sample mean utility of the application set 27

Discussion Multiple imputation provides middle ground Researchers do not have to learn Bayesian methods Variance and standard error reflect appropriate degree of parameter uncertainty 28

Limitations Need original publishers of MAUI studies to apply this imputation method Some imputed value sets appear to be logically inconsistent Did not take population weights of US D1 model into account 29

Conclusions MI is a potential method to account for the underestimation of variance of predicted health utilities 30

Acknowledgement Eleanor Pullenayegum Andy Willan Wendy Lou 31

Simulation Study Used 100 simulated sets sampled from the 42 health states in the US EQ-5D valuation study Full Bayesian analysis convergence confirmed using methods by Raftery and Lewis 8 and Heidelberg and Welch 9 8 Raftery, A.E. and Lewis, S.M. (1992). One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo. Statistical Science, 7, 493-497. 9 Heidelberger P and Welch PD. A spectral method for confidence interval generation and run length control in simulations. Comm. ACM. 24, 233-245 (1981) 32

Simulation Study Using the posterior predictive distributions of each of the 42 health states, 10 samples drawn randomly to construct 95% CI for mean utility 9 d.f. Compare the 100 CIs generated from simulation with those of the original study 33

Results Simulation study: >95% coverage of 95% CI 38/42 health states covered Table 1. Parameter estimates of the coefficients and 95% credible regions (CR) of the Bayesian mixed effects model using the dataset of US EQ-5D-3L valuation study. 34 D1 model parameter Coefficients 95% CR of coefficients MO2 0.157 0.120 0.197 MO3 0.526 0.468 0.586 SC2 0.178 0.146 0.211 SC3 0.443 0.391 0.496 UA2 0.136 0.093 0.183 UA3 0.351 0.309 0.394 PD2 0.174 0.141 0.208 PD3 0.503 0.460 0.547 AD2 0.159 0.124 0.193 AD3 0.413 0.372 0.454 D1-0.139-0.178-0.105 I2-squared 0.008 0.000 0.015 I3-0.107-0.168-0.047 I3-squared -0.009-0.019 0.000