A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

Size: px
Start display at page:

Download "A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY"

Transcription

1 A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research, UCLA Neuropsychiatric Institute Los Angeles, CA Department of Biostatistics, UCLA School of Public Health, Los Angeles, CA lqtang@ucla.edu KEY WORDS: Multiple Imputation, Hot-deck Imputation, Mental Health ABSTRACT: IMPACT is a multi-center randomized controlled trial of a disease management program for late life depression. Like many longitudinal clinical trials, this study faces problems of item-nonresponse, unit-level non-response and drop out. In this paper, we compare two approaches to handle incomplete data. The first approach is based on hot-deck multiple imputation of missing response, using a modified predicted mean matching method for item-nonresponse (Bell, 1999) and the approximate Bayesian bootstrap for unit-non-response (Lavori, Dawson and Shera 1995). In the second method, we apply multiple imputation based on the multivariate normal model using SAS PROC MI software. The two methods as well as complete-case analysis are compared in a simulation study. Overall both hot-deck multiple imputations performed well with good coverage rates for Monte Carlo means and testing intervention effects. For dichotomous variables, multiple imputation under the multivariate normal model has lower coverage in three variables, which were derived from a highly skewed variable. On the other hand, complete-case analysis showed that 47% varaibles had low coverage in Monte Carlo means, but it had good coverage in the intervention effects because the intervention and the control group have the same direction in bias and biases were cancelled. 1. THE IMPACT STUDY AND MISSING DATA Project IMPACT (Improving Mood: Promoting Access to Collaborative Treatment for Late Life Depression) is a multi-center study to test the effectiveness of a new disease management model for late life depression. 1,801 patients aged 60 and older with major depression or dysthymic disorder were enrolled in the study from 8 organizations. In each clinic, participants are randomly assigned to either the new treatment model or the usual care. Usual care patients could use any primary care or specialty mental health care services available to them as usual. Intervention participants had access to a depression care manager who was supervised by a psychiatrist and a primary care expert for up to 12 months in addition to the usual care component. After 12 months, all study participants continued with their regular primary care provider in care as usual. Telephone surveys were conducted for a two-year period to assess the effects of the new depression treatment model on various health and economic outcomes, as compared to the usual care delivery models in place at each site. This paper uses IMPACT data from baseline, 3 and 6 month follow-ups. Most variables have missing rates of less than 2% at item-level missing data. The unit response rates to the 3 and 6-month follow-up assessments were 90.2% and 87.2% respectively. Missing rates in the control group are higher than in the intervention group (10.1% vs. 8.3% at Month-3 and 12.6% vs. 10.7% at Month-6), while these differences are not significant. The missing data pattern is not monotone in terms that some of patients (2.6%) missed Month-3 assessments and participated at the Month-6 (Table 1) participants completed both follow-ups with 783 in the intervention group and 738 in the care as usual group. Table 1. Unit Response Pattern Over Waves Month Response status Over all (N=1801) Interventi on (N=906) Control (N=895) 3 Responded Missing Dropout Death Responded Missing Dropout Death 23 * 9* 14* *: cumulative number including month 3 To investigate whether unit-nonresponse behavior differs between the intervention and control groups, we tested bivariate association for each independent variable in a large set of predictors with the outcome of response (coded 1 if response and 0 if nonresponse). The independent variables included in the bivariate analyses were demographic, clinical characteristics and prior treatments at baseline. For Month-6 analyses, we also included clinical characteristics and depression treatment variables at 3 month. A number of covariates are associated with nonresponse at 3 and 6 months in both the intervention and the control group although the strength and statistical significance of the associations varies somewhat. 3430

2 At 3 months, higher rates of nonresponse are associated with greater cognitive impairment at baseline in both groups. At 6 months, higher rates of nonresponse are associated with education level, functional impairment at baseline and quality of life at 3 months in both groups. There are, however, a number of covariates that are significantly associated with nonresponse in one group but not the other. For example, the nonresponse rates significantly differ across the organizations at both follow-ups in the intervention group but not in the control group. The baseline treatment preferences for counseling rather than antidepressant medication are associated with lower nonresponse rates in the intervention but not the control group. This could be explained by the fact that the intervention offers free counseling to all participants. Nonresponders in the intervention but not in the control group are more likely to state a treatment preference for "neither counseling nor psychotherapy". This might be explained by the fact that intervention patients are actively encouraged to try one of these two treatments while this is less likely to be the case in the usual care controls. Similarly, nonresponders to both follow-ups in the intervention group have significantly lower rates of prior antidepressant use than responders. This is not the case in the usual care group, and it might reflect the fact that in the intervention group where patients were actively encouraged to try treatments such as antidepressants those who do not wish to take such treatments are more likely to drop out than in the usual care group where there is less such pressure. These findings indicate that the unit nonresponse behavior in the IMPACT study differs between the intervention group and usual care group. And this behavior changes over time. We chose mixed-effects models to conduct intent-to-treat analyses. In the mixed-effects models, we specified the covariance structure within subjects using an unstructured model to account for the within-subject correlation over time. Mixed-effects models handles cases with incomplete follow-up, and it is known that it provides unbiased estimates when the missing data are missing at random (Littell, 1996). However, this assumption is correct only under the included variables in the model. If there are variables not included in the mixed-effects model but related to the missing data mechanism, the model without those variables may lead to biased estimates. In addition, such programs are designed to handle missing outcomes, and it is typical as with many other regression programs to drop subjects from analyses when any explanatory variable is missing. 2. IMPUTATION PROCEDURES In this paper, we considered two multiple imputation methods to handle missing data. The first is referred as the hot-deck multiple imputation procedure since it uses a modified predicted mean matching method for item-nonresponse (Bell, 1999, Little 1988) and the approximate Bayesian bootstrap for unit-non-response (Lavori, Dawson and Shera 1995). In the second method, we apply multiple imputation using a multivariate normal model using a missing data procedure (PROC MI) in SAS software HOT-DECK MULTIPLE IMPUTATION For the hot-deck multiple imputation procedure, we dealt with the item-level missing data and unit-level missing data sequentially. We first impute baseline data 5 times. We then conduct unit-level imputation for 3 month followup data using information from the imputed baseline. The item-level imputation at 3 month follow-up is then performed after the unit-level imputation. For 6 month follow-up data, we impute the unit-level imputation using information from 5 imputed baseline and 3 month data followed by the item-level imputation. These sequential procedures will be continued towards the end of the study, resulting five imputed data sets in each follow-up. The hot-deck imputation for item-level missing data modifies the predictive mean matching method (Bell, 1999). This method involves two steps: (1) forming imputation classes; and (2) drawing imputations at random from observed data within each class. In order to impute variable Y, for instance, we select 20 variables, X1-X20, as predictors. In step 1, each missing value of Y is predicted using a multiple regression of Y on all of the independent variables that were observed for that case. Suppose that Respondent A was missing Y and X15 to X20. Respondent A s prediction would be based on a multiple linear regression for people with complete data for Y and X1 to X14 (without regard to which of X15 to X20 were observed). Predicted values are then computed for anyone with complete data on X1 to X14 (everyone in the regression, Respondent A, and all others with the same missing data pattern as Respondent A). The predicted values from this regression are then sorted into equal-sized cells based on the predicted values from all cases. A value of Y is imputed for Respondent A in Step 2 by choosing an observed value of Y at random (with replacement) from the same cell that Respondent A fell in. To reflect the uncertainty of donor cells we created bootstrap weights before each hot-deck imputation. Then we used the 3431

3 bootstrap weights in the weighted multiple regression as well as in the selection of donors. In each follow-up month, the imputed data sets differs by only the bootstrap weight and the seed used to obtain the random number employed in the hot deck imputation. The hot-deck imputation for unit-level missing data is based on the approximate Bayesian bootstrap method, stratifying by propensity scores of response (Lavori etc., 1995). This method consists of two steps: (1) forming imputation cells by propensity-scores based stratification; (2) imputation based on the approximate Bayesian bootstrap approach within each cell. Let T be the number of follow-up waves and z t be the indicator for response to the t-th wave (coded 1 if response and 0 if nonresponse), t=1, 2,, T. In step 1, we model the propensity of response to the t-th wave through baseline and prior waves: e 1 (X 0 )=Pr(z 1 =1 X 0 ), : : e t (X 0,Y 1,,Y t-1 )=Pr(z 2 =1 X 0,Y 1, Y t-1 ), where X 0 is the baseline covariate vector and Y t is the outcome vector at the t-th wave. Given the stratification based on the quartiles of the propensity scores among the nonrespondants, at step (2), we use the algorithm of the approximate Bayesian bootstrap (Lavori, Dawson and Shera 1995) to select donors: (a) Sample n obs (the number of respondents to the wave) at random with replacement from the observed responses in an imputation cell. This forms a potential set of observed responses. (b) Sample n miss (the number of non-respondents to the wave) with replacement from the potential observed sample from (a). Here the sampled n miss are donors for the imputation. (c) Missing data are replaced by the observed data from the donors determined in (b) MULTIPLE IMPUTATION BASED ON A MULTIVARIATE NORMAL MODEL Multiple imputation based on multivariate normal model assumes that complete data follow a multivariate normal distribution (Schafer 1997). Here we chose the SAS MI procedure to generate five multiply imputed data sets. IMPACT data contains both continuous and categorical variables. Even if categorical variables do not follow the normal distribution assumption, multiple imputation may be robust when the amounts of missing information are not large (Schafer, 1997). The percentage of missing values in each variable was quite low in the IMPACT study, and we used dummy codes for categorical variables. Several variables measured as counts were not normally distributed with highly skewed distribution. The log transformation was applied to those variables, and imputed values were transformed back to the original scale. Imputations under the normal distribution resulted in imputed values in a continuous scale and we rounded off those numbers to the possible range of each variable. IMPACT study measures outcome variables repeatedly over time. However, the measured time points were only 3 times (baseline, 3 months, and 6 months), we considered a cross-sectional imputation model instead of a longitudinal one. In the imputation model, we considered longitudinally measured variable as separate variables. That is, the mean SCL-20 depression score was measured at baseline, three months, and six months and treated as three different variables in the model. In some sense, this model may be more general than a longitudinal model, because it considers all possible correlations among variables. 3. APPLICATION This section describes the implementation of the two imputation procedures and presents numeric results from the data analyses HOT-DECK MULTIPLE IMPUTATION To conduct the unit-level imputation, we first fit the multiple logistic regressions to estimate the propensity scores of response stratified by the intervention arms. We started with a large set of independent variables to be considered for a logistic regression on the outcome of response (coded 1 if response and 0 if nonresponse). In order to control for multiple comparisons, we only included in the modeling procedures with a bivariate association with the response that was partially statistically significant (P-value<0.1) The final model included the predictors that were significant (P-value<.05) in at least one of the 5 multiply imputation data sets for either intervention group or the care as usual group. We also forced in independent variables of gender and two design variables: recruitment method (screening or referral) and site (7 dummy variables for 8 sites). To form the imputation cell, we used the intervention status and gender as the primary stratification variable because men and women may have different characteristics. Within each intervention and gender group, the 4 imputation cells were formed based on the quartiles of the estimated propensity scores e t (X 0,Y 1,,Y t-1 ) among the nonresponse group (see Table 2). Averaged across 5 imputation data sets, the propensity scores differ by 2 percent between the intervention group 3432

4 and care as usual group (mean(sd) =.90 (.04) vs..92(.05) at Month-3, and means(sd) =.87 (.08) vs..89 (.08) at Month-6). Table 2 shows the distribution of the possible matches in each imputation cells, stratified by intervention status, gender and the quartiles of propensity s cores. Table 2. Imputation Cells Stratified by Gender- Propensity Scores Possible matches Month 3 Month 6 Grou p Sex Quartile Group Missing wave Obs. Missing wave Obs. UC M F IV M F All MULTIPLE IMPUTATION BASED ON A MULTIVARIATE NORMAL MODEL The imputation model based on multivariate normal model contained 91 variables. It includes an indicator variable for intervention status and 35 variables of baseline characteristics, 20 outcome variables at baseline, 17 outcome variables at Month-3 follow-up, and 18 outcome variables at Month-6 follow-up. All variables had missing rates less than 2%. Outcome variables at each time point contained 8-9 primary outcome variables of our main interest as well as 7-11 secondary outcome variables for later analysis. Five outcome variables of main interest were derived variables from three variables. The model contained those three original variables, and five outcome variables were derived from them after imputation. Two outcome variables (dysthymia and current employment status) were measured only at baseline, and major depression (SCID) was not measured at 3 month follow-up, resulting different number of outcome variables at each time point. Some outcome variables were composite measures from many item variables. Since much bigger model with each item variables had a problem in convergence of the EM algorithm, we did not consider item variable imputation here. The convergence of data augmentation was checked with time-series plots and autocorrelation plots. The first thousand iterations were discarded as a burn-in period. We considered a single chain multiple imputation and 5 imputed data sets were generated at every one thousand iterations. Imputation of categorical variables could generate unreasonable imputed values. For example, missing ethnicity might result in two ethnicities. In that case, imputed values were considered as a probability and randomly redrawn ethnicity variables until only one ethnicity were chosen. In Month-3 and Month-6, 11 and 12 participants were deceased, respectively, and their values should not be imputed. Since the MI procedure did not allow us to choose who should not be imputed, those people s values were deleted after imputation. That is, final imputed data were missing for those participants on follow-up measures after their death RESULTS We conducted outcome analyses using the data sets imputed by two imputation procedures. Dependent variables in IMPACT included 9 binary and 3 interval valued variables: self reported use of antidepressants or psychotherapy in the last 3 months, potentially effective use of antidepressants or psychotherapy, SCL-20 depression scores, rates of remission of depression (defined by an SCL-20 depression score <0.5), rates of treatment response (defined as a 50 % or greater drop in SCL-20 depression score from baseline), the proportion of subjects who met diagnostic criteria for major depression on the SCID, health related functional impairment, and quality of life at 3 and 6-months. For each dependent variable, we conducted an intent-to-treat analysis of repeated measures. We fitted mixed-effects regression models (or mixedeffects logistic regression for dichotomous variables) using baseline, 3 and 6 months followup data with regression adjustment for recruitment method (screening or referral) and participating study organizations. In the mixed-effects models, to account for the within-subject correlation over time, we specified the covariance structure within subjects using an unstructured model. Table 3 presents the imputation effects on unadjusted analyses for 3 key outcome variables (DEPTX=any depression treatment, MAJDEP= Major depression, SCL= SCL-20 depression score). Point estimates from two imputation methods were compared to the complete-data analysis. SE0 stands for the stander error from complete-data analysis. λ is the estimated fraction of missing information. Both methods produced similar point estimates. On the average, the hotdeck multiple imputation produced higher standard errors, larger λ and slightly less significant 3433

5 intervention effects. Among 12 dependent variables, the difference of t-statistics, on the average, is less than Table 3 Imputation Effects for Key Outcome Varaibles from Unadjusted Analyses MI Hot-deck MI multivariate normal Difference on point Difference on point Y Data estimations SE/SE0 100λ estimations SE/SE0 100λ DEPTX03* All UC I DEPTX06** All UC I MAJDEP06** All UC I SCL03* All UC I SCL06** All UC I * Represents 3-month follow-up, ** Represents 6-month follow-up. 4. SIMULATION To compare bias and coverage rates in two imputation methods in the IMPACT study, we performed a Monte Carlo simulation study. The group of 1521 participants who completed all two follow-ups was considered to be the population for this simulation study (intervention group=783, care as usual group =738). For each intervention arm and follow-up wave, simple random samples of n=250 were drawn without replacement. After a sample was drawn, the following missing data mechanism was imposed for each person. The missing mechanism mimicked MAR mechanism with estimated coefficients in the multiple logistic regressions of estimating propensity scores of response from all participates, adjusting intercept such that the percentage of the missingness rates matched the IMPACT observed data. The procedure was conducted separately by the intervention and usual care group as well as the follow-up months. The missingmess rates matched the true data with 8% and 11% for intervention group at months 3 and 6, and 10% and 11% for care as usual group. After imposing a pattern of unit-missingness, the two imputation methods were applied and point estimates (the group means) and intervention effects were calculated for each dependent variable. The estimated variances were adjusted by the finite population correction factor because the sampling fraction is large in the simulation. The procedure was carried out 1000 times. Tables 4 and 5 display the simulation results from point estimation and intervention effects for 3 key outcome varaibles. The tables report the Monte Carlo mean estimated bias, the actual coverage rate (cvg.) indicating the percentage of 95% intervals out of 1000 that covered the true estimand, the length of the 95% confidence interval (lngh). Overall hot-deck multiple imputations performed well with good coverage rates. Multiple imputations under the multivariate normal model showed good coverage in all continuous variables. For dichotomous variables, it has lower coverage in three variables, which were derived from highly skewed variable, the number of all mental health therapy visit. Imputed items tended to be more positive than the observed ones; the percentage of zero was 40.9% among imputed cases and 66.7% among observed cases. Since the distribution of this variable is very skewed, the mean is much higher than the median, and it might end up imputing higher values than observed values. Similar patterns were shown in other simulated data with large bias in this variable and variables derived from this variable. On the other hand, 47% varaibles in complete-case analysis showed lower coverage in Monte Carlo means, but it showed good coverage in the intervention effects because the intervention and the control group have the same direction in bias and biases were cancelled. 3434

6 Table 4. Point estimates for key outcome varaibles Complete-case analysis Hot-deck Multivariate Normal Y Data bias Cvg. lngh bias Cvg. lngh 100λ bias Cvg. lngh 100λ DEPTX03 All UC IV DEPTX06 All UC IV MAJDEP06 All UC IV SCL03 All UC IV SCL06 All UC IV Table 5. Intervention effects for key outcome varaibles Complete-case analysis Hotdeck Multivariate Normal Y bias cvg. lngth bias cvg. lngth 100λ bias cvg. lngth 100λ DEPTX DEPTX MAJDEP SCL SCL REFERENCE Little R. J. (1988) Missing data adjustments in large surveys,.j Business and Economic Statistics, 6, Bell R (1999). Presentation at Depression PORT Methods Workshop (I). RAND, Santa Monica, CA. Rubin DB. (1987), Multiple imputation for nonresponse in surveys. New York: J Wiley & Sons. Lavori P., Dawson R. and Shera D. A multiple imputation strategy for clinical trials with truncation of patient data. Statistics in Medicine 1995; 14: Littell, R. C., Milliken, G. A., Stroup, W. W., and Wolfinger, R. D., (1996) SAS System for Mixed Models, SAS Institute, Inc. Schafer, JL, (1997) Analysis of Incomplete Multivariate Data, Chapman & Hall. 3435

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch.

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch. S05-2008 Imputation of Categorical Missing Data: A comparison of Multivariate Normal and Abstract Multinomial Methods Holmes Finch Matt Margraf Ball State University Procedures for the imputation of missing

More information

Missing Data and Imputation

Missing Data and Imputation Missing Data and Imputation Barnali Das NAACCR Webinar May 2016 Outline Basic concepts Missing data mechanisms Methods used to handle missing data 1 What are missing data? General term: data we intended

More information

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan Sequential nonparametric regression multiple imputations Irina Bondarenko and Trivellore Raghunathan Department of Biostatistics, University of Michigan Ann Arbor, MI 48105 Abstract Multiple imputation,

More information

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1 Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition

More information

Methods for Computing Missing Item Response in Psychometric Scale Construction

Methods for Computing Missing Item Response in Psychometric Scale Construction American Journal of Biostatistics Original Research Paper Methods for Computing Missing Item Response in Psychometric Scale Construction Ohidul Islam Siddiqui Institute of Statistical Research and Training

More information

SCHIATTINO ET AL. Biol Res 38, 2005, Disciplinary Program of Immunology, ICBM, Faculty of Medicine, University of Chile. 3

SCHIATTINO ET AL. Biol Res 38, 2005, Disciplinary Program of Immunology, ICBM, Faculty of Medicine, University of Chile. 3 Biol Res 38: 7-12, 2005 BR 7 Multiple imputation procedures allow the rescue of missing data: An application to determine serum tumor necrosis factor (TNF) concentration values during the treatment of

More information

Kelvin Chan Feb 10, 2015

Kelvin Chan Feb 10, 2015 Underestimation of Variance of Predicted Mean Health Utilities Derived from Multi- Attribute Utility Instruments: The Use of Multiple Imputation as a Potential Solution. Kelvin Chan Feb 10, 2015 Outline

More information

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business Applied Medical Statistics Using SAS Geoff Der Brian S. Everitt CRC Press Taylor Si Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Croup, an informa business A

More information

An Introduction to Multiple Imputation for Missing Items in Complex Surveys

An Introduction to Multiple Imputation for Missing Items in Complex Surveys An Introduction to Multiple Imputation for Missing Items in Complex Surveys October 17, 2014 Joe Schafer Center for Statistical Research and Methodology (CSRM) United States Census Bureau Views expressed

More information

Multiple imputation for multivariate missing-data. problems: a data analyst s perspective. Joseph L. Schafer and Maren K. Olsen

Multiple imputation for multivariate missing-data. problems: a data analyst s perspective. Joseph L. Schafer and Maren K. Olsen Multiple imputation for multivariate missing-data problems: a data analyst s perspective Joseph L. Schafer and Maren K. Olsen The Pennsylvania State University March 9, 1998 1 Abstract Analyses of multivariate

More information

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC Selected Topics in Biostatistics Seminar Series Missing Data Sponsored by: Center For Clinical Investigation and Cleveland CTSC Brian Schmotzer, MS Biostatistician, CCI Statistical Sciences Core brian.schmotzer@case.edu

More information

Statistical data preparation: management of missing values and outliers

Statistical data preparation: management of missing values and outliers KJA Korean Journal of Anesthesiology Statistical Round pissn 2005-6419 eissn 2005-7563 Statistical data preparation: management of missing values and outliers Sang Kyu Kwak 1 and Jong Hae Kim 2 Departments

More information

Section on Survey Research Methods JSM 2009

Section on Survey Research Methods JSM 2009 Missing Data and Complex Samples: The Impact of Listwise Deletion vs. Subpopulation Analysis on Statistical Bias and Hypothesis Test Results when Data are MCAR and MAR Bethany A. Bell, Jeffrey D. Kromrey

More information

Advanced Handling of Missing Data

Advanced Handling of Missing Data Advanced Handling of Missing Data One-day Workshop Nicole Janz ssrmcta@hermes.cam.ac.uk 2 Goals Discuss types of missingness Know advantages & disadvantages of missing data methods Learn multiple imputation

More information

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values Sutthipong Meeyai School of Transportation Engineering, Suranaree University of Technology,

More information

Help! Statistics! Missing data. An introduction

Help! Statistics! Missing data. An introduction Help! Statistics! Missing data. An introduction Sacha la Bastide-van Gemert Medical Statistics and Decision Making Department of Epidemiology UMCG Help! Statistics! Lunch time lectures What? Frequently

More information

PubH 7405: REGRESSION ANALYSIS. Propensity Score

PubH 7405: REGRESSION ANALYSIS. Propensity Score PubH 7405: REGRESSION ANALYSIS Propensity Score INTRODUCTION: There is a growing interest in using observational (or nonrandomized) studies to estimate the effects of treatments on outcomes. In observational

More information

Appendix 1. Sensitivity analysis for ACQ: missing value analysis by multiple imputation

Appendix 1. Sensitivity analysis for ACQ: missing value analysis by multiple imputation Appendix 1 Sensitivity analysis for ACQ: missing value analysis by multiple imputation A sensitivity analysis was carried out on the primary outcome measure (ACQ) using multiple imputation (MI). MI is

More information

Analysis of TB prevalence surveys

Analysis of TB prevalence surveys Workshop and training course on TB prevalence surveys with a focus on field operations Analysis of TB prevalence surveys Day 8 Thursday, 4 August 2011 Phnom Penh Babis Sismanidis with acknowledgements

More information

BIOSTATISTICAL METHODS

BIOSTATISTICAL METHODS BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH PROPENSITY SCORE Confounding Definition: A situation in which the effect or association between an exposure (a predictor or risk factor) and

More information

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study STATISTICAL METHODS Epidemiology Biostatistics and Public Health - 2016, Volume 13, Number 1 Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation

More information

Should a Normal Imputation Model Be Modified to Impute Skewed Variables?

Should a Normal Imputation Model Be Modified to Impute Skewed Variables? Sociological Methods and Research, 2013, 42(1), 105-138 Should a Normal Imputation Model Be Modified to Impute Skewed Variables? Paul T. von Hippel Abstract (169 words) Researchers often impute continuous

More information

Module 14: Missing Data Concepts

Module 14: Missing Data Concepts Module 14: Missing Data Concepts Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724 Pre-requisites Module 3

More information

Multiple imputation for handling missing outcome data when estimating the relative risk

Multiple imputation for handling missing outcome data when estimating the relative risk Sullivan et al. BMC Medical Research Methodology (2017) 17:134 DOI 10.1186/s12874-017-0414-5 RESEARCH ARTICLE Open Access Multiple imputation for handling missing outcome data when estimating the relative

More information

Strategies for handling missing data in randomised trials

Strategies for handling missing data in randomised trials Strategies for handling missing data in randomised trials NIHR statistical meeting London, 13th February 2012 Ian White MRC Biostatistics Unit, Cambridge, UK Plan 1. Why do missing data matter? 2. Popular

More information

Inclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and. All Incomplete Variables. Jin Eun Yoo, Brian French, Susan Maller

Inclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and. All Incomplete Variables. Jin Eun Yoo, Brian French, Susan Maller Inclusive strategy with CFA/MI 1 Running head: CFA AND MULTIPLE IMPUTATION Inclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and All Incomplete Variables Jin Eun Yoo, Brian French,

More information

Bayesian approaches to handling missing data: Practical Exercises

Bayesian approaches to handling missing data: Practical Exercises Bayesian approaches to handling missing data: Practical Exercises 1 Practical A Thanks to James Carpenter and Jonathan Bartlett who developed the exercise on which this practical is based (funded by ESRC).

More information

Evaluators Perspectives on Research on Evaluation

Evaluators Perspectives on Research on Evaluation Supplemental Information New Directions in Evaluation Appendix A Survey on Evaluators Perspectives on Research on Evaluation Evaluators Perspectives on Research on Evaluation Research on Evaluation (RoE)

More information

Mediation Analysis With Principal Stratification

Mediation Analysis With Principal Stratification University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 3-30-009 Mediation Analysis With Principal Stratification Robert Gallop Dylan S. Small University of Pennsylvania

More information

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Journal of Social and Development Sciences Vol. 4, No. 4, pp. 93-97, Apr 203 (ISSN 222-52) Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Henry De-Graft Acquah University

More information

Master thesis Department of Statistics

Master thesis Department of Statistics Master thesis Department of Statistics Masteruppsats, Statistiska institutionen Missing Data in the Swedish National Patients Register: Multiple Imputation by Fully Conditional Specification Jesper Hörnblad

More information

Impact and adjustment of selection bias. in the assessment of measurement equivalence

Impact and adjustment of selection bias. in the assessment of measurement equivalence Impact and adjustment of selection bias in the assessment of measurement equivalence Thomas Klausch, Joop Hox,& Barry Schouten Working Paper, Utrecht, December 2012 Corresponding author: Thomas Klausch,

More information

Complier Average Causal Effect (CACE)

Complier Average Causal Effect (CACE) Complier Average Causal Effect (CACE) Booil Jo Stanford University Methodological Advancement Meeting Innovative Directions in Estimating Impact Office of Planning, Research & Evaluation Administration

More information

Using Test Databases to Evaluate Record Linkage Models and Train Linkage Practitioners

Using Test Databases to Evaluate Record Linkage Models and Train Linkage Practitioners Using Test Databases to Evaluate Record Linkage Models and Train Linkage Practitioners Michael H. McGlincy Strategic Matching, Inc. PO Box 334, Morrisonville, NY 12962 Phone 518 643 8485, mcglincym@strategicmatching.com

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Multiple Imputation For Missing Data: What Is It And How Can I Use It?

Multiple Imputation For Missing Data: What Is It And How Can I Use It? Multiple Imputation For Missing Data: What Is It And How Can I Use It? Jeffrey C. Wayman, Ph.D. Center for Social Organization of Schools Johns Hopkins University jwayman@csos.jhu.edu www.csos.jhu.edu

More information

Evaluating health management programmes over time: application of propensity score-based weighting to longitudinal datajep_

Evaluating health management programmes over time: application of propensity score-based weighting to longitudinal datajep_ Journal of Evaluation in Clinical Practice ISSN 1356-1294 Evaluating health management programmes over time: application of propensity score-based weighting to longitudinal datajep_1361 180..185 Ariel

More information

An application of a pattern-mixture model with multiple imputation for the analysis of longitudinal trials with protocol deviations

An application of a pattern-mixture model with multiple imputation for the analysis of longitudinal trials with protocol deviations Iddrisu and Gumedze BMC Medical Research Methodology (2019) 19:10 https://doi.org/10.1186/s12874-018-0639-y RESEARCH ARTICLE Open Access An application of a pattern-mixture model with multiple imputation

More information

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data Missing data Patrick Breheny April 3 Patrick Breheny BST 71: Bayesian Modeling in Biostatistics 1/39 Our final topic for the semester is missing data Missing data is very common in practice, and can occur

More information

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0% Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of

More information

Score Tests of Normality in Bivariate Probit Models

Score Tests of Normality in Bivariate Probit Models Score Tests of Normality in Bivariate Probit Models Anthony Murphy Nuffield College, Oxford OX1 1NF, UK Abstract: A relatively simple and convenient score test of normality in the bivariate probit model

More information

Bias reduction with an adjustment for participants intent to dropout of a randomized controlled clinical trial

Bias reduction with an adjustment for participants intent to dropout of a randomized controlled clinical trial ARTICLE Clinical Trials 2007; 4: 540 547 Bias reduction with an adjustment for participants intent to dropout of a randomized controlled clinical trial Andrew C Leon a, Hakan Demirtas b, and Donald Hedeker

More information

2017 American Medical Association. All rights reserved.

2017 American Medical Association. All rights reserved. Supplementary Online Content Borocas DA, Alvarez J, Resnick MJ, et al. Association between radiation therapy, surgery, or observation for localized prostate cancer and patient-reported outcomes after 3

More information

MISSING DATA AND PARAMETERS ESTIMATES IN MULTIDIMENSIONAL ITEM RESPONSE MODELS. Federico Andreis, Pier Alda Ferrari *

MISSING DATA AND PARAMETERS ESTIMATES IN MULTIDIMENSIONAL ITEM RESPONSE MODELS. Federico Andreis, Pier Alda Ferrari * Electronic Journal of Applied Statistical Analysis EJASA (2012), Electron. J. App. Stat. Anal., Vol. 5, Issue 3, 431 437 e-issn 2070-5948, DOI 10.1285/i20705948v5n3p431 2012 Università del Salento http://siba-ese.unile.it/index.php/ejasa/index

More information

Analysis Strategies for Clinical Trials with Treatment Non-Adherence Bohdana Ratitch, PhD

Analysis Strategies for Clinical Trials with Treatment Non-Adherence Bohdana Ratitch, PhD Analysis Strategies for Clinical Trials with Treatment Non-Adherence Bohdana Ratitch, PhD Acknowledgments: Michael O Kelly, James Roger, Ilya Lipkovich, DIA SWG On Missing Data Copyright 2016 QuintilesIMS.

More information

Statistical Audit. Summary. Conceptual and. framework. MICHAELA SAISANA and ANDREA SALTELLI European Commission Joint Research Centre (Ispra, Italy)

Statistical Audit. Summary. Conceptual and. framework. MICHAELA SAISANA and ANDREA SALTELLI European Commission Joint Research Centre (Ispra, Italy) Statistical Audit MICHAELA SAISANA and ANDREA SALTELLI European Commission Joint Research Centre (Ispra, Italy) Summary The JRC analysis suggests that the conceptualized multi-level structure of the 2012

More information

UN Handbook Ch. 7 'Managing sources of non-sampling error': recommendations on response rates

UN Handbook Ch. 7 'Managing sources of non-sampling error': recommendations on response rates JOINT EU/OECD WORKSHOP ON RECENT DEVELOPMENTS IN BUSINESS AND CONSUMER SURVEYS Methodological session II: Task Force & UN Handbook on conduct of surveys response rates, weighting and accuracy UN Handbook

More information

A Review of Hot Deck Imputation for Survey Non-response

A Review of Hot Deck Imputation for Survey Non-response doi:10.1111/j.1751-5823.2010.00103.x A Review of Hot Deck Imputation for Survey Non-response Rebecca R. Andridge 1 and Roderick J. A. Little 2 1 Division of Biostatistics, The Ohio State University, Columbus,

More information

Detection of Unknown Confounders. by Bayesian Confirmatory Factor Analysis

Detection of Unknown Confounders. by Bayesian Confirmatory Factor Analysis Advanced Studies in Medical Sciences, Vol. 1, 2013, no. 3, 143-156 HIKARI Ltd, www.m-hikari.com Detection of Unknown Confounders by Bayesian Confirmatory Factor Analysis Emil Kupek Department of Public

More information

Exploring the Impact of Missing Data in Multiple Regression

Exploring the Impact of Missing Data in Multiple Regression Exploring the Impact of Missing Data in Multiple Regression Michael G Kenward London School of Hygiene and Tropical Medicine 28th May 2015 1. Introduction In this note we are concerned with the conduct

More information

Running Head: BAYESIAN MEDIATION WITH MISSING DATA 1. A Bayesian Approach for Estimating Mediation Effects with Missing Data. Craig K.

Running Head: BAYESIAN MEDIATION WITH MISSING DATA 1. A Bayesian Approach for Estimating Mediation Effects with Missing Data. Craig K. Running Head: BAYESIAN MEDIATION WITH MISSING DATA 1 A Bayesian Approach for Estimating Mediation Effects with Missing Data Craig K. Enders Arizona State University Amanda J. Fairchild University of South

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

The Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models

The Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Educational Psychology Papers and Publications Educational Psychology, Department of 7-1-2001 The Relative Performance of

More information

SESUG Paper SD

SESUG Paper SD SESUG Paper SD-106-2017 Missing Data and Complex Sample Surveys Using SAS : The Impact of Listwise Deletion vs. Multiple Imputation Methods on Point and Interval Estimates when Data are MCAR, MAR, and

More information

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Combining Risks from Several Tumors Using Markov Chain Monte Carlo University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln U.S. Environmental Protection Agency Papers U.S. Environmental Protection Agency 2009 Combining Risks from Several Tumors

More information

arxiv: v1 [stat.ap] 11 Jan 2013

arxiv: v1 [stat.ap] 11 Jan 2013 The Annals of Applied Statistics 2012, Vol. 6, No. 4, 1814 1837 DOI: 10.1214/12-AOAS555 c Institute of Mathematical Statistics, 2012 arxiv:1301.2490v1 [stat.ap] 11 Jan 2013 ADDRESSING MISSING DATA MECHANISM

More information

Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory

Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory Nathaniel Schenker Deputy Director, National Center for Health Statistics* (and

More information

Supplementary Online Content

Supplementary Online Content Supplementary Online Content Rollman BL, Herbeck Belnap B, Abebe KZ, et al. Effectiveness of online collaborative care for treating mood and anxiety disorders in primary care: a randomized clinical trial.

More information

Estimating HIV incidence in the United States from HIV/AIDS surveillance data and biomarker HIV test results

Estimating HIV incidence in the United States from HIV/AIDS surveillance data and biomarker HIV test results STATISTICS IN MEDICINE Statist. Med. 2008; 27:4617 4633 Published online 4 August 2008 in Wiley InterScience (www.interscience.wiley.com).3144 Estimating HIV incidence in the United States from HIV/AIDS

More information

Addendum: Multiple Regression Analysis (DRAFT 8/2/07)

Addendum: Multiple Regression Analysis (DRAFT 8/2/07) Addendum: Multiple Regression Analysis (DRAFT 8/2/07) When conducting a rapid ethnographic assessment, program staff may: Want to assess the relative degree to which a number of possible predictive variables

More information

Hakan Demirtas, Anup Amatya, Oksana Pugach, John Cursio, Fei Shi, David Morton and Beyza Doganay 1. INTRODUCTION

Hakan Demirtas, Anup Amatya, Oksana Pugach, John Cursio, Fei Shi, David Morton and Beyza Doganay 1. INTRODUCTION Statistics and Its Interface Volume 2 (2009) 449 456 Accuracy versus convenience: A simulation-based comparison of two continuous imputation models for incomplete ordinal longitudinal clinical trials data

More information

Clinical trials with incomplete daily diary data

Clinical trials with incomplete daily diary data Clinical trials with incomplete daily diary data N. Thomas 1, O. Harel 2, and R. Little 3 1 Pfizer Inc 2 University of Connecticut 3 University of Michigan BASS, 2015 Thomas, Harel, Little (Pfizer) Clinical

More information

Missing data in medical research is

Missing data in medical research is Abstract Missing data in medical research is a common problem that has long been recognised by statisticians and medical researchers alike. In general, if the effect of missing data is not taken into account

More information

What s New in SUDAAN 11

What s New in SUDAAN 11 What s New in SUDAAN 11 Angela Pitts 1, Michael Witt 1, Gayle Bieler 1 1 RTI International, 3040 Cornwallis Rd, RTP, NC 27709 Abstract SUDAAN 11 is due to be released in 2012. SUDAAN is a statistical software

More information

Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3

Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3 Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3 Analysis of Vaccine Effects on Post-Infection Endpoints p.1/40 Data Collected in Phase IIb/III Vaccine Trial Longitudinal

More information

Advanced Bayesian Models for the Social Sciences

Advanced Bayesian Models for the Social Sciences Advanced Bayesian Models for the Social Sciences Jeff Harden Department of Political Science, University of Colorado Boulder jeffrey.harden@colorado.edu Daniel Stegmueller Department of Government, University

More information

Subject index. bootstrap...94 National Maternal and Infant Health Study (NMIHS) example

Subject index. bootstrap...94 National Maternal and Infant Health Study (NMIHS) example Subject index A AAPOR... see American Association of Public Opinion Research American Association of Public Opinion Research margins of error in nonprobability samples... 132 reports on nonprobability

More information

Missing data in clinical trials: making the best of what we haven t got.

Missing data in clinical trials: making the best of what we haven t got. Missing data in clinical trials: making the best of what we haven t got. Royal Statistical Society Professional Statisticians Forum Presentation by Michael O Kelly, Senior Statistical Director, IQVIA Copyright

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

How should the propensity score be estimated when some confounders are partially observed?

How should the propensity score be estimated when some confounders are partially observed? How should the propensity score be estimated when some confounders are partially observed? Clémence Leyrat 1, James Carpenter 1,2, Elizabeth Williamson 1,3, Helen Blake 1 1 Department of Medical statistics,

More information

Abstract. Introduction A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION

Abstract. Introduction A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION Fong Wang, Genentech Inc. Mary Lange, Immunex Corp. Abstract Many longitudinal studies and clinical trials are

More information

IAPT: Regression. Regression analyses

IAPT: Regression. Regression analyses Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project

More information

Clincial Biostatistics. Regression

Clincial Biostatistics. Regression Regression analyses Clincial Biostatistics Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a

More information

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. Greenland/Arah, Epi 200C Sp 2000 1 of 6 EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. INSTRUCTIONS: Write all answers on the answer sheets supplied; PRINT YOUR NAME and STUDENT ID NUMBER

More information

Instrumental Variables Estimation: An Introduction

Instrumental Variables Estimation: An Introduction Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA The Problem The Problem Suppose you wish to

More information

Discussion. Ralf T. Münnich Variance Estimation in the Presence of Nonresponse

Discussion. Ralf T. Münnich Variance Estimation in the Presence of Nonresponse Journal of Official Statistics, Vol. 23, No. 4, 2007, pp. 455 461 Discussion Ralf T. Münnich 1 1. Variance Estimation in the Presence of Nonresponse Professor Bjørnstad addresses a new approach to an extremely

More information

You must answer question 1.

You must answer question 1. Research Methods and Statistics Specialty Area Exam October 28, 2015 Part I: Statistics Committee: Richard Williams (Chair), Elizabeth McClintock, Sarah Mustillo You must answer question 1. 1. Suppose

More information

Imputation classes as a framework for inferences from non-random samples. 1

Imputation classes as a framework for inferences from non-random samples. 1 Imputation classes as a framework for inferences from non-random samples. 1 Vladislav Beresovsky (hvy4@cdc.gov) National Center for Health Statistics, CDC 1 Disclaimer: The findings and conclusions in

More information

Modern Strategies to Handle Missing Data: A Showcase of Research on Foster Children

Modern Strategies to Handle Missing Data: A Showcase of Research on Foster Children Modern Strategies to Handle Missing Data: A Showcase of Research on Foster Children Anouk Goemans, MSc PhD student Leiden University The Netherlands Email: a.goemans@fsw.leidenuniv.nl Modern Strategies

More information

Estimands, Missing Data and Sensitivity Analysis: some overview remarks. Roderick Little

Estimands, Missing Data and Sensitivity Analysis: some overview remarks. Roderick Little Estimands, Missing Data and Sensitivity Analysis: some overview remarks Roderick Little NRC Panel s Charge To prepare a report with recommendations that would be useful for USFDA's development of guidance

More information

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill)

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill) Advanced Bayesian Models for the Social Sciences Instructors: Week 1&2: Skyler J. Cranmer Department of Political Science University of North Carolina, Chapel Hill skyler@unc.edu Week 3&4: Daniel Stegmueller

More information

Comparison of imputation and modelling methods in the analysis of a physical activity trial with missing outcomes

Comparison of imputation and modelling methods in the analysis of a physical activity trial with missing outcomes IJE vol.34 no.1 International Epidemiological Association 2004; all rights reserved. International Journal of Epidemiology 2005;34:89 99 Advance Access publication 27 August 2004 doi:10.1093/ije/dyh297

More information

Imputation approaches for potential outcomes in causal inference

Imputation approaches for potential outcomes in causal inference Int. J. Epidemiol. Advance Access published July 25, 2015 International Journal of Epidemiology, 2015, 1 7 doi: 10.1093/ije/dyv135 Education Corner Education Corner Imputation approaches for potential

More information

Estimating drug effects in the presence of placebo response: Causal inference using growth mixture modeling

Estimating drug effects in the presence of placebo response: Causal inference using growth mixture modeling STATISTICS IN MEDICINE Statist. Med. 2009; 28:3363 3385 Published online 3 September 2009 in Wiley InterScience (www.interscience.wiley.com).3721 Estimating drug effects in the presence of placebo response:

More information

Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models, 2nd Ed.

Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models, 2nd Ed. Eric Vittinghoff, David V. Glidden, Stephen C. Shiboski, and Charles E. McCulloch Division of Biostatistics Department of Epidemiology and Biostatistics University of California, San Francisco Regression

More information

In this module I provide a few illustrations of options within lavaan for handling various situations.

In this module I provide a few illustrations of options within lavaan for handling various situations. In this module I provide a few illustrations of options within lavaan for handling various situations. An appropriate citation for this material is Yves Rosseel (2012). lavaan: An R Package for Structural

More information

Proof. Revised. Chapter 12 General and Specific Factors in Selection Modeling Introduction. Bengt Muthén

Proof. Revised. Chapter 12 General and Specific Factors in Selection Modeling Introduction. Bengt Muthén Chapter 12 General and Specific Factors in Selection Modeling Bengt Muthén Abstract This chapter shows how analysis of data on selective subgroups can be used to draw inference to the full, unselected

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

Missing Data and Institutional Research

Missing Data and Institutional Research A version of this paper appears in Umbach, Paul D. (Ed.) (2005). Survey research. Emerging issues. New directions for institutional research #127. (Chapter 3, pp. 33-50). San Francisco: Jossey-Bass. Missing

More information

Methods for Addressing Selection Bias in Observational Studies

Methods for Addressing Selection Bias in Observational Studies Methods for Addressing Selection Bias in Observational Studies Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA What is Selection Bias? In the regression

More information

Depressive illness has been shown to be associated with

Depressive illness has been shown to be associated with Effect on Disability Outcomes of a Depression Relapse Prevention Program MICHAEL VON KORFF, SCD, WAYNE KATON MD, CAROLYN RUTTER, PHD, EVETTE LUDMAN, PHD, GREG SIMON, MD, MPH, ELIZABETH LIN, MD, MPH, AND

More information

Accuracy of Range Restriction Correction with Multiple Imputation in Small and Moderate Samples: A Simulation Study

Accuracy of Range Restriction Correction with Multiple Imputation in Small and Moderate Samples: A Simulation Study A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute

More information

WATCHMAN PROTECT AF Study Rev. 6

WATCHMAN PROTECT AF Study Rev. 6 WATCHMAN PROTECT AF Study Rev. 6 Protocol Synopsis Title WATCHMAN Left Atrial Appendage System for Embolic PROTECTion in Patients with Atrial Fibrillation (PROTECT AF) Sponsor Atritech/Boston Scientific

More information

Data Analysis Using Regression and Multilevel/Hierarchical Models

Data Analysis Using Regression and Multilevel/Hierarchical Models Data Analysis Using Regression and Multilevel/Hierarchical Models ANDREW GELMAN Columbia University JENNIFER HILL Columbia University CAMBRIDGE UNIVERSITY PRESS Contents List of examples V a 9 e xv " Preface

More information

Comparison And Application Of Methods To Address Confounding By Indication In Non- Randomized Clinical Studies

Comparison And Application Of Methods To Address Confounding By Indication In Non- Randomized Clinical Studies University of Massachusetts Amherst ScholarWorks@UMass Amherst Masters Theses 1911 - February 2014 Dissertations and Theses 2013 Comparison And Application Of Methods To Address Confounding By Indication

More information

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data Michael Anderson, PhD Hélène Carabin, DVM, PhD Department of Biostatistics and Epidemiology The University

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Missing Data: Our View of the State of the Art

Missing Data: Our View of the State of the Art Psychological Methods Copyright 2002 by the American Psychological Association, Inc. 2002, Vol. 7, No. 2, 147 177 1082-989X/02/$5.00 DOI: 10.1037//1082-989X.7.2.147 Missing Data: Our View of the State

More information

Modeling Nonresponse Bias Likelihood and Response Propensity

Modeling Nonresponse Bias Likelihood and Response Propensity Modeling Nonresponse Bias Likelihood and Response Propensity Daniel Pratt, Andy Peytchev, Michael Duprey, Jeffrey Rosen, Jamie Wescott 1 RTI International is a registered trademark and a trade name of

More information

The prevention and handling of the missing data

The prevention and handling of the missing data Review Article Korean J Anesthesiol 2013 May 64(5): 402-406 http://dx.doi.org/10.4097/kjae.2013.64.5.402 The prevention and handling of the missing data Department of Anesthesiology and Pain Medicine,

More information