Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory

Size: px
Start display at page:

Download "Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory"

Transcription

1 Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory Nathaniel Schenker Deputy Director, National Center for Health Statistics* (and a former colleague of Rod Little s at UCLA) Symposium in Celebration of Rod Little s 65 th Birthday University of Michigan, Ann Arbor, MI October 31, 2015 * The findings and opinions in this presentation are those of the speaker and do not necessarily represent the views of the National Center for Health Statistics, the Centers for Disease Control and Prevention, or the U.S. government. 1

2 Outline Empirical results based on the 2008 National Ambulatory Medical Care Survey (Lewis et al. 2014) Theoretical results based on a one-way, normal, random-effects model (He et al. forthcoming) Discussion of a few limitations and areas for future research An aside: Documenting one of Rod s less publicized talents 2

3 National Ambulatory Medical Care Survey (NAMCS) Administered by NCHS since 1973 Objective: Collect and disseminate nationally representative data on office-based physician care Multistage design 1. Single or grouped counties 2. Physician practices (stratified by specialty) 3. Patient visits during selected week In 2008, 30,000 visits in sample 1997 OMB standards for classifying race/ethnicity data 3

4 Item Nonresponse Rate (%) Missing Data on Race/Ethnicity in NAMCS (from Lewis et al. 2014) Year 4

5 Exploring Multiple Imputation for Missing Race Data in 2008 NAMCS NCHS research team developed imputation model Predictors: age, sex, urban/rural, physician specialty, reason for visit, log(time spent with physician), sample weights, zip-code level proportions non-hispanic white and non-hispanic black from 2000 census Variable choices based on previously used cell-based procedure, advice from subject-matter experts, and desire to reflect sample design Used sequential regression multivariate imputation (Raghunathan et al. 2001) as implemented in IVEWare Created five sets of imputations (D = 5) 5

6 Estimands Considered in Lewis et al. (2014) Race distributions (non-hispanic white, non-hispanic black, other) Overall By four regions By five age groups By diabetes status (yes, no) For each estimand, estimated ratio of standard errors of estimates: multiple imputation/single imputation 6

7 Estimated Standard Error Ratio (MI/SI) SE Ratios by Missingness Levels (from Lewis et al. 2014) % 25.0% 30.0% 35.0% 40.0% 45.0% 50.0% 55.0% 60.0% Percent of Observations Missing 7

8 SE Ratios by Missingness Levels Levels of ratios roughly consistent with those for 2000 NAMCS based on bootstrap re-imputation (Li et al. 2004) Ratios mostly rather low Why are ratios seemingly not related to missingness levels? 8

9 Estimated Standard Error Ratio (MI/SI) SE Ratios by Estimated Design Effects (from Lewis et al. 2014) Estimated Design Effect 9

10 SE Ratios by Estimated Design Effects Strong inverse relationship Ratios < 1.04 when DEFFs > 10 Consistent with simulation result in Reiter et al. (2006), who reasoned as follows: The complex design makes the within-imputation variance a dominant factor relative to the between-imputation variance. That is, the fraction of missing information due to missing data is relatively small when compared to the effect of clustering. 10

11 Increase in Estimated Variance Attributable to Missing Data versus Complex Sample Design Increase attributable to both factors: = [U (SRS) DEFF] D B U (SRS) Proportion attributable to missing data: D B/ 11

12 Percents Attributable to Missing Data, by Estimated DEFFs (from Lewis et al. 2014) 12

13 SE Ratio SE Ratio SE Ratio SE Ratio Value of SE Ratio if DEFF Were Equal to 1? Lowess smoother Lowess smoother DEFF bandwidth = DEFF bandwidth =.7 Lowess smoother Lowess smoother DEFF bandwidth = DEFF bandwidth =.5 13

14 Value of SE Ratio if DEFF Were Equal to 1? Lowess smoother analysis suggests SE Ratio of 1.08 to 1.1 Implies fraction of missing information of 14% to 17% 14

15 Some Theory for Single-Stage Cluster Sampling (He et al. forthcoming) Simple random sample of m out of M clusters, each containing n elements Model-based representation: For i = 1,, m, j = 1,, n, 15

16 Some Theory for Single-Stage Cluster Sampling Estimand: μ If data were complete, would have DEFF com = 1 + n 1 ρ, where ρ = τ2 τ 2 +σ 2 With missing data, assuming MCAR, and r observations per cluster, DEFF obs = 1 + (r 1)ρ 16

17 Some Theory for Single-Stage Cluster Sampling Approximations for multiple imputation (D ) with missingness rate P mis : FMI P mis DEFF obs and FMI P mis (1 P mis )DEFF com +P mis Derivations assume that DEFF obs r and DEFF com n; if assumption violated, formulas can be used as simple upper bounds If ρ = 0, then approximations imply FMI P mis 17

18 SE Ratios Predicted Using FMI Approximation Based on DEFF com (from He et al. forthcoming) 18

19 How Well Do Approximations Predict Results for 2008 NAMCS? (from He et al. forthcoming) Approximation based on DEFF com Approximation based on DEFF obs 19

20 Discussion Case study of 2008 NAMCS Considered coarse domains; often finer domains smaller DEFFs In 2009, awareness among field representatives ; nonresponse on race Beginning in 2012, no clustering by counties; PSUs are physician offices Would be useful to study impacts Theoretical results Can be thought of as extension of Rubin and Schenker (1986) Would be useful to go beyond MCAR Other factors influence DEFFs; e.g., weights, multiple stages of clustering 20

21 References He, Y., Shimizu, I., Schappert, S., Xu, J., Beresovsky, V., Khan, D., Valverde, R., and Schenker, N. (forthcoming), A Note on the Effect of Data Clustering on the Multiple Imputation Variance Estimator: An Addendum to Taylor et al. (2014), to appear in the Journal of Official Statistics. Lewis, T., Goldberg, E., Schenker, N., Beresovsky, V., Schappert, S., Decker, S., Sonnenfeld, N., and Shimizu, I. (2014), The Relative Impacts of Design Effects and Multiple Imputation on Variance Estimates: A Case Study with the 2008 National Ambulatory Medical Care Survey, Journal of Official Statistics, 30, Li, Y., Lynch, C., Shimizu, I, and Kaufman, S. (2004), Imputation Variance Estimation by Bootstrap Method for the National Ambulatory Medical Care Survey, American Statistical Association Proceedings of the Survey Research Methods Section. Raghunathan, T., Lepkowski, J., Van Hoewyk, J., and Solenberger, P. (2001), A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Models, Survey Methodology, 27, Reiter, J., Raghunathan, T. and Kinney, S. (2006), The Importance of Modeling the Sampling Design in Multiple Imputation for Missing Survey Data, Survey Methodology, 32, Rubin, D.B., and Schenker, N. (1986), Multiple Imputation for Interval Estimation from Simple Random Samples with Ignorable Nonresponse, Journal of the American Statistical Association, 81,

22 22

23 23

24 24

25 25

26 26

27 27

28 28

29 HAPPY BIRTHDAY, ROD! 29

Key words: Health survey; missing data; item nonresponse; fraction of missing information.

Key words: Health survey; missing data; item nonresponse; fraction of missing information. Journal of Official Statistics, Vol. 30, No. 1, 2014, pp. 147 161, http://dx.doi.org/10.2478/jos-2014-0008 The Relative Impacts of Design Effects and Multiple Imputation on Variance Estimates: A Case Study

More information

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan Sequential nonparametric regression multiple imputations Irina Bondarenko and Trivellore Raghunathan Department of Biostatistics, University of Michigan Ann Arbor, MI 48105 Abstract Multiple imputation,

More information

Missing Data and Imputation

Missing Data and Imputation Missing Data and Imputation Barnali Das NAACCR Webinar May 2016 Outline Basic concepts Missing data mechanisms Methods used to handle missing data 1 What are missing data? General term: data we intended

More information

An Introduction to Multiple Imputation for Missing Items in Complex Surveys

An Introduction to Multiple Imputation for Missing Items in Complex Surveys An Introduction to Multiple Imputation for Missing Items in Complex Surveys October 17, 2014 Joe Schafer Center for Statistical Research and Methodology (CSRM) United States Census Bureau Views expressed

More information

Section on Survey Research Methods JSM 2009

Section on Survey Research Methods JSM 2009 Missing Data and Complex Samples: The Impact of Listwise Deletion vs. Subpopulation Analysis on Statistical Bias and Hypothesis Test Results when Data are MCAR and MAR Bethany A. Bell, Jeffrey D. Kromrey

More information

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,

More information

Clinical trials with incomplete daily diary data

Clinical trials with incomplete daily diary data Clinical trials with incomplete daily diary data N. Thomas 1, O. Harel 2, and R. Little 3 1 Pfizer Inc 2 University of Connecticut 3 University of Michigan BASS, 2015 Thomas, Harel, Little (Pfizer) Clinical

More information

Small-area estimation of mental illness prevalence for schools

Small-area estimation of mental illness prevalence for schools Small-area estimation of mental illness prevalence for schools Fan Li 1 Alan Zaslavsky 2 1 Department of Statistical Science Duke University 2 Department of Health Care Policy Harvard Medical School March

More information

County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS)

County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS) County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS) Van L. Parsons, Nathaniel Schenker Office of Research and

More information

Weight Adjustment Methods using Multilevel Propensity Models and Random Forests

Weight Adjustment Methods using Multilevel Propensity Models and Random Forests Weight Adjustment Methods using Multilevel Propensity Models and Random Forests Ronaldo Iachan 1, Maria Prosviryakova 1, Kurt Peters 2, Lauren Restivo 1 1 ICF International, 530 Gaither Road Suite 500,

More information

Nonresponse Rates and Nonresponse Bias In Household Surveys

Nonresponse Rates and Nonresponse Bias In Household Surveys Nonresponse Rates and Nonresponse Bias In Household Surveys Robert M. Groves University of Michigan and Joint Program in Survey Methodology Funding from the Methodology, Measurement, and Statistics Program

More information

A Comparison of Variance Estimates for Schools and Students Using Taylor Series and Replicate Weighting

A Comparison of Variance Estimates for Schools and Students Using Taylor Series and Replicate Weighting A Comparison of Variance Estimates for Schools and Students Using and Replicate Weighting Peter H. Siegel, James R. Chromy, Ellen Scheib RTI International Abstract Variance estimation is an important issue

More information

Title. Description. Remarks. Motivating example. intro substantive Introduction to multiple-imputation analysis

Title. Description. Remarks. Motivating example. intro substantive Introduction to multiple-imputation analysis Title intro substantive Introduction to multiple-imputation analysis Description Missing data arise frequently. Various procedures have been suggested in the literature over the last several decades to

More information

Supplementary Online Content

Supplementary Online Content Supplementary Online Content Hafeman DM, Merranko J, Goldstein TR, et al. Assessment of a person-level risk calculator to predict new-onset bipolar spectrum disorder in youth at familial risk. JAMA Psychiatry.

More information

Alternative indicators for the risk of non-response bias

Alternative indicators for the risk of non-response bias Alternative indicators for the risk of non-response bias Federal Committee on Statistical Methodology 2018 Research and Policy Conference Raphael Nishimura, Abt Associates James Wagner and Michael Elliott,

More information

Epidemiology of Asthma. In the Western Michigan Counties of. Kent, Montcalm, Muskegon, Newaygo, and Ottawa

Epidemiology of Asthma. In the Western Michigan Counties of. Kent, Montcalm, Muskegon, Newaygo, and Ottawa Epidemiology of Asthma In the Western Michigan Counties of Kent, Montcalm, Muskegon, Newaygo, and Ottawa Elizabeth Wasilevich, MPH Asthma Epidemiologist Bureau of Epidemiology Michigan Department of Community

More information

AnExaminationoftheQualityand UtilityofInterviewerEstimatesof HouseholdCharacteristicsinthe NationalSurveyofFamilyGrowth. BradyWest

AnExaminationoftheQualityand UtilityofInterviewerEstimatesof HouseholdCharacteristicsinthe NationalSurveyofFamilyGrowth. BradyWest AnExaminationoftheQualityand UtilityofInterviewerEstimatesof HouseholdCharacteristicsinthe NationalSurveyofFamilyGrowth BradyWest An Examination of the Quality and Utility of Interviewer Estimates of Household

More information

Discussion. Ralf T. Münnich Variance Estimation in the Presence of Nonresponse

Discussion. Ralf T. Münnich Variance Estimation in the Presence of Nonresponse Journal of Official Statistics, Vol. 23, No. 4, 2007, pp. 455 461 Discussion Ralf T. Münnich 1 1. Variance Estimation in the Presence of Nonresponse Professor Bjørnstad addresses a new approach to an extremely

More information

JSM Survey Research Methods Section

JSM Survey Research Methods Section Studying the Association of Environmental Measures Linked with Health Data: A Case Study Using the Linked National Health Interview Survey and Modeled Ambient PM2.5 Data Rong Wei 1, Van Parsons, and Jennifer

More information

Module 14: Missing Data Concepts

Module 14: Missing Data Concepts Module 14: Missing Data Concepts Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724 Pre-requisites Module 3

More information

Epidemiology of Asthma. In Wayne County, Michigan

Epidemiology of Asthma. In Wayne County, Michigan Epidemiology of Asthma In Wayne County, Michigan Elizabeth Wasilevich, MPH Asthma Epidemiologist Bureau of Epidemiology Michigan Department of Community Health 517.335.8164 Publication Date: August 2005

More information

Multiple Imputation For Missing Data: What Is It And How Can I Use It?

Multiple Imputation For Missing Data: What Is It And How Can I Use It? Multiple Imputation For Missing Data: What Is It And How Can I Use It? Jeffrey C. Wayman, Ph.D. Center for Social Organization of Schools Johns Hopkins University jwayman@csos.jhu.edu www.csos.jhu.edu

More information

Analysis of TB prevalence surveys

Analysis of TB prevalence surveys Workshop and training course on TB prevalence surveys with a focus on field operations Analysis of TB prevalence surveys Day 8 Thursday, 4 August 2011 Phnom Penh Babis Sismanidis with acknowledgements

More information

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study STATISTICAL METHODS Epidemiology Biostatistics and Public Health - 2016, Volume 13, Number 1 Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation

More information

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1 Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition

More information

A Review of Hot Deck Imputation for Survey Non-response

A Review of Hot Deck Imputation for Survey Non-response doi:10.1111/j.1751-5823.2010.00103.x A Review of Hot Deck Imputation for Survey Non-response Rebecca R. Andridge 1 and Roderick J. A. Little 2 1 Division of Biostatistics, The Ohio State University, Columbus,

More information

How should the propensity score be estimated when some confounders are partially observed?

How should the propensity score be estimated when some confounders are partially observed? How should the propensity score be estimated when some confounders are partially observed? Clémence Leyrat 1, James Carpenter 1,2, Elizabeth Williamson 1,3, Helen Blake 1 1 Department of Medical statistics,

More information

Introduction to Survey Sample Weighting. Linda Owens

Introduction to Survey Sample Weighting. Linda Owens Introduction to Survey Sample Weighting Linda Owens Content of Webinar What are weights Types of weights Weighting adjustment methods General guidelines for weight construction/use. 2 1 What are weights?

More information

Accuracy of Range Restriction Correction with Multiple Imputation in Small and Moderate Samples: A Simulation Study

Accuracy of Range Restriction Correction with Multiple Imputation in Small and Moderate Samples: A Simulation Study A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute

More information

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch.

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch. S05-2008 Imputation of Categorical Missing Data: A comparison of Multivariate Normal and Abstract Multinomial Methods Holmes Finch Matt Margraf Ball State University Procedures for the imputation of missing

More information

Analysis Strategies for Clinical Trials with Treatment Non-Adherence Bohdana Ratitch, PhD

Analysis Strategies for Clinical Trials with Treatment Non-Adherence Bohdana Ratitch, PhD Analysis Strategies for Clinical Trials with Treatment Non-Adherence Bohdana Ratitch, PhD Acknowledgments: Michael O Kelly, James Roger, Ilya Lipkovich, DIA SWG On Missing Data Copyright 2016 QuintilesIMS.

More information

Complier Average Causal Effect (CACE)

Complier Average Causal Effect (CACE) Complier Average Causal Effect (CACE) Booil Jo Stanford University Methodological Advancement Meeting Innovative Directions in Estimating Impact Office of Planning, Research & Evaluation Administration

More information

Nonresponse Adjustment Methodology for NHIS-Medicare Linked Data

Nonresponse Adjustment Methodology for NHIS-Medicare Linked Data Nonresponse Adjustment Methodology for NHIS-Medicare Linked Data Michael D. Larsen 1, Michelle Roozeboom 2, and Kathy Schneider 2 1 Department of Statistics, The George Washington University, Rockville,

More information

Small-area estimation of prevalence of serious emotional disturbance (SED) in schools. Alan Zaslavsky Harvard Medical School

Small-area estimation of prevalence of serious emotional disturbance (SED) in schools. Alan Zaslavsky Harvard Medical School Small-area estimation of prevalence of serious emotional disturbance (SED) in schools Alan Zaslavsky Harvard Medical School 1 Overview Detailed domain data from short scale Limited amount of data from

More information

Subject index. bootstrap...94 National Maternal and Infant Health Study (NMIHS) example

Subject index. bootstrap...94 National Maternal and Infant Health Study (NMIHS) example Subject index A AAPOR... see American Association of Public Opinion Research American Association of Public Opinion Research margins of error in nonprobability samples... 132 reports on nonprobability

More information

UMbRELLA interim report Preparatory work

UMbRELLA interim report Preparatory work UMbRELLA interim report Preparatory work This document is intended to supplement the UMbRELLA Interim Report 2 (January 2016) by providing a summary of the preliminary analyses which influenced the decision

More information

A preliminary study of active compared with passive imputation of missing body mass index values among non-hispanic white youths 1 4

A preliminary study of active compared with passive imputation of missing body mass index values among non-hispanic white youths 1 4 A preliminary study of active compared with passive imputation of missing body mass index values among non-hispanic white youths 1 4 David A Wagstaff, Sibylle Kranz, and Ofer Harel ABSTRACT Background:

More information

SESUG Paper SD

SESUG Paper SD SESUG Paper SD-106-2017 Missing Data and Complex Sample Surveys Using SAS : The Impact of Listwise Deletion vs. Multiple Imputation Methods on Point and Interval Estimates when Data are MCAR, MAR, and

More information

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values Sutthipong Meeyai School of Transportation Engineering, Suranaree University of Technology,

More information

Should a Normal Imputation Model Be Modified to Impute Skewed Variables?

Should a Normal Imputation Model Be Modified to Impute Skewed Variables? Sociological Methods and Research, 2013, 42(1), 105-138 Should a Normal Imputation Model Be Modified to Impute Skewed Variables? Paul T. von Hippel Abstract (169 words) Researchers often impute continuous

More information

Review of Pre-crash Behaviour in Fatal Road Collisions Report 1: Alcohol

Review of Pre-crash Behaviour in Fatal Road Collisions Report 1: Alcohol Review of Pre-crash Behaviour in Fatal Road Collisions Research Department Road Safety Authority September 2011 Contents Executive Summary... 3 Introduction... 4 Road Traffic Fatality Collision Data in

More information

Exploring the Impact of Missing Data in Multiple Regression

Exploring the Impact of Missing Data in Multiple Regression Exploring the Impact of Missing Data in Multiple Regression Michael G Kenward London School of Hygiene and Tropical Medicine 28th May 2015 1. Introduction In this note we are concerned with the conduct

More information

Using Test Databases to Evaluate Record Linkage Models and Train Linkage Practitioners

Using Test Databases to Evaluate Record Linkage Models and Train Linkage Practitioners Using Test Databases to Evaluate Record Linkage Models and Train Linkage Practitioners Michael H. McGlincy Strategic Matching, Inc. PO Box 334, Morrisonville, NY 12962 Phone 518 643 8485, mcglincym@strategicmatching.com

More information

Trends in Smoking Prevalence by Race based on the Tobacco Use Supplement to the Current Population Survey

Trends in Smoking Prevalence by Race based on the Tobacco Use Supplement to the Current Population Survey Trends in Smoking Prevalence by Race based on the Tobacco Use Supplement to the Current Population Survey William W. Davis 1, Anne M. Hartman 1, James T. Gibson 2 National Cancer Institute, Bethesda, MD,

More information

Standard Errors of Correlations Adjusted for Incidental Selection

Standard Errors of Correlations Adjusted for Incidental Selection Standard Errors of Correlations Adjusted for Incidental Selection Nancy L. Allen Educational Testing Service Stephen B. Dunbar University of Iowa The standard error of correlations that have been adjusted

More information

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC Selected Topics in Biostatistics Seminar Series Missing Data Sponsored by: Center For Clinical Investigation and Cleveland CTSC Brian Schmotzer, MS Biostatistician, CCI Statistical Sciences Core brian.schmotzer@case.edu

More information

Enrollment under the Medicaid Expansion and Health Insurance Exchanges. A Focus on Those with Behavioral Health Conditions in Michigan

Enrollment under the Medicaid Expansion and Health Insurance Exchanges. A Focus on Those with Behavioral Health Conditions in Michigan Enrollment under the Medicaid Expansion and Health Insurance Exchanges A Focus on Those with Behavioral Health Conditions in Michigan Methods for Estimating Uninsured with M/SU Conditions by FPL From NSDUH,

More information

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine Data Analysis in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine Multilevel Data Statistical analyses that fail to recognize

More information

Estimating peer density effects on oral health for community-based older adults

Estimating peer density effects on oral health for community-based older adults Chakraborty et al. BMC Oral Health (2017) 17:166 DOI 10.1186/s12903-017-0456-4 RESEARCH ARTICLE Open Access Estimating peer density effects on oral health for community-based older adults Bibhas Chakraborty

More information

An Application of Propensity Modeling: Comparing Unweighted and Weighted Logistic Regression Models for Nonresponse Adjustments

An Application of Propensity Modeling: Comparing Unweighted and Weighted Logistic Regression Models for Nonresponse Adjustments An Application of Propensity Modeling: Comparing Unweighted and Weighted Logistic Regression Models for Nonresponse Adjustments Frank Potter, 1 Eric Grau, 1 Stephen Williams, 1 Nuria Diaz-Tena, 2 and Barbara

More information

Chapter 3. Producing Data

Chapter 3. Producing Data Chapter 3. Producing Data Introduction Mostly data are collected for a specific purpose of answering certain questions. For example, Is smoking related to lung cancer? Is use of hand-held cell phones associated

More information

Geographical Accuracy of Cell Phone Samples and the Effect on Telephone Survey Bias, Variance, and Cost

Geographical Accuracy of Cell Phone Samples and the Effect on Telephone Survey Bias, Variance, and Cost Geographical Accuracy of Cell Phone Samples and the Effect on Telephone Survey Bias, Variance, and Cost Abstract Benjamin Skalland, NORC at the University of Chicago Meena Khare, National Center for Health

More information

ANALYSIS OF SURVEYS WITH EPI INFO AND STATA

ANALYSIS OF SURVEYS WITH EPI INFO AND STATA Department of Epidemiology Course EPI 418 School of Public Health University of California, Los Angeles Session 11 ANALYSIS OF SURVEYS WITH EPI INFO AND STATA Note: prepared with Epi Info (Windows) and

More information

educational assessment and educational measurement

educational assessment and educational measurement EDUCATIONAL ASSESSMENT AND EDUCATIONAL MEASUREMENT research line 5 educational assessment and educational measurement EDUCATIONAL ASSESSMENT AND EDUCATIONAL MEASUREMENT 98 1 Educational Assessment 100

More information

Use of Paradata in a Responsive Design Framework to Manage a Field Data Collection

Use of Paradata in a Responsive Design Framework to Manage a Field Data Collection Journal of Official Statistics, Vol. 28, No. 4, 2012, pp. 477 499 Use of Paradata in a Responsive Design Framework to Manage a Field Data Collection James Wagner 1, Brady T. West 1, Nicole Kirgis 1, James

More information

Appendix 1. Sensitivity analysis for ACQ: missing value analysis by multiple imputation

Appendix 1. Sensitivity analysis for ACQ: missing value analysis by multiple imputation Appendix 1 Sensitivity analysis for ACQ: missing value analysis by multiple imputation A sensitivity analysis was carried out on the primary outcome measure (ACQ) using multiple imputation (MI). MI is

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

Evaluators Perspectives on Research on Evaluation

Evaluators Perspectives on Research on Evaluation Supplemental Information New Directions in Evaluation Appendix A Survey on Evaluators Perspectives on Research on Evaluation Evaluators Perspectives on Research on Evaluation Research on Evaluation (RoE)

More information

SECONDARY DATA ANALYSIS: Its Uses and Limitations. Aria Kekalih

SECONDARY DATA ANALYSIS: Its Uses and Limitations. Aria Kekalih SECONDARY DATA ANALYSIS: Its Uses and Limitations Aria Kekalih it is always wise to begin any research activity with a review of the secondary data (Novak 1996). Secondary Data Analysis can be literally

More information

2002 NAMCS MICRO-DATA FILE DOCUMENTATION PAGE 1 ABSTRACT

2002 NAMCS MICRO-DATA FILE DOCUMENTATION PAGE 1 ABSTRACT 2002 NAMCS MICRO-DATA FILE DOCUMENTATION PAGE 1 ABSTRACT This material provides documentation for users of the micro-data file of the 2002 National Ambulatory Medical Care Survey (NAMCS). The NAMCS is

More information

An Empirical Study of Nonresponse Adjustment Methods for the Survey of Doctorate Recipients Wilson Blvd., Suite 965, Arlington, VA 22230

An Empirical Study of Nonresponse Adjustment Methods for the Survey of Doctorate Recipients Wilson Blvd., Suite 965, Arlington, VA 22230 An Empirical Study of Nonresponse Adjustment Methods for the Survey of Doctorate Recipients 1 Fan Zhang 1 and Stephen Cohen 1 Donsig Jang 2, Amang Suasih 2, and Sonya Vartivarian 2 1 National Science Foundation,

More information

An Empirical Study to Evaluate the Performance of Synthetic Estimates of Substance Use in the National Survey of Drug Use and Health

An Empirical Study to Evaluate the Performance of Synthetic Estimates of Substance Use in the National Survey of Drug Use and Health An Empirical Study to Evaluate the Performance of Synthetic Estimates of Substance Use in the National Survey of Drug Use and Health Akhil K. Vaish 1, Ralph E. Folsom 1, Kathy Spagnola 1, Neeraja Sathe

More information

A Strategy for Handling Missing Data in the Longitudinal Study of Young People in England (LSYPE)

A Strategy for Handling Missing Data in the Longitudinal Study of Young People in England (LSYPE) Research Report DCSF-RW086 A Strategy for Handling Missing Data in the Longitudinal Study of Young People in England (LSYPE) Andrea Piesse and Graham Kalton Westat Research Report No DCSF-RW086 A Strategy

More information

Quantifying the clinical measure of interest in the presence of missing data:

Quantifying the clinical measure of interest in the presence of missing data: Quantifying the clinical measure of interest in the presence of missing data: choosing primary and sensitivity analyses in neuroscience clinical trials Sept 26, 2016 Elena Polverejan, Ph.D. Statistical

More information

THE EFFECTS OF SELF AND PROXY RESPONSE STATUS ON THE REPORTING OF RACE AND ETHNICITY l

THE EFFECTS OF SELF AND PROXY RESPONSE STATUS ON THE REPORTING OF RACE AND ETHNICITY l THE EFFECTS OF SELF AND PROXY RESPONSE STATUS ON THE REPORTING OF RACE AND ETHNICITY l Brian A. Harris-Kojetin, Arbitron, and Nancy A. Mathiowetz, University of Maryland Brian Harris-Kojetin, The Arbitron

More information

Selected Oral Health Indicators in the United States,

Selected Oral Health Indicators in the United States, NCHS Data Brief No. 96 May 01 Selected Oral Health Indicators in the United States, 005 008 Bruce A. Dye, D.D.S., M.P.H.; Xianfen Li, M.S.; and Eugenio D. Beltrán-Aguilar, D.M.D., M.S., Dr.P.H. Key findings

More information

The Impact of Cellphone Sample Representation on Variance Estimates in a Dual-Frame Telephone Survey

The Impact of Cellphone Sample Representation on Variance Estimates in a Dual-Frame Telephone Survey The Impact of Cellphone Sample Representation on Variance Estimates in a Dual-Frame Telephone Survey A. Elizabeth Ormson 1, Kennon R. Copeland 1, B. Stephen J. Blumberg 2, and N. Ganesh 1 1 NORC at the

More information

Reduction of Measurement Error due to Survey Length: Evaluation of the Split Questionnaire Design Approach

Reduction of Measurement Error due to Survey Length: Evaluation of the Split Questionnaire Design Approach Survey Research Methods (2017) Vol. 11, No. 4, pp. 361-368 doi:10.18148/srm/2017.v11i4.7145 c European Survey Research Association ISSN 1864-3361 http://www.surveymethods.org Reduction of Measurement Error

More information

Kelvin Chan Feb 10, 2015

Kelvin Chan Feb 10, 2015 Underestimation of Variance of Predicted Mean Health Utilities Derived from Multi- Attribute Utility Instruments: The Use of Multiple Imputation as a Potential Solution. Kelvin Chan Feb 10, 2015 Outline

More information

Jinhui Ma 1,2,3, Parminder Raina 1,2, Joseph Beyene 1 and Lehana Thabane 1,3,4,5*

Jinhui Ma 1,2,3, Parminder Raina 1,2, Joseph Beyene 1 and Lehana Thabane 1,3,4,5* Ma et al. BMC Medical Research Methodology 2013, 13:9 RESEARCH ARTICLE Open Access Comparison of population-averaged and cluster-specific models for the analysis of cluster randomized trials with missing

More information

Trends in Emergency Department Visits for Ischemic Stroke and Transient Ischemic Attack: United States,

Trends in Emergency Department Visits for Ischemic Stroke and Transient Ischemic Attack: United States, Trends in Emergency Department Visits for Ischemic Stroke and Transient Ischemic Attack: United States, 2001 2011 Anjali Talwalkar, M.D., M.P.H.; and Sayeedha Uddin, M.D., M.P.H. Key findings Data from

More information

In this module I provide a few illustrations of options within lavaan for handling various situations.

In this module I provide a few illustrations of options within lavaan for handling various situations. In this module I provide a few illustrations of options within lavaan for handling various situations. An appropriate citation for this material is Yves Rosseel (2012). lavaan: An R Package for Structural

More information

Impact of Methods of Scoring Omitted Responses on Achievement Gaps

Impact of Methods of Scoring Omitted Responses on Achievement Gaps Impact of Methods of Scoring Omitted Responses on Achievement Gaps Dr. Nathaniel J. S. Brown (nathaniel.js.brown@bc.edu)! Educational Research, Evaluation, and Measurement, Boston College! Dr. Dubravka

More information

Practice of Epidemiology. Strategies for Multiple Imputation in Longitudinal Studies

Practice of Epidemiology. Strategies for Multiple Imputation in Longitudinal Studies American Journal of Epidemiology ª The Author 2010. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail:

More information

Incorporating the sampling design in weighting adjustments for panel attrition

Incorporating the sampling design in weighting adjustments for panel attrition Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.1002/sim.0000 Incorporating the sampling design in weighting adjustments for panel attrition Qixuan Chen a, Andrew Gelman b, Melissa

More information

USING THE CENSUS 2000/2001 SUPPLEMENTARY SURVEY AS A SAMPLING FRAME FOR THE NATIONAL EPIDEMIOLOGICAL SURVEY ON ALCOHOL AND RELATED CONDITIONS

USING THE CENSUS 2000/2001 SUPPLEMENTARY SURVEY AS A SAMPLING FRAME FOR THE NATIONAL EPIDEMIOLOGICAL SURVEY ON ALCOHOL AND RELATED CONDITIONS USING THE CENSUS 2000/2001 SUPPLEMENTARY SURVEY AS A SAMPLING FRAME FOR THE NATIONAL EPIDEMIOLOGICAL SURVEY ON ALCOHOL AND RELATED CONDITIONS Marie Stetser, Jana Shepherd, and Thomas F. Moore 1 U.S. Census

More information

Chapter 5: Producing Data

Chapter 5: Producing Data Chapter 5: Producing Data Key Vocabulary: observational study vs. experiment confounded variables population vs. sample sampling vs. census sample design voluntary response sampling convenience sampling

More information

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data Michael Anderson, PhD Hélène Carabin, DVM, PhD Department of Biostatistics and Epidemiology The University

More information

Help! Statistics! Missing data. An introduction

Help! Statistics! Missing data. An introduction Help! Statistics! Missing data. An introduction Sacha la Bastide-van Gemert Medical Statistics and Decision Making Department of Epidemiology UMCG Help! Statistics! Lunch time lectures What? Frequently

More information

Methods for treating bias in ISTAT mixed mode social surveys

Methods for treating bias in ISTAT mixed mode social surveys Methods for treating bias in ISTAT mixed mode social surveys C. De Vitiis, A. Guandalini, F. Inglese and M.D. Terribili ITACOSM 2017 Bologna, 16th June 2017 Summary 1. The Mixed Mode in ISTAT social surveys

More information

Model development including interactions with multiple imputed data

Model development including interactions with multiple imputed data Hendry et al. BMC Medical Research Methodology 2014, 14:136 RESEARCH ARTICLE Open Access Model development including interactions with multiple imputed data Gillian M Hendry 1*, Rajen N Naidoo 2, Temesgen

More information

Reducing Decision Errors in the Paired Comparison of the Diagnostic Accuracy of Continuous Screening Tests

Reducing Decision Errors in the Paired Comparison of the Diagnostic Accuracy of Continuous Screening Tests Reducing Decision Errors in the Paired Comparison of the Diagnostic Accuracy of Continuous Screening Tests Brandy M. Ringham, 1 Todd A. Alonzo, 2 John T. Brinton, 1 Aarti Munjal, 1 Keith E. Muller, 3 Deborah

More information

Vocabulary. Bias. Blinding. Block. Cluster sample

Vocabulary. Bias. Blinding. Block. Cluster sample Bias Blinding Block Census Cluster sample Confounding Control group Convenience sample Designs Experiment Experimental units Factor Level Any systematic failure of a sampling method to represent its population

More information

Some General Guidelines for Choosing Missing Data Handling Methods in Educational Research

Some General Guidelines for Choosing Missing Data Handling Methods in Educational Research Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 3 11-2014 Some General Guidelines for Choosing Missing Data Handling Methods in Educational Research Jehanzeb R. Cheema University

More information

Multiple imputation for handling missing outcome data when estimating the relative risk

Multiple imputation for handling missing outcome data when estimating the relative risk Sullivan et al. BMC Medical Research Methodology (2017) 17:134 DOI 10.1186/s12874-017-0414-5 RESEARCH ARTICLE Open Access Multiple imputation for handling missing outcome data when estimating the relative

More information

The Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models

The Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Educational Psychology Papers and Publications Educational Psychology, Department of 7-1-2001 The Relative Performance of

More information

AMELIA II: A Package for Missing Data

AMELIA II: A Package for Missing Data AMELIA II: A Package for Missing Data James Honaker Gary King Matthew Blackwell July 24, 2009 I want to convince you of three things. I want to convince you of three things. 1 Missing data is a problem

More information

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine Confounding by indication developments in matching, and instrumental variable methods Richard Grieve London School of Hygiene and Tropical Medicine 1 Outline 1. Causal inference and confounding 2. Genetic

More information

Missing data in clinical trials: making the best of what we haven t got.

Missing data in clinical trials: making the best of what we haven t got. Missing data in clinical trials: making the best of what we haven t got. Royal Statistical Society Professional Statisticians Forum Presentation by Michael O Kelly, Senior Statistical Director, IQVIA Copyright

More information

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study Marianne (Marnie) Bertolet Department of Statistics Carnegie Mellon University Abstract Linear mixed-effects (LME)

More information

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences. SPRING GROVE AREA SCHOOL DISTRICT PLANNED COURSE OVERVIEW Course Title: Basic Introductory Statistics Grade Level(s): 11-12 Units of Credit: 1 Classification: Elective Length of Course: 30 cycles Periods

More information

Modern Strategies to Handle Missing Data: A Showcase of Research on Foster Children

Modern Strategies to Handle Missing Data: A Showcase of Research on Foster Children Modern Strategies to Handle Missing Data: A Showcase of Research on Foster Children Anouk Goemans, MSc PhD student Leiden University The Netherlands Email: a.goemans@fsw.leidenuniv.nl Modern Strategies

More information

Within-Household Selection in Mail Surveys: Explicit Questions Are Better Than Cover Letter Instructions

Within-Household Selection in Mail Surveys: Explicit Questions Are Better Than Cover Letter Instructions University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Sociology Department, Faculty Publications Sociology, Department of 2017 Within-Household Selection in Mail Surveys: Explicit

More information

National Ambulatory Medical Care Survey: 1997 Summary

National Ambulatory Medical Care Survey: 1997 Summary Number 305 + May 20, 1999 From Vital and Health Statistics of the CENTERS FOR DISEASE CONTROL AND PREVENTION/National Center for Health Statistics National Ambulatory Medical Care Survey: 1997 Summary

More information

Propensity Score Methods with Multilevel Data. March 19, 2014

Propensity Score Methods with Multilevel Data. March 19, 2014 Propensity Score Methods with Multilevel Data March 19, 2014 Multilevel data Data in medical care, health policy research and many other fields are often multilevel. Subjects are grouped in natural clusters,

More information

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington

More information

Practical Statistical Reasoning in Clinical Trials

Practical Statistical Reasoning in Clinical Trials Seminar Series to Health Scientists on Statistical Concepts 2011-2012 Practical Statistical Reasoning in Clinical Trials Paul Wakim, PhD Center for the National Institute on Drug Abuse 10 January 2012

More information

Incorporating the sampling design in weighting adjustments for panel attrition

Incorporating the sampling design in weighting adjustments for panel attrition Research Article Statistics Received XXXX (www.interscience.wiley.com) DOI: 10.1002/sim.0000 Incorporating the sampling design in weighting adjustments for panel attrition Qixuan Chen a, Andrew Gelman

More information

Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study

Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study Research Report Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study Xueli Xu Matthias von Davier April 2010 ETS RR-10-10 Listening. Learning. Leading. Linking Errors in Trend Estimation

More information

THE GOOD, THE BAD, & THE UGLY: WHAT WE KNOW TODAY ABOUT LCA WITH DISTAL OUTCOMES. Bethany C. Bray, Ph.D.

THE GOOD, THE BAD, & THE UGLY: WHAT WE KNOW TODAY ABOUT LCA WITH DISTAL OUTCOMES. Bethany C. Bray, Ph.D. THE GOOD, THE BAD, & THE UGLY: WHAT WE KNOW TODAY ABOUT LCA WITH DISTAL OUTCOMES Bethany C. Bray, Ph.D. bcbray@psu.edu WHAT ARE WE HERE TO TALK ABOUT TODAY? Behavioral scientists increasingly are using

More information