AMELIA II: A Package for Missing Data

Size: px
Start display at page:

Download "AMELIA II: A Package for Missing Data"

Transcription

1 AMELIA II: A Package for Missing Data James Honaker Gary King Matthew Blackwell July 24, 2009

2 I want to convince you of three things.

3 I want to convince you of three things. 1 Missing data is a problem for statistical analysis.

4 I want to convince you of three things. 1 Missing data is a problem for statistical analysis. 2 Multiple imputation is a method that drastically improves the analysis of incomplete data.

5 I want to convince you of three things. 1 Missing data is a problem for statistical analysis. 2 Multiple imputation is a method that drastically improves the analysis of incomplete data. 3 Our software, Amelia, is a simple yet powerful way to implement this method.

6 the problem: missing data a solution our approach

7 year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso Burkina Faso Burkina Faso Burkina Faso

8 year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 NA NA Burkina Faso Burkina Faso 435 NA Burkina Faso

9 > NA + 34 [1] NA

10 year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 NA NA Burkina Faso Burkina Faso 435 NA Burkina Faso

11 Listwise Deletion year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 NA NA Burkina Faso Burkina Faso 435 NA Burkina Faso

12 Listwise Deletion year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso Burkina Faso

13 Listwise Deletion Solves <the problem>?

14 Listwise Deletion Solves <the problem>? Yes.

15 Listwise Deletion Solves <the problem>? Yes. Creates new problems?

16 Listwise Deletion Solves <the problem>? Yes. Creates new problems? Yes.

17 New Problems

18 New Problems BIAS The cases you throw out are systematically different than the ones that you leave in.

19 New Problems BIAS The cases you throw out are systematically different than the ones that you leave in. INEFFICIENCY Tossing out observed information with the missing values.

20 Imputation year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 NA NA Burkina Faso Burkina Faso 435 NA Burkina Faso

21 Imputation year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 x a x b Burkina Faso Burkina Faso 435 x c Burkina Faso

22 Mean Imputation year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 Xin Xtrade Burkina Faso Burkina Faso 435 Xin Burkina Faso

23 Mean Imputation Solves <the problem>?

24 Mean Imputation Solves <the problem>? Yes.

25 Mean Imputation Solves <the problem>? Yes. Creates new problems?

26 Mean Imputation Solves <the problem>? Yes. Creates new problems? Yes.

27 New Problems

28 New Problems BIAS Ignores correlations between variables.

29 New Problems BIAS Ignores correlations between variables. OVERCONFIDENCE Treating imputations as observed data.

30 The problem, revised How do we fill in the data in an way that both preserves the relationships in the observed data and incorporates the uncertainty of imputation?

31 the problem a solution: multiple imputation our approach

32 The Multiple Imputation Scheme

33 The Multiple Imputation Scheme incomplete data

34 The Multiple Imputation Scheme incomplete data imputation imputed datasets

35 The Multiple Imputation Scheme incomplete data imputation imputed datasets analysis separate results

36 The Multiple Imputation Scheme incomplete data imputation imputed datasets analysis separate results combination final results

37 Multiple Imputation year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 x a x b Burkina Faso Burkina Faso 435 x c Burkina Faso

38 Multiple Imputation year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 x a x b Burkina Faso Burkina Faso 435 x c Burkina Faso year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 x a x b Burkina Faso Burkina Faso 435 x c Burkina Faso

39 Multiple Imputation year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 x a x b Burkina Faso Burkina Faso 435 x c Burkina Faso year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 x a x b Burkina Faso Burkina Faso 435 x c Burkina Faso year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 x a x b Burkina Faso Burkina Faso 435 x c Burkina Faso

40 Multiple Imputation year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 x a x b Burkina Faso Burkina Faso 435 x c Burkina Faso year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 x a x b Burkina Faso Burkina Faso 435 x c Burkina Faso year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 x a x b Burkina Faso Burkina Faso 435 x c Burkina Faso year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 x a x b Burkina Faso Burkina Faso 435 x c Burkina Faso

41 Multiple Imputation

42 Multiple Imputation REGRESSION To preserve the relationships in the data.

43 Multiple Imputation REGRESSION To preserve the relationships in the data. SIMULATION To reflect the uncertainty of our imputation.

44 How to impute y = X β + ε REGRESSION

45 How to impute X mis i = X obs i β + ε REGRESSION

46 How to impute X mis i = X obs i β + ε REGRESSION β N(β, var( β)) SIMULATION ε N(0, σ 2 X mis )

47 How to impute X mis i = X obs i β + ε EM β N(β, var( β)) SIMULATION ε N(0, σ 2 X mis )

48 How to impute X mis i = X obs i β + ε EM β N(β, var( β)) BOOTSTRAP ε N(0, σ 2 X mis )

49 the problem a solution our approach: Amelia features diagnostics

50 The Amelia Scheme

51 The Amelia Scheme incomplete data

52 The Amelia Scheme incomplete data bootstrap bootstrapped data

53 The Amelia Scheme incomplete data bootstrap bootstrapped data EM imputed datasets

54 The Amelia Scheme incomplete data bootstrap bootstrapped data EM imputed datasets analysis

55 The Amelia Scheme incomplete data bootstrap bootstrapped data EM imputed datasets analysis combination final results

56 the problem a solution our approach: Amelia features diagnostics

57 Simplicity a.out <- amelia(data)

58 A GUI

59 Transformations GDP log(gdp) Frequency Frequency a.out <- amelia(africa, logs = "gdp")

60 Polynomials of Time year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 NA NA Burkina Faso Burkina Faso 435 NA Burkina Faso f(t) = t + t 2 + t 3

61 Polynomials of Time year country GDP infl trade population Burkina Faso Burkina Faso Burkina Faso 393 NA NA Burkina Faso Burkina Faso 435 NA Burkina Faso f(t) = t + t 2 + t 3 data yesterday imputation tomorrow

62 Easily passed to other platforms for analysis ## Pass to Zelig library(zelig) a.out <- amelia(africa) z.out <- zelig(infl ~ gdp, data = a.out$imputations, model = ls) ## Write to Stata files write.amelia(a.out, stem = "outdata", format = "dta")

63 Error Checking > a.out <- amelia(africa) Amelia Error Code: 37 The variable(s) country are "factors". You may have wanted to set this as a ID variable to remove it from the imputation model or as an ordinal or nominal variable to be imputed. Please set it as either and try again.

64 the problem a solution our approach: Amelia features diagnostics

65 Missingness Maps Burkina Faso Missingness Map Missing Observed Burundi Cameroon Congo Senegal Zambia trade gdp_pc population civlib infl country year > missmap(africa, tsvar = "year", csvar = "country")

66 Overimputation Observed versus Imputed Values Imputed values Observed values > overimpute(a.out, var = trade)

67 Time-Series Cross-Sectional Plots Burundi trade Time > tscsplot(a.out, var = "trade", cs = "Burundi")

68 the problem: missing data a solution: multiple imputation our approach: Amelia

69 thank you. Learn more about Amelia:

Advanced Handling of Missing Data

Advanced Handling of Missing Data Advanced Handling of Missing Data One-day Workshop Nicole Janz ssrmcta@hermes.cam.ac.uk 2 Goals Discuss types of missingness Know advantages & disadvantages of missing data methods Learn multiple imputation

More information

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC Selected Topics in Biostatistics Seminar Series Missing Data Sponsored by: Center For Clinical Investigation and Cleveland CTSC Brian Schmotzer, MS Biostatistician, CCI Statistical Sciences Core brian.schmotzer@case.edu

More information

Analysis of TB prevalence surveys

Analysis of TB prevalence surveys Workshop and training course on TB prevalence surveys with a focus on field operations Analysis of TB prevalence surveys Day 8 Thursday, 4 August 2011 Phnom Penh Babis Sismanidis with acknowledgements

More information

Appendix 1. Sensitivity analysis for ACQ: missing value analysis by multiple imputation

Appendix 1. Sensitivity analysis for ACQ: missing value analysis by multiple imputation Appendix 1 Sensitivity analysis for ACQ: missing value analysis by multiple imputation A sensitivity analysis was carried out on the primary outcome measure (ACQ) using multiple imputation (MI). MI is

More information

Technical Whitepaper

Technical Whitepaper Technical Whitepaper July, 2001 Prorating Scale Scores Consequential analysis using scales from: BDI (Beck Depression Inventory) NAS (Novaco Anger Scales) STAXI (State-Trait Anxiety Inventory) PIP (Psychotic

More information

ethnicity recording in primary care

ethnicity recording in primary care ethnicity recording in primary care Multiple imputation of missing data in ethnicity recording using The Health Improvement Network database Tra Pham 1 PhD Supervisors: Dr Irene Petersen 1, Prof James

More information

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values Sutthipong Meeyai School of Transportation Engineering, Suranaree University of Technology,

More information

Education, Literacy & Health Outcomes Findings

Education, Literacy & Health Outcomes Findings 2014/ED/EFA/MRT/PI/05 Background paper prepared for the Education for All Global Monitoring Report 2013/4 Teaching and learning: Achieving quality for all Education, Literacy & Health Outcomes Findings

More information

Missing Data and Imputation

Missing Data and Imputation Missing Data and Imputation Barnali Das NAACCR Webinar May 2016 Outline Basic concepts Missing data mechanisms Methods used to handle missing data 1 What are missing data? General term: data we intended

More information

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch.

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch. S05-2008 Imputation of Categorical Missing Data: A comparison of Multivariate Normal and Abstract Multinomial Methods Holmes Finch Matt Margraf Ball State University Procedures for the imputation of missing

More information

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,

More information

A Unified Approach to Measurement Error and Missing Data: Overview and Applications

A Unified Approach to Measurement Error and Missing Data: Overview and Applications A Unified Approach to Measurement Error and Missing Data: Overview and Applications Matthew Blackwell James Honaker Gary King April 10, 2015 Abstract Although social scientists devote considerable effort

More information

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1 Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition

More information

The analysis of tuberculosis prevalence surveys. Babis Sismanidis with acknowledgements to Sian Floyd Harare, 30 November 2010

The analysis of tuberculosis prevalence surveys. Babis Sismanidis with acknowledgements to Sian Floyd Harare, 30 November 2010 The analysis of tuberculosis prevalence surveys Babis Sismanidis with acknowledgements to Sian Floyd Harare, 30 November 2010 Background Prevalence = TB cases / Number of eligible participants (95% CI

More information

Module 14: Missing Data Concepts

Module 14: Missing Data Concepts Module 14: Missing Data Concepts Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724 Pre-requisites Module 3

More information

Week 10 Hour 1. Shapiro-Wilks Test (from last time) Cross-Validation. Week 10 Hour 2 Missing Data. Stat 302 Notes. Week 10, Hour 2, Page 1 / 32

Week 10 Hour 1. Shapiro-Wilks Test (from last time) Cross-Validation. Week 10 Hour 2 Missing Data. Stat 302 Notes. Week 10, Hour 2, Page 1 / 32 Week 10 Hour 1 Shapiro-Wilks Test (from last time) Cross-Validation Week 10 Hour 2 Missing Data Stat 302 Notes. Week 10, Hour 2, Page 1 / 32 Cross-Validation in the Wild It s often more important to describe

More information

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Introduction to Mulitple Regression

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Introduction to Mulitple Regression STAT/SOC/CSSS 1 Statistical Concepts and Methods for the Social Sciences Introduction to Mulitple Regression Christopher Adolph Department of Political Science and Center for Statistics and the Social

More information

The Effectiveness of Captopril

The Effectiveness of Captopril Lab 7 The Effectiveness of Captopril In the United States, pharmaceutical manufacturers go through a very rigorous process in order to get their drugs approved for sale. This process is designed to determine

More information

Section on Survey Research Methods JSM 2009

Section on Survey Research Methods JSM 2009 Missing Data and Complex Samples: The Impact of Listwise Deletion vs. Subpopulation Analysis on Statistical Bias and Hypothesis Test Results when Data are MCAR and MAR Bethany A. Bell, Jeffrey D. Kromrey

More information

Multiple Imputation For Missing Data: What Is It And How Can I Use It?

Multiple Imputation For Missing Data: What Is It And How Can I Use It? Multiple Imputation For Missing Data: What Is It And How Can I Use It? Jeffrey C. Wayman, Ph.D. Center for Social Organization of Schools Johns Hopkins University jwayman@csos.jhu.edu www.csos.jhu.edu

More information

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business Applied Medical Statistics Using SAS Geoff Der Brian S. Everitt CRC Press Taylor Si Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Croup, an informa business A

More information

Jinhui Ma 1,2,3, Parminder Raina 1,2, Joseph Beyene 1 and Lehana Thabane 1,3,4,5*

Jinhui Ma 1,2,3, Parminder Raina 1,2, Joseph Beyene 1 and Lehana Thabane 1,3,4,5* Ma et al. BMC Medical Research Methodology 2013, 13:9 RESEARCH ARTICLE Open Access Comparison of population-averaged and cluster-specific models for the analysis of cluster randomized trials with missing

More information

Master thesis Department of Statistics

Master thesis Department of Statistics Master thesis Department of Statistics Masteruppsats, Statistiska institutionen Missing Data in the Swedish National Patients Register: Multiple Imputation by Fully Conditional Specification Jesper Hörnblad

More information

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study STATISTICAL METHODS Epidemiology Biostatistics and Public Health - 2016, Volume 13, Number 1 Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation

More information

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan Sequential nonparametric regression multiple imputations Irina Bondarenko and Trivellore Raghunathan Department of Biostatistics, University of Michigan Ann Arbor, MI 48105 Abstract Multiple imputation,

More information

MISSING DATA AND PARAMETERS ESTIMATES IN MULTIDIMENSIONAL ITEM RESPONSE MODELS. Federico Andreis, Pier Alda Ferrari *

MISSING DATA AND PARAMETERS ESTIMATES IN MULTIDIMENSIONAL ITEM RESPONSE MODELS. Federico Andreis, Pier Alda Ferrari * Electronic Journal of Applied Statistical Analysis EJASA (2012), Electron. J. App. Stat. Anal., Vol. 5, Issue 3, 431 437 e-issn 2070-5948, DOI 10.1285/i20705948v5n3p431 2012 Università del Salento http://siba-ese.unile.it/index.php/ejasa/index

More information

Missing data imputation: focusing on single imputation

Missing data imputation: focusing on single imputation Big-data Clinical Trial Column Page 1 of 8 Missing data imputation: focusing on single imputation Zhongheng Zhang Department of Critical Care Medicine, Jinhua Municipal Central Hospital, Jinhua Hospital

More information

J2.6 Imputation of missing data with nonlinear relationships

J2.6 Imputation of missing data with nonlinear relationships Sixth Conference on Artificial Intelligence Applications to Environmental Science 88th AMS Annual Meeting, New Orleans, LA 20-24 January 2008 J2.6 Imputation of missing with nonlinear relationships Michael

More information

Exploring the Impact of Missing Data in Multiple Regression

Exploring the Impact of Missing Data in Multiple Regression Exploring the Impact of Missing Data in Multiple Regression Michael G Kenward London School of Hygiene and Tropical Medicine 28th May 2015 1. Introduction In this note we are concerned with the conduct

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis Basic Concept: Extend the simple regression model to include additional explanatory variables: Y = β 0 + β1x1 + β2x2 +... + βp-1xp + ε p = (number of independent variables

More information

Validity and reliability of measurements

Validity and reliability of measurements Validity and reliability of measurements 2 Validity and reliability of measurements 4 5 Components in a dataset Why bother (examples from research) What is reliability? What is validity? How should I treat

More information

Technical Track Session IV Instrumental Variables

Technical Track Session IV Instrumental Variables Impact Evaluation Technical Track Session IV Instrumental Variables Christel Vermeersch Beijing, China, 2009 Human Development Human Network Development Network Middle East and North Africa Region World

More information

Food Production and Violent Conflict in Sub-Saharan Africa. - Supplementary Information -

Food Production and Violent Conflict in Sub-Saharan Africa. - Supplementary Information - Food Production and Violent Conflict in Sub-Saharan Africa - Supplementary Information - Halvard Buhaug a,b,1, Tor A. Benaminsen c,a, Espen Sjaastad c & Ole Magnus Theisen b,a a Peace Research Institute

More information

Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory

Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory Comparing Multiple Imputation to Single Imputation in the Presence of Large Design Effects: A Case Study and Some Theory Nathaniel Schenker Deputy Director, National Center for Health Statistics* (and

More information

1. Objective: analyzing CD4 counts data using GEE marginal model and random effects model. Demonstrate the analysis using SAS and STATA.

1. Objective: analyzing CD4 counts data using GEE marginal model and random effects model. Demonstrate the analysis using SAS and STATA. LDA lab Feb, 6 th, 2002 1 1. Objective: analyzing CD4 counts data using GEE marginal model and random effects model. Demonstrate the analysis using SAS and STATA. 2. Scientific question: estimate the average

More information

Jane T. Bertrand and Margaret Farrell-Ross. Supported by a grant from the Bill and Melinda Gates Foundation (#OPP )

Jane T. Bertrand and Margaret Farrell-Ross. Supported by a grant from the Bill and Melinda Gates Foundation (#OPP ) Jane T. Bertrand and Margaret Farrell-Ross T l Tulane S School h l off Public P bli Health H l h and d Tropical T i lm Medicine di i Supported by a grant from the Bill and Melinda Gates Foundation (#OPP1017071)

More information

In this module I provide a few illustrations of options within lavaan for handling various situations.

In this module I provide a few illustrations of options within lavaan for handling various situations. In this module I provide a few illustrations of options within lavaan for handling various situations. An appropriate citation for this material is Yves Rosseel (2012). lavaan: An R Package for Structural

More information

Multiple imputation for multivariate missing-data. problems: a data analyst s perspective. Joseph L. Schafer and Maren K. Olsen

Multiple imputation for multivariate missing-data. problems: a data analyst s perspective. Joseph L. Schafer and Maren K. Olsen Multiple imputation for multivariate missing-data problems: a data analyst s perspective Joseph L. Schafer and Maren K. Olsen The Pennsylvania State University March 9, 1998 1 Abstract Analyses of multivariate

More information

SUPPLEMENTARY APPENDICES FOR ONLINE PUBLICATION. Supplement to: All-Cause Mortality Reductions from. Measles Catch-Up Campaigns in Africa

SUPPLEMENTARY APPENDICES FOR ONLINE PUBLICATION. Supplement to: All-Cause Mortality Reductions from. Measles Catch-Up Campaigns in Africa SUPPLEMENTARY APPENDICES FOR ONLINE PUBLICATION Supplement to: All-Cause Mortality Reductions from Measles Catch-Up Campaigns in Africa (by Ariel BenYishay and Keith Kranker) 1 APPENDIX A: DATA DHS survey

More information

Friday, August 25 th

Friday, August 25 th Friday, August 25 th Happy Friday!! As you come in, please: Student s choice on seats choose wisely or you may be moved! Front table pick up a Unit Three Overview and read it FYI Our Early Release Schedule

More information

Outline. Introduction GitHub and installation Worked example Stata wishes Discussion. mrrobust: a Stata package for MR-Egger regression type analyses

Outline. Introduction GitHub and installation Worked example Stata wishes Discussion. mrrobust: a Stata package for MR-Egger regression type analyses mrrobust: a Stata package for MR-Egger regression type analyses London Stata User Group Meeting 2017 8 th September 2017 Tom Palmer Wesley Spiller Neil Davies Outline Introduction GitHub and installation

More information

Scaling Up Nutrition Action for Africa

Scaling Up Nutrition Action for Africa Scaling Up Nutrition Action for Africa Where are we and what challenges are need to be addressed to accelerate malnutrition? Lawrence Haddad Global Alliance for Improved Nutrition Why should African political

More information

bivariate analysis: The statistical analysis of the relationship between two variables.

bivariate analysis: The statistical analysis of the relationship between two variables. bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for

More information

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data Missing data Patrick Breheny April 3 Patrick Breheny BST 71: Bayesian Modeling in Biostatistics 1/39 Our final topic for the semester is missing data Missing data is very common in practice, and can occur

More information

The Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models

The Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Educational Psychology Papers and Publications Educational Psychology, Department of 7-1-2001 The Relative Performance of

More information

Kelvin Chan Feb 10, 2015

Kelvin Chan Feb 10, 2015 Underestimation of Variance of Predicted Mean Health Utilities Derived from Multi- Attribute Utility Instruments: The Use of Multiple Imputation as a Potential Solution. Kelvin Chan Feb 10, 2015 Outline

More information

Software Cost Estimation with Incomplete Data *

Software Cost Estimation with Incomplete Data * National Research Council Canada Institute for Information Technology Conseil national de recherches Canada Institut de technologie de l'information Software Cost Estimation with Incomplete Data * Strike,

More information

Supplemental Appendix for Beyond Ricardo: The Link Between Intraindustry. Timothy M. Peterson Oklahoma State University

Supplemental Appendix for Beyond Ricardo: The Link Between Intraindustry. Timothy M. Peterson Oklahoma State University Supplemental Appendix for Beyond Ricardo: The Link Between Intraindustry Trade and Peace Timothy M. Peterson Oklahoma State University Cameron G. Thies University of Iowa A-1 This supplemental appendix

More information

Bayesian melding for estimating uncertainty in national HIV prevalence estimates

Bayesian melding for estimating uncertainty in national HIV prevalence estimates Bayesian melding for estimating uncertainty in national HIV prevalence estimates Leontine Alkema, Adrian E. Raftery and Tim Brown 1 Working Paper no. 82 Center for Statistics and the Social Sciences University

More information

Demographic Transitions, Solidarity Networks and Inequality Among African Children: The Case of Child Survival? Vongai Kandiwa

Demographic Transitions, Solidarity Networks and Inequality Among African Children: The Case of Child Survival? Vongai Kandiwa Demographic Transitions, Solidarity Networks and Inequality Among African Children: The Case of Child Survival? Vongai Kandiwa PhD Candidate, Development Sociology and Demography Cornell University, 435

More information

On average, about half the respondents to surveys

On average, about half the respondents to surveys American Political Science Review Vol. 95, No. 1 March 2001 Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation GARY KING Harvard University JAMES HONAKER Harvard

More information

Dan Lizotte, Lacey Gunter, Eric Laber, Susan Murphy Department of Statistics - University of Michigan

Dan Lizotte, Lacey Gunter, Eric Laber, Susan Murphy Department of Statistics - University of Michigan REINFORCEMENT LEARNING FOR CLINICAL TRIAL DATA Dan Lizotte, Lacey Gunter, Eric Laber, Susan Murphy Department of Statistics - University of Michigan it s not what you think I say: Reinforcement Learning

More information

Help! Statistics! Missing data. An introduction

Help! Statistics! Missing data. An introduction Help! Statistics! Missing data. An introduction Sacha la Bastide-van Gemert Medical Statistics and Decision Making Department of Epidemiology UMCG Help! Statistics! Lunch time lectures What? Frequently

More information

Validity and reliability of measurements

Validity and reliability of measurements Validity and reliability of measurements 2 3 Request: Intention to treat Intention to treat and per protocol dealing with cross-overs (ref Hulley 2013) For example: Patients who did not take/get the medication

More information

PEER REVIEW HISTORY ARTICLE DETAILS VERSION 1 - REVIEW. Ball State University

PEER REVIEW HISTORY ARTICLE DETAILS VERSION 1 - REVIEW. Ball State University PEER REVIEW HISTORY BMJ Open publishes all reviews undertaken for accepted manuscripts. Reviewers are asked to complete a checklist review form (see an example) and are provided with free text boxes to

More information

Best Practice in Handling Cases of Missing or Incomplete Values in Data Analysis: A Guide against Eliminating Other Important Data

Best Practice in Handling Cases of Missing or Incomplete Values in Data Analysis: A Guide against Eliminating Other Important Data Best Practice in Handling Cases of Missing or Incomplete Values in Data Analysis: A Guide against Eliminating Other Important Data Sub-theme: Improving Test Development Procedures to Improve Validity Dibu

More information

4. STATA output of the analysis

4. STATA output of the analysis Biostatistics(1.55) 1. Objective: analyzing epileptic seizures data using GEE marginal model in STATA.. Scientific question: Determine whether the treatment reduces the rate of epileptic seizures. 3. Dataset:

More information

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Dr. Kelly Bradley Final Exam Summer {2 points} Name {2 points} Name You MUST work alone no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. This exam is being scored out of 00 points.

More information

educational assessment and educational measurement

educational assessment and educational measurement EDUCATIONAL ASSESSMENT AND EDUCATIONAL MEASUREMENT research line 5 educational assessment and educational measurement EDUCATIONAL ASSESSMENT AND EDUCATIONAL MEASUREMENT 98 1 Educational Assessment 100

More information

Spectrum. Quick Start Tutorial

Spectrum. Quick Start Tutorial Spectrum Quick Start Tutorial March 2005 Table of Contents Introduction... 2 What you will learn... 4 Basic Steps in Using Spectrum... 4 Step 1. Installing Spectrum... 4 Step 2. Changing the language in

More information

PAA 2010 (Dallas) Session 73: Family Planning, Reproductive Health and Fertility in Africa

PAA 2010 (Dallas) Session 73: Family Planning, Reproductive Health and Fertility in Africa Reconstructing Fertility Trends in Sub-Saharan Africa by Combining Multiple Surveys Affected by Data Quality Problems Bruno Schoumaker 1, Université catholique de Louvain Draft (preliminary results) April

More information

Comparison of methods for imputing limited-range variables: a simulation study

Comparison of methods for imputing limited-range variables: a simulation study Rodwell et al. BMC Medical Research Methodology 2014, 14:57 RESEARCH ARTICLE Open Access Comparison of methods for imputing limited-range variables: a simulation study Laura Rodwell 1,2*, Katherine J Lee

More information

Bayesian hierarchical modelling

Bayesian hierarchical modelling Bayesian hierarchical modelling Matthew Schofield Department of Mathematics and Statistics, University of Otago Bayesian hierarchical modelling Slide 1 What is a statistical model? A statistical model:

More information

Multiple Treatments on the Same Experimental Unit. Lukas Meier (most material based on lecture notes and slides from H.R. Roth)

Multiple Treatments on the Same Experimental Unit. Lukas Meier (most material based on lecture notes and slides from H.R. Roth) Multiple Treatments on the Same Experimental Unit Lukas Meier (most material based on lecture notes and slides from H.R. Roth) Introduction We learned that blocking is a very helpful technique to reduce

More information

Bayesian melding for estimating uncertainty in national HIV prevalence estimates

Bayesian melding for estimating uncertainty in national HIV prevalence estimates Bayesian melding for estimating uncertainty in national HIV prevalence estimates L Alkema, 1 A E Raftery, 1 T Brown 2 Supplement 1 University of Washington, Center for Statistics and the Social Sciences,

More information

SESUG Paper SD

SESUG Paper SD SESUG Paper SD-106-2017 Missing Data and Complex Sample Surveys Using SAS : The Impact of Listwise Deletion vs. Multiple Imputation Methods on Point and Interval Estimates when Data are MCAR, MAR, and

More information

Benchmark Dose Modeling Cancer Models. Allen Davis, MSPH Jeff Gift, Ph.D. Jay Zhao, Ph.D. National Center for Environmental Assessment, U.S.

Benchmark Dose Modeling Cancer Models. Allen Davis, MSPH Jeff Gift, Ph.D. Jay Zhao, Ph.D. National Center for Environmental Assessment, U.S. Benchmark Dose Modeling Cancer Models Allen Davis, MSPH Jeff Gift, Ph.D. Jay Zhao, Ph.D. National Center for Environmental Assessment, U.S. EPA Disclaimer The views expressed in this presentation are those

More information

Hakan Demirtas, Anup Amatya, Oksana Pugach, John Cursio, Fei Shi, David Morton and Beyza Doganay 1. INTRODUCTION

Hakan Demirtas, Anup Amatya, Oksana Pugach, John Cursio, Fei Shi, David Morton and Beyza Doganay 1. INTRODUCTION Statistics and Its Interface Volume 2 (2009) 449 456 Accuracy versus convenience: A simulation-based comparison of two continuous imputation models for incomplete ordinal longitudinal clinical trials data

More information

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Karl Bang Christensen National Institute of Occupational Health, Denmark Helene Feveille National

More information

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction

More information

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis Treatment effect estimates adjusted for small-study effects via a limit meta-analysis Gerta Rücker 1, James Carpenter 12, Guido Schwarzer 1 1 Institute of Medical Biometry and Medical Informatics, University

More information

On Missing Data and Genotyping Errors in Association Studies

On Missing Data and Genotyping Errors in Association Studies On Missing Data and Genotyping Errors in Association Studies Department of Biostatistics Johns Hopkins Bloomberg School of Public Health May 16, 2008 Specific Aims of our R01 1 Develop and evaluate new

More information

Inverse Probability of Censoring Weighting for Selective Crossover in Oncology Clinical Trials.

Inverse Probability of Censoring Weighting for Selective Crossover in Oncology Clinical Trials. Paper SP02 Inverse Probability of Censoring Weighting for Selective Crossover in Oncology Clinical Trials. José Luis Jiménez-Moro (PharmaMar, Madrid, Spain) Javier Gómez (PharmaMar, Madrid, Spain) ABSTRACT

More information

MEASUREMENT OF SKILLED PERFORMANCE

MEASUREMENT OF SKILLED PERFORMANCE MEASUREMENT OF SKILLED PERFORMANCE Name: Score: Part I: Introduction The most common tasks used for evaluating performance in motor behavior may be placed into three categories: time, response magnitude,

More information

Effect of National Immunizations Days on Immunization Coverage, Child Morbidity and Mortality: Evidence from Regression Discontinuity Design

Effect of National Immunizations Days on Immunization Coverage, Child Morbidity and Mortality: Evidence from Regression Discontinuity Design Effect of National Immunizations Days on Immunization Coverage, Child Morbidity and Mortality: Evidence from Regression Discontinuity Design Patrick Opoku Asuming and Stephane Helleringer EXTENDED ABSTRACT

More information

The prevention and handling of the missing data

The prevention and handling of the missing data Review Article Korean J Anesthesiol 2013 May 64(5): 402-406 http://dx.doi.org/10.4097/kjae.2013.64.5.402 The prevention and handling of the missing data Department of Anesthesiology and Pain Medicine,

More information

Carrying out an Empirical Project

Carrying out an Empirical Project Carrying out an Empirical Project Empirical Analysis & Style Hint Special program: Pre-training 1 Carrying out an Empirical Project 1. Posing a Question 2. Literature Review 3. Data Collection 4. Econometric

More information

G , G , G MHRN

G , G , G MHRN Estimation of treatment-effects from randomized controlled trials in the presence non-compliance with randomized treatment allocation Graham Dunn University of Manchester Research funded by: MRC Methodology

More information

Using response time data to inform the coding of omitted responses

Using response time data to inform the coding of omitted responses Psychological Test and Assessment Modeling, Volume 58, 2016 (4), 671-701 Using response time data to inform the coding of omitted responses Jonathan P. Weeks 1, Matthias von Davier & Kentaro Yamamoto Abstract

More information

African Gender and Development Index

African Gender and Development Index African Gender and Development Index EXPERTS MEETING ON METHODOLOGIES FOR REGIONAL INTEGRATION INDEX 26 SEPTEMBER 2018 ADDIS ABABA, ETHIOPIA Outline 1. The African Gender and Development Index 2. Gender

More information

Advanced IPD meta-analysis methods for observational studies

Advanced IPD meta-analysis methods for observational studies Advanced IPD meta-analysis methods for observational studies Simon Thompson University of Cambridge, UK Part 4 IBC Victoria, July 2016 1 Outline of talk Usual measures of association (e.g. hazard ratios)

More information

Designing and Analyzing RCTs. David L. Streiner, Ph.D.

Designing and Analyzing RCTs. David L. Streiner, Ph.D. Designing and Analyzing RCTs David L. Streiner, Ph.D. Emeritus Professor, Department of Psychiatry & Behavioural Neurosciences, McMaster University Emeritus Professor, Department of Clinical Epidemiology

More information

Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of four noncompliance methods

Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of four noncompliance methods Merrill and McClure Trials (2015) 16:523 DOI 1186/s13063-015-1044-z TRIALS RESEARCH Open Access Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of

More information

Bayesian Projection of Life Expectancy Accounting for the HIV/AIDS Epidemic

Bayesian Projection of Life Expectancy Accounting for the HIV/AIDS Epidemic Bayesian Projection of Accounting for the HIV/AIDS Epidemic arxiv:1608.07330v2 [stat.ap] 27 Sep 2016 1 Introduction Jessica Godwin and Adrian E. Raftery University of Washington August 25, 2016 Probabilistic

More information

Inclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and. All Incomplete Variables. Jin Eun Yoo, Brian French, Susan Maller

Inclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and. All Incomplete Variables. Jin Eun Yoo, Brian French, Susan Maller Inclusive strategy with CFA/MI 1 Running head: CFA AND MULTIPLE IMPUTATION Inclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and All Incomplete Variables Jin Eun Yoo, Brian French,

More information

Analyzing diastolic and systolic blood pressure individually or jointly?

Analyzing diastolic and systolic blood pressure individually or jointly? Analyzing diastolic and systolic blood pressure individually or jointly? Chenglin Ye a, Gary Foster a, Lisa Dolovich b, Lehana Thabane a,c a. Department of Clinical Epidemiology and Biostatistics, McMaster

More information

Socioeconomic inequalities in HIV/AIDS prevalence in sub-saharan African countries: evidence from the Demographic Health Surveys

Socioeconomic inequalities in HIV/AIDS prevalence in sub-saharan African countries: evidence from the Demographic Health Surveys Hajizadeh et al. International Journal for Equity in Health 2014, 13:18 RESEARCH Open Access Socioeconomic inequalities in HIV/AIDS prevalence in sub-saharan African countries: evidence from the Demographic

More information

Diagnostic Procurement The Clinton Foundation HIV/AIDS Initiative

Diagnostic Procurement The Clinton Foundation HIV/AIDS Initiative Diagnostic Procurement The Clinton Foundation HIV/AIDS Initiative 28 October 2008 Geneva 1 CHAI Work in Diagnostics - Context Diagnostic Test Agreements - In 2002 when CHAI was formed, the price of many

More information

Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms).

Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms). Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms). 1. Bayesian Information Criterion 2. Cross-Validation 3. Robust 4. Imputation

More information

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012 STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements

More information

What to do with missing data in clinical registry analysis?

What to do with missing data in clinical registry analysis? Melbourne 2011; Registry Special Interest Group What to do with missing data in clinical registry analysis? Rory Wolfe Acknowledgements: James Carpenter, Gerard O Reilly Department of Epidemiology & Preventive

More information

Gender, Poverty, and Health in Sub-Saharan Africa: A Framework for Analysis

Gender, Poverty, and Health in Sub-Saharan Africa: A Framework for Analysis Gender, Poverty, and Health in Sub-Saharan Africa: A Framework for Analysis Pathways to Improved Health Outcomes Health outcomes Households/ Communities Household behaviors & risk factors Community factors

More information

EXPLANATION OF INDICATORS CHOSEN FOR THE 2017 ANNUAL SUN MOVEMENT PROGRESS REPORT

EXPLANATION OF INDICATORS CHOSEN FOR THE 2017 ANNUAL SUN MOVEMENT PROGRESS REPORT UNICEF / Zar Mon Annexes EXPLANATION OF INDICATORS CHOSEN FOR THE 2017 ANNUAL SUN MOVEMENT PROGRESS REPORT This report includes nine nutrition statistics, as per the 2017 Global Nutrition Report. These

More information

Development Through a Gender Lens: Implications for Poverty Reduction and Energy Sector Strategies

Development Through a Gender Lens: Implications for Poverty Reduction and Energy Sector Strategies Development Through a Gender Lens: Implications for Poverty Reduction and Energy Sector Strategies C. Mark Blackden Office of the Sector Director, PREM, Africa Region Biomass Energy Workshop February 26,

More information

Missing Data and Institutional Research

Missing Data and Institutional Research A version of this paper appears in Umbach, Paul D. (Ed.) (2005). Survey research. Emerging issues. New directions for institutional research #127. (Chapter 3, pp. 33-50). San Francisco: Jossey-Bass. Missing

More information

n Outline final paper, add to outline as research progresses n Update literature review periodically (check citeseer)

n Outline final paper, add to outline as research progresses n Update literature review periodically (check citeseer) Project Dilemmas How do I know when I m done? How do I know what I ve accomplished? clearly define focus/goal from beginning design a search method that handles plateaus improve some ML method s robustness

More information

( A JICA-IRRI-PhilRice Initiative) Presented by Noel Magor, Head Unit Impact Acceleration and Training Center, IRRI

( A JICA-IRRI-PhilRice Initiative) Presented by Noel Magor, Head Unit Impact Acceleration and Training Center, IRRI ( A JICA-IRRI-PhilRice Initiative) Presented by Noel Magor, Head Unit Impact Acceleration and Training Center, IRRI Long partnership between; IRRI & PhilRice IRRI & JICA PhilRice & JICA The Season Long

More information

Title. Description. Remarks. Motivating example. intro substantive Introduction to multiple-imputation analysis

Title. Description. Remarks. Motivating example. intro substantive Introduction to multiple-imputation analysis Title intro substantive Introduction to multiple-imputation analysis Description Missing data arise frequently. Various procedures have been suggested in the literature over the last several decades to

More information