Advanced Handling of Missing Data
|
|
- Dorcas Spencer
- 6 years ago
- Views:
Transcription
1 Advanced Handling of Missing Data One-day Workshop Nicole Janz
2 2 Goals Discuss types of missingness Know advantages & disadvantages of missing data methods Learn multiple imputation Practical: diagnose, visualize and handle missing data in R
3 3 Steps in the research process 1. Identify patterns of missingness for each variable 2. Why are data missing? Could this bias your sample? 3. How do other scholars in your field handle missingness? 4. Decide on method to handle missingness for your particular variables 5. Robustness: try different missing data methods, run your analysis, compare the results
4 Proportions of missingness per variable in a table variable nmiss n propmiss country year UN_FDI_flow US_fdi_electrical US_fdi_machinery US_fdi_transport US_fdi_mining US_fdi_services US_fdi_petrol US_fdi_utilities
5 Proportions of missingness per variable in a graph Proportion of missingness Petrol/GDP Mining/GDP Other FDI/GDP Deposit./GDP Finance/GDP US FDI/GDP Wh.Trade/GDP Food/GDP Chemical/GDP Metal/GDP Transp./GDP Machinery/GDP Mosley Law Mosley Prac. Mosley Labor Electr./GDP PTS Democracy CIRI Women CIRI Phys. CIRI Emp. CIRI Worker Trade GDP p. capita Population Conflict Fariss Life exp. Inf.mort
6 Time series: number of years with A SIMPLIFIED BIVARIATE TEST GUIDE existing data 6
7 Heatmap per country-year and variable yellow=missing 7
8 Why are my data missing? Due to social/natural processes school graduation, dropout, death a country does not exist anymore e.g. GDR statistics office reclassified variables intentional non-disclosure Skip patterns in surveys E.g. only married respondents are asked certain follow-up questions Respondent refusal income 8
9 Why are my data missing? variable nmiss n propmiss US_fdi_mining US_fdi_petrol US_fdi_utilities Mining FDI is available until 1999 Petrol FDI is available from 2000 Utilities FDI is a new category was introduced after
10 Three types of missingness 1. MCAR - Missing Completely at Random 2. MAR - Missing at Random 3. MNAR Missing not at Random 10
11 MCAR: Missing Completely at Random Missing value (y) neither depends on x nor y. Probability of missingness is the same for all units. Survey respondent decides whether to answer the earnings question by rolling a die and refusing to answer if a 6 shows up Some survey questions asked of a simple random sample of original sample What to do: If data are missing completely at random, then throwing out cases with missing data does not bias your inferences -> do listwise deletion, then run analysis 11
12 MAR: Missing at Random Probability that a variable is missing depends only observed data, but not the missing data itself, or unobserved data. If sex, race, education, and age are recorded for all the people in the survey, then earnings is MAR if the probability of nonresponse depends only on these variables If men are more likely to tell you their weight than women, and we record gender, then weight is MAR. What to do? Some say listwise deletion is fine, but only if regression controls for all variables that affect probability of missingness. More common: use multiple imputation (MI) because listwise 12 deletion introduces bias.
13 MNAR: Missing not at Random (non-ignorable missingness) Missingness depends at least in part on unobserved factors. Special case: Missingness depends on variable that is missing People with college degrees are less likely to reveal their earnings, we don t have education data for all respondents If a particular treatment causes discomfort, a patient is more likely to drop out of the study. We don t have a measure for discomfort for all patients. Respondents with high income less likely to report income. 13
14 MNAR: Missing not at Random (non-ignorable missingness) What to do? Most problematic case. Potential lurking variables are often unobserved. MI based on auxiliary, external data e.g. estimate race based on Census data associated with the address of the respondent. Try to include as many predictors as possible in a model to get MNAR closer to MAR. 14
15 How to distinguish between MNAR and MAR? Think about your variables and use your substantive scientific knowledge of the data and your field. Can you collect more data that explain missingness, or is it very likely that they will remain unobserved? What does the literature say about predictors of that particular missing variable? 15
16 How to distinguish between MAR and MCAR? Again, think about the data. Some indication (but no definitive answer) can be gained from two tests: 1) Little s test for MCAR (Little 1988) Maximum likelihood chi-square test for missing completely at random. H 0 is that the data is MCAR. If the p value for Little's MCAR test is not significant, then the data may be assumed to be MCAR and missingness is ignorable (do listwise deletion). mcartest in STATA; EM option in SPSS; in R see lab 16
17 How to distinguish between MAR and MCAR? 2. Dummy variable approach for MCAR create dummy variables for whether a variable is missing: 1 = missing 0 = observed Run t-tests (continuous) and chi-square (categorical) tests between this dummy and other variables to see if the missingness is related to the values of other variables Tests which return a finding of significance indicate MAR rather than MCAR (-> use multiple imputation) (SPSS: MVA option, R see lab) 17
18 Ad-hoc methods Listwise deletion (complete case analysis) Automatically done in regression in most software; or by hand; assumes MCAR If MAR or MNAR: introduces biased sample reduces sample size Pairwise deletion (available case analysis) different aspects of a problem are studied with different subsets of the data Results between subsets not consistent / comparable if the non-respondents differ systematically from the respondents, this will bias the available-case summaries Potential omitted variable bias if excludes a complete variable because its high missingness 18
19 Ad-hoc methods Last value carried forward replace missing outcome values with pre-treatment measure would lead to underestimates of the true treatment effect ignores changes over time Mean imputation easiest way to impute is to replace each NA with the mean distorts distribution for this variable, e.g. underestimates sd ignores changes over time Filling in values manually based on case-based knowledge from other sources time-consuming prone to measurement error 19
20 Single imputation Impute missing values from predicted values results from regression the error in these cases becomes zero. However, random errors are a feature of the real world and one variable treated with single imputation will be fundamentally different from the other variables. leads to overconfidence in our models and biases the coefficients upwards 20
21 Multiple Imputation Techniques Multiple imputation (MI) is also based on the idea of using predicted values, but it builds in mechanisms to incorporate uncertainty about the predicted values. MI imputes values for each missing data point, but it does so n times (usually 5). It then creates n (5) completed data sets. The observed values remain the same, but the imputed value varies across these 5 data sets, reflecting uncertainty. MI is much closer to reality when calculating new values. MI is a good alternative to listwise deletion because the main assumption is that data are MAR, meaning that some other variables in the data set may (and should) explain why an observation is missing 21
22 Multiple Imputation Techniques Details on expectation maximization (EM) algorithm, see King et al. (2001). 22 Figure:
23 Combination of results Run each analysis (e.g. regression) on all 5 imputed data sets. Collect all 5 coefficients and standard errors (and other measures of interest), and combine them into one estimate according to Rubin s Rule (1987): Estimates: average of the individual estimates Standard error: combine between-imputation variance and within-imputation variance See King et al. (2001). 23
24 Multiple Imputation Software {Amelia} in R (by Gary King and collaborators) {mi} in R (by Andrew Gelman and collaborators) {mice} in R (by Stef van Buuren and collaborators) SPSS (Analyze > Multiple Imputation) STATA mi estimate 24
25 Social Sciences Research Methods Centre Lab
26 Summarizing and Visualizing Missingness in R % of missingness per variable and subsets of variables Graphical display Using Amelia for diagnosis of missingness 26
27 MCAR patterns? 1) BaylorEdPsych (Little s Test to diagnose MCAR) BaylorEdPsych.pdf 2) Creating a dummy variable for missingness 0/1, then running correlations among variables 27
28 Ad-hoc measures in R 1) Listwise deletion, pairwise deletion 2) Carry last value forward 3) Mean imputation 4) Manually recoding particular variables 5) Replace NAs with predicted values from regression 28
29 Example 1 Adapted from Schlomer et al. (2010) 60 clients under age 21 years at a large university counseling center were referred for counseling by the dean of students due to underage drinking violations. The counseling center randomly assigned the students to one of two treatment programs (independent variable: Group), one of which uses the harm reduction approach, and the other of which is based on a 12-step model. Participants self-efficacy for sobriety was measured before (covariate) and after (dependent variable) the counseling. 7 variations of the DV: DV with no missing; DV with 10%, 20%, and 50% MCAR, and DV with 10%, 20%, and 50% MAR 29
30 Example 1 Adapted from Schlomer et al. (2010) Goal: Compare biases in estimates of mean, standard deviation, regression coefficient, and standard error when the DV has 20% missing at random with when the DV has 0% missing using different missing data handling techniques. Step 1: Calculate M, SD, B, and SE with DV0Miss Step 2: Create the target data set with DV20MAR 30
31 Example 1 Adapted from Schlomer et al. (2010) Describe missing patterns Summarize and visualize missingness Little's (1998) MCAR test Dummy code missingness Ad-hoc methods Delete listwise or pairwise Carry last value forward Substitute with mean Recode manually Predict from regression Multiple imputation Amelia II 31
32 Multiple Imputation with Amelia II How to run an imputation in R incl diagnostics - run Amelia on a data set - saving an imputed data set - combining several data into an amelia object - how to deal with ordinal, nominal, natural log data - time series cross-section - lags and leads - overimputation - time series plots 32
33 Reproducibility Set seed (!!!) for yourself and others When you re-run Amelia after diagnostics and want to make changes, it s best to re-use exactly what you had with minimal changes Work in R, not the GUI version Keep your Rscript well commented; make a note of sessioninfo(), especially the Amelia and R version used 33
34 Reproducibility On 12/4/2012 5:40 AM, Nicole Janz wrote: Dear, I'm a PhD student at Cambridge University, and I work on foreign investment and labor standards. I read your with great interest. I was wondering if you could make the imputation Rcode available to me? I am asking this because I am using Amelia as well, and I would like to try and replicate your imputation with the same specifications. Hi Nicole - Thanks for the note. Unfortunately, we did this in AmeliaView, so we don't have R code available (I assume you've found the replication data and Stata code on my website). 34
35 More practical tips Set the seed! Include any variable in the analysis model in your imputation model. Maybe use auxiliary variables if they make sens. Include variables in the form they enter the model (lags, logs, leads, transformations). Don t impute things that don t make sense! Don t impute decades of missing data. Check diagnostics 35
36 Literature and tutorials Amelia mailing list Tutorial for three MI software packages by Thomas Leeper MISSING VALUES ANALYSIS & DATA IMPUTATION James Honaker and Gary King, What to do About Missing Values in Time Series Cross-Section Data American Journal of Political Science Vol. 54, No. 2 (April, 2010): Pp Gary King, James Honaker, Anne Joseph, and Kenneth Scheve. Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation, American Political Science Review, Vol. 95, No. 1 (March, 2001): Pp
37 Literature and tutorials Andrew Gelman and Jeniffer Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, CHAPTER 25: Missing-data imputation. Cambridge University Press, Cambridge (2006). Much Ado About Nothing: A Comparison of Missing Data Methods and Software to Fit Incomplete Data Regression Models Allison, Paul D Missing Data. Sage University Papers Series on Quantitative Applications in the Social Sciences. Thousand Oaks: Sage. Enders, Craig Applied Missing Data Analysis. Guilford Press: New York. Little, Roderick J., Donald Rubin Statistical Analysis with Missing Data. John Wiley & Sons, Inc: Hoboken. Schafer, Joseph L., John W. Graham MissingData: Our View of the State of the Art. Psychological Methods. 37
38 Thank you! Nicole Janz
Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values
Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values Sutthipong Meeyai School of Transportation Engineering, Suranaree University of Technology,
More informationSelected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC
Selected Topics in Biostatistics Seminar Series Missing Data Sponsored by: Center For Clinical Investigation and Cleveland CTSC Brian Schmotzer, MS Biostatistician, CCI Statistical Sciences Core brian.schmotzer@case.edu
More informationHelp! Statistics! Missing data. An introduction
Help! Statistics! Missing data. An introduction Sacha la Bastide-van Gemert Medical Statistics and Decision Making Department of Epidemiology UMCG Help! Statistics! Lunch time lectures What? Frequently
More informationMissing Data and Imputation
Missing Data and Imputation Barnali Das NAACCR Webinar May 2016 Outline Basic concepts Missing data mechanisms Methods used to handle missing data 1 What are missing data? General term: data we intended
More informationAMELIA II: A Package for Missing Data
AMELIA II: A Package for Missing Data James Honaker Gary King Matthew Blackwell July 24, 2009 I want to convince you of three things. I want to convince you of three things. 1 Missing data is a problem
More informationModern Strategies to Handle Missing Data: A Showcase of Research on Foster Children
Modern Strategies to Handle Missing Data: A Showcase of Research on Foster Children Anouk Goemans, MSc PhD student Leiden University The Netherlands Email: a.goemans@fsw.leidenuniv.nl Modern Strategies
More informationValidity and reliability of measurements
Validity and reliability of measurements 2 3 Request: Intention to treat Intention to treat and per protocol dealing with cross-overs (ref Hulley 2013) For example: Patients who did not take/get the medication
More informationA COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY
A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,
More informationValidity and reliability of measurements
Validity and reliability of measurements 2 Validity and reliability of measurements 4 5 Components in a dataset Why bother (examples from research) What is reliability? What is validity? How should I treat
More informationModule 14: Missing Data Concepts
Module 14: Missing Data Concepts Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724 Pre-requisites Module 3
More informationAppendix 1. Sensitivity analysis for ACQ: missing value analysis by multiple imputation
Appendix 1 Sensitivity analysis for ACQ: missing value analysis by multiple imputation A sensitivity analysis was carried out on the primary outcome measure (ACQ) using multiple imputation (MI). MI is
More informationExploring the Impact of Missing Data in Multiple Regression
Exploring the Impact of Missing Data in Multiple Regression Michael G Kenward London School of Hygiene and Tropical Medicine 28th May 2015 1. Introduction In this note we are concerned with the conduct
More informationBest Practice in Handling Cases of Missing or Incomplete Values in Data Analysis: A Guide against Eliminating Other Important Data
Best Practice in Handling Cases of Missing or Incomplete Values in Data Analysis: A Guide against Eliminating Other Important Data Sub-theme: Improving Test Development Procedures to Improve Validity Dibu
More informationMissing Data and Institutional Research
A version of this paper appears in Umbach, Paul D. (Ed.) (2005). Survey research. Emerging issues. New directions for institutional research #127. (Chapter 3, pp. 33-50). San Francisco: Jossey-Bass. Missing
More informationPropensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research
2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy
More informationThe prevention and handling of the missing data
Review Article Korean J Anesthesiol 2013 May 64(5): 402-406 http://dx.doi.org/10.4097/kjae.2013.64.5.402 The prevention and handling of the missing data Department of Anesthesiology and Pain Medicine,
More informationS Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch.
S05-2008 Imputation of Categorical Missing Data: A comparison of Multivariate Normal and Abstract Multinomial Methods Holmes Finch Matt Margraf Ball State University Procedures for the imputation of missing
More informationMultiple Imputation For Missing Data: What Is It And How Can I Use It?
Multiple Imputation For Missing Data: What Is It And How Can I Use It? Jeffrey C. Wayman, Ph.D. Center for Social Organization of Schools Johns Hopkins University jwayman@csos.jhu.edu www.csos.jhu.edu
More informationInclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and. All Incomplete Variables. Jin Eun Yoo, Brian French, Susan Maller
Inclusive strategy with CFA/MI 1 Running head: CFA AND MULTIPLE IMPUTATION Inclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and All Incomplete Variables Jin Eun Yoo, Brian French,
More informationBias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study
STATISTICAL METHODS Epidemiology Biostatistics and Public Health - 2016, Volume 13, Number 1 Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation
More informationMaster thesis Department of Statistics
Master thesis Department of Statistics Masteruppsats, Statistiska institutionen Missing Data in the Swedish National Patients Register: Multiple Imputation by Fully Conditional Specification Jesper Hörnblad
More informationAnalysis of TB prevalence surveys
Workshop and training course on TB prevalence surveys with a focus on field operations Analysis of TB prevalence surveys Day 8 Thursday, 4 August 2011 Phnom Penh Babis Sismanidis with acknowledgements
More informationThe Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Educational Psychology Papers and Publications Educational Psychology, Department of 7-1-2001 The Relative Performance of
More informationDealing with Missing Data: A comparative exploration of approaches utilizing the Integrated City Sustainability Database
Dealing with Missing Data: A comparative exploration of approaches utilizing the Integrated City Sustainability Database Cali Curley, Rachel M. Krause, Richard Feiock, Christopher V. Hawkins Abstract:
More informationMethods for Computing Missing Item Response in Psychometric Scale Construction
American Journal of Biostatistics Original Research Paper Methods for Computing Missing Item Response in Psychometric Scale Construction Ohidul Islam Siddiqui Institute of Statistical Research and Training
More informationSection on Survey Research Methods JSM 2009
Missing Data and Complex Samples: The Impact of Listwise Deletion vs. Subpopulation Analysis on Statistical Bias and Hypothesis Test Results when Data are MCAR and MAR Bethany A. Bell, Jeffrey D. Kromrey
More informationMissing by Design: Planned Missing-Data Designs in Social Science
Research & Methods ISSN 1234-9224 Vol. 20 (1, 2011): 81 105 Institute of Philosophy and Sociology Polish Academy of Sciences, Warsaw www.ifi span.waw.pl e-mail: publish@ifi span.waw.pl Missing by Design:
More informationAccuracy of Range Restriction Correction with Multiple Imputation in Small and Moderate Samples: A Simulation Study
A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute
More informationIn this module I provide a few illustrations of options within lavaan for handling various situations.
In this module I provide a few illustrations of options within lavaan for handling various situations. An appropriate citation for this material is Yves Rosseel (2012). lavaan: An R Package for Structural
More informationMissing Data: Our View of the State of the Art
Psychological Methods Copyright 2002 by the American Psychological Association, Inc. 2002, Vol. 7, No. 2, 147 177 1082-989X/02/$5.00 DOI: 10.1037//1082-989X.7.2.147 Missing Data: Our View of the State
More informationAn Introduction to Multiple Imputation for Missing Items in Complex Surveys
An Introduction to Multiple Imputation for Missing Items in Complex Surveys October 17, 2014 Joe Schafer Center for Statistical Research and Methodology (CSRM) United States Census Bureau Views expressed
More informationPreliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)
Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) After receiving my comments on the preliminary reports of your datasets, the next step for the groups is to complete
More informationMISSING DATA AND PARAMETERS ESTIMATES IN MULTIDIMENSIONAL ITEM RESPONSE MODELS. Federico Andreis, Pier Alda Ferrari *
Electronic Journal of Applied Statistical Analysis EJASA (2012), Electron. J. App. Stat. Anal., Vol. 5, Issue 3, 431 437 e-issn 2070-5948, DOI 10.1285/i20705948v5n3p431 2012 Università del Salento http://siba-ese.unile.it/index.php/ejasa/index
More informationMissing data in clinical trials: making the best of what we haven t got.
Missing data in clinical trials: making the best of what we haven t got. Royal Statistical Society Professional Statisticians Forum Presentation by Michael O Kelly, Senior Statistical Director, IQVIA Copyright
More informationCOMMITTEE FOR PROPRIETARY MEDICINAL PRODUCTS (CPMP) POINTS TO CONSIDER ON MISSING DATA
The European Agency for the Evaluation of Medicinal Products Evaluation of Medicines for Human Use London, 15 November 2001 CPMP/EWP/1776/99 COMMITTEE FOR PROPRIETARY MEDICINAL PRODUCTS (CPMP) POINTS TO
More informationMaintenance of weight loss and behaviour. dietary intervention: 1 year follow up
Institute of Psychological Sciences FACULTY OF MEDICINE AND HEALTH Maintenance of weight loss and behaviour change Dropouts following and a 12 Missing week healthy Data eating dietary intervention: 1 year
More informationStatistical data preparation: management of missing values and outliers
KJA Korean Journal of Anesthesiology Statistical Round pissn 2005-6419 eissn 2005-7563 Statistical data preparation: management of missing values and outliers Sang Kyu Kwak 1 and Jong Hae Kim 2 Departments
More informationbivariate analysis: The statistical analysis of the relationship between two variables.
bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for
More informationAdjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data
Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Karl Bang Christensen National Institute of Occupational Health, Denmark Helene Feveille National
More informationData Analysis Using Regression and Multilevel/Hierarchical Models
Data Analysis Using Regression and Multilevel/Hierarchical Models ANDREW GELMAN Columbia University JENNIFER HILL Columbia University CAMBRIDGE UNIVERSITY PRESS Contents List of examples V a 9 e xv " Preface
More informationWeek 10 Hour 1. Shapiro-Wilks Test (from last time) Cross-Validation. Week 10 Hour 2 Missing Data. Stat 302 Notes. Week 10, Hour 2, Page 1 / 32
Week 10 Hour 1 Shapiro-Wilks Test (from last time) Cross-Validation Week 10 Hour 2 Missing Data Stat 302 Notes. Week 10, Hour 2, Page 1 / 32 Cross-Validation in the Wild It s often more important to describe
More informationChapter 3 Missing data in a multi-item questionnaire are best handled by multiple imputation at the item score level
Chapter 3 Missing data in a multi-item questionnaire are best handled by multiple imputation at the item score level Published: Eekhout, I., de Vet, H.C.W., Twisk, J.W.R., Brand, J.P.L., de Boer, M.R.,
More informationStrategies for handling missing data in randomised trials
Strategies for handling missing data in randomised trials NIHR statistical meeting London, 13th February 2012 Ian White MRC Biostatistics Unit, Cambridge, UK Plan 1. Why do missing data matter? 2. Popular
More informationWELCOME! Lecture 11 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression
More informationGraphical Representation of Missing Data Problems
TECHNICAL REPORT R-448 January 2015 Structural Equation Modeling: A Multidisciplinary Journal, 22: 631 642, 2015 Copyright Taylor & Francis Group, LLC ISSN: 1070-5511 print / 1532-8007 online DOI: 10.1080/10705511.2014.937378
More informationCatherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1
Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition
More informationRegression Analysis II
Regression Analysis II Lee D. Walker University of South Carolina e-mail: walker23@gwm.sc.edu COURSE OVERVIEW This course focuses on the theory, practice, and application of linear regression. As Agresti
More informationPolitical Science 15, Winter 2014 Final Review
Political Science 15, Winter 2014 Final Review The major topics covered in class are listed below. You should also take a look at the readings listed on the class website. Studying Politics Scientifically
More informationPEER REVIEW HISTORY ARTICLE DETAILS VERSION 1 - REVIEW. Ball State University
PEER REVIEW HISTORY BMJ Open publishes all reviews undertaken for accepted manuscripts. Reviewers are asked to complete a checklist review form (see an example) and are provided with free text boxes to
More informationBayesian approaches to handling missing data: Practical Exercises
Bayesian approaches to handling missing data: Practical Exercises 1 Practical A Thanks to James Carpenter and Jonathan Bartlett who developed the exercise on which this practical is based (funded by ESRC).
More informationDesign and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP I 5/2/2016
Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP233201500069I 5/2/2016 Overview The goal of the meta-analysis is to assess the effects
More informationMEASURES OF ASSOCIATION AND REGRESSION
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MEASURES OF ASSOCIATION AND REGRESSION I. AGENDA: A. Measures of association B. Two variable regression C. Reading: 1. Start Agresti
More informationMissing data imputation: focusing on single imputation
Big-data Clinical Trial Column Page 1 of 8 Missing data imputation: focusing on single imputation Zhongheng Zhang Department of Critical Care Medicine, Jinhua Municipal Central Hospital, Jinhua Hospital
More informationPractical Statistical Reasoning in Clinical Trials
Seminar Series to Health Scientists on Statistical Concepts 2011-2012 Practical Statistical Reasoning in Clinical Trials Paul Wakim, PhD Center for the National Institute on Drug Abuse 10 January 2012
More informationEstimands, Missing Data and Sensitivity Analysis: some overview remarks. Roderick Little
Estimands, Missing Data and Sensitivity Analysis: some overview remarks Roderick Little NRC Panel s Charge To prepare a report with recommendations that would be useful for USFDA's development of guidance
More informationPredictive Models for Making Patient Screening Decisions
Predictive Models for Making Patient Screening Decisions MICHAEL HAHSLER 1, VISHAL AHUJA 1, MICHAEL BOWEN 2, AND FARZAD KAMALZADEH 1 1 Southern Methodist University, 2 UT Southwestern Medical Center and
More informationASSESSING THE EFFECTS OF MISSING DATA. John D. Hutcheson, Jr. and James E. Prather, Georgia State University
ASSESSING THE EFFECTS OF MISSING DATA John D. Hutcheson, Jr. and James E. Prather, Georgia State University Problems resulting from incomplete data occur in almost every type of research, but survey research
More informationRegression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models, 2nd Ed.
Eric Vittinghoff, David V. Glidden, Stephen C. Shiboski, and Charles E. McCulloch Division of Biostatistics Department of Epidemiology and Biostatistics University of California, San Francisco Regression
More informationSESUG Paper SD
SESUG Paper SD-106-2017 Missing Data and Complex Sample Surveys Using SAS : The Impact of Listwise Deletion vs. Multiple Imputation Methods on Point and Interval Estimates when Data are MCAR, MAR, and
More informationTitle. Description. Remarks. Motivating example. intro substantive Introduction to multiple-imputation analysis
Title intro substantive Introduction to multiple-imputation analysis Description Missing data arise frequently. Various procedures have been suggested in the literature over the last several decades to
More informationSequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan
Sequential nonparametric regression multiple imputations Irina Bondarenko and Trivellore Raghunathan Department of Biostatistics, University of Michigan Ann Arbor, MI 48105 Abstract Multiple imputation,
More informationLecture (chapter 1): Introduction
Lecture (chapter 1): Introduction Ernesto F. L. Amaral January 17, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015. Statistics: A Tool for Social Research. Stamford:
More informationBefore we get started:
Before we get started: http://arievaluation.org/projects-3/ AEA 2018 R-Commander 1 Antonio Olmos Kai Schramm Priyalathta Govindasamy Antonio.Olmos@du.edu AntonioOlmos@aumhc.org AEA 2018 R-Commander 2 Plan
More informationRunning head: SELECTION OF AUXILIARY VARIABLES 1. Selection of auxiliary variables in missing data problems: Not all auxiliary variables are
Running head: SELECTION OF AUXILIARY VARIABLES 1 Selection of auxiliary variables in missing data problems: Not all auxiliary variables are created equal Felix Thoemmes Cornell University Norman Rose University
More informationWhat to do with missing data in clinical registry analysis?
Melbourne 2011; Registry Special Interest Group What to do with missing data in clinical registry analysis? Rory Wolfe Acknowledgements: James Carpenter, Gerard O Reilly Department of Epidemiology & Preventive
More informationSome General Guidelines for Choosing Missing Data Handling Methods in Educational Research
Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 3 11-2014 Some General Guidelines for Choosing Missing Data Handling Methods in Educational Research Jehanzeb R. Cheema University
More informationCross-Lagged Panel Analysis
Cross-Lagged Panel Analysis Michael W. Kearney Cross-lagged panel analysis is an analytical strategy used to describe reciprocal relationships, or directional influences, between variables over time. Cross-lagged
More informationManuscript Presentation: Writing up APIM Results
Manuscript Presentation: Writing up APIM Results Example Articles Distinguishable Dyads Chung, M. L., Moser, D. K., Lennie, T. A., & Rayens, M. (2009). The effects of depressive symptoms and anxiety on
More informationA Strategy for Handling Missing Data in the Longitudinal Study of Young People in England (LSYPE)
Research Report DCSF-RW086 A Strategy for Handling Missing Data in the Longitudinal Study of Young People in England (LSYPE) Andrea Piesse and Graham Kalton Westat Research Report No DCSF-RW086 A Strategy
More informationDesigning and Analyzing RCTs. David L. Streiner, Ph.D.
Designing and Analyzing RCTs David L. Streiner, Ph.D. Emeritus Professor, Department of Psychiatry & Behavioural Neurosciences, McMaster University Emeritus Professor, Department of Clinical Epidemiology
More informationClinical trials with incomplete daily diary data
Clinical trials with incomplete daily diary data N. Thomas 1, O. Harel 2, and R. Little 3 1 Pfizer Inc 2 University of Connecticut 3 University of Michigan BASS, 2015 Thomas, Harel, Little (Pfizer) Clinical
More informationThe analysis of tuberculosis prevalence surveys. Babis Sismanidis with acknowledgements to Sian Floyd Harare, 30 November 2010
The analysis of tuberculosis prevalence surveys Babis Sismanidis with acknowledgements to Sian Floyd Harare, 30 November 2010 Background Prevalence = TB cases / Number of eligible participants (95% CI
More informationSupplemental Appendix for Beyond Ricardo: The Link Between Intraindustry. Timothy M. Peterson Oklahoma State University
Supplemental Appendix for Beyond Ricardo: The Link Between Intraindustry Trade and Peace Timothy M. Peterson Oklahoma State University Cameron G. Thies University of Iowa A-1 This supplemental appendix
More informationProblem Set 3 ECN Econometrics Professor Oscar Jorda. Name. ESSAY. Write your answer in the space provided.
Problem Set 3 ECN 140 - Econometrics Professor Oscar Jorda Name ESSAY. Write your answer in the space provided. 1) Sir Francis Galton, a cousin of James Darwin, examined the relationship between the height
More informationMODEL SELECTION STRATEGIES. Tony Panzarella
MODEL SELECTION STRATEGIES Tony Panzarella Lab Course March 20, 2014 2 Preamble Although focus will be on time-to-event data the same principles apply to other outcome data Lab Course March 20, 2014 3
More informationBiostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego
Biostatistics Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego (858) 534-1818 dsilverstein@ucsd.edu Introduction Overview of statistical
More informationMissing data and multiple imputation in clinical epidemiological research
Clinical Epidemiology open access to scientific and medical research Open Access Full Text Article Missing data and multiple imputation in clinical epidemiological research METHODOLOGY Alma B Pedersen
More informationApplied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business
Applied Medical Statistics Using SAS Geoff Der Brian S. Everitt CRC Press Taylor Si Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Croup, an informa business A
More informationChapter Eight: Multivariate Analysis
Chapter Eight: Multivariate Analysis Up until now, we have covered univariate ( one variable ) analysis and bivariate ( two variables ) analysis. We can also measure the simultaneous effects of two or
More informationOn Missing Data and Genotyping Errors in Association Studies
On Missing Data and Genotyping Errors in Association Studies Department of Biostatistics Johns Hopkins Bloomberg School of Public Health May 16, 2008 Specific Aims of our R01 1 Develop and evaluate new
More informationOn average, about half the respondents to surveys
American Political Science Review Vol. 95, No. 1 March 2001 Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation GARY KING Harvard University JAMES HONAKER Harvard
More informationMISSING DATA IN FAMILY RESEARCH: EXAMINING DIFFERENT LEVELS OF MISSINGNESS
MISSING DATA IN FAMILY RESEARCH: EXAMINING DIFFERENT LEVELS OF MISSINGNESS SEMIRA TAGLIABUE CATHOLIC UNIVERSITY OF BRESCIA SILVIA DONATO CATHOLIC UNIVERSITY OF MILANO Family research is influenced by the
More informationMissing Data in Homicide Research
Cleveland State University EngagedScholarship@CSU Sociology & Criminology Faculty Publications Sociology & Criminology Department 8-2004 Missing Data in Homicide Research Marc Riedel Southeastern Louisiana
More informationIntroduction to Econometrics
Global edition Introduction to Econometrics Updated Third edition James H. Stock Mark W. Watson MyEconLab of Practice Provides the Power Optimize your study time with MyEconLab, the online assessment and
More informationisc ove ring i Statistics sing SPSS
isc ove ring i Statistics sing SPSS S E C O N D! E D I T I O N (and sex, drugs and rock V roll) A N D Y F I E L D Publications London o Thousand Oaks New Delhi CONTENTS Preface How To Use This Book Acknowledgements
More informationProfile Analysis. Intro and Assumptions Psy 524 Andrew Ainsworth
Profile Analysis Intro and Assumptions Psy 524 Andrew Ainsworth Profile Analysis Profile analysis is the repeated measures extension of MANOVA where a set of DVs are commensurate (on the same scale). Profile
More informationBusiness Statistics Probability
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationChapter Eight: Multivariate Analysis
Chapter Eight: Multivariate Analysis Up until now, we have covered univariate ( one variable ) analysis and bivariate ( two variables ) analysis. We can also measure the simultaneous effects of two or
More information11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES
Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are
More informationChapter 34 Detecting Artifacts in Panel Studies by Latent Class Analysis
352 Chapter 34 Detecting Artifacts in Panel Studies by Latent Class Analysis Herbert Matschinger and Matthias C. Angermeyer Department of Psychiatry, University of Leipzig 1. Introduction Measuring change
More information1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp
The Stata Journal (22) 2, Number 3, pp. 28 289 Comparative assessment of three common algorithms for estimating the variance of the area under the nonparametric receiver operating characteristic curve
More informationYou must answer question 1.
Research Methods and Statistics Specialty Area Exam October 28, 2015 Part I: Statistics Committee: Richard Williams (Chair), Elizabeth McClintock, Sarah Mustillo You must answer question 1. 1. Suppose
More informationAbstract. Introduction A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION
A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION Fong Wang, Genentech Inc. Mary Lange, Immunex Corp. Abstract Many longitudinal studies and clinical trials are
More informationComparison of imputation and modelling methods in the analysis of a physical activity trial with missing outcomes
IJE vol.34 no.1 International Epidemiological Association 2004; all rights reserved. International Journal of Epidemiology 2005;34:89 99 Advance Access publication 27 August 2004 doi:10.1093/ije/dyh297
More informationSTATISTICS & PROBABILITY
STATISTICS & PROBABILITY LAWRENCE HIGH SCHOOL STATISTICS & PROBABILITY CURRICULUM MAP 2015-2016 Quarter 1 Unit 1 Collecting Data and Drawing Conclusions Unit 2 Summarizing Data Quarter 2 Unit 3 Randomness
More informationMissing Data in Longitudinal Studies: Strategies for Bayesian Modeling, Sensitivity Analysis, and Causal Inference
COURSE: Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling, Sensitivity Analysis, and Causal Inference Mike Daniels (Department of Statistics, University of Florida) 20-21 October 2011
More informationChapter 1: Explaining Behavior
Chapter 1: Explaining Behavior GOAL OF SCIENCE is to generate explanations for various puzzling natural phenomenon. - Generate general laws of behavior (psychology) RESEARCH: principle method for acquiring
More informationWhat is Multilevel Modelling Vs Fixed Effects. Will Cook Social Statistics
What is Multilevel Modelling Vs Fixed Effects Will Cook Social Statistics Intro Multilevel models are commonly employed in the social sciences with data that is hierarchically structured Estimated effects
More informationChapter 1 Review Questions
Chapter 1 Review Questions 1.1 Why is the standard economic model a good thing, and why is it a bad thing, in trying to understand economic behavior? A good economic model is simple and yet gives useful
More information