What to do with missing data in clinical registry analysis?

Similar documents
COMMITTEE FOR PROPRIETARY MEDICINAL PRODUCTS (CPMP) POINTS TO CONSIDER ON MISSING DATA

SIERRA-SACRAMENTO VALLEY EMS AGENCY PROGRAM POLICY

Exploring the Impact of Missing Data in Multiple Regression

Missing Data and Imputation

Help! Statistics! Missing data. An introduction

Injury caused by an object breaking the skin and entering the body. immediate intervention to repair internal

Module 14: Missing Data Concepts

Contents. Version 1.0: 01/02/2010 Protocol# ISRCTN Page 1 of 7

How should the propensity score be estimated when some confounders are partially observed?

Effect of post-intubation hypotension on outcomes in major trauma patients

Pre-hospital Response to Trauma and Brain Injury. Hans Notenboom, M.D. Asst. Medical Director Sacred Heart Medical Center

Critical care resources are often provided to the too well and as well as. to the too sick. The former include the patients admitted to an ICU

Analysis Strategies for Clinical Trials with Treatment Non-Adherence Bohdana Ratitch, PhD

Individual Participant Data (IPD) Meta-analysis of prediction modelling studies

Strategies for handling missing data in randomised trials

TRAUMATIC BRAIN INJURY. Moderate and Severe Brain Injury

Appendix 1. Sensitivity analysis for ACQ: missing value analysis by multiple imputation

Assessment and Scoring Tools

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

THE ISSUE OF STAGE AT DIAGNOSIS

Maintenance of weight loss and behaviour. dietary intervention: 1 year follow up

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch.

PSI Missing Data Expert Group

Analysis of TB prevalence surveys

APACHE II: A Severity of Disease Classification System Standard Operating Procedure for Accurate Calculations

Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling, Sensitivity Analysis, and Causal Inference

Trauma Patient Medical Record

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values

9/15/2015. Introduction (1 of 3) Chapter 8. Introduction (2 of 3) What is the difference? Scene Size-up (1 of 2) Patient Assessment

T he rising proportion of the population aged 65 and older

The prevention and handling of the missing data

Best Practice in Handling Cases of Missing or Incomplete Values in Data Analysis: A Guide against Eliminating Other Important Data

Traumatic Brain Injury

Complex evaluation of polytrauma in intensive care with multiple severity scores

Supplementary Online Content

Longitudinal data monitoring for Child Health Indicators

MISSING DATA AND PARAMETERS ESTIMATES IN MULTIDIMENSIONAL ITEM RESPONSE MODELS. Federico Andreis, Pier Alda Ferrari *

Chapter 2 Triage. Introduction. The Trauma Team

Pediatric Advanced Life Support

An Introduction to Multiple Imputation for Missing Items in Complex Surveys

On Missing Data and Genotyping Errors in Association Studies

Missing data in clinical trials: making the best of what we haven t got.

Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models, 2nd Ed.

Advanced Handling of Missing Data

Comparison of imputation and modelling methods in the analysis of a physical activity trial with missing outcomes

Quantifying the clinical measure of interest in the presence of missing data:

Validity and reliability of measurements

Bayesian approaches to handling missing data: Practical Exercises

TITLE: Optimal Oxygen Saturation Range for Adults Suffering from Traumatic Brain Injury: A Review of Patient Benefit, Harms, and Guidelines

An application of a pattern-mixture model with multiple imputation for the analysis of longitudinal trials with protocol deviations

Practical Statistical Reasoning in Clinical Trials

Treatment changes in cancer clinical trials: design and analysis

ethnicity recording in primary care

Massive Transfusion in Pediatric Trauma: Analysis of the National Trauma Databank

Dealing with uncertainty in the economic evaluation of health care technologies

Missing data in medical research is

Methodology and Analytic Rationale for the American College of Surgeons Trauma Quality Improvement Program

European Resuscitation Council

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

The RoB 2.0 tool (individually randomized, cross-over trials)

Validity and reliability of measurements

Scoring of anatomic injury after trauma: AIS 98 versus AIS 90 do the changes affect overall severity assessment?

INTERVAL trial Statistical analysis plan for principal paper

Global Journal of Health Science Vol. 4, No. 3; 2012

Bayesian methods for combining multiple Individual and Aggregate data Sources in observational studies

ESCAMBIA COUNTY TRAUMA TRANSPORT

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data

Section on Survey Research Methods JSM 2009

CARES Targeted Temperature Management (TTM) Module

Methods for Computing Missing Item Response in Psychometric Scale Construction

Multiple imputation for handling missing outcome data when estimating the relative risk

County of Santa Clara Emergency Medical Services System

An introduction to Quality by Design. Dr Martin Landray University of Oxford

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine

8/29/2011. Brain Injury Incidence: 200/100,000. Prehospital Brain Injury Mortality Incidence: 20/100,000

Efficacy of the Motor Component of the Glasgow Coma Scale in Trauma Triage

Abstract. Introduction A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION

SAN LUIS OBISPO COUNTY EMERGENCY MEDICAL SERVICES AGENCY PREHOSPITAL POLICY

Bias reduction with an adjustment for participants intent to dropout of a randomized controlled clinical trial

The analysis of tuberculosis prevalence surveys. Babis Sismanidis with acknowledgements to Sian Floyd Harare, 30 November 2010

Designing and Analyzing RCTs. David L. Streiner, Ph.D.

PEDIATRIC INITIAL ASSESSMENT - ALS

Restore adequate respiratory and circulatory conditions. Reduce pain

Predictive Models for Making Patient Screening Decisions

Missing by Design: Planned Missing-Data Designs in Social Science

Controversies in Spinal Immobilization

Title. Description. Remarks. Motivating example. intro substantive Introduction to multiple-imputation analysis

Hot Topics in Cardiac Arrest. Should the patient go To the Cath Lab?

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Inverse Probability of Censoring Weighting for Selective Crossover in Oncology Clinical Trials.

S pinal injury in the paediatric trauma patient can have

October 2009 CE Site code #107200E-1209

CORE STANDARDS STANDARDS USED IN TARN REPORTS

Instrumental Variables Estimation: An Introduction

PARAMEDIC COURSE OBJECTIVES

The American College of Surgeons Committee on Trauma

Supplementary Appendix

Imputation approaches for potential outcomes in causal inference

Transcription:

Melbourne 2011; Registry Special Interest Group What to do with missing data in clinical registry analysis? Rory Wolfe Acknowledgements: James Carpenter, Gerard O Reilly Department of Epidemiology & Preventive Medicine

Missing data

At analysis Too late? Ideally data collection is designed in a way to minimise missing data, e.g. ANZDATA Not always possible to anticipate how data will be used, e.g. Predict survival forward from 3 months on dialysis Too difficult? Complex methods, major assumptions... Ignore the problem? Complete case analysis

What is a reasonable assumption? Assume values are missing completely at random, i.e. patients with missing values are a random sample Assume missing values are dependent only on observed characteristics, e.g. age, sex, disease severity. Assume values are missing precisely because of their unobserved value MCAR, MAR, MNAR

No universal method The mechanism that gives rise to missing values is usually unknown Possible to investigate in detail a random sample of missing data to obtain true values? Extra assumptions are required for analysis Different assumptions give different results The validity of the assumptions cannot be determined from the data at hand Assessing the sensitivity of conclusions to the assumptions made about the missingness mechanism should play a central role. Adapted from Carpenter

Methods for dealing with missing data Complete case analysis (CC) deletion of cases with missing observations Single imputation (SI) Insert reasonable value Mean imputation Median imputation Last observation carried forward Missing as normal? X X X Possibly biased. Reduced precision due to ignoring observed data in the people excluded. Almost certainly biased. Over-estimates precision due to making up data.

Multiple imputation (MI) Uses observed values to generate a range of plausible values, based on existing associations between variables m complete datasets created, each dataset analysed m sets of results; combined using Rubin s rules Improved precision due to use of all observed data Possibly biased assumes MAR given the model used for imputation (CC assumes MAR given the covariates in model used for analysis)?

Multiple imputation Early days of multiple imputation suggested that imputation could be a separate process from analysis Now more widely accepted that imputation should be specific to a project / defined set of analyses Choice of imputation model is critical Include any variables in analysis model Including the outcome of study Sampling weights etc Care with survival outcome time and censoring status using estimate of hazard recommended Interactions if necessary

Inverse probability weighting Weights determined from a regression model for missingness Current work looks at combined MI and inverse prob weighting approach Full maximum likelihood Fully Bayesian approach

Rory & Roman s research at DEPM One common use of registry data is the development and/or validation of risk prediction models / prognostic tools What approach to take to missing data? Multiple imputation shown to have advantages under MAR assumption in model development Evidence base incomplete Little work done for model validation We are undertaking statistical simulations to assess MI v CC

Hierarchical / longitudinal data Guidelines now exist for imputation strategies with longitudinal data may be relevant to registries Imputation in hierarchical data Eg ASCTS cardiac surgery cases by hospital UK Renal Registry exploration of widely differing rates of missing data by reporting centre

Conclusions Ignoring missing data is not an option Any method of analysis makes assumptions the only issue is whether those assumptions are explicitly stated or not Sensitivity analysis is vital but time-consuming and never-ending (MNAR unlimited)? Prevention always better than cure do what is possible to collect data in first place

Newgard & Haukoos; AEMJ 2010 Appropriately handling missing data is not a panacea for all potential problems that confront the measurement of quality. Attention must also be focused on issues such as case ascertainment, small per-hospital patient numbers, unmeasured confounders, and risk adjustment modeling strategies that fully account for differences in case mix between hospitals. However, with appropriate attention to missing data during the development and implementation of quality initiatives, we might ameliorate one potential threat. Missing data present an example where poor choices in data collection and oversimplified approaches to data analyses can threaten a well-intentioned health policy initiative. With informed merging of statistical methodology and health policy, we might better understand the devilish details and actually use them to improve the care of our patients and our health care system.

Missing in Action: A Case Study of the Application of Methods for Dealing with Missing Data to Trauma System Benchmarking O Reilly G, Jolley D, Cameron P, Gabbe B Department of Epidemiology and Preventive Medicine Acad.Emerg.Med. 2010;17:1-8 Trauma2010 Melbourne

Background Trauma registry data are usually incomplete Various methods for dealing with missing data have been used, some of which lead to biased results There is no standardisation of the approach to missing data across trauma registries

Objective This study examined the effect of using selected methods for handling missing data on a recognised trauma outcome measure.

Methods (1) Data from the Victorian State Trauma Registry (VSTR) were used for the period July 2003 to June 2008. Three methods for handling missing data were investigated: 1. Complete case analysis (CC) 2. Single imputation (SI): 2 approaches 3. Multiple imputation (MI): 5 different models (combination of variables)

Methods (2) Following each method (for dealing with missing data) TRISS analysis was performed The W-score was derived

Glasgow Coma Scale GCS motor response 6 moves limb to command 5 localizes to painful stimulus 4 withdraws from painful stimulus 3 abnormal flexion response 2 abnormal extension response 1 no motor response GCS verbal response 5 oriented 4 confused 3 inappropriate words 2 incomprehensible 1 no verbal GCS eye response 4 spontaneous 3 open to speech 2 open to pain 1 none GCS total score = motor + verbal + eye Range: 3 (worst) to 15 (best)

Revised Trauma Score (RTS) Points scoring system, 0 (worst) to 12 (best), combining GCS, systolic blood pressure, respiratory rate Field RTS: to predict death/survival for blunt & penetrating injury

Abbreviated Injury Scale (AIS) Points score for each injury 0=no injury 1=minor 2=moderate 3=severe (non life-threatening) 4=severe (life threatening, survival probable) 5=critical (survival uncertain) 6=fatal injury for anatomic area Eg, abdominal; muscle ache=1, organ contusion=3, spine fracture with neurologic signs=4

Injury Severity Scale (ISS) Most severe injury to each body region: head & neck, chest, abdominal, extremity, general Points score for injury using AIS ISS = sum of squared AIS scores from three most severely injured regions Range 0 (best) to 75 (worst) 75 is automatic score if AIS=6, fatal, for any region To retrospectively assess treatment in terms of patient mortality

TRISS To retrospectively assess treatment in terms of patient mortality Survival related to age, ISS (anatomical injuries and their severity) and RTS (hence respiratory rate, SBP, GCS) Logistic regression-based model probability of survival = 1/[1+e a + b(rts) - c(iss) - d(age>55yr) ]

TRISS the W-score W-score the absolute difference between observed and expected number of deaths per 100 patients. (O-E) / (N/100) N=number of patients included O=observed number of deaths, E=expected number of deaths obtained by summing over all patients their probability of death, i.e one minus TRISS predicted probability of survival

Results There were 10,180 cases 2,398 (23.6%) were missing at least one TRISS variable observation (at primary hospital) The variable with the greatest proportion of missing observations was GCS

Missing data in 10,180 patients TRISS variable Missing Missing % Age 1 0.01 ISS 31 0.3 sbp (primary hospital) 612 6 sbp (primary hospital + prehospital) 498 5 sbp (primary hospital + prehospital + definitive hospital) 75 1 RR (primary hospital) 1,449 14 RR (primary hospital + prehospital) 630 6 RR (primary hospital + prehospital + definitive hospital) 221 2 GCS (primary hospital) 2,030 20 GCS (primary hospital + prehospital) 765 8 GCS (primary hospital + prehospital + definitive hospital) 448 4

Single imputation methods 1. Missing ISS, RR, or GCS imputed using corresponding prehospital value if available. If unavailable then patient omitted from analysis 2. Missing ISS, RR, or GCS imputed using corresponding prehospital value. If prehospital value also missing, the corresponding definitive hospital observation (in the case of hospital transfer) was used. If unavailable then patient omitted from analysis 3. Median and mean imputation also explored

MI Method Variables used for imputation model in MI (in addition to mortality outcome) 1 Primary hospital TRISS variables 2 Primary hospital TRISS variables Pre-hospital SBP, RR, GCS 3 Primary hospital TRISS variables Pre-hospital SBP, RR, GCS Transfer to Primary hospital by ambulance 4 Primary hospital TRISS variables Pre-hospital SBP, RR, GCS Transfer to Primary hospital by ambulance Transfer from Primary hospital to Definitive hospital 5 Primary hospital TRISS variables Pre-hospital SBP, RR, GCS Transfer to Primary hospital by ambulance Transfer from Primary hospital to Definitive hospital Legitimate Primary hospital GCS*

Display of W-score and 95% confidence interval for each method for dealing with missing data Median and mean imputation gave more extreme results than SI 1, 2 but in opposite direction!

Discussion Complete case analysis (CC) survival benefit Single imputation (SI) survival benefit of greater magnitude than CC Multiple imputation (MI) survival benefit when care taken with variables used in imputation model Demonstrates disparity in results and need for sensitivity analyses did they go far enough? TRISS W-score not necessarily the perfect benchmarking tool

Conclusion Using the TRISS-derived W-score as the outcome measure, different methods for managing missing data can result in a positive or negative assessment of trauma care performance. Where analyses are used for benchmarking hospitals and/or systems, the impact can be considerable. It is important that validated standardised approaches to handling missing data are universally adopted and reported. If MI is to be the chosen method, it must be adapted to the specific analysis, with variables appropriately included and reported. Studies comparing trauma outcomes must state: The number of cases with missing data The approach used to address missing data