RISK PREDICTION MODEL: PENALIZED REGRESSIONS

Similar documents
Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach

Testing Statistical Models to Improve Screening of Lung Cancer

Selection and Combination of Markers for Prediction

Computer Models for Medical Diagnosis and Prognostication

MODEL SELECTION STRATEGIES. Tony Panzarella

Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

What is Regularization? Example by Sean Owen

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

Applying Machine Learning Methods in Medical Research Studies

Rest and Exercise Echocardiography in Hypertrophic Cardiomyopathy: Determinants of Exercise Peak Gradient and Predictors of Outcome

An Introduction to Bayesian Statistics

Chapter 17 Sensitivity Analysis and Model Validation

MODEL PERFORMANCE ANALYSIS AND MODEL VALIDATION IN LOGISTIC REGRESSION

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

An informal analysis of multilevel variance

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

Template 1 for summarising studies addressing prognostic questions

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance

WELCOME! Lecture 11 Thommy Perlinger

Modelling Research Productivity Using a Generalization of the Ordered Logistic Regression Model

The impact of pre-selected variance inflation factor thresholds on the stability and predictive power of logistic regression models in credit scoring

Supplemental Material

Influence of Hypertension and Diabetes Mellitus on. Family History of Heart Attack in Male Patients

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

Predicting Breast Cancer Survival Using Treatment and Patient Factors

Cross-validation. Miguel Angel Luque Fernandez Faculty of Epidemiology and Population Health Department of Non-communicable Diseases.

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

bivariate analysis: The statistical analysis of the relationship between two variables.

Daniel Boduszek University of Huddersfield

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

A novel clinical risk prediction model for sudden cardiac death in HCM: a proof of concept study

The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance

11/24/2017. Do not imply a cause-and-effect relationship

Evidence Based Medicine

Managing Hypertrophic Cardiomyopathy with Imaging. Gisela C. Mueller University of Michigan Department of Radiology

Introduction to Meta-analysis of Accuracy Data

VARIABLE SELECTION WHEN CONFRONTED WITH MISSING DATA

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.

Colon cancer subtypes from gene expression data

Simple Linear Regression the model, estimation and testing

Abstract ESC Pisa

Biostatistics 2 nd year Comprehensive Examination. Due: May 31 st, 2013 by 5pm. Instructions:

Dr. Dermot Phelan MB BCh BAO PhD European Society of Cardiology 2012

Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008

Treatment of Hypertrophic Cardiomyopathy in Bruce B. Reid, MD

Reliability of Ordination Analyses

CHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to

Technical Notes for PHC4 s Report on CABG and Valve Surgery Calendar Year 2005

Statistical modelling for thoracic surgery using a nomogram based on logistic regression

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

Discrimination and Reclassification in Statistics and Study Design AACC/ASN 30 th Beckman Conference

Introduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India

Clincial Biostatistics. Regression

Multiple Treatments on the Same Experimental Unit. Lukas Meier (most material based on lecture notes and slides from H.R. Roth)

Anale. Seria Informatică. Vol. XVI fasc Annals. Computer Science Series. 16 th Tome 1 st Fasc. 2018

CHAPTER 3 RESEARCH METHODOLOGY

Small Group Presentations

Reporting and Methods in Clinical Prediction Research: A Systematic Review

PRINCIPLES OF STATISTICS

patients actual drug exposure for every single-day of contribution to monthly cohorts, either before or

Overview of Multivariable Prediction Modelling. Methodological Conduct & Reporting: Introducing TRIPOD guidelines

NHS Diabetes Prevention Programme (NHS DPP) Non-diabetic hyperglycaemia. Produced by: National Cardiovascular Intelligence Network (NCVIN)

Part 8 Logistic Regression

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

Aortic Stenosis and Perioperative Risk With Non-cardiac Surgery

Content. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries. Research question. Example Newly diagnosed Type 2 Diabetes

Article from. Forecasting and Futurism. Month Year July 2015 Issue Number 11

Online Supplementary Appendix

Development, validation and application of risk prediction models

Identifying Susceptibility in Epidemiology Studies: Implications for Risk Assessment. Joel Schwartz Harvard TH Chan School of Public Health

Load and Function - Valvular Heart Disease. Tom Marwick, Cardiovascular Imaging Cleveland Clinic

Systematic reviews of prediction modeling studies: planning, critical appraisal and data collection

CSE 258 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression

Score Tests of Normality in Bivariate Probit Models

Bringing machine learning to the point of care to inform suicide prevention

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Political Science 15, Winter 2014 Final Review

Lecture II: Difference in Difference and Regression Discontinuity

Chapter 3: Examining Relationships

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

TOTAL HIP AND KNEE REPLACEMENTS. FISCAL YEAR 2002 DATA July 1, 2001 through June 30, 2002 TECHNICAL NOTES

SUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing

Risk prediction in inherited conditions Laminopathies

Supplementary appendix

Experience with 500 Stentless Aortic Valve Replacements

Classical Psychophysical Methods (cont.)

Introduction to ROC analysis

Transcription:

RISK PREDICTION MODEL: PENALIZED REGRESSIONS Inspired from: How to develop a more accurate risk prediction model when there are few events Menelaos Pavlou, Gareth Ambler, Shaun R Seaman, Oliver Guttmann, Perry Elliott, Michael King, Rumana Z Omar BMJ 2015;351:h3868 Tip: Use to scan QR code Journal Club January 2015. Pawin Numthavaj, M.D. Section for Clinical Epidemiology and Biostatistics Faculty of Medicine Ramathibodi Hospital

RISK PREDICTION MODEL Statistical model Use predictors to predict health outcome

USUAL RISK PREDICTION MODEL DEVELOPMENT 1. Model development based on patients in one group 2. Obtaining outcome and predictor data 3. Create a mathematical model of prediction of outcome 4. Test the performance of model

MODEL PERFORMANCE 1. Discrimination Model s ability to discriminate between low and high risk 2. Calibration Agreement between real observed outcomes and predictions

1. DISCRIMINATION Ability to distinguish low risk versus high risk patients Area under ROC Curve of model predicted outcome vs actual outcome for different cut-off points of predicted risk Concordance (C) Statistics Probability that a randomly selected subject with outcome will have a higher predicted probability of outcome compared to a randomly selected subject without outcome 0.7-0.8: acceptable, 0.8-0.9 excellent, 0.9-1.0 outstanding

C-STATISTICS concordant + (0.5 ties) C = all pairs C = 6 + (0.5 3) = 0.62 12 Giovanni Tripepi et al. Nephrol. Dial. Transplant. 2010;25:1399-1401

2. CALIBRATION Measure of how close predicted probabilities are to observed rated of positive outcome Ex: Predicted 70% chance is 70% observed in actual data? Commonly used technique: Hosmer and Lemeshow chisquare Partition data into groups Compare average of predicted probabilities and outcome prevalence in each group by Chi-square

HOSMER-LEMESHOW TEST Deciles of estimated probability of death Sum of predicted deaths Sum of observed deaths 1 10.1 5 2 11.0 6 3 10.4 5 4 11.1 7 5 11.4 12 6 9.0 11 7 15.0 13 8 13.0 18 9 14.5 16 10 19.6 19 Giovanni Tripepi et al. Nephrol. Dial. Transplant. 2010;25:1402-1405

Deciles Sum of predicted deaths Sum of observed deaths 1 10.1 5 2 11.0 6 3 10.4 5 4 11.1 7 5 11.4 12 6 9.0 11 7 15.0 13 8 13.0 18 9 14.5 16 10 19.6 19 HL test χ 2 = [ = [ 5 10.1 2 10.1 =12 observed - estimated 2 ] estimated + 5 10.4 2 10.4 + 6 11.0 2 11.0 + + 18 13.0 2 13.0 Chi-square of 12 with n-2 (8) degrees of freedom p=0.15 Proportion of deaths predicted by model does not significantly differ from observed deaths ]

TYPICAL TECHNIQUES FOR MODEL VALIDATION Internal validation Bootstrapping methods External validation Use patient data not used for model development

EXAMPLE (BOX1) Outcome: Mechanical failure of heart valve (Y/N) Predictors: sex (score of 1=female) age (years) body surface area (BSA; m2) whether a replacement valve came from a batch with fractures (score of 1=valve came from batch with fractures)

RISK MODEL: LOGISTIC REGRESSION MODEL Patient s risk of heart failure = e (patient s risk score) (1+e (patient s risk score) ) Patient s risk score = intercept + (b sex sex) + (b age age) + (b BSA BSA) + (b fracture fracture) Regression coefficients (b) can be obtained using various methods: standard logistic regression, ridge or lasso

b sex = 0.193 b age = 0.0497 b BSA = 1.344 b fracture = 1.261 Intercept = 4.25 The risk score for a 40 year old female patient with a body surface area of 1.7 m2 and an artificial valve from a batch with fractures would then be calculated as: = 4.25 + ( 0.193 1 (female sex)) + ( 0.0497 40 (age; years)) + (1.344 1.7 (BSA in m 2 )) + (1.261 1 (fracture present in batch)) = 2.89 Therefore, her predicted risk would be: exp( 2.89) (1+exp( 2.89)) = 5.3%

BOOTSTRAP VALIDATION Use when no external cohort is not available Bootstrap dataset: imitation of original dataset, constructed by random sampling of patients from original dataset Typically, large number of bootstrap dataset (ex: 200) is created Model is fitted to each boostrap dataset, and estimated coefficients are use to obtain predictions for the patients in original dataset These predictions are used to calculate calibration slope for the fitted model

SOMETIMES, THERE ARE FEW EVENTS COMPARED TO NUMBER OF PREDICTORS Example: Structural failure of medical heart valves Sudden cardiac death in patients with hypertrophic cardiomyopathy Predictors from the model often perform less well in a new patient group

WHY? Fitted model captures not only the association between outcome and predictors Also random variation (noise) in development dataset Model overfitting Underestimate probability of event in low risk patients Overestimates probability of event in high risk patients

SAMPLE SIZE REQUIRED FOR RISK PREDICTION MODEL Rule of thumb Events per variable (EPV) ratio EPV = Number of events Number of regression coefficient EPV of 10 is needed to avoid overfitting

EXAMPLE 60 events for model with 6 regression coefficients Structural Heart Disease Age CV Death Sex HT Family History of CVD DM

WHEN EVENTS ARE RARE EPV of 10 may be difficult to achieve

PROBLEM OF RARE OUTCOME Models with few events compared to numbers of predictors often underperform when applied to new patients Model Overfitting Underestimate probability of event in low risk patients Overestimate probability of event in high risk patients

COMMON STRATEGIES 1. Univariable screening Only include significant predictors in the model 2. Stepwise model selection Ex: Backwards elimination Drawback: Process may not be stable Small changes in the data or in the predictor selection process could lead to different predictors being included in the final model

ANOTHER WAY TO ALLEVIATE MODEL FITTING Shrinkage methods Methods that tend to shrink the regression coefficient towards zero Moving poorly calibrated predicted risks towards the average risk

SIMPLEST SHRINKAGE METHOD Shrink all coefficients by common factor: ex. -20% However, this approach does not perform well if EPV very low

PENALIZED REGRESSION Flexible shrinkage approaches that is effective when EPV is low (<10) Process: 1. Specify form of risk model (ex: logistic/cox) 2. Fit the data to estimate coefficient in standard logistic/cox model 3. Range of predicted risk is too wide as result of overfitting 4. Shrinking regression coefficients toward zero by placing constraint on the values of regression coefficients (Penalized) Coefficient estimates are typically smaller than those of standard regression

SEVERAL FORMS OF PENALIZED REGRESSION Ridge Lasso Derivations of Ridge and Lasso: Elastic net, Smoothly clipped absolute deviation, adaptive Lasso Etc. Packages in R (penalized), SPSS *Stata rxridge, firthlogit, overfit

RIDGE REGRESSION Fit model under constraint that sum of squared regression coefficients does not exceed particular threshold Penalized the coefficients using formula: l β λ p j=1 λ : scalar chosen by the investigator to control the amount of shrinkage λ = 0 results in the standard regression model β j 2

The threshold is chosen to maximize model s predictive ability using cross validation: Dataset is split into k group Model is fitted to (k-1) groups and validated on the omitted group Repeated k times, each time omitting a different group Ex: 10-fold cross validation Split dataset into 10 subsets Subset j is omitted then penalized model is fitted to other nine subsets Calculate prediction for all patients, calculate predictive abilities and compare with the full model

LASSO REGRESSION Least Absolute Shrinkage and Selection Operator Similar to ridge Constrain the sum of absolute values of regression coefficients l β λ Lasso can effectively exclude predictors from the final model by shrinking coefficient to 0 p j=1 β j

RIDGE OR LASSO? In health research, set of prespecified predictors is often available Ridge regression is usually preferred option Lasso: if preferred simpler model with few predictors (ex: save time/resources by collecting less information on patients)

DETECTION OF MODEL OVERFITTING Assessment of model calibration Internal validation External validation Dividing patients into risk groups according to predicted risk Compare proportion of patients who had event and average predicted risk in that group Graph (calibration plot) Table (and Hosmer-Lemeshow GoF)

DEGREE OF OVERFITTING Quantify by simple regression model Outcomes in validation data are regressed using logistic regression on their predicted risk score Well-calibrated model: estimated slop (calibration slope): close to 1 Overfitted model: <1 (low risks are underestimated, high risks are overestimated

EXAMPLE 1: MECHANICAL HEART VALVE FAILURE Data of 3118 patients with mechanical heart valve Outcome: Failure of artificial valve (56) Predictor: age, sex, BSA, fractures in the batch of the valve (Y/N), year of valve manufacture (<1981/>1981), valve size (10 coefficients) EPV = 56/10 = 5.6 Standard, ridge, lasso regression

Predictors Descriptive statistics Regression coefficient estimates Standard Ridge Lasso regression regression regression Intercept 7.80 5.97 (23) 6.65 (15) Sex (female) 1337 (43) 0.24 0.14 (41) 0.16 (34) Age (years) 54.1 (10.8) 0.052 0.047 (11) 0.050 (4) Body surface area (m2) 1.6 (0.3) 1.98 1.52 (24) 1.75 (12) Aortic size 23, 27, 29, 31 mm 1.43 1.43 0.36 (75) 0.61 (68) Mitral size 23-27 mm 1.3 1.3 0.22 (84) 0.43 (67) Mitral size 29 mm 1.95 1.95 0.80 (59) 1.13 (42) Mitral size 31 mm 2.62 2.62 1.38 (47) 1.77 (33) Mitral size 33 mm 2.58 2.58 1.41 (45) 1.73 (33) Fracture in batch (yes) 0.59 0.59 0.69 ( 17) 0.64 ( 9) Date of manufacture (after 1981) 1.38 1.38 1.02 (26) 1.22 (12)

FIG 1: DISTRIBUTION OF PREDICTED RISK SCORES ESTIMATED USING STANDARD, RIDGE, AND LASSO REGRESSION Menelaos Pavlou et al. BMJ 2015;351:bmj.h3868

FIG 2: OBSERVED PROPORTIONS VERSUS AVERAGE PREDICTED RISK OF THE EVENT (USING STANDARD, RIDGE AND LASSO REGRESSION).

EXAMPLE 2: SUDDEN CARDIAC DEATH IN HYPERTROPHIC CARDIOMYOPATHY Data on 1000 patients Outcome: risk of sudden cardiac death within 10 years from diagnosis (42 events) Predictors: age, max LV wall thickness, fractional shortening, LA diameter, peak LV outflow tract gradient (cont) and gender, family history of SCD, non-sustained VT, severity of HF by NYHA, unexplained syncope (binary) EPV = 4.2 Externally validated model using data from different centers (2405 patients, 106 events)

COEFFICIENT TABLE Predictors Standard regression Regression coefficient estimates Ridge regression Lasso regression Age (years) -0.024-0.015-0.015 Max Wall Thickness (mm) 0.043 0.038 0.039 Fractional Shortening(mm) 0.002 0.003 0 LA diameter (mm) 0.042 0.028 0.027 Peak LVOT gradient (mmhg) 0.009 0.007 0.007 Sudden cardiac death in family 0.60 0.43 0.42 Non-sustain VT 0.30 0.19 0.03 Syncope 0.93 0.71 0.74 Sex-male -0.14-0.07 0 NYHA class III/IV -0.24-0.07 0

FIGURE

CONCLUSION When number of events is low compared to predictors in risk model: standard regression may produced overfitted risk model Common method such as stepwise selection and univariable screening are problematic and should be avoided Recommended that the use of penalized regression methods be explored Other methods such as incorporated existing evidence (from published risk models, meta-analysis, and expert opinion) could be better in some scenario

TAKE HOME MESSAGE Beware prediction models with Number of events EPV( Number of regression coefficient ) < 10 Standard model usually overfitted in EPV<10: underestimate low risk patients, and overestimate high risk patients Penalizing the coefficient using penalized regression methods such as Ridge and Lasso is a possible solution to this problem

http://www.ceb-rama.org THANK YOU

NEXT JOURNAL CLUB REMINDER: Factors influencing recruitment to research: qualitative study of the experiences and perceptions of research teams by Threechada Boonchan Friday 19 th, February 13:00-14:30 Room 905 lunch from 12:00 noon Register at: www.ceb-rama.org Tip: Use Scan app to scan QR code and add appointment