RISK PREDICTION MODEL: PENALIZED REGRESSIONS

Size: px

Start display at page:

Download "RISK PREDICTION MODEL: PENALIZED REGRESSIONS"

Tyrone Jones
5 years ago
Views:

RISK PREDICTION MODEL: PENALIZED REGRESSIONS Inspired from: How to develop a more accurate risk prediction model when there are few events Menelaos Pavlou, Gareth Ambler, Shaun R Seaman, Oliver

1 RISK PREDICTION MODEL: PENALIZED REGRESSIONS Inspired from: How to develop a more accurate risk prediction model when there are few events Menelaos Pavlou, Gareth Ambler, Shaun R Seaman, Oliver Guttmann, Perry Elliott, Michael King, Rumana Z Omar BMJ 2015;351:h3868 Tip: Use to scan QR code Journal Club January Pawin Numthavaj, M.D. Section for Clinical Epidemiology and Biostatistics Faculty of Medicine Ramathibodi Hospital

2 RISK PREDICTION MODEL Statistical model Use predictors to predict health outcome

3 USUAL RISK PREDICTION MODEL DEVELOPMENT 1. Model development based on patients in one group 2. Obtaining outcome and predictor data 3. Create a mathematical model of prediction of outcome 4. Test the performance of model

4 MODEL PERFORMANCE 1. Discrimination Model s ability to discriminate between low and high risk 2. Calibration Agreement between real observed outcomes and predictions

5 1. DISCRIMINATION Ability to distinguish low risk versus high risk patients Area under ROC Curve of model predicted outcome vs actual outcome for different cut-off points of predicted risk Concordance (C) Statistics Probability that a randomly selected subject with outcome will have a higher predicted probability of outcome compared to a randomly selected subject without outcome : acceptable, excellent, outstanding

6 C-STATISTICS concordant + (0.5 ties) C = all pairs C = 6 + (0.5 3) = Giovanni Tripepi et al. Nephrol. Dial. Transplant. 2010;25:

7 2. CALIBRATION Measure of how close predicted probabilities are to observed rated of positive outcome Ex: Predicted 70% chance is 70% observed in actual data? Commonly used technique: Hosmer and Lemeshow chisquare Partition data into groups Compare average of predicted probabilities and outcome prevalence in each group by Chi-square

8 HOSMER-LEMESHOW TEST Deciles of estimated probability of death Sum of predicted deaths Sum of observed deaths Giovanni Tripepi et al. Nephrol. Dial. Transplant. 2010;25:

9 Deciles Sum of predicted deaths Sum of observed deaths HL test χ 2 = [ = [ =12 observed - estimated 2 ] estimated Chi-square of 12 with n-2 (8) degrees of freedom p=0.15 Proportion of deaths predicted by model does not significantly differ from observed deaths ]

10 TYPICAL TECHNIQUES FOR MODEL VALIDATION Internal validation Bootstrapping methods External validation Use patient data not used for model development

$batch with fractures (score of 1=valve came from batch with$

11 EXAMPLE (BOX1) Outcome: Mechanical failure of heart valve (Y/N) Predictors: sex (score of 1=female) age (years) body surface area (BSA; m2) whether a replacement valve came from a batch with fractures (score of 1=valve came from batch with fractures)

12 RISK MODEL: LOGISTIC REGRESSION MODEL Patient s risk of heart failure = e (patient s risk score) (1+e (patient s risk score) ) Patient s risk score = intercept + (b sex sex) + (b age age) + (b BSA BSA) + (b fracture fracture) Regression coefficients (b) can be obtained using various methods: standard logistic regression, ridge or lasso

13 b sex = b age = b BSA = b fracture = Intercept = 4.25 The risk score for a 40 year old female patient with a body surface area of 1.7 m2 and an artificial valve from a batch with fractures would then be calculated as: = ( (female sex)) + ( (age; years)) + ( (BSA in m 2 )) + ( (fracture present in batch)) = 2.89 Therefore, her predicted risk would be: exp( 2.89) (1+exp( 2.89)) = 5.3%

14 BOOTSTRAP VALIDATION Use when no external cohort is not available Bootstrap dataset: imitation of original dataset, constructed by random sampling of patients from original dataset Typically, large number of bootstrap dataset (ex: 200) is created Model is fitted to each boostrap dataset, and estimated coefficients are use to obtain predictions for the patients in original dataset These predictions are used to calculate calibration slope for the fitted model

15 SOMETIMES, THERE ARE FEW EVENTS COMPARED TO NUMBER OF PREDICTORS Example: Structural failure of medical heart valves Sudden cardiac death in patients with hypertrophic cardiomyopathy Predictors from the model often perform less well in a new patient group

16 WHY? Fitted model captures not only the association between outcome and predictors Also random variation (noise) in development dataset Model overfitting Underestimate probability of event in low risk patients Overestimates probability of event in high risk patients

17 SAMPLE SIZE REQUIRED FOR RISK PREDICTION MODEL Rule of thumb Events per variable (EPV) ratio EPV = Number of events Number of regression coefficient EPV of 10 is needed to avoid overfitting

18 EXAMPLE 60 events for model with 6 regression coefficients Structural Heart Disease Age CV Death Sex HT Family History of CVD DM

19 WHEN EVENTS ARE RARE EPV of 10 may be difficult to achieve

20 PROBLEM OF RARE OUTCOME Models with few events compared to numbers of predictors often underperform when applied to new patients Model Overfitting Underestimate probability of event in low risk patients Overestimate probability of event in high risk patients

21 COMMON STRATEGIES 1. Univariable screening Only include significant predictors in the model 2. Stepwise model selection Ex: Backwards elimination Drawback: Process may not be stable Small changes in the data or in the predictor selection process could lead to different predictors being included in the final model

22 ANOTHER WAY TO ALLEVIATE MODEL FITTING Shrinkage methods Methods that tend to shrink the regression coefficient towards zero Moving poorly calibrated predicted risks towards the average risk

23 SIMPLEST SHRINKAGE METHOD Shrink all coefficients by common factor: ex. -20% However, this approach does not perform well if EPV very low

24 PENALIZED REGRESSION Flexible shrinkage approaches that is effective when EPV is low (<10) Process: 1. Specify form of risk model (ex: logistic/cox) 2. Fit the data to estimate coefficient in standard logistic/cox model 3. Range of predicted risk is too wide as result of overfitting 4. Shrinking regression coefficients toward zero by placing constraint on the values of regression coefficients (Penalized) Coefficient estimates are typically smaller than those of standard regression

25 SEVERAL FORMS OF PENALIZED REGRESSION Ridge Lasso Derivations of Ridge and Lasso: Elastic net, Smoothly clipped absolute deviation, adaptive Lasso Etc. Packages in R (penalized), SPSS *Stata rxridge, firthlogit, overfit

26 RIDGE REGRESSION Fit model under constraint that sum of squared regression coefficients does not exceed particular threshold Penalized the coefficients using formula: l β λ p j=1 λ : scalar chosen by the investigator to control the amount of shrinkage λ = 0 results in the standard regression model β j 2

27 The threshold is chosen to maximize model s predictive ability using cross validation: Dataset is split into k group Model is fitted to (k-1) groups and validated on the omitted group Repeated k times, each time omitting a different group Ex: 10-fold cross validation Split dataset into 10 subsets Subset j is omitted then penalized model is fitted to other nine subsets Calculate prediction for all patients, calculate predictive abilities and compare with the full model

28 LASSO REGRESSION Least Absolute Shrinkage and Selection Operator Similar to ridge Constrain the sum of absolute values of regression coefficients l β λ Lasso can effectively exclude predictors from the final model by shrinking coefficient to 0 p j=1 β j

29 RIDGE OR LASSO? In health research, set of prespecified predictors is often available Ridge regression is usually preferred option Lasso: if preferred simpler model with few predictors (ex: save time/resources by collecting less information on patients)

30 DETECTION OF MODEL OVERFITTING Assessment of model calibration Internal validation External validation Dividing patients into risk groups according to predicted risk Compare proportion of patients who had event and average predicted risk in that group Graph (calibration plot) Table (and Hosmer-Lemeshow GoF)

31 DEGREE OF OVERFITTING Quantify by simple regression model Outcomes in validation data are regressed using logistic regression on their predicted risk score Well-calibrated model: estimated slop (calibration slope): close to 1 Overfitted model: <1 (low risks are underestimated, high risks are overestimated

32 EXAMPLE 1: MECHANICAL HEART VALVE FAILURE Data of 3118 patients with mechanical heart valve Outcome: Failure of artificial valve (56) Predictor: age, sex, BSA, fractures in the batch of the valve (Y/N), year of valve manufacture (<1981/>1981), valve size (10 coefficients) EPV = 56/10 = 5.6 Standard, ridge, lasso regression

33 Predictors Descriptive statistics Regression coefficient estimates Standard Ridge Lasso regression regression regression Intercept (23) 6.65 (15) Sex (female) 1337 (43) (41) 0.16 (34) Age (years) 54.1 (10.8) (11) (4) Body surface area (m2) 1.6 (0.3) (24) 1.75 (12) Aortic size 23, 27, 29, 31 mm (75) 0.61 (68) Mitral size mm (84) 0.43 (67) Mitral size 29 mm (59) 1.13 (42) Mitral size 31 mm (47) 1.77 (33) Mitral size 33 mm (45) 1.73 (33) Fracture in batch (yes) ( 17) 0.64 ( 9) Date of manufacture (after 1981) (26) 1.22 (12)

34 FIG 1: DISTRIBUTION OF PREDICTED RISK SCORES ESTIMATED USING STANDARD, RIDGE, AND LASSO REGRESSION Menelaos Pavlou et al. BMJ 2015;351:bmj.h3868

35 FIG 2: OBSERVED PROPORTIONS VERSUS AVERAGE PREDICTED RISK OF THE EVENT (USING STANDARD, RIDGE AND LASSO REGRESSION).

36 EXAMPLE 2: SUDDEN CARDIAC DEATH IN HYPERTROPHIC CARDIOMYOPATHY Data on 1000 patients Outcome: risk of sudden cardiac death within 10 years from diagnosis (42 events) Predictors: age, max LV wall thickness, fractional shortening, LA diameter, peak LV outflow tract gradient (cont) and gender, family history of SCD, non-sustained VT, severity of HF by NYHA, unexplained syncope (binary) EPV = 4.2 Externally validated model using data from different centers (2405 patients, 106 events)

37 COEFFICIENT TABLE Predictors Standard regression Regression coefficient estimates Ridge regression Lasso regression Age (years) Max Wall Thickness (mm) Fractional Shortening(mm) LA diameter (mm) Peak LVOT gradient (mmhg) Sudden cardiac death in family Non-sustain VT Syncope Sex-male NYHA class III/IV

38 FIGURE

39 CONCLUSION When number of events is low compared to predictors in risk model: standard regression may produced overfitted risk model Common method such as stepwise selection and univariable screening are problematic and should be avoided Recommended that the use of penalized regression methods be explored Other methods such as incorporated existing evidence (from published risk models, meta-analysis, and expert opinion) could be better in some scenario

40 TAKE HOME MESSAGE Beware prediction models with Number of events EPV( Number of regression coefficient ) < 10 Standard model usually overfitted in EPV<10: underestimate low risk patients, and overestimate high risk patients Penalizing the coefficient using penalized regression methods such as Ridge and Lasso is a possible solution to this problem

41 THANK YOU

NEXT JOURNAL CLUB REMINDER: Factors influencing recruitment to research: qualitative study of the experiences and perceptions of research teams by Threechada

42 NEXT JOURNAL CLUB REMINDER: Factors influencing recruitment to research: qualitative study of the experiences and perceptions of research teams by Threechada Boonchan Friday 19 th, February 13:00-14:30 Room 905 lunch from 12:00 noon Register at: Tip: Use Scan app to scan QR code and add appointment

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Patrick J. Heagerty Department of Biostatistics University of Washington 174 Biomarkers Session Outline