It s hard to predict!
|
|
- Bethany Parks
- 5 years ago
- Views:
Transcription
1 Statistical Methods for Prediction Steven Goodman, MD, PhD With thanks to: Ciprian M. Crainiceanu Associate Professor Department of Biostatistics JHSPH 1 It s hard to predict! People with no future: Marilyn Monroe and Elvis Presley. Useless or impossible technologies: Telephones, light bulbs, radio, TV, rockets, atomic bombs, X-rays, space flight, portable computers. Barack Obama s NCAA Predictions Risk Prediction Models What do we want risk prediction models to do? Randolph et al. (1998, Crit Care Med) Prediction tools only inform decisions if they help us differentiate between those patients with higher risks of encountering the outcome from those patients with lower risks Moons and Harrell (2005, Academic Radiology): Ultimately the patient and his physician want to know his risk of disease Cook (2007, Circulation) Most important for clinical risk prediction is whether a model can more accurately stratify individuals into higher or lower risk categories of clinical importance These descriptions involve: Discriminatory accuracy; Calibration; Predictive accuracy 4 1
2 Goal of these models Not!!!... just to distinguish between diseased and non-diseased patients, or those who will and won t prevent disease. To improve the overall health outcomes (or reduce cost or suffering) in a group in whom it is applied. To do more good than harm. Differences from explanatory or etiologic epi Subject Test setting Test Result Action Consequences Predictive model/marker studies Prediction is also concerned with individual classification, NOT statistical distinction of group averages. Steps of Statistical Prediction Modeling Design Calibration, Discrimination Validation Assessment Incremental value Reclassification 8 2
3 Many ways of constructing prediction models Statistical modeling Regression, Regression Trees Random Forests Classification trees Clustering Data Mining Neural Networks Machine learning Support Vector Machines It (almost) doesn t matter how you build your model if you validate it properly! Examples of prediction Framingham risk score / CHD Sex, age, T. Chol., HDL, BP, Diabetes, Smoking Gail risk model / Breast cancer Personal history, age, age at menarche, age at first live birth, family history, number of biopsies, race Polls election, market preferences, etc. Markets pay per prediction 9 10 Regression line Regression line and confidence interval β 0 = 1.35 β 1 = 1.66 se(β 1 ) = 0.75 p-value = 0.06 r = 0.62 R 2 = 0.38 = β 0 = 1.35 β 1 = 1.66 se(β 1 ) = 0.75 p-value = 0.06 r = 0.62 R 2 = 0.38 = l(ci) =
4 Regression line, confidence interval, prediction interval β 0 = 1.35 β 1 = 1.66 se(β 1 ) = 0.75 p-value = 0.06 r = 0.62 R 2 = 0.38 = l(ci) = 3.62 L(PI) = 9.03 Regression line, confidence interval, prediction interval β 0 = 1.35 β 1 = 1.66 se(β 1 ) = 0.75 p-value = 0.06 r = 0.62 R 2 = 0.38 = l(ci) = 3.62 L(PI) = Regression line, confidence interval, prediction interval β 0 = 0.71 β 1 = 1.87 se(β 1 ) = 0.24 p-value < r = 0.49 R 2 = 0.24 = l(ci) = 0.75 L(PI) = 7.72 Regression line, confidence interval, prediction interval β 0 = 0.88 β 1 = 1.53 se(β 1 ) = 0.08 p-value < r = 0.42 R 2 = 0.18 = l(ci) = 0.23 L(PI) =
5 Length of confidence and prediction intervals Lessons Length of CI uncertainty about population average response uncertainty about the predicted value Length of PI uncertainty about individual response (outcome) Statistical significance prediction relevance Length of CI << Length of PI Statistical significance and prediction They are different Statistical significance is a weak surrogate for goodness of prediction Statistical significance Can be used to screen for potential confounders Covariates that are not significant probably do not have much prediction power Prediction in binary regression Outcome is 0-1 Examples Non-diseased / diseased Alive / Dead Failure / Success (procedure) Type of binary regression logistic (log odds of success/failure)
6 Sensitivity and specificity of binary prediction rules Sensitivity: P( prediction = 1 outcome = 1) Estimated as Estimators depend on threshold FN TP Sens. = TP / (TP + FN) Specificity: P( prediction = 0 outcome = 0) Estimated as Spec. = TN / (TN + FP) 22 TN FP 23 Sensitivity and specificity curves (red=specificity) Receiver Operating Characteristic (ROC) Curve
7 Area under the ROC curve (AUC) Probability that given two subjects, one who will develop an event and one who will not, the model will assign a higher probability of an event to the former One of the main criteria for assessing discrimination accuracy AUC=0.68 (in the example) Steps of Statistical Prediction Modeling Design Calibration Validation Replication Extrapolation Refinement and Adaptation Calibration, aka Clinical validity Calibration How well the observed outcomes agree with the predicted outcomes Most model fitting strategies select the model that is best calibrated to the data In Statistics Calibration is called Model Fit In clinical prediction, the ability to predict prognosis is called clinical validity. The ability to predict response to therapy is called either clinical utility or predictive ability (as opposed to prognostic )
8 Measures of Calibration Example: SUPPORT Study Goodness of fit statistics E.g., Hosmer-Lemeshow statistic, comparing observed and expected outcomes w/in quantiles. Ad hoc comparisons of observed and expected outcomes Develop model to predict risk of death for seriously ill hospitalized adults, to assist physicians in clinical decision making Cox regression model fit using prospective study of 4301 hospitalized adults with at least 1 of 9 illnesses and expected 6 mo. mortality of 50% Predictors: disease category, severity of acute disease as measured by physiologic abnormalities, evaluation of the patient's long-term health status, age, comorbid conditions, number of days hospitalized before study entry collected 3 days after study entry Validation in 4028 independent patients Calibration of SUPPORT Model Example: Calibration of Model Predicting Survival in Cirrhotic Patients From Knaus et al The SUPPORT Prognostic Model: Objective Estimates of Survival for Seriously Ill Hospitalized Adults. Ann Int Med 122: Shows agreement between observed (step-function) and predicted (smooth function) survival, based on Cox regression model From Guardiola et al External validation of a prognostic model for predicting 33 survival of cirrhotic patients with refractory ascites. Am J Gastroenterol 97:
9 Discrimination Discrimination How well does the risk prediction model separate the two groups? Want separated risk distributions for the two groups Discrimination vs. Calibration Measures of Discrimination Suppose the risk of death in 5 years is 50% a model that assigns a risk of 50% for the entire population is perfectly calibrated, but has no discrimination. a model that assigns all cases 11% risk and all controls 10% risk perfectly discriminates but is poorly calibrated. Models typically cannot be perfect in both We always want a model that is well calibrated, but discrimination (and predictive accuracy) more directly relate to the clinical utility of the model TPR/FPR, ROC curve Time-dependent versions for time-varying outcomes C-index (concordance statistic) For binary outcomes, C-index = Area under ROC curve (AUC) = probability case has higher risk score than control For time-varying binary outcomes, C-index = probability that person with earlier event has higher risk score Very popular, but little clinical relevance Misclassification rate
10 Example: Gail Model for Breast Cancer Risk Discriminatory accuracy assessed by Rockhill et al. (2001, JNCI) 82,109 white women aged from Nurses Health Study, Focus on invasive breast cancer within the 5 year period 38 From Rockhill et al Validation of the Gail et al. model of breast cancer risk prediction and implications for chemoprevention. JNCI 93: ROCs for constant OR ORs and Logistic regression From Pepe, et. al, Limitations of the Odds Ratio in Gauging the Performance of a Diagnostic, Prognostic, or Screening Marker AJE, 2004 FIGURE 2. Probability distributions of a marker, X, in cases (solid curves) and controls (dashed curves) consistent with the logistic model logit P( D = 1 X) = α + β X. It has been assumed that X has a mean of 0 and a standard deviation of 0.5 in controls so that a unit increase represents the difference between the 84th and 16th percentiles of X in controls. The marker is normally distributed, with the same variance in cases. The odds ratio (OR) per unit increase in X is shown. 10
11 Internal vs. External Validation VALIDATION Avoiding Overoptimism When Calculating Model Accuracy Internal validation: evaluate the data by carefully using the current data To evaluate the accuracy of the model in the current setting Our focus today External validation: Use different dataset E.g., a different geographic location, different study population, different time period To determine generalizability What are you validating??? The whole model fitting process, including the mathematical form of the model? The predictors in the model? The coefficients of the predictors in the model? Whatever you are validating, you need to freeze that in the validation/test set. But be very careful about what you are not validating. 44 The Simplest Approach: Training/ Test Split Randomly divide the data into two parts Use one part to fit the model, the other to validate it E.g., 2/3 training; 1/3 test 45 11
12 Training/Test Split, cont d Useful when: Fitting the model is very computationally intensive or involves decisions which cannot be automated Large dataset Downside: Inefficient use of data (we d like to use all the data to fit the model) K-Fold Cross-Validation Randomly split the data into K parts, e.g., 6 parts: Train Train Test Train Train Train For the k th part (3 rd part above), fit the model to the K-1 other parts, and use the model to predict outcomes on the k th part Do this for k = 1,2,,K, each time obtaining a new model and filling in predicted values for part of the data Calculate model accuracy using these predicted values Note that each observation has one predicted value, obtained from a model which was fit without that observation Summary Using the same data to fit and validate the risk prediction model leads to an overly optimistic estimate of model accuracy Most common approach to correcting for overoptimism: training/test split More sophisticated approaches: cross-validation and bootstrapping More efficiently use the data Key issue is determining which model to use in practice A prediction rule does not merit the name without proper validation. Aspects of Model Performance Calibration: how well does the model fit the data (bias) Discrimination: how well does the model discriminate between the two groups of individuals (ordering) Incremental value: How much does it improve on what we already knew? Clinical Utility: How does use of the model impact clinical outcomes?
13 Incremental Value Incremental Value The increase in classification accuracy when additional information is added to the model NOT: The magnitude of the new coefficient The statistical significance of the new coefficient Easy to find examples of factors that are strongly associated, but do not impact classification accuracy Example: Predicting Pancreatic Cancer Risk Incremental Value of CA-125 Solid line: ROC curve for CA-19-9 Dashed line: ROC curve for risk model From Pepe et al Limitations of the odds ratio for gauging the performance of a diagnostic, prognostic, or screening marker. Am J 52 Epidemiol 159: From Pepe et al Limitations of the odds ratio for gauging the performance of a diagnostic, prognostic, or screening marker. Am J 53 Epidemiol 159:
14 Incremental Value of CA-125, cont d ROC curves show improvement in classification accuracy with CA-125 Change in risk distribution not shown in this figure (see next 2 examples) Example: Utility of CRP in Predicting Cardiovascular Risk Cook et al. (2006, Ann Int Med) Develop cardiovascular risk prediction model, with and without CRP non-diabetic women in the Women s Health Study nationwide cohort of 45 years and older, free of cancer and CVD at entry Followed annually for development of CVD (average 10 years) CRP Example, cont d Cox regression model with covariates: age, CRP, HDL, total cholesterol, SBP, antihypertensive use, current smoking measured at baseline 56 Model Without CRP The Net Benefit of CRP Model With CRP < 5% 5% to 10% 10% to 20% > 20% Total n < 5% n n n % to 10% n n n % to 20% n n n > 20% n n n Total n n
15 Discriminatory Accuracy If risk > 20% suggests intervention: Without CRP: 55/725 (7.6%) of cases identified 154/26202 (0.6%) of controls falsely identified Discriminatory Accuracy If risk > 10% suggests intervention: Without CRP: 146/725 (20.1%) cases identified 866/26202 (3.3%) controls falsely identified With CRP: 57/725 (7.9%) of cases identified 162/26202 (0.6%) of controls falsely identified With CRP: 171/725 (23.6%) cases identified 944/26202 (3.5%) controls falsely identified Summary: Evaluating Incremental Value The improvement in classification when additional information is added to the model NOT: The magnitude of the new coefficient The statistical significance of the new coefficient The improvement in health status with the new model. 60 How Much Do SNPs Improve Models to Predict Breast Cancer Risk? Gail model slides provided by Mitchell H. Gail, Biostatistics Branch, Division of Cancer Epidemiology and Genetics 61 15
16 Breast Cancer Risk Assessment Tool (BCRAT) The NCI s BCRAT or Gail Model 2 Risk factors in BCRAT Age Age at first live birth Age at menarche Number of mother/sisters with breast cancer Number of previous benign breast biopsies and whether atypical hyperplasia present on any Well calibrated Discriminatory accuracy modest BRCA 1 and 2 BRCA1 and 2 (Breast cancer 1 and 2) are human tumor suppressor genes 13.2 % percent (132 out of 1,000) of women in the general population will develop breast cancer 36 to 85% ( out of 1,000) of women with an altered BRCA1 or BRCA2 gene develop breast cancer Gene mutations are rare in the population: % in cases: 1-2% Thus, one needs to look for genes that are predictive of breast cancer, but have higher alleles frequencies SNPs Associated with Breast Cancer Location Disease Allele Frequency Odds Ratio per Allele Reference FGFR TNRC9 (or TOX3) MAP3K LSP CASP q q Prob ( r>t) in cases ROC-type Plots BCRAT + 7 BCRAT 7 Geometric mean Easton et al., Nature 2007;447: Cox et al., Nature Genetics 2007;39: Stacey et al., Nature Genetics 2007;39: Prob ( r>t) in general population 65 16
17 Conclusions Very modest public health improvements from BCRATplus7 for Discriminatory accuracy (AUC) (4.1%) Deciding whether to take tamoxifen (0.1% or 0.8%) Deciding to have mammogram (0.8% or 0.1%) Allocating scarce mammogram resources (5.5%) Reclassification versus BCRAT useful for individuals if BCRATplus7 is well calibrated BCRATplus7 needs to be validated in independent cohort data on individuals 66 Conclusions (continued) Usefulness of SNPs depends on the application, validity of model, and costs To achieve high discriminatory accuracy (AUC=0.8) would require hundreds of SNPs, optimistically. 67 Evaluating the added predictive ability of a new marker: Reclassification Slides based on the paper Pencina et al., Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Statistics in Medicine 68 Net reclassification improvement (NRI) Same setting as for re-classification tables Classify predicted risks using two prediction models Calculate the net reclassification improvement (NRI) Ideas Consider subjects who develop and do not develop events separately For event subjects Any upward movement in categories improves classification Any downward movement in categories worsens classification For non-event subjects Any upward movement in categories worsens classification Any downward movement in categories improves classification 69 17
18 NRI definition Contribution to NRI from events = (29-7) / 183 =0.12 NRI = P(up D=1) P(down D=1) (P(up D=0) P(down D =0)) D is the disease event Estimators: 70 Downward moves in events 4+3=7 Upward moves in events 15+14=29 71 Contribution to NRI from events = ( ) / 3081 = Issues related to NRI Downward moves in Upward moves in nonevents =174 nonevents =17372 NRI = (29-7) / ( ) / 3081 = 0.12 ( ) 0.12 NRI depends heavily on re-classification in events NRI tends to depend less on re-classification in non-events Same importance for up and down classification Same importance for events and non-events Same importance for 1 or 2 category jumps 73 18
19 Summary of prediction modeling Predicting the outcomes of individuals is hard. It is hard for genetic predictors or biomarkers to add materially to predictions based on clinical factors. It is even harder for prediction models to make an difference in patient outcomes. Validate, validate, validate!
Development, validation and application of risk prediction models
Development, validation and application of risk prediction models G. Colditz, E. Liu, M. Olsen, & others (Ying Liu, TA) 3/28/2012 Risk Prediction Models 1 Goals Through examples, class discussion, and
More informationSISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers
SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington
More informationOutline of Part III. SISCR 2016, Module 7, Part III. SISCR Module 7 Part III: Comparing Two Risk Models
SISCR Module 7 Part III: Comparing Two Risk Models Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington Outline of Part III 1. How to compare two risk models 2.
More informationModule Overview. What is a Marker? Part 1 Overview
SISCR Module 7 Part I: Introduction Basic Concepts for Binary Classification Tools and Continuous Biomarkers Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington
More informationDiscrimination and Reclassification in Statistics and Study Design AACC/ASN 30 th Beckman Conference
Discrimination and Reclassification in Statistics and Study Design AACC/ASN 30 th Beckman Conference Michael J. Pencina, PhD Duke Clinical Research Institute Duke University Department of Biostatistics
More informationSISCR Module 4 Part III: Comparing Two Risk Models. Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington
SISCR Module 4 Part III: Comparing Two Risk Models Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington Outline of Part III 1. How to compare two risk models 2.
More informationNet Reclassification Risk: a graph to clarify the potential prognostic utility of new markers
Net Reclassification Risk: a graph to clarify the potential prognostic utility of new markers Ewout Steyerberg Professor of Medical Decision Making Dept of Public Health, Erasmus MC Birmingham July, 2013
More informationGenetic risk prediction for CHD: will we ever get there or are we already there?
Genetic risk prediction for CHD: will we ever get there or are we already there? Themistocles (Tim) Assimes, MD PhD Assistant Professor of Medicine Stanford University School of Medicine WHI Investigators
More informationRisk modeling for Breast-Specific outcomes, CVD risk, and overall mortality in Alliance Clinical Trials of Breast Cancer
Risk modeling for Breast-Specific outcomes, CVD risk, and overall mortality in Alliance Clinical Trials of Breast Cancer Mary Beth Terry, PhD Department of Epidemiology Mailman School of Public Health
More informationComputer Models for Medical Diagnosis and Prognostication
Computer Models for Medical Diagnosis and Prognostication Lucila Ohno-Machado, MD, PhD Division of Biomedical Informatics Clinical pattern recognition and predictive models Evaluation of binary classifiers
More informationAssessment of performance and decision curve analysis
Assessment of performance and decision curve analysis Ewout Steyerberg, Andrew Vickers Dept of Public Health, Erasmus MC, Rotterdam, the Netherlands Dept of Epidemiology and Biostatistics, Memorial Sloan-Kettering
More informationAssessment of Clinical Validity of a Breast Cancer Risk Model Combining Genetic and Clinical Information
DOI: 0.09/jnci/djq88 The Author 00. Published by Oxford University Press. Advance Access publication on October 8, 00. This is an Open Access article distributed under the terms of the Creative Com mons
More informationWhite Paper Estimating Complex Phenotype Prevalence Using Predictive Models
White Paper 23-12 Estimating Complex Phenotype Prevalence Using Predictive Models Authors: Nicholas A. Furlotte Aaron Kleinman Robin Smith David Hinds Created: September 25 th, 2015 September 25th, 2015
More informationSystematic Reviews and meta-analyses of Diagnostic Test Accuracy. Mariska Leeflang
Systematic Reviews and meta-analyses of Diagnostic Test Accuracy Mariska Leeflang m.m.leeflang@amc.uva.nl This presentation 1. Introduction: accuracy? 2. QUADAS-2 exercise 3. Meta-analysis of diagnostic
More informationCritical reading of diagnostic imaging studies. Lecture Goals. Constantine Gatsonis, PhD. Brown University
Critical reading of diagnostic imaging studies Constantine Gatsonis Center for Statistical Sciences Brown University Annual Meeting Lecture Goals 1. Review diagnostic imaging evaluation goals and endpoints.
More informationKnowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes
Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC
More information1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp
The Stata Journal (22) 2, Number 3, pp. 28 289 Comparative assessment of three common algorithms for estimating the variance of the area under the nonparametric receiver operating characteristic curve
More informationStatistical modelling for thoracic surgery using a nomogram based on logistic regression
Statistics Corner Statistical modelling for thoracic surgery using a nomogram based on logistic regression Run-Zhong Liu 1, Ze-Rui Zhao 2, Calvin S. H. Ng 2 1 Department of Medical Statistics and Epidemiology,
More informationBiomarker adaptive designs in clinical trials
Review Article Biomarker adaptive designs in clinical trials James J. Chen 1, Tzu-Pin Lu 1,2, Dung-Tsa Chen 3, Sue-Jane Wang 4 1 Division of Bioinformatics and Biostatistics, National Center for Toxicological
More informationControlling Bias & Confounding
Controlling Bias & Confounding Chihaya Koriyama August 5 th, 2015 QUESTIONS FOR BIAS Key concepts Bias Should be minimized at the designing stage. Random errors We can do nothing at Is the nature the of
More informationSystematic reviews of prognostic studies 3 meta-analytical approaches in systematic reviews of prognostic studies
Systematic reviews of prognostic studies 3 meta-analytical approaches in systematic reviews of prognostic studies Thomas PA Debray, Karel GM Moons for the Cochrane Prognosis Review Methods Group Conflict
More informationCVD risk assessment using risk scores in primary and secondary prevention
CVD risk assessment using risk scores in primary and secondary prevention Raul D. Santos MD, PhD Heart Institute-InCor University of Sao Paulo Brazil Disclosure Honoraria for consulting and speaker activities
More informationBiases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University
Biases in clinical research Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University Learning objectives Describe the threats to causal inferences in clinical studies Understand the role of
More informationGlucose tolerance status was defined as a binary trait: 0 for NGT subjects, and 1 for IFG/IGT
ESM Methods: Modeling the OGTT Curve Glucose tolerance status was defined as a binary trait: 0 for NGT subjects, and for IFG/IGT subjects. Peak-wise classifications were based on the number of incline
More informationModel-free machine learning methods for personalized breast cancer risk prediction -SWISS PROMPT
Model-free machine learning methods for personalized breast cancer risk prediction -SWISS PROMPT Chang Ming, 22.11.2017 University of Basel Swiss Public Health Conference 2017 Breast Cancer & personalized
More informationGraphical assessment of internal and external calibration of logistic regression models by using loess smoothers
Tutorial in Biostatistics Received 21 November 2012, Accepted 17 July 2013 Published online 23 August 2013 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.5941 Graphical assessment of
More informationAdvanced IPD meta-analysis methods for observational studies
Advanced IPD meta-analysis methods for observational studies Simon Thompson University of Cambridge, UK Part 4 IBC Victoria, July 2016 1 Outline of talk Usual measures of association (e.g. hazard ratios)
More informationFrom single studies to an EBM based assessment some central issues
From single studies to an EBM based assessment some central issues Doug Altman Centre for Statistics in Medicine, Oxford, UK Prognosis Prognosis commonly relates to the probability or risk of an individual
More informationSensitivity, specicity, ROC
Sensitivity, specicity, ROC Thomas Alexander Gerds Department of Biostatistics, University of Copenhagen 1 / 53 Epilog: disease prevalence The prevalence is the proportion of cases in the population today.
More informationRISK PREDICTION MODEL: PENALIZED REGRESSIONS
RISK PREDICTION MODEL: PENALIZED REGRESSIONS Inspired from: How to develop a more accurate risk prediction model when there are few events Menelaos Pavlou, Gareth Ambler, Shaun R Seaman, Oliver Guttmann,
More informationTemplate 1 for summarising studies addressing prognostic questions
Template 1 for summarising studies addressing prognostic questions Instructions to fill the table: When no element can be added under one or more heading, include the mention: O Not applicable when an
More informationDepartment of Epidemiology, Rollins School of Public Health, Emory University, Atlanta GA, USA.
A More Intuitive Interpretation of the Area Under the ROC Curve A. Cecile J.W. Janssens, PhD Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta GA, USA. Corresponding
More informationChapter 13 Estimating the Modified Odds Ratio
Chapter 13 Estimating the Modified Odds Ratio Modified odds ratio vis-à-vis modified mean difference To a large extent, this chapter replicates the content of Chapter 10 (Estimating the modified mean difference),
More informationAbstract: Heart failure research suggests that multiple biomarkers could be combined
Title: Development and evaluation of multi-marker risk scores for clinical prognosis Authors: Benjamin French, Paramita Saha-Chaudhuri, Bonnie Ky, Thomas P Cappola, Patrick J Heagerty Benjamin French Department
More informationVarious performance measures in Binary classification An Overview of ROC study
Various performance measures in Binary classification An Overview of ROC study Suresh Babu. Nellore Department of Statistics, S.V. University, Tirupati, India E-mail: sureshbabu.nellore@gmail.com Abstract
More informationQUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDICTORS ON THE PERFORMANCE OF A PROGNOSTIC MODEL
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDICTORS ON THE PERFORMANCE OF A PROGNOSTIC MODEL Gary Collins, Emmanuel Ogundimu, Jonathan Cook, Yannick Le Manach, Doug Altman
More informationLecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method
Biost 590: Statistical Consulting Statistical Classification of Scientific Studies; Approach to Consulting Lecture Outline Statistical Classification of Scientific Studies Statistical Tasks Approach to
More informationStatistics, Probability and Diagnostic Medicine
Statistics, Probability and Diagnostic Medicine Jennifer Le-Rademacher, PhD Sponsored by the Clinical and Translational Science Institute (CTSI) and the Department of Population Health / Division of Biostatistics
More informationSurvival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer
Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer Jayashree Kalpathy-Cramer PhD 1, William Hersh, MD 1, Jong Song Kim, PhD
More informationMODEL SELECTION STRATEGIES. Tony Panzarella
MODEL SELECTION STRATEGIES Tony Panzarella Lab Course March 20, 2014 2 Preamble Although focus will be on time-to-event data the same principles apply to other outcome data Lab Course March 20, 2014 3
More informationPart [1.0] Introduction to Development and Evaluation of Dynamic Predictions
Part [1.0] Introduction to Development and Evaluation of Dynamic Predictions A Bansal & PJ Heagerty Department of Biostatistics University of Washington 1 Biomarkers The Instructor(s) Patrick Heagerty
More informationSTATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012
STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements
More informationSubclinical atherosclerosis in CVD: Risk stratification & management Raul Santos, MD
Subclinical atherosclerosis in CVD: Risk stratification & management Raul Santos, MD Sao Paulo Medical School Sao Paolo, Brazil Subclinical atherosclerosis in CVD risk: Stratification & management Prof.
More informationQuantifying the added value of new biomarkers: how and how not
Cook Diagnostic and Prognostic Research (2018) 2:14 https://doi.org/10.1186/s41512-018-0037-2 Diagnostic and Prognostic Research COMMENTARY Quantifying the added value of new biomarkers: how and how not
More informationNipple Aspirate Fluid Cytology and the Gail Model for Breast Cancer Risk Assessment in a Screening Population
324 Cancer Epidemiology, Biomarkers & Prevention Nipple Aspirate Fluid Cytology and the Gail Model for Breast Cancer Risk Assessment in a Screening Population Jeffrey A. Tice, 1 Rei Miike, 2 Kelly Adduci,
More informationA SAS Macro to Compute Added Predictive Ability of New Markers in Logistic Regression ABSTRACT INTRODUCTION AUC
A SAS Macro to Compute Added Predictive Ability of New Markers in Logistic Regression Kevin F Kennedy, St. Luke s Hospital-Mid America Heart Institute, Kansas City, MO Michael J Pencina, Dept. of Biostatistics,
More informationSelected Topics in Biostatistics Seminar Series. Prediction Modeling. Sponsored by: Center For Clinical Investigation and Cleveland CTSC
Selected Topics in Biostatistics Seminar Series Prediction Modeling Sponsored by: Center For Clinical Investigation and Cleveland CTSC Denise Babineau, PhD Director, CCI Statistical Sciences Core Co-Director,
More information2011 ASCP Annual Meeting
Diagnostic Accuracy Martin Kroll, MD Professor of Pathology and Laboratory Medicine Boston University School of Medicine Chief, Laboratory Medicine Boston Medical Center Disclosure Roche Abbott Course
More informationAn Improved Algorithm To Predict Recurrence Of Breast Cancer
An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant
More informationAn Improved Patient-Specific Mortality Risk Prediction in ICU in a Random Forest Classification Framework
An Improved Patient-Specific Mortality Risk Prediction in ICU in a Random Forest Classification Framework Soumya GHOSE, Jhimli MITRA 1, Sankalp KHANNA 1 and Jason DOWLING 1 1. The Australian e-health and
More informationThe index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models
Kattan and Gerds Diagnostic and Prognostic Research (2018) 2:7 https://doi.org/10.1186/s41512-018-0029-2 Diagnostic and Prognostic Research METHODOLOGY Open Access The index of prediction accuracy: an
More information7/17/2013. Evaluation of Diagnostic Tests July 22, 2013 Introduction to Clinical Research: A Two week Intensive Course
Evaluation of Diagnostic Tests July 22, 2013 Introduction to Clinical Research: A Two week Intensive Course David W. Dowdy, MD, PhD Department of Epidemiology Johns Hopkins Bloomberg School of Public Health
More informationLecture Outline Biost 517 Applied Biostatistics I. Statistical Goals of Studies Role of Statistical Inference
Lecture Outline Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Statistical Inference Role of Statistical Inference Hierarchy of Experimental
More informationSystematic reviews of prediction modeling studies: planning, critical appraisal and data collection
Systematic reviews of prediction modeling studies: planning, critical appraisal and data collection Karel GM Moons, Lotty Hooft, Hans Reitsma, Thomas Debray Dutch Cochrane Center Julius Center for Health
More informationDiagnostic screening. Department of Statistics, University of South Carolina. Stat 506: Introduction to Experimental Design
Diagnostic screening Department of Statistics, University of South Carolina Stat 506: Introduction to Experimental Design 1 / 27 Ties together several things we ve discussed already... The consideration
More informationReview. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN
Outline 1. Review sensitivity and specificity 2. Define an ROC curve 3. Define AUC 4. Non-parametric tests for whether or not the test is informative 5. Introduce the binormal ROC model 6. Discuss non-parametric
More informationCARDIOVASCULAR RISK ASSESSMENT ADDITION OF CHRONIC KIDNEY DISEASE AND RACE TO THE FRAMINGHAM EQUATION PAUL E. DRAWZ, MD, MHS
CARDIOVASCULAR RISK ASSESSMENT ADDITION OF CHRONIC KIDNEY DISEASE AND RACE TO THE FRAMINGHAM EQUATION by PAUL E. DRAWZ, MD, MHS Submitted in partial fulfillment of the requirements for the degree of Master
More informationZheng Yao Sr. Statistical Programmer
ROC CURVE ANALYSIS USING SAS Zheng Yao Sr. Statistical Programmer Outline Background Examples: Accuracy assessment Compare ROC curves Cut-off point selection Summary 2 Outline Background Examples: Accuracy
More informationThe Potential of Genes and Other Markers to Inform about Risk
Research Article The Potential of Genes and Other Markers to Inform about Risk Cancer Epidemiology, Biomarkers & Prevention Margaret S. Pepe 1,2, Jessie W. Gu 1,2, and Daryl E. Morris 1,2 Abstract Background:
More informationMeta-analysis of external validation studies
Meta-analysis of external validation studies Thomas Debray, PhD Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, The Netherlands Cochrane Netherlands, Utrecht, The
More informationSelection and Combination of Markers for Prediction
Selection and Combination of Markers for Prediction NACC Data and Methods Meeting September, 2010 Baojiang Chen, PhD Sarah Monsell, MS Xiao-Hua Andrew Zhou, PhD Overview 1. Research motivation 2. Describe
More informationIntroduction to Meta-analysis of Accuracy Data
Introduction to Meta-analysis of Accuracy Data Hans Reitsma MD, PhD Dept. of Clinical Epidemiology, Biostatistics & Bioinformatics Academic Medical Center - Amsterdam Continental European Support Unit
More informationPredictive Models for Healthcare Analytics
Predictive Models for Healthcare Analytics A Case on Retrospective Clinical Study Mengling Mornin Feng mfeng@mit.edu mornin@gmail.com 1 Learning Objectives After the lecture, students should be able to:
More informationIntroduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015
Introduction to diagnostic accuracy meta-analysis Yemisi Takwoingi October 2015 Learning objectives To appreciate the concept underlying DTA meta-analytic approaches To know the Moses-Littenberg SROC method
More informationHow to Develop, Validate, and Compare Clinical Prediction Models Involving Radiological Parameters: Study Design and Statistical Methods
Review Article Experimental and Others http://dx.doi.org/10.3348/kjr.2016.17.3.339 pissn 1229-6929 eissn 2005-8330 Korean J Radiol 2016;17(3):339-350 How to Develop, Validate, and Compare Clinical Prediction
More informationINTRODUCTION TO MACHINE LEARNING. Decision tree learning
INTRODUCTION TO MACHINE LEARNING Decision tree learning Task of classification Automatically assign class to observations with features Observation: vector of features, with a class Automatically assign
More informationFeature selection methods for early predictive biomarker discovery using untargeted metabolomic data
Feature selection methods for early predictive biomarker discovery using untargeted metabolomic data Dhouha Grissa, Mélanie Pétéra, Marion Brandolini, Amedeo Napoli, Blandine Comte and Estelle Pujos-Guillot
More informationAN INDEPENDENT VALIDATION OF QRISK ON THE THIN DATABASE
AN INDEPENDENT VALIDATION OF QRISK ON THE THIN DATABASE Dr Gary S. Collins Professor Douglas G. Altman Centre for Statistics in Medicine University of Oxford TABLE OF CONTENTS LIST OF TABLES... 3 LIST
More informationSanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra, India.
Research Methodology Sample size and power analysis in medical research Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra,
More informationIndividual Participant Data (IPD) Meta-analysis of prediction modelling studies
Individual Participant Data (IPD) Meta-analysis of prediction modelling studies Thomas Debray, PhD Julius Center for Health Sciences and Primary Care Utrecht, The Netherlands March 7, 2016 Prediction
More informationEvidence Based Medicine
Course Goals Goals 1. Understand basic concepts of evidence based medicine (EBM) and how EBM facilitates optimal patient care. 2. Develop a basic understanding of how clinical research studies are designed
More informationHigh-sensitivity Troponin T Predicts Recurrent Cardiovascular Events in Patients with Stable Coronary Heart Disease: KAROLA Study 8 Year FU
ESC Congress 2011 Paris, France, August 27-31 KAROLA Session: Prevention: Are biomarkers worth their money? Abstract # 84698 High-sensitivity Troponin T Predicts Recurrent Cardiovascular Events in Patients
More informationExample - Birdkeeping and Lung Cancer - Interpretation. Lecture 20 - Sensitivity, Specificity, and Decisions. What do the numbers not mean...
Odds Ratios Example - Birdkeeping and Lung Cancer - Interpretation Lecture 20 - Sensitivity, Specificity, and Decisions Sta102 / BME102 Colin Rundel April 16, 2014 Estimate Std. Error z value Pr(> z )
More informationBREAST CANCER EPIDEMIOLOGY MODEL:
BREAST CANCER EPIDEMIOLOGY MODEL: Calibrating Simulations via Optimization Michael C. Ferris, Geng Deng, Dennis G. Fryback, Vipat Kuruchittham University of Wisconsin 1 University of Wisconsin Breast Cancer
More informationConsidering depression as a risk marker for incident coronary disease
Considering depression as a risk marker for incident coronary disease Dr Adrienne O'Neil Senior Research Fellow Melbourne School of Population & Global Health The University of Melbourne & Visiting Fellow
More informationChapter 17 Sensitivity Analysis and Model Validation
Chapter 17 Sensitivity Analysis and Model Validation Justin D. Salciccioli, Yves Crutain, Matthieu Komorowski and Dominic C. Marshall Learning Objectives Appreciate that all models possess inherent limitations
More informationBreast density: imaging, risks and recommendations
Breast density: imaging, risks and recommendations Maureen Baxter, MD Radiologist Director of Ruth J. Spear Breast Center Providence St. Vincent Medical Center Alison Conlin, MD/MPH Medical Oncologist
More informationAddressing error in laboratory biomarker studies
Addressing error in laboratory biomarker studies Elizabeth Selvin, PhD, MPH Associate Professor of Epidemiology and Medicine Co-Director, Biomarkers and Diagnostic Testing Translational Research Community
More informationAn Introduction to Epidemiology
An Introduction to Epidemiology Wei Liu, MPH Biostatistics Core Pennington Biomedical Research Center Baton Rouge, LA Last edited: January, 14 th, 2014 TABLE OF CONTENTS Introduction.................................................................
More informationTesting Statistical Models to Improve Screening of Lung Cancer
Testing Statistical Models to Improve Screening of Lung Cancer 1 Elliot Burghardt: University of Iowa Daren Kuwaye: University of Hawai i at Mānoa Iowa Summer Institute in Biostatistics - University of
More informationPrediction Model For Risk Of Breast Cancer Considering Interaction Between The Risk Factors
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME, ISSUE 0, SEPTEMBER 01 ISSN 81 Prediction Model For Risk Of Breast Cancer Considering Interaction Between The Risk Factors Nabila Al Balushi
More informationMammographic density and risk of breast cancer by tumor characteristics: a casecontrol
Krishnan et al. BMC Cancer (2017) 17:859 DOI 10.1186/s12885-017-3871-7 RESEARCH ARTICLE Mammographic density and risk of breast cancer by tumor characteristics: a casecontrol study Open Access Kavitha
More informationCopyright 2007 IEEE. Reprinted from 4th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, April 2007.
Copyright 27 IEEE. Reprinted from 4th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, April 27. This material is posted here with permission of the IEEE. Such permission of the
More informationBMI 541/699 Lecture 16
BMI 541/699 Lecture 16 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Proportions & contingency tables -
More informationA simple screening score for diabetes for the Korean population
2012 International Conference on Diabetes and Metabolism S7. Epidemiologic issues on diabetes Friday 9 November, 2012 A simple screening score for diabetes for the Korean population Dae Jung Kim, MD Associate
More informationRoadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:
Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:7332-7341 Presented by Deming Mi 7/25/2006 Major reasons for few prognostic factors to
More informationBehavioral Data Mining. Lecture 4 Measurement
Behavioral Data Mining Lecture 4 Measurement Outline Hypothesis testing Parametric statistical tests Non-parametric tests Precision-Recall plots ROC plots Hardware update Icluster machines are ready for
More informationBIOSTATISTICAL METHODS
BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH PROPENSITY SCORE Confounding Definition: A situation in which the effect or association between an exposure (a predictor or risk factor) and
More informationDeveloping a Prediction Rule
Developing a Prediction Rule Henry Glick Epi 55 January 26, 2 Pre-test Probability of Disease An important anchor for developing management strategies for patients Can be adjusted to account for additional
More informationPredicting Kidney Cancer Survival from Genomic Data
Predicting Kidney Cancer Survival from Genomic Data Christopher Sauer, Rishi Bedi, Duc Nguyen, Benedikt Bünz Abstract Cancers are on par with heart disease as the leading cause for mortality in the United
More informationCP Statistics Sem 1 Final Exam Review
Name: _ Period: ID: A CP Statistics Sem 1 Final Exam Review Multiple Choice Identify the choice that best completes the statement or answers the question. 1. A particularly common question in the study
More informationPredicting Breast Cancer Survivability Rates
Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer
More informationPredicting Breast Cancer Survival Using Treatment and Patient Factors
Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women
More informationDesigning Studies of Diagnostic Imaging
Designing Studies of Diagnostic Imaging Chaya S. Moskowitz, PhD With thanks to Nancy Obuchowski Outline What is study design? Building blocks of imaging studies Strategies to improve study efficiency What
More informationGene-Environment Interactions
Gene-Environment Interactions What is gene-environment interaction? A different effect of an environmental exposure on disease risk in persons with different genotypes," or, alternatively, "a different
More informationEPI 200C Final, June 4 th, 2009 This exam includes 24 questions.
Greenland/Arah, Epi 200C Sp 2000 1 of 6 EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. INSTRUCTIONS: Write all answers on the answer sheets supplied; PRINT YOUR NAME and STUDENT ID NUMBER
More informationApplying Machine Learning Methods in Medical Research Studies
Applying Machine Learning Methods in Medical Research Studies Daniel Stahl Department of Biostatistics and Health Informatics Psychiatry, Psychology & Neuroscience (IoPPN), King s College London daniel.r.stahl@kcl.ac.uk
More informationBiases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University
Biases in clinical research Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University Learning objectives Understand goal of measurement and definition of accuracy Describe the threats to causal
More informationEvidence Based Medicine Prof P Rheeder Clinical Epidemiology. Module 2: Applying EBM to Diagnosis
Evidence Based Medicine Prof P Rheeder Clinical Epidemiology Module 2: Applying EBM to Diagnosis Content 1. Phases of diagnostic research 2. Developing a new test for lung cancer 3. Thresholds 4. Critical
More information