Statistical methods for the meta-analysis of full ROC curves

Similar documents
Statistical methods for the meta-analysis of full ROC curves

Meta-analysis of diagnostic test accuracy studies with multiple & missing thresholds

Introduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015

Empirical assessment of univariate and bivariate meta-analyses for comparing the accuracy of diagnostic tests

Supplementary Online Content

Systematic Reviews and meta-analyses of Diagnostic Test Accuracy. Mariska Leeflang

Bayesian meta-analysis of Papanicolaou smear accuracy

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Derivative-Free Optimization for Hyper-Parameter Tuning in Machine Learning Problems

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy

MIDAS RETOUCH REGARDING DIAGNOSTIC ACCURACY META-ANALYSIS

SYSTEMATIC REVIEWS OF TEST ACCURACY STUDIES

Introduction to Meta-analysis of Accuracy Data

Multivariate Mixed-Effects Meta-Analysis of Paired-Comparison Studies of Diagnostic Test Accuracy

Reporting and methods in systematic reviews of comparative accuracy

Meta-analyses evaluating diagnostic test accuracy

Multivariate Mixed-Effects Meta-Analysis of Paired-Comparison Studies of Diagnostic Test Accuracy

Bayesian Meta-Analysis of the Accuracy of a Test for Tuberculous Pleuritis in the Absence of a Gold Standard Reference

Summarising and validating test accuracy results across multiple studies for use in clinical practice

CopulaDTA: An R package for copula-based bivariate beta-binomial models for diagnostic test accuracy studies in Bayesian framework

Estimation of Area under the ROC Curve Using Exponential and Weibull Distributions

Meta-analysis of Diagnostic Test Accuracy Studies

Reducing Decision Errors in the Paired Comparison of the Diagnostic Accuracy of Continuous Screening Tests

Methods Research Report. An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy

Meta-analysis using RevMan. Yemisi Takwoingi October 2015

Performance Assessment for Radiologists Interpreting Screening Mammography

Biometrical Journal 52 (2010) 61, zzz zzz / DOI: /bimj

An empirical comparison of methods to meta-analyze individual patient data of diagnostic accuracy

Package CopulaREMADA

Meta-analysis of diagnostic accuracy studies. Mariska Leeflang (with thanks to Yemisi Takwoingi, Jon Deeks and Hans Reitsma)

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy

Package CopulaREMADA

EVIDENCE-BASED GUIDELINE DEVELOPMENT FOR DIAGNOSTIC QUESTIONS

Diagnostic accuracy in the presence of an imperfect reference standard: challenges in evaluating latent class models specifications (a Campylobacter

State of the Art in Clinical & Anatomic Pathology. Meta-analysis of Clinical Studies of Diagnostic Tests

METHODS FOR DETECTING CERVICAL CANCER

Landmarking, immortal time bias and. Dynamic prediction

Bayesian Joint Modelling of Benefit and Risk in Drug Development

Meta-analysis of diagnostic research. Karen R Steingart, MD, MPH Chennai, 15 December Overview

A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests

Module Overview. What is a Marker? Part 1 Overview

Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews

Critical reading of diagnostic imaging studies. Lecture Goals. Constantine Gatsonis, PhD. Brown University

Part [1.0] Introduction to Development and Evaluation of Dynamic Predictions

Chapter 10. Screening for Disease

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers

Multivariate meta-analysis using individual participant data

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Where a licence is displayed above, please note the terms and conditions of the licence govern your use of this document.

Research and Evaluation Methodology Program, School of Human Development and Organizational Studies in Education, University of Florida

Small-area estimation of mental illness prevalence for schools

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

Heritability. The Extended Liability-Threshold Model. Polygenic model for continuous trait. Polygenic model for continuous trait.

Zheng Yao Sr. Statistical Programmer

Individualized Treatment Effects Using a Non-parametric Bayesian Approach

Performance of rapid influenza H1N1 diagnostic tests: a meta-analysis

Explaining heterogeneity

Sensitivity, Specificity, and Relatives

Risk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach

Assessing variability in results in systematic reviews of diagnostic studies

Statistics, Probability and Diagnostic Medicine

Knowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes

Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta GA, USA.

VU Biostatistics and Experimental Design PLA.216

Introduction to ROC analysis

Design for Targeted Therapies: Statistical Considerations

The cross sectional study design. Population and pre-test. Probability (participants). Index test. Target condition. Reference Standard

Various performance measures in Binary classification An Overview of ROC study

Case Studies in Bayesian Augmented Control Design. Nathan Enas Ji Lin Eli Lilly and Company

Choice of axis, tests for funnel plot asymmetry, and methods to adjust for publication bias

Bayesian methods in health economics

Analysis of left-censored multiplex immunoassay data: A unified approach

Application of Bayesian Probabilistic Approach on Ground Motion Attenuation Relations. Rong-Rong Xu. Master of Science in Civil Engineering

Meta-Analysis Methods used in Radiology Journals

EVALUATION AND COMPUTATION OF DIAGNOSTIC TESTS: A SIMPLE ALTERNATIVE

Computer Models for Medical Diagnosis and Prognostication

ROC Curves (Old Version)

The Statistical Analysis of Failure Time Data

Assessing the diagnostic accuracy of a sequence of tests

Systematic reviews of diagnostic test accuracy studies

From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1

A Bayesian Perspective on Unmeasured Confounding in Large Administrative Databases

Comparing treatments evaluated in studies forming disconnected networks of evidence: A review of methods

Hayden Smith, PhD, MPH /\ v._

Estimating heritability for cause specific mortality based on twin studies

Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers

Assessment of performance and decision curve analysis

MEDICAL SCORING FOR BREAST CANCER RECURRENCE. Nurul Husna bt Jamian UiTM (Perak), Tapah Campus

Biostatistics II

Meta-analysis of two studies in the presence of heterogeneity with applications in rare diseases

Bayesian growth mixture models to distinguish hemoglobin value trajectories in blood donors

Bayesian Meta-analysis of Diagnostic Tests

Economic Evaluations and Diagnostic Testing: Sanghera, Sabina; Orlando, Rosa; Roberts, Tracy

Statistical Tolerance Regions: Theory, Applications and Computation

MLE #8. Econ 674. Purdue University. Justin L. Tobias (Purdue) MLE #8 1 / 20

Using dynamic prediction to inform the optimal intervention time for an abdominal aortic aneurysm screening programme

Meta-Analysis of Diagnostic Accuracy with mada

Modelling prognostic capabilities of tumor size: application to colorectal cancer

Transcription:

Statistical methods for the meta-analysis of full ROC curves Oliver Kuss (joint work with Stefan Hirt and Annika Hoyer) German Diabetes Center, Leibniz Institute for Diabetes Research at Heinrich Heine University Düsseldorf, Institute for Biometry and Epidemiology May 4, 2016 1 / 35 Kuss/ Hoyer May 4, 2016

Outline Introduction Bivariate time-to-event model for interval-censored data Discussion 2 / 35 Kuss/ Hoyer May 4, 2016

Introduction

Diagnostic study Gold Standard D Diagnostic Test T + - + TP FP - FN TN TP: True Positive, FP: False Positive, FN: False Negative, TN: True Negative, N = TP + FP + FN + TN Sensitivity: Se = P(T + D + ) = Specificity: Sp = P(T D ) = TP TP+FN TN FP+TN 3 / 35 Kuss/ Hoyer May 4, 2016

Diagnostic study Gold Standard D Diagnostic Test T + - + 392 1130-267 5015 Sensitivity: Se = P(T + D + ) = 392 392+267 = 59% Specificity: Sp = P(T D ) = 5015 1130+5015 = 82% 4 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis Meta-analysis is a quantitative, systematic summary of a collection of separate studies for the purpose of obtaining information that cannot be derived from any of the studies alone (Boissel et al., 1988) 5 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies Meta-analysis for intervention studies is well established today In contrast, meta-analysis of diagnostic studies has been a vivid research area in recent years Reason: Increased complexity of diagnostic trials with their bivariate outcome of sensitivity and specificity 6 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies Study Sensitivity Specificity 1 392/659 5015/6145 2 207/252 2704/5865......... Se MA =... Sp MA =... generally negatively correlated 7 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies Additional challenge: Single studies report a full ROC curve with several pairs of sensitivity and specificity, each one for a different threshold Still greater challenge: Values and numbers of thresholds can vary between single studies 8 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies - An example Study Threshold Sensitivity Specificity 1 5.0 596/659 2062/6145 1 5.5 392/659 5015/6145 1 6.0 137/659 5999/6145 2 5.3 207/252 2704/5865 2 5.4 196/252 3308/5865............ 9 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies - An example 10 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies - An example Population-based screening for type 2 diabetes mellitus Two systematic reviews (Kodama, 2013 and Bennett, 2007) report on 38 single studies (one pair of sensitivity and specificity from each study) to assess HbA1c as diagnostic marker However: Intensified search yields 124 pairs of sensitivity and specificity for 26 different HbA1c thresholds 11 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies - An example 1.0 0.8 Sensitivity 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1-Specificity 12 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies - An example An analysis using only the original 38 pairs discards more than 70% of the available observations Even worse: Using only the original 38 pairs would also ignore the different HbA1c values that were used as thresholds 13 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies - The paradox of the standard SROC A summary ROC curve from the original 38 pairs explicitly ignores the ROC information from the single studies and...... is in principle unidentifiable [Rücker/Schumacher, 2010]... cannot be interpreted as a kind of average curve or a curve typical for the study-specific ROC curves. It can have a shape that is very different from the study-specific shapes. [Arends et al., 2008] 14 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies - The paradox of the standard SROC [Chu/Guo, 2009] 15 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies - Recommendations... although several of these methods allow for different test thresholds to be used across the primary studies, none have been used to incorporate threshold values explicitly, a notable limitation. (Sutton, 2008) If more than one threshold is reported per study, this has to be taken into account in the quantitative analyses. (Trikalinos, 2012) 16 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies - Current approaches using the full information from each study Good news: Methods for the meta-analysis of full ROC curves have been proposed [Kester, 2000; Dukic, 2003; Poon, 2004; Bipat, 2007; Hamza, 2009; Putter, 2010; Martinez-Camblor, 2014; Riley, 2014]. 17 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies - Current approaches using the full information from each study Not so good news: Each of these methods has at least one of the following disadvantages: The method constitutes a two-step approach and estimation uncertainty from the first step is ignored in the second step The number of thresholds has to be identical across all studies The concrete values of the thresholds are ignored The method assumes a fixed-effect model The method is not applicable with extreme values 18 / 35 Kuss/ Hoyer May 4, 2016

Meta-analysis of diagnostic studies - A boring solution Extend current standard model (Bivariate logistic regression model with random effects) to a meta-regression by adding a covariate for the threshold (under review at Statistical Methods in Medical Research ) 19 / 35 Kuss/ Hoyer May 4, 2016

Bivariate time-to-event model for interval-censored data

ROC from Sato et al. 1.0 0.8 5.0 Sensitivity 0.6 0.4 5.5 0.2 6.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1-Specificity 20 / 35 Kuss/ Hoyer May 4, 2016

ROC from Sato et al. 1.0 0.8 596 63 659 0 Sensitivity 0.6 0.4 392 267 137 522 0.2 0 0.0 0.0 6590.2 0.4 0.6 0.8 1.0 1-Specificity 21 / 35 Kuss/ Hoyer May 4, 2016

Life-table estimator from Sato et al. 1.0 63/659 0.8 5.0 267/659 Sensitivity 0.6 5.5 0.4 0.2 6.0 0 1 2 3 4 5 6 HbA1c 22 / 35 Kuss/ Hoyer May 4, 2016

Fundamental insight An ROC curve can be considered a bivariate time-to-event model for interval-censored data! 23 / 35 Kuss/ Hoyer May 4, 2016

Bivariate time-to-event model for interval-censored data - Notation Given K observations from a continuous variable Y (in our case: HbA1c) as lying in intervals (yk L, y k R] with k = yk R y k L Y has density f (y, µ, θ) with corresponding cdf F(y, µ, θ); µ is a location and φ a scale parameter In general, three types of censoring can occur: Type of Censoring Interval Contribution to LogL Left yk L = 0, y k R F(y k R ; µ, φ) Interval yk L 0, y k R [f (y k; µ, φ) k ] Right yk L 0, y k R = [1 F(y k L ; µ, φ)] 24 / 35 Kuss/ Hoyer May 4, 2016

Bivariate time-to-event model for interval-censored data - Densities We consider three densities for Y : Weibull f (y; µ, φ) = φy φ 1 exp( (y/µ) φ ) µ φ Log-normal f (y; µ, φ) = exp( [log(y) µ]2 /(2φ)) y 2πφ Log-logistic f (y; µ, φ) = (See Lindsey, 1998 for more densities) πexp( π[log(y) µ]/( 3φ)) y 3φ(1+exp( π[log(y) µ]/( 3φ))) 25 / 35 Kuss/ Hoyer May 4, 2016

Bivariate time-to-event model for interval-censored data - Final model After a log transformation, the diagnostic test values Y 0 and Y 1 for non-diseased and diseased probands are allowed to vary between studies and are linked by a bivariate normal distribution with correlation ρ. log(y 0 ) = b 0 + ɛ 0 + u 0i, (1) log(y 1 ) = b 1 + ɛ 1 + u 1i, (2) and ( u0i u 1i ) N [( ) 0, 0 ( σ 2 0 ρσ 0 σ 1 ρσ 0 σ 1 σ 2 1 )]. (3) 26 / 35 Kuss/ Hoyer May 4, 2016

Parameter estimation The following parameters have to be estimated: b 0, b 1, ɛ 0, ɛ 1, σ0 2, σ2 1, and ρ. Estimation method: Gaussian quadrature (SAS PROC NLMIXED) We are not interested in the model parameters, but in the summary ROC curve That is, predict sensitivity and specificity at given thresholds by the BLUP principle (PREDICT statement, NLMIXED) 27 / 35 Kuss/ Hoyer May 4, 2016

Results for the HbA1c example 1.0 0.8 Sensitivity 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1-Specificity Weibull Log-normal Log-logistic 28 / 35 Kuss/ Hoyer May 4, 2016

Results for the HbA1c example - Estimated densitites 29 / 35 Kuss/ Hoyer May 4, 2016

Discussion

Summary We proposed an approach for the meta-analysis of full ROC curves that use the information for all thresholds. It avoids the problems of previous methods and comes with the following advantages: Constitutes a one-step approach Allows varying numbers and varying values of thresholds Explicitly models the value of the diagnostic test Allows heterogeneity of sensitivity and specificity across studies Accounts for correlation within studies 30 / 35 Kuss/ Hoyer May 4, 2016

Summary Ensures proper weighting of studies sample size Works well in simulations Can be generalized to compare two (or even more) diagnostic tests to a common gold standard 31 / 35 Kuss/ Hoyer May 4, 2016

Limitations Distributional assumptions for the values of the diagnostic test (Idea: Relax this by using a piecewise constant function) Robustness 32 / 35 Kuss/ Hoyer May 4, 2016

Thank you! View from the IBE on Rheinisches Braunkohlerevier 33 / 35 Kuss/ Hoyer May 4, 2016

Bibliography Bennett CM, Guo M, Dharmage SC. HbA1c as a screening tool for detection of Type 2 diabetes: a systematic review. Diabetic Medicine. 2007;24:333-343. Kodama S, Horikawa C, Fujihara K, Hirasawa R, Yachi Y, Yoshizawa S, Tanaka S, Sone Y, Shimano H, Iida KT, Saito K, Sone H. Use of high-normal levels of haemoglobin A 1 C and fasting plasma glucose for diabetes screening and for prediction: a meta-analysis. Diabetes/ Metabolism Research and Reviews. 2013;29(8):680-692. Rücker G, Schumacher M. Summary ROC curve based on a weighted Youden index for selecting an optimal cutpoint in meta-analysis of diagnostic accuracy. Statistics in Medicine. 2010; 29(30):3069-3078. Arends LR, Hamza TH, van Houwelingen JC, Heijenbrok-Kal MH, Hunink MG, Stijnen T. Bivariate random effects meta-analysis of ROC curves. Medical Decision Making. 2008;28(5):621-38. Chu H, Guo H. Letter to the editor. Biostatistics. 2009;10(1):201-203. Kester AD, Buntinx F. Meta-analysis of ROC curves. Medical Decision Making. 2000;20(4):430-9. Dukic V, Gatsonis C. Meta-analysis of diagnostic test accuracy assessment studies with varying number of thresholds. Biometrics. 2003;59:936-46. Poon WY. A latent normal distribution model for analysing ordinal responses with applications in meta-analysis. Statistics in Medicine. 2004;23(14):2155-72. 34 / 35 Kuss/ Hoyer May 4, 2016

Bibliography Bipat S, Zwinderman AH, Bossuyt PM, Stoker J. Multivariate random-effects approach: for meta-analysis of cancer staging studies. Academic Radiology. 2007;14(8):974-84. Hamza TH, Arends LR, van Houwelingen HC, Stijnen T. Multivariate random effects meta-analysis of diagnostic tests with multiple thresholds. BMC Medical Research Methodology. 2009;9:73. Putter H, Fiocco M, Stijnen T. Meta-analysis of diagnostic test accuracy studies with multiple thresholds using survival methods. Biometrical Journal. 2010;52(1):95-110. Martinez-Camblor P. Fully non-parametric receiver operating characteristic curve estimation for random-effects meta-analysis. Statistical Methods in Medical Research. 2014 May 28. Riley RD, Takwoingi Y, Trikalinos T, Guha A, Biswas A, Ensor J, Morris RK, Deeks JJ. Meta-Analysis of Test Accuracy Studies with Multiple and Missing Thresholds: A Multivariate-Normal Model. Journal of Biometrics & Biostatistics. 2014;5:196. Zhang J, Yang Q, Zhang JA, Chen Y, Qin Q, Yang J, Yan N. The value of HbA1c for diagnosing type 2 diabetes in high-risk population. Journal of Xian Jiaotong University (Medical Sciences). 2010;31(4):512514. Lindsey JK. A study of interval censoring in parametric regression models. Lifetime Data Analysis. 1998;4(4):329-54. Gong Q, Fang L. Comparison of different parametric proportional hazards models for interval-censored data: a simulation study. Contemporary Clinical Trials. 2013;36(1):276-83. 35 / 35 Kuss/ Hoyer May 4, 2016