Meta-analysis of diagnostic test accuracy studies with multiple & missing thresholds Richard D. Riley School of Health and Population Sciences, & School of Mathematics, University of Birmingham Collaborators: Apratim Guha, Atanu Biswas, Yemisi Takwoingi Joie Ensor, R. Katie Morris, Jonathan J. Deeks
Rationale
SENSITIVITY vs. SPECIFICITY pdf Test - Test + Group 1 Diseased Group 0 Healthy Group 0 (Healthy) Test + TP FP Test - FN TN TN TP Group 1 (Diseased) D T Diagnostic variable, D Threshold Sensitivity = number of true positives / total with disease Specificity = number of true negatives / total without disease 3 3
SENSITIVITY vs. SPECIFICITY pdf Test - Test + Group 1 Diseased Group 0 Healthy Group 0 (Healthy) Test + TP FP Test - FN TN TN TP Group 1 (Diseased) D T Diagnostic variable, D Threshold Each study may report more than one 2x2 table One for each different threshold value considered 4 4
TRACING OUT THE RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE 1 Lower threshold Sensitivity Higher threshold 0 (1-Specificity) 1 5 5
Clinical question Pre-eclampsia is a major cause of maternal and perinatal morbidity and mortality, and occurs in 2-8% of all pregnancies The diagnosis of pre-eclampsia is determined by the presence of elevated blood pressure combined with significant proteinuria Gold-standard for detection of significant proteinuria is the 24 hour urine collection... but this is time consuming and inconvenient Alternative, quicker and yet accurate diagnostic tests required
Data Systematic review identified 13 studies evaluating spot protein:creatinine ratio (PCR) for detecting significant proteinuria in pregnancy 23 different thresholds reported, with a minimum of one and a maximum of seven studies per threshold Threshold values ranged from 0.13 to 0.50. Five studies provided 2x2 table for just one threshold The other eight studies reported results for each of multiple thresholds, up to a maximum of nine thresholds in any study
How to do meta-analysis?
But summary meta-analysis result is not for a particular threshold and can t identify threshold results from the SROC curve (1) Current standard approaches: Choose one threshold per study Standard approach for meta-analysis Still allows summary ROC curves to be produced
(1) Current standard approaches: Separate meta-analysis at each threshold Use ALL thresholds in each study Take each threshold separately Apply standard meta-analysis to each threshold, e.g. bivariate random effects meta-analysis model - models binomial nature of the data - obtain average sensitivity & specificity for each threshold - allows for between-study heterogeneity - accounts for any between-study correlation
Separate meta-analysis at each threshold sensitivity.5.6.7.8.9 1.45.39.35.28.31.32.3.4.25.24.18.5.49.2.16.23.17.21.19.22.14.13.15 1.9.8 specificity.7.6.5
However... throwing away information? A separate meta-analysis for each threshold omits any studies that do not report the threshold of interest But such omitted studies may contain related information from other thresholds that are available. In particular... recall the ROC curve Test accuracy results from neighbouring thresholds are similar Results for a missing threshold are bounded between any pair of higher and lower thresholds that are available. Statistically like to utilise such related information (correlation)
(2) Sophisticated approaches: (i) Multinomial regression with random-effects Hamza et al. method
(2) Sophisticated approaches: (ii) Poisson survival model with random-effects Compared to Hamza: - the multinomial distribution of the counts is replaced by a Poisson distribution of the number of events, - the random normal distribution to describe betweenstudy variation is replaced by a multivariate gamma distribution.
(2) Sophisticated approaches: (iii) Many others. Tosteson AN, Begg CB: A general regression methodology for ROC curve estimation. Medical Decision Making 1988, 8:204-15. Kester ADM, Buntinx F: Meta-analysis of ROC curves. Medical Decision Making 2000, 20:430-439. Poon WY: A latent normal distribution model for analysing ordinal responses with applications in meta-analysis. Statistics in Medicine 2004, 23:2155-2172. Bipat S, Zwinderman AH, Bossuyt PMM, Stoker J: Multivariate Random-Effects Approach: For Meta-Analysis of Cancer Staging Studies. Acad Radiol 2007, 14:974-984. Dukic V, Gatsonis C: Meta-analysis of Diagnostic test accuracy assessment studies with varying number of thresholds. Biometrics 2003, 59:936-946. Typically not used / difficult to fit with missing data / don t utilise established multivariate methods / don t handle studies with 1 threshold
(3) A practical sensitivity analysis: imputation approach Aim: Evaluate impact of missing thresholds but enable established Cochrane methods We have missing data that is bounded Results for a missing threshold are bounded between any pair of higher and lower thresholds that are available Step 1: Impute missing threshold results On the logit-scale, assume: - a linear increase in specificity as the threshold increases - a linear decrease in sensitivity as the threshold increases Convert back to a 2x2 table for the missing threshold Step 2: Apply a standard meta-analysis at each threshold separately
Example: Al Ragib study logit-sensitivity 1 1.25 1.5 1.75 2 2.25.49.45.4.39.35.3.32.31.28.19.18.2.24.23.22.21.25.17.16.15.14.13 1.75 1.5 1.25 1 logit-specificity.75.5 imputed observed
Example: Al Ragib study threshold threshold High Normal ID, t value, x Imputed? TP FP FN TN Total proteinuria proteinuria 1 0.13 No 35 51 4 95 185 39 146 2 0.14 Yes 34.7 49.1 4.3 96.9 3 0.15 Yes 34.3 47.3 4.7 98.7 4 0.16 Yes 33.9 45.5 5.1 100.5 5 0.17 Yes 33.5 43.7 5.5 102.3 6 0.18 No 33 42 6 104
Example: Al Ragib study threshold threshold High Normal ID, t value, x Imputed? TP FP FN TN Total proteinuria proteinuria 1 0.13 No 35 51 4 95 185 39 146 2 0.14 Yes 34.7 49.1 4.3 96.9 3 0.15 Yes 34.3 47.3 4.7 98.7 4 0.16 Yes 33.9 45.5 5.1 100.5 5 0.17 Yes 33.5 43.7 5.5 102.3 6 0.18 No 33 42 6 104 Using this method, an additional 50 threshold results were imputed Vast more information available for meta-analysis at each threshold Key question: are the original conclusions (before imputation) robust?
. 45. 39. 3. 35. 32. 31. 4. 5. 28. 49. 25. 2. 24. 23. 18. 19. 21. 2. 17. 16. 13. 14. 15 Separate versus imputation approach Sensitivity.5.6.7.8.9 1.45.39.35.28.31.32.3.4.25.24.18.5.49.2.16.23.17.21.19.22.14.13.15 1.9.8 Specificity.7.6.5 no imputation with imputation
Imputation reveals weaker clinical conclusion The imputation approach generally Imputation: - reveals LOWER summary test accuracy - increased heterogeneity Original: Suggest PCR test is not suitable (on its own) for diagnosing proteinuria
Imputation approach: pros and cons Advantages Utilises more information from neighbouring thresholds Allows standard bivariate meta-analysis at each threshold Incorporates studies with just one threshold Reveals potential over-optimistic in published results (due to selective reporting of thresholds: see Rifai et al.) Disadvantages Single imputation used; multiple imputation now essential Uncertainty of the imputed value needs to be acknowledged (view as a sensitivity analysis, focus on impact on summary estimates)
(4) multivariate-normal approach Rather than imputation, consider modelling correlation Extend bivariate-normal model of Reitsma et al. (2002) Step 1: Obtain for each study logit-sensitivity and logit-specificity estimates at each threshold their variance-covariance matrix (variances and correlation) Step 2: Apply a multivariate-normal meta-analysis model Assume logit estimates follow an approx multivariate normal dist Jointly synthesises all threshold simultaneously Correlation borrows strength across thresholds
. 45. 35. 4. 32. 31. 28. 39. 5. 49. 3. 24. 2. 25. 18. 23. 21. 2. 16. 17. 19. 13. 15. 14 Univariate versus multivariate approach Sensitivity.5.6.7.8.9 1.45.35.28.39.31.32.3.4.5.24.25.49.2.23.16.21.19.18.22.17.13.14.15 1.9.8 Specificity.7.6.5 multivariate univariate
Multivariate-normal approach: pros & cons Advantages Can lead to improved precision of estimates Reduces selective reporting of thresholds Disadvantages Laborious to fit Between-study covariance matrix poorly estimated (correlations often +1 and -1) Uses approximate normal sampling distribution Requires continuity correction if there are zero cells
Final thing Ordering Sensitivity and specificity should be ordered across thresholds As threshold increases sensitivity decreases, specificity increases Sophisticated methods in part (2) ensure this but perhaps also causes convergence problems given missing data Simpler approaches do not ensure ordering across thresholds e.g. the summary sensitivity estimate increases from 0.82 to 0.94 when the threshold increases from 0.28 to 0.3 This is due to the imbalance in number of studies per threshold Order by meta-regression of meta-analysis results VS threshold
Constrain after multi-normal meta-analysis. 5. 49. 45. 4. 39. 35. 32. 31. 3. 28. 25. 24. 23. 2. 21. 2. 19. 18. 17. 16. 15. 14. 13 Sensitivity.5.6.7.8.9 1.39.45.3.32.31.35.4.5.28.49.25.18.22.2.19.21.23.24.16.17.13.14.15 1.9.8 Specificity.7.6.5 unconstrained summary results constrained summary results
Constrain after multi-normal meta-analysis. 5. 49. 45. 4. 39. 35. 32. 31. 3. 28. 25. 24. 23. 2. 21. 2. 19. 18. 17. 16. 15. 14. 13 Sensitivity.5.6.7.8.9 1.39.45.3.32.31.35.4.5.28.49.25.18.22.2.19.21.23.24.16.17.13.14.15 For constrained results they are ordered; threshold value increases from 0.13 to 0.5 1.9.8 Specificity.7.6.5 unconstrained summary results constrained summary results
So main message is. For continuous tests, ideally want to produce meta-analysis results for each threshold & an interpretable SROC curve If possible, move away from ROC curves where particular thresholds are not identifiable Difficult given missing thresholds across studies This is a non trivial problem Therefore usually ignored encourage researchers to consider it
Conclusions: dealing with multiple thresholds Sophisticated methods available but may not be possible Propose multivariate-normal or imputation approaches Enable more familiar methods Assess if summary results are robust to the missing thresholds Potentially reduce selective reporting Currently extending to multiple imputation Future work: model parametric distribution of raw data
THANK YOU! Selected references Reitsma JB, Glas AS, Rutjes AW, Scholten RJ, Bossuyt PM, Zwinderman AH. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J Clin Epidemiol 2005; 58: 982-990. Chu H, Cole SR. Bivariate meta-analysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. J Clin Epidemiol 2006; 59: 1331-1332; author reply 1332-1333. Hamza TH, Arends LR, van Houwelingen HC, Stijnen T. Multivariate random effects meta-analysis of diagnostic tests with multiple thresholds. BMC Medical Research Methodology 2009; 9: 73. Riley RD et al. Meta-analysis of diagnostic test studies with multiple and missing thresholds: multivariate-normal & imputation approaches (to be submitted) Rifai, N., Altman, D. G., Bossuyt, P. M. (2008). Reporting Bias in Diagnostic and Prognostic Studies: Time for Action. Clin. Chem. 54: 1101-1103