Bayesian meta-analysis of Papanicolaou smear accuracy

Similar documents
An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy

Introduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015

Methods Research Report. An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy

Introduction to Meta-analysis of Accuracy Data

Multivariate Mixed-Effects Meta-Analysis of Paired-Comparison Studies of Diagnostic Test Accuracy

Comparison of Meta-Analytic Results of Indirect, Direct, and Combined Comparisons of Drugs for Chronic Insomnia in Adults: A Case Study

Systematic Reviews and meta-analyses of Diagnostic Test Accuracy. Mariska Leeflang

Meta-analysis of diagnostic test accuracy studies with multiple & missing thresholds

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Multivariate Mixed-Effects Meta-Analysis of Paired-Comparison Studies of Diagnostic Test Accuracy

Bayesian and Frequentist Approaches

Mathematical modeling and its use in guiding cervical cancer. Decision Science and Cervical Cancer. Second International Conference on Cervical Cancer

Performance of the Trim and Fill Method in Adjusting for the Publication Bias in Meta-Analysis of Continuous Data

Introduction to Bayesian Analysis 1

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions

Meta-analyses evaluating diagnostic test accuracy

Statistical methods for the meta-analysis of full ROC curves

Statistical methods for the meta-analysis of full ROC curves

Choice of axis, tests for funnel plot asymmetry, and methods to adjust for publication bias

Fundamental Clinical Trial Design

Study population The study population comprised women residing in Keelung, northern Taiwan.

Design for Targeted Therapies: Statistical Considerations

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data

Multilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison

Bayes-Verfahren in klinischen Studien

Meta-analysis of Diagnostic Test Accuracy Studies

Dynamic borrowing of historical data: Performance and comparison of existing methods based on a case study

Empirical assessment of univariate and bivariate meta-analyses for comparing the accuracy of diagnostic tests

METHODS FOR DETECTING CERVICAL CANCER

Type and quantity of data needed for an early estimate of transmissibility when an infectious disease emerges

Historical controls in clinical trials: the meta-analytic predictive approach applied to over-dispersed count data

Comparing disease screening tests when true disease status is ascertained only for screen positives

Patients referred to a colposcopy clinic will often have

Fixed-Effect Versus Random-Effects Models

Pap plus HPV every 3 years with screening stopped at 65, 75 and 100 years; Pap plus HPV every 2 years with screening stopped at 65, 75 and 100 years.

Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials

A Brief Introduction to Bayesian Statistics

Standards for Heterogeneity of Treatment Effect (HTE)

Clinical Policy Title: Fluorescence in situ hybridization for cervical cancer screening

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering

Studies reporting ROC curves of diagnostic and prediction data can be incorporated into meta-analyses using corresponding odds ratios

Bayesian Hierarchical Models for Fitting Dose-Response Relationships

Bayesian random-effects meta-analysis made simple

Bayesian hierarchical modelling

WinBUGS : part 1. Bruno Boulanger Jonathan Jaeger Astrid Jullion Philippe Lambert. Gabriele, living with rheumatoid arthritis

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Statistical Tolerance Regions: Theory, Applications and Computation

A Case Study: Two-sample categorical data

Method Comparison for Interrater Reliability of an Image Processing Technique in Epilepsy Subjects

Meta-analysis of diagnostic research. Karen R Steingart, MD, MPH Chennai, 15 December Overview

The conditional relative odds ratio provided less biased results for comparing diagnostic test accuracy in meta-analyses

Bayesian Estimation of a Meta-analysis model using Gibbs sampler

Correcting AUC for Measurement Error

State of the Art in Clinical & Anatomic Pathology. Meta-analysis of Clinical Studies of Diagnostic Tests

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of four noncompliance methods

Effective Implementation of Bayesian Adaptive Randomization in Early Phase Clinical Development. Pantelis Vlachos.

Advanced IPD meta-analysis methods for observational studies

No Large Differences Among Centers in a Multi-Center Neurosurgical Clinical Trial

How to Conduct a Meta-Analysis

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

CopulaDTA: An R package for copula-based bivariate beta-binomial models for diagnostic test accuracy studies in Bayesian framework

Explaining heterogeneity

Meta-analysis of diagnostic accuracy studies. Mariska Leeflang (with thanks to Yemisi Takwoingi, Jon Deeks and Hans Reitsma)

Stroke-free duration and stroke risk in patients with atrial fibrillation: simulation using a Bayesian inference

Supplementary Online Content

MEA DISCUSSION PAPERS

Bayesian Meta-Analysis of the Accuracy of a Test for Tuberculous Pleuritis in the Absence of a Gold Standard Reference

Inference About Magnitudes of Effects

The Impact of Continuity Violation on ANOVA and Alternative Methods

Fluorescence Spectroscopy and Imaging for Photodetection of Cervical Intraepithelial Neoplasia

Bayesian Mediation Analysis

Bayesian Nonparametric Methods for Precision Medicine

Economic evaluation of factorial randomised controlled trials: challenges, methods and recommendations

Meta-analysis of few small studies in small populations and rare diseases

Bayesian Multiparameter Evidence Synthesis to Inform Decision Making: A Case Study in Metastatic Hormone-Refractory Prostate Cancer

Bayesian methods in health economics

Between-trial heterogeneity in meta-analyses may be partially explained by reported design characteristics

Modelling heterogeneity variances in multiple treatment comparison meta-analysis Are informative priors the better solution?

Health care professionals encounter multiple dilemmas

Mediation Analysis With Principal Stratification

Designing a Bayesian randomised controlled trial in osteosarcoma. How to incorporate historical data?

Models for potentially biased evidence in meta-analysis using empirically based priors

heterogeneity in meta-analysis stats methodologists meeting 3 September 2015

How many Cases Are Missed When Screening Human Populations for Disease?

Evidence synthesis with limited studies

Analysis of left-censored multiplex immunoassay data: A unified approach

Meta-analysis: Advanced methods using the STATA software

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT QUADRATIC (U-SHAPED) REGRESSION MODELS

Bayesian Meta-Analysis in Drug Safety Evaluation

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U

MIDAS RETOUCH REGARDING DIAGNOSTIC ACCURACY META-ANALYSIS

Assessment of performance and decision curve analysis

18/11/2013. An Introduction to Meta-analysis. In this session: What is meta-analysis? Some Background Clinical Trials. What questions are addressed?

Meta-analysis of two studies in the presence of heterogeneity with applications in rare diseases

Methods for assessing consistency in Mixed Treatment Comparison Meta-analysis

Automating network meta-analysis

Empirical evidence on sources of bias in randomised controlled trials: methods of and results from the BRANDO study

Transcription:

Gynecologic Oncology 107 (2007) S133 S137 www.elsevier.com/locate/ygyno Bayesian meta-analysis of Papanicolaou smear accuracy Xiuyu Cong a, Dennis D. Cox b, Scott B. Cantor c, a Biometrics and Data Management, Boehringer Ingelheim Pharmaceuticals, Inc., Ridgefield, Connecticut, USA b Department of Statistics, Rice University, Houston, Texas, USA c Section of Health Services Research, Unit 447, Department of Biostatistics, The University of Texas M.D. Anderson Cancer Center, 1515 Holcombe Blvd., Houston, Texas 77030-4009, USA Received 24 August 2007 Abstract Objective. To perform a Bayesian analysis of data from a previous meta-analysis of Papanicolaou (Pap) smear accuracy (Fahey et al. Am J Epidemiol 1995; 141:680 689) and compare the results. Methods. We considered two Bayesian models for the same data set used in the Fahey et al. study. Model I was a beta-binomial model which considered the number of true positives and false negatives as independent binomial random variables with probability parameters β (sensitivity) and α (one minus specificity), respectively. We assumed that β and α are independent, each following a beta distribution with exponential priors. Model II considered sensitivity and specificity jointly through a bivariate normal distribution on the logits of the sensitivity and specificity. We performed sensitivity analysis to examine the effect of prior selection on the parameter estimates. Results. We compared the estimates of average sensitivity and specificity from the Bayesian models with those from Fahey et al.'s summary receiver operating characteristics (SROC) approach. Model I produced results similar to those of the SROC approach. Model II produced point estimates higher than those of the SROC approach, although the credible intervals overlapped and were wider. Sensitivity analysis showed that the Bayesian models are somewhat sensitive to the variance of the prior distribution, but their point estimates are more robust than those of the SROC approach. Conclusions. The Bayesian approach has advantages over the SROC approach in that it accounts for between-study variation and allows for estimating the sensitivity and specificity for a particular trial, taking into consideration the results of other trials, i.e., borrowing strength from other trials. 2007 Elsevier Inc. All rights reserved. Keywords: Meta-analysis; Sensitivity; Specificity; Bayesian model; Papanicolaou smear; Cervical cancer Introduction The Papanicolaou (Pap) smear, which involves the collection, preparation, and examination of exfoliated cervical cells, is a medical procedure commonly used to screen for and diagnose cervical cancer. The Pap smear is believed to be a key strategy in the early detection and prevention of cervical cancer, and its Financial support for this study was provided by grant number CA82710 from the National Cancer Institute. The funding agreement ensured the authors' independence in designing the study, interpreting the data, and writing and publishing the report. This research was presented, in part, at the 26th Annual Meeting of the Society for Medical Decision Making in Atlanta, Georgia, October, 2004. Corresponding author. Fax: +1 713 563 4243. E-mail address: sbcantor@mdanderson.org (S.B. Cantor). accuracy has been the subject of many clinical trials and research papers. However, the results and conclusions of these trials and papers differ greatly. Fahey et al. [1] noted that among the studies included in their meta-analysis, the estimates of sensitivity and specificity ranged from 0.14 to 0.97 and from 0.11 to 0.99, respectively. In addition, study quality and other characteristics, including sample size and patients' ages and races, also varied among the studies. These disparities warrant a systematic review of these studies. Here, we conduct such a systematic review through meta-analysis, a statistical technique that quantitatively pools the results of multiple independent studies to draw inferences from those studies. The summary receiver operating characteristic (SROC) method, introduced by Littenberg and Moses [2,3], has been widely used by authors of published meta-analyses that examine diagnostic test accuracy. However, the SROC method has some 0090-8258/$ - see front matter 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.ygyno.2007.08.080

S134 X. Cong et al. / Gynecologic Oncology 107 (2007) S133 S137 Table 1 Prior parameters used in the Bayesian analyses Model I: beta-binomial model Model II: normal logit model a 1 exp(5) θ N (0, 1000) b 1 exp(3) τ 2 inverse gamma (0.33, 0.25) a 2 exp(3) μ N (0, 1000) b 2 exp(5) σ 2 inverse gamma (0.33, 0.25) inherent limitations that could affect its validity and usefulness. First, the SROC method is not based on a formal probability model; in fact, Littenberg and Moses call it a data-analytic approach [2]. Second, there are difficulties in dealing with any zeroes that appear in any cells of the 2 2 table used in diagnostic test evaluation, i.e., true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). The continuity correction method, i.e., adding one half to each count in calculating the true positive rate and the false positive rate, introduces non-negligible downward bias to the estimated SROC curve [2,3,4]. An additional problem with the SROC method is that confidence intervals are based on large sample theory approximations and the appropriate sample size for a meta-analysis is the number of studies which may not be large enough for the approximation to be accurate. In this paper we consider a Bayesian approach to assess the diagnostic test accuracy of multiple studies. We propose two Bayesian models, a beta-binomial model and a normal logit model, and compare the results to the SROC model. Methods Study sources We used the same primary studies as used in the meta-analysis by Fahey et al. [1]. All primary studies were published between January 1984 and March 1992, inclusive, and were identified by a MEDLINE search. In addition, manual searches of relevant journals, reference lists of retrieved articles, and communication with other researchers identified additional sources of published data. Three content areas were used for the literature search: cervical cancer/disease, tests for cervical abnormality, and test evaluation. Studies were excluded if they were not published in English, were review articles, or without original data. Also excluded were articles that did not use Pap smears, did not use histology as the reference standard, or did not report sufficient data for estimation of operating characteristics. There were 58 cross-sectional studies identified (Fahey et al. [1], Appendix 1, p. 687) that evaluated the Pap smear, used histology as the reference standard, and used a threshold of cervical intraepithelial neoplasia grade 1 (CIN1) or grade 2 (CIN2) or equivalent. In 31 of these studies, the Pap smear was used as a follow-up to prior abnormal tests; in the remaining 27 studies, the Pap smear was used for screening purposes. Model I: beta-binomial model In the beta-binomial model, the number of true positives is a binomially distributed random number that has the true study-specific sensitivity (β) as a probability parameter: TP bin (TP+FN, β). Similarly, the number of false positives is binomially distributed with the study-specific probability parameter α, which is equal to one minus specificity: FP bin (FP+TN, α). It is assumed that the parameters β and α follow independent beta distributions with different parameters: β beta (a 1, b 1 ) and α beta (a 2, b 2 ). We place independent prior exponential distributions on a 1, b 1, a 2, and b 2. The particular parameters for all models can be found in Table 1. The goal is to choose prior values such that the prior means are close to the sample mean but have rather large variances and hence are vague priors. Model II: normal logit model The aforementioned beta-binomial model (Model I) is relatively simple. However, it ignores possible correlation between sensitivity and specificity by independently modeling TP and FP. Model II, the normal logit model, includes correlation through the log odds ratio and is specified as follows: TPfbinðTP þ FN;bÞ FPfbinðFP þ TN;aÞ d ¼ logitðbþ logitðaþ logitðaþfnða; r 2 Þ dfnðh; s 2 Þ Here, logit(α) =log(α/(1 α)). The parameter δ denotes the log odds ratio for a given study. A preliminary analysis showed that it is reasonable to assume that δ follows a normal distribution, say N(θ, τ 2 ) with mean θ and variance τ 2. The correlation is modeled through δ, in particular the correlation between β and α is σ/(σ 2 +τ 2 ) 1/2. The parameters of the sampling model are μ, σ 2, θ, and τ 2. We assume independent normal and inverse gamma priors for the mean and variance parameters, respectively, for mathematical simplicity. Model fitting We fit each of the two models described in Table 1 in WinBUGS 1.4.1 [8] to the data sets consisting of the two subgroups (i.e., the 31 studies that concerned tests for follow-up purposes and the 27 studies that concerned tests for screening purposes). We ran 15,000 WinBUGS steps for each model. Here, we report the summary statistics after discarding the first 5000 burn-in samples. Sensitivity analysis It is important to conduct a sensitivity analysis to determine how the choice of prior distributions affects the performance of a Bayesian model. To this end, we chose some representative priors that give large, intermediate, or small Table 2 Bayesian posterior estimates of mean sensitivity and specificity Model Description Sensitivity Specificity Model I Beta-binomial 0.58 (0.49, 0.67) 0.65 (0.57, 0.72) 0.70 (0.62, 0.77) 0.68 (0.60, 0.75) Model II Normal logit 0.60 (0.45, 0.74) 0.69 (0.58, 0.79) 0.76 (0.66, 0.76) 0.73 (0.63, 0.81) SROC Summary ROC 0.58 (0.49, 0.67) 0.66 (0.58, 0.73) 0.69 (0.62, 0.77) 0.66 (0.58, 0.73) Entries in table are the means followed by the 95% credible intervals for the Bayesian Models I and II and 95% confidence intervals for the SROC Model (Fahey et al. [1]). Prior distribution for Models I and II are specified in Table 1.

X. Cong et al. / Gynecologic Oncology 107 (2007) S133 S137 S135 Fig. 1. Estimated sensitivity and specificity for individual studies. On each panel, circles are the estimated value from observed data; triangles for Model I and crosses for Model II. Solid vertical lines are the means estimated from observed data; dotted lines are the means estimated from Model I and dashed line from Model II. The two Bayesian models used prior parameters specified in Table 1. between-study heterogeneity and compared the performances of the models using these different priors. If the performances differ very little, then we would conclude that there is robustness against the choice of prior distributions. Results Table 2 shows the Bayesian posterior estimates, using the specified prior parameters, of mean sensitivity and specificity for the 31 follow-up and 27 screening studies and the 95% credible intervals based on posterior estimates of the standard errors. The estimates among the Bayesian models are clearly comparable. Model I has approximately the same estimates and interval widths as the SROC model of Fahey et al. [1] Model II yields higher point estimates for sensitivity and specificity. The 95% credible intervals for Model II are wider than the 95% confidence intervals determined in the SROC analysis by Fahey et al. [1] In addition, the interval estimates for specificity are shifted, consistent with the higher point estimates. Fig. 1 shows the posterior estimates of sensitivity and specificity for each individual study based on the two Bayesian Table 3 Sensitivity analysis for Model I Prior parameter values Sensitivity Specificity a1 exp(1); b1 exp(1) 0.58 (0.48, 0.67) 0.64 (0.56, 0.72) 0.69 (0.60, 0.77) 0.67 (0.59, 0.75) a2 exp(1); b2 exp(1) a1 exp(3); b1 exp(5) 0.58 (0.49, 0.67) 0.65 (0.57, 0.72) 0.70 (0.62, 0.77) 0.68 (0.60, 0.75) a2 exp(5); b2 exp(3) a1 exp(1); b1 exp(2) 0.58 (0.48, 0.67) 0.64 (0.56, 0.72) 0.69 (0.60, 0.77) 0.67 (0.59, 0.74) a2 exp(2); b2 exp(1)

S136 X. Cong et al. / Gynecologic Oncology 107 (2007) S133 S137 Table 4 Sensitivity analysis for Model II Prior parameter values Sensitivity Specificity θ N (0, 1000); μ N (0, 1000); 0.60 (0.45, 0.74) 0.69 (0.58, 0.79) 0.76 (0.66, 0.84) 0.73 (0.64, 0.81) τ 2 IG (0.333, 0.25); σ 2 IG (0.333, 0.25) θ N (3.0, 1000); μ N ( 2.2, 1000); 0.60 (0.45, 0.74) 0.69 (0.58, 0.80) 0.76 (0.66, 0.84) 0.73 (0.63, 0.81) τ 2 IG (0.333, 0.75); σ 2 IG (0.333, 0.75) θ N (3.0, 10); μ N ( 2.2, 1.0); τ 2 IG (0.333, 0.25); σ 2 IG (0.333, 0.25) 0.60 (0.46, 0.74) 0.69 (0.57, 0.79) 0.74 (0.68, 0.85) 0.74 (0.65, 0.82) Note. IG=inverse gamma. models. In comparison with the estimates directly from each study, these values take into consideration all other studies and are drawn toward the mean. The smaller the sample size of an individual study, the closer the Bayesian estimates are to the mean. We performed sensitivity analyses to examine the effect of prior distribution selection on the parameter estimates. In Tables 3 and 4, we show the results of such sensitivity analyses for Model I and II, respectively. Three sets of prior distributions were analyzed for each model. The estimated mean sensitivity and specificity for each model using different parameters are very close to each other. Discussion Many authors of meta-analyses that examine diagnostic test accuracy have used the summary receiver operating characteristic (SROC) approach [1,5,6], but confusion remains as to how to apply and interpret the resulting curves. Fahey et al. [1] and Sutton et al. [6] claimed that the advantage of the SROC curve used in this framework was that it could better account for the possibility that different studies used different test thresholds. However, both sources did not clearly discuss how this effect is accounted for. A non-significant slope was obtained in fitting the SROC curve; hence both sources concluded that the test threshold did not affect the test accuracy based on their analysis. The Model II used here does provide some flexibility in dealing with varying thresholds. If the threshold for a positive test result is increased, then both the true positive rates and false positive rates will go down, so δ will not change much whereas α will decrease. Thus, if our posterior estimate for τ 2 is small but for σ 2 is large, this would be indicative that much of the variability may be due to varying thresholds. Model I does not lend itself to such clear modeling of varying thresholds, but it is flexible enough to be able to accommodate such variation. We proposed Bayesian models as alternatives to the SROC approach in the meta-analysis of diagnostic test accuracy. Although the models here were applied to Pap smear studies, they can be used with similar outcomes for other diagnostic tests. With the proper selection of priors, Bayesian models can account for between-study heterogeneity other than from varying thresholds for a positive test. Bayesian models provide posterior estimates for each individual study while taking into consideration the results of other studies or borrowing strength from other studies. We believe that Bayesian models provide a more realistic and flexible framework in which to combine studies and integrate information in meta-analysis. In our analysis, the two Bayesian model realizations gave similar results in terms of estimation of the mean sensitivity and specificity. For the follow-up group, all these models estimated a mean sensitivity and specificity smaller than that of the weighted mean from Mitchell et al. [5] (0.57 0.67 vs. 0.75, and 0.66 0.68 vs. 0.73, respectively). It is also interesting that the screening group had a lower estimated sensitivity but a higher specificity than did the follow-up group. This may be because of different test thresholds used. When screening healthy individuals, doctors tend to require higher levels of abnormality to claim a diseased case, hence lower sensitivity but higher specificity. But when performing follow-up tests for patients with prior abnormal Pap smear results, doctors are more likely to use a less stringent threshold of abnormality to claim that a test result is positive. These differences may also have to do with fundamental differences in the population. It is natural to expect that, in the screening group, there are more clear negatives and more borderline positives, hence higher true negative and false negative rates. Despite the fact that the two Bayesian models gave similar mean estimates, the normal logit model makes more intuitive sense because sensitivity and specificity are correlated. Furthermore, this model is better than the beta-binomial model because it allows more between-study heterogeneity. In our analysis, the normal logit model produced higher point estimates for both sensitivity and specificity. Further study will determine whether this is a true reflection of reality. We recognize that these estimates should be viewed with caution. As pointed out by Fahey et al. [1] the estimates of the positivity rate of Pap smear screening are approximately 5 10%. However, the best operating characteristics as calculated above (sensitivity of 0.60 and specificity of 0.76) would yield a positivity rate no less than 24%. Thus, the true estimate for specificity for the Pap smear is probably in the upper end of the confidence or credible interval, i.e., closer to 0.90, which is consistent with estimates used in published decision-analytic models for cervical cancer screening [7]. It is clear that the data used in this analysis are somewhat inconsistent with result, which is based on very large data sets. A further advantage of the Bayesian approach is that it can incorporate incomplete information such as simply a positivity rate or results of followup tests applied only to the subset of patients who have a positive Pap smear.

X. Cong et al. / Gynecologic Oncology 107 (2007) S133 S137 S137 Conflict of interest statement We declare that we have no conflict of interest. Acknowledgment The authors wish to thank Margaret Newell for editorial contributions that enhanced the quality of the manuscript. References [1] Fahey MT, Irwig L, Macaskill P. Meta-analysis of Pap test accuracy. Am J Epidemiol 1995;141:680 9. [2] Moses LE, Shapiro DE, Littenberg B. Combining independent studies of a diagnostic test into a summary ROC curve: data-analytic approaches and some additional considerations. Stat Med 1993;12:1293 316. [3] Littenberg B, Moses LE. Estimating diagnostic accuracy from multiple conflicting reports: a new meta-analytic method. Med Decis Mak 1993;13:313 21. [4] Mitchell MD. Validation of the summary ROC for diagnostic test metaanalysis: a Monte Carlo simulation. Acad Radiol 2003;10: 25 31. [5] Mitchell MF, Cantor SB, Ramanujam N, Tortolero-Luna G, Richards-Kortum R. Fluorescence spectroscopy for diagnosis of squamous intraepithelial lesions of the cervix. Obstet Gynecol 1999; 93:462 70. [6] Sutton AJ, Abrams KR, Jones DR, Sheldon TA, Song F. Methods for Meta-Analysis in Medical Research. Chichester: John Wiley and Sons; 2000. [7] Mandelblatt JS, Lawrence WF, Womack SM, Jacobson D, Yi B, Yi-ting H, Gold K, Barter J, Shah K. Benefits and costs of using HPV testing to screen for survival cancer. JAMA 2002;287: 2372 2381. [8] Spiegelhalter DJ, Thomas A, Best NG. WinBUGS Version 1.2 user manual. Cambridge: MRC Biostatistics Unit; 1999.