How many Cases Are Missed When Screening Human Populations for Disease?

Similar documents
MEA DISCUSSION PAPERS

The Pretest! Pretest! Pretest! Assignment (Example 2)

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

Chapter 13 Estimating the Modified Odds Ratio

ANOVA in SPSS (Practical)

South Australian Research and Development Institute. Positive lot sampling for E. coli O157

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study

One-Way Independent ANOVA

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T.

JSM Survey Research Methods Section

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA

Bayesian meta-analysis of Papanicolaou smear accuracy

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Chapter 5: Field experimental designs in agriculture

Investigating the robustness of the nonparametric Levene test with more than two groups

Statistical Models for Censored Point Processes with Cure Rates

Modeling of epidemic spreading with white Gaussian noise

A microsimulation study of the benefits and costs of screening for colorectal cancer Christopher Eric Stevenson

A re-randomisation design for clinical trials

Nonparametric DIF. Bruno D. Zumbo and Petronilla M. Witarsa University of British Columbia

The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0

Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note

Bowel cancer screening the facts. Bowel cancer screening: the facts

MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2

Day 11: Measures of Association and ANOVA

Two-Way Independent ANOVA

Logistic regression. Department of Statistics, University of South Carolina. Stat 205: Elementary Statistics for the Biological and Life Sciences

Some alternatives for Inhomogeneous Poisson Point Processes for presence only data

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Fundamental Clinical Trial Design

6. Unusual and Influential Data

Chapter 12: Introduction to Analysis of Variance

Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection

RAG Rating Indicator Values

Meta-analysis of two studies in the presence of heterogeneity with applications in rare diseases

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

Binary Diagnostic Tests Two Independent Samples

Poisson regression. Dae-Jin Lee Basque Center for Applied Mathematics.

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

For general queries, contact

Six Sigma Glossary Lean 6 Society

Application of Multinomial-Dirichlet Conjugate in MCMC Estimation : A Breast Cancer Study

How to describe bivariate data

Appendix III Individual-level analysis

National Bowel Screening Programme. Quick Guide

Simultaneous Equation and Instrumental Variable Models for Sexiness and Power/Status

Mantel-Haenszel Procedures for Detecting Differential Item Functioning

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) *

Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome

BMI 541/699 Lecture 16

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*

Instrumental Variables Estimation: An Introduction

Method Comparison for Interrater Reliability of an Image Processing Technique in Epilepsy Subjects

Score Tests of Normality in Bivariate Probit Models

BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA

Lecture 21. RNA-seq: Advanced analysis

Confidence Intervals On Subsets May Be Misleading

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions

MLE #8. Econ 674. Purdue University. Justin L. Tobias (Purdue) MLE #8 1 / 20

Performance of Median and Least Squares Regression for Slightly Skewed Data

Network Science: Principles and Applications

The MISCAN-COLON simulation model for the evaluation of colorectal cancer screening

Introduction to Meta-analysis of Accuracy Data

Bowel Cancer Prevention and Screening. Harriet Wynne, Cancer Council Victoria

STATISTICAL MODELING OF THE INCIDENCE OF BREAST CANCER IN NWFP, PAKISTAN

While correlation analysis helps

Impact of Response Variability on Pareto Front Optimization

Part I. Boolean modelling exercises

Bayesian hierarchical modelling

Comparing disease screening tests when true disease status is ascertained only for screen positives

Bayesian and Frequentist Approaches

SUPPLEMENTARY INFORMATION

Optimising health information to reduce inequalities

Chapter 1: Exploring Data

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016

Econometric Game 2012: infants birthweight?

Estimating drug effects in the presence of placebo response: Causal inference using growth mixture modeling

The Effect of Guessing on Item Reliability

PSYCHOLOGY 300B (A01)

Business Statistics Probability

Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups.

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

Supplemental Figure S1. Expression of Cirbp mrna in mouse tissues and NIH3T3 cells.

Answer all three questions. All questions carry equal marks.

Bayesian approaches to handling missing data: Practical Exercises

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale.

Meta-analysis to compare combination therapies: A case study in kidney transplantation

Method Comparison Report Semi-Annual 1/5/2018

Stat Wk 9: Hypothesis Tests and Analysis

Measuring association in contingency tables

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Exam 4 Review Exercises

Analysis of Hearing Loss Data using Correlated Data Analysis Techniques

1. You want to find out what factors predict achievement in English. Develop a model that

Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data

Prediction of Malignant and Benign Tumor using Machine Learning

Multiple Regression Models

Transcription:

School of Mathematical and Physical Sciences Department of Mathematics and Statistics Preprint MPS-2011-04 25 March 2011 How many Cases Are Missed When Screening Human Populations for Disease? by Dankmar Böhning and Heinz Holling

How Many Cases Are Missed When Screening Human Populations for Disease? Dankmar Böhning Department of Mathematics and Statistics School of Mathematical and Physical Sciences University of Reading, Whiteknights Reading, RG6 6BX, UK email: d.a.w.bohning@reading.ac.uk Heinz Holling Statistics and Quantitative Methods Faculty of Psychology and Sports Science University of Münster, Münster, Germany email: holling@psy.uni-muenster.de March 25, 2011 1

Abstract Human populations are frequently screened for specific diseases with the aim to detect the disease early when it is easier to treat and cure. However, only for those at risk (at risk being defined as having a positive screening test result) it is meaningful to verify the disease status. As a result a large portion of the screened population remains unverified in their disease status. We investigate techniques to estimate the amount of disease present in the unverified population. In particular, we consider a situation in which a specific screening test (here for the presence of bowel cancer) is applied several times and we focus on the count of times the test has been positive for each subject. A method is suggested to estimate the number of individuals with cancer but having a negative screening test result at all times. The suggested technique is based upon a simple loglinear model for the ratios of neighboring probabilities of the number of times the test has been positive. The technique is applied to publicly available data of a screening study on bowel cancer in Sydney and it is demonstrated that the method provides realistic estimates of the number of missed cancer cases and performs superior to other available techniques. Some key words: capture-recapture, screening with partial verification of disease status, ratio estimator, zero truncated model 2

1 The problem Human populations are frequently screened for specific diseases with the aim to detect the disease early when it is easier to treat and cure. An eample is screening for bowel cancer. Bowel cancer can develop without any early warning signs and can grow on the inside wall of the bowel for several years before spreading to other parts of the body. Often very small amounts of blood leak from these growths and pass into the bowel motion before any symptoms are noticed. A test called Faecal Occult Blood Test (FOBT) can detect these small amounts of blood in the bowel motion. The FOBT looks for blood in the bowel motion, but not for bowel cancer itself. Screening for bowel cancer using a FOBT is a simple non-invasive process that can be done in the privacy of your own home. No screening test is 100% accurate, in fact, a single application of the kit test might have low sensitivity. However, it is thought that a repeated replication of the diagnostic test over a number of days will help to identify most cases of cancer. Over several years, from 1984 onwards about 50000 subjects were screened for bowel cancer at the St Vincent s Hospital in Sydney (Australia). The relevant references are Lloyd and Frommer (2004a, 2004b, 2008). The screening test comprises a sequence of 6 binary diagnostic tests which all are self-administered on 6 successive days. Each records the absence or presence of blood in faeces. If participants in the screening programme have their true disease status determined it is said the have been verified. In the case of this screening study it was done by physical eamination, sigmoidoscopy and colonoscopy. Out of eactly 49927 participants, 46553 tested negatively on all si tests and were not further assessed with the implicit diagnosis that they were cancer-free. In other words, these 46553 participants remained unverified. Out of the other 3374 subjects who tested positively at least once, 3106 were eamined and their true disease status determined with one of the following outcomes: healthy, polyps, cancer. 268 subjects who tested positively were lost to the study (Lloyd and Frommer 2008). The results are publicly available and presented here as Table 1. Table 1: Screening of 49927 subjects in Sydney for bowel cancer with partial verification of disease status (Lloyd and Frommer 2008) status 0 1 2 3 4 5 6 healthy? 1123 264 103 35 25 17 polyps? 772 245 108 72 45 69 cancer? 46 27 26 33 39 57 total 46553 1941 536 237 140 109 143 3

2 Conventional binomial model, diagnosing and modelling binomial heterogeneity We are interested in determining the distribution of number of positive tests per subject. Assuming that there is a homogeneous probability θ that a single of the applied diagnostic tests is positive and that the 6 tests are applied independently, then number of positive tests X per subject follows the binomial distribution ( ) n p = P (X = ) = θ (1 θ) n (1) where n = 6 is the number of tests per subject. Let us consider ratios ( n ) p +1 +1 θ +1 (1 θ) n 1 = ) = n θ p θ (1 θ) n + 1 1 θ, leading to ( n p +1 r = a = + 1 p +1 = θ p n p 1 θ, (2) where a = ( + 1)/(n ), showing that the ratio r is constant with varying p count. It is straightforward to estimate r = a +1 p by ˆr = a f +1 /N f /N = a f +1 f where f is the frequency of count and N = f 0 + f 1 +... + f n. The graph f ˆr = a +1 f can be used as a diagnostic device for the binomial and is called the ratio plot. If the ratio plot shows a pattern of a horizontal line, it can be taken as indicative for the presence of a binomial distribution. This is demonstrated in Figure 1 for simulated data from a binomial with trial size parameter n = 6. The ratio plot shows clear evidence for a binomial distribution. If we apply the concept of the ratio plot to the Sydney screening data of Table 1, we see that there is no evidence of a horizontal line (see Figure 2). Instead, we observe in the ratio plot a monotone pattern arising. Indeed, if θ is distributed with arbitrary density g(θ), then p = 1 0 ( ) n θ (1 θ) n g(θ)dθ (3) has the monotonicity property (Böhning, Baksh, Lerdsruwansri, Gallagher 2011) a 0 p 1 p 0 a 1 p 2 p 1 a 2 p 3 p 2..., and, hence, the ratio plot is must be monotone increasing. 4

1.0 ratio plot for binomial with n=6 and 50000 replications 0.8 true value of event parameter 0.4 0.6 θ 0.4 0.4 0.2 0.0 0 1 2 3 4 5 Figure 1: Ratio plot for 50000 simulated binomial counts with event parameter θ = 0.4 and trials size paramet n = 6 5

a) b) c) d) Figure 2: Ratio plot for Sydney screening study: a) unclassified, b)healthy only (red squares), c) polyps only (red squares), d) cancer only (red squares) 6

The form of the monotonicity in the pattern of the ratio plot in Figure 2 suggests to consider a more eplicit modelling of log ˆr in its dependence from. As it turns out, the model log ˆr = α + β log( + 1) + ɛ (4) arises as a simple, quite reasonable model with only two parameters involved, namely α and β. We can find estimates of α and β using weighted regression estimates where the weights are found as the inverses of 1/f +1 +1/f (Böhning 2008). Figure 3 shows that the log-linear model (4) not only fits well for the unclassified data, but also for the partially classified data. Note that it is one of the advantages of the approach that the model can be fitted for the unclassified data as well as for the zero-truncated classified data (see Figure 3 a) d). To provide a more formal assessment of the model we consider the fitted values log ˆr = ˆα + ˆβ log( + 1), for = 0, 1,..., n 1. Given ˆr 0,..., ˆr n 1, we need to determine ˆp 0,..., ˆp n. This can be accomplished as follows. By definition, the fitted probabilities have to satisfy the recursion ˆp +1 = (ˆr /a )ˆp for = 0,..., n 1 as well as n =0 ˆp = 1, so that n 1 1 = ˆp 0 +... + ˆp n = p 0 [1 + ˆr 0/a 0 + (ˆr 0/a 0 )(ˆr 1/a 1 ) +... + ˆr /a ], =1 which leads to ˆp 0 = 1 + n 1 i=0 1 i. (5) =0 ˆr /a Table 2: Observed and fitted frequencies of the unclassified Sydney screening data with various models fitted: log-linear (4), binomial model (1), and betabinomial model 0 1 2 3 4 5 6 χ 2 binomial 44236.6 5164.6 251.2 6.5 0.1 0.0 0.0 > 10 9 beta-bin. 46842.8 1403.9 633.2 363.0 221.5 131.1 63.6 398.93 log-linear 46418.7 1968.5 443.1 235.7 202.8 211.2 178.9 96.37 observed 46553 1941 536 237 140 109 143-7

The remaining fitted probabilities can be determined using the recursion ˆp +1 = (ˆr /a )ˆp for = 0,..., n 1. The fitted frequencies for the unclassified Sydney screening data, ˆf = ˆp N are found in row 4 of Table 2. The fit is good with a χ 2 = n (f ˆf ) 2 =0 = 96.37 by df = 7 1 2 = 4 degrees of freedom. For ˆf comparison, we have also included the fitted frequencies under a binomial model (1) which is evidently entirely unsatisfactory. An improved fit can be found by applying the beta-binomial model with g(θ) in (3) provided as the beta-density g(θ) = Γ(α + β) Γ(α)Γ(β) θα 1 (1 θ) β 1 where α and β are unknown parameters and Γ(t) = y t 1 ep( y)dy is the 0 Gamma function. Note that the beta-binomial is frequently used since the marginal can easily be worked out to be 1 ( ) ( ) n n Γ(α + β) p = θ (1 θ) n Γ( + α)γ(n + β) g(θ)dθ =. Γ(α)Γ(β) Γ(n + α + β) 0 The fit of the beta-binomial with χ 2 = 398.93 is evidently a lot better than the fit for the binomial, but clearly inferior to the fit of χ 2 = 96.37 provided by the log-linear model. Hence, we will focus in the following on the log-linear model. The panels b), c) and d) in Figure 3 show that the log-linear model is also suitable for the partially classified populations with cancer, with polyps, and those that are healthy. 3 Predicting the number of cases missed We are now considering fitting the log-linear model (4) to the partially classified data as being healthy having polyps having cancer. For each of the three populations we fit the log-linear model log ˆr = α + β log( + 1) + ɛ for = 1,..., n 1. Note that model is suitable for both situations, untruncated and zero-truncated count data as we have here. Clearly, we are interested in the predicted value ˆr 0 = ep(ˆα) which motivates the estimator ˆf 0 = a 0 f 1 ep( ˆα), (6) 8

since ˆr 0 = a 0 ˆf1 / ˆf 0, and replacing ˆf 1 by the observed frequency f 1 provided the estimate in (6). The specific estimates for the three populations is given in Table 3. As a standard comparative estimator we provide Chao s lower bound estimate which is achieved as ˆf 0,C = n 1 n f 2 1 2f 2, which provides an asymptotic lower bound for E(f 0 ). This lower bound is given in brackets in column 2 of table 3. Typically, Chao s lower bound provides only 10% of the size of the log-linear estimator. Still, also our estimates are lower bounds since 15937+10638+332=26907 do not match the marginal frequency 46553. The shape of the graphs in Figure 3 suggest that we likely underestimate the frequency of zero-counts in the healthy population and in the population with polyps, not so much the in the population with cancer. The estimate for the frequency of missed cancer cases seems realistic. Values for the sensitivity of an individual application of a FOBT ranges between q = 0.1 0.3. This leads to a probability for the diagnostic procedure finding at least one positive result in n successive applications of 1 (1 q) n. This would mean to epect the frequency of missed subjects with cancer to range between 258 487. We estimate it with 332 which is well in this range. Table 3: screening of 49927 subjects in Sydney for bowel cancer with partial verification of disease status: predicted frequencies of zero-count test results for each of the three partially classified populations are in bold (second column) status 0 1 2 3 4 5 6 healthy 15937 (1990) 1123 264 103 35 25 17 polyps 10638 (1014) 772 245 108 72 45 69 cancer 332 (33) 46 27 26 33 39 57 total 46553 1941 536 237 140 109 143 4 Prediction for a subset of fully classified data Lloyd and Frommer (2004) present also a table of cancer patients with a repeated application of the diagnostic procedure. In fact, a subset of 125 of the positive patients with confirmed cancer agreed to repeat the procedure 4 to 10 days after the primary test which we call the secondary data. Hence for all test results including the test-negatives the disease status is known (Table 4). It might be 9

a) b) c) d) Figure 3: Ratio plot with fitted log-linear model for Sydney screening study: a) unclassified with fitted line, b)healthy only (red squares) with fitted model (solid red line), c) polyps only (red squares) with fitted model (solid red line), d) cancer only (red squares) with fitted model (solid red line); for b)-d) the black dots indicate the unclassified data 10

3 2 secondary data on people with cancer Variable log(ratio) FITS_scondary log(ratio) 1 0-1 -2 point not used for fitting the model -3 0 1 2 3 4 5 Figure 4: Ratio plot with fitted log-linear model for secondary Sydney screening data with confirmed cancer interesting to see how the log-linear estiamte (??) performs when the known frequency f 0 is ignored. A model fit ignoring f 0 is presented in Figure 4 which appears to be reasonable. The corresponding estimate from (6) is ˆf 0 = 21 which compares favorably with the observed value of f 0 = 25. Chao s estimate is poor with ˆf 0,C = 2. Table 4: Distribution of count of test-positives for a repeated diagnostic testing of 125 subjects with cancer number of positive tests 0 1 2 3 4 5 6 f 25 8 12 16 21 12 31 11

References [1] Böhning, D. (2008). A simple variance formula for population size estimators by conditioning. Statistical Methodology 5, 410 423. [2] Bunge, J. and Fitzpatrick, M. (1993). Estimating the Number of Species: a Review. Journal of the American Statistical Association 88, 364 373. [3] Chao, A. (1987). Estimating the Population Size for Capture-Recapture data with Unequal Catchability. Biometrics 43, 783 791. [4] Dorazio, R.M. and Royle, J.A. (2005). Miture models for estimating the size of a closed population when capture rates vary among individuals. Biometrics 59, 351 364. [5] Lloyd, C.J. and Frommer (2004). Estimating the false negative fraction for a multiple screening test for bowel cancer when negatives are not verified. Austr. N.Z.J. Stat. 46, 531-542. [6] Lloyd, C.J. and Frommer (2004). Regression based estimation of the false negative fraction when multiple negatives are unverified. J. Roy. Statist. Soc. Ser. C 53, 619-631. [7] Lloyd, C.J. and Frommer (2008). An application of multinomial logistic regression to estimating performance of a multiple-screening test with incomplete verification. J. Roy. Statist. Soc. Ser. C 57, 89-102. 12