Statistical Hocus Pocus? Assessing the Accuracy of a Diagnostic Screening Test When You Don t Even Know Who Has the Disease

Similar documents
Reflection Questions for Math 58B

MS&E 226: Small Data

Att vara eller inte vara (en Bayesian)?... Sherlock-conundrum

Bayesian and Frequentist Approaches

A Brief Introduction to Bayesian Statistics

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods

UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

Bayesian Latent Subgroup Design for Basket Trials

An Introduction to Bayesian Statistics

Chapter 8: Estimating with Confidence

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

STAT 200. Guided Exercise 4

9. Interpret a Confidence level: "To say that we are 95% confident is shorthand for..

Introduction. Patrick Breheny. January 10. The meaning of probability The Bayesian approach Preview of MCMC methods

Bayesian Inference. Review. Breast Cancer Screening. Breast Cancer Screening. Breast Cancer Screening

Bayes theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

ISSUES IN CHOOSING AND USING JOHNE S TESTING STRATEGIES

A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests

Chapter 6: Confidence Intervals

Bayes Theorem Application: Estimating Outcomes in Terms of Probability

Bayesian Tailored Testing and the Influence

A Cue Imputation Bayesian Model of Information Aggregation

NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES

Design for Targeted Therapies: Statistical Considerations

THE GOOD, THE BAD, & THE UGLY: WHAT WE KNOW TODAY ABOUT LCA WITH DISTAL OUTCOMES. Bethany C. Bray, Ph.D.

Chapter 1: Exploring Data

What is probability. A way of quantifying uncertainty. Mathematical theory originally developed to model outcomes in games of chance.

How to use the Lafayette ESS Report to obtain a probability of deception or truth-telling

Hierarchy of Statistical Goals

Sensitivity of heterogeneity priors in meta-analysis

A Case Study: Two-sample categorical data

Diagnostic screening. Department of Statistics, University of South Carolina. Stat 506: Introduction to Experimental Design

Sta 309 (Statistics And Probability for Engineers)

Outline. Chapter 3: Random Sampling, Probability, and the Binomial Distribution. Some Data: The Value of Statistical Consulting

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Lecture Outline Biost 517 Applied Biostatistics I

Bayesian Joint Modelling of Benefit and Risk in Drug Development

Statistics for Psychology

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

Psychology, 2010, 1: doi: /psych Published Online August 2010 (

Kelvin Chan Feb 10, 2015

Applications with Bayesian Approach

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Geoffrey Stewart Morrison

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012

STATISTICAL INFERENCE 1 Richard A. Johnson Professor Emeritus Department of Statistics University of Wisconsin

Inference About Magnitudes of Effects

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

Lecture Outline Biost 517 Applied Biostatistics I. Statistical Goals of Studies Role of Statistical Inference

Bayesian methods in health economics

Kepler tried to record the paths of planets in the sky, Harvey to measure the flow of blood in the circulatory system, and chemists tried to produce

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T.

Power & Sample Size. Dr. Andrea Benedetti

Bayesian Methods for Medical Test Accuracy. Broemeling & Associates Inc., 1023 Fox Ridge Road, Medical Lake, WA 99022, USA;

Advanced IPD meta-analysis methods for observational studies

Introduction to screening tests. Tim Hanson Department of Statistics University of South Carolina April, 2011

Individual Differences in Attention During Category Learning

Using mixture priors for robust inference: application in Bayesian dose escalation trials

Chapter 1 Data Types and Data Collection. Brian Habing Department of Statistics University of South Carolina. Outline

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

Fundamental Clinical Trial Design

Item Analysis: Classical and Beyond

12.1 Inference for Linear Regression. Introduction

WORKSHOP 1 / ATELIER 1

Confidence interval and hypothesis testing examples

Decision Making in Confirmatory Multipopulation Tailoring Trials

Performance Assessment for Radiologists Interpreting Screening Mammography

Estimating drug effects in the presence of placebo response: Causal inference using growth mixture modeling

Sheila Barron Statistics Outreach Center 2/8/2011

(a) 50% of the shows have a rating greater than: impossible to tell

Stroke-free duration and stroke risk in patients with atrial fibrillation: simulation using a Bayesian inference

Module 5: Introduction to Stochastic Epidemic Models with Inference

PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science. Homework 5

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

Exam 4 Review Exercises

Thursday. Compartmental Disease Models

Chapter 3: Describing Relationships

The Computing Brain: Decision-Making as a Case Study

Module 5: Introduction to Stochastic Epidemic Models with Inference

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Introduction to ROC analysis

Section B. Variability in the Normal Distribution: Calculating Normal Scores

Case Studies in Bayesian Augmented Control Design. Nathan Enas Ji Lin Eli Lilly and Company

Lesson 87 Bayes Theorem

Biostatistics Lecture April 28, 2001 Nate Ritchey, Ph.D. Chair, Department of Mathematics and Statistics Youngstown State University

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT QUADRATIC (U-SHAPED) REGRESSION MODELS

Between-word regressions as part of rational reading

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

EECS 433 Statistical Pattern Recognition

Bayesian approaches to handling missing data: Practical Exercises

Statistical Analysis Using Machine Learning Approach for Multiple Imputation of Missing Data

Introduction to Bayesian Analysis 1

Statistics for Social and Behavioral Sciences

Generalized Estimating Equations for Depression Dose Regimes

Chapter 3: Examining Relationships

Intro to Probability Instructor: Alexandre Bouchard

Bayesian Dose Escalation Study Design with Consideration of Late Onset Toxicity. Li Liu, Glen Laird, Lei Gao Biostatistics Sanofi

Transcription:

Statistical Hocus Pocus? Assessing the Accuracy of a Diagnostic Screening Test When You Don t Even Know Who Has the Disease Michelle Norris Dept. of Mathematics and Statistics California State University, Sacramento September 21, 2012

Outline Goals and Challenges of Diagnostic Screening Hui and Walter Solution to No Gold Standard Problem Longitudinal Diagnostic Screening Comparing Bayesian and Frequentist Statistics Johne s Disease Model and Findings

Diagnostic Screening Screening humans and animals for a multitude of diseases common practice in modern medicine throat culture for strep throat ELISA for HIV tissue biopsy for cancer Unfortunately, many tests are imperfect Statistical methods exist to quantify the accuracy of a screening test

Types of Tests Raw test results may be: binary, i.e. cancerous cells are present=1/not present=0 discrete, i.e. colony count in bacterial culture continuous, i.e. optical density of serology test For a binary test, performance defined using conditional probability.

A bit on Conditional Probability A method for adjusting probability if additional information is available about the outcome Ṁajor Engineering Math Tot Male 14 6 20 Female 1 9 10 Tot 15 15 30 Randomly select a student from this class. What is the probability the student is 1. male? 2. a male given you know the student is an engineer, written P(Male Engr)? 3. an engineer given you know the student is male, i.e. P(Engr Male)?

Measures of Diagnostic Test Performance Sensitivity = Probability diseased person tests positive = P(+ D) Specificity = Prob undiseased person tests negative = P( No D) π = proportion of population having disease

The Easy Case Easiest way to estimate the sensitivity and specificity is to administer the test to subjects whose disease status is known. Sensitivity (Se) is estimated by the sample proportion of positive tests among the diseased subjects Specificity (Sp) is estimated by sample proportion of negative tests among the non-diseased subjects Stat 1 methods can typically be used to obtain confidence intervals for Se and Sp ˆp ± z α 2 ˆp(1 ˆp) where ˆp is the proportion having the characteristic of interest in the sample n

Pink = person has disease White = person does not have disease + = screening test is positive - = screening test is negative

Pink = person has disease White = person does not have disease + = screening test is positive - = screening test is negative

Outline Goals and Challenges of Diagnostic Screening Hui and Walter Solution to No Gold Standard Problem Longitudinal Diagnostic Screening Comparing Bayesian and Frequentist Statistics Johne s Disease Model and Findings

No Gold Standard Case No gold standard (NGS) data occur when there is no perfect test so that the true disease status of subjects is unknown (how to estimate SE, SP, π?) First breakthrough in NGS data in 1980, Hui and Walter used 2 indept tests on 2 pops

Hui and Walter Solution to NGS Case Need two tests and two populations (could be males/females) For example, suppose we group the people in this room by gender We test each person with a serology test and a bacterial culture test for Strep Throat We don t know the true disease status of anyone Name Serology Culture Group John + + 1 Jane - + 2 Susan + - 2 Summarize data with n 1++ =number in group 1 test + on both test, n 1+, n 1 +, n 1,n 2++,n 2+, n 2 +, n 2

Hui and Walter, 1980 With 2 tests in 2 populations, can estimate Se and Sp for both tests and prevalences in both populations using Max Lik: {Se 1, Se 2, Sp 1, Sp 2, π 1, π 2 } WITHOUT KNOWING ANYONE S TRUE DISEASE STATUS A few assumptions Tests are independent conditional on disease status The prevalences of the two pops are different Se and Sp of both tests are the same for both pops

The Data We are able to estimate the 6 parameters since we have 6 bits of data Test 2 Pop 1 + - Tot + 14 4 18 T 1-9 528 537 Tot 23 532 555 Test 2 Pop 2 + - Tot + 887 31 918 T 1-37 367 404 Tot 924 398 1322 Let n gij be the number in group g having test 1 and 2 outcomes i and j. So, n 1+ = 4 Getting estimates of {Se 1, Se 2, Sp 1, Sp 2, π 1, π 2 } now like solving system of 6 eqns in 6 unknowns.

Outline Goals and Challenges of Diagnostic Screening Hui and Walter Solution to No Gold Standard Problem Longitudinal Diagnostic Screening Comparing Bayesian and Frequentist Statistics Johne s Disease Model and Findings

The Data Current research considers NGS data where subjects are screened repeated over time, longitudinal data Longitudinal methods have been applied to: HIV, diagnosing ovarian cancer, modeling cognition in dementia patients And to diagnosing Johne s Disease in cows

Johne s Disease No cure Significant economic losses due to reduced milk production No symptoms for roughly one year Early detection prevents disease from spreading Semi-annual screening of 365 cows with two imperfect tests administered at each test time: serology test (continuous) and a fecal culture test (binary)

Johne s Disease Data Goal is to correctly classify cows as diseased or not using this data (Norris, Johnson, and Gardner, 2009) Cow 182 Cow 82 S log serology 2 1 0 1 2 F S F S F F F F F S S S S S log serology 2 1 0 1 2 F F F F F F F S S S S S F S F S 15.0 15.5 16.0 16.5 17.0 5 6 7 8 Time in yrs Time in yrs Cow 52 Cow 208 log serology 2 1 0 1 2 F F F F F F S S S S S S log serology 2 1 0 1 2 F F F F F F F S S S S S S 9 10 11 12 15.5 16.0 16.5 17.0 17.5

Outline Goals and Challenges of Diagnostic Screening Hui and Walter Solution to No Gold Standard Problem Longitudinal Diagnostic Screening Comparing Bayesian and Frequentist Statistics Johne s Disease Model and Findings

Bayesian versus Frequentist Data Analysis We used Bayesian methods to analyze the longitudinal fecal and serology tests for cows Frequentists make inferences based on the data only Bayesians use both the data collected AND so-called prior information from another independent source (previous study, expert, etc) data and prior are combined in a probabilistically coherent manner using Bayes Theorem to obtain the posterior distribution mean of posterior distribution often taken as estimate of parameter; may have a nice formula

Bayes Theorem f (µ data) = f (data µ) g(µ) P(data) = f (x µ) g(µ) P(x) g(µ), the prior, reflects the probabilities associated with different values of the parameter before data is seen f (x µ) represents the information about the parameter µ that is contained in the data The posterior distribution, f (µ x), represents the updated probability distribution of µ once the prior has been combined with the information in the data Once posterior distn of µ is obtained, often use its mean to estimate µ

Example of Bayesian and Frequentist Analysis The problem (from Samaniego and Reneau, 1994): population consists of the first words of the 758 pages in a particular edition of W. Somerset Maugham s book Of Human Bondage task is to estimate p, the true proportion of words that have 6 or more letters The data will consist the number of words having 6 or more letters from 10 randomly selected pages. (n= sample size =10) The frequentist estimate of p is ˆp = the number of words having 6 or more letters in sample 10

Example of Bayesian and Frequentist Analysis For a Bayesian analysis, you may construct a prior as follows: first take your best guess at the proportion of words having 6 or more letters, call it p*. Suppose my p* = 0.40. Now consider how much weight, α [0, 1], you want to put on the data, i.e. ˆp Estimate p by αˆp + (1 α)p I choose α = 0.75 If data yielded ˆp = 5 10, then the Bayes estimator is αˆp + (1 α)p = 0.75(0.5) + 0.25(0.4) = 0.475

Example of Bayesian and Frequentist Analysis Samaniego and Reneau had 99 Stat 1 students each formulate his/her own prior using this procedure the scatterplot shows their results they compared how each students Bayesian estimator would perform against ˆp, the frequentist estimator Bayesian estimator outperformed frequentist with 88 of the 99 priors. Priors that failed to yield better estimates had p* far off and heavy weight on p*. wrong and stubborn

Outline Goals and Challenges of Diagnostic Screening Hui and Walter Solution to No Gold Standard Problem Longitudinal Diagnostic Screening Comparing Bayesian and Frequentist Statistics Johne s Disease Model and Findings

Models by Infection State Model to account for: Lag between infection with bacteria and antibody production, will be estimated Different models are defined for each infection state State 1: No infection during entire study State 2: Infection late, no serology reaction occurs during study State 3: Infection occurs early enough for serology reaction to occur during study Assuming infection state is known, we assume serology and fecal culture are independent Infection state unknown, inferred from data

Models by Infection State Serology model for State 3 shown below; fecal same as state 2

Parameter Estimates Putting priors on all parameters and using Bayesian methods we obtain the following parameter estimates (total of 2199 parameters and latents): 95% Probability Interval Parameter Post. Mean Lower Upper β 0-1.741-1.761-1.721 σ β0 0.067 0.052 0.087 τ e 55.9 42.7 63.4 se F 0.57 0.52 0.63 sp F 0.976 0.955 0.990 q 1 0.48 0.41 0.55 q 2 0.25 0.19 0.32 q 3 0.26 0.22 0.32 lag 1.60 1.32 1.85 Table: Parameter Estimates for Johne s Disease Data (Semiparametric Model)

Conclusions about Slopes Each cow in State 3 permitted to have its own slope for serology reaction. Typical to assume these slopes are draws from a normal distribution. We didn t make this assumption and estimated the distribution of slopes. dens.sum 0.0 0.2 0.4 0.6 0.8

Infection over Time Because time of infection is estimated, can study how infection spreads through herd over time. Cumulative number infected (N=365) 0 50 100 150 0 5 10 15 20

Further Research Needed More flexible serology trajectories Allow tests to be dependent