Bayesian Adjustments for Misclassified Data. Lawrence Joseph

Similar documents
Bayesian Adjustments for Misclassified Data. Lawrence Joseph

Robustness of Prevalence Estimates Derived from Misclassified Data from Administrative Databases

Bayesian sample size for diagnostic test studies in the absence of a gold standard: Comparing identifiable with non-identifiable models

MS&E 226: Small Data

Introduction to Bayesian Analysis 1

Bayesian Methods for Medical Test Accuracy. Broemeling & Associates Inc., 1023 Fox Ridge Road, Medical Lake, WA 99022, USA;

Bayesian performance

Models for potentially biased evidence in meta-analysis using empirically based priors

Bayesian latent class estimation of the incidence of chest radiograph-confirmed pneumonia in rural Thailand

UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER

Bayesian modelling of tuberculosis clustering from DNA fingerprint data

Commentary: Practical Advantages of Bayesian Analysis of Epidemiologic Data

A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests

Bayesian Meta-Analysis of the Accuracy of a Test for Tuberculous Pleuritis in the Absence of a Gold Standard Reference

A Bayesian Approach to Measurement Error Problems in Epidemiology Using Conditional Independence Models

A Brief Introduction to Bayesian Statistics

Advanced Bayesian Models for the Social Sciences

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill)

Lecture Outline Biost 517 Applied Biostatistics I. Statistical Goals of Studies Role of Statistical Inference

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

5/10/2010. ISPOR Good Research Practices for Comparative Effectiveness Research: Indirect Treatment Comparisons Task Force

Comparing treatments evaluated in studies forming disconnected networks of evidence: A review of methods

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Supplementary Online Content

Lecture Outline Biost 517 Applied Biostatistics I

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods

Type and quantity of data needed for an early estimate of transmissibility when an infectious disease emerges

Epidemiology: Overview of Key Concepts and Study Design. Polly Marchbanks

Bayesian Inference Bayes Laplace

Bayesian and Frequentist Approaches

An Introduction to Bayesian Statistics

Psychology, 2010, 1: doi: /psych Published Online August 2010 (

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy

Challenges of Observational and Retrospective Studies

Response to the ASA s statement on p-values: context, process, and purpose

Project Funded under FP7 - HEALTH Grant Agreement no Funded under FP7 - HEALTH Grant Agreement no

On the Targets of Latent Variable Model Estimation

Bayesian Analysis by Simulation

A Bayesian Perspective on Unmeasured Confounding in Large Administrative Databases

Detection of Unknown Confounders. by Bayesian Confirmatory Factor Analysis

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data

Biost 590: Statistical Consulting

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Administrative database research has unique characteristics that can risk biased results

with Stata Bayesian Analysis John Thompson University of Leicester A Stata Press Publication StataCorp LP College Station, Texas

Outline. What s inside this paper? My expectation. Software Defect Prediction. Traditional Method. What s inside this paper?

Inference About Magnitudes of Effects

Truth Versus Truthiness in Clinical Data

Misclassification errors in prevalence estimation: Bayesian handling with care

Bayesian methods for combining multiple Individual and Aggregate data Sources in observational studies

Prevalence of Systemic Lupus Erythematosus and Systemic Sclerosis in the First Nations Population of Alberta, Canada

T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design

A Case Study: Two-sample categorical data

Sensory specific satiation: using Bayesian networks to combine data from related studies

School of Population and Public Health SPPH 503 Epidemiologic methods II January to April 2019

Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models

Bayesian Mediation Analysis

What is probability. A way of quantifying uncertainty. Mathematical theory originally developed to model outcomes in games of chance.

Can you guarantee that the results from your observational study are unaffected by unmeasured confounding? H.Hosseini

Structural Approach to Bias in Meta-analyses

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

Comparison of Meta-Analytic Results of Indirect, Direct, and Combined Comparisons of Drugs for Chronic Insomnia in Adults: A Case Study

OHDSI Tutorial: Design and implementation of a comparative cohort study in observational healthcare data

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

Epidemiologic Methods and Counting Infections: The Basics of Surveillance

Hypothetical Scenario

extraction can take place. Another problem is that the treatment for chronic diseases is sequential based upon the progression of the disease.

Data Analysis Using Regression and Multilevel/Hierarchical Models

Longitudinal and Hierarchical Analytic Strategies for OAI Data

Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling, Sensitivity Analysis, and Causal Inference

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc.

Presenter Disclosure Information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

A Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China

Remarks on Bayesian Control Charts

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions

Modelling heterogeneity variances in multiple treatment comparison meta-analysis Are informative priors the better solution?

Science, Subjectivity and Software (Comments on the articles by Berger and Goldstein)

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

INTRODUCTION TO BAYESIAN REASONING

Fundamental Clinical Trial Design

Bayesians methods in system identification: equivalences, differences, and misunderstandings

10CS664: PATTERN RECOGNITION QUESTION BANK

Methods Research Report. An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy

Copyright by Sindhu Rose Johnson, 2013

Introduction to Computational Neuroscience

Performance Assessment for Radiologists Interpreting Screening Mammography

Geoffrey Stewart Morrison

Empirical assessment of univariate and bivariate meta-analyses for comparing the accuracy of diagnostic tests

We thank the Editor for guidance on submitting a revised version and have responded to each of the points outlined in their letter.

Introduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015

Intelligent Systems. Discriminative Learning. Parts marked by * are optional. WS2013/2014 Carsten Rother, Dmitrij Schlesinger

What is a probability? Two schools in statistics: frequentists and Bayesians.

You must answer question 1.

Hierarchy of Statistical Goals

Research in Real-World Settings: PCORI s Model for Comparative Clinical Effectiveness Research

Critical Thinking Rubric. 1. The student will demonstrate the ability to interpret information. Apprentice Level 5 6. graphics, questions, etc.

Transcription:

Bayesian Adjustments for Misclassified Data Lawrence Joseph

Bayesian Adjustments for Misclassified Data Lawrence Joseph Marcel Behr, Patrick Bélisle, Sasha Bernatsky, Nandini Dendukuri, Theresa Gyorkos, Martin Ladouceur, Elham Rahme, Kevin Schwartzman, Allison Scott

Bayesian Adjustments for Misclassified Data Lawrence Joseph Marcel Behr, Patrick Bélisle, Sasha Bernatsky, Nandini Dendukuri, Theresa Gyorkos, Martin Ladouceur, Elham Rahme, Kevin Schwartzman, Allison Scott Department of Epidemiology and Biostatistics McGill University

Outline

Outline Statistical trends in medical journals

Outline Statistical trends in medical journals Misclassification in analysis of diagnostic testing data

Outline Statistical trends in medical journals Misclassification in analysis of diagnostic testing data User friendly software

Outline Statistical trends in medical journals Misclassification in analysis of diagnostic testing data User friendly software Misclassification in administrative database studies

Outline Statistical trends in medical journals Misclassification in analysis of diagnostic testing data User friendly software Misclassification in administrative database studies Extension to more complex situations (correlations, continuous data)

Outline Statistical trends in medical journals Misclassification in analysis of diagnostic testing data User friendly software Misclassification in administrative database studies Extension to more complex situations (correlations, continuous data) Conclusion

Trends in Medical Journals

Trends in Medical Journals Recognition of a good idea Optional, but if you analyze data this way, reviewers generally will say it is a good idea

Trends in Medical Journals Recognition of a good idea Optional, but if you analyze data this way, reviewers generally will say it is a good idea Commonplace If you do not analyze data this way, reviewers generally will ask you to do it

Trends in Medical Journals Recognition of a good idea Optional, but if you analyze data this way, reviewers generally will say it is a good idea Commonplace If you do not analyze data this way, reviewers generally will ask you to do it What is considered as a Good Idea or is Commonplace changes over time.

Current Trends in Medical Journals (2007)

Current Trends in Medical Journals (2007) 1. Bayesian analysis is recognized as a good idea but not required

Current Trends in Medical Journals (2007) 1. Bayesian analysis is recognized as a good idea but not required 2. Adjustment for measurement error is also recognized as a good idea but not required

Current Trends in Medical Journals (2007) 1. Bayesian analysis is recognized as a good idea but not required 2. Adjustment for measurement error is also recognized as a good idea but not required 3. Non-identifiability issues often means that 2 1

Current Trends in Medical Journals (2007) 1. Bayesian analysis is recognized as a good idea but not required 2. Adjustment for measurement error is also recognized as a good idea but not required 3. Non-identifiability issues often means that 2 1 4. Both ideas are increasing in popularity

Articles encouraging use of Bayesian methods in medicine

Articles encouraging use of Bayesian methods in medicine Malakoff D. Bayes offers a new way to make sense of numbers. Science 1999;286:1460-1464. [Science review about the rise of Bayesian analysis]

Articles encouraging use of Bayesian methods in medicine Malakoff D. Bayes offers a new way to make sense of numbers. Science 1999;286:1460-1464. [Science review about the rise of Bayesian analysis] Dunson D. Practical advantages of Bayesian analysis of epidemiologic data. American Journal of Epidemiology 2001;153:1222-1226.

Articles encouraging use of Bayesian methods in medicine Malakoff D. Bayes offers a new way to make sense of numbers. Science 1999;286:1460-1464. [Science review about the rise of Bayesian analysis] Dunson D. Practical advantages of Bayesian analysis of epidemiologic data. American Journal of Epidemiology 2001;153:1222-1226. Goodman S. Of P-values and Bayes: A modest proposal. Epidemiology. 2001;12:295-7. [Encourages use of Bayes Factors rather than p-values]

Articles encouraging use of Bayesian methods in medicine Malakoff D. Bayes offers a new way to make sense of numbers. Science 1999;286:1460-1464. [Science review about the rise of Bayesian analysis] Dunson D. Practical advantages of Bayesian analysis of epidemiologic data. American Journal of Epidemiology 2001;153:1222-1226. Goodman S. Of P-values and Bayes: A modest proposal. Epidemiology. 2001;12:295-7. [Encourages use of Bayes Factors rather than p-values] Greenland S. Multiple-bias modelling for analysis of observational data. JRSSA 2005;168:267-306. [Encourages use of Bayesian methods for bias adjustments and measurement error/misclassification]

Quote from Greenland 2005 Conventional analytic results do not reflect any source of uncertainty other than random error, and as a result readers must rely on informal judgments regarding the effect of possible biases. When standard errors are small these judgments often fail to capture sources of uncertainty and their interactions adequately. Multiple-bias models provide alternatives.... Typically, the bias parameters in the model are not identified by the analysis data and so the results depend completely on priors for those parameters. A Bayesian analysis is then natural...

Another quote from Greenland 2005 Conventional analyses can be characterized as:

Another quote from Greenland 2005 Conventional analyses can be characterized as: (a) Employ frequentist statistical methods based on assumptions which may be grossly violated and are not testable with the data under analysis:

Another quote from Greenland 2005 Conventional analyses can be characterized as: (a) Employ frequentist statistical methods based on assumptions which may be grossly violated and are not testable with the data under analysis: (i) the study exposure is randomized within levels of controlled covariates

Another quote from Greenland 2005 Conventional analyses can be characterized as: (a) Employ frequentist statistical methods based on assumptions which may be grossly violated and are not testable with the data under analysis: (i) the study exposure is randomized within levels of controlled covariates (ii) selection, participation and missing data are random

Another quote from Greenland 2005 Conventional analyses can be characterized as: (a) Employ frequentist statistical methods based on assumptions which may be grossly violated and are not testable with the data under analysis: (i) the study exposure is randomized within levels of controlled covariates (ii) selection, participation and missing data are random (iii) there is no measurement error

Another quote from Greenland 2005 Conventional analyses can be characterized as: (a) Employ frequentist statistical methods based on assumptions which may be grossly violated and are not testable with the data under analysis: (i) the study exposure is randomized within levels of controlled covariates (ii) selection, participation and missing data are random (iii) there is no measurement error (b) Address possible violations of assumptions with speculative discussions. If they like the results, researchers argue that the biases are inconsequential. If they dislike the results they focus on possible biases.

Example: Diagnostic testing for Strongyloides infection Stool Examination + Serology + 38 87 125 2 35 37 40 122 162

Example: Diagnostic testing for Strongyloides infection Stool Examination + Serology + 38 87 125 2 35 37 40 122 162 Prevalence 25%? Prevalence 75%?

Example: Diagnostic testing for Strongyloides infection Stool Examination + Serology + 38 87 125 2 35 37 40 122 162 Prevalence 25%? Prevalence 75%? Sensitivity? Specificity?

Problem: Non-identifiability

Problem: Non-identifiability Table has 3 degrees of freedom

Problem: Non-identifiability Table has 3 degrees of freedom There are five unknown parameters (prev + sens and spec from each test)

Problem: Non-identifiability Table has 3 degrees of freedom There are five unknown parameters (prev + sens and spec from each test) Thus we have non-identifiability: There is an infinite number of solutions (estimates of the five parameters) that fit the data equally well

Frequentist Solution

Frequentist Solution With 3 df, can only estimate 3 parameters at a time

Frequentist Solution With 3 df, can only estimate 3 parameters at a time Solution: Pick any two parameters, and fix their values, use data to estimate other three

Frequentist Solution With 3 df, can only estimate 3 parameters at a time Solution: Pick any two parameters, and fix their values, use data to estimate other three Problems:

Frequentist Solution With 3 df, can only estimate 3 parameters at a time Solution: Pick any two parameters, and fix their values, use data to estimate other three Problems: Which 2 to pick as known?

Frequentist Solution With 3 df, can only estimate 3 parameters at a time Solution: Pick any two parameters, and fix their values, use data to estimate other three Problems: Which 2 to pick as known? What if known values are inaccurate?

Frequentist Solution With 3 df, can only estimate 3 parameters at a time Solution: Pick any two parameters, and fix their values, use data to estimate other three Problems: Which 2 to pick as known? What if known values are inaccurate? Even if exactly correct values selected for known parameters, confidence intervals are too narrow, as uncertainty in constrained values ignored.

Bayesian Solution

Bayesian Solution Treat all five parameters as equal

Bayesian Solution Treat all five parameters as equal Place a prior distribution over each parameter

Bayesian Solution Treat all five parameters as equal Place a prior distribution over each parameter At least two priors must be informative, to get around identifiability problem

Bayesian Solution Treat all five parameters as equal Place a prior distribution over each parameter At least two priors must be informative, to get around identifiability problem Have priors, can write down likelihood, get posteriors for all parameters via Bayes Theorem

Bayesian Solution Treat all five parameters as equal Place a prior distribution over each parameter At least two priors must be informative, to get around identifiability problem Have priors, can write down likelihood, get posteriors for all parameters via Bayes Theorem Contains the frequentist solution as a special case, with point priors on two parameters and uniform priors on other three parameters... very unrealistic solution when looked at in this way

Bayesian Solution - Need For User Friendly Software

Bayesian Solution - Need For User Friendly Software Thousands of articles discussing new statistical methods are published every year

Bayesian Solution - Need For User Friendly Software Thousands of articles discussing new statistical methods are published every year Only a very small percentage of these find use in real applications

Bayesian Solution - Need For User Friendly Software Thousands of articles discussing new statistical methods are published every year Only a very small percentage of these find use in real applications An even smaller proportion of new methods find use by researchers other than the developers

Bayesian Solution - Need For User Friendly Software Thousands of articles discussing new statistical methods are published every year Only a very small percentage of these find use in real applications An even smaller proportion of new methods find use by researchers other than the developers Reasons:

Bayesian Solution - Need For User Friendly Software Thousands of articles discussing new statistical methods are published every year Only a very small percentage of these find use in real applications An even smaller proportion of new methods find use by researchers other than the developers Reasons: Too difficult for most non-statisticians to understand

Bayesian Solution - Need For User Friendly Software Thousands of articles discussing new statistical methods are published every year Only a very small percentage of these find use in real applications An even smaller proportion of new methods find use by researchers other than the developers Reasons: Too difficult for most non-statisticians to understand Lack of recognition for applied work (not here!)

Bayesian Solution - Need For User Friendly Software Thousands of articles discussing new statistical methods are published every year Only a very small percentage of these find use in real applications An even smaller proportion of new methods find use by researchers other than the developers Reasons: Too difficult for most non-statisticians to understand Lack of recognition for applied work (not here!) Even if understandable, lack of time to program

Bayesian Solution - Need For User Friendly Software Thousands of articles discussing new statistical methods are published every year Only a very small percentage of these find use in real applications An even smaller proportion of new methods find use by researchers other than the developers Reasons: Too difficult for most non-statisticians to understand Lack of recognition for applied work (not here!) Even if understandable, lack of time to program Useful to provide user friendly software

Example: BayesDiagnosticTests Software

Example: BayesDiagnosticTests Software Windows based exe file

Example: BayesDiagnosticTests Software Windows based exe file Input data + priors through user friendly fill in the blanks windows

Example: BayesDiagnosticTests Software Windows based exe file Input data + priors through user friendly fill in the blanks windows Runs a Gibbs sampler using WinBUGS by itself

Example: BayesDiagnosticTests Software Windows based exe file Input data + priors through user friendly fill in the blanks windows Runs a Gibbs sampler using WinBUGS by itself When ready (typically a few minutes) pops up posterior distributions + graphs

Example: BayesDiagnosticTests Software Windows based exe file Input data + priors through user friendly fill in the blanks windows Runs a Gibbs sampler using WinBUGS by itself When ready (typically a few minutes) pops up posterior distributions + graphs Extensive manual, free help via email

Example: BayesDiagnosticTests Software Windows based exe file Input data + priors through user friendly fill in the blanks windows Runs a Gibbs sampler using WinBUGS by itself When ready (typically a few minutes) pops up posterior distributions + graphs Extensive manual, free help via email Let s look at the results from the Strongyloides example

Results: Strongyloides Example Stool Examination Serology Prev Sens Spec Sens Spec Prior 0.50 0.24 0.95 0.81 0.72 Information 0.03 0.98 0.07 0.47 0.89 0.99 0.63 0.92 0.31 0.96 Stool Examin- 0.74 0.30 0.95 ation Alone 0.41 0.98 0.21 0.47 0.88 0.99 Serology 0.80 0.83 0.58 Alone 0.23 0.99 0.73 0.92 0.22 0.94 Both Tests 0.76 0.31 0.96 0.89 0.67 Combined 0.52 0.91 0.22 0.44 0.91 0.99 0.80 0.95 0.36 0.95

Application to Administrative Database Research

Application to Administrative Database Research Primary data collection can be expensive

Application to Administrative Database Research Primary data collection can be expensive Researchers are increasingly using information collected in administrative databases (e.g. RAMQ)

Application to Administrative Database Research Primary data collection can be expensive Researchers are increasingly using information collected in administrative databases (e.g. RAMQ) Such databases typically contain substantial proportions of misclassification errors (e.g. diagnoses).

Example: Prevalence of OA in 65+ from RAMQ data Three imperfect clues about OA are available:

Example: Prevalence of OA in 65+ from RAMQ data Three imperfect clues about OA are available: ICD-9 diagnostic code for OA

Example: Prevalence of OA in 65+ from RAMQ data Three imperfect clues about OA are available: ICD-9 diagnostic code for OA At least one prescription for acetaminophen or an NSAID, but not methotrexate or plaquenil

Example: Prevalence of OA in 65+ from RAMQ data Three imperfect clues about OA are available: ICD-9 diagnostic code for OA At least one prescription for acetaminophen or an NSAID, but not methotrexate or plaquenil Received injection common in OA, an arthroplasty or a tibial osteotomy

Methods

Methods Similar to case with two tests, but can use data from one, two, or all three tests (7 combinations in all)

Methods Similar to case with two tests, but can use data from one, two, or all three tests (7 combinations in all) Priors not needed if all three tests are used

Methods Similar to case with two tests, but can use data from one, two, or all three tests (7 combinations in all) Priors not needed if all three tests are used Idea is to compare results from a variety of models, to check robustness of prevalence estimates

OA data from RAMQ data base Test 1 Test 2 Test 3 Number of Phys. Diagnosis Medication Medical Acts individuals observed + + + 11,816 + + - 57,222 + - + 3,320 + - - 25,651 - + + 9,610 - + - 260,923 - - + 5,002 - - - 595,415

OA data from RAMQ data base Test 1 Test 2 Test 3 Number of Phys. Diagnosis Medication Medical Acts individuals observed + + + 11,816 + + - 57,222 + - + 3,320 + - - 25,651 - + + 9,610 - + - 260,923 - - + 5,002 - - - 595,415 Test 1 +ve = 11,816+ 57,222 +3,320 +25,651 = 98,009 N = 968,959

Results with no Misclassification Adjustment

Results with no Misclassification Adjustment Naive estimate using physician diagnosis error is 10.1%, 95% Credible Interval (CrI) 10.1-10.2

Results with no Misclassification Adjustment Naive estimate using physician diagnosis error is 10.1%, 95% Credible Interval (CrI) 10.1-10.2 Very narrow interval, but accounts only for uncertainty due to random variation, not for extra variability due to misclassification

Results with no Misclassification Adjustment Naive estimate using physician diagnosis error is 10.1%, 95% Credible Interval (CrI) 10.1-10.2 Very narrow interval, but accounts only for uncertainty due to random variation, not for extra variability due to misclassification How much confidence can we place in this seemingly very accurate estimate?

Results with Misclassification Adjustment Prev Sens 1 Sens 2 Sens 3 Spec 1 Spec 2 Spec 2 Prior distribution 50.0 75.0 75.0 25.0 95.0 60.0 95.0 0.0-100 70.0-80.0 70.0-80.0 20.0-30.0 90.0-100 55.0-65.0 90.0-100 One Test Physician diagnosis alone 11.5 72.8 95.4 4.5-14.2 63.2-82.4 92.8-99.9 Prescribed medication alone 9.5 72.3 71.1 3.3-22.0 62.5-81.9 66.3-75.6 Medical procedure alone 10.6 22.1 99.2 5.2-18.6 14.7-33.8 97.9-99.9 Two Tests Combination of physician 11.8 75.1 76.1 98.9 70.5 diagnosis and prescribed 8.6-14.8 68.4-81.4 73.9-78.4 98.1-99.1 70.1-71.3 medication Combination of physician 9.8 74.1 23.5 96.7 99.2 diagnosis and 6.4-13.7 63.2-83.2 15.9-31.1 94.3-99.6 98.6-99.3 medical acts Combination of 10.0 77.6 24.0 70.6 99.6 prescribed medication 6.8-16.4 72.2-83.3 18.2-38.2 68.2-72.5 99.3-99.9 and medical acts Three Tests 3 tests using 14.8 58.2 78.3 18.2 98.1 72.4 99.5 non-informative priors 14.5-15.1 57.0-59.0 77.6-79.0 17.8-18.5 98.0-98.3 72.3-72.6 99.5-99.5

Comparison of Estimates

Comparison of Estimates Usual estimate: 10.1% (10.1-10.2), width = 0.1

Comparison of Estimates Usual estimate: 10.1% (10.1-10.2), width = 0.1 Adj estimate (3 tests): 14.8% (14.5-15.1), width = 0.6

Comparison of Estimates Usual estimate: 10.1% (10.1-10.2), width = 0.1 Adj estimate (3 tests): 14.8% (14.5-15.1), width = 0.6 Bias 50%, CrI width grows by a factor of 6

Comparison of Estimates Usual estimate: 10.1% (10.1-10.2), width = 0.1 Adj estimate (3 tests): 14.8% (14.5-15.1), width = 0.6 Bias 50%, CrI width grows by a factor of 6 Other plausible estimates range from 3.3% to 22%

Comparison of Estimates Usual estimate: 10.1% (10.1-10.2), width = 0.1 Adj estimate (3 tests): 14.8% (14.5-15.1), width = 0.6 Bias 50%, CrI width grows by a factor of 6 Other plausible estimates range from 3.3% to 22% Must admit true answer not really known

Comparison of Estimates Usual estimate: 10.1% (10.1-10.2), width = 0.1 Adj estimate (3 tests): 14.8% (14.5-15.1), width = 0.6 Bias 50%, CrI width grows by a factor of 6 Other plausible estimates range from 3.3% to 22% Must admit true answer not really known All estimates depend on unverifiable assumptions

Comparison of Estimates Usual estimate: 10.1% (10.1-10.2), width = 0.1 Adj estimate (3 tests): 14.8% (14.5-15.1), width = 0.6 Bias 50%, CrI width grows by a factor of 6 Other plausible estimates range from 3.3% to 22% Must admit true answer not really known All estimates depend on unverifiable assumptions Problems carry over to regressions based on risk factors for OA, etc.

Extensions

Different numbers of tests Extensions

Different numbers of tests Extensions Correlations among dichotomous tests

Different numbers of tests Extensions Correlations among dichotomous tests Continuous diagnostic test results (Para + Non-Para)

Different numbers of tests Extensions Correlations among dichotomous tests Continuous diagnostic test results (Para + Non-Para) Combinations of continuous and dichotomous tests

Different numbers of tests Extensions Correlations among dichotomous tests Continuous diagnostic test results (Para + Non-Para) Combinations of continuous and dichotomous tests Correlations among continuous tests

Different numbers of tests Extensions Correlations among dichotomous tests Continuous diagnostic test results (Para + Non-Para) Combinations of continuous and dichotomous tests Correlations among continuous tests Hierarchical models for diagnostic test data

Conclusions Important to consider measurement/misclassification error in Epi (coming trend?)

Conclusions Important to consider measurement/misclassification error in Epi (coming trend?) For real impact: not enough to develop methods need user friendly software

Conclusions Important to consider measurement/misclassification error in Epi (coming trend?) For real impact: not enough to develop methods need user friendly software Beware of unadjusted results from Administrative Database research

References Joseph L, Gyorkos T, Coupal L. Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. American Journal of Epidemiology 1995;141(3):263-272. (Dichotomous tests) Rahme E, Joseph L, and Gyorkos T. Bayesian sample size determination for estimating binomial parameters from data subject to misclassification. Applied Statistics 2000;49(1):119-228. (Design of studies with misclassification) Dendukuri N, Joseph L. Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests. Biometrics 2001;57(1):208-217. (Dependent tests) Bernatsky S, Joseph L, Bélisle P, Boivin J, Rajan R, Moore A, Clarke A. A Bayesian hierarchical model for estimating the properties of cancer ascertainment methods in cohort studies. Statistics in Medicine 2005;24:2365-2379. (Hierarchical model) Carabin H, Marshall C, Joseph L, Riley S, Olveda R, McGarvey S. Estimating and modelling the dynamics of the intensity of infection with Schistosoma japonicum in villagers of Leyte, Philippines. Part I: A Bayesian cumulative logit model. American Journal of Tropical Medicine and Hygiene 2005;72(6):745-753. (Logistic regression with misclassification) Ladouceur M, Rahme E, Pineau C, Joseph L. Robustness of prevalence estimates derived from misclassified data from administrative databases. Biometrics 2007;63:272-279. (Application to Administrative Data) Scott A, Joseph L, Bélisle P, Behr M, Schwartzman K. Bayesian estimation of tuberculosis clustering rates from DNA sequence data. Statistics in Medicine 2007 (to appear). (Continuous tests)