Evidence-based Imaging: Critically Appraising Studies of Diagnostic Tests

Similar documents
CONSORT 2010 checklist of information to include when reporting a randomised trial*

CHECK-LISTS AND Tools DR F. R E Z A E I DR E. G H A D E R I K U R D I S TA N U N I V E R S I T Y O F M E D I C A L S C I E N C E S

Evidence Based Medicine Prof P Rheeder Clinical Epidemiology. Module 2: Applying EBM to Diagnosis

Evidence-Based Medicine: Diagnostic study

Critical reading of diagnostic imaging studies. Lecture Goals. Constantine Gatsonis, PhD. Brown University

STUDIES OF THE ACCURACY OF DIAGNOSTIC TESTS: (Relevant JAMA Users Guide Numbers IIIA & B: references (5,6))

Worksheet for Structured Review of Physical Exam or Diagnostic Test Study

EBM Diagnosis. Denise Campbell-Scherer Stefanie R. Brown. Departments of Medicine and Pediatrics University of Miami Miller School of Medicine

Washington, DC, November 9, 2009 Institute of Medicine

Impact of STARD on reporting quality of diagnostic accuracy studies in a top Indian Medical Journal: A retrospective survey

Teaching Tips for Diagnostic Studies Dr. Annette Plüddemann

Improving reporting for observational studies: STROBE statement

Critical Appraisal for Research Papers. Appraisal Checklist & Guide Questions

OUTLINE. Teaching Critical Appraisal and Application of Research Findings. Elements of Patient Management 2/18/2015. Examination

Glossary of Practical Epidemiology Concepts

Clinical Epidemiology for the uninitiated

Diagnostic Studies Dr. Annette Plüddemann

Evidence-based Radiology: A New Approach to the Practice of Radiology 1

Appraising Diagnostic Test Studies

Overuse of Imaging: Identifying Waste and Inefficiency

Systematic Reviews. Simon Gates 8 March 2007

Hayden Smith, PhD, MPH /\ v._

Assessment of performance and decision curve analysis

Evidence-based Laboratory Medicine: Finding and Assessing the Evidence

The evidence system of traditional Chinese medicine based on the Grades of Recommendations Assessment, Development and Evaluation framework

APPLYING EVIDENCE-BASED METHODS IN PSYCHIATRY JOURNAL CLUB: HOW TO READ & CRITIQUE ARTICLES

Critical Review Form Clinical Decision Analysis

Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative

The cross sectional study design. Population and pre-test. Probability (participants). Index test. Target condition. Reference Standard

Critical Appraisal. Dave Abbott Senior Medicines Information Pharmacist

(true) Disease Condition Test + Total + a. a + b True Positive False Positive c. c + d False Negative True Negative Total a + c b + d a + b + c + d

Evidence Based Medicine: Articles of Diagnosis

EQUATOR Network: promises and results of reporting guidelines

CRITICAL APPRAISAL WORKSHEET 1

The QUOROM Statement: revised recommendations for improving the quality of reports of systematic reviews

EVIDENCE-BASED GUIDELINE DEVELOPMENT FOR DIAGNOSTIC QUESTIONS

Peer review & Critical Appraisal

SPECIAL COMMUNICATION. Evidence-Based Medicine, Part 3. An Introduction to Critical Appraisal of Articles on Diagnosis

Quality of Clinical Practice Guidelines

Quality of Reporting of Diagnostic Accuracy Studies 1

Issues to Consider in the Design of Randomized Controlled Trials

Introduzione al metodo GRADE

Evidence Based Medicine

GATE CAT Diagnostic Test Accuracy Studies

Evidence Based Practice (EBP) Five Step Process EBM. A Definition of EBP 10/13/2009. Fall

The recommended method for diagnosing sleep

Appendix G: Methodology checklist: the QUADAS tool for studies of diagnostic test accuracy 1

RACE611 CLINICAL EPIDEMIOLOGY AND EVIDENCE-BASED MEDICINE Prognostic study

Introduction to systematic reviews/metaanalysis

Epidemiologic Study Designs. (RCTs)

7/17/2013. Evaluation of Diagnostic Tests July 22, 2013 Introduction to Clinical Research: A Two week Intensive Course

Introduction to ROC analysis

Finding Good Diagnosis Studies

Peer review & Critical Appraisal. Dr Leila Ghalichi

Quality of Clinical Practice Guidelines

Critical Review Form Meta-analysis

Meta-analysis of Diagnostic Test Accuracy Studies

THERAPY WORKSHEET: page 1 of 2 adapted from Sackett 1996

EBM e Medicina Veterinaria: un modo di affrontare i problemi clinici e prendere decisioni da un altra prospettiva. Veterinary Art vs.

Kuang-Hui Yu, M.D. Center for EBM, Chang-Gung Memorial Hospital

Evidence-Based Medicine Journal Club. A Primer in Statistics, Study Design, and Epidemiology. August, 2013

GLOSSARY OF GENERAL TERMS

Clinical Research Scientific Writing. K. A. Koram NMIMR

Evidence-Based Medicine, Systematic Reviews, and Guidelines in Interventional Pain Management: Part 5. Diagnostic Accuracy Studies

Cover Page. The handle holds various files of this Leiden University dissertation

From single studies to an EBM based assessment some central issues

11 questions to help you evaluate a clinical prediction rule

Uses and misuses of the STROBE statement: bibliographic study

Appraising the Literature Overview of Study Designs

Diagnosing Anaemia. Conjunctival pallor. Results

Assessment of methodological quality and QUADAS-2

Systematic reviews and meta-analyses of observational studies (MOOSE): Checklist.

EBP STEP 2. APPRAISING THE EVIDENCE : So how do I know that this article is any good? (Quantitative Articles) Alison Hoens

The 2016 NASCI Keynote: Trends in Utilization of Cardiac Imaging: The Coronary CTA Conundrum. David C. Levin, M.D.

Underuse of risk assessment and overuse of CTPA in patients with suspected pulmonary thromboembolism

EBM, Study Design and Numbers. David Frankfurter, MD Professor OB/GYN The George Washington University

Journal of Pediatric Sciences

Using Number Needed to Treat to Interpret Treatment Effect

Meta-analysis of diagnostic research. Karen R Steingart, MD, MPH Chennai, 15 December Overview

Quality Assessment of Research Articles in Nuclear Medicine Using STARD and QUADAS 2 Tools

Systematic Review & Course outline. Lecture (20%) Class discussion & tutorial (30%)

Critical Appraisal of a Meta-Analysis: Rosiglitazone and CV Death. Debra Moy Faculty of Pharmacy University of Toronto

What Is Evidence-Based Medicine? 1 Critical Thinking Skills Symposium

View from the Technology Evaluation Center (TEC)

Session 4: Test instruments to assess interpretive performance challenges and opportunities Overview of Test Set Design and Use

Cardiac CT Lowering the Dose Dramatically

The importance of good reporting of medical research. Doug Altman. Centre for Statistics in Medicine University of Oxford

MCQ Course in Pediatrics Al Yamamah Hospital June Dr M A Maleque Molla, FRCP, FRCPCH

Meeting the research information needs of patients and clinicians more effectively Iain Chalmers Editor, James Lind Library

Randomized Controlled Trial

How to use this appraisal tool: Three broad issues need to be considered when appraising a case control study:

Are the likely benefits worth the potential harms and costs? From McMaster EBCP Workshop/Duke University Medical Center

Understanding Diagnostic Research Outline of Topics

Designing A User Study

The Role of Likelihood Ratio in Clinical Diagnosis: Applicability in the Setting of Spontaneous Bacterial Peritonitis

Lecture Outline Biost 517 Applied Biostatistics I. Statistical Goals of Studies Role of Statistical Inference

4 Diagnostic Tests and Measures of Agreement

Guidelines for Writing and Reviewing an Informed Consent Manuscript From the Editors of Clinical Research in Practice: The Journal of Team Hippocrates

SYSTEMATIC REVIEWS OF TEST ACCURACY STUDIES

Critical Appraisal Series

Transcription:

Evidence-based Imaging: Critically Appraising Studies of Diagnostic Tests Aine Marie Kelly, MD Critically Appraising Studies of Diagnostic Tests Aine Marie Kelly B.A., M.B. B.Ch. B.A.O., M.S. M.R.C.P.I., F.R.C.R. No Financial i Disclosures Evidence Based Radiology Integrates the best available research evidence with clinical expertise and patient values Sackett et al. Evidence Based Medicine, how to practice and teach EBM. Elsvier Churchill Livingstone 2005. How do we practice Evidence Based Medicine? Formulate a clinical question Identify medical literature Critically appraise the medical literature Summarize the evidence Apply the evidence to derive the appropriate clinical action Sackett et al. Evidence Based Medicine, how to practice and teach EBM. Elsvier Churchill Livingstone 2005. Critical Appraisal Diagnostic Literature Grade the literature Technology assessment in radiology Assess materials and methods for validity and bias Assess the statistical strength of the results Specific questions for imaging s Levels of Evidence for Diagnostic Tests Level 1 (ideal) Systematic Review of RCT, RCT of appropriate size, validating cohort study, uniform good reference, appropriate population Level 2 (strong) SR of cohort study, exploratory cohort study, good reference, selected population Level 3 (moderate) SR of case control studies, outcomes research, case control trials, non consistent reference standard Level 4 (weak) case series, poor or non independent reference standard Level 5 (very weak) clinical evidence, descriptive studies or reports of expert consensus committees The Centre for Evidence Based Medicine, Oxford University, England. http://cebm.net/levels of evidence.asp#levels 268

Grading Evidence Technology Assessment A: consistent level 1 studies B: consistent level 2 or 3 studies C: level 4 studies D: level l 5 studies or inconsistenti t or inconclusive studies of any level Remedios D, McCoubrie P. Making the best use of clinical radiology services: A new approach to referral guidelines. Clin Radiol 2007; 62 (10): 919 20.) Level 1 = Technical Efficacy Level 2 = Diagnostic accuracy efficacy Level 3 = Diagnostic thinking efficacy Level 4 = Therapeutic efficacy Level 5 = Patient outcome efficacy Level 6 = Societal efficacy Thornbury JR. Acad Radiol 6 (1999), pp. S58 S65. Fryback and Thornbury. Med Decis Making 11 (1991), pp. 88 94. Mackenzie and Dixon. Clin Radiol 50 (1995), pp. 513 518. Imaging g Effectiveness Hierarchy Can the modality produce the image? Noise, resolution line pairs, MTF, grey scale, sharpness Yield of abnormal or normal cases in a series Sensitivity, specificity, PPV, NPV, ROC height and area Number of cases in which modality was useful in making a diagnosis Pre and post probability, likelihood ratios Number of times modality was helpful in clinical decision making Altered or avoided treatments Percentage of patients improved with the or change in quality adjusted life expectancy Expected value of information, cost effectiveness per QALY Cost Benefit analysis, cost effectiveness analysis from society viewpoint Thornbury JR. Acad Radiol 6 (1999); S58 S65. S65. Fryback and Thornbury. Med Decis Making 11 (1991);88 94. Mackenzie and Dixon. Clin Radiol 50 (1995); 513 518. Materials and Methods: Assess for Validity Was there an independent, blind comparison with a reference (gold) standard of diagnosis? Was the diagnostic evaluated in an appropriate spectrum of patients (like those in whom it would be used in practice)? Was the reference standard applied regardless of the diagnostic result? Was the (or cluster of s) validated in a second, independent group of patients? Dodd JD et al. Evidence-based radiology: how to quickly assess the validity and strength of publications in the diagnostic radiology literature. Eur Radiol. 2004 May;14(5):915-22. Materials and Methods: Assess for Bias Is the study original? How were the patients recruited? Was recruitment bias avoided? Inclusion and exclusion criteria? Were the subjects studied in real life circumstances? What was the sample size? Materials and Methods: Assess for Bias What exactly did they do? What outcome is measured and why? Was systematic bias avoided or minimized? Was assessment blind? Was review /diagnosis review bias avoided? Has workup/verification bias been avoided? Has spectrum bias been avoided? 269

Additional Issues Diagnostic Tests Is this potentially relevant to my practice? Did this validation study include an appropriate spectrum of participants? Was the shown to be both reproducible within and between observers? Has a sensible normal range been derived from these results? Has the been placed in the context of other potential s in the diagnostic sequence for the condition? Specific Questions that apply to Imaging Tests Has the imaging modality been described in sufficient detail to reproduce it in your department? Have the imaging g s being evaluated and the gold standard been performed to the same standard of excellence? Have generations of technology development within the same modality been adequately considered in the study design and discussion? Has radiation exposure been considered? Were images reviewed on hard copy or a monitor? Were images reviewed by a radiologist of sufficient experience? Dodd JD et al. Evidence-based radiology: how to quickly assess the validity and strength of publications in the diagnostic radiology literature. Eur Radiol. 2004 May;14(5):915-22. Results: Assessment of Statistical Strength Sensitivity Specificity Confidence intervals Positive Predictive value Negative Predictive value Likelihood ratios Sensitivity and Specificity Sensitivity = the proportion of patients with the diagnosis who have a positive Specificity = the proportion of patients without the diagnosis who have a negative Independent d of disease prevalence Diagnostic threshold = level of abnormality above which is considered positive and below which the is considered negative Ideal situation no analysis required threshold Disease absent But in reality threshold Disease absent Increasing abnormality Disease present Disease present Increasing abnormality 270

Sensitivity and Specificity: 2 x 2 table Index Reference or gold standard d + - + True (A) Positive - False (C) Negative A+C Sensitivity = False (B) Positive True (D) Negative B+D A +B C + D Total Ttl true positives true positives and false negatives Sensitivity and Specificity: 2 x 2 table Index Reference or gold standard d + - + True Positive (A) - False Negative (C) A+C Specificity = False Positive (B) A +B True Negative (D) C + D B+D Total Ttl true negatives true negatives and false positives Sensitivity = A / A+C Specificity = D / B+D SpPIn and SnNOut If has high sensitivity (Sn),, a negative (N) effectively rules out (Out) the diagnosis (SnNOut) If has high specificity (Sp),, a positive (P) effectively rules in (In) the diagnosis (SpPIn) Confidence Intervals Sensitivity and specificity are point estimates Confidence intervals provide a measure of how closely these estimates approximate the truth Sample size influences the confidence interval Dodd JD et al. Evidence-based radiology: how to quickly assess the validity and strength of publications in the diagnostic radiology literature. Eur Radiol. 2004 May;14(5):915-22. Positive Predictive Value Positive predictive value = the proportion p of patients with a positive who have the disease Sensitivity x Prevalence Sens x prev + (1-specificity)x(1 specificity)x(1-prev) prev) PPV dependsd on prevalence of disease Positive Predictive Value: 2 x 2 table Index Reference or gold standard d + - + True Positive (A) False Positive (B) - False Negative (C) A+C A +B True Negative (D) C + D B+D Positive Predictive Value = true positives all positives Total Ttl PPV = A / A+B 271

Negative Predictive Value Negative predictive value = proportion of patients with a negative who do not have the disease Specificity x (1-Prevalence) spec x (1-prev)+(1 prev)+(1-sensitivity) x prev NPV depends d on prevalence of the disease Negative Predictive Value: 2 x 2 table Index Reference or gold standard d + - + True Positive (A) - False Negative (C) A+C False Positive (B) A +B True Negative (D) C + D B+D Negative Predictive Value = true negatives all negatives NPV = D / C+D Total Ttl Predictive Values Likelihood Ratios Stein PD et al. MDCT for Acute PE. NEJM 2006;354(22); 2717-27. The ratio of two probabilities Probability of positive result in patients with disease divided id d by probability bili of positive result in patients without disease OR Probability of negative result in patients without disease divided by probability of negative result in patients with disease Likelihood Ratios Likelihood Ratios Calculated from sensitivity and specificity LR positive result = sensitivity 1-specificity LR negative result = 1-sensitivity specificity LR range from 0 to infinity LR of 0 = excludes disease LR of infinity = confirms disease LR of 1 = has no discriminating power LR of > 10 = strongly positive LR of < 0.1 = strongly negative 272

Fagan's Nomogram CTPA in PIOPED II (Likelihood Ratio Nomogram) Prevalence (pre probability)=23.3 LR+ = 19.6 LR - =018 0.18 Stein PD et al. MDCT for Acute PE. NEJM 2006;354(22); 2717-27. How do we estimate the pre- probability for a population or individual? Exclusion Threshold Action Threshold 0% 25% 50% 75% 100% Very unlikely Probably does Don t know Probably does Very likely to have the not have it have it to have the disease disease In practice, pre- probability does not have to be expressed numerically Graph of Conditional Probabilities GCP = graph of pre versus post probability Ranges from 0 to 1 or 100% for positive or negative LR of Web based programs - input prevalence (pre probability), sensitivity and specificity MacEneaney PM, Malone DE. The meaning of diagnostic results: A spreadsheet for swift data analysis. Clin Radiol 2000;55:227 235. Also at www.evidencebasedradiology.net Graph of Conditional Probabilities weak Post-Test t Probabilitiy of Disease Graph of Conditional Probabilities 1.000 0.800 0.600 0.400 0.200 0.000 0.000 0.200 0.400 0.600 0.800 1.000 Pre-Test Probability of Disease Graph of Conditional Probabilities strong Probabilitiy of Disease Post-Test 1.000 0.800 0.600 0.400 0.200 0.000 Graph of Conditional Probabilities biliti 0.000 0.200 0.400 0.600 0.800 1.000 Pre-Test Probability of Disease Test Negative Test Positive Test Negative Test Positive MacEneaney PM, Malone DE. The meaning of diagnostic results: A spreadsheet for swift data analysis. Clin Radiol 2000;55:227 235. Also at www.evidencebasedradiology.net MacEneaney PM, Malone DE. The meaning of diagnostic results: MacEneaney PM, Malone DE. The meaning of diagnostic results: A spreadsheet for swift data analysis. Clin Radiol 2000;55:227 235. Also at www.evidencebasedradiology.net 273

Graph of Conditional Probabilities - MDCT in PIOPED II Post-Test P robabilitiy of Dis sease Graph of Conditional Probabilities 1.000 0800 0.800 0.600 0.400 0.200 0.000 0.000 0.200 0.400 0.600 0.800 1.000 Pre-Test Probability of Disease Test Negative Test Positive Stein PD et al. MDCT for Acute PE. NEJM 2006;354(22); 2717-27. Critical Appraisal Guides STAndards for the Reporting of Diagnostic Accuracy Studies (STARD) CONsolidated Standards Of Reporting Trials (CONSORT) The STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) Standards for QUality Improvement Reporting Excellence (SQUIRE) Transparent Reporting of Evaluations with Non randomized Designs (TREND) STARD Guidelines STARD Guidelines Identify the article as a study of diagnostic accuracy (recommend MeSH heading 'sensitivity and specificity'). State the research questions or study aims, such as estimating diagnostic accuracy or comparing accuracy between s or across participant groups. The study population: The inclusion and exclusion criteria, setting and locations where data were collected. Participant recruitment: Was recruitment based on presenting symptoms, results from previous s, or the fact that the participants had received the index s or the reference standard? Participant sampling: Was the study population a consecutive series of participants defined by the selection criteria in item 3 and 4? If not, specify how participants i t were further selected. Data collection: Was data collection planned before the index and reference standard were performed (prospective study) or after (retrospective study)? Bussuyt PM et al. Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Radiology 2003 Jan;226(1):24-8. The reference standard and its rationale. Technical specifications of material and methods involved including how and when measurements were taken, and/or cite references for index s and reference standard. Definition of and rationale for the units, cut-offs and/or categories of the results of the index s and the reference standard. The number, training and expertise of the persons executing and reading the index s and the reference standard. Whether or not the readers of the index s and reference standard were blind (masked) to the results of the other and describe any other clinical information available to the readers. Methods for calculating or comparing measures of diagnostic accuracy, and the statistical ti ti methods used to quantify uncertainty t (e.g. 95% confidence intervals). Methods for calculating reproducibility, if done. Bussuyt PM et al. Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Radiology 2003 Jan;226(1):24-8. STARD Guidelines STARD Guidelines When study was performed, including beginning and end dates of recruitment. Clinical and demographic characteristics of the study population (at least information on age, gender, spectrum of presenting symptoms). The number of participants i t satisfying i the criteria i for inclusion i who did or did not undergo the index s and/or the reference standard; describe why participants failed to undergo either (a flow diagram is strongly recommended). Time-interval interval between the index s and the reference standard, and any treatment administered in between. Distribution of severity of disease (define criteria) in those with the target condition; other diagnoses in participants without the target condition. A cross tabulation of the results of the index s (including indeterminate and missing results) by the results of the reference standard; for continuous results, the distribution of the results by the results of the reference standard. Any adverse events from performing the index s or the reference standard. Bussuyt PM et al. Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Radiology 2003 Jan;226(1):24-8. Estimates of diagnostic accuracy and measures of statistical uncertainty (e.g. 95% confidence intervals). How indeterminate results, missing data and outliers of the index s were handled. Estimates of variability of diagnostic accuracy between subgroups of participants, readers or centers, if done. Estimates of reproducibility, if done. Discuss the clinical applicability of the study findings. Bussuyt PM et al. Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Radiology 2003 Jan;226(1):24-8. 274

References Evidence Based Medicine. How to practice and teach EBM. Third Edition. Sharon E. Strauss, W. Scott Richardson, Paul Glasziou and R. Brian Haynes. Elsvier Churchill hill Livingstone i 2005. Users Guide to the Medical Literature. Essentials of Evidence-Based Clinical Practice. Gordon Guyatt MD and Drummond Rennie MD. AMA press 2002. How to Read a Paper. The basics of Evidence Based Medicine. Third Edition. Trisha Greenhalgh. Blackwell Publishing 2006. Evidence Based Imaging. L.Santiago Medina and C. Craig Blackmore. Springer 2006 Critically Appraising Studies of Diagnostic Tests Aine Marie Kelly B.A., M.B. B.Ch. B.A.O., M.S. M.R.C.P.I., F.R.C.R. 275