Understanding Diagnostic Research Outline of Topics

Albert-Ludwigs-Universität Freiburg Freiräume für wissenschaftliche Weiterbildung Understanding Diagnostic Research Outline of Topics Werner Vach, Veronika Reiser, Izabela Kolankowska In Kooperation mit Dieses Vorhaben wird aus Mitteln des Bundesministeriums für Bildung und Forschung und aus dem Europäischen Sozialfonds der Europäischen Union gefördert.

Understanding Diagnostic Research Outline of Topics In this document we describe the structure of the script of the planned module Understanding Diagnostic Research, as well as basic points to be considered in each section of the script. 1. The ideal story of developing a new diagnostic test idea: something can help us to find out something about the disease status of a patient developing measurement/test instrument: stable and reliable assay / good instrument to assess symptoms etc proof of principle: comparing definitely diseased with definitely disease free assessing accuracy in a clinical population of interest (needs gold standard) studying impact: are clinicians really using the information as expected? if consequences are unclear: studying benefit in randomised trials predicting the expected benefit for society / HTA decision to include in routine care post introduction investigation biomarkers: slight difference, as proof of principle / accuracy can be studied retrospectively 2. Identifying the clinical situation: For whom we want to know what? clinical target situation: where should patients meet the test? consequence of information: treatment decisions, further diagnostic procedures, other management decisions? clinical definition of target population setting to be applied (GP, specialized university department, developing countries...) typical situations: screening, high risk group screening, primary, differential diagnoses, staging, response evaluation, monitoring 3. What is a diagnostic test? binary rule everything giving information, from single question on a symptom until high tech image or molecular marker can be wrong! information on error probability essential for interpretation and communication to be carefully distinguished from gold standard: telling the truth 4. Interpreting and communicating the results of a diagnostic test: predictive values Patient with abdominal pain and a GP practice: What can I tell the patient, if ultrasound image shows a shadow? 2

Look at all your patients with the same symptoms and the same test result: How many of them were diseased? Tell it the patient! Use it yourself to decide on treatment/further diagnosing 5. Describing the quality of a diagnostic test: Sensitivity and specificity How good is a test to catch the subjects it should catch and not to catch the subjects it should not catch? sensitivity and specificity from your patient data sensitivity and specificity from a study (simple accuracy studies) gold standard vs. reference standard sensitivity and specificity can be population dependent population based vs. case-control studies statistics: The need for confidence intervals 6. The relation between predictive values and sensitivity/specificity Bayes formula: computing predictive values if sensitivity, specificity, and prevalence are known dependence of predictive values on the prevalence 7. Sequential application of diagnostic tests typical way today: sequential application in dependence on previous results, until sufficient evidence for presence or absence transformation of pre-test probabilities to post test probabilities 8. Further tools to communicate the value of a diagnostic test likelihood ratio, diagnostic odds ratio, number needed to diagnose 9. Comparison of diagnostic tests: conceptual issues Simple question: Which of two tests is the better one? Difficult to answer, if one has the better sensitivity and the other the better specificity. Balancing sensitivity against specificity: What are the consequences of false positive and false negative decisions? What is more important? Considering weighted averages: Formal approaches role of prevalence Difficulty to agree on weights: ranges of weights 3

10. Comparison of diagnostic tests: statistical issues of (paired) accuracy studies confidence intervals for difference in sensitivity, specificity, or predictive values p-values (McNemar test) confidence intervals for weighted differences comparison if gold standard unknown for double negatives 11. Impact of imperfect reference standards bias direction difficult comparative studies: balancing out? 12. How to plan, conduct and report an accuracy study population blinding sample size considerations 13. Sources of bias in accuracy studies spectrum bias verification bias work up bias 14. Why diagnostic tests have to be binary the problem of evaluating a diagnostic test with the category "indefinite". 15. Evaluation of continuous diagnostic markers the idea behind ROC curves comparison of ROC curves AUC, CIs and p-values the non-existence of optimal cut points pragmatic ways to determine cut points (sens, spec, prevalence match) variants of ROC curves 16. Combining diagnostic markers or tests 4 basic approach: estimating P(D=1 X) and regard the resulting score as continuous marker

standard: logistic regression the need for cross validation of AUCs, sensitivity, specificity trees modern methods: Random forests, Lasso etc... 17. Assessing the clinical benefit by randomised trials idea of benefit studies patient relevant outcomes designs sample size considerations 18. Assessing the clinical benefit by decision modelling idea why difficult role prior to RCTs 19. Establishing diagnostic procedures as gate keepers the idea and role of "enrichment designs" 20. Establishing diagnostic procedures as predictive factors the idea of "interaction" designs 21. Determining factors with influence on accuracy regression models for sensitivity/specificity/auc 22. Meta analyses in diagnostic research 23: Outlook: Future directions in diagnostic research Appendix I: Statistical methods to be used in diagnostic research confidence intervals for proportions a derivation of the Bayes formula confidence intervals for computed predictive values 5

confidence intervals for the difference of proportions (paired and unpaired) CIs for AUCs, p-values for comparing AUCs, CIs for difference in AUCs (paired and unpaired) Appendix II: Formal systems of phases in diagnostic research Contact: Werner Vach, Center for Medical Biometry and Medical Informatics, University of Freiburg, Stefan-Meier-Str. 26, 79104 Freiburg, Germany, email: wv@imbi.uni-freiburg.de Freiburg, März 2014 6