Diagrammatic Reasoning Meets Medical Risk Communication

Similar documents
Confirmation Bias. this entry appeared in pp of in M. Kattan (Ed.), The Encyclopedia of Medical Decision Making.

Pace yourself: Improving time-saving judgments when increasing activity speed

Elaboration for p. 20:

Simple Tools for Understanding Risks: From Innumeracy to Insight

The Importance of Numeracy for Decision Making in Health. Wendy Nelson National Cancer Institute August 14, 2012

Communicating Risk & Benefit to Health Care Providers & Patients

Student Performance Q&A:

Statistical Literacy in Obstetricians and Gynecologists

Issues and Challenges in the Era of Shared Decision-making: Explaining Risk and Uncertainty

Screening policy across Europe has evolved with regard to its goal: Participation informed participation informed decision for or against

Lesson Building Two-Way Tables to Calculate Probability

Our goal in this section is to explain a few more concepts about experiments. Don t be concerned with the details.

Six of One, Half Dozen of the Other Expanding and Contracting Numerical Dimensions Produces Preference Reversals

Research Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process

Problem Set and Review Questions 2

Bayes Theorem Application: Estimating Outcomes in Terms of Probability

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

had non-continuous enrolment in Medicare Part A or Part B during the year following initial admission;

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error

5.3: Associations in Categorical Variables

Medication Risk/Benefit Communication: What role does disease severity play? Payal Patel. Honors Thesis. Pharmacy Department

Magnitude and accuracy differences between judgements of remembering and forgetting

Virtual Mentor American Medical Association Journal of Ethics April 2009, Volume 11, Number 4:

Gage R&R. Variation. Allow us to explain with a simple diagram.

Handout 11: Understanding Probabilities Associated with Medical Screening Tests STAT 100 Spring 2016

Plans for Surveillance of Acute Myocardial Infarction in users of Oral Anti Diabetes Drugs

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

NATIONAL INSTITUTE FOR HEALTH AND CARE EXCELLENCE

Cancer Research UK Tesco Charity of the Year Spotting the signs of cancer For men

Is a picture worth a thousand words? The interaction of visual display and attribute representation in attenuating framing bias

What s Time Got to Do with It? Inattention to Duration in Interpretation of Survival Graphs

Theoretical aspects of clinical reasoning and decision-making And their application to clinical teaching

Large Research Studies

Over the last 3 decades, advances in the understanding of

Introduction to Preference and Decision Making

Plain Numbers. Presenting numbers in ways that engage your members and patients. October 2015

Making comparisons. Previous sessions looked at how to describe a single group of subjects However, we are often interested in comparing two groups

MEA DISCUSSION PAPERS

Measures of Clinical Significance

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Variability. After reading this chapter, you should be able to do the following:

Marilyn M. Schapira, MD, MPH University of Pennsylvania

observational studies Descriptive studies

An experimental investigation of consistency of explanation and graph representation

Communicating and tailoring risks and benefit information

Running Head: TRUST INACCURATE INFORMANTS 1. In the Absence of Conflicting Testimony Young Children Trust Inaccurate Informants

Review for Final Exam Math 20

How Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis?

Supplemental Material. Experiment 1: Face Matching with Unlimited Exposure Time

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

MADIT-RIT: Simple programming change averts most inappropriate ICD therapy

Getting ready for propensity score methods: Designing non-experimental studies and selecting comparison groups

The Good Judgment Project: A Large Scale Test of Different Methods of Combining Expert Predictions

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks

Introductory Statistics Day 7. Conditional Probability

Best on the Left or on the Right in a Likert Scale

CHAPTER 15: DATA PRESENTATION

Lecture Notes Module 2

Overview: Part I. December 3, Basics Sources of data Sample surveys Experiments

Psychological. Influences on Personal Probability. Chapter 17. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Changing Public Behavior Levers of Change

(a) 50% of the shows have a rating greater than: impossible to tell

Biostatistics 2 - Correlation and Risk

Section 6: Analysing Relationships Between Variables

Implantable Cardioverter-Defibrillators (ICD)

SURVEY TOPIC INVOLVEMENT AND NONRESPONSE BIAS 1

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 1.1-1

Designs. February 17, 2010 Pedro Wolf

The Epidemic Model 1. Problem 1a: The Basic Model

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

2016 Children and young people s inpatient and day case survey

Item Writing Guide for the National Board for Certification of Hospice and Palliative Nurses

Overview. Survey Methods & Design in Psychology. Readings. Significance Testing. Significance Testing. The Logic of Significance Testing

Hae Won KIM. KIM Reproductive Health (2015) 12:91 DOI /s x

Goal for Today. Making a Better Test: Using New Multiple Choice Question Varieties with Confidence

PRINCIPLES OF STATISTICS

Implantable Cardioverter-Defibrillators (ICD)

the standard deviation (SD) is a measure of how much dispersion exists from the mean SD = square root (variance)

Math 140 Introductory Statistics

Bugbears or Legitimate Threats? (Social) Scientists Criticisms of Machine Learning. Sendhil Mullainathan Harvard University

9 research designs likely for PSYC 2100

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Underestimating Correlation from Scatterplots

Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias.

AP STATISTICS 2008 SCORING GUIDELINES (Form B)

Why We Need Shared Decision Making. Jack Fowler Senior Scientific Advisor

ExperimentalPhysiology

More honoured in the breach

Factors influencing the perception of side effect risk information

Sheila Barron Statistics Outreach Center 2/8/2011

LSP 121. LSP 121 Math and Tech Literacy II. Topics. Risk Analysis. Risk and Error Types. Greg Brewster, DePaul University Page 1

Assessment of methodological quality and QUADAS-2

This article is the second in a series in which I

17/10/2012. Could a persistent cough be whooping cough? Epidemiology and Statistics Module Lecture 3. Sandra Eldridge

Evaluation Models STUDIES OF DIAGNOSTIC EFFICIENCY

Unit 1 Exploring and Understanding Data

Wendy J. Reece (INEEL) Leroy J. Matthews (ISU) Linda K. Burggraff (ISU)

The Use of Narrative Evidence and Explicit Likelihood by Decisionmakers Varying in Numeracy

Experimental Research. Types of Group Comparison Research. Types of Group Comparison Research. Stephen E. Brock, Ph.D.

The Efficacy of the Back School

Transcription:

Diagrammatic Reasoning Meets Medical Risk Communication Angela Brunstein (angela@brunstein.net) Department of S.A.P.E. - Psychology, American University in Cairo, AUC Ave, PO Box 74, New Cairo 11835, Egypt Joerg Brunstein (joerg@brunstein.net) Department of S.A.P.E. - Psychology, American University in Cairo, AUC Ave, PO Box 74, New Cairo 11835, Egypt Ali Marzuk (AM12334@rcsi-mub.com) Royal College of Surgeons in Ireland Bahrain, PO Box 15503, Adliya, Bahrain Abstract Informed consent for medical procedures requires that patients understand risks associated with diagnostic and treatment options. Similar to performance for diagrammatic reasoning and system dynamics, patients, physicians and medical students are reported to perform poorly on understanding medical risk-related information. At the same time, different presentation formats seem to support different kind of conclusions across domains. In this research, we investigated different formats of presenting risk information related to a treatment scenario with 22 medical and 50 nonmedical students. As expected, medical students performed better than non-medical students for all versions of the problem, while non-medical students could partially compensate missing medical knowledge with displays that reduce complexity and allow reducing cognitive load. This implies that it is possible to support patients decision making, but also highlights the need to educate patients on potential risks and benefits. Keywords: medical decision making, diagrammatic reasoning, risk evaluation Applying Diagrammatic Reasoning to Medical Risk Communication Diagrammatic reasoning and understanding complex system has been demonstrated to be difficult for several task domains (e.g., Cronin, Gonzalez, & Sterman, 2009). Performance for these kinds of tasks is better if information can be directly read out from diagrams compared to inferring it (e.g., Larkin & Simon, 1987). The other way around, specific presentation formats seem to support different kinds of conclusions. For example, for judging a car s fuel efficiency, presentation of gallons per mile is more promising than presentation of miles per gallon (Larrick & Soll, 2008). For the medical domain, understanding information related to diagnostic and treatment choices is essential to informed consent, evidence-based medicine and doctorpatient shared decision-making. As for understanding complex systems in general, there is evidence that both, physicians and patients have difficulties understanding risk related information (Mazur & Hickam, 1993; Windish, Huot, & Green, 2007). Similar to the diagrammatic reasoning literature, patients and undergraduates performance varies between different presentation formats (Gigerenzer et al., 2007; Shapira, Nattinger, & Mc Auliffe, 2006). Also medical students perform better for displays on accumulation problems than non-medical students for some, but not all medical scenarios (Brunstein, Gonzalez, & Kanter, 2010). In this study, we aimed to combine the lessons learned from diagrammatic reasoning for understanding a treatment scenario on ventricular fibrillation with medical students and non-medical undergraduates. For decision whether or not to undergo surgery to get an implantable cardioverter defibrillator (ICD) after surviving a heart attack, patients need to evaluate the risk of having another heart attack (i.e., severity of disease) and how likely an ICD can save their life during that heart attack (i.e., effectiveness of treatment). In the medical literature, common measures for those values come from clinical trials with number of patients surviving versus dying in treatment, in this case ICD, versus control groups, in this case heart medication only. For physicians, critical values impacting treatment decisions are absolute and relative risk reduction as estimated proportion of patients surviving due to treatment and number needed to treat as estimates of the number of patients who are exposed to potential side effects for saving one patient s life. For illustrating these measures, several formats of information presentation are used in the medical literature and on patient information leaflets or websites. These include tables, frequency arrays or bar graphs with numbers or proportions of patients dying versus surviving in different conditions. As for the miles per gallon example, patients perform better for understanding outcomes of clinical trials when presented in terms of frequencies and not in terms of probabilities (Gigerenzer et al., 2007). Unfortunately, the decision whether or not undergo surgery to receive an ICD requires both, understanding the research results and estimating the probability of success for themselves. Also, when presented with several treatment options, comparing several pairs of 100 smileys each can become very confusing. Therefore, it is not evident which kind of display might help patients best to make informed decisions. For each of these three domains, participants display a tendency for bias or errors. For the miles per gallon illusion (Larrik & Soll, 2008), participants tend to assume a linear instead curvilinear relationship and, therefore, they do not appreciate the increase of efficiency by replacing least 572

efficient cars. For health information, participants tend to neglect base rates or to confuse variables for calculations and, therefore, do not understand the value of screening or underestimate the risks associated with treatment. Different kinds of visualization do not change the concepts or required calculations, but they do change the likelihood of error in participants responses. Miles per gallons come with decimals instead naturals and with constant base rates. This makes the comparison much easier. Frequencies instead conditional probabilities have the base rates already integrated and require one step less for processing. At the same time, naturals are more convenient for calculations and comparisons than proportions with varying base rates. Applying these considerations to medical risk communication, all presentation formats are isomorph and display the same information. All of them allow comparison. At the same time, each of them invites for different strategies: Tables explicitly display number of patients, invite to calculate, but require numeracy and statistical literacy to get the correct number and to understand that result. It is challenging to visualize patients in different conditions from numbers. And patient numbers need to be converted to estimate personal odds of surviving or dying. Bar graphs illustrate proportions of patients and are therefore closest to personal risk. They invite to estimate, but they can be problematic given reported difficulties with probabilities (e.g., Gigerenzer et al., 2008). As for tables, it is difficult to imagine number of patients in different conditions. Frequency arrays as favored by Gigerenzer and colleagues (2008) work with naturals and do not require considering base rates. They illustrate numbers of patients and are intuitive to understand. As for tables, number of patients need to be converted to estimate personal risk. Similarly, separate or integrated displays are associated with different advantages and disadvantages. Separate displays can be directly mapped onto treatment conditions, but require comparing and integrating displays to derive conclusions on treatment effectiveness. Integrated displays highlight the difference and take processing away from participants, but make it more difficult to read the display. This holds especially for the part of patients that have survived in treatment condition, while the corresponding number of patients had died in control condition. 1. Based on the diagrammatic reasoning and system dynamics literature, we expected that medical students should outperform non-medical students due to their greater knowledge associated with medical risk communication and treatment options for that disease. 2. Given that background knowledge, we expected little impact of presentation formats on medical students performance for evaluating outcomes of clinical trials. 3. In contrast, we expected, differences in performance for different presentation formats for non-medical students. Because tables, arrays and bar graphs are differently suited to support performance on proportion versus number of patients, but also vary in complexity and in how intuitive they are to interpret, we had no specific hypotheses which presentation should be best for this task. Participants Method Fifty non-medical students (26 female, 24 male), mean age 21 years (SD =3) participated for course credit in this study and were randomly assigned to one of five presentation formats for a scenario on treating ventricular fibrillation with ICD (10 per condition). Forty medical students (19 female, 16 male, 18 dropped before demographic information) mean age 21 years (SD = 2) agreed to participate in this study. Due to high dropout, only 22 participants (4-5 per condition) completed the scenario. All participants who quitted before completing the survey, did so before performing the risk information scenario at the consent page or at the demographics page. Data from these participants were excluded from analysis. Design This study implemented a 5 (versions of visualization) x 2 (medical knowledge) between participants design for answering 4 questions associated with severity of disease and effectiveness of treatment. Materials Participants were presented with scenario on 5-year survival of patients with ventricular fibrillation from a clinical trial (based on Moss et al., 1996). According the description, one hundred patients had an ICD implanted in addition to traditional heart medication (treatment condition). The remaining 100 patients received heart medication only (control condition). This was the topic of a group project for first year medical students at RCSI. Therefore, medical students were familiar with the topic, but not with the specific data. Data were presented in one of five presentation formats: a table (see Figure 1a) or frequency arrays for number of patients dying versus surviving in treatment and control conditions (see Figure 1d and 1e) or bar graphs on proportions patients surviving versus dying (see Figure 1b and 1c). In addition for arrays and bar graphs, we either presented a pair of individual displays of patients surviving versus dying per condition or a combined display for both conditions. Participants answered four questions on simplified versions of risk reduction and number needed to treat (NNT), on severity of disease and estimated effectiveness of treatment. These questions are relevant for patients, for example, with ventricular fibrillation, for making informed decisions on their treatment. The first question asked for a number, the remaining questions were true/false statements: 573

Figure 1: Displays for patients surviving versus dying in treatment condition (ICD) and control condition (heart medication only) as (a) table, (b) integrated bar graph, (c) separate bar graphs, (d) integrated frequency array, and (e) separate frequency arrays. Table 1: Average performance (and standard deviations) for medical students (med; N = 22) and non-medical students (non-med; N =40) for different presentation formats (maximum score was 4). Table 1 Array 2 Arrays 1 Bar 2 Bars Total med 3.0 (1.0) 3.2 (0.8) 3.7 (0.6) 3.0 (0) 3.3 (0.5) 3.2 (0.3) non-med 1.6 (0.7) 2.5 (0.7) 2.0 (0.7) 2.1 (0.9) 1.7 (0.5) 2.0 (0.4) The first question asked for the difference of survivors between both conditions as a proxy for treatment effectiveness and as a component of absolute and relative risk reduction. Correct answer is 20. 574

The second question asked whether more patients die than survive as a proxy for the severity of the disease or for the necessity to treat. Correct answer is false. The third question asked whether more patients die with implant than without. The correct answer is false. These 3 questions are needed to calculate the number needed to treat to save one patient s life (NNT). For this scenario, 1 of 5 patients survives due to treatment and the remaining 4 of 5 suffer from potential side effects of treatment without benefit. One of five will die in both conditions and 3 of 5 will live in both conditions. The correct answer is true. Because all four questions ask for aspects that are relevant for the treatment decision and because of the small number of participants among medical students, we report accumulated scores below. For patients, the next question would be whether they want to have an ICD implanted. Procedure The study was conducted as an online experiment on surveymonkey.com. After providing informed consent and demographic information, participants answered the four questions on the treatment scenario. Total time was about 10 to 15 min. The IRB/ethics boards of the American University in Cairo and the Royal College of Surgeons in Ireland Bahrain had approved this research. Results As expected, medical students performed better than nonmedical students, F (1, 62) = 47.12, p <.001, η2 =.43. Given the low number of participants for the group of medical students, we analyzed the effect of versions on participants performance for the four questions separately for groups. For medical students, presentation format had no impact on performance, F (4, 17) = 0.64, η2 =.13. In tendency, medical students performed better for separate displays than for integrated displays (Bonferoni, all p s >.10). However, presentation format impacted performance of non-medical students, F (4, 45) = 2.96, p <.05, η2 =.21. Non-medical students performed better for the array than 2 bars or table (p s <.01) and in tendency better for integrated displays than for separate displays (p s >.10, see Table 1). This indicates that displays that reduce complexity and have potential for visual imagery better support non-medical students performance for evaluating medical risk related information. Discussion Understanding medical risk communication is essential for informed consent for treatment choices, evidence-based medicine and physician-patient shared decision-making. At the same time, literature on medical decision-making indicates that patients performance is far from perfect. This is matched by reported difficulties for diagrammatic reasoning and system dynamics. For our study, medical students performed better for evaluating treatment options than non-medical students. However, not all medical students scored 4 of 4 questions correctly. As for diagrammatic reasoning literature, in our study different formats were differently supportive for evaluating medical risk information in non-medical students. Potentially, knowledge of the task domain and presentation formats that match the required task can help with understanding risk information. For our task that required understanding of research results in terms of frequencies and estimating the patient s own chances of success in terms of probability, there is not one single format of diagram that serves all aspects of the task best. Therefore, it seems that displays that are intuitive (arrays of patient numbers) and displays that allow reducing cognitive load (a combined bar graph on patient proportions) seem to foster non-medical students performance. In contrast, medical students tended to profit from separate displays that can serve as external memory when calculating the statistical values. This means we should not leave it to the doctor to choose the presentation format for the patient because what is best for the expert does not match what is best for the layperson. For our study, dropout rates for medical students (18 of 40 dropped) were very different from non-medical students dropout rate. If eliminating the corresponding proportion of low-performing non-medical students, the effect of presentation format becomes weaker, but does not disappear completely. In addition, even the best performing group of non-medical students performs worse than any of the groups of medical students. This indicates that presentation format can promote performance for medical risk evaluation, but cannot replace domain knowledge for understanding implications of illustrated data. Therefore, when supporting patients treatment choices, we will need to educate them on potential risks and benefits in addition to provide intuitive displays. Acknowledgements This research was made possible by the Royal College in Ireland Bahrain s generous support. The data collection on medical students at RCSI Bahrain was conducted when Dr. Angela Brunstein was a Senior Lecturer in Psychology at RCSI Bahrain. The contents of this report are solely the responsibility of the authors and do not necessarily represent the official views of RCSI Bahrain. References Brunstein, A, Gonzalez, C, & Kanter, S. (2010). Effects of domain experience in the stock-flow failure. System Dynamics Review, 26, 347 354. Cronin, M., Gonzalez, C., & Sterman, J.D. (2009). Why don t well-educated adults understand accumulation? A challenge to researchers, educators, and citizens. Organizational Behavior and Human Decision Processes, 108, 116 130. 575

Gigerenzer, G., Gaissmaier, W., Kurz-Milcke, E., Schwartz, L.M., & Woloshin, S. (2007). Helping doctors and patients make sense of health statistics. Psychological Science in the Public Interest, 8, 53-96. Larkin, J.H., & Simon, H.A. (1987).Why a diagram is (sometimes) worth ten thousand words. Cognitive Science, 11, 65-99. Larrik, R.P., & Soll, J.B. (2008). The MPG illusion. Science, 320, 1593-94. Mazur, D.J, & Hickam, D.H. (1993). Patients and physicians interpretations of graphic data displays. Medical Decision Making, 13, 59 63. Moss, A.J., Jackson Hall, W., Cannon, D.S., Daubert, J.P., Higgins, S.L., et al. (1996). Improved survival with an implanted defibrillator in patients with coronary disease at high risk for ventricular arrhythmia. New England Journal of Medicine, 335, 1933-1940. Shapira, M.M., Nattinger, A.B., McAuliffe, T.L. (2006). The influence of graphical format on breast cancer risk communication. Journal of Health Communication, 11, 569-82. Windish, D.M., Huot, S.J., & Green, M.L. (2007). Medicine residents understanding of the biostatistics and results in the medical literature. JAMA, 298, 1010 1022. 576