Review: Conditional Probability. Using tests to improve decisions: Cutting scores & base rates

Similar documents
Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Learn how to more effectively communicate with others. This will be a fun and informative workshop! Sponsored by

AFSP SURVIVOR OUTREACH PROGRAM VOLUNTEER TRAINING HANDOUT

Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias.

Section 4 Decision-making

Helping Your Asperger s Adult-Child to Eliminate Thinking Errors

Sheila Barron Statistics Outreach Center 2/8/2011

keep track of other information like warning discuss with your doctor, and numbers of signs for relapse, things you want to

Chapter Three: Sampling Methods

Problem Situation Form for Parents

FINDING THE RIGHT WORDS IN ADVANCED AND METASTATIC BREAST CANCER (ABC/MBC)

Sexual Feelings. Having sexual feelings is not a choice, but what you do with your feelings is a choice. Let s take a look at this poster.

THERAPEUTIC REASONING

Chapter 7: Descriptive Statistics

Psychological. Influences on Personal Probability. Chapter 17. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Cognitive Restructuring

Reliability, validity, and all that jazz

REVIEW FOR THE PREVIOUS LECTURE

Attention and Concentration Problems Following Traumatic Brain Injury. Patient Information Booklet. Talis Consulting Limited

Statistics. Dr. Carmen Bruni. October 12th, Centre for Education in Mathematics and Computing University of Waterloo

Module 28 - Estimating a Population Mean (1 of 3)

When Your Partner s Actions Seem Selfish, Inconsiderate, Immature, Inappropriate, or Bad in Some Other Way

ELEPHANT IN THE OFFICE!

Human intuition is remarkably accurate and free from error.

Probability Models for Sampling

Anxiety. Top ten fears. Glossophobia fear of speaking in public or of trying to speak

Controlling Worries and Habits

H Qs. Tamiflu and an influenza pandemic. Key Assessment Task: Questions

Our plan for giving better care to people with dementia Oxleas Dementia

Reliability, validity, and all that jazz

Question: I m worried my child is using illegal drugs, what should I do about it?

When Intuition. Differs from Relative Frequency. Chapter 18. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

UNIVERSITY OF SOUTHERN CALIFORNIA TOWARDS NO TOBACCO USE (TNT) STUDENT SURVEY POSTTEST

Awareness and understanding of dementia in New Zealand

Bayes theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

Why Is It That Men Can t Say What They Mean, Or Do What They Say? - An In Depth Explanation

Options in HIV Prevention A Participant-Centered Counseling Approach

MALE LIBIDO- EBOOKLET

Living well today...32 Hope for tomorrow...32

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

This week s issue: UNIT Word Generation. intrinsic commodity practitioner evaluate infer

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

Beattie Learning Disabilities Continued Part 2 - Transcript

HEPATITIS C LESSONS PART 4

Audio: In this lecture we are going to address psychology as a science. Slide #2

I don t want to be here anymore. I m really worried about Clare. She s been acting different and something s not right

Detective Work and Disputation

Statistics Coursework Free Sample. Statistics Coursework

Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time.

Bayesian Analysis by Simulation

What does NLP do? Congruence. Rapport. Outcome. Technique or pattern

Building Friendships: Avoid Discounting

Unraveling Recent Cervical Cancer Screening Updates and the Impact on Your Practice

Political Science 15, Winter 2014 Final Review

The Wellbeing Course. Resource: Mental Skills. The Wellbeing Course was written by Professor Nick Titov and Dr Blake Dear

NEXT. Top 12 Myths about OEE

How to CRITICALLY APPRAISE

APPENDIX N. Summary Statistics: The "Big 5" Statistical Tools for School Counselors

Study Guide for Why We Overeat and How to Stop Copyright 2017, Elizabeth Babcock, LCSW

Participant Information Sheet

Helping you decide 2014 edition Easy Read

Time for Change. The Challenge Ahead

Harmony in the home with Challenging Children. By Laura Kerbey Positive Autism Support and Training

Study Design. Study design. Patrick Breheny. January 23. Patrick Breheny Introduction to Biostatistics (171:161) 1/34

How Math (and Vaccines) Keep You Safe From the Flu

Assertive Communication

A Guide to Help You Reduce and Stop Using Tobacco

Understanding Probability. From Randomness to Probability/ Probability Rules!

Reasoning with Uncertainty. Reasoning with Uncertainty. Bayes Rule. Often, we want to reason from observable information to unobservable information

ADHD. What you need to know

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION"

Sampling Controlled experiments Summary. Study design. Patrick Breheny. January 22. Patrick Breheny Introduction to Biostatistics (BIOS 4120) 1/34

Mentoring. Awards. Debbie Thie Mentor Chair Person Serena Dr. Largo, FL

Limited English Proficiency Training

The Truth About Fitness, Weight Loss and Improving Athletic Performance by Kevin Quinlan

The big picture. Lesson 1. Breaking the ice. Talking points. For this lesson you ll need

Transforming Public Speaking Anxiety Workbook

USING ASSERTIVENESS TO COMMUNICATE ABOUT SEX

Toolkit Instructions. Read and complete the 5 assignments in this toolkit.

ADHD clinic for adults Feedback on services for attention deficit hyperactivity disorder

What is Down syndrome?

Patrick Breheny. January 28

From the scenario below please identify the situation, thoughts, and emotions/feelings.

Everyone asks and answers questions of causation.

QUESTIONS ANSWERED BY

Exploring YOUR inner-self through Vocal Profiling

Evaluating you relationships

You Can Treat OCD. Treatment of OCD. ReidWilson, PhD. NoiseInYourHead.com 1. Objectives. BriefTherapy Conference December 9, 2018.

Appendix: Instructions for Treatment Index B (Human Opponents, With Recommendations)

My Review of John Barban s Venus Factor (2015 Update and Bonus)

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

SECOND TRADITION SKIT

Study on Gender in Physics

02/19/02 OBSESSIVE-COMPULSIVE DISORDER SECTION

What is Self-Esteem? Why is it Important? Where Does Self-Esteem Come From? How Can You Boost Self-Esteem?

Here s a list of the Behavioral Economics Principles included in this card deck

Chapter 12. The One- Sample

Anxiety and problem solving

Workshop: Hack Your Brain: Tools and Tips for Building Emotional Intelligence

Pearson Edexcel International GCSE in Mathematics (Specification A) (9-1) Exemplar student answers with examiner comments

Transcription:

Review: Conditional Probability Using tests to improve decisions: & base rates Conditional probabilities arise when the probability of one thing [A] depends on the probability of something else [B] In such cases, we want to factor in the probability of B before we worry about A This amounts to focusing on the elements that are likely to be picked out by both A and B Three ways We can consider three ways to solve condition probability questions (all exactly equivalent): Common sense Probability tables Bayes Theorem a.) Common sense 5 males, one wears a dress; 3 females, wear dresses. What is the probability that you wear a dress, given that you are female? First: We want to know how many people are both dress wearers and females = P(A and B) = Second: We want to know what proportion of all woman are accounted for by the dress wearing females = Dress wearing females / Females = P(Female and dress-wearing)/p(female) = /5 b.) Probability Tables What is the probability that you sometimes wear a dress, given that you are female? b.) Probability Tables What is the probability that you sometimes wear a dress, given that you are female? Dress No dress Dress No dress Male Male JUST IGNORE ALL THE MALES! Female Female

c.) Bayes Theorem What is the probability that you sometimes wear a dress, given that you are female? P(A B) = P(B A) P(A) / P(B) Proof: By definition, (.) P(A B) = P(A and B) / P(B) (2.) P(B A) = P(A and B) / P(A) (3.) P(B A) P(A) = P(A and B) [Multiply (2.) by P(A)] (.) P(B A) P(A)= P(A B) P(B) [Substitute (.) in (3.)] (5.) P(B A) P(A) / P(B) = P(A B) [Divide by P(B)] P(Dress-wearing Female) = P(Female Dress-wearing)P(Female)/P(Dresswearing) = (/5 * 5/0) / (5/0) = /5 Why use ^$#*! Bayes Theorem? Bayes Theorem is not intended to confuse, but to simplify: you can use it to get the probability relation between any two cells in the 2x2 table It can also be generalized to more complex situations However, in this class we won t go outside of 2x2 conditional probability tables: so just draw a picture or think it through if you prefer! What is a cutting score or cutting line? How shall we evaluate how any given test is? What is a cutting score or cutting line? In many tests we have criteria: if a subject scores above score X, they are likely to be Y [a genius, a moron, a prospect, likely to die in six months] X is a cutting score Note that this is a conditional probability: P(diagnosis test result) Note also that in this case probability of X [test result] is not given by God : we test designers are free to change the cutting score as we like In doing so, we can change P(diagnosis test result) As an example, think of the probability that a person is a genius (defined, let s say, as IQ > 30) given that they got an IQ score of 28, on the one hand, or 0, on the other. Assume the standard error for IQ is 0 points Then there is a fair chance that a person who got 28 has an IQ above 30, but a very small (but non-zero) chance that that person who got 0 has an IQ above 30 If we used 0 as a cutting score for genius, we d be wrong a lot: P(diagnosis test result) is very low If we used 28 as a cutting score for genius, we d be wrong less often: P(diagnosis test result) is higher What we want is some principled way of deciding what a cutting score is for any particular purpose Clearly, our choice of cutting score will depend on that purpose When we are diagnosing a brain tumour, we want to be wrong almost never if the person does have a brain tumour AND we don t care too much if we make a false positive When we are trying to identify criminals, we might be more worried about minimizing false positives (we could ruin a life is we say someone is a criminal when they are not) and willing to pay the price by letting some real criminals go free (increase our false negative rate) 2

False negative: Incorrectly undiagnosed. False positive: Incorrectly diagnosed Low false negative rate High false positive rate High false negative rate Low false positive rate Rewarding incompetence Ignoring competence How shall we evaluate how a test is? How shall we evaluate how a test is? Three things need to be taken into account: i.) The size of the correlation between test scores and criterion - The higher the correlation, the narrower the scatterplot (i.e. the ellipse) and the smaller the error rates Three things need to be taken into account: ii.) The base rate iii.) The cutting score What is the relation between these two measures? 3

The relation between base rate and cutting score Example from Meehl: Group A: 5 well-adjusted soldiers Group B: 89 mal-adjusted soldiers A scale diagnosed 55% of Group B, and only 9% of Group A, so the authors advocated its use Example: Assume N = 0,000 500 are bad. 55% (275) are classified as bad 9500 are. 8% (7695) are not classified as bad. (7695 + 275)/0000 = 79.97% are correctly classified. Why should this bother us? We could have correctly classified 95% without using a test! Let s use Bayes Theorem: Is bad bad? Let s use Bayes Theorem: Is not bad? P(Good Not Diagnosed) =P(Not Diagnosed Good)P(Good)/P(Not Diagnosed) P( Diagnosed) = P(Diagnosed )P()/P(Diagnosed) P(Diagnosed ) 0.55 Given P() 0.05 Assumed P(Diagnosed) 0.208 = (0.55*0.05) + (0.9* 0.95) P( Diagnosed) 0.3 = P(Diagnosed )P()/P(Diagnosed) When we take base rates into account, an identification of a person as bad actually has only a 3% chance of being correct, not a 55% chance as claimed. P(Not Diagnosed Good) 0.8 Given P(Good) 0.95 Assumed P(Not Diagnosed) 0.7920 = (0.5*0.05) + (0.8* 0.95) P(Good Not Diagnosed) 0.97 By Bayes' Theorem When we take base rates into account, a failure to identify a person as bad has 97% chance of being correct but remember that we were already 95% sure before we bothered to do the calculation! The relation between base rate and cutting score, II Let s do the math! A certain Rorschach configuration is seen in 8.% of schizophrenics, and 0% of nonschizophrenics The authors claim this is clinically useful: Is it really? P(Schizo Rorschach) = P(Rorschach Schizo)P(Schizo)/P(Rorschach) P(Rorschach Schizo) 0.08 The empirical finding P(Schizo) 0.0085 Known base rate for schizophrenia P(Rorschach) 0.0006885 = (0.0085*0.08) P(Schizo Rorschach).00 =P(Rorschach Schizo)P(Schizo)/P(Rorschach) Although the sign is certain in this case, it is so rare itself and applies to a group with such a rare base rate that it is P(Rorschach) that is worrying: This information would be diagnostically helpful in only 7 cases out of 0,000! = it is clinically useless

What can we do? : Rule Example : Rule In order for a positive diagnostic assertion to be more likely true than false, the ratio of positive to negative base rates in the examined population must exceed the false positive to valid positive rate : Base rate of positives Base rate of negatives > False positive rate of test True positive rate of test Base rate of positives Base rate of negatives > False positive rate of test True positive rate of test A cutting score identifies 80% of brain-damaged patients. 5% of nondamaged patients also exceed that cut-off. What base rates can justify the use of such a test?.5 (false positive) /.80 (true positive) = 0.9 The ratio of brain damaged to non-brain damaged patients in the population under consideration must be equal to or greater than.9, or about in 5. The easiest case: Equal base rates (Rule 2) Example: Equal base rates (Rule 2) Iff base rates are equal, then the probability of a positive diagnosis is the ratio of the true positive rate to the sum of the true and false positive rates. Another way of saying this more simply is: equal base rates render Bayes Theorem unnecessary. Iff base rates are equal, then the probability of a positive diagnosis is the ratio of the true positive rate to the sum of the true and false positive rates. Two kinds of cancers occur equally often. A test diagnoses Type B with 68% accuracy, but is at chance for Type A. You get a positive test result. What is the probability you have Type B cancer? For once life is simple. The probability is 68%. 0.68 / (0.68 + 0.32) = 0.68 Example 2: Equal base rates (Rule 2) A test picks out 75% of people who will continue in school (true positives) but also 0% of those who will not (false positives). It is claimed that about half of all students in the population drop out of school. How far off can that claim be without the test being useless? The probability of a positive diagnosis with equal split is the ratio of the true positive rate to the sum of the true and false positive rates: 0.75 / (0.75 + 0.0) = 0.65 So the test gets about 65% right. If less than 35% of the students actually do drop out, the test will not do better than base rates. When can a test help? (Rule 3) A test result can only help if the base rate of the more numerous class (here, positive) is less than the ratio of the true negative rate to the sum of the true and false negative rate That is: If it is a matter of fact that (say) only 0% of students drop out, then there is no use giving this test: it can t beat the 90% odds you have of being correct before you bothered to give the test 5

When can a test help? (Rule 3) A test result can only help if the base rate of the more numerous class (say, positive) is less than the ratio of the true negative rate to the sum of the true and false negative rate A test of maladjustment classifies 85% of maladjusted girls, but only misidentifies 5% of adjusted girls. What base rates are needed to support these ratios? (Assume, reasonably, that there are more adjusted than unadjusted girls.) What does this have to do with cutting lines? The proportion of people selected (diagnosed, chosen) from a sample is called the selection ratio When positive/negative base rates are not equal, there is a (fairly brutal) trade-off between the accuracy (error rate) of a diagnosis or prediction, and the size of the selection ratio The ratio of the true negative rate to the sum of the true and false negative rate = (0.85 [true negative] / (0.85 [true negative] + 0.5 [false negative]) = 0.85. The test can only help if less than 85% of girls are well-adjusted. The brutal trade-off If you want to be very sure you are right, you can speak of only a very small proportion of the sample (and you need a very large sample to get the cut-off points!) If you want to say something about everyone, then you must be prepared to be uncertain about your cut-off points, and wrong very often. In short: you can be certain about a few people, or uncertain about a lot of people: take your pick! False negative: Incorrectly unselected False positive: Incorrectlyselected Low false Negative rate High false positive rate High false negative rate Low false positive rate Rewarding incompetence Ignoring competence 6

Sensitivity & Specificity The sensitivity of a test = The probability of having a positive test result when the disease is present = P(Result Disease) = True positive rate The specificity of a test = The probability of having a negative test result when the disease is absent = P(~Result ~Disease) = True negative rate False negative: Incorrectly unselected True negative: Correctly unselected SPECIFICITY SENSITIVITY True positive: Correctly selected False positive: Incorrectly selected What to do? What to do? 2.) Obviously, sometimes we can be satisfied with a small improvement on true negative base rates and with a large false positive rate As we have said, we don t mind mistaking 90 brain tumors in order not miss 20. 2.) Successive hurdles: Take a chance, allow errors, and give the expensive, time-consuming, but accurate tests to those who are selected out from a first-pass of a less-expensive, less time-consuming, and more accurate test Repeat as necessary... 3.) Sometimes we can find sub-populations with less extreme base rates than in the world-at-large If our referrals are well-screened, we can have more confidence in base rates that are less onerous (= closer to being equal) than they would be in the world at large What to do? 3 What to do?.) Sometimes so what? is the right thing to say. Since testing with any accuracy is so difficult to do well, we should not bother to give tests that don t lead to real changes in therapy or other treatment If you can identify therapy candidates with 70% accuracy, so what? Will you then ignore or refuse to treat those who don t make the cut? If not, don t waste time and effort giving the test Gather base rate information. 7