Bayes Theorem Application: Estimating Outcomes in Terms of Probability

Similar documents
Bayes theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

MS&E 226: Small Data

MITOCW conditional_probability

5.3: Associations in Categorical Variables

Psychological. Influences on Personal Probability. Chapter 17. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

The Human Side of Science: I ll Take That Bet! Balancing Risk and Benefit. Uncertainty, Risk and Probability: Fundamental Definitions and Concepts

Lesson 87 Bayes Theorem

Representativeness heuristics

Chapter 7: Descriptive Statistics

Student Performance Q&A:

CPS331 Lecture: Coping with Uncertainty; Discussion of Dreyfus Reading

Understanding Probability. From Randomness to Probability/ Probability Rules!

Intro to Hypothesis Testing Exercises

Introduction. Patrick Breheny. January 10. The meaning of probability The Bayesian approach Preview of MCMC methods

BAYESIAN HYPOTHESIS TESTING WITH SPSS AMOS

Sheila Barron Statistics Outreach Center 2/8/2011

Probability II. Patrick Breheny. February 15. Advanced rules Summary

Lecture 3: Bayesian Networks 1

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

QUESTIONS ANSWERED BY

Supplementary notes for lecture 8: Computational modeling of cognitive development

STAT 200. Guided Exercise 4

Why Is It That Men Can t Say What They Mean, Or Do What They Say? - An In Depth Explanation

Project: Conditional Probability (the sequel)

Handout 11: Understanding Probabilities Associated with Medical Screening Tests STAT 100 Spring 2016

Risk Aversion in Games of Chance

Villarreal Rm. 170 Handout (4.3)/(4.4) - 1 Designing Experiments I

Conduct an Experiment to Investigate a Situation

Barriers to concussion reporting. Qualitative Study of Barriers to Concussive Symptom Reporting in High School Athletics

Progress Monitoring Handouts 1

Dare to believe it! Live to prove it! JOHN VON ACHEN

Chapter 2 ANNA AXLER. Ann, circa Ann and Ira Marks -33-

Planning for Physical

Statistics. Dr. Carmen Bruni. October 12th, Centre for Education in Mathematics and Computing University of Waterloo

Teaching Family and Friends in Your Community

Bayesian Analysis by Simulation

The Next 32 Days. By James FitzGerald

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

RAL HEAL TH IMPROVEMENT CENTER

Letter to the teachers

The 6 Vital Keys to Turn Visualization Into Manifestation

Math HL Chapter 12 Probability

Graphic Organizers. Compare/Contrast. 1. Different. 2. Different. Alike

Probability and Sample space

EVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error

Building Friendships: Avoid Discounting

DAY 2 RESULTS WORKSHOP 7 KEYS TO C HANGING A NYTHING IN Y OUR LIFE TODAY!

Statistics for Psychology

Mentoring. Awards. Debbie Thie Mentor Chair Person Serena Dr. Largo, FL

Refresh. The science of sleep for optimal performance and well being. Sleep and Exams: Strange Bedfellows

Jack Grave All rights reserved. Page 1

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION"

UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER

How to use the Lafayette ESS Report to obtain a probability of deception or truth-telling

Scientific Thinking Handbook

UNDERSTANDING CAPACITY & DECISION-MAKING VIDEO TRANSCRIPT

USING OBSERVATIONS AND INFERENCES IN SCIENCE

Representation and Analysis of Medical Decision Problems with Influence. Diagrams

Unraveling Recent Cervical Cancer Screening Updates and the Impact on Your Practice

When Intuition. Differs from Relative Frequency. Chapter 18. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

MBios 478: Systems Biology and Bayesian Networks, 27 [Dr. Wyrick] Slide #1. Lecture 27: Systems Biology and Bayesian Networks

Motivational Interviewing in Healthcare. Presented by: Christy Dauner, OTR

Meeting a Kid with Autism

Making comparisons. Previous sessions looked at how to describe a single group of subjects However, we are often interested in comparing two groups

Biostatistics Lecture April 28, 2001 Nate Ritchey, Ph.D. Chair, Department of Mathematics and Statistics Youngstown State University

Artificial Intelligence Programming Probability

Appendix: Instructions for Treatment Index B (Human Opponents, With Recommendations)

Sexual Feelings. Having sexual feelings is not a choice, but what you do with your feelings is a choice. Let s take a look at this poster.

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

Human intuition is remarkably accurate and free from error.

Overview. Meeting Length 90 minutes. Senses [Meeting 1]

This week s issue: UNIT Word Generation. conceive unethical benefit detect rationalize

OCW Epidemiology and Biostatistics, 2010 Michael D. Kneeland, MD November 18, 2010 SCREENING. Learning Objectives for this session:

Attention and Concentration Problems Following Traumatic Brain Injury. Patient Information Booklet. Talis Consulting Limited

Reasoning with Uncertainty. Reasoning with Uncertainty. Bayes Rule. Often, we want to reason from observable information to unobservable information

Sequential Decision Making

2 INSTRUCTOR GUIDELINES

Confidence Intervals. Chapter 10

Stay Married with the FIT Technique Go from Pissed off to Peaceful in Three Simple Steps!

Anthony Robbins' book on success

Nutrition Response Testing SM New Patient Orientation Welcome. If you are like most people who come to us for help, then most likely: You have one or

The 3 Things NOT To Do When You Quit Smoking

Chapter 8 Estimating with Confidence

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

Whatever You Do, DON'T Make These 5 Weight Loss MISTAKES

Choosing Life: Empowerment, Action, Results! CLEAR Menu Sessions. Health Care 3: Partnering In My Care and Treatment

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

Handout on Perfect Bayesian Equilibrium

Statisticians deal with groups of numbers. They often find it helpful to use

UNIT 4 ALGEBRA II TEMPLATE CREATED BY REGION 1 ESA UNIT 4

Illusion of control is all about the relationship between the conscious and the sub-conscious mind.

Acceptance and Commitment Therapy (ACT) for Physical Health Conditions

Chapter 8. Learning Objectives 9/10/2012. Research Principles and Evidence Based Practice

Module 4. Relating to the person with challenging behaviours or unmet needs: Personal histories, life journeys and memories

The Lens Model and Linear Models of Judgment

Chapter 12. The One- Sample

Probability: Psychological Influences and Flawed Intuitive Judgments

Autism, my sibling, and me

Your Brain is Like a Computer

Transcription:

Bayes Theorem Application: Estimating Outcomes in Terms of Probability The better the estimates, the better the outcomes. It s true in engineering and in just about everything else. Decisions and judgments are invariably made with less available information than we would like. No project would ever be completed if we had to wait for all of the data. Even then, there is always uncertainty in every measurement, test and every data set. Thus, the understanding can at best be an approximation of the situation. It is important to remember that approximations are a useful guide, but not reality. There are truly no 100% sure things there is only overconfidence, occasionally followed by misfortune There are techniques to improve the quality of such decisions in the face of uncertainty, but the principles are not generally taught. Here, the starting point here is introducing Bayes Theorem to quantitatively estimate probability of events based on prior experience. The insights from this can be applied qualitatively to make better intuitive estimates. Let s start with the weather forecast. In the morning, there may be a forecast of 30% chance of rain. The decision to take an umbrella with you or not is made with this information. This forecast becomes the prior condition. Later in the day, the forecast may change, say to 70%. This is new information, more reliable, and you take advantage of the update to take your umbrella. Data Bayes Theorem Revised Probability (or Posterior Probability) Prior Probability This schematic above illustrates how Bayes Theorem works. There is some prior information about the likelihood of an event occurring, some new event that changes the situation and a computation of a new probability that the original event will occur. There is no certainty the event will occur, just a better estimate that it will occur. This is often termed conditional probability, since the conditions affecting the original event have changed. In this sense, the path toward understanding is an iterative process and Bayes Theorem is the computational engine that drives it. More generally the approach can be quite valuable qualitatively. Making an initial estimate pushes you to examine the available information or evaluate the data more carefully. There is more on the line when one owns an estimate. This estimate becomes the prior probability. As more information becomes available, a revised, estimate can then be made. On the other hand, estimates made in a vacuum are wildly subjective. If there is no estimate at all, one has to accept at face value whatever result comes their way. This section provides the tools to avoid either of these undesirable situations.

Methods to Transform Data into Information (Rev0) Bayes Theorem Data without context in can be misleading. Graphical analysis was one method to do this. There are also statistical methods to do this. The example below shows the need for this type of analysis. After the example, statistical methods based on conditional probability: The first part of this section introduces Bayes Theorem, which mathematically uses prior probability combines it with new information to arrive at a revised probability. The most well-known example, and the one we start with, is determining the probability of a woman actually having cancer after receiving a positive result in an initial screening. Most people, including physicians, are way off on this estimate. Venn Diagram (http://oscarbonilla.com/2009/05/visualizing-bayes-theorem/) A woman in her 40 s participates in a routine screening for breast cancer. She has a positive result. What is the probability that the woman has breast cancer? Prior Probability: It is known that 1% of women in this age group have breast cancer. The test does not give perfect results: True Positive: Of the women with breast cancer, 80% will get a positive test result. False Positive: Of the women without breast cancer, 9.6% will get a positive test result There is no action taken when a negative test result is reported Prior Probability The filled circle A represents the 1% of women with cancer. The larger circle represents all women in the sample group. New Information about the test: Modify the Venn Diagram for the True Positive and False Positive Information. Add the women who do not have breast cancer but got a positive result on the mammogram test (Circle B). Circle B covers 80% of Circle A (True Positive) and 9.6% of the larger circle outside of A (False Positive Comparing the areas of Circle A and Circle B, it is clear that only a small percentage of the women who got a positive result actually have breast cancer. The exact percentage can be calculated using Bayes Theorem. Most doctors guessed that the answer to the question was around 80%, which is clearly impossible looking at the diagram! Visualizing the diagram can help us apply Bayes theorem:

Quantitative Analysis of the Same Cancer Test Scenario A woman in her 40 s participates in a routine screening for breast cancer. She has a positive result. What is the probability that the woman has breast cancer? It is known that 1% of women in this age group have breast cancer The test does not give perfect results Of the women with breast cancer, 80% will get a positive test result. (20% negative result.) Of the women without breast cancer, 9.6% will get a positive result. (90.4% negative result) An important point: Only the women that have a positive test have any concerns. This is bolded in the table below. If a woman has a negative test, there is no follow-up for the next year. In a table, all of the information can be shown. It is important to distinguish between actually having cancer and having a positive test for cancer. The table shows the false positives and negatives. Cancer No Cancer Prior Information 1% 99% Test Positive 80% (True) 9.6% (False) Test Negative 20% 90.4% (No further action for a negative test result) When the woman receives notice of a positive result, she is in the top row of the table, but the result could be a true or false positive. Calculate the chances of a true positive: 1% chance she has cancer x 80% chance that test identified it = 0.008 Calculate the chances of a false positive: 99% chance she does not have cancer x 9.6% chance that test identified it = 0.095 In table form: Cancer (1%) No Cancer (99%) Test Positive True Pos: 1% * 80% = 0.008 False Pos: 99% * 9.6% = 0.095 Test Negative -------- ------- Calculate he probability of actually having cancer if she gets a positive test: Probability = Has cancer/all possibilities if she received a positive test = (Has Cancer)/(Has Cancer + Does not have cnacer = 0.008/(0.008 + 0.095) = 0.008/(0.103) = 7.8% The probability that the woman has cancer is 7.8% (not 80%!) (Remember this when someone you know gets a positive screening result)

Bayes Theorem Same Cancer Test Scenario (Simplified Notation: N. Silver) Revised Probability = (xy)/[xy + z(1-x)] Prior Probability x 1% of women in this age group have cancer (historical data) New Events True Result y 80% of women with cancer will get a positive result False Positive z 9.6% of women without cancer will get a positive result Revised (Posterior) Probability Calculated by Bayes formula below Revised Probability = (xy)/[xy + z(1-x)] = (0.01)(0.8)/(0.01)(0.80) + [0.096(1.00-0.01)] Revised Probability =.078 = 7.8% The probability that the woman has cancer has increased from the prior value of 1% to the revised value of 8.6% as a consequence of the new test information. Comparison of Notation for Cancer Test Scenario A: Actually Has Cancer ~ A: Actually not having cancer; Test Positive X Test Negative ~X (Silver s Notation) Revised Probability = (xy)/[xy + z(1-x)] Prior Probability x Pr(A) Chance of having cancer (1%). 1% of women in this age group have cancer (historical data) New Events True Positive y Pr(X A) Chance of a positive test (X) given that you had cancer (A). 80% of women with cancer will get a positive result This is the chance of a true positive, 80% in our case False Positive z Pr(X ~A) Chance of a positive test (X) given that you didn t have cancer (~A). 9.6% of women without cancer will get a positive result Revised (Posterior) Probability This is a false positive, 9.6% in our case. 1-x Pr(~A) Chance of not having cancer (99%). Pr(A X) Chance of having cancer (A) given a positive test (X). This is what we want to know: In our case it was 7.8%. Calculated by Bayes formula below Revised Probability = (xy)/[xy + z(1-x)] = (0.01)(0.8)/(0.01)(0.80) + [0.096(1.00-0.01)] = 0.078 = 7.8% Pr(A X) = Pr(X A) Pr(A) / [Pr(X A) Pr(A) + Pr(~A) Pr(X ~A)] = (0.01)(0.8)/(0.01)(0.80) + [0.096(1.00-0.01)] = 0.078 =7.8%

A similar but simpler class problem: (http://www.biocodershub.net/community/understanding-bayesian-statistics/) If a person has cancer, it will always detect it, 100% of the time. If the person, it will usually correctly call this result, 90% of the time. However in 10% of these cases, it will incorrectly report they do have cancer. Looking at this in a table: Test detects cancer Test doesn t detect cancer Patient has cancer 100% (true positive) 0% (false negative) Patient doesn t have cancer 10% (false positive) Now let s assume that 1% of people actually have cancer. You go in for a test and unfortunately it reports you have cancer. Statistically, do you actually have cancer? Use full notation. Key Points of Bayes Theorem (refer to examples above) Tests are not the event. Tests are flawed. Tests give us test probabilities, not the real probabilities. False positives skew results. We have a cancer test, separate from the event of actually having cancer. Tests detect things that don t exist (false positive), and miss things that do exist (false negative). People often consider the test results directly, without considering the errors in the tests. False positives are needed to reduce the number of cancer cases missed by the test. Missing a true cancer case is a much worse outcome than a false positive. Bayes theorem gives you the actual probability of an event given the measured test probabilities. Comparison of two types of statistics: Conventional statistics (Frequentist), inferences are based solely on the sampling distribution of the statistic. Here the statistic is the sample mean, standard deviation, and ranges. Emphasizes more accurately maximum likelihood, looks for the model with the highest probability of producing the data. Bayesian statistics Looks for the model that is most likely to have produced the data, or best explained by the data. The probability depends upon prior conditions or knowledge. It is based on the degree of belief based on prior knowledge that can be changed by new events. (Conditional Probability). Bayesian statistical analysis is the direction that analysis is going. It is a complex subject, but the insights from these examples can really help you understand situations in a different way. (It is still not taught in many college statistics courses.)

Bayes Theorem Problems 1. You recently returned from Mexico and go to see a doctor about a fever. The doctor decides to give you a test for swine flu. For the purposes of this exercise, it is known that 1 in 500 people who visited Mexico came down with swine flu. The test is 99% accurate, in the sense that the probability of a false positive is 1%. The probability of a false negative is 0. You get a positive result. What is the probability that you have swine flu? 2. Sam can come to school either by walking or taking the bus. Sam wakes up late one day. He decides that if he walks quickly, there is a 50% chance that he will be late for Engineering class. If he takes the M14, the probability of being late is only 20%. Sam is late one day, and the teacher wishes to estimate the probability that Sam walked to school. Since he does not know which mode of transportation that Sam usually takes, he gives a prior probability of ½ to each of the two possibilities. Draw a Venn Diagram which qualitatively describes this situation. What is the teacher s probability estimate that Sam walked to school? 3. You have an old car. For experience you know that when the car makes a loud clicking sound, there s a 75% chance it s going to break down. You also know that if it s more than 80 degrees Fahrenheit outside, the car only has a 15% chance of breaking down. If it s more than 80 degrees outside and the car is making a loud clicking sound, what is the probability that the car is going to break down? 4. Suppose that two competing factories, Manhattan and Queens produce jelly beans. The Manhattan factory starts shipping jars containing 70% sweet and 30% sour jelly beans. The Queens factory is shipping jars containing 20% sweet and 80% sour jelly beans. Manhattan is the larger factory, shipping 60% of the jelly beans compared to 40% from Queens. You eat one jelly bean and find that the taste is sweet. What is the probability that the jelly bean was produced in Manhattan? 5.Sam can come to school either by walking, taking the bus or using his skateboard. Sam wakes up late one day. He decides that if he walks quickly, there is a 50% chance that he will be late for Engineering class. If he takes the M14, the probability of being late is only 20%. The probability of being late using the skateboard is 30% Sam is late one day, and the teacher wishes to estimate the probability that Sam walked to school. Since he does not know which mode of transportation that Sam usually takes, he gives equal prior probability to each of the possibilities. What is the probability that Sam walked to school. Compare this problem to the earlier problem 2. 6. The three stooges, Curley, Larry, and Moe, have finally run out of luck and pulled one prank too many. One of them (selected at random) is to be executed. Moe, ever the smart stooge, begs the jailer to tell him the name of one of others who is not to be executed. The jailer flips a coin and gives Moe the name. What are the new probabilities of execution for the remaining Moe and the other unnamed stooge? 7. Marie is getting married tomorrow, at an outdoor ceremony in the desert. In recent years, it has rained only 5 days each year. Unfortunately, the weatherman has predicted rain for tomorrow. When it actually rains, the weatherman correctly forecasts rain 90% of the time. When it doesn't rain, he incorrectly forecasts rain 10% of the time. What is the probability that it will rain on the day of Marie's wedding? Specify prior probability and the new event as well as the probability of rain on the wedding day.