Reasoning with Uncertainty Reasoning with Uncertainty Often, we want to reason from observable information to unobservable information We want to calculate how our prior beliefs change given new available evidence Bayes rule tells us how to optimally reason with uncertainty. Do people reason like Bayes rule? Bayes Rule Prior probability Evidence Posterior Probability Bayes rule tells us how the available evidence should alter our belief in something being true 1
Difficulties in Reasoning with Uncertainty Problems reasoning with probabilities frequencies are easier to understand Problems understanding Conditional probability doctors need to calculate the probability of disease given the observed symptoms: P( disease symptoms ) Sometimes P( symptoms disease ) is used incorrectly when reasoning about the likelihood of a disease Why is this wrong? The base rate is important To get P( disease symptom ), you need to know about P( symptom disease ) and also the base rate -- prevalence of the disease before you have seen patient More intuitive example: what is the probability of being tall given you are player in the NBA? what is the probability of being a player in the NBA given that you are tall? P( NBA player tall ) P( tall NBA player ) Reasoning with base rates Suppose there is a disease that affects 1 out of 100 people There is a diagnostic test with the following properties: If the person has the disease, the test will be positive 98% of the time if the person does not have the disease, the test will be positive 1% of the time A person tests positive, what is the probability that this person has the disease? Frequent answer =.98 Correct answer.50 2
Are we really that bad in judging probabilities? According to some researchers (e.g., Gigerenzer), it matters how the information is presented and processed. Processing frequencies is more intuitive than probabilities (even it leads to the same outcome). A counting heuristic (in tree form) 10,000 people 100 have disease 9,900 do not 98 test positive 2 test negative 99 test positive 9801 test negative P( disease test positive ) = 98 / ( 98 + 99 ).50 The same thing in words... Let s take 10,000 people. On average, 100 out of 10,000 actually have the disease and 98 of those will test positive (98% true positive rate) Among the 9,900 who do not have the disease, the test will falsely identify 1% as having it. 1% of 9,900 = 99 On average, out of 10,000 people: 98 test positive and they have the disease 99 test positive and they do not have the disease. Therefore, a positive test outcome implies a 98/(98+99) 50% chance of having the disease 3
Change the example What now if the disease affects only 1 out of 10,000 people? Assume same diagnosticity of test (98% true positive rate, 1% false positive rate) A person tests positive, what now is the probability that this person has the disease? A counting heuristic (in tree form) 1,000,000 people 100 have disease 999,900 do not have the disease 98 test positive 2 test negative 9999 test positive 989901 test negative P( disease test positive ) = 98 / ( 98 + 9999 ) =.0097 (smaller than 1%) Bayes Rule The previous example essentially is a simple way to apply Bayes rule: P ( positive disease) P( disease) P( disease positive) = P ( positive disease) P( disease) + P ( positive not disease) P( not disease) P( positive disease ) =.98 P( positive not disease ) =.01 P( disease ) =.0001 P( disease positive ) =.0097 4
Normative Model Bayes rule tells you how you should reason with probabilities it is a prescriptive (i.e., normative) model But do people reason like Bayes? In certain circumstances, the base rates are neglected base rate neglect The Taxi Problem: version 1 A witness sees a crime involving a taxi in Carborough. The witness says that the taxi is blue. It is known from previous research that witnesses are correct 80% of the time when making such statements. What is the probability that a blue taxi was involved in the crime? The Taxi Problem: version 2 A witness sees a crime involving a taxi in Carborough. The witness says that the taxi is blue. It is known from previous research that witnesses are correct 80% of the time when making such statements. The police also know that 15% of the taxis in Carborough are blue, the other 85% being green. What is the probability that a blue taxi was involved in the crime? 5
Base Rate Neglect: The Taxi Problem Failure to take prior probabilities (i.e., base rates) into account In the taxi story, the addition of: The police also know that 15% of the taxis in Carborough are blue, the other 85% being green. has little influence on rated probability Base Rate Neglect (2) Kahneman & Tversky (1973). group A: 70 engineers and 30 lawyers group B: 30 engineers and 70 lawyers What is probability of picking an engineer in group A and B? Subjects can do this Provide some evidence Jack is a 45 year-old man. He is married and has four children. He is generally conservative, careful, and ambitious. He shows no interest in political and social issues and spends most of his free time on his many hobbies, which include home carpentry, sailing, and mathematical puzzles What now is probability Jack is an engineer? Estimates for both group A and group B was P =.9 6
Heuristics and Biases Tversky & Kahneman propose that people often do not follow rules of probability Instead, decision making may be based on heuristics Lower cognitive load but may lead to systematic errors and biases Example heuristics representativeness availability All the families having exactly six children in a particular city were surveyed. In 72 of the families, the exact order of the births of boys and girls was: G B G B B G What is your estimate of the number of families surveyed in which the exact order of births was: B G B B B B Answer: a) < 72 b) 72 c) >72 Representativeness Heuristic The sequence G B G B B G is seen as A) more representative of all possible birth sequences. B) better reflecting the random process of B/G 7
A coin is flipped. What is a more likely sequence? A) H T H T T H B) H H H H H H A) #H = 3 and #T = 3 (in some order) B) #H = 6 Gambler s fallacy: wins are perceived to be more likely after a string of losses Does the hot hand phenomenon exist? Most basketball coaches/players/fans refer to players having a Hot hand or being in a Hot zone and show Streaky shooting However, making a shot after just making three shots is pretty much as likely as after just missing three shots not much statistical evidence that basketball players switch between a state of hot hand and cold hand (Gilovich, Vallone, & Tversky, 1985) 8
Availability Heuristic Are there more words in the English language that begin with the letter V or that have V as their third letter? What about the letter R, K, L, and N? (Tversky & Kahneman, 1973) Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations. Rate the likelihood that the following statements about Linda are true: a) Linda is active in the feminist movement b) Linda is a bank teller c) Linda is a bank teller and is active in the feminist movement Rating C as more likely than B and A is a Conjunction Fallacy What to make of these results? One interpretation of Tversky & Kahneman s findings: people do not use proper probabilistic reasoning people use arbitrary mechanisms/ heuristics with no apparent rationale However, Gigerenzer and Todd show in their Fast and Frugal Heuristics research program that heuristics can often be very effective 9
Which city has a larger population? A) San Diego B) San Antonio 66% accuracy with University of Chicago undergraduates. However, 100% accuracy with German students. San Diego was recognized as American cities by 78% of German students. San Antonio: 4% With lack of information, use recognition heuristic (Goldstein & Gigerenzer, 2002) How to pick a stock Problem: what stocks to invest in? Solution 1 optimizing : Gather lots of info about many companies Process with sophisticated tools and choose Solution 2 the recognition heuristic: Purchase stocks from recognized companies (slide from Peter Todd) Paying for the name. (slide from Peter Todd) 10
Picking Stocks with Recognition Heuristic Borges et al. (1999) can ignorance beat the stock market? 180 German lay-people recognition of German stocks 6 month return on DAX 30: Dec 1996 Jun 1997 Market Index +34% Recognition Rate > 90% +47% Recognition Rate < 10% +13% Note: this result has not replicated in other studies (e.g., Boyd, 2001; Rakow, 2002) -- don t rush to use this heuristic on your own money! 11