Pooling Subjective Confidence Intervals

Similar documents
Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Still important ideas

Chapter 7: Descriptive Statistics

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Patrick Breheny. January 28

Still important ideas

Business Statistics Probability

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Risk Aversion in Games of Chance

Political Science 15, Winter 2014 Final Review

Regressions usually happen at the following ages 4 months, 8 months, 12 months, 18 months, 2 years.

Validity and Quantitative Research. What is Validity? What is Validity Cont. RCS /16/04

Quality Digest Daily, March 3, 2014 Manuscript 266. Statistics and SPC. Two things sharing a common name can still be different. Donald J.

Enumerative and Analytic Studies. Description versus prediction

Disclosing medical errors to patients: Recent developments and future directions

Statistics for Psychology

Readings: Textbook readings: OpenStax - Chapters 1 4 Online readings: Appendix D, E & F Online readings: Plous - Chapters 1, 5, 6, 13

Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data

V. Gathering and Exploring Data

Probability Models for Sampling

1 The conceptual underpinnings of statistical power

Chapter 12. The One- Sample

Handout 16: Opinion Polls, Sampling, and Margin of Error

Experimental Research in HCI. Alma Leora Culén University of Oslo, Department of Informatics, Design

Instrumental Variables Estimation: An Introduction

Cost-Utility Analysis (CUA), part II

Chapter 1. Dysfunctional Behavioral Cycles

Why Coaching Clients Give Up

Sleeping Beauty is told the following:

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

Checking the counterarguments confirms that publication bias contaminated studies relating social class and unethical behavior

Critical Conversations

Economics 2010a. Fall Lecture 11. Edward L. Glaeser

Here s a list of the Behavioral Economics Principles included in this card deck

8 people got 100 (I rounded down to 100 for those who went over) Two had perfect exams

The Pretest! Pretest! Pretest! Assignment (Example 2)

Chapter 8 Estimating with Confidence

CHAPTER 15: DATA PRESENTATION

Information on ADHD for Children, Question and Answer - long version

Anxiety. Top ten fears. Glossophobia fear of speaking in public or of trying to speak

Psychological. Influences on Personal Probability. Chapter 17. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Introduction to Behavioral Economics Like the subject matter of behavioral economics, this course is divided into two parts:

Chapter 1 Review Questions

Taste of MI: The Listener. Taste of MI: The Speaker 10/30/2015. What is Motivational Interviewing? (A Beginning Definition) What s it for?

Endogeneity is a fancy word for a simple problem. So fancy, in fact, that the Microsoft Word spell-checker does not recognize it.

Two-sample Categorical data: Measuring association

Barriers to concussion reporting. Qualitative Study of Barriers to Concussive Symptom Reporting in High School Athletics

Physiological Mechanisms of Lucid Dreaming. Stephen LaBerge Sleep Research Center Stanford University

VALIDITY OF QUANTITATIVE RESEARCH

The Interactive Effects of Ethical Norms and Subordinate Recommendations on Accounting Decisions

Exploration and Exploitation in Reinforcement Learning

Phenomenal content. PHIL April 15, 2012

Reliability and Validity

Behavioral Game Theory

CPS331 Lecture: Coping with Uncertainty; Discussion of Dreyfus Reading

How do we identify a good healthcare provider? - Patient Characteristics - Clinical Expertise - Current best research evidence

Recording Transcript Wendy Down Shift #9 Practice Time August 2018

Estimation. Preliminary: the Normal distribution

The Evolution of Cooperation: The Genetic Algorithm Applied to Three Normal- Form Games

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES

Views of autistic adults on assessment in the early years

CSC2130: Empirical Research Methods for Software Engineering

RAG Rating Indicator Values

Chapter 11. Experimental Design: One-Way Independent Samples Design

Rational Choice Theory I: The Foundations of the Theory

Cognitive Changes Workshop Outcomes

Managerial Decision Making: Session 6

(Visual) Attention. October 3, PSY Visual Attention 1

Relationships. Between Measurements Variables. Chapter 10. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Statisticians deal with groups of numbers. They often find it helpful to use

Reliability, validity, and all that jazz

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

Editorial. From the Editors: Perspectives on Turnaround Time

Measuring and Assessing Study Quality

First Problem Set: Answers, Discussion and Background

Outlier Analysis. Lijun Zhang

Attention and Concentration Problems Following Traumatic Brain Injury. Patient Information Booklet. Talis Consulting Limited

ROADMAP FREEDOM FROM STUTTERING. Copyright 2017 Life Quality, Inc. All Rights Reserved

Unit 1 Exploring and Understanding Data

INSTRUCTIONS FOR THE AD8 DEMENTIA SCREENING INTERVIEW (10/22/2015) (ADS, VERSION 1, 4/29/2015)

The HPV Data Is In What Do the Newest Updates in Screening Mean For Your Patients?

Selecting the Best Form of Jury Research for Your Case & Budget

Step 2 Challenging negative thoughts "Weeding"

Level 2 l Upper intermediate

FAQ: Heuristics, Biases, and Alternatives

Jack Grave All rights reserved. Page 1

Abdul Latif Jameel Poverty Action Lab Executive Training: Evaluating Social Programs Spring 2009

Creative Commons Attribution-NonCommercial-Share Alike License

Unit 2: Probability and distributions Lecture 3: Normal distribution

Section 3.2 Least-Squares Regression

Cognitive testing. Quality assurance of the survey. Ăirts Briăis Luxembourg, 13 May, 2009

Confusion in Hospital Patients. Dr Nicola Lovett, Geratology Consultant OUH

Further Properties of the Priority Rule

Transcription:

Spring, 1999 1 Administrative Things Pooling Subjective Confidence Intervals Assignment 7 due Friday You should consider only two indices, the S&P and the Nikkei. Sorry for causing the confusion. Reading from Decision Traps for this week. Start on Axelrod for next week s classes (first few chapters). Thanks for the comments received on the mid-term evaluations. Today s Topics Understanding the properties of subjective confidence intervals Signaling and the trade-off of coverage versus length Modeling the random variation in your intervals; Cauchy errors Improving the quality of your intervals Pooling subjective confidence intervals Contrast with normal-based intervals Review from Last Time Introduced with a questionnaire soliciting your subjective intervals Subjective intervals, regardless of the nominal coverage, tend to cover about 40-50% of the time. Is this an indication of over-confidence or signaling knowledge? - Signal knowledge of an area by giving a very short interval. - Studies have found that your intervals would have to be about 15 times longer to be 95% intervals! Would you offer such a wide interval?

Spring, 1999 2 Follow-up questionnaire: What are your preferences for intervals? What do you look for in an interval? Comparison of Subjective Intervals to Scientific Intervals Coverage probabilities of subjective intervals Generally less than 50% regardless of setting and audience. - Description of the task (e.g., emphasis on the 95% coverage aspect) does not seem to affect the experimental results. - Training/practice does not seem to raise the average coverage; still hovers down in the 50% range. (Avoiding the trick of guaranteed performance on 19/20, with the other deliberately missed.) - Experts also tend to have 50% coverage, just shorter intervals. Domain knowledge leads to shorter intervals, but similar coverage. Science is not much better: confidence intervals for the speed of light Are 95% statistical intervals really any better, particularly those used to extrapolate time series (like economic forecasts)? Once we account for the uncertainty in the choice of model and the guesses that went into the selection of structure and data sources, the prediction intervals from statistical models are often not much better than subjective intervals. Perhaps it s just that the process is much more formalized. Why should scientific intervals have the same sort of coverage as a subjective confidence interval? - Systematic error versus random, idealized error. - Decompose error in judgement as a combination of the true value with added random noise as well as a random bias term, something like Estimate = µ + (bias) + (random error) The bias term is random and may have average zero, but has quite large variance. Bias term arises, for example, from cultural or functional limitations.

Spring, 1999 3 Models for the Coverage of Subjective Intervals To learn how to best use these intervals in decision making, we first need to look at the nature of the errors made with subjective intervals. Properties of the normality-based intervals used with data Normal intervals have form of Estimate ± 2 SE(estimate) If we look at a plot of the ratio center / length of a collection of normal intervals which were built in the idealized circumstances, we would end up with a collection of normal random values. Aside: Even though the length is not the SE, it s a multiple of this and so we still get a normal ratio. The exact distribution in idealized cases is known to be something called Student s t, but that s pretty close to normal unless you have a very small sample. If you look at a lot of normal-type intervals, you can find a hidden normal random variable underneath. This feature of a typical data-based confidence interval is the basis for our construction of the simulated versions of the information sources in the first part of the course. What are the analogous properties of subjective intervals? Pick a questionnaire item and record your standardized errors Z = (Center of interval Truth) / (Length of interval) How does the distribution of the class values for Z compare to the normal (using the normal quantile plot to check for normality)? We ll do an example from your results and show the answers in class. Given these results, how should we use subjective confidence intervals to generate samples for our information combining, simulation methods? Empirical results from literature (e.g. Yaniv and Foster 1997, J of Behavioral Decision Making) suggest a Cauchy model for the variation Plots of sample from the Cauchy distribution show many more outliers.

Spring, 1999 4 Cauchy variation is much more likely to produce wild, outlying values. The following plot shows a histogram of 2,000 standard Cauchy values, after trimming off the most extreme (±600) so we can see the histogram. Very thick tails in the distribution as shown in the quantile plot..001.01.05.10.25.50.75.90.95.99.999 100 0-100 -4-3 -2-1 0 1 2 3 4 Cauchy version of the empirical rule Empirical rule for the Cauchy is that 50% of probability lies within +1 to 1 for a standard cauchy. Here are the quantiles (percentiles) for our sample of Quantiles maximum minimum 2000 from the standard Cauchy. 100.0% 99.5% 97.5% 90.0% 10.0% 2.5% 0.5% 0.0% 645.22 46.52 16.15 3.20 1.06-0.02-1.01-3.27-13.58-75.55-740.61 Note that the two s (which between them include 50% of the data) are at 1 and +1, as claimed. In comparison, the probability between the 1 and +1 for a standard normal random variable is higher, at about 2/3. Simulating the information in a subjective interval

Spring, 1999 5 Will want to pool the information from several subjective intervals as well as combine these with normal theory, data-driven intervals. Suppose that your subjective interval for costs in $M for a new project is 3.5 to 6.5. How can you simulate a sample that goes with this subjective interval? (1) Treat the subjective interval as a 50% coverage Cauchy interval. (2) Convert the length of the interval into a scale for the Cauchy. Since the length is 3, that suggests a scale of 1.5 $M for the Cauchy. (3) Simulate a sample which is consistent with this subjective interval in JMP using the formula 5 + 1.5?cauchy where?cauchy is obtained from the set of random functions in the JMP calculator (in the same group with the normal). (4) Check your formula by looking at the s of the sample. They should be close to your endpoints. 6.4 4.9 3.4 Improving the Information in Your Intervals Host of important social psychology results when it comes to thinking about how to make better decisions. Curiously, this same literature is a source of predictions about human/computer interactions as well. Premise Better intervals = better assessment of the uncertainty and the possibilities. Get better intervals by avoiding some common pitfalls. More likely to avoid the pitfalls if you know what they are. Decision Traps gives many more examples of these with nice examples from the business world. Being aware of the possibilities

Spring, 1999 6 Questionnaire #3 Decision trees are an important a methodology for exploring the range of possibilities in a given problem or situation. Lack of context often colors our assessment of probabilities. Eg. What is the probability of a flood wiping out half of the population of California? Other types of biases in judgment Availability bias occurs when we use the available information that is convenient to get your hands on, but not always the most informative. - Egocentric (Who does the cleaning?) - Gambling - Bus arrivals (Which one comes first?) Accepting anecdotal evidence rather than making a more careful study of the problem. Pretty closely related to availability bias. - Which kills more: pigs or sharks? - CD players in airplanes - Smoking while buckled up Separating the random from systematic. This category of biases occurs when we forget Bonferroni and are fooled by the presence of false patterns. - Streaks in sports - Rewards versus punishment in manufacturing and education Anchoring our value in one problem to a number in an apparently unrelated context. - What s your area code? - In what year was Helen of Troy born? Functional bias in our view and attitudes to a problem - Silo mentality

Spring, 1999 7 - Rational? Source of rewards (like here at Wharton) Many other examples (with good business stories) in Russo and Schoemaker. Also see the references that they have included. Pooling Subjective Information Importance of brainstorming Generation of independent ideas is very hard to come by. How many dots were on that poster? Which is the longest line shown on the screen? Source of bias noted previously in subjective intervals, group-think. Arriving at consensus too quickly often leads to the wrong interval. Role for consultants Independent, outside source of information Watch out for strategic issues. In a one-time arrangement, the consultant may be rewarded for a short interval, even though it turns out to be wrong. He/she is often paid up front for knowledge that is only tested later. The rules are different in a continuing arrangement, since the consultant will have to come back and live with the estimate. Manipulating several independent subjective intervals Simulation procedure works exactly as before. (1) Generate a large simulated sample for each interval. Remember that the subjective intervals have about 50% coverage (unless you have some reason to believe otherwise, such as from tracking the source). (2) Use the subset selection method to pool them. Since the data are not normal, we can no longer use the slick regression trick.

Spring, 1999 8 Note that subjective intervals combine in a very different manner from how normal intervals combine, particularly when there are some gaps. Examples of Pooling Subjective Information Two similar sources Estimates of project costs are [3,5] for source A and [4,8] for source 2. Treat as two independent estimates Will have to work much harder to use the histogram to find those points that are close to zero. The easiest, most intuitive approach is to rescale the axis in the histogram, and finally pull out a small fraction of points near zero. Also, pay attention to the value you use for the increment option. It controls the sort of bins that will appear in the histogram. 2000 1000 0-1000 After rescaling, you can start to see the ones near the origin. With the range [-10, 10], I finally got about 100 that were close to zero. For these, the 5.4557 4.6034 3.8780 summaries of Source A and Source B give an interval of about [3.8, 5.4] (again, at 50% coverage. Stay away from the tails! They are too unstable even with the 3,000 simulated here.) Summary, from [3,5] and [4,8] we moved to [3.8,5.4]. As before, the addition of the longer interval somewhat improved (shortened) the better, more narrow initial interval from Source A, and shifted it to the right. Two contradictory, non-overlapping sources 5.3941 4.4694 3.8333 Recall the normal distribution examples. Two non-overlapping sources produce a net interval in the middle which is incompatible with either one.

Spring, 1999 9 - To understand this better, make it easy and pretend each source is an interval based on a sample of, say, 10 observations. The sample mean from one sample is very different from the sample mean in the other. If we were to average all 20 values together, we d get a pooled mean right in the middle. Important aside normal 50% intervals To get a normal 50% interval, the endpoints are determined as follows. From basic calculations, P(-.675 N(0,1).675) = 0.5. Hence, the length of a 50% interval represents 1.35 SDs. In general, to get a normal sample that represents the 50% interval [10,20], say, note that as with 95% intervals this interval implies that the mean of the normals is 15. To get the SD, the length of 10 = 1.35 SDs, so the associated SD is 7.41. Contradictory Cauchy sources For this example, suppose we have a Source A ([3,5] as before) but now a Source C (in place of B, to keep things simple). The interval for the new source is [12,16]. What happens when we combine these? With such incompatible sources, using 7500 simulated for A and C I got an interval with about 100 in the range. I found the Brush tool useful in this case for finding some observations within about +/-1 of zero. Here are the resulting intervals. 13.187 7.712 4.299 13.155 7.970 4.271 Starting from [3,5] (length 2) and [12,16] (length 4), we get an interval something like [4,13] with length 9. Compared to the normal intervals, Cauchy intervals handle incompatibility in a much different way, giving a very wide range that captures some of the bias variation. The variation/length always went down when pooling the normal sources. Next Time Introduction to game theory (tit-for-tat).