LSP 121. LSP 121 Math and Tech Literacy II. Topics. Binning. Stats & Probability. Stats Wrapup Intro to Probability

Similar documents
Chance. May 11, Chance Behavior The Idea of Probability Myths About Chance Behavior The Real Law of Averages Personal Probabilities

Understanding Probability. From Randomness to Probability/ Probability Rules!

CHAPTER 6. Probability in Statistics LEARNING GOALS

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

First Problem Set: Answers, Discussion and Background

Chapter 5 & 6 Review. Producing Data Probability & Simulation

Finding an Experimental Probability

Probability and Sample space

Representativeness heuristics

Section 6.1 "Basic Concepts of Probability and Counting" Outcome: The result of a single trial in a probability experiment

Math 1313 Chapter 6 Section 6.4 Permutations and Combinations

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error

Abdul Latif Jameel Poverty Action Lab Executive Training: Evaluating Social Programs Spring 2009

Conditional probability

Psychological. Influences on Personal Probability. Chapter 17. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Probability: Judgment and Bayes Law. CSCI 5582, Fall 2007

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

Elementary statistics. Michael Ernst CSE 140 University of Washington

Probability, Statistics, Error Analysis and Risk Assessment. 30-Oct-2014 PHYS 192 Lecture 8 1

Behavioural models. Marcus Bendtsen Department of Computer and Information Science (IDA) Division for Database and Information Techniques (ADIT)

Villarreal Rm. 170 Handout (4.3)/(4.4) - 1 Designing Experiments I

Lecture 12: Normal Probability Distribution or Normal Curve

Observational study is a poor way to gauge the effect of an intervention. When looking for cause effect relationships you MUST have an experiment.

Review for Final Exam

Essential question: How can you find the experimental probability of an event?

PROBABILITY Page 1 of So far we have been concerned about describing characteristics of a distribution.

Chapter 6: Counting, Probability and Inference

When Intuition. Differs from Relative Frequency. Chapter 18. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Unit 5. Thinking Statistically

MAKING GOOD CHOICES: AN INTRODUCTION TO PRACTICAL REASONING

5.3: Associations in Categorical Variables

So You Want to do a Survey?

6 Relationships between

Creative Commons Attribution-NonCommercial-Share Alike License

Chapter 20 Confidence Intervals with proportions!

Reasoning about probabilities (cont.); Correlational studies of differences between means

Quizzes (and relevant lab exercises): 20% Midterm exams (2): 25% each Final exam: 30%

Statistics and Probability

MATH CALCULUS & STATISTICS/BUSN - PRACTICE EXAM #2 - SUMMER DR. DAVID BRIDGE

Statistics. Dr. Carmen Bruni. October 12th, Centre for Education in Mathematics and Computing University of Waterloo

STA Module 1 Introduction to Statistics and Data

THE DIVERSITY OF SAMPLES FROM THE SAME POPULATION

ASSIGNMENT 2. Question 4.1 In each of the following situations, describe a sample space S for the random phenomenon.

STA Module 1 The Nature of Statistics. Rev.F07 1

STA Rev. F Module 1 The Nature of Statistics. Learning Objectives. Learning Objectives (cont.

Scientific Thinking Handbook

5.2 ESTIMATING PROBABILITIES

Recitation #3. Taking measurements

Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time.

Reasoning with Uncertainty. Reasoning with Uncertainty. Bayes Rule. Often, we want to reason from observable information to unobservable information

Evaluating you relationships

Streak Biases in Decision Making: Data and a Memory Model

Probabilities and Research. Statistics

Math HL Chapter 12 Probability

The Fallacy of Taking Random Supplements

Lesson Presentation. Holt McDougal Algebra 2

Bio 1M: Evolutionary processes

The Human Side of Science: I ll Take That Bet! Balancing Risk and Benefit. Uncertainty, Risk and Probability: Fundamental Definitions and Concepts

Confidence in Sampling: Why Every Lawyer Needs to Know the Number 384. By John G. McCabe, M.A. and Justin C. Mary

Day Topic Homework IXL Grade

Math 2a: Lecture 1. Agenda. Course organization. History of the subject. Why is the subject relevant today? First examples

Oct. 21. Rank the following causes of death in the US from most common to least common:

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

Alien Life Form (ALF)

STAT 111 SEC 006 PRACTICE EXAM 1: SPRING 2007

Teaching Statistics with Coins and Playing Cards Going Beyond Probabilities

UNIT 4 ALGEBRA II TEMPLATE CREATED BY REGION 1 ESA UNIT 4

REVIEW FOR THE PREVIOUS LECTURE

Chapter Three Research Methodology

Answer the questions on the handout labeled: Four Famous Reasoning Problems. Try not to remember what you may have read about these problems!

Moore, IPS 6e Chapter 03

8.2 Warm Up. why not.

Statistical sciences. Schools of thought. Resources for the course. Bayesian Methods - Introduction

Grieving is a necessary passage and a difficult transition to finally letting go of sorrow - it is not a permanent rest stop.

Sleeping Beauty is told the following:

Still important ideas

Previous Example. New. Tradition

LSP 121. LSP 121 Math and Tech Literacy II. Topics. Risk Analysis. Risk and Error Types. Greg Brewster, DePaul University Page 1

VOCABULARY. TRAITS a genetic (inherited) characteristic. HEREDITY The passing of traits from parent to offspring

Chapter Three: Sampling Methods

Math 140 Introductory Statistics

Margin of Error = Confidence interval:

(a) 50% of the shows have a rating greater than: impossible to tell

were selected at random, the probability that it is white or black would be 2 3.

EVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS

Audio: In this lecture we are going to address psychology as a science. Slide #2

Charlie: I was just diagnosed with CLL, so my doctor and I are now in the process of deciding what

USING STATCRUNCH TO CONSTRUCT CONFIDENCE INTERVALS and CALCULATE SAMPLE SIZE

STAT100 Module 4. Detecting abnormalities. Dr. Matias Salibian-Barrera Winter 2009 / 2010

Risk Aversion in Games of Chance

AP Stats Review for Midterm

Lecture 3. PROBABILITY. Sections 2.1 and 2.2. Experiment, sample space, events, probability axioms. Counting techniques

Lecture 15. When Intuition Differs from Relative Frequency

Problem Set and Review Questions 2

Statistical inference provides methods for drawing conclusions about a population from sample data.

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Biases in Casino Betting: The Hot Hand and the Gambler s Fallacy

Lesson 87 Bayes Theorem

Review: Conditional Probability. Using tests to improve decisions: Cutting scores & base rates

Transcription:

Greg Brewster, DePaul University Page 1 LSP 121 Math and Tech Literacy II Stats Wrapup Intro to Greg Brewster DePaul University Statistics Wrapup Topics Binning Variables Statistics that Deceive Intro to Events Sample spaces Empirical probabilities Theoretical probabilities Subjective probabilities Binning Grouping values (and their frequencies) into bins so that meaningful data analysis is possible.

Why is it valuable? If 100 students took an exam and everyone received a different score, it would be very difficult to analyze the results without binning. A histogram of the frequencies would be flat. How does PASW support binning? PASW provides a Visual Binning Transform Allows you to create a new variable that groups values from an old variable Old variable must be type Scale Then do Histogram with the binned variable. Statistics Can Deceive People who present statistical results often have an agenda. They wan their results to prove a certain result. For this reason, they may present them in a way that favors their outcome. Greg Brewster, DePaul University Page 2

Generalizing Sample Results Statistics are often taken from a small sample and then generalized to a larger group. Example: someone takes a survey of 100 DePaul students. Then they assume that these results reflect the opinions of all 23,000 DePaul students. Generalizing Sample Results In order to generalize sample results to a larger group, you should Make sure that the sample group is chosen randomly from the larger group in an unbiased way. Calculate a margin of error that specifies how much the larger group results might differ from the smaller sample results. The larger the sample group the smaller the margin of error. Generalizing Sample Results Example: Our survey shows that 57% of customers prefer Brand A, while only 43% prefer Brand B. **(Margin of Error: +/- 8%) Do these results prove anything? Not really. Could be 57%-8% = 49% who prefer Brand A, while 43%+8% = 51% prefer Brand B in the larger population. Greg Brewster, DePaul University Page 3

Greg Brewster, DePaul University Page 4 Some Typical Problems Descriptive Statistics Biased choice of sample Presentation of only results you want Grouping (e.g. Simpson s Paradox) Generalizing sample results to larger group Non-random sampling Sample size Margin of Error may not be shown Incorrect choice of statistic Simpson s Paradox From a data sampling viewpoint, the larger the data set, the better Simpson s Paradox demonstrates that a great deal of care has to be taken when combining smaller data sets into a larger one Sometimes the conclusions from the larger data set are opposite of the conclusion from the smaller data sets Example: Simpson s Paradox Baseball batting statistics for two players First Half Second Half Total Season Player A.400.250.264 Player B.350.200.336 How could Player A beat Player B for both halves individually, but then have a lower total season batting average? If someone wants to deceive you, they could just publish the First and Second Half results and leave out the Total Season results.

Example Continued We weren t told how many at bats each player had: First Half Second Half Total Season Player A 4/10 (.400) 25/100 (.250) 29/110 (.264) Player B 35/100 (.350) 2/10 (.200) 37/110 (.336) Player A s dismal second half and Player B s great first half had higher weights than the other two values. To avoid this problem Be very careful when you combine data subsets into a larger set Be an informed consumer of statistics Be skeptical Ask questions (courtesy of Dr Jost) Who says so? How do they know? What s missing? Did somebody change the subject? Does it make sense? Greg Brewster, DePaul University Page 5

Greg Brewster, DePaul University Page 6 How likely is it that something will happen? Expressing Event Something that occurs For example: roll of dice, flip of coin, weather forecast, election

Outcome Result of an event Has a value of interest to us For example: value on rolled die is 6, heads, rain, Bob Smith is treasurer Sample Space = All Possible Outcomes All possible combinations of outcomes Example - Flip one coin sample space T H Example Roll a die sample space: 1, 2, 3, 4, 5, 6 Sample Space = All Possible Outcomes All possible combinations of outcomes Example - Flip two coins Sample space HH HT TH TT Greg Brewster, DePaul University Page 7

Size of Sample Space = Number of Possible Outcomes M possible outcomes for one event and N possible outcomes for a event: Total number of possible outcomes (size of sample space) for the two events combined = M x N How many outcomes are possible when you roll two dice? Sample Space/ Possible Outcomes (cont.) A restaurant menu offers two choices for an appetizer, five choices for a main course, and three choices for a dessert. How many different possible three-course meals are there? A college offers 12 natural science classes, 15 social science classes, 10 English classes, and 8 fine arts classes. How many possible four-class combinations are there? Expressing As a proportion 0.0 1.0 As a percentage 0 100 % Greg Brewster, DePaul University Page 8

of an Event Occurring P(A) = of A occurring = proportion of the possible event(s) in which a particular outcome (A) occurs For example, Rolling a die (event) has 6 possible outcomes, (1,2,3,4,5,6). Sample space size = 6 of any of those outcomes is 1/6. For example, P(die roll = 3) = 1/6. of an Event Not Occurring P(not A) = 1 - P(A) For example, If the probability of rolling a 3 with one die is 1/6, then the probability of NOT rolling a 3 with one die is 1-1/6 = 5/6 Types of Greg Brewster, DePaul University Page 9

Greg Brewster, DePaul University Page 10 Three Basic Types of Theoretical, or a priori Empirical Subjective Theoretical or a Priori Theoretical, or a priori probability based on situations in which all outcomes are known to be equally likely. Probabilities can be calculated before event Examples: coin toss, dice roll, draw a card, spin a roulette wheel. Theoretical P(A) = (number of ways A can occur) (total # outcomes (sample space size)) e.g. of a head landing in a coin toss: 1/2 of rolling a 7 using two dice: that a family of 3 will have two boys and one girl

Greg Brewster, DePaul University Page 11 Empirical Empirical probability based on the results of observations or experiments. Used to predict the probability of future events based on how often they happened in the past. Empirical based on observations or experiments Example: Records indicate that a river has crested above flood level just four times in the past 2000 years. What is the empirical probability that the river will crest above flood level this year? 4/2000 = 1/500 = 0.002 Comparing Theoretical and Empirical Probabilities Theoretical probability of a coin flip resulting in heads =.5 But your actual results when flipping a coin (empirical probability) may not be exactly the same

Greg Brewster, DePaul University Page 12 Law of Large Numbers The theoretical probability of tossing a coin and landing tails is 0.5. But what if you toss it 5 times and you get HHHHH? The Law of Large Numbers says that the if you toss the coin a very large number of times then the empirical probability will approach the theoretical probability. Law of Large Numbers http://bcs.whfreeman.com/ips4e/cat_0 10/applets/expectedvalue.html Gambler s Fallacy You are playing craps in Vegas. You have had a string of bad luck. You figure since your luck has been so bad, it has to balance out and turn good Bad assumption! Each event is independent of another and has nothing to do with the previous run. Especially in the short run. It takes a very large number of tries before the Law of Large Numbers kicks in. This is called the Gambler s Fallacy

Subjective Subjective (personal) probability use personal judgment or intuition. For example If you go to college today, you will be more successful in the future. Often this probability is not quantified with any number. Greg Brewster, DePaul University Page 13