Probability Models for Sampling

Similar documents
9. Interpret a Confidence level: "To say that we are 95% confident is shorthand for..

Handout 14: Understanding Randomness Investigating Claims of Discrimination

Lecture 16. DATA 8 Spring Assessing Models. Slides created by John DeNero and Ani Adhikari

Chapter 8: Estimating with Confidence

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

Statistics for Psychology

Medicaid Denied My Request for Services, Now What?

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

Risk Aversion in Games of Chance

Appendix: Instructions for Treatment Index B (Human Opponents, With Recommendations)

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Estimation. Preliminary: the Normal distribution

Reflection Questions for Math 58B

Lecture 12A: Chapter 9, Section 1 Inference for Categorical Variable: Confidence Intervals

Math HL Chapter 12 Probability

Chapter 12. The One- Sample

Your Money or Your Life An Exploration of the Implications of Genetic Testing in the Workplace

APPENDIX N. Summary Statistics: The "Big 5" Statistical Tools for School Counselors

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 1: Exploring Data

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

Handout 16: Opinion Polls, Sampling, and Margin of Error

Math 1680 Class Notes. Chapters: 1, 2, 3, 4, 5, 6

CP Statistics Sem 1 Final Exam Review

People have used random sampling for a long time

If you could interview anyone in the world, who. Polling. Do you think that grades in your school are inflated? would it be?

Chapter 8 Estimating with Confidence

Chapter 7: Descriptive Statistics

STAT 100 Exam 2 Solutions (75 points) Spring 2016

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points.

Chapter 5: Producing Data

Never P alone: The value of estimates and confidence intervals

Chapter 1 Review Questions

Biostatistics Lecture April 28, 2001 Nate Ritchey, Ph.D. Chair, Department of Mathematics and Statistics Youngstown State University

CHAPTER 5: PRODUCING DATA

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

Something to think about. What happens, however, when we have a sample with less than 30 items?

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Cognitive Restructuring

The Chromosomes of a Frimpanzee: An Imaginary Animal

THEORY U. A Way to Change Services for People with Intellectual Disabilities John O Brien Illustrations by Ester Ortega

Study on Gender in Physics

STA Module 9 Confidence Intervals for One Population Mean

Lecture 12: Normal Probability Distribution or Normal Curve

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Sheila Barron Statistics Outreach Center 2/8/2011

Chapter 3. Producing Data

Axiovert Standalone Specimen - Blood. Joyce Ma and Jackie Wong. November 2003

The Assisted Decision-Making (Capacity) Act 2015 and the Decision Support Service

10/4/2007 MATH 171 Name: Dr. Lunsford Test Points Possible

Stanford Youth Diabetes Coaches Program Instructor Guide Class #1: What is Diabetes? What is a Diabetes Coach? Sample

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time.

Actor-Observer Bias One of the most established phenomenon in social psychology YOUR behavior is attributed to OTHER S behavior is attributed to

Chapter 1 Data Collection

Drugs. Putting a drug into your body is the same thing as putting poison into your body...

ANATOMY OF A RESEARCH ARTICLE

Applied Statistical Analysis EDUC 6050 Week 4

Two-sample Categorical data: Measuring association

LET S TALK about Sticking with your treatment plan

AP Stats Review for Midterm

Review: Conditional Probability. Using tests to improve decisions: Cutting scores & base rates

***SECTION 10.1*** Confidence Intervals: The Basics

Statistical Methods Exam I Review

(2) In each graph above, calculate the velocity in feet per second that is represented.

UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER

Offseason Training: Nutritional Troubleshooting and FAQ Section

Class 6 Overview. Two Way Tables

Plans for chest and lung operations in South Wales

Averages and Variation

Los Caminos Peligrosos Chapter 5 Aterrizaje forzoso

Math 140 Introductory Statistics

A VIDEO SERIES. living WELL. with kidney failure KIDNEY TRANSPLANT

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Chapter 11. Experimental Design: One-Way Independent Samples Design

Vocabulary. Bias. Blinding. Block. Cluster sample

Colour Communication.

Your Task: Find a ZIP code in Seattle where the crime rate is worse than you would expect and better than you would expect.

PROSTATE CANCER SCREENING SHARED DECISION MAKING VIDEO

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.

Welcome to OSA Training Statistics Part II

Conducting Survey Research. John C. Ricketts

Decision Making Process

STAT 200. Guided Exercise 4

Inferential Statistics: An Introduction. What We Will Cover in This Section. General Model. Population. Sample

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

Unit 1 Exploring and Understanding Data

The Wellbeing Course. Resource: Mental Skills. The Wellbeing Course was written by Professor Nick Titov and Dr Blake Dear

Principle underlying all of statistics

Lesson 1: Distributions and Their Shapes

Outline. Chapter 3: Random Sampling, Probability, and the Binomial Distribution. Some Data: The Value of Statistical Consulting

AP Stats Chap 27 Inferences for Regression

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

*Karle Laska s Sections: There is NO class Thursday or Friday! Have a great Valentine s Day weekend!

Pooling Subjective Confidence Intervals

Mentoring. Awards. Debbie Thie Mentor Chair Person Serena Dr. Largo, FL

In this chapter we discuss validity issues for quantitative research and for qualitative research.

Chapter 15: Continuation of probability rules

Transcription:

Probability Models for Sampling Chapter 18 May 24, 2013 Sampling Variability in One Act Probability Histogram for ˆp

Act 1 A health study is based on a representative cross section of 6,672 Americans age 18 to 79. A sociologist wants to interview these people but she only has money to interview 100 of them. To avoid bias, she is going to draw the sample at random. The following is her conversation with a statistician. Soc: It seems like a lot of work to write all the 6,672 names on separate tickets, put them in a box and draw out 100 at random. Stat: The computer has a random number generator. It picks a number at random from 1 to 6,672. The person with that code number goes into the sample. Then it picks a second number at random, different from the first. That s the second person to go into the same. It keeps going like this until it gets 100 people.

Act 1 Stat: I drew a sample to show you. Look, this sample has 51 men and 49 women. That s pretty close. Soc: Something isn t right. There are 3,091 men and 3,581 women in the survey: 46% men. I should only have 46 men in my sample.

Act 1 Stat: Not true. Remember, the people in sample are drawn at random. Just by the luck of the draw, you could get too many men or too few. I had the computer take a lot of samples for you, 250 in all. The number of men ranged from a low of 34 to a high of 58. Only 17 samples out of the lot had exactly 46 men. Here s a histogram.

Act 1 Soc: So what stops the number from being exactly 46? Stat: Chance variability. Each time the computer chooses a person for the sample, it either gets a man or a woman. So the number of men either stays the same or goes up by one. The chances are 46 to 54 each time. Soc: What happens if we increase the size of the sample? Won t it come out out more like the population? Stat: Right. Suppose we increase the sample size by four to 400. I got the computer to draw another 250 samples, this time with 400 people in each sample. With some of these samples, the percentage of men is below 46%, with the others it is above. The low is 39% and the high is 54%. Here s a histogram.

Act 1 Stat: Multiplying the sample size by four cuts the likely size of chance error in the % by a factor of two.

Act 1 Soc: Can you explain what chance error is? Stat: Sure, here is an equation: ˆp = p + chance error Of course the chance error in ˆp will be different from sample to sample. Soc: So if I let you draw a sample with this random number business, can you tell me exactly how big the chance error will be in ˆp for my sample? Stat: Not exactly, but I can tell you it s likely or typical size. Soc: O.K. good, but wait, there is a point I missed earlier.

Act 1 Soc: How can you have 250 different samples with 100 people each? I mean 250 100 = 25, 000 and we only started with 6,672 people. Stat: Ah. The samples are all different, but they have some people in common. Look at the sketch. The inside of the circle is like the 6,672 people and each shaded strip is a sample. The strips are different, but they overlap.

2.0 What do we expect ˆp to be? The sociologist took a sample of size 100 from a population of 3,091 men and 3,581 women. The box can be represented as: 3,091 1 3,581 0 or 46% 1 54% 0 100 draws are made without replacement. And ˆp, the proportion of men in the sample is calculated as: ˆp = sum of 1 s drawn/100. On each draw, how likely is she to get a male? How many men should we expect in a sample of size 100?

2.1 What is the S.D. of ˆp? Defn: The S.D. of ˆp is S.D. of ˆp = S.D. of Numbers in Box/ 100 What is the standard deviation of the numbers in the box? What is the likely size of the chance error in ˆp?

2.2 Expected Value and S.D. of ˆp In summary: The expected value of ˆp is p. The S.D. of ˆp is p(1 p) n These formulas are exact when drawing with replacement. They are good approximations when drawing without replacement, provided the number of draws is small relative to the number of tickets in the box

2.3 Example A university has 25,000 students of whom 10,000 are older than 25. The registrar draws a S.R.S. of 400 students. Find the expected value and the S.D. of the proportion of students in the sample who are older than 25.

3.0 Using the Normal Curve Recall the Central Limit Theorem: Suppose an experiment consists of drawing n tickets (at random, with replacement) from a box of numbered tickets and the outcome is the sum (or average, count, proportion) of the numbers drawn. Then the probability histogram of the sum (or average, count, proportion) will converge to the normal curve as n gets bigger. Therefore the probability histogram of ˆp will converge (get closer and closer to) a normal curve as n gets bigger. The mean of this curve is and the S.D. of this curve is

3.1 Example In a certain town, the telephone company has 100,000 subscribers. It plans to take a S.R.S. of 400 for a market research study. According to the Census data, 20% of the company subscribers earn more than $50,000 a year. What is the chance that between 18% and 22% of the people in the sample will each more than $50,000 a year?

3.2 Example In 1965, the U.S. Supreme Court decided the case of Swain v. Alabama. Swain, a black man was convicted in Talladega County of raping a white woman and sentenced to death. The case was appealed on the grounds that there were no blacks on the jury. At that time in Talladega Country there were 16,000 men of whom 26% were black. If 100 people were chosen at random from this population, what is the chance that 8 or fewer would be black? What do you conclude? Aside: The appeal was denied on the grounds that there were 8 blacks on a panel of 100 from which the jury was selected.