Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Similar documents
Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Chapter 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

Confidence Intervals. Chapter 10

The following command was executed on their calculator: mean(randnorm(m,20,16))

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

9. Interpret a Confidence level: "To say that we are 95% confident is shorthand for..

Chapter 8: Estimating with Confidence

THIS PROBLEM HAS BEEN SOLVED BY USING THE CALCULATOR. A 90% CONFIDENCE INTERVAL IS ALSO SHOWN. ALL QUESTIONS ARE LISTED BELOW THE RESULTS.

Lecture 12A: Chapter 9, Section 1 Inference for Categorical Variable: Confidence Intervals

Module 28 - Estimating a Population Mean (1 of 3)

CHAPTER 8 Estimating with Confidence

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

3.2 Least- Squares Regression

CHAPTER 5: PRODUCING DATA

Statistical Inference

Statistical inference provides methods for drawing conclusions about a population from sample data.

Chapter 20 Confidence Intervals with proportions!

AP Statistics TOPIC A - Unit 2 MULTIPLE CHOICE

***SECTION 10.1*** Confidence Intervals: The Basics

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

The Confidence Interval. Finally, we can start making decisions!

Population. Sample. AP Statistics Notes for Chapter 1 Section 1.0 Making Sense of Data. Statistics: Data Analysis:

Making Inferences from Experiments

(a) 50% of the shows have a rating greater than: impossible to tell

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

Section 3.2 Least-Squares Regression

Statisticians deal with groups of numbers. They often find it helpful to use

Normal Distribution: Homework *

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

THE DIVERSITY OF SAMPLES FROM THE SAME POPULATION

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

Exam 4 Review Exercises

Statistics for Psychology

Statistics and Probability

What makes us special? Ages 3-5

Sheila Barron Statistics Outreach Center 2/8/2011

(a) 50% of the shows have a rating greater than: impossible to tell

Chapter 1: Exploring Data

Kepler tried to record the paths of planets in the sky, Harvey to measure the flow of blood in the circulatory system, and chemists tried to produce

Previously, when making inferences about the population mean,, we were assuming the following simple conditions:

The random variable must be a numeric measure resulting from the outcome of a random experiment.

3. For a $5 lunch with a 55 cent ($0.55) tip, what is the value of the residual?

Math HL Chapter 12 Probability

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.

Math 243 Sections , 6.1 Confidence Intervals for ˆp

I think. I infer I predict My guess is Perhaps This could mean It could be that My conclusion is.

Never P alone: The value of estimates and confidence intervals

If you could interview anyone in the world, who. Polling. Do you think that grades in your school are inflated? would it be?

Example The median earnings of the 28 male students is the average of the 14th and 15th, or 3+3

Lecture 12: Normal Probability Distribution or Normal Curve

APPENDIX N. Summary Statistics: The "Big 5" Statistical Tools for School Counselors

6 Relationships between

Conduct an Experiment to Investigate a Situation

Averages and Variation

Section I: Multiple Choice Select the best answer for each question. a) 8 b) 9 c) 10 d) 99 e) None of these

CH.2 LIGHT AS A WAVE

Creative Commons Attribution-NonCommercial-Share Alike License

AP STATISTICS 2013 SCORING GUIDELINES

STA Learning Objectives. What is Population Proportion? Module 7 Confidence Intervals for Proportions

STA Module 7 Confidence Intervals for Proportions

UNIT 4 ALGEBRA II TEMPLATE CREATED BY REGION 1 ESA UNIT 4

THE FIGHTER PILOT CHALLENGE: IN THE BLINK OF AN EYE

Making Sense of Measures of Center

Statistical Methods Exam I Review

Chapter 12. The One- Sample

Applied Statistical Analysis EDUC 6050 Week 4

Chapter 5 & 6 Review. Producing Data Probability & Simulation

Business Statistics Probability

SECTION I Number of Questions 15 Percent of Total Grade 50

Probability Models for Sampling

Descriptive Statistics Lecture

Margin of Error = Confidence interval:

STAT 200. Guided Exercise 4

Chapter Three in-class Exercises. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Quantitative Literacy: Thinking Between the Lines

Student Performance Q&A:

V. Gathering and Exploring Data

Introduction. Lecture 1. What is Statistics?

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

Statistics: Bar Graphs and Standard Error

Psychology Research Process

Lecture 13. Outliers

The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance?

SAMPLING AND SAMPLE SIZE

Chapter 2 Organizing and Summarizing Data. Chapter 3 Numerically Summarizing Data. Chapter 4 Describing the Relation between Two Variables

STATISTICS INFORMED DECISIONS USING DATA

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Chapter 1 Data Types and Data Collection. Brian Habing Department of Statistics University of South Carolina. Outline

Control Chart Basics PK

Welcome to OSA Training Statistics Part II

Still important ideas

Lesson 1: Distributions and Their Shapes

How Science Works. Grade 3. Module 2. Class Question: What variables affect ball motion? Group Color: Scientist (Your Name): Teacher s Name:

Section 9.2b Tests about a Population Proportion

Transcription:

Chapter 8 Estimating with Confidence Lesson 2: Estimating a Population Proportion

Conditions for Estimating p These are the conditions you are expected to check before calculating a confidence interval 1. Random: The data come from a welldesigned random sample or randomized experiment. 2. 10%: When sampling without replacement, make the 10% Condition check. 3. Large counts: Both np-hat and n(1-phat) are at least 10.

Example 1: Check that the conditions for constructing a confidence interval for p are met. a. Glenn wonders what proportion of the students at his school believe that tuition is too high. He interviews an SRS of 50 of the 2400 students at his college. Thirty-eight of those interviewed think tuition is too high. Random? Yes, SRS. 10%? Yes: n, 50, is less than 10% of the population of students at his college (2400). Large Counts? Yes: both n-phat and n(1-phat) are at least 10.

Example 1: Check that the conditions for constructing a confidence interval for p are met. b. The small round holes you often see in sea shells were drilled by other sea creatures, who ate the former dwellers of the shells. Whelks often drill into mussels, but this behavior appears to be more or less common in different locations. Researchers collected whelk eggs from the coast of Oregon, raised the whelks in the laboratory, then put each whelk in a container with some delicious mussels. Only of 98 whelks drilled into a mussel. The researchers want to estimate the proportion p of Oregon whelks that will spontaneously drill into mussels. Random? Maybe? We do not now if the eggs were a random sample. 10%? Yes: sample size is les than 10% of all whelk eggs. Large Counts? No: n*phat is less than10.

What happens if one of the conditions is violated? If the data isn t a SRS or results from a randomized experiment, then there s no point in inference, as this violation limits our ability to form any conclusions about the population. When the Large Counts condition is violated, the capture rate will be lower than what is indicated in the confidence level. If the 10% condition (without replacement) is violated, then the confidence intervals are longer than they need to be, meaning that the actual capture rate is greater than what is indicated in the confidence level.

General Formula for constructing a confidence interval for an unknown population proportion, p Statistic ± (critical value) (standard deviation of statistic) The sample proportion p-hat is the statistic we use to estimate p. Z* value that marks off the confidence interval. 90%: z*=1.645 95%: z* = 1.96 99%: z*= 2.575 Though it s the same formula as the standard deviation for a sampling distribution of proportions, we call it the standard error (SE). Because we do not know the value of p, we replace the formula with p-hat. It describes how close the sample proportion p-hat will typically be to the population proportion p. Formula:

How do I find the critical values? 1. Draw a picture of a normal distribution. Label the area within the confidence interval. Identify the area in the individual tails. 2. From your diagram, identify the amount of area to the left of the positive z-value. 3. Use your table: Find the z that corresponds to this amount of total area (to the left of the positive z)

Example 2: Find the critical value z* for an 80% confidence interval. Assume that the Large counts condition is met. z = 1.28

Example 3 Find the critical value z* for a 96% confidence interval. Assume that the Large counts condition is met. z = 2.05

Confidence Intervals in a 4 Step Process: Statistics Problems Demand Consistency 1. State: What parameter do you want to estimate, and at what confidence level? 2. Plan: Identify the appropriate inference method: check conditions. 3. Do: If the conditions are met, perform calculations. 4. Conclude: Interpret your interval in the context of the problem.

Example 4 In her first-grade social studies class, Jordan learned that 70% of Earth s surface was covered in water. She wondered if this was really true and asked her dad for help. To investigate, he tossed an inflatable globe to her 50 times, being careful to spin the globe each time. When she caught it, he recorded where her right finger was pointing. In 50 tosses, her finger was pointing to water 33 times. Construct and interpret a 95% confidence interval for the proportion of Earth s surface that is covered in water.

Example 4 Solutions State: We want to estimate p = the true proportion of Earth s surface that is covered in water with 95% confidence. Plan: Use a on-sample z* interval for p if the conditions are met Random? Yes 10%? Don t need to check: there was replacement. Large counts? Both n*p-hat and n*(1-p-hat) are greater than 10. Do: 0.529 p 0.791 Conclude: We are 95% confident that the interval from 0.529 to 0.791 captures the true proportion of Earth s surface that is covered in water. This is consistent with the claim that 70% of Earth s surface is covered in water, because 0.70 is one of the plausible values in the interval.

Finally: Sample size: what determines how big a sample size to use? The size of your margin of error (ME) determines the minimum sample size you ll use. The ME involves the sample proportion of successes, p-hat. Use a guess from a p-hat based on a past experience or study. Use p-hat = 0.5 as the guess. The ME is largest at this value, providing a conservative estimate.

Sample size for desired margin of error To determine the sample size n that will yield a C% confidence interval for a population proportion p with a maximum margin of error ME, solve the following inequality for n: Where p-hat is a guessed value for the sample proportion. The margin of error will always be less than or equal to ME if you use a p-hat value of 0.5.

Example 5 Suppose that you want to estimate p, the true proportion of students at your school who have a tattoo with 95% confidence and a margin of error of no more than 0.10. How large a sample is needed?

Example 5 Solution Identify variables: P-hat: we don t know, so 0.5 Z* : 1.96 ME: 0.10 Solve for n using your algebra skills. Sentence: We need to survey at least 97 students to estimate the true proportion of students with a tattoo with 95% confidence and a margin of error of at most 0.10.

What is my mystery μ? What do we know about Population Distribution Normally Distributed σ = 20 N = μ =?

Where to start? Rarely do we know/find out what the mean of a population is. But what would you use to estimate what you think μ is? A point estimator is a statistic that provides an estimate of a population parameter. The value of that statistic from a sample is called a point estimate.

For example Statistics Parameter Point estimator Point estimate μ x-bar The calculated sample mean σ s The standard deviation of the sample p p-hat The proportion of successes Any parameter The corresponding statistics

Example 1: In each of the following settings, determine the point estimator you would use and calculate the value of the point estimate. a. The makers of a new golf ball want to estimate the median distance the new balls will travel when hit by a mechanical driver. They select a random sample of 10 balls and measure the distance each ball travels after being hit by the mechanical driver. Here are the distances (in yards): 285 286 284 285 282 284 287 290 288 285 Point Estimator: The sample median to estimate the population median Point Estimate: The sample median is 285 yards

Example 1: In each of the following settings, determine the point estimator you would use and calculate the value of the point estimate. b. The golf ball manufacturer would also like to investigate the variability of the distance traveled by the golf balls by estimating the interquartile range. Point Estimator: The sample IQR as a point estimator for the population IQR Point Estimate: 3 yards (287-284)

Example 1: In each of the following settings, determine the point estimator you would use and calculate the value of the point estimate. c. The math department wants to know what proportion of its students own a graphing calculator, so they take a random sample of 100 students and find that 28 own a graphing calculator. Point Estimator: P-hat as a point estimator for the population proportion p Point Estimate: p-hat = 0.28

Back to the mystery mu Population Distribution Sampling Distribution Normally Distributed n = 16 σ = 20 Point estimator for μ = σx= μ =? The question is, how would the sample mean x vary if we took many SRSs of size 16 from this same population?

Definitions Confidence Interval A C% confidence interval gives an interval of plausible values for a parameter. The interval is calculated from the data and has the form, point estimate ± margin of error. Margin of Error The difference between the point estimate and the true parameter value will be less than the margin of error in C% of all samples. Confidence Level, C The confidence level C gives the overall success rate of the method for calculating the confidence interval. That is, in C% of all possible samples, the method would yield an interval that captures the true parameter value. How to interpret a confidence interval: To interpret a C% confidence interval for an unknown parameter, say, We are C% confident that the interval from to captures the [parameter in context].

Interpreting Intervals with Caution Rule #1: A confidence level tells how likely it is that the interval captures the population parameter if we use it many times. It is the overall capture rate. It does not tell us the chance that the interval captures the parameter. It provides a set of plausible values for the parameter.

Interpreting Confidence Intervals with Caution Rule #2: A confidence interval is NOT the probability that the parameter has been captured. Before it is calculated, we have a 95% chance (for example) of getting a mean that s within 2σ of μ, which would lead to a confidence interval that captures μ. After the confidence interval is constructed, it either does or does not contain μ, which corresponds to a probability of 100% (the interval contained μ) or 0% (the interval did not contain μ).

Interpreting Confidence Intervals with Caution Rule #3: When interpreting a confidence interval, make it clear that you are predicting a parameter, a population, not a statistic, not a sample. Yes: Based on the sample, we believe that the population mean is somewhere between Not so much: We are 95% confident that the interval from _ to _ contains the sample proportion

Interpreting Confidence Intervals with Caution Rule #4: Talk in the future tense, not in the past. No: We are 95% confident that the interval from to captures the true proportion of US adults who said Vs. Yes: We are 95% confident that the interval from to captures the true proportion of US adults who would say

Example 2 A large company is concerned that many of its employees are in poor physical condition, which can result in decreased productivity. To determine how many steps each employee takes per day, on average, the company provides a pedometer to 50 randomly selected employees to use for one 24-hour period. After collecting the data, the company statistician reports a 95% confidence interval of 4547 steps to 8473 steps. a. Interpret the confidence interval. b. What is the point estimate that was used to create the interval? What is the margin of error? c. Recent guidelines suggest that people aim for 10,000 steps per day. Is there convincing evidence that the employees of this company are not meeting the guideline, on average? Explain.

Example 2 A large company is concerned that many of its employees are in poor physical condition, which can result in decreased productivity. To determine how many steps each employee takes per day, on average, the company provides a pedometer to 50 randomly selected employees to use for one 24-hour period. After collecting the data, the company statistician reports a 95% confidence interval of 4547 steps to 8473 steps. a. Interpret the confidence interval. We are 95% confident that the interval from 4547 to 8473 captures the true mean number of steps taken per day for employees at this company. b. What is the point estimate that was used to create the interval? What is the margin of error? Point estimate: 6510 steps (midpoint of the interval) Margin of Error: 1963 steps c. Recent guidelines suggest that people aim for 10,000 steps per day. Is there convincing evidence that the employees of this company are not meeting the guideline, on average? Explain. There is convincing evidence that the employees are not meeting the guideline because all of the values in the interval are less than 10,000 steps.

Let's explore confidence intervals

Example 3 How much does the fat content of Brand X hot dogs vary? To find out, researchers measured the fat content (in grams) of a random sample of 10 Brand X hot dogs. A 95% confidence interval for the population standard deviation σ is 2.84 to 7.55 a. Interpret the confidence interval. b. Interpret the confidence level. c. True or false: the interval from 2.84 to 7.55 has a 95% chance of containing the actual population standard deviation σ. Justify.

Example 3 Solutions a. Interpret the confidence interval. We are 95% confident that the interval from 2.84 to 7.55 g captures the population standard deviation of the fat content of Brand X hot dogs. b. Interpret the confidence level. Over the course of many repetitions, about 95% of all the confidence intervals would capture the true standard deviation of fat content of Brand X hot dogs. c. True or false: the interval from 2.84 to 7.55 has a 95% chance of containing the actual population standard deviation σ. Justify. False: the interval either does or does not contain the population standard deviation (a probability of 1 or 0, respectively).

Exploring Confidence Intervals Play around with the app, and be ready to summarize: 1. Explain how changing the confidence level affects the confidence interval. 2. Explain how changing the sample size affects the length of the confidence interval. 3. Does increasing the sample size increase the capture rate (percent hit)?

My two cents solution Play around with the app, and be ready to summarize: 1. Explain how changing the confidence level affects the confidence interval. Increasing the confidence level widens the confidence interval. Our interval of plausible values for the parameter depends on our level; the wider the interval, the less precise of an estimate, but the more likely that the true parameter will be captured. 2. Explain how changing the sample size affects the length of the confidence interval. The larger the sample size, the more precise estimate of a parameter. 3. Does increasing the sample size increase the capture rate (percent hit)? The sample size does not affect the capture rate. Increasing the sample size does NOT make us more confident, it just makes for a more precise estimate.

Calculating a confidence interval Generally, the confidence interval for estimating a population parameter has the form Statistic ± (critical value) (standard deviation of statistic) The critical value basically is the number of standard deviations that makes the interval wide enough to have the stated capture rate. The product of the critical value and standard deviation is the margin of error.

Margin of error The margin of error depends on 1. The critical value: The greater confidence requires a larger critical value. 2. The standard deviation: the standard deviation of the statistic depends on the sample size n: larger samples give more precise estimates, which means less variability in the statistic.