Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Similar documents
Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Chapter 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

The following command was executed on their calculator: mean(randnorm(m,20,16))

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

Chapter 8: Estimating with Confidence

Confidence Intervals. Chapter 10

9. Interpret a Confidence level: "To say that we are 95% confident is shorthand for..

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Module 28 - Estimating a Population Mean (1 of 3)

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

Lecture 12A: Chapter 9, Section 1 Inference for Categorical Variable: Confidence Intervals

3.2 Least- Squares Regression

THIS PROBLEM HAS BEEN SOLVED BY USING THE CALCULATOR. A 90% CONFIDENCE INTERVAL IS ALSO SHOWN. ALL QUESTIONS ARE LISTED BELOW THE RESULTS.

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

CHAPTER 8 Estimating with Confidence

Chapter 20 Confidence Intervals with proportions!

Statistical Inference

***SECTION 10.1*** Confidence Intervals: The Basics

CHAPTER 5: PRODUCING DATA

Statistical inference provides methods for drawing conclusions about a population from sample data.

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

AP Statistics TOPIC A - Unit 2 MULTIPLE CHOICE

Section 3.2 Least-Squares Regression

Population. Sample. AP Statistics Notes for Chapter 1 Section 1.0 Making Sense of Data. Statistics: Data Analysis:

(a) 50% of the shows have a rating greater than: impossible to tell

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

Statisticians deal with groups of numbers. They often find it helpful to use

Making Inferences from Experiments

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

Statistics for Psychology

Normal Distribution: Homework *

Exam 4 Review Exercises

Chapter 1: Exploring Data

If you could interview anyone in the world, who. Polling. Do you think that grades in your school are inflated? would it be?

Lecture 13. Outliers

Business Statistics Probability

THE FIGHTER PILOT CHALLENGE: IN THE BLINK OF AN EYE

Statistics and Probability

3. For a $5 lunch with a 55 cent ($0.55) tip, what is the value of the residual?

The Confidence Interval. Finally, we can start making decisions!

APPENDIX N. Summary Statistics: The "Big 5" Statistical Tools for School Counselors

THE DIVERSITY OF SAMPLES FROM THE SAME POPULATION

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Never P alone: The value of estimates and confidence intervals

(a) 50% of the shows have a rating greater than: impossible to tell

Making Sense of Measures of Center

Probability Models for Sampling

What makes us special? Ages 3-5

CH.2 LIGHT AS A WAVE

Psychology Research Process

Chapter 12. The One- Sample

Previously, when making inferences about the population mean,, we were assuming the following simple conditions:

Math HL Chapter 12 Probability

Applied Statistical Analysis EDUC 6050 Week 4

Math 243 Sections , 6.1 Confidence Intervals for ˆp

STAT 113: PAIRED SAMPLES (MEAN OF DIFFERENCES)

Example The median earnings of the 28 male students is the average of the 14th and 15th, or 3+3

Psychology Research Process

Kepler tried to record the paths of planets in the sky, Harvey to measure the flow of blood in the circulatory system, and chemists tried to produce

Conduct an Experiment to Investigate a Situation

Variability. After reading this chapter, you should be able to do the following:

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.

Lecture 12: Normal Probability Distribution or Normal Curve

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

I think. I infer I predict My guess is Perhaps This could mean It could be that My conclusion is.

STA Learning Objectives. What is Population Proportion? Module 7 Confidence Intervals for Proportions

STA Module 7 Confidence Intervals for Proportions

Still important ideas

AP STATISTICS 2013 SCORING GUIDELINES

Sheila Barron Statistics Outreach Center 2/8/2011

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

SECTION I Number of Questions 15 Percent of Total Grade 50

Averages and Variation

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

STAT 200. Guided Exercise 4

UNIT 4 ALGEBRA II TEMPLATE CREATED BY REGION 1 ESA UNIT 4

Section I: Multiple Choice Select the best answer for each question. a) 8 b) 9 c) 10 d) 99 e) None of these

CHAPTER 3 Describing Relationships

Scientific Inquiry Review

Fun and Fit Forever: Exercise. May 21, :00-7:30 p.m. All About Kids

Still important ideas

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

The random variable must be a numeric measure resulting from the outcome of a random experiment.

Descriptive Statistics Lecture

Undertaking statistical analysis of

Creative Commons Attribution-NonCommercial-Share Alike License

Designing Psychology Experiments: Data Analysis and Presentation

Chapter 5 & 6 Review. Producing Data Probability & Simulation

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Lesson 1: Distributions and Their Shapes

A) I only B) II only C) III only D) II and III only E) I, II, and III

Control Chart Basics PK

TRACKS Lesson Plan. Choosing healthy beverages Rethink Your Drink Grade 5 8 Boys Club

Quantitative Literacy: Thinking Between the Lines

Lesson 1. Assessment 1.1 (Preassessment) Name: Per: Date:

V. Gathering and Exploring Data

Transcription:

Chapter 8 Estimating with Confidence Lesson 2: Estimating a Population Proportion

What proportion of the beads are yellow? In your groups, you will find a 95% confidence interval for the true proportion of yellow beads. *remember: you will be using the data from the sample you have, and since everyone has a different sample statistic, we have to apply the ideas of sampling distributions what are those again?

What are the big ideas about sampling distributions for proportions? SOCS: Shape: (skip O) Center: Spread:

So what proportion of beads are yellow? 1. Each group will take a simple random sample of beads. Separate the beads into two groups, those that are yellow, and those that are not. 2. Determine the point estimate pƹ for the unknown population proportion p of red beads in the container. 3. In your groups, find a 95% confidence interval for the true proportion of red beads p. Hint: does the number 95% sound familiar? 4. Compare the results with the other groups.

Conditions for Estimating p These are the conditions you are expected to check before calculating a confidence interval 1. Random: The data come from a welldesigned random sample or randomized experiment. 2. Large counts: Both npƹ and n(1 p) Ƹ are at least 10. 1. What does this ensure?

Example 1: Check that the conditions for constructing a confidence interval for p are met. a. Glenn wonders what proportion of the students at his school believe that tuition is too high. He interviews an SRS of 50 of the 2400 students at his college. Thirty-eight of those interviewed think tuition is too high. Random? Yes, SRS. Large Counts? Yes: both n-phat and n(1-phat) are at least 10.

Example 1: Check that the conditions for constructing a confidence interval for p are met. b. The small round holes you often see in sea shells were drilled by other sea creatures, who ate the former dwellers of the shells. Whelks often drill into mussels, but this behavior appears to be more or less common in different locations. Researchers collected whelk eggs from the coast of Oregon, raised the whelks in the laboratory, then put each whelk in a container with some delicious mussels. Only 9 of 98 whelks drilled into a mussel. The researchers want to estimate the proportion p of Oregon whelks that will spontaneously drill into mussels. Random? Maybe? We do not now if the eggs were a random sample. Large Counts? No: n*phat is less than10.

What happens if one of the conditions is violated? If the data isn t a SRS or results from a randomized experiment, then there s no point in inference, as this violation limits our ability to form any conclusions about the population. When the Large Counts condition is violated, the capture rate will be lower than what is indicated in the confidence level.

Ƹ General Formula for constructing a confidence interval for an unknown population proportion, p Statistic ± (critical value) (standard deviation of statistic) The sample proportion, p, is the statistic we use to estimate p. Z* value that marks off the confidence interval. 90%: z*=1.645 95%: z* = 1.96 99%: z*= 2.575 Also known here as the standard error (SE) of p: an estimate of the standard deviation of the sampling distribution of p. Formula: SE p = p(1 p) n The standard error of p estimates how much p typically varies from p

How do I find the critical values? 1. Draw a picture of a normal distribution. Label the area within the confidence interval. Identify the area in the individual tails. 2. From your diagram, identify the amount of area to the left of the positive z-value. 3. Use your table: Find the z that corresponds to this amount of total area (to the left of the positive z)

Example 2: Find the critical value z* for an 80% confidence interval. Assume that the Large counts condition is met. z = 1.28

Example 3 Find the critical value z* for a 96% confidence interval. Assume that the Large counts condition is met. z = 2.05

How to calculate a confidence interval for p When the random and large counts conditions are met, a C% confidence interval for the population proportion p is Point estimate ±margin of error p Ƹ ± z p(1 Ƹ p) Ƹ n Where z is the critical value for the standard normal curve with C% of its area between z and +z. Your interval will look like: < p <

Example 4 According to a recent study by the Annenberg Foundation, only 36% of adults in the United States could name all three branches of government. This was based on a survey given to a random sample of 1416 adults. 1. Show that the conditions for calculating a confidence interval for a proportion are satisfied. 2. Calculate a 99% confidence interval for the proportion of all U.S. adults who could name all three branches of government. 3. Interpret the interval from Question 2.

Confidence Intervals in a 4 Step Process: Statistics Problems Demand Consistency 1. State: What parameter do you want to estimate, and at what confidence level? 2. Plan: Identify the appropriate inference method: check conditions. 3. Do: If the conditions are met, perform calculations. 4. Conclude: Interpret your interval in the context of the problem.

Example 4 In her first-grade social studies class, Jordan learned that 70% of Earth s surface was covered in water. She wondered if this was really true and asked her dad for help. To investigate, he tossed an inflatable globe to her 50 times, being careful to spin the globe each time. When she caught it, he recorded where her right finger was pointing. In 50 tosses, her finger was pointing to water 33 times. Construct and interpret a 95% confidence interval for the proportion of Earth s surface that is covered in water.

Example 4 Solutions State: We want to estimate p = the true proportion of Earth s surface that is covered in water with 95% confidence. Plan: Use a on-sample z* interval for p if the conditions are met Random? Yes 10%? Don t need to check: there was replacement. Large counts? Both n*p-hat and n*(1-p-hat) are greater than 10. Do: 0.529 p 0.791 Conclude: We are 95% confident that the interval from 0.529 to 0.791 captures the true proportion of Earth s surface that is covered in water. This is consistent with the claim that 70% of Earth s surface is covered in water, because 0.70 is one of the plausible values in the interval.

Finally: Sample size: what determines how big a sample size to use? The size of your margin of error (ME) determines the minimum sample size you ll use. The ME involves the sample proportion of successes, p-hat. Use a guess from a p-hat based on a past experience or study. Use p-hat = 0.5 as the guess. The ME is largest at this value, providing a conservative estimate.

Sample size for desired margin of error To determine the sample size n that will yield a C% confidence interval for a population proportion p with a maximum margin of error ME, solve the following inequality for n: Where p-hat is a guessed value for the sample proportion. The margin of error will always be less than or equal to ME if you use a p-hat value of 0.5.

Example 5 Suppose that you want to estimate p, the true proportion of students at your school who have a tattoo with 95% confidence and a margin of error of no more than 0.10. How large a sample is needed?

Example 5 Solution Identify variables: P-hat: we don t know, so 0.5 Z* : 1.96 ME: 0.10 Solve for n using your algebra skills. Sentence: We need to survey at least 97 students to estimate the true proportion of students with a tattoo with 95% confidence and a margin of error of at most 0.10.

What is my mystery μ? What do we know about Population Distribution Normally Distributed σ = 20 N = μ =?

Where to start? Rarely do we know/find out what the mean of a population is. But what would you use to estimate what you think μ is? A point estimator is a statistic that provides an estimate of a population parameter. The value of that statistic from a sample is called a point estimate.

For example Statistics Parameter Point estimator Point estimate μ x-bar The calculated sample mean σ s The standard deviation of the sample p p-hat The proportion of successes Any parameter The corresponding statistics

Example 1: In each of the following settings, determine the point estimator you would use and calculate the value of the point estimate. a. The makers of a new golf ball want to estimate the median distance the new balls will travel when hit by a mechanical driver. They select a random sample of 10 balls and measure the distance each ball travels after being hit by the mechanical driver. Here are the distances (in yards): 285 286 284 285 282 284 287 290 288 285 Point Estimator: The sample median to estimate the population median Point Estimate: The sample median is 285 yards

Example 1: In each of the following settings, determine the point estimator you would use and calculate the value of the point estimate. b. The golf ball manufacturer would also like to investigate the variability of the distance traveled by the golf balls by estimating the interquartile range. Point Estimator: The sample IQR as a point estimator for the population IQR Point Estimate: 3 yards (287-284)

Example 1: In each of the following settings, determine the point estimator you would use and calculate the value of the point estimate. c. The math department wants to know what proportion of its students own a graphing calculator, so they take a random sample of 100 students and find that 28 own a graphing calculator. Point Estimator: P-hat as a point estimator for the population proportion p Point Estimate: p-hat = 0.28

Back to the mystery mu Population Distribution Sampling Distribution Normally Distributed n = 16 σ = 20 Point estimator for μ = σx= μ =? The question is, how would the sample mean x vary if we took many SRSs of size 16 from this same population?

Definitions Confidence Interval A C% confidence interval gives an interval of plausible values for a parameter. The interval is calculated from the data and has the form, point estimate ± margin of error. Margin of Error The difference between the point estimate and the true parameter value will be less than the margin of error in C% of all samples. Confidence Level, C The confidence level C gives the overall success rate of the method for calculating the confidence interval. That is, in C% of all possible samples, the method would yield an interval that captures the true parameter value. How to interpret a confidence interval: To interpret a C% confidence interval for an unknown parameter, say, We are C% confident that the interval from to captures the [parameter in context].

Interpreting Intervals with Caution Rule #1: A confidence level tells how likely it is that the interval captures the population parameter if we use it many times. It is the overall capture rate. It does not tell us the chance that the interval captures the parameter. It provides a set of plausible values for the parameter.

Interpreting Confidence Intervals with Caution Rule #2: A confidence interval is NOT the probability that the parameter has been captured. Before it is calculated, we have a 95% chance (for example) of getting a mean that s within 2σ of μ, which would lead to a confidence interval that captures μ. After the confidence interval is constructed, it either does or does not contain μ, which corresponds to a probability of 100% (the interval contained μ) or 0% (the interval did not contain μ).

Interpreting Confidence Intervals with Caution Rule #3: When interpreting a confidence interval, make it clear that you are predicting a parameter, a population, not a statistic, not a sample. Yes: Based on the sample, we believe that the population mean is somewhere between Not so much: We are 95% confident that the interval from _ to _ contains the sample proportion

Interpreting Confidence Intervals with Caution Rule #4: Talk in the future tense, not in the past. No: We are 95% confident that the interval from to captures the true proportion of US adults who said Vs. Yes: We are 95% confident that the interval from to captures the true proportion of US adults who would say

Example 2 A large company is concerned that many of its employees are in poor physical condition, which can result in decreased productivity. To determine how many steps each employee takes per day, on average, the company provides a pedometer to 50 randomly selected employees to use for one 24-hour period. After collecting the data, the company statistician reports a 95% confidence interval of 4547 steps to 8473 steps. a. Interpret the confidence interval. b. What is the point estimate that was used to create the interval? What is the margin of error? c. Recent guidelines suggest that people aim for 10,000 steps per day. Is there convincing evidence that the employees of this company are not meeting the guideline, on average? Explain.

Example 2 A large company is concerned that many of its employees are in poor physical condition, which can result in decreased productivity. To determine how many steps each employee takes per day, on average, the company provides a pedometer to 50 randomly selected employees to use for one 24-hour period. After collecting the data, the company statistician reports a 95% confidence interval of 4547 steps to 8473 steps. a. Interpret the confidence interval. We are 95% confident that the interval from 4547 to 8473 captures the true mean number of steps taken per day for employees at this company. b. What is the point estimate that was used to create the interval? What is the margin of error? Point estimate: 6510 steps (midpoint of the interval) Margin of Error: 1963 steps c. Recent guidelines suggest that people aim for 10,000 steps per day. Is there convincing evidence that the employees of this company are not meeting the guideline, on average? Explain. There is convincing evidence that the employees are not meeting the guideline because all of the values in the interval are less than 10,000 steps.

Let's explore confidence intervals

Example 3 How much does the fat content of Brand X hot dogs vary? To find out, researchers measured the fat content (in grams) of a random sample of 10 Brand X hot dogs. A 95% confidence interval for the population standard deviation σ is 2.84 to 7.55 a. Interpret the confidence interval. b. Interpret the confidence level. c. True or false: the interval from 2.84 to 7.55 has a 95% chance of containing the actual population standard deviation σ. Justify.

Example 3 Solutions a. Interpret the confidence interval. We are 95% confident that the interval from 2.84 to 7.55 g captures the population standard deviation of the fat content of Brand X hot dogs. b. Interpret the confidence level. Over the course of many repetitions, about 95% of all the confidence intervals would capture the true standard deviation of fat content of Brand X hot dogs. c. True or false: the interval from 2.84 to 7.55 has a 95% chance of containing the actual population standard deviation σ. Justify. False: the interval either does or does not contain the population standard deviation (a probability of 1 or 0, respectively).

Exploring Confidence Intervals Play around with the app, and be ready to summarize: 1. Explain how changing the confidence level affects the confidence interval. 2. Explain how changing the sample size affects the length of the confidence interval. 3. Does increasing the sample size increase the capture rate (percent hit)?

My two cents solution Play around with the app, and be ready to summarize: 1. Explain how changing the confidence level affects the confidence interval. Increasing the confidence level widens the confidence interval. Our interval of plausible values for the parameter depends on our level; the wider the interval, the less precise of an estimate, but the more likely that the true parameter will be captured. 2. Explain how changing the sample size affects the length of the confidence interval. The larger the sample size, the more precise estimate of a parameter. 3. Does increasing the sample size increase the capture rate (percent hit)? The sample size does not affect the capture rate. Increasing the sample size does NOT make us more confident, it just makes for a more precise estimate.

Calculating a confidence interval Generally, the confidence interval for estimating a population parameter has the form Statistic ± (critical value) (standard deviation of statistic) The critical value basically is the number of standard deviations that makes the interval wide enough to have the stated capture rate. The product of the critical value and standard deviation is the margin of error.

Margin of error The margin of error depends on 1. The critical value: The greater confidence requires a larger critical value. 2. The standard deviation: the standard deviation of the statistic depends on the sample size n: larger samples give more precise estimates, which means less variability in the statistic.