Chapter 8 Estimating with Confidence

Similar documents
Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

The following command was executed on their calculator: mean(randnorm(m,20,16))

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

9. Interpret a Confidence level: "To say that we are 95% confident is shorthand for..

Chapter 8: Estimating with Confidence

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

***SECTION 10.1*** Confidence Intervals: The Basics

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Module 28 - Estimating a Population Mean (1 of 3)

Statistical Inference

Confidence Intervals. Chapter 10

Cover Page Homework #9

THE DIVERSITY OF SAMPLES FROM THE SAME POPULATION

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

Quantitative Literacy: Thinking Between the Lines

Statistics for Psychology

Lecture 12A: Chapter 9, Section 1 Inference for Categorical Variable: Confidence Intervals

Optimization and Experimentation. The rest of the story

4) Compare the scores: a score of 88 on a test with a mean of 79 and a score of 78 on a test with a mean of 70.

The Confidence Interval. Finally, we can start making decisions!

Inferential Statistics: An Introduction. What We Will Cover in This Section. General Model. Population. Sample

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

1. Find the appropriate value for constructing a confidence interval in each of the following settings:

Chapter 13 Summary Experiments and Observational Studies

Sheila Barron Statistics Outreach Center 2/8/2011

USING STATCRUNCH TO CONSTRUCT CONFIDENCE INTERVALS and CALCULATE SAMPLE SIZE

Chapter 13. Experiments and Observational Studies. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Applied Statistical Analysis EDUC 6050 Week 4

Probability Models for Sampling

Table of Contents. Introduction. 1. Diverse Weighing scale models. 2. What to look for while buying a weighing scale. 3. Digital scale buying tips

Should I Continue Getting Mammograms? -For Women Age 85 or older-

Chapter 12. The One- Sample

Chapter 1: Exploring Data

Statisticians deal with groups of numbers. They often find it helpful to use

First Problem Set: Answers, Discussion and Background

SAMPLING AND SAMPLE SIZE

MAT Mathematics in Today's World

Chapter 3. Producing Data

CHAPTER OBJECTIVES - STUDENTS SHOULD BE ABLE TO:

Statistical inference provides methods for drawing conclusions about a population from sample data.

Algebra 2 P Experimental Design 11 5 Margin of Error

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Handout 16: Opinion Polls, Sampling, and Margin of Error

Population. Sample. AP Statistics Notes for Chapter 1 Section 1.0 Making Sense of Data. Statistics: Data Analysis:

Section 3.2 Least-Squares Regression

Chapter 7: Descriptive Statistics

News English.com Ready-to-use ESL / EFL Lessons

Making comparisons. Previous sessions looked at how to describe a single group of subjects However, we are often interested in comparing two groups

Villarreal Rm. 170 Handout (4.3)/(4.4) - 1 Designing Experiments I

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d

ADENIYI MOFOLUWAKE MPH APPLIED EPIDEMIOLOGY WEEK 5 CASE STUDY ASSIGNMENT APRIL

Comparing Means among Two (or More) Independent Populations. John McGready Johns Hopkins University

6 Relationships between

MATH 183 Test 2 Review Problems

DOCTOR: The last time I saw you and your 6-year old son Julio was about 2 months ago?

In this chapter we discuss validity issues for quantitative research and for qualitative research.

Simple Sensitivity Analyses for Matched Samples Thomas E. Love, Ph.D. ASA Course Atlanta Georgia

Unit 5. Thinking Statistically

AP Statistics TOPIC A - Unit 2 MULTIPLE CHOICE

Reflection Questions for Math 58B

ACTIVITY 10 A Little Tacky!

STAGES OF ADDICTION. Materials Needed: Stages of Addiction cards, Stages of Addiction handout.

CHAPTER 5: PRODUCING DATA

Pneumococcal Vaccines: Questions and Answers

Quick Read Series. Information for people with seizure disorders

Collecting Data Example: Does aspirin prevent heart attacks?

Thinking about Inference

UNDERSTANDING ANOVA A HOW-TO GUIDE. University of Arizona Military REACH

SECTION I Number of Questions 15 Percent of Total Grade 50

Daily Agenda. Honors Statistics. 1. Check homework C4#9. 4. Discuss 4.3 concepts. Finish 4.2 concepts. March 28, 2017

How Confident Are Yo u?

- Decide on an estimator for the parameter. - Calculate distribution of estimator; usually involves unknown parameter

News English.com Ready-to-use ESL / EFL Lessons

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

CHAPTER 2 NATURAL SELECTION AND REPRODUCTION

NEXT. Top 12 Myths about OEE

Name AP Statistics UNIT 1 Summer Work Section II: Notes Analyzing Categorical Data

THIS PROBLEM HAS BEEN SOLVED BY USING THE CALCULATOR. A 90% CONFIDENCE INTERVAL IS ALSO SHOWN. ALL QUESTIONS ARE LISTED BELOW THE RESULTS.

30-DAY FOLLOW-UP SURVEY (TELEPHONE TRANSCRIPT)

Comparison of Estimates From An Address-Based Mail Survey And A RDD Telephone Survey

Chapter 1 - Sampling and Experimental Design

Study Guide for the Final Exam

Gathering. Useful Data. Chapter 3. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Unraveling Recent Cervical Cancer Screening Updates and the Impact on Your Practice

Section 9.2b Tests about a Population Proportion

Variability. After reading this chapter, you should be able to do the following:

Market Research on Caffeinated Products

2 INSTRUCTOR GUIDELINES

Math HL Chapter 12 Probability

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

An Introduction to Bayesian Statistics

Statistical Methods Exam I Review

Abdul Latif Jameel Poverty Action Lab Executive Training: Evaluating Social Programs Spring 2009

Stop Smoking Pre Session Guide. Helping YOU Successfully Stop Smoking For Good

Transcription:

Chapter 8 Estimating with Confidence Introduction Our goal in many statistical settings is to use a sample statistic to estimate a population parameter. In Chapter 4, we learned if we randomly select the sample, we should be able to generalize our results to the population of interest. In Chapter 7, we learned that different samples yield different results for our estimate. Statistical inference uses the language of probability to express the strength of our conclusions by taking chance variation due to random selection or random assignment into account. In this chapter, we ll learn one method of statistical inference confidence intervals so we may estimate the value of a parameter from a sample statistic. As we do so, we ll learn not only how to construct a confidence interval, but also how to report probabilities that would describe what would happen if we used the inference method many times. Activity: The Mystery Mean Section 8.1 Confidence Intervals: The Basics If you had to give one number to estimate an unknown population parameter, what would it be? If you were estimating a population mean µ, you would probably use. If you were estimating a population proportion p, you might use. In both cases, you would be providing a point estimate of the parameter of interest. DEFINITION: Point estimator and point estimate A point estimator is a statistic that provides an estimate of a population parameter. The value of that statistic from a sample is called a point estimate. Ideally, a point estimate is our best guess at the value of an unknown parameter. We learned in Chapter 7 that an ideal point estimator will have no bias and low variability. Since variability is almost always present when calculating statistics from different samples, we must extend our thinking about estimating parameters to include an acknowledgement that repeated sampling could yield different results. Example From Batteries to Smoking (Point estimators) PROBLEM: In each of the following settings, determine the point estimator you would use and calculate the value of the point estimate. (a) Quality control inspectors want to estimate the mean lifetime µ of the AA batteries produced in an hour at a factory. They select a random sample of 30 batteries during each hour of production and then drain them under conditions that mimic normal use. Here are the lifetimes (in hours) of the batteries from one sample: 16.91 18.83 17.58 15.84 17.42 17.65 16.63 16.84 15.63 16.37 15.80 15.93 15.81 17.45 16.85 16.33 16.22 16.59 17.13 17.10 16.96 16.40 17.35 16.37 15.98 16.52 17.04 17.07 15.73 16.74

(b) What proportion p of U.S. high school students smoke? The 2007 Youth Risk Behavioral Survey questioned a random sample of 14.041 students in grades 9 to 12. Of these, 2808 said they had smoked cigarettes at least one day in the past month. (c) The quality control inspectors in part (a) want to investigate the variability in batter lifetimes by estimating the population variance σ 2. The Idea of a Confidence Interval Recall the Mystery Mean Activity. Is the value of the population mean µ exactly 240.79? Probably not. However, since the sample mean is 240.79, we could guess that µ is somewhere around 240.79. How close to 240.79 is µ likely to be? To answer this question, we must ask another: How would the sample mean vary if we took many SRSs of size 16 from the population? To estimate the Mystery mean µ, we can use = 240.79 as a point estimate. We don t expect µ to be exactly equal to so we need to say how accurate we think our estimate is. In repeated samples, the values of follow a Normal distribution with mean and standard deviation 5. The 68-95-99.7 Rule tells us that in 95% of all samples of size 16, will be within 10 (two standard deviations) of µ. If is within 10 points of µ, then µ is within 10 points of. Therefore, the interval from 10 to + 10 will capture µ in about 95% of all samples of size 16.

If we estimate that µ lies somewhere in the interval 230.79 to 250.79, we d be calculating an interval using a method that captures the true µ in about 95% of all possible samples of this size. The big idea: The sampling distribution of tells us how close to µ the sample mean is likely to be. All confidence intervals we construct will have a form similar to this: estimate ± margin of error We usually choose a confidence level of 90% or higher because we want to be quite sure of our conclusions. The most common confidence level is 95%. DEFINITION: Confidence interval, margin of error, confidence level A confidence interval for a parameter has two parts: - An interval calculated from the data, which has the form estimate ± margin of error The margin of error tells how close the estimate tends to be to the unknown parameter in repeated random sampling. - A confidence level C, which gives the overall success rate of the method for calculating the confidence interval. That is, in C% of all possible samples, the method would yield an interval that captures the true parameter value. Interpreting Confidence Levels and Confidence Intervals The confidence level is the overall capture rate if the method is used many times. Starting with the population, imagine taking many SRSs of 16 observations. The sample mean will vary from sample to sample, but when we use the method estimate ± margin of error to get an interval based on each sample, 95% of these intervals capture the unknown population mean µ. Interpreting Confidence Level and Confidence Intervals Confidence level: To say that we are 95% confident is shorthand for 95% of all possible samples of a given size from this population will result in an interval that captures the unknown parameter. Confidence interval: To interpret a C% confidence interval for an unknown parameter, say, We are C% confident that the interval from to captures the actual value of the [population parameter in context]. The confidence level tells us how likely it is that the method we are using will produce an interval that captures the population parameter if we use it many times. The confidence level does not tell us the chance that a particular confidence interval captures the population parameter.

Instead, the confidence interval gives us a set of plausible values for the parameter. We interpret confidence levels and confidence intervals in much the same way whether we are estimating a population mean, proportion, or some other parameter. Example Do You Use Twitter? (Interpreting a confidence interval and a confidence level) In late 2009, the Pew Internet and American Life Project asked a random sample of 2253 U.S. adults, Do you ever use Twitter or another service to share updates about yourself or to see updates about others? Of the sample, 19% said Yes. According to Pew, the resulting 95% confidence interval is (0.167, 0.213). PROBLEM: Interpret the confidence interval and the confidence level. CHECK YOUR UNDERSTANDING How much does the fat content of Brand X hot dogs vary? To find out, researchers measured the fat content (in grams) of a random sample of 10 Brand X hot dogs. A 95% confidence interval for the population standard deviation σ is 2.84 to 7.55. 1. Interpret the confidence interval. 2. Interpret the confidence level. 3. True or False: The interval from 2.84 to 7.55 has a 95% chance of containing the actual population standard deviation σ. Justify your answer. Constructing a Confidence Interval Why settle for 95% confidence when estimating a parameter? The price we pay for greater confidence is a wider interval. When we calculated a 95% confidence interval for the mystery mean µ, we started with estimate ± margin of error Our estimate came from the sample statistic. Since the sampling distribution of is Normal, about 95% of the values of will lie within 2 standard deviations ( ) of the mystery mean µ. That is, our interval could be written as: 240.79 ± 2(5) = ± This leads to a more general formula for confidence intervals: statistic ± (critical value) (standard deviation of statistic) Calculating a Confidence Interval

The confidence interval for estimating a population parameter has the form statistic ± (critical value) (standard deviation of statistic) where the statistic we use is the point estimator for the parameter. Properties of Confidence Intervals: The margin of error is the (critical value) (standard deviation of statistic) The user chooses the confidence level, and the margin of error follows from this choice. The critical value depends on the confidence level and the sampling distribution of the statistic. Greater confidence requires a larger critical value The standard deviation of the statistic depends on the sample size n The margin of error gets smaller when: The confidence level decreases The sample size n increases Using Confidence Intervals Before calculating a confidence interval for µ or p there are three important conditions that you should check. Conditions for Constructing a Confidence Interval 1) Random: The data should come from a well-designed random sample or randomized experiment. 2) Normal: The sampling distribution of the statistic is approximately Normal. For means: The sampling distribution is exactly Normal if the population distribution is Normal. When the population distribution is not Normal, then the central limit theorem tells us the sampling distribution will be approximately Normal if n is sufficiently large (n 30). For proportions: We can use the Normal approximation to the sampling distribution as long as np 10 and n(1 p) 10. 3) Independent: Individual observations are independent. When sampling without replacement, the sample size n should be no more than 10% of the population size N (the 10% condition) to use our formula for the standard deviation of the statistic.