Previously, when making inferences about the population mean,, we were assuming the following simple conditions:

Similar documents
- Decide on an estimator for the parameter. - Calculate distribution of estimator; usually involves unknown parameter

Statistical inference provides methods for drawing conclusions about a population from sample data.

12.1 Inference for Linear Regression. Introduction

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

STA Module 9 Confidence Intervals for One Population Mean

Comparison of two means

Quantitative Data and Measurement. POLI 205 Doing Research in Politics. Fall 2015

Applied Statistical Analysis EDUC 6050 Week 4

***SECTION 10.1*** Confidence Intervals: The Basics

Chapter 8: Estimating with Confidence

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

9. Interpret a Confidence level: "To say that we are 95% confident is shorthand for..

A point estimate is a single value that has been calculated from sample data to estimate the unknown population parameter. s Sample Standard Deviation

Chapter 25. Paired Samples and Blocks. Copyright 2010 Pearson Education, Inc.

The following command was executed on their calculator: mean(randnorm(m,20,16))

Module 28 - Estimating a Population Mean (1 of 3)

Probability and Statistics. Chapter 1

Chapter 1: Exploring Data

AP STATISTICS 2014 SCORING GUIDELINES

CHAPTER 8 Estimating with Confidence

The Confidence Interval. Finally, we can start making decisions!

AP Statistics. Semester One Review Part 1 Chapters 1-5

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.

Creative Commons Attribution-NonCommercial-Share Alike License

Quizzes (and relevant lab exercises): 20% Midterm exams (2): 25% each Final exam: 30%

Statistics for Psychology

Lecture Notes Module 2

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:

Choosing a Significance Test. Student Resource Sheet

An Introduction to Bayesian Statistics

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

Statistical Inference

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

5.3: Associations in Categorical Variables

appstats26.notebook April 17, 2015

Basic Statistics 01. Describing Data. Special Program: Pre-training 1

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) 1) A) B) C) D)

ASSIGNMENT 2. Question 4.1 In each of the following situations, describe a sample space S for the random phenomenon.

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

Quantitative Literacy: Thinking Between the Lines

Confidence Intervals. Chapter 10

Bayesian approaches to handling missing data: Practical Exercises

Chapter 11: Designing experiments

Survival Skills for Researchers. Study Design

The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance?

V. Gathering and Exploring Data

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Normal Distribution. Many variables are nearly normal, but none are exactly normal Not perfect, but still useful for a variety of problems.

1.4 - Linear Regression and MS Excel

Normal Random Variables

Unit 1 Exploring and Understanding Data

AP Statistics Practice Test Ch. 3 and Previous

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Lessons in biostatistics

Section I: Multiple Choice Select the best answer for each question. a) 8 b) 9 c) 10 d) 99 e) None of these

UNEQUAL CELL SIZES DO MATTER

PROBABILITY Page 1 of So far we have been concerned about describing characteristics of a distribution.

AP Statistics Chapter 5 Multiple Choice

Statistics and Probability

AP Statistics TOPIC A - Unit 2 MULTIPLE CHOICE

Sheila Barron Statistics Outreach Center 2/8/2011

Summarizing Data. (Ch 1.1, 1.3, , 2.4.3, 2.5)

Chapter 8 Estimating with Confidence. Lesson 2: Estimating a Population Proportion

Key Concept. Chapter 9 Inferences from Two Samples. Requirements. November 18, S9.5_3 Comparing Variation in Two Samples

Chapter 8: Estimating with Confidence

Descriptive Statistics

Displaying the Order in a Group of Numbers Using Tables and Graphs

A Case Study: Two-sample categorical data

Examining differences between two sets of scores

Properties of the F Distribution. Key Concept. Chapter 9 Inferences from Two Samples. Requirements

Use Survey Happiness Complete. SAV file on Blackboard Assigned to Mistake 8

CHAPTER 8 Estimating with Confidence

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016

Testing Means. Related-Samples t Test With Confidence Intervals. 6. Compute a related-samples t test and interpret the results.

Making Inferences from Experiments

Research Analysis MICHAEL BERNSTEIN CS 376

Chapter 1: Introduction to Statistics

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Hypothesis Testing. Richard S. Balkin, Ph.D., LPC-S, NCC

The Single-Sample t Test and the Paired-Samples t Test

Averages and Variation

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Math 243 Sections , 6.1 Confidence Intervals for ˆp

Tutorial 3: MANOVA. Pekka Malo 30E00500 Quantitative Empirical Research Spring 2016

04/12/2014. Research Methods in Psychology. Chapter 6: Independent Groups Designs. What is your ideas? Testing

PRINTABLE VERSION. Quiz 1. True or False: The amount of rainfall in your state last month is an example of continuous data.

Chapter 3. Producing Data

MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION

E 490 FE Exam Prep. Engineering Probability and Statistics

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger

Student Performance Q&A:

Analysis and Interpretation of Data Part 1

8.2 Warm Up. why not.

Probability and Sample space

Handout on Perfect Bayesian Equilibrium

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Observational study is a poor way to gauge the effect of an intervention. When looking for cause effect relationships you MUST have an experiment.

The Z Test. Assignment. Terms You Should Know. Psychological Statistics The Z Test. G&W, Chapter 6. Z-test

Transcription:

Chapter 17 Inference about a Population Mean Conditions for inference Previously, when making inferences about the population mean,, we were assuming the following simple conditions: (1) Our data (observations) are a simple random sample (SRS) of size n from the population of interest. (2) The variable we measure has an exactly normal distribution with parameters and. (3) Population standard deviation is known. Then we were constructing confidence interval for the population mean based on distribution (one-sample z statistic): This holds approximately for large samples even if the assumption (2) is not satisfied. Why? Issue: In a more realistic setting, assumption (3) is not satisfied, i.e., the standard deviation is unknown. So what can we do to handle real-life problems? We replace the population standard deviation, by its estimate: When σ is known, the standard deviation of the sample mean x is When σ is unknown, we then estimate the standard deviation of x by (This quantity is called the of the sample mean x.) We get the one-sample t statistic: When making inferences about the population mean with unknown we use the one-sample t statistic (Note that we still need the assumptions 1 and 2). But one-sample t statistic doesn t have normal distribution, it has 1

The t-distributions We specify a particular t-distribution by giving its degrees of freedom (d.f.). How does t-distribution compare with standard normal distribution? Similarities: Difference: As the d.f. k increases, the t k distribution approaches the Normal(0,1) distribution. Notation: t k represents the t-distribution with k d.f. 2

Confidence Intervals for a Population Mean (when standard deviation σ is unknown) Confidence interval for when is unknown (t -CI) A level C confidence interval for is given by where t* is the upper (1-C)/2 critical value for the t n-1 distribution, i.e., Ex: What critical value t* from Table C would you use to make a CI for the population mean in each of the following situations? a) A 95% CI based on n = 10 observations. b) A 90% CI based on n = 26 observations. c) An 80% CI from a sample of size 7. 3

Ex: Suppose the JC-Penney wishes to know the average income of the households in the Dallas area before they decide to open another store here. A random sample of 21 households is taken and the income of these sampled households turns out to average $45,000 with a standard deviation of $15,000. (a) Give a 90% confidence interval for the unknown average income of the households in Dallas area. (b) Is there evidence at 10% level that the average income of the household in the Dallas area is $48,000? Use the four-step process. 4

Matched Pairs t Procedures As we mentioned in Chapter 9, comparative studies are more convincing than single-sample investigations. For that reason, one sample- inference is less common than comparative inference. In a matched pairs design, subjects are matched in pairs and each treatment is given to one subject in each pair. The experimenter can toss a coin to assign two treatments to the two subjects in each pair. Example 1. Suppose a college placement center wants to estimate µ, the difference in mean, starting salaries for men and women graduates who seek jobs through the center. If it independently samples men and women, the starting salaries may vary because of their different college majors and differences in grade point averages. To eliminate these sources of variability, the placement center could match male and female job-seekers according to their majors and GPAs. Then the differences between the starting salaries of each pair in the sample could be used to make an inference about µ. Example 2. Suppose you wish to estimate the difference in mean absorption rate into the bloodstream for two drugs that relieve pain. If you independently sample people, the absorption rates might vary because of age, weight, sex, etc. It may be possible to obtain two measurements on the same person. First, we administer one of the two drugs and record the time until absorption. After a sufficient amount of time, the other drug is administered and a second measurement on absorption time is obtained. The differences between the measurements for each person in the sample could then be used to estimate µ. Another situation calling for matched pairs is before-and-after observations on the same subjects. Example 3. Suppose you wish to estimate the difference in mean blood pressure before and after taking a drug. We will obtain the first measurement before a patient is taking the drug and second measurement after a sufficient amount of time that the patient was taking the drug. The differences between the measurements for each person in the sample could then be used to estimate µ. If the samples are matched pairs, find the difference between the responses within each pair, then apply one-sample t procedures to those differences of observed responses. 5

Example. An experiment is conducted to compare the starting salaries of male and female college graduates who find jobs. Pairs are formed by choosing a male and a female with the same major and similar GPA. Suppose a random sample of 10 pairs is formed in this manner and the starting annual salary of each person is recorded. Let µ 1 be the mean starting salary for males and let µ 2 be the mean starting salary for females. Pair Male (in $) Female (in $) Difference (male female) 1 29300 28800 500 2 41500 41600-100 3 40400 39800 600 4 38500 38500 0 5 43500 42600 900 6 37800 38000-200 7 69500 69200 300 8 41200 40100 1100 9 38400 38200 200 10 59200 58500 700 (a) Compute a 95% confidence interval for the mean difference µ = µ 1 -µ 2. The sample average of the paired difference x and the sample standard deviation of the paired difference s The 95% paired difference CI for = 1-2 is 6

(b) Is there evidence at 5% level that the male starting salary is significantly different from the female starting salary? Use the four-step process. Robustness of t procedures A confidence interval is called robust if the confidence level does not change very much when the conditions for use of the procedure are violated. The t confidence interval is exact when the distribution of the population is exactly. However, no real data are exactly. The usefulness of the t procedures in practice therefore depends on Here are some practical guidelines for inference on population means: ***Always make a plot to check for skewness and outliers before using the t procedures for small samples. *** 7

Using the t procedures Except in the case of small samples, the condition that the data are an SRS from the population of interest is more important than the condition that the population distribution is normal. Sample size less than 15: Use t procedures if the data appear close to normal (roughly symmetric, single peak, no outliers). If the data are clearly skewed or if outliers are presented, do not use t procedures. Sample size at least 15: The t procedures can be used except in the presence of outliers or strong skewness. Large samples: The t procedures can be used even for clearly skewed distributions when the sample size is large, say n 40. 8