Unit 2: Probability and distributions Lecture 3: Normal distribution

Similar documents
Normal Distribution. Many variables are nearly normal, but none are exactly normal Not perfect, but still useful for a variety of problems.

Quantitative Literacy: Thinking Between the Lines

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

AP Statistics TOPIC A - Unit 2 MULTIPLE CHOICE

Applied Statistical Analysis EDUC 6050 Week 4

Module 28 - Estimating a Population Mean (1 of 3)

Statistics for Psychology

12.1 Inference for Linear Regression. Introduction

Normal Random Variables

Observational studies; descriptive statistics

Chapter 7: Descriptive Statistics

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

Chapter 1: Introduction to Statistics

AP Statistics. Semester One Review Part 1 Chapters 1-5

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016

Appendix B Statistical Methods

AP Psych - Stat 1 Name Period Date. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Probability and Statistics. Chapter 1

STAT 200. Guided Exercise 4

Classical Psychophysical Methods (cont.)

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

Still important ideas

Unit 1 Exploring and Understanding Data

Welcome to OSA Training Statistics Part II

Introduction to Statistical Data Analysis I

Business Statistics Probability

STA Module 9 Confidence Intervals for One Population Mean

Lesson 9 Presentation and Display of Quantitative Data

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

AP Psych - Stat 2 Name Period Date. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Data, frequencies, and distributions. Martin Bland. Types of data. Types of data. Clinical Biostatistics

Chapter 2: The Normal Distributions

Chapter 1: Exploring Data

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

CHAPTER 3 Describing Relationships

The normal curve and standardisation. Percentiles, z-scores

Normal Distribution: Homework *

STOR 155 Section 2 Midterm Exam 1 (9/29/09)

Homework Exercises for PSYC 3330: Statistics for the Behavioral Sciences

2.4.1 STA-O Assessment 2

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points.

CHAPTER ONE CORRELATION

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Regression. Lelys Bravo de Guenni. April 24th, 2015

111, section 8.6 Applications of the Normal Distribution

Reminders/Comments. Thanks for the quick feedback I ll try to put HW up on Saturday and I ll you

STATISTICS AND RESEARCH DESIGN

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Lesson 1: Distributions and Their Shapes

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

Previously, when making inferences about the population mean,, we were assuming the following simple conditions:

LOTS of NEW stuff right away 2. The book has calculator commands 3. About 90% of technology by week 5

Statistical Methods and Reasoning for the Clinical Sciences

Standard Scores. Richard S. Balkin, Ph.D., LPC-S, NCC

Biostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego

Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of

10/4/2007 MATH 171 Name: Dr. Lunsford Test Points Possible

MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION

Chapter 20: Test Administration and Interpretation

Chapter Three in-class Exercises. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Medical Statistics 1. Basic Concepts Farhad Pishgar. Defining the data. Alive after 6 months?

Summarizing Data. (Ch 1.1, 1.3, , 2.4.3, 2.5)

Estimation. Preliminary: the Normal distribution

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger

4.3 Measures of Variation

Still important ideas

STAT 113: PAIRED SAMPLES (MEAN OF DIFFERENCES)

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60

(a) 50% of the shows have a rating greater than: impossible to tell

On the purpose of testing:

V. Gathering and Exploring Data

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Stats 95. Statistical analysis without compelling presentation is annoying at best and catastrophic at worst. From raw numbers to meaningful pictures

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Example The median earnings of the 28 male students is the average of the 14th and 15th, or 3+3

Pooling Subjective Confidence Intervals

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

PRINTABLE VERSION. Quiz 1. True or False: The amount of rainfall in your state last month is an example of continuous data.

Math 1680 Class Notes. Chapters: 1, 2, 3, 4, 5, 6

Basic Statistics 01. Describing Data. Special Program: Pre-training 1

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

STT 200 Test 1 Green Give your answer in the scantron provided. Each question is worth 2 points.

STT315 Chapter 2: Methods for Describing Sets of Data - Part 2

Statistics is a broad mathematical discipline dealing with

AP Stats Review for Midterm

Identify two variables. Classify them as explanatory or response and quantitative or explanatory.

MATH 227 CP 8 SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

How Faithful is the Old Faithful? The Practice of Statistics, 5 th Edition 1

Empirical Rule ( rule) applies ONLY to Normal Distribution (modeled by so called bell curve)

Frequency distributions

Unit 7 Comparisons and Relationships

Sheila Barron Statistics Outreach Center 2/8/2011

Practice First Midterm Exam

Transcription:

Unit 2: Probability and distributions Lecture 3: Normal distribution Statistics 101 Thomas Leininger May 23, 2013

Announcements 1 Announcements 2 Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Calculating percentiles Recap 3 Evaluating the normal approximation 4 Examples (time permitting) Normal probability and quality control Finding cutoff points Statistics 101

Announcements Announcements Problem Set #2 due tomorrow Quiz #1 tomorrow Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 2 / 30

1 Announcements 2 Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Calculating percentiles Recap 3 Evaluating the normal approximation 4 Examples (time permitting) Normal probability and quality control Finding cutoff points Statistics 101

Normal distribution Unimodal and symmetric, bell shaped curve Most variables are nearly normal, but none are exactly normal Denoted as N(µ, σ) Normal with mean µ and standard deviation σ Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 3 / 30

Heights of males http:// blog.okcupid.com/ index.php/ the-biggest-lies-in-online-dating/ Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 4 / 30

Heights of males The male heights on OkCupid very nearly follow the expected normal distribution except the whole thing is shifted to the right of where it should be. Almost universally guys like to add a couple inches. You can also see a more subtle vanity at work: starting at roughly 5 8, the top of the dotted curve tilts even further rightward. This means that guys as they get closer to six feet round up a bit more than usual, stretching for that coveted psychological benchmark. http:// blog.okcupid.com/ index.php/ the-biggest-lies-in-online-dating/ Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 4 / 30

Heights of females http:// blog.okcupid.com/ index.php/ the-biggest-lies-in-online-dating/ Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 5 / 30

Heights of females When we looked into the data for women, we were surprised to see height exaggeration was just as widespread, though without the lurch towards a benchmark height. http:// blog.okcupid.com/ index.php/ the-biggest-lies-in-online-dating/ Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 5 / 30

Normal distribution model 1 Announcements 2 Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Calculating percentiles Recap 3 Evaluating the normal approximation 4 Examples (time permitting) Normal probability and quality control Finding cutoff points Statistics 101

Normal distribution model Normal distributions with different parameters µ: mean, σ: standard deviation N(µ = 0, σ = 1) N(µ = 19, σ = 4) -3-2 -1 0 1 2 3 7 11 15 19 23 27 31 0 10 20 30 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 6 / 30

68-95-99.7 Rule 1 Announcements 2 Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Calculating percentiles Recap 3 Evaluating the normal approximation 4 Examples (time permitting) Normal probability and quality control Finding cutoff points Statistics 101

68-95-99.7 Rule 68-95-99.7 Rule For nearly normally distributed data, about 68% falls within 1 SD of the mean, about 95% falls within 2 SD of the mean, about 99.7% falls within 3 SD of the mean. It is possible for observations to fall 4, 5, or more standard deviations away from the mean, but these occurrences are very rare if the data are nearly normal. 68% 95% 99.7% µ 3σ µ 2σ µ σ µ µ + σ µ + 2σ µ + 3σ Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 7 / 30

68-95-99.7 Rule Describing variability using the 68-95-99.7 Rule SAT scores are distributed nearly normally with mean 1500 and standard deviation 300. Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 8 / 30

68-95-99.7 Rule Describing variability using the 68-95-99.7 Rule SAT scores are distributed nearly normally with mean 1500 and standard deviation 300. 68% of students score between 1200 and 1800 on the SAT. 95% of students score between 900 and 2100 on the SAT. 99.7% of students score between 600 and 2400 on the SAT. 68% 95% 99.7% 600 900 1200 1500 1800 2100 2400 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 8 / 30

68-95-99.7 Rule Number of hours of sleep on school nights We can approximate this with a normal distribution (a bit of a stretch here, but it seems to hold in larger samples). 0 10 30 50 70 mean = 6.88 sd = 0.94 4 5 6 7 8 9 10 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 9 / 30

68-95-99.7 Rule Number of hours of sleep on school nights We can approximate this with a normal distribution (a bit of a stretch here, but it seems to hold in larger samples). 0.0 0.2 0.4 0.6 0.8 4 5 6 7 8 9 10 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 9 / 30

68-95-99.7 Rule Number of hours of sleep on school nights We can approximate this with a normal distribution (a bit of a stretch here, but it seems to hold in larger samples). 0 10 30 50 70 75 % 4 5 6 7 8 9 10 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 9 / 30

68-95-99.7 Rule Number of hours of sleep on school nights We can approximate this with a normal distribution (a bit of a stretch here, but it seems to hold in larger samples). 0 10 30 50 70 95 % 75 % 4 5 6 7 8 9 10 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 9 / 30

68-95-99.7 Rule Number of hours of sleep on school nights We can approximate this with a normal distribution (a bit of a stretch here, but it seems to hold in larger samples). 0 10 30 50 70 98 % 95 % 75 % 4 5 6 7 8 9 10 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 9 / 30

Standardizing with Z scores 1 Announcements 2 Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Calculating percentiles Recap 3 Evaluating the normal approximation 4 Examples (time permitting) Normal probability and quality control Finding cutoff points Statistics 101

Standardizing with Z scores SAT scores are distributed nearly normally with mean 1500 and standard deviation 300. ACT scores are distributed nearly normally with mean 21 and standard deviation 5. A college admissions officer wants to determine which of the two applicants scored better on their standardized test with respect to the other test takers: Pam, who earned an 1800 on her SAT, or Jim, who scored a 24 on his ACT? Pam Jim 600 900 1200 1500 1800 2100 2400 6 11 16 21 26 31 36 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 10 / 30

Standardizing with Z scores Standardizing with Z scores Since we cannot just compare these two raw scores, we instead compare how many standard deviations beyond the mean each observation is. Pam s score is 1800 1500 300 = 1 standard deviation above the mean. Jim s score is 24 21 5 = 0.6 standard deviations above the mean. Jim Pam 2 1 0 1 2 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 11 / 30

Standardizing with Z scores Standardizing with Z scores (cont.) These are called standardized scores, or Z scores. Z score of an observation is the number of standard deviations it falls above or below the mean. Z scores Z = observation mean SD Z scores are defined for distributions of any shape, but only when the distribution is normal can we use Z scores to calculate percentiles. Observations that are more than 2 SD away from the mean ( Z > 2) are usually considered unusual. Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 12 / 30

Standardizing with Z scores Percentiles Percentile is the percentage of observations that fall below a given data point. Graphically, percentile is the area below the probability distribution curve to the left of that observation. 600 900 1200 1500 1800 2100 2400 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 13 / 30

Standardizing with Z scores Approximately what percent of students score below 1800 on the SAT? (Hint: Use the 68-95-99.7% rule. The mean is 1500 and the SD is 300.) 600 900 1200 1500 1800 2100 2400 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 14 / 30

Standardizing with Z scores Approximately what percent of students score below 1800 on the SAT? (Hint: Use the 68-95-99.7% rule. The mean is 1500 and the SD is 300.) 100 68 = 32% 32/2 = 16% 68 + 16 = 84% Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 14 / 30

Standardizing with Z scores Jim or Pam? So who had a higher score Jim or Pam? Pam got an 1800 on the SAT (mean 1500, SD 300). Jim got a 24 on the ACT (mean 21, SD 5). Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 15 / 30

Standardizing with Z scores Jim or Pam? So who had a higher score Jim or Pam? Pam got an 1800 on the SAT (mean 1500, SD 300). Jim got a 24 on the ACT (mean 21, SD 5). 1800 1500 Pam: Z Pam = 300 Percentile: 84% = 1.0 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 15 / 30

Standardizing with Z scores Jim or Pam? So who had a higher score Jim or Pam? Pam got an 1800 on the SAT (mean 1500, SD 300). Jim got a 24 on the ACT (mean 21, SD 5). 1800 1500 Pam: Z Pam = 300 Percentile: 84% = 1.0 24 21 Jim: Z Jim = = 0.6 5 Percentile: 73% Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 15 / 30

Standardizing with Z scores Jim or Pam? So who had a higher score Jim or Pam? Pam got an 1800 on the SAT (mean 1500, SD 300). Jim got a 24 on the ACT (mean 21, SD 5). 1800 1500 Pam: Z Pam = 300 Percentile: 84% = 1.0 24 21 Jim: Z Jim = = 0.6 5 Percentile: 73% http:// www.halpertbeesly.com/ images/ gallery/ 10.jpg Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 15 / 30

Calculating percentiles 1 Announcements 2 Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Calculating percentiles Recap 3 Evaluating the normal approximation 4 Examples (time permitting) Normal probability and quality control Finding cutoff points Statistics 101

Calculating percentiles Calculating percentiles - using computation There are many ways to compute percentiles/areas under the curve: R: > pnorm(1800, mean = 1500, sd = 300) [1] 0.8413447 Applet: http:// www.socr.ucla.edu/ htmls/ SOCR Distributions.html Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 16 / 30

Calculating percentiles Calculating percentiles - using tables Second decimal place of Z Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359 0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753 0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141 0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517 0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879 0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224 0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549 0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852 0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133 0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389 1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621 1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830 1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015 You ll find a similar table in Appendix B in the back of the book. Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 17 / 30

Recap 1 Announcements 2 Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Calculating percentiles Recap 3 Evaluating the normal approximation 4 Examples (time permitting) Normal probability and quality control Finding cutoff points Statistics 101

Recap Question Which of the following is false? (a) Majority of Z scores in a right skewed distribution are negative. (b) In skewed distributions the Z score of the mean might be different than 0. (c) For a normal distribution, IQR is less than 2 SD. (d) Z scores are helpful for determining how unusual a data point is compared to the rest of the data in the distribution. Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 18 / 30

Recap Question Which of the following is false? (a) Majority of Z scores in a right skewed distribution are negative. (b) In skewed distributions the Z score of the mean might be different than 0. (c) For a normal distribution, IQR is less than 2 SD. (d) Z scores are helpful for determining how unusual a data point is compared to the rest of the data in the distribution. Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 18 / 30

Evaluating the normal approximation 1 Announcements 2 Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Calculating percentiles Recap 3 Evaluating the normal approximation 4 Examples (time permitting) Normal probability and quality control Finding cutoff points Statistics 101

Evaluating the normal approximation Normal probability plot A histogram and normal probability plot of a sample of 100 male heights. Male heights (inches) 60 65 70 75 80 Theoretical Quantiles male heights (in.) 2 1 0 1 2 65 70 75 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 19 / 30

Evaluating the normal approximation Anatomy of a normal probability plot Data are plotted on the y-axis of a normal probability plot, and theoretical quantiles (following a normal distribution) on the x-axis. If there is a one-to-one relationship between the data and the theoretical quantiles, then the data follow a nearly normal distribution. Since a one-to-one relationship would appear as a straight line on a scatter plot, the closer the points are to a perfect straight line, the more confident we can be that the data follow the normal model. Constructing a normal probability plot requires calculating percentiles and corresponding z-scores for each observation, which is tedious. Therefore we generally rely on software when making these plots. Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 20 / 30

Evaluating the normal approximation Below is a histogram and normal probability plot for the NBA heights from the 2008-2009 season. Do these data appear to follow a normal distribution? Height (inches) 70 75 80 85 90 Theoretical quantiles 3 2 1 0 1 2 3 70 75 80 85 90 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 21 / 30

Evaluating the normal approximation Below is a histogram and normal probability plot for the NBA heights from the 2008-2009 season. Do these data appear to follow a normal distribution? Height (inches) 70 75 80 85 90 Theoretical quantiles 3 2 1 0 1 2 3 70 75 80 85 90 Why do the points on the normal probability have jumps? Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 21 / 30

Evaluating the normal approximation Construct a normal probability plot for the data set given below and determine if the data follow an approximately normal distribution. 3.46, 4.02, 5.09, 2.33, 6.47 Observation i 1 2 3 4 5 x i 2.33 3.46 4.02 5.09 6.47 Percentile = i n+1 0.17 0.33 0.50 0.67 0.83 Corrsponding Z i -0.95 0 0.44 0.95 Since the points on the normal probability plot seem to follow a straight line we can say that the distribution is nearly normal. Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 22 / 30

Evaluating the normal approximation Construct a normal probability plot for the data set given below and determine if the data follow an approximately normal distribution. 3.46, 4.02, 5.09, 2.33, 6.47 Observation i 1 2 3 4 5 x i 2.33 3.46 4.02 5.09 6.47 Percentile = i n+1 0.17 0.33 0.50 0.67 0.83 Corrsponding Z i -0.95-0.44 0 0.44 0.95 Since the points on the normal probability plot seem to follow a straight line we can say that the distribution is nearly normal. Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 22 / 30

Evaluating the normal approximation Normal probability plot and skewness Right Skew - If the plotted points appear to bend up and to the left of the normal line that indicates a long tail to the right. Left Skew - If the plotted points bend down and to the right of the normal line that indicates a long tail to the left. Short Tails - An S shaped-curve indicates shorter than normal tails, i.e. narrower than expected. Long Tails - A curve which starts below the normal line, bends to follow it, and ends above it indicates long tails. That is, you are seeing more variance than you would expect in a normal distribution, i.e. wider than expected. Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 23 / 30

Examples (time permitting) 1 Announcements 2 Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Calculating percentiles Recap 3 Evaluating the normal approximation 4 Examples (time permitting) Normal probability and quality control Finding cutoff points Statistics 101

Examples (time permitting) Normal probability and quality control 1 Announcements 2 Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Calculating percentiles Recap 3 Evaluating the normal approximation 4 Examples (time permitting) Normal probability and quality control Finding cutoff points Statistics 101

Examples (time permitting) Normal probability and quality control Six sigma The term six sigma process comes from the notion that if one has six standard deviations between the process mean and the nearest specification limit, as shown in the graph, practically no items will fail to meet specifications. http:// en.wikipedia.org/ wiki/ Six Sigma Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 24 / 30

Examples (time permitting) Normal probability and quality control Question At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of ketchup in the bottle is below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles have fewer than 35.8 ounces of ketchup? (a) less than 0.15% (b) between 0.15% and 2.5% (c) between 2.5% and 16% (d) between 16% and 50% Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 25 / 30

Examples (time permitting) Normal probability and quality control Question At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of ketchup in the bottle is below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles have fewer than 35.8 ounces of ketchup? (a) less than 0.15% Let X = amount of ketchup in a bottle: X N(µ = 36, σ = 0.11) (b) between 0.15% and 2.5% (c) between 2.5% and 16% (d) between 16% and 50% 35.8 36 Z = 35.8 36 0.11 = 1.82 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 25 / 30

Examples (time permitting) Normal probability and quality control Finding the exact probability - using the Z table Second decimal place of Z 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00 Z 0.0014 0.0014 0.0015 0.0015 0.0016 0.0016 0.0017 0.0018 0.0018 0.0019 2.9 0.0019 0.0020 0.0021 0.0021 0.0022 0.0023 0.0023 0.0024 0.0025 0.0026 2.8 0.0026 0.0027 0.0028 0.0029 0.0030 0.0031 0.0032 0.0033 0.0034 0.0035 2.7 0.0036 0.0037 0.0038 0.0039 0.0040 0.0041 0.0043 0.0044 0.0045 0.0047 2.6 0.0048 0.0049 0.0051 0.0052 0.0054 0.0055 0.0057 0.0059 0.0060 0.0062 2.5 0.0064 0.0066 0.0068 0.0069 0.0071 0.0073 0.0075 0.0078 0.0080 0.0082 2.4 0.0084 0.0087 0.0089 0.0091 0.0094 0.0096 0.0099 0.0102 0.0104 0.0107 2.3 0.0110 0.0113 0.0116 0.0119 0.0122 0.0125 0.0129 0.0132 0.0136 0.0139 2.2 0.0143 0.0146 0.0150 0.0154 0.0158 0.0162 0.0166 0.0170 0.0174 0.0179 2.1 0.0183 0.0188 0.0192 0.0197 0.0202 0.0207 0.0212 0.0217 0.0222 0.0228 2.0 0.0233 0.0239 0.0244 0.0250 0.0256 0.0262 0.0268 0.0274 0.0281 0.0287 1.9 0.0294 0.0301 0.0307 0.0314 0.0322 0.0329 0.0336 0.0344 0.0351 0.0359 1.8 0.0367 0.0375 0.0384 0.0392 0.0401 0.0409 0.0418 0.0427 0.0436 0.0446 1.7 0.0455 0.0465 0.0475 0.0485 0.0495 0.0505 0.0516 0.0526 0.0537 0.0548 1.6 0.0559 0.0571 0.0582 0.0594 0.0606 0.0618 0.0630 0.0643 0.0655 0.0668 1.5 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 26 / 30

Examples (time permitting) Normal probability and quality control Finding the exact probability - using R > pnorm(-1.82, mean = 0, sd = 1) [1] 0.0344 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 27 / 30

Examples (time permitting) Normal probability and quality control Finding the exact probability - using R > pnorm(-1.82, mean = 0, sd = 1) [1] 0.0344 OR Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 27 / 30

Examples (time permitting) Normal probability and quality control Finding the exact probability - using R > pnorm(-1.82, mean = 0, sd = 1) [1] 0.0344 OR > pnorm(35.8, mean = 36, sd = 0.11) [1] 0.0345 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 27 / 30

Examples (time permitting) Normal probability and quality control Question At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of the bottle goes below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles pass the quality control inspection? (a) 1.82% (b) 3.44% (c) 6.88% (d) 93.12% (e) 96.56% Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 28 / 30

Examples (time permitting) Normal probability and quality control P(35.8 < X < 36.2) =? = 35.8 36 36.2 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 28 / 30

Examples (time permitting) Normal probability and quality control P(35.8 < X < 36.2) =? = - 35.8 36 36.2 36 36.2 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 28 / 30

Examples (time permitting) Normal probability and quality control P(35.8 < X < 36.2) =? = - 35.8 36 36.2 36 36.2 35.8 36 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 28 / 30

Examples (time permitting) Normal probability and quality control P(35.8 < X < 36.2) =? = - 35.8 36 36.2 36 36.2 35.8 36 Z 35.8 = 35.8 36 = 1.82 0.11 Z 36.2 = 36.2 36 = 1.82 0.11 P(35.8 < X < 36.2) = P( 1.82 < Z < 1.82) = 0.9656 0.0344 = 0.9312 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 28 / 30

Examples (time permitting) Finding cutoff points 1 Announcements 2 Normal distribution Normal distribution model 68-95-99.7 Rule Standardizing with Z scores Calculating percentiles Recap 3 Evaluating the normal approximation 4 Examples (time permitting) Normal probability and quality control Finding cutoff points Statistics 101

Examples (time permitting) Finding cutoff points Body temperatures of healthy humans are distributed nearly normally with mean 98.2 F and standard deviation 0.73 F. What is the cutoff for the lowest 3% of human body temperatures? Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 29 / 30

Examples (time permitting) Finding cutoff points Body temperatures of healthy humans are distributed nearly normally with mean 98.2 F and standard deviation 0.73 F. What is the cutoff for the lowest 3% of human body temperatures? 0.03? 98.2 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 29 / 30

Examples (time permitting) Finding cutoff points Body temperatures of healthy humans are distributed nearly normally with mean 98.2 F and standard deviation 0.73 F. What is the cutoff for the lowest 3% of human body temperatures? 0.03 0.09 0.08 0.07 0.06 0.05 Z 0.0233 0.0239 0.0244 0.0250 0.0256 1.9 0.0294 0.0301 0.0307 0.0314 0.0322 1.8 0.0367 0.0375 0.0384 0.0392 0.0401 1.7? 98.2 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 29 / 30

Examples (time permitting) Finding cutoff points Body temperatures of healthy humans are distributed nearly normally with mean 98.2 F and standard deviation 0.73 F. What is the cutoff for the lowest 3% of human body temperatures? 0.03 0.09 0.08 0.07 0.06 0.05 Z 0.0233 0.0239 0.0244 0.0250 0.0256 1.9 0.0294 0.0301 0.0307 0.0314 0.0322 1.8 0.0367 0.0375 0.0384 0.0392 0.0401 1.7? 98.2 P(X < x) = 0.03 P(Z < -1.88) = 0.03 obs mean Z = x 98.2 = 1.88 SD 0.73 x = ( 1.88 0.73) + 98.2 = 96.8 Mackowiak, Wasserman, and Levine (1992), A Critical Appraisal of 98.6 Degrees F, the Upper Limit of the Normal Body Temperature, and Other Legacies of Carl Reinhold August Wunderlick. Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 29 / 30

Examples (time permitting) Finding cutoff points Question Body temperatures of healthy humans are distributed nearly normally with mean 98.2 F and standard deviation 0.73 F. What is the cutoff for the highest 10% of human body temperatures? (a) 99.1 (b) 97.3 (c) 99.4 (d) 99.6 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 30 / 30

Examples (time permitting) Finding cutoff points Z 0.05 0.06 0.07 0.08 0.09 0.90 98.2? 0.10 1.0 0.8531 0.8554 0.8577 0.8599 0.8621 1.1 0.8749 0.8770 0.8790 0.8810 0.8830 1.2 0.8944 0.8962 0.8980 0.8997 0.9015 1.3 0.9115 0.9131 0.9147 0.9162 0.9177 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 30 / 30

Examples (time permitting) Finding cutoff points Z 0.05 0.06 0.07 0.08 0.09 0.90 98.2? 0.10 1.0 0.8531 0.8554 0.8577 0.8599 0.8621 1.1 0.8749 0.8770 0.8790 0.8810 0.8830 1.2 0.8944 0.8962 0.8980 0.8997 0.9015 1.3 0.9115 0.9131 0.9147 0.9162 0.9177 P(X > x) = 0.10 P(Z < 1.28) = 0.90 obs mean Z = x 98.2 = 1.28 SD 0.73 x = (1.28 0.73) + 98.2 = 99.1 Statistics 101 (Thomas Leininger) U2 - L3: Normal distribution May 23, 2013 30 / 30