Objectives. Types of Statistical Inference. Statistical Inference. Chapter 19 Confidence intervals: Estimating with confidence

Types of Statistical Iferece Chapter 19 Cofidece itervals: The basics Cofidece itervals for estiatig the value of a populatio paraeter Tests of sigificace assesses the evidece for a clai about a populatio. Both types of ifereces are based o the saplig distributios of statistics Both report probabilities that state what would happe if we used the iferece ethod ay ties Whe you use statistical iferece, you are actig as if the data are a rado saple or coe fro a radoized experiet. Objectives Cofidece itervals: the basics Estiatig with cofidece Cofidece itervals for the proportio or ea Estiatig with cofidece x Although the saple ea,, is a uique uber for ay particular saple, if you pick a differet saple, you will probably get a differet saple ea. I fact, you could get ay differet values for the saple ea, ad virtually oe of the would actually equal the true populatio ea, μ. How cofidece itervals behave Choosig the saple size Statistical Iferece Statistical iferece provides ethods for drawig coclusios about a populatio fro saple data. What does % cofidece really ea? I repeated saples of the sae size, the cofidece created will catch the true value/paraeter (p) of the tie.

What does % cofidece really ea? Wheever we create a cofidece iterval, we write a setece iterpretatio: 68-95-99.7 Rule Based o our saple, we are 95% cofidet that the true % (or proportio) of (cotet) is betwee a ad b %. But the saple distributio is arrower tha the populatio distributio, by a factor of. Thus, the estiates gaied fro our saples are always relatively close to the populatio paraeter µ. Saple eas, subjects Populatio, x idividual subjects Cofidece iterval A level C cofidece iterval for a paraeter has two parts: A iterval calculated fro the data, usually of the for estiate ±argi of error A cofidece level C, which gives the probability that the iterval will capture the true paraeter value i repeated saples, or the success rate for the ethod. If the populatio is orally distributed N(µ,σ), so will the saplig distributio N(µ,σ/ ). 68-95-99.7 Rule I 95% of all saples, the ea score for the saple will be withi two stadard deviatios of the populatio ea score. So the ea of 500 SAT Math scores will be withi 9 poits of i 95% of all saples. To say that Saplig is a 95% cofidece distributio iterval for the of populatio ea is to say that i repeated trials, 95% of these itervals capture. We are 95% cofidet that the ukow ea SAT Math score for all Califoria high school seiors lies betwee 452 ad 470. (ukow) 95% of all saple eas will be withi roughly 2 stadard deviatios (2*s/ ) of the populatio paraeter. Because distaces are syetrical, this iplies that the populatio paraeter ust be withi roughly 2 stadard deviatios fro the saple average, i 95% of all saples. This reasoig is the essece of statistical iferece. Red dot: ea value of idividual saple

The weight of sigle eggs of the brow variety is orally distributed N(65 g,5 g). Thik of a carto of 12 brow eggs as a SRS of size 12.. What is the distributio of the saple eas? Noral (ea, stadard deviatio s/ ) = N(65 g,1.44 g). Fid the iddle 95% of the saple eas distributio. Roughly ± 2 stadard deviatios fro the ea, or 65g ± 2.88g. populatio saple You buy a carto of 12 white eggs istead. The box weighs 770 g. The average egg weight fro that SRS is thus = 64.2 g. Kowig that the stadard deviatio of egg weight is 5 g, what ca you ifer about the ea µ of the white egg populatio? There is a 95% chace that the populatio ea µ is roughly withi ± 2s/ of, or 64.2 g ± 2.88 g. The iportat z* values Fid the z* for a 90% C.I, 95% C.I. ad for a 99% C.I. Suarize your results i a siple table N(0, 1) Cofidece Level Z* Use ivnor(p, 0, 1) 90% 1.645 95% 1.960 99% 2.576 Cofidece Iterval for a Populatio Mea Coditios for costructig a cofidece iterval for The costructio of a cofidece iterval for a populatio is appropriate whe Whe the data coe fro a SRS fro the populatio of iterest, ad The saplig distributio of x-bar is approxiately oral How do we fid specific z* values? We ca use a table of z values (Table A). For a particular cofidece level C, the appropriate z* value is just above it. We ca use software. I Excel: =NORMINV(probability,ea,stadard_dev) gives z for a give cuulative probability. Ex. For a 98% cofidece level, z*=2.326 Sice we wat the iddle C probability, the probability we require is (1 - C)/2 Exaple: For a 98% cofidece level, = NORMINV (.01,0,1) = 2.32635 (= eg. z*) Costructig a level C cofidece iterval Catch the cetral probability C uder a oral curve Go out z* stadard deviatios o either side of the ea. Iterpretig a cofidece iterval for a ea A cofidece iterval ca be expressed as: ± z* estiate z* estiate is called the argi of error Two edpoits of a iterval: possibly withi ( z* estiate ) to ( + z* estiate ) -z* z* A cofidece level C (i %) idicates the success rate of the ethod that produces the iterval. It represets the area uder the oral curve withi ± z* of the ceter of the curve. -z* z*

Cofidece iterval The cofidece iterval is a rage of values with a associated probability or cofidece level C. The probability quatifies the chace that the iterval cotais the true populatio paraeter. A cofidece iterval ca be expressed as: Mea ± is called the argi of error withi ± Exaple: 120 ± 6 Two edpoits of a iterval withi ( ) to ( + ) ex. 114 to 126 A cofidece level C (i %) idicates the probability that the µ falls withi the iterval. It represets the area uder the oral curve withi ± of the ± 4.2 is a 95% cofidece iterval for the populatio paraeter. ceter of the curve. This equatio says that i 95% of the cases, the actual value of will be withi 4.2 uits of the value of. Iplicatios Review: stadardizig the oral curve usig z We do t eed to take a lot of rado saples to rebuild the saplig distributio ad fid at its ceter. N(64.5, 2.5) N(µ, σ/ ) N(0,1) Saple Populatio All we eed is oe SRS of size ad relyig o the properties of the saple eas distributio to ifer the populatio ea. Stadardized height (o uits) Here, we work with the saplig distributio, ad s/ is its stadard deviatio (spread). Reeber that s is the stadard deviatio of the origial populatio. Reworded With 95% cofidece, we ca say that µ should be withi roughly 2 stadard deviatios (2*s/ ) fro our saple ea bar. I 95% of all possible saples of this size, µ will ideed fall i our cofidece iterval. Varyig cofidece levels Cofidece itervals cotai the populatio ea i C% of saples. Differet areas uder the curve give differet cofidece levels C. Practical use of z: z* z* is related to the chose cofidece level C. C is the area uder the stadard oral curve betwee z* ad z*. C I oly 5% of saples would be farther fro µ. The cofidece iterval is thus: Z* Z* Exaple: For a 80% cofidece level C, 80% of the oral curve s area is cotaied i the iterval.

Lik betwee cofidece level ad argi of error The cofidece level C deteries the value of z* (i Table C). The argi of error also depeds o z*. Higher cofidece C iplies a larger argi of error (thus less precisio i our estiates). Saple size ad experietal desig You ay eed a certai argi of error (e.g., drug trial, aufacturig specs). I ay cases, the populatio variability (s) is fixed, but we ca choose the uber of easureets (). So pla ahead what saple size to use to achieve that argi of error. C A lower cofidece level C produces a saller argi of error (thus better precisio i our estiates). Z* Z* Reeber, though, that saple size is ot always stretchable at will. There are typically costs ad costraits associated with large saples. The best approach is to use the sallest saple size that ca give you useful results. Differet cofidece itervals for the sae set of easureets Desity of bacteria i solutio: Measureet equipet has stadard deviatio s = 1*10 6 bacteria/l fluid. 3 easureets: 24, 29, ad 31*10 6 bacteria/l fluid Mea: = 28*10 6 bacteria/l. Fid the 96% ad 70% CI. 96% cofidece iterval for the true desity, z* = 2.054, ad write 70% cofidece iterval for the true desity, z* = 1.036, ad write What saple size for a give argi of error? Desity of bacteria i solutio: Measureet equipet has stadard deviatio σ = 1*10 6 bacteria/l fluid. How ay easureets should you ake to obtai a argi of error of at ost 0.5*10 6 bacteria/l with a cofidece level of 90%? For a 90% cofidece iterval, z*= 1.645. = 28 ± 2.054(1/ 3) = 28 ± 1.19*10 6 bacteria/l = 28 ± 1.036(1/ 3) = 28 ± 0.60*10 6 bacteria/l Usig oly 10 easureets will ot be eough to esure that is o ore tha 0.5*106. Therefore, we eed at least 11 easureets. Ipact of saple size The spread i the saplig distributio of the ea is a fuctio of the uber of idividuals per saple. The larger the saple size, the saller the stadard deviatio (spread) of the saple ea distributio. But the spread oly decreases at a rate equal to. Stadard error Saple size