Intro to Scientific Analysis (BIO 100) THE t-test. Plant Height (m)

THE t-test Let Start With a Example Whe coductig experimet, we would like to kow whether a experimetal treatmet had a effect o ome variable. A a imple but itructive example, uppoe we wat to kow whether a ew formulatio of fertilizer icreae plat growth over that of a old fertilizer formula. To tet thi, we might meaure the growth repoe (let ay height) of two et of plat, each of which i grow o oe of the two fertilizer. Let imagie that we grow 0 plat o the old fertilizer ad 0 plat o the ew fertilizer; the height of each idividual plat ad the mea for each fertilizer are give i the table below. Plat Height (m) Old Fertilizer New Fertilizer 0.64.04 0.8.64.76 0.77.34.3.7.8.66.3.49 3.07.8.6.9..3 Mea:.4 Mea:.93 SD: 0. SD: 0.63 A you ca ee, the calculated mea height of plat grow o the ew fertilizer wa greater tha that of plat grow o the old fertilizer. But wait a miute! Before we jump to the cocluio that the ew fertilizer i better tha the old, let take a cloer look at the data that give rie to thee mea plat height. If you look cloely at the data, you ll otice that the data are variable. There variatio i plat height withi each of the fertilizer treatmet, repreeted by the tadard deviatio (SD), ad there alo variatio i plat height betwee the two treatmet. For example, oe of the plat grow o the old fertilizer grew quite tall ad reached.3 m. I fact, thi i taller tha eve out of te of the plat grow o the ew fertilizer ad taller tha the mea height of all plat grow o the ew fertilizer! Thi raie a importat quetio. How ca we claim that the ew fertilizer i i fact better if for ome plat it i ad ome plat it ot? Of coure, a few of the plat grow o the ew fertilizer were taller tha thoe grow o the old fertilizer, but ot all. Correpodigly, may of the plat grow o the old fertilizer were horter tha thoe grow o the ew fertilizer, but ot all. We expect ome variatio. But how much variatio i too much for u to coider that there wa a igificat poitive effect o plat growth of the ew fertilizer? Let look at thee data aother way. Figure how the plat height above a poit o a graph. Fig.. Idividual height (m) meauremet of plat grow i old ad ew fertilizer. The olid lie repreet mea height for each treatmet ( = 0 for each treatmet).

Plotted i thi way, you ca ee that the idividual plat height overlap betwee the old ad ew fertilizer; thi i due to radom variatio i plat height withi each treatmet. Although the calculated value of the mea betwee old ad ew fertilizer are differet umber, our cocluio about whether or ot the there wa a effect of the fertilizer treatmet o plat growth overall deped o how much variability there i i the data. The more variable our data, the le cofidet we ca be that the mea reflect a meaigful differece. To drive thi poit home, let examie a dataet of plat height o the ame fertilizer treatmet ad with the ame mea. But thi time the data are le variable. Figure how thi ew dataet a height of idividual plat grow o the old ad ew fertilizer. A a meaure of variability, let ue the tadard deviatio (SD). For the old fertilizer, SD = 0.9; for the ew fertilizer, SD = 0.. Fig.. Idividual height (m) meauremet of plat grow i old or ew fertilizer. Data for each treatmet ha the ame mea a i Fig., but are le variable (SD of old fertilizer = 0.9; SD of ew fertilizer = 0.). If you had a choice betwee uig the data i Figure or Figure to determie whether the old or ew fertilizer differed i their effect o plat height growth, which data would you have the mot cofidece i? Becaue the data i Figure i le variable tha the data i Figure, it tell u that the mea we calculated i Figure are actually more precie tha thoe i Figure. A a reult, we are more cofidet that the mea i Figure differ from oe aother tha we are cofidet that the mea i Figure differ. Thu, the variability of our data i what i truly critical whe makig cocluio about whether or ot real differece actually exit betwee our populatio of iteret. A cietit who are taked with beig objective whe makig uch cocluio, thi i where we tur to tatitical approache. A it tur out, the mea i Figure do ot differ tatitically from oe aother whe they are compared uig a objective tatitical tet, wherea the mea i Figure do differ igificatly. If you were to coclude that imply becaue the calculated mea were differet i the Figure data, the you would have made a icorrect cocluio. Statitic miimize the rik of makig thi type of mitake. The t-tet The t-tet, or Studet t-tet, i a tatitical tet that allow u to compare two ample mea. It i called a t-tet becaue we calculate a tet tatitic called a t-value. The t-value i calculated baed o the differece betwee the two mea but take accout of the variatio i the data. If the differece betwee two mea i large, the it i likely that the two mea are differet. However, a decribed i the fertilizer example above, we mut alo coider the variability i the data. If variatio i the data i low, the it i more likely that ay differece i the mea i ot due to chace aloe but to a factor that i cauig the mea to differ. The ize (or magitude) of the t-value i idicative of how differet our ample mea are with repect to the variace i the data. A large t-value idicate that the ample mea are igificatly differet, wherea a mall t-value idicate o igificat differece betwee the mea.

A a example, let compare the deity of a marh gra, called Spartia, betwee two differet marhe. The t-tet provide a ubiaed way of decidig whether ay oberved differece i the mea deity of Spartia betwee the two marhe i real or imply due to chace. A with all tatitical tet, the t-tet tet the ull hypothei. I thi example, the ull hypothei i: Mea deity of Spartia doe ot differ betwee marhe. I other word, our ull hypothei tate that the mea are equal (i.e., x = x ). The t-tet i baed o the t-ditributio, which i a ditributio that give the probability of gettig a particular t- value for a particular igificace level (α) ad degree of freedom (df). Becaue the t-value reflect the magitude of the differece betwee the mea, 0 if the mea are idetical (a rare occurrece). Sice the t-ditributio i baed o the ull hypothei (i.e., that the mea do ot differ), the t- ditributio ha a mea of zero. The greater the differece betwee the mea (aumig low variace), the higher the t-value will be. The higher the t-value, the le probable it i that you could get a t-value that high if the ull hypothei i true (i.e., mea are the ame). The t-ditributio wa developed to take accout of the fact that ample ize () i mall i mot practical applicatio ad, therefore, require a differet ditributio tha the ormal ditributio. Ideed, the t-ditributio i imply a modified form of the ormal ditributio, which if you remember ha certai propertie that allow u to aig probabilitie of occurrece for our data. For example, data poit that fall farther away from the ceter (i.e., the mea) of a ormal curve are le likely (le probable) to occur tha thoe that lad cloer to the ceter of the curve. Thi alo applie to the t-ditributio. The t- tet relie o thi property whe it compare two mea. To ay that two mea are tatitically differet, we ue the % igificace level (P < 0.0). That i, the differece betwee the mea mut be great eough uch that it i improbable that we would get uch a large differece betwee the mea if, i fact, they were the ame. Said aother way, uig the % igificace level, there i a % chace that we would get a t-value outide of the % level if the mea were the ame. The t-ditributio ad it % ( x.%) probability regio are how at right..% probability.% probability The aumptio of the t-tet are: ) the data are ditributed ormally that i, the frequecy ditributio of the data form a ormal (bell-haped) curve; ) the variace of the two ample beig compared are approximately equal; ad 3) the ample are idepedet. I a t-tet, the t-value i calculated baed o the differece i the ample mea ( x ad x ) ad the tadard error of the differece betwee ample mea ( ): x x 0 t-ditributio It tur out that: x x x x x x = + 3

Thi i becaue there i a mathematical relatiohip that equate the tadard error of the differece i the mea with the um of the tadard error of the two variable. Subtitutig thi for i the equatio for t, we get: x x x x + where the i value are the variace of the idividual variable. Thi equatio ca be rearraged a follow: x x + where equal the pooled tadard deviatio of both ample: = i= x i i= ( xi ) + xi i= + i= ( x ) i For ay particular t-tet, the calculated t-value will lie omewhere withi the t-ditributio. The t- value we get will have a probability aociated with it. That probability will be a meaure of how likely it would be for u to get a t-value a large or larger tha we did by chace aloe. Therefore, if our t-value i large (toward the tail of the curve), the there i a low probability that the mea are equal. By cotrat if t i mall (toward the ceter of the curve, that i zero), the the probability that the ull hypothei i true i high. By covetio, we et the probability cut-off poit for the t-value at 0.0 (or %) o the curve. Thi cut-off value of t i called the critical value. The critical t-value for ay particular t-tet ca be looked up i a t-table (called Critical value of the t ditributio ; icluded below). To determie the critical t value you eed to kow the igificace level (α) (i thi cae % or 0.0) ad the degree of freedom (df). The df for a t-tet i: df = ( + - ). If the t-value calculated from a t-tet i larger tha the critical value, the we reject the ull hypothei. By cotrat, if the t-value calculated from a t-tet i maller tha the critical value, the we fail to reject the ull hypothei. 4

Example of a t-tet For the Spartia example above, let ay that we collect plat cout from five m plot i each of the two marhe. We get the followig data: Plot # Marh A Marh B 0.7 4.3.0 4.4 3 0..8 4 8.8.6 8.6 4. x 9.9 x 3.6 x 99.6 x 66.3 x 989.4 x 887.9 9.9 3.6 Uig our calculatio equatio for the pooled tadard deviatio (): + = 989.4 ( 99.6) ( 66.3) + 887. + Therefore: = ad.08 + 8.06 8 which give, =.83 6.66 = ( 0.4) 6.66 0.8 = 8. The critical t-value at the 0.0 igificace level with 8 degree of freedom (df) i.306 (i.e., t (0.0, 8) =.306). Becaue our calculated t-value ( 8.) i higher tha the critical t-value at the 0.0 igificace level, there i le tha a % (P < 0.0) chace that we could get a t-value a large or larger tha we oberved if the ull hypothei i true. Therefore, we coclude that the mea deity of Spartia i marh A i igificatly differet tha it deity i marh B. Whe evaluatig tatitical igificace, you mut alway report the igificace level ued ad the P-value i your reult.

CRITICAL VALUES OF THE t DISTRIBUTION Sigificace level (α -tailed) Sigificace level (α -tailed) df 0. 0.0 0.0 df 0. 0.0 0.0 6.34.706 63.67.67.008.676.90 4.303 9.9.67.007.674 3.33 3.8.84 3.674.006.67 4.3.776 4.604 4.674.00.670.0.7 4.03.673.004.668 6.943.447 3.707 6.673.003.667 7.89.36 3.499 7.67.00.66 8.860.306 3.3 8.67.00.663 9.833.6 3.0 9.67.00.66 0.8.8 3.69 60.67.000.660.796.0 3.06 6.670.000.69.78.79 3.0 6.670.999.67 3.77.60 3.0 63.669.998.66 4.76.4.977 64.669.998.6.73.3.947 6.669.997.64 6.746.0.9 66.668.997.6 7.740.0.898 67.668.996.6 8.734.0.878 68.668.99.60 9.79.093.86 69.667.99.649 0.7.086.84 70.667.994.648.7.080.83 7.667.994.647.77.074.89 7.666.993.646 3.74.069.807 73.666.993.64 4.7.064.797 74.666.993.644.708.060.787 7.66.99.643 6.706.06.779 76.66.99.64 7.703.0.77 77.66.99.64 8.70.048.763 78.66.99.640 9.699.04.76 79.664.990.640 30.697.04.70 80.664.990.639 3.696.040.744 8.664.990.638 3.694.037.738 8.664.989.637 33.69.03.733 83.663.989.636 34.69.03.78 84.663.989.636 3.690.030.74 8.663.988.63 36.688.08.79 86.663.988.634 37.687.06.7 87.663.988.634 38.686.04.7 88.66.987.633 39.68.03.708 89.66.987.63 40.684.0.704 90.66.987.63 4.683.00.70 9.66.986.63 4.68.08.698 9.66.986.630 43.68.07.69 93.66.986.630 44.680.0.69 94.66.986.69 4.679.04.690 9.66.98.69 46.679.03.687 96.66.98.68 47.678.0.68 97.66.98.67 48.677.0.68 98.66.984.67 49.677.00.680 99.660.984.66 0.676.009.678 00.660.984.66.64.960.76 6