Intro to Scientific Analysis (BIO 100) THE t-test. Plant Height (m)

Similar documents
ANALYZING ECOLOGICAL DATA

Chapter 23 Summary Inferences about Means

Estimating Means with Confidence

Chapter 7 - Hypothesis Tests Applied to Means

Chapter 7 - Hypothesis Tests Applied to Means

Chapter 18 - Inference about Means

Lecture 18b: Practice problems for Two-Sample Hypothesis Test of Means

Confidence Intervals and Point Estimation

5/7/2014. Standard Error. The Sampling Distribution of the Sample Mean. Example: How Much Do Mean Sales Vary From Week to Week?

Concepts Module 7: Comparing Datasets and Comparing a Dataset with a Standard

Chapter 21. Recall from previous chapters: Statistical Thinking. Chapter What Is a Confidence Interval? Review: empirical rule

Distribution of sample means. Estimation

Objectives. Sampling Distributions. Overview. Learning Objectives. Statistical Inference. Distribution of Sample Mean. Central Limit Theorem

Chapter 8 Student Lecture Notes 8-1

Standard deviation The formula for the best estimate of the population standard deviation from a sample is:

Chapter 8 Descriptive Statistics

Statistical Analysis and Graphing

How is the President Doing? Sampling Distribution for the Mean. Now we move toward inference. Bush Approval Ratings, Week of July 7, 2003

Sec 7.6 Inferences & Conclusions From Data Central Limit Theorem

Review for Chapter 9

Sample Size Determination

Statistics for Managers Using Microsoft Excel Chapter 7 Confidence Interval Estimation

Statistics Lecture 13 Sampling Distributions (Chapter 18) fe1. Definitions again

Measures of Spread: Standard Deviation

Statistics 11 Lecture 18 Sampling Distributions (Chapter 6-2, 6-3) 1. Definitions again

Objectives. Types of Statistical Inference. Statistical Inference. Chapter 19 Confidence intervals: Estimating with confidence

Estimation and Confidence Intervals

Sampling Distributions and Confidence Intervals

CHAPTER 8 ANSWERS. Copyright 2012 Pearson Education, Inc. Publishing as Addison-Wesley

2014 International Journal of Medical Science Research and Practice available on

Lecture Outline. BIOST 514/517 Biostatistics I / Applied Biostatistics I. Paradigm of Statistics. Inferential Statistic.

Technical Assistance Document Algebra I Standard of Learning A.9

Maths skills. for biologists Biology. Planning field investigations. Planning field investigations. Asking ecological questions

EDEXCEL NATIONAL CERTIFICATE UNIT 28 FURTHER MATHEMATICS FOR TECHNICIANS OUTCOME 1- ALGEBRAIC TECHNIQUES TUTORIAL 3 - STATISTICAL TECHNIQUES

Appendix C: Concepts in Statistics

Caribbean Examinations Council Secondary Education Certificate School Based Assessment Additional Math Project

Two Data sets. Variability. Data Example with the range. Issues with the range. Central Tendency tells part of the story

23.3 Sampling Distributions

Practical Basics of Statistical Analysis

Measuring Dispersion

A Generalized Difference-cum-Ratio Type Estimator for the Population Variance in Double Sampling

Calibration Approach based Estimation of Finite Population Total under Two Stage Sampling

Confidence Intervals Estimation for ROC Curve, AUC and Brier Score under the Constant Shape Bi-Weibull Distribution

Recall, general format for all sampling distributions in Ch. 9: The sampling distribution of the sample statistic is approximately normal, with:

JUST THE MATHS UNIT NUMBER STATISTICS 3 (Measures of dispersion (or scatter)) A.J.Hobson

CCXCIII. VITAMIN A DETERMINATION: RELA- AND PHYSICAL METHODS OF TEST. TION BETWEEN THE BIOLOGICAL, CHEMICAL

DISTRIBUTION AND PROPERTIES OF SPERMATOZOA IN DIFFERENT FRACTIONS OF SPLIT EJACULATES*

GSK Medicine Study Number: Title: Rationale: Study Period: Objectives: Primary Secondary Indication: Study Investigators/Centers: Research Methods

REVIEW for Exam 2. Chapters 9 13 (& chi-square in ch8)

Should We Care How Long to Publish? Investigating the Correlation between Publishing Delay and Journal Impact Factor 1

Project Title: A1C and Diabetic Control

Recall, general format for all sampling distributions in Ch. 9:

GOALS. Describing Data: Numerical Measures. Why a Numeric Approach? Concepts & Goals. Characteristics of the Mean. Graphic of the Arithmetic Mean

5.1 Description of characteristics of population Bivariate analysis Stratified analysis

Training with inspiratory pressure support in patients with severe COPD

Introduction. The Journal of Nutrition Methodology and Mathematical Modeling

Modified Early Warning Score Effect in the ICU Patient Population

Lecture 4: Distribution of the Mean of Random Variables

Primary: To assess the change on the subject s quality of life between diagnosis and the first 3 months of treatment.

Copy of: Proc. IEEE 1998 Int. Conference on Microelectronic Test Structures, Vol.11, March 1998

LAB 4: Biological Membranes

Methodology CHAPTER OUTLINE

A Supplement to Improved Likelihood Inferences for Weibull Regression Model by Yan Shen and Zhenlin Yang

Chapter - 8 BLOOD PRESSURE CONTROL AND DYSLIPIDAEMIA IN PATIENTS ON DIALYSIS

COMPARISON OF A NEW MICROCRYSTALLINE

How important is the acute phase in HIV epidemiology?

What are minimal important changes for asthma measures in a clinical trial?

Visual Acuity Screening of Children 6 Months to 3 Years of Age

An economic analysis of a methionine source comparison response model

Since the early 1930s until the early 2000s, Quantitative scoring of an interferon-c assay for differentiating active from latent tuberculosis

Stochastic Integer Programming Models in the Management of the Blood Supply Chain: A Case Study

Hypertension in patients with diabetes is a well recognized

! A data structure representing a list. ! A series of dynamically allocated nodes. ! A separate pointer (the head) points to the first

The Suicide Note: Do unemployment rates affect suicide rates? Author: Sarah Choi. Course: A World View of Math and Data Analysis

talking about Men s Health...

Plantar Pressure Difference: Decision Criteria of Motor Relearning Feedback Insole for Hemiplegic Patients

RADIESSE Dermal Filler for the Correction of Moderate to Severe Facial Wrinkles and Folds, Such As Nasolabial Folds

We have previously shown that resting

Reporting Checklist for Nature Neuroscience

Ida Leida M.Thaha, Mega Marindrawati Rochka 1, Muh. Syafar 2

STATISTICAL ANALYSIS & ASTHMATIC PATIENTS IN SULAIMANIYAH GOVERNORATE IN THE TUBER-CLOSES CENTER

Outline. Neutron Interactions and Dosimetry. Introduction. Tissue composition. Neutron kinetic energy. Neutron kinetic energy.

GSK Medicine: Study Number: Title: Rationale: Study Period: Objectives: Indication: Study Investigators/Centers: Research Methods:

Autism Awareness Education. April 2018

Your health matters. Practical tips and sources of support

Biopharmaceutics Classification System (BCS) Biowaiver Assessment Report

Comparison of speed and accuracy between manual and computer-aided measurements of dental arch and jaw arch lengths in study model casts

Meningococcal B Prevention Tools for Your Practice

Whether you have a bacterial infection or a viral infection, there are things you can do to help yourself feel better:

Evolution of Anti-Inflammatory Activity of Aqueous Methanolic Extract of Basella alba on Wistar Alibino Rats

Research on the effects of aerobics on promoting the psychological development of students based on SPSS statistical analysis

A Method to Determine Cortical Bone Thickness of Human Femur and Tibia Using Clinical CT Scans. Wenjing Du, Jinhuan Zhang, Jingwen Hu

Data for MBI Workshop Statistics of Time Warpings and Phase Variations. Three-dimensional vascular geometry dataset

QUANTITATIVE STUDIES ON THE CILIATE GLAUCOMA

l A data structure representing a list l A series of dynamically allocated nodes l A separate pointer (the head) points to the first

Estimating Income Variances by Probability Sampling: A Case Study

Teacher Manual Module 3: Let s eat healthy

Study No.: Title: Rationale: Phase: Study Period: Study Design: Centres: Indication: Treatment: Objectives: Primary Outcome/Efficacy Variable:

Ovarian Cancer Survival

IMPAIRED THEOPHYLLINE CLEARANCE IN PATIENTS WITH COR PULMONALE

Transcription:

THE t-test Let Start With a Example Whe coductig experimet, we would like to kow whether a experimetal treatmet had a effect o ome variable. A a imple but itructive example, uppoe we wat to kow whether a ew formulatio of fertilizer icreae plat growth over that of a old fertilizer formula. To tet thi, we might meaure the growth repoe (let ay height) of two et of plat, each of which i grow o oe of the two fertilizer. Let imagie that we grow 0 plat o the old fertilizer ad 0 plat o the ew fertilizer; the height of each idividual plat ad the mea for each fertilizer are give i the table below. Plat Height (m) Old Fertilizer New Fertilizer 0.64.04 0.8.64.76 0.77.34.3.7.8.66.3.49 3.07.8.6.9..3 Mea:.4 Mea:.93 SD: 0. SD: 0.63 A you ca ee, the calculated mea height of plat grow o the ew fertilizer wa greater tha that of plat grow o the old fertilizer. But wait a miute! Before we jump to the cocluio that the ew fertilizer i better tha the old, let take a cloer look at the data that give rie to thee mea plat height. If you look cloely at the data, you ll otice that the data are variable. There variatio i plat height withi each of the fertilizer treatmet, repreeted by the tadard deviatio (SD), ad there alo variatio i plat height betwee the two treatmet. For example, oe of the plat grow o the old fertilizer grew quite tall ad reached.3 m. I fact, thi i taller tha eve out of te of the plat grow o the ew fertilizer ad taller tha the mea height of all plat grow o the ew fertilizer! Thi raie a importat quetio. How ca we claim that the ew fertilizer i i fact better if for ome plat it i ad ome plat it ot? Of coure, a few of the plat grow o the ew fertilizer were taller tha thoe grow o the old fertilizer, but ot all. Correpodigly, may of the plat grow o the old fertilizer were horter tha thoe grow o the ew fertilizer, but ot all. We expect ome variatio. But how much variatio i too much for u to coider that there wa a igificat poitive effect o plat growth of the ew fertilizer? Let look at thee data aother way. Figure how the plat height above a poit o a graph. Fig.. Idividual height (m) meauremet of plat grow i old ad ew fertilizer. The olid lie repreet mea height for each treatmet ( = 0 for each treatmet).

Plotted i thi way, you ca ee that the idividual plat height overlap betwee the old ad ew fertilizer; thi i due to radom variatio i plat height withi each treatmet. Although the calculated value of the mea betwee old ad ew fertilizer are differet umber, our cocluio about whether or ot the there wa a effect of the fertilizer treatmet o plat growth overall deped o how much variability there i i the data. The more variable our data, the le cofidet we ca be that the mea reflect a meaigful differece. To drive thi poit home, let examie a dataet of plat height o the ame fertilizer treatmet ad with the ame mea. But thi time the data are le variable. Figure how thi ew dataet a height of idividual plat grow o the old ad ew fertilizer. A a meaure of variability, let ue the tadard deviatio (SD). For the old fertilizer, SD = 0.9; for the ew fertilizer, SD = 0.. Fig.. Idividual height (m) meauremet of plat grow i old or ew fertilizer. Data for each treatmet ha the ame mea a i Fig., but are le variable (SD of old fertilizer = 0.9; SD of ew fertilizer = 0.). If you had a choice betwee uig the data i Figure or Figure to determie whether the old or ew fertilizer differed i their effect o plat height growth, which data would you have the mot cofidece i? Becaue the data i Figure i le variable tha the data i Figure, it tell u that the mea we calculated i Figure are actually more precie tha thoe i Figure. A a reult, we are more cofidet that the mea i Figure differ from oe aother tha we are cofidet that the mea i Figure differ. Thu, the variability of our data i what i truly critical whe makig cocluio about whether or ot real differece actually exit betwee our populatio of iteret. A cietit who are taked with beig objective whe makig uch cocluio, thi i where we tur to tatitical approache. A it tur out, the mea i Figure do ot differ tatitically from oe aother whe they are compared uig a objective tatitical tet, wherea the mea i Figure do differ igificatly. If you were to coclude that imply becaue the calculated mea were differet i the Figure data, the you would have made a icorrect cocluio. Statitic miimize the rik of makig thi type of mitake. The t-tet The t-tet, or Studet t-tet, i a tatitical tet that allow u to compare two ample mea. It i called a t-tet becaue we calculate a tet tatitic called a t-value. The t-value i calculated baed o the differece betwee the two mea but take accout of the variatio i the data. If the differece betwee two mea i large, the it i likely that the two mea are differet. However, a decribed i the fertilizer example above, we mut alo coider the variability i the data. If variatio i the data i low, the it i more likely that ay differece i the mea i ot due to chace aloe but to a factor that i cauig the mea to differ. The ize (or magitude) of the t-value i idicative of how differet our ample mea are with repect to the variace i the data. A large t-value idicate that the ample mea are igificatly differet, wherea a mall t-value idicate o igificat differece betwee the mea.

A a example, let compare the deity of a marh gra, called Spartia, betwee two differet marhe. The t-tet provide a ubiaed way of decidig whether ay oberved differece i the mea deity of Spartia betwee the two marhe i real or imply due to chace. A with all tatitical tet, the t-tet tet the ull hypothei. I thi example, the ull hypothei i: Mea deity of Spartia doe ot differ betwee marhe. I other word, our ull hypothei tate that the mea are equal (i.e., x = x ). The t-tet i baed o the t-ditributio, which i a ditributio that give the probability of gettig a particular t- value for a particular igificace level (α) ad degree of freedom (df). Becaue the t-value reflect the magitude of the differece betwee the mea, 0 if the mea are idetical (a rare occurrece). Sice the t-ditributio i baed o the ull hypothei (i.e., that the mea do ot differ), the t- ditributio ha a mea of zero. The greater the differece betwee the mea (aumig low variace), the higher the t-value will be. The higher the t-value, the le probable it i that you could get a t-value that high if the ull hypothei i true (i.e., mea are the ame). The t-ditributio wa developed to take accout of the fact that ample ize () i mall i mot practical applicatio ad, therefore, require a differet ditributio tha the ormal ditributio. Ideed, the t-ditributio i imply a modified form of the ormal ditributio, which if you remember ha certai propertie that allow u to aig probabilitie of occurrece for our data. For example, data poit that fall farther away from the ceter (i.e., the mea) of a ormal curve are le likely (le probable) to occur tha thoe that lad cloer to the ceter of the curve. Thi alo applie to the t-ditributio. The t- tet relie o thi property whe it compare two mea. To ay that two mea are tatitically differet, we ue the % igificace level (P < 0.0). That i, the differece betwee the mea mut be great eough uch that it i improbable that we would get uch a large differece betwee the mea if, i fact, they were the ame. Said aother way, uig the % igificace level, there i a % chace that we would get a t-value outide of the % level if the mea were the ame. The t-ditributio ad it % ( x.%) probability regio are how at right..% probability.% probability The aumptio of the t-tet are: ) the data are ditributed ormally that i, the frequecy ditributio of the data form a ormal (bell-haped) curve; ) the variace of the two ample beig compared are approximately equal; ad 3) the ample are idepedet. I a t-tet, the t-value i calculated baed o the differece i the ample mea ( x ad x ) ad the tadard error of the differece betwee ample mea ( ): x x 0 t-ditributio It tur out that: x x x x x x = + 3

Thi i becaue there i a mathematical relatiohip that equate the tadard error of the differece i the mea with the um of the tadard error of the two variable. Subtitutig thi for i the equatio for t, we get: x x x x + where the i value are the variace of the idividual variable. Thi equatio ca be rearraged a follow: x x + where equal the pooled tadard deviatio of both ample: = i= x i i= ( xi ) + xi i= + i= ( x ) i For ay particular t-tet, the calculated t-value will lie omewhere withi the t-ditributio. The t- value we get will have a probability aociated with it. That probability will be a meaure of how likely it would be for u to get a t-value a large or larger tha we did by chace aloe. Therefore, if our t-value i large (toward the tail of the curve), the there i a low probability that the mea are equal. By cotrat if t i mall (toward the ceter of the curve, that i zero), the the probability that the ull hypothei i true i high. By covetio, we et the probability cut-off poit for the t-value at 0.0 (or %) o the curve. Thi cut-off value of t i called the critical value. The critical t-value for ay particular t-tet ca be looked up i a t-table (called Critical value of the t ditributio ; icluded below). To determie the critical t value you eed to kow the igificace level (α) (i thi cae % or 0.0) ad the degree of freedom (df). The df for a t-tet i: df = ( + - ). If the t-value calculated from a t-tet i larger tha the critical value, the we reject the ull hypothei. By cotrat, if the t-value calculated from a t-tet i maller tha the critical value, the we fail to reject the ull hypothei. 4

Example of a t-tet For the Spartia example above, let ay that we collect plat cout from five m plot i each of the two marhe. We get the followig data: Plot # Marh A Marh B 0.7 4.3.0 4.4 3 0..8 4 8.8.6 8.6 4. x 9.9 x 3.6 x 99.6 x 66.3 x 989.4 x 887.9 9.9 3.6 Uig our calculatio equatio for the pooled tadard deviatio (): + = 989.4 ( 99.6) ( 66.3) + 887. + Therefore: = ad.08 + 8.06 8 which give, =.83 6.66 = ( 0.4) 6.66 0.8 = 8. The critical t-value at the 0.0 igificace level with 8 degree of freedom (df) i.306 (i.e., t (0.0, 8) =.306). Becaue our calculated t-value ( 8.) i higher tha the critical t-value at the 0.0 igificace level, there i le tha a % (P < 0.0) chace that we could get a t-value a large or larger tha we oberved if the ull hypothei i true. Therefore, we coclude that the mea deity of Spartia i marh A i igificatly differet tha it deity i marh B. Whe evaluatig tatitical igificace, you mut alway report the igificace level ued ad the P-value i your reult.

CRITICAL VALUES OF THE t DISTRIBUTION Sigificace level (α -tailed) Sigificace level (α -tailed) df 0. 0.0 0.0 df 0. 0.0 0.0 6.34.706 63.67.67.008.676.90 4.303 9.9.67.007.674 3.33 3.8.84 3.674.006.67 4.3.776 4.604 4.674.00.670.0.7 4.03.673.004.668 6.943.447 3.707 6.673.003.667 7.89.36 3.499 7.67.00.66 8.860.306 3.3 8.67.00.663 9.833.6 3.0 9.67.00.66 0.8.8 3.69 60.67.000.660.796.0 3.06 6.670.000.69.78.79 3.0 6.670.999.67 3.77.60 3.0 63.669.998.66 4.76.4.977 64.669.998.6.73.3.947 6.669.997.64 6.746.0.9 66.668.997.6 7.740.0.898 67.668.996.6 8.734.0.878 68.668.99.60 9.79.093.86 69.667.99.649 0.7.086.84 70.667.994.648.7.080.83 7.667.994.647.77.074.89 7.666.993.646 3.74.069.807 73.666.993.64 4.7.064.797 74.666.993.644.708.060.787 7.66.99.643 6.706.06.779 76.66.99.64 7.703.0.77 77.66.99.64 8.70.048.763 78.66.99.640 9.699.04.76 79.664.990.640 30.697.04.70 80.664.990.639 3.696.040.744 8.664.990.638 3.694.037.738 8.664.989.637 33.69.03.733 83.663.989.636 34.69.03.78 84.663.989.636 3.690.030.74 8.663.988.63 36.688.08.79 86.663.988.634 37.687.06.7 87.663.988.634 38.686.04.7 88.66.987.633 39.68.03.708 89.66.987.63 40.684.0.704 90.66.987.63 4.683.00.70 9.66.986.63 4.68.08.698 9.66.986.630 43.68.07.69 93.66.986.630 44.680.0.69 94.66.986.69 4.679.04.690 9.66.98.69 46.679.03.687 96.66.98.68 47.678.0.68 97.66.98.67 48.677.0.68 98.66.984.67 49.677.00.680 99.660.984.66 0.676.009.678 00.660.984.66.64.960.76 6