The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance?

Similar documents
Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

Applied Statistical Analysis EDUC 6050 Week 4

STAT 113: PAIRED SAMPLES (MEAN OF DIFFERENCES)

Analysis of Variance: repeated measures

Final Exam Practice Test

Designing Psychology Experiments: Data Analysis and Presentation

Analysis of Variance (ANOVA)

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Comparing Two Means using SPSS (T-Test)

Testing Means. Related-Samples t Test With Confidence Intervals. 6. Compute a related-samples t test and interpret the results.

Chapter 9: Comparing two means

Two-Way Independent ANOVA

Sheila Barron Statistics Outreach Center 2/8/2011

Inferential Statistics

Psychology Research Process

CHAPTER ONE CORRELATION

CHAPTER OBJECTIVES - STUDENTS SHOULD BE ABLE TO:

Psychology Research Process

Missy Wittenzellner Big Brother Big Sister Project

Daniel Boduszek University of Huddersfield

appstats26.notebook April 17, 2015

Chapter 12: Introduction to Analysis of Variance

One-Way Independent ANOVA

Midterm Exam MMI 409 Spring 2009 Gordon Bleil

Reflection Questions for Math 58B

EPS 625 INTERMEDIATE STATISTICS TWO-WAY ANOVA IN-CLASS EXAMPLE (FLEXIBILITY)

Hypothesis testing: comparig means

THIS PROBLEM HAS BEEN SOLVED BY USING THE CALCULATOR. A 90% CONFIDENCE INTERVAL IS ALSO SHOWN. ALL QUESTIONS ARE LISTED BELOW THE RESULTS.

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Instructions for doing two-sample t-test in Excel

ANOVA in SPSS (Practical)

8/28/2017. If the experiment is successful, then the model will explain more variance than it can t SS M will be greater than SS R

Lesson 11.1: The Alpha Value

Chapter 9: Answers. Tests of Between-Subjects Effects. Dependent Variable: Time Spent Stalking After Therapy (hours per week)

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

HS Exam 1 -- March 9, 2006

Alcohol Consumption Among YSU Students Statistics 3717, CC:2290 Elia Crisucci, Major in Biology Tracy Hitesman. Major in Biology Melissa Phillipson,

Comparison of two means

How to Conduct On-Farm Trials. Dr. Jim Walworth Dept. of Soil, Water & Environmental Sci. University of Arizona

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d

PSY 216: Elementary Statistics Exam 4

YSU Students. STATS 3743 Dr. Huang-Hwa Andy Chang Term Project 2 May 2002

Kepler tried to record the paths of planets in the sky, Harvey to measure the flow of blood in the circulatory system, and chemists tried to produce

Chapter 8: Estimating with Confidence

Chapter 25. Paired Samples and Blocks. Copyright 2010 Pearson Education, Inc.

Experimental Psychology Arlo Clark Foos

An Introduction to Research Statistics

Previously, when making inferences about the population mean,, we were assuming the following simple conditions:

Examining differences between two sets of scores

Between Groups & Within-Groups ANOVA

CHAPTER 8 Estimating with Confidence

STATISTICS - CLUTCH CH.11: HYPOTHESIS TESTING: PART 1.

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Advanced ANOVA Procedures

Something to think about. What happens, however, when we have a sample with less than 30 items?

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

SCAF Workshop Project Cost Control

Statistical Significance, Effect Size, and Practical Significance Eva Lawrence Guilford College October, 2017

The Single-Sample t Test and the Paired-Samples t Test

Chapter 12: Analysis of covariance, ANCOVA

04/12/2014. Research Methods in Psychology. Chapter 6: Independent Groups Designs. What is your ideas? Testing

Review and Wrap-up! ESP 178 Applied Research Methods Calvin Thigpen 3/14/17 Adapted from presentation by Prof. Susan Handy

Stat Wk 9: Hypothesis Tests and Analysis

2-Group Multivariate Research & Analyses

Business Statistics Probability

Psy201 Module 3 Study and Assignment Guide. Using Excel to Calculate Descriptive and Inferential Statistics

JOE BLOGGS 22 Mar 2017 Character DNA SAMPLE REPORT

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.

Analysis of data in within subjects designs. Analysis of data in between-subjects designs

Repeated Measures ANOVA and Mixed Model ANOVA. Comparing more than two measurements of the same or matched participants

Feeling. Thinking. My Result: My Result: My Result: My Result:

Chi Square Goodness of Fit

Online Introduction to Statistics

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Still important ideas

APPENDIX N. Summary Statistics: The "Big 5" Statistical Tools for School Counselors

Module 28 - Estimating a Population Mean (1 of 3)

Readings Assumed knowledge

Title: A new statistical test for trends: establishing the properties of a test for repeated binomial observations on a set of items

Intro to SPSS. Using SPSS through WebFAS

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

ID# Exam 1 PS306, Spring 2005

Independent Variables Variables (factors) that are manipulated to measure their effect Typically select specific levels of each variable to test

SPSS output for 420 midterm study

Planning Sample Size for Randomized Evaluations.

Confidence Intervals On Subsets May Be Misleading

Slide 1. Slide 2. Slide 3. Behavioral Research Chapter 10. Simple designs. Factorial design. Complex Experimental Designs

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

EXPERIMENTAL RESEARCH DESIGNS

Lesson 2 The Experimental Method

CAUTIONS ABOUT THE PRACTICE EXAM

Sampling for Impact Evaluation. Maria Jones 24 June 2015 ieconnect Impact Evaluation Workshop Rio de Janeiro, Brazil June 22-25, 2015

Demonstrating Client Improvement to Yourself and Others

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

Abdul Latif Jameel Poverty Action Lab Executive Training: Evaluating Social Programs Spring 2009

Overview of Lecture. Survey Methods & Design in Psychology. Correlational statistics vs tests of differences between groups

Inferential Statistics: An Introduction. What We Will Cover in This Section. General Model. Population. Sample

Transcription:

The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance? Two versions: (a) Dependent-means t-test: ( Matched-pairs" or "one-sample" t-test). Same subjects do both experimental conditions e.g., two conditions A and B: half subjects do A then B; rest do B then A. (Randomly allocated to one order or the other). Both types of t-test have one independent variable, with two levels (the two different conditions of our experiment). (b) Independent-means t-test: ( Two-sample" t-test). Different subjects do each experimental condition. e.g., two conditions A and B: half subjects do A; rest do B. (Randomly allocated to A or B). There is one dependent variable (the thing we actually measure). Effects of alcohol on reaction-time performance. I.V. is "alcohol consumption". Two levels - drunk and sober. D.V. is RT. Use a repeated-measures t-test: measure each subject's RT twice, once while drunk and once while sober. Effects of personality type on a memory test. I.V. is "personality type". Two levels - introversion and extraversion. D.V. is memory test score. Use an independent-measures t-test: measure each subject's memory score once, then compare introverts and extraverts. 1

Rationale behind the t-test: Experiment on the effects of alcohol on RT. Measure RT for subjects when drunk, and when sober. Null hypothesis: alcohol has no effect on RT: variation between the drunk sample mean and the sober sample mean is due to sampling variation. The drunk and sober scores are samples from the same population (sober RTs). If the difference between the sober and drunk means is large, we might prefer to believe that alcohol has affected RT: the difference is not due to sampling variation, but arose because the drunk and sober scores are samples from two different populations - the population of sober RTs and the population of drunk RTs. How large is large? The t-test enables us to decide. Both types of t-test are similar in principle to the z-score. t-distribution becomes progressively more like the normal distribution as sample size (n) increases: Observed difference between sample means Predicted difference between sample means (that there will be no difference at all) measure of the extent to which pairs of sample means might differ 2

1. We have two sample means, which differ. 2. Null hypothesis is that the two samples come from the same population; if so, ideally the sample means should be identical. RT for drunk sample: 800 ms RT for sober sample: 300 ms Difference: 500 ms Sober sample RT really reflects sober population RT: say, 600 ms Drunk sample RT also really reflects sober population RT: 600 ms Difference: 0 The difference between 500 ms and 0 ms is due to chance (sampling variation). 3. Alternative hypothesis is that our experimental manipulation has affected our subjects. The two samples (drunk and sober) are samples from different populations with different means. If so, the samples might well have different means. (e.g. the sober sample mean of 300 ms might reflect a sober population mean of 300 ms; the drunk sample mean of 800 ms might reflect a drunk population mean of 800 ms). A big difference between our two sample means therefore suggests that either: (a) the two sample means are poor reflections of the mean of the single population that they are supposed to represent (i.e., our samples are atypical ones). OR (b) The two sample means are actually from two different parent populations, and our initial assumption that the samples both come from the same population is wrong. The bigger the difference between our two sample means, the less plausible (a) becomes, and the more likely that (b) is true. Repeated Measures t-test, step-by-step: Does Prozac affect driving ability? Ten subjects have their driving performance tested twice on a sheep farm: test A after they have taken Prozac ( experimental condition); test B while they are drug-free ( control condition). Each subject thus provides two scores (one for each condition). Five do A then B, five do B then A. 3

Number of sheep hit during a 30-minute driving test: Subject: Test A Score Test B Score Difference, D 1 28 25 3 2 26 27-1 3 33 28 5 4 30 31-1 5 32 29 3 6 30 30 0 7 31 32-1 8 18 21-3 9 22 25-3 10 24 20 4 27.4 26.8 ΣD 6 t the average difference between scores in our two samples (should be close to zero if there is no difference between the two conditions) D µ D( hypothesised ) S D the predicted average difference between scores in our two samples (usually zero, since we assume the two samples don t differ ) estimated standard error of the mean difference (a measure of how much the mean difference might vary from one occasion to the next). 1. Add up the differences: ΣD 6 2. Find the mean difference: D D N 6 10 6 3. Estimate of the population standard deviation (the standard deviation of the differences): S D ( D D ) n 1 2 4. Estimate of the population standard error (the standard error of the differences between two sample means): S D S D n 4

5. Hypothesised difference between the sample means. Our null hypothesis is usually that there is no difference between the two sample means. (In statistical terms, that they have come from two identical populations): µ D (hypothesised) 0 6. Work out t: t 6 0 92 65 7. "Degrees of freedom" (d.f.) are the number of subjects minus one: d.f. n - 1 10-1 9 8. Find the critical value of t from a table (at the back of many statistics books; also on my website). (a) Two-tailed test : if we are predicting a difference between tests A and B find the critical value of t for a "two-tailed" test. With 9 d.f., critical value 2.26. (b) One-tailed test : if we are predicting that A is bigger than B, or A is smaller than B, find the critical value of t for a "one-tailed" test. For 9 d.f., critical value 1.83. critical values of t (two-tailed): If obtained t is bigger or equal to the critical t-value, "reject the null hypothesis" - the difference between our sample means is probably too large to have arisen by chance. Here, obtained t 65. This is less than 2.262. There was no significant difference between performance on the two tests; the observed difference is so small, it probably arose by chance. Conclusion: Prozac does not significantly affect driving ability. Results of analysis using Excel: (Tools> Data Analysis > t-test: Paired Two Sample for Means)! " # # $ #%% & '() * # %,-./ #+ #++%# $ %,-./ $ #+ # +% -2.26 65 2.26 5

Results of analysis using SPSS: (Analyze >Compare Means > Paired samples t-test) Paired Samples Statistics Pair 1 Test A Test B Std. Std. Error Mean N Deviation Mean 27.4000 10 4.8351 1.5290 26.8000 10 4.0497 1.2806 Paired Samples Correlations Pair 1 Test A & Test B N Correlation Sig. 10.799.006 Paired Samples Test Pair 1 Test A - Test B Paired Differences 95% Confidence Interval of the Std. Std. Error Difference Sig. Mean Deviation Mean Lower Upper t df (2-tailed).6000 2.9136.9214-1.4842 2.6842.651 9.531 6