One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

Similar documents
One-Way Independent ANOVA

The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance?

ANOVA. Thomas Elliott. January 29, 2013

Analysis of Variance (ANOVA)

Chapter 12: Introduction to Analysis of Variance

Two-Way Independent ANOVA

Testing Means. Related-Samples t Test With Confidence Intervals. 6. Compute a related-samples t test and interpret the results.

Analysis of Variance (ANOVA) Program Transcript

Final Exam Practice Test

ANOVA in SPSS (Practical)

Two-Way Independent Samples ANOVA with SPSS

Analysis of Variance: repeated measures

Regression Including the Interaction Between Quantitative Variables

CHAPTER ONE CORRELATION

Day 11: Measures of Association and ANOVA

Study Guide for the Final Exam

Business Statistics Probability

Appendix: Instructions for Treatment Index B (Human Opponents, With Recommendations)

appstats26.notebook April 17, 2015

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

Advanced ANOVA Procedures

ANALYSIS OF VARIANCE (ANOVA): TESTING DIFFERENCES INVOLVING THREE OR MORE MEANS

PSY 216: Elementary Statistics Exam 4

Chapter 11. Experimental Design: One-Way Independent Samples Design

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

Inferential Statistics

Instructions for doing two-sample t-test in Excel

HS Exam 1 -- March 9, 2006

Chapter 9. Factorial ANOVA with Two Between-Group Factors 10/22/ Factorial ANOVA with Two Between-Group Factors

Psy201 Module 3 Study and Assignment Guide. Using Excel to Calculate Descriptive and Inferential Statistics

Lesson 11.1: The Alpha Value

Statistical Significance, Effect Size, and Practical Significance Eva Lawrence Guilford College October, 2017

The Single-Sample t Test and the Paired-Samples t Test

To open a CMA file > Download and Save file Start CMA Open file from within CMA

Still important ideas

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

EPS 625 INTERMEDIATE STATISTICS TWO-WAY ANOVA IN-CLASS EXAMPLE (FLEXIBILITY)

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Chapter 12. The One- Sample

Chapter 13: Introduction to Analysis of Variance

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

PSYCHOLOGY 320L Problem Set #4: Estimating Sample Size, Post Hoc Tests, and Two-Factor ANOVA

8/28/2017. If the experiment is successful, then the model will explain more variance than it can t SS M will be greater than SS R

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

Section 6: Analysing Relationships Between Variables

Stat Wk 9: Hypothesis Tests and Analysis

FORM C Dr. Sanocki, PSY 3204 EXAM 1 NAME

Research paper. Split-plot ANOVA. Split-plot design. Split-plot design. SPSS output: between effects. SPSS output: within effects

Never P alone: The value of estimates and confidence intervals

Lesson 9: Two Factor ANOVAS

INTENDED LEARNING OUTCOMES

Lesson 8 Descriptive Statistics: Measures of Central Tendency and Dispersion

Utilizing t-test and One-Way Analysis of Variance to Examine Group Differences August 31, 2016

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 8 One Way ANOVA and comparisons among means Introduction

Sheila Barron Statistics Outreach Center 2/8/2011

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

Statistics for Psychology

APPENDIX N. Summary Statistics: The "Big 5" Statistical Tools for School Counselors

Still important ideas

Math Section MW 1-2:30pm SR 117. Bekki George 206 PGH

SPSS output for 420 midterm study

THIS PROBLEM HAS BEEN SOLVED BY USING THE CALCULATOR. A 90% CONFIDENCE INTERVAL IS ALSO SHOWN. ALL QUESTIONS ARE LISTED BELOW THE RESULTS.

25. Two-way ANOVA. 25. Two-way ANOVA 371

To open a CMA file > Download and Save file Start CMA Open file from within CMA

Creative Commons Attribution-NonCommercial-Share Alike License

Use the above variables and any you might need to construct to specify the MODEL A/C comparisons you would use to ask the following questions.

Psychology Research Process

Midterm Exam MMI 409 Spring 2009 Gordon Bleil

The Association Design and a Continuous Phenotype

Repeated Measures ANOVA and Mixed Model ANOVA. Comparing more than two measurements of the same or matched participants

How to Conduct On-Farm Trials. Dr. Jim Walworth Dept. of Soil, Water & Environmental Sci. University of Arizona

Research paper. One-way Analysis of Variance (ANOVA) Research paper. SPSS output. Learning objectives. Alcohol and driving ability

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

STAT 113: PAIRED SAMPLES (MEAN OF DIFFERENCES)

Online Introduction to Statistics

Comparing Two Means using SPSS (T-Test)

Completely Random Design and Least Significant Differences for Breast Cancer in Al-Najaf City ( )

Examining differences between two sets of scores

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Study Guide #2: MULTIPLE REGRESSION in education

Variability. After reading this chapter, you should be able to do the following:

Chapter 9: Comparing two means

Success Center Directed Learning Activity (DLA) Reading a Food Label M602.1

Hypothesis Testing. Richard S. Balkin, Ph.D., LPC-S, NCC

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology

Collecting & Making Sense of

1-way ANOVA indepenpendent groups Page 1 of 60. One-way ANOVA. Copyright 1998, 2000 Tom Malloy

Comparing 3 Means- ANOVA

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Psychology Research Process

Something to think about. What happens, however, when we have a sample with less than 30 items?

CHAPTER THIRTEEN. Data Analysis and Interpretation: Part II.Tests of Statistical Significance and the Analysis Story CHAPTER OUTLINE

Biostatistics 3. Developed by Pfizer. March 2018

Chi Square Goodness of Fit

UNEQUAL CELL SIZES DO MATTER

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE

Lecture 20: Chi Square

Designing Psychology Experiments: Data Analysis and Presentation

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

Final Exam PS 217, Spring 2004

Transcription:

1 One-Way ANOVAs We have already discussed the t-test. The t-test is used for comparing the means of two groups to determine if there is a statistically significant difference between them. The t-test allows us to determine the probability that we are making a Type I error (rejecting the null when it is true). If the probability (p-value) of obtaining the t-value that you did, is less than alpha (generally set at 0.05) then we reject the null hypothesis and conclude that there is a difference between conditions with at least 95% confidence that the difference between the two conditions is not due simply to chance. Take a detour with me. Imagine your instructor stated that she could obtain a 6 on the throw of a single die and then she did it! Would that be impressive? What would the probability of doing that be?. Now suppose that she said she could obtain at least one 6 on the throw of a pair of die. Is this more or less impressive than using a single die? What is the probability that she could get a 6 with two dice? What if she said that she could get at least one 6 by throwing 3 dice? What is the probability now?. We have the same problem when we want to make more than one comparison of condition means within a study. For example, we may want to include 3 conditions in our study A, B and C. I am interested in looking at difference between the means of conditions A and B; between the means of condition A and C and between the means of conditions B and C. Each new comparison we make is like adding an additional die to the example above. If we set alpha at.05 we have 5% chance of making a type one error with every comparison we make. The probability of making a Type I error in this study is not 5%, it is, in fact, 15%. Continuing with the example we have been using on the Marital Status and Happiness Ratings. The dependant variable in this example is happiness ratings. Assume that I interviewed 20 Married, 20 Single and 20 Divorced persons from Grant County for this study and obtained their responses to the Happiness Questionnaire. My Independent variable is Marital Status and it has three levels; single, married, and divorced. Below are Descriptive Statistics for this study. They are presented in the same format that would be produced if you analyzed the results using a computer statistical analysis program called Statistical Package for the Social Sciences (SPSS). As we discuss ANOVAs you will need to learn to interpret results from tables presented in the SPSS format. The Descriptive Statistics Table at the top of the next page contains the means, standard deviations and sample sizes (N) for the total sample as well as for each of the conditions.

2 Descriptive Statistics Dependent Variable: Happiness rating Marriage Mean Std. N Status Deviation Single 5.4000 1.6670 20 Married 6.6500 1.7252 20 Divorced 4.9500 1.7614 20 Total 5.6667 1.8381 60 Note: Even though the numbers in the table are not, you should always round to two decimal places. To determine if there are statistically significant differences among the mean happiness ratings for these three conditions, I could do three t-tests to compare all possible combinations of these groups. (i.e., I could compare single to married; single to divorced and married to divorced.) Remember for each t-test we have a 5% chance of making a Type I error. If I do three t-tests, I triple that chance and leaving a 15% chance of making a Type I error in the entire set of comparisons. If I found a significant difference for any or for all of these comparisons, I could not say that I was 95% confident that the difference is not due to chance. I could only be 85% confident. In science, that is not good enough! Fortunately, there is a method for comparing more than two means. The procedure is called an Analysis of Variance (ANOVA). When we have one independent variable (with 3 or more levels) we use a procedure called a One-way ANOVA. There are other variations of this test that can be used for factorial designs (designs with more than one independent variable). For example, if we did a study with two independent variables, we would use a Two-way ANOVA to analyze the results. What do you think they would call the analysis used for a study with five independent variables? An one way ANOVA can be used to compare any number of condition means and still maintain the probability of making a Type I error at 5%. An ANOVA is called an omnibus test because it looks at the amount the whole set of means differ from each other and determines if that pattern of differences is likely to have occurred less than 5% of the time by chance alone. If the entire set does not have a p value of greater than.05, than no one comparison can either. Recall when hypothesis testing we were deciding whether to reject or accept the null hypothesis. If we reject the null, we conclude that the scientific (alternative) hypothesis is the more tenable conclusion. The two hypotheses for the ANOVA are: Scientific Hypothesis - There are differences between at least 2 of the groups. Null Hypothesis There are no differences among the groups.

3 The logic of this test is simple (the mathematics is more complex but the computer does it). Here is the basic idea. Review: When we discussed measures of dispersion much earlier in the term, we talked about range, deviations scores (the score minus the mean), variance and standard deviations. The range is not a good estimate of the dispersion of scores because it is greatly influenced by extreme scores. Deviation scores when summed, equal zero, so they too are useless a measure of the dispersion of scores in a sample distribution. To get around this problem we square all the dispersion scores (this makes them all positive numbers) then we can sum them and divide by the sample size to obtain the mean squared that amount scores in the distribution deviate from the mean. This mean squared dispersion is called the variance of the distribution. Since most of us have difficulty thinking in squared amounts, it is generally more useful to think about the standard deviation of a distribution. The standard deviation is the square root of the variance and can be thought of as the mean amount the scores in the distribution vary from the distribution mean. Why are we talking about variances instead of standard deviations? The problem with the standard distribution is that it is a square root. We cannot add, subtract, multiply or divide square roots without squaring them first. For example, 9 9 18. Since variances are easier to work with mathematically, we use them for this analysis. Keep in mind however, that a variance is just a measure of the dispersion of scores. The larger the variance, the more spread out the scores are in the distribution. 1) Using the Happiness and Marital study example, we start with the assumption that people in general differ from each other in happiness. I expect that married people show the same variability in Happiness that single people do and that divorced people do. In other words, I might expect that marriage shifts the Happiness ratings of the entire group, but does not effect how much variability in happiness there is within the group. If there are differences between my groups that are due to the independent variable (Marital Status), I expect the means of the groups to differ but not the variances. 3 4 5 6 7 8 We refer to the amount of variation that is associated, in general, with the dependant variable as Within Groups Variance. It is the amount that scores between individuals within the same condition would be expected to vary from the mean of their condition. It does not have anything to do with variation due to the level of the IV. It can be thought of as variance that is due to random variation between individuals or what statisticians call Error. Having three groups (conditions) I have three estimates of this Within

4 Groups Variance. Using the average of these three Within Groups variance estimates gives a better estimate of the general variation of happiness in the population. We make a second assumption -- that we can use Within Groups Variance to estimate the amount of variation we would expect to find Between Groups if the Independent variable has no effect (if the null is true). Therefore, we assume that Within Groups Variance give a good estimate of Error variance. This estimate of error variance is called the Mean Squared Error (MSE). That leaves just one last step, measuring the amount of variance between groups. The way this is computed is complex but for this class we do not need to worry about that, SPSS will do the math for us. If we repeatedly measured samples of the same size drawn from the same population, over and over again, we would not expect to get exactly the same mean each time. (Remember, the distribution of the means we discussed when we discussed t-tests.) Similarly, if we sample three levels of our IV, even if they do not differ from each other on the DV, we would not expect to get exactly the same means for each condition. The means will vary simply due to random variation. However, they might also vary due to differences in the level of the IV. What is important is that you understand that Between Groups Variance is due to both error variation and to variation due to the independent variable. Between Group Variance is the variance of the entire set of scores in the study. Between Group Variance = Random Variance + Variance Due to the Independent variable 3 4 5 6 7 8 Assume for a moment that there is no effect of the Independent variable. This means that the independent variable adds zero variance to that we would expect to find due to chance. If there are no differences between our groups, then we would expect our Within Group Variance and our Between Group Variance to be equal. What would we expect the ration of these two variance estimates to be if there were no effect of the IV?

5 Random Variance + Variance Due to the Independent variable = Random Variance + 0 Random Variance Random Variance We would expect to find a ratio of one. The amount that the ratio differs from 1 can be attributed to variation due to the IV. So, if the IV has an effect, the ratio of Between Groups Variance to Within Groups Variance will be greater than one (i.e., the numerator will be greater than the denominator). This is called an F ratio. Because we are only using estimates of variances we do not expect that the ratio will always be one even when the IV does not have an effect. How much the F ratio needs to be above 1 for us to be 95% sure that there really is an effect of the Independent variable can be determined using probability theory, Again (lucky us!) the probability of making a type I error is calculated for us by the computer program. SPSS provides the following type of output. Tests of Between-Subjects Effects Dependent Variable: Happiness rating Source Sum of Squares df Mean Square F Sig. Marital Status 31.033 2 15.517 5.255.008 Error 168.300 57 2.953 Total 2126.000 60 F is the ratio of Mean Squared Variance due to Marital Status (Between Group Variance) and Mean Squared Error (Within Group Variance). Sig. (in the final column) is the probability (p value) of making a Type I error. Because p is less than.05 we reject the null hypothesis and conclude that there is a significant difference between at least two of the means. We would report this result by stating that A One-way ANOVA determined that Happiness ratings significantly differ among Marital Status Groups (F(2,57) = 5.26; p =.008). The numbers in the parenthesis following the letter F are the degrees of freedom (df) associated with this analysis. The first is the degrees of freedom between groups and the second is the degrees of freedom within groups. They are related to the number of conditions in the study and the number of participants. They must be reported in APA reports. They are always reported in parenthesis after the letter F and are always reported in the order Between Groups df, Within Groups df separated by a comma. A significant result for a One-way ANOVA allows us to reject the null hypothesis and conclude that the scientific hypothesis is the most tenable conclusion. (Remember the scientific hypothesis is that there are differences between at least 2 of the groups.) The One-Way ANOVA does not tell us, which groups differ from each other. To determine that we need to make individual comparisons. When the F ratio is significant, SPSS continues the analysis by running Post hoc (follow-up) Multiple Comparisons of the sets of means to determine which means are significantly different from each other. These are very much like doing t-tests between

6 all combinations of the means. Why can we do them now? Having found a significant F ratio from the One-way ANOVA we know that the level of Type I error is limited to 5% for the entire set of group comparisons. Therefore, it is safe to go ahead and do the three separate comparisons. Since the pattern of differences we found was unlikely to occur (p <.05) if we selected three samples randomly from a population, we can be assured that any significant differences we find in the multiple comparisons are not due to having done multiple tests (like throwing the dice three times) but are actually due to the effects of the IV. There are several ways of doing these post hoc Multiple Comparisons. I have had SPSS do a Least Squares Difference Test (LSD). Looking at the Multiple Comparisons Table below, the difference between each set of means is listed in the third column, and the p value (sig) in the last column. The LSD multiple comparisons analysis determined that married people rate themselves as significantly more happy then single people (p =.025) and than divorced people (p =.003), whereas single and divorced people do not differ on Happiness ratings. Multiple Comparisons Dependent Variable: Happiness rating LSD (I) Marital Status (J) Marital Status Mean Differenc e (I-J) Std. Error Sig. Single Married -1.2500.5434.025 Divorced.4500.5434.411 Married Single 1.2500.5434.025 Divorced 1.7000.5434.003 Divorced Single -.4500.5434.411 Married -1.7000.5434.003 If you were answering an exam question, I would be looking for the following. A statement about the one-way ANOVA: If the ANOVA is not significant, do not go on and interpret the LSDs. If the oneway ANOVA is significant, then you need to give a FULL interpretation of the LSD Multiple comparisons. If there are three conditions you need to make three statements. If there are four conditions you need to make 6 comparisons. If there are five conditions you need to make 10 comparisons.

7 In our case: The one way ANOVA was significant (F(2,57) = 5.255, p =.008). The LSD multiple comparisons analysis determined that married people rate themselves as significantly happier (M = 6.6.5, s = 1.73) than single people (M = 5.40, s = 1.67; p =.025) and then divorced people (M = 4.95, s = 1.76; p =.003), whereas single and divorced people do not differ on Happiness ratings. Within Subject Designs When the design of the study is Within Subjects, the Post Hoc Multiple comparisons would be paired t-tests instead of LSDs. We will see an example of this in concept checks. They are interpreted in the same manner, but are displayed in tables a little differently.