Analysis of Variance (ANOVA) Program Transcript DR. JENNIFER ANN MORROW: Welcome to Analysis of Variance. My name is Dr. Jennifer Ann Morrow. In today's demonstration, I'll review with you the definition of analysis of variance. We'll go over a couple of sample research questions. I'll show you the formulas for an analysis of variance. We'll talk about the assumptions for analysis of variance, as well as the post hoc tests. I will also go over with you how to conduct to the effect size for an analysis of variance. And lastly, I'll go over two examples using analysis of variance. Analysis of variance, or ANOVA, is a hypothesis testing procedure that is used to evaluate mean differences between two or more treatments, or populations. If a difference between the means is statistically reliable, or significant, the difference is expected, with a certain probability, to reappear if the study is replicated. A non-significant difference implies that you cannot rule out the possibility that mean differences that do exist in your sample data occurred by chance. For this demonstration, I will review for you one specific type of analysis of variance, a one-way between subjects analysis of variance. Some examples of research questions that can be addressed using a one-way between subjects ANOVA are as follows. Question one. Is there a difference in self-esteem depending on year and school? Here, my independent variable would be year and my groups, or levels of my independent variable, would be first-year, sophomore, junior, and senior. And my dependent variable would be self-esteem. For my second question, do participants in the treatment groups recall more words? My independent variable here would be group and my levels, or groups of my independent variable, would be control, treatment 1, and treatment 2. And my dependent variable would be recall. Again, remember that there are different participants in each of your group, or levels of your independent variable, for both of these examples. The basic formula for a one-way between subjects ANOVA is as follows. Your F-ratio is equal to your mean squared between divided by your mean squared within. To calculate your mean squared between, it is just your sum of squares between 2012 Laureate Education, Inc. 1
divided by your degrees of freedom between. And your means squared within is your sum of squares within divided by your degrees of freedom within. For a one-way between subjects ANOVA, your degrees of freedom total is equal to n minus 1, or your sample size minus 1. Your degrees of freedom between is equal to k minus 1, or the number of groups or levels in your independent variable minus 1. And your degrees of freedom within is equal to n minus k, or your sample size minus the number of groups or levels in your independent variable. Here's an example of a one-way between subjects ANOVA source table. You should always put your results of an ANOVA into a table such as this one. So here you have a column for the variance between, within, and total. You have a column for your sum of squares, sum of squares between, your sum of squares within, and your sum of squares total. You have a column for you degrees of freedom, your degrees of freedom between, your degrees of freedom within, and your degrees of freedom total. You have a column for your mean squared. You have a mean squared between and a mean squared within. And I have a column for your F-ratio, which is your mean squared between divided by your mean squared within. OK, let's recap. So far we've gone over the definition for a one-way between subjects ANOVA. We've gone over a couple of sample research questions that can be addressed using a one-way between subjects ANOVA. And we learned the formulas for conducting a one-way between subjects ANOVA. Now let's learn about one-way between subjects ANOVA in more detail. There are three assumptions that must be addressed when conducting a one-way between subjects ANOVA. The first one is that the observations within each sample must be independent. None of your scores in each of your groups should be related to each other. Second, the populations from which the samples are selected must be normal. And lastly, the populations from which the samples are selected must have equal variances. And this is known as homogeneity of variance. When you're conducting an analysis of variance and you have an independent variable that has three or more groups, or three or more levels, you need to conduct additional analyses to ascertain where your group differences are. Post hoc tests are these additional analyses. These are additional hypothesis tests that are conducted after an ANOVA to determine exactly which mean differences are significant and which are not. Two of the many post hoc tests available to you are the Scheffe and the Tukey HSD test. The Scheffe test is a very conservative post hoc test. It compares all pairs of means. The 2012 Laureate Education, Inc. 2
Tukey HSD test is a more liberal test. It is less conservative than a Scheffe, and it compares all pairs of means as well. This is the most popular post hoc test. You can also calculate an effect size for a one-way between subjects ANOVA. You can report the percentage of variance explained, known as r squared or eta squared. To calculate the percentage of variance explained, you take your sum of squares between and divide that by your sum of squares total. Now let's go over a couple of examples for a one-way between subjects ANOVA. For the first example, we'll use a formula to calculate a one-way between subjects ANOVA. My research question is, do students drink alcohol in different amounts depending on where they live? Here, my independent variable is reside and my groups or levels are with parents, in a dorm, or in a Greek house. And my dependent variable is weekly alcohol use. So my null hypothesis is equal to mu 1 equals mu 2 equals mu 3, or there are no differences among my three groups. And my alternative or research hypothesis is equal to mu 1 is not equal to mu 2 is not equal to mu 3, or there is a difference among the three groups. Now let's calculate the ANOVA. For this example, I have three groups of participants. Those that live with the parents and their weekly drinking is as follows-- 0, 1, 3, 1, and 0. I have those students that live in a dorm and their weakly drinking is as follows-- 1 comma 2 comma 2, 0, and 0. I have those that live in a Greek house and their weekly drinking is as follows-- 4, 3, 6, 3, and 4. Each of these three groups has 5 participants, so my total n is equal to 15. My mean weekly drinking for those that live with the parents is equal to 1. My mean weekly drinking for those that live in a dorm is also equal to 1. And my mean weekly drinking for those that live in a Greek house is equal to 4. I've also previously calculated the sum of squares. So my sum of squares for those that live with the parents is equal to six. My sum of squares for those that live in the dorm is equal to 4. And my sum of squares for those that live in a Greek house is equal to 6. I have also already calculated my sum of squares between, which in this case is equal to 30. And my sum of squares within, which is equal to 16. And my sum of squares total, which is equal to 46. And you can refer back to your text or to earlier slides in this presentation for the formulas for the sum of squares between, within, and total. I now need to calculate my degrees of freedom. My degrees of freedom between is equal to k minus 1 or the number of groups or levels of my independent variable minus 1. Which in this case, is equal of 3 minus 1, which is 2. 2012 Laureate Education, Inc. 3
My degrees of freedom within is equal to n minus k, which is equal to the sample size minus the number of groups or levels in my independent variable. Which here, in this case, is equal to 15 minus 3, which is equal to 12. And my degrees of freedom total is equal to n minus 1, which here, in this case, is equal to 15 minus 1, which is 14. I now need to calculate the critical value that I need to surpass in order to achieve significance. So if I choose an alpha level of 0.05 and a two-tailed test, I look in my ANOVA table, and I find that for 2 and 12 degrees of freedom and an alpha level of 0.05 two-tailed, the critical value that I need to surpass is 3.88. So let's calculate your F-ratio. First, I need to calculate my mean squared between and my mean squared within. My mean squared between is equal to my sum of squares between divided by my degrees of freedom between. And here, that is equal to 30 divided by 2, which is equal to 15. My mean squared within is equal to the sum of squares within divided by the degrees of freedom within. And here, that is equal to 16 divided by 12. And that is equal to 1.33. So my F-ratio is equal to the mean squared between divided by the mean squared within. Or here, F is equal to 15 divided by 1.33. And in this case, that is equal to 11.23. So how do I write this up? I put F and my between, then my within degrees of freedom. 2 comma 12 is equal to 11.23 comma-- is this significant? Yes, it is. p less than 0.05 comma two-tailed. How do I know that this is significant? Well, my F-ratio value of 11.23 surpasses my critical value of 3.88. So I know that it is significant. So how do I interpret this? I can say that there is a significant difference among my three groups-- with parents, dorm, and Greek-- on weekly drinking. I would need to have to calculate a post hoc test, such as a Tukey HSD test or a Scheffe test, in order to find out which groups are significantly different from each other. I can also calculate my effect size, my percentage of variance explained, which is equal to the sum of squares between divided by the sum of squares total. Which in this case, is equal to 30 divided by 46. So my percentage of variance explained is 0.65. Now let's calculate a one-way between subjects ANOVA using SPSS. Once you have SPSS open, you need to get the data set open that you want to analyze. Click on File. Click on Open. Click on Data and find the data set that you want to use. Once you have found the data set, click on the Data Set. And then click on Open. And make sure your data view window appears on your screen. 2012 Laureate Education, Inc. 4
Now, to conduct a one-way between subjects ANOVA, just do the following. Click on Analyze. Click on General Linear Model. And click on Univariate and your analysis of variance dialog box will appear on your screen. Here I want to look at, are there any ethnic differences in spirituality? So here, my dependent variable is spirituality. So if I go to my box here on the left and look for my dependent variable of spirituality, I click on my dependent variable. And then I click on the right arrow key here at the top to move that variable of spirituality into the dependent variable dialog box. So I click on that. And my independent variable is ethnicity, so I go back to the box on the left and find my independent variable. I click on Ethnicity. And then I click on the right arrow key to move my variable ethnicity into the fixed factors box, which is also known as an independent variable. So I click on that. I also need to conduct a post hoc test because I have more than two groups in my independent variable of ethnicity. So I click on Post Hoc. And I click on my independent variable of ethnic. And then I click on the right arrow key to move that variable into the post hoc test dialog box. As you can see, there are many options for post hoc tests. And I'm just going to choose one. And I'm going to choose the one that says Tukey because that stands for the Tukey HSD test. And then I'm going to click Continue. I also want to make sure that I get my descriptives and effect size for this analysis. So I need to click on Options. And then here I have many choices here. And I just need to click on Descriptive Statistics and Estimates of Effect Size. And once I'm done with that, I click on Continue. And now I'm ready to conduct my analysis. Just click on OK. And we'll scroll down in the output. As you can see here, SPSS is going to give you a table first giving your sample size for your groups. So here I have the independent variable of ethnicity, and I have four groups or four levels-- African American black, Asian, Caucasian, and other. I have 19 participants that are African American black, 34 that are Asian, 70 that are Caucasian, and 23 that are other. If I scroll down in my output a little bit more, I have my means and standard deviations for my dependent variable of spirituality for each of my four groups. So here, for those that are African American black, my mean level of spirituality is 3.8163. My mean level for Asians is 3.9715. My mean level of spirituality for Caucasians is 3.8134. And for those that designated themselves as other is 3.7596. So scroll down a little bit more in the output, and you now have your source table for your one-way between subjects ANOVA. So here you have your independent variable of ethnicity and you have 3 and 142 degrees of freedom. So your between degrees of freedom is 3. Your within degrees of freedom is 2012 Laureate Education, Inc. 5
142. Your total degrees of freedom is 146. Your F-statistic or F-ratio is 1.048 and your significance level is 0.373. And your eta squared for this analysis is 0.022. So how do I interpret this? For this example, I would have F and 3 and 142 degrees of freedom is equal to 1.05 comma ns, which stands for non-significant comma two-tailed saying that there is no significant differences between my four ethnic groups and spirituality. Also, you can see in your output it did conduct a Tukey HSD test. In this case, it wasn't needed because there were no significant differences found in the overall F-test. But let's just look at the Tukey HSD test. What the Tukey HSD test does is it compares each of your groups to each other. If there were significant differences between your groups, you would see it here in the column that says significance, and there would be an asterisks detailing which groups are significantly different from each other. But again, in this case, because our F-test, as you can see above, was non-significant, we didn't need to conduct a Tukey HSD test. Let's recap. We went over the assumptions of a one-way between subjects ANOVA. We learned about the two different post hoc tests. We learned how to calculate the effect size for a one-way between subjects ANOVA. And we went over two formulas, one using-- two examples. One using the formula and one using SPSS. We have now come to the end of this demonstration. Don't forget to practice calculating a one-way between subjects ANOVA, both using the formula and SPSS on your own. Thank you, and have a great day. 2012 Laureate Education, Inc. 6