Lesson 9: Two Factor ANOVAS - PDF Free Download

Published on Agron 513 (https://courses.agron.iastate.edu/agron513) Home > Lesson 9 Lesson 9: Two Factor ANOVAS Developed by: Ron Mowers, Marin Harbur, and Ken Moore Completion Time: 1 week Introduction In the previous unit we discussed how one could analyze for differences in treatments, each replicated on several plots, using an analysis of variance. Often there is more than one type of treatment we wish to use in an experiment. For example, we may want to know the effect of different plant populations on yield, but also how different levels of nitrogen application affect the yield. This unit will build on the principle you learned in the last unit and add another factor to have two factors in the ANOVA. Lesson 9: Required Reading Chapter 12.1-12.6 and 9.1-9.2 Clewer and Scarisbrick Objectives How to analyze a second factor in an ANOVA The linear additive model for a two-factor ANOVA The advantages of factorial experiments How to completely randomize an experiment

Factorial Experiments Factorial experiments involve more than one treatment factor. In the corn population experiment analyzed in Unit 8, we were interested only in the response of one hybrid. However, corn hybrids respond differently to changes in plant population. What if we wanted to compare the response of three different hybrids to population changes? What are our options in designing this experiment? Option 1 We could consider each hybrid in a separate experiment. For each hybrid, then, we would plant our plots using each of the three populations. We would need the following experimental units: hybrid A x 3 populations x 3 replications = 9 plots hybrid B x 3 populations x 3 replications = 9 plots hybrid C x 3 populations x 3 replications = 9 plots The three experiments would require 27 (9+9+9) total experimental units, in this case plots. The results from this experiment would tell us the response of each hybrid to population changes. Each experiment would have an ANOVA table (Table 9.1) similar to that used previously Table 9.1 ANOVA table for corn planted at three populations in northwest Iowa Population (plants/acre) Degrees of Freedom Population 2 Error 6 Total 8 Yet, since each hybrid test is conducted as a separate experiment, we could not compare the yields of each hybrid. In other words, we would know which population is most appropriate for each hybrid. But, we could not predict which hybrid would produce the greatest yield with any confidence. The experiments were simply not designed in a manner to allow this.

Option 2 We could develop an experimental arrangement where these two factors are combined into a factorial experiment. In this case, all combinations of population and hybrid would occur within the same experiment. The total number of treatments would be 9 (3 hybrids x 3 populations). In this case, with three replications, we would have: 3 hybrids x 3 populations x 3 replications = 27 plots However, this factorial experiment will produce more useful information than the first design. Specifically, this experiment will tell us whether the effect produced by changing plant population is the same regardless of hybrid, or if each hybrid reacts differently to population. If each hybrid reacts differently to population, we will say that there is an interaction between the hybrids and plant populations. To analyze the results from this experiment, the ANOVA must be expanded to include the additional effects: another main factor and its interaction with the original main factor (Table 9.2). Table 9.2 ANOVA table for three corn varieties planted at three populations in Northwest Iowa Source of Variation Degrees of Freedom Population 2 Hybrid 2 Interaction 4 Error 18 Total 26 The table is similar to the one used before, with the exception of these two new sources of variation. The degrees of freedom are calculated using the formulae in Table 9.3. Note that even though we have the same number of experimental units whether we evaluate the three hybrids separately (Table 9.1) or together (Table 9.2), our residual degrees of freedom and the sensitivity of our F-test increase dramatically with factorial design. Table 9.3 Degrees of freedom factorial experiment Source of Variation Degrees of Freedom Treatment A # of levels of treatment A - 1 Treatment B # of levels of treatmetn B - 1 Interaction (df for Treatment A) x (df for Treatment B) Error (or Residual) (# of levels of treatment A) x (# of levels of Treatment B) x (# of replications - 1) Total # of Experimental Units - 1 The ANOVA for factorial experiments can be completed using a "cookbook" method similar to that described in the previous unit. However, since most experiments of this type are analyzed using computer software, we will not continue the example here. Rather, we will use the computer to analyze a similar experiment using SAS to calculate the statistics we need. For further study refer to pp. 160-163 in the text.

Interaction Interaction is differential response of one factor at different levels of another. In addition to maximizing the efficiency of an experiment by combining the treatments into one set of plots, a two-factor ANOVA introduces another effect; the interaction between the treatments. As introduced in the ANOVA Table 9.2, there is an additional source of variation in the analysis: the interaction between our two main treatment factors. As is often realized, additional effects occur because of interaction between two treatments. Two factors can not interact at all, interact positively, or interact negatively. These can be depicted well by graphs. Let's consider two different varieties at three different levels of N. Example 1: No Interaction Here yield of the two varieties reacts similarly to N-rate. Thus, there is no interaction between N and variety. Notice that the lines are parallel so the effect of the variety is constant as N rate increases. Variety 1 is uniformly superior at each N level and higher levels of N result in improved yield of each variety. Fig. 9.1 Example 2: Positive interaction Yield of the two varieties differs based on level of N applied, but the variety 1 always yields better than variety 2. The response of each variety depends on the level of N applied. It is no longer constant as in the previous example. The type of interaction is sometimes referred to as a change in the magnitude of the response interaction. Variety 1 is still the highest yielding variety, but the magnitude of it's response to N depends on the amount of N applied. With this type of interaction, you can't evaluate the effect of N independent of variety because the response is different depending on which variety you are talking about. Whenever an interaction tests as significant in the ANOVA it is important to ignore the main factor means and focus on the interaction means.

Fig. 9.2 Example 3: Negative Interaction The last type of interaction occurs when the yield response is completely opposite based on the level of N. Note that the lines in the figure now cross each other. This changes the interpretation drastically because the best variety to grow now depends on the level of N that is applied. You would need to fertilize each variety differently based on the response. This is often called a crossover interaction and it is very important to pay attention when they occur because your recommendation about one factor depends heavily on the level of the other. Often when a crossover interaction occurs one or both main factors involved in the interaction tests as nonsignificant in the ANOVA. When this occurs, you should look at the results very carefully because failure to assess the interaction carefully can result in poor recommendations. Fig. 9.3 For further study refer to pp. 163-165 in the text

Linear Additive Model for Two-Factor ANOVA The linear model for a factorial is a simple extension of the linear model. The addition of factors and their possible interaction produce further complexity in the ANOVA. But the effects can be added just as the previous effects were to the linear model. The linear model for a two-factor ANOVA is: Equation 9.1 Notice that the only change in the linear model from equation 8.8 is that the treatment structure is modified. We now have sources for factor A, factor B and their interaction. The error structure is not changed; we still have only a single random error for the experiment. Factorial refers to the treatment structure, not the assignment of treatments to experimental units. For the corn population x hybrid experiment we can rewrite the model to indicate the true sources of variation as: Equation 9.2 The interactive activity Exercise 9.1: Analysis of Variance for a two- factor experiment can only be completed online. The interactive activity Study Question 9.1 can only be completed online. The interactive activity Study Question 9.2 can only be completed online.

The interactive activity Study Question 9.3 can only be completed online. The interactive activity Study Question 9.4 can only be completed online.

ANOVA and Experimental Design The experiment design determines the model and analysis. You will recall from Unit 1 that experimental design refers to the manner in which treatments are assigned to experimental units (plots). In this unit we have assumed that all treatments are applied at random to the entire set of experimental units used in the experiment. This is what is known as a Completely Random Design (CRD). In a CRD every experimental unit has the same chance of receiving any given treatment. We use a CRD when we expect the magnitude of the plot effect to be similar among all the plots used in the experiment. Statisticians refer to this condition as homogeneity, and it is assumed when we use a CRD. The CRD is a common design (Fig. 9.4); there are many cases where its use is appropriate. For example, in a growth chamber we might have 25 flats filled with sand, each planted with 30 wheat seeds, 5 reps of each of 5 varieties. However, whenever there are not enough plots with similar characteristics to accommodate all the treatments and replications in an experiment, alternative designs should be considered to improve the precision of the experiment. We will learn more about this as we study other designs in subsequent units. Fig. 9.4 There are many ways to randomly assign treatments to experimental units. The use of a table of random numbers is described in the text. This method involves listing all the treatments and their replications and assigning a random number from a table of random numbers to each one. This random order is then matched to a predetermined order of experimental units. A far easier approach is to use a computer to randomize treatments. You can use Excel or SAS to randomize an experiment. The interactive activity Exercise 9.2: Randomizing treatments for a completely random design can only be completed online. Once the randomization is completed and the experiment properly conducted according to plan, the error structure for the experiment is, at least partially, determined. For the CRD, for example, we do not have a blocking factor because treatments are assigned to experimental units completely at random. In Unit 11 we will introduce another type of design, the Randomized (Complete) Block Design, or RCBD, in which some restrictions are put on the randomization. For the RCBD, there is yet another recognizable source of variation in the ANOVA, that for blocks, as you will see in Unit 11.

Question and Answer 1. The most difficult concept for this lesson was the discussion of the linear model. Although this unit s discussion builds on the previous unit, I continue to be confused with the use of subscript I, J, and K and how they relate to the overall formula. We now have two treatment factors, for example hybrid and plant density, as well as reps, so we need three subscripts to describe them all. For Equation 9.2, i is the index for plant density, j indexes hybrids, and k indexes reps. 2. Why doesn't Excel sort the random number column correctly? To sort with EXCEL, paint all the columns that will accompany the sorted one, Data, Sort, choose the column to sort on and click OK to get the sort. 3. Page 168 of Clewer, last paragraph indicates that "blocking was not very successful in this experiment." I thought with CV 11.12% and CV 3.7% for blocks would be low and indicate a good test or low variation. The book did say 11% CV is good. This CV is based on error. It also said that the CV for blocks was low, indicating that not as much variation is accounted for by blocks. Since blocks don't eliminate much variation in this experiment, they were not really needed. We will see more details on this in Unit 11. 4. CRD assumes homogeneity, but how can we ever really assume that in field experiment settings, where so many factors are beyond control? What is the advantage assumed over blocking, that causes blocking not to be used automatically? Unless the field is very uniform, we usually do not assume a CRD, and in general, we do try blocking. Some experimental areas are very uniform, including a uniform field, greenhouse area, or growth chamber, and therefore we use a CRD. One advantage of a CRD over blocking is more df for error. 5. I don't understand significances of the factors in Ex.1. It looks to me like population (P > F=.15) and hybrid (P > F=.07) are more significant than pop/hybrid, which, at.03, is less than.05. But that's not what they say in the lesson. What am I mistaking? The P =.03 is the probability assuming the null hypothesis (no difference). Small P implies there aredifferences. The smaller the probability, the more significant. That is, we have lower probability that the treatments are the same just by chance. Consequently with P less than 0.05, they are called significantly different. 6. When conducting field research how often can you really utilize a completely randomized design? If this design assumes total homogeneity, I don t think it would be very often. Is this the reason we do some sort of blocking factor to improve some of the differences? This is similar to an earlier question. Most field experiments use blocking and the RCBD design. Growth chambers and greenhouses sometimes use blocking and other times use a completely randomized design (CRD), sometimes re-randomizing the pots by re-arranging every week. 7. What exactly is the error mean square telling us in this lesson? Error mean square is the measure of error variance for the experiment. It tells us how much random variation is present. If it is high, our experiment is poor. If it is low relative to the Trt MS, we can better detect Trt differences and probably have a good experiment.

8. Unclear: the Treatment SS and Error SS and their relationship to each other and the data. Are data good because SS are low or high and how do we determine this? When the error MS is low, this is good; there is not much unaccounted error in the experiment. High Trt MS generally implies treatments are different. If Trt SS is 70% of Total SS (corrected), Err SS is 30%. We then say R-squared is 70%, or 70% of the experiment variation is accounted for by treatments. 9. Can you explain the term treatment structure? Treatment structure refers to what treatments are present and how are they arranged. For example, we may have two planting dates for each of 10 hybrids. This would be a 2*10 factorial treatment structure. 10. If all of the sources are significantly different can you determine the level of significance between the sources? If there are two factors and their interaction, and all are significant, just concentrate on the interaction because there are different patterns of response in Factor A for different levels of Factor B. If interaction is not significant, look at main effects of Factor A or Factor B. For example, in an experiment with corn hybrids (Factor A) and nitrogen application (Factor B), if A*B is significant, we need to look at N-response separately for each hybrid. If A*B is not significant, we can report the overall N-response for all hybrids. 11. The least clear concept to me is the difference between utilizing the F-ratio and its critical values, and utilizing the Prob>F value to determine whether to reject or accept a null. Can either one be used, or is there one method that provides this information and I am just confusing the purposes of each? Either calculated F with the critical F from the table, or the Prob>F tell the same information. Prob>F is easier with computer programs and is more accurate. For example, not using the computer output, we might reject the null because calculated F > Table F for 0.05, but we wouldn't know that Prob>F is 0.026, which is more informative. 12. I didn't really understand what were the important numbers to get out of the ANOVA tables to analyze the differences between treatments. Most important are the Prob>F in the Effects Tests. Those tell if the interaction is important (when less than 0.05) and if main effects of Factor A or Factor B are important. 13. I get confused on the differences between the Completely Random Design experiments and experiments that are in a blocked fashion. Could you explain the differences? The distinction between the two is that block designs have a restriction on randomization and with a CRD there are none. You normally use a block design when you do not have enough uniform plots or other experimental units to accommodate all your treatments. In this case you can group your plots into more uniform groups or blocks. When you do this you are counting on the variation among your blocks being greater than variation among plots within a block. To make this work you have to restrict your randomization so that every treatment occurs in every block. You will be learning more about block designs in Lesson 11. 14. I'm still not clear as to the practical utility of the linear additive model. To me, it seems like little more than a way to visually represent all of the potential factors that are affecting your results. How is the LAM used for research purposes? Linear additive models are used to describe experimental designs and determine the appropriate analysis of data from an experiment. They have tremendous utility for people who conduct

research using designs more complex than the ones presented in this class. They are presented here to help you conceptualize how the ANOVA works. 15. The least clear part of the lesson is when I randomized the treatments in the assignment some of the same treatments ended up being right next to each other. Is this ok or should it be changed? I read in the book that it shouldn't be changed. This occurs as a natural consequence of the randomization process. Generally, we accept whatever randomization we get. However, there is a class of experimental designs which tries to achieve spatial balance and these eliminate such occurrences. 16. I translated the formula for the linear model for a two-factor ANOVA into words but I can't say I really understand it. The equation is throwing you off. It is really a simple concept. The model partitions effects among the sources of variation included in the model. Think of it as parsing the blame. 17. I don't quite understand the Linear Model equation and the significance of the subscripts i, j, k. The subscripts correspond to the levels of the factors in the model. For example, if Factor A is described as Ai, the subscript i refers to the possible levels of Factor A. A1 refers to the first level of Factor A. A2 refers to the second level of Factor A, and so on. 18. I do not understand how you can accurately determine the amount of error within an experiment without a lot more information than what an ANOVA table uses. You may be confused about the meaning of "error" in the context of an ANOVA. It is defined as the variation among plots treated the same and is considered an unbiased estimate of the variance of the population the plots represent. It represents all of the variation in your measured response that is not attributable to the other factors in your model so it gets blamed for all the unexplained variation in your experiment. 19. The concept that was the least clear to me was the linear model equation for the two-factor ANOVA. I guess it is hard for me to understand that just multiplying the two treatment effects gives the result for the two treatments interaction. I thought that the interaction would be much more complex than that. It is important to understand that the interaction is not the result of multiplying the two treatment effects. The interaction term in a linear additive model is just another linear term used to account for the situation when the main factors to not explain all the variation in the response. This is sometimes called nonadditivity, because when the interaction is significant, the observed response cannot be explained by addition of the main treatment factors alone. It the interaction is not significant, then the term is not needed to explain the response and its estimate is close to zero. 20. When we perform a two factor ANOVA in SAS what are the probabilities of the effect tests telling us? They represent the probability of the calculated test statistic (F) occurring at random. A low P (e.g., < 0.05) indicates that the test is significant and the factor has an effect on the measured response. 21. The item I am still a little confused by is how to completely read the ANOVA table, I understand what I need to be looking at to see the results the in the F ratio, but I am not exactly sure how this always relates, do we always look at the.05,or is there times that that number changes. I am still a little confused by how to read all of the results.

The F statistic tests the multisample hypothesis that all the treatment means of the levels tested within a treatment factor or an interaction are equal to zero. It is analogous to the t-test which is used to test pairwise hypotheses. The probability level used for either test depends on your assessment of the consequences of making a Type I error; i.e. saying a difference occurs when in reality there was none. The 0.05 P level is often used in agronomic experiments, which means most of the time we are comfortable with 1 in 20 chance of making an error. SAS and other statistics programs give you the actual probability of the F ratio occurring at random. As long as it is less than our stated P level we are comfortable declaring the test significant. 22. The concept in the unit that is the least clear to me is why it would be beneficial to choose option one in designing an experiment that considers each experiment as a separate experiment instead of combining them into a factorial experiment. I also don t really understand what the value of doing this type of experiment would be. Exactly, it makes more sense to use the factorial experimental design whenever there are enough similar experimental units to accommodate it. 23. I was surprised by the high level of statistical significance of what appeared to be at most a moderate interaction between rate and date in the homework problem. I would like to have a better feel for what underlies that level of statistical significance, and know more generally about the interrelationships of the direction and magnitude of measured interactions, precision of the experiment as reflected in EMS, and statistical significance of the interaction and main effects. I understand that you can t generalize about the magnitude of the main effects if the interaction is significant, but can you still indicate that the main effects had a statistically significant effect if so indicated by the F statistic and p? I think the very high level of significance / very low probability associated with the interaction is an indication of how strong the main effects were individually. So to a certain extent the main effects are integrated into the interaction. I'll admit to being a little concerned with the choice of words used to describe the interactions -- I would not have chosen "moderate" vs. "significant" or "severe". I prefer the terms "positive" and "negative". You can still present the main effect probabilities in your results when there is an interaction (I have even published some) -- but your discussion needs to focus on the interaction. 24. I still have trouble with all these degrees of freedom, I know how to calculate them but I still have trouble figuring out why they are needed and the concept behind them. Think of the the degrees of freedom as "averaging" our sums of squares. The more treatments or replications we have, the larger our sums of squares will be -- even if the variation in the population does not change. So our degrees of freedom allow us to average the sum of squares to to get the mean squares. 25. I don't understand what lsmeans truly is. Nor do I understand, in SAS, what slicing is and how to determine what to slice? For example should I have run lsmeans in SAS and, then, what should the code have been for slicing in SAS? I don't understand exactly what the lsmeans statement is, either! Well, I think I do, but to explain it is tricky. To understand it, you have to think back to the regression unit. After we drew the regression line through the data, there was an actual sample mean for each level of X, and there was the mean value predicted by the regression equation. That is what the least square mean (the lsmean) is, because the analysis of variance is another kind of regression (even though we don't normally discuss it in those terms.

Slicing is used to look at the means of each level of one factor at each level of a second factor. It is basically the same thing you did using the pivot table in the homework. 26. What is the difference and meaning of Type I and Type III in SAS? Type I and Type III anovas have the same value if the experiment is balanced, i.e. each treatment combination is represented the same number of times. When we end up with an unbalanced design, because a plot got the wrong treatment, or some piece of data got lost (yes, this does happen), then we use the Type III anova. 27. One thing I am unsure of is the use of SLICE in SAS. Is that just used to determine which of the treatments actually had a significant interaction? SLICE simply gives us the means for each level of Factor A separately for each level of Factor B. As you wrote, this allows us to view the nature of the interaction. 28. Using Ai for example, what is the effect of ith level of factor A? The effect of ith level of factor A is, well, exactly that. It is the amount by which the value of the experimental unit changes as a result of the level of treatment. For example, say we had nitrogen rates of 0, 100, and 200 lbs. We would have an effect on corn yield associated with each level of nitrogen. So there would be a variation (positive or negative) from the population mean (the mean of the whole experiment) associated with 0 lb N (i=0), and another one with 50 lb N (i=50), and so on. 29. The concept that is still unclear to me is the understanding of the interaction between the different factors in an experiment. I think I am getting confused between the significance in the main affects and and how to classify if there is no interaction, some interaction and significant interaction. I'll admit to being a little concerned with the choice of words used to describe the interactions -- I would not have chosen "moderate" vs. "significant" or "severe". I prefer the terms "positive" and "negative". The interaction is either significant or it isn't, and that is determined by the F-test and associated probability. In the absence of a significant interaction, we rely on the significance of the main effects to describe the effects of each factor. If the interaction is significant, however, then we cannot explain one factor independent of the value of the second factor at which the first factor is evaluated. For example, new corn hybrids are more efficient in using nitrogen than old hybrids. Therefore, if we tested an old and a new hybrid at high and low levels of N fertilization, we would see different results (e.g. a greater yield difference between the hybrids at the low level of N fertilization). If we were going to discuss the difference between the hybrids, we would have to mention the difference at each level of nitrogen. In a way, we always do this in experiments by describing in our methodology how the crops were managed. 30. I still struggle with the additive model. This is no different than my struggles with the previous additive models. I still don't understand when and where I would use them. Perhaps I wouldn't? I know it is supposed to be the outline of the ANOVA, but I still don't get why it is useful. As you mentioned, it is the outline of the ANOVA model. It describes how we partition the variation associated with an observation -- what variation we attribute to each factor, versus plot error. It allows us ultimately to determine which effects are significant.

In the GLM or ANOVA statement in SAS, the model statement is nothing more than the linear additive model, minus the population mean. The model tells SAS how to partition the variation and test the effects. Hopefully it will make more sense as we move into the Randomized Complete Block Design in Lesson 11.

Summary Factorial Experiments Interaction Involve more than one treatment factor. Allow exploration of combinations of factors, interaction Refers to the treatment structure, not how treatments are assigned to plots. Differential response of one factor at different levels of another. Linear Model Factorial Linear Model is simple extension for more detail on treatment factors and interaction. Experimental Design Determines model and ANOVA. CRD has treatments assigned to experimental units completely at random. Assignment 9.1 Follow the instructions to connect to the Agronomy Virtual Labs and complete the exercises found in hw09.doc. To complete this assignment you will need to make use of the hw09.xls data file, located in the same folder. Complete this assignment in Canvas. Be sure to check the course calendar for due dates. Lesson 9 Reflection 1. In your own words, write a short summary (< 150 words) for this lesson. 2. What is the most valuable concept you learned from the lesson? Why is this concept valuable to you? 3. What concepts in the lesson are still unclear/the least clear to you? Complete this assignment in Canvas. Be sure to check the course calendar for due dates.