f WILEY ANOVA and ANCOVA A GLM Approach Second Edition ANDREW RUTHERFORD Staffordshire, United Kingdom Keele University School of Psychology

ANOVA and ANCOVA A GLM Approach Second Edition ANDREW RUTHERFORD Keele University School of Psychology Staffordshire, United Kingdom f WILEY A JOHN WILEY & SONS, INC., PUBLICATION

Contents Acknowledgments xiii 1 An Introduction to General Linear Models: Regression, Analysis of Variance, and Analysis of Covariance 1 1.1 Regression, Analysis of Variance, and Analysis of Covariance 1 1.2 A Pocket History of Regression, ANOVA, and ANCOVA 2 1.3 An Outline of General Linear Models (GLMs) 3 1.3.1 Regression 4 1.3.2 Analysis of Variance 5 1.3.3 Analysis of Covariance 5 1.4 The "General" in GLM 6 1.5 The "Linear" in GLM 8 1.6 Least Squares Estimates 11 1.7 Fixed, Random, and Mixed Effects Analyses 12 1.8 The Benefits of a GLM Approach to ANOVA and ANCOVA 13 1.9 The GLM Presentation 14 1.10 Statistical Packages for Computers 15 2 Traditional and GLM Approaches to Independent Measures Single Factor ANOVA Designs 17 2.1 Independent Measures Designs 17 2.2 Balanced Data Designs 19 2.3 Factors and Independent Variables 20 2.4 An Outline of Traditional ANOVA for Single Factor Designs 21 2.5 Variance 23 2.6 Traditional ANOVA Calculations for Single Factor Designs 25 2.7 Confidence Intervals 30 v

I) vi CONTENTS 2.8 GLM Approaches to Single Factor ANOVA 31 2.8.1 Experimental Design GLMs 31 Pull and Reduced 2.8.2 Estimating Effects by Comparing Experimental Design GLMs 37 2.8.3 Regression GLMs 41 2.8.4 Schemes for Coding Experimental Conditions 41 2.8.4.1 Dummy Coding 41 Variables Are Used to - 2.8.4.2 Why Only (p Represent All Experimental Conditions? 44 2.8.4.3 Effect Coding 47 2.8.5 Coding Scheme Solutions to the Overparameterization Problem 50 2.8.6 Cell Mean GLMs 50 2.8.7 Experimental Design Regression and Cell Mean GLMs 51 3 Comparing Experimental Condition Means, Multiple Hypothesis Testing, Type 1 Error, and a Basic Data Analysis Strategy 53 3.1 Introduction 53 3.2 Comparisons Between Experimental Condition Means 55 3.3 Linear Contrasts 56 3.4 Comparison Sum of Squares 57 3.5 Orthogonal Contrasts 58 3.6 Testing Multiple Hypotheses 62 3.6.1 Type 1 and Type 2 Errors 63 3.6.2 Type 1 Error Rate Inflation with Multiple Hypothesis Testing 65 3.6.3 Type 1 Error Rate Control and Analysis Power 66 3.6.4 Different Conceptions of Type 1 Error Rate 68 3.6.4.1 Testwise Type 1 Error Rate 68 3.6.4.2 Familywise Type 1 Error Rate 69 3.6.4.3 Experimentwise Type 1 Error Rate 70 3.6.4.4 False Discovery Rate 70 3.6.5 Identifying the "Family" in Familywise Type 1 Error Rate Control 71 3.6.6 Logical and Empirical Relations 72 3.6.6.1 Logical Relations 72 3.6.6.2 Empirical Relations 74 3.7 Planned and Unplanned Comparisons 76

CONTENTS Vii 3.7.1 Direct Assessment of Planned Comparisons 77 3.7.2 Contradictory Results with ANOVA Omnibus F-tests and Direct Planned Comparisons 78 3.8 A Basic Data Analysis Strategy 79 3.8.1 ANOVA First? 79 3.8.2 Strong and Weak Type 1 Error Control 80 3.8.3 Stepwise Tests 81 3.8.4 Test Power 82 3.9 The Three Basic Stages of Data Analysis 83 3.9.1 Stage 1 83 3.9.2 Stage 2 83 3.9.2.1 Rom's Test 83 3.9.2.2 Shaffer's R Test 84 3.9.2.3 Applying Shaffer's R Test After a Significant F-test 86 3.9.3 Stage 3 89 3.10 The Role of the Omnibus F-Test 91 4 Measures of Effect Size and Strength of Association, Power, and Sample Size 93 4.1 Introduction 93 4.2 Effect Size as a Standardized Mean Difference 94 4.3 Effect Size as Strength of Association (SOA) 96 4.3.1 SOA for Specific Comparisons 98 4.4 Small, Medium, and Large Effect Sizes 99 4.5 Effect Size in Related Measures Designs 99 4.6 Overview of Standardized Mean Difference and SOA Measures of Effect Size 100 4.7 Power 101 4.7.1 Influences on Power 101 4.7.2 Uses of Power Analysis 103 4.7.3 Determining the Sample Size Needed to Detect the Omnibus Effect 104 4.7.4 Determining the Sample Size Needed to Detect Specific Effects 107 4.7.5 Determining the Power Level of a Planned or Completed Study 109 4.7.6 The Fallacy of Observed Power 110

viii CONTENTS 5 GLM Approaches to Independent Measures Factorial Designs 111 5.1 Factorial Designs 111 5.2 Factor Main Effects and Factor Interactions 112 5.2.1 Estimating Effects by Comparing Full and Reduced Experimental Design GLMs 117 5.3 Regression GLMs for Factorial ANOVA 121 5.4 Estimating Effects with Incremental Analysis 123 5.4.1 Incremental Regression Analysis 124 5.4.1.1 Step 1 124 5.4.1.2 Step 2 124 5.4.1.3 Step 3 125 5.5 Effect Size Estimation 126 5.5.1 SOA for Omnibus Main and Interaction Effects 126 5.5.1.1 Complete a)2 for Main and Interaction Effects 126 5.5.1.2 Partial S2 for Main and Interaction Effects 127 5.5.2 Partial S2 for Specific Comparisons 127 5.6 Further Analyses 128 5.6.1 Main Effects: Encoding Instructions and Study Time 128 5.6.2 Interaction Effect: Encoding Instructions x Study Time 131 5.6.2.1 Simple Effects: Comparing the Three Levels of Factor B at al, and at a2 132 5.6.2.2 Simple Effects: Comparing the Two Levels of Factor A at b 1, at b2, and at b3 135 5.7 Power 136 5.7.1 Determining the Sample Size Needed to Detect Omnibus Main Effects and Interactions 136 5.7.2 Determining the Sample Size Needed to Detect Specific Effects 138 6 GLM Approaches to Related Measures Designs 139 6.1 Introduction 139 6.1.1 Randomized Block Designs 140 6.1.2 Matched Sample Designs 141 6.1.3 Repeated Measures Designs 141 6.2 Order Effect Controls in Repeated Measures Designs 144 6.2.1 Randomization 144 6.2.2 Counterbalancing 144 6.2.2.1 Crossover Designs 144 6.2.2.2 Latin Square Designs 145

CONTENTS ix 6.3 The GLM Approach to Single Factor Repeated Measures Designs 146 6.4 Estimating Effects by Comparing Full and Reduced Repeated Measures Design GLMs 153 6.5 Regression GLMs for Single Factor Repeated Measures Designs 156 6.6 Effect Size Estimation 160 6.6.1 A Complete 32 SOA for the Omnibus Effect Comparable Across Repeated and Independent Measures Designs 160 6.6.2 A Partial to2 SOA for the Omnibus Effect Appropriate for Repeated Measures Designs 161 6.6.3 A Partial m2 SOA for Specific Comparisons Appropriate for Repeated Measures Designs 162 6.7 Further Analyses 162 6.8 Power 168 6.8.1 Determining the Sample Size Needed to Detect the Omnibus Effect 168 6.8.2 Determining the Sample Size Needed to Detect Specific Effects 169 7 The GLM Approach to Factorial Repeated Measures Designs 171 7.1 Factorial Related and Repeated Measures Designs 171 7.2 Fully Repeated Measures Factorial Designs 172 7.3 Estimating Effects by Comparing Full and Reduced Experimental Design GLMs 179 7.4 Regression GLMs for the Fully Repeated Measures Factorial ANOVA 180 7.5 Effect Size Estimation 186 7.5.1 A Complete a>2 SOA for Main and Interaction Omnibus Effects Comparable Across Repeated Measures and Independent Designs 186 7.5.2 A Partial S32 SOA for the Main and Interaction Omnibus Effects Appropriate for Repeated Measures Designs 187 7.5.3 A Partial S)2 SOA for Specific Comparisons Appropriate for Repeated Measures Designs 188 7.6 Further Analyses 188 7.6.1 Main Effects: Encoding Instructions and Study Time 188 7.6.2 Interaction Effect: Encoding Instructions x Study Time 191

X CONTENTS 7.6.2.1 Simple Effects: Comparison of Differences Between the Three Levels of Factor B (Study Time) at Each Level of Factor A (Encoding Instructions) 191 7.6.2.2 Simple Effects: Comparison of Differences Between the Two Levels of Factor A (Encoding Instructions) at Each Level of Factor B (Study Time) 193 7.7 Power 197 8 GLM Approaches to Factorial Mixed Measures Designs 199 8.1 Mixed Measures and Split-Plot Designs 199 8.2 Factorial Mixed Measures Designs 200 8.3 Estimating Effects by Comparing Full and Reduced Experimental Design GLMs 205 8.4 Regression GLM for the Two-Factor Mixed Measures ANOVA 206 8.5 Effect Size Estimation 211 8.6 Further Analyses 211 8.6.1 Main Effects: Independent Factor Encoding Instructions 211 8.6.2 Main Effects: Related Factor Study Time 212 8.6.3 Interaction Effect: Encoding Instructions x Study Time 212 8.6.3.1 Simple Effects: Comparing Differences Between the Three Levels of Factor B (Study Time) at Each Level of Factor A (Encoding Instructions) 212 8.6.3.2 Simple Effects: Comparing Differences Between the Two Levels of Factor A (Encoding Instructions) at Each Level of Factor B (Study Time) 212 8.7 Power 214 9 The GLM Approach to ANCOVA 215 9.1 The Nature of ANCOVA 215 9.2 Single Factor Independent Measures ANCOVA Designs 216 9.3 Estimating Effects by Comparing Full and Reduced ANCOVA GLMs 221 9.4 Regression GLMs for the Single Factor, Single-Covariate ANCOVA 226 9.5 Further Analyses 229 9.6 Effect Size Estimation 231

CONTENTS Xi 9.6.1 A Partial m2 SOA for the Omnibus Effect 231 9.6.2 A Partial co2 SOA for Specific Comparisons 232 9.7 Power 232 9.8 Other ANCOVA Designs 233 9.8.1 Single Factor and Fully Repeated Measures Factorial ANCOVA Designs 233 9.8.2 Mixed Measures Factorial ANCOVA 233 10 Assumptions Underlying ANOVA, Traditional ANCOVA, and GLMs 235 10.1 Introduction 235 10.2 ANOVA and GLM Assumptions 235 10.2.1 Independent Measures Designs 236 10.2.2 Related Measures 238 10.2.2.1 Assessing and Dealing with Sphericity Violations 238 10.2.3 Traditional ANCOVA 240 10.3 A Strategy for Checking GLM and Traditional ANCOVA Assumptions 241 10.4 Assumption Checks and Some Assumption Violation Consequences 242 10.4.1 Independent Measures ANOVA and ANCOVA Designs 243 10.4.1.1 Random Sampling 243 10.4.1.2 Independence 244 10.4.1.3 Normality 245 10.4.1.4 Homoscedasticity: Homogeneity of Variance 248 10.4.2 Traditional ANCOVA Designs 250 10.4.2.1 Covariate Independent of Experimental Conditions 250 10.4.2.2 Linear Regression 252 10.4.2.3 Homogeneous Regression 256 10.5 Should Assumptions be Checked? 259 11 Some Alternatives to Traditional ANCOVA 263 11.1 Alternatives to Traditional ANCOVA 263 11.2 The Heterogeneous Regression Problem 264 11.3 The Heterogeneous Regression ANCOVA GLM 265

Xii CONTENTS 11.4 Single Factor Independent Measures Heterogeneous Regression ANCOVA 266 11.5 Estimating Heterogeneous Regression ANCOVA Effects 268 11.6 Regression GLMs for Heterogeneous Regression ANCOVA 273 11.7 Covariate-Experimental Condition Relations 276 11.7.1 Adjustments Based on the General Covariate Mean 276 11.7.2 Multicolinearity 277 11.8 Other Alternatives 278 11.8.1 Stratification (Blocking) 278 11.8.2 Replacing the Experimental Conditions with the Covariate 279 11.9 The Role of Heterogeneous Regression ANCOVA 280 12 Multilevel Analysis for the Single Factor Repeated Measures Design 281 12.1 Introduction 281 12.2 Review of the Single Factor Repeated Measures Experimental Design GLM and ANOVA 282 12.3 The Multilevel Approach to the Single Factor Repeated Measures Experimental Design 283 12.4 Parameter Estimation in Multilevel Analysis 288 12.5 Applying Multilevel Models with Different Covariance Structures 289 12.5.1 Using SYSTAT to Apply the Multilevel GLM of the Repeated Measures Experimental Design GLM 289 12.5.1.1 The Linear Mixed Model 291 12.5.1.2 The Hierarchical Linear Mixed Model 295 12.5.2 Applying Alternative Multilevel GLMs to the Repeated Measures Data 298 12.6 Empirically Assessing Different Multilevel Models 303 Appendix A 305 Appendix B 307 Appendix C 315 References 325 Index 339