Study Guide #2: MULTIPLE REGRESSION in education

Similar documents
WELCOME! Lecture 11 Thommy Perlinger

CHAPTER TWO REGRESSION

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Simple Linear Regression One Categorical Independent Variable with Several Categories

Daniel Boduszek University of Huddersfield

Regression Including the Interaction Between Quantitative Variables

Correlation and Regression

Example of Interpreting and Applying a Multiple Regression Model

11/24/2017. Do not imply a cause-and-effect relationship

Item-Total Statistics

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

Business Research Methods. Introduction to Data Analysis

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

Daniel Boduszek University of Huddersfield

Chapter 11 Multiple Regression

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

isc ove ring i Statistics sing SPSS

Simple Linear Regression

SPSS output for 420 midterm study

Dr. Kelly Bradley Final Exam Summer {2 points} Name

LAMPIRAN A KUISIONER

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN)

0= Perempuan, 1= Laki-Laki

Daniel Boduszek University of Huddersfield

MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE:

SPSS output for 420 midterm study

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

What Causes Stress in Malaysian Students and it Effect on Academic Performance: A case Revisited

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

7 Statistical Issues that Researchers Shouldn t Worry (So Much) About

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Regression. Page 1. Variables Entered/Removed b Variables. Variables Removed. Enter. Method. Psycho_Dum

CHAPTER ONE CORRELATION

Multiple Regression Analysis

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

Week 8 Hour 1: More on polynomial fits. The AIC. Hour 2: Dummy Variables what are they? An NHL Example. Hour 3: Interactions. The stepwise method.

HPS301 Exam Notes- Contents

Multiple Regression Analysis

CHILD HEALTH AND DEVELOPMENT STUDY

THE UNIVERSITY OF SUSSEX. BSc Second Year Examination DISCOVERING STATISTICS SAMPLE PAPER INSTRUCTIONS

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

Overview of Lecture. Survey Methods & Design in Psychology. Correlational statistics vs tests of differences between groups

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

Multiple Regression Using SPSS/PASW

Chapter 11 Nonexperimental Quantitative Research Steps in Nonexperimental Research

Prediction of sheep milk chemical composition using ph, electrical conductivity and refractive index

CHAPTER 4: FINDINGS 4.1 Introduction This chapter includes five major sections. The first section reports descriptive statistics and discusses the

Daniel Boduszek University of Huddersfield

Two-Way Independent ANOVA

Between Groups & Within-Groups ANOVA

RELATIONSHIP BETWEEN EMOTIONAL INTELLIGENCE AND ETHICAL COMPETENCE: AN EMPIRICAL STUDY

ANOVA in SPSS (Practical)

Chapter 10: Moderation, mediation and more regression

Chapter Eight: Multivariate Analysis

Regression CHAPTER SIXTEEN NOTE TO INSTRUCTORS OUTLINE OF RESOURCES

POLS 5377 Scope & Method of Political Science. Correlation within SPSS. Key Questions: How to compute and interpret the following measures in SPSS

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms).

Chapter 9: Comparing two means

Small Group Presentations

Analysis and Interpretation of Data Part 1

Moderating and Mediating Variables in Psychological Research

Title: The Theory of Planned Behavior (TPB) and Texting While Driving Behavior in College Students MS # Manuscript ID GCPI

CHAPTER 4 RESULTS. In this chapter the results of the empirical research are reported and discussed in the following order:

Multiple Regression Models

Business Statistics Probability

d =.20 which means females earn 2/10 a standard deviation more than males

Intro to SPSS. Using SPSS through WebFAS

Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN

Statistical Significance, Effect Size, and Practical Significance Eva Lawrence Guilford College October, 2017

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Chapter Eight: Multivariate Analysis

Chapter 14: More Powerful Statistical Methods

Inferential Statistics

SUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK

Chapter 12: Analysis of covariance, ANCOVA

Before we get started:

Statistics for Psychology

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS

ANOVA. Thomas Elliott. January 29, 2013

IAPT: Regression. Regression analyses

Data Analysis for Project. Tutorial

Assignment 4: True or Quasi-Experiment

Bootstrapping Residuals to Estimate the Standard Error of Simple Linear Regression Coefficients

On the purpose of testing:

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

m 11 m.1 > m 12 m.2 risk for smokers risk for nonsmokers

Statistics for Psychology

Knowledge as a driver of public perceptions about climate change reassessed

Bambang Subroto, Rosidi, Bambang Purnomosidhi Departement of Accounting, Faculty of Economics and Business, Brawijaya University

Introduction. Lecture 1. What is Statistics?

Answer all three questions. All questions carry equal marks.

Ecological Statistics

MULTIPLE REGRESSION OF CPS DATA

PSY 216: Elementary Statistics Exam 4

Steps in Inferential Analyses. Inferential Statistics. t-test

Transcription:

Study Guide #2: MULTIPLE REGRESSION in education What is Multiple Regression? When using Multiple Regression in education, researchers use the term independent variables to identify those variables that they think will influence some other dependent variable. If two variables are correlated, then knowing the score of one variable will allow you to predict the score of the other variable. The stronger the correlation, the closer the scores will fall to the regression line and therefore the more accurate the prediction. Multiple regression is simply an extension of this principle, where we predict one variable on the basis of several other variables. Having more than one Independent Variable (or predictor variable) is useful when predicting human behaviour (which we do in education). Our 1

actions, thoughts and emotions are all likely to be influenced by some combination of several factors. Using multiple regression we can test theories (or models) about precisely which set of variables is influencing our behaviour. Conditions for Using Multiple Regression To use multiple regression, make sure that there is a linear relationship between the Independent variable and the Dependent variable. In other words, the relationship follows a straight line. The dependent variable should be a continuous variable which means it should be have score such as 1, 20, 30 or 99 (such as scores in a mathematics test, or GPA and so forth). If your dependent variable is categorical such 1 = low, 2 = average and 3 = high, then a different regression 2

method called Logistic Regression should be used for categorical variable [which is not discussed here]. The independent variable should as far as possible be a continuous variable and have scores such as 1, 20, 30 or 99. However, if you do have to use categorical variable such as 1 = male and 2 = female, you have to create a dummy variable [which is not discussed here]. Multiple regression requires a large number of cases. How many is enough? You could use this guide. You should have 40 times as many subjects as independent variables. i.e. if you intend to use 2 independent variables to predict than you should have at least 2 x 40 = 80 subjects. 3

When choosing an independent variable, you should select a variable that might be correlated with the dependent variable. For example, you surely do not want to correlate head size with academic performance! The independent variable should not be strongly correlated with other independent variables. However, when dealing human behaviour it is common for the independent variables to be correlated. The term multicollinearity refers to the situation when there is a high correlation found between two or more independent variables (for example, attitudes, self-esteem, motivation, selfefficacy, academic performance, personality). What do you think will happen when there is a high correlation between independent variables? 4

Such high correlations cause problems when trying to draw inferences about the relative contribution of each independent variable to the dependent variable. Is it attitude or motivation that contributed to academic performance? [Fortunately, SPSS provides a method for checking for multicollinearity]. Example: Let us look at a study in which a researcher is interested in finding out what factors influence mathematics achievement. The researcher collected 4 types of information: The independent variables are: Reading ability Metacognitive ability Attitude towards mathematics 5

The dependent variable Mathematics achievement. METACOG- NITION Scale READING Test ATTITUDE Towards Science MATHE- MATICS Test 52.00 54.00 57.00 41.00 59.00 52.00 61.00 53.00 33.00 59.00 31.00 54.00 44.00 33.00 56.00 47.00 52.00 44.00 61.00 57.00 59.00 63.00 61.00 51.00 46.00 47.00 61.00 42.00 57.00 44.00 36.00 45.00 55.00 50.00 36.00 54.00 60.00 34.00 51.00 52.00 63.00 63.00 51.00 51.00 52.00 57.00 61.00 51.00 49.00 60.00 61.00 71.00 57.00 57.00 71.00 57.00 52.00 73.00 46.00 50.00 57.00 54.00 56.00 43.00 52.00 45.00 55.00 51.00 50.00 42.00 45.00 60.00 6

Correlations between the 4 Variables Read Pearson Correlation Sig. (2- tailed) Attitude Pearson Correlation Sig. (2- Metacognition Mathe matics tailed) Pearson Correlation Sig. (2- tailed) Pearson Correlation Sig. (2- tailed) Read Attitude Meta- Mathe cognition matics 1.00.621.597.662..00.00.00.621 1.00.605.544.00..00.00.597.605 1.00.617.000.000..000.662.544.617 1.000.000.000.000. ** Correlation is significant at the 0.01 level (2-tailed). 7

Independent + Independent + Independent Variable #1 Variable #2 Variable #3 [Reading] [Metacognition] [Attitude] Dependent Variable [Mathematics Score] IV (READING) is correlated with DV (Mathematics Score) at 0.66 which means that 43.5% of the variance for Mathematics Performance is contributed by Reading IV (METACOGNITION) correlated with DV (Mathematics Score) at 0.62 which means that 38.4% of the variance for Mathematics Performance is contributed by Metacognition. IV (ATTITUDE) correlated with DV (Mathematics Score) at 0.54 which means that 21.1% of the variance for 8

Mathematics Performance is contributed by Attitude. But in education and the social sciences, the likelihood of overlap is very high. i.e. there is multicollinearity. 9

METHODS OF MULTIPLE REGRESSION ANALYSIS There are 3 types of methods used in Multiple Regression Analysis: Standard model all the independent variables enter the regression equation at once the relationship between the whole set of IV1, IV2, IV3.IVn on the DV. Hierarchical model you determine which IV enters the regression equation first; this decision is based on the theory of the study. Stepwise model the computer determines which of the IV should enter the regression equation. 10

Variables Entered/Removed a Model Variables Entered 1 reading score Method Stepwise (Criteria: Probabilityof-F-to-enter <=.050, Probabilityof-F-to-remove >=.100). 2 Metacognition Stepwise (Criteria: Probabilityof-F-to-enter <=.050, Probabilityof-F-to-remove >=.100). a Dependent Variable: math score This Table tells us about the Independent or Predictor Variables and the Method used. Here we see that only two Independent Variables were entered because we selected the Stepwise method. 11

Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate 1.662a.439.436 7.0371 2.718b.515.510 6.5553 a Predictors: (Constant), reading score b Predictors: (Constant), reading score, metacognition This Table called the Model Summary is important. The Adjusted R Square tells us that the inclusion of reading score alone accounted for 43.6% of variance in Mathematics score. The inclusion of metacognition resulted in an additional 7.4% of the variance explained to 51.0% 12

Model ANOVA Table c 1 Regression Sum of Squares df Mean Square F Sig. 7660.7 1 7660.7 154.6.000a Residual 9805.0 198 49.520 Total 17465.7 199 2 Regression 9000.27 2 4500.1 104.7.000b Residual 8465.51 197 42.972 Total 17465.7 199 a Predictors: (Constant), reading score b Predictors: (Constant), reading score, metacognition c Dependent Variable: math score The ANOVA Table reports the ANOVA for the two models There is a significant difference for reading alone. There is also a significant difference for reading and metacognition combined. 13

COEFFICIENTS a Unstand ardized Coefficie nts Model B Std. Error Standardi zed Coefficient s Beta t Sig. 1 (Constant) 21.0 2.58 8.12.00 reading score.605.049.662 12.4.00 2 (Constant) 12.8 2.82 4.55.00 reading.417.056.456 7.38.00 score metacogni tion.341.061.345 5.58.00 a Dependent Variable: math score The standardized beta coefficients give a measure of the contribution of each variable to the model. A large value indicates that a unit change in this independent variable has a large effect on the dependent variable. The t and Sig (p) values give a rough indication of the impact of each 14

independent variable. A big absolute t value and small p value suggest that an independent variable is having a large impact on the dependent variable. 15

COEFFICIENTS table with COLLINEARITY data Unstandardized Coefficients Model B Std. Error Standardized Coefficients 1 (Constant) 21.0 2.58 8.12.00 reading score 2 (Constant) 12.865 2.82 4.55.00 reading score Metacognition a Dependent Variable: math score t Sig. Collinearity Statistics Beta Tolerance VIF.605.049.662 12.4.00 1.00 1.000.417.056.456 7.38.00.644 1.553.341.061.345 5.58.00.644 1.553 If you had asked SPSS to check for collinearity, it will be shown in an additional two columns of the Coefficients table (see above) Tolerance tells us the amount of overlap between the independent variable and all other independent variables. It varies between 0 to 1. The closer to 0, the stronger the relationship between this and other independent variables. You should worry about variables that have low tolerance 16

values (values < 0.10 are often considered to be an indication of collinearity). The other indicator of collinearity is the Variance Inflation Factors (VIF). VIF > 4 means that two or more independent variables are highly correlated, i.e. collinearity. Maybe, two variables are measuring the same thing, e.g. SES and Family Income or Father s education. COLLINEARITY: o We want the IVs to be highly correlated with the DV. o We do not want the IVs to be highly correlated with each other. 17

CONCLUSION When reporting your results, you should report: o Significance of the model by citing the F and associated p value F2,97 = 104.7, p0 < 0.0005 o The adjusted R square Adjusted R square = 0.510 o The beta regression weights Independent Beta p variable Reading score 0.456 p < 0.0005 Metacognition 0.345 p < 0.0005 Attitude was not a significant predictor in this model 18