Example of Interpreting and Applying a Multiple Regression Model

Similar documents
Regression Including the Interaction Between Quantitative Variables

Study Guide #2: MULTIPLE REGRESSION in education

CHAPTER TWO REGRESSION

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction

Multiple Regression Models

Multiple Linear Regression Analysis

Simple Linear Regression One Categorical Independent Variable with Several Categories

Complex Regression Models with Coded, Centered & Quadratic Terms

WELCOME! Lecture 11 Thommy Perlinger

Using SPSS for Correlation

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

Daniel Boduszek University of Huddersfield

Regression. Page 1. Variables Entered/Removed b Variables. Variables Removed. Enter. Method. Psycho_Dum

Analysis of Variance: repeated measures

Intro to SPSS. Using SPSS through WebFAS

Business Research Methods. Introduction to Data Analysis

CHAPTER ONE CORRELATION

Psychology Research Process

SPSS output for 420 midterm study

SUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK

Simple Linear Regression

Survey Project Data Analysis Guide

Item-Total Statistics

Analysis and Interpretation of Data Part 1

Chapter 11 Multiple Regression

Two-Way Independent ANOVA

Correlation and Regression

One-Way Independent ANOVA

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

THE UNIVERSITY OF SUSSEX. BSc Second Year Examination DISCOVERING STATISTICS SAMPLE PAPER INSTRUCTIONS

SPSS output for 420 midterm study

CLINICAL RESEARCH METHODS VISP356. MODULE LEADER: PROF A TOMLINSON B.Sc./B.Sc.(HONS) OPTOMETRY

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

PSY 216: Elementary Statistics Exam 4

Bivariate &/vs. Multivariate

Psychology Research Process

Sociology 593 Exam 2 March 28, 2003

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

Sample Exam Paper Answer Guide

MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE:

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

1. Below is the output of a 2 (gender) x 3(music type) completely between subjects factorial ANOVA on stress ratings

Dr. Kelly Bradley Final Exam Summer {2 points} Name

CHAPTER III METHODOLOGY

Business Statistics Probability

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

ANOVA in SPSS (Practical)

Basic Biostatistics. Chapter 1. Content

Inferential Statistics

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE

The importance of measurement in our daily lives and in research in. The Assessment of Individuals CHAPTER 1

Daniel Boduszek University of Huddersfield

Day 11: Measures of Association and ANOVA

Regression CHAPTER SIXTEEN NOTE TO INSTRUCTORS OUTLINE OF RESOURCES

CLINICAL RESEARCH METHODS VISP356. MODULE LEADER: PROF A TOMLINSON B.Sc./B.Sc.(HONS) OPTOMETRY

isc ove ring i Statistics sing SPSS

Between Groups & Within-Groups ANOVA

Multiple Regression Using SPSS/PASW

Use the above variables and any you might need to construct to specify the MODEL A/C comparisons you would use to ask the following questions.

LAMPIRAN A KUISIONER

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 8 One Way ANOVA and comparisons among means Introduction

kxk BG Factorial Designs kxk BG Factorial Designs Basic and Expanded Factorial Designs

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.

Survey Project Data Analysis Guide

Multiple Regression Analysis

Lesson 9: Two Factor ANOVAS

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

Statistics for Psychology

Data Analysis with SPSS

CHILD HEALTH AND DEVELOPMENT STUDY

Manuscript Presentation: Writing up APIM Results

The Lens Model and Linear Models of Judgment

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

Two-Way Independent Samples ANOVA with SPSS

bivariate analysis: The statistical analysis of the relationship between two variables.

Chapter 9: Comparing two means

0= Perempuan, 1= Laki-Laki

The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance?

Analysis of Variance (ANOVA)

Choosing a Significance Test. Student Resource Sheet

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

On the purpose of testing:

Chapter 10: Moderation, mediation and more regression

Chapter 14: More Powerful Statistical Methods

SIGNIFICANCE OF HEART RATE AND BP IN PSYCHO-PHYSIOLOGICAL PREPAREDNESS IN CONTEXT OF SPORTS

SPSS Correlation/Regression

Examining differences between two sets of scores

Introduction to Quantitative Methods (SR8511) Project Report

1. You want to find out what factors predict achievement in English. Develop a model that

Testing Means. Related-Samples t Test With Confidence Intervals. 6. Compute a related-samples t test and interpret the results.

What Causes Stress in Malaysian Students and it Effect on Academic Performance: A case Revisited

Notes for laboratory session 2

Proof. Revised. Chapter 12 General and Specific Factors in Selection Modeling Introduction. Bengt Muthén

ANOVA. Thomas Elliott. January 29, 2013

Alcohol Consumption Among YSU Students Statistics 3717, CC:2290 Elia Crisucci, Major in Biology Tracy Hitesman. Major in Biology Melissa Phillipson,

Constructing a mixed model using the AIC

The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance

CHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to

Transcription:

Example of Interpreting and Applying a Multiple Regression We'll use the same data set as for the bivariate correlation example -- the criterion is 1 st year graduate grade point average and the predictors are the program they are in and the three GRE scores. First we'll take a quick look at the simple correlations Correlations 1st year graduate gpa -- criterion variable Pearson Correlation Sig. (2-tailed) N Analytic subscore Quantitative subscore of Verbal subscore of GRE GRE of GRE PROGRAM.643.613.277 -.186.000.000.001.028 140 140 140 140 We can see that all four variables are correlated with the criterion -- and all GRE correlations are positive. Since program is coded 1 = clinical and 2 = experimental, we see that the clinical students have a higher mean on the criterion;. Analyze Regression Linear Move criterion variable into "Dependent" window Move all four predictor variable into "Independent(s)" window Syntax REGRESSION /STATISTICS COEFF OUTS R ANOVA /DEPENDENT ggpa /METHOD=ENTER grea greq grev program.

SPSS Output: 1 Summary Adjusted Std. Error of R R Square R Square the Estimate.758 a.575.562.39768 a. Predictors: (Constant), Verbal subscore of GRE, PROGRAM, Quantitative subscore of GRE, Analytic subscore of GRE By the way, the "adjusted R²" is intended to "control for" overestimates of the population R² resulting from small samples, high collinearity or small subject/variable ratios. Its perceived utility varies greatly across research areas and time. Also, the "Std. Error of the Estimate" is the standard deviation of the residuals (gpa - gpa'). As R² increases the SEE will decrease (better fit less estimation error) On average, our estimates of GGPA with this model will be wrong by.40 not a trivial amount given the scale of GGPA. Does the model work? Yep -- significant F-test of H0: that R²=0 If we had to compute it by hand, it would be R² / k F = --------------------------------- (1 - R²) / (N - k - 1).575 / 4 = --------------------------- = 45.67 (1 -.575) / 135 F(4,120,.01) = 3.48 So, we would reject this H0: and decide to use the model, since it accounts for significantly more variance in the criterion variable than would be expected by chance. ANOVA b Sum of Squares df Mean Square F Sig. 1 Regression 28.888 4 7.222 45.67.000 a Residual 21.351 135.158 Total 50.239 139 a. Predictors: (Constant), Verbal subscore of GRE, PROGRAM, Quantitative subscore of GRE, Analytic subscore of GRE b. Dependent Variable: 1st year graduate gpa -- criterion variable How well does the model work? Accounts for about 58% of gpa variance 1 (Constant) PROGRAM Analytic subscore of GRE Quantitative subscore of GRE Verbal subscore of GRE Coefficients a Unstandardized Coefficients a. Dependent Variable: 1st year graduate gpa -- criterion variable Standardized Coefficients B Std. Error Beta Sig. -1.215.454.025-6.561E-02.070 -.055.348 6.749E-03.001.549.000 3.374E-03.000.456.000-2.353E-03.001 -.243.001 Which variables contribute to the model? Looking at the p-value of the t-test for each predictor, we can see that each of the GRE scales contributes to the model, but program does not. Once GRE scores are "taken into account" there is no longer a mean grade difference between the program groups. This highlights the difference between using a correlation to ask if there is bivariate relationship between the criterion and a single predictor (ignoring all other predictors) and using a multiple regression to ask if that predictor is related to the criterion after controlling for all the other predictors in the model.

Take a look at the analytic subscale The b weight tells us that each added point on the GREA increases the expected grade point by.0065. Doesn't seem like much, but consider that a GRE increase of 100 leads to an GPA increase of about.65. Take a look at the verbal subscale This is a suppressor variable -- the sign of the multiple regression b and the simple r are different By itself GREV is positively correlated with gpa, but in the model higher GREV scores predict smaller gpa (other variables held constant) check out the Suppressors handout for more about these. Example Write-up Correlation and multiple regression analyses were conducted to examine the relationship between first year graduate GPA and various potential predictors. Table 1 summarizes the descriptive statistics and analysis results. As can be seen each of the GRE scores is positively and significantly correlated with the criterion, indicating that those with higher scores on these variables tend to have higher 1 st year GPAs. Program is negatively correlated with 1 ST year GPA (coded as 1=clinical and 2=experimental), indicating that the clinical students have a larger 1 st year GPA. The multiple regression model with all four predictors produced R² =.575, F(4, 135) = 45.67, p <.001. As can be seen in Table1, the Analytic and Quantitative GRE scales had significant positive regression weights, indicating students with higher scores on these scales were expected to have higher 1 st year GPA, after controlling for the other variables in the model. The Verbal GRE scale has a significant negative weight (opposite in sign from its correlation with the criterion), indicating that after accounting for Analytic and Quantitative GRE scores, those students with higher Verbal scores were expected to have lower 1 st year GPA (a suppressor effect). Program did not contribute to the multiple regression model. Table 1 Summary statistics, correlations and results from the regression analysis multiple regression weights Variable mean std correlation with 1 st year GPA b 1 st year GPA 3.319.612 GREA 570.0 75.9.643***.0065***.549 GREV 559.3 62.2.277*** -.0024*** -.243 GREQ 578.5 82.0.613***.0034***.456 Program^ clinical 55 (53.4%) -.186* -.0066 -.055 Exper 48 (46.6%) ^ coded as 1=clinical and 2=experimental students * p <.05 ** p <.01 ***p<.001

Applying the multiple regression model Now that we have a "working" model to predict 1 st year graduate gpa, we might decide to apply it to the next year's applicants. So, we use the raw score model to compute our predicted scores gpa' = (.006749*grea) + (.003374*greq) + (-.002353*grev) + (-.006561*prog) - 1.215. Notice that all four predictors are in the model, even though prog isn t a significant/contributing predictor. If we wanted to use a model with just the three GRE predictors, we would have to rerun that model and use the resulting weights you can t just use some of the b-weights from a model! COMPUTE gpa' = (.006749*grea) + (.003374*greq) + (-.002353*grev) + (-.006561*prog) - 1.215. EXE. When we run this computation, a new variable is computed and placed in the rightmost column of the data set. We might have computed these estimated GGPA values to help decide which students to admit to the program. When using these estimates, we need to consider four things carefully: 1. The model works better than chance meaning that, on average, GGPA is expected to estimate GGPA better than if we just assigned each candidate the mean GGPA for the population represented by the sample (but some individuals may be better estimated by that mean than by y ). 2. While an R² of.58 is usually grounds for much celebration, the model accounts for less than 60% of the variance way less than 100% 3. Related to this the SEE tells us that, on average, our GGPA estimates will be off by.40 4. The specific predicted GGPA estimate fir the applicants depends not only upon the fit of the model, but the specific predictors involved in the model. If we used a different model (even with the same R²) we might get different values and even a different ordering of the applicants.

We could also use the standardized model to make the predictions. That model is zgpa' = (.549*Zgrea) + (.456*Zgreq) + (-.243*Zgrev) + (-.055*Zprog) In order to apply this model, we must have z-score versions of each variable. Perhaps the simplest way to do this in SPSS is via the Descriptives procedure. Analyze Descriptive Statistics Descriptives Move the desired variables into the Variables window Check the box on the lower left Save standardized values as variables When you run this command, you will get the requested statistics, and new variables will be added to the spread sheet. The name of each new variable will have a Z inserted at the beginning of the original variable name. We can then apply the standardized formula shown above to estimate the Z-score GGPA of each applicant. Applying this compute statement will produce a new variable that estimates applicant s GGPA, but on a standardized scale (mean = 0, std = 1), rather than on the scale of the population GGPA as estimated from the original modeling sample. The ggpa_pred and Zggpa_pred variables for each candidate are shown on the right. All the caveats that apply to predicted raw scores apply to predicted Z-scores!