Multiple Regression Using SPSS/PASW

Similar documents
CHAPTER TWO REGRESSION

Daniel Boduszek University of Huddersfield

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

WELCOME! Lecture 11 Thommy Perlinger

Correlation and Regression

Chapter 10: Moderation, mediation and more regression

CHILD HEALTH AND DEVELOPMENT STUDY

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Multiple Regression Analysis

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction

Using SPSS for Correlation

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis

11/24/2017. Do not imply a cause-and-effect relationship

isc ove ring i Statistics sing SPSS

Study Guide #2: MULTIPLE REGRESSION in education

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

Business Research Methods. Introduction to Data Analysis

Sample Exam Paper Answer Guide

2 Assumptions of simple linear regression

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

Simple Linear Regression

One-Way Independent ANOVA

Two-Way Independent ANOVA

This tutorial presentation is prepared by. Mohammad Ehsanul Karim

Daniel Boduszek University of Huddersfield

Intro to SPSS. Using SPSS through WebFAS

Regression Including the Interaction Between Quantitative Variables

Chapter 12: Analysis of covariance, ANCOVA

Example of Interpreting and Applying a Multiple Regression Model

THE UNIVERSITY OF SUSSEX. BSc Second Year Examination DISCOVERING STATISTICS SAMPLE PAPER INSTRUCTIONS

M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page Influence Analysis 1

Multiple Linear Regression Analysis

6. Unusual and Influential Data

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*

10. LINEAR REGRESSION AND CORRELATION

SPSS Portfolio. Brittany Murray BUSA MWF 1:00pm-1:50pm

CHAPTER ONE CORRELATION

bivariate analysis: The statistical analysis of the relationship between two variables.

Chapter 9: Comparing two means

Chapter 3 CORRELATION AND REGRESSION

Simple Linear Regression the model, estimation and testing

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.

Chapter 11 Multiple Regression

Analysis of Covariance (ANCOVA)

Chapter 8: Regression

Chapter 19: Categorical outcomes: chi-square and loglinear analysis

Chapter 7: Correlation

Small Group Presentations

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Section 6: Analysing Relationships Between Variables

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS

Item-Total Statistics

Multiple Regression Analysis

Simple Linear Regression One Categorical Independent Variable with Several Categories

Survey research (Lecture 1) Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.

Survey research (Lecture 1)

Step 3 Tutorial #3: Obtaining equations for scoring new cases in an advanced example with quadratic term

Examining differences between two sets of scores

MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE:

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

Data Analysis for Project. Tutorial

Introduction to SPSS. Katie Handwerger Why n How February 19, 2009

Section 3.2 Least-Squares Regression

INTERNATIONALES MARKENMANAGEMENT

SPSS output for 420 midterm study

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine

Daniel Boduszek University of Huddersfield

Profile Analysis. Intro and Assumptions Psy 524 Andrew Ainsworth

Dan Byrd UC Office of the President

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Linear Regression in SAS

IAPT: Regression. Regression analyses

Example 7.2. Autocorrelation. Pilar González and Susan Orbe. Dpt. Applied Economics III (Econometrics and Statistics)

Chapter 9: Answers. Tests of Between-Subjects Effects. Dependent Variable: Time Spent Stalking After Therapy (hours per week)

STAT445 Midterm Project1

Measuring the User Experience

SPSS output for 420 midterm study

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 8 One Way ANOVA and comparisons among means Introduction

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Overview of Lecture. Survey Methods & Design in Psychology. Correlational statistics vs tests of differences between groups

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

ANOVA in SPSS (Practical)

Biostatistics II

(C) Jamalludin Ab Rahman

Regression CHAPTER SIXTEEN NOTE TO INSTRUCTORS OUTLINE OF RESOURCES

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

Hour 2: lm (regression), plot (scatterplots), cooks.distance and resid (diagnostics) Stat 302, Winter 2016 SFU, Week 3, Hour 1, Page 1

Section 3 Correlation and Regression - Teachers Notes

Statistical Techniques. Meta-Stat provides a wealth of statistical tools to help you examine your data. Overview

CHAPTER 4 RESULTS. In this chapter the results of the empirical research are reported and discussed in the following order:

Part 8 Logistic Regression

Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

SPSS Correlation/Regression

BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA

Manuscript Presentation: Writing up APIM Results

Tutorial #7A: Latent Class Growth Model (# seizures)

Chapter 25. Paired Samples and Blocks. Copyright 2010 Pearson Education, Inc.

Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

d =.20 which means females earn 2/10 a standard deviation more than males

Transcription:

MultipleRegressionUsingSPSS/PASW The following sections have been adapted from Field (2009) Chapter 7. These sections have been edited down considerablyandisuggest(especiallyifyou reconfused)thatyoureadthischapterinitsentirety.youwillalsoneed toreadthischaptertohelpyouinterprettheoutput.ifyou rehavingproblemsthereisplentyofsupportavailable: youcan(1)emailorseeyourseminartutor(2)postamessageonthecoursebulletinboardor(3)dropintomyoffice hour. WhatisCorrelationalResearch? Correlational designs are when many variables are measured simultaneously but unlike in an experiment none of them are manipulated. When we use correlational designs we can t look for cause effect relationships because we haven tmanipulatedanyofthevariables,andalsobecauseallofthevariableshavebeenmeasuredatthesamepoint intime(ifyou rereallybored,field,2009,chapter1explainswhyexperimentsallowustomakecausalinferencesbut correlational research does not). In psychology, the most common correlational research consists of the researcher administeringseveralquestionnairesthatmeasuredifferentaspectsofbehaviourtoseewhichaspectsofbehaviour arerelated.manyofyouwilldothissortofresearchforyourfinalyearresearchproject(sopayattention!). This handout looks at such an example: data have been collected from several questionnaires relating to clinical psychology,andwewillusethesemeasurestopredictsocialanxietyusingmultipleregression. TheData Anxiety disorders take on different shapes and forms, and each disorder is believed to be distinct and have unique causes.wecansummarisethedisordersandsomepopulartheoriesasfollows: Social Anxiety: Social anxiety disorder is a marked and persistent fear of 1 or more social or performance situations in which the person is exposed to unfamiliar people or possible scrutiny by others. This anxiety leads to avoidance of these situations. People with social phobia are believed to feel elevated feelings of shame. Obsessive Compulsive Disorder (OCD): OCD is characterised by the everyday intrusion into conscious thinkingofintense,repetitive,personallyabhorrent,absurdandalienthoughts(obsessions),leadingtothe endlessrepetitionofspecificactsortotherehearsalofbizarreandirrationalmentalandbehaviouralrituals (compulsions). Social anxiety and obsessive compulsive disorder are seen as distinct disorders having different causes. However, therearesomesimilarities. Theybothinvolvesomekindofattentionalbias:attentiontobodilysensationinsocialanxietyandattention tothingsthatcouldhavenegativeconsequencesinocd. Theybothinvolverepetitivethinkingstyles:socialphobicsarebelievedtoruminateaboutsocialencounters aftertheevent(knownaspost eventprocessing),andpeoplewithocdhaverecurringintrusivethoughtsand images. Theybothinvolvesafetybehaviours(i.e.tryingtoavoidthethingthatmakesyouanxious). Thismightleadustothinkthat,ratherthanbeingdifferentdisorders,theyarejustmanifestationsofthesamecore processes. One way to research this possibility would be to see whether social anxiety can be predicted from measuresofotheranxietydisorders.ifsocialanxietydisorderandocdaredistinctweshouldexpectthatmeasuresof OCD will not predict social anxiety. However, if there are core processes underlying all anxiety disorders, then measuresofocdshouldpredictsocialanxiety 1. 1 Those of you interested in these disorders can download my old lecture notes on social anxiety (http://www.statisticshell.com/panic.pdf)andocd(http://www.statisticshell.com/ocd.pdf). Dr.AndyField,2008 Page1

The data for this handout are in the file SocialAnxietyRegression.sav which can be downloaded from the course website.thisfilecontainsfourvariables: TheSocialPhobiaandAnxietyInventory(SPAI),whichmeasureslevelsofsocialanxiety. Interpretation of Intrusions Inventory (III), which measures the degree to which a person experiences intrusivethoughtslikethosefoundinocd. Obsessive Beliefs Questionnaire (OBQ), which measures the degree to which people experience obsessive beliefslikethosefoundinocd. TheTestofSelf ConsciousAffect(TOSCA),whichmeasuresshame. Each of 134 people was administered all four questionnaires. You should note that each questionnaire has its own columnandeachrowrepresentsadifferentperson(seefigure1). The data in this handout are deliberately quite similar to the data you will be analysing for Autumn Portfolio assignment2.themaindifferenceisthatwehavedifferentvariables. Figure1:Datalayoutformultipleregression Whatanalysiswillwedo? We are going to do a multiple regression analysis. Specifically, we re going to do a hierarchical multiple regression analysis.allthismeansisthatweentervariablesintotheregressionmodelinanorderdeterminedbypastresearch andexpectations.so,foryouranalysis,wewillentervariablesinso called blocks : Block 1: the first block will contain any predictors that we expect to predict social anxiety. These variables shouldbeenteredusingforcedentry.inthisexamplewehaveonlyonevariablethatweexpect,theoretically, topredictsocialanxietyandthatisshame(measuredbythetosca). Block 2: the second block will contain our exploratory predictor variables (the one s we don t necessarily expecttopredictsocialanxiety).thisbockshouldcontainthemeasuresofocd(obqandiii)becausethese variablesshouldn tpredictsocialanxietyifsocialanxietyisindeeddistinctfromocd.thesevariablesshould beenteredusingastepwisemethodbecauseweare exploringthem (thinkbacktoyourlecture). Dr.AndyField,2008 Page2

DoingMultipleRegressiononSPSS/PASW SpecifyingtheFirstBlockinHierarchicalRegression Theoryindicatesthatshameisasignificantpredictorofsocialphobia,andsothisvariableshouldbeincludedinthe model first. The exploratory variables (obq and iii) should, therefore, be entered into the model after shame. This methodiscalledhierarchical(theresearcherdecidesinwhichordertoentervariablesintothemodelbasedonpast research).todoahierarchicalregressioninspss/paswweenterthevariablesinblocks(eachblockrepresentingone step in the hierarchy). To get to the main regression dialog box select select.themaindialogboxisshowninfigure2. Figure2:Maindialogboxforblock1ofthemultipleregression Themaindialogboxisfairlyself explanatoryinthatthereisaspacetospecifythedependentvariable(outcome),and aspacetoplaceoneormoreindependentvariables(predictorvariables).asusual,thevariablesinthedataeditorare listedontheleft handsideofthebox.highlighttheoutcomevariable(spaiscores)inthislistbyclickingonitandthen transferittotheboxlabelleddependentbyclickingon ordraggingitacross.wealsoneedtospecifythepredictor variableforthefirstblock.wedecidedthatshameshouldbeenteredintothemodelfirst(because theoryindicatesthatitisanimportantpredictor),so,highlightthisvariableinthelistandtransferit to the box labelled Independent(s) by clicking on or dragging it across. Underneath the Independent(s)box,thereisadrop downmenuforspecifyingthemethodofregression(seefield, 2009 or your lecture notes for multiple regression). You can select a different method of variable entryforeachblockbyclickingon,nexttowhereitsaysmethod.thedefaultoptionis forcedentry,andthisistheoptionwewant,butifyouwerecarryingoutmoreexploratorywork, youmightdecidetouseoneofthestepwisemethods(forward,backward,stepwiseorremove). SpecifyingtheSecondBlockinHierarchicalRegression Havingspecifiedthefirstblockinthehierarchy,wemoveontotothesecond.Totellthecomputerthatyouwantto specifyanewblockofpredictorsyoumustclickon.thisprocessclearstheindependent(s)boxsothatyoucan enterthenewpredictors(youshouldalsonotethatabovethisboxitnowreadsblock2of2indicatingthatyouarein thesecondblockofthetwothatyouhavesofarspecified).wedecidedthatthesecondblockwouldcontainbothof thenewpredictorsandsoyoushouldclickonobqandiiiinthevariableslistandtransferthem,onebyone,tothe Independent(s)boxbyclickingon.ThedialogboxshouldnowlooklikeFigure3.Tomovebetweenblocksusethe and buttons(so,forexample,tomovebacktoblock1,clickon ). Dr.AndyField,2008 Page3

Itispossibletoselectdifferentmethodsofvariable entryfordifferentblocksinahierarchy.so,although we specified forced entry for the first block, we could now specify a stepwise method for the second. Given that we have no previous research regarding the effects of obq and iii on SPAI scores, we might be justified in requesting a stepwise methodforthisblock(seeyourlecturenotesandmy textbook).forthisanalysisselectastepwisemethod forthissecondblock. Figure3:Maindialogboxforblock2ofthemultipleregression Statistics In the main regression dialog box click on to open a dialog box for selecting various important options relating to the model (Figure 4). Most of these options relate to the parameters of the model; however, there are procedures available for checking the assumptions of no multicollinearity (Collinearity diagnostics) and independence of errors(durbin Watson).Whenyouhaveselectedthestatisticsyourequire(I recommend all but the covariance matrix as a general rule) clickon toreturntothemaindialogbox. Figure4:Statisticsdialogboxforregressionanalysis Estimates:Thisoptionisselectedbydefaultbecauseitgivesustheestimatedcoefficientsoftheregression model (i.e. the estimated b values). Test statistics and their significance are produced for each regression coefficient:at testisusedtoseewhethereachbdifferssignificantlyfromzero. 2 Confidenceintervals:Thisoption,ifselected,producesconfidenceintervalsforeachoftheunstandardized regression coefficients. Confidence intervals can be a very useful tool in assessing the likely value of the regressioncoefficientsinthepopulation. Modelfit:Thisoptionisvitalandsoisselectedbydefault.Itprovidesnotonlyastatisticaltestofthemodel s abilitytopredicttheoutcomevariable(thef test),butalsothevalueofr(ormultipler),thecorresponding R 2,andtheadjustedR 2. Rsquaredchange:ThisoptiondisplaysthechangeinR 2 resultingfromtheinclusionofanewpredictor(or block of predictors). This measure is a useful way to assess the unique contribution of new predictors (or blocks)toexplainingvarianceintheoutcome. 2 Rememberthatforsimpleregression,thevalueofbisthesameasthecorrelationcoefficient.Whenwetestthe significanceofthepearsoncorrelationcoefficient,wetestthehypothesisthatthecoefficientisdifferentfromzero;it is,therefore,sensibletoextendthislogictotheregressioncoefficients. Dr.AndyField,2008 Page4

Descriptives: If selected, this option displays a table of the mean, standard deviation and number of observationsofallofthevariablesincludedintheanalysis.acorrelationmatrixisalsodisplayedshowingthe correlation between all of the variables and the one tailed probability for each correlation coefficient. This optionisextremelyusefulbecausethecorrelationmatrixcanbeusedtoassesswhetherpredictorsarehighly correlated(whichcanbeusedtoestablishwhetherthereismulticollinearity). Part and partial correlations: This option produces the zero order correlation (the Pearson correlation) between each predictor and the outcome variable. It also produces the partial correlation between each predictor and the outcome, controlling for all other predictors in the model. Finally, it produces the part correlation(orsemi partialcorrelation)betweeneachpredictorandtheoutcome.thiscorrelationrepresents the relationship between each predictor and the part of the outcome that is not explained by the other predictorsinthemodel.assuch,itmeasurestheuniquerelationshipbetweenapredictorandtheoutcome. Collinearity diagnostics: This option is for obtaining collinearity statistics such as the VIF, tolerance, eigenvaluesofthescaled,uncentredcross productsmatrix,conditionindexesandvarianceproportions(see section7.6.2.4offield,2009andyourlecturenotes). Durbin Watson:ThisoptionproducestheDurbin Watsonteststatistic,whichtestsforcorrelationsbetween errors.specifically,ittestswhetheradjacentresidualsarecorrelated(rememberoneofourassumptionsof regressionwasthattheresidualsareindependent).inshort,thisoptionisimportantfortestingwhetherthe assumptionofindependenterrorsistenable.theteststatisticcanvarybetween0and4withavalueof2 meaningthattheresidualsareuncorrelated.avaluegreaterthan2indicatesanegativecorrelationbetween adjacent residuals whereas a value below 2 indicates a positive correlation. The size of the Durbin Watson statisticdependsuponthenumberofpredictorsinthemodel,andthenumberofobservations.foraccuracy, you should look up the exact acceptable values in Durbin and Watson s (1951) original paper. As a very conservativeruleofthumb,field(2009)suggeststhatvalueslessthan1orgreaterthan3aredefinitelycause forconcern;however,valuescloserto2maystillbeproblematicdependingonyoursampleandmodel. Casewisediagnostics:Thisoption,ifselected,liststheobservedvalueoftheoutcome,thepredictedvalueof the outcome, the difference between these values (the residual) and this difference standardized. Furthermore,itwilllistthesevalueseitherforallcases,orjustforcasesforwhichthestandardizedresidualis greater than 3 (when the ± sign is ignored). This criterion value of 3 can be changed, and I recommend changingitto2forreasonsthatwillbecomeapparent.asummarytableofresidualstatisticsindicatingthe minimum, maximum, mean and standard deviation of both the values predicted by the model and the residuals(seesection7.6offield,2009)isalsoproduced. RegressionPlots Onceyouarebackinthemaindialogbox,clickon toactivatetheregressionplotsdialogboxshowninfigure 5.Thisdialogboxprovidesthemeanstospecifyanumberofgraphs,whichcanhelptoestablishthevalidityofsome regression assumptions. Most of these plots involve various residual values, which are described in detail in Field, 2009.Ontheleft handsideofthedialogboxisalistofseveralvariables: DEPENDNT(theoutcomevariable). *ZPRED(thestandardizedpredictedvaluesofthedependentvariablebasedonthemodel).Thesevaluesare standardizedformsofthevaluespredictedbythemodel. *ZRESID(thestandardizedresiduals,orerrors).Thesevaluesarethestandardizeddifferencesbetweenthe observeddataandthevaluesthatthemodelpredicts). *DRESID(thedeletedresiduals). *ADJPRED(theadjustedpredictedvalues). *SRESID(theStudentizedresidual). *SDRESID(theStudentizeddeletedresidual).Thisvalueisthedeletedresidualdividedbyitsstandarderror. Thevariableslistedinthisdialogboxallcomeunderthegeneralheadingofresiduals,andarediscussedindetailinmy book(sorryforalloftheself referencing,buti mtryingtocondensea60pagechapterintoamanageablehandout!). For a basic analysis it is worth plotting *ZRESID (Y axis) against *ZPRED (X axis), because this plot is useful to determinewhethertheassumptionsofrandomerrorsandhomoscedasticityhavebeenmet.aplotof*sresid(y axis) against*zpred(x axis)willshowupanyheteroscedasticityalso.althoughoftenthesetwoplotsarevirtuallyidentical, thelatterismoresensitiveonacase by casebasis.tocreatetheseplotssimplyselectavariablefromthelist,and transfer it to the space labelled either X or Y(which refer to the axes) by clicking. When you have selected two variablesforthefirstplot(asisthecaseinfigure5)youcanspecifyanewplotbyclickingon.thisprocess Dr.AndyField,2008 Page5

clearsthespacesinwhichvariablesarespecified.ifyouclickon lastspecified,thensimplyclickon. You can also select the tick box labelled Produce all partial plots which will produce scatterplots of the residuals of the outcome variable and each of the predictors when both variables are regressed separately on the remaining predictors. Regardless of whether the previous sentence made any sense to you, these plots have several important characteristics that make them worthinspecting.first,thegradientoftheregressionline between the two residual variables is equivalent to the coefficientofthepredictorintheregressionequation.as such, any obvious outliers on a partial plot represent cases that might have undue influence on a predictor s regression coefficient. Second, non linear relationships betweenapredictorandtheoutcomevariablearemuch more detectable using these plots. Finally, they are a useful way of detecting collinearity. For these reasons, I recommendrequestingthem. andwouldliketoreturntotheplotthatyou Figure5:Linearregression:plotsdialogbox Thereareseveraloptionsforplotsofthestandardizedresiduals.First,youcanselectahistogramofthestandardized residuals (this is extremely useful for checking the assumption of normality of errors). Second, you can ask for a normal probability plot, which also provides information about whether the residuals in the model are normally distributed.whenyouhaveselectedtheoptionsyourequire,clickon totakeyoubacktothemainregression dialogbox. SavingRegressionDiagnostics In this week s lecture we met two types of regression diagnostics:thosethathelpusassesshowwellourmodelfits our sample and those that help us detect cases that have a largeinfluenceonthemodelgenerated.inspss/paswwecan choose to save these diagnostic variables in the data editor (so, SPSS/PASW will calculate them and then create new columnsinthedataeditorinwhichthevaluesareplaced). Clickon inthemainregressiondialogboxtoactivate the save new variables dialog box (see Figure 6). Once this dialogboxisactive,itisasimplemattertoticktheboxesnext to the required statistics. Most of the available options are explained in Field (2005 section 7.7.4) and Figure 6 shows, whaticonsidertobe,afairlybasicsetofdiagnosticstatistics. Standardized (and Studentized) versions of these diagnostics are generally easier to interpret and so I suggest selecting them in preference to the unstandardized versions. Once the regressionhasbeenrun,spss/paswcreatesacolumninyour data editor for each statistic requested and it has a standard set of variable names to describe each one. After the name, therewillbeanumberthatreferstotheanalysisthathasbeen run.so,forthefirstregressionrunonadatasetthevariable names will be followed by a 1, if you carry out a second regression it will create a new set of variables with names followedbya2andsoon.thenamesofthevariablesinthe dataeditoraredisplayedbelow.whenyouhaveselectedthe diagnosticsyourequire(byclickingintheappropriateboxes), clickon toreturntothemainregressiondialogbox. Figure6:Dialogboxforregressiondiagnostics Dr.AndyField,2008 Page6

pre_1:unstandardizedpredictedvalue;zpr_1:standardizedpredictedvalue;adj_1:adjustedpredictedvalue;sep_1: standard error of predicted value; res_1: unstandardized residual; zre_1: standardized residual; sre_1: Studentized residual; dre_1: deleted residual; sdr_1: Studentized deleted residual; mah_1: Mahalanobis distance; coo_1: Cook s distance; lev_1: centred leverage value; sdb0_1: standardized DFBETA (intercept); sdb1_1: standardized DFBETA (predictor1);sdb2_1:standardizeddfbeta(predictor2);sdf_1:standardizeddffit;cov_1:covarianceratio. ABriefGuidetoInterpretation ModelSummary Themodelsummarycontainstwomodels.Model1referstothefirststageinthehierarchywhenonlyTOSCAisused asapredictor.model2referstothefinalmodel(tosca,andobqandiiiiftheyendupbeingincluded). In the column labelled R are the values of the multiple correlation coefficient between the predictors and the outcome.whenonlytoscaisusedasapredictor,thisisthesimplecorrelationbetweenspaiandtosca(0.34). The next column gives us a value of R 2, which is a measure of how much of the variability in the outcome is accounted for by the predictors. For the first model its value is 0.116, which means that TOSCA accounts for 11.6%ofthevariationinsocialanxiety.However,forthefinalmodel(model2),thisvalueincreasesto0.157or 15.7% of the variance in SPAI. Therefore, whatever variables enter the model in block 2 account for an extra (15.7 11.6)4.1%ofthevarianceinSPAIscores(thisisalsothevalueinthecolumnlabelledR squarechangebut expressedasapercentage). TheadjustedR 2 givesussomeideaofhowwellourmodelgeneralizesandideallywewouldlikeitsvaluetobethe same,orverycloseto,thevalueofr 2.Inthisexamplethedifferenceforthefinalmodelisafairbit(0.157 0.143 =0.014or1.4%).Thisshrinkagemeansthatifthemodelwerederivedfromthepopulationratherthanasampleit wouldaccountforapproximately1.4%lessvarianceintheoutcome. Finally,ifyourequestedtheDurbin Watsonstatisticitwillbefoundinthelastcolumn.Thisstatisticinformsus aboutwhethertheassumptionofindependenterrorsistenable.thecloserto2thatthevalueis,thebetter,and forthesedatathevalueis2.084,whichissocloseto2thattheassumptionhasalmostcertainlybeenmet. ANOVATable The next part of the output contains an analysis of variance (ANOVA) that tests whether the model is significantly betteratpredictingtheoutcomethanusingthemeanasa bestguess.specifically,thef ratiorepresentstheratioof theimprovementinpredictionthatresultsfromfittingthemodel(labelled Regression inthetable),relativetothe inaccuracythatstillexistsinthemodel(labelled Residual inthetable).thistableisagainsplitintotwosections:one foreachmodel. Iftheimprovementduetofittingtheregressionmodelismuchgreaterthantheinaccuracywithinthemodelthenthe valueoffwillbegreaterthan1andspss/paswcalculatestheexactprobabilityofobtainingthevalueoffbychance. FortheinitialmodeltheF ratiois16.52,whichisveryunlikelytohavehappenedbychance(p<.001).forthesecond modelthevalueoffis11.61,whichisalsohighlysignificant(p<.001).wecaninterprettheseresultsasmeaningthat thefinalmodelsignificantlyimprovesourabilitytopredicttheoutcomevariable. Dr.AndyField,2008 Page7

ModelParameters Thenextpartoftheoutputisconcernedwiththeparametersofthemodel.Thefirststepinourhierarchyincluded TOSCAandalthoughtheseparametersareinterestinguptoapoint,we remoreinterestedinthefinalmodelbecause thisincludesallpredictorsthatmakeasignificantcontributiontopredictingsocialanxiety.so,we lllookonlyatthe lowerhalfofthetable(model2). Inmultipleregressionthemodeltakestheformofanequationthatcontainsacoefficient(b)foreachpredictor.The first part of the table gives us estimates for these b values and these values indicate the individual contribution of eachpredictortothemodel. Thebvaluestellusabouttherelationshipbetweensocialanxietyandeachpredictor.Ifthevalueispositivewecan tell that there is a positive relationship between the predictor and the outcome whereas a negative coefficient represents a negative relationship. For these data both predictors have positive b values indicating positive relationships.so,asshame(tosca)increases,socialanxietyincreasesandasobsessivebeliefsincreasesodoessocial anxiety; The b values also tell us to what degree each predictor affects the outcome if the effects of all other predictorsareheldconstant. Eachofthesebetavalueshasanassociatedstandarderrorindicatingtowhatextentthesevalueswouldvaryacross different samples, and these standard errors are used to determine whether or not the b value differs significantly from zero(using the t statistic that you came across last year). Therefore, if the t test associated with a b value is significant (if the value in the column labelled Sig. is less than 0.05) then that predictor is making a significant contributiontothemodel.thesmallerthevalueofsig.(andthelargerthevalueoft)thegreaterthecontributionof thatpredictor.forthismodel,shame(tosca),t(125)=3.16,p<.01,andobsessivebeliefs,t(125)=2.46,p<.05)are significantpredictorsofsocialanxiety.fromthemagnitudeofthet statisticswecanseethattheshame(tosca)had slightlymoreimpactthanobsessivebeliefs. The b values and their significance are important statistics to look at; however, the standardized versions of the b values are easier to interpret because they are not dependent on the units of measurement of the variables. The standardized beta values are provided by SPSS/PASW and they tell us the number of standard deviations that the outcomewillchangeasaresultofonestandarddeviationchangeinthepredictor.thestandardizedbetavalues(β) areallmeasuredinstandarddeviationunitsandsoaredirectlycomparable:therefore,theyprovideabetterinsight intothe importance ofapredictorinthemodel.thestandardizedbetavaluesforshame(tosca)is0.273,andfor obsessivebeliefsis0.213.thistellsusthatshamehasslightlymoreimpactinthemodel. ExcludedVariables AteachstageofaregressionanalysisSPSS/PASWprovidesasummaryofanyvariablesthathavenotyetbeenentered into the model. In a hierarchical model, this summary has details of the variables that have been specified to be Dr.AndyField,2008 Page8

entered in subsequent steps, and in stepwise regression this table contains summaries of the variables that SPSS/PASWisconsideringenteringintothemodel.Thesummarygivesanestimateofeachpredictor sbvalueifitwas entered into the equation at this point and calculates a t test for this value. In a stepwise regression, SPSS/PASW shouldenterthepredictorwiththehighestt statisticandwillcontinueenteringpredictorsuntiltherearenoneleft with t statistics that have significance values less than 0.05. Therefore, the final model might not include all of the variablesyouaskedspss/paswtoenter! In this case it tells us that if the interpretation of intrusions (III) is entered into the model it would not have a significant impact on the model s ability to predict social anxiety, t = 0.049, p >.05. In fact the significance of this variableisalmost1indicatingitwouldhavevirtuallynoimpactwhatsoever(notealsothatitsbetavalueisextremely closetozero!). Assumptions If you want to go beyond this basic analysis, you should also look at how reliable your model is. You ll be given a lectureaboutthis,andfield(2009)goesthroughthemainpointsinpainfuldetail. WritingUpMultipleRegressionAnalysis IfyoufollowtheAmericanPsychologicalAssociationguidelinesforreportingmultipleregressionthentheimplication seemstobethattabulatedresultsarethewayforward.theyalsoseeminfavourofreporting,asabareminimum,the standardizedbetas,theirsignificancevalueandsomegeneralstatisticsaboutthemodel(suchasther 2 ).Ifyoudo decidetodoatablethenthebetavaluesandtheirstandarderrorsarealsoveryusefulandpersonallyi dliketosee theconstantaswellbecausethenreadersofyourworkcanconstructthefullregressionmodeliftheyneedto.also,if you vedoneahierarchicalregressionyoushouldreportthesevaluesateachstageofthehierarchy.so,basically,you want to reproduce the table labelled coefficients from the SPSS/PASW output and omit some of the non essential information.fortheexampleinthishandoutwemightproduceatablelikethis: b SEb β Step1 Constant 54.37 28.62 Shame(TOSCA) 27.45 6.75.34*** Step2 Constant 51.49 28.09 Shame(TOSCA) 22.05 6.98.27** OCD(OBQ) 6.92 2.82.21* Note.R 2 =.12forStep1: R 2 =.04forStep2(ps<.05).*p<.05,**p<.01,***p<.001. Dr.AndyField,2008 Page9

SeeifyoucanlookbackthroughtheSPSS/PASWoutputinthishandoutandworkoutfromwherethevaluescame. Thingstonoteare: (1) I veroundedoffto2decimalplacesthroughout; (2) Forthestandardizedbetasthereisnozerobeforethedecimalpoint(becausethesevaluescannotexceed1) butforallothervalueslessthenonethezeroispresent; (3) Thesignificanceofthevariableisdenotedbyasteriskswithafootnotetoindicatethesignificancelevelbeing used(suchas*p<.05,**p<.01,and***p<.001); (4) TheR 2 fortheinitialmodelandthechangeinr 2 (denotedas R 2 )foreachsubsequentstepofthemodelare reportedbelowthetable. Task2 Afashionstudentwasinterestedinfactorsthatpredictedthesalariesofcatwalkmodels.Shecollecteddatafrom231 models.foreachmodelsheaskedthemtheirsalaryperdayondayswhentheywereworking(salary),theirage(age), howmanyyearstheyhadworkedasamodel(years),andthengotapanelofexpertsfrommodellingagenciestorate theattractivenessofeachmodelasapercentagewith100%beingperfectlyattractive(beauty).thedataareinthe file Supermodel.sav on the course website. Conduct a multiple regression to see which factor predict a model s salary? 3 Howmuchvariancedoesthefinalmodelexplain? Your Answers: Whichvariablessignificantlypredictsalary? Your Answers: FillinthevaluesforthefollowingAPAformattableoftheresults: b SEb β Constant Age YearsasaModel Attractiveness Note.R 2 =*p<.01,**p<.001. 3 AnswerstothistaskareontheCD ROMofField(2005)inthefilecalledAnswers(Chapter5).pdf. Dr.AndyField,2008 Page10

Writeouttheregressionequationforthefinalmodel. Your Answers: Aretheresidualsasyouwouldexpectforagoodmodel? Your Answers: Isthereevidenceofnormalityoferrors,homoscedasticityandnomulticollinearity? Your Answers: Task3 Complete the multiple choice questions for Chapter 7 on the companion website to Field (2009): http://www.uk.sagepub.com/field3e/mcq.htm.ifyougetanywrong,re readthishandout(orfield,2009,chapter7) anddothemagainuntilyougetthemallcorrect. Task4(OptionalforExtraPractice) Gobacktotheoutputforlastweek stask(doeslisteningtoheavymetalpredictsuiciderisk).isthemodelvalid(i.e. arealloftheassumptionsmet?)? Thishandoutcontainsmaterialfrom: Field,A.P.(2009).DiscoveringstatisticsusingSPSS:andsexanddrugsandrock n roll(3 rd Edition).London:Sage. ThismaterialiscopyrightAndyField(2000,2005,2009). Dr.AndyField,2008 Page11