Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

Size: px
Start display at page:

Download "Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)"

Transcription

1 Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) After receiving my comments on the preliminary reports of your datasets, the next step for the groups is to complete all data gathering, move everything to SPSS or STATA or some other statistical software, conduct t-tests and bivariate correlations and report these findings. You will have until Thursday, 5 p.m. February 28 to upload these Preliminary Reports on Simple Statistical Tests to your group s common folder. These must be in a SINGLE PDF file clearly labeled. Once you have finished filling in the missing data in your datasets and you have converted your datasets into a form that is readable by a statistical analysis program, your group will be set to do some preliminary tests. The report will reflect the following steps: (1) Explain the logic of the core model(s) that your group has decided to test under the rubric of the chosen hypothesis. (2) Analyze the differences in means between two arrays of variables of interest using independent-sample t-tests. For example, if you are evaluating to what extent SMEs and LMEs have different standards of living, you might take per capita GDP and, using a dummy variable to select SMEs or LMEs only, you will have SPSS compare the differences in means of per capita GDP between these two cohorts. You may do as many t-tests as you wish. You need only report (a) the t-statistic and (b) the significance of the figure. Something like (-1.23, p<.001) would be fine. This means that the difference in means is significant at the.001 level. If the statistic is not significant, just say so. You can still report the t-statistic and the p value (for example: (2.55, p=.78)). (3) For each of your major DVs, you will evaluate its correlation with selected IVs. This may be done by running separate bivariate regressions for each or by running a correlation matrix. The latter can be done by running a multiple regression and looking at only the correlation matrix. In SPSS bivariate correlations are done by using the Regression Linear function under the Analyze drop-down menu in SPSS. Once you have selected a single DV and a single IV, you will select the Statistics button and check three things: (1) confidence intervals, (2) descriptives, and (3) collinearity diagnostics. After pressing Continue, you will select the Plots button. Place ZPRED in the X box and ZRESID in the Y box. Check Histogram. Then, you will run the regression. In the Output box you will see the Pearson s R correlation statistic, which will include a coefficient of correlation and a significance statistic. Anything under.05 will be statistically significant. (N is the number of cases). You will then note the R-square and the adjusted R-square. In the Coefficients box you will note the unstandardized coefficient and the standardized coefficient (the same as the Pearson s R).

2 I would like you to report for each of your bivariate tests, in a box, these statistics in the following form (### simply substitutes for the numbers you will have): Table 1: Test 1 Linear Regression of DV in question Variable Unstandardized Standard Standardized T Significance Coefficient Error Coefficient Constant ### ### ### ###** IV ### ### ### ### ### R-square Adjusted R- N square ### ### ### For all variables that are significant at the.05 or.01 levels you will note their significance in the box by making the entry bold (see the example of the Constant above, which is usually significant). You will indicate the level of significance by reporting the p value like so, below the box: * p <.05 ** p<.01 You will place a bold asterisk next to the significance statistic as demonstrated above. Please convert all unstandardized coefficients reported in scientific notation by SPSS. You may run as many bivariate tests as you believe necessary, but just label each test with a title and a number as shown above. What to Look For At this stage, you are testing your variables to see how they affect the DV(s). If you get back insignificant Pearson s R for a variable, then it may be statistically insignificant when you run it in a multivariate model. This assumption is not always true, but you can use it as a rule of thumb. You are also judging the correlation by the R-square (or adjusted R-square) to see about the goodness of fit of the regression line. If the R-square is low, you might have a nonlinear relationship or lurking variables. The latter is very likely in bivariate anlysis since you are allowing other variables to vary (i.e., they are not being controlled). In these cases, look at the standardized scatterplots which are run as part of the SPSS output. If you do not see a random distribution around the center (e.g., a curved pattern), then the relationship may be nonlinear. There may also be heteroskedasticity, especially if you see funnel or curved patterns in the distribution of the data points. You may also run a simple scatterplot from the Graphs dropdown menu in SPSS. If you see evidence of a nonlinear relationship, then report that in prose underneath the table of the test in question. Another reason to look at the residuals, even if the R-square looks pretty good, is to check to see what the pattern of outliers is. In the standardized residual plot generated by

3 SPSS as part of your output, check to see if there are many data points located outside of the -3 to 3 range. If there are, then these are probably outliers. You can click the chart in SPSS and label the cases so that you can identify which are the outliers. Look at those cases in the dataset and try to figure out why they are statistical outliers. Remember, you want to get a distribution in the simple scatterplot that is as linear as possible and a distribution in the standardized scatterplot that is random. If in the simple scatterplot you think you are looking at an exponential relationship and not a linear one (and the IV is a socio-economic variable), then you might have to calculate the natural logarithm of the IV in question and run that instead as the IV in another bivariate test. In SPSS this is done with the Calculate function in the Transform Data drop-down menu. Put in your source variable and put in the comment for the natural log. (You can also easily do this in Excel by cutting and pasting your IV data into a worksheet and then in the adjacent column inputing the formula =ln([cell, e.g., A1]), then selecting all of the cells underneath that first cell, selecting Fill from the Edit dropdown menu, and then Down ). Once you have the natural log of the variable, cut and paste that into SPSS. Run the regression using the natural log of the variable. That will give it a more even distribution. Again, look at the residuals in the simple scatterplot and see if the relationship looks more linear. In the histogram, if you see a certain skew in the data to one side or the other, and perhaps a less than perfect bell-shaped curve, then report what you see. We are looking for a normal distribution in the data around zero a bell-shaped curve. These are just preliminary tests of the data to see what kinds of relationships might exist. At the same time, you need to be thinking.what makes sense in the abstract? What should be affecting what? What would I expect to see when I run this bivariate test? Then you need to make decisions about which IVs seem to be the most robust predictors of the DV. Then, in the next step, you will specify your multivariate model.

4 Preliminary Report on Multivariate Regressions Now that you have finished your first bivariate tests, it is time to specify your multiple regression model(s). Your group must decide on (1) which DVs (if there are multiple choices) will the model attempt to explain; (2) which main IVs will the group decide is/are the one(s) to build an argument around; and (3) which control variables will be run in the model(s). Once you have specified the model, lay it out using the standard regression equation: Y = a + b 1 *x 1 + b 2 *x 2 + e Here is an example: Standard of Living (GDP per capita) = a + b 1 (Union Density) + b 2 (EPL) + b 3 (Unemployment)... + b p *X p...+ e In this example, the model will regress a DV called standard of living (operationalized as GDP per capita) on three variables, Union Density, the EPL index, and Unemployment. The a represent the constants (the Y intercept). This is where the regression line intercepts the Y axis, that is the value of the DV when all the IVs are 0. The b p s are the regression coefficients. The e is the error term, which is used to calculate statistical significance. (You must use the same formula format but just place the variable labels for your IVs and DVs in place of those listed above). You may omit the + b p *X p as this merely represents the fact that a regression equation may have more than the three IVs listed above. (NOTE: For further reference regarding multiple regression, see the document titled, Multiple Regression Instructions, available on the common folder for the class). The expectation of this argument is that Union Density and EPL, which are higher for SMEs than for LMEs, will predict change in GDP per capita controlling for the level of unemployment. Your group will no doubt have more than three variables, but you need to specify the model as I did above and you must list the variable names so they may be understood as I did above. The statistical results of your model will be reported using the same format you used for the bivariate analyses, with some minor modifications (i.e., I wish you to report the number of cases this time):

5 Table 1: Linear Regression of Standard of Living Variable Unstandardized Standard Standardized T Significance Coefficient Error Coefficient Constant ### ### ### ###** Union Density ### ### ### ### ### EPL Index ### ### ### ### ### Unemployment ### ### ### ### ### R-square Adjusted R- N square ### ### ## For all variables that are significant at the.05 or.01 levels you will note their significance in the box by making the entry bold (see the example of the Constant above, which is usually significant). You will indicate the level of significance by reporting the p value like so, below the box: * - p <.05 ** - p<.01 You will place a bold asterisk next to the significance statistic as demonstrated above. Please convert all unstandardized coefficients reported in scientific notation by SPSS. Note: Variable names in your dataset may be shortened (e.g., UNIDEN, EPLI, UNEMP, etc.). But when you report the output of your regressions, I do not wish to see a cut and paste job of the SPSS or STATA output!!! I will ONLY accept a proper table with the full names of the variables written out as illustrated above. If you wish to present more than one multiple regression model, then you must do so in a single table that nests the models, one after the other. Please see me about this. The above will constitute the final preliminary report of your group s data analysis prior ot the final presentation. This report will be uploaded as a SINGLE PDF FILE by Monday, 9 a.m., March 3. Additional Considerations About Your Data Groups must evaluate several diagnostics in their multiple regression output. Two in particularly will be of some concern to us: (1) heteroskedasticity and (2) multicollinearity. Heteroskedasticity As in the bivariate correlations, you should plot the standardized residuals. In SPSS this is ZRESID vs. ZPRED referred to above. The plot should show a random pattern, with no

6 nonlinearity or heteroscedasticity. If you get a funnel shape or a cone, then there is a heteroskedasticity problem. See the professor. Multicollinearity There are many kinds of multicollinearity, but we will be concerned with mostly one: multicollinearity among the IVs. This biases standard error terms which makes assessment of the significance of any given IV unreliable. You can detect multicollinearity in a number of ways. First, look at the correlation matrix in a multiple regression output. Which independent variables have reasonably high Pearson s coefficients with other IVs? Scores >.60 might be a cause of concern. Second, generate tolerance and variance inflation factor diagnostics when you run the regression. These statisics regress each IV on all of the others. The following, which is taken from the document Multiple Regression Instructions summarizes how to use tolerance and VIF statistics: o Tolerance is 1 - R 2 for the regression of that independent variable on all the other independents, ignoring the dependent. There will be as many tolerance coefficients as there are independents. The higher the intercorrelation of the independents, the more the tolerance will approach zero. As a rule of thumb, if tolerance is less than.20, a problem with multicollinearity is indicated. In SPSS 13, select Analyze, Regression, Linear; click Statistics; check Collinearity diagnostics to get tolerance. When tolerance is close to 0 there is high multicollinearity of that variable with other independents and the b and beta coefficients will be unstable.the more the multicollinearity, the lower the tolerance, the more the standard error of the regression coefficients. Tolerance is part of the denominator in the formula for calculating the confidence limits on the b (partial regression) coefficient. o o Variance-inflation factor, VIF VIF is the variance inflation factor, which is simply the reciprocal of tolerance. Therefore, when VIF is high there is high multicollinearity and instability of the b and beta coefficients. VIF and tolerance are found in the SPSS output section on collinearity statistics. The table below shows the inflationary impact on the standard error of the regression coefficient (b) of the jth independent variable for various levels of multiple correlation (R j ), tolerance, and VIF (adapted from Fox, 1991: 12). In SPSS 13, select Analyze, Regression, Linear; click Statistics; check Collinearity diagnostics to get VIF. Note that in the "Impact on SE" column, 1.0 corresponds to no impact, 2.0 to doubling the standard error, etc.:

7 R j Tolerance VIF Impact on SE b o Standard error is doubled when VIF is 4.0 and tolerance is.25, corresponding to R j =.87. Therefore VIF >= 4 is an arbitrary but common cut-off criterion for deciding when a given independent variable displays "too much" multicollinearity: values above 4 suggest a multicollinearity problem. Some researchers use the more lenient cutoff of 5.0: if VIF >=5, then multicollinearity is a problem. Groups ought to discuss their findings with the professor prior to the presentation. Issues involving diagnostic results can be evaluated at that time. The Oral Presentation Your group will have specified time limits (8-10 minutes) to present your findings to the class during one of the two classroom sessions dedicated to the data analysis group presentations. You may set up your presentation using whatever techniques you would like, but it must include the following content: (1) an explanation of your main argument including its logic and its importance, (2) an overview of how you gathered your data, (3) a statement of what expectations you had prior to conducting your analysis, (4) an explanation of how you did your bivariate tests on the data and how you specified your multiple regression model, (5) a discussion of the results of your model using a graphic display (PowerPoint or MS Word are fine tools for this), (6) conclusions including suggested areas for further research. Immediately following your presentation, the group members are to be peppered with insightful and keen questions from the audience. After a few minutes of crossexamination, your presentation will be over.

Daniel Boduszek University of Huddersfield

Daniel Boduszek University of Huddersfield Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Multiple Regression (MR) Types of MR Assumptions of MR SPSS procedure of MR Example based on prison data Interpretation of

More information

Multiple Regression Using SPSS/PASW

Multiple Regression Using SPSS/PASW MultipleRegressionUsingSPSS/PASW The following sections have been adapted from Field (2009) Chapter 7. These sections have been edited down considerablyandisuggest(especiallyifyou reconfused)thatyoureadthischapterinitsentirety.youwillalsoneed

More information

WELCOME! Lecture 11 Thommy Perlinger

WELCOME! Lecture 11 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression

More information

CHAPTER TWO REGRESSION

CHAPTER TWO REGRESSION CHAPTER TWO REGRESSION 2.0 Introduction The second chapter, Regression analysis is an extension of correlation. The aim of the discussion of exercises is to enhance students capability to assess the effect

More information

CHAPTER ONE CORRELATION

CHAPTER ONE CORRELATION CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to

More information

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys Multiple Regression Analysis 1 CRITERIA FOR USE Multiple regression analysis is used to test the effects of n independent (predictor) variables on a single dependent (criterion) variable. Regression tests

More information

bivariate analysis: The statistical analysis of the relationship between two variables.

bivariate analysis: The statistical analysis of the relationship between two variables. bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

11/24/2017. Do not imply a cause-and-effect relationship

11/24/2017. Do not imply a cause-and-effect relationship Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

Class 7 Everything is Related

Class 7 Everything is Related Class 7 Everything is Related Correlational Designs l 1 Topics Types of Correlational Designs Understanding Correlation Reporting Correlational Statistics Quantitative Designs l 2 Types of Correlational

More information

POL 242Y Final Test (Take Home) Name

POL 242Y Final Test (Take Home) Name POL 242Y Final Test (Take Home) Name_ Due August 6, 2008 The take-home final test should be returned in the classroom (FE 36) by the end of the class on August 6. Students who fail to submit the final

More information

BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA

BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA PART 1: Introduction to Factorial ANOVA ingle factor or One - Way Analysis of Variance can be used to test the null hypothesis that k or more treatment or group

More information

CHILD HEALTH AND DEVELOPMENT STUDY

CHILD HEALTH AND DEVELOPMENT STUDY CHILD HEALTH AND DEVELOPMENT STUDY 9. Diagnostics In this section various diagnostic tools will be used to evaluate the adequacy of the regression model with the five independent variables developed in

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Chapter 10: Moderation, mediation and more regression

Chapter 10: Moderation, mediation and more regression Chapter 10: Moderation, mediation and more regression Smart Alex s Solutions Task 1 McNulty et al. (2008) found a relationship between a person s Attractiveness and how much Support they give their partner

More information

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction In this exercise, we will gain experience assessing scatterplots in regression and

More information

Daniel Boduszek University of Huddersfield

Daniel Boduszek University of Huddersfield Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Correlation SPSS procedure for Pearson r Interpretation of SPSS output Presenting results Partial Correlation Correlation

More information

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival* LAB ASSIGNMENT 4 1 INFERENCES FOR NUMERICAL DATA In this lab assignment, you will analyze the data from a study to compare survival times of patients of both genders with different primary cancers. First,

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis Basic Concept: Extend the simple regression model to include additional explanatory variables: Y = β 0 + β1x1 + β2x2 +... + βp-1xp + ε p = (number of independent variables

More information

10. LINEAR REGRESSION AND CORRELATION

10. LINEAR REGRESSION AND CORRELATION 1 10. LINEAR REGRESSION AND CORRELATION The contingency table describes an association between two nominal (categorical) variables (e.g., use of supplemental oxygen and mountaineer survival ). We have

More information

Study Guide #2: MULTIPLE REGRESSION in education

Study Guide #2: MULTIPLE REGRESSION in education Study Guide #2: MULTIPLE REGRESSION in education What is Multiple Regression? When using Multiple Regression in education, researchers use the term independent variables to identify those variables that

More information

MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE:

MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE: 1 MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE: Predicting State Rates of Robbery per 100K We know that robbery rates vary significantly from state-to-state in the United States. In any given state, we

More information

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Multiple Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Multiple Regression 1 / 19 Multiple Regression 1 The Multiple

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Survey research (Lecture 1) Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.

Survey research (Lecture 1) Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4. Summary & Conclusion Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.0 Overview 1. Survey research 2. Survey design 3. Descriptives & graphing 4. Correlation

More information

Survey research (Lecture 1)

Survey research (Lecture 1) Summary & Conclusion Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.0 Overview 1. Survey research 2. Survey design 3. Descriptives & graphing 4. Correlation

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Assoc. Prof Dr Sarimah Abdullah Unit of Biostatistics & Research Methodology School of Medical Sciences, Health Campus Universiti Sains Malaysia Regression Regression analysis

More information

isc ove ring i Statistics sing SPSS

isc ove ring i Statistics sing SPSS isc ove ring i Statistics sing SPSS S E C O N D! E D I T I O N (and sex, drugs and rock V roll) A N D Y F I E L D Publications London o Thousand Oaks New Delhi CONTENTS Preface How To Use This Book Acknowledgements

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

Data Analysis with SPSS

Data Analysis with SPSS Data Analysis with SPSS A First Course in Applied Statistics Fourth Edition Stephen Sweet Ithaca College Karen Grace-Martin The Analysis Factor Allyn & Bacon Boston Columbus Indianapolis New York San Francisco

More information

Correlation and Regression

Correlation and Regression Dublin Institute of Technology ARROW@DIT Books/Book Chapters School of Management 2012-10 Correlation and Regression Donal O'Brien Dublin Institute of Technology, donal.obrien@dit.ie Pamela Sharkey Scott

More information

Problem set 2: understanding ordinary least squares regressions

Problem set 2: understanding ordinary least squares regressions Problem set 2: understanding ordinary least squares regressions September 12, 2013 1 Introduction This problem set is meant to accompany the undergraduate econometrics video series on youtube; covering

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

Sample Exam Paper Answer Guide

Sample Exam Paper Answer Guide Sample Exam Paper Answer Guide Notes This handout provides perfect answers to the sample exam paper. I would not expect you to be able to produce such perfect answers in an exam. So, use this document

More information

1.4 - Linear Regression and MS Excel

1.4 - Linear Regression and MS Excel 1.4 - Linear Regression and MS Excel Regression is an analytic technique for determining the relationship between a dependent variable and an independent variable. When the two variables have a linear

More information

Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0

Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Summary & Conclusion Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Overview 1. Survey research and design 1. Survey research 2. Survey design 2. Univariate

More information

5 To Invest or not to Invest? That is the Question.

5 To Invest or not to Invest? That is the Question. 5 To Invest or not to Invest? That is the Question. Before starting this lab, you should be familiar with these terms: response y (or dependent) and explanatory x (or independent) variables; slope and

More information

One-Way Independent ANOVA

One-Way Independent ANOVA One-Way Independent ANOVA Analysis of Variance (ANOVA) is a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment.

More information

2 Assumptions of simple linear regression

2 Assumptions of simple linear regression Simple Linear Regression: Reliability of predictions Richard Buxton. 2008. 1 Introduction We often use regression models to make predictions. In Figure?? (a), we ve fitted a model relating a household

More information

Day 11: Measures of Association and ANOVA

Day 11: Measures of Association and ANOVA Day 11: Measures of Association and ANOVA Daniel J. Mallinson School of Public Affairs Penn State Harrisburg mallinson@psu.edu PADM-HADM 503 Mallinson Day 11 November 2, 2017 1 / 45 Road map Measures of

More information

AP Statistics. Semester One Review Part 1 Chapters 1-5

AP Statistics. Semester One Review Part 1 Chapters 1-5 AP Statistics Semester One Review Part 1 Chapters 1-5 AP Statistics Topics Describing Data Producing Data Probability Statistical Inference Describing Data Ch 1: Describing Data: Graphically and Numerically

More information

Introduction to SPSS. Katie Handwerger Why n How February 19, 2009

Introduction to SPSS. Katie Handwerger Why n How February 19, 2009 Introduction to SPSS Katie Handwerger Why n How February 19, 2009 Overview Setting up a data file Frequencies/Descriptives One-sample T-test Paired-samples T-test Independent-samples T-test One-way ANOVA

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Name Date Per Key Vocabulary: response variable explanatory variable independent variable dependent variable scatterplot positive association negative association linear correlation r-value regression

More information

Daniel Boduszek University of Huddersfield

Daniel Boduszek University of Huddersfield Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Multinominal Logistic Regression SPSS procedure of MLR Example based on prison data Interpretation of SPSS output Presenting

More information

Bangor University Laboratory Exercise 1, June 2008

Bangor University Laboratory Exercise 1, June 2008 Laboratory Exercise, June 2008 Classroom Exercise A forest land owner measures the outside bark diameters at.30 m above ground (called diameter at breast height or dbh) and total tree height from ground

More information

Simple Linear Regression One Categorical Independent Variable with Several Categories

Simple Linear Regression One Categorical Independent Variable with Several Categories Simple Linear Regression One Categorical Independent Variable with Several Categories Does ethnicity influence total GCSE score? We ve learned that variables with just two categories are called binary

More information

HPS301 Exam Notes- Contents

HPS301 Exam Notes- Contents HPS301 Exam Notes- Contents Week 1 Research Design: What characterises different approaches 1 Experimental Design 1 Key Features 1 Criteria for establishing causality 2 Validity Internal Validity 2 Threats

More information

Chapter 1: Explaining Behavior

Chapter 1: Explaining Behavior Chapter 1: Explaining Behavior GOAL OF SCIENCE is to generate explanations for various puzzling natural phenomenon. - Generate general laws of behavior (psychology) RESEARCH: principle method for acquiring

More information

The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance

The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance The SAGE Encyclopedia of Educational Research, Measurement, Multivariate Analysis of Variance Contributors: David W. Stockburger Edited by: Bruce B. Frey Book Title: Chapter Title: "Multivariate Analysis

More information

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling Olli-Pekka Kauppila Daria Kautto Session VI, September 20 2017 Learning objectives 1. Get familiar with the basic idea

More information

6. Unusual and Influential Data

6. Unusual and Influential Data Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the

More information

Multiple Linear Regression Analysis

Multiple Linear Regression Analysis Revised July 2018 Multiple Linear Regression Analysis This set of notes shows how to use Stata in multiple regression analysis. It assumes that you have set Stata up on your computer (see the Getting Started

More information

Dan Byrd UC Office of the President

Dan Byrd UC Office of the President Dan Byrd UC Office of the President 1. OLS regression assumes that residuals (observed value- predicted value) are normally distributed and that each observation is independent from others and that the

More information

Statistics for Psychology

Statistics for Psychology Statistics for Psychology SIXTH EDITION CHAPTER 12 Prediction Prediction a major practical application of statistical methods: making predictions make informed (and precise) guesses about such things as

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement

More information

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,

More information

Reveal Relationships in Categorical Data

Reveal Relationships in Categorical Data SPSS Categories 15.0 Specifications Reveal Relationships in Categorical Data Unleash the full potential of your data through perceptual mapping, optimal scaling, preference scaling, and dimension reduction

More information

Understandable Statistics

Understandable Statistics Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement

More information

Problem Set 3 ECN Econometrics Professor Oscar Jorda. Name. ESSAY. Write your answer in the space provided.

Problem Set 3 ECN Econometrics Professor Oscar Jorda. Name. ESSAY. Write your answer in the space provided. Problem Set 3 ECN 140 - Econometrics Professor Oscar Jorda Name ESSAY. Write your answer in the space provided. 1) Sir Francis Galton, a cousin of James Darwin, examined the relationship between the height

More information

Section 6: Analysing Relationships Between Variables

Section 6: Analysing Relationships Between Variables 6. 1 Analysing Relationships Between Variables Section 6: Analysing Relationships Between Variables Choosing a Technique The Crosstabs Procedure The Chi Square Test The Means Procedure The Correlations

More information

ANOVA in SPSS (Practical)

ANOVA in SPSS (Practical) ANOVA in SPSS (Practical) Analysis of Variance practical In this practical we will investigate how we model the influence of a categorical predictor on a continuous response. Centre for Multilevel Modelling

More information

Chapter 4. More On Bivariate Data. More on Bivariate Data: 4.1: Transforming Relationships 4.2: Cautions about Correlation

Chapter 4. More On Bivariate Data. More on Bivariate Data: 4.1: Transforming Relationships 4.2: Cautions about Correlation Chapter 4 More On Bivariate Data Chapter 3 discussed methods for describing and summarizing bivariate data. However, the focus was on linear relationships. In this chapter, we are introduced to methods

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Chapter 14: More Powerful Statistical Methods

Chapter 14: More Powerful Statistical Methods Chapter 14: More Powerful Statistical Methods Most questions will be on correlation and regression analysis, but I would like you to know just basically what cluster analysis, factor analysis, and conjoint

More information

From Bivariate Through Multivariate Techniques

From Bivariate Through Multivariate Techniques A p p l i e d S T A T I S T I C S From Bivariate Through Multivariate Techniques R e b e c c a M. W a r n e r University of New Hampshire DAI HOC THAI NGUYEN TRUNG TAM HOC LIEU *)SAGE Publications '55'

More information

Small Group Presentations

Small Group Presentations Admin Assignment 1 due next Tuesday at 3pm in the Psychology course centre. Matrix Quiz during the first hour of next lecture. Assignment 2 due 13 May at 10am. I will upload and distribute these at the

More information

How to describe bivariate data

How to describe bivariate data Statistics Corner How to describe bivariate data Alessandro Bertani 1, Gioacchino Di Paola 2, Emanuele Russo 1, Fabio Tuzzolino 2 1 Department for the Treatment and Study of Cardiothoracic Diseases and

More information

This tutorial presentation is prepared by. Mohammad Ehsanul Karim

This tutorial presentation is prepared by. Mohammad Ehsanul Karim STATA: The Red tutorial STATA: The Red tutorial This tutorial presentation is prepared by Mohammad Ehsanul Karim ehsan.karim@gmail.com STATA: The Red tutorial This tutorial presentation is prepared by

More information

On the purpose of testing:

On the purpose of testing: Why Evaluation & Assessment is Important Feedback to students Feedback to teachers Information to parents Information for selection and certification Information for accountability Incentives to increase

More information

Chapter 3 CORRELATION AND REGRESSION

Chapter 3 CORRELATION AND REGRESSION CORRELATION AND REGRESSION TOPIC SLIDE Linear Regression Defined 2 Regression Equation 3 The Slope or b 4 The Y-Intercept or a 5 What Value of the Y-Variable Should be Predicted When r = 0? 7 The Regression

More information

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information

Stat 13, Lab 11-12, Correlation and Regression Analysis

Stat 13, Lab 11-12, Correlation and Regression Analysis Stat 13, Lab 11-12, Correlation and Regression Analysis Part I: Before Class Objective: This lab will give you practice exploring the relationship between two variables by using correlation, linear regression

More information

Using SPSS for Correlation

Using SPSS for Correlation Using SPSS for Correlation This tutorial will show you how to use SPSS version 12.0 to perform bivariate correlations. You will use SPSS to calculate Pearson's r. This tutorial assumes that you have: Downloaded

More information

Example of Interpreting and Applying a Multiple Regression Model

Example of Interpreting and Applying a Multiple Regression Model Example of Interpreting and Applying a Multiple Regression We'll use the same data set as for the bivariate correlation example -- the criterion is 1 st year graduate grade point average and the predictors

More information

Analysis and Interpretation of Data Part 1

Analysis and Interpretation of Data Part 1 Analysis and Interpretation of Data Part 1 DATA ANALYSIS: PRELIMINARY STEPS 1. Editing Field Edit Completeness Legibility Comprehensibility Consistency Uniformity Central Office Edit 2. Coding Specifying

More information

Pitfalls in Linear Regression Analysis

Pitfalls in Linear Regression Analysis Pitfalls in Linear Regression Analysis Due to the widespread availability of spreadsheet and statistical software for disposal, many of us do not really have a good understanding of how to use regression

More information

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

3 CONCEPTUAL FOUNDATIONS OF STATISTICS 3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still

More information

Charts Worksheet using Excel Obesity Can a New Drug Help?

Charts Worksheet using Excel Obesity Can a New Drug Help? Worksheet using Excel 2000 Obesity Can a New Drug Help? Introduction Obesity is known to be a major health risk. The data here arise from a study which aimed to investigate whether or not a new drug, used

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

Psy201 Module 3 Study and Assignment Guide. Using Excel to Calculate Descriptive and Inferential Statistics

Psy201 Module 3 Study and Assignment Guide. Using Excel to Calculate Descriptive and Inferential Statistics Psy201 Module 3 Study and Assignment Guide Using Excel to Calculate Descriptive and Inferential Statistics What is Excel? Excel is a spreadsheet program that allows one to enter numerical values or data

More information

Section 3 Correlation and Regression - Teachers Notes

Section 3 Correlation and Regression - Teachers Notes The data are from the paper: Exploring Relationships in Body Dimensions Grete Heinz and Louis J. Peterson San José State University Roger W. Johnson and Carter J. Kerk South Dakota School of Mines and

More information

AP STATISTICS 2010 SCORING GUIDELINES

AP STATISTICS 2010 SCORING GUIDELINES AP STATISTICS 2010 SCORING GUIDELINES Question 1 Intent of Question The primary goals of this question were to assess students ability to (1) apply terminology related to designing experiments; (2) construct

More information

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing

More information

Effects of Nutrients on Shrimp Growth

Effects of Nutrients on Shrimp Growth Data Set 5: Effects of Nutrients on Shrimp Growth Statistical setting This Handout is an example of extreme collinearity of the independent variables, and of the methods used for diagnosing this problem.

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

Before we get started:

Before we get started: Before we get started: http://arievaluation.org/projects-3/ AEA 2018 R-Commander 1 Antonio Olmos Kai Schramm Priyalathta Govindasamy Antonio.Olmos@du.edu AntonioOlmos@aumhc.org AEA 2018 R-Commander 2 Plan

More information

M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page Influence Analysis 1

M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page Influence Analysis 1 M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page 1 15.6 Influence Analysis FIGURE 15.16 Minitab worksheet containing computed values for the Studentized deleted residuals, the hat matrix elements, and

More information

Math 075 Activities and Worksheets Book 2:

Math 075 Activities and Worksheets Book 2: Math 075 Activities and Worksheets Book 2: Linear Regression Name: 1 Scatterplots Intro to Correlation Represent two numerical variables on a scatterplot and informally describe how the data points are

More information

TEACHING REGRESSION WITH SIMULATION. John H. Walker. Statistics Department California Polytechnic State University San Luis Obispo, CA 93407, U.S.A.

TEACHING REGRESSION WITH SIMULATION. John H. Walker. Statistics Department California Polytechnic State University San Luis Obispo, CA 93407, U.S.A. Proceedings of the 004 Winter Simulation Conference R G Ingalls, M D Rossetti, J S Smith, and B A Peters, eds TEACHING REGRESSION WITH SIMULATION John H Walker Statistics Department California Polytechnic

More information

Lab 8: Multiple Linear Regression

Lab 8: Multiple Linear Regression Lab 8: Multiple Linear Regression 1 Grading the Professor Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these

More information

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Plous Chapters 17 & 18 Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

More information

Hour 2: lm (regression), plot (scatterplots), cooks.distance and resid (diagnostics) Stat 302, Winter 2016 SFU, Week 3, Hour 1, Page 1

Hour 2: lm (regression), plot (scatterplots), cooks.distance and resid (diagnostics) Stat 302, Winter 2016 SFU, Week 3, Hour 1, Page 1 Agenda for Week 3, Hr 1 (Tuesday, Jan 19) Hour 1: - Installing R and inputting data. - Different tools for R: Notepad++ and RStudio. - Basic commands:?,??, mean(), sd(), t.test(), lm(), plot() - t.test()

More information

STAT 503X Case Study 1: Restaurant Tipping

STAT 503X Case Study 1: Restaurant Tipping STAT 503X Case Study 1: Restaurant Tipping 1 Description Food server s tips in restaurants may be influenced by many factors including the nature of the restaurant, size of the party, table locations in

More information

Intro to SPSS. Using SPSS through WebFAS

Intro to SPSS. Using SPSS through WebFAS Intro to SPSS Using SPSS through WebFAS http://www.yorku.ca/computing/students/labs/webfas/ Try it early (make sure it works from your computer) If you need help contact UIT Client Services Voice: 416-736-5800

More information

From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1

From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1 From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Contents Dedication... iii Acknowledgments... xi About This Book... xiii About the Author... xvii Chapter 1: Introduction...

More information