Bivariate Correlations

Similar documents
Daniel Boduszek University of Huddersfield

ROC Curve. Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341)

Section 6: Analysing Relationships Between Variables

Day 11: Measures of Association and ANOVA

Using SPSS for Correlation

Lab 5a Exploring Correlation

Intro to SPSS. Using SPSS through WebFAS

POLS 5377 Scope & Method of Political Science. Correlation within SPSS. Key Questions: How to compute and interpret the following measures in SPSS

Class 7 Everything is Related

10. LINEAR REGRESSION AND CORRELATION

Correlation and Regression

CHAPTER ONE CORRELATION

Analysis and Interpretation of Data Part 1

isc ove ring i Statistics sing SPSS

Chapter Eight: Multivariate Analysis

Overview of Non-Parametric Statistics

Survey research (Lecture 1) Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.

Survey research (Lecture 1)

Unit 8 Day 1 Correlation Coefficients.notebook January 02, 2018

Chapter Eight: Multivariate Analysis

1.4 - Linear Regression and MS Excel

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

Chapter 14: More Powerful Statistical Methods

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Section 3 Correlation and Regression - Teachers Notes

SPSS Correlation/Regression

Introduction to Quantitative Methods (SR8511) Project Report

Impact of Personal Attitudes on Propensity to Use Autonomous Vehicles for Intercity Travel

M15_BERE8380_12_SE_C15.6.qxd 2/21/11 8:21 PM Page Influence Analysis 1

On the purpose of testing:

Content. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries. Research question. Example Newly diagnosed Type 2 Diabetes

Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0

11/24/2017. Do not imply a cause-and-effect relationship

SOME NOTES ON STATISTICAL INTERPRETATION

MULTIPLE OLS REGRESSION RESEARCH QUESTION ONE:

STATISTICS AND RESEARCH DESIGN

Measuring the User Experience

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE

Unit 8 Bivariate Data/ Scatterplots

Chapter 7: Correlation

Meeting-5 MEASUREMENT 8-1

CHAPTER TWO REGRESSION

Undertaking statistical analysis of

Q: How do I get the protein concentration in mg/ml from the standard curve if the X-axis is in units of µg.

Here are the various choices. All of them are found in the Analyze menu in SPSS, under the sub-menu for Descriptive Statistics :

7. Bivariate Graphing

Problem set 2: understanding ordinary least squares regressions

Introduction to SPSS S0

1. To review research methods and the principles of experimental design that are typically used in an experiment.

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

DENTRIX ENTERPRISE 8.0.5

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 8 One Way ANOVA and comparisons among means Introduction

Managing and Taking Notes

Daniel Boduszek University of Huddersfield

AP Statistics. Semester One Review Part 1 Chapters 1-5

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

Scatter Plots and Association

Managing and Taking Notes

Basic Biostatistics. Chapter 1. Content

Unit outcomes. Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2018 Creative Commons Attribution 4.0.

Unit outcomes. Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2018 Creative Commons Attribution 4.0.

LAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*

From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1

The North Carolina Health Data Explorer

Examining differences between two sets of scores

Survey Project Data Analysis Guide

Measures of Dispersion. Range. Variance. Standard deviation. Measures of Relationship. Range. Variance. Standard deviation.

Chapter 1: Explaining Behavior

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction

bivariate analysis: The statistical analysis of the relationship between two variables.

Study of the Toyota Brand Impression Resulting From the Recall Crisis

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.

32.5. percent of U.S. manufacturers experiencing unfair currency manipulation in the trade practices of other countries.

BlueBayCT - Warfarin User Guide

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Reveal Relationships in Categorical Data

Principal Components Factor Analysis in the Literature. Stage 1: Define the Research Problem

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Module 3: Pathway and Drug Development

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

Multiple Linear Regression Analysis

Selecting the Right Data Analysis Technique

Chapter 7: Descriptive Statistics

Ch. 11 Measurement. Paul I-Hai Lin, Professor A Core Course for M.S. Technology Purdue University Fort Wayne Campus

Daniel Boduszek University of Huddersfield

Math 075 Activities and Worksheets Book 2:

To learn how to use the molar extinction coefficient in a real experiment, consider the following example.

PTHP 7101 Research 1 Chapter Assignments

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.

STOR 155 Section 2 Midterm Exam 1 (9/29/09)

Multiple Regression Using SPSS/PASW

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Level 3 AS Credits Internal Investigate Bivariate Measurement Data Written by Jake Wills MathsNZ

Still important ideas

Ch. 11 Measurement. Measurement

Warfarin Help Documentation

Prepared by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies

Enzyme Action: Testing Catalase Activity

STAT 503X Case Study 1: Restaurant Tipping

Transcription:

Bivariate Correlations Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341) 580342 081 753 3962

Bivariate Correlations The Bivariate Correlations procedure computes the pairwise associations for a set of variables and displays the results in a matrix. It is useful for determining the strength and direction of the association between two scale or ordinal variables. Using Correlations to Study the Association between Motor Vehicles Sales and Fuel Efficiency In order to increase sales, motor vehicle design engineers want to focus their attention on aspects of the vehicle that are important to customers--for example, how important is fuel efficiency with respect to sales? One way to measure this is to compute the correlation between past sales and fuel efficiency. Information concerning various makes of motor vehicles is collected in car_sales.sav. Use Bivariate Correlations to measure the importance of fuel efficiency to the salability of a motor vehicle. Running the Analysis To run a correlations analysis, from the menus choose: Analyze Correlate Bivariate... 1- Bivariate Correlations

Select Sales in thousands and Fuel efficiency as analysis variables. Click OK. These selections produce a correlation matrix for Sales in thousands and Fuel efficiency. Correlation Matrix The Pearson correlation coefficient measures the linear association between two scale variables. The correlation reported in the table is negative(!), although not significantly different from 0, which suggests that designers should not focus their efforts on making cars more fuel efficient because there isn't an appreciable effect on sales. However, the Pearson correlation coefficient works best when the variables are approximately normally distributed and have no outliers. A scatterplot can reveal these possible problems. 2- Bivariate Correlations

To produce a scatterplot of Sales in thousands by Fuel efficiency, from the menus choose: Graphs Scatter... Click Define. 3- Bivariate Correlations

Select Sales in thousands as the y variable and Fuel efficiency as the x variable. Click OK. The resulting scatterplot shows a point that is far to the right of the others. 4- Bivariate Correlations

To identify the point, activate the graph by double-clicking on it. Click the Point Identification tool. Select the point. It is identified as case 27. The scatterplot shows that Sales in thousands has a skewed distribution. Moreover, there is an outlier in Fuel efficiency. By fixing these problems, you can improve the estimates of the correlation. 5- Bivariate Correlations

Improving Correlation Estimates Since Sales in thousands is heavily right-skewed, try replacing it with Log-transformed sales in further analyses. The Data Editor shows that case 27, the Metro, is not representative of the vehicles that your design team is working on, so you can safely remove it from further analyses. To remove the Metro from the correlation computations, from the menus choose: Data Select Cases... 6- Bivariate Correlations

Select If condition is satisfied and click If. Type model ~= 'Metro' in the text box. Click Continue. Click OK in the Select Cases dialog box. A new variable has been created that uses all cases except for the Metro in further computations. 7- Bivariate Correlations

To analyze the filtered data, recall the Bivariate Correlations dialog box. Deselect Sales in thousands as an analysis variable. Select Log-transformed sales as an analysis variable. Click OK. Correlations After removing the outlier and looking at the log-transformed sales, the correlation is now positive but still not significantly different from 0. However, the customer demographics for trucks and automobiles are different, and the reasons for buying a truck or a car may not be the same. It's worthwhile to look at another scatterplot, this time marking trucks and autos separately. 8- Bivariate Correlations

To produce a scatterplot of Log-transformed sales by Fuel efficiency, controlling for Vehicle type, recall the Simple Scatterplot dialog box. Deselect Sales in thousands and select Log-transformed sales as the y variable. Select Vehicle type as the variable to set markers by. Click OK. 9- Bivariate Correlations

The scatterplot shows that trucks and automobiles form distinctly different groups. By splitting the data file according to Vehicle type, you might get a more accurate view of the association. Improving Correlation Estimates: Splitting the File To split the data file according to Vehicle type, from the menus choose: Data Split File... 10- Bivariate Correlations

Select Compare groups. Select Vehicle type as the variable on which groups should be based. Click OK. To analyze the split file, recall the Bivariate Correlations dialog box. Click OK. 11- Bivariate Correlations

Correlations Splitting the file on Vehicle type has made the relationship between sales and fuel efficiency much more clear. There is a significant and fairly strong positive correlation between sales and fuel efficiency for automobiles. For trucks, the correlation is positive but not significantly different from 0. Reaching these conclusions has required some work and shown that correlation analysis using the Pearson correlation coefficient is not always straightforward. For comparison, see how you can avoid the difficulty of transforming variables by using nonparametric correlation measures. Nonparametric Correlation Estimates 12- Bivariate Correlations

The Spearman's rho and Kendall's tau-b statistics measure the rank-order association between two scale or ordinal variables. They work regardless of the distributions of the variables. To obtain an analysis using Spearman's rho, recall the Bivariate Correlations dialog box. Select Sales in thousands as an analysis variable. Deselect Pearson and select Spearman. Click OK. Correlations Spearman's rho is reported separately for automobiles and trucks. As with Pearson's correlation coefficient, the association between Log-transformed sales and Fuel efficiency is fairly strong. However, Spearman's rho reports the same correlation for the untransformed sales! This is because rho is based on rank orders, which are unchanged by log transformation. Moreover, outliers have less of an effect on Spearman's rho, so it's possible to save some time and effort by using it as a measure of association. 13- Bivariate Correlations

Summary Using Bivariate Correlations, you produced a correlation matrix for Sales in thousands by Fuel efficiency and, surprisingly, found a negative correlation. Upon removing an outlier and using Log-transformed sales, the correlation became positive, although not significantly different from 0. However, you found that by computing the correlations separately for trucks and autos, there is a positive and statistically significant correlation between sales and fuel efficiency for automobiles. Furthermore, you found similar results without the transformation using Spearman's rho, and perhaps are wondering why you should go through the effort of transforming variables when Spearman's rho is so convenient. The measures of rank order are handy for discovering whether there is any kind of association between two variables, but when they find an association it's a good idea to find a transformation that makes the relationship linear. This is because there are more predictive models available for linear relationships, and the linear models are generally easier to implement and interpret. Related Procedures The Bivariate Correlations procedure is useful for studying the pairwise associations for a set of scale or ordinal variables. If you have nominal variables, use the Crosstabs procedure to obtain measures of association. If you want to model the value of a scale variable based on its linear relationship to other variables, try the Linear Regression procedure. If you want to decompose the variation in your data to look for underlying patterns, try the Factor Analysis procedure. See the following text for more information on correlations (for complete bibliographic information, hover over the reference): Hays, W. L. 1981. Statistics. New York: Holt, Rinehart, and Winston. 14- Bivariate Correlations