Lab 5a Exploring Correlation

Similar documents
Math 075 Activities and Worksheets Book 2:

CHAPTER ONE CORRELATION

Correlation & Regression Exercises Chapters 14-15

q3_2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

HW 3.2: page 193 #35-51 odd, 55, odd, 69, 71-78

7. Bivariate Graphing

SCATTER PLOTS AND TREND LINES

Section 6: Analysing Relationships Between Variables

Bivariate Correlations

Making charts in Excel

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables

Unit 8 Bivariate Data/ Scatterplots

Statisticians deal with groups of numbers. They often find it helpful to use

10. LINEAR REGRESSION AND CORRELATION

Chapter 3 Review. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

Charts Worksheet using Excel Obesity Can a New Drug Help?

Relationships. Between Measurements Variables. Chapter 10. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Regression. Lelys Bravo de Guenni. April 24th, 2015

Section 3.2 Least-Squares Regression

Chapter 3 CORRELATION AND REGRESSION

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction

The Jumping Dog Quadratic Activity

To learn how to use the molar extinction coefficient in a real experiment, consider the following example.

AP Statistics Practice Test Ch. 3 and Previous

Eating and Sleeping Habits of Different Countries

Section 3 Correlation and Regression - Teachers Notes

IAS 3.9 Bivariate Data

Daniel Boduszek University of Huddersfield

SPSS Correlation/Regression

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Daniel Boduszek University of Huddersfield

MATH 2560 C F03 Elementary Statistics I LECTURE 6: Scatterplots (Continuation).

The North Carolina Health Data Explorer

Regression Equation. November 29, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions

Your Task: Find a ZIP code in Seattle where the crime rate is worse than you would expect and better than you would expect.

Q: How do I get the protein concentration in mg/ml from the standard curve if the X-axis is in units of µg.

Math 081 W2010 Exam 1 Ch 4.4 to 4.6 V 01 Preparation Dressler. Name 6) Multiply. 1) ) ) )

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

The Effectiveness of Captopril

1. To review research methods and the principles of experimental design that are typically used in an experiment.

Using SPSS for Correlation

Chapter 4. More On Bivariate Data. More on Bivariate Data: 4.1: Transforming Relationships 4.2: Cautions about Correlation

Section 1.2 Displaying Quantitative Data with Graphs. Dotplots

Commonwealth of Pennsylvania PA Test Method No. 423 Department of Transportation October Pages LABORATORY TESTING SECTION. Method of Test for

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 8 One Way ANOVA and comparisons among means Introduction

Chapter Eight: Multivariate Analysis

CHAPTER TWO REGRESSION

3.2 Least- Squares Regression

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

DENTRIX ENTERPRISE 8.0.5

Chapter 3, Section 1 - Describing Relationships (Scatterplots and Correlation)

Chapter Eight: Multivariate Analysis

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

Chapter 3: Examining Relationships

Analysis of Variance (ANOVA) Program Transcript

(a) 50% of the shows have a rating greater than: impossible to tell

STT 200 Test 1 Green Give your answer in the scantron provided. Each question is worth 2 points.

(a) 50% of the shows have a rating greater than: impossible to tell

MEASURES OF ASSOCIATION AND REGRESSION

Lesson: A Ten Minute Course in Epidemiology

STOR 155 Section 2 Midterm Exam 1 (9/29/09)

Activity: Smart Guessing

BlueBayCT - Warfarin User Guide

Level 3 AS Credits Internal Investigate Bivariate Measurement Data Written by Jake Wills MathsNZ

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

Reminders/Comments. Thanks for the quick feedback I ll try to put HW up on Saturday and I ll you

Homework Linear Regression Problems should be worked out in your notebook

Chapter 3: Describing Relationships

Lab 5: Testing Hypotheses about Patterns of Inheritance

Simple Linear Regression One Categorical Independent Variable with Several Categories

Biopac Student Lab Lesson 6 ELECTROCARDIOGRAPHY (ECG) II Analysis Procedure. Rev

Digestive System: Where does food go? Student Version

One-Way Independent ANOVA

Intro to SPSS. Using SPSS through WebFAS

8.SP.1 Hand span and height

Arizona Western College Math 81 Course Review. Name Class Time

To open a CMA file > Download and Save file Start CMA Open file from within CMA

Lesson 2 EMG 2 Electromyography: Mechanical Work

Name AP Statistics UNIT 1 Summer Work Section II: Notes Analyzing Categorical Data

1 Version SP.A Investigate patterns of association in bivariate data

LAB 1 The Scientific Method

EXPERIMENT 4 TITRATION OF AN UNKNOWN ACID

05. Conversion Factors tutorial.doc. Introduction

Electromyography II Laboratory (Hand Dynamometer Transducer)

COLLEGE ALGEBRA PREREQUISITE REVIEW NAME: CLASS: DUAL ENROLLMENT MAC 1105 DUE DATE: 08/23/10

Reveal Relationships in Categorical Data

A response variable is a variable that. An explanatory variable is a variable that.

Homework 2 Math 11, UCSD, Winter 2018 Due on Tuesday, 23rd January

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60

Appendix: Instructions for Treatment Index B (Human Opponents, With Recommendations)

Prentice Hall Connected Mathematics 2, Grade Correlated to: Michigan Grade Level Content Expectations (Grade 6)

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points.

Before we get started:

Math MidTerm Exam & Math Final Examination STUDY GUIDE Spring 2011

Elementary Algebra Sample Review Questions

Warfarin Help Documentation

Living with Newton's Laws

Choosing a Significance Test. Student Resource Sheet

Transcription:

Lab 5a Exploring Correlation The correlation coefficient measures how tightly the points on a scatterplot cluster around a line. In this lab we will examine scatterplots and correlation coefficients for many pairs of variables. We will look at data from the EPA evaluation of the fuel economy of the 2013 model year cars (see www.fueleconomy.gov), and from a Statistics class survey. AT THE COMPUTER In this lab, we begin by gaining some practice at judging the values of correlations by looking at the scatterplots. We will begin by looking at a data set that deals with cars and gas mileage. The data set Cars2013 lists characteristics of car models from the 2013 model year. With the recent increases in fuel cost many people are concerned with fuel mileage. Let s study the relationship of mileage with other variables. Correlations and scatterplots can help us understand relationships between variables for these cars. 1. We begin by examining engine displacement (liters) and the city fuel mileage. Engine displacement indicates the size of a vehicle s engine. In general, large or high performance vehicles have larger engines. Remember that in a positive correlation, as one variable increases the other also increases. In a negative correlation, as one variable increases the other decreases. Do you think that the engine displacement of the car and the fuel mileage (miles per gallon) would be positively related, negatively related, or near zero? 2. Next, think about the amount of luggage space (cubic feet) a car has. What type of relationship do you feel this variable would have with fuel mileage? 3. Next let s consider the relationship between city gas mileage and highway gas mileage. What type of relationship do you feel these variables would have? 4. Finally consider the relationship between the amount of passenger space of the car and the amount of luggage space. What type of relationship do you feel these variables would have? 1

Software Tip: Creating Scatterplots In Data Desk select response variable of interest (place the Y on this variable). To select the independent variable, hold the shift key while selecting the variable (an X will be placed on the variable). Then choose Scatterplot under the Plot menu. In CrunchIt click Graphics>Scatterplot. Choose the variables of interest and put them in the Y and X boxes. If you d like more help, watch the CrunchIt help video on Correlation. Now let s see what the actual data indicates for these variables by making scatterplots of each pair of variables. Open the Cars2013 data file and make scatterplots of the pairs of variables we previously discussed. 5. How did your predictions compare with the actual scatterplots? Did you predict any positive correlations to be negative or vise versa? Mention any differences here. 6. Examine the scatterplots you have created. a. Which of the correlations appears to be the strongest? Remember that a strong correlation is one that is tightly packed near a straight line. b. By looking at the scatterplots, what correlation would you expect for these variables? Make a guess rounded to one decimal place along with a direction (positive or negative). Write your guess in the appropriate space below. Variables My correlation guess Actual correlation displacement and mpg:city space:luggage and mpg:city mpg:city and mpg:highway 2

space:luggage and space:passenger 3

c. Now calculate the actual correlation using software, and record those correlations in the table provided. Which of your guesses was off by the most? Software Tip: Calculating Correlation To calculate correlations in Data Desk use the hyperview triangle in the upper-left corner of the scatterplot you created for that variable. Choose the Correlation option. In CrunchIt click Statistics> Correlation. Click to choose the variables of interest. It is good practice to first take a look at the scatter plot before calculating the correlation coefficient in order to see if it is an appropriate measure of the strength of the association. For example, you should look for evidence about whether the pattern of association between the two variables is linear and possible explanations for outliers. 7a. Does there seem to be a nonlinear relationship between any of the pairs of variables you examined? Which ones? 7b. Look at the scatter plot of city mileage versus highway mileage for the cars in the data set. Try color-coding the points using some of the other variables like Drive Type and whether the car is a gas-electric hybrid. Explain what you learn from each picture. To add a color code: click on the variable that codes the colors and then use Modify>Colors>Add>by Group (in DataDesk) or use the Group by option in the Crunchit Scatterplot dialog box. 4

How does changing the unit of measurement change the correlation between variables? We can explore this by examining the conversion of the engine displacement and the mileage of the car. For the last decade engine displacement has been given in liters, but previously most American cars listed their engine displacement in cubic inches. How do the correlations change when we convert cubic inches to liters? We can find out by calculating a new variable that multiplies engine size by 61 (there are approximately 61 cubic inches in a liter). Software Tip: Creating a New Variable To create a new variable click Manip>Transform>New Derived Variable. Give the variable a name of your choice and click OK. A window will appear in which you should type the formula for the new variable. In Data Desk be sure to put the variable name in single quotes. For example: displacement *61. In CrunchIt click Insert>Evaluate Formula. In the formula box, type the name of the variable and the calculation you want. In CrunchIt put the name of the variable in square brackets, for example: [Displacement]*61. The new variable will be inserted in the next column of the worksheet. 8. Create a scatterplot of your new variable and the mileage variable. Examine this scatterplot and the scatterplot of displacement and city mileage you made earlier. How does the pattern of this scatterplot compare with the previous scatterplot of these variables? 9. Calculate the correlation between these variables. How does this compare with the correlation you found between these variables previously? Explain. 5

Now let s switch to another data set that deals with data from a survey completed by students in your class. Open the data file called Class_Survey. We ll look at the variables height (the students height in inches), mate height (the height of the students ideal mate ), year (of birth), age (in years), HS GPA (high school grade point average on a four-point scale), and OSU GPA (grade point average at Ohio State). 10. Before looking at the data, make a guess at the size of the correlation for the pairs of variables listed below. Record your guess in the table below. Next make a plot of each pair of variables from the Class_survey1 data file. Look at each plot and try to guess the value of the correlation. Record your guess in the table. Finally, use the software to find the actual value of the correlation between each pair of variables and record that value. Variables My guess at the correlation My guess after looking at the plot Actual correlation height and mate height year and age HS GPA and OSU GPA 11. Which of the three correlations in the previous question were the most difficult for you to guess? How did the three correlations differ from your expectations with respect to direction and/or strength? 12. Many people are surprised at the direction of the correlation between the students height and the height of their ideal mate in this survey. Think of an explanation for this paradox and use the software to investigate your explanation. Show (sketch or cut-and-paste) the results below that you used to test your explanation. 6