Chapter 3 CORRELATION AND REGRESSION

Similar documents
MA 250 Probability and Statistics. Nazar Khan PUCIT Lecture 7

Lecture 6B: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

Chapter 3: Describing Relationships

Chapter 1: Exploring Data

STAT 201 Chapter 3. Association and Regression

Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

STATISTICS INFORMED DECISIONS USING DATA

Regression Equation. November 29, S10.3_3 Regression. Key Concept. Chapter 10 Correlation and Regression. Definitions

Section 3.2 Least-Squares Regression

Chapter 3: Examining Relationships

Homework #3. SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

3.2A Least-Squares Regression

SCATTER PLOTS AND TREND LINES

Reminders/Comments. Thanks for the quick feedback I ll try to put HW up on Saturday and I ll you

Relationships. Between Measurements Variables. Chapter 10. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

1.4 - Linear Regression and MS Excel

AP Statistics Practice Test Ch. 3 and Previous

3.2 Least- Squares Regression

12.1 Inference for Linear Regression. Introduction

Business Statistics Probability

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

HW 3.2: page 193 #35-51 odd, 55, odd, 69, 71-78

Introduction to regression

Simple Linear Regression the model, estimation and testing

Chapter 3 Review. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Regression CHAPTER SIXTEEN NOTE TO INSTRUCTORS OUTLINE OF RESOURCES

Unit 1 Exploring and Understanding Data

bivariate analysis: The statistical analysis of the relationship between two variables.

Sample Math 71B Final Exam #1. Answer Key

Still important ideas

CHAPTER TWO REGRESSION

CHILD HEALTH AND DEVELOPMENT STUDY

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Math 124: Module 2, Part II

INTERPRET SCATTERPLOTS

CHAPTER ONE CORRELATION

NORTH SOUTH UNIVERSITY TUTORIAL 2

IAPT: Regression. Regression analyses

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Examining Relationships Least-squares regression. Sections 2.3

Statistics for Psychology

CHAPTER 3 Describing Relationships

Regression. Lelys Bravo de Guenni. April 24th, 2015

Math 075 Activities and Worksheets Book 2:

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger

Correlation and regression

How Faithful is the Old Faithful? The Practice of Statistics, 5 th Edition 1

8.SP.1 Hand span and height

Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups.

Table of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017

Chapter 7: Descriptive Statistics

Pitfalls in Linear Regression Analysis

AP Statistics. Semester One Review Part 1 Chapters 1-5

AP Stats Chap 27 Inferences for Regression

STATISTICS 201. Survey: Provide this Info. How familiar are you with these? Survey, continued IMPORTANT NOTE. Regression and ANOVA 9/29/2013

Lesson 1: Distributions and Their Shapes

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Undertaking statistical analysis of

Centering Predictors

6. Unusual and Influential Data

Chapter 2 Organizing and Summarizing Data. Chapter 3 Numerically Summarizing Data. Chapter 4 Describing the Relation between Two Variables

Find the slope of the line that goes through the given points. 1) (-9, -68) and (8, 51) 1)

Regression Including the Interaction Between Quantitative Variables

Comparative Neuroanatomy (CNA) Evaluation Post-Test

Midterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points.

Linear Regression in SAS

Chapter 3, Section 1 - Describing Relationships (Scatterplots and Correlation)

Part III Taking Chances for Fun and Profit

Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world

q3_2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Simple Linear Regression

STATISTICS AND RESEARCH DESIGN

At first sight multivariate data analysis (MDA) can appear diverse and

Eating and Sleeping Habits of Different Countries

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

Caffeine & Calories in Soda. Statistics. Anthony W Dick

Chapter 4: More about Relationships between Two-Variables Review Sheet

A response variable is a variable that. An explanatory variable is a variable that.

Scatter Plots and Association

(a) 50% of the shows have a rating greater than: impossible to tell

Biology 345: Biometry Fall 2005 SONOMA STATE UNIVERSITY Lab Exercise 5 Residuals and multiple regression Introduction

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

Chapter 1 Where Do Data Come From?

Correlation & Regression Exercises Chapters 14-15

The Jumping Dog Quadratic Activity

Regression. Regression lines CHAPTER 5

1 Version SP.A Investigate patterns of association in bivariate data

Still important ideas

Chapter 14: More Powerful Statistical Methods

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Transcription:

CORRELATION AND REGRESSION TOPIC SLIDE Linear Regression Defined 2 Regression Equation 3 The Slope or b 4 The Y-Intercept or a 5 What Value of the Y-Variable Should be Predicted When r = 0? 7 The Regression Line 9 The Point of Averages 12 Residuals 15 Extrapolation, Restricted Range, and Lurking Variables 20 Tutorials Obtaining a linear regression analysis in Excel 2007

➊ The stronger the correlation, the more accurately one variable can be predicted from another variable ➋ By using the linear regression equation, we can predict scores for one variable (the Y-variable) from scores on a second variable (the X-variable) The linear regression equation assumes the statistical relationship between two variables follows a straight line known as the regression line

➊ The regression equation consists of four parts: The predicted value for the Y-variable or y The slope of the regression line or b The known value of the X-variable or x The value for the y-intercept or a y' b x i a

y' b x a i ➊ The slope of the regression line or b : Has the same sign (+ or -) as the correlation coefficient r Is a function of the strength of the correlation and the ratio of standard deviations for X and Y variables b r SDy SDx

y' b x a i ➊ The value for the y-intercept or a : Is the point where the regression line crosses the y-axis Is the predicted value of y when the x-variable equals zero This value may sometimes be a strange value, but remember it s a predicted value

a Y b X ➊ The y-intercept equals: The slope of the regression equation (b) times the overall mean for the x-variable (X ) subtracted from The overall mean for the y-variable (Y )

➊ If the correlation is zero, that means the value for the slope is zero and the regression line is flat (i.e., horizontal) ➋ If b = 0, then the y-intercept formula simplifies to: a Y Which means the regression equation simplifies to: y' Y Why?

➊ If there is no correlation between two variables, the best prediction for either variable is its mean ➋ On average, the mean is closer to all values in a distribution compared to any other score In other words, if the mean is used to predict each score in a data set, the average error in prediction will be smaller compared to using some other score from the distribution

➊ What values make the regression line? The values predicted by the regression equation create the regression line y' b x i a These predicted points all fall on the regression line

➊ Represents a central point inside the points of a scatterplot The points in a scatterplot can be thought of as regressing to this central point ➋ Is the best fitting line and is also known as the line of leastsquares Imagine the different angles you could plot a straight line through a scatterplot The line that would result in the smallest average distance from all points would be the regression line

Regression Equation The blue line is the regression line. The points that make this line are the predicted values from the regression equation.

➊ Every linear regression line passes through the point of averages The point of averages is located by the intersection of the overall mean for the x-variable and the overall mean of the y-variable ➋ Point predicted closer to the point of averages are, on average, more accurate than points plotted further away from this point

Regression Equation The black dot represents the point of averages where the overall means for the x-variable (Father s Height 69 inches) and y-variable (Son s Height 71.5 inches). This point is always found on a linear regression line

➊ The regression line can be plotted using Excel, however, you can also plot this line using two points: The point of averages and The y-intercept ➋ You can also plot the regression line by plugging-in values of the x-variable into the regression equation and solving for the predicted value of the y-variable Remember the regression line is made-up of all the predicted values of the y-variable or y

➊ The term residuals refers to the amount of error in prediction In other words, the regression equation produces a predicted value for the y-variable The difference between the predicted value of Y and the real value of Y is known as error or the residual Excel can calculate the residuals for each predicted score, however if we were to obtain the residuals by hand, the formula used is: Formula for Residuals: y y

Regression Equation Residual Residual The distance between each real point and the regression line is a residual or error in prediction. The sum of the residuals is always equal to zero.

➊ Residuals can help identify outliers When a residual is very large, it may indicate an outlier Outliers can have the effect of increasing or decreasing the slope of the regression line This means that outliers can also increase or decrease the correlation between two variables Depending on the size of the outlier, a researcher may want to run the regression analysis with and without the outlier to see how much the score may affect the results

➊ The regression equation attempts to predict the mean of the y-variable at each value of the x-variable WHY? Suppose you have three fathers who are each 74 inches tall (or 6 2 ) Each of these fathers has a son who is a different height The value of the x-variable entered into the regression equation will be the same for each of these three fathers What value for sons heights should the equation try to predict?

Regression Equation What height should be predicted for the three sons who each have a father that is 74 tall? The regression equation will try to predict the average height of the sons (y-variable) at each height of the fathers (x-variable).

➊ What is meant by extrapolation? Predicting values beyond the range of the data used to develop the regression equation ➋ What is meant by limited range? When the regression equation is based on a very narrow range of data compared to the true range of the data in the population What is meant by lurking variables? Other variables that can account for the correlation between two variables

➊ The correlation coefficient can be obtained by hand using the following formula: r b SDx SDy

End of Chapter 3 Part 2