1 Simple and Multiple Linear Regression Assumptions

Size: px
Start display at page:

Download "1 Simple and Multiple Linear Regression Assumptions"

Transcription

1 1 Simple and Multiple Linear Regression Assumptions The assumptions for simple are in fact special cases of the assumptions for multiple: Check: 1. What is external validity? Which assumption is critical for external validity? 2. What is internal validity? Which assumption is critical for internal validity? 3. What null hypothesis are we typically testing? Which assumption is critical for hypothesis testing? 4. What happens when a dummy variable for male is included in a regression alongside a variable for non-male? Which assumption is violated? 2 Omitted Variable Bias i. Practice: Signing the bias A classmate of mine is working on a project in Mexico looking at how homicide rates (per capita) are affected by changes in police financing (per capita). Presumably, giving police more resources with which to fight crime would lower the rate of homicides in a given area. Let us imagine that the population model of homicides looks like this, with an index of gang presence in a given district as an additional explanatory variable. homicide β 0 + β 1 police f inance + β 2 gangs + u (1) However, let s pretend that she didn t think to collect data on prevalence of gangs in each district, so that the model she estimates is: homicide β 0 + β 1 police f inance + ũ (2) 1

2 If we think there s an important variable missing, like gangs is above, we can sign the bias we expect if we leave gangs out of the regression simply by determining the signs of two correlations: 1. Cov(homicide, gangs) or Cov(y, omitted variable) 2. Cov(police f inance, gangs) or Cov(x, omitted variable) Often we leave out the unimportant variables. An unimportant variable is one that we re not interested in, and one that will not induce bias (i.e. bias is zero) in our coefficients of interest if we leave it out The bias due to the omitted variable will be zero when: On a problem set or exam, when you re trying to decide how omitted variable bias might be affecting your estimation of a parameter, it s easy to think of the following table: Cov(x, x ov ) > 0 Cov(x, x ov ) < 0 Cov(y, x ov ) > 0 Upward bias Downward bias Cov(y, x ov ) < 0 Downward bias Upward bias ii. OVB: Derivation/Calculation SLR4 fails because of an omitted variable: E[u X] 0 Reviewing lecture from last Thursday: Population Model: y β 0 + β 1 x 1 + β 2 x 2 + u Model with omitted x 2 variable: y β 0 + β 1 x 1 + ũ Suppose x 1 is correlated with x 2 in the following way: x 2 α + ρx 1 + v Substituting this equation into the true population model we get: y β 0 + β 1 x 1 + β 2 (α + ρx 1 + v) + u There are extra terms! If we take the expectation of ˆ β 1 : [ ] E ˆ β 1 If E [ ˆβ 1 ] β1 then we say ˆβ 1 is biased. What this means is that on average, our regression estimate is going to miss the true population parameter by. 2

3 3 Clean and Dirty Variation In class we heard the concept of clean and dirty variation mentioned. What does it mean and why do we care? This is a very intuitive lense through which to think about the variation of certain variables. The idea is that an estimation is suffering from omitted variable bias. This is a problem because as a result we are not getting the true relationship between our explanatory variable of interest and our outcome, β. In the example above we wanted to explain homicides with police funding. However, we had biased results because areas with higher police funding also had higher levels of gang presence. This correlation between police funding and gang presence is considered dirty variation in police funding. It is variation in police funding that cannot be disentangled from gang presence and results in biasing our estimates. However, police funding and gang presence are not perfectly collinear. There is other variation in police funding that is not directly explained by gang presence and vice versa. If we could isolate only the clean variation in police funding that was independent of gang presence, then we could end up getting an unbiased estimate of the relationship between homicides and police funding! But this is impossible, right? Sort of. In the example from class, we actually had data for our omitted variable. Analagously, if we had data on gang presence we could be clever and use Stata to isolate only our clean variation in police funding that would capture an unbiased estimate of β. To do this, we generate a new model that uses gang presence to predict police funding by regressing police funding on gang presence. This predicted police funding is the dirty portion of variation that we can t use. Instead we extract the clean variation in police funding that has nothing to do with gang presence. This clean variation is captured in the residuals of our model. What does this do to our results? We saw that this new estimate from regressing homicides on these residuals is essentially the same as when we estimate the true population model and include the omitted variable directly in the equation! We now have E[ ˆβ] β! So why don t we just include this variable in the first place!?! Well... Good question. I m glad you asked. Of course we would love to always be blessed with the complete set of data and variables so that we could directly estimate the true population model, but that is very rarely the case. This example was constructed to illustrate two things: 1) how we can focus on clean variation to get accurate estimates and 2) how isolating only clean variation leads to an increase in our standard errors relative to our biased estimate. The latter point is because we are now using less of the total variation in our police funding (our X variable). The notion of clean and dirty variation will come back later in the course. Many impact evaluation methods are tools designed to help us isolate the clean variation of our variables of interest so that we can get unbiased estimates of β. 3

4 Variance of ˆβ Check: Bringing this home (hopefully) lets remember our variance for ˆβ. Var( ˆβ) ˆ σ 2 SST x (1 R 2 x) 1. What happened to n in this formula, don t we still care about our sample size? 2. What would happen to R 2 x if we add an additional variable into our regression that is highly correlated with X? 3. What happens to σ 2 if the newly added variable explains a lot of variation in Y? 4

5 4 Example: OVB in Action In this section, I use the wage data (WAGE1.dta) from your textbook to demonstrate the evils of omitted variable bias and show you that the OVB formula works. Let s pretend that this sample of 500 people is our whole population of interest, so that when we run our regressions, we are actually revealing the true parameters instead of just estimates. We re interested in the relationship between wages and gender, and our omitted variable will be tenure (how long the person has been at his/her job). Suppose our population model is: (1) log(wage) i β 0 + β 1 f emale i + β 2 tenure i + u i First let s look at the correlations between our variables and see if we can predict how omitting tenure will bias ˆβ 1 :. corr lwage female tenure lwage female tenure lwage female tenure If we ran the regression: (2) log(wage) i β 0 + β 1 f emale i + e i...then the information above tells us that β 1 β 1. Let s see if we were right. Imagine we ran the regressions in Stata (we did) and we get the below results for our two models: (1) log(wage) i f emale i tenure i + u i (2) log(wage) i f emale i + e i From these results we now know that β 1 and β 1. This means that our BIAS is equal to: There s one more parameter missing from our OVB formula. What regression do we have to run to find its value? The Stata results give us: tenure ρ 0 + ρ 1 f emale + v tenure f emale + v Now we can plug all of our parameters into the bias formula to check that it in fact gives us the bias from leaving out tenure from our wage regression: β 1 E[ ˆ β 1 ] β 1 + β 2 ρ 1 5

6 5 OVB Intuition (For your own reference) For further intuition on omitted variable bias, I like to think of an archer. When our MLR1-4 hold, the archer is aiming the arrow directly at the center of the target if he/she misses, it s due to random fluctuations in the air that push the arrow around, or maybe imperfections in the arrow that send it a little off course. When MLR1-4 do not all hold, like when we have an omitted variable, the archer is no longer aiming at the center of the target. There are still puffs of air and feather imperfections that send the arrow off course, but the course wasn t even the right one to begin with! The arrow (which you should think of as our ˆβ) misses the center of the target (which you should think of as our true β) systematically. To demonstrate this, I did the following: Take a random sample of 150 people out of the 500 that are in WAGE1.dta Estimate ˆβ 1 using OLS, controlling for tenure with these 150 people. Estimate ˆα 1 using OLS (NOT controlling for tenure) with these 150 people. Repeat 6000 times. At the end of all of the above, I end up with 6000 biased and 6000 unbiased estimates of ˆβ 1. I plotted the kernel density of the biased estimates alongside that of the unbiased estimates. You can see how the biased distribution is shifted to the left indicating a downward bias! Figure 1. Kernel densities for biased and unbiased estimates. Density effect of female on ln(wage) alphahat_1 betahat_1 6

Technical Track Session IV Instrumental Variables

Technical Track Session IV Instrumental Variables Impact Evaluation Technical Track Session IV Instrumental Variables Christel Vermeersch Beijing, China, 2009 Human Development Human Network Development Network Middle East and North Africa Region World

More information

Problem Set 5 ECN 140 Econometrics Professor Oscar Jorda. DUE: June 6, Name

Problem Set 5 ECN 140 Econometrics Professor Oscar Jorda. DUE: June 6, Name Problem Set 5 ECN 140 Econometrics Professor Oscar Jorda DUE: June 6, 2006 Name 1) Earnings functions, whereby the log of earnings is regressed on years of education, years of on-the-job training, and

More information

SOME NOTES ON THE INTUITION BEHIND POPULAR ECONOMETRIC TECHNIQUES

SOME NOTES ON THE INTUITION BEHIND POPULAR ECONOMETRIC TECHNIQUES SOME NOTES ON THE INTUITION BEHIND POPULAR ECONOMETRIC TECHNIQUES These notes have been put together by Dave Donaldson (EC307 class teacher 2004-05) and by Hannes Mueller (EC307 class teacher 2004-present).

More information

Problem set 2: understanding ordinary least squares regressions

Problem set 2: understanding ordinary least squares regressions Problem set 2: understanding ordinary least squares regressions September 12, 2013 1 Introduction This problem set is meant to accompany the undergraduate econometrics video series on youtube; covering

More information

EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE

EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE ...... EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE TABLE OF CONTENTS 73TKey Vocabulary37T... 1 73TIntroduction37T... 73TUsing the Optimal Design Software37T... 73TEstimating Sample

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 7: Endogeneity and IVs Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 7 VŠE, SS 2016/17 1 / 36 Outline 1 OLS and the treatment effect 2 OLS and endogeneity 3 Dealing

More information

Instrumental Variables Estimation: An Introduction

Instrumental Variables Estimation: An Introduction Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA The Problem The Problem Suppose you wish to

More information

Introduction to Econometrics

Introduction to Econometrics Global edition Introduction to Econometrics Updated Third edition James H. Stock Mark W. Watson MyEconLab of Practice Provides the Power Optimize your study time with MyEconLab, the online assessment and

More information

EC352 Econometric Methods: Week 07

EC352 Econometric Methods: Week 07 EC352 Econometric Methods: Week 07 Gordon Kemp Department of Economics, University of Essex 1 / 25 Outline Panel Data (continued) Random Eects Estimation and Clustering Dynamic Models Validity & Threats

More information

Simple Linear Regression the model, estimation and testing

Simple Linear Regression the model, estimation and testing Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.

More information

Example 7.2. Autocorrelation. Pilar González and Susan Orbe. Dpt. Applied Economics III (Econometrics and Statistics)

Example 7.2. Autocorrelation. Pilar González and Susan Orbe. Dpt. Applied Economics III (Econometrics and Statistics) Example 7.2 Autocorrelation Pilar González and Susan Orbe Dpt. Applied Economics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Example 7.2. Autocorrelation 1 / 17 Questions.

More information

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations)

Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) Preliminary Report on Simple Statistical Tests (t-tests and bivariate correlations) After receiving my comments on the preliminary reports of your datasets, the next step for the groups is to complete

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Carrying out an Empirical Project

Carrying out an Empirical Project Carrying out an Empirical Project Empirical Analysis & Style Hint Special program: Pre-training 1 Carrying out an Empirical Project 1. Posing a Question 2. Literature Review 3. Data Collection 4. Econometric

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

Multiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012

Multiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012 Multiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012 2 In Today s Class Recap Single dummy variable Multiple dummy variables Ordinal dummy variables Dummy-dummy interaction Dummy-continuous/discrete

More information

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines Ec331: Research in Applied Economics Spring term, 2014 Panel Data: brief outlines Remaining structure Final Presentations (5%) Fridays, 9-10 in H3.45. 15 mins, 8 slides maximum Wk.6 Labour Supply - Wilfred

More information

The Pretest! Pretest! Pretest! Assignment (Example 2)

The Pretest! Pretest! Pretest! Assignment (Example 2) The Pretest! Pretest! Pretest! Assignment (Example 2) May 19, 2003 1 Statement of Purpose and Description of Pretest Procedure When one designs a Math 10 exam one hopes to measure whether a student s ability

More information

WELCOME! Lecture 11 Thommy Perlinger

WELCOME! Lecture 11 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression

More information

Section 3.2 Least-Squares Regression

Section 3.2 Least-Squares Regression Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.

More information

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Multiple Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Multiple Regression 1 / 19 Multiple Regression 1 The Multiple

More information

F1: Introduction to Econometrics

F1: Introduction to Econometrics F1: Introduction to Econometrics Feng Li Department of Statistics, Stockholm University General information Homepage of this course: http://gauss.stat.su.se/gu/ekonometri.shtml Lecturer F1 F7: Feng Li,

More information

Establishing Causality Convincingly: Some Neat Tricks

Establishing Causality Convincingly: Some Neat Tricks Establishing Causality Convincingly: Some Neat Tricks Establishing Causality In the last set of notes, I discussed how causality can be difficult to establish in a straightforward OLS context If assumptions

More information

Exam 2 Solutions: Monday, April 2 8:30-9:50 AM

Exam 2 Solutions: Monday, April 2 8:30-9:50 AM Amherst College Department of Economics Economics 360 Spring 2012 Name: Exam 2 Solutions: Monday, April 2 8:30-9:50 AM Cigarette Consumption Data: Cross section of per capita cigarette consumption and

More information

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1: Research Methods 1 Handouts, Graham Hole,COGS - version 10, September 000: Page 1: T-TESTS: When to use a t-test: The simplest experimental design is to have two conditions: an "experimental" condition

More information

A NON-TECHNICAL INTRODUCTION TO REGRESSIONS. David Romer. University of California, Berkeley. January Copyright 2018 by David Romer

A NON-TECHNICAL INTRODUCTION TO REGRESSIONS. David Romer. University of California, Berkeley. January Copyright 2018 by David Romer A NON-TECHNICAL INTRODUCTION TO REGRESSIONS David Romer University of California, Berkeley January 2018 Copyright 2018 by David Romer CONTENTS Preface ii I Introduction 1 II Ordinary Least Squares Regression

More information

Inference with Difference-in-Differences Revisited

Inference with Difference-in-Differences Revisited Inference with Difference-in-Differences Revisited M. Brewer, T- F. Crossley and R. Joyce Journal of Econometric Methods, 2018 presented by Federico Curci February 22nd, 2018 Brewer, Crossley and Joyce

More information

Content. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries. Research question. Example Newly diagnosed Type 2 Diabetes

Content. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries. Research question. Example Newly diagnosed Type 2 Diabetes Content Quantifying association between continuous variables. Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General

More information

Instrumental Variables I (cont.)

Instrumental Variables I (cont.) Review Instrumental Variables Observational Studies Cross Sectional Regressions Omitted Variables, Reverse causation Randomized Control Trials Difference in Difference Time invariant omitted variables

More information

Lesson 11.1: The Alpha Value

Lesson 11.1: The Alpha Value Hypothesis Testing Lesson 11.1: The Alpha Value The alpha value is the degree of risk we are willing to take when making a decision. The alpha value, often abbreviated using the Greek letter α, is sometimes

More information

Structural Equation Modeling (SEM)

Structural Equation Modeling (SEM) Structural Equation Modeling (SEM) Today s topics The Big Picture of SEM What to do (and what NOT to do) when SEM breaks for you Single indicator (ASU) models Parceling indicators Using single factor scores

More information

The Late Pretest Problem in Randomized Control Trials of Education Interventions

The Late Pretest Problem in Randomized Control Trials of Education Interventions The Late Pretest Problem in Randomized Control Trials of Education Interventions Peter Z. Schochet ACF Methods Conference, September 2012 In Journal of Educational and Behavioral Statistics, August 2010,

More information

GENETIC DRIFT & EFFECTIVE POPULATION SIZE

GENETIC DRIFT & EFFECTIVE POPULATION SIZE Instructor: Dr. Martha B. Reiskind AEC 450/550: Conservation Genetics Spring 2018 Lecture Notes for Lectures 3a & b: In the past students have expressed concern about the inbreeding coefficient, so please

More information

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,

More information

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

3 CONCEPTUAL FOUNDATIONS OF STATISTICS 3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical

More information

Methods for Addressing Selection Bias in Observational Studies

Methods for Addressing Selection Bias in Observational Studies Methods for Addressing Selection Bias in Observational Studies Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA What is Selection Bias? In the regression

More information

Chapter 3 CORRELATION AND REGRESSION

Chapter 3 CORRELATION AND REGRESSION CORRELATION AND REGRESSION TOPIC SLIDE Linear Regression Defined 2 Regression Equation 3 The Slope or b 4 The Y-Intercept or a 5 What Value of the Y-Variable Should be Predicted When r = 0? 7 The Regression

More information

Midterm project due next Wednesday at 2 PM

Midterm project due next Wednesday at 2 PM Course Business Midterm project due next Wednesday at 2 PM Please submit on CourseWeb Next week s class: Discuss current use of mixed-effects models in the literature Short lecture on effect size & statistical

More information

Practical Regression: Convincing Empirical Research in Ten Steps

Practical Regression: Convincing Empirical Research in Ten Steps DAVID DRANOVE 7-112-001 Practical Regression: Convincing Empirical Research in Ten Steps This is one in a series of notes entitled Practical Regression. These notes supplement the theoretical content of

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Name Date Per Key Vocabulary: response variable explanatory variable independent variable dependent variable scatterplot positive association negative association linear correlation r-value regression

More information

Your Task: Find a ZIP code in Seattle where the crime rate is worse than you would expect and better than you would expect.

Your Task: Find a ZIP code in Seattle where the crime rate is worse than you would expect and better than you would expect. Forensic Geography Lab: Regression Part 1 Payday Lending and Crime Seattle, Washington Background Regression analyses are in many ways the Gold Standard among analytic techniques for undergraduates (and

More information

6. Unusual and Influential Data

6. Unusual and Influential Data Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Reminders/Comments. Thanks for the quick feedback I ll try to put HW up on Saturday and I ll you

Reminders/Comments. Thanks for the quick feedback I ll try to put HW up on Saturday and I ll  you Reminders/Comments Thanks for the quick feedback I ll try to put HW up on Saturday and I ll email you Final project will be assigned in the last week of class You ll have that week to do it Participation

More information

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet Standard Deviation and Standard Error Tutorial This is significantly important. Get your AP Equations and Formulas sheet The Basics Let s start with a review of the basics of statistics. Mean: What most

More information

5 To Invest or not to Invest? That is the Question.

5 To Invest or not to Invest? That is the Question. 5 To Invest or not to Invest? That is the Question. Before starting this lab, you should be familiar with these terms: response y (or dependent) and explanatory x (or independent) variables; slope and

More information

Estimating Heterogeneous Choice Models with Stata

Estimating Heterogeneous Choice Models with Stata Estimating Heterogeneous Choice Models with Stata Richard Williams Notre Dame Sociology rwilliam@nd.edu West Coast Stata Users Group Meetings October 25, 2007 Overview When a binary or ordinal regression

More information

Political Science 15, Winter 2014 Final Review

Political Science 15, Winter 2014 Final Review Political Science 15, Winter 2014 Final Review The major topics covered in class are listed below. You should also take a look at the readings listed on the class website. Studying Politics Scientifically

More information

Week 8 Hour 1: More on polynomial fits. The AIC. Hour 2: Dummy Variables what are they? An NHL Example. Hour 3: Interactions. The stepwise method.

Week 8 Hour 1: More on polynomial fits. The AIC. Hour 2: Dummy Variables what are they? An NHL Example. Hour 3: Interactions. The stepwise method. Week 8 Hour 1: More on polynomial fits. The AIC Hour 2: Dummy Variables what are they? An NHL Example Hour 3: Interactions. The stepwise method. Stat 302 Notes. Week 8, Hour 1, Page 1 / 34 Human growth

More information

Bayesian approaches to handling missing data: Practical Exercises

Bayesian approaches to handling missing data: Practical Exercises Bayesian approaches to handling missing data: Practical Exercises 1 Practical A Thanks to James Carpenter and Jonathan Bartlett who developed the exercise on which this practical is based (funded by ESRC).

More information

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger Conditional Distributions and the Bivariate Normal Distribution James H. Steiger Overview In this module, we have several goals: Introduce several technical terms Bivariate frequency distribution Marginal

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

3.2 Least- Squares Regression

3.2 Least- Squares Regression 3.2 Least- Squares Regression Linear (straight- line) relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these

More information

IAPT: Regression. Regression analyses

IAPT: Regression. Regression analyses Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project

More information

Patrick Breheny. January 28

Patrick Breheny. January 28 Confidence intervals Patrick Breheny January 28 Patrick Breheny Introduction to Biostatistics (171:161) 1/19 Recap Introduction In our last lecture, we discussed at some length the Public Health Service

More information

Two-Way Independent ANOVA

Two-Way Independent ANOVA Two-Way Independent ANOVA Analysis of Variance (ANOVA) a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment. There

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Dr. Kelly Bradley Final Exam Summer {2 points} Name {2 points} Name You MUST work alone no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. This exam is being scored out of 00 points.

More information

26:010:557 / 26:620:557 Social Science Research Methods

26:010:557 / 26:620:557 Social Science Research Methods 26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview

More information

One-Way Independent ANOVA

One-Way Independent ANOVA One-Way Independent ANOVA Analysis of Variance (ANOVA) is a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment.

More information

Sheila Barron Statistics Outreach Center 2/8/2011

Sheila Barron Statistics Outreach Center 2/8/2011 Sheila Barron Statistics Outreach Center 2/8/2011 What is Power? When conducting a research study using a statistical hypothesis test, power is the probability of getting statistical significance when

More information

Simple Linear Regression One Categorical Independent Variable with Several Categories

Simple Linear Regression One Categorical Independent Variable with Several Categories Simple Linear Regression One Categorical Independent Variable with Several Categories Does ethnicity influence total GCSE score? We ve learned that variables with just two categories are called binary

More information

ANOVA. Thomas Elliott. January 29, 2013

ANOVA. Thomas Elliott. January 29, 2013 ANOVA Thomas Elliott January 29, 2013 ANOVA stands for analysis of variance and is one of the basic statistical tests we can use to find relationships between two or more variables. ANOVA compares the

More information

Lab 8: Multiple Linear Regression

Lab 8: Multiple Linear Regression Lab 8: Multiple Linear Regression 1 Grading the Professor Many college courses conclude by giving students the opportunity to evaluate the course and the instructor anonymously. However, the use of these

More information

Marno Verbeek Erasmus University, the Netherlands. Cons. Pros

Marno Verbeek Erasmus University, the Netherlands. Cons. Pros Marno Verbeek Erasmus University, the Netherlands Using linear regression to establish empirical relationships Linear regression is a powerful tool for estimating the relationship between one variable

More information

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests Objectives Quantifying the quality of hypothesis tests Type I and II errors Power of a test Cautions about significance tests Designing Experiments based on power Evaluating a testing procedure The testing

More information

A response variable is a variable that. An explanatory variable is a variable that.

A response variable is a variable that. An explanatory variable is a variable that. Name:!!!! Date: Scatterplots The most common way to display the relation between two quantitative variable is a scatterplot. Statistical studies often try to show through scatterplots, that changing one

More information

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA 15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA Statistics does all kinds of stuff to describe data Talk about baseball, other useful stuff We can calculate the probability.

More information

SAMPLING AND SAMPLE SIZE

SAMPLING AND SAMPLE SIZE SAMPLING AND SAMPLE SIZE Andrew Zeitlin Georgetown University and IGC Rwanda With slides from Ben Olken and the World Bank s Development Impact Evaluation Initiative 2 Review We want to learn how a program

More information

Studying the effect of change on change : a different viewpoint

Studying the effect of change on change : a different viewpoint Studying the effect of change on change : a different viewpoint Eyal Shahar Professor, Division of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona

More information

Chapter 11. Experimental Design: One-Way Independent Samples Design

Chapter 11. Experimental Design: One-Way Independent Samples Design 11-1 Chapter 11. Experimental Design: One-Way Independent Samples Design Advantages and Limitations Comparing Two Groups Comparing t Test to ANOVA Independent Samples t Test Independent Samples ANOVA Comparing

More information

Functionalist theories of content

Functionalist theories of content Functionalist theories of content PHIL 93507 April 22, 2012 Let s assume that there is a certain stable dependence relation between the physical internal states of subjects and the phenomenal characters

More information

Sampling for Impact Evaluation. Maria Jones 24 June 2015 ieconnect Impact Evaluation Workshop Rio de Janeiro, Brazil June 22-25, 2015

Sampling for Impact Evaluation. Maria Jones 24 June 2015 ieconnect Impact Evaluation Workshop Rio de Janeiro, Brazil June 22-25, 2015 Sampling for Impact Evaluation Maria Jones 24 June 2015 ieconnect Impact Evaluation Workshop Rio de Janeiro, Brazil June 22-25, 2015 How many hours do you expect to sleep tonight? A. 2 or less B. 3 C.

More information

1.4 - Linear Regression and MS Excel

1.4 - Linear Regression and MS Excel 1.4 - Linear Regression and MS Excel Regression is an analytic technique for determining the relationship between a dependent variable and an independent variable. When the two variables have a linear

More information

CHAPTER 2: TWO-VARIABLE REGRESSION ANALYSIS: SOME BASIC IDEAS

CHAPTER 2: TWO-VARIABLE REGRESSION ANALYSIS: SOME BASIC IDEAS CHAPTER 2: TWO-VARIABLE REGRESSION ANALYSIS: SOME BASIC IDEAS 2.1 It tells how the mean or average response of the sub-populations of Y varies with the fixed values of the explanatory variable (s). 2.2

More information

Options in HIV Prevention A Participant-Centered Counseling Approach

Options in HIV Prevention A Participant-Centered Counseling Approach Options in HIV Prevention A Participant-Centered Counseling Approach Options Counseling Flipchart, Version 3.0, 10 Oct 2017 Enrollment Visit Welcome and thank you! 3 HOPE Adherence Counseling CHOICE: Helping

More information

Why do Psychologists Perform Research?

Why do Psychologists Perform Research? PSY 102 1 PSY 102 Understanding and Thinking Critically About Psychological Research Thinking critically about research means knowing the right questions to ask to assess the validity or accuracy of a

More information

t-test for r Copyright 2000 Tom Malloy. All rights reserved

t-test for r Copyright 2000 Tom Malloy. All rights reserved t-test for r Copyright 2000 Tom Malloy. All rights reserved This is the text of the in-class lecture which accompanied the Authorware visual graphics on this topic. You may print this text out and use

More information

REVIEW PROBLEMS FOR FIRST EXAM

REVIEW PROBLEMS FOR FIRST EXAM M358K Sp 6 REVIEW PROBLEMS FOR FIRST EXAM Please Note: This review sheet is not intended to tell you what will or what will not be on the exam. However, most of these problems have appeared on or are very

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis Basic Concept: Extend the simple regression model to include additional explanatory variables: Y = β 0 + β1x1 + β2x2 +... + βp-1xp + ε p = (number of independent variables

More information

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling Olli-Pekka Kauppila Daria Kautto Session VI, September 20 2017 Learning objectives 1. Get familiar with the basic idea

More information

GUIDE 4: COUNSELING THE UNEMPLOYED

GUIDE 4: COUNSELING THE UNEMPLOYED GUIDE 4: COUNSELING THE UNEMPLOYED Addressing threats to experimental integrity This case study is based on Sample Attrition Bias in Randomized Experiments: A Tale of Two Surveys By Luc Behaghel, Bruno

More information

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Today s Class: Features of longitudinal data Features of longitudinal models What can MLM do for you? What to expect in this

More information

Multiple Regression with Qualitative Information ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Multiple Regression with Qualitative Information ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Multiple Regression with Qualitative Information ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Introduction There is a lot of (relevant) information in data about the elements observed that is not in quantitative

More information

Exam 2 PS 306, Spring 2004

Exam 2 PS 306, Spring 2004 Exam 2 PS 306, Spring 2004 1. Briefly define the term confound. Then, using a very explicit example of practice effects (maybe even with numbers?), illustrate why conducting a repeated measures experiment

More information

Online Appendix. Gregorio Caetano & Vikram Maheshri. Classifying Observed Variables as Detectable and Undetectable

Online Appendix. Gregorio Caetano & Vikram Maheshri. Classifying Observed Variables as Detectable and Undetectable Online Appendix Gregorio Caetano & Vikram Maheshri A Classifying Observed Variables as Detectable and Undetectable Confounders Here we provide details of how we implemented the exercise described in Appendix

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj Statistical Techniques Masoud Mansoury and Anas Abulfaraj What is Statistics? https://www.youtube.com/watch?v=lmmzj7599pw The definition of Statistics The practice or science of collecting and analyzing

More information

Study Guide #2: MULTIPLE REGRESSION in education

Study Guide #2: MULTIPLE REGRESSION in education Study Guide #2: MULTIPLE REGRESSION in education What is Multiple Regression? When using Multiple Regression in education, researchers use the term independent variables to identify those variables that

More information

Final Exam - section 2. Thursday, December hours, 30 minutes

Final Exam - section 2. Thursday, December hours, 30 minutes Econometrics, ECON312 San Francisco State University Michael Bar Fall 2011 Final Exam - section 2 Thursday, December 15 2 hours, 30 minutes Name: Instructions 1. This is closed book, closed notes exam.

More information

USING STATCRUNCH TO CONSTRUCT CONFIDENCE INTERVALS and CALCULATE SAMPLE SIZE

USING STATCRUNCH TO CONSTRUCT CONFIDENCE INTERVALS and CALCULATE SAMPLE SIZE USING STATCRUNCH TO CONSTRUCT CONFIDENCE INTERVALS and CALCULATE SAMPLE SIZE Using StatCrunch for confidence intervals (CI s) is super easy. As you can see in the assignments, I cover 9.2 before 9.1 because

More information

We re going to talk about a class of designs which generally are known as quasiexperiments. They re very important in evaluating educational programs

We re going to talk about a class of designs which generally are known as quasiexperiments. They re very important in evaluating educational programs We re going to talk about a class of designs which generally are known as quasiexperiments. They re very important in evaluating educational programs and policies because often we might not have the right

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

How to Work with the Patterns That Sustain Depression

How to Work with the Patterns That Sustain Depression How to Work with the Patterns That Sustain Depression Module 5.2 - Transcript - pg. 1 How to Work with the Patterns That Sustain Depression How the Grieving Mind Fights Depression with Marsha Linehan,

More information

Clincial Biostatistics. Regression

Clincial Biostatistics. Regression Regression analyses Clincial Biostatistics Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a

More information

Stat 13, Lab 11-12, Correlation and Regression Analysis

Stat 13, Lab 11-12, Correlation and Regression Analysis Stat 13, Lab 11-12, Correlation and Regression Analysis Part I: Before Class Objective: This lab will give you practice exploring the relationship between two variables by using correlation, linear regression

More information

Chapter 14: More Powerful Statistical Methods

Chapter 14: More Powerful Statistical Methods Chapter 14: More Powerful Statistical Methods Most questions will be on correlation and regression analysis, but I would like you to know just basically what cluster analysis, factor analysis, and conjoint

More information

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information