Technical Track Session IV Instrumental Variables

Similar documents
Instrumental Variables Estimation: An Introduction

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

1 Simple and Multiple Linear Regression Assumptions

Applied Quantitative Methods II

Session 3: Dealing with Reverse Causality

Carrying out an Empirical Project

Methods for Addressing Selection Bias in Observational Studies

Session 1: Dealing with Endogeneity

Instrumental Variables I (cont.)

Experimental Methods. Policy Track

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

AP STATISTICS 2014 SCORING GUIDELINES

EC352 Econometric Methods: Week 07

GUIDE 4: COUNSELING THE UNEMPLOYED

Ordinary Least Squares Regression

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Introduction to Econometrics

Lecture II: Difference in Difference and Regression Discontinuity

[En français] For a pdf of the transcript click here.

Practical Regression: Convincing Empirical Research in Ten Steps

Assessing Studies Based on Multiple Regression. Chapter 7. Michael Ash CPPA

Regression Discontinuity Design

INTRODUCTION TO ECONOMETRICS (EC212)

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS

Final Exam - section 2. Thursday, December hours, 30 minutes

Regression Discontinuity Design (RDD)

THEORY OF CHANGE FOR FUNDERS

Lecture II: Difference in Difference. Causality is difficult to Show from cross

CAN T WE ALL JUST GET ALONG?

Choosing Life: empowerment, Action, Results! CLEAR Menu Sessions. Adherence 1: Understanding My Medications and Adherence

Problem set 2: understanding ordinary least squares regressions

NHS Institute for Innovation and Improvement All rights reserved.

Counselling Should: Recognize that behaviour change is difficult and human beings are not perfect

Being a Coach of Impact. Skye Eddy Bruce SoccerParenting.com

A mental health power of attorney allows you to designate someone else, called an agent, to

Session 0A: Welcome to the Look AHEAD Lifestyle Program. Session 0B: Welcome to the Look AHEAD Lifestyle Program

First Problem Set: Answers, Discussion and Background

Dr. Mia Mulrennan, President, Rave-Worthy LLC The Optimist Gets Results: Faking the Cha-Cha with Two Left Feet

OVERCOMING YOUR CHILD S FEARS AND WORRIES GUIDANCE FOR PARENTS

Jae Jin An, Ph.D. Michael B. Nichol, Ph.D.

Module 14: Missing Data Concepts

Estimating Heterogeneous Choice Models with Stata

Evaluating you relationships

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA

SAMPLING AND SAMPLE SIZE

Daily short message service surveys detect greater HIV risk behavior than monthly clinic questionnaires in Kenya

Propensity Score Analysis Shenyang Guo, Ph.D.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) 1) A) B) C) D)

Lab 4 (M13) Objective: This lab will give you more practice exploring the shape of data, and in particular in breaking the data into two groups.

Protect Yourself from Seasonal Flu

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

m 11 m.1 > m 12 m.2 risk for smokers risk for nonsmokers

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

Addendum: Multiple Regression Analysis (DRAFT 8/2/07)

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany

Activities in Neighborhood Associations in Japan: Discovering the Drivers of Procedural Utility

Recreation for the Duke Community. January Health and Wellness in 2019

Introduction to Program Evaluation

Planning a Fitness Program. Chapter 3 Lesson 3

The Limits of Inference Without Theory

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines

SOME NOTES ON THE INTUITION BEHIND POPULAR ECONOMETRIC TECHNIQUES

PREPARING FOR THE ELEVENTH TRADITION

ORAL HYGIENE SESSION 2

John Broome, Weighing Lives Oxford: Oxford University Press, ix + 278pp.

Empirical Tools of Public Finance. 131 Undergraduate Public Economics Emmanuel Saez UC Berkeley

Are Illegal Drugs Inferior Goods?

Methods of Randomization Lupe Bedoya. Development Impact Evaluation Field Coordinator Training Washington, DC April 22-25, 2013

lab exam lab exam Experimental Design Experimental Design when: Nov 27 - Dec 1 format: length = 1 hour each lab section divided in two

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Choosing Life: Empowerment, Action, Results! CLEAR Menu Sessions. Substance Use Risk 2: What Are My External Drug and Alcohol Triggers?

Problems to go with Mastering Metrics Steve Pischke

Principles and language suggestions for talking with patients

Chapter 19. Confidence Intervals for Proportions. Copyright 2010 Pearson Education, Inc.

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error

USING STATCRUNCH TO CONSTRUCT CONFIDENCE INTERVALS and CALCULATE SAMPLE SIZE

Inferential Statistics

10.1 Estimating with Confidence. Chapter 10 Introduction to Inference

Belief behavior Smoking is bad for you I smoke

Data Collection: Validity & Ethics

Quick Fact: How do I stay engaged after Scale Back Alabama is over?

ORAL HYGIENE SESSION 2

WELCOME you ve brought your smile to the right place.

TOP 10 THINGS YOU MUST KNOW BEFORE CHOOSING A FAMILY DENTIST. By Michael J. DeLaura D.D.S CARE (2273)

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Motivational Strategies for Challenging Situations

Chapter 11. Experimental Design: One-Way Independent Samples Design

Lesson 1: Making and Continuing Change: A Personal Investment

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Week 10 Hour 1. Shapiro-Wilks Test (from last time) Cross-Validation. Week 10 Hour 2 Missing Data. Stat 302 Notes. Week 10, Hour 2, Page 1 / 32

The changes that patients make to their lifestyle, are as important a part of treatment as their medicines. 1

Econometric Game 2012: infants birthweight?

PROFESSIONALISM THE ABC FOR SUCCESS

Isolating causality between gender and corruption: An IV approach Correa-Martínez, Wendy; Jetter, Michael

DON'T WORK AND WHAT YOU CAN DO ABOUT IT. BY DR. RHONA EPSTEIN

Living a Healthy Balanced Life Emotional Balance By Ellen Missah

INTERNAL VALIDITY, BIAS AND CONFOUNDING

2017 Label Insight Shopper Trends Survey

What is Multilevel Modelling Vs Fixed Effects. Will Cook Social Statistics

Transcription:

Impact Evaluation Technical Track Session IV Instrumental Variables Christel Vermeersch Beijing, China, 2009 Human Development Human Network Development Network Middle East and North Africa Region World Bank Institute Spanish Impact Evaluation Fund

An example to start off with Say we wish to evaluate a voluntary job training program Any unemployed person is elegible Some people choose to register ( Treatments ) Other people choose not to register ( Comparisons ) Some simple (but not-so-good) ways to evaluate the program: Compare before and after situation in the treatment group Compare situation of treatments and comparisons after the intervention Compare situation of treatments and comparisons before and after 2

Voluntary job training program Say we decide to compare outcomes for those who participate to the outcomes of those who do not participate: a simple model to do this: y D x Where D D 1 2 = 1 if person participates in training = 0 if person does not participate in training x = control variables ( exogenous & observable) Why is this not working properly? 2 probl ems: Variables that we omit (for various reasons) but that are important Decision to participate in training is endogenous 3

Problem #1: Omitted Variables Even in a well-thought model, we'll miss "forgotten" characteristics: we didn't know they mattered characteristics that are too complicated to measure Examples : Varying talent and levels of motivation Different levels of information Varying opportunity cost of participation Varying level of access to services The full "correct" model is: The model we use: y D x M 1 2 3 y 1D 2x 4

Problem #2: Endogenous Decision to Participate Participation is a decision variable? endogenous! (I.e. it depends on the participants themselves) y D x 1 2 D M 2 y ( M ) x 1 2 2 y x M 1 2 1 2 1 So: in the two cases: we re missing the M term, which we need to estimate the correct model, but which we cannot measure properly. 5

The correct model is: Simplified model: y D x M 1 2 3 y D x 1 2 Say we estimate the treatment effect γ 1 with β 1,OLS If M is correlated with D, and we don t include M in the simplified model, then the estimator of the parameter on D will pick up part of the effect of M. This will happen to the extent that M and D are correlated. Thus: our OLS estimator β 1,OLS of the treatment effect γ 1 captures the effect of other characteristics (M) in addition to the treatment effect. This means that there is a difference between E(β 1,OLS ) and γ 1 the expected value of the OLS estimator β 1 isn t γ 1, the real treatment effect β 1,OLS is a biased estimator of the treatment effect γ 1. 6

The correct model is: Simplified model: y y T 1 T 1 x D 2 x 2 3 This means that there is a difference between E(β 1,OLS ) and γ 1 the expected value of the OLS estimator β 1 isn t γ 1, the real treatment effect β 1,OLS is a biased estimator of the treatment effect γ 1. Why did this happen? One of the basic conditions for OLS to be BLUE was violated: In other words E(β 1,OLS ) γ 1 Even worse..plim(β 1,OLS ) γ 1 (biased estimator) (inconsistent estimator) 7

What can we do to solve this problem? y D x M 1 2 3 y D x 1 2 Try to clean the correlation between D and ε: Isolate the variation in D that is not correlated with ε through the omitted variable M We can do this using an instrumental variable (IV) 8

Basic idea behind IV The basic problem is that corr (D,M) 0 y D x M 1 2 3 y D x 1 2 Find a variable Z that satisfies two conditions: 1. Correlated with D: corr (Z, D) 0 ---- Z and D are correlated, or Z predicts part of D 2. Z is not correlated with ε: corr (Z, ε) = 0 ----By itself, Z has no influence on y. The only way it can influence y is because it influences D. All of the effect of Z on y passes through D. Examples of Z in the case of voluntary job training program? 9

Two-stage least squares (2SLS) Remember the original model with endogenous D: y D x 1 2 Step 1: Regress the endogenous variable D on the instrumental variable(s) Z and other exogenous variables D x Z 0 1 1 Calculate the predicted value of D for each observation: D Since Z and x are not correlated with, neither will be D. You will need one instrumental variable for each potentially endogenous regressor 10

Two-stage least squares (2SLS) Step 2: Regress y on the predicted variable D and the other exogenous variables y D x 1 2 Note: the standard errors of the second stage OLS need to be corrected because D is not a fixed regressor. In practice: use STATA ivreg command, which does the two steps. at once and reports correct standard errors. Intuition: by using Z for D, we cleaned D of its correlation with. It can be shown that (under certain conditions) IV yields a consistent estimator of (large sample theory) 1 11

Uses of instrumental variables Simultaneity: X and Y cause each other instrument X Omitted Variables: X is picking up the effect of other variables which are omitted instrument X with a variable that is not correlated with the omitted variable(s) Measurement Error: X is not measured precisely instrument X 12

Where do we find instrumental variables? Searching for one hard! Creating one with information If everyone is eligible to participate in treatment But some have more information than others Who has more information will be more likely to participate Provision of additional information on a random basis 13

Example 1: voluntary job training program Population eligible for job training program Random sample Random assignment Standard Information Package only Standard + additional Information Package Monthly income 1 year later = 700 Monthly income one year later= 850 30 percent takeup 90 percent takeup Question: what is the impact of the job training program? 14

Standard Information Package only Standard + additional Information Package Monthly income 1 year later = 700 Monthly income one year later= 850 30 percent takeup 90 percent takeup Question: what is the impact of the job training program? Difference between the well informed and not well informed group. Corrected for the differential take-up rate. Practically: Impact =. 15

Link back to the estimation formula Stage 1: Regress the participation on training on a dummy for whether person received additional information package (linear model) Compute predicted value of participation Stage 2: Regress wages on the predicted value of participation 16

Example 2: School autonomy in Nepal Goal is to evaluate A. Autonomous school management by communities B. School report cards Data You can include 1000 schools in the evaluation Each community freely chooses to participate or not. School report cards done by NGOs Each comunity has exactly one school. Task: design the implementation of the program so it can be evaluated propose method of evaluation. 17

School autonomy in Nepal Intervention B: School report card intervention by NGO Yes No Total Instrumental Variable for Intervention A: NGO visits community to inform on procedures for transfer of the school to community management Yes 300 300 600 No 200 200 400 Total 500 500 1000 18

Reminder and a word of caution. corr (Z,ε) =0 if corr (Z, ε) 0 Bad instrument Problem! Finding a good instrument is hard! Use both theory and common sense to find one! We can think of designs that yield good instruments. corr(z,d) 0 Weak instruments : the correlation between Z and D needs to be sufficiently strong. If not, the bias stays large even for large sample sizes. 19