Randomized Evaluations

Similar documents
Abdul Latif Jameel Poverty Action Lab Executive Training: Evaluating Social Programs Spring 2009

EXPERIMENTAL METHODS 1/15/18. Experimental Designs. Experiments Uncover Causation. Experiments examining behavior in a lab setting

SAMPLING AND SAMPLE SIZE

Public Policy & Evidence:

Problem Set 5 ECN 140 Econometrics Professor Oscar Jorda. DUE: June 6, Name

Lecture II: Difference in Difference and Regression Discontinuity

Introduction to Program Evaluation

Chapter 11. Experimental Design: One-Way Independent Samples Design

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Sex and the Classroom: Can a Cash Transfer Program for Schooling decrease HIV infections?

SOME NOTES ON THE INTUITION BEHIND POPULAR ECONOMETRIC TECHNIQUES

An Update on BioMarin Clinical Research and Studies in the PKU Community

Randomization as a Tool for Development Economists. Esther Duflo Sendhil Mullainathan BREAD-BIRS Summer school

Why randomize? Rohini Pande Harvard University and J-PAL.

Methods of Randomization Lupe Bedoya. Development Impact Evaluation Field Coordinator Training Washington, DC April 22-25, 2013

Experimental Methods. Policy Track

Why do Psychologists Perform Research?

Methods for Addressing Selection Bias in Observational Studies

Schooling, Income, and HIV Risk: Experimental Evidence from a Cash Transfer Program.

EC Macroeconomic Theory Supplement 13 Professor Sanjay Chugh

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Finding True Program Impacts Through Randomization

Class 1: Introduction, Causality, Self-selection Bias, Regression

The Logic of Causal Order Richard Williams, University of Notre Dame, Last revised February 15, 2015

How to Find the Poor: Field Experiments on Targeting. Abhijit Banerjee, MIT

ECON Microeconomics III

Establishing Causality Convincingly: Some Neat Tricks

WRITTEN PRELIMINARY Ph.D. EXAMINATION. Department of Applied Economics. January 17, Consumer Behavior and Household Economics.

Instrumental Variables I (cont.)

Boston Library Consortium Member Libraries

observational studies Descriptive studies

CBT+ Measures Cheat Sheet

CHAPTER I INTRODUCTION. This phenomenon happens in every society around the world. It becomes the great

t-test for r Copyright 2000 Tom Malloy. All rights reserved

[En français] For a pdf of the transcript click here.

Does Male Education Affect Fertility? Evidence from Mali

Difference-in-Differences

How to Work with the Patterns That Sustain Depression

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.

What is Down syndrome?

The Limits of Inference Without Theory

Sampling Controlled experiments Summary. Study design. Patrick Breheny. January 22. Patrick Breheny Introduction to Biostatistics (BIOS 4120) 1/34

Impact Evaluation Toolbox

Applied Econometrics for Development: Experiments II

TRANSLATING RESEARCH INTO ACTION

Impact Evaluation Methods: Why Randomize? Meghan Mahoney Policy Manager, J-PAL Global

How to Randomise? (Randomisation Design)

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Handout: Instructions for 1-page proposal (including a sample)

Reading and maths skills at age 10 and earnings in later life: a brief analysis using the British Cohort Study

A NON-TECHNICAL INTRODUCTION TO REGRESSIONS. David Romer. University of California, Berkeley. January Copyright 2018 by David Romer

Quiz 4.1C AP Statistics Name:

Scientific Thinking Handbook

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 1.1-1

Study Design. Study design. Patrick Breheny. January 23. Patrick Breheny Introduction to Biostatistics (171:161) 1/34

BOOKLET ONE. Introduction to Behavioural Activation for Depression

Carrying out an Empirical Project

AP Psychology -- Chapter 02 Review Research Methods in Psychology

Correlation Ex.: Ex.: Causation: Ex.: Ex.: Ex.: Ex.: Randomized trials Treatment group Control group

Study on Gender in Physics

The Practice of Statistics 1 Week 2: Relationships and Data Collection

SELECTED FACTORS LEADING TO THE TRANSMISSION OF FEMALE GENITAL MUTILATION ACROSS GENERATIONS: QUANTITATIVE ANALYSIS FOR SIX AFRICAN COUNTRIES

Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time.

*Karle Laska s Sections: There is NO class Thursday or Friday! Have a great Valentine s Day weekend!

August 29, Introduction and Overview

Math 140 Introductory Statistics

UNIT. Experiments and the Common Cold. Biology. Unit Description. Unit Requirements

Self Esteem and Purchasing Behavior Part Two.

Finding the Poor: Field Experiments on Targeting Poverty Programs in in Indonesia

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

Self-directed support

TRANSLATING RESEARCH INTO ACTION. Why randomize? Dan Levy. Harvard Kennedy School

Chapter Three: Sampling Methods

Applied Quantitative Methods II

WRITTEN ASSIGNMENT 1 (8%)

Using Vignettes to Measure the Quality of Health Care Jishnu Das (World Bank) Ken Leonard (Maryland)

Understanding Epidemics Section 2: HIV/AIDS

Technical Track Session IV Instrumental Variables

NUTRITION & HEALTH YAO PAN

15.301/310, Managerial Psychology Prof. Dan Ariely Lecture 2: The Validity of Intuitions

Analysis Plans in Economics

Chapter 1 Review Questions

Letter to the teachers

Chapter 11: Experiments and Observational Studies p 318

Chapter 7: Descriptive Statistics

Chapter Eight: Multivariate Analysis

Your Task: Find a ZIP code in Seattle where the crime rate is worse than you would expect and better than you would expect.

Scouter Support Training Participant Workbook

Introduction: Statistics, Data and Statistical Thinking Part II

Primary Objectives. Content Standards (CCSS) Mathematical Practices (CCMP) Materials

Causality and Statistical Learning

Sleep & Relaxation. Session 1 Understanding Insomnia Sleep improvement techniques Try a new technique

Measuring impact. William Parienté UC Louvain J PAL Europe. povertyactionlab.org

A Probability Puzzler. Statistics, Data and Statistical Thinking. A Probability Puzzler. A Probability Puzzler. Statistics.

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS

Math 124: Module 3 and Module 4

Good Communication Starts at Home

Teaching Family and Friends in Your Community

Transcription:

Randomized Evaluations Introduction, Methodology, & Basic Econometrics using Mexico s Progresa program as a case study (with thanks to Clair Null, author of 2008 Notes) Sept. 15, 2009

Not All Correlations are Causal Relationships Far from it! There are several reasons why X & Y might be correlated: causality X Y

Not All Correlations are Causal Relationships Far from it! There are several reasons why X & Y might be correlated: causality X Y reverse causality Y X

Not All Correlations are Causal Relationships Far from it! There are several reasons why X & Y might be correlated: causality X Y reverse causality Y X simultaneity X Y and Y X

Not All Correlations are Causal Relationships Far from it! There are several reasons why X & Y might be correlated: causality X Y reverse causality Y X simultaneity X Y and Y X omitted variables / confounding Z X and Z Y

Not All Correlations are Causal Relationships Far from it! There are several reasons why X & Y might be correlated: causality X Y reverse causality Y X simultaneity X Y and Y X omitted variables / confounding Z X and Z Y spurious correlation

Practice Correlations Think about the following correlations and be skeptical of causality. Which of the other 4 types of correlations would you want to rule out before you would be confident that the relationships were causal? In the case of omitted variables, what are some Z factors that you re worried about? 1 reverse causality Y X 2 simultaneity X Y and Y X 3 omitted variables / confounding Z X and Z Y 4 spurious correlation Job applicants with names that are common among African Americans are less likely to get an interview. In a country with few women leaders, voters have low opinions of a woman s ability to lead. As the planet heats up, there are fewer and fewer pirates. The chronically ill are usually poor.

Job applicants with names that are common among African Americans are less likely to get an interview.

Job applicants with names that are common among African Americans are less likely to get an interview. For causality, would like to know that if the name was the only thing that changed, they d still be less likely to get an interview. Is it possible to design an experiment that would let us observe that?

Job applicants with names that are common among African Americans are less likely to get an interview. For causality, would like to know that if the name was the only thing that changed, they d still be less likely to get an interview. Is it possible to design an experiment that would let us observe that? Actually, yes, and it s remarkably easy. Two economists (Bertrand & Mullainathan) did it by sending out fake resumes. The results are disturbing - those with white-sounding names were 50% more likely to be called for an interview. Even worse, while the likelihood of getting an interview is increasing in a white applicant s credentials, experience & honors mattered much less for black applicants. This sort of discrimination could turn into a vicious cycle.

In a country with few women leaders, voters have low opinions of a woman s ability to lead.

In a country with few women leaders, voters have low opinions of a woman s ability to lead. For causality, would like to know if an exogenous change in the number of women leaders would lead to changes in opinions of a woman s ability to lead. Is it possible to design an experiment that would let us observe that?

In a country with few women leaders, voters have low opinions of a woman s ability to lead. India s policy of requiring that some leadership positions be filled by women gives us a natural experiment (Beaman et al.). The essential aspect of this policy (in terms of determining a causal relationship between exposure to female politicians and opinions of them), is that the local governments didn t get to choose whether or not they wanted their position to be reserved for a woman. The positions to be reserved were randomly assigned.

In a country with few women leaders, voters have low opinions of a woman s ability to lead. India s policy of requiring that some leadership positions be filled by women gives us a natural experiment (Beaman et al.). The essential aspect of this policy (in terms of determining a causal relationship between exposure to female politicians and opinions of them), is that the local governments didn t get to choose whether or not they wanted their position to be reserved for a woman. The positions to be reserved were randomly assigned. Perceptions of women leaders effectiveness did improve after randomly-assigned exposure. Almost twice as many women won unreserved positions in places where the position had been reserved for a woman in the prior two elections relative to places where the position had been reserved only once or never at all.

As the planet heats up, there are fewer and fewer pirates.

As the planet heats up, there are fewer and fewer pirates. The point here is that theory helps. If there s not some reasonable explanation, it s not very likely to be causal. (Be wary of unreasonable explanations.)

The chronically ill are usually poor.

The chronically ill are usually poor. Again, to explore causality in either direction, would want to see how health responds to an exogenous change in wealth (or vice versa). Probably going to need an experiment for this one. need to make some people rich and see if they get healthier than otherwise identical people or make some chronically ill people healthier and see if they get richer than otherwise identical chronically ill people In principle, a natural experiment could work (e.g. lottery winners? people who live near new medical facilities?), but a field experiment is a sure bet.

The Real Issue: Finding the Right Counterfactual To determine the causal effect of X on Y we want to observe what happens to Y when we change X without changing anything else (i.e. Z). In the resume experiment, this was easy to do, because the people didn t really exist. But what if we need to change X in real people s lives? (call these people the treatment group) To make the comparison we need someone else whose X didn t change but who was otherwise identical to the treatment group. (call these people the control group) We can t observe the same person both with and without the change in X (the treatment) the counterfactual is unobservable.

Randomization to the Rescue! With a large enough sample, by randomly assigning people to treatment and control groups we can make the two more or less identical (e.g. their X s & Z s should be the same on average). IMPORTANT: If people get to choose whether or not they want to be in the treatment group, then there s no way to make sure that the people in the treatment group are identical to the people in the control group. Even if they look the same in every other characteristic, the fact that the treatment people chose to be treatment people and the control people chose to be control people means that something about them was different (there s some piece of Z that we re not able to measure).

In Math... Page 5-8 of Duflo et al, using our beat-to-death textbook thing as the backstory... We want to know Y T i Yi C, this is impossible to observe.

In Math... Page 5-8 of Duflo et al, using our beat-to-death textbook thing as the backstory... We want to know Y T i Yi C, this is impossible to observe. We can however, estimate E[Yi T Yi C ] The simple easy-to-get, not-the-same-thing (why?) number is: D = E[Y T i treated Y C i control]

In Math... Page 5-8 of Duflo et al, using our beat-to-death textbook thing as the backstory... We want to know Y T i Yi C, this is impossible to observe. We can however, estimate E[Yi T Yi C ] The simple easy-to-get, not-the-same-thing (why?) number is: D = E[Y T i Yi T treated Y C i control] was a school that was observed with textbooks, so having textbooks and being treated are not identical. Add and subtract E[Yi C T ] E[Yi T T ] E[Yi C T ] E[Yi C C] + E[Yi C T ] = E[(Yi T Yi C ) T + E[Yi C T ] E[Yi C C]

More Math E[Yi T T ] E[Yi C T ] E[Yi C C] + E[Yi C T ] = E[(Yi T Yi C ) T + E[Yi C T ] E[Yi C C] First term is the treatment effect (i.e., what we want.) Second and third terms are the selection bias. It captures the difference in potential untreated outcomes between the treatment and the comparison schools; treatment schools may have had different test scores on average even if they had not been treated. Briefly, randomization solves this. Since the treatment has been randomly assigned, individuals assigned to the treatment and control groups differ in expectation only through their exposure to the treatment. Had neither received the treatment, their outcomes would have been in expectation the same. This implies that the selection bias,e[yi C T ] E[Yi C C], is equal to zero.

Add Z of Mystery For simplicity, suppose we re interested in the effect of X = 1 relative to X = 0 The outcome Y is a function of the treatment (X ) and some other characteristic (for simplicity let Z = 1 or 0) We write this relationship in mathematical notation as Y i = a + bx i + cz i + ε i where the i subscript refers to a specific person and the ε is a white noise error / disturbance term that averages out across the population What that says in words for the Progresa case study is: school attendance (Y ) depends on whether or not the child gets a scholarship (X ), the child s ability (Z), and whether or not the child woke up on the right side of the bed (ε)

Z of Mystery To measure the effect of X = 1 relative to X = 0, taking into account Z, we compare the expected values (averages) of Y conditional on the levels of X and Z E[Y X, Z] = E[a + bx + cz + ε X, Z] = E[a X, Z] + E[bX X, Z] + E[cZ X, Z] + E[ε X, Z] = a + be[x X, Z] + ce[z X, Z] + 0

Z of Mystery To measure the effect of X = 1 relative to X = 0, taking into account Z, we compare the expected values (averages) of Y conditional on the levels of X and Z E[Y X, Z] = E[a + bx + cz + ε X, Z] = E[a X, Z] + E[bX X, Z] + E[cZ X, Z] + E[ε X, Z] = a + be[x X, Z] + ce[z X, Z] + 0 So, for our 4 possible combinations of X and Z, we have E[Y X = 1, Z = 1] = a + b + c E[Y X = 1, Z = 0] = a + b E[Y X = 0, Z = 1] = a + c E[Y X = 0, Z = 0] = a and as expected, the effect of X holding Z constant is b (the effect of Z holding X constant is c)

Omitted Variable Bias What if we hadn t been able to control for ability in the relationship we were just discussing? In that case, we wouldn t be able to calculate E[Y X = 1, Z] E[Y X = 0, Z] Instead, all we d be able to calculate is E[Y X = 1] E[Y X = 0] Writing out the gory details, we d have E[Y X = 1] = ( a + b + ce[z X = 1] ) ( ) E[Y X = 0] a + 0 + ce[z X = 0] = }{{} b + c ( E[Z X = 1] E[Z X = 0] ) }{{} true effect OVB

Omitted Variable Bias E[Y X = 1] = }{{} b + c ( E[Z X = 1] E[Z X = 0] ) }{{} true effect OVB E[Y X = 0] In words, if average ability is different for the kids in the treatment & control groups, then we won t be able to separate the effect of the treatment from the effect that their different ability levels are having on their attendance rates Randomization assures us that E[Z X = 1] = E[Z X = 0] so that there is no omitted variable bias

OVB is What Ails Us OVB is the real problem. You can sometimes get clues about the sign/magnitude. Given a true model Y i = a + bx i + cz i + ε i, if you leave out Z, your estimate of b come out as b + c cov(x,z) var(x ) In a word, if there exists any variable Z that s correlated with your variable of interest X and your outcome variable Y, you re screwed. Some think that means the solution to Omitted Variable bias is to not omit anything. While that may help a little, there are always things to omit, so it s better if you can find an X that s uncorrelated (ie, random).

Proof of Randomization Can you prove that you randomized? (Can you prove that there exists no Z that s correlated with Y and your X?)

Proof of Randomization Can you prove that you randomized? (Can you prove that there exists no Z that s correlated with Y and your X?) That s a universal negative, so no, you can t prove it. But you can gather a whole bunch of Z s before the program and show they have the same average across treatment and control groups, and that might assuage some fear.

Program Design The goal is to keep kids in school. How can the government do it?

Program Design The goal is to keep kids in school. How can the government do it? Economics is all about incentives and constraints... What sort of incentives could the government provide to households? What sort of constraints could the government relax for households?

Alternatives Supply approaches Build schools near where kids live (reduce cost of getting to school) Increase school resources (teacher salaries, meals, etc.) Might not affect enrollment gap between poor & non-poor students

Alternatives Supply approaches Build schools near where kids live (reduce cost of getting to school) Increase school resources (teacher salaries, meals, etc.) Might not affect enrollment gap between poor & non-poor students Demand approach: conditional cash transfers (CCT) Subsidies targeted to poor ( means tested ) Compensate families for opportunity cost of child s labor school itself is free Minimize disincentive to work (conditioned only on pre-program income)

Program Implementation Initial census to determine eligibility status about 2/3 of households qualified Do we care at all about the non-eligible households? Monthly educational grants children enrolled in grades 3-9 conditional on 85% attendance rates (confirmed by teacher) increasing in grades and extra bonus for girls in secondary school 70-255 pesos girl in 9th grade 44% of day-laborer s monthly earnings

What is Y?

What is Y? School enrollment

What is Y? School enrollment inequality

What is Y? School enrollment inequality achievement (not mentioned in this paper)

What is Y? School enrollment inequality achievement (not mentioned in this paper) Child labor

What is Y? School enrollment inequality achievement (not mentioned in this paper) Child labor Fertility program initially for 3 years only, but if viewed as an entitlement, bigger effect among eligible mothers for teenage girls, staying in school could reduce fertility

What Z s are observable?

What Z s are observable? Community level school quality (proxied with # kids/teacher) distance to secondary school distance to urban labor market Household level farther away lower opportunity cost of time farther away less info about returns to schooling parent s education PRE-program poverty index (what problem with concurrent income level?) Child level age sex grades completed

What Z s are unobservable?

What Z s are unobservable? Community level cohesiveness local returns to schooling Household level parents preferences for schooling Child level child s preferences for schooling ability

Deep Thoughts on Randomization In general, the finer the level at which you randomize, the better (more observations to compare to one another) Is randomizing at the child level a good idea?

Deep Thoughts on Randomization In general, the finer the level at which you randomize, the better (more observations to compare to one another) Is randomizing at the child level a good idea? You can t ignore political consequences. The idea is that the researchers want to compare participants - but participants are also likely to compare themselves to one another in cases where the treatment is observable.

Deep Thoughts on Randomization In general, the finer the level at which you randomize, the better (more observations to compare to one another) Is randomizing at the child level a good idea? You can t ignore political consequences. The idea is that the researchers want to compare participants - but participants are also likely to compare themselves to one another in cases where the treatment is observable. Randomizing at the household level might solve some of these problems associated with child-level randomization, but is it the best solution? Ultimately, Progresa randomization was at the village level. Does this solve all the problems we might worry about? Localities phased-in: treatment got it starting in 1998, controls got it starting in 2000

Alternative Methods of Randomizing Randomized order of phase-in timing is essential (are you trying to measure long-term effects?) caution: does control group change behavior in expectation of eventual treatment?

Alternative Methods of Randomizing Randomized order of phase-in timing is essential (are you trying to measure long-term effects?) caution: does control group change behavior in expectation of eventual treatment? Lotteries for oversubscribed treatments caution: only identifies effect of treatment among those eligible for randomization

Alternative Methods of Randomizing Randomized order of phase-in timing is essential (are you trying to measure long-term effects?) caution: does control group change behavior in expectation of eventual treatment? Lotteries for oversubscribed treatments caution: only identifies effect of treatment among those eligible for randomization Within-group treat subgroups caution: is control group contaminated?

Alternative Methods of Randomizing Randomized order of phase-in timing is essential (are you trying to measure long-term effects?) caution: does control group change behavior in expectation of eventual treatment? Lotteries for oversubscribed treatments caution: only identifies effect of treatment among those eligible for randomization Within-group treat subgroups caution: is control group contaminated? Encouragement design treatment available to everyone, but take-up rate varies randomly assign some people encouragement to receive treatment analytically difficult (only changes probability of treatment)

Results - Enrollment Program increased schooling by about 2/3 of a year Is this a big effect?

Results - Enrollment Program increased schooling by about 2/3 of a year Is this a big effect? Depends on the base that this increase builds upon compared to average of 6.8, pretty big, actually Many kids would have gone to school anyway for those who changed enrollment status in response to program, effect is even bigger

Results - Enrollment Program increased schooling by about 2/3 of a year Is this a big effect? Depends on the base that this increase builds upon compared to average of 6.8, pretty big, actually Many kids would have gone to school anyway for those who changed enrollment status in response to program, effect is even bigger Reduced inequality between poor & non-poor kids

So, should we expand Progresa everywhere? Behrman & Todd estimate that increases in earning power from more education exceed costs of program by 40-110%! More cost effective than building schools, at least in rural Mexico (Parker & Coady) Intergenerational effect on educational attainment also important

So, should we expand Progresa everywhere? Behrman & Todd estimate that increases in earning power from more education exceed costs of program by 40-110%! More cost effective than building schools, at least in rural Mexico (Parker & Coady) Intergenerational effect on educational attainment also important But what about external validity?

Re-cap Correlation is not causation Goal of impact evaluation is to identify causal effects of X on Y Omitted variable bias is a big problem Randomization is the safest way to avoid OVB Progresa case study randomized at locality level (phased-in pilot) conditional cash transfers increased school attendance concern w/ any randomized evaluation is external validity Return