Measuring Impact. Program and Policy Evaluation with Observational Data. Daniel L. Millimet. Southern Methodist University.

Similar documents
Measuring Impact. Conceptual Issues in Program and Policy Evaluation. Daniel L. Millimet. Southern Methodist University.

Lecture II: Difference in Difference and Regression Discontinuity

Lecture II: Difference in Difference. Causality is difficult to Show from cross

Quasi-experimental analysis Notes for "Structural modelling".

1. INTRODUCTION. Lalonde estimates the impact of the National Supported Work (NSW) Demonstration, a labor

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Class 1: Introduction, Causality, Self-selection Bias, Regression

Instrumental Variables I (cont.)

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS

Impact Evaluation Toolbox

Introduction to Program Evaluation

Instrumental Variables Estimation: An Introduction

Methods for Addressing Selection Bias in Observational Studies

Supplementary Web Appendix Transactional Sex as a Response to Risk in Western Kenya American Economic Journal: Applied Economics

Marno Verbeek Erasmus University, the Netherlands. Cons. Pros

ICPSR Causal Inference in the Social Sciences. Course Syllabus

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH

Dylan Small Department of Statistics, Wharton School, University of Pennsylvania. Based on joint work with Paul Rosenbaum

Quantitative Methods. Lonnie Berger. Research Training Policy Practice

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Practical propensity score matching: a reply to Smith and Todd

Introduction to Applied Research in Economics Kamiljon T. Akramov, Ph.D. IFPRI, Washington, DC, USA

Version No. 7 Date: July Please send comments or suggestions on this glossary to

Objective: To describe a new approach to neighborhood effects studies based on residential mobility and demonstrate this approach in the context of

Causal Methods for Observational Data Amanda Stevenson, University of Texas at Austin Population Research Center, Austin, TX

1. Introduction Consider a government contemplating the implementation of a training (or other social assistance) program. The decision to implement t

Econometric analysis and counterfactual studies in the context of IA practices

The Limits of Inference Without Theory

Predicting the efficacy of future training programs using past experiences at other locations

EC352 Econometric Methods: Week 07

Cancer survivorship and labor market attachments: Evidence from MEPS data

Using register data to estimate causal effects of interventions: An ex post synthetic control-group approach

Vocabulary. Bias. Blinding. Block. Cluster sample

Methods of Randomization Lupe Bedoya. Development Impact Evaluation Field Coordinator Training Washington, DC April 22-25, 2013

Putting the Evidence in Evidence-Based Policy. Jeffrey Smith University of Michigan

Identifying Mechanisms behind Policy Interventions via Causal Mediation Analysis

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines

Recent advances in non-experimental comparison group designs

TRANSLATING RESEARCH INTO ACTION

A note on evaluating Supplemental Instruction

Methods of Reducing Bias in Time Series Designs: A Within Study Comparison

Arturo Gonzalez Public Policy Institute of California & IZA 500 Washington St, Suite 800, San Francisco,

Empirical Strategies

Propensity Score Methods with Multilevel Data. March 19, 2014

Carrying out an Empirical Project

Impact Evaluation Methods: Why Randomize? Meghan Mahoney Policy Manager, J-PAL Global

What is Multilevel Modelling Vs Fixed Effects. Will Cook Social Statistics

Causal Inference Course Syllabus

Public Policy & Evidence:

Multiple Mediation Analysis For General Models -with Application to Explore Racial Disparity in Breast Cancer Survival Analysis

Reassessing Quasi-Experiments: Policy Evaluation, Induction, and SUTVA. Tom Boesche

Difference-in-Differences

Introduction to Observational Studies. Jane Pinelis

BIOSTATISTICAL METHODS

TRANSLATING RESEARCH INTO ACTION. Why randomize? Dan Levy. Harvard Kennedy School

Keywords: causal channels, causal mechanisms, mediation analysis, direct and indirect effects. Indirect causal channel

Methodological requirements for realworld cost-effectiveness assessment

Regression Discontinuity Analysis

Hazardous or Not? Early Cannabis Use and the School to Work Transition of Young Men

Anna Spurlock, Peter Cappers, Ling Jin, Annika Todd, LBNL Patrick Baylis, University of California at Berkeley

Chapter 11. Experimental Design: One-Way Independent Samples Design

Can Quasi Experiments Yield Causal Inferences? Sample. Intervention 2/20/2012. Matthew L. Maciejewski, PhD Durham VA HSR&D and Duke University

Problems to go with Mastering Metrics Steve Pischke

Estimating population average treatment effects from experiments with noncompliance

Effects of propensity score overlap on the estimates of treatment effects. Yating Zheng & Laura Stapleton

Going It Alone? A Smartphone Study of Social Connectivity and Employment after Prison Naomi F. Sugie

Propensity Score Analysis Shenyang Guo, Ph.D.

The Dynamic Effects of Obesity on the Wages of Young Workers

Health, Mental Health and Labor Productivity: The Role of Self-Reporting Bias

CAUSAL EFFECT HETEROGENEITY IN OBSERVATIONAL DATA

Why randomize? Rohini Pande Harvard University and J-PAL.

THE ROLE OF UNEMPLOYMENT INSURANCE ON ALCOHOL USE AND ABUSE FOLLOWING JOB LOSS ROBERT LANTIS AND BRITTANY TEAHAN

The Economics of Obesity

Profile of DeKalb County

Empirical Methods in Economics. The Evaluation Problem

Propensity Score Matching with Limited Overlap. Abstract

Empirical Tools of Public Finance. 131 Undergraduate Public Economics Emmanuel Saez UC Berkeley

Constructing AFQT Scores that are Comparable Across the NLSY79 and the NLSY97. Joseph G. Altonji Prashant Bharadwaj Fabian Lange.

Establishing Causality Convincingly: Some Neat Tricks

Impact of Eliminating Medicaid Adult Dental Coverage in California on Emergency. Department Use for Dental Problems

Book review of Herbert I. Weisberg: Bias and Causation, Models and Judgment for Valid Comparisons Reviewed by Judea Pearl

Causal Inference in Observational Settings

Study Design. Svetlana Yampolskaya, Ph.D. Summer 2013

Correlation Ex.: Ex.: Causation: Ex.: Ex.: Ex.: Ex.: Randomized trials Treatment group Control group

Factor analysis of alcohol abuse and dependence symptom items in the 1988 National Health Interview survey

A NON-TECHNICAL INTRODUCTION TO REGRESSIONS. David Romer. University of California, Berkeley. January Copyright 2018 by David Romer

Challenges of Observational and Retrospective Studies

Education, Information, and Improved Health: Evidence from Breast Cancer Screening

Subjects Motivations

The Employment Retention and Advancement Project

DOING SOCIOLOGICAL RESEARCH C H A P T E R 3

Evaluating Social Programs Course: Evaluation Glossary (Sources: 3ie and The World Bank)

Regression Discontinuity Design

PREVENTIVE HEALTH BEHAVIORS AMONG THE ELDERLY

Identification with Models and Exogenous Data Variation

Unlike standard economics, BE is (most of the time) not based on rst principles, but on observed behavior of real people.

UNIT I SAMPLING AND EXPERIMENTATION: PLANNING AND CONDUCTING A STUDY (Chapter 4)

Transcription:

Measuring mpact Program and Policy Evaluation with Observational Data Daniel L. Millimet Southern Methodist University 23 May 2013 DL Millimet (SMU) Observational Data May 2013 1 / 23

ntroduction Measuring causal impacts is best understood using the Rubin Causal Model and the idea of potential outcomes Recall, the following notation: D = 0, 1 denotes treatment assignment for a given subject Y (0) denotes the outcome that would arise for the subject if D = 0 Y (1) denotes the outcome that would arise for the subject if D = 1 The subject-speci c treatment is given by τ = Y (1) Y (0) This causal e ect for a given subject is never observed because either Y (0) or Y (1) is always missing DL Millimet (SMU) Observational Data May 2013 2 / 23

Due to the problem of the missing counterfactual, any statement concerning the causal e ect of a treatment, D, must rely on a set of assumptions in order to estimate the missing counterfactual Randomization is one method to estimate the missing counterfactual Under the assumption of random assignment: The average outcome of the control group is a valid (unbiased) estimate of the missing Y (0) for subjects in the treatment group The average outcome of the treatment group is a valid (unbiased) estimate of the missing Y (1) for subjects in the control group The average di erence in outcome across the treatment and control groups is a valid (unbiased) estimate of the average treatment e ect Thus, randomization allows us to obtain valid estimates of the missing counterfactual by using other subjects assigned to a di erent treatment assignment This succeeds because randomization ensures that the treatment and control groups are identical in expectation in all respects except for the treatment DL Millimet (SMU) Observational Data May 2013 3 / 23

As mentioned earlier, randomization is not always feasible n such cases, researchers utilize survey data in an attempt to measure causal impacts of treatments Researchers may collect the data themselves or utilize data collected by others Census Current Population Survey (CPS) Survey of ncome and Program Participation (SPP) Panel Study of ncome Dynamics (PSD) National Longitudinal Surveys (NLS) National Center for Education Statistics (NCES) Survey data is referred to as observational data to denote the fact there is no randomization; subjects are simply observed in the real world DL Millimet (SMU) Observational Data May 2013 4 / 23

Question turns to how to obtain valid estimates of the missing counterfactual with observational data With observational data, there are two options for estimating the missing counterfactual 1 Cross-Sectional Approach: Use outcomes of other subjects with a di erent treatment assignment observed at the same point in time Example: End-of-year survey of students in DW with data on participation in SBP and school performance 2 Longitudinal Approach: Use outcomes of same subject with a di erent treatment assignment observed at a di erent point in time Example: Repeated end-of-year surveys of students in DW with data on participation in SBP and school performance Two approaches have di erent data requirements irst approach requires cross-sectional data Data on numerous subjects collected at a single point in time Second approach requires longitudinal or panel data Data on numerous subjects collected at at least two points in time DL Millimet (SMU) Observational Data May 2013 5 / 23

Main di culty under either approach is that participation in the treatment is not randomized nstead subjects self-select into the treatment and control groups Cross-sectional approach Outcomes of other subjects with a di erent treatment assignment may be a poor approximation of the outcome that a given subject would have experienced due to di erences between subjects in attributes that lead some to self-select into the treatment and others to self-select into the control Longitudinal approach Outcomes of the same subject at a di erent point in time under a di erent treatment assignment may be a poor approximation of the outcome a given subject would have experienced today due to di erences over time in attributes that lead the subject to self-select into the treatment at one time but not another DL Millimet (SMU) Observational Data May 2013 6 / 23

Cross-Sectional Approaches ntroduction Estimation of causal e ects based on cross-sectional approaches uses outcomes of other subjects to estimate missing counterfactuals Many statistical techniques are available based on di erent assumptions concerning how subjects self-select into treatment assignment 1 Random selection Subjects do not self-select on the basis of any individual attributes correlated with potential outcomes 2 Selection on observed attributes only Subjects do not self-select on the basis of any individual attributes correlated with potential outcomes but unobserved by the researcher Subjects may self-select on the basis of individual attributes correlated with potential outcomes and observed by the researcher 3 Selection on unobserved attributes Subjects may self-select on the basis of individual attributes correlated with potential outcomes and unobserved by the researcher DL Millimet (SMU) Observational Data May 2013 7 / 23

Example: Question: What is the causal e ect of SBP participation on child obesity? Random selection Children decide to participate based only on convenience (and factors a ecting convenience are unrelated to child health) Selection on observed attributes only Children decide to participate based only on convenience, family SES status, and other attributes included in the survey Selection on unobserved attributes Children decide to participate based on convenience, family SES status, other attributes included in the survey, and nutritional knowledge and/or other attributes not measured in the survey DL Millimet (SMU) Observational Data May 2013 8 / 23

Cross-Sectional Approaches Estimation Random Selection: Analogous to random experiments Randomization just arises naturally ) natural experiments Casual e ects estimated using the average di erence in outcomes across subjects in the treatment and control groups DL Millimet (SMU) Observational Data May 2013 9 / 23

Example: E ects of children on labor market outcomes Angrist & Evans (1998) assess the causal e ect of children on the labor supply of men and women Twins represent a random occurrence of an extra child ind that an extra child reduces female labor supply, but not male labor supply DL Millimet (SMU) Observational Data May 2013 10 / 23

Selection on Observed Attributes Only: Analogous to a random experiment conditional on the set of observed attributes that are correlated with treatment assignment and potential outcomes Thus, estimation methods control for this set of observables Often referred to as quasi-experimental methods Causal e ects estimated using the average di erence in outcomes after removing the confounding e ects of the set of observables Observed attributes, denoted by X, may be controlled for in many ways 1 Regression (Ordinary Least Squares) 2 Matching 3 Propensity Score Methods DL Millimet (SMU) Observational Data May 2013 11 / 23

Matching The missing counterfactual for a given subject is the average outcome of other subjects with di erent treatment assignment but identical attributes, X This yields an estimated treatment e ect for each subject Causal e ect estimated using the average subject-speci c treatment e ect Other e ects may be estimated by averaging over di erent subsets of subjects (e.g., race or gender groups) ntuition is easy to explain to policymakers, boards, others: conditional on X, data mimics a random experiment DL Millimet (SMU) Observational Data May 2013 12 / 23

Example: National Supported Work (NSW) Demonstration Project Randomized job training program conducted from 1975-79 Program gave hard-to-employ individuals jobs in a supportive, but performance-based, work environment for 12-18 months Eligibility based on ex-convicts, ex-drug addicts, long-term ADC recipients, and HS dropouts Among those eligible, subjects randomized into treatment or control (no job) groups Dehejia and Wahba (1999) re-evaluate the program using matching methods by taking the original data on the treatment group (and discarding the original controls group) and using survey data to form the new control group Matching is able to replicate the experimental results DL Millimet (SMU) Observational Data May 2013 13 / 23

Selection on Unobserved Attributes Only: Prior methods are not valid as conditioning on X is insu cient Need to control for, say, X and W, but W is not observed Estimation is very di cult Unfortunately, this situation is often confronted E ects of education: self-selection based on innate ability E ects of private schooling: self-selection based on ability, motivation E ects of training: self-selection based on motivation, reliability, work ethic E ects of SBP: self-selection based on neighborhood quality, home nutritional environment Estimation methods 1 nstrumental Variables 2 Regression Discontinuity DL Millimet (SMU) Observational Data May 2013 14 / 23

nstrumental Variables Most common technique used by empirical economists Requires data on an attribute of subjects, denoted by Z, that 1 A ects the decision to self-select into the treatment or control group 2 But is uncorrelated with potential outcomes 3 And, hence, the unobserved attributes that a ect the decisions of subjects to self-select into the treatment or control group and correlated with potential outcomes DL Millimet (SMU) Observational Data May 2013 15 / 23

Example: E ects of private schools Evans & Schwab (1995) assess the causal e ect of attending a Catholic HS on HS graduation and college enrollment Control group includes students attending public schools Use Catholic upbringing as an instrument ind that Catholic schools cause higher graduation and college enrollment rates DL Millimet (SMU) Observational Data May 2013 16 / 23

Regression Discontinuity Applicable to situations where eligibility for participation is on the basis on some observed attribute: a score, S, relative to cut-o value, S Subjects just above and below S are nearly identical in all respects except for eligibility The missing counterfactual for a given subject is the average outcome of other subjects on the other side of S This yields an estimated treatment e ect for each subject Causal e ect estimated using the average subject-speci c treatment e ect ntuition is easy to explain to policymakers, boards, others: conditional on being close to S, data mimics a random experiment DL Millimet (SMU) Observational Data May 2013 17 / 23

Example: E ects of Medicaid De La Mata (2012) assesses the causal e ect of Medicaid on take-up, private H coverage, child health Use a RD method to compare households just above and below income cuto s that determine eligibility ind that Medicaid crowds out private H, increases preventive care, and has no impact on child health (as measured by self-reported health, obesity status, and school absences) DL Millimet (SMU) Observational Data May 2013 18 / 23

Longitudinal Approaches Estimation of causal e ects based on longitudinal approaches uses outcomes of the same subject at di erent times to estimate missing counterfactuals Two main approaches 1 Di erences model 2 Di erences-in-di erences model DL Millimet (SMU) Observational Data May 2013 19 / 23

Di erences Model: Casual e ects estimated using the average di erence in outcomes over time across subjects during the period of treatment and period of control While the same subject is used to estimate the missing counterfactual, it is still an estimate since the two outcomes are not measured at the same time Approximation may be poor if attributes of the subject that are correlated with potential outcomes also change over time f the only relevant attributes that change over time are observed by the researcher, then this is a case of selection on observed attributes only f at least some relevant attributes that change over time are unobserved by the researcher, then this is a case of selection on unobserved attributes DL Millimet (SMU) Observational Data May 2013 20 / 23

Di erences-in-di erences Model: f at least some attributes that change over time and are correlated with potential outcomes are unobserved by the researcher, an alternative strategy is based on DiD Casual e ects estimated using the average di erence between the average change in outcomes over time across subjects whose treatment status changes and the average change across subjects who are never treated Requires the relevant unobserved attributes that change over time to be uncorrelated with treatment assignment DL Millimet (SMU) Observational Data May 2013 21 / 23

Example: E ects of the minimum wage Card & Krueger (1994) assess the causal e ect of minimum wage hike in NJ in 1992 Compare the change in employment at fast food restaurants in NJ to the change in employment at similar restaurants just across the PA border ind no adverse impact of the minimum wage hike DL Millimet (SMU) Observational Data May 2013 22 / 23

Conclusion Estimation of causal e ects using observational data must rely on assumptions that are often untestable Questions to ask: 1 What type of selection process is assumed by the researcher? 2 Are there possible unobserved confounders that may invalidate the estimation procedure? Other issues to remember 1 Causal e ects are de ned with respect to the outcome being assessed; other outcomes may yield di erent e ects of a program or policy 2 Many di erent types of causal e ects can be estimated when treatment e ects are heterogeneous Average e ects for di erent groups Distributional e ects 3 With observational data, measurement error is commonplace and often consequential (Millimet 2011) DL Millimet (SMU) Observational Data May 2013 23 / 23