EMPIRICAL STRATEGIES IN LABOUR ECONOMICS

Similar documents
Empirical Strategies

Empirical Strategies 2012: IV and RD Go to School

Empirical Strategies

Mastering Metrics: An Empirical Strategies Workshop. Master Joshway

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany

Practical propensity score matching: a reply to Smith and Todd

Class 1: Introduction, Causality, Self-selection Bias, Regression

Introduction to Applied Research in Economics Kamiljon T. Akramov, Ph.D. IFPRI, Washington, DC, USA

Marno Verbeek Erasmus University, the Netherlands. Cons. Pros

1. INTRODUCTION. Lalonde estimates the impact of the National Supported Work (NSW) Demonstration, a labor

Applied Quantitative Methods II

Effects of propensity score overlap on the estimates of treatment effects. Yating Zheng & Laura Stapleton

Carrying out an Empirical Project

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

ICPSR Causal Inference in the Social Sciences. Course Syllabus

The Limits of Inference Without Theory

What is: regression discontinuity design?

Randomization as a Tool for Development Economists. Esther Duflo Sendhil Mullainathan BREAD-BIRS Summer school

Instrumental Variables I (cont.)

Estimating average treatment effects from observational data using teffects

Propensity Score Analysis Shenyang Guo, Ph.D.

Syllabus.

Problem Set 5 ECN 140 Econometrics Professor Oscar Jorda. DUE: June 6, Name

EC352 Econometric Methods: Week 07

Establishing Causality Convincingly: Some Neat Tricks

Jake Bowers Wednesdays, 2-4pm 6648 Haven Hall ( ) CPS Phone is

Predicting the efficacy of future training programs using past experiences at other locations

1. Introduction Consider a government contemplating the implementation of a training (or other social assistance) program. The decision to implement t

A Guide to Quasi-Experimental Designs

Causal Inference Course Syllabus

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Using register data to estimate causal effects of interventions: An ex post synthetic control-group approach

Quasi-experimental analysis Notes for "Structural modelling".

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines

Measuring Impact. Program and Policy Evaluation with Observational Data. Daniel L. Millimet. Southern Methodist University.

Causal Methods for Observational Data Amanda Stevenson, University of Texas at Austin Population Research Center, Austin, TX

Causal Validity Considerations for Including High Quality Non-Experimental Evidence in Systematic Reviews

Identification with Models and Exogenous Data Variation

A NON-TECHNICAL INTRODUCTION TO REGRESSIONS. David Romer. University of California, Berkeley. January Copyright 2018 by David Romer

Propensity Score Matching with Limited Overlap. Abstract

What Are We Weighting For?

MEA DISCUSSION PAPERS

Econometric Game 2012: infants birthweight?

Score Tests of Normality in Bivariate Probit Models

Introduction to Econometrics

Methods for Addressing Selection Bias in Observational Studies

Dylan Small Department of Statistics, Wharton School, University of Pennsylvania. Based on joint work with Paul Rosenbaum

Regression Discontinuity Suppose assignment into treated state depends on a continuously distributed observable variable

Putting the Evidence in Evidence-Based Policy. Jeffrey Smith University of Michigan

Key questions when starting an econometric project (Angrist & Pischke, 2009):

A note on evaluating Supplemental Instruction

NBER WORKING PAPER SERIES BETTER LATE THAN NOTHING: SOME COMMENTS ON DEATON (2009) AND HECKMAN AND URZUA (2009) Guido W. Imbens

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine

Threats and Analysis. Shawn Cole. Harvard Business School

Group Work Instructions

Arturo Gonzalez Public Policy Institute of California & IZA 500 Washington St, Suite 800, San Francisco,

Reading and maths skills at age 10 and earnings in later life: a brief analysis using the British Cohort Study

Empirical Methods in Economics. The Evaluation Problem

Introduction to Applied Research in Economics

INTRODUCTION TO ECONOMETRICS (EC212)

Econometric Evaluation of Health Policies

Structural vs. Atheoretic Approaches to Econometrics

THE WAGE EFFECTS OF PERSONAL SMOKING

Causal Effect Heterogeneity

Modeling Selection Effects Draft 21 January 2005

Structural vs. Atheoretic Approaches to Econometrics

Web Appendix Index of Web Appendix

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Lennart Hoogerheide 1,2,4 Joern H. Block 1,3,5 Roy Thurik 1,2,3,6,7

Cancer survivorship and labor market attachments: Evidence from MEPS data

I INTRODUCTION. Paper presented at the Fourteenth Annual Conference of the Irish Economic Association.

Threats and Analysis. Bruno Crépon J-PAL

Comparing Experimental and Matching Methods using a Large-Scale Field Experiment on Voter Mobilization

Do not copy, post, or distribute

Technical Track Session IV Instrumental Variables

Program Evaluations and Randomization. Lecture 5 HSE, Dagmara Celik Katreniak

Instrumental Variables Estimation: An Introduction

Introduction to Observational Studies. Jane Pinelis

Early Release from Prison and Recidivism: A Regression Discontinuity Approach *

NBER WORKING PAPER SERIES A CAUTIONARY NOTE ON USING INDUSTRY AFFILIATION TO PREDICT INCOME. Jörn-Steffen Pischke Hannes Schwandt

August 29, Introduction and Overview

Are Illegal Drugs Inferior Goods?

Session 1: Dealing with Endogeneity

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Lecture II: Difference in Difference. Causality is difficult to Show from cross

LOCAL INSTRUMENTS, GLOBAL EXTRAPOLATION: EXTERNAL VALIDITY OF THE LABOR SUPPLY-FERTILITY LOCAL AVERAGE TREATMENT EFFECT

NBER TECHNICAL WORKING PAPER SERIES USING RAMDOMIZATION IN DEVELOPMENT ECONOMICS RESEARCH: A TOOLKIT. Esther Duflo Rachel Glennerster Michael Kremer

Estimating Heterogeneous Treatment Effects with Observational Data

Evaluating the Regression Discontinuity Design Using Experimental Data

The Dynamic Effects of Obesity on the Wages of Young Workers

Assessing Studies Based on Multiple Regression. Chapter 7. Michael Ash CPPA

The Prevalence of HIV in Botswana

Reassessing Quasi-Experiments: Policy Evaluation, Induction, and SUTVA. Tom Boesche

Gender and Generational Effects of Family Planning and Health Interventions: Learning from a Quasi- Social Experiment in Matlab,

Preliminary Draft. The Effect of Exercise on Earnings: Evidence from the NLSY

Cross-Lagged Panel Analysis

Rajeev Dehejia* Experimental and Non-Experimental Methods in Development Economics: A Porous Dialectic

DRAFT. Human capital and growth

Transcription:

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS University of Minho J. Angrist NIPE Summer School June 2009 This course covers core econometric ideas and widely used empirical modeling strategies. The main theoretical ideas are illustrated with examples. Our text is Mostly Harmless Econometrics by Angrist and Pischke. The outline below is keyed to the MHE table of contents. In a short course, we have to pick and choose our topics, even more than usual. I plan to lecture on Chapters 3, 4, and 8 from MHE, emphasizing advanced regression topics, instrumental variables with heterogeneous potential outcomes, and clustering and related standard errors problems. This 3-day course consists of three 90 minute lectures per day. We ll break briefly in the morning for coffee and in the early afternoon for lunch (well, lunch for me, anyway). In between lectures, I d be happy to talk with you about your empirical projects. Feel free to interrupt me with questions in class I ll be asking you questions too! OUTLINE Chapters 1 and 2: Questions About Questions and The Experimental Ideal. This easy-to-read background material sets the stage for the more technical material to come. Please look at this on your own. Chapter 3: Making Regression Make Sense Lecture Note I: Regression and the CEF Regression approximates the Conditional Expectation Function (CEF) Review of large sample theory for OLS estimates Why regression is called regression and what regression-to-the-mean means Lecture Note II: Causal Regression (our main occupation); Regression vs. Matching Linking a regression model to a causal model; Conditional independence assumptions Omitted variables bias Bad control Matching to estimate the effect of treatment on the treated Theoretical comparison of regression and matching Extras (time-permitting) Bad control Even more on regression and matching Limited dependent variables and marginal effects 1

Lecture Note III: Training Programs and the Propensity Score The propensity score theorem The Lalonde/Dehejia-Wahba/Smith-Todd controversy Why I think the propensity score is useful but not special Chapter 4: Instrumental Variables in Action Lecture Note IVa: Constant-effects models IV and omitted variables bias: using 2SLS to estimate a long regression without controls The Wald estimator and grouped data Two-sample IV and related methods IV details (part 1) The bias of 2SLS Lecture Note IVb: Instrumental variables with heterogeneous potential outcomes Local average treatment effects; internal vs. external validity The compliers concept; identification of effects on the treated and ATE IV in randomized trials Counting and characterizing compliers Generalizing LATE Models with variable treatment intensity IV details (part 2; time-permitting) 2SLS mistakes Limited dependent variables reprise Chapter 8: Nonstandard standard errors issues The bias of robust standard errors Why robust s.e.s are biased A simple example Clustering and serial correlation in panels Clustering and the Moulton problem Serial correlation in panels and differences-in-differences models Fewer than 42 clusters. 2

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS University of Minho J. Angrist NIPE Summer School June 2009 READINGS J.D. Angrist and J.S. Pischke, Mostly Harmless Econometrics: An Empiricists Companion, Princeton University Press, 2008. The reading list is keyed to the MHE table of contents. An asterisk denotes material in the reading packet distributed to students. Published journal articles should be available via JSTOR. 1-2. INTRODUCTION AND BACKGROUND MHE, Chapters 1-2. 3. REGRESSION 3.1 Regression and the CEF; Review of large-sample theory MHE, Section 3.1 * G. Chamberlain, Panel Data, Chapter 22 in The Handbook of Econometrics, Volume II, Amsterdam: North-Holland, 1983. 3.2 Regression and Causality MHE Section 3.2 * J. Angrist and A. Krueger, Empirical Strategies in Labor Economics, Chapter 23 in O. Ashenfelter and D. Card, eds., The Handbook of Labor Economics, Volume III, North Holland, 1999 (esp. Sections 2.1-2.2). 3.3 Regression and matching MHE Chapter 3.3.1 J. Angrist, "Estimating the Labor Market Impact of Voluntary Military Service Using Social Security Data on Military Applicants, Econometrica, March 1998. A. Abadie and G. Imbens, Large Sample Properties of Matching Estimators for Average Treatment Effects, Econometrica 74(1), 2006, 235-267. G. Imbens, Nonparametric Estimation of Average Treatment Effects under Exogeneity: A Review, The Review of Economics and Statistics, 86(1), 2004. 3

The propensity score MHE Section 3.3.2-3.3.3 O. Ashenfelter, Estimating the Effect of Training Programs on Earnings, The Review of Economics and Statistics 60 (1978), 47-57. O. Ashenfelter and D. Card, "Using the Longitudinal Structure of Earnings to Estimate the Effect of Training Programs on Earnings," The Review of Economics and Statistics 67 (1985):648-66. R. LaLonde, "Evaluating the Econometric Evaluations of Training Programs with Experimental Data," American Economic Review 76 (September 1986): 604-620. J. Heckman and J. Hotz, "Choosing Among Alternative Nonexperimental Methods for Estimating the Impact of Social programs: The Case of Manpower Training," JASA 84 (1989): 862-8. J. Heckman, H. Ichimura, and P. Todd, Matching as an Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme, The Review of Economic Studies 64, October 1997. P. Rosenbaum and R. Rubin, Reducing Bias in Observational Studies Using Subclassification on the Propensity Score, JASA 79[387], September 1984, 516-524. Rosenbaum, P. R. And D. B. Rubin, 1983, The Central Role of the Propensity Score in Observational Studies for Causal Effects, Biometrika 70[1], April 1983, 41-55. R. Dehejia and S. Wahba, "Causal Effects in Nonexperimental Studies: Re-evaluating the Evaluation of Training Programs," JASA 94 (Sept. 1999). J. Smith and P. Todd, Does Matching Overcome LaLonde s Critique of Nonexperimental Estimators? Journal of Econometrics, 2005(1-2). J. Hahn, On the Role of the Propensity Score in Efficient Estimation of Average Treatment Effects, Econometrica 66, March 1998. J. Angrist and J. Hahn, When to Control for Covariates? Panel-Asymptotic Results for Estimates of Treatment Effects, Review of Economics and Statistics, February 2004. K. Hirano, G. Imbens, and G. Ridder, Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score, Econometrica 71(4), 2003. 4. INSTRUMENTAL VARIABLES 4.1 2SLS with constant effects; the Wald estimator, grouped data MHE Section 4.1 J. Angrist and A. Krueger, Instrumental Variables and the Search for Identification, Journal of Economic Perspectives, Fall 2001. J. Angrist, Grouped Data Estimation and Testing in Simple Labor Supply Models, Journal of Econometrics, February/March 1991. J. Angrist, "Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records," American Economic Review, June 1990. * J. Angrist and S. Chen, Long-Term Economic Consequences of Vietnam-Era Conscription: Schooling, Experience and Earnings, IZA Discussion Paper, August 2008. 4

4.3 Two-Sample IV and related estimators MHE Section 4.3 J. Angrist and A. Krueger, The Effect of Age at School Entry on Educational Attainment: An Application of Instrumental Variables with Moments from Two Samples, JASA 87 (June 1992). J. Angrist and A. Krueger, Split-Sample Instrumental Variables Estimates of the Returns to Schooling, JBES, April 1995. * Inoue, Atsushi and G. Solon, Two-Sample Instrumental Variables Estimators, manuscript, forthcoming in The Review of Economics and Statistics. 4.6.4 The bias of 2SLS MHE Section 4.6.4 J. Angrist, G. Imbens, and A. Krueger, Jackknife Instrumental Variables Estimation, Journal of Applied Econometrics 14(1), 57-67. Flores-Lagunes, Alfonso, Finite-Sample Evidence on IV Estimators with Weak Instruments, Journal of Applied Econometrics 22, 2007, 677-694 4.4 Instrumental variables with heterogeneous potential outcomes MHE Section 4.4 G. Imbens and J. Angrist, Identification and Estimation of Local Average Treatment Effects, Econometrica, March 1994. J. Angrist, G. Imbens, and D. Rubin, Identification of Causal Effects Using Instrumental Variables, with comments and rejoinder, JASA, 1996. J. Angrist and A. Krueger, "Does Compulsory Schooling Attendance Affect Schooling and Earnings?," Quarterly Journal of Economics 106, November 1991, 979-1014. J. Angrist, Treatment Effect Heterogeneity in Theory and Practice, The Economic Journal 114, March 2004, C52-C83. 4.5 Generalizing LATE MHE Section 4.5.3 variable treatment intensity J. Angrist and G. Imbens, Two-Stage Least Squares Estimation of Average Causal Effects in Models with Variable Treatment Intensity, JASA, June 1995. * J. Angrist, V. Lavy, and Analia Schlosser, Multiple Experiments for the Causal Link Between the Quantity and Quality of Children, MIT Working Paper 06-26, September 2006. 4.6 IV details Limited dependent variables reprise MHE Section 4.6.3 J. Angrist, Estimation of Limited Dependent Variable Models with Dummy Endogenous Variables: Simple Strategies for Empirical Practice, JBES 19(1), January 2001. 5

8. NON-STANDARD STANDARD ERRORS ISSUES MHE, Chapter 8 A. Chesher and I. Jewitt, The Bias of a Heteroskedasticity-Consistent Covariance Matrix Estimator, Econometrica 55, September 1987. Moulton, Brent. 1986. "Random Group Effects and the Precision of Regression Estimates", Journal of Econometrics, 32, pp. 385-397. Bertrand, Marianne, Esther Duflo, and Sendhil Mullainathan, "How Much Should We Trust Differences-in-Differences Estimates?," QJE 119 (February 2004), 249-275. 6

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS University of Minho J. Angrist NIPE Summer School June 2009 PROBLEMS 1. Discuss the relationship between regression and matching, as described below: a. Suppose all covariates are discrete and you are trying to estimate a treatment effect conditional on covariates. Prove that if the regression model for covariates is saturated, then matching and regression estimates will estimate the same parameter (i.e., have the same plim) in either of the following two cases: (i) treatment effects are independent of covariates; (ii) treatment assignment is independent of covariates. b. Propose a weighted matching estimator that estimates the same thing as regression. c. Why might you prefer regression estimates over matching estimates, even if you are primarily interested in the effect of treatment on the treated? d. (extra credit) Calculate matching and regression estimates in the empirical application of your choice. Discuss the difference between the two estimates with the aid of a figure like the one used in Angrist (1998) for this purpose. 2. You are interested in estimating a regression of log wages, y i, on years of schooling, s i, while controlling for another variable related to schooling and earnings that we will call a i. Consider the following regression: y i = $ + Ds i + a i ( +, i (1) Assume that the regression coefficients $, D and ( are defined such that, i is uncorrelated with s i and a i. a. Suppose you estimate a bivariate regression of y i on s i instead. What is the plim of the coefficient on s i in terms of the parameters in equation (1)? When does the "short regression" estimate of D equal the "long regression" estimate? b. Why is the long regression more likely to have a causal interpretation? Or is it? 3. Consider using information on quarter of birth, Q i (= 1, 2, 3, 4), as an instrument for equation (1) when a i is unobserved. You are trying to use an instrument to get the long-regression D in a sample of men born (say) in 1930-39. a. What is the rationale for using Q i as an instrument? b. Show that using z i = 1[Q i =1] plus a constant as an instrument for a bivariate regression of y i on s i produces a "Wald estimate" of D based on comparisons by quarter of birth. Given the rationale in (a), is this estimator consistent for D in equation (1)? c. Suppose that the omitted variable of interest, a i, is still unobserved but we know that it is the age of i measured in quarters. What is the plim of the Wald estimator in this case? Can you sign the bias of the Wald estimator? 7

d. Suppose that instead of using z i, you use Q i itself as an instrument. Show that the resulting estimator is not consistent either (continuing to assume a i is omitted and equal to age in quarters). Can you use the two inconsistent estimators (Wald and IV using Q i ) to produce a consistent estimate of D? e. Now suppose that a i is observed and included in your model. Explain when and how you can consistently estimate D by 2SLS using 3 quarter of birth dummies, z 1i = 1[Q i =1], z 2i = 1[Q i =2], and z 3i = 1[Q i =3], plus a constant as the excluded instruments. f. As an alternative to 2SLS, consider using a dummy for middle quarters,: z im = I[Q i =2 or Q i =3], plus a constant as instruments for a bivariate regression of y i on s i. Show that this also produces a consistent estimate of D when Q i is uniformly distributed (still assuming that a i is age in quarters). Explain why this strategy works. On what basis might you choose between these alternative estimators? g. Suppose the equation of interest includes a quadratic function of age in quarters: y i = $ + Ds i + a i ( 0 + a i 2 ( 1 +, i (2) Explain why the "middle-quarters" estimator no longer works. Can you think of an estimator that does? 4. Construct an extract from the 1980 Census similar to the one used by Angrist and Krueger (1991). use this extract to compute and compare the estimates discussed in questions 2 and 3. 5. Discuss the link between causal effects and structural parameters in a Bivariate Probit model of the relationship between divorce and female labor force participation. The purpose of the model is to determine whether female employment strengthens a marriage or increases divorce. Organize your discussion as outlined below: a. Explain in words why the causal effect of employment on divorce is difficult to determine. Is the problem here primarily one of identification or estimation? Can you design an experiment to answer the question of interest? b. Write the potential outcomes and potential treatment assignments in your causal model in terms of latent indices with unobserved random errors in a structural model. c. What should the population be for this study? What does it mean for employment to be endogenous in the structural model? How about in the causal model? d. Show how to use the Probit structural parameters and distributional assumptions to calculate the population average treatment effect (ATE), the effect on the treated (ETT), and LATE. Which parameters are identified without distributional assumptions? e. Discuss the relationship between the three average causal effects, LATE, ATE, and ETT. Can you say which is likely to be largest and which is likely to be smallest? f. Compare OLS with Probit and IV with Bivariate Probit in the application of your choice (as in Angrist, 2001). 8