What is: regression discontinuity design? Mike Brewer University of Essex and Institute for Fiscal Studies Part of Programme Evaluation for Policy Analysis (PEPA), a Node of the NCRM
Regression discontinuity design: overview A regression discontinuity design is a way of undertaking causal inference, usually of some policy intervention It can provide robust, convincing estimates of causal impacts under fairly weak conditions or minimal assumptions It was invented by psychologists, but labour economists are now realising how applicable it is The nature of the intervention will determine whether an RDD is appropriate. Even when it is, data demands are often great
What is regression discontinuity? A regression discontinuity design is appropriate where a treatment/intervention/policy is given to individuals for whom some measured characteristic lies on one side of a cut-off (sharp RD) AND the characteristic cannot be perfectly manipulated by individuals
(Sharp) regression discontinuity design Treatment, D i 1 To estimate the impact of the treatment, we need a comparison group These people are above the cut-off and exposed to treatment. 0 These people are below the cut-off and not exposed to treatment. And some of them are very similar to some who are treated... c (cut-off) X (running variable) An ideal comparison group would have the same values of X, but not be treated, But such people do not exist...
RDD: the principle Compare treated outcome for those just above cut-off with untreated outcome for those just below cut-off This identifies the average treatment effect on subjects at the cutoff Why does this work? If the running variable cannot be perfectly manipulated, then individuals on either side of the cut-off should be very similar to each other in their observable and unobservable characteristics: it s as if treatment were randomly assigned Key assumption: nothing else jumps at cut-off
RDD: examples Intervention of interest Running variable Unit of study Outcomes of interest Remedial measures Exam results School Future exam results Scholarship Test scores Individual Drop-out rates, future exam results, earnings Labour market or W2W policies Age Individual Labour supply, earnings Enrolment in school Date of birth Children (or parents) Regulation, Payroll Taxes Being elected to public office Children s test scores of future earnings, parents labour supply Size Firm Labour demand, polluting behaviour Vote share Candidates for public office Future income/wealth See much longer list in Lee and Lemieux, 2010
RDD: implementation Graphical analysis Outcome vs running variable either side of cut-off
Example: link between entitlement to UB and length of unemployment Source: Lalive, (2007)
RDD: implementation Graphical analysis Outcome vs running variable either side of cut-off Formal estimate Parametric = OLS. Easy! Non-parametric means local linear regression
RDD: implementing in OLS Indicator for being right side of cut-off, so coefficient measures how outcome variable jumps at X=c Allows running variable, X, affects outcomes according to quadratic function whose slope changes at X=c Y. I( X abs l1 abs r1 Z Other covariates c) X c. I( X c) abs X c X c. I( X c) abs X c l 2 r 2 2 2 I( X I( X If X discrete, then should allow for errors to be clustered at level of running variable (Card and Lee, 2008) c) c)
RDD: implementation Graphical analysis Outcome vs running variable either side of cut-off Formal estimate Parametric = OLS. Easy! Non-parametric means local linear regression Sensitivities and robustness checks
RDD: checks Something other than treatment might cause the jumps Do pre-treatment variables or explanatory variables jump around cut-off? Individuals might manipulate running variable Is density of running variable smooth around cut-off? Do pre-treatment variables or explanatory variables jump around cut-off? Distinguish between discontinuity and non-linearity Are results robust to inclusion of higher-order polynomials? Are results robust to changing size of window around cut-off? Are there jumps when none expected ( placebo RDDs )?
A non-smooth density function From McCrary, 2008. Probability of vote just being lost is a lot lower than it just being won
RDD: checks Something other than treatment might cause the jumps Do pre-treatment variables or explanatory variables jump around cut-off? Individuals might manipulate running variable Is density of running variable smooth around cut-off? Do pre-treatment variables or explanatory variables jump around cut-off? Distinguish between discontinuity and non-linearity Are results robust to inclusion of higher-order polynomials? Are results robust to changing size of window around cut-off? Are there jumps when none expected ( placebo RDDs )?
From Mostly Harmless Econometrics
Variant: Fuzzy RDD Fuzzy RDD appropriate when the probability that someone is treated changes discontinuously when a characteristic crosses a cut-off For those close to the cut-off, being on the right side of the cut-off is a valid instrument (predicts treatment well, no direct impact on outcome) Can then estimate impact of treatment through 2SLS (change in outcome either side of cut-off divided by change in treatment either side of cutoff) Technically, requires a monotonicity assumption and then identifies a LATE: the impact of the treatment on compliers at the cut-off
Fuzzy regression discontinuity design Treatment, D i i 1 0 c (cutoff) X Now treatment depends on whether X bigger than cut-off c, but this is not the only factor. There is a jump in the fraction who are treated as we cross the cut-off, c.
Variant: Fuzzy RDD Fuzzy RDD appropriate when the probability that someone is treated changes discontinuously when a characteristic crosses a cut-off For those close to the cut-off, being on the right side of the cut-off is a valid instrument (predicts treatment well, no direct impact on outcome) Can then estimate impact of treatment through 2SLS (change in outcome either side of cut-off divided by change in treatment either side of cutoff) Technically, requires a monotonicity assumption and then identifies a LATE: the impact of the treatment on compliers at the cut-off
RDD: assessment RDDs can provide convincing causal estimates and can be easily implemented via OLS But not universally applicable: depends entirely on nature of intervention focusing on small area around cut-off often requires large amounts of data
References and reading Where it all came from: Thistlethwaite & Campbell, 1960 "Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment" Journal of Educational Psychology, 51(6): 309-17 To find out more: Angrist & Pischke, Mostly Harmless Econometrics Lee and Lemieux, 2010, "Regression Discontinuity Designs in Economics." Journal of Economic Literature, 48(2), 281 355 Journal of Econometrics, 2008, 142(2), esp. articles by Imbens & Lemieux, Lalive, Card & Lee, McCrary: http://www.sciencedirect.com/science/journal/03044076/142/2. For economics examples, see citations in Lee and Lemieux Some UK examples outside economics: Del Bono et al., Health information and health outcomes: an application of the regression discontinuity design to the 1995 UK contraceptive pill scare case, ISER WP 2011-16 Eggers and Hainmueller, MPs for Sale? Returns to Office in Postwar British Politics, American Political Science Review, 103, pp 513-533