Establishing Causality Convincingly: Some Neat Tricks

Establishing Causality In the last set of notes, I discussed how causality can be difficult to establish in a straightforward OLS context If assumptions for unbiasedness hold, you re OK. But generally x will be endogenous, meaning that the zero conditional mean assumption will fail (i.e. the assumptions for unbiasedness won t hold) Snow s study of how cholera spreads gave a nice example of how finding exogenous variation in x (i.e. variation uncorrelated with the error term) can lead to convincing findings on causality.

Establishing Causality Variables (x) that are chosen by individuals are likely to be endogenous Whether I smoke How much I exercise Whether I save a lot or a little If we want to find the effect of such things on y, we have to worry about this endogeneity Ideally, what we d like is to exogenize x; use only the exogenous component of the variation in x to test for x s causal effect on y.

Isolating Exogenous Variation in x Take something like an individual s level of education and its effect on the wage. Individual chooses how much education to get Therefore x and u will be correlated, if unobserved differences between individuals with high education and low education are contained in u (e.g. work ethic) To some extent one s choice of education is a function of personal choice (endogenous component), but to some extent it s also a function of factors that are arguably random (exogenous component) What the minimum school drop-out age is in one s province (affects how many years of education some people get) How close one lives to a university (affects cost of attendance; therefore decision to attend)

Isolating Exogenous Variation in x If we can isolate the exogenous variation in x, we can use that variation to identify the causal effect of x on y. Remember, the ideal randomized experiment to determine the effect of education on the wage is to randomly sprinkle people with different levels of education and compare their wages. Suppose people are randomly sprinkled with various shocks to their decisionmaking that affect their choice of education. The variation in education that results from these random factors can be treated as random (thus exogenous). The rest of the variation in education should be treated as endogenous.

Example: Instrumental Variables One technique that does this is called instrumental variables (IV) estimation. This has become very popular in empirical economics. The idea is to take an endog RHS variable like educ, and find something that exogenously shifts it around. The intuition is essentially to regress educ on this exogenous shifter, predict values of educ using values of the exogenous shifter, and then substitute these predicted values of educ into the wage equation. The variation in educ should now be greatly diminished because the endogenous component has been removed This means standard error of the coefficient on educ will go up But the coefficient will be estimated using only exogenous variation in educ. This removes bias due to endogeneity of educ

Example: Instrumental Variables The exogenous shifter is called an instrumental variable or an instrument The estimator we obtain is called the IV estimator (as opposed to the OLS estimator) Picking a good instrument is essential to obtaining reliable IV estimators Suppose we are testing the effect of x on y, but we believe x is endogenous. Need an instrument, z, that is both: 1) Correlated with x (the stronger the better) 2) Uncorrelated with u (this means z only affects y through its effect on x)

Example: Instrumental Variables More formally, let s assume we want to estimate y = β 0 + β 1 x + u But we have reason to believe that x is endogenous (correlated with u) If we estimate the model as is, then our estimate of the effect of x on y will be biased Suppose that z affects x but has no effect (other than through x) on y Can instrument for z with x.

Example: Instrumental Variables Remember our OLS estimator is an approximation (using data) of Cov(x,y)/Var(x). This means it uses all variation in x and all covariation between x and y, whether endogenous or exogenous Consider Cov(z,y) where z is an instrument for x. Intuitively, this is the variation in y due to variation in x caused by variation in the exogenous instrument z (because the only way z affects y is through variation it causes in x).

Example: Instrumental Variables Cov(z, y) = Cov[z,(β 0 + β 1 x + u) β 1 = = β 1 Cov(z, x) + Cov(z,u) Cov(z, y) Cov(z, x) Note that Cov(z,u)=0 by assumption of z being exogenous We can get the sample analog of this ˆβ 1 IV = n i=1 (z i z )(y i y) (z i z )(x i x)

Example: Instrumental Variables This is very straightforward to estimate using a statistical package like EViews Using IV, you ve arguably done what Snow did by searching out a form of exogenous variation in water contamination levels To put this back in the randomized trials context, you can think of z as causing random variation in x which then allows you to treat x as if it s truly random. Note that if z is correlated with u in the original equation, then you can get into serious trouble IV estimator can be more biased than OLS estimator Also note that because we re only using exogeous variation in x to identify the causal effect of x on y, the standard error on the coeff on x will be much higher than it is with OLS (which uses all variation in x)

Some Famous Instruments Think back to our wage equation and the effect of education Important policy question What are some instruments for education? Proximity to a 4-year university may affect an individual s decision whether or not to attend university; this location might be able to be treated as random. Parents education; arguably this says something about the educational values you were exposed to growing up. Might be able to view this as random. Here s a neat one: quarter of birth. Why?

Some Famous Instruments Quarter of Birth In US and Canada students enter school according to a set formula (e.g. If you turn 5 by December 31st of a year, you start kindergarten that year) There are also minimum dropout ages (e.g. age 16 in many states/provinces) The interaction of these two policies causes a certain amount of variation in the years of education that people get. Example: If a kid turns 5 on Dec 31 this year, she starts school this year; if she turns 5 on January 1, she starts school next year. Yet, both kids turn 16 at essentially same time, so if both drop out at 16, one will have 1 more year of education than the other

Some Famous Instruments Arguably one s quarter of birth should be completely random Yet, the quarter of birth causes some people to have more education than others. A source of exogenous variation! A very famous paper (Angrist and Kreuger, 1991) exploited this idea to estimate the effect of an extra year of education Actually find a higher return to education using this instrument than people had typically found with OLS (about a 10% increase in earnings)

Some Famous Instruments Vietnam draft number Following Vietnam war, people became interested in the experience that Vietnam vets had in labour market Lots of stories of stress, emotional problems among returning troops Whether one served in Vietnam is arguably endogenous, because some chose to serve Choosers to serve may differ fundamentally from choosers not to serve Ideally, we want something that randomly forces some people to serve and some not to serve Draft number! A source of exogenous variation in whether you served in Vietnam or not.

Some more on IV estimation There are lots of potential instruments out there; creative use of them tends to be rewarded with (potentially more) convincing findings and publications But some cautionary notes IV estimates are biased but consistent Tend to have high standard errors (difficult to reject null) For both these reasons, large samples are needed If your instrument is badly chosen so that it is correlated with u you can make things worse than OLS by doing IV estimation Finally, IV estimation can only say something about the part of the sample the exogenous variation occurs in QOB only causes variation among high school dropouts. Therefore the estimate we obtain using this instrument only gives the return to an extra year of education for someone on the margin of dropping out. The effect could be much different for non-marginal students. This is called the local average treatment effect problem (LATE)

Example: Regression Discontinuity Design This is another nifty way to come up with plausibly exogenous variation in x Suppose x is determined by a stochastic process that involves a discontinuity Example: Some states in US have local budget referenda that determine school spending for individual districts. If the new budget is approved, spending rises to the proposed level. If it s rejected, spending falls to some state-mandated minimum. 50% plus 1 vote passes the budget, 50% minus 1 vote fails to pass the budget. Arguably the number of votes for the budget is partly randomly determined. Outcome is highly discontinuous; a small perturbation in the number of votes leads to big variation in the budget size

Example: Regression Discontinuity Design (RD) Why is this useful? One could simply regress some student outcome, like score on a standardized test, on level of spending in each school district. The problem is that spending in a school district is likely correlated with unobservable determinants of a student s performance on a standardized test Parents may care more about child s education in highspending districts than in low-spending districts, may not have a care variable to control for this Hence coefficient on spending would likely be biased (positively in this case) The RD approach lets us compare districts with close voting outcomes (say all those that were 50-51% for against those that were 49-50% for the budget)

Example: Regression Discontinuity Design (RD) Districts that vote 80% for maybe different than those that vote 20% for in unobservable ways However, arguably districts that vote 51% for are virtually identical to districts that vote 49% for. Yet, there s a huge difference in spending outcomes for these districts Arguably the difference in voting outcomes is driven by randomness (maybe there was a traffic jam in the second district that caused a few busy parents not to make it to the polls in time) If this is true, then the variation we get in spending between the 51% and 49% districts is completely exogenous

Example: Regression Discontinuity Design (RD) Therefore, we may be more inclined to trust the results we get with an RD approach than with a straight OLS regression of test score on spending Another voting example was the 2000 US presidential election. Suppose you re interested in how different industries perform under different political parties More specifically, how does having a Republican in the White House affect stock price performance of drug companies or oil companies?

Example: Regression Discontinuity Design (RD) You could simply compare stock price performances of an index of oil companies over various presidential administrations, and include a dummy for party in White House. (basic time series analysis) However, there may be factors correlated with who is in the White House that affect stock price of oil companies Maybe people vote Republican when they re worried we re running short of oil. So the standard OLS approach would be suspect

Example: Regression Discontinuity Design (RD) The 2000 election provided a great RD experiment The day of the election, no one knew with any confidence who was going to win. People were speculating that the result would probably depend on the weather in a few key states (i.e. the outcome was essentially going to be a coin toss--random!) The result was determined by a Supreme Court ruling.

Example: Regression Discontinuity Design (RD) People could essentially treat price changes in various stocks between Election Day and the Supreme Court ruling as a function of a coin toss on the presidential election outcome The fact that the party in power shifted from Democrat to Republican could basically be attributed to randomness (the 500 vote margin in Florida) Many studies were actually done exploiting this experiment

Natural Experiments These types of empirical approaches are generally referred to as natural experiments The name points to the way in which they approximate randomized trials Ethically, randomized trials couldn t be used in any of these studies. However, researchers if researchers find ways that nature imparts some randomness on x, they can exploit that randomness as if a randomized trial had been done. Often times we can learn much more about the relationship between two variables if we employ this type of an approach

unnatural Experiments in Economics Randomized trials are sometimes used in economics Lab experiments Interested students should consider Dr. Rondeau s occasionally offered experiments course (400 level) Honours students (e.g. Shiva Shirazi-Kia) sometimes perform experiments as part of thesis research. Field experiments Dr. Rondeau: Fundraising letters (getting permission from charities to vary wording of appeals to donors) There s a large literature on private contributions to public goods that considers the motivation of those who donate to charity Studies of strategic behavior

unnatural Experiments in Economics Bertrand and Mullainathan resume studies Researchers sent job application cover letter and resumes out to employers who were advertising openings Used a core set of resumes and a core set of cover letters Varied names of applicants Chose some names to be black (e.g. Tyrone Jackson, Shanika Jefferson) and some names to be white (e.g. Dylan Peters, Ashley Hutchinson) Mailed resumes with different names to employers, found that black names got fewer callbacks for interviews than white names This is considered a field experiment Phil Oreopoulos at UBC/U Toronto has done similar experiments in the BC labor market looking at discrimination against immigrants (people with East Asian and South Asian names versus those with Caucasian names) Note, you can use econometrics to analyze data in studies like these, but you often can get by with simple techniques like OLS.