Instrumental Variables Estimation: An Introduction

Size: px

Start display at page:

Download "Instrumental Variables Estimation: An Introduction"

Kenneth Park
5 years ago
Views:

1 Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA

2 The Problem

3 The Problem Suppose you wish to compare outcomes of patients who receive an intervention (or are enrolled in a program, or receive a particular treatment) with the outcomes of patients in usual care (or who do not receive the treatment) Key point: Participation in the intervention or special program is not randomized; in fact, there is reason to believe that patients either selfselected or were selected (e.g., by providers) into the intervention based on particular attributes, not all of which are measured. 3

4 The Problem (cont d) Examples are real-world interventions or programs where it is not feasible to randomize Similar problem if you have randomization but are interested in estimating the TOT (or astreated ) effects instead of the ITT (e.g., when there is significant crossover due to nonadherence and contamination) If the unobserved attributes determining whether patients end up in the intervention vs. usual care are also attributes that affect the outcome, then the indicator for treatment assignment is endogenous, leading to potential bias 4

5 Definition of Endogeneity An explanatory variable in a regression model is said to be endogenous when it is correlated with the error term in the regression, potentially leading to biased estimates of the causal effect of the predictor on the dependent variable. Endogeneity can arise as a result of omitted variables (sometimes called treatment selection ), reverse causality, simultaneity, measurement error, or autocorrelated errors. In this presentation, I focus on bias due to omitted variables, or treatment selection. 5

6 Omitted-Variables Bias, a.k.a. Treatment Selection Bias

7 Selection Bias and the Residual Recall that in a regression, the residual (or error ) term captures the effects of all omitted and imperfectly measured variables In a linear regression, if the omitted variables are uncorrelated with your covariates (the socalled X s), then the only concern is that your model will have lower explanatory power and your estimates may be less precise However, if the omitted variables are correlated with some of the X s, then those coefficient estimates may be biased 7

8 Implications of Omitted Variables If you can t measure a predictor likely to be correlated with your treatment or intervention indicator, there is potential selection bias As the likely direction of the bias depends on the correlation of the omitted variable with both the regressor and the outcome, it is important to think about what you aren t controlling for in the model (as much as you do about what you are controlling for); a conceptual model helps here Sometimes there are competing effects, making it difficult to anticipate the likely direction of bias. 8

9 Example #1(a) Q: How does glycemic control compare for patients whose usual source of care (USC) is a diabetes specialists vs. those whose USC is a PCP? If patients who see specialists are unobservably sicker, then beneficial effects of specialty care on glycemic control may be understated (biased towards zero). Severity of illness is captured in the error term of the outcome regression and is correlated with seeing a specialist, so the specialty care dummy ends up serving as a proxy for being sicker The true effect of specialty care on the outcome is offset by the effect of being a sicker patient 9

10 Example #1(b) Same question, different omitted variable If patients who see specialists are unobservably more motivated, then beneficial effects of specialty care on glycemic control may be overstated (biased away from zero). Motivation is captured in the error term of the outcome regression and is correlated with seeing a specialist, so the coefficient on the specialty care dummy picks up both the true effect as well as the effect of proxying for being a more motivated patient 10

11 Determining the Direction of Bias (simple case no 2 nd order effects) True model: Y = αx 1 + βx 2 + ε Model run: Y = φx 1 + µ Think of running auxiliary regression of X 2 on X 1 : X 2 = θ + ρx 1 + η Instead of α, estimated coefficient on X 1 will be α + ρ β, which could be > α or < α, depending on the signs of α, ρ and β 11

12 Direction of Omitted-Variable Bias True model: Y = αx 1 + βx 2 + ε X 2 not measured α > 0, β > 0 α > 0, β < 0 α < 0, β > 0 α < 0, β < 0 (Simple Case) X 1 and X 2 are positively correlated (ρ > 0) α hat biased away from zero α hat biased toward zero α hat biased toward zero α hat biased away from zero X 1 and X 2 are negatively correlated (ρ < 0) α hat biased toward zero α hat biased away from zero α hat biased away from zero α hat biased toward zero 12

13 Some Possible Solutions (But No Panacea) 13

14 Possible Methods for Addressing Bias Due to Treatment Selection Propensity score methods Designed to better exploit observed attributes and help prevent out-of-sample prediction Does not necessarily help with bias due to unobservables (and in some cases may exacerbate it) Can use R&R method to test sensitivity Treatment effects models Requires strong distributional assumptions 14

15 Methods for Addressing Bias (cont d) Longitudinal data modeling Quasi-experimental designs: compare change over time among intervention vs. comparison groups to net out heterogeneity However, still assumes that the secular time trends would be the same in the absence of the intervention Treatment selection could lead to violation of this assumption; hard to check without having a long pre period Also you can only difference out attributes that are constant over time 15

16 Instrumental Variables Methods 16

17 IV in Action Even after adjusting for a fairly extensive set of baseline severity measures... Depression treatment was associated with worse outcomes in the PIC study Medicaid insurance was associated with higher mortality in the HCSUS study Both of these results reversed themselves when IV methods were used. 17

18 Instrumental Variables (IV) The idea behind IV is to achieve quasirandomization using a variable (the instrument ) that has a direct impact on the endogenous regressor but only an indirect impact on outcome In our original example, a valid IV would influence whether the patient was in the intervention vs. usual care, but would affect the outcome only through whether the patient got the intervention Exogenous variation in the instrument allows us to isolate the causal effect of the endogenous regressor on the outcome 18

19 A Simple Example of How IV Works Q: Does seeing a diabetes specialist improve glycemic control? Stylized Fact #1: Suppose that glycemic control is better among patients who have seen a diabetes specialist. But does this mean that seeing a diabetes specialist improves glycemic control, or that patients who are more motivated to keep their diabetes under control are more likely to seek out diabetes specialty care? 19

20 A Simple Example (cont d) Stylized Fact #2: Suppose that patients who live in areas with a high density of diabetes specialists (relative to PCPs) are more likely to have seen a diabetes specialist. Stylized Fact #3: Finally, suppose that patients who live in areas with a high density of diabetes specialists have better glycemic control. 20

21 A Simple Example (cont d) Implication: If we can assume that the relative density of diabetes specialists in an area does not directly influence an individual patient s glycemic control, then the only way to explain the better glycemic control among patients living in areas with greater density of diabetes specialists is through whether the patient has seen a diabetes specialist. 21

22 A Simple Example (cont d) In this example, we know : High specialist density => patient sees specialist High specialist density => patient has better glycemic control before controlling for whether patient saw specialist We assume: High specialist density > patient has better glycemic control after controlling for whether patient saw specialist This means: Seeing a specialist => better glycemic control, i.e., seeing a specialist causally affects glycemic control. 22

23 A Simple Example (cont d) The casual impact of seeing a diabetes specialist on glycemic control is identified through our assumption that relative provider specialist density (the instrument ): (1) affects whether the patient sees a specialist (2) does not directly affect the patient s glycemic control Note, however, that if either assumption fails, we cannot draw this causal inference. 23

24 Two-Stage Least Squares (2SLS) How do you get the IV estimate? Most common IV method is 2SLS, which is a special case when you are using a linear regression for your outcome equation First, estimate a reduced-form regression of the endogenous regressor on all exogenous variables in the system Next, use the regression estimates to construct a predicted value for the endogenous regressor 24

25 Two-Stage Least Squares (cont d) Substitute this predicted value for the actual value of the endogenous regressor in the outcome equation Then estimate as you normally would Standard errors must be adjusted for the use of a predicted value or they will be too small, but standard IV commands in software programs will automatically do that for you 25

26 Again, with Equations Model: Y = α + β 1 X 1 + β 2 X 2 + ε Suppose that X 2 is endogenous to Y. An instrumental variable is one that: i) is correlated with the endogenous variable X 2 ii) is uncorrelated with error term ε iii) should not enter the main equation (i.e., does not explain Y after controlling for X 2 ) 26

27 Equations (cont d) Stage 1: Predict X 2 as a function of all other variables plus at least one IV (call it Z): X 2 = a + γ 1 X 1 + γ 2 Z + ν Create predicted values of X 2 ; call them X p 2 Stage 2: Predict y as a function of X 2p and all other variables (but not Z): Y = a + β 1 X 1 + β 2 X 2p + ε Note: Adjust the standard errors to account for the fact that X 2p is predicted. 27

28 2SLS: Application In our earlier example, we wanted to estimate the causal effect of seeing a diabetes specialist on glycemic control. First, estimate the (linear) probability that the patient saw a diabetes specialist as a function of the relative density of diabetes specialists in the area and all predictors of glycemic control This is the first-stage or reduced-form regression 28

29 2SLS: Application (cont d) Second, create predicted (linear) probability of seeing a diabetes specialist for each patient Estimate glycemic control as a function of predicted probability of seeing a specialist instead of the indicator for whether they actually saw one This regression also controls for all other predictors of glycemic control (but not provider density) SE s need to be adjusted unless software does it This is the second-stage or structural regression 29

30 Intuition By replacing the indicator for whether you actually saw a diabetes specialist with the predicted probability of seeing a specialist, you are breaking its correlation with the error term in the glycemic control regression, so you get an unbiased estimate of its causal effect on glycemic control. The predicted values are just linear combinations of the exogenous variables, so by construction are not correlated with the error term. 30

31 Instruments and the Exclusion or Identifying Restriction Instruments for seeing a diabetes specialist are variables included in the 1 st - but not 2nd-stage regression. To identify the effect of diabetes specialty care on glycemic control, we need at least one variable that affects specialty care, but does not directly influence glycemic control after controlling for specialty care and other covariates. 31

32 Intuition Behind Identification To determine the effect of specialty care on glycemic control, we want to see how glycemic control changes when there is an exogenous shift in specialty care, holding everything else determining glycemic control constant. Thus, we need to find something that will shift around specialty care without changing any of the other predictors of glycemic control. 32

33 Importance of Instrument In our example, the instrument is relative density of diabetes specialists in the area. If provider density does not actually affect an individual patient s probability of seeing a specialist, or if provider density has a direct influence on glycemic control (perhaps because areas with high density of diabetes specialists have more, or higherquality, other resources for achieving glycemic control), then we can t separate out the independent causal effect of seeing a specialist. 33

34 Importance of Other Controls Usually it is easier to meet, as well as test, the criterion that the instrument (specialist provider density) should predict the endogenous regressor (whether saw a specialist). Typically it is the exclusion restriction that fails (provider density shouldn t affect glycemic control after controlling for seeing a specialist) In addition, you can only test this assumption if you have >1 instrument More likely to meet this assumption if you control well for other confounders, but then can be harder to meet first assumption 34

35 Randomized Controlled Trials: An IV Application Suppose many of the people randomized to the treatment group do not actually end up getting the treatment Intent-to-treat design may understate the true effect of the treatment (say, on HRQL) The random assignment can be used as the instrument for whether the person actually got the treatment in an as-treated analysis 35

36 RCTs: An IV Application (cont d) Stage 1: Estimate whether the person got treatment as a function of treatment assignment and all other predictors of either treatment or HRQL Stage 2: Estimate HRQL as a function of the predicted probability of treatment, controlling for other determinants of HRQL Treatment assignment is assumed to influence HRQL only indirectly, through actual receipt of the treatment. 36

37 RCTs: An IV Application (cont d) Fundamental difference with the usual IV analyses in the literature is which assumption most likely to fail Randomization is often excludable by definition (e.g., assignment to a treatment arm only affects outcome if you actually get the treatment), although there are exceptions However, if there is lots of crossover, randomization may not be a strong predictor of treatment group 37

38 Interpretation For which patients does this procedure yield an unbiased estimate of the effect of treatment on HRQL? At a minimum, this estimate applies to the marginal people, i.e., those for whom a change in the instrument (treatment assignment) would change the endogenous regressor (treatment). If the treatment effect is (assumed to be) homogeneous, then (by definition) the estimate generalizes to entire population. 38

39 Limitations of IV Analysis It s difficult to find a valid instrument Sometimes instruments are so weak that IV is actually more biased than OLS, even in quite large samples Even if your instrument is a strong predictor of treatment, often hard to argue that it s excludable from the second-stage outcome equation, again leading to bias If you don t have a valid instrument, then you also don t have a valid test for endogeneity 39

40 Limitations (cont d) Key question: Is the correlation of the IV with the endogenous regressor high relative to its correlation with the outcome? The greater the correlation of the IV with the outcome, the stronger its correlation with the endogenous regressor needs to be. Crown et al. (2011) have a nice simulation paper showing that IV estimates have less error than OLS estimates only under circumstances close to ideal 40

41 Limitations (cont d) If you have multiple potential IVs, with at least one known (or assumed to be) valid, can test the overidentifying restrictions on the model; if only one instrument, can t test the exclusion restriction Even with a plausible instrument, IV estimates are less efficient than single-equation estimates (especially if your endogenous regressor is dichotomous), so you lose precision/power 41

42 Future IV Topics The other three IV assumptions! IV when the outcome is nonlinear (two-stage residual inclusion) Other IV models when endogenous regressor is dichotomous IV with multiple endogenous regressors Testing endogeneity (Hausman or augmented regression test) Testing the strength of the instrument (in their 1997 paper, Staiger & Stock recommend a partial F statistic 5) Testing the exclusion restriction (Sargan, 1984) 42

43 Additional Slides

44 Does X Really Cause Y? (Dowd & Town) Example of Omitted Variables Bias R S β R β HP β S Type of Health Plan (X) Health Care Expenditures (Y) Unmeasured Consumer Desire for Broad Coverage Unmeasured Chronic Illness 44

45 Places to Look for Instruments (borrowed from Staiger) Geography (distance, rivers, small area variation) Legal/political institutions (laws, election dynamics) Administrative/program rules (wage/staffing rules, reimbursement rules, eligibility rules, mandates) Natural randomization (draft, birthdate, lottery, roommate assignment, weather) 45

46 IV Assumptions Although we focus primarily on two of them, IV actually relies on 5 assumptions We illustrate these assumptions using the example of the effect of seeing a diabetes specialist on glycemic control, using (relative) density of diabetes specialty providers in the patient s area of residence as the instrument 46

47 Formally: Non-Zero Average Causal Effect The instrument must predict the endogenous regressor, controlling for the other covariates. Example: Controlling for the other covariates, provider density is a good predictor of whether the patient has seen a diabetes specialist. 47

48 Formally: Exclusion Restriction The instrument has only a negligible direct influence on the outcome after controlling for the covariates, that is, the instrument is uncorrelated with the error term. Example: Controlling for whether the patient has seen a diabetes specialist and the other covariates, provider density does not predict glycemic control. 48

49 Monotonicity Formally: The instrument cannot increase the value of the endogenous regressor for some subjects, but decrease it for others. Example: All respondents who would have seen a diabetes specialist if they lived in an area with low provider density would also have one if they lived in an area with high provider density. 49

50 Formally: Random Assignment Subjects must be effectively randomized into the value for the IV within subgroups defined by the other covariates. Example: Knowing a respondent s glycemic control does not yield any information about provider density in the respondent s area of residence. 50

51 Stable Unit Treatment Value Assumption Formally: The outcome of one subject is not influenced by the value of the endogenous regressor for other subjects. Example: The glycemic control of one subject is not influenced by whether other subjects see a diabetes specialist and differences in effectiveness of seeing a specialist are minor. 51

Methods for Addressing Selection Bias in Observational Studies

Methods for Addressing Selection Bias in Observational Studies Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA What is Selection Bias? In the regression