Instrumental Variables Estimation: An Introduction

Size: px
Start display at page:

Download "Instrumental Variables Estimation: An Introduction"

Transcription

1 Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA

2 The Problem

3 The Problem Suppose you wish to compare outcomes of patients who receive an intervention (or are enrolled in a program, or receive a particular treatment) with the outcomes of patients in usual care (or who do not receive the treatment) Key point: Participation in the intervention or special program is not randomized; in fact, there is reason to believe that patients either selfselected or were selected (e.g., by providers) into the intervention based on particular attributes, not all of which are measured. 3

4 The Problem (cont d) Examples are real-world interventions or programs where it is not feasible to randomize Similar problem if you have randomization but are interested in estimating the TOT (or astreated ) effects instead of the ITT (e.g., when there is significant crossover due to nonadherence and contamination) If the unobserved attributes determining whether patients end up in the intervention vs. usual care are also attributes that affect the outcome, then the indicator for treatment assignment is endogenous, leading to potential bias 4

5 Definition of Endogeneity An explanatory variable in a regression model is said to be endogenous when it is correlated with the error term in the regression, potentially leading to biased estimates of the causal effect of the predictor on the dependent variable. Endogeneity can arise as a result of omitted variables (sometimes called treatment selection ), reverse causality, simultaneity, measurement error, or autocorrelated errors. In this presentation, I focus on bias due to omitted variables, or treatment selection. 5

6 Omitted-Variables Bias, a.k.a. Treatment Selection Bias

7 Selection Bias and the Residual Recall that in a regression, the residual (or error ) term captures the effects of all omitted and imperfectly measured variables In a linear regression, if the omitted variables are uncorrelated with your covariates (the socalled X s), then the only concern is that your model will have lower explanatory power and your estimates may be less precise However, if the omitted variables are correlated with some of the X s, then those coefficient estimates may be biased 7

8 Implications of Omitted Variables If you can t measure a predictor likely to be correlated with your treatment or intervention indicator, there is potential selection bias As the likely direction of the bias depends on the correlation of the omitted variable with both the regressor and the outcome, it is important to think about what you aren t controlling for in the model (as much as you do about what you are controlling for); a conceptual model helps here Sometimes there are competing effects, making it difficult to anticipate the likely direction of bias. 8

9 Example #1(a) Q: How does glycemic control compare for patients whose usual source of care (USC) is a diabetes specialists vs. those whose USC is a PCP? If patients who see specialists are unobservably sicker, then beneficial effects of specialty care on glycemic control may be understated (biased towards zero). Severity of illness is captured in the error term of the outcome regression and is correlated with seeing a specialist, so the specialty care dummy ends up serving as a proxy for being sicker The true effect of specialty care on the outcome is offset by the effect of being a sicker patient 9

10 Example #1(b) Same question, different omitted variable If patients who see specialists are unobservably more motivated, then beneficial effects of specialty care on glycemic control may be overstated (biased away from zero). Motivation is captured in the error term of the outcome regression and is correlated with seeing a specialist, so the coefficient on the specialty care dummy picks up both the true effect as well as the effect of proxying for being a more motivated patient 10

11 Determining the Direction of Bias (simple case no 2 nd order effects) True model: Y = αx 1 + βx 2 + ε Model run: Y = φx 1 + µ Think of running auxiliary regression of X 2 on X 1 : X 2 = θ + ρx 1 + η Instead of α, estimated coefficient on X 1 will be α + ρ β, which could be > α or < α, depending on the signs of α, ρ and β 11

12 Direction of Omitted-Variable Bias True model: Y = αx 1 + βx 2 + ε X 2 not measured α > 0, β > 0 α > 0, β < 0 α < 0, β > 0 α < 0, β < 0 (Simple Case) X 1 and X 2 are positively correlated (ρ > 0) α hat biased away from zero α hat biased toward zero α hat biased toward zero α hat biased away from zero X 1 and X 2 are negatively correlated (ρ < 0) α hat biased toward zero α hat biased away from zero α hat biased away from zero α hat biased toward zero 12

13 Some Possible Solutions (But No Panacea) 13

14 Possible Methods for Addressing Bias Due to Treatment Selection Propensity score methods Designed to better exploit observed attributes and help prevent out-of-sample prediction Does not necessarily help with bias due to unobservables (and in some cases may exacerbate it) Can use R&R method to test sensitivity Treatment effects models Requires strong distributional assumptions 14

15 Methods for Addressing Bias (cont d) Longitudinal data modeling Quasi-experimental designs: compare change over time among intervention vs. comparison groups to net out heterogeneity However, still assumes that the secular time trends would be the same in the absence of the intervention Treatment selection could lead to violation of this assumption; hard to check without having a long pre period Also you can only difference out attributes that are constant over time 15

16 Instrumental Variables Methods 16

17 IV in Action Even after adjusting for a fairly extensive set of baseline severity measures... Depression treatment was associated with worse outcomes in the PIC study Medicaid insurance was associated with higher mortality in the HCSUS study Both of these results reversed themselves when IV methods were used. 17

18 Instrumental Variables (IV) The idea behind IV is to achieve quasirandomization using a variable (the instrument ) that has a direct impact on the endogenous regressor but only an indirect impact on outcome In our original example, a valid IV would influence whether the patient was in the intervention vs. usual care, but would affect the outcome only through whether the patient got the intervention Exogenous variation in the instrument allows us to isolate the causal effect of the endogenous regressor on the outcome 18

19 A Simple Example of How IV Works Q: Does seeing a diabetes specialist improve glycemic control? Stylized Fact #1: Suppose that glycemic control is better among patients who have seen a diabetes specialist. But does this mean that seeing a diabetes specialist improves glycemic control, or that patients who are more motivated to keep their diabetes under control are more likely to seek out diabetes specialty care? 19

20 A Simple Example (cont d) Stylized Fact #2: Suppose that patients who live in areas with a high density of diabetes specialists (relative to PCPs) are more likely to have seen a diabetes specialist. Stylized Fact #3: Finally, suppose that patients who live in areas with a high density of diabetes specialists have better glycemic control. 20

21 A Simple Example (cont d) Implication: If we can assume that the relative density of diabetes specialists in an area does not directly influence an individual patient s glycemic control, then the only way to explain the better glycemic control among patients living in areas with greater density of diabetes specialists is through whether the patient has seen a diabetes specialist. 21

22 A Simple Example (cont d) In this example, we know : High specialist density => patient sees specialist High specialist density => patient has better glycemic control before controlling for whether patient saw specialist We assume: High specialist density > patient has better glycemic control after controlling for whether patient saw specialist This means: Seeing a specialist => better glycemic control, i.e., seeing a specialist causally affects glycemic control. 22

23 A Simple Example (cont d) The casual impact of seeing a diabetes specialist on glycemic control is identified through our assumption that relative provider specialist density (the instrument ): (1) affects whether the patient sees a specialist (2) does not directly affect the patient s glycemic control Note, however, that if either assumption fails, we cannot draw this causal inference. 23

24 Two-Stage Least Squares (2SLS) How do you get the IV estimate? Most common IV method is 2SLS, which is a special case when you are using a linear regression for your outcome equation First, estimate a reduced-form regression of the endogenous regressor on all exogenous variables in the system Next, use the regression estimates to construct a predicted value for the endogenous regressor 24

25 Two-Stage Least Squares (cont d) Substitute this predicted value for the actual value of the endogenous regressor in the outcome equation Then estimate as you normally would Standard errors must be adjusted for the use of a predicted value or they will be too small, but standard IV commands in software programs will automatically do that for you 25

26 Again, with Equations Model: Y = α + β 1 X 1 + β 2 X 2 + ε Suppose that X 2 is endogenous to Y. An instrumental variable is one that: i) is correlated with the endogenous variable X 2 ii) is uncorrelated with error term ε iii) should not enter the main equation (i.e., does not explain Y after controlling for X 2 ) 26

27 Equations (cont d) Stage 1: Predict X 2 as a function of all other variables plus at least one IV (call it Z): X 2 = a + γ 1 X 1 + γ 2 Z + ν Create predicted values of X 2 ; call them X p 2 Stage 2: Predict y as a function of X 2p and all other variables (but not Z): Y = a + β 1 X 1 + β 2 X 2p + ε Note: Adjust the standard errors to account for the fact that X 2p is predicted. 27

28 2SLS: Application In our earlier example, we wanted to estimate the causal effect of seeing a diabetes specialist on glycemic control. First, estimate the (linear) probability that the patient saw a diabetes specialist as a function of the relative density of diabetes specialists in the area and all predictors of glycemic control This is the first-stage or reduced-form regression 28

29 2SLS: Application (cont d) Second, create predicted (linear) probability of seeing a diabetes specialist for each patient Estimate glycemic control as a function of predicted probability of seeing a specialist instead of the indicator for whether they actually saw one This regression also controls for all other predictors of glycemic control (but not provider density) SE s need to be adjusted unless software does it This is the second-stage or structural regression 29

30 Intuition By replacing the indicator for whether you actually saw a diabetes specialist with the predicted probability of seeing a specialist, you are breaking its correlation with the error term in the glycemic control regression, so you get an unbiased estimate of its causal effect on glycemic control. The predicted values are just linear combinations of the exogenous variables, so by construction are not correlated with the error term. 30

31 Instruments and the Exclusion or Identifying Restriction Instruments for seeing a diabetes specialist are variables included in the 1 st - but not 2nd-stage regression. To identify the effect of diabetes specialty care on glycemic control, we need at least one variable that affects specialty care, but does not directly influence glycemic control after controlling for specialty care and other covariates. 31

32 Intuition Behind Identification To determine the effect of specialty care on glycemic control, we want to see how glycemic control changes when there is an exogenous shift in specialty care, holding everything else determining glycemic control constant. Thus, we need to find something that will shift around specialty care without changing any of the other predictors of glycemic control. 32

33 Importance of Instrument In our example, the instrument is relative density of diabetes specialists in the area. If provider density does not actually affect an individual patient s probability of seeing a specialist, or if provider density has a direct influence on glycemic control (perhaps because areas with high density of diabetes specialists have more, or higherquality, other resources for achieving glycemic control), then we can t separate out the independent causal effect of seeing a specialist. 33

34 Importance of Other Controls Usually it is easier to meet, as well as test, the criterion that the instrument (specialist provider density) should predict the endogenous regressor (whether saw a specialist). Typically it is the exclusion restriction that fails (provider density shouldn t affect glycemic control after controlling for seeing a specialist) In addition, you can only test this assumption if you have >1 instrument More likely to meet this assumption if you control well for other confounders, but then can be harder to meet first assumption 34

35 Randomized Controlled Trials: An IV Application Suppose many of the people randomized to the treatment group do not actually end up getting the treatment Intent-to-treat design may understate the true effect of the treatment (say, on HRQL) The random assignment can be used as the instrument for whether the person actually got the treatment in an as-treated analysis 35

36 RCTs: An IV Application (cont d) Stage 1: Estimate whether the person got treatment as a function of treatment assignment and all other predictors of either treatment or HRQL Stage 2: Estimate HRQL as a function of the predicted probability of treatment, controlling for other determinants of HRQL Treatment assignment is assumed to influence HRQL only indirectly, through actual receipt of the treatment. 36

37 RCTs: An IV Application (cont d) Fundamental difference with the usual IV analyses in the literature is which assumption most likely to fail Randomization is often excludable by definition (e.g., assignment to a treatment arm only affects outcome if you actually get the treatment), although there are exceptions However, if there is lots of crossover, randomization may not be a strong predictor of treatment group 37

38 Interpretation For which patients does this procedure yield an unbiased estimate of the effect of treatment on HRQL? At a minimum, this estimate applies to the marginal people, i.e., those for whom a change in the instrument (treatment assignment) would change the endogenous regressor (treatment). If the treatment effect is (assumed to be) homogeneous, then (by definition) the estimate generalizes to entire population. 38

39 Limitations of IV Analysis It s difficult to find a valid instrument Sometimes instruments are so weak that IV is actually more biased than OLS, even in quite large samples Even if your instrument is a strong predictor of treatment, often hard to argue that it s excludable from the second-stage outcome equation, again leading to bias If you don t have a valid instrument, then you also don t have a valid test for endogeneity 39

40 Limitations (cont d) Key question: Is the correlation of the IV with the endogenous regressor high relative to its correlation with the outcome? The greater the correlation of the IV with the outcome, the stronger its correlation with the endogenous regressor needs to be. Crown et al. (2011) have a nice simulation paper showing that IV estimates have less error than OLS estimates only under circumstances close to ideal 40

41 Limitations (cont d) If you have multiple potential IVs, with at least one known (or assumed to be) valid, can test the overidentifying restrictions on the model; if only one instrument, can t test the exclusion restriction Even with a plausible instrument, IV estimates are less efficient than single-equation estimates (especially if your endogenous regressor is dichotomous), so you lose precision/power 41

42 Future IV Topics The other three IV assumptions! IV when the outcome is nonlinear (two-stage residual inclusion) Other IV models when endogenous regressor is dichotomous IV with multiple endogenous regressors Testing endogeneity (Hausman or augmented regression test) Testing the strength of the instrument (in their 1997 paper, Staiger & Stock recommend a partial F statistic 5) Testing the exclusion restriction (Sargan, 1984) 42

43 Additional Slides

44 Does X Really Cause Y? (Dowd & Town) Example of Omitted Variables Bias R S β R β HP β S Type of Health Plan (X) Health Care Expenditures (Y) Unmeasured Consumer Desire for Broad Coverage Unmeasured Chronic Illness 44

45 Places to Look for Instruments (borrowed from Staiger) Geography (distance, rivers, small area variation) Legal/political institutions (laws, election dynamics) Administrative/program rules (wage/staffing rules, reimbursement rules, eligibility rules, mandates) Natural randomization (draft, birthdate, lottery, roommate assignment, weather) 45

46 IV Assumptions Although we focus primarily on two of them, IV actually relies on 5 assumptions We illustrate these assumptions using the example of the effect of seeing a diabetes specialist on glycemic control, using (relative) density of diabetes specialty providers in the patient s area of residence as the instrument 46

47 Formally: Non-Zero Average Causal Effect The instrument must predict the endogenous regressor, controlling for the other covariates. Example: Controlling for the other covariates, provider density is a good predictor of whether the patient has seen a diabetes specialist. 47

48 Formally: Exclusion Restriction The instrument has only a negligible direct influence on the outcome after controlling for the covariates, that is, the instrument is uncorrelated with the error term. Example: Controlling for whether the patient has seen a diabetes specialist and the other covariates, provider density does not predict glycemic control. 48

49 Monotonicity Formally: The instrument cannot increase the value of the endogenous regressor for some subjects, but decrease it for others. Example: All respondents who would have seen a diabetes specialist if they lived in an area with low provider density would also have one if they lived in an area with high provider density. 49

50 Formally: Random Assignment Subjects must be effectively randomized into the value for the IV within subgroups defined by the other covariates. Example: Knowing a respondent s glycemic control does not yield any information about provider density in the respondent s area of residence. 50

51 Stable Unit Treatment Value Assumption Formally: The outcome of one subject is not influenced by the value of the endogenous regressor for other subjects. Example: The glycemic control of one subject is not influenced by whether other subjects see a diabetes specialist and differences in effectiveness of seeing a specialist are minor. 51

Methods for Addressing Selection Bias in Observational Studies

Methods for Addressing Selection Bias in Observational Studies Methods for Addressing Selection Bias in Observational Studies Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA What is Selection Bias? In the regression

More information

Instrumental Variables I (cont.)

Instrumental Variables I (cont.) Review Instrumental Variables Observational Studies Cross Sectional Regressions Omitted Variables, Reverse causation Randomized Control Trials Difference in Difference Time invariant omitted variables

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 7: Endogeneity and IVs Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 7 VŠE, SS 2016/17 1 / 36 Outline 1 OLS and the treatment effect 2 OLS and endogeneity 3 Dealing

More information

Technical Track Session IV Instrumental Variables

Technical Track Session IV Instrumental Variables Impact Evaluation Technical Track Session IV Instrumental Variables Christel Vermeersch Beijing, China, 2009 Human Development Human Network Development Network Middle East and North Africa Region World

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

Lecture II: Difference in Difference. Causality is difficult to Show from cross

Lecture II: Difference in Difference. Causality is difficult to Show from cross Review Lecture II: Regression Discontinuity and Difference in Difference From Lecture I Causality is difficult to Show from cross sectional observational studies What caused what? X caused Y, Y caused

More information

Carrying out an Empirical Project

Carrying out an Empirical Project Carrying out an Empirical Project Empirical Analysis & Style Hint Special program: Pre-training 1 Carrying out an Empirical Project 1. Posing a Question 2. Literature Review 3. Data Collection 4. Econometric

More information

Studying the effect of change on change : a different viewpoint

Studying the effect of change on change : a different viewpoint Studying the effect of change on change : a different viewpoint Eyal Shahar Professor, Division of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona

More information

Session 3: Dealing with Reverse Causality

Session 3: Dealing with Reverse Causality Principal, Developing Trade Consultants Ltd. ARTNeT Capacity Building Workshop for Trade Research: Gravity Modeling Thursday, August 26, 2010 Outline Introduction 1 Introduction Overview Endogeneity and

More information

Introduction to Observational Studies. Jane Pinelis

Introduction to Observational Studies. Jane Pinelis Introduction to Observational Studies Jane Pinelis 22 March 2018 Outline Motivating example Observational studies vs. randomized experiments Observational studies: basics Some adjustment strategies Matching

More information

A NEW TRIAL DESIGN FULLY INTEGRATING BIOMARKER INFORMATION FOR THE EVALUATION OF TREATMENT-EFFECT MECHANISMS IN PERSONALISED MEDICINE

A NEW TRIAL DESIGN FULLY INTEGRATING BIOMARKER INFORMATION FOR THE EVALUATION OF TREATMENT-EFFECT MECHANISMS IN PERSONALISED MEDICINE A NEW TRIAL DESIGN FULLY INTEGRATING BIOMARKER INFORMATION FOR THE EVALUATION OF TREATMENT-EFFECT MECHANISMS IN PERSONALISED MEDICINE Dr Richard Emsley Centre for Biostatistics, Institute of Population

More information

Session 1: Dealing with Endogeneity

Session 1: Dealing with Endogeneity Niehaus Center, Princeton University GEM, Sciences Po ARTNeT Capacity Building Workshop for Trade Research: Behind the Border Gravity Modeling Thursday, December 18, 2008 Outline Introduction 1 Introduction

More information

EC352 Econometric Methods: Week 07

EC352 Econometric Methods: Week 07 EC352 Econometric Methods: Week 07 Gordon Kemp Department of Economics, University of Essex 1 / 25 Outline Panel Data (continued) Random Eects Estimation and Clustering Dynamic Models Validity & Threats

More information

What is Multilevel Modelling Vs Fixed Effects. Will Cook Social Statistics

What is Multilevel Modelling Vs Fixed Effects. Will Cook Social Statistics What is Multilevel Modelling Vs Fixed Effects Will Cook Social Statistics Intro Multilevel models are commonly employed in the social sciences with data that is hierarchically structured Estimated effects

More information

Establishing Causality Convincingly: Some Neat Tricks

Establishing Causality Convincingly: Some Neat Tricks Establishing Causality Convincingly: Some Neat Tricks Establishing Causality In the last set of notes, I discussed how causality can be difficult to establish in a straightforward OLS context If assumptions

More information

Applied Econometrics for Development: Experiments II

Applied Econometrics for Development: Experiments II TSE 16th January 2019 Applied Econometrics for Development: Experiments II Ana GAZMURI Paul SEABRIGHT The Cohen-Dupas bednets study The question: does subsidizing insecticide-treated anti-malarial bednets

More information

Lecture II: Difference in Difference and Regression Discontinuity

Lecture II: Difference in Difference and Regression Discontinuity Review Lecture II: Difference in Difference and Regression Discontinuity it From Lecture I Causality is difficult to Show from cross sectional observational studies What caused what? X caused Y, Y caused

More information

Propensity Score Analysis Shenyang Guo, Ph.D.

Propensity Score Analysis Shenyang Guo, Ph.D. Propensity Score Analysis Shenyang Guo, Ph.D. Upcoming Seminar: April 7-8, 2017, Philadelphia, Pennsylvania Propensity Score Analysis 1. Overview 1.1 Observational studies and challenges 1.2 Why and when

More information

[En français] For a pdf of the transcript click here.

[En français] For a pdf of the transcript click here. [En français] For a pdf of the transcript click here. Go back to the podcast on endogeneity: http://www.youtube.com/watch?v=dlutjoymfxs [00:05] Hello, my name is John Antonakis. I am a professor of Organisational

More information

6. Unusual and Influential Data

6. Unusual and Influential Data Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the

More information

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling Olli-Pekka Kauppila Daria Kautto Session VI, September 20 2017 Learning objectives 1. Get familiar with the basic idea

More information

Why randomize? Rohini Pande Harvard University and J-PAL.

Why randomize? Rohini Pande Harvard University and J-PAL. Why randomize? Rohini Pande Harvard University and J-PAL www.povertyactionlab.org Agenda I. Background on Program Evaluation II. What is a randomized experiment? III. Advantages and Limitations of Experiments

More information

Introduction to Applied Research in Economics Kamiljon T. Akramov, Ph.D. IFPRI, Washington, DC, USA

Introduction to Applied Research in Economics Kamiljon T. Akramov, Ph.D. IFPRI, Washington, DC, USA Introduction to Applied Research in Economics Kamiljon T. Akramov, Ph.D. IFPRI, Washington, DC, USA Training Course on Applied Econometric Analysis June 1, 2015, WIUT, Tashkent, Uzbekistan Why do we need

More information

Introduction to Program Evaluation

Introduction to Program Evaluation Introduction to Program Evaluation Nirav Mehta Assistant Professor Economics Department University of Western Ontario January 22, 2014 Mehta (UWO) Program Evaluation January 22, 2014 1 / 28 What is Program

More information

1 Simple and Multiple Linear Regression Assumptions

1 Simple and Multiple Linear Regression Assumptions 1 Simple and Multiple Linear Regression Assumptions The assumptions for simple are in fact special cases of the assumptions for multiple: Check: 1. What is external validity? Which assumption is critical

More information

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Multiple Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Multiple Regression 1 / 19 Multiple Regression 1 The Multiple

More information

Trial designs fully integrating biomarker information for the evaluation of treatment-effect mechanisms in stratified medicine

Trial designs fully integrating biomarker information for the evaluation of treatment-effect mechanisms in stratified medicine Trial designs fully integrating biomarker information for the evaluation of treatment-effect mechanisms in stratified medicine Dr Richard Emsley Centre for Biostatistics, Institute of Population Health,

More information

Quantitative Methods. Lonnie Berger. Research Training Policy Practice

Quantitative Methods. Lonnie Berger. Research Training Policy Practice Quantitative Methods Lonnie Berger Research Training Policy Practice Defining Quantitative and Qualitative Research Quantitative methods: systematic empirical investigation of observable phenomena via

More information

Structural Equation Modeling (SEM)

Structural Equation Modeling (SEM) Structural Equation Modeling (SEM) Today s topics The Big Picture of SEM What to do (and what NOT to do) when SEM breaks for you Single indicator (ASU) models Parceling indicators Using single factor scores

More information

Logistic regression: Why we often can do what we think we can do 1.

Logistic regression: Why we often can do what we think we can do 1. Logistic regression: Why we often can do what we think we can do 1. Augst 8 th 2015 Maarten L. Buis, University of Konstanz, Department of History and Sociology maarten.buis@uni.konstanz.de All propositions

More information

IAPT: Regression. Regression analyses

IAPT: Regression. Regression analyses Regression analyses IAPT: Regression Regression is the rather strange name given to a set of methods for predicting one variable from another. The data shown in Table 1 and come from a student project

More information

Propensity scores: what, why and why not?

Propensity scores: what, why and why not? Propensity scores: what, why and why not? Rhian Daniel, Cardiff University @statnav Joint workshop S3RI & Wessex Institute University of Southampton, 22nd March 2018 Rhian Daniel @statnav/propensity scores:

More information

Estimating treatment effects with observational data: A new approach using hospital-level variation in treatment intensity

Estimating treatment effects with observational data: A new approach using hospital-level variation in treatment intensity Preliminary and incomplete Do not quote Estimating treatment effects with observational data: A new approach using hospital-level variation in treatment intensity Mark McClellan Stanford University and

More information

The Late Pretest Problem in Randomized Control Trials of Education Interventions

The Late Pretest Problem in Randomized Control Trials of Education Interventions The Late Pretest Problem in Randomized Control Trials of Education Interventions Peter Z. Schochet ACF Methods Conference, September 2012 In Journal of Educational and Behavioral Statistics, August 2010,

More information

Class 1: Introduction, Causality, Self-selection Bias, Regression

Class 1: Introduction, Causality, Self-selection Bias, Regression Class 1: Introduction, Causality, Self-selection Bias, Regression Ricardo A Pasquini April 2011 Ricardo A Pasquini () April 2011 1 / 23 Introduction I Angrist s what should be the FAQs of a researcher:

More information

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Sylvia Richardson 1 sylvia.richardson@imperial.co.uk Joint work with: Alexina Mason 1, Lawrence

More information

Estimating Heterogeneous Choice Models with Stata

Estimating Heterogeneous Choice Models with Stata Estimating Heterogeneous Choice Models with Stata Richard Williams Notre Dame Sociology rwilliam@nd.edu West Coast Stata Users Group Meetings October 25, 2007 Overview When a binary or ordinal regression

More information

Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary note

Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary note University of Iowa Iowa Research Online Theses and Dissertations Fall 2014 Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary

More information

Policy Brief RH_No. 06/ May 2013

Policy Brief RH_No. 06/ May 2013 Policy Brief RH_No. 06/ May 2013 The Consequences of Fertility for Child Health in Kenya: Endogeneity, Heterogeneity and the Control Function Approach. By Jane Kabubo Mariara Domisiano Mwabu Godfrey Ndeng

More information

Recent advances in non-experimental comparison group designs

Recent advances in non-experimental comparison group designs Recent advances in non-experimental comparison group designs Elizabeth Stuart Johns Hopkins Bloomberg School of Public Health Department of Mental Health Department of Biostatistics Department of Health

More information

Marno Verbeek Erasmus University, the Netherlands. Cons. Pros

Marno Verbeek Erasmus University, the Netherlands. Cons. Pros Marno Verbeek Erasmus University, the Netherlands Using linear regression to establish empirical relationships Linear regression is a powerful tool for estimating the relationship between one variable

More information

Regression Discontinuity Design

Regression Discontinuity Design Regression Discontinuity Design Regression Discontinuity Design Units are assigned to conditions based on a cutoff score on a measured covariate, For example, employees who exceed a cutoff for absenteeism

More information

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha attrition: When data are missing because we are unable to measure the outcomes of some of the

More information

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines Ec331: Research in Applied Economics Spring term, 2014 Panel Data: brief outlines Remaining structure Final Presentations (5%) Fridays, 9-10 in H3.45. 15 mins, 8 slides maximum Wk.6 Labour Supply - Wilfred

More information

A NON-TECHNICAL INTRODUCTION TO REGRESSIONS. David Romer. University of California, Berkeley. January Copyright 2018 by David Romer

A NON-TECHNICAL INTRODUCTION TO REGRESSIONS. David Romer. University of California, Berkeley. January Copyright 2018 by David Romer A NON-TECHNICAL INTRODUCTION TO REGRESSIONS David Romer University of California, Berkeley January 2018 Copyright 2018 by David Romer CONTENTS Preface ii I Introduction 1 II Ordinary Least Squares Regression

More information

Complier Average Causal Effect (CACE)

Complier Average Causal Effect (CACE) Complier Average Causal Effect (CACE) Booil Jo Stanford University Methodological Advancement Meeting Innovative Directions in Estimating Impact Office of Planning, Research & Evaluation Administration

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

Econometric Game 2012: infants birthweight?

Econometric Game 2012: infants birthweight? Econometric Game 2012: How does maternal smoking during pregnancy affect infants birthweight? Case A April 18, 2012 1 Introduction Low birthweight is associated with adverse health related and economic

More information

Final Exam - section 2. Thursday, December hours, 30 minutes

Final Exam - section 2. Thursday, December hours, 30 minutes Econometrics, ECON312 San Francisco State University Michael Bar Fall 2011 Final Exam - section 2 Thursday, December 15 2 hours, 30 minutes Name: Instructions 1. This is closed book, closed notes exam.

More information

26:010:557 / 26:620:557 Social Science Research Methods

26:010:557 / 26:620:557 Social Science Research Methods 26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview

More information

Assessing Studies Based on Multiple Regression. Chapter 7. Michael Ash CPPA

Assessing Studies Based on Multiple Regression. Chapter 7. Michael Ash CPPA Assessing Studies Based on Multiple Regression Chapter 7 Michael Ash CPPA Assessing Regression Studies p.1/20 Course notes Last time External Validity Internal Validity Omitted Variable Bias Misspecified

More information

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis EFSA/EBTC Colloquium, 25 October 2017 Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis Julian Higgins University of Bristol 1 Introduction to concepts Standard

More information

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH Why Randomize? This case study is based on Training Disadvantaged Youth in Latin America: Evidence from a Randomized Trial by Orazio Attanasio,

More information

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information

Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models

Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models Florian M. Hollenbach Department of Political Science Texas A&M University Jacob M. Montgomery

More information

Political Science 15, Winter 2014 Final Review

Political Science 15, Winter 2014 Final Review Political Science 15, Winter 2014 Final Review The major topics covered in class are listed below. You should also take a look at the readings listed on the class website. Studying Politics Scientifically

More information

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity Measurement & Variables - Initial step is to conceptualize and clarify the concepts embedded in a hypothesis or research question with

More information

Supplement 2. Use of Directed Acyclic Graphs (DAGs)

Supplement 2. Use of Directed Acyclic Graphs (DAGs) Supplement 2. Use of Directed Acyclic Graphs (DAGs) Abstract This supplement describes how counterfactual theory is used to define causal effects and the conditions in which observed data can be used to

More information

INTERNAL VALIDITY, BIAS AND CONFOUNDING

INTERNAL VALIDITY, BIAS AND CONFOUNDING OCW Epidemiology and Biostatistics, 2010 J. Forrester, PhD Tufts University School of Medicine October 6, 2010 INTERNAL VALIDITY, BIAS AND CONFOUNDING Learning objectives for this session: 1) Understand

More information

(b) empirical power. IV: blinded IV: unblinded Regr: blinded Regr: unblinded α. empirical power

(b) empirical power. IV: blinded IV: unblinded Regr: blinded Regr: unblinded α. empirical power Supplementary Information for: Using instrumental variables to disentangle treatment and placebo effects in blinded and unblinded randomized clinical trials influenced by unmeasured confounders by Elias

More information

An Instrumental Variable Consistent Estimation Procedure to Overcome the Problem of Endogenous Variables in Multilevel Models

An Instrumental Variable Consistent Estimation Procedure to Overcome the Problem of Endogenous Variables in Multilevel Models An Instrumental Variable Consistent Estimation Procedure to Overcome the Problem of Endogenous Variables in Multilevel Models Neil H Spencer University of Hertfordshire Antony Fielding University of Birmingham

More information

This exam consists of three parts. Provide answers to ALL THREE sections.

This exam consists of three parts. Provide answers to ALL THREE sections. Empirical Analysis and Research Methodology Examination Yale University Department of Political Science January 2008 This exam consists of three parts. Provide answers to ALL THREE sections. Your answers

More information

Statistical methods for assessing treatment effects for observational studies.

Statistical methods for assessing treatment effects for observational studies. University of Louisville ThinkIR: The University of Louisville's Institutional Repository Electronic Theses and Dissertations 5-2014 Statistical methods for assessing treatment effects for observational

More information

Measuring Impact. Program and Policy Evaluation with Observational Data. Daniel L. Millimet. Southern Methodist University.

Measuring Impact. Program and Policy Evaluation with Observational Data. Daniel L. Millimet. Southern Methodist University. Measuring mpact Program and Policy Evaluation with Observational Data Daniel L. Millimet Southern Methodist University 23 May 2013 DL Millimet (SMU) Observational Data May 2013 1 / 23 ntroduction Measuring

More information

Problem set 2: understanding ordinary least squares regressions

Problem set 2: understanding ordinary least squares regressions Problem set 2: understanding ordinary least squares regressions September 12, 2013 1 Introduction This problem set is meant to accompany the undergraduate econometrics video series on youtube; covering

More information

The Dynamic Effects of Obesity on the Wages of Young Workers

The Dynamic Effects of Obesity on the Wages of Young Workers The Dynamic Effects of Obesity on the Wages of Young Workers Joshua C. Pinkston University of Louisville June, 2015 Contributions 1. Focus on more recent cohort, NLSY97. Obesity

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis Basic Concept: Extend the simple regression model to include additional explanatory variables: Y = β 0 + β1x1 + β2x2 +... + βp-1xp + ε p = (number of independent variables

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

Evaluating Social Programs Course: Evaluation Glossary (Sources: 3ie and The World Bank)

Evaluating Social Programs Course: Evaluation Glossary (Sources: 3ie and The World Bank) Evaluating Social Programs Course: Evaluation Glossary (Sources: 3ie and The World Bank) Attribution The extent to which the observed change in outcome is the result of the intervention, having allowed

More information

ECON Microeconomics III

ECON Microeconomics III ECON 7130 - Microeconomics III Spring 2016 Notes for Lecture #5 Today: Difference-in-Differences (DD) Estimators Difference-in-Difference-in-Differences (DDD) Estimators (Triple Difference) Difference-in-Difference

More information

GUIDE 4: COUNSELING THE UNEMPLOYED

GUIDE 4: COUNSELING THE UNEMPLOYED GUIDE 4: COUNSELING THE UNEMPLOYED Addressing threats to experimental integrity This case study is based on Sample Attrition Bias in Randomized Experiments: A Tale of Two Surveys By Luc Behaghel, Bruno

More information

26:010:557 / 26:620:557 Social Science Research Methods

26:010:557 / 26:620:557 Social Science Research Methods 26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview

More information

Cross-Lagged Panel Analysis

Cross-Lagged Panel Analysis Cross-Lagged Panel Analysis Michael W. Kearney Cross-lagged panel analysis is an analytical strategy used to describe reciprocal relationships, or directional influences, between variables over time. Cross-lagged

More information

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover). STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical methods 2 Course code: EC2402 Examiner: Per Pettersson-Lidbom Number of credits: 7,5 credits Date of exam: Sunday 21 February 2010 Examination

More information

Endogeneity is a fancy word for a simple problem. So fancy, in fact, that the Microsoft Word spell-checker does not recognize it.

Endogeneity is a fancy word for a simple problem. So fancy, in fact, that the Microsoft Word spell-checker does not recognize it. Jesper B Sørensen August 2012 Endogeneity is a fancy word for a simple problem. So fancy, in fact, that the Microsoft Word spell-checker does not recognize it. Technically, in a statistical model you have

More information

Quasi-experimental analysis Notes for "Structural modelling".

Quasi-experimental analysis Notes for Structural modelling. Quasi-experimental analysis Notes for "Structural modelling". Martin Browning Department of Economics, University of Oxford Revised, February 3 2012 1 Quasi-experimental analysis. 1.1 Modelling using quasi-experiments.

More information

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. Greenland/Arah, Epi 200C Sp 2000 1 of 6 EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. INSTRUCTIONS: Write all answers on the answer sheets supplied; PRINT YOUR NAME and STUDENT ID NUMBER

More information

Lecture 14: Adjusting for between- and within-cluster covariates in the analysis of clustered data May 14, 2009

Lecture 14: Adjusting for between- and within-cluster covariates in the analysis of clustered data May 14, 2009 Measurement, Design, and Analytic Techniques in Mental Health and Behavioral Sciences p. 1/3 Measurement, Design, and Analytic Techniques in Mental Health and Behavioral Sciences Lecture 14: Adjusting

More information

Version No. 7 Date: July Please send comments or suggestions on this glossary to

Version No. 7 Date: July Please send comments or suggestions on this glossary to Impact Evaluation Glossary Version No. 7 Date: July 2012 Please send comments or suggestions on this glossary to 3ie@3ieimpact.org. Recommended citation: 3ie (2012) 3ie impact evaluation glossary. International

More information

11/24/2017. Do not imply a cause-and-effect relationship

11/24/2017. Do not imply a cause-and-effect relationship Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

Dylan Small Department of Statistics, Wharton School, University of Pennsylvania. Based on joint work with Paul Rosenbaum

Dylan Small Department of Statistics, Wharton School, University of Pennsylvania. Based on joint work with Paul Rosenbaum Instrumental variables and their sensitivity to unobserved biases Dylan Small Department of Statistics, Wharton School, University of Pennsylvania Based on joint work with Paul Rosenbaum Overview Instrumental

More information

The Effects of Autocorrelated Noise and Biased HRF in fmri Analysis Error Rates

The Effects of Autocorrelated Noise and Biased HRF in fmri Analysis Error Rates The Effects of Autocorrelated Noise and Biased HRF in fmri Analysis Error Rates Ariana Anderson University of California, Los Angeles Departments of Psychiatry and Behavioral Sciences David Geffen School

More information

Key questions when starting an econometric project (Angrist & Pischke, 2009):

Key questions when starting an econometric project (Angrist & Pischke, 2009): Econometric & other impact assessment approaches to policy analysis Part 1 1 The problem of causality in policy analysis Internal vs. external validity Key questions when starting an econometric project

More information

Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias.

Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias. Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias. In the second module, we will focus on selection bias and in

More information

Measuring the Impacts of Teachers: Reply to Rothstein

Measuring the Impacts of Teachers: Reply to Rothstein Measuring the Impacts of Teachers: Reply to Rothstein Raj Chetty, Harvard University John Friedman, Brown University Jonah Rockoff, Columbia University February 2017 Abstract Using data from North Carolina,

More information

Problem Set 5 ECN 140 Econometrics Professor Oscar Jorda. DUE: June 6, Name

Problem Set 5 ECN 140 Econometrics Professor Oscar Jorda. DUE: June 6, Name Problem Set 5 ECN 140 Econometrics Professor Oscar Jorda DUE: June 6, 2006 Name 1) Earnings functions, whereby the log of earnings is regressed on years of education, years of on-the-job training, and

More information

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,

More information

A re-randomisation design for clinical trials

A re-randomisation design for clinical trials Kahan et al. BMC Medical Research Methodology (2015) 15:96 DOI 10.1186/s12874-015-0082-2 RESEARCH ARTICLE Open Access A re-randomisation design for clinical trials Brennan C Kahan 1*, Andrew B Forbes 2,

More information

Module 14: Missing Data Concepts

Module 14: Missing Data Concepts Module 14: Missing Data Concepts Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724 Pre-requisites Module 3

More information

Cancer survivorship and labor market attachments: Evidence from MEPS data

Cancer survivorship and labor market attachments: Evidence from MEPS data Cancer survivorship and labor market attachments: Evidence from 2008-2014 MEPS data University of Memphis, Department of Economics January 7, 2018 Presentation outline Motivation and previous literature

More information

Experiments. ESP178 Research Methods Dillon Fitch 1/26/16. Adapted from lecture by Professor Susan Handy

Experiments. ESP178 Research Methods Dillon Fitch 1/26/16. Adapted from lecture by Professor Susan Handy Experiments ESP178 Research Methods Dillon Fitch 1/26/16 Adapted from lecture by Professor Susan Handy Recap Causal Validity Criterion Association Non-spurious Time order Causal Mechanism Context Explanation

More information

Pooling Subjective Confidence Intervals

Pooling Subjective Confidence Intervals Spring, 1999 1 Administrative Things Pooling Subjective Confidence Intervals Assignment 7 due Friday You should consider only two indices, the S&P and the Nikkei. Sorry for causing the confusion. Reading

More information

Chapter 1: Explaining Behavior

Chapter 1: Explaining Behavior Chapter 1: Explaining Behavior GOAL OF SCIENCE is to generate explanations for various puzzling natural phenomenon. - Generate general laws of behavior (psychology) RESEARCH: principle method for acquiring

More information

INTRODUCTION TO ECONOMETRICS (EC212)

INTRODUCTION TO ECONOMETRICS (EC212) INTRODUCTION TO ECONOMETRICS (EC212) Course duration: 54 hours lecture and class time (Over three weeks) LSE Teaching Department: Department of Economics Lead Faculty (session two): Dr Taisuke Otsu and

More information

Genetic instrumental variable regression: Explaining socioeconomic and health outcomes in nonexperimental data

Genetic instrumental variable regression: Explaining socioeconomic and health outcomes in nonexperimental data Genetic instrumental variable regression: Explaining socioeconomic and health outcomes in nonexperimental data Thomas A. DiPrete a,1,2, Casper A. P. Burik b,1, and Philipp D. Koellinger b,1,2 a Department

More information

EXAMINING THE EDUCATION GRADIENT IN CHRONIC ILLNESS

EXAMINING THE EDUCATION GRADIENT IN CHRONIC ILLNESS EXAMINING THE EDUCATION GRADIENT IN CHRONIC ILLNESS PINKA CHATTERJI, HEESOO JOO, AND KAJAL LAHIRI Department of Economics, University at Albany: SUNY February 6, 2012 This research was supported by the

More information

DRAFT (Final) Concept Paper On choosing appropriate estimands and defining sensitivity analyses in confirmatory clinical trials

DRAFT (Final) Concept Paper On choosing appropriate estimands and defining sensitivity analyses in confirmatory clinical trials DRAFT (Final) Concept Paper On choosing appropriate estimands and defining sensitivity analyses in confirmatory clinical trials EFSPI Comments Page General Priority (H/M/L) Comment The concept to develop

More information

Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of four noncompliance methods

Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of four noncompliance methods Merrill and McClure Trials (2015) 16:523 DOI 1186/s13063-015-1044-z TRIALS RESEARCH Open Access Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of

More information