This content downloaded from on Mon, 2 Sep :55:20 PM All use subject to JSTOR Terms and Conditions

Size: px
Start display at page:

Download "This content downloaded from on Mon, 2 Sep :55:20 PM All use subject to JSTOR Terms and Conditions"

Transcription

1 Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs Author(s): Rajeev H. Dehejia and Sadek Wahba Source: Journal of the American Statistical Association, Vol. 94, No. 448 (Dec., 1999), pp Published by: American Statistical Association Stable URL: Accessed: 2/9/213 18:55 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at. JSTOR is a not-for-profit service that helps scholars, researchers, and students disver, use, and build upon a wide range of ntent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please ntact support@jstor.org.. American Statistical Association is llaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association.

2 Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs Rajeev H. DEHEJIA and Sadek WAHBA This article uses propensity sre methods to estimate the treatment impact of the National Supported Work (NSW) Demonstration, a labor training program, on postintervention earnings. We use data from Lalonde's evaluation of nonexperimental methods that mbine the treated units from a randomized evaluation of the NSW with nonexperimental mparison units drawn from survey datasets. We apply propensity sre methods to this mposite dataset and demonstrate that, relative to the estimators that Lalonde evaluates, propensity sre estimates of the treatment impact are much closer to the experimental benchmark estimate. Propensity sre methods assume that the variables associated with assignmento treatment are observed (referred to as ignorable treatment assignment, or selection on observables). Even under this assumption, it is difficult to ntrol for differences between the treatment and mparison groups when they are dissimilar and when there are many preintervention variables. The estimated propensity sre (the probability of assignmento treatment, nditional on preintervention variables) summarizes the preintervention variables. This offers a diagnostic on the mparability of the treatment and mparison groups, because one has only to mpare the estimated propensity sre across the two groups. We discuss several methods (such as stratification and matching) that use the propensity sre to estimate the treatment impact. When the range of estimated propensity sres of the treatment and mparison groups overlap, these methods can estimate the treatment impact for the treatment group. A sensitivity analysis shows that our estimates are not sensitive to the specification of the estimated propensity sre, but are sensitive to the assumption of selection on observables. We nclude that when the treatment and mparison groups overlap, and when the variables determining assignment to treatment are observed, these methods provide a means to estimate the treatment impact. Even though propensity sre methods are not always applicable, they offer a diagnostic on the quality of nonexperimental mparison groups in terms of observable preintervention variables. KEY WORDS: Matching; Program evaluation; Propensity sre. 1. INTRODUCTION and Hotz 1989; Manski, Sandefur, McLanahan, and Powers This article discusses the estimation of treatment effects 1992). in observational studies. This issue has been the focus of In this article we apply propensity sre methods (Rosenmuch attention because randomized experiments cannot albaum and Rubin 1983) to Lalonde's dataset. The propenways be implemented and has been addressed inter alia by sity sre is defined as the probability of assignment to Lalonde (1986), whose data we use herein. Lalonde estitreatment, nditional on variates. Propensity sre methmated the impact of the National Supported Work (NSW) ods focus on the mparability of the treatment and non- Demonstration, a labor training program, on postintervenexperimental mparison groups in terms of preintervention inme levels. He used data from a randomized evaltion variables. Controlling for differences in preintervenuation of the program and examined the extent to which tion variables is difficult when the treatment and mparison groups are dissimilar and when there are many preintervennonexperimental estimators can replicate the unbiased extion variables. The estimated propensity sre, a single variperimental estimate of the treatment impact when applied able on the unit interval that summarizes the preintervento a mposite dataset of experimental treatment units and tion variables, can ntrol for differences between the treatnonexperimental mparison units. He ncluded that stanment and nonexperimental mparison groups. When we dard nonexperimental estimatorsuch as regression, fixedapply these methods to Lalonde's nonexperimental data for effects, and latent variable selection models are either ina range of propensity sre specifications and estimators, accurate relative to the experimental benchmark or sensiwe obtain estimates of the treatment impact that are much tive to the specification used in the regression. Lalonde's closer to the experimental treatment effec than Lalonde's results have been influential renewing the debate on exnonexperimental estimates. perimental versus nonexperimental evaluations (see Manski The assumption underlying this method is that assignand Garfinkel 1992) and in spurring a search for alternament to treatment is associated only with observable preintive estimators and specification tests (see, e.g., Heckman tervention variables, called the ignorable treatment assignment assumption or selection on observables (see Heckman Rajeev Dehejia is Assistant Professor, Department of Enomics and and Robb 1985; Holland 1986; Rubin 1974, 1977, 1978). SIPA, Columbia University, New York, NY 127 ( dehejia@ Although this is a strong assumption, we demonstrate that lumbia.edu). Sadek Wahba is Vice President, Morgan Stanley & Co. Inpropensity sre methods are an informative starting point, rporated, New York, NY 136 ( wahbas@ms.m). This work was partially supported by a grant from the Social Sciences and Hu- because they quickly reveal the extent of overlap in the manities Research Council of Canada (first author) and a World Bank treatment and mparison groups in terms of preinterven- Fellowship (send author). The authors gratefully acknowledge an asso- tion variables. ciate editor, two anonymous referees, Gary Chamberlain, Guido Imbens, and Donald Rubin, whose detailed mments and suggestions greatly improved the article. They thank Robert Lalonde for providing the data from his 1986 study and substantial help in recreating the original dataset, and? 1999 American Statistical Association also thank Joshua Angrist, George Cave, David Cutler, Lawrence Katz, Journal of the American Statistical Association Caroline Minter-Hoxby, and Jeffrey Smith. December 1999, Vol. 94, No. 448, Applications and Case Studies 153

3 154 Journal of the American Statistical Association, December 1999 The article is organized as follows. Section 2 reviews employed prior to randomization. Selection of this subset is Lalonde's data and reproduces his results. Section 3 identi- based only on preintervention variables (month of assignfies the treatment effect under the potential outmes causal ment and employment history). Assuming that the initial model and discusses estimation strategies for the treatment randomization was independent of preintervention varieffect. Section 4 applies our methods to Lalonde's dataset, ates, the subset retains a key property of the full experimenand Section 5 discusses the sensitivity of the results to the tal data: The treatment and ntrol groups have the same methodology. Section 6 ncludes the article. distribution of preintervention variables, although this distribution uld differ from the distribution of variates for 2. LALONDE'S RESULTS the larger sample. A difference in means remains an unbiased estimator of the average treatment impact for the 2.1 The Data reduced sample. The subset includes 185 treated and 26 The NSW Demonstration [Manpower Demonstration Re- ntrol observations. search Corporation (MDRC) 1983] was a federally and pri- We presen the preintervention characteristics of the origvately funded program implemented in the mid-197s to inal sample and of our subset in the first four rows of Table provide work experience for a period of 6-18 months to 1. Our subset differs from Lalonde's original sample, espeindividuals who had faced enomic and social problems cially in terms of 1975 earnings; this is a nsequence both prior to enrollment in the program. Those randomly se- of the hort phenomenon and of the fact that our sublected to join the program participated in various types of sample ntains more individuals who were unemployed work, such as restaurant and nstruction work. Informa- prior to program participation. The distribution of preintertion on preintervention variables (preintervention earnings vention variables is very similar across the treatment and as well as education, age, ethnicity, and marital status) was ntrol groups for each sample; none of the differences is obtained from initial surveys and Social Security Admin- significantly different from at a 5% level of significance, istration rerds. Both the treatment and ntrol groups with the exception of the indicator for "no degree". participated in follow-up interviews at specific intervals. Lalonde's nonexperimental estimates of the treatment ef- Lalonde (1986) offered a separate analysis of the male and fect are based on two distinct mparison groups: the Panel female participants. In this article we focus on the male par- Study of Inme Dynamics (PSID-1) and Westat's Matched ticipants, as estimates for this group were the most sensitive Current Population Survey-Social Security Administration to functional-form specification, as indicated by Lalonde. File (CPS-1). Table 1 presents the preintervention charac- Candidates eligible for the NSW were randomized into teristics of the mparison groups. It is evident that both the program between March 1975 and July One n- PSID-1 and CPS-1 differ dramatically from the treatment sequence of randomization over a 2-year period was that group in terms of age, marital status, ethnicity, and preinindividuals who joined early in the program had different tervention earnings; all of the mean differences are signifcharacteristics than those who entered later; this is referred icantly different from well beyond a 1% level of signifto as the "hort phenomenon" (MDRC 1983, p. 48). An- icance, except the indicator for "Hispanic". To bridge the other nsequence is that data from the NSW are delin- gap between the treatment and mparison groups in terms eated in terms of experimental time. Lalonde annualized of preintervention characteristics, Lalonde extracted subsets earnings data from the experiment because the nonexperi- from PSID-1 and CPS-1 (denoted PSID-2 and -3 and CPS-2 mental mparison groups that he used (discussed later) are and -3) that resemble the treatment group in terms of single delineated in calendar time. By limiting himself to those preintervention characteristics (such as age or employment assigned to treatment after December 1975, Lalonde en- status; see Table 1). Table 1 reveals that the subsets resured that retrospectivearnings information from the ex- main statistically substantially different from the treatment periment included calendar 1975 earnings, which he then group; the mean differences in age, ethnicity, marital status, used as preintervention earnings. By likewise limiting him- and earnings are smaller but remain statistically significant self to those who were no longer participating in the pro- at a 1% level. gram by January 1978, he ensured that the postintervention data included calendar 1978 earnings, which he took to be the outme of interest. Earnings data for both these 2.2 years Lalonde's Results are available for both nonexperimental mparison groups. Because our analysis in Section 4 uses a subset of This reduces the NSW sample to 297 treated observations Lalonde's original data and an additional variable (1974 and 425 ntrol observations for male participants. earnings), in Table 2 we reproduce Lalonde's results us- However, it is importanto look at several years of prein- ing his original data and variables (Table 2, panel A), and tervention earnings in determining the effect of job training then apply the same estimators to our subset of his data programs (Angrist 199, 1998; Ashenfelter 1978; Ashen- both without and with the additional variable (Table 2, panfelter and Card 1985; Card and Sullivan 1988). Thus we els B and C). We show that when his analysis is applied further limit ourselves to the subset of Lalonde's NSW data to the data and variables that we use, his basic nclusions for which 1974 earnings can be obtained: those individuals remain unchanged. In Section 5 we discuss the sensitivity who joined the program early enough for the retrospective of our propensity sre results to dropping the additional earnings information to include 1974, as well as those indi- earnings data. In his article, Lalonde nsidered linear reviduals who joined later but were known to have been un- gression, fixed-effects, and latent variable selection models

4 Dehejia and Wahba: Causal Effects in Nonexperimental Studies 155 Table 1. Sample Means of Characteristics for NSW and Comparison Samples No. of observations Age Education Black Hispanic No degree Married RE74 (U.S. $) RE75 (U.S. $) NSW/Lalonde:a Treated ,66 (.32) (.9) (.2) (.1) (.2) (.2) (236) Control ,26 (.32) (.8) (.2) (.2) (.2) (.2) (252) RE74 subset:b Treated ,96 1,532 (.35) (.1) (.2) (.1) (.2) (.2) (237) (156) Control ,17 1,267 (.34) (.8) (.2) (.2) (.2) (.2) (276) (151) Comparison groups:c PSID-1 2, ,429 19,63 [.78] [.23] [.3] [.1] [.4] [.3] [991] [1,2] PSID ,27 7,569 [1.] [.27] [.4] [.2] [.5] [.4] [853] [695] PSID ,566 2,611 [1.17] [.29] [.5] [.3] [.5] [.5] (686) [499] CPS-1 15, ,16 13,65 [.81] [.21] [.2] [.2] [.3] [.3] [75] [682] CPS-2 2, ,728 7,397 [.87] [.19] [.2] [.2] [.4] [.4] [667] [6] CPS ,619 2,467 [.87] [.23] [.3] [.3] [.4] [.4] [552] [288] NOTE: Standard errors are in parentheses. Standard error on difference in means with RE74 subset/treated is given in brackets. Age = age in years; Education = number of years of schooling; Black = 1 if black, otherwise; Hispanic = 1 if Hispanic, otherwise; No degree = 1 if no high school degree, otherwise; Married = 1 if married, otherwise; REx = earnings in calendar year 19x. a NSW sample as nstructed by Lalonde (1986). bthe subset of the Lalonde sample for which RE74 is available. c Definition of mparison groups (Lalonde 1986): PSID-1: All male household heads under age 55 who did not classify themselves as retired in PSID-2: Selects from PSID-1 all men who were not working when surveyed in the spring of PSID-3: Selects from PSID-2 all men who were not working in CPS-1: All CPS males under age 55. CPS-2: Selects from CPS-1 all males who were not working when surveyed in March CPS-3: Selects from CPS-2 all the unemployed males in 1976 whose inme in 1975 was below the poverty level. PSID1-3 and CPS-1 are identical to those used by Lalonde. CPS2-3 are similar to those used by Lalonde, but Lalonde's original subset uld not be recreated. of the treatment impact. Because our analysis focuses on panel B, is that the regression specifications and mparithe importance of preintervention variables, we focus on son groups fail to replicate the treatment impact. the first of these. Including 1974 earnings as an additional variable in the Table 2, panel A, reproduces the results of Lalonde (1986, regressions in Table 2, panel C does not alter Lalonde's basic message, although the estimates improve mpared Table 5). Comparing Panels A and B, we note that the treatto those in panel B. In lumns (1) and (3), many estimates ment effect, as estimated from the randomized experiment, remain negative, but less so than in panel B. In lumn (2) is higher in the latter ($1,794 mpared to $886). This rethe estimates for PSID-1 and CPS-1 are negative, but the flects differences in the mposition of the two samples, as estimates for the subsets improve. In lumns (4) and (5) discussed in the previous section: A higher treatment ef- the estimates are closer to the experimental benchmark than fect is obtained for those who joined the program earlier or in panel B, off by about $1, for PSID1-3 and CPS1-2 who were unemployed prior to program participation. The and by $4 for CPS-3. Overall, the results closest to the results in terms of the success of nonexperimental estimates experimental benchmark in Table 2 are for CPS-3, panel C. are qualitatively similar across the two samples. The sim- This raises a number of issues. The strategy of nsidering ple difference in means, reported in lumn (1), yields neg- subsets of the mparison group improves estimates of the ative treatment effects for the CPS and PSID mparison treatment effect relative to the benchmark. However, Table 1 reveals that significant differences remain between the groups in both samples (except PSID-3). The fixed-effectsmparison groups and the treatment group. These subsets type differencing estimator in the third lumn fares someare created based on one or two preintervention variables. what better, although many estimates are still negative or In Sections 3 and 4 we show that propensity sre methods deteriorate when we ntrol for variates in both panels. provide a systematic means of creating such subsets. The estimates in the fifth lumn are closest to the experimental estimate, nsistently closer than those in the sec- 3. IDENTIFYING AND ESTIMATING THE AVERAGE ond lumn, which do not ntrol for earnings in The TREATMENT EFFECT treatment effect is underestimated by about $1, for the 3.1 Identification CPS mparison groups and by $1,5 for the PSID groups. Let Yi, representhe value of the outme when unit i is Lalonde's nclusion from panel A, which also holds in exposed to regime 1 (called treatment), and let Yio represent

5 156 Journal of the American Statistical Association, December 1999 Q3 Lo -1 LO Lo 'I cm q q 'It CI) CC) CC) N N r, L O-) Lo C').) f: ui.) =3 CC) r' n C, C') N w m w 11- w (D C'i :3 (3) E2 CL C) CM E Lo C) (D I- C - =3 CZ CO ') CM C) C) 1 "- ; "t "t (O CI) ') ') C) Z CC) W N CO C"L b CL LL] 6 2LI i5 LO CM Lo CL :3 N LO O' C- C) =3 C) L6 L '2 (D C) C) - rl- (o Lo o C) (D =3 2 o Lo a w o C').) C) (D C) '('O F, N r' LO CO CQL 72 LL] 'r, (D ; W LO 11 - CI) C\j ') N rl- (D CM CM LO LO CI) (D CO, Q4) C? zi- z zti O- 7 7 C) C\j :;- _- a- LO ia C) i:::, CC) i:::, (Co') C 't to :3 r M Lo I'* Fi C, rl- ). rl- C) CM r ') "t ) (o JR). rl- w ui 4). CL E CM j:z: 'It LO - CM - - :;' C) Lo - w r'' r" LO CI) 't Ir - C) CI) C'q C'j -6 (D r- N N r".' CZ LU U =3 LU C3) m, - 6 ja a 2 CZ LO O- CO - r- CM gt r' LO ON "t a a m N :;: LO (D (D C) "t - R r- CI) (D U-) :3 Co C: ) ') (D I'- C') (D a) 6- cli E C) rl- CE2 - It (O ') CO 't 1, rlo I'-3 "t CI) L) cq (o CIZ LO to ) LO LO C\J r- C1 (o W "t 11- W ) LO C\j Q R -6 Oa" 4 E2 COO C: a) E =3 CM cl) CM ') LO i:::, (O C) (D C\i "t r- LO (D LO LO CM E r- ') CM ') E 23 :3 LO a) 16 CL Q)? o ztj E p ui w 14 a) C) CI) ') a) ro- Qp) E r o, izz, "t "t C) ro-) w M- m C'l N N Lo LQ N (q Lo C\L CI) oo (o r'- CC) -r- =3 C\j (o r-, ` Lo -o a) =3 Cr -o Lo? Sn a).3 - CO =3 E 'o E 2-6 ui C'z (O ) r'- q r- C\j (O ') "t Co Lo (o CO ) r- t w CC - =3 -o - (CD: 'D 5 'V5 C: c (n C\j S S -a 2 - Co. :L- :3 ) Lo (n m r'- C'l U') V 2 ) a) w -t 2 'Z.9 i5 6 -r- - -r-

6 Dehejia and Wahba: Causal Effects in Nonexperimental Studies 157 the value of the outme when unit i is exposed to regime (called ntrol). Only one of Yio or Yi1 can be observed for any unit, because one cannot observe the same unit under both treatment and ntrol. Let Ti be a treatment indicator (1 if exposed to treatment, otherwise). Then the observed outme for unit i is Yi = TiTYi + (1-Tj)Yio. The treatment effect for unit i is Ti = Yi-Yio. In an experimental setting where assignmento treatment is randomized, the treatment and ntrol groups are drawn from the same population. The average treatment effect for this population is T= E(Yi) - E(Yio). But randomization implies that {Yi, I Yio IL Ti} [using Dawid's (1979) notation, H represents independence], so that for j,1, and E(YijlTi = 1) = E(YijlTi = ) = E(YijTi = j) T E(Yi1 ITi 1) - E(Yio ITi = ) = E(YjjTi =1)-E(YijTi =?), which is readily estimated. In an observational study, the treatment and mparison -E(YilTi =,p(xi)) Ti = 1}, (2) groups are often drawn from different populations. In our assuming that the expectations are defined. The outer exapplication the treatment group is drawn from the popu- pectation is over the distribution of p(xi)iti = 1. lation of interest: welfare recipients eligible for the pro- One intuition for the propensity sre is that whereas in gram. The (nonexperimental) mparison group is drawn (1) we are trying to ndition on Xi (intuitively, to find obfrom a different population. (In our application both the servations with similar variates), in (2) we are trying to CPS and PSID are more representative of the general U.S. ndition just on the propensity sre, because the propopopulation.) Thus the treatment effect that we are trying sition implies that observations with the same propensity to identify is the average treatment effect for the treated sre have the same distribution of the full vector of population, variates, Xi. T T=l= E(Yi, ITi = 1) - E(Yio ITi = 1). 3.2 The Estimation Strategy Estimation is done in two steps. First, we estimate the This expression cannot be estimated directly, because Yio is propensity sre separately for each nonexperimental samnot observed for treated units. Assuming selection on ob- ple nsisting of the experimental treatment units and the servable variates, Xi-namely, {Yil, Yio HL Ti} Xi (Rubin specified set of mparison units (PSID1-3 or CPS 1-3). We 1974, 1977)-we obtain use a logistic probability model, but other standard models E (Yij I Xi, Ti = 1) = E (Yij I Xi, Ti = O) = E (Yi I Xi, Ti = j) for j =, 1. Conditional on the observables, Xi, there is yield similar results. One issue is what functional form of the preintervention variables to include in the logit. We rely on the following proposition. no systematic pretreatment difference between the groups Proposition 2 (Rosenbaum and Rubin 1983). If p(xi) assigned to treatment and ntrol. This allows us to identify is the propensity sre, then the treatment effect for the treated, Xi H Ti lp(xi) T1T=1 = EfE(YilXi,Ti = 1)-E(YilXi,Ti = O)jTi = 1}, (1) where the outer expectation is over the distribution of XilTi = 1, the distribution of preintervention variables in the treated population. In our application we have both an experimental ntrol group and a nonexperimental mparison group. Because the former is drawn from the population of interest along with the treated group, we enomize on notation and use Ti=1 to represen the entire group of interest and use i= to representhe nonexperimental group. Thus in (1) the expectation is over the distribution of Xi for the NSW population. One method for estimating the treatment effecthat stems from (1) is estimating E(Yi IXi, Ti = 1) and E(Yi IXi, Ti = ) as two nonparametric equations. This estimation strategy bemes difficult, however, if the variates, Xi, are high dimensional. The propensity sre theorem provides an intermediate step. Proposition ] (Rosenbaum and Rubin 1983). Let p(xi) be the probability of unit i having been assigned to treatment, defined as p(xi) _ Pr(Ti = 1JXi) = E(TilXi). Assume that < p(xi) < 1, for all Xi, and Pr(Tl, T2,... TNIX1,X2,... XN) = f.i=1 N P(Xi)Ti (1 p(xi))(1-ti) for the N units in the sample. Then {(Yi,Yio) II Ti} Xi =- {(Y 1,Yio) II Ti}lp(Xi). Corollary. If {(Yil, Yio) H Ti} IXi and the assumptions of Proposition 1 hold, then T T=l = EfE(YilTi = l,p(xi)) Proposition 2 asserts that, nditional on the propensity sre, the variates are independent of assignmento treatment, so that for observations with the same propensity sre, the distribution of variates should be the same across the treatment and mparison groups. Conditioning on the propensity sre, each individual has the same probability of assignmento treatment, as in a randomized experiment. We use this proposition to assess estimates of the propensity sre. For any given specification (we start by introducing the variates linearly), we group observations into strata defined on the estimated propensity sre and check

7 158 Journal of the American Statistical Association, December 1999 whether we succeed in balancing the variates within each ing nonparametric techniques. Hence we use a parametric stratum. We use tests for the statistical significance of dif- model for the propensity sre. This is preferable to apferences in the distribution of variates, focusing on first plying a parametric model directly to (1) because, as we and send moments (see Rosenbaum and Rubin 1984). If will see, the results are less sensitive to the logit specifithere are no significant differences between the two groups cation than regression models, such as those in Table 2. within each stratum, then we accept the specification. If Finally, depending on the estimator that one adopts (e.g., there are significant differences, then we add higher-order stratification), a precise estimate of the propensity sre is terms and interactions of the variates until this ndition not required. The process of validating the propensity sre is satisfied. In Section 5 we demonstrate that the results are estimate produces at least one partition structure that balnot sensitive to the selection of higher-order and interaction ances preintervention variates across the treatment and variables. mparison groups within each stratum, which, by (1), is In the send step, given the estimated propensity sre, all that is needed for an unbiased estimate of the treatment we need to estimate a univariate nonparametric regres- impact. sion, E(YjTi =T j,p(xi)), for j =,1. We focus on simple methods for obtaining a flexible functional form- 4. RESULTS USING THE PROPENSITY SCORE stratification and matching-but in principle one uld use Using the method outlined in the previous section, we any of the standard array of nonparametric techniques (see, separately estimate the propensity sre for each sample e.g., Hardle and Linton 1994; Heckman, Ichimura, and Todd of mparison units and treatment units. Figures 1 and ). present histograms of the estimated propensity sres for With stratification, observations are sorted from lowest to the treatment and PSID-1 and CPS-1 mparison groups. highest estimated propensity sre. We discard the mpar- Most of the mparison units (1,333 of a total of 2,49 ison units with an estimated propensity sre less than the PSID-1 units and 12,611 of 15,992 CPS-1 units) are disminimum (or greater than the maximum) estimated propen- carded because their estimated propensity sres are less sity sre for treated units. The strata, defined on the es- than the minimum for the treatment units. Even then, the timated propensity sre, are chosen so that the variates first bin (units with an estimated propensity sre of - within each stratum are balanced across the treatment and.5) ntains most of the remaining mparison units and mparison units. (We know that such strata exist from step few treatment units. An important difference between the 1.) Based on (2), within each stratum we take a difference figures is that Figure 1 has many bins in which the treatin means of the outme between the treatment and m- ment units greatly outnumber the mparison units. (Inparison groups, then weight these by the number of treated deed, for three bins there are no mparison units.) In nobservations in each stratum. We also nsider matching on trast, in Figure 2 for CPS- 1, each bin ntains at least a the propensity sre. Each treatment unit is matched with few mparison units. Overall, for PSID- 1 there are 98 replacemento the mparison unit with the closest propen- (more than half the total number) treated units with an estisity sre; the unmatched mparison units are discarded mated propensity sre in excess of.8, and only 7 mpar- (see Dehejia and Wahba 1998 for more details; also Heck- ison units, mpared to 35 treated and 7 mparison units man, Ichimura, Smith, and Todd 1998; Heckman, Ichimura, for CPS-1. and Todd 1998; Rubin 1979). Figures 1 and 2 illustrate the diagnostic value of the There are a number of reasons for preferring this two- propensity sre. They reveal that although the mparistep approach to direct estimation of (1). First, tackling (1) son groups are large relative to the treatment group, there directly with a nonparametric regression would enunter is limited overlap in terms of preintervention characteristhe curse of dimensionality as a problem in many datasets tics. Had there been no mparison units overlapping with such as ours that have a large number of variates. This a broad range of the treatment units, then it would not have would also occur when estimating the propensity sre us- been possible to estimate the average treatment effect on the 15 i 1 i l I l I l l o i j Ca 5Xm,lF,ln Estimated p(xi), 1333 mparison units discarded, first bin ntains 928 mparison units Figure 1. Histogram of the Estimated Propensity Sre for NSW Treated Units and PSID Comparison Units. The 1,333 PSID units whose estimated propensity sre is less than the minimum estimated propensity sre for the treatment group are discarded. The first bin ntains 928 PSID units. There is minimal overlap between the two groups. Three bins (.8-.85,.85-.9, and.9-.95) ntain no mparison units. There are 97 treated units with an estimated propensity sre greater than.8 and only 7 mparison units.

8 Dehejia and Wahba: Causal Effects in Nonexperimental Studies i i l I l l l I 15: 1{ o1 Ca 5 X Estimated p(xi), mparison units discarded, first bin ntains 2969 mparison units Figure 2. Histogram of the Estimated Propensity Sre for NSW Treated Units and CPS Comparison Units. The 12,611 CPS units whose estimated propensity sre is less than the minimum estimated propensity sre for the treatment group are discarded. The first bin ntains 2,969 CPS units. There is minimal overlap between the two groups, but the overlap is greater than in Figure 1; only one bin (.45-.5) ntains no mparison units, and there are 35 treated and 7 mparison units with an estimated propensity sre greater than.8. treatment group (although the treatment impact still uld number of treated observations within each stratum [Table be estimated in the range of overlap). With limited overlap, 3, lumn (4)]. An alternative is a within-block regression, we can proceed cautiously with estimation. Because in our again taking a weighted sum over the strata [Table 3, lapplication we have the benchmark experimental estimate, umn (5)]. When the variates are well balanced, such a we are able to evaluate the accuracy of the estimates. Even regression should have little effect, but it can help elimiin the absence of an experimental estimate, we show in Sec- nate the remaining within-block differences. Likewise for tion 5 that the use of multiple mparison groups provides matching, we can estimate a difference in means between the treatment and matched mparison groups for earnings another means of evaluating the estimates. in 1978 [lumn(7)], and also perform a regression of 1978 We use stratification and matching on the propensity earnings on variates [lumn(8)]. sre to group the treatment units with the small number Table 3 presents the results. For the PSID sample, the of mparison units whose estimated propensity sres are stratification estimate is $1,68 and the matching estigreater than the minimum-or less than the maximum- mate is $1,691, mpared to the benchmark randomizedpropensity sre for treatment units. We estimate the treat- experiment estimate of $1,794. The estimates from a difment effect by summing the within-stratum difference in ference in means and regression on the full sample are means between the treatment and mparison observations -$15,25 and $731. In lumns (5) and (8), ntrolling (of earnings in 1978), where the sum is weighted by the for variates has little impact on the stratification and Table 3. Estimated Training Effects for the NSW Male Participants Using Comparison Groups From PSID and CPS NSW treatment earnings less mparison group earnings, NSW earnings less nditional on the estimated propensity sre mparison group earnings Quadratic Stratifying on the sre Matching on the sre (1) (2) in sre' (4) (5) (6) (7) (8) Unadjusted Adjusteda (3) Unadjusted Adjusted Observationsc Unadjusted Adjustedd NSW 1,794 1,672 (633) (638) PSID-le -15, ,68 1,494 1,255 1,691 1,473 (1,154) (886) (1,389) (1,571) (1,581) (2,29) (89) PSID-21-3, ,22 2, ,455 1,48 (959) (1,28) (1,193) (1,768) (1,793) (2,33) (88) PSID-31 1, ,321 1, ,12 1,549 (899) (1,14) (1,383) (1,994) (2,2) (2,335) (826) CPS-1g -8, ,117 1,713 1,774 4,117 1,582 1,616 (712) (55) (747) (1,115) (1,152) (1,69) (751) CPS-29-3, ,543 1,622 1,493 1,788 1,563 (67) (658) (847) (1,461) (1,346) (1,25) (753) CPS , ,252 2, (657) (798) (951) (1,617) (2,82) (1,496) (776) a Least squares regression: RE78 on a nstant, a treatment indicator, age, age 2, education, no degree, black, Hispanic, RE74, RE75. bleast squares regression of RE78 on a quadratic on the estimated propensity sre and a treatment indicator, for observations used under stratification; see note (g). c Number of observations refers to the actual number of mparison and treatment units used for (3)-(5); namely, all treatment units and those mparison units whose estimated propensity sre is greater than the minimum, and less than the maximum, estimated propensity sre for the treatment group. dweighted least squares: treatment observations weighted as 1, and ntrol observations weighted by the number of times they are matched to a treatment observation [same variates as (a)]. Propensity sres are estimated using the logistic model, with specifications as follows: e PSID: Prb i= 1)= Fae Prob (T F(age, age, education, education, married, no degree, black, Hispanic, RE74, RE75, RE74, RE75, u74* black). f PI-adPD3Po =1=Fg,a,ndge,areblkHiai2,edai,eu PSID-2 and PSID-3: Prob (Ti 1) = = F(age, age,education, education,no degree, married, black, Hispanic, RE74, RE74, RE75, RE75, u74, u75). CPS-1, CPS-2, and CPS-3: Prob (Ti = 1) = F(age, age2, education, education2, no degree, married, black, Hispanic, RE74, RE75, u74, u75, education * RE74, age3)

9 16 Journal of the American Statistical Association, December 1999 matching estimates. Likewise for the CPS, the propensity- Column (3) in Table 3 illustrates the value of allowing sre-based estimates from the CPS-$1,713 and $1,582- both for a heterogeneous treatment effect and for a nonare much closer to the experimental benchmark than linear functional form in the propensity sre. The estimaestimates from the full mparison sample, -$8,498 tors in lumns (4)-(8) have both of these characteristics, and $972. whereas lumn (3) regresses 1978 earnings on a less non- We also nsider estimates from the subsets of the PSID linear function [quadratic, as opposed to the step function and CPS. In Table 2 the estimates tend to improve when in lumns (4) and (5)] of the estimated propensity sre applied to narrower subsets. However, the estimates still and a treatment indicator. The estimates are mparable range from -$3,822 to $1,326. In Table 3 the estimates do to those in lumn (2), where we regress the outme on not improve for the subsets, although the range of fluctua- all preintervention characteristics, and are farther from the tion is narrower, from $587 to $2,321. Tables 1 and 4 shed experimental benchmark than the estimates in lumns (4)- light on this. (8). This demonstrates the ability of the propensity sre to Table 1 presents the preintervention characteristics of summarize all preintervention variables, but underlines the the various mparison groups. We note that the subsets- importance of using the propensity sre in a sufficiently PSID-2 and -3 and CPS-2 and -3-although more closely nonlinear functional form. resembling the treatment group are still nsiderably differ- Finally, it must be noted that even though the estimates ent in a number of important dimensions, including ethnicpresented in Table 3 are closer to the experimental benchity, marital status, and especially earnings. Table 4 presents mark than those presented in Table 2, with the exception the characteristics of the matched subsamples from the of the adjusted matching estimator, their standard errors are mparison groups. The characteristics of the higher. In Table matched sub- 3, lumn (5), the standard errors are 1,152 and 1,581 for the CPS and PSID, mpared to 55 sets of CPS-1 and PSID-1 rrespond closely to the treatand 886 in Table 2, Panel C, lumn (5). This is because the ment group; none of the differences is statistically signifpropensity sre estimators use fewer observations. When icant. But as we create subsets of the mparison groups, stratifying on the propensity sre, we discard irrelevant the quality of the matches declines, most dramatically for ntrols, so that the strata may ntain as few as seven the PSID. PSID-2 and -3 earnings now increase from 1974 treated observations. However, the standard errors for the to 1975, whereas they decline for the treatment group. The adjusted matching estimator (751 and 89) are similar to training literature has identified the "dip" in earnings as an those in Table 2. important characteristic of participants in training programs By summarizing all of the variates in a single number, (see Ashenfelter 1974, 1978). The CPS subsamples retain the propensity sre method allows us to focus on the mthe dip, but 1974 earnings are substantially higher for the parability of the mparison group to the treatment group. matched subset of CPS-3 than for the treatment group. Hence it allows us to address the issues of functional form This illustrates one of the important features of propen- and treatment effect heterogeneity much more easily. sity sre methods, namely that creation of ad hoc subsamples from the nonexperimental mparison group is nei- 5. SENSITIVITY ANALYSIS ther necessary nor desirable; subsamples based on single preintervention characteristics may dispose of mparison 5.1 Sensitivity to the Specification of the units that still provide good overall mparisons with treat- Propensity Sre ment units. The propensity sre sorts out which mpari- The upper half of Table 5 demonstrates that the estimates son units are most relevant, nsidering all preintervention of the treatment impact are not particularly sensitive to the characteristics simultaneously, not just one characteristic at specification used for the propensity sre. Specifications 1 n time and 4 are the same as those in Table 3 (and hence they bal- Table 4. Sample Means of Characteristics for Matched Control Samples Matched No. of samples observations Age Education Black Hispanic No degree Married RE74 (U.S. $) RE75 (U.S.$) NSW ,96 1,532 MPSID ,794 1,126 [2.56] [.63] [.13] [.6] [.13] [.12] [1,46] [1,146] MPSID ,599 2,225 [2.63] [.83] [.14] [.8] [.16] [.16] [1,95] [1,228] MPSID ,386 1,863 [2.97] [.84] [.13] [.8] [.16] [.16] [1,68] [1,494] MCPS ,11 1,396 [1.25] [.32] [.6] [.4] [.7] [.6] [841] [563] MCPS ,758 1,24 [1.43] [.37] [.8] [.5] [.9].8 [896] [661] MCPS ,79 1,587 [1.68] [.48] [.9] [.6] [.1] [.9] [1,285] [76] NOTE: Standard error on the difference in means with NSW sample is given in brackets. MPSID1-3 and MCPS1-3 are the subsamples of PSID1-3 and CPS1-3 that are matched to the treatment group.

10 Dehejia and Wahba: Causal Effects in Nonexperimental Studies 161 Table 5. Sensitivity of Estimated Training Effects to Specification of the Propensity Sre NSW treatment earnings less mparison group earnings, NSW earnings less nditional on the estimated propensity sre mparison group earnings Quadratic Stratifying on the sre Matching on the sre (1) (2) in sre C (4) (5) (6) (7) (8) Comparison group Unadjusted Adjusteda (3) Unadjusted Adjusted Observationsd Unadjusted Adjustedb NSW 1,794 1,672 (633) (638) Dropping higher-order terms PSID-1: -15, ,68 1,254 1,255 1,691 1,54 Specification 1 (1,154) (866) (1,389) (1,571) (1,616) (2,29) (831) PSID-1: -15, ,524 1,775 1,533 2,281 2,291 Specification 2 (1,154) (863) (1,344) (1,527) (1,538) (1,732) (796) PSID-1: -15, ,185 1,237 1,155 1,373 1, Specification 3 (1,154) (863) (1,233) (1,144) (1,28) (1,72) (96) CPS-1: -8, ,117 1,713 1,774 4,117 1,582 1,616 Specification 4 (712) (547) (747) (1,115) (1,152) (1,69) (751) CPS-1: -8, ,248 1,452 1,454 6, Specification 5 (712) (546) (731) (632) (2,713) (1,7) (769) CPS-1: -8, ,241 1,299 1,95 6,17 1,13 1,471 Specification 6 (712) (546) (671) (547) (925) (877) (787) Dropping RE74 PSID-1: -15, ,23 1,284 1,727 1,34 Specification 7 (1,154) (88) (1,279) (1,41) (1,493) (1,447) (845) PSID-2: -3, Specification 8 (959) (1,4) (1,154) (1,472) (1,495) (1,848) (92) PSID-3: 1, , Specification 8 (899) (1,1) (1,261) (1,449) (1,493) (1,58) (938) CPS-1: -8, ,181 1,234 1,347 4,558 1, Specification 9 (712) (557) (698) (695) (683) (1,67) (786) CPS-2: -3, ,473 1,588 1,222 1,941 1,668 Specification 9 (67) (662) (731) (1,313) (1,39) (1,5) (755) CPS-3: ,348 1, ,97 1,12 Specification 9 (657) (87) (942) (1,61) (1,6) (1,366) (783) NOTE: Specification 1: Same as Table 3, note (c). Specification 2: Specification 1 without higher powers. Specification 3: Specification 2 without higher-order terms. Specification 4: Same as Table 3, note (e). Specification 5: Specification 4 without higher powers. Specification 6: Specification 5 without higher-order terms. Specification 7: Same as Table 3, note (c), with RE74 removed. Specification 8: Same as Table 3, note (d), with RE74 removed. Specification 9: Same as Table 3, note (e), with RE74 removed. a Least squares regression: RE78 on a nstant, a treatment indicator, age, education, no degree, black, Hispanic, RE74, RE75. bweighted least squares: treatment observations weighted as 1, and ntrol observations weighted by the number of times they are matched to a treatment observation [same variates as (a)]. c Least squares regression of RE78 on a quadratic on the estimated propensity sre and a treatment indicator, for observations used under stratification; see note (d). d Number of observations refers to the actual number of mparison and treatment units used for (3)-(5); namely, all treatment units and those mparison units whose estimated propensity sre is greater than the minimum, and less than the maximum, estimated propensity sre for the treatment group. ance the preintervention characteristics). In specifications outmes, Yi1 and Yio, are observed. This assumption led 2-3 and 5-6, we drop the squares and cubes of the - us to restrict Lalonde's data to the subset for which 2 years variates, and then the interactions and dummy variables. In (rather than 1 year) of preintervention earnings data is availspecifications 3 and 6, the logits simply use the variates able. In this section we nsider how our estimators would linearly. These estimates are farther from the experimen- fare in the absence of 2 years of preintervention earnings tal benchmark than those in Table 3, ranging from $835 to data. In the bottom part of Table 5, we reestimate the treat- $2,291, but they remain ncentrated mpared to the range ment impact without using 1974 earnings. For PSID1-3, of estimates from Table 2. Furthermore, for the alternative the stratification estimates (ranging from -$1,23 to $482) specifications, we are unable to find a partition structure are more variable than the regression estimates in lumn such that the preintervention characteristics are balanced (2) (ranging from -$265 to $297) and the estimates in Tawithin each stratum, which then nstitutes a well-defined ble 3, which use 1974 earnings (ranging from $1,494 to criterion for rejecting these alternative specifications. In- $2,321). The estimates from matching vary less than those deed, the specification search begins with a linear specifi- from stratification. Compared to the PSID estimates, the escation, then adds higher-order and interaction terms until timates from the CPS are closer to the experimental benchwithin-stratum balance is achieved. mark (ranging from $1,234 to $1,588 for stratification and from $861 to $1,941 for matching). They are also closer than the regression estimates in lumn (2). 5.2 Sensitivity to Selection on Observables The results clearly are sensitive to the set of preinter- One important assumption underlying propensity sre vention variables used, but the degree of sensitivity varies methods is that all of the variables that influence assign- with the mparison group. This illustrates the importance ment to treatment and that are rrelated with the potential of a sufficiently lengthy preintervention earnings history

11 162 Journal of the American Statistical Association, December 1999 for training programs. Table 5 also demonstrates the value rica, 66, of using multiple mparison groups. Even if we did not Ashenfelter,. (1974), "The Effect of Manpower Training on Earnings: Preliminary Results," in Proceedings of the Twenty-Seventh Annual Winknow the experimental estimate, the variation in estimates ter Meetings of the Industrial Relations Research Association, eds. J. between the CPS and PSID would raise the ncern that the Stern and B. Dennis, Madison, WI: Industrial Relations Research Assovariables that we observe (assuming that earnings in 1974 ciation. are not observed) do not ntrol fully for the differences (1978),"Estimating the Effects of Training Programs on Earnings," Review of Enomics and Statistics, 6, between the treatment and mparison groups. If all rele- Ashenfelter, O., and Card, D. (1985), "Using the Longitudinal Structure vant variables are observed, then the estimates from both of Earnings to Estimate the Effect of Training Programs," Review of groups should be similar (as they are in Table 3). When Enomics and Statistics, 67, an experimental benchmark is not available, multiple m- Card, D., and Sullivan, D. (1988), "Measuring the Effect of Subsidized Training Programs on Movements In and Out of Employment," Enoparison groups are valuable, because they can suggest the metrica, 56, existence of important unobservables. Rosenbaum (1987) Dawid, A. P. (1979), "Conditional Independence in Statistical Theory," has developed this idea in more detail. Journal of the Royal Statistical Society, Ser. B, 41, Dehejia, R. H., and Wahba, S. (1998), "Matching Methods for Estimat- 6. CONCLUSIONS ing Causal Effects in Non-Experimental Studies," Working Paper 6829, National Bureau of Enomic Research. In this article we have demonstrated how to estimate the Hardle, W., and Linton,. (1994),"Applied Nonparametric Regression," in treatment impact in an observational study using propen- Handbook of Enometrics, Vol. 4, eds. R. Engle and D. L. McFadden, Amsterdam: Elsevier, pp sity sre methods. Our ntribution is to demonstrate the Heckman, J., and Hotz, J. (1989), "Choosing Among Alternative Nonuse of propensity sre methods and to apply them in a experimental Methods for Estimating the Impact of Social Programs: ntext that allows us to assess their efficacy. Our results The Case of Manpower Training," Journal of the American Statistical show that the estimates of the training effect for Lalonde's Association, 84, Heckman, J., Ichimura, H., Smith, J., and Todd, P. (1998), "Characterizhybrid of an experimental and nonexperimental dataset are ing Selection Bias Using Experimental Data," Enomnetrica, 66, 117- close to the benchmark experimental estimate and are ro bust to the specification of the mparison group and to the Heckman, J., Ichimura, H., and Todd, P. (1997), "Matching As An Enofunctional form used to estimate the propensity sre. A metric Evaluation Estimator: Evidence from Evaluating a Job Training Program," Review of Enomic Studies, 64, researcher using this method would arrive at estimates of (1998), "Matching as an Enometric Evaluation Estimator," Rethe treatment impact ranging from $1,473 to $1,774, close view of Enomic Studies, 65, to the benchmark unbiased estimate from the experiment Heckman, J., and Robb, R. (1985), "Alternative Methods for Evaluating of $1,794. Furthermore, our methods succeed for a trans- the Impact of Interventions," in Longituidinal Analysis of Labor Market Data, Enometric Society Monograph No. 1, eds. J. Heckman and B. parent reason: They use only the subset of the mparison Singer, Cambridge, U.K.: Cambridge University Press, pp group that is mparable to the treatment group, and dis- Holland, P. W. (1986), "Statistics and Causal Inference," Journal of the card the mplement. Although Lalonde attempted to fol- American Statistical Association, 81, low this strategy in his nstruction of other mparison Lalonde, R. (1986), "Evaluating the Enometric Evaluations of Training Programs," American Enomic Review, 76, groups, his method relies on an informal selection based Manpower Demonstration Research Corporation (1983), Summaty and on preintervention variables. Our application illustrates that Findings of the National Supported Work Demonstr-ation, Cambridge, even among a large set of potential mparison units, very U.K.: Ballinger. Manski, C. F., and Garfinkel, I. (1992), "Introduction," in Evaluating Welfew may be relevant, and that even a few mparison units fare and Training Programs, eds. C. Manski and I. Garfinkel, Cambridge, may be sufficient to estimate the treatment impact. U.K.: Harvard University Press, pp The methods we suggest are not relevant in all situa- Manski, C. F., Sandefur, G., McLanahan, S., and Powers, D. (1992), "Altertions. There may be important unobservable variates, for native Estimates of the Effect of Family Structure During Adolescence on High School Graduation," Journal of the American Statistical Assowhich the propensity sre method cannot acunt. Howciation, 87, ever, rather than giving up, or relying on assumptions about Rosenbaum, P. (1987), "The Role of a Send Control Group in an Obthe unobserved variables, there is substantial reward in ex- servational Study," Statistical Science, 2, ploring firsthe information ntained in the variables that Rosenbaum, P., and Rubin, D. (1983), "The Central Role of the Propensity Sre in Observational Studies for Causal Effects," Biometrika, 7, 41- are observed. In this regard, propensity sre methods can 55. offer both a diagnostic on the quality of the mparison (1984), "Reducing Bias in Observational Studies Using the Subgroup and a means to estimate the treatment impact. classification on the Propensity Sre," Jouirnal of the American Statistical Association, 79, Rubin, D. (1974),"Estimating Causal Effects of Treatments in Randomized [Received October Revised May 1999.] and Non-Randomized Studies," Journal of Educational Psychology, 66, REFERENCES (1977), "Assignmento a Treatment Group on the Basis of a Covariate," Journal of Educational Statistics, 2, Angrist, J. (199), "Lifetime Earnings and the Vietnam Draft Lottery: Ev- (1978), "Bayesian Inference for Causal Effects: The Role of Ranidence From Social Security Administrative Rerds," American E- domization," The Annals of Statistics, 6, nomic Review, 8, (1979),"Using Multivariate Matched Sampling and Regression Ad- ( 1998), "Estimating the Labor Market Impact of Voluntary Military justmen to Control Bias in Observation Studies," Journal of the Amer- Service Using Social Security Data on Military Applicants," Enomet- ican Statistical Association, 74,

1. INTRODUCTION. Lalonde estimates the impact of the National Supported Work (NSW) Demonstration, a labor

1. INTRODUCTION. Lalonde estimates the impact of the National Supported Work (NSW) Demonstration, a labor 1. INTRODUCTION This paper discusses the estimation of treatment effects in observational studies. This issue, which is of great practical importance because randomized experiments cannot always be implemented,

More information

Practical propensity score matching: a reply to Smith and Todd

Practical propensity score matching: a reply to Smith and Todd Journal of Econometrics 125 (2005) 355 364 www.elsevier.com/locate/econbase Practical propensity score matching: a reply to Smith and Todd Rajeev Dehejia a,b, * a Department of Economics and SIPA, Columbia

More information

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS EMPIRICAL STRATEGIES IN LABOUR ECONOMICS University of Minho J. Angrist NIPE Summer School June 2009 This course covers core econometric ideas and widely used empirical modeling strategies. The main theoretical

More information

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany Dan A. Black University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany Matching as a regression estimator Matching avoids making assumptions about the functional form of the regression

More information

Empirical Strategies

Empirical Strategies Empirical Strategies Joshua Angrist BGPE March 2012 These lectures cover many of the empirical modeling strategies discussed in Mostly Harmless Econometrics (MHE). The main theoretical ideas are illustrated

More information

Propensity Score Matching with Limited Overlap. Abstract

Propensity Score Matching with Limited Overlap. Abstract Propensity Score Matching with Limited Overlap Onur Baser Thomson-Medstat Abstract In this article, we have demostrated the application of two newly proposed estimators which accounts for lack of overlap

More information

1. Introduction Consider a government contemplating the implementation of a training (or other social assistance) program. The decision to implement t

1. Introduction Consider a government contemplating the implementation of a training (or other social assistance) program. The decision to implement t 1. Introduction Consider a government contemplating the implementation of a training (or other social assistance) program. The decision to implement the program depends on the assessment of its likely

More information

Measuring Impact. Program and Policy Evaluation with Observational Data. Daniel L. Millimet. Southern Methodist University.

Measuring Impact. Program and Policy Evaluation with Observational Data. Daniel L. Millimet. Southern Methodist University. Measuring mpact Program and Policy Evaluation with Observational Data Daniel L. Millimet Southern Methodist University 23 May 2013 DL Millimet (SMU) Observational Data May 2013 1 / 23 ntroduction Measuring

More information

Effects of propensity score overlap on the estimates of treatment effects. Yating Zheng & Laura Stapleton

Effects of propensity score overlap on the estimates of treatment effects. Yating Zheng & Laura Stapleton Effects of propensity score overlap on the estimates of treatment effects Yating Zheng & Laura Stapleton Introduction Recent years have seen remarkable development in estimating average treatment effects

More information

Introduction to Observational Studies. Jane Pinelis

Introduction to Observational Studies. Jane Pinelis Introduction to Observational Studies Jane Pinelis 22 March 2018 Outline Motivating example Observational studies vs. randomized experiments Observational studies: basics Some adjustment strategies Matching

More information

Class 1: Introduction, Causality, Self-selection Bias, Regression

Class 1: Introduction, Causality, Self-selection Bias, Regression Class 1: Introduction, Causality, Self-selection Bias, Regression Ricardo A Pasquini April 2011 Ricardo A Pasquini () April 2011 1 / 23 Introduction I Angrist s what should be the FAQs of a researcher:

More information

Predicting the efficacy of future training programs using past experiences at other locations

Predicting the efficacy of future training programs using past experiences at other locations Journal of Econometrics ] (]]]]) ]]] ]]] www.elsevier.com/locate/econbase Predicting the efficacy of future training programs using past experiences at other locations V. Joseph Hotz a, *, Guido W. Imbens

More information

Propensity Score Methods for Causal Inference with the PSMATCH Procedure

Propensity Score Methods for Causal Inference with the PSMATCH Procedure Paper SAS332-2017 Propensity Score Methods for Causal Inference with the PSMATCH Procedure Yang Yuan, Yiu-Fai Yung, and Maura Stokes, SAS Institute Inc. Abstract In a randomized study, subjects are randomly

More information

PubH 7405: REGRESSION ANALYSIS. Propensity Score

PubH 7405: REGRESSION ANALYSIS. Propensity Score PubH 7405: REGRESSION ANALYSIS Propensity Score INTRODUCTION: There is a growing interest in using observational (or nonrandomized) studies to estimate the effects of treatments on outcomes. In observational

More information

PharmaSUG Paper HA-04 Two Roads Diverged in a Narrow Dataset...When Coarsened Exact Matching is More Appropriate than Propensity Score Matching

PharmaSUG Paper HA-04 Two Roads Diverged in a Narrow Dataset...When Coarsened Exact Matching is More Appropriate than Propensity Score Matching PharmaSUG 207 - Paper HA-04 Two Roads Diverged in a Narrow Dataset...When Coarsened Exact Matching is More Appropriate than Propensity Score Matching Aran Canes, Cigna Corporation ABSTRACT Coarsened Exact

More information

Comparing Experimental and Matching Methods using a Large-Scale Field Experiment on Voter Mobilization

Comparing Experimental and Matching Methods using a Large-Scale Field Experiment on Voter Mobilization Comparing Experimental and Matching Methods using a Large-Scale Field Experiment on Voter Mobilization Kevin Arceneaux Alan S. Gerber Donald P. Green Yale University Institution for Social and Policy Studies

More information

TESTING FOR COVARIATE BALANCE IN COMPARATIVE STUDIES

TESTING FOR COVARIATE BALANCE IN COMPARATIVE STUDIES TESTING FOR COVARIATE BALANCE IN COMPARATIVE STUDIES by Yevgeniya N. Kleyman A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Statistics) in The

More information

Complier Average Causal Effect (CACE)

Complier Average Causal Effect (CACE) Complier Average Causal Effect (CACE) Booil Jo Stanford University Methodological Advancement Meeting Innovative Directions in Estimating Impact Office of Planning, Research & Evaluation Administration

More information

The Prevalence of HIV in Botswana

The Prevalence of HIV in Botswana The Prevalence of HIV in Botswana James Levinsohn Yale University and NBER Justin McCrary University of California, Berkeley and NBER January 6, 2010 Abstract This paper implements five methods to correct

More information

BIOSTATISTICAL METHODS

BIOSTATISTICAL METHODS BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH PROPENSITY SCORE Confounding Definition: A situation in which the effect or association between an exposure (a predictor or risk factor) and

More information

Marno Verbeek Erasmus University, the Netherlands. Cons. Pros

Marno Verbeek Erasmus University, the Netherlands. Cons. Pros Marno Verbeek Erasmus University, the Netherlands Using linear regression to establish empirical relationships Linear regression is a powerful tool for estimating the relationship between one variable

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

Methods for Addressing Selection Bias in Observational Studies

Methods for Addressing Selection Bias in Observational Studies Methods for Addressing Selection Bias in Observational Studies Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA What is Selection Bias? In the regression

More information

Randomization as a Tool for Development Economists. Esther Duflo Sendhil Mullainathan BREAD-BIRS Summer school

Randomization as a Tool for Development Economists. Esther Duflo Sendhil Mullainathan BREAD-BIRS Summer school Randomization as a Tool for Development Economists Esther Duflo Sendhil Mullainathan BREAD-BIRS Summer school Randomization as one solution Suppose you could do a Randomized evaluation of the microcredit

More information

Continuum of a Declining Trend in Correct Self- Perception of Body Weight among American Adults

Continuum of a Declining Trend in Correct Self- Perception of Body Weight among American Adults Georgia Southern University Digital Commons@Georgia Southern Georgia Southern University Research Symposium Apr 16th, 4:00 PM - 5:00 PM Continuum of a Declining Trend in Correct Self- Perception of Body

More information

Mostly Harmless Simulations? On the Internal Validity of Empirical Monte Carlo Studies

Mostly Harmless Simulations? On the Internal Validity of Empirical Monte Carlo Studies Mostly Harmless Simulations? On the Internal Validity of Empirical Monte Carlo Studies Arun Advani and Tymon Sªoczy«ski 13 November 2013 Background When interested in small-sample properties of estimators,

More information

ICPSR Causal Inference in the Social Sciences. Course Syllabus

ICPSR Causal Inference in the Social Sciences. Course Syllabus ICPSR 2012 Causal Inference in the Social Sciences Course Syllabus Instructors: Dominik Hangartner London School of Economics Marco Steenbergen University of Zurich Teaching Fellow: Ben Wilson London School

More information

Does AIDS Treatment Stimulate Negative Behavioral Response? A Field Experiment in South Africa

Does AIDS Treatment Stimulate Negative Behavioral Response? A Field Experiment in South Africa Does AIDS Treatment Stimulate Negative Behavioral Response? A Field Experiment in South Africa Plamen Nikolov Harvard University February 19, 2011 Plamen Nikolov (Harvard University) Behavioral Response

More information

Impact and adjustment of selection bias. in the assessment of measurement equivalence

Impact and adjustment of selection bias. in the assessment of measurement equivalence Impact and adjustment of selection bias in the assessment of measurement equivalence Thomas Klausch, Joop Hox,& Barry Schouten Working Paper, Utrecht, December 2012 Corresponding author: Thomas Klausch,

More information

Exploring the Impact of Missing Data in Multiple Regression

Exploring the Impact of Missing Data in Multiple Regression Exploring the Impact of Missing Data in Multiple Regression Michael G Kenward London School of Hygiene and Tropical Medicine 28th May 2015 1. Introduction In this note we are concerned with the conduct

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Estimating average treatment effects from observational data using teffects

Estimating average treatment effects from observational data using teffects Estimating average treatment effects from observational data using teffects David M. Drukker Director of Econometrics Stata 2013 Nordic and Baltic Stata Users Group meeting Karolinska Institutet September

More information

Causal Validity Considerations for Including High Quality Non-Experimental Evidence in Systematic Reviews

Causal Validity Considerations for Including High Quality Non-Experimental Evidence in Systematic Reviews Non-Experimental Evidence in Systematic Reviews OPRE REPORT #2018-63 DEKE, MATHEMATICA POLICY RESEARCH JUNE 2018 OVERVIEW Federally funded systematic reviews of research evidence play a central role in

More information

Discussion. Ralf T. Münnich Variance Estimation in the Presence of Nonresponse

Discussion. Ralf T. Münnich Variance Estimation in the Presence of Nonresponse Journal of Official Statistics, Vol. 23, No. 4, 2007, pp. 455 461 Discussion Ralf T. Münnich 1 1. Variance Estimation in the Presence of Nonresponse Professor Bjørnstad addresses a new approach to an extremely

More information

Using ASPES (Analysis of Symmetrically- Predicted Endogenous Subgroups) to understand variation in program impacts. Laura R. Peck.

Using ASPES (Analysis of Symmetrically- Predicted Endogenous Subgroups) to understand variation in program impacts. Laura R. Peck. Using ASPES (Analysis of Symmetrically- Predicted Endogenous Subgroups) to understand variation in program impacts Presented by: Laura R. Peck OPRE Methods Meeting on What Works Washington, DC September

More information

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine Confounding by indication developments in matching, and instrumental variable methods Richard Grieve London School of Hygiene and Tropical Medicine 1 Outline 1. Causal inference and confounding 2. Genetic

More information

Version No. 7 Date: July Please send comments or suggestions on this glossary to

Version No. 7 Date: July Please send comments or suggestions on this glossary to Impact Evaluation Glossary Version No. 7 Date: July 2012 Please send comments or suggestions on this glossary to 3ie@3ieimpact.org. Recommended citation: 3ie (2012) 3ie impact evaluation glossary. International

More information

Arturo Gonzalez Public Policy Institute of California & IZA 500 Washington St, Suite 800, San Francisco,

Arturo Gonzalez Public Policy Institute of California & IZA 500 Washington St, Suite 800, San Francisco, WORKING PAPER #521 PRINCETON UNIVERSITY INDUSTRIAL RELATIONS SECTION JUNE 2007 Estimating the Effects of Length of Exposure to a Training Program: The Case of Job Corps 1 Alfonso Flores-Lagunes University

More information

Jake Bowers Wednesdays, 2-4pm 6648 Haven Hall ( ) CPS Phone is

Jake Bowers Wednesdays, 2-4pm 6648 Haven Hall ( ) CPS Phone is Political Science 688 Applied Bayesian and Robust Statistical Methods in Political Research Winter 2005 http://www.umich.edu/ jwbowers/ps688.html Class in 7603 Haven Hall 10-12 Friday Instructor: Office

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

Missing Data and Imputation

Missing Data and Imputation Missing Data and Imputation Barnali Das NAACCR Webinar May 2016 Outline Basic concepts Missing data mechanisms Methods used to handle missing data 1 What are missing data? General term: data we intended

More information

Identifying Peer Influence Effects in Observational Social Network Data: An Evaluation of Propensity Score Methods

Identifying Peer Influence Effects in Observational Social Network Data: An Evaluation of Propensity Score Methods Identifying Peer Influence Effects in Observational Social Network Data: An Evaluation of Propensity Score Methods Dean Eckles Department of Communication Stanford University dean@deaneckles.com Abstract

More information

Group Work Instructions

Group Work Instructions Evaluating Social Programs Executive Education at the Abdul Latif Jameel Poverty Action Lab June 18, 2007 June 22, 2007 Group Work Instructions You will be assigned to groups of 5-6 people. We will do

More information

The role of self-reporting bias in health, mental health and labor force participation: a descriptive analysis

The role of self-reporting bias in health, mental health and labor force participation: a descriptive analysis Empir Econ DOI 10.1007/s00181-010-0434-z The role of self-reporting bias in health, mental health and labor force participation: a descriptive analysis Justin Leroux John A. Rizzo Robin Sickles Received:

More information

Propensity Score. Overview:

Propensity Score. Overview: Propensity Score Overview: What do we use a propensity score for? How do we construct the propensity score? How do we implement propensity score es

More information

Propensity Score Analysis: Its rationale & potential for applied social/behavioral research. Bob Pruzek University at Albany

Propensity Score Analysis: Its rationale & potential for applied social/behavioral research. Bob Pruzek University at Albany Propensity Score Analysis: Its rationale & potential for applied social/behavioral research Bob Pruzek University at Albany Aims: First, to introduce key ideas that underpin propensity score (PS) methodology

More information

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha attrition: When data are missing because we are unable to measure the outcomes of some of the

More information

Statistical Techniques for Analyzing Data From Prevention Trials: Treatment of No-Shows Using Rubin's Causal Model

Statistical Techniques for Analyzing Data From Prevention Trials: Treatment of No-Shows Using Rubin's Causal Model Psychological Methods 1998, Vol. 3, No. 2,147-159 Copyright 1998 by the American Psychological Association, Inc. 1082-989X/98/$3.00 Statistical Techniques for Analyzing Data From Prevention Trials: Treatment

More information

TRACER STUDIES ASSESSMENTS AND EVALUATIONS

TRACER STUDIES ASSESSMENTS AND EVALUATIONS TRACER STUDIES ASSESSMENTS AND EVALUATIONS 1 INTRODUCTION This note introduces the reader to tracer studies. For the Let s Work initiative, tracer studies are proposed to track and record or evaluate the

More information

AP Statistics Exam Review: Strand 2: Sampling and Experimentation Date:

AP Statistics Exam Review: Strand 2: Sampling and Experimentation Date: AP Statistics NAME: Exam Review: Strand 2: Sampling and Experimentation Date: Block: II. Sampling and Experimentation: Planning and conducting a study (10%-15%) Data must be collected according to a well-developed

More information

WORKING PAPER SERIES

WORKING PAPER SERIES WORKING PAPER SERIES Evaluating School Vouchers: Evidence from a Within-Study Kaitlin P. Anderson Patrick J. Wolf, Ph.D. April 3, 2017 EDRE Working Paper 2017-10 The University of Arkansas, Department

More information

UN Handbook Ch. 7 'Managing sources of non-sampling error': recommendations on response rates

UN Handbook Ch. 7 'Managing sources of non-sampling error': recommendations on response rates JOINT EU/OECD WORKSHOP ON RECENT DEVELOPMENTS IN BUSINESS AND CONSUMER SURVEYS Methodological session II: Task Force & UN Handbook on conduct of surveys response rates, weighting and accuracy UN Handbook

More information

Working Paper: Designs of Empirical Evaluations of Non-Experimental Methods in Field Settings. Vivian C. Wong 1 & Peter M.

Working Paper: Designs of Empirical Evaluations of Non-Experimental Methods in Field Settings. Vivian C. Wong 1 & Peter M. EdPolicyWorks Working Paper: Designs of Empirical Evaluations of Non-Experimental Methods in Field Settings Vivian C. Wong 1 & Peter M. Steiner 2 Over the last three decades, a research design has emerged

More information

Propensity scores: what, why and why not?

Propensity scores: what, why and why not? Propensity scores: what, why and why not? Rhian Daniel, Cardiff University @statnav Joint workshop S3RI & Wessex Institute University of Southampton, 22nd March 2018 Rhian Daniel @statnav/propensity scores:

More information

Abstract Title Page Not included in page count. Title: Analyzing Empirical Evaluations of Non-experimental Methods in Field Settings

Abstract Title Page Not included in page count. Title: Analyzing Empirical Evaluations of Non-experimental Methods in Field Settings Abstract Title Page Not included in page count. Title: Analyzing Empirical Evaluations of Non-experimental Methods in Field Settings Authors and Affiliations: Peter M. Steiner, University of Wisconsin-Madison

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

A Potential Outcomes View of Value-Added Assessment in Education

A Potential Outcomes View of Value-Added Assessment in Education A Potential Outcomes View of Value-Added Assessment in Education Donald B. Rubin, Elizabeth A. Stuart, and Elaine L. Zanutto Invited discussion to appear in Journal of Educational and Behavioral Statistics

More information

Combining machine learning and matching techniques to improve causal inference in program evaluation

Combining machine learning and matching techniques to improve causal inference in program evaluation bs_bs_banner Journal of Evaluation in Clinical Practice ISSN1365-2753 Combining machine learning and matching techniques to improve causal inference in program evaluation Ariel Linden DrPH 1,2 and Paul

More information

PREVENTIVE HEALTH BEHAVIORS AMONG THE ELDERLY

PREVENTIVE HEALTH BEHAVIORS AMONG THE ELDERLY PREVENTIVE HEALTH BEHAVIORS AMONG THE ELDERLY by Padmaja Ayyagari Department of Economics Date: Duke University Approved: Frank A. Sloan, Supervisor Alessandro Tarozzi Curtis R. Taylor Ahmed Khwaja Dissertation

More information

Econometric analysis and counterfactual studies in the context of IA practices

Econometric analysis and counterfactual studies in the context of IA practices Econometric analysis and counterfactual studies in the context of IA practices Giulia Santangelo http://crie.jrc.ec.europa.eu Centre for Research on Impact Evaluation DG EMPL - DG JRC CRIE Centre for Research

More information

Modelling Research Productivity Using a Generalization of the Ordered Logistic Regression Model

Modelling Research Productivity Using a Generalization of the Ordered Logistic Regression Model Modelling Research Productivity Using a Generalization of the Ordered Logistic Regression Model Delia North Temesgen Zewotir Michael Murray Abstract In South Africa, the Department of Education allocates

More information

THE EFFECTS OF SELF AND PROXY RESPONSE STATUS ON THE REPORTING OF RACE AND ETHNICITY l

THE EFFECTS OF SELF AND PROXY RESPONSE STATUS ON THE REPORTING OF RACE AND ETHNICITY l THE EFFECTS OF SELF AND PROXY RESPONSE STATUS ON THE REPORTING OF RACE AND ETHNICITY l Brian A. Harris-Kojetin, Arbitron, and Nancy A. Mathiowetz, University of Maryland Brian Harris-Kojetin, The Arbitron

More information

NBER TECHNICAL WORKING PAPER SERIES USING RAMDOMIZATION IN DEVELOPMENT ECONOMICS RESEARCH: A TOOLKIT. Esther Duflo Rachel Glennerster Michael Kremer

NBER TECHNICAL WORKING PAPER SERIES USING RAMDOMIZATION IN DEVELOPMENT ECONOMICS RESEARCH: A TOOLKIT. Esther Duflo Rachel Glennerster Michael Kremer NBER TECHNICAL WORKING PAPER SERIES USING RAMDOMIZATION IN DEVELOPMENT ECONOMICS RESEARCH: A TOOLKIT Esther Duflo Rachel Glennerster Michael Kremer Technical Working Paper 333 http://www.nber.org/papers/t0333

More information

Rajeev Dehejia* Experimental and Non-Experimental Methods in Development Economics: A Porous Dialectic

Rajeev Dehejia* Experimental and Non-Experimental Methods in Development Economics: A Porous Dialectic JGD 2015; 6(1): 47 69 Rajeev Dehejia* Experimental and Non-Experimental Methods in Development Economics: A Porous Dialectic Open Access Abstract: This paper surveys six widely-used non-experimental methods

More information

Propensity Score Analysis Shenyang Guo, Ph.D.

Propensity Score Analysis Shenyang Guo, Ph.D. Propensity Score Analysis Shenyang Guo, Ph.D. Upcoming Seminar: April 7-8, 2017, Philadelphia, Pennsylvania Propensity Score Analysis 1. Overview 1.1 Observational studies and challenges 1.2 Why and when

More information

Propensity Score Analysis to compare effects of radiation and surgery on survival time of lung cancer patients from National Cancer Registry (SEER)

Propensity Score Analysis to compare effects of radiation and surgery on survival time of lung cancer patients from National Cancer Registry (SEER) Propensity Score Analysis to compare effects of radiation and surgery on survival time of lung cancer patients from National Cancer Registry (SEER) Yan Wu Advisor: Robert Pruzek Epidemiology and Biostatistics

More information

Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time.

Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time. Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time. Anne Cori* 1, Michael Pickles* 1, Ard van Sighem 2, Luuk Gras 2,

More information

Challenges of Observational and Retrospective Studies

Challenges of Observational and Retrospective Studies Challenges of Observational and Retrospective Studies Kyoungmi Kim, Ph.D. March 8, 2017 This seminar is jointly supported by the following NIH-funded centers: Background There are several methods in which

More information

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan Sequential nonparametric regression multiple imputations Irina Bondarenko and Trivellore Raghunathan Department of Biostatistics, University of Michigan Ann Arbor, MI 48105 Abstract Multiple imputation,

More information

Overview of Perspectives on Causal Inference: Campbell and Rubin. Stephen G. West Arizona State University Freie Universität Berlin, Germany

Overview of Perspectives on Causal Inference: Campbell and Rubin. Stephen G. West Arizona State University Freie Universität Berlin, Germany Overview of Perspectives on Causal Inference: Campbell and Rubin Stephen G. West Arizona State University Freie Universität Berlin, Germany 1 Randomized Experiment (RE) Sir Ronald Fisher E(X Treatment

More information

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1 Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition

More information

Meta-Analysis and Publication Bias: How Well Does the FAT-PET-PEESE Procedure Work?

Meta-Analysis and Publication Bias: How Well Does the FAT-PET-PEESE Procedure Work? Meta-Analysis and Publication Bias: How Well Does the FAT-PET-PEESE Procedure Work? Nazila Alinaghi W. Robert Reed Department of Economics and Finance, University of Canterbury Abstract: This study uses

More information

THE GOOD, THE BAD, & THE UGLY: WHAT WE KNOW TODAY ABOUT LCA WITH DISTAL OUTCOMES. Bethany C. Bray, Ph.D.

THE GOOD, THE BAD, & THE UGLY: WHAT WE KNOW TODAY ABOUT LCA WITH DISTAL OUTCOMES. Bethany C. Bray, Ph.D. THE GOOD, THE BAD, & THE UGLY: WHAT WE KNOW TODAY ABOUT LCA WITH DISTAL OUTCOMES Bethany C. Bray, Ph.D. bcbray@psu.edu WHAT ARE WE HERE TO TALK ABOUT TODAY? Behavioral scientists increasingly are using

More information

P E R S P E C T I V E S

P E R S P E C T I V E S PHOENIX CENTER FOR ADVANCED LEGAL & ECONOMIC PUBLIC POLICY STUDIES Revisiting Internet Use and Depression Among the Elderly George S. Ford, PhD June 7, 2013 Introduction Four years ago in a paper entitled

More information

Econometric Game 2012: infants birthweight?

Econometric Game 2012: infants birthweight? Econometric Game 2012: How does maternal smoking during pregnancy affect infants birthweight? Case A April 18, 2012 1 Introduction Low birthweight is associated with adverse health related and economic

More information

Mediation Analysis With Principal Stratification

Mediation Analysis With Principal Stratification University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 3-30-009 Mediation Analysis With Principal Stratification Robert Gallop Dylan S. Small University of Pennsylvania

More information

Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of four noncompliance methods

Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of four noncompliance methods Merrill and McClure Trials (2015) 16:523 DOI 1186/s13063-015-1044-z TRIALS RESEARCH Open Access Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of

More information

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX The Impact of Relative Standards on the Propensity to Disclose Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX 2 Web Appendix A: Panel data estimation approach As noted in the main

More information

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. Decision-Making under Uncertainty in the Setting of Environmental Health Regulations Author(s): James M. Robins, Philip J. Landrigan, Thomas G. Robins, Lawrence J. Fine Source: Journal of Public Health

More information

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Sylvia Richardson 1 sylvia.richardson@imperial.co.uk Joint work with: Alexina Mason 1, Lawrence

More information

THE INFLUENCE OF UNCONSCIOUS NEEDS ON COLLEGE PROGRAM CHOICE

THE INFLUENCE OF UNCONSCIOUS NEEDS ON COLLEGE PROGRAM CHOICE University of Massachusetts Amherst ScholarWorks@UMass Amherst Tourism Travel and Research Association: Advancing Tourism Research Globally 2007 ttra International Conference THE INFLUENCE OF UNCONSCIOUS

More information

Which Comparison-Group ( Quasi-Experimental ) Study Designs Are Most Likely to Produce Valid Estimates of a Program s Impact?:

Which Comparison-Group ( Quasi-Experimental ) Study Designs Are Most Likely to Produce Valid Estimates of a Program s Impact?: Which Comparison-Group ( Quasi-Experimental ) Study Designs Are Most Likely to Produce Valid Estimates of a Program s Impact?: A Brief Overview and Sample Review Form February 2012 This publication was

More information

Causal Effect Heterogeneity

Causal Effect Heterogeneity Causal Effect Heterogeneity Jennie E. Brand Juli Simon-Thomas PWP-CCPR-2011-012 Latest Revised: March 30, 2012 California Center for Population Research On-Line Working Paper Series CAUSAL EFFECT HETEROGENEITY

More information

Methods of Reducing Bias in Time Series Designs: A Within Study Comparison

Methods of Reducing Bias in Time Series Designs: A Within Study Comparison Methods of Reducing Bias in Time Series Designs: A Within Study Comparison Kylie Anglin, University of Virginia Kate Miller-Bains, University of Virginia Vivian Wong, University of Virginia Coady Wing,

More information

Propensity score methods to adjust for confounding in assessing treatment effects: bias and precision

Propensity score methods to adjust for confounding in assessing treatment effects: bias and precision ISPUB.COM The Internet Journal of Epidemiology Volume 7 Number 2 Propensity score methods to adjust for confounding in assessing treatment effects: bias and precision Z Wang Abstract There is an increasing

More information

Chapter 13 Estimating the Modified Odds Ratio

Chapter 13 Estimating the Modified Odds Ratio Chapter 13 Estimating the Modified Odds Ratio Modified odds ratio vis-à-vis modified mean difference To a large extent, this chapter replicates the content of Chapter 10 (Estimating the modified mean difference),

More information

BAYESIAN ESTIMATORS OF THE LOCATION PARAMETER OF THE NORMAL DISTRIBUTION WITH UNKNOWN VARIANCE

BAYESIAN ESTIMATORS OF THE LOCATION PARAMETER OF THE NORMAL DISTRIBUTION WITH UNKNOWN VARIANCE BAYESIAN ESTIMATORS OF THE LOCATION PARAMETER OF THE NORMAL DISTRIBUTION WITH UNKNOWN VARIANCE Janet van Niekerk* 1 and Andriette Bekker 1 1 Department of Statistics, University of Pretoria, 0002, Pretoria,

More information

Weight Adjustment Methods using Multilevel Propensity Models and Random Forests

Weight Adjustment Methods using Multilevel Propensity Models and Random Forests Weight Adjustment Methods using Multilevel Propensity Models and Random Forests Ronaldo Iachan 1, Maria Prosviryakova 1, Kurt Peters 2, Lauren Restivo 1 1 ICF International, 530 Gaither Road Suite 500,

More information

Proof. Revised. Chapter 12 General and Specific Factors in Selection Modeling Introduction. Bengt Muthén

Proof. Revised. Chapter 12 General and Specific Factors in Selection Modeling Introduction. Bengt Muthén Chapter 12 General and Specific Factors in Selection Modeling Bengt Muthén Abstract This chapter shows how analysis of data on selective subgroups can be used to draw inference to the full, unselected

More information

Instrumental Variables I (cont.)

Instrumental Variables I (cont.) Review Instrumental Variables Observational Studies Cross Sectional Regressions Omitted Variables, Reverse causation Randomized Control Trials Difference in Difference Time invariant omitted variables

More information

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. Identification of Causal Effects Using Instrumental Variables Author(s): Joshua D. Angrist, Guido W. Imbens, Donald B. Rubin Source: Journal of the American Statistical Association, Vol. 91, No. 434 (Jun.,

More information

Estimating Heterogeneous Treatment Effects with Observational Data

Estimating Heterogeneous Treatment Effects with Observational Data Estimating Heterogeneous Treatment Effects with Observational Data Yu Xie University of Michigan Jennie E. Brand University of California Los Angeles Ben Jann University of Bern, Switzerland Population

More information

KARUN ADUSUMILLI OFFICE ADDRESS, TELEPHONE & Department of Economics

KARUN ADUSUMILLI OFFICE ADDRESS, TELEPHONE &   Department of Economics LONDON SCHOOL OF ECONOMICS & POLITICAL SCIENCE Placement Officer: Professor Wouter Den Haan +44 (0)20 7955 7669 w.denhaan@lse.ac.uk Placement Assistant: Mr John Curtis +44 (0)20 7955 7545 j.curtis@lse.ac.uk

More information

A Comparison of Variance Estimates for Schools and Students Using Taylor Series and Replicate Weighting

A Comparison of Variance Estimates for Schools and Students Using Taylor Series and Replicate Weighting A Comparison of Variance Estimates for Schools and Students Using and Replicate Weighting Peter H. Siegel, James R. Chromy, Ellen Scheib RTI International Abstract Variance estimation is an important issue

More information

Lecture II: Difference in Difference and Regression Discontinuity

Lecture II: Difference in Difference and Regression Discontinuity Review Lecture II: Difference in Difference and Regression Discontinuity it From Lecture I Causality is difficult to Show from cross sectional observational studies What caused what? X caused Y, Y caused

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Can Quasi Experiments Yield Causal Inferences? Sample. Intervention 2/20/2012. Matthew L. Maciejewski, PhD Durham VA HSR&D and Duke University

Can Quasi Experiments Yield Causal Inferences? Sample. Intervention 2/20/2012. Matthew L. Maciejewski, PhD Durham VA HSR&D and Duke University Can Quasi Experiments Yield Causal Inferences? Matthew L. Maciejewski, PhD Durham VA HSR&D and Duke University Sample Study 1 Study 2 Year Age Race SES Health status Intervention Study 1 Study 2 Intervention

More information

CHAPTER VI RESEARCH METHODOLOGY

CHAPTER VI RESEARCH METHODOLOGY CHAPTER VI RESEARCH METHODOLOGY 6.1 Research Design Research is an organized, systematic, data based, critical, objective, scientific inquiry or investigation into a specific problem, undertaken with the

More information

Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary note

Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary note University of Iowa Iowa Research Online Theses and Dissertations Fall 2014 Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary

More information