Rules Versus Discretion in Social Programs: Empirical Evidence on Profiling In Employment and Training Programs

Size: px
Start display at page:

Download "Rules Versus Discretion in Social Programs: Empirical Evidence on Profiling In Employment and Training Programs"

Transcription

1 Rules Versus Discretion in Social Programs: Empirical Evidence on Profiling In Employment and Training Programs Miana Plesca University of Guelph Jeff Smith University of Maryland This version: January 2005 We thank the CIBC Chair in Human Capital and Productivity and the Social Sciences and Humanities Research Council of Canada for research support. We are grateful to Audra Bowlus, Bo Honor, Chris Robinson, Todd Stinebricker and Tiemen Woutersen for helpful comments. 1

2 Table of Contents Abstract 1. Introduction 2. Statistical Treatment Rules 2.1 Profiling as an allocation mechanism 2.2 The choice of a profiling variable 2.3 Empirical evidence on profiling on program outcomes 2.4 Empirical evidence on profiling on program impacts 3. The NJS Data 4. Estimation Methodology 4.1 Profiling on largest predicted impacts 4.2 Profiling specifications 4.3 Profiling on small Y Profiling at positive sites 4.5 Profiling into JTPA from Application 4.6 Profiling into JTPA from Eligibility 4.7 Profiling into Alternative Treatment Streams within JTPA 5. Results from profiling into JTPA 5.1 Profiling into JTPA on predicted program impacts Different covariate sets used in profiling Profiling based on positive predicted earnings impacts Profiling based on positive predicted employment impacts 5.2 Profiling into JTPA on predicted non-treatment outcomes 2

3 Profiling based on low predicted non-treatment earnings Profiling based on low predicted non-treatment employment Sensitivity to different cutoffs in outcome profiling 5.3 Sensitivity to different proportions of splitting the estimating and validating samples 5.4 Conclusions from profiling into JTPA Profiling performance Efficiency versus equity Specifications 6. Profiling into JTPA at positive sites 7. Profiling into JTPA from Application 8. Profiling into JTPA from Eligibility 9. Profiling into Alternative Treatment Streams within JTPA 9.1 Profiling pooled adult males and females into JTPA treatment streams Profiling the pooled sample on predicted earnings impacts Profiling the pooled sample on predicted employment impacts 9.2 Profiling separately adult males and adult females into JTPA treatment streams Profiling adult males into treatment streams on high treatment outcomes Y Conclusions from profiling into treatment streams 10. Summary and Conclusions References Main result tables Appendix 1: Results from profiling young males and young females Appendix 2: Some sensitivity results 3

4 Abstract A substantial body of empirical evidence indicates that public employment and training programs generally fail to achieve their stated goal of increasing the earnings and employment of those they serve. One possible solution for improving the performance of re-employment and training programs is to use statistical models to profile potential participants either into or out of a program or to alternative treatments within a program. Because of heterogeneity in program impacts, profiling based on predicted program impacts can lead to a more efficient allocation mechanism. Using experimental data from the National JTPA Study we evaluate the potential improvement in program performance from profiling into the JTPA program. We find that statistical profiling improves program impacts for adult males and females, but does not improve the impacts for young males and females. Compared to JTPA participants, we find the same profiling impacts for program applicants, as well as for adult female eligible non-applicants. For adult male eligible nonapplicants profiling impacts are worse. In general, profiling on large predicted impacts is more efficient than profiling on low predicted non-participation outcomes. Program performance can also be improved by eliminating negative mean impact sites. We also provide evidence on the potential benefits from profiling into alternative treatment streams conditional on program participation. For adult females, profiling into treatment streams improves program performance in all treatment streams, but not so for adult males. Caseworker assignment and random assignment allocations give similar predicted program impacts, which are less efficient than the statistical profiling ones for those instances where statistical profiling is an efficiency-improving mechanism. 4

5 1 Introduction Profiling methods seek to improve program performance by using statistical decision rules to assign individuals to participate in a program or not, or to assign participants to particular treatment streams within a program. Any program has a social agenda, be that equity - to serve those most in need - or efficiency - to serve those who would benefit most from the program. Given the goal of the program, policymakers face the important choice of an allocation mechanism to assign participation. Statistical profiling is one such possible treatment allocation mechanism. Other possibilities include deterministic rules which base assignment on one or more observable characteristics (e.g. race in affirmative action programs), or caseworker discretion. Within program eligibility rules caseworker assignments base the allocation on participant characteristics, observed and unobserved (to the statistician), at the discretion of program staff. In this paper we examine how the impact of a program can be improved by selecting program participants based on statistical profiling mechanisms which exploit the heterogeneity in outcomes and impacts across individuals. In practice governments have implemented profiling on outcome levels, by assigning to treatment individuals with low outcomes in the absence of treatment. 1 This kind of profiling serves an equity goal. If efficiency is the desired goal, then the allocation mechanism of choice ought to be statistical profiling based on predicted program impacts. The mechanism relies on predicting for each individual his or her gains in every state of a program (participation, non-participation, treatment streams within a program) and assigning the state that optimize expected gains. Besides comparing program impacts from assignments given by statistical profiling 1 This is the case, for instance, with the U.S. Unemployment insurance profiling system. 5

6 rules, we also investigate how the assignment into particular treatment streams within a program differs between statistical profiling and other assignment mechanisms such as random assignment or caseworker discretion. 2 Even though profiling systems for social programs are fairly new in practice, there are some important examples. For instance, in the U.S. the Unemployment Insurance (UI) system in most states profiles individuals into mandatory reemployment services based on their predicted duration of UI benefit receipt. 3 Although it is sometimes referred as a program that attempts to maximize the total gains of its participants, this allocation system in fact builds on equity concerns. While the UI profiling system is an example of a profiling system that allocates potential participants into a program or not, both the U.S. and Canada are also considering systems that will allocate persons to alternative treatments within a program. In the U.S. the Frontline Decision Support System (FDSS) will allocate persons to treatments funded under the Workforce Investment Act (WIA). In Canada an earlier system called the Service and Outcome Measurement System (SOMS) was intended to allocate unemployed persons to various publicly funded employment and training programs. 4 The goal of any program is to obtain the best results according to some social welfare criterion, be that efficiency or equity (or maximization of reelection probabilities). Conditional on having decided on the allocation mechanism, the choice of a target ( pro- 2 The evaluation of statistical profiling as an allocation mechanism and the evaluation of the program impacts are two separate issues. A good allocation mechanism can make the best it can when implemented on a badly designed program. Similarly, a bad allocation mechanism can misallocate participants and thus deteriorate the impacts of an otherwise attractive program. 3 See, e.g., Dickinson, Decker and Kreutzer (1999) for a detailed description of UI profiling in the U.S. 4 See Chapter 12 on FDSS by Eberts and O Leary and Chapter 10 on SOMMS by Colpitts in Targeting Employment Services, Eberts, O Leary and Wandner, eds., 2002 Upjohn Institute. 6

7 filing ) variable is essential in achieving the program goals. If the goal of the program is efficiency, then the program targets maximum impacts on the outcome of interest, e.g. earnings or employment. 5 If the goal of the program is equity, treatment is administered to those individuals identified as neediest, as in the case of the UI profiling system where those claimants with the highest predicted probabilities to exhaust UI benefits are profiled into treatment. 6 The choice of a profiling variable depends not only on the goal of the program, but also on the availability of data to be used in estimating the profiling model. Depending on the program under evaluation, profiling at different points in the participation decision may achieve different goals. For instance, if a program is running under capacity constraint limitations, profiling the pool of applicants may increase program efficiency by making sure the people who would benefit most from the program get enrolled. If the intention of a program is to implement services for the whole eligible population, not just for individuals self-selected into participation, then the profiling analysis should be carried out at the eligibility stage. Finally, profiling can further improve a program s performance by recommending program participants the treatment stream that would bring them the highest expected gains (possibly conditional on capacity constraints). In the empirical analysis we use data from an influential social experiment conducted in the U.S., the NJS (National JTPA Study) evaluation of the JTPA (Job Training and Partnership Act) program. We first consider profiling individuals into or out of the program at various stages in the application and enrollment process: profiling from random 5 This is a partial equilibrium world where general equilibrium effects such as displacement are ignored. 6 It is rarely the case that the goals of efficiency and equity do not conflict. See Berger, Black, and Smith (2001). 7

8 assignment, from application, and from eligibility. Using experimental data we estimate a heterogeneous program impact function based on a large set of individual characteristics. Under certain assumptions the impact function allows for the forecast of potential program gains for each individual. Based on this forecast we profile individuals either into participation or into non-participation and we re-compute program impacts based on the designed allocation. Likewise, we compute impact functions conditional on personal characteristics for each JTPA treatment stream and we profile participants into the treatment that results in the highest expected gain for each participant. A comparison between the program gains under the current implementation and the average program impacts under profiling indicates the potential for using statistical profiling to improve program performance. We document under what circumstances profiling can improve the efficiency of the JTPA program by allocating treatment those participants who would benefit most from it. Although economically relevant, the results do not always achieve statistical significance because of the small samples involved in estimation (but then, this is a common problem often plaguing the program evaluation literature). From a methodological standpoint, we bring innovations to the program evaluation literature. We introduce a profiling function that generates program impacts as a function of individuals observed covariates X. We note that in small samples the program impact estimators may suffer from over-fitting bias. To avoid the over-fitting bias we introduce a procedure in which we randomly split the sample into an estimating sample and a validating (or holdout) sample. We use the observations in the estimating sample to generate for individuals in the validating sample predicted impact functions on which we base the profiling allocations. We repeat this procedure in a bootstrap fashion enough 8

9 times (500) to account for sampling variations. We report average impact measures (as well as average standard errors) from the 500 repetitions. The paper proceeds as follows. Section 2 discusses the theoretical grounds for statistical treatment rules, the choice of profiling as an allocation mechanism, and existing evidence on profiling. Section 3 describes the experimental data used in this exercise. Section 4 provides the methodological approach. Results for profiling into the program at the point of random assignment are discussed in section 5. Different profiling exercises are undertaken in section 6 (profiling at positive sites), section 7 (profiling from application) and section 8 (profiling from eligibility). Section 9 provides results from the analysis of profiling into treatment streams, and section 10 concludes. 2 Statistical Treatment Rules 2.1 Profiling as an allocation mechanism As argued by Manski (1997), if a program were administered by a benevolent social planner, the allocation that would maximize a utilitarian social welfare function is the same as the allocation that would maximize the utility each individual derives from the program. Abstracting from unobserved individual heterogeneity pertaining to motivation or ability, the allocation of choice for individuals who have access to the same information set as the social planner is also the allocation based on profiling on program impacts. In other words, an individual will chose to participate or not in a program, or in a treatment stream within a program, depending on whether the net outcome from participation is larger or not than the net outcome from non-participation. The central motivation behind this argument is that program impacts are heterogeneous (Manski, 1997 and 2000). Even 9

10 conditional on some broad characteristics like belonging to a certain demographic group, program impacts vary across participants. The idea behind statistical profiling is to try to identify which personal characteristics are responsible for the heterogeneity in individual impacts and in what manner, and to use this knowledge to predict individual responses and best assignments under different treatment scenarios. Statistical profiling is an intermediate allocation strategy between caseworker discretion and a deterministic rule, less ad-hoc than caseworker selection and providing a much finer allocation than the deterministic rule. Caseworker assignment is the strategy at work in most existing multiple-treatment programs. The (hopefully) benevolent caseworker has access to background information, the results of formal tests of aptitudes and of interests, and possibly other variables related to the applicant s motivation and enthusiasm for the project. Based on her available information, the caseworker makes the treatment recommendation she considers will achieve the program goals for the eligible applicant, subject to budgetary and administrative constraints. Caseworker assignment has the virtue that it allows for idiosyncratic information about the client and about the institutional environment - information that may be difficult to include in a statistical decision system or to encode in a deterministic rule - to affect the allocation process. The downside of this allocation mechanism is that program outcomes will depend on subjective decisions by caseworkers who could make mistakes or engage in creamskimming to achieve performance standard goals. There is also considerable variation across caseworkers decisions: caseworkers are not all well informed, or they do not all use 10

11 the same criteria in decision-making. 7 A deterministic rule places individuals in treatment based on observable characteristics, such as means-tested transfer programs, affirmative action programs, or a rule that all welfare participants with no children below an age cutoff must participate in employmentrelated activities. Deterministic rules have the virtue of simplicity and of equity, in the sense of treating observationally equivalent cases in the same way. Statistical profiling is an intermediate case between caseworker discretion and a deterministic rule. Observable characteristics are included in the profiling to assign persons to a program or to treatments within a program, where the importance of each characteristic depends on its estimated relationship with the profiling variable. In practice, profiling results in a finer allocation that incorporates more information than most deterministic rules, at the cost of setting up and operating the profiling system. Relative to caseworker discretion, profiling gives up the use of idiosyncratic information. At the same time, profiling is likely to be less costly and generate fewer concerns about unequal treatment across caseworkers. At one extreme of statistical profiling is the random assignment mechanism, allocating participants at random to a program or into treatment streams. Since random assignment equates the distribution of observables and non-observables between treatment and control groups in large samples, random assignment is used in social experiments to allow unbiased computation of program impacts by simple mean difference in program outcomes between the treatment and control groups. We use this allocation method in computing experimental program impacts, as well as a benchmark in the exercise of profiling partic- 7 See the evidence in Bell and Orr (2002). 11

12 ipants into alternative treatment streams. 2.2 The choice of a profiling variable The selection of the profiling variable depends on the goals of the allocation rule and on the available data (Berger, Black, and Smith, 2001). If the goal of the allocation is to assign the neediest persons to a program, then a profiling variable that correlates positively with need will be used, and individuals with high predicted values of the variable will be assigned to treatment. If efficiency is the goal of the program, then the logical profiling variable is the predicted net impact of the program, whose determinants can be estimated using experimental or non-experimental methods. 8 Profiling requires information on the observable characteristics in the profiling model for everyone who is to be profiled, as well as data on the profiling variable for the sample that will be used to estimate the profiling model. When the profiling variable is equity related - such as expected duration of UI or welfare - the profiling variable itself can typically be obtained from administrative data on earlier cohorts of participants. When the profiling variable is an expected impact of the program, the ideal is to have experimental data on the population to be profiled so that the experimental impacts as a function of observable characteristics can be estimated without a bias. In some cases, the available experimental data will not correspond exactly to the program as currently implemented (perhaps coming from another state) or to the population being profiled (perhaps because of changes in participation over the business cycle). Given a choice of a variable, the choice of which predictor variables to include in the model de- 8 See Heckman, LaLonde and Smith (1999) and Angrist and Krueger (1999) for extended discussions of experimental and non-experimental methods for estimating the impact of social programs. 12

13 pends on the costs of obtaining the data and on how fine-grained an allocation mechanism is desired. Manski (1997, 2000) and Pepper (2000) discuss the issues in detail, and show that in some cases profiling based on poor data may do worse than simple deterministic rules. 2.3 Empirical evidence on profiling on program outcomes The outcome from program participation, Y 1, is the level of the variable of interest, be it income or employment, at the end of the program. Data identifies Y 1 from program participants. The outcome in the absence of the program, Y 0, is the level of the variable of interest in the absence of the program. It is identified by data on non-participants. Profiling on program outcomes, or levels, singles out and assigns to treatment individuals with highest (or lowest) predicted levels of Y 1 or Y 0, while profiling on program impacts assigns individuals with highest gain from the program Y 1 -Y 0. Because of equity concerns, most programs who implement profiling base it on low levels of Y 0. O Leary, Decker and Wandner (2001) look at potential gains from profiling persons into the UI bonus treatment using data from the Washington and Pennsylvania UI reemployment bonus experiments. The experimental impact estimates in both cases were statistically insignificant. The idea is that profiling based on the probability of benefit exhaustion (the same profiling variable currently used in the UI profiling system) would exclude persons most likely to have a short spell even without the bonus and could therefore improve the mean impact of the treatment. Their profiling results are presented as mean experimental impacts conditional on various cutoff levels of the probability of benefit exhaustion. 9 They find that although profiling can increase the cost-effectiveness of the program, the paid UI benefits do not 9 The program outcomes considered are UI benefit receipt, bonus payments, and earnings. 13

14 steadily decline as the cut-off levels increase. Using an ingenious quasi-experimental design, Black, Smith, Berger, and Noel (2003) examine the relationship between the impact of the mandatory reemployment services and the probability of finding employment. Under the UI profiling system implementation in Kentucky, UI claimants receive a profiling score between 1 and 20 that represents their predicted duration of UI benefit receipt from lowest to highest. Individuals are assigned to mandatory reemployment services each week in each local UI office in descending order of the profiling score until the slots for that office in that week are filled. Individuals with the marginal profiling score in each office each week - the score where there are not enough slots to treat everyone - are randomly assigned. If the profiling system were to enhance the efficiency of the allocation mechanism by directing the treatment to those who benefit most from it, the experimental impact estimates should increase with the score. The authors find no evidence that individuals targeted by the program, namely claimants likely to have a long spell of UI receipt, benefit more from the program than those predicted to have shorter spells, and conclude that the goal of the UI profiling system is equity rather than efficiency. 10 An experimental evaluation of a statistical profiling mechanism used to allocate welfare recipients to three different treatment streams was designed at the request of the U.S. Department of Labor (DOL) (Eberts, 2001). The program under consideration was Michigan s implementation of a Work-First program for welfare recipients. The service offered, job search assistance, was uniform across all participants, but the three treatment 10 In fact, the largest point estimate ($1507, not statistically significant) occurs for claimants with a profiling score of 6, while the point estimates for claimants with profiling scores of 17 and 18 are negative (and not significant). 14

15 providers differed in their approach to implementing the services. Participants were assigned a score, the probability of finding and keeping employment for more than 90 days after the treatment. The score estimated from a previous cohort of program participants controlled for education, demographics, and past experience with the program. Based on the score, one third of participants were assigned respectively to one of the three service providers. At this point, half of the participants in each score group, the treatments, were sent to treatment at the assigned provider. The other half, the controls, were randomly assigned to one of the three providers. Results from this evaluation seem to indicate that, conditional on program participation, individuals appear to have better outcomes from the treatment they were originally assigned to. Nothing more can be said about program impacts. The grouping by score does not necessarily equate participants unobservable characteristics (or observables, for that matter) within each group. 11 Moreover if, however unlikely, program impacts were to turn out homogenous across participants, then the allocation distribution would not even matter. Finally, the quality of predictions generated by a profiling model is essential in designing the profiling allocation. Policymakers who elect to use profiling as an allocation mechanism face two additional, interrelated choices: what variable to use as the profiling variable and what variables to include in the profiling model. The more covariates are available, the finer the allocation outcome that can be achieved, but at the cost of collecting the extra variables. In the UI profiling system, states have varied widely in the types and number of additional variables that they have included beyond those suggested in the basic state model. Including 11 The wide difference in outcomes between treatments and controls in the second score group who end up with the same (preferred) provider seems to confirm that there may be systematic differences between the treatment and control groups in score two. 15

16 additional covariates can change the ordering of individuals on the predicted profiling variable and can also make for better predictions. In a recent study requested by the U.S. DOL, Black et al. (2003) document that improving data quality, including enough conditioning variables and, to some extent, using a continuous profiling variable (fraction of UI benefits exhausted) are key ingredients for improving the predictive performance of UI profiling models. 2.4 Empirical evidence on profiling on program impacts Dehejia (2000) evaluates four different allocation mechanisms using results from an experimental evaluation of the GAIN (Greater Avenues for Independence) program in Alameda County, California. The treatment group receives the GAIN program consisting of regular AFDC (Aid for Families with Dependent Children) benefits plus employment and training services (mostly job search assistance and basic education), while the control group receives only the regular AFDC program. Deterministically assigning everyone to AFDC dominates assigning everyone to GAIN if impacts are assumed to be homogenous, while the reverse is true for heterogeneous program impacts. If impacts are heterogeneous and individuals are profiled into GAIN based on expected impacts, then this allocation mechanism second-order stochastically dominates either of the two deterministic rules. Lechner and Smith (2003) use Swiss administrative records to assess the performance of caseworkers in assigning unemployed individuals to one of eight possible treatment streams within the Swiss re-employment program. They adapt the popular matching estimator to suit the framework of multiple treatment streams. Their results indicate that, if caseworkers were to assign individuals to the treatment that would generate the largest 16

17 predicted impact, overall program impacts would increase by 14%. Conversely, if individuals were instead assigned to the treatment stream that suited them least, overall program performance would decrease by 15.8%. 12 Their conclusion is that, while caseworkers do not perform very well, they are not harmful either. Caseworkers are not necessarily uninformed or incompetent, but they may face different objectives - they may favor equity concerns over efficiency - or external restrictions such as imposed participation requirements. One more piece of evidence on profiling based on non-experimental predicted outcomes comes from Frlich (2001). Frlich applies profiling to the problem of choice of treatment for participants in Swedish worker rehabilitation programs. Recipients of temporary disability benefits are profiled into three different treatment streams: no rehabilitation, vocational rehabilitation and non-vocational rehabilitation. By assigning persons to treatment based on non-experimental estimates of the impact of each treatment (as function of observable characteristics) it is estimated that the re-employment rate increases from 46%, its level under current program operation, to 56%. A comparison of the profiling assignment based on estimated impacts and the actual assignments of caseworkers reveals that caseworkers assign only 42% of participants to what is considered optimal under profiling. To summarize, the literature indicates that profiling on predicted individual program impacts can exploit the heterogeneity in treatment responses across participants and result in a better allocation of participants to the program. In general, profiling on predicted program impacts serves efficiency goals, while profiling on predicted program outcomes serves equity goals, and usually there is a trade-off between the two. Using data from 12 As common in the literature, general equilibrium effects are ignored. 17

18 the National JTPA Study, we add to the literature by examining to what extent and under what circumstances profiling on impacts is more efficient than profiling on outcomes. Moreover, we investigate the relative gains from profiling at different points in the participation decision process: eligibility, application, or assignment. Answers to such queries are bound to have large policy relevance. Having good estimates for the predicted profiling variable is essential in obtaining the desired allocation mechanism. Even more important, good estimates of the heterogeneous impact profiling function are needed. Frlich (2001) aims to compute unbiased profiling function estimates by applying semiparametric techniques to the non-experimental data. Nevertheless, the literature indicates that, although the sample selection bias can be reduced by carefully applying non-experimental estimators, the bias is not completely removed when compared to unbiased experimental estimators. Our results regarding profiling into participation or non-participation will be more reliable since the impact profiling function relies on experimental estimators. 3 The NJS Data The data come from the National Job Training and Partnership Act (JTPA) Study (NJS). JTPA is a U.S. federal training program targeted to serve disadvantaged Americans. The program started in 1982 as a substitute for the Comprehensive Employment and Training Act (CETA) and continued to operate until the late 1990s when replaced by the Workforce Investment Act (WIA). JTPA provided disadvantaged youth and adults with training and re-employment services. Prior to random assignment participants are recommended different services that can be streamed in three service groups: classroom training in occupational skills (denoted CT-OS in what follows), job search assistance 18

19 and subsidized on-the-job training at private firms ( OJT ), and a mix of other services ( OTHER ), including basic education possibly in combination with some CT-OS or OJT. Classroom training programs in North America tend to have a small average duration - typically just a few months - and aim to prepare their participants for entry level positions in semi-skilled occupations. Subsidized on-the-job training provides an incentive for private firms to hire and train disadvantaged workers, though whether they actually get any more training than other newly hired workers is subject to debate. Job search assistance programs attempt to facilitate the matching process between firms and workers. This category includes services ranging from lectures on resume writing and how to give a good interview to the formal job matching services of the Employment Service. 13 The JTPA program was subject to an experimental evaluation - the National JTPA Study (NJS) commissioned by the U.S. department of Labor in 1986 to measure impacts and costs of JTPA (e.g. Bloom et al.1993). The evaluation took place at a non-random sample of 16 of the more than 600 JTPA training centers. Applicants over the period November September 1989 were first allocated a treatment stream based on caseworkers decision. Subsequently, the applicant group was split by random assignment to experimental treatment or control groups, in a proportion of two treatments to one control. Follow-up surveys examined the employment and earnings outcomes of the experimental control and treatment groups. We consider the moment of random assignment to treatment or control groups as the time zero mark. The outcomes of interest are the 13 A caveat applies to our exercise of profiling into treatment streams. The three treatment streams are a bit artificial relative to how the program actually operates. A real-world profiling system for JTPA should focus on assignment to particular treatments, not groups of treatments or treatment streams. 19

20 sum of earnings for all months up to month 18 after random assignment, and employment at month 18. The analysis is based on forecasting heterogeneous experimental program impacts as a function of personal characteristics. Since the parameters of the program impact function are computed using the randomly assigned experimental treatment and control participants, results are free of sample selection biases. The outcomes of reference are earnings and employment in the 18 months following random assignment. In profiling from random assignment and profiling into treatment streams we use information on experimental treatments and controls available at the 16 non-random sites which agreed to participate in the NJS experiment. In profiling from application we use extra data on non-experimental applicants (NEAs in what follows). These individuals initially applied for JTPA but got lost somewhere between the first and second interviews so they did not participate in the JTPA experiment. One reason (among many) that individuals may appear in the NEA sample but not the experimental sample is that they may turn out to be ineligible for JTPA; as such, the NEAs are drawn from a different population than the ENPs. their relationship to the impact functions estimated in the NEA and experimental samples. Other reasons for NEAs not following up with participation could be that they may not like their caseworker, or they may have learned enough to conclude that JTPA services are not for them, they may have gotten a job offer or gone to jail, etc. Information on NEAs is only available at 12 of the NJS sites and was collected on a voluntary basis at different points in time by caseworkers. Finally, for profiling from eligibility we use extra data on eligible non-experimental control participants (ENP in what follows). Information about ENPs is only available at 4 NJS sites. 20

21 The evaluation analysis is conducted separately for each of the four demographic groups: adult males and females (ages 22 and older), and males and female out-of-school youth (ages 16 to 21). Youth were randomly assigned at only 15 of the 16 experimental sites. Information on sample composition by demographic characteristics is available in Table 1. 4 Estimation Methodology A widely used parameter in the evaluation literature is the treatment on the treated (TT) parameter, estimating the impact of a program on its participants. Treatment on the treated is the parameter of interest in many obvious instances - for example to evaluate how well a program performs for the people it was designed to serve, or to decide whether to continue the program as is or shut it down. When experimental data is available, as is the case with NJS, under assumptions that rule out changes in the normal operation of a program due to experimentation, the simple mean difference between the outcomes of the treatment and control groups gives an unbiased estimate of the treatment on the treated parameter. These assumptions are not likely to hold in the JTPA experiment, due to treatment group dropout and control group substitution. Without accounting for dropout and substitution biases the experimental estimator is an unbiased estimator of the intent to treat rather than actual treatment received effect. Although sometimes invoked in the evaluation literature, the assumption of a common effect parameter stating that all treatment participants should experience the same program gains, does not hold in practice (e.g. Heckman, Smith and Clements 1997). 14 We 14 Note that evidence of heterogeneity in program impacts is implicit in any study that finds a change in the mean impact associated with profiling. 21

22 proceed by designing an impact function that predicts individual program gains based on individual observable characteristics. The idea on which we base the profiling experiment is straightforward: we exploit the random assignment design in the NJS to estimate for all participants, regardless of their assignment status, an unbiased individual impact function conditional on personal characteristics. 4.1 Profiling on largest predicted impacts Let Y 0 denote the program outcome in the absence of treatment, Y 1 the post-treatment outcome, and let D be an indicator for program participation. D takes the value 1 if the experimental participant was assigned to the treatment group and zero if s/he was assigned to control. Following Heckman, LaLonde and Smith (1999) we define the following switching regression: (1) Y = (1 D)Y 0 + DY 1 When we observe, when we observe. The impact of the program on its participants (the treatment-on-the treated parameter) is = E[Y 1 Y 0 X, D = 1]. The two equations describing the outcome in the two states, treatment and nontreatment, can be written in a simplified linear form as: Y 1 = Xβ 1 + u 1 for participation, Y 0 = Xβ 0 + u 0 for non-participation, Substituting the two outcome equations in (1) gives: Y = Xβ 0 + DX(β 1 β 0 ) + u 0 + D(u 1 u 0 ), 22

23 or, denoting by γ = β 1 β 0 and by ε = u 1 u 0 : (2) Y = Xβ 0 + DXγ + u 0 + Dε The treatment on the treated impact can be therefore obtained as: (X) = E[Y 1 Y 0 X, D = 1] = E[Y X, D = 1] E[Y X, D = 0] (X) = Xγ + (E[u 0 X, D = 1] E[u 0 X, D = 0]) + E[ε X, D = 1] Because of the random assignment design E[u 0 X, D = 0] = E[u 0 X, D = 1] and the second term on the right hand side disappears. This is also true for the last term: E[ε X, D = 1] = E[u 1 u 0 X, D = 1] = E[u 1 X, D = 1] E[u 0 X, D = 0] = 0, and thus (X) = Xγ. For each individual, estimating (2) generates a predicted individual treatment response function: (3) i (X i ) = X i γ Averaging all individual predicted impacts produces the desired parameter of interest: (4) (X) = i (X i ) = X i γ The allocation mechanism that profiles on predicted program impacts sorts all participants based on the values of the heterogeneous program impacts i (X i )and then assigns to treatment only those participants with positive predicted impacts (or with predicted impacts above a given threshold value). The profiled impacts are obtained by re-averaging i (X i )over profiled j individuals such that j (X j ) > 0 : (5) profiling bias = j (X j ) j (X j ) > 0 = X j γ X j γ > 0 for all j. 23

24 A caveat applies. As indicated by the bias superscript, if the sample size is too small, this methodology can be flawed. Although E[ε j X j, D j = 1] = 0, once we condition on X j γ > 0, unless the sample size is large enough, it may be the case that E[ε j X j, D j = 1 X j γ > 0] 0.Conditioning on X γ > 0truncates the conditional distribution of εto values of εwith the highest correlation between ε and Xγ ( over-fitting bias ). In this situation, averaging over profiled observations induces an upward bias in the estimated profiling impact. To get away from the over-fitting bias we implement the following procedure: we randomly split the sample into two, an estimating sample and a validating (or holdout) sample. 15 We use the observations in the estimating sample to generate predicted individual impact functions ie (X ie )for participants in the validating samples. 16 Sorting individuals based on ie (X ie )does not induce an inconsistency. We profile into treatment those participants from the uncontaminated validating sample who have positive predicted impacts as given by the predictions from the estimating sample. Profiling impacts are re-estimated on the profiled subsample, where the impact coefficient is now denoted by δ rather than γ: (6) Y = Xα + DXδ + u 0 + Dε While it is not necessary to re-estimate the impact function - a simple mean difference between the outcomes of experimental treatment and controls would also give an unbiased profiling impact - the re-estimated impact function can be used to profile eligible 15 In the forecasting literature the estimating sample can be referred as training sample and the validating sample testing or holdout sample. 16 The E subscript indicates that the impacts were predicted off the estimation sample. 24

25 individuals who have never taken part in the JTPA experiment. 17 The unbiased profiled treatment impact is given by: (7) profiling(x) = k (X k ) = X k δ where k are all individuals in the validating sample for whom a positive predicted impact was identified as a projection from the estimating sample ke (X ke ) > 0. We replicate 500 times the splitting randomizations into estimating and validating samples, and report results that are averages from 500 repetitions on different random samples. 4.2 Profiling specifications One possible concern from estimating the profiling impacts as described above is that, in order to achieve a fine enough allocation mechanism, the dimension of the X vector has to be very large. In turn, the precision of the impact function decreases with increases in the size of X. A compromise would be to incorporate the covariates X in the profiling function as a scalar linear combination of all X. We chose the propensity score, the probability that an individual participates in training conditional on X, as the scalar measure that incorporates the covariates X. Thus we implement a second allocation whereby the vector X is first converted into an index form P(X), the probability of selecting into participation conditional on observed characteristics (or, the propensity score). In this case, the equations that generate the profiling function (P (X))would look like: Y = α + βp (X) + δd + γp (X)D 17 Actually, we report both profiling impacts, one given by re-estimating the profiling function in the validating sample, the other as mean difference between profiled treatment and control outcomes in the validating sample. 25

26 i = δ + ˆγ ˆP (X i ) and (P (X)) = δ + ˆγ ˆP i where P is the propensity score P (X) = P r(r = 1 X)where R denotes selection into treatment, R=1 for individuals who decide to participate in JTPA. Because of the experimental design of the NJS sample, both the experimental treatment (D=1) and the experimental control (D=0) groups have R=1. In order to estimate the probability of participation in JTPA we need a larger sample that also included individuals who would have been eligible to participate but decided not to (Eligible non-participants, or ENPs), for whom R=0. Data on all experimentals, treatment and controls, is used for D=1. For D=0 all the ENPs for whom data is available (collected at 4 NJS sites out of 16). The propensity scores P come from a weighted logit, where population weights are used to account for the fact that in the eligible population the ENPs are a larger fraction (97%0 that in our sample. Propensity scores are computed only once, before starting the 500 iterations on random samples. (Sensitivity results show that computing propensity scores separately for the estimating and validating samples within each repetition does not change the results). The two specifications described, one where the vector of covariates X enters linearly, the other where it enters in its scalar form P(X), are described as specifications 1 and 2 in Table 2. To account for non-linearities in X and P(X) we implement a hybrid specification where the covariates X still enter in an index fashion, only instead of using the propensity score as such we use quantiles of its distribution. These are specifications 3 and 4 and some of their variants described in Table 2. The main difference between specifications 3 and 4 is 26

27 that for the former the quantiles are computed for the distribution of propensity scores in the experimental sample, while for the latter quantiles are computed for the distribution of propensity scores in the eligible population (including the ENPs). Specification 4 allows for less variation in the quantiles of experimental treatment and controls, since the propensity scores for the experimental group would tend to cluster towards the higher tail of the propensity score distribution in the population. To allow for more variation in the quantiles of experimental treatment and control groups, specification 4 uses more quantiles (20) than specification 3, which only uses 10 quintiles. We also implement a variant of specification 3, 3*, where 20 quintiles of the distribution of propensity scores in the experimental population are used in the profiling estimation. Because specifications 3, 3* and 4 combine virtues of a parsimonious specification (smaller overfitting bias and better out-of-sample forecast) with allowing for non-linearities in the propensity score, we expect those specifications to give the best profiling results. Table 2 mentions three more sets of profiling specifications: 2b-4b, 2c-4c, and 2d and 4d. These specifications are used at the stage of profiling from application or profiling from eligibility, exercises that are described further on. The main difference comes from different ways of including another segment of the eligible population, the Non-Experimental Applicants (NEA) in the propensity score analysis. Since the conditioning variables X determine the impact function (X) or (P (X)), the choice of X is of crucial relevance for the performance of the profiling method. Faced with a trade-off between precision in the allocation mechanism and specification concerns, we selected the subset of the covariates that best predicted the outcome variable in an automated backward-forward step-wise routine. For each of the four demographic groups, 27

28 these covariates are listed in Table 3. The set of X variables that the stepwise procedure chooses from are guided by what we have learned so far about participation and outcomes from the program evaluation literature. The backward selection routine removes from the pool of available variables one by one the variables that contribute least to R2 until the backward selection threshold (P.20) is attained. Next, the forward selection routine adds to the variables already included in the model those with the highest partial correlations with the dependent variable, as long as the forward selection threshold has not been attained (P.15). In general, the forward selection routine rarely adds any variable to those selected by the backward stepwise routine. Similar results are obtained whether the pool of available variables is interacted with the treatment participation dummy or not. The automated procedure is quite robust to small variations in the selection thresholds. The difference between specifications Stepwise I and Stepwise II is that for the former we consider grouped variables when they belong to the same category (for instance, all education variables are considered jointly by the stepwise routine), while for the latter we allow them to be split (the stepwise routine may pick only selected education categories). We also provide different sets of covariates X that were used in sensitivity analysis. Finally, when implementing the specifications that use the propensity score we did not have as much latitude in choosing covariates X because we were limited by the availability of data for the ENPs. The same covariates were used for all four demographic groups. They are listed in the end of Table 3. 28

29 4.3 Profiling on small Y 0 A different allocation mechanism, based on notions of equity rather than efficiency, would assign people to treatment not based on the predicted gains from the program (an efficiency goal), but based on low outcomes that would occur if the individual were not enrolled in the program (an equity goal). We implement here an exercise that evaluates what would be the impact gains for the program if individuals were selected for participation not based on predicted program impacts, but on low values of the outcome in the absence of treatment, Y 0. The methodology is similar to profiling on predicted impacts, only instead of assigning to treatment all individuals with positive predicted program impacts we assign them in ascending order of their predicted non-participation outcome Y 0. In this situation, since no endogenous threshold emerges, we have to make a decision where to impose the cut-off threshold for the profiled sample. We choose it at two thirds of the sample, but we also perform analysis for half the sample assigned into treatment, as well as one third of the sample. 4.4 Profiling at positive sites As an extension to individual profiling we combine profiling on largest predicted individual impacts with profiling of the sites participating in the program by eliminating from the analysis sites with negative mean impacts. The profiling impacts are generated using the same algorithm as in the base profiling case, with the only difference that the set of sites with negative mean impacts are first identified from separate site regressions and are eliminated from the analysis. Computing the experimental impact at positive mean 29

Instrumental Variables Estimation: An Introduction

Instrumental Variables Estimation: An Introduction Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA The Problem The Problem Suppose you wish to

More information

Class 1: Introduction, Causality, Self-selection Bias, Regression

Class 1: Introduction, Causality, Self-selection Bias, Regression Class 1: Introduction, Causality, Self-selection Bias, Regression Ricardo A Pasquini April 2011 Ricardo A Pasquini () April 2011 1 / 23 Introduction I Angrist s what should be the FAQs of a researcher:

More information

Methods for Addressing Selection Bias in Observational Studies

Methods for Addressing Selection Bias in Observational Studies Methods for Addressing Selection Bias in Observational Studies Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA What is Selection Bias? In the regression

More information

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS EMPIRICAL STRATEGIES IN LABOUR ECONOMICS University of Minho J. Angrist NIPE Summer School June 2009 This course covers core econometric ideas and widely used empirical modeling strategies. The main theoretical

More information

The Limits of Inference Without Theory

The Limits of Inference Without Theory The Limits of Inference Without Theory Kenneth I. Wolpin University of Pennsylvania Koopmans Memorial Lecture (2) Cowles Foundation Yale University November 3, 2010 Introduction Fuller utilization of the

More information

Instrumental Variables I (cont.)

Instrumental Variables I (cont.) Review Instrumental Variables Observational Studies Cross Sectional Regressions Omitted Variables, Reverse causation Randomized Control Trials Difference in Difference Time invariant omitted variables

More information

An Experimental Investigation of Self-Serving Biases in an Auditing Trust Game: The Effect of Group Affiliation: Discussion

An Experimental Investigation of Self-Serving Biases in an Auditing Trust Game: The Effect of Group Affiliation: Discussion 1 An Experimental Investigation of Self-Serving Biases in an Auditing Trust Game: The Effect of Group Affiliation: Discussion Shyam Sunder, Yale School of Management P rofessor King has written an interesting

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany Dan A. Black University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany Matching as a regression estimator Matching avoids making assumptions about the functional form of the regression

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH Why Randomize? This case study is based on Training Disadvantaged Youth in Latin America: Evidence from a Randomized Trial by Orazio Attanasio,

More information

Further Properties of the Priority Rule

Further Properties of the Priority Rule Further Properties of the Priority Rule Michael Strevens Draft of July 2003 Abstract In Strevens (2003), I showed that science s priority system for distributing credit promotes an allocation of labor

More information

Empirical Tools of Public Finance. 131 Undergraduate Public Economics Emmanuel Saez UC Berkeley

Empirical Tools of Public Finance. 131 Undergraduate Public Economics Emmanuel Saez UC Berkeley Empirical Tools of Public Finance 131 Undergraduate Public Economics Emmanuel Saez UC Berkeley 1 DEFINITIONS Empirical public finance: The use of data and statistical methods to measure the impact of government

More information

The Employment Retention and Advancement Project

The Employment Retention and Advancement Project The Employment Retention and Advancement Project March 2009 Results from the Substance Abuse Case Management Program in New York City John Martinez, Gilda Azurdia, Dan Bloom, and Cynthia Miller This report

More information

Version No. 7 Date: July Please send comments or suggestions on this glossary to

Version No. 7 Date: July Please send comments or suggestions on this glossary to Impact Evaluation Glossary Version No. 7 Date: July 2012 Please send comments or suggestions on this glossary to 3ie@3ieimpact.org. Recommended citation: 3ie (2012) 3ie impact evaluation glossary. International

More information

TRACER STUDIES ASSESSMENTS AND EVALUATIONS

TRACER STUDIES ASSESSMENTS AND EVALUATIONS TRACER STUDIES ASSESSMENTS AND EVALUATIONS 1 INTRODUCTION This note introduces the reader to tracer studies. For the Let s Work initiative, tracer studies are proposed to track and record or evaluate the

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Practical propensity score matching: a reply to Smith and Todd

Practical propensity score matching: a reply to Smith and Todd Journal of Econometrics 125 (2005) 355 364 www.elsevier.com/locate/econbase Practical propensity score matching: a reply to Smith and Todd Rajeev Dehejia a,b, * a Department of Economics and SIPA, Columbia

More information

Cochrane Pregnancy and Childbirth Group Methodological Guidelines

Cochrane Pregnancy and Childbirth Group Methodological Guidelines Cochrane Pregnancy and Childbirth Group Methodological Guidelines [Prepared by Simon Gates: July 2009, updated July 2012] These guidelines are intended to aid quality and consistency across the reviews

More information

Selection and Combination of Markers for Prediction

Selection and Combination of Markers for Prediction Selection and Combination of Markers for Prediction NACC Data and Methods Meeting September, 2010 Baojiang Chen, PhD Sarah Monsell, MS Xiao-Hua Andrew Zhou, PhD Overview 1. Research motivation 2. Describe

More information

THE USE OF MULTIVARIATE ANALYSIS IN DEVELOPMENT THEORY: A CRITIQUE OF THE APPROACH ADOPTED BY ADELMAN AND MORRIS A. C. RAYNER

THE USE OF MULTIVARIATE ANALYSIS IN DEVELOPMENT THEORY: A CRITIQUE OF THE APPROACH ADOPTED BY ADELMAN AND MORRIS A. C. RAYNER THE USE OF MULTIVARIATE ANALYSIS IN DEVELOPMENT THEORY: A CRITIQUE OF THE APPROACH ADOPTED BY ADELMAN AND MORRIS A. C. RAYNER Introduction, 639. Factor analysis, 639. Discriminant analysis, 644. INTRODUCTION

More information

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information

1. INTRODUCTION. Lalonde estimates the impact of the National Supported Work (NSW) Demonstration, a labor

1. INTRODUCTION. Lalonde estimates the impact of the National Supported Work (NSW) Demonstration, a labor 1. INTRODUCTION This paper discusses the estimation of treatment effects in observational studies. This issue, which is of great practical importance because randomized experiments cannot always be implemented,

More information

HCEO Summer School on Socioeconomic Inequality. Professor Jeffrey Smith Department of Economics University of Michigan

HCEO Summer School on Socioeconomic Inequality. Professor Jeffrey Smith Department of Economics University of Michigan HCEO Summer School on Socioeconomic Inequality Professor Jeffrey Smith Department of Economics University of Michigan econjeff@umich.edu Program Evaluation I: Heterogeneous Treatment Effects University

More information

Measuring and Assessing Study Quality

Measuring and Assessing Study Quality Measuring and Assessing Study Quality Jeff Valentine, PhD Co-Chair, Campbell Collaboration Training Group & Associate Professor, College of Education and Human Development, University of Louisville Why

More information

1. Introduction Consider a government contemplating the implementation of a training (or other social assistance) program. The decision to implement t

1. Introduction Consider a government contemplating the implementation of a training (or other social assistance) program. The decision to implement t 1. Introduction Consider a government contemplating the implementation of a training (or other social assistance) program. The decision to implement the program depends on the assessment of its likely

More information

WELCOME! Lecture 11 Thommy Perlinger

WELCOME! Lecture 11 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression

More information

Applied Econometrics for Development: Experiments II

Applied Econometrics for Development: Experiments II TSE 16th January 2019 Applied Econometrics for Development: Experiments II Ana GAZMURI Paul SEABRIGHT The Cohen-Dupas bednets study The question: does subsidizing insecticide-treated anti-malarial bednets

More information

Multivariable Systems. Lawrence Hubert. July 31, 2011

Multivariable Systems. Lawrence Hubert. July 31, 2011 Multivariable July 31, 2011 Whenever results are presented within a multivariate context, it is important to remember that there is a system present among the variables, and this has a number of implications

More information

Econometric Game 2012: infants birthweight?

Econometric Game 2012: infants birthweight? Econometric Game 2012: How does maternal smoking during pregnancy affect infants birthweight? Case A April 18, 2012 1 Introduction Low birthweight is associated with adverse health related and economic

More information

Addendum: Multiple Regression Analysis (DRAFT 8/2/07)

Addendum: Multiple Regression Analysis (DRAFT 8/2/07) Addendum: Multiple Regression Analysis (DRAFT 8/2/07) When conducting a rapid ethnographic assessment, program staff may: Want to assess the relative degree to which a number of possible predictive variables

More information

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA The uncertain nature of property casualty loss reserves Property Casualty loss reserves are inherently uncertain.

More information

Online Appendix A. A1 Ability

Online Appendix A. A1 Ability Online Appendix A A1 Ability To exclude the possibility of a gender difference in ability in our sample, we conducted a betweenparticipants test in which we measured ability by asking participants to engage

More information

Vocabulary. Bias. Blinding. Block. Cluster sample

Vocabulary. Bias. Blinding. Block. Cluster sample Bias Blinding Block Census Cluster sample Confounding Control group Convenience sample Designs Experiment Experimental units Factor Level Any systematic failure of a sampling method to represent its population

More information

Introduction to Program Evaluation

Introduction to Program Evaluation Introduction to Program Evaluation Nirav Mehta Assistant Professor Economics Department University of Western Ontario January 22, 2014 Mehta (UWO) Program Evaluation January 22, 2014 1 / 28 What is Program

More information

Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985)

Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985) Confirmations and Contradictions Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985) Estimates of the Deterrent Effect of Capital Punishment: The Importance of the Researcher's Prior Beliefs Walter

More information

The Prevalence of HIV in Botswana

The Prevalence of HIV in Botswana The Prevalence of HIV in Botswana James Levinsohn Yale University and NBER Justin McCrary University of California, Berkeley and NBER January 6, 2010 Abstract This paper implements five methods to correct

More information

Predicting the efficacy of future training programs using past experiences at other locations

Predicting the efficacy of future training programs using past experiences at other locations Journal of Econometrics ] (]]]]) ]]] ]]] www.elsevier.com/locate/econbase Predicting the efficacy of future training programs using past experiences at other locations V. Joseph Hotz a, *, Guido W. Imbens

More information

Causal Validity Considerations for Including High Quality Non-Experimental Evidence in Systematic Reviews

Causal Validity Considerations for Including High Quality Non-Experimental Evidence in Systematic Reviews Non-Experimental Evidence in Systematic Reviews OPRE REPORT #2018-63 DEKE, MATHEMATICA POLICY RESEARCH JUNE 2018 OVERVIEW Federally funded systematic reviews of research evidence play a central role in

More information

What Behaviors Do Behavior Programs Change

What Behaviors Do Behavior Programs Change What Behaviors Do Behavior Programs Change Yingjuan (Molly) Du, Dave Hanna, Jean Shelton and Amy Buege, Itron, Inc. ABSTRACT Utilities behavioral programs, such as audits and web-based tools, are designed

More information

A MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA

A MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA A MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA Elizabeth Martin Fischer, University of North Carolina Introduction Researchers and social scientists frequently confront

More information

NBER WORKING PAPER SERIES HOW WAS THE WEEKEND? HOW THE SOCIAL CONTEXT UNDERLIES WEEKEND EFFECTS IN HAPPINESS AND OTHER EMOTIONS FOR US WORKERS

NBER WORKING PAPER SERIES HOW WAS THE WEEKEND? HOW THE SOCIAL CONTEXT UNDERLIES WEEKEND EFFECTS IN HAPPINESS AND OTHER EMOTIONS FOR US WORKERS NBER WORKING PAPER SERIES HOW WAS THE WEEKEND? HOW THE SOCIAL CONTEXT UNDERLIES WEEKEND EFFECTS IN HAPPINESS AND OTHER EMOTIONS FOR US WORKERS John F. Helliwell Shun Wang Working Paper 21374 http://www.nber.org/papers/w21374

More information

A Case Study: Two-sample categorical data

A Case Study: Two-sample categorical data A Case Study: Two-sample categorical data Patrick Breheny January 31 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/43 Introduction Model specification Continuous vs. mixture priors Choice

More information

Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary note

Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary note University of Iowa Iowa Research Online Theses and Dissertations Fall 2014 Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary

More information

Research Overview: Employment as a Social Determinant of HIV Health and Prevention

Research Overview: Employment as a Social Determinant of HIV Health and Prevention Research Overview: Employment as a Social Determinant of HIV Health and Prevention Liza Conyers, Ph.D. CRC Penn State University Chair National Working Positive Coalition Research Working Group Ken Hergenrather,

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary Statistics and Results This file contains supplementary statistical information and a discussion of the interpretation of the belief effect on the basis of additional data. We also present

More information

Evaluation Models STUDIES OF DIAGNOSTIC EFFICIENCY

Evaluation Models STUDIES OF DIAGNOSTIC EFFICIENCY 2. Evaluation Model 2 Evaluation Models To understand the strengths and weaknesses of evaluation, one must keep in mind its fundamental purpose: to inform those who make decisions. The inferences drawn

More information

Economics Bulletin, 2013, Vol. 33 No. 1 pp

Economics Bulletin, 2013, Vol. 33 No. 1 pp 1. Introduction An often-quoted paper on self-image as the motivation behind a moral action is An economic model of moral motivation by Brekke et al. (2003). The authors built the model in two steps: firstly,

More information

Introduction to Observational Studies. Jane Pinelis

Introduction to Observational Studies. Jane Pinelis Introduction to Observational Studies Jane Pinelis 22 March 2018 Outline Motivating example Observational studies vs. randomized experiments Observational studies: basics Some adjustment strategies Matching

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

Why randomize? Rohini Pande Harvard University and J-PAL.

Why randomize? Rohini Pande Harvard University and J-PAL. Why randomize? Rohini Pande Harvard University and J-PAL www.povertyactionlab.org Agenda I. Background on Program Evaluation II. What is a randomized experiment? III. Advantages and Limitations of Experiments

More information

6. Unusual and Influential Data

6. Unusual and Influential Data Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the

More information

Helmut Farbmacher: Copayments for doctor visits in Germany and the probability of visiting a physician - Evidence from a natural experiment

Helmut Farbmacher: Copayments for doctor visits in Germany and the probability of visiting a physician - Evidence from a natural experiment Helmut Farbmacher: Copayments for doctor visits in Germany and the probability of visiting a physician - Evidence from a natural experiment Munich Discussion Paper No. 2009-10 Department of Economics University

More information

Recent advances in non-experimental comparison group designs

Recent advances in non-experimental comparison group designs Recent advances in non-experimental comparison group designs Elizabeth Stuart Johns Hopkins Bloomberg School of Public Health Department of Mental Health Department of Biostatistics Department of Health

More information

Research to Practice. What Are the Trends in Employment Outcomes of Youth with Autism: ? Alberto Migliore, John Butterworth, & Agnes Zalewska

Research to Practice. What Are the Trends in Employment Outcomes of Youth with Autism: ? Alberto Migliore, John Butterworth, & Agnes Zalewska Research to Practice Issue No. 53 2012 What Are the Trends in Employment Outcomes of Youth with Autism: 2006 2010? Alberto Migliore, John Butterworth, & Agnes Zalewska Introduction In recent years, the

More information

Fit to play but goalless: Labour market outcomes in a cohort of public sector ART patients in Free State province, South Africa

Fit to play but goalless: Labour market outcomes in a cohort of public sector ART patients in Free State province, South Africa Fit to play but goalless: Labour market outcomes in a cohort of public sector ART patients in Free State province, South Africa Frikkie Booysen Department of Economics / Centre for Health Systems Research

More information

Volume 36, Issue 3. David M McEvoy Appalachian State University

Volume 36, Issue 3. David M McEvoy Appalachian State University Volume 36, Issue 3 Loss Aversion and Student Achievement David M McEvoy Appalachian State University Abstract We conduct a field experiment to test if loss aversion behavior can be exploited to improve

More information

Section 4.1. Chapter 4. Classification into Groups: Discriminant Analysis. Introduction: Canonical Discriminant Analysis.

Section 4.1. Chapter 4. Classification into Groups: Discriminant Analysis. Introduction: Canonical Discriminant Analysis. Chapter 4 Classification into Groups: Discriminant Analysis Section 4.1 Introduction: Canonical Discriminant Analysis Understand the goals of discriminant Identify similarities between discriminant analysis

More information

Key questions when starting an econometric project (Angrist & Pischke, 2009):

Key questions when starting an econometric project (Angrist & Pischke, 2009): Econometric & other impact assessment approaches to policy analysis Part 1 1 The problem of causality in policy analysis Internal vs. external validity Key questions when starting an econometric project

More information

Econometric analysis and counterfactual studies in the context of IA practices

Econometric analysis and counterfactual studies in the context of IA practices Econometric analysis and counterfactual studies in the context of IA practices Giulia Santangelo http://crie.jrc.ec.europa.eu Centre for Research on Impact Evaluation DG EMPL - DG JRC CRIE Centre for Research

More information

Working When No One Is Watching: Motivation, Test Scores, and Economic Success

Working When No One Is Watching: Motivation, Test Scores, and Economic Success Working When No One Is Watching: Motivation, Test Scores, and Economic Success Carmit Segal Department of Economics, University of Zurich, Zurich 8006, Switzerland. carmit.segal@econ.uzh.ch This paper

More information

Assignment 4: True or Quasi-Experiment

Assignment 4: True or Quasi-Experiment Assignment 4: True or Quasi-Experiment Objectives: After completing this assignment, you will be able to Evaluate when you must use an experiment to answer a research question Develop statistical hypotheses

More information

Sawtooth Software. The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? RESEARCH PAPER SERIES

Sawtooth Software. The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? RESEARCH PAPER SERIES Sawtooth Software RESEARCH PAPER SERIES The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? Dick Wittink, Yale University Joel Huber, Duke University Peter Zandan,

More information

Keywords: causal channels, causal mechanisms, mediation analysis, direct and indirect effects. Indirect causal channel

Keywords: causal channels, causal mechanisms, mediation analysis, direct and indirect effects. Indirect causal channel Martin Huber University of Fribourg, Switzerland Disentangling policy effects into causal channels Splitting a policy intervention s effect into its causal channels can improve the quality of policy analysis

More information

Causality and Statistical Learning

Causality and Statistical Learning Department of Statistics and Department of Political Science, Columbia University 29 Sept 2012 1. Different questions, different approaches Forward causal inference: What might happen if we do X? Effects

More information

Placebo and Belief Effects: Optimal Design for Randomized Trials

Placebo and Belief Effects: Optimal Design for Randomized Trials Placebo and Belief Effects: Optimal Design for Randomized Trials Scott Ogawa & Ken Onishi 2 Department of Economics Northwestern University Abstract The mere possibility of receiving a placebo during a

More information

Identifying Endogenous Peer Effects in the Spread of Obesity. Abstract

Identifying Endogenous Peer Effects in the Spread of Obesity. Abstract Identifying Endogenous Peer Effects in the Spread of Obesity Timothy J. Halliday 1 Sally Kwak 2 University of Hawaii- Manoa October 2007 Abstract Recent research in the New England Journal of Medicine

More information

A NON-TECHNICAL INTRODUCTION TO REGRESSIONS. David Romer. University of California, Berkeley. January Copyright 2018 by David Romer

A NON-TECHNICAL INTRODUCTION TO REGRESSIONS. David Romer. University of California, Berkeley. January Copyright 2018 by David Romer A NON-TECHNICAL INTRODUCTION TO REGRESSIONS David Romer University of California, Berkeley January 2018 Copyright 2018 by David Romer CONTENTS Preface ii I Introduction 1 II Ordinary Least Squares Regression

More information

Cancer survivorship and labor market attachments: Evidence from MEPS data

Cancer survivorship and labor market attachments: Evidence from MEPS data Cancer survivorship and labor market attachments: Evidence from 2008-2014 MEPS data University of Memphis, Department of Economics January 7, 2018 Presentation outline Motivation and previous literature

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Practitioner s Guide To Stratified Random Sampling: Part 1

Practitioner s Guide To Stratified Random Sampling: Part 1 Practitioner s Guide To Stratified Random Sampling: Part 1 By Brian Kriegler November 30, 2018, 3:53 PM EST This is the first of two articles on stratified random sampling. In the first article, I discuss

More information

Causality and Statistical Learning

Causality and Statistical Learning Department of Statistics and Department of Political Science, Columbia University 27 Mar 2013 1. Different questions, different approaches Forward causal inference: What might happen if we do X? Effects

More information

SAMPLING AND SAMPLE SIZE

SAMPLING AND SAMPLE SIZE SAMPLING AND SAMPLE SIZE Andrew Zeitlin Georgetown University and IGC Rwanda With slides from Ben Olken and the World Bank s Development Impact Evaluation Initiative 2 Review We want to learn how a program

More information

Which Comparison-Group ( Quasi-Experimental ) Study Designs Are Most Likely to Produce Valid Estimates of a Program s Impact?:

Which Comparison-Group ( Quasi-Experimental ) Study Designs Are Most Likely to Produce Valid Estimates of a Program s Impact?: Which Comparison-Group ( Quasi-Experimental ) Study Designs Are Most Likely to Produce Valid Estimates of a Program s Impact?: A Brief Overview and Sample Review Form February 2012 This publication was

More information

Using ASPES (Analysis of Symmetrically- Predicted Endogenous Subgroups) to understand variation in program impacts. Laura R. Peck.

Using ASPES (Analysis of Symmetrically- Predicted Endogenous Subgroups) to understand variation in program impacts. Laura R. Peck. Using ASPES (Analysis of Symmetrically- Predicted Endogenous Subgroups) to understand variation in program impacts Presented by: Laura R. Peck OPRE Methods Meeting on What Works Washington, DC September

More information

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha attrition: When data are missing because we are unable to measure the outcomes of some of the

More information

PLANNING THE RESEARCH PROJECT

PLANNING THE RESEARCH PROJECT Van Der Velde / Guide to Business Research Methods First Proof 6.11.2003 4:53pm page 1 Part I PLANNING THE RESEARCH PROJECT Van Der Velde / Guide to Business Research Methods First Proof 6.11.2003 4:53pm

More information

Randomized Evaluations

Randomized Evaluations Randomized Evaluations Introduction, Methodology, & Basic Econometrics using Mexico s Progresa program as a case study (with thanks to Clair Null, author of 2008 Notes) Sept. 15, 2009 Not All Correlations

More information

Do children in private Schools learn more than in public Schools? Evidence from Mexico

Do children in private Schools learn more than in public Schools? Evidence from Mexico MPRA Munich Personal RePEc Archive Do children in private Schools learn more than in public Schools? Evidence from Mexico Florian Wendelspiess Chávez Juárez University of Geneva, Department of Economics

More information

The Regression-Discontinuity Design

The Regression-Discontinuity Design Page 1 of 10 Home» Design» Quasi-Experimental Design» The Regression-Discontinuity Design The regression-discontinuity design. What a terrible name! In everyday language both parts of the term have connotations

More information

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA PharmaSUG 2014 - Paper SP08 Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA ABSTRACT Randomized clinical trials serve as the

More information

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines Ec331: Research in Applied Economics Spring term, 2014 Panel Data: brief outlines Remaining structure Final Presentations (5%) Fridays, 9-10 in H3.45. 15 mins, 8 slides maximum Wk.6 Labour Supply - Wilfred

More information

Module 14: Missing Data Concepts

Module 14: Missing Data Concepts Module 14: Missing Data Concepts Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724 Pre-requisites Module 3

More information

SRDC Technical Paper Series How Random Must Random Assignment Be in Random Assignment Experiments?

SRDC Technical Paper Series How Random Must Random Assignment Be in Random Assignment Experiments? SRDC Technical Paper Series 03-01 How Random Must Random Assignment Be in Random Assignment Experiments? Paul Gustafson Department of Statistics University of British Columbia February 2003 SOCIAL RESEARCH

More information

The Logic of Causal Order Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 15, 2015

The Logic of Causal Order Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 15, 2015 The Logic of Causal Order Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 15, 2015 [NOTE: Toolbook files will be used when presenting this material] First,

More information

From Field Experiments to Program Implementation: Assessing the Potential Outcomes of an Experimental Intervention Program for Unemployed Persons 1

From Field Experiments to Program Implementation: Assessing the Potential Outcomes of an Experimental Intervention Program for Unemployed Persons 1 American Journal of Community Psychology, Vol. 19, No. 4, 1991 From Field Experiments to Program Implementation: Assessing the Potential Outcomes of an Experimental Intervention Program for Unemployed

More information

Measuring Impact. Program and Policy Evaluation with Observational Data. Daniel L. Millimet. Southern Methodist University.

Measuring Impact. Program and Policy Evaluation with Observational Data. Daniel L. Millimet. Southern Methodist University. Measuring mpact Program and Policy Evaluation with Observational Data Daniel L. Millimet Southern Methodist University 23 May 2013 DL Millimet (SMU) Observational Data May 2013 1 / 23 ntroduction Measuring

More information

CHAPTER 2: TWO-VARIABLE REGRESSION ANALYSIS: SOME BASIC IDEAS

CHAPTER 2: TWO-VARIABLE REGRESSION ANALYSIS: SOME BASIC IDEAS CHAPTER 2: TWO-VARIABLE REGRESSION ANALYSIS: SOME BASIC IDEAS 2.1 It tells how the mean or average response of the sub-populations of Y varies with the fixed values of the explanatory variable (s). 2.2

More information

MARK SCHEME for the May/June 2011 question paper for the guidance of teachers 9699 SOCIOLOGY

MARK SCHEME for the May/June 2011 question paper for the guidance of teachers 9699 SOCIOLOGY UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS GCE Advanced Subsidiary Level and GCE Advanced Level MARK SCHEME for the May/June 2011 question paper for the guidance of teachers 9699 SOCIOLOGY 9699/23

More information

Constructing AFQT Scores that are Comparable Across the NLSY79 and the NLSY97. Joseph G. Altonji Prashant Bharadwaj Fabian Lange.

Constructing AFQT Scores that are Comparable Across the NLSY79 and the NLSY97. Joseph G. Altonji Prashant Bharadwaj Fabian Lange. Constructing AFQT Scores that are Comparable Across the NLSY79 and the NLSY97 Introduction Joseph G. Altonji Prashant Bharadwaj Fabian Lange August 2009 Social and behavioral scientists routinely use and

More information

Evaluating the Regression Discontinuity Design Using Experimental Data

Evaluating the Regression Discontinuity Design Using Experimental Data Evaluating the Regression Discontinuity Design Using Experimental Data Dan Black University of Chicago and NORC danblack@uchicago.edu Jose Galdo McMaster University and IZA galdojo@mcmaster.ca Jeffrey

More information

Lecture 4: Research Approaches

Lecture 4: Research Approaches Lecture 4: Research Approaches Lecture Objectives Theories in research Research design approaches ú Experimental vs. non-experimental ú Cross-sectional and longitudinal ú Descriptive approaches How to

More information

UN Handbook Ch. 7 'Managing sources of non-sampling error': recommendations on response rates

UN Handbook Ch. 7 'Managing sources of non-sampling error': recommendations on response rates JOINT EU/OECD WORKSHOP ON RECENT DEVELOPMENTS IN BUSINESS AND CONSUMER SURVEYS Methodological session II: Task Force & UN Handbook on conduct of surveys response rates, weighting and accuracy UN Handbook

More information

Trends in Ohioans Health Status and Income

Trends in Ohioans Health Status and Income October 200 Trends in Ohioans Health Status and Income Since 2005, household incomes in Ohio have steadily declined. In 2005, 65% of Ohio adults were living in households with an annual income over 200%

More information

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering Meta-Analysis Zifei Liu What is a meta-analysis; why perform a metaanalysis? How a meta-analysis work some basic concepts and principles Steps of Meta-analysis Cautions on meta-analysis 2 What is Meta-analysis

More information

Lecture II: Difference in Difference. Causality is difficult to Show from cross

Lecture II: Difference in Difference. Causality is difficult to Show from cross Review Lecture II: Regression Discontinuity and Difference in Difference From Lecture I Causality is difficult to Show from cross sectional observational studies What caused what? X caused Y, Y caused

More information