Using ASPES (Analysis of Symmetrically- Predicted Endogenous Subgroups) to understand variation in program impacts. Laura R. Peck.

Similar documents
Methods for Addressing Selection Bias in Observational Studies

Introduction to Applied Research in Economics Kamiljon T. Akramov, Ph.D. IFPRI, Washington, DC, USA

Complier Average Causal Effect (CACE)

Quantitative Methods. Lonnie Berger. Research Training Policy Practice

Does AIDS Treatment Stimulate Negative Behavioral Response? A Field Experiment in South Africa

Causal Validity Considerations for Including High Quality Non-Experimental Evidence in Systematic Reviews

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines

Lecture II: Difference in Difference. Causality is difficult to Show from cross

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS

Causal Inference Course Syllabus

Instrumental Variables I (cont.)

Regression Discontinuity Design (RDD)

Propensity Score Analysis Shenyang Guo, Ph.D.

Instrumental Variables Estimation: An Introduction

Multiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012

P E R S P E C T I V E S

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

1. INTRODUCTION. Lalonde estimates the impact of the National Supported Work (NSW) Demonstration, a labor

ICPSR Causal Inference in the Social Sciences. Course Syllabus

Propensity scores: what, why and why not?

Applied Quantitative Methods II

Carrying out an Empirical Project

Part 1. Online Session: Math Review and Math Preparation for Course 5 minutes Introduction 45 minutes Reading and Practice Problem Assignment

Assessing Studies Based on Multiple Regression. Chapter 7. Michael Ash CPPA

Class 1: Introduction, Causality, Self-selection Bias, Regression

Propensity scores and causal inference using machine learning methods

Title:Bounding the Per-Protocol Effect in Randomized Trials: An Application to Colorectal Cancer Screening

Quasi-experimental analysis Notes for "Structural modelling".

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Overview of Perspectives on Causal Inference: Campbell and Rubin. Stephen G. West Arizona State University Freie Universität Berlin, Germany

Motherhood and Female Labor Force Participation: Evidence from Infertility Shocks

Addendum: Multiple Regression Analysis (DRAFT 8/2/07)

Causal Methods for Observational Data Amanda Stevenson, University of Texas at Austin Population Research Center, Austin, TX

The Effects of Maternal Alcohol Use and Smoking on Children s Mental Health: Evidence from the National Longitudinal Survey of Children and Youth

Introduction to Applied Research in Economics

Methods of Randomization Lupe Bedoya. Development Impact Evaluation Field Coordinator Training Washington, DC April 22-25, 2013

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

EC352 Econometric Methods: Week 07

OECD work on Subjective Well-being

TRIPLL Webinar: Propensity score methods in chronic pain research

Supplement 2. Use of Directed Acyclic Graphs (DAGs)

Practical propensity score matching: a reply to Smith and Todd

Reading and maths skills at age 10 and earnings in later life: a brief analysis using the British Cohort Study

Version No. 7 Date: July Please send comments or suggestions on this glossary to

Introduction to Observational Studies. Jane Pinelis

Estimating average treatment effects from observational data using teffects

Matching an Internet Panel Sample of Health Care Personnel to a Probability Sample

Effects of propensity score overlap on the estimates of treatment effects. Yating Zheng & Laura Stapleton

Creating an Index to Measure Wellbeing and Predict Life Satisfaction in Athens, Georgia Series 1: January 2018

Challenges of Observational and Retrospective Studies

Causal Effect Heterogeneity

ECON Microeconomics III

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH

Compliance Mixture Modelling with a Zero Effect Complier Class and Missing. Data.

What is Multilevel Modelling Vs Fixed Effects. Will Cook Social Statistics

Authors: Samuel D. Pimental, University of Pennsylvania,

Jake Bowers Wednesdays, 2-4pm 6648 Haven Hall ( ) CPS Phone is

Evidence-Based Medicine and Publication Bias Desmond Thompson Merck & Co.

Prisoner Reentry Services: What Worked for SVORI Evaluation Participants?

A critical look at the use of SEM in international business research

THE WAGE EFFECTS OF PERSONAL SMOKING

Impact Evaluation Toolbox

Multiple Samples Inference Examples

Using Logic Models for Program Development 1

TREATMENT EFFECTIVENESS IN TECHNOLOGY APPRAISAL: May 2015

TABLE 1. Percentage of respondents to a national survey of young adults, by selected characteristics, according to gender, United States, 2009

The Prevalence of HIV in Botswana

Lecture II: Difference in Difference and Regression Discontinuity

G , G , G MHRN

This exam consists of three parts. Provide answers to ALL THREE sections.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Testing for non-response and sample selection bias in contingent valuation: Analysis of a combination phone/mail survey

The Stable Unit Treatment Value Assumption (SUTVA) and Its Implications for Social Science RCTs

Pros. University of Chicago and NORC at the University of Chicago, USA, and IZA, Germany

CARE Cross-project Collectives Analysis: Technical Appendix

Working Paper: Designs of Empirical Evaluations of Non-Experimental Methods in Field Settings. Vivian C. Wong 1 & Peter M.

POL 574: Quantitative Analysis IV

Comparing Experimental and Matching Methods using a Large-Scale Field Experiment on Voter Mobilization

*2) Interprets relevance of context

Introduction to Applied Research in Economics

PharmaSUG Paper HA-04 Two Roads Diverged in a Narrow Dataset...When Coarsened Exact Matching is More Appropriate than Propensity Score Matching

Case A, Wednesday. April 18, 2012

Which Comparison-Group ( Quasi-Experimental ) Study Designs Are Most Likely to Produce Valid Estimates of a Program s Impact?:

THE GOOD, THE BAD, & THE UGLY: WHAT WE KNOW TODAY ABOUT LCA WITH DISTAL OUTCOMES. Bethany C. Bray, Ph.D.

SUPPLEMENTARY INFORMATION

Goal-setting for a healthier self: evidence from a weight loss challenge

What is: regression discontinuity design?

Introduction to Meta-Analysis

Causal Inference in Observational Settings

Comparing Experimental and Matching Methods Using a Large-Scale Voter Mobilization Experiment

The Effect of College Education on Individual Social Trust in the United States. An Examination of the Causal Mechanisms.

Arturo Gonzalez Public Policy Institute of California & IZA 500 Washington St, Suite 800, San Francisco,

Group Work Instructions

Logic Models to Support Program Design, Implementation and Evaluation

Constructing AFQT Scores that are Comparable Across the NLSY79 and the NLSY97. Joseph G. Altonji Prashant Bharadwaj Fabian Lange.

Designs in Partially Controlled Studies: Messages from a Review

Most candidates were able to gain marks on this question, though there were relatively few who were able to explain interpretive sociology.

Demystifying causal inference in randomised trials

Combining machine learning and matching techniques to improve causal inference in program evaluation

Building an evidence-based social sector in New Zealand. Prepared and presented by Dr Carolyn O Fallon September 2017

Transcription:

Using ASPES (Analysis of Symmetrically- Predicted Endogenous Subgroups) to understand variation in program impacts Presented by: Laura R. Peck OPRE Methods Meeting on What Works Washington, DC September 3-4, 2014

Today s Agenda Motivating Challenge & Solutions Logic & Execution of ASPES Illustration: Supporting Healthy Marriage program Ideal Conditions for ASPES Conclusion Abt Associates pg 2

Motivating Challenge Policy guidance requires more than just an estimate of the net effects of a program or policy; it is also necessary to understand the circumstances under which a program or policy has effects, and how and why it works. (OPRE Meeting Summary) Relevant policy questions: What are the effects of participating in the intervention (not just being offered access)? How does variation in dosage affect program impacts? How does exposure to various levels of program quality influence program impacts? What is the effect of participating in component A (or B or C) of a multi-faceted intervention? Abt Associates pg 3

Possible Solutions that use the experimental design Analytic Approach Instrumental Variables works for no shows: assumes the only pathway of randomization s effect is through participation Abt Associates pg 4

Possible Solutions & Limitations that use the experimental design Analytic Approach Instrumental Variables works for no shows: assumes the only pathway of randomization s effect is through participation Limitation applicable only where we have two groups, and one experiences no effect of being offered treatment Abt Associates pg 5

Possible Solutions & Limitations that use the experimental design Analytic Approach Instrumental Variables works for no shows: assumes the only pathway of randomization s effect is through participation Propensity Score Matching predicts status in one arm to find matched counterparts in other arm Limitation applicable only where we have two groups, and one experiences no effect of being offered treatment omitted variables in prediction create inconsistent impact estimates Abt Associates pg 6

Possible Solutions & Limitations that use the experimental design Analytic Approach Instrumental Variables works for no shows: assumes the only pathway of randomization s effect is through participation Propensity Score Matching predicts status in one arm to find matched counterparts in other arm Principal Stratification uses status in each arm to predict and compare potential outcomes in other arm Limitation applicable only where we have two groups, and one experiences no effect of being offered treatment omitted variables in prediction create inconsistent impact estimates applicable when subgroup is observed in both experimental arms Abt Associates pg 7

Analysis of Symmetrically-Predicted Endogenous Subgroups (ASPES) Logic: any group in one experimental arm has a counterpart in the other arm Treatment Group High Low Medium Abt Associates pg 8

Analysis of Symmetrically-Predicted Endogenous Subgroups (ASPES) Logic: any group in one experimental arm has a counterpart in the other arm Treatment Group High Low Medium Low, medium or high: Dosage exposure Quality experience Likelihood of program component take-up Risk of drop-out Abt Associates pg 9

Analysis of Symmetrically-Predicted Endogenous Subgroups (ASPES) Logic: any group in one experimental arm has a counterpart in the other arm Treatment Group Control Group High Low Medium Abt Associates pg 10

ASPES Defined Endogenous subgroups = post-assignment events, experiences Symmetrically-predicted = leverages experimental design in identifying subgroups Abt Associates pg 11

ASPES Defined & Compared Endogenous subgroups = post-assignment events, experiences Symmetrically-predicted = leverages experimental design in identifying subgroups As opposed to IV strategy (all impact among takers): Treatment Group Control Group no shows took up offer took up offer Abt Associates pg 12

ASPES Defined & Compared Endogenous subgroups = post-assignment events, experiences Symmetrically-predicted = leverages experimental design in identifying subgroups As opposed to (asymmetric) PSM strategy: Treatment Group Control Group actual high predicted high actual low predicted low Abt Associates pg 13

Execution of ASPES Step 1: Use baseline (exogenous) characteristics to predict subgroup membership Abt Associates pg 14

Execution of ASPES Step 1: Use baseline (exogenous) characteristics to predict subgroup membership Leveraging the experimental design: use out-of-sample prediction (to ensure T and C subgroups are symmetric) Abt Associates pg 15

Execution of ASPES Step 1: Use baseline (exogenous) characteristics to predict subgroup membership Leveraging the experimental design: use out-of-sample prediction (to ensure T and C subgroups are symmetric) Step 2: Estimate impacts on predicted subgroups Abt Associates pg 16

Execution of ASPES Step 1: Use baseline (exogenous) characteristics to predict subgroup membership Step 2: Estimate impacts on predicted subgroups Leveraging the experimental design: use out-of-sample prediction (to ensure T and C subgroups are symmetric) Achieving internal validity: the difference between T and C subgroup mean outcomes is unbiased treatment effect (but for a blend of correctly and incorrectly predicted actuals) Abt Associates pg 17

Execution of ASPES Step 1: Use baseline (exogenous) characteristics to predict subgroup membership Step 2: Estimate impacts on predicted subgroups Leveraging the experimental design: use out-of-sample prediction (to ensure T and C subgroups are symmetric) Achieving internal validity: the difference between T and C subgroup mean outcomes is unbiased treatment effect (but for a blend of correctly and incorrectly predicted actuals) Step 3: Convert estimated impacts for predicted subgroups into impacts for actual subgroups Abt Associates pg 18

Execution of ASPES Step 1: Use baseline (exogenous) characteristics to predict subgroup membership Step 2: Estimate impacts on predicted subgroups Step 3: Convert estimated impacts for predicted subgroups into impacts for actual subgroups Leveraging the experimental design: use out-of-sample prediction (to ensure T and C subgroups are symmetric) Achieving internal validity: the difference between T and C subgroup mean outcomes is unbiased treatment effect (but for a blend of correctly and incorrectly predicted actuals) Achieving external validity: use assumptions to convert the experimental impact estimate Abt Associates pg 19

Execution of ASPES (cont.) Step 3: Convert estimated impacts for predicted subgroups into impacts for actual subgroups Consider that the impact on predicteds is a weighted sum of the impacts on actuals, where the weights involve correct prediction rates: I 1 = p 1 I 1 + 1 p 1 I 2 I 2 = p 2 I 2 + (1 p 2 )I 1 Abt Associates pg 20

Execution of ASPES (cont.) Step 3: Convert estimated impacts for predicted subgroups into impacts for actual subgroups Consider that the impact on predicteds is a weighted sum of the impacts on actuals, where the weights involve correct prediction rates: I 1 = p 1 I 1 + 1 p 1 I 2 I 2 = p 2 I 2 + (1 p 2 )I 1 Use homogeneity assumption (and algebra) to solve for I I 1 = p 2I 1 (1 p 1 )I 2 p 1 + p 2 1 I 2 = p 1I 2 (1 p 2 )I 1 p 1 + p 2 1 Abt Associates pg 21

Illustrative Example: SHM What is the impact of the number of hours of SHM participation (which is endogenous) on couples marital stability and relationship happiness? Abt Associates pg 22

Illustrative Example: SHM What is the impact of the number of hours of SHM participation (which is endogenous) on couples marital stability and relationship happiness? We know Ts Dosage 37% High (24+hrs) 16% Low (0-6hrs) 37% Med (7-23hrs) Abt Associates pg 23

Illustrative Example: SHM (cont.) Select random subsamples of the treatment group from which to predict SHM dosage levels: low (0-6 hrs), medium (7-23 hrs) and high (24+ hrs) Using baseline characteristics, predict dosage (used multinomial logit): Personal and Couple Characteristics - earnings, education, age, race, ethnicity, children, psychological measures, marital tenure, communication, satisfaction measures Program Characteristics - site dummies Use the resulting predicted dosage variable to symmetrically identify subgroups in the treatment and control groups, then compare groups outcomes and convert Abt Associates pg 24

Illustrative Example: SHM (cont.) Selected Outcomes Relationship Status: % married Average Happiness level Impact Analyzed: Overall Study Sample 0.80 0.15 *** Predicted Low-Dosage Group -0.40 0.12 Predicted High-Dosage Group 2.20 *** 0.15 *** between-group diffs n.s. n.s. Abt Associates pg 25

Illustrative Example: SHM (cont.) Selected Outcomes Relationship Status: % married Average Happiness level Impact Analyzed: Overall Study Sample 0.80 0.15 *** Predicted Low-Dosage Group -0.40 0.12 Predicted High-Dosage Group 2.20 *** 0.15 *** Treatment Group Control Group Predicted High Predicted Low Predicted High Predicted Low Predicted Medium Predicted Medium Abt Associates pg 26

Illustrative Example: SHM (cont.) Selected Outcomes Relationship Status: % married Average Happiness level Impact Analyzed: Overall Study Sample 0.80 0.15 *** Predicted Low-Dosage Group -0.40 0.12 Predicted High-Dosage Group 2.20 *** 0.15 *** Treatment Group Control Group Convert to Actuals Predicted High Predicted Low Predicted High Predicted Low Actual High Actual Low Predicted Medium Predicted Medium Actual Medium Abt Associates pg 27

Illustrative Example: SHM (cont.) Selected Outcomes Relationship Status: % married Average Happiness level Impact Analyzed: Overall Study Sample 0.80 0.15 *** Predicted Low-Dosage Group -0.40 0.12 Predicted High-Dosage Group 2.20 *** 0.15 *** between-group diffs n.s. n.s. Actual Low-Dosage Group -5.10-0.07 Actual High-Dosage Group 6.10 0.16 between-group diffs n.s. n.s. Abt Associates pg 28

Ideal Conditions for ASPES Baseline data Standard: demographics, education, works/earnings history Unobservables : motivation, behaviors, preferences First stage prediction success Better than chance? Predicted actual conversion assumption credibility Sample size OLS IV requires greater sample 1,000 for 0.30ES within predicted subgroup 12,500 for 0.30ES within actual (sample, prediction, noise dependent) Abt Associates pg 29

Conclusion Assumptions Tradeoffs Exchange IV s exclusion restriction, for example, for ASPES homogeneity assumption Abt Associates pg 30

Conclusion Assumptions Tradeoffs Exchange IV s exclusion restriction, for example, for ASPES homogeneity assumption Research Questions Participation, including potential effects on no-shows MTO: what is the effect of using a voucher when offered? Treatment dosage or quality HSIS: what generates greater impacts two years, rather than one? being in a better quality center? Multi-faceted treatment components/pathways HPOG: what is it about the intervention that drives impacts? (experience of boot camp style orientation; participation in facilitated peer support groups; use of emergency assistance ) Abt Associates pg 31

References Angrist, Joshua D., Guido W. Imbens, & Donald B. Rubin. (1996). Identification of Causal Effects Using Instrumental Variables. Journal of the American Statistical Association, 91(434), 444-455. DOI: 10.2307/2291629 Bell, Stephen H. & Laura R. Peck. (2013). Using Symmetric Predication of Endogenous Subgroups for Causal Inferences about Program Effects under Robust Assumptions: Part Two of a Method Note in Three Parts. American Journal of Evaluation, 34(3): 413-426. DOI: 10.1177/1098214013489338 Bloom, Howard S. (1984). Accounting for No-shows in Experimental Evaluation Designs. Evaluation Review, 8(2), 225-246. DOI: 10.1177/0193841X8400800205 Frangakis, Constantine E. & Donald B. Rubin. (2002). Principal Stratification in Causal Inference. Biometrics, 58(1), 21-29. Harvill, Eleanor L., Laura R. Peck & Stephen H. Bell. (2013). On Overfitting in Analysis of Symmetrically Predicted Endogenous Subgroups from Randomized Experimental Samples: Part Three of a Method Note in Three Parts. American Journal of Evaluation, 34(4): 545-556. DOI: 10.1177/1098214013503201 Moulton, Shawn, Laura R. Peck, & Keri-Nicole Dillman. (2014). Moving to Opportunity s Impact on Health and Well-being Among High Dosage Participants. Housing Policy Debate, 24(2): 415-446. DOI: 10.1080/10511482.2013.875051 Peck, Laura R. (2003). Subgroup Analysis in Social Experiments: Measuring Program Impacts Based on Post-Treatment Choice. American Journal of Evaluation, 24(2), 157-187. DOI: 10.1016/S1098-2140(03)00031-6 Peck, Laura R. (2007). What are the Effects of Welfare Sanction Policies? Or, Using Propensity Scores as a Subgroup Indicator to Learn More from Social Experiments. American Journal of Evaluation, 28(3), 256-274. DOI: 10.1177/1098214007304129 Peck, Laura R. (2013). On Analysis of Symmetrically-Predicted Endogenous Subgroups: Part One of a Method Note in Three Parts. American Journal of Evaluation, 34(2): 225-236. DOI: 10.1177/1098214013481666 Peck, Laura R. and Stephen H. Bell. (2014). The Role of Program Quality in Determining Head Start s Impact on Child U Development. OPRE Report #2014-10, Washington, DC: Office of Planning, Research and Evaluation, Administration for Children and Families, U.S. Department of Health and Human Services. Schochet, Peter Z., & John Burghardt. (2007). Using Propensity Scoring to Estimate Program-Related Subgroup Impacts in Experimental Program Evaluations. Evaluation Review, 31(2), 95-120. Abt Associates pg 32

For more information, please contact: Laura R. Peck Principal Scientist Social & Economic Policy Division T: 301.347.5537 E: Laura_Peck@abtassoc.com