Research Design. Beyond Randomized Control Trials. Jody Worley, Ph.D. College of Arts & Sciences Human Relations

Similar documents
Chapter 10 Quasi-Experimental and Single-Case Designs

Overview of Perspectives on Causal Inference: Campbell and Rubin. Stephen G. West Arizona State University Freie Universität Berlin, Germany

Regression Discontinuity Design

Topic #6. Quasi-experimental designs are research studies in which participants are selected for different conditions from pre-existing groups.

Formative and Impact Evaluation. Formative Evaluation. Impact Evaluation

VALIDITY OF QUANTITATIVE RESEARCH

QUASI-EXPERIMENTAL HEALTH SERVICE EVALUATION COMPASS 1 APRIL 2016

Lecture 4: Research Approaches

investigate. educate. inform.

EXPERIMENTAL RESEARCH DESIGNS

RESEARCH METHODS. Winfred, research methods, ; rv ; rv

RESEARCH METHODS. Winfred, research methods,

Experimental and Quasi-Experimental designs

Propensity Score Analysis Shenyang Guo, Ph.D.

STATISTICAL CONCLUSION VALIDITY

Causal Validity Considerations for Including High Quality Non-Experimental Evidence in Systematic Reviews

Version No. 7 Date: July Please send comments or suggestions on this glossary to

Educational Research. S.Shafiee. Expert PDF Trial

Validity and Quantitative Research. What is Validity? What is Validity Cont. RCS /16/04

Randomized Controlled Trials Shortcomings & Alternatives icbt Leiden Twitter: CAAL 1

Lecture 9 Internal Validity

Research Approach & Design. Awatif Alam MBBS, Msc (Toronto),ABCM Professor Community Medicine Vice Provost Girls Section

Research Design & Protocol Development

HPS301 Exam Notes- Contents

Quantitative Methods. Lonnie Berger. Research Training Policy Practice

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

PharmaSUG Paper HA-04 Two Roads Diverged in a Narrow Dataset...When Coarsened Exact Matching is More Appropriate than Propensity Score Matching

In this chapter we discuss validity issues for quantitative research and for qualitative research.

The Regression-Discontinuity Design

Rival Plausible Explanations

Quasi-Experimental and Single Case Experimental Designs. Experimental Designs vs. Quasi-Experimental Designs

RESEARCH METHODS. A Process of Inquiry. tm HarperCollinsPublishers ANTHONY M. GRAZIANO MICHAEL L RAULIN

Measures of Dispersion. Range. Variance. Standard deviation. Measures of Relationship. Range. Variance. Standard deviation.

Conditional Average Treatment Effects

Prediction, Causation, and Interpretation in Social Science. Duncan Watts Microsoft Research

Can Quasi Experiments Yield Causal Inferences? Sample. Intervention 2/20/2012. Matthew L. Maciejewski, PhD Durham VA HSR&D and Duke University

Experimental Research. Types of Group Comparison Research. Types of Group Comparison Research. Stephen E. Brock, Ph.D.

Chapter 9 Experimental Research (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.

WWC Standards for Regression Discontinuity and Single Case Study Designs

UNIT 1: Fundamentals of research design and variables

Recent advances in non-experimental comparison group designs

One slide on research question Literature review: structured; holes you will fill in Your research design

EXPERIMENTAL AND EX POST FACTO DESIGNS M O N A R A H I M I

The degree to which a measure is free from error. (See page 65) Accuracy

Causal Methods for Observational Data Amanda Stevenson, University of Texas at Austin Population Research Center, Austin, TX

10/23/2017. Often, we cannot manipulate a variable of interest

Previous Example. New. Tradition

Design of Experiments & Introduction to Research

MBACATÓLICA JAN/APRIL Marketing Research. Fernando S. Machado. Experimentation

Causal inference nuts and bolts

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Quantitative research Methods. Tiny Jaarsma

Introduction to Applied Research in Economics Kamiljon T. Akramov, Ph.D. IFPRI, Washington, DC, USA

Should Causality Be Defined in Terms of Experiments? University of Connecticut University of Colorado

Conducting Strong Quasi-experiments

26:010:557 / 26:620:557 Social Science Research Methods

The use of quasi-experiments in the social sciences: a content analysis

Defining Evidence-Based : Developing a Standard for Judging the Quality of Evidence for Program Effectiveness and Utility

The essential focus of an experiment is to show that variance can be produced in a DV by manipulation of an IV.

2013/4/28. Experimental Research

Study Design. Svetlana Yampolskaya, Ph.D. Summer 2013

11/2/2017. Often, we cannot manipulate a variable of interest

Research Methods. Donald H. McBurney. State University of New York Upstate Medical University Le Moyne College

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH

Public Policy & Evidence:

Georgina Salas. Topics EDCI Intro to Research Dr. A.J. Herrera

WWC STUDY REVIEW STANDARDS

PYSC 224 Introduction to Experimental Psychology

Topic #2. A key criterion in evaluating any test, measure, or piece of research is validity.

PSY 250. Designs 8/11/2015. Nonexperimental vs. Quasi- Experimental Strategies. Nonequivalent Between Group Designs

Contents. Acknowledgments

Causal inference: Nuts and bolts

Regression Discontinuity Analysis

Epidemiologic Methods I & II Epidem 201AB Winter & Spring 2002

What We Will Cover in This Section

Experimental Design and the struggle to control threats to validity

Methods of Reducing Bias in Time Series Designs: A Within Study Comparison

Impact Evaluation Toolbox

Conducting Research in the Social Sciences. Rick Balkin, Ph.D., LPC-S, NCC

Introduction to Observational Studies. Jane Pinelis

Chapter Three Research Methodology

Fitting the Method to the Question

Chapter 8 Experimental Design

Evaluating Social Programs Course: Evaluation Glossary (Sources: 3ie and The World Bank)

TECH 646 Analysis of Research in Industry and Technology. Experiments

Propensity Score Methods to Adjust for Bias in Observational Data SAS HEALTH USERS GROUP APRIL 6, 2018

Levels of Evaluation. Question. Example of Method Does the program. Focus groups with make sense? target population. implemented as intended?

Lecture 3. Previous lecture. Learning outcomes of lecture 3. Today. Trustworthiness in Fixed Design Research. Class question - Causality

04/12/2014. Research Methods in Psychology. Chapter 6: Independent Groups Designs. What is your ideas? Testing

Impact and adjustment of selection bias. in the assessment of measurement equivalence

Hierarchy of Statistical Goals

Introductory: Coding

TECH 646 Analysis of Research in Industry and Technology. Experiments

Leaning objectives. Outline. 13-Jun-16 BASIC EXPERIMENTAL DESIGNS

Other Designs. What Aids Our Causal Arguments? Time Designs Cohort. Time Designs Cross-Sectional

Econometric analysis and counterfactual studies in the context of IA practices

Getting ready for propensity score methods: Designing non-experimental studies and selecting comparison groups

Psych 1Chapter 2 Overview

We re going to talk about a class of designs which generally are known as quasiexperiments. They re very important in evaluating educational programs

Internal Validity and Experimental Design

Transcription:

Research Design Beyond Randomized Control Trials Jody Worley, Ph.D. College of Arts & Sciences Human Relations

Introduction to the series Day 1: Nonrandomized Designs Day 2: Sampling Strategies Day 3: Matching Techniques for Balanced Designs

Nonrandomized Designs 1. What are Randomized Control Trials and why are they considered the Gold Standard in research design? 2. If the primary goal in quantitative research is to establish some degree of certainty about causal relationships, are there any viable alternatives to RCT? 3. Non-randomized designs are the only option when random assignment is not possible, or not ethically appropriate. What are our options, then, among the various non-randomized design protocols? What are the critical considerations for research design protocol? What are the advantages and disadvantages for each approach?

What are Randomized Control Trials (RCT) and why are they considered the Gold Standard? Strong control over extraneous variables Primary goal is understanding the average overall benefit and risk Efficacious and Effective Efficacy refers to the beneficial effects under optimal conditions Effectiveness refers to effects under more real-world conditions

Potential Reasons for Differences Between the Efficacy and Effectiveness of a Treatment Study Design - Setting: Standardized, highly controlled v Naturalistic, unsystematic - Outcomes include less meaningful end points than desired in naturalistic setting Participant Selection - Strict inclusion/exclusion criteria v Eligible; available participants - Randomization addresses confounding v Unknown, unmeasured factors Treatment / Program Implementation - Complex and multifaceted treatments or programs are challenging to implement - Differences in procedural experience of researchers/providers influences outcomes

Are there any viable alternatives to RCT? Or, is any other approach to establishing causality just a dubious search for fool s gold? Models of Causality Definition of Effect Theory of Cause Causal Generalization Implications for Design and Analysis

Models of Causality Donald T. Campbell s Model (Campbell Causal Model, CCM) Donald B. Rubin s Model (Rubin Causal Model, RCM) Judea Pearl s Model (Pearl Causal Model, PCM)

Campbell s Model Validity Typology Threats to Validity Implications for Research Design

Campbell s Validity Typology Statistical Conclusion Validity Internal Validity Construct Validity External Validity

Threats to Validity Plausible rival hypotheses, or alternative explanations For example: One may infer that results from a nonrandomized experiment support a hypothesis that the treatment worked. It is possible to be wrong in many ways! History Regression Maturation Selection-treatment interaction

Definition of Effect Knowledge of the counterfactual (Campbell). A counterfactual is a condition that would occur if some part of the world were different than it really is. The effect of a cause is the difference between what did happen to the person who received the cause (the fact) and what would have happened to that person had they not received the cause (the counterfactual). ACE = E[δ]= E[Y 1 ] E[Y 0 ]. The Average Causal Effect (ACE) is a difference at the population level: it s the outcome when the treatment is received minus the outcome when the treatment is not administered.

Definition of Effect Potential outcome (Rubin) Rubin notes that all potential outcomes could, in principle, be observed until treatment is assigned, that some can be observed after that. By definition, counterfactuals can never be observed. It is not possible to observe a fact that that does not exist. RCM developed a propensity score logic for observational studies grounded in what researchers know about regular (RCT) designs. See: Rubin (2004, 2005); Morgan & Winship (2007); Winship & Morgan (1999)

Theory of Cause Clear operationalization of the treatment Full implementation of the treatment Causes must be manipulable, and Cause includes whatever was manipulated CCM aspires to be a theory of generalized inference RCM has a more narrow purpose, to define an effect clearly and precisely in single experiments Bandwidth versus Fidelity

Causal Generalization External validity and Theory of construct validity Meta-analysis plays a key role Neither RCM nor CCM has been particularly successful in generating applications of their ideas on causal generalization. Priority is given to internal validity.

Implications for Design and Analysis Internal validity is the priority concern During design stage (Before data collection) Minimize the number of rival hypotheses and plausible alternative explanations After a study is completed Assess remaining threats to validity This is what led to the development of some of the more complex and rigorous nonrandomized designs for use when RCTs are not possible or feasible.

Non-randomized designs What to do when random assignment is not possible, or not ethically appropriate What are our options, then, among the various non-randomized design protocols? What are the critical considerations for research design protocol? What are the advantages and disadvantages for each approach?

Non-randomized designs What to do when random assignment is not possible, or not ethically appropriate Quasi-Experimental Nonequivalent control group Regression Discontinuity Design Interrupted Time-Series Design with control series Complex Pattern Matching and Propensity Scores

Quasi-Experimental Designs A quasi-experimental design is one that looks a bit like an experimental design but lacks the key ingredient -- random assignment. Nonequivalent Control Group design requires a pretest and posttest for a treated and comparison group. Regression discontinuity design (RDD) is a quasiexperimental pretest-posttest design that elicits the causal effects of interventions by assigning a cutoff or threshold above or below which an intervention is assigned.

Interrupted Time-Series with a control series Schema of a quasi-experiment using ITS-CG Design Treatment Group O 1 O 2 O 3 O 4 X O 5 O 6 O 7 O 8 Comparison Group O 1 O 2 O 3 O 4 - O 5 O 6 O 7 O 8

Matching Procedures Metric approaches: Mahalanobis Distance Matching Nearest Neighbor: Propensity Score Matching Coarsened exact matching and caliper-based approaches

Strengths Strengths and Weaknesses Relative to RCT - Large and diverse samples in real-world settings - Opportunity for insight where multiple or complex phenomena occur simultaneously - Relatively inexpensive Weaknesses - Confounding or selection bias of participants - Difficult to compare nonequivalent groups - Multiple methodological approaches are often available but may be inconsistently applied or reported

Considerations When planning or interpreting results of nonrandomized studies: 1. Does the study sample fit the hypothesis and the target population of interest? 2. Is the size of the sample adequate to answer the research question? 3. Is the study design conducive to addressing or answering the research question? 4. Is the treatment exposure determined accurately? Was exposure assessed before the outcome occurred? Can duration of exposure be quantified and treatment-response be explained? (sufficient exposure? e.g., large enough) 5. Is the outcome measured accurately and is it relevant for practice? 6. Are confounding factors measured accurately to control for confounding? Any potentially known confounding variables that are not measured? 7. Is there any cost or loss to follow-up? 8. Are the statistical methods and their assumptions suitable for the research question? 9. (if using secondary data) Does the data fit or address the research question?

So,If RCT is not an option, then what? Representative sample, and Matched comparisons Strive for internal validity

Any Questions?