EPSE 594: Meta-Analysis: Quantitative Research Synthesis

Similar documents
EPSE 594: Meta-Analysis: Quantitative Research Synthesis

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

MS&E 226: Small Data

Unit 1 Exploring and Understanding Data

Still important ideas

Lecture 9 Internal Validity

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Fixed Effect Combining

Evidence-Based Medicine and Publication Bias Desmond Thompson Merck & Co.

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

C2 Training: August 2010

Simpson s Paradox and the implications for medical trials

GLOSSARY OF GENERAL TERMS

Still important ideas

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Meta-Analysis and Publication Bias: How Well Does the FAT-PET-PEESE Procedure Work?

Psychology Research Process

EPSE 592: Design & Analysis of Experiments

Carrying out an Empirical Project

Clinical research in AKI Timing of initiation of dialysis in AKI

Business Statistics Probability

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

AP Psych - Stat 1 Name Period Date. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Assignment #6. Chapter 10: 14, 15 Chapter 11: 14, 18. Due tomorrow Nov. 6 th by 2pm in your TA s homework box

Political Science 15, Winter 2014 Final Review

n Outline final paper, add to outline as research progresses n Update literature review periodically (check citeseer)

Instrumental Variables Estimation: An Introduction

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Abstract Title Page Not included in page count. Authors and Affiliations: Joe McIntyre, Harvard Graduate School of Education

baseline comparisons in RCTs

Research Questions, Variables, and Hypotheses: Part 2. Review. Hypotheses RCS /7/04. What are research questions? What are variables?

Survey research (Lecture 1) Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.

Survey research (Lecture 1)

Comparing Pre-Post Change Across Groups: Guidelines for Choosing between Difference Scores, ANCOVA, and Residual Change Scores

A Brief Introduction to Bayesian Statistics

Lecture II: Difference in Difference. Causality is difficult to Show from cross

UNIT II: RESEARCH METHODS

A brief history of the Fail Safe Number in Applied Research. Moritz Heene. University of Graz, Austria

Econometric analysis and counterfactual studies in the context of IA practices

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Confidence Intervals On Subsets May Be Misleading

Audio: In this lecture we are going to address psychology as a science. Slide #2

Experimental Psychology

Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0

AP Psych - Stat 2 Name Period Date. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data

Performance of the Trim and Fill Method in Adjusting for the Publication Bias in Meta-Analysis of Continuous Data

Introduction & Basics

Comparison of Different Methods of Detecting Publication Bias

JSM Survey Research Methods Section

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Assessing publication bias in genetic association studies: evidence from a recent meta-analysis

Statistics. Dr. Carmen Bruni. October 12th, Centre for Education in Mathematics and Computing University of Waterloo

Unit outcomes. Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2018 Creative Commons Attribution 4.0.

Unit outcomes. Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2018 Creative Commons Attribution 4.0.

Student Performance Q&A:

Theory. = an explanation using an integrated set of principles that organizes observations and predicts behaviors or events.

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

AP STATISTICS 2014 SCORING GUIDELINES

Research Synthesis and meta-analysis: themes. Graham A. Colditz, MD, DrPH Method Tuuli, MD, MPH

CHAPTER ONE CORRELATION

Overview. Survey Methods & Design in Psychology. Readings. Significance Testing. Significance Testing. The Logic of Significance Testing

Please revise your paper to respond to all of the comments by the reviewers. Their reports are available at the end of this letter, below.

ORGANISATIONAL BEHAVIOUR

Psychology Research Process

MAT Mathematics in Today's World

The Pretest! Pretest! Pretest! Assignment (Example 2)

George B. Ploubidis. The role of sensitivity analysis in the estimation of causal pathways from observational data. Improving health worldwide

Running head: INDIVIDUAL DIFFERENCES 1. Why to treat subjects as fixed effects. James S. Adelman. University of Warwick.

Measurement and meaningfulness in Decision Modeling

Chapter 02. Basic Research Methodology

STATISTICS & PROBABILITY

Section 2: Data & Measurement

Systematic Reviews and Meta- Analysis in Kidney Transplantation

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Variables Research involves trying to determine the relationship between two or more variables.

9 research designs likely for PSYC 2100

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

Independent Variables Variables (factors) that are manipulated to measure their effect Typically select specific levels of each variable to test

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH

Bayesian methods for combining multiple Individual and Aggregate data Sources in observational studies

Introduction to Econometrics

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Study Methodology: Tricks and Traps

Introduction to statistics Dr Alvin Vista, ACER Bangkok, 14-18, Sept. 2015

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

Comparison of volume estimation methods for pancreatic islet cells

Caveats for the Spatial Arrangement Method: Supplementary Materials

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Research Methodology. Characteristics of Observations. Variables 10/18/2016. Week Most important know what is being observed.

How to Conduct a Meta-Analysis

Chapter 5 & 6 Review. Producing Data Probability & Simulation

Problem set 2: understanding ordinary least squares regressions

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering

Insight Assessment Measuring Thinking Worldwide

Chapter 25. Paired Samples and Blocks. Copyright 2010 Pearson Education, Inc.

Patrick Breheny. January 28

Analysis of TB prevalence surveys

Meta-analysis: Advanced methods using the STATA software

Transcription:

EPSE 594: Meta-Analysis: Quantitative Research Synthesis Ed Kroc University of British Columbia ed.kroc@ubc.ca March 28, 2019 Ed Kroc (UBC) EPSE 594 March 28, 2019 1 / 32

Last Time Publication bias Funnel plots, trim-and-fill procedures Ed Kroc (UBC) EPSE 594 March 28, 2019 2 / 32

Today Simpson s Paradox Psychometric considerations in meta-analysis Ed Kroc (UBC) EPSE 594 March 28, 2019 3 / 32

Funnel plots A useful visual tool to diagnose possible publication bias is a funnel plot: Plot each study s outcome effect size against its standard error. If the scatter of points is a symmetric blob around the summary effect size, then no evidence of significance bias. If the scatter of points trails off to the right (positive effect size) or to the left (negative effect size), then we have possible evidence of significance bias. Note: typical to draw a triangle (funnel) around the scatterplot of points: triangle is centred at summary effect, and has vertex angle defined by the 95% CI of the summary effect. Ed Kroc (UBC) EPSE 594 March 28, 2019 4 / 32

Funnel plots Funnel plot for Zheng et al. (2016): no evidence of PB Ed Kroc (UBC) EPSE 594 March 28, 2019 5 / 32

Funnel plots Funnel plot for hypothetical meta-analysis: possible evidence of PB Ed Kroc (UBC) EPSE 594 March 28, 2019 6 / 32

Asymmetry in funnel plots does not always imply PB Notice that we have been careful to say that nonsymmetric plots only show possible evidence of publication bias. This is because under a random effects model, we should expect some variation in the true effect sizes. Moreover, true effect size is often correlated with sample size (and so standard error). For example, when meta-analyzing well-designed RCTs, studies targeting smaller effect sizes will have larger sample sizes (to achieve reasonable power). Thus, we might expect studies with smaller standard errors to arise from studies estimating smaller true effect sizes. More generally, a moderating variable may explain the asymmetry in the funnel plot. Ed Kroc (UBC) EPSE 594 March 28, 2019 7 / 32

Asymmetry in funnel plots does not always imply PB Funnel plot for meta-analysis with no PB: skew explained by moderator Ed Kroc (UBC) EPSE 594 March 28, 2019 8 / 32

Asymmetry in funnel plots does not always imply PB When meta-analyzing well-designed studies, power will correlate with true effect size. Simulated meta-analysis with correlation between true effects and targeted effects (all with 80% power) of 0.6. No PB. Ed Kroc (UBC) EPSE 594 March 28, 2019 9 / 32

Trim and fill procedures If we see evidence of PB in the funnel plot, then we may want to adjust for it. How? Most common procedure is trim and fill (assume positive mean effect): Remove the study furthest to the right (biggest effect size); Compute the new summary effect; Repeat until funnel plot is symmetric; Then, to ensure we don t artificially deflate uncertainty, add the removed studies back in, and also add their mirror images on the opposite side of the new summary effect; Now we have an unbiased estimate of the summary effect and a semi-reasonable estimate of its uncertainty, assuming the initial asymmetry actually reflects true PB. Ed Kroc (UBC) EPSE 594 March 28, 2019 10 / 32

Trim and fill procedures Trim... Ed Kroc (UBC) EPSE 594 March 28, 2019 11 / 32

Trim and fill procedures... and fill Ed Kroc (UBC) EPSE 594 March 28, 2019 12 / 32

Trim and fill procedures Trim and fill is a nice technique, but it comes with major caveats: The technique assumes that asymmetry actually reflects true PB. The technique does not explicitly consider Type M error. The actual algorithm that does the trimming is prone to perform poorly when there are too few studies, or too many aberrant studies. The fill algorithm relies on imputation to create the missing effect sizes: this comes with a host of other modelling assumptions that we will not be able to test for a meta-analysis. In particular, a good technical argument can be made that the fill procedure artificially deflates uncertainty quite badly; it also can severely distort the true mean effect size. Can use trim-and-fill to see if your substantive conclusions change; if they do, then should attempt to find the source of alleged PB and adjust for it directly. Ed Kroc (UBC) EPSE 594 March 28, 2019 13 / 32

Simpson s Paradox Simpson s Paradox (also called Simpson-Yule Paradox or Lord s Paradox) occurs when a trend present in an aggregate dataset disappears or reverses when the dataset is split into groups, or more generally, when an omitted confounding variable is accounted for. This has major implications for inference. It is particularly troublesome in the context of meta-analysis, where we are combining a bunch of group (study) effects into a single (composite) effect. Ed Kroc (UBC) EPSE 594 March 28, 2019 14 / 32

Simpson s Paradox: Ex. 1 Ed Kroc (UBC) EPSE 594 March 28, 2019 15 / 32

Simpson s Paradox: Ex. 2 Ed Kroc (UBC) EPSE 594 March 28, 2019 16 / 32

Simpson s Paradox: Berkeley admissions In 1973, alleged gender bias in grad school admissions at UC-Berkeley: Chi-squared test yields p-value! 0.000001. So ostensible evidence for gender bias, but... Ed Kroc (UBC) EPSE 594 March 28, 2019 17 / 32

Simpson s Paradox: Berkeley admissions Broken down by department, a very different picture emerges: Ed Kroc (UBC) EPSE 594 March 28, 2019 18 / 32

Simpson s Paradox: Berkeley admissions (1) No consistent evidence of gender bias; in fact, one could argue that a possible gender bias exists in favour of women applicants in Department A. Ed Kroc (UBC) EPSE 594 March 28, 2019 19 / 32

Simpson s Paradox: Berkeley admissions (2) Women tend to apply to departments with higher overall rejection rates; may reflect underlying societal gender biases at work, but not in the admissions process. Ed Kroc (UBC) EPSE 594 March 28, 2019 20 / 32

Simpson s Paradox: Kidney stone treatments Two treatments for kidney stones: A = open surgery (invasive), B = laparoscopy (mildly invasive) Larger kidney stones is a more severe condition than small stones. Ed Kroc (UBC) EPSE 594 March 28, 2019 21 / 32

Simpson s Paradox: Kidney stone treatments Ignoring severity (confounding variable), Treatment B is more effective. Yet Treatment A is more effective at treating both mild and severe cases. Ed Kroc (UBC) EPSE 594 March 28, 2019 22 / 32

Simpson s Paradox: Kidney stone treatments Why does this happen? Notice the cell counts: Groups 2 and 3 dominate. Thus, the combined estimates are driven by the proportions in Groups 2 and 3, and Group 2 success rate is higher. Ed Kroc (UBC) EPSE 594 March 28, 2019 23 / 32

Simpson s Paradox as Ecological Fallacy Simpson s Paradox is an example of a more general phenomenon known as the ecological fallacy. An ecological fallacy occurs when we use an inference at an ecological (aggregate) level to make claims about what happens at the individual (group) level. Classic example: Income positively correlates with tendency to vote Republican (USA). Thus, richer states tend to vote Republican more than poorer states... FALSE! Here, voting preference is affected by overall wealth of the state even after controlling for individual wealth: self-perceived relative wealth? Ed Kroc (UBC) EPSE 594 March 28, 2019 24 / 32

Simpson s Paradox as Ecological Fallacy In a meta-analysis, this is a potentially serious concern. Why? We are aggregating group (study) level effects to estimate a combined (ecological) effect. Thus, a positive aggregate association of treatment with condition may actually mask negative associations within each individual study. Ed Kroc (UBC) EPSE 594 March 28, 2019 25 / 32

Psychometric issues in meta-analysis In psychometrics, we are often very concerned with issues of measurement, namely: Reliability: how variable, or imprecise, a measurement process is. Validity: how well (how accurately) the measurement captures the phenomenon it is trying to quantify. Ed Kroc (UBC) EPSE 594 March 28, 2019 26 / 32

Psychometric issues in meta-analysis Classically, one proposes the following framework: Each subject (e.g. person) has a unique true value (score), T, of some particular phenomenon of interest. This true value cannot be measured directly; instead, we observe (measure) only a proxy for it; this is the observed score, X. This observed score may differ from the true score; thus we propose a generic measurement error model: X T ` E, where E denotes the measurement error. Usually, further assumptions are then imposed on the the structure of the errors to more accurately model a real-life phenomenon and measurement process. Ed Kroc (UBC) EPSE 594 March 28, 2019 27 / 32

Psychometric issues in meta-analysis In the context of meta-analysis, it may be natural to ask how reliable or how valid are measurements (observed effects) are for the actual phenomenon (true effects) they are trying to quantify. Notice: this is not the same thing as sample error. Sample error occurs because our sample will not capture every relevant feature of the overall population. In contrast, measurement error speaks to how well our measurement process captures the phenomenon it is try to quantify. E.g. one could have census-level data (no sampling error), that is still subject to substantial measurement error. Ed Kroc (UBC) EPSE 594 March 28, 2019 28 / 32

Psychometric issues in meta-analysis In the context of meta-analysis, one may want study weights to explicitly account for the reliability or validity of a measurement from a particular study. For many reasons, you really only see this done with estimates of reliability. For most (but not all) measurement error situations, extra variance due to measurement error (i.e. imperfect reliability) will have an attenuating effect on model estimates; i.e. measurement error tends to cause our estimates to shrink towards the null. However, if we could adjust our estimates before meta-analyzing them, then we could potentially remove (at least some) of this attenuation bias. Ed Kroc (UBC) EPSE 594 March 28, 2019 29 / 32

Psychometric issues in meta-analysis First, we need to understand what reliability of a measurement process is, and how to quantify it. Formally, reliability of a measurement X for a true score T is defined as R : VarpT q VarpX q 1 VarpEq VarpX q. Under the classical test theory measurement error model, if one has two parallel measurements X and X 1 for T, then reliability is also equal to: R ρ 2 XT. X and X 1 are parallel measurements for T if their variances are equal, and their corresponding errors are uncorrelated. Ed Kroc (UBC) EPSE 594 March 28, 2019 30 / 32

Psychometric issues in meta-analysis It can be shown that if ρ is a correlation (effect size) between variables, with one subject to this kind of classical measurement error, and if ρ adj is the corrected correlation (free of measurement error), then a ρ ρ adj is equal to the square root of the reliability of the measurement process. Thus, to adjust for attenuation due to this kind of measurement error, we need a way to estimate reliability. Many methods for this: Cronbach s α is the most common. Crucially, all these methods always yield underestimates of the actual (theoretical) reliability. Ed Kroc (UBC) EPSE 594 March 28, 2019 31 / 32

Psychometric issues in meta-analysis With some (under)-estimate of reliability in place, we could now adjust our observed effects for measurement error: r adj r pa Similarly, we can adjust the corresponding variance of the observed effects via: Varpr adj q Varprq pa 2 Now can proceed to meta-analysis as usual, but using these adjusted estimates of effect size and standard error. Ed Kroc (UBC) EPSE 594 March 28, 2019 32 / 32