C2 Training: August 2010 Introduction to meta-analysis The Campbell Collaboration www.campbellcollaboration.org
Pooled effect sizes Average across studies Calculated using inverse variance weights Studies with more precise estimates (larger N, smaller sd) contribute more to overall average than those with less precise estimates Choice between fixed and random effects models (and mixed models): 1. a priori expectations, 2. statistical tests Do we expect studies to estimate a single population parameter? If Yes, use fixed effect model and test homogeneity assumption Usually use random effect model or mixed model
Combining effect sizes across studies Compute effect sizes within each study Create a set of independent effect sizes Compute weighted mean and variance of effect sizes Compute 95% confidence interval for weighted mean effect size Test for the homogeneity of effect sizes
Create a set of independent effect sizes Likely to have multiple effect sizes per study Effect sizes within a study can be: Multiple measures of study participants Measures of independent groups of participants
What to do with multiple effect sizes per study? Use only independent groups in each analysis Use only one measure of the study participants in each analysis Use results within studies that are derived from independent groups of study participants
Compute weighted mean and variance of effect sizes Compute individual study effect sizes Correct effect sizes for any biases, e.g., Hedges correction Compute study effect size variance
p. 62 of MST pdf
Use weighted mean for effect sizes Use weighted means because each effect size has a different variance that depends on the study s sample size and effect size value Weight all analyses by the inverse of the effect size variance or
Weighted Mean Effect Size k is the number of effect sizes
Standard error of weighted mean effect size
Confidence Interval for Mean Effect Size α = Significance level, z = Critical value from standard normal distribution
Forest plot of days incarcerated Mean ES (95% lower limit, 95% upper limit)
Test of Homogeneity Statistical test that addresses whether the k effect sizes that are averaged into a mean value all estimate the same population effect size In a homogeneous distribution, effect sizes differ from population mean only by sampling error
Homogeneity/heterogeneity tests Statistical tests of homogeneity of results across studies Are all studies estimating a common population parameter with Differences between studies due to chance (sampling error) alone? Homogeneity tests performed under the fixed effect model
Form of the homogeneity test When we reject the null hypothesis of homogeneity, the variability of the effect sizes is more than would be expected from sampling error
Computational form
To Test Homogeneity Compare Q to the (1 α) critical value of the chi-square distribution with k-1 degrees of freedom Significant Q = heterogeneity Non-significant Q = homogeneity
Preliminary analysis Graph effect sizes and 95% CI Compute overall mean effect size and 95% CI Compute homogeneity test Interpret findings
Forest plot of days incarcerated Chi 2 is Q test of homogeneity
p. 59 of MST pdf
Moderator models: ANOVA When we find that a group of studies are heterogeneous, we can explore whether moderator variables explain this variation When we have continuous moderators, we use regression models When we have categorical moderators, we use ANOVA Caution needed: Moderator analysis is correlational Moderators may be confounded within studies
Example from Sirin paper Sirin, S. R. (2005). Socioeconomic status and academic achievement: A meta-analytic review of research. Review of Educational Research, 75, 417-453.
Assessment and adjustments for bias Small sample bias Hedges g correction for SMD Range restriction Hunter & Schmidt Missing data sensitivity analysis, concerns about integrity of intent-to-treat analysis (missing cases), outcome reporting bias (missing data) Publication bias missing studies (and missing data for available studies)
Assessing risk of publication bias 1. Funnel plots plot study effect sizes by their standard errors interoccular analysis of funnel plots is unreliable 2. Trim and fill analysis (need ~ 10+ studies) 3. Statistical tests (Egger s test and others) Do NOT use Failsafe N (Becker, 2005) See Rothstein, Sutton & Bornstein (2005)