Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods Hilde Augusteijn M.A.L.M. van Assen R. C. M. van Aert APS May 29, 2016
Today s presentation Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods: PET-PEESE (Stanley & Doucouliagos, 2013) P-uniform (van Assen, van Aert & Wicherts, 2015)
Key messages Lot of evidence of publication bias Traditional meta-analysis methods cannot assess effect sizes accurately when there is publication bias P-uniform currently only accurately assesses effect sizes when there is no heterogeneity PET-PEESE does not work when sample sizes are homogeneous Methods need to be improved
Overview presentation Publication bias Meta-analysis PET-PEESE P-uniform Simulation study Results Recommendations
Publication bias Impact on meta-analytical results Overestimated effect sizes Inaccurate estimates of heterogeneity (τ2)
Meta-analysis Objectives of meta-analysis Estimating... 1. True effect 2. Heterogeneity 3. Moderator effects
Meta-analysis methods Traditional methods Fixed effects meta-analysis Random effects meta-analysis New methods: PET-PEESE P-uniform
Meta-analysis: traditional methods Fixed effects: assumes one underlying effect size Weighted average Weight: precision of study: within-study variance (σi2) Random effects: assumes multiple true effect sizes Weighted average Weight: precision & heterogeneity: between- study variance (σi2 +τ2)
Methods dealing with publication bias A lot of methods have been developed. So far, none of them estimates accurately under all circumstances
Methods dealing with publication bias New methods: PET-PEESE (Stanley & Doucouliagos, 2013) P-uniform (van Assen, van Aert & Wicherts, 2015) & P-curve (Simonsohn, Nelson & Simmons, 2014 )
PET-PEESE When there is no publication bias, there is no relation between effect sizes and precision
PET-PEESE 0.1 0.3 PEESE PET No publication bias 0.1 0.3 0.1 0.3 Plot of 300 randomly generated studies, true ES = 0.5, sample sizes differ from 10 to 150.
PET-PEESE When there is no publication bias, there is no relation between effect sizes and precision Publication bias smaller studies report larger effect sizes
PET-PEESE No publication bias 0.1 0.3 PEESE PET 0.1 0.1 0.3 0.3 Standard Error Standard Error 0.1 0.3 Plot of 300 randomly generated studies, true ES = 0.5, sample sizes differ from 10 to 150.
PET-PEESE When there is no publication bias, there is no relation between effect sizes and precision Publication bias smaller studies report larger effect sizes PET-PEESE uses the intercept of a regression line to estimate the effect size
PET-PEESE No publication bias 0.1 0.3 PEESE PET 0.1 0.3 0.1 0.3 Plot of 300 randomly generated studies, true ES = 0.5, sample sizes differ from 10 to 150.
PET-PEESE: homogeneous sample sizes X-axis: Standard error: related to sample size
PET-PEESE: homogeneous sample sizes 0.10 0.15 0 0 5 0.10 0.15 0 0 5 0.10 No publication bias PET PEESE 0.15 0 0.15 0 0.10 0 5 0.15 0 5 0 PEESE PET No publication bias 0 5 0.10 0.15 0 0 5 0.10 Plots of 100 randomly generated studies, true ES= 0.5 with sample sizes differing between 30 and 50.
PET-PEESE: homogeneous sample sizes 0.10 0.15 0 0 5 0.10 0.15 0 0 5 0.10 No publication bias PET PEESE 0.15 0 0.15 0 0.10 0 5 0.15 0 5 0 PEESE PET No publication bias 0 5 0.10 0.15 0 0 5 0.10 Plots of 100 randomly generated studies, true ES= 0.5 with sample sizes differing between 30 and 50.
P-uniform: Distribution of (conditional) p-values is uniform Uses only statistically significant studies Assumption: All significant studies are equally likely to be published Distribution of (conditional) p-values is uniform under the true effect Currently assumes a fixed effect: Does not provide accurate estimates when there is heterogeneity
Simulation study
Simulation study Independent variables True effect size (4 levels: Cohen s d of 0,, 0.5, ) Level of publication bias (4 levels: 0%, 75%, 95%, 100%) Sample size (2 levels: homogeneous & heterogeneous, average sample size = 100) Amount of heterogeneity (3 levels: τ2= 0, 1, 9; I2=0%, 50%, 90% when N = 100) Number of studies (1 level: Number of significant studies = 5) Total: 96 conditions, 10,000 iterations
Results: expectations Performance of fixed- and random-effects suffer from publication bias Performance of PET-PEESE suffers from homogeneous sample sizes Performance of P-uniform suffers from heterogeneity in effect sizes
Simulation study Dependent variables Bias in estimated effect size (Type I error rate) Power Coverage RMSE
Results: estimation of effect sizes Fixed & random Effects: overestimates effect sizes when there is publication bias
Results: estimation of effect sizes Fixed & random Effects ES d=0.5 d= level of publication bias d= d= ES RE: median estimates of ES FE: median estimates of ES level of publication bias
Results: estimation of effect sizes Fixed & random Effects: overestimates effect sizes when there is publication bias PET-PEESE: underestimates effect size when sample sizes are homogeneous
Results: estimation of effect sizes PET PEESE median estimates of ES Heterogeneous sample sizes level of publication bias ES d= d= d=0.5 d= ES Homogeneous sample sizes level of publication bias
Results: estimation of effect sizes Fixed & random Effects: overestimates effect sizes when there is publication bias PET-PEESE: underestimates effect size when sample sizes are homogeneous P-uniform: overestimates effect sizes when there is heterogeneity
Results: estimation of effect sizes P uniform median estimates of ES ES level of publication bias d= d= d=0.5 d= level of publication bias ES d= d= d=0.5 d= ES Large amount of heterogeneity Medium amount of heterogeneity No heterogeneity level of publication bias
Results: Power Ideal situation: power is at least.80 Fixed & Random effects: power is almost always close to 0 PET-PEESE: when sample sizes are homogeneous, power is almost always equal to.05 P-uniform: power is too low when the true effect size is small. Power increases when there is heterogeneity: due to overestimation of the effect.
Results: Power PET PEESE: Power Heterogeneous sample sizes level of publication bias power d= d= d=0.5 d= power Homogeneous sample sizes level of publication bias
Results: Power Ideal situation: power is at least.80 Fixed & Random effects: power is almost always close to 0 PET-PEESE: when sample sizes are homogeneous, power is almost always equal to.05 P-uniform: power is too low when the true effect size is small. Power increases when there is heterogeneity: due to overestimation of the effect.
Results: Coverage Ideal situation: 95% coverage FE/RE: coverage decreases when publication bias increases (up to 0% coverage) PET-PEESE: coverage 95% when sample sizes are homogeneous (large CI), lower when sample sizes are heterogeneous P-uniform: coverage = 95% when there is no heterogeneity in effect sizes, decreases when heterogeneity increases.
Results: RMSE Root-mean-squared-error: magnitude of error FE/RE: RMSE increases as publication bias increases PET-PEESE: Large RMSE values when sample sizes are homogeneous P-uniform: RMSE is only influenced by δ, not by heterogeneity and publication bias
Key messages Lot of evidence of publication bias Traditional meta-analysis methods cannot assess effect sizes accurately when there is publication bias P-uniform currently only accurately assesses effect sizes when there is no heterogeneity PET-PEESE does not work when sample sizes are homogeneous Methods need to be improved
Recommendations If you know there is no publication bias: Use fixed/random effects Do not use PET-PEESE if the sample sizes are homogeneous (or when there is a large amount of heterogeneity) Only use P-uniform if you know there is no or a small amount of heterogeneity Pay attention to new methods being published!
Questions/Remarks? H.E.M.Augusteijn@tilburguniversity.edu www.metaresearch.nl