Incorporating Within-Study Correlations in Multivariate Meta-analysis: Multilevel Versus Traditional Models

Size: px
Start display at page:

Download "Incorporating Within-Study Correlations in Multivariate Meta-analysis: Multilevel Versus Traditional Models"

Transcription

1 Incorporating Within-Study Correlations in Multivariate Meta-analysis: Multilevel Versus Traditional Models Alison J. O Mara and Herbert W. Marsh Department of Education, University of Oxford, UK Abstract The multilevel meta-analysis approach is compared with the traditional meta-analytical approaches, known as fixed effects and random effects models, for dealing with data with multiple outcomes. It is common for research to examine multiple outcomes (e.g., multidimensional constructs), and so it is important to establish whether multilevel models offer an improvement in modelling such datasets. A description and comparison of the underlying models is followed by the results of a simulation study comparing the three meta-analytic models under different statistical conditions. The results of the simulation study suggest that the maximum likelihood multilevel approach is in general superior to the fixed-effects and random effects approaches, as it provides good estimates of the parameters and generally has better coverage. In particular, the fixed and random effects are shown to have higher risk of Type II error. Interestingly, the inclusion of the covariance between the two outcomes (the cornerstone of the multivariate multilevel approach; Kalaian & Raudenbush, 1996) did not seem to improve the estimation of the parameters beyond the regular multilevel model. In general, all models performed better when the number of studies included in the meta-analysis was larger. Implications for the inclusion of different types of within-study correlations are discussed. Corresponding author: Alison O Mara, alison.omara@education.ox.ac.uk, Department of Education, University of Oxford, 15 Norham Gardens, OXFORD, OX2 6PY Acknowledgement: This research was supported by a grant from the United Kingdom Economic and Social Research Council (ESRC RES ).

2 Advances in knowledge come from the integration of a vast body of research (Schmidt, 1992). Broadly, meta-analysis can be defined as a quantitative review of the results of independent studies addressing a related research question (Normand, 1999). Meta-analysis seeks to systematically synthesise the results from multiple studies to facilitate comparison across studies (Rosenthal, 1987), with the goal of gaining a greater understanding of related research. Methodological approaches to meta-analysis are being developed faster than they can be incorporated into practice, and most of these developments have been made in the last two decades (Viechtbauer, 2005). Early meta-analyses were based on a fixed-effects model (Hedges & Vevea, 1998), which assumes that all samples are drawn from the same population of studies so that there is only one true (homogeneous) population effect (Cohn & Becker, 2003; Erez, Bloom, & Wells, 1996). Fixed effects meta-analyses were an improvement on traditional literature reviews because they allowed the researcher to model differences between studies, rather than just report whether results of various studies were statistically significant or not. However, there is a problem with fixed effects models: Results of fixed-effects analyses apply only to the actual data considered so the results cannot be generalised to other studies (Raudenbush, 1994). This is because fixed effects models assume that there is no systematic variability between studies (i.e., they are homogeneous). In reality, this is rarely the case. Hafdahl (2007) noted that, Given the ubiquity of between-studies variation in meta-analytic data and the theoretical and practical value of relating it to study features, models that incorporate this heterogeneity are indispensable (p. 200). More recently, researchers have argued for a random-effects model, in which variability between studies is assumed to reflect both sampling error and variability in the population of effects; this is the random effects model (Raudenbush, 1994). The inclusion of between-study variance estimates makes the findings from random effects model analyses appropriate to generalize to other studies that were not included in the meta-analysis (Lipsey & Wilson, 2001). The procedure for the random-effects model is similar to fixed-effects models, except that a random error variance component is added to the variance associated with each effect size, which takes into account variability in the population effects (Lipsey & Wilson, 2001). Although random effects models represented an important extension of in meta-analytic methodology, it still suffers a potentially critical flaw. An important statistical assumption of both fixed and random effects models is that the outcomes are independent of each other. This

3 means that if a meta-analyst wishes to consider multiple outcomes from the same study, this assumption is violated (i.e., effect sizes from the same study are not independent and therefore likely to be more highly correlated). As such, traditional meta-analytic methods (fixed and random effects models) are not appropriate for dealing with multidimensional constructs consisting of multiple outcomes, as they use incorrect standard errors in the analyses. A promising new direction for dealing with this issue is the multilevel model approach to metaanalysis (Hox, 2002; Kalaian & Kasim, 2008a, 2008b; Kalaian & Raudenbush, 1996), outlined below. First, we will examine how meta-analysts using traditional methods have dealt with the multiple outcome issue in the past, to see why a new approach is necessary. Approaches to Dealing with Multiple Outcomes in Traditional Meta-Analytic Models As a result of the assumption of independence in fixed and random effects models, metaanalysts using these traditional methods are forced to consider only one outcome per study. This inevitably has important theoretical implications, as well as statistical ramifications (see Gleser & Olkin, 1994, for a more detailed discussion of the problem with dependent effect sizes). Choose one outcome of interest. Many traditional meta-analyses focus on only one outcome that which is of most interest to the researcher. For instance, the researcher might only look at how an educational programme improves overall school grades, but not be interested in the affect of the programme on standardised achievement tests. This approach is appropriate for some research questions that are very specific and deal with a narrow range of outcomes. However, this strategy does not allow a detailed look at how the independent variables of interest (e.g., an educational programme) affect different outcomes (e.g., grades in mathematics and grades in English). Averaging the effect sizes. A common approach when multiple indicators of the same construct (e.g., different measures of job satisfaction) is to establish an independent set of effect sizes by calculating the average of the effect sizes in the study. However, the dependent variables need to be almost perfectly correlated for this method to work, because the mean effect size gives an estimate that is lower than expected if the correlation is moderate or low (Rosenthal & Rubin, 1986). In addition to a high statistical correlation, the outcomes (or indicators) should be conceptually similar and on the same metric to make the results meaningful. Similarly, averaging distinct domains of self-concept (e.g., maths ability self-concept and verbal ability self-concept) may not make sense when forming a global, overall self-concept score, since these domains have

4 been shown to be nearly uncorrelated and hence very distinct (see Marsh & O Mara, 2008 for review). As such, the practice of averaging outcomes is usually counter-productive as potentially valuable information is lost or the resulting score might even be meaningless. In particular, it is highly problematic for multidimensional constructs in which the multiple dimensions are specifically designed to measure different facets of the construct and are highly differentiated (e.g., multiple dimensions of self-concept for which the average correlation among different selfconcept domains is only about.20). Separate meta-analyses on each outcome. If multiple outcomes are of interest to the researcher, then one approach is to conduct separate analyses on each outcome. However, this approach does not allow contrasts between outcomes, thereby restricting the questions you can ask. Thus, for example, if the meta-analyst specifically wanted to compare the effects of a math intervention on math and verbal self-concepts, it would be important to consider both facets of self-concept in the same analysis. Therefore, this may not always make sense for the research question under consideration (Rosenthal & Rubin, 1986). Shifting unit of analysis (Cooper, 1998). Cooper suggested that the outcomes of the included studies could be aggregated depending on the level of analysis of interest the study or outcome level. At the study level, all effect sizes from within a study are aggregated to produce one outcome per study. For each moderator analysis, effect sizes are aggregated based upon the particular moderator (or predictor) variable, such that each study only includes one effect size per outcome on that particular variable. Although this strategic compromise does not eliminate the problem of independence, this approach minimizes violations of assumptions about the independence of effect sizes, whilst preserving as much of the data as possible (Cooper, 1998). It is probably the most defensible way of dealing with multiple outcomes in fixed and random effects models in cases where the researcher is explicitly interested in comparing different outcomes. However, this too has problems. Within an aggregated dataset, the same problems with averaging the outcomes arise. In addition, each set of analyses will be based on a different set of effect sizes, making interpretation of different analyses within the same meta-analysis difficult. Indeed, the original presentation of this strategy was a short suggestion presented by Cooper (1998) as an expedient compromise to an apparently serious limitation to traditional approaches to meta-analysis, but with no statistical or empirical justification. Despite being little more than a note in passing, Cooper s suggestion has been widely cited as a justification for the

5 aggregation of effect sizes depending on the unit of analysis. However, little research has evaluated whether this method is indeed an improvement on the alternatives. Although the present investigation does not evaluate the shifting unit of analysis approach, an alternative that is based on sound statistical reasoning is tested. In summary, there are no truly satisfactory ways of dealing with multiple outcomes when the researcher is explicitly interested in comparing and contrasting different outcomes in the traditional meta-analytic models. In response to this, researchers are now exploring multilevel modelling approaches to meta-analysis. Approaches to Dealing With Multiple Outcomes in Multilevel Modelling Multilevel (or hierarchical linear) models were applied to meta-analysis to account for the dependencies in multivariate meta-analytic datasets (Goldstein, 2003; Hox, 2002; Marsh, O Mara, & Malmberg, 2008; Raudenbush & Bryk, 2002). The idea here is that multiple effect sizes from the same study can be included in the analyses, without averaging or aggregating and thereby losing information. Multilevel analyses (in general) are increasingly used in the social sciences for analysing hierarchically ordered data in which the assumption of independence is violated (Hox, 2002). In typical educational data, for example, students (level 1) are nested within classes (level 2). Students within the same class are usually more similar to each other than to students from different classes violating the assumption of independence consistent with random sampling (Goldstein, 2003). Meta-analysis is a special type of multilevel data in which the multiple effect sizes from a given study (level 1) are nested within the studies included in the meta-analysis (level 2) (Hox & de Leeuw, 2003). Multilevel model meta-analyses are surprisingly uncommon, despite being first suggested by Raudenbush and Bryk in The lack of published metaanalyses utilizing this method does not stem from scepticism about its potential efficacy; rather, a distinct lack of prescriptive instructions on how to conduct a meta-analysis using multilevel modelling (Van den Noortgate & Onghena, 2003) has made researchers hesitant. Only in the last few years have more detailed methodological notes been published (e.g., Hox & de Leeuw, 2003; Van den Noortgate & Onghena, 2003; Hox, 2002; Raudenbush & Bryk, 2002). Three (or more) level models. In addition to the issue of multiple outcomes, meta-analyses can violate the assumption of independence by having more than one treatment group or sample reported in the same study. It is not uncommon to have more than two levels in the data structure

6 as would be the case when there are multiple studies within the same publication, or multiple publications from the same research team. A multilevel approach is more explicit in identifying the different levels, determining how much variance is explained at each level, and evaluating characteristics that are specific to different levels (and even interactions between effects that might occur at different levels). Although random effects models do include an estimate of the between-studies variance, they are not really able to incorporate true hierarchical data particularly when there are more than two levels in the structure. Known population correlation. More recently, the multilevel meta-analytic model has been extended to include multivariate outcomes (Kalaian & Raudenbush, 1996; Kalaian & Kasim, 2008a, 2008b). By accounting for this nested structure and including an estimate of the correlation between the outcomes within a study, it is proposed that the problems with dependencies within the dataset are minimised (Kalaian & Kasim, 2008a, 2008b). This claim is yet to undergo rigorous testing using simulated data, as previous work (e.g., Berkey, Anderson, & Hoaglin, 1996; Kalaian & Raudenbush, 1996; Kalaian & Kasim, 2008a, 2008b) has been based on real data. However, despite the intuitive appeal of multivariate multilevel model meta-analyses, in practice they can be difficult to implement. The primary reason for this is that often the withinstudy correlations are not known (see Riley, in press, for a review). Typically, the primary studies included in the meta-analysis do not report the correlations between the outcomes of interest, nor do they report the raw data so that a meta-analyst could calculate the correlation (Riley, Thompson, & Abrams, 2008). In a meta-analysis of Scholastic Aptitude Test (SAT) scores, Kalaian and Kasim (2008a; see also Kalaian & Raudenbush, 1996) did not have access to the within-study correlations, so they used the correlation between the two outcomes (maths and verbal achievement) published in the SAT manual. In this case, this approach is reasonable because all of the included studies reported the exact same outcomes using the exact same measure and based on similar samples. Thus, even though it is likely that there will be slight study-to-study variations in the actual sample correlation, it is reasonable to assume that the sample correlations would not differ much from the population correlation published in the test manual. This assumption becomes less realistic when (a) different studies use different instruments to measure the outcomes, (b) the population correlation between the outcomes is not known, or (c) the samples in the included

7 studies differ from the population. This means that in some cases, the estimate of the correlation is little better than an educated guess. In summary, whilst Kalaian and colleagues research provides a reasonable compromise to this problem in the context of SAT scores, this situation does not generalize to other research. Hence, meta-analysts are left with the problem of dealing with the estimation of the correlation between multiple outcomes at level 1 (i.e., the level of the individual participant). Unknown population correlation. An alternative is to use the known correlations from some studies to estimate the unknown correlations in others. For example, Berkey et al. (1996) used the correlations between outcomes in one clinical trial as the basis for the correlations in any other study that did not report the correlation. They conducted a sensitivity analysis in combination with this to show that they would have reached a similar conclusion had the correlations come from a different source. Using known correlations could also involve mean imputation. For instance, if a meta-analysis yielded 20 studies, and half of them reported the correlation between the outcomes, then one could use the average of the known as the fixed value correlation of the unknown. This requires a reasonable number of known correlations (or access to the original raw data; see Riley, in press) for this to be appropriate. An alternative approach could be to use more sophisticated approaches to missing data, such as multiple imputation. Riley et al. (2008) suggested an approach for bivariate random effects model metaanalysis (BRMA) when the within-study correlations are not known. This model allows for individual weightings for each included study, but includes only one overall correlation parameter, ρ. This overall correlation is a hybrid measure of the within-study and between-study correlations. This seems a promising approach. However, so far the model is confined to bivariate studies with complete data it is not clear how to estimate the overall correlation with more than two outcomes, or when some of the studies include only one (or some) of the outcomes of interest. Adding more outcomes will make the modelling process very complex. Perhaps most importantly, this model assumes that the within-study correlations will be equal across all studies. Although this may be reasonable in many cases, it is not clear whether having different samples across the studies (which could potentially be from different populations) and, perhaps, different measures of the target constructs, will mean that the correlation between the outcomes is the same in every study.

8 In summary, multilevel models are the most recent and sophisticated development in metaanalytic methodology. They have greatly expanded the number and types of research questions that can be appropriately addressed through meta-analysis. They promise to resolve the issue of non-independent outcomes within meta-analytic datasets, although this promise is undermined by the difficulty in acquiring or estimating the within-study correlations. This issue is yet to be fully evaluated. Meta-analytic Approaches to Research Synthesis In the following sections, the basic components of the analytical strategies in meta-analysis (i.e., effect sizes, variances, and covariances) are summarised. This is followed by an overview of the underlying models of the major techniques in the fixed effects (FE), random effects (RE) and multilevel (ML) approaches. For ease of reading, please note the following notation used in the present study. p was used to denote the number of outcome measures in each of the primary experimental studies. In this bivariate model, each of the primary studies under review has two outcome measures (p = 2), so that the total number of outcome measures across all k studies is equal to 2k (or pk, the number of outcomes multiplied by the number of studies). i is used to denote the study from which the data are drawn. k represents the number of studies included in the meta-analysis. Effect Sizes, Variances, and Covariances Effect sizes. An effect size is a standardised numerical outcome for comparing the studies. There are numerous ways of calculating an effect size (see Lipsey & Wilson, 2001). Here we will be examining the standardised mean difference effect size. In the present study, an effect size is defined as the difference between two groups (e.g., experimental and control groups) in standard deviation units. This is calculated as (Glass, 1976; Rosenthal, 1994; Lipsey & Wilson, 2001) d ip Y Y2ip = (1) s 1ip pooled where Y 1i and Y 2i are the means of the outcome scores for group 1 and group 2, respectively, and s pooled is the pooled standard deviation (square root of the pooled within-groups variances), from the i th study. s pooled is estimated as follows s p _ pooled = ( s 1ip ( n 1i 1)) + ( s ( n 1i + n 2i 2ip ( n 2) 2i 1)) (2)

9 where s 1ip and s 2ip are group 1 and 2 s estimated standard errors for the outcome measure in study i for outcome p, respectively. n 1i and n 2i are the sample sizes for groups 1 and 2 in study i, respectively. Variances and weighting. The variance of the effect size in meta-analysis is the sampling error variance of the original study. It is the square of the standard error of the effect size. The standard error of the effect size is defined as the standard deviation of the sampling distribution (Lipsey & Wilson, 2001). The variance is based on the sample sizes for the two groups and the effect size, as seen below. The sampling variance (v ip ) of the effect size, d ip, for study i can be estimated as (Hedges, 1981) 2 ( n1 i + n2i ) dip vip= + ( n n ) 2( n + n 1i 2i 1i 2i ) (3) where n 1i and n 2i are the sample sizes for groups 1 and 2 in study i, respectively. d 2 ip is the square of effect size p for study i. The variance is very important in meta-analysis, as it is used in weighting the effect sizes. Because different studies have different sample sizes and standard errors of the effect sizes, simply finding the average of the effect sizes may give unwarranted weight to studies with small samples or large sampling error. So, meta-analysts weight the effect sizes based on the inverse of the variance so that more weight is given to effect sizes that are considered to be more precise, such as those with a small standard error and those from studies with larger samples. The weighting formulas for the different models are given in the next section. Covariances. It has been proposed that the covariances between any pair of the multiple effect sizes needs to be estimated to account for the correlation between outcomes (e.g., Kalaian & Raudenbush, 1996). The covariance between the effect sizes from each primary study can be estimated as follows (Kalaian & Kasim, 2008a) σ ip, ip ( n1+ n2) ' = r ( n n ) 1 2 ip, ip' dipd + 2( n 2 ip' ip, ip' 1 r + n 2 ) (4)

10 where r ip,ip is the correlation coefficient between the outcome measures in each of the primary studies, d ip and d ip represent the two effect sizes; in a bivariate model, the effect sizes would be denoted as d i1 and d i2. The covariance term is only included in the ML model. The Three Meta-Analytic Models (Fixed, Random, and Multilevel Models) Fixed effects models. As noted earlier, FE models assume that all of the variance between the effect sizes is attributable to sampling error (v ip ), because the effects are assumed to all come from the same population (Hedges & Olkin, 1985). Thus, the formula for the observed effect size in an FE model is given by (5) d i = δ + e i where d i is the observed effect size in study i, δ is the true population effect, and e i is the residual due to sampling variance in study i (equivalent to the square root of v ip ). In the FE model, a weighted average of the effect sizes is calculated. The weight is given by w ipf 1 = v ip (6) where v ip is the value calculated using Equation (3). The overall mean effect size is given by d p ( wipf d = w ipf ip ) (7) In addition to calculating the overall weighted mean effect size, a test of the homogeneity of the effect sizes is conducted. This indicates whether the effect sizes are similar to each other, and are drawn from the same population of effect sizes. That is, it asks the question: Are the differences among the effect sizes bigger than might be expected by chance? If the test is significant, it means that the data are heterogeneous, and so this assumption of the FE model is violated. The Q-statistic follows a chi-square distribution, and is calculated as where d ip is the observed effect size p in study i, is the mean effect size p from study i, and d ip Q p ( d ip dip ) = w ip w ip is the weighting for effect size p in study i. If this value is significant when compared to a 2 (8)

11 chi-square table, the meta-analysts should conduct RE analyses instead, but may also wish to model moderators of the effect sizes to determine the source/s of the variance. Random effects model. In addition to the sampling error, RE models also assume that there are some genuine between-study differences, which contribute to the variation in the effect sizes. The general formula for the RE model is d = δ + u + e ip where δ is the mean true population effect size, u ip is the deviation of the true study effect size from the mean true effect size, and e ip is the residual due to sampling variance in study i. The RE model takes into account the between-study variability by including a random error variance component in the weighting. The random error variance component, v θ, is given by Q where Q is the Q-statistic calculated to test the homogeneity of the model (Equation 8), k is the number of studies, and w ipf is the effect size weight, calculated based on FE models (Equation 6). The random error variance component, v θ, is then used in the calculation of the weighting. (9) (10) 1 (11) + The weighted overall mean effect size is then calculated as in Equation (7), but with w ipr rather than w ipf. Multilevel model. Like the RE model, the ML model assumes that there is sampling error variance (i.e., within-study variance) and between-study variance. The formula for the observed effect size in a ML model is d γ + u + e ip = 0 ip where d ip is the observed effect size p in study i, γ 0 is the mean true population effect size, u ip is the deviation of the true study effect size p from the mean true effect size, and e ip is the residual due to sampling variance in study i. The ML model differs from the RE model in the way that the parameters are estimated. It uses an iterative process, so that the model is run repeatedly, improving the estimation each time, until the estimates fail to significantly improve (known as ip ip ( k 1) p vθ p = w 2 ipf wipf w ipf w ipr = v ip v θ ip (12)

12 convergence ). The process through which this occurs is detailed in Goldstein (2003), Hox (2002), and Raudenbush and Bryk (2002). The estimation procedure appropriate in meta-analysis is called restricted maximum likelihood, which means that the variance components are estimated after removing the fixed effects from the model. This model is preferred particularly for meta-analysis (see Bryk & Raudenbush, 1992) because the significance of the between-study variance is critical in deciding whether all the studies report the same outcome, and the restricted maximum likelihood model is more accurate in estimating the variance components (Hox, 2002). The alternative estimation procedure, full information maximum likelihood, has also been found to have downwardly biased estimates of the fixed parameters in meta-analysis (e.g., Turner, Omar, Yang, Goldstein, & Thompson, 2000; Van den Noortgate & Onghena, 2003). ML models have been suggested to give more precise and less biased estimates of between-study variance than traditional techniques based on Hedges and Olkin s (1985) formula. This is particularly true for comparisons of standard errors in FE models (Bateman & Jones, 2003; Hox, 2002; Van den Noortgate & Onghena, 2003). Hox (2002) notes that, in RE weighted regression, an estimate of the residual between-study variance must be provided by the metaanalyst before conducting the analysis, whereas multilevel modelling programmes use iterative maximum likelihood estimation, taking into account the information contained in the explanatory variables in the model (p. 144). Since the estimation of standard errors is improved in multilevel modelling, the confidence intervals will be more precise, regardless of heterogeneity. However, like the RE model, ML models incorporate the random error variance component, and so the confidence intervals will be more similar in the two models than the FE model. The difference between RE and ML models is likely to increase with greater standard errors as proposed by Hox, in press). The Present Investigation Standard approaches to multilevel model meta-analyses (referred to here as the general ML model) have been cited as a logical way of representing meta-analytic data (e.g., Hox, 2002). As an extension of this model, the multivariate ML approach has been hailed as the best way to deal with multivariate outcomes (Kalaian & Kasim, 2008a), as it explicitly models the covariance between the multiple outcomes. Although conceptually this seems reasonable, the practicality of knowing (or estimating) the various within-study correlations is likely to be problematic.

13 The purpose of the present investigation is twofold. Firstly, we wish to test whether the general ML model is actually an improvement on the FE and RE model in terms of estimating the overall mean effect sizes and their standard errors. Secondly, we seek to evaluate the multivariate ML model compared to the general ML model, to determine whether the inclusion of the covariance between the multivariate outcomes in the multivariate model improves the estimates beyond the general ML model (which does not include the covariance term in the model). These issues will be examined through a simulation study. The methods will be compared on the amount of bias in the estimates, the accuracy and variability of the estimates, and the frequency of Type I and Type II error under a host of conditions. It is hypothesised that the ML will perform better than the FE and RE across all conditions. It is also hypothesised that the multivariate ML model will produce better estimates than the general ML model when the covariance in the fitted model matches the population covariance. The precise predictions for each of the outcomes (bias, variability, and coverage) will be elaborated on at the end of the Method section. Method Simulation Design Statistical simulation is a numerical technique of experimenting with random sampling from probability distributions. In a typical simulation study, a series of computer-generated datasets are analysed under a host of different (statistical) conditions. The purpose of simulation is generally to determine how a particular statistical model behaves under various conditions, and therefore whether its properties operate in a way that is appropriate for particular research applications. For example, it is possible to test whether having a larger number of studies in the meta-analysis improves the estimate of the overall mean effect size. Because the dataset is computer-generated, the true sampling distribution based on the population generating model specified by the researcher (which will be referred to as the population distribution here) is known for the particular set of conditions. This allows researchers to test to see if the model is able to accurately reflect that generated distribution (Castro & Gaviria, 2000). Simulation studies involve generating multiple independent simulated data sets. In the present research, 1,000 datasets were generated per condition, which is considered a sufficient number for guaranteeing the precision of the analysis (e.g., Castro and Gaviria, 2000). The effect sizes of interest were standardised mean differences (i.e., the difference between two groups on

14 an outcome; described further below). The datasets were moderately independent, in that the same set of 1,000 simulated independent datasets were used to compare the three statistical methods (ML, FE, and RE) for the same condition, but a different set of datasets was generated for each condition investigated. This situation can be viewed as analogous to a matched pair design where the within sample variability is eliminated and therefore are sensitive to detecting any differences between methods (Burton, Altman, Royston, & Holder, 2006, p. 4282). The simulation conditions were determined by six characteristics of the meta-analyses and their included studies (see Table 1). In total, 144 conditions (with 432 cells representing the three models) were tested. <<insert table 1 about here>> Number of studies (k = 20 or 100). With an increased number of studies, there is more information from which to calculate the parameters of interest. Therefore, we would expect that the accuracy of the estimates would increase with increasing k, particularly for RE (Raudenbush, 1994) and ML models which assume between-study variance. Whether 20 or 100 studies are considered a small or large sample of studies, respectively, depends on the discipline. In the social sciences, 20 would represent a small-medium meta-analysis, whereas 100 would represent a medium-large sample (cf. the sample sizes of the 302 meta-analyses included in Lipsey & Wilson s 1993 meta-meta-analysis). Mean group sizes (n 1 = 80 or 100). It is unlikely that most studies included in a metaanalysis have the same mean sample size for the two groups. For example, it is often common that a treatment group has a smaller number of participants than a control group, due to costs, accessibility, or other reasons. The sample sizes of the two groups (n 1 and n 2 ) are important, because they are used in the calculations of every aspect of the meta-analysis, including the weighted mean effect size, the sampling variance, and critically, the covariance. Thus, it is important to test whether the condition n 1 n 2 has a differential effect when the covariance term is included in the model. The numbers of participants in the two groups were generated on a Poisson distribution (following Castro & Gaviria, 2003) under two conditions: one in which the mean of the n s were equal (mean n 1 = 100, mean n 2 = 100) and one in which the mean of the n s for the two groups were not equal (mean n 1 = 80, mean n 2 = 100). The group sample sizes are equal within a study, so that the two effect sizes are based on the same sample sizes within a study. It is acknowledged that this may not necessarily be the case in real data situations.

15 Overall true effect size (mean δ 1 = 0 or 0.5). The size of the two outcomes is crucial in calculating the covariance, so it was important to test whether an imbalance in the means of the overall true effect sizes impacted the parameter estimation. In one condition, the population means of the effect sizes were equal for the two outcomes (mean δ 1 = 0.5 and mean δ 2 = 0.5), and in the other condition, the means were unequal (mean δ 1 = 0 and mean δ 2 = 0.5). These values were chosen as they were both seen to be realistic and common overall mean effect sizes in the social sciences. An effect size of zero represents no difference between the two groups, which is the null hypothesis, whereas an effect size of.5 represents a.5 standard deviation higher score for group 1 (e.g., the treatment group) compared to group 2 (e.g., the control group). Variance in true effect sizes (σ 2 1 = 0.1 or 0.2). The variance (σ 2 1 ) in the population of effects for each outcome will determine the spread of the effect sizes. In one condition, the population variances of the effect sizes were equal (mean σ 2 1 = 0.1 and mean σ 2 2 = 0.1), and in the other condition, the population variances were unequal (mean σ 2 1 = 0.2 and mean σ 2 2 = 0.1). This is important, as the variances will determine the between-study heterogeneity. Correlation in the fitted (ML) model (r = 0, 0.2, or 0.8). The FE and RE models assume that the data are independent, and so do not include an estimate of the covariance between the effects. The ML approach to meta-analysis is proposed to yield better estimates of the data when information about the correlation between the outcomes is included in the model. As shown earlier, the ML model includes the covariance, which incorporates the correlation between the outcomes (see Equation 4). Each pair of effect sizes will have their own covariance that is distinct from study to study, even if the correlation is constant across studies, because of the way in which the covariance is calculated (Equation 4). It was important to test the model when the fitted within-study correlation was zero, as this is equivalent to a ML model that does not include the covariance estimate. It was also desirous to test if the model could determine correct parameter estimates when the correlation between outcomes was low (r =.2) or high (r =.8). It is possible that the inclusion of the covariance in the fitted model (based on the correlation between the outcomes) only makes a substantial difference when the within-study correlations are high. 2 Covariance in true effect sizes ( σ = 0, 0.005, or 0.02). This refers to the covariance 12 between the two outcomes in the simulated data, and therefore represents the population covariance. The correlations in the fitted model (r = 0,.2, or.8), when substituted into Equation

16 4 for the δ s and n s specified above, correspond roughly with covariances of 0,.005 and.02, respectively. For example, when a correlation of.2 is entered into Equation 4 when δ 1 and δ 2 are.5 and n 1 and n 2 are 80 or 100, the covariance will equal approximately.005. Therefore, the population data were generated such that the two outcomes had a covariance of 0,.005, or.02. This was so that the population covariances would correspond with the zero, low, and high correlations in the fitted model. In summary, by selecting these six factors (Table 1), it was intended to determine whether certain features of the studies or the meta-analyses had an impact on whether the inclusion of the correlation in the model improved the parameter estimates for the ML model compared to the FE and RE models (which do not incorporate the within-study correlation). The conditions selected were considered to be representative of typical social science meta-analytic data. Generating the Data The simulation was performed using MLwiN 2.10 (Beta10; Rasbash, Browne, Healy, Cameron, & Charlton, 2008). The data were generated using macros developed by the first author. The effect sizes were simulated by the programme based on the population conditions specified in Table 1. The variances to be fitted in the model were calculated based on the simulated effect sizes and the randomly generated sample sizes. Analysing The Simulated Data The variances, weights, and standard errors were calculated based on Hedges and Olkin (1985), described in the previous section. The fitted model for the FE was that described in Hedges (1994). The fitted model for the RE was the method of moments approach described in Raudenbush (1994). Also see Hedges and Vevea (1998), Overton (1998), and Lipsey and Wilson (2001) for details and comparisons of both the FE and RE models used here. For the ML approach, the multivariate model described by Kalaian and Raudenbush (1996) and Kalaian and Kasim (2008a, 2008b) was used. In the ML model, restricted maximum likelihood estimates were calculated because full maximum likelihood estimates are downwardly biased (e.g., Turner, Omar, Yang, Goldstein, & Thompson, 2000; Van den Noortgate & Onghena, 2003). In simulation research, three of the primary methods for testing the ability of a model to estimate the true population parameters are coverage, bias, and accuracy. The coverage, bias, and precision were calculated in MLwiN 2.10 (Beta 10), and then imported into SPSS15.0 for analysis.

17 Bias. Bias is the deviation in an estimate from the true population quantity, and indicates the accuracy of the estimator for the effect. The basic assessment of bias is the difference between the average estimate and the true value, and is given by δ = ˆ β β (13) B where β is the true value for the estimate of interest, ˆ β = ˆ T i =1 βi /, T is the number of simulations performed, and βˆ i is the estimate of interest within each of the I = 1,..., T simulations. The smaller the bias, the closer the estimate is to the true value; the more accurate the estimate. RMSE. The mean square error (MSE) indicates the overall variability of the model. It is defined as the average squared difference between the estimate and the true value, and includes both bias and efficiency (Collins et al., 2001). MSE is given by ( ˆ β ˆ)) β 2 2 β ) + ( SE( (14) The smaller the MSE, the better, as it indicates greater accuracy of the estimates. It is common to report the square root of the MSE (referred to as root mean square error, or RMSE), because this is in the same scale as the parameter and makes it easier to interpret (Burton et al., 2006; Collins et al., 2001). Again, smaller RMSEs are preferred. Empirical SD and the bias of the SE. Precision can be assessed by examining the empirical standard deviation (SD). The empirical SD is the SD of the parameter estimates across the 1,000 iterations. A small empirical SD means that the estimates of that approach are not very variable. In this respect, precision differs from the RMSE in that the RMSE is based on deviations between each estimate and the true population parameter, whilst the empirical SD is based on deviation between each estimate and the mean estimate. Thus, RMSE reflects a combination of accuracy and precision, whilst the empirical SD reflects precision without regard to accuracy. It is also useful to evaluate the difference between the empirical standard deviation and the estimated standard error (mean of the estimated standard errors for a given parameter estimate provided by the model, averaged across the different simulation samples) can reveal whether the estimated standard error is biased. This occurs when the empirical standard deviation varies systematically from the corresponding from the estimated standard error.

18 Coverage. The coverage of a confidence interval is the proportion of times that the true specified parameter value is included in the obtained confidence interval. The confidence intervals are obtained by ˆ β ± Z SE( ˆ β ) (15) i 1 α / 2 i where βˆ i is the estimate of interest within each of the I = 1,..., T simulations, ( ˆ i ) SE β is the standard error of the estimate of interest within each simulation and Z 1 α / 2 is the 1-α/2 quantile of the standard normal distribution. For the 95% confidence interval, Z 1 α / 2 is The coverage indicates in part the accuracy of the standard error, and should be approximately equal to the nominal coverage rate. For example, a 95% confidence interval would warrant a 95% coverage rate. Coverage is linked to Type I and Type II error. Type I error is the error of rejecting a correct null hypothesis. It occurs when a difference is observed, when in truth there is not a difference. Type II error is the error of accepting a false null hypothesis, and occurs when a difference is not observed, when there actually is a difference. For a nominal coverage rate of 95%, coverage rates greater than 95% suggest that the results are too conservative, leading to a loss of statistical power with higher than expected Type II errors. Coverage rates lower than 95 per cent can lead to higher than expected type I errors. Burton et al. (2006) suggested that, with 95% confidence intervals and 1,000 generated datasets, the coverage value should lie between 93.6 and 96.4, as this is the range of two standard errors around the nominal coverage probability. However, if there is systematic bias, the interpretation of coverage rate is complicated: an accurate standard error associated with a large bias will result in population value not being in the CI. Hypotheses and Research Questions Bias. In the present study, it is hypothesised that the ML will show smaller bias in the parameter estimation than the RE and FE models when estimating multivariate outcomes. This is because the iterative process in the ML model ensures greater accuracy in the parameter estimates. The FE model is likely to show greater bias in the parameter estimation than the ML or RE, as the model does not include the between-studies variance estimate, and so is less likely to estimate the true population values. In addition, there is no prediction regarding the difference between the multivariate ML model and the general ML model in terms of bias.

19 RMSE. In the present study, it is hypothesised that the ML will have a smaller RMSE than the FE and RE models. This is proposed because the iterative process ensures more accurate standard errors in the ML model, and so the estimates are more precise. The RE model may show a higher RMSE than both ML and FE models. This is hypothesised because the RE model s standard errors tend to be larger than in the FE model as they include the random error variance component. The RE model s RMSEs may be larger than in the ML model because the ML model will have smaller bias (which is included in the calculation of the RMSE), and because the RE model does not have an iterative process to increase precision. Also, it is predicted that the multivariate ML model will have a smaller RMSE than the general ML model because it will estimate a more accurate error structure. Empirical SD and bias of the SE. We make no predictions regarding differences between the models in terms of the empirical standard deviation (SD). In terms of the difference between the empirical SD and the estimated standard error (SE), it is predicted that the difference will be smallest for the ML and largest for the RE for the same reasons as for the RMSE. We anticipate that the multivariate ML model will have a smaller bias of the SE than the general ML model because it will estimate a more accurate error structure. Coverage. It is hypothesised that the ML model will show no tendency towards either Type I or Type II error; i.e., the coverage will be between 93.6 and The FE model is expected to be prone to Type I error (i.e., coverage <93.6), because it does not include a random error variance component and so has smaller confidence intervals. Smaller confidence intervals mean that it is more likely to be statistically significant, and therefore more likely to falsely reject the null hypothesis. The RE model is assumed to be more similar in terms of coverage to the ML, as both include a random error variance component especially when heterogeneity is relatively small. However, in the conditions where heterogeneity is large (i.e., when the variance of δ 1 is 0.2 in the present study), then the RE model may be too conservative and show greater Type II errors. This means that the RE model is more likely to falsely accept the null hypothesis. Finally, it is hypothesised that the multivariate ML model will have better coverage than the general ML model because it will estimate a more accurate error structure. Results and Discussion No problems were encountered in estimating the coefficients of the MLM, FE, or RE models; the estimation procedure converged in all 144,000 simulation data sets. The bias,

20 RMSE, and coverage for each of the three models were estimated for each generated dataset for all 432 cells ( ) in the design. To determine which of the study s conditions contributed to the bias, the RMSE, the coverage, and the empirical standard deviation, we conducted ANOVAs using the bias, RMSE, coverage, and empirical standard deviation of the estimator as the dependent variables and each manipulated condition (method; number of studies, k; group size, n; population effect size; σ 2 1 ; covariance in the population; and covariance in the fitted model) as a factor. The ANOVAs were conducted at the cell level, with each cell average being treated as one observation (so that the highest-order interaction could not be separated from the error). Assessment of Bias The bias indicates whether the estimated mean is systematically different from the true population mean. To determine the bias, the cell mean for each of the 1,000 repetitions was calculated, and then the population value was subtracted from this mean (see Equation 13). For the ML approach, across the 144 cell means, the bias ranged in magnitude from to (M = -.001, SD =.000) for δ 1 and to.002 (M =.000, SD =.002) for δ 2 (see Table 2). As predicted, the bias for the FE approach was larger, with values ranging from to (M = -.010, SD =.010) for δ 1 and to (M = -.013, SD =.002) for δ 2. The bias for the RE approach was similar to the ML approach, with values ranging from to (M = -.002, SD =.001) for δ 1 and to.001 (M = -.003, SD =.002) for δ 2. In contrast to the ML approach and the RE approach, the FE approach slightly underestimated the effect sizes estimates (i.e., the true population mean was typically larger than the estimated mean). However, it is clear that the biases for all three models are quite small. <<insert Table 2 about here>> The source of the bias was further investigated by conducting an ANOVA, with the bias for δ 1 as the dependent variable. The values related to δ 1 were used because all of the changes in the conditions affected δ 1 only (i.e., variance, size of the overall mean effect size) or δ 1 and δ 2 equally (i.e., n, k, covariance, correlation). Thus, any differences would be observed most predominantly in δ 1. The factors in the ANOVA were the method (ML, FE, or RE) plus the six conditions described above (k, n, population effect size, σ 2 1, covariance in the population, and the correlation used in the fitted model). Although all factors were included in the 7-way model,

21 only the main effects and the 2-way, 3-way, and 4-way interactions were tested. This is because higher order interactions would have too few cases per cell, would be difficult to interpret, and were always non-significant or extremely small in size. The results of the ANOVA suggest that the largest moderator of the bias was the main effect of method (η 2 =.335), followed by the population mean effect size method interaction (η 2 =.330), the main effect of the population mean effect size (η 2 =.247), the variance method interaction (η 2 =.027), the population mean effect size variance method interaction (η 2 =.027), the main effect of variance (η 2 =.016), the population mean effect size variance interaction (η 2 =.014), and the main effect of k (η 2 =.003). All other main and interaction effects had η 2 values that were zero when rounded to three decimal places (i.e., less than 1/10 of 1% of the variance explained) and deemed to be substantively negligible. The R 2 for this model was due to the fact that analyses were based on cell means, but the total variance of the bias was very small (=.021). Importantly, the method used explained the most variance in the biases. The FE model had the largest biases (see Table 2), whilst RE and ML biases were quite similar. This means that, on average, the FE tends to underestimate the true population value. Explaining the second largest amount of variance was the interaction between the population mean effect size and the method. Oddly, when the population mean effect size, the three models have roughly the same bias; when the effect size increases, the FE model has a larger bias (see Figure 1). This could suggest that the FE model is as accurate at detecting the alternative hypothesis (that the two groups constituting the effect size calculation are not equal). The main effect for the population mean effect size also explained a large proportion of the variance in the biases. An inspection of the cell means indicates that bias is larger (mean bias = -.024, p <.001) for meta-analyses in which the population mean is.5, compared to meta-analyses in which the population mean is zero (mean bias = -.002, p <.001). It is arguable that, despite being statistically significant, these differences are substantively trivial. However, it might be the case that this difference is not trivial when the population mean is even higher than.5. Future simulations could test a broader range of population mean effect sizes to determine this. The second largest interaction effect for bias was the variance method interaction. From Figure 2 we can see that the bias for the FE model is consistently larger (in magnitude), but increases when the variance is larger. This means that the FE model is more prone to underestimate the mean effect size when the variance is higher.

Meta-Analysis of Correlation Coefficients: A Monte Carlo Comparison of Fixed- and Random-Effects Methods

Meta-Analysis of Correlation Coefficients: A Monte Carlo Comparison of Fixed- and Random-Effects Methods Psychological Methods 01, Vol. 6, No. 2, 161-1 Copyright 01 by the American Psychological Association, Inc. 82-989X/01/S.00 DOI:.37//82-989X.6.2.161 Meta-Analysis of Correlation Coefficients: A Monte Carlo

More information

Meta-Analytic Methods in Educational Research: Issues and Their Solutions

Meta-Analytic Methods in Educational Research: Issues and Their Solutions OMA05407 Meta-Analytic Methods in Educational Research: Issues and Their Solutions Alison J. O Mara, Herbert W. Marsh, and Rhonda G. Craven SELF Research Centre, University of Western Sydney, Australia

More information

Self-Concept Intervention Research in School Settings: A Multivariate, Multilevel Model Meta-Analysis

Self-Concept Intervention Research in School Settings: A Multivariate, Multilevel Model Meta-Analysis OMA05401 Self-Concept Intervention Research in School Settings: A Multivariate, Multilevel Model Meta-Analysis Alison J. O Mara, Herbert W. Marsh, and Rhonda G. Craven SELF Research Centre, University

More information

Module 14: Missing Data Concepts

Module 14: Missing Data Concepts Module 14: Missing Data Concepts Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724 Pre-requisites Module 3

More information

A Comprehensive Multilevel Model Meta-Analysis of Self-Concept Interventions [R] University of Western Sydney, Australia

A Comprehensive Multilevel Model Meta-Analysis of Self-Concept Interventions [R] University of Western Sydney, Australia A Comprehensive Multilevel Model Meta-Analysis of Self-Concept Interventions [R] Alison J. O Mara 1, Herbert W. Marsh 1, and Rhonda G. Craven 2 1 University of Western Sydney, Australia and University

More information

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering Meta-Analysis Zifei Liu What is a meta-analysis; why perform a metaanalysis? How a meta-analysis work some basic concepts and principles Steps of Meta-analysis Cautions on meta-analysis 2 What is Meta-analysis

More information

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1 Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Confidence Intervals On Subsets May Be Misleading

Confidence Intervals On Subsets May Be Misleading Journal of Modern Applied Statistical Methods Volume 3 Issue 2 Article 2 11-1-2004 Confidence Intervals On Subsets May Be Misleading Juliet Popper Shaffer University of California, Berkeley, shaffer@stat.berkeley.edu

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

Statistical Techniques. Meta-Stat provides a wealth of statistical tools to help you examine your data. Overview

Statistical Techniques. Meta-Stat provides a wealth of statistical tools to help you examine your data. Overview 7 Applying Statistical Techniques Meta-Stat provides a wealth of statistical tools to help you examine your data. Overview... 137 Common Functions... 141 Selecting Variables to be Analyzed... 141 Deselecting

More information

A brief history of the Fail Safe Number in Applied Research. Moritz Heene. University of Graz, Austria

A brief history of the Fail Safe Number in Applied Research. Moritz Heene. University of Graz, Austria History of the Fail Safe Number 1 A brief history of the Fail Safe Number in Applied Research Moritz Heene University of Graz, Austria History of the Fail Safe Number 2 Introduction Rosenthal s (1979)

More information

Understanding Uncertainty in School League Tables*

Understanding Uncertainty in School League Tables* FISCAL STUDIES, vol. 32, no. 2, pp. 207 224 (2011) 0143-5671 Understanding Uncertainty in School League Tables* GEORGE LECKIE and HARVEY GOLDSTEIN Centre for Multilevel Modelling, University of Bristol

More information

University of Bristol - Explore Bristol Research

University of Bristol - Explore Bristol Research Lopez-Lopez, J. A., Van den Noortgate, W., Tanner-Smith, E., Wilson, S., & Lipsey, M. (017). Assessing meta-regression methods for examining moderator relationships with dependent effect sizes: A Monte

More information

Meta-analysis using HLM 1. Running head: META-ANALYSIS FOR SINGLE-CASE INTERVENTION DESIGNS

Meta-analysis using HLM 1. Running head: META-ANALYSIS FOR SINGLE-CASE INTERVENTION DESIGNS Meta-analysis using HLM 1 Running head: META-ANALYSIS FOR SINGLE-CASE INTERVENTION DESIGNS Comparing Two Meta-Analysis Approaches for Single Subject Design: Hierarchical Linear Model Perspective Rafa Kasim

More information

OLS Regression with Clustered Data

OLS Regression with Clustered Data OLS Regression with Clustered Data Analyzing Clustered Data with OLS Regression: The Effect of a Hierarchical Data Structure Daniel M. McNeish University of Maryland, College Park A previous study by Mundfrom

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study STATISTICAL METHODS Epidemiology Biostatistics and Public Health - 2016, Volume 13, Number 1 Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation

More information

Investigating the robustness of the nonparametric Levene test with more than two groups

Investigating the robustness of the nonparametric Levene test with more than two groups Psicológica (2014), 35, 361-383. Investigating the robustness of the nonparametric Levene test with more than two groups David W. Nordstokke * and S. Mitchell Colp University of Calgary, Canada Testing

More information

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA The uncertain nature of property casualty loss reserves Property Casualty loss reserves are inherently uncertain.

More information

In this chapter, we discuss the statistical methods used to test the viability

In this chapter, we discuss the statistical methods used to test the viability 5 Strategy for Measuring Constructs and Testing Relationships In this chapter, we discuss the statistical methods used to test the viability of our conceptual models as well as the methods used to test

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

How to interpret results of metaanalysis

How to interpret results of metaanalysis How to interpret results of metaanalysis Tony Hak, Henk van Rhee, & Robert Suurmond Version 1.0, March 2016 Version 1.3, Updated June 2018 Meta-analysis is a systematic method for synthesizing quantitative

More information

Hypothesis Testing. Richard S. Balkin, Ph.D., LPC-S, NCC

Hypothesis Testing. Richard S. Balkin, Ph.D., LPC-S, NCC Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric statistics

More information

EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE

EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE ...... EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE TABLE OF CONTENTS 73TKey Vocabulary37T... 1 73TIntroduction37T... 73TUsing the Optimal Design Software37T... 73TEstimating Sample

More information

Estimation of the predictive power of the model in mixed-effects meta-regression: A simulation study

Estimation of the predictive power of the model in mixed-effects meta-regression: A simulation study 30 British Journal of Mathematical and Statistical Psychology (2014), 67, 30 48 2013 The British Psychological Society www.wileyonlinelibrary.com Estimation of the predictive power of the model in mixed-effects

More information

11/24/2017. Do not imply a cause-and-effect relationship

11/24/2017. Do not imply a cause-and-effect relationship Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

How many speakers? How many tokens?:

How many speakers? How many tokens?: 1 NWAV 38- Ottawa, Canada 23/10/09 How many speakers? How many tokens?: A methodological contribution to the study of variation. Jorge Aguilar-Sánchez University of Wisconsin-La Crosse 2 Sample size in

More information

Where does "analysis" enter the experimental process?

Where does analysis enter the experimental process? Lecture Topic : ntroduction to the Principles of Experimental Design Experiment: An exercise designed to determine the effects of one or more variables (treatments) on one or more characteristics (response

More information

CHAPTER VI RESEARCH METHODOLOGY

CHAPTER VI RESEARCH METHODOLOGY CHAPTER VI RESEARCH METHODOLOGY 6.1 Research Design Research is an organized, systematic, data based, critical, objective, scientific inquiry or investigation into a specific problem, undertaken with the

More information

META-ANALYSIS OF DEPENDENT EFFECT SIZES: A REVIEW AND CONSOLIDATION OF METHODS

META-ANALYSIS OF DEPENDENT EFFECT SIZES: A REVIEW AND CONSOLIDATION OF METHODS META-ANALYSIS OF DEPENDENT EFFECT SIZES: A REVIEW AND CONSOLIDATION OF METHODS James E. Pustejovsky, UT Austin Beth Tipton, Columbia University Ariel Aloe, University of Iowa AERA NYC April 15, 2018 pusto@austin.utexas.edu

More information

JSM Survey Research Methods Section

JSM Survey Research Methods Section Methods and Issues in Trimming Extreme Weights in Sample Surveys Frank Potter and Yuhong Zheng Mathematica Policy Research, P.O. Box 393, Princeton, NJ 08543 Abstract In survey sampling practice, unequal

More information

COMMITTEE FOR PROPRIETARY MEDICINAL PRODUCTS (CPMP) POINTS TO CONSIDER ON MISSING DATA

COMMITTEE FOR PROPRIETARY MEDICINAL PRODUCTS (CPMP) POINTS TO CONSIDER ON MISSING DATA The European Agency for the Evaluation of Medicinal Products Evaluation of Medicines for Human Use London, 15 November 2001 CPMP/EWP/1776/99 COMMITTEE FOR PROPRIETARY MEDICINAL PRODUCTS (CPMP) POINTS TO

More information

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels; 1 One-Way ANOVAs We have already discussed the t-test. The t-test is used for comparing the means of two groups to determine if there is a statistically significant difference between them. The t-test

More information

The Meta on Meta-Analysis. Presented by Endia J. Lindo, Ph.D. University of North Texas

The Meta on Meta-Analysis. Presented by Endia J. Lindo, Ph.D. University of North Texas The Meta on Meta-Analysis Presented by Endia J. Lindo, Ph.D. University of North Texas Meta-Analysis What is it? Why use it? How to do it? Challenges and benefits? Current trends? What is meta-analysis?

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

3 CONCEPTUAL FOUNDATIONS OF STATISTICS 3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical

More information

Georgina Salas. Topics EDCI Intro to Research Dr. A.J. Herrera

Georgina Salas. Topics EDCI Intro to Research Dr. A.J. Herrera Homework assignment topics 51-63 Georgina Salas Topics 51-63 EDCI Intro to Research 6300.62 Dr. A.J. Herrera Topic 51 1. Which average is usually reported when the standard deviation is reported? The mean

More information

Measurement and meaningfulness in Decision Modeling

Measurement and meaningfulness in Decision Modeling Measurement and meaningfulness in Decision Modeling Brice Mayag University Paris Dauphine LAMSADE FRANCE Chapter 2 Brice Mayag (LAMSADE) Measurement theory and meaningfulness Chapter 2 1 / 47 Outline 1

More information

Multilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison

Multilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison Group-Level Diagnosis 1 N.B. Please do not cite or distribute. Multilevel IRT for group-level diagnosis Chanho Park Daniel M. Bolt University of Wisconsin-Madison Paper presented at the annual meeting

More information

Hierarchical Linear Models: Applications to cross-cultural comparisons of school culture

Hierarchical Linear Models: Applications to cross-cultural comparisons of school culture Hierarchical Linear Models: Applications to cross-cultural comparisons of school culture Magdalena M.C. Mok, Macquarie University & Teresa W.C. Ling, City Polytechnic of Hong Kong Paper presented at the

More information

Exploring the Impact of Missing Data in Multiple Regression

Exploring the Impact of Missing Data in Multiple Regression Exploring the Impact of Missing Data in Multiple Regression Michael G Kenward London School of Hygiene and Tropical Medicine 28th May 2015 1. Introduction In this note we are concerned with the conduct

More information

Section on Survey Research Methods JSM 2009

Section on Survey Research Methods JSM 2009 Missing Data and Complex Samples: The Impact of Listwise Deletion vs. Subpopulation Analysis on Statistical Bias and Hypothesis Test Results when Data are MCAR and MAR Bethany A. Bell, Jeffrey D. Kromrey

More information

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study Marianne (Marnie) Bertolet Department of Statistics Carnegie Mellon University Abstract Linear mixed-effects (LME)

More information

C h a p t e r 1 1. Psychologists. John B. Nezlek

C h a p t e r 1 1. Psychologists. John B. Nezlek C h a p t e r 1 1 Multilevel Modeling for Psychologists John B. Nezlek Multilevel analyses have become increasingly common in psychological research, although unfortunately, many researchers understanding

More information

Use of the estimated intraclass correlation for correcting differences in effect size by level

Use of the estimated intraclass correlation for correcting differences in effect size by level Behav Res (2012) 44:490 502 DOI 10.3758/s13428-011-0153-1 Use of the estimated intraclass correlation for correcting differences in effect size by level Soyeon Ahn & Nicolas D. Myers & Ying Jin Published

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

CHAPTER 4 RESULTS. In this chapter the results of the empirical research are reported and discussed in the following order:

CHAPTER 4 RESULTS. In this chapter the results of the empirical research are reported and discussed in the following order: 71 CHAPTER 4 RESULTS 4.1 INTRODUCTION In this chapter the results of the empirical research are reported and discussed in the following order: (1) Descriptive statistics of the sample; the extraneous variables;

More information

Correlation and Regression

Correlation and Regression Dublin Institute of Technology ARROW@DIT Books/Book Chapters School of Management 2012-10 Correlation and Regression Donal O'Brien Dublin Institute of Technology, donal.obrien@dit.ie Pamela Sharkey Scott

More information

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug? MMI 409 Spring 2009 Final Examination Gordon Bleil Table of Contents Research Scenario and General Assumptions Questions for Dataset (Questions are hyperlinked to detailed answers) 1. Is there a difference

More information

GUIDELINE COMPARATORS & COMPARISONS:

GUIDELINE COMPARATORS & COMPARISONS: GUIDELINE COMPARATORS & COMPARISONS: Direct and indirect comparisons Adapted version (2015) based on COMPARATORS & COMPARISONS: Direct and indirect comparisons - February 2013 The primary objective of

More information

Context of Best Subset Regression

Context of Best Subset Regression Estimation of the Squared Cross-Validity Coefficient in the Context of Best Subset Regression Eugene Kennedy South Carolina Department of Education A monte carlo study was conducted to examine the performance

More information

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN)

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN) UNIT 4 OTHER DESIGNS (CORRELATIONAL DESIGN AND COMPARATIVE DESIGN) Quasi Experimental Design Structure 4.0 Introduction 4.1 Objectives 4.2 Definition of Correlational Research Design 4.3 Types of Correlational

More information

baseline comparisons in RCTs

baseline comparisons in RCTs Stefan L. K. Gruijters Maastricht University Introduction Checks on baseline differences in randomized controlled trials (RCTs) are often done using nullhypothesis significance tests (NHSTs). In a quick

More information

Received: 14 April 2016, Accepted: 28 October 2016 Published online 1 December 2016 in Wiley Online Library

Received: 14 April 2016, Accepted: 28 October 2016 Published online 1 December 2016 in Wiley Online Library Research Article Received: 14 April 2016, Accepted: 28 October 2016 Published online 1 December 2016 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.7171 One-stage individual participant

More information

CHAPTER III METHODOLOGY AND PROCEDURES. In the first part of this chapter, an overview of the meta-analysis methodology is

CHAPTER III METHODOLOGY AND PROCEDURES. In the first part of this chapter, an overview of the meta-analysis methodology is CHAPTER III METHODOLOGY AND PROCEDURES In the first part of this chapter, an overview of the meta-analysis methodology is provided. General procedures inherent in meta-analysis are described. The other

More information

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information

Reliability, validity, and all that jazz

Reliability, validity, and all that jazz Reliability, validity, and all that jazz Dylan Wiliam King s College London Introduction No measuring instrument is perfect. The most obvious problems relate to reliability. If we use a thermometer to

More information

Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP I 5/2/2016

Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP I 5/2/2016 Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP233201500069I 5/2/2016 Overview The goal of the meta-analysis is to assess the effects

More information

1 The conceptual underpinnings of statistical power

1 The conceptual underpinnings of statistical power 1 The conceptual underpinnings of statistical power The importance of statistical power As currently practiced in the social and health sciences, inferential statistics rest solidly upon two pillars: statistical

More information

Measuring and Assessing Study Quality

Measuring and Assessing Study Quality Measuring and Assessing Study Quality Jeff Valentine, PhD Co-Chair, Campbell Collaboration Training Group & Associate Professor, College of Education and Human Development, University of Louisville Why

More information

C2 Training: August 2010

C2 Training: August 2010 C2 Training: August 2010 Introduction to meta-analysis The Campbell Collaboration www.campbellcollaboration.org Pooled effect sizes Average across studies Calculated using inverse variance weights Studies

More information

Meta-Analysis and Publication Bias: How Well Does the FAT-PET-PEESE Procedure Work?

Meta-Analysis and Publication Bias: How Well Does the FAT-PET-PEESE Procedure Work? Meta-Analysis and Publication Bias: How Well Does the FAT-PET-PEESE Procedure Work? Nazila Alinaghi W. Robert Reed Department of Economics and Finance, University of Canterbury Abstract: This study uses

More information

A SAS Macro to Investigate Statistical Power in Meta-analysis Jin Liu, Fan Pan University of South Carolina Columbia

A SAS Macro to Investigate Statistical Power in Meta-analysis Jin Liu, Fan Pan University of South Carolina Columbia Paper 109 A SAS Macro to Investigate Statistical Power in Meta-analysis Jin Liu, Fan Pan University of South Carolina Columbia ABSTRACT Meta-analysis is a quantitative review method, which synthesizes

More information

Meta-Analysis David Wilson, Ph.D. Upcoming Seminar: October 20-21, 2017, Philadelphia, Pennsylvania

Meta-Analysis David Wilson, Ph.D. Upcoming Seminar: October 20-21, 2017, Philadelphia, Pennsylvania Meta-Analysis David Wilson, Ph.D. Upcoming Seminar: October 20-21, 2017, Philadelphia, Pennsylvania Meta-Analysis Workshop David B. Wilson, PhD September 16, 2016 George Mason University Department of

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement

More information

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /bmsp.

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /bmsp. Rubio-Aparicio, M., Sánchez-Meca, J., Lopez-Lopez, J. A., Botella, J., & Marín-Martínez, F. (017). Analysis of Categorical Moderators in Mixedeffects Meta-analysis: Consequences of Using Pooled vs. Separate

More information

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives DOI 10.1186/s12868-015-0228-5 BMC Neuroscience RESEARCH ARTICLE Open Access Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives Emmeke

More information

Revised Cochrane risk of bias tool for randomized trials (RoB 2.0) Additional considerations for cross-over trials

Revised Cochrane risk of bias tool for randomized trials (RoB 2.0) Additional considerations for cross-over trials Revised Cochrane risk of bias tool for randomized trials (RoB 2.0) Additional considerations for cross-over trials Edited by Julian PT Higgins on behalf of the RoB 2.0 working group on cross-over trials

More information

A critical look at the use of SEM in international business research

A critical look at the use of SEM in international business research sdss A critical look at the use of SEM in international business research Nicole F. Richter University of Southern Denmark Rudolf R. Sinkovics The University of Manchester Christian M. Ringle Hamburg University

More information

Instrumental Variables Estimation: An Introduction

Instrumental Variables Estimation: An Introduction Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA The Problem The Problem Suppose you wish to

More information

Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods

Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods Hilde Augusteijn M.A.L.M. van Assen R. C. M. van Aert APS May 29, 2016 Today s presentation Estimation

More information

10. LINEAR REGRESSION AND CORRELATION

10. LINEAR REGRESSION AND CORRELATION 1 10. LINEAR REGRESSION AND CORRELATION The contingency table describes an association between two nominal (categorical) variables (e.g., use of supplemental oxygen and mountaineer survival ). We have

More information

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine Data Analysis in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine Multilevel Data Statistical analyses that fail to recognize

More information

A Comparison of Several Goodness-of-Fit Statistics

A Comparison of Several Goodness-of-Fit Statistics A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures

More information

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy Number XX An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy Prepared for: Agency for Healthcare Research and Quality U.S. Department of Health and Human Services 54 Gaither

More information

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis EFSA/EBTC Colloquium, 25 October 2017 Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis Julian Higgins University of Bristol 1 Introduction to concepts Standard

More information

Meta-Analysis and Subgroups

Meta-Analysis and Subgroups Prev Sci (2013) 14:134 143 DOI 10.1007/s11121-013-0377-7 Meta-Analysis and Subgroups Michael Borenstein & Julian P. T. Higgins Published online: 13 March 2013 # Society for Prevention Research 2013 Abstract

More information

The Effect of Extremes in Small Sample Size on Simple Mixed Models: A Comparison of Level-1 and Level-2 Size

The Effect of Extremes in Small Sample Size on Simple Mixed Models: A Comparison of Level-1 and Level-2 Size INSTITUTE FOR DEFENSE ANALYSES The Effect of Extremes in Small Sample Size on Simple Mixed Models: A Comparison of Level-1 and Level-2 Size Jane Pinelis, Project Leader February 26, 2018 Approved for public

More information

WELCOME! Lecture 11 Thommy Perlinger

WELCOME! Lecture 11 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression

More information

A Case Study: Two-sample categorical data

A Case Study: Two-sample categorical data A Case Study: Two-sample categorical data Patrick Breheny January 31 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/43 Introduction Model specification Continuous vs. mixture priors Choice

More information

Some General Guidelines for Choosing Missing Data Handling Methods in Educational Research

Some General Guidelines for Choosing Missing Data Handling Methods in Educational Research Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 3 11-2014 Some General Guidelines for Choosing Missing Data Handling Methods in Educational Research Jehanzeb R. Cheema University

More information

Basic Statistics and Data Analysis in Work psychology: Statistical Examples

Basic Statistics and Data Analysis in Work psychology: Statistical Examples Basic Statistics and Data Analysis in Work psychology: Statistical Examples WORK PSYCHOLOGY INTRODUCTION In this chapter we examine a topic which is given too little coverage in most texts of this kind,

More information

Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra, India.

Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra, India. Research Methodology Sample size and power analysis in medical research Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra,

More information

Lessons in biostatistics

Lessons in biostatistics Lessons in biostatistics The test of independence Mary L. McHugh Department of Nursing, School of Health and Human Services, National University, Aero Court, San Diego, California, USA Corresponding author:

More information

MODELING HIERARCHICAL STRUCTURES HIERARCHICAL LINEAR MODELING USING MPLUS

MODELING HIERARCHICAL STRUCTURES HIERARCHICAL LINEAR MODELING USING MPLUS MODELING HIERARCHICAL STRUCTURES HIERARCHICAL LINEAR MODELING USING MPLUS M. Jelonek Institute of Sociology, Jagiellonian University Grodzka 52, 31-044 Kraków, Poland e-mail: magjelonek@wp.pl The aim of

More information

Differential Item Functioning

Differential Item Functioning Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item

More information

Reliability, validity, and all that jazz

Reliability, validity, and all that jazz Reliability, validity, and all that jazz Dylan Wiliam King s College London Published in Education 3-13, 29 (3) pp. 17-21 (2001) Introduction No measuring instrument is perfect. If we use a thermometer

More information

Mantel-Haenszel Procedures for Detecting Differential Item Functioning

Mantel-Haenszel Procedures for Detecting Differential Item Functioning A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning H. Jane Rogers, Teachers College, Columbia University Hariharan Swaminathan, University of

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still

More information

Simple Linear Regression the model, estimation and testing

Simple Linear Regression the model, estimation and testing Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.

More information

ECONOMICS SERIES SWP 2013/2. Neither Fixed nor Random: Weighted Least Squares Meta-Regression. T.D. Stanley and Hristos Doucouliagos

ECONOMICS SERIES SWP 2013/2. Neither Fixed nor Random: Weighted Least Squares Meta-Regression. T.D. Stanley and Hristos Doucouliagos Faculty of Business and Law Department of Economics ECONOMICS SERIES SWP 013/ Neither Fixed nor Random: Weighted Least Squares Meta-Regression T.D. Stanley and Hristos Doucouliagos The working papers are

More information

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Research Methods and Ethics in Psychology Week 4 Analysis of Variance (ANOVA) One Way Independent Groups ANOVA Brief revision of some important concepts To introduce the concept of familywise error rate.

More information

investigate. educate. inform.

investigate. educate. inform. investigate. educate. inform. Research Design What drives your research design? The battle between Qualitative and Quantitative is over Think before you leap What SHOULD drive your research design. Advanced

More information

Empirical assessment of univariate and bivariate meta-analyses for comparing the accuracy of diagnostic tests

Empirical assessment of univariate and bivariate meta-analyses for comparing the accuracy of diagnostic tests Empirical assessment of univariate and bivariate meta-analyses for comparing the accuracy of diagnostic tests Yemisi Takwoingi, Richard Riley and Jon Deeks Outline Rationale Methods Findings Summary Motivating

More information

Analysis of Confidence Rating Pilot Data: Executive Summary for the UKCAT Board

Analysis of Confidence Rating Pilot Data: Executive Summary for the UKCAT Board Analysis of Confidence Rating Pilot Data: Executive Summary for the UKCAT Board Paul Tiffin & Lewis Paton University of York Background Self-confidence may be the best non-cognitive predictor of future

More information