Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes

Size: px
Start display at page:

Download "Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes"

Transcription

1 Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes Stephen Burgess Simon G. Thompson CRP CHD Genetics Collaboration May 24, 2012 Abstract Mendelian randomization is an epidemiological method for estimating causal associations from observational data by using genetic variants as instrumental variables. Typically the genetic variants explain only a small proportion of the variation in the risk factor of interest, and so large sample sizes are required, necessitating data from multiple sources. Meta-analysis based on individual patient data requires synthesis of studies which differ in many aspects. A proposed Bayesian framework is able to estimate a causal effect from each study, and combine these using a hierarchical model. The method is illustrated for data on C-reactive protein (CRP) and coronary heart disease (CHD) from the CRP CHD Genetics Collaboration (CCGC). Studies from the CCGC differ in terms of the genetic variants measured, the study design (prospective or retrospective, population-based or case-control), whether CRP was measured, the time of CRP measurement (pre- or post-disease), and whether full or tabular data were shared. We show how these data can be combined in an efficient way to give a single estimate of causal association based on the totality of the data available. Compared to a two-stage analysis, the Bayesian method is able to incorporate data on 23% additional participants and 51% more events, leading to a 23 26% gain in efficiency. Keywords: Mendelian randomization, meta-analysis, individual participant data, causal inference. Address: Department of Public Health & Primary Care, Strangeways Research Laboratory, Worts Causeway, Cambridge, CB1 8RN, UK. Telephone: Fax: Correspondence to: sb452@medschl.cam.ac.uk. 1

2 1 Introduction A fundamental epidemiological question of interest is whether an observed correlation between a risk factor and a disease is a causal or a non-causal association. Mendelian randomization is a technique for determining the causal association between a risk factor (X) and an outcome (Y ) in the presence of several possibly unmeasured confounders (U) [1]. A genetic variant (G) is sought which is associated with the risk factor, not associated with any of the confounders, and independent of the outcome conditional on the risk factor and confounders [2] [3]. Such a variable is known as an instrumental variable [4]. 1.1 Causal inference A causal association refers to the effect of an intervention on a specific risk factor, and is usually the target of interest for an epidemiologist [5]. This is contrasted with an observational association, which describes how the outcome depends on an observed difference in a risk factor. Unfortunately, interpreting the association between a risk factor and a disease outcome in observational data as a causal association relies on untestable and often implausible assumptions. This has led to several high-profile cases where a risk factor has been widely advocated as important in disease prevention from observational data, only to be later discredited when the evidence from randomized trials did not support a causal interpretation [6]. For example, observational studies reported a strong inverse association between vitamin C and coronary heart disease, which did not attenuate on adjustment for a variety of risk factors [7]. However, results of experimental data obtained from randomized controlled trials (RCTs) showed a null association with a positive point estimate for the association [8]. 1.2 Instrumental variables An instrumental variable (IV) is associated with the risk factor of interest and is used to estimate the causal effect of change in the risk factor while all other risk factors remain constant [9] [10]. The fundamental conditions for an IV to satisfy are summarized as [3] [4] [11]: i. the IV is associated with the risk factor (G X), ii. the IV is not associated with any confounder (G U), iii. the IV is conditionally independent of the outcome given the risk factor and confounders (G Y X, U). Subgroups defined as those with a given value of the IV are analogous to treatment arms in a RCT [12]. From the IV assumptions, these subgroups differ systematically in the risk factor, but not in any other factor [13]. A difference in disease incidence between these subgroups would therefore indicate a true causal relationship between risk factor and outcome [14]. 2

3 1.3 Mendelian randomization Mendelian randomization is instrumental variable analysis using genetic instruments [15]. Although not all Mendelian randomization studies have used IV methodology [16] [17], the use of genetic variants as IVs is at the core of Mendelian randomization. Genetic variants are ideal candidates for IVs, as genes are typically specific in function, affecting a single risk factor [18]. Genetic variation is determined at conception, so no reverse causation of an outcome on a genetic variant is possible. Genetic markers used as IVs are usually single nucleotide polymorphisms (SNPs) [11]. We consider data on C-reactive protein (CRP) and coronary heart disease (CHD) collated by the CRP CHD Genetic Collaboration (CCGC) [19] [20]. Although the methods in this paper were specifically designed for the data from the collaboration, we believe that they cover a wide range of study designs and scenarios and so will be useful for meta-analysis of Mendelian randomization data in other contexts. The use of any particular genetic variant as an IV requires caution as the IV assumptions may be violated for various epidemiological and biological reasons, such as the gene being associated with variables on multiple risk pathways (pleiotropy) [3] [11] [21] [22] [23]. Although the assumptions cannot be fully tested, when the function of the gene where the genetic variant is located is known, we have good biological plausibility for use of the genetic variant as an IV. We assume that the genetic variants used as an IV in this paper satisfy the necessary conditions; this has been discussed at length elsewhere [19]. To summarize, the SNPs used as IVs are taken from the CRP gene region on chromosome 1 and are not known to be associated with any potential confounding factors either from experimental knowledge or empirical testing in this dataset (out of 84 associations between 4 SNPs and 21 alternative risk factors for CHD, 3 had p < 0.05, with minimal p-value 0.003). 1.4 Meta-analysis In general, the variation in the risk factor of interest explained by genetic variants is small, and so adequately powered Mendelian randomization studies typically require large sample sizes, demanding synthesis of evidence from multiple, possibly heterogeneous studies [2]. We combat the problems raised by this heterogeneity by extending a Bayesian hierarchical method designed for continuous outcomes [24]. By making certain simplifying assumptions which are fully detailed below, we demonstrate how a range of different designs of studies with binary outcomes can be analysed using a logistic model, and how these causal estimates can be combined in a hierarchical model. We show how the parameters of genetic association can also be combined across studies, to strengthen the instrument and increase precision. By using the random effects distribution as an implicit prior for the genetic association parameters, we show how studies with no data on the risk factor can still be included in the analysis. By including both prevalent disease events (those reported at baseline) and incident events in prospective studies, we use all available data on disease outcomes. 3

4 1.5 Structure of paper Having discussed the data available and sources of heterogeneity between studies in the CCGC (Section 2), the methodological framework and statistical model for analysis is introduced (Section 3). We show how this can be used to analyse each study in the collaboration (Section 4), assessing the model assumptions by sensitivity analysis. Extensions are discussed which efficiently deal with issues of combining evidence across studies (Section 5), and then results are presented for the causal change in CHD due to CRP (Section 6). We conclude by discussing the interpretation and potential applications of this method (Section 7). 2 The CRP CHD Genetics Collaboration The CCGC is a collaboration of 47 epidemiological studies seeking to ascertain the causal role of C-reactive protein (CRP) in coronary heart disease (CHD) using a Mendelian randomization approach [20]. CRP is an acute-phase protein found in the blood which is associated with inflammation. It is known that CRP is observationally associated with CHD [25], but it is not known whether this association is causal [26] [27] [28]. Studies from the collaboration measure CRP levels, genes relating to CRP and CHD events. Individual participant data (IPD) have been collated by the coordinating centre. Table 1 lists the major statistical features of the studies in the CCGC. Further epidemiological characterization of the studies can be found in Appendix 1 of the published paper from the collaboration [19]. A list of study abbreviations as used in this paper can be found in Web Table A1. In all analyses, we restrict attention to participants of European descent, excluding the four studies with no participants of European descent from analysis. This is to ensure greater homogeneity of the study populations and to counteract violations of the IV assumptions due to population stratification [11]. CRP is positively-skewed, and so we take log(crp) as the risk factor. We use the term risk ratio as a generic term meaning hazard ratio or odds ratio as appropriate. 2.1 Issues leading to difficulties in evidence synthesis The studies to be combined differ in several aspects. We list some aspects of between-study variability such as clinical and methodological diversity, and differences in which variables have been measured. These lead to difficulties in evidence synthesis and possible statistical heterogeneity: 1. Study design: The collaboration includes prospective studies: cohort studies, nested case-control studies (both matched and unmatched); and retrospective studies: case-control studies (unmatched). Four of the studies in the collaboration did not provide IPD but only summary data on numbers of individuals with and without CHD events for each genotype. Different study designs are usually analysed using different methods and provide estimates which represent different quantities. 4

5 Total Number of subjects with... SNP data 1 Study 2 Study type participants Incident CHD Prevalent CHD CRP data 3 g1 g2 g3 g4 BRHS Cohort with prevalent cases BWHHS Cohort with prevalent cases CCHS Cohort with prevalent cases CGPS Cohort with prevalent cases CHS Cohort with prevalent cases EAS Cohort with prevalent cases ELSA Cohort with prevalent cases FRAMOFF Cohort with prevalent cases PROSPER Cohort with prevalent cases ROTT Cohort with prevalent cases NPHSII Cohort without prevalent cases WOSCOPS Cohort without prevalent cases EPICNOR Nested matched case-control HPFS Nested matched case-control NHS Nested matched case-control NSC Nested matched case-control CAPS Nested unmatched case-control DDDD Nested unmatched case-control EPICNL Nested unmatched case-control WHIOS Nested unmatched case-control MALMO Nested unmatched case-control with prevalent cases SPEED Nested unmatched case-control with prevalent cases ARIC Unmatched case-control CUDAS Unmatched case-control CUPID Unmatched case-control HIFMECH Unmatched case-control HIMS Unmatched case-control ISIS Unmatched case-control (see Web Appendix) LURIC Unmatched case-control PROCARDIS Unmatched case-control SHEEP Unmatched case-control WHITE2 Unmatched case-control CIHDS Unmatched case-control (CRP in controls only) BHF-FHS Unmatched case-control (no CRP data) CHAOS Unmatched case-control (no CRP data) GISSI Unmatched case-control (no CRP data) HVHS Unmatched case-control (no CRP data) INTHEART Unmatched case-control (no CRP data) UCP Unmatched case-control (no CRP data) AGES Tabular data HEALTHABC Tabular data MONICA/KORA Tabular data PENNCATH Tabular data Total Table 1: Summary of studies from the CRP CHD Genetics Collaboration with subjects of European descent 1 g1 = rs1205, g2 = rs , g3 = rs , g4 = rs or equivalent proxies (see Web Appendix). 2 A list of study abbreviations is given in the Web Table A1 3 In retrospective case-control studies, CRP data is taken in controls only; in prospective studies, in subjects without prevalent CHD. 5

6 2. Outcome data: The outcome was defined as fatal CHD (based on International Classification of Diseases codings) or nonfatal myocardial infarction (using World Health Organization criteria). In five studies, coronary stenosis (more than 50% narrowing of at least one coronary artery assessed by angiography) was also included as a disease outcome. As our case definition is almost uniform across studies, we do not expect this to be a source of heterogeneity. We refer to all outcomes as CHD, using the term prevalent to refer to a CHD event prior to blood draw for CRP measurement and incident to refer to a CHD event subsequent to blood draw. In some prospective studies, CRP measurements have not been taken at baseline, but rather at a later occasion, which we have redefined as our baseline. Hence, some of the individuals who had incident events in the original study will not have incident events in the baseline-transformed study. In most of the prospective cohort studies, the hazard function appears to be a smooth function of follow-up time (Web Figure A1). In later sections, we will investigate the sensitivity of regression of the outcome in cohort studies on parametric assumptions, and on ignoring variable follow-up. Although there are anomalous results in a few of the studies, it seems that these assumptions may not severely misrepresent the data. 3. Risk factor data: Some of the studies do not measure CRP level for all individuals, and others do not measure it for any individuals. In case-control data, cases have been oversampled with respect to the general population, and so we make inferences on the gene-risk factor association from controls [29] [30]. In prospective studies, as a CHD event may affect CRP levels, to prevent problems of reverse causation, we only make inferences on the gene risk factor association from individuals without a prevalent CHD event [3]. Different studies measured CRP using different assays, after different storage periods, and with and without prior fasting, leading to potential heterogeneity. Apart from low levels of CRP, where assays are not sensitive enough to determine between values, the distribution of log(crp) can be approximated by a normal distribution (Web Figure A2). The mean of the risk factor distribution in genetic subgroups varies, as shown for one of the studies in the collaboration (CGPS), but the standard deviation is similar in each subgroup with no clear trend (Web Figure A3). 4. Genetic data: The 43 studies in the collaboration with participants of European descent measure different genetic information in the form of SNPs. The number of SNPs measured in each study varies from 1 to 13. Over 20 SNPs in total are measured by at least one study. Only 6 studies measure all of the four pre-specified SNPs [20]: rs1205, rs , rs and rs Some studies measure SNPs which are in complete linkage disequilibrium (LD) with one of the pre-specified SNPs, and which can be used as proxies for these SNPs [31]. In total 20 studies measure all four SNPs or proxies thereof and an additional 17 measure some three out of these four. Others measure fewer than 6

7 this. SNPs measured all come from the CRP-regulatory gene on chromosome 1 and so display varying degrees of correlation. These genetic variants can be summarized by five haplotypes, which comprise 99% of the variation in European populations (Web Tables A2 and A3). The frequency of the haplotype patterns is similar in each of the studies in the collaboration (Web Figure A4), which provides evidence for the homogeneity of European descent populations, and supports our claims for use of proxy SNPs and determination of haplotypes. Further details about the genetic variants used as SNPs are found in the Web Appendix. 2.2 Weak instruments Although IV methods give consistent estimates of causal association, IV estimates in finite samples are biased towards the confounded observational estimate of association [32]. The magnitude of this bias depends on the strength of the statistical association between the IV and the risk factor, and is related to the F statistic in the regression of the risk factor on the IV. The F statistics in each study for different models of genetic association are given in Web Table A4. As there is little evidence for a more complex model (see Web Appendix for discussion), an additive per-allele SNP based model of genetic association in each study is used throughout. Even using this parsimonious model with one parameter for each IV, the F statistics in some studies are less than 10. IVs with an F statistic less than 10 are often labelled as weak instruments [33]. Such classification is misleading for several reasons. First, it gives a binary classification of IVs as either weak or strong based on an arbitrarily chosen threshold F value, whereas the bias is a continuous phenomenon. Secondly, it leads researchers to think that weak instrument bias is due to an intrinsic property of the instruments, whereas any instrument can be made stronger by increasing the sample size. Thirdly, the measured F statistic in a given dataset is an unreliable guide to the true strength of an instrument, due to the large sampling variability of the F statistic [34]. Fourthly, the use of rules based on measured F statistics, such as the choice of IVs or the exclusion of studies from a meta-analysis with F < 10, can lead to more bias than it prevents [35]. For these reasons, we seek to combat weak instrument bias through careful choice of analysis method rather than by post hoc selection of data. We discuss how weak instrument bias affects the methods used in this paper in Section

8 3 Methods for data analysis In this section, we present methods for instrumental variable analysis. Firstly, we present the classical two-stage method. This is equivalent to the commonly used ratio of coefficients valid for a single IV, but can be used with multiple instruments [36]. Then, we present a Bayesian method similar to the two-stage method, which extends naturally to a meta-analysis model where studies are combined in a hierarchical model on the causal parameter. Finally, we discuss approaches to IV estimation with a survival outcome. 3.1 Two-stage analysis The causal association can be estimated using a two-stage approach. With continuous outcomes, this is known as two-stage least squares (2SLS) [37]. In 2SLS, we first fit a linear regression of the risk factor on the IVs (G X regression), and secondly a linear regression of the outcome on the fitted values for the risk factor from the first stage regression ( ˆX Y regression). The 2SLS estimate ( ˆβ 2SLS ) is the coefficient for the increase in outcome per unit increase in risk factor. With binary outcomes, an analogous estimate has been proposed, called a two-stage [38], pseudo-2sls [39], two-stage predictor substitution [40] [41], or Wald-type estimator [42]. This replaces the second linear ˆX Y regression with a logistic regression. With a single instrument, the 2SLS and two-stage methods estimators coincide with the ratio of coefficients from the appropriate G Y regression (linear or logistic) divided by the coefficient from the G X regression [43]. There are several difficulties with this approach. Firstly, the fitted values for the risk factor are plugged into the second-stage regression without accounting for uncertainty, meaning that the precision in the causal estimate may be overestimated. Secondly, the distribution of the causal parameter is assumed to be normal, which may result in overly narrow confidence intervals when the instrument is weak [44]. Finally, in the non-linear case, the two-stage estimate is uncorrelated with the residuals in the G X regression, but not with the residuals in the ˆX Y regression, leading to bias compared to the conditional causal effect [39] and the coining of such two-stage analyses as forbidden regressions in the field of econometrics [45] [46]. However, when the two-stage estimate is compared to a marginal or population-averaged causal effect, which is different to the conditional causal effect due to non-collapsibility of the logistic function [3] [47], the two-stage estimate is shown to be close to unbiased with strong IVs [48] [49]. This contrasts to the control function approach [50], also known as the adjusted IV [51] or two-stage residual inclusion method [40], which has been shown to be biased for the conditional causal effect when there is confounding [41]. In this method, the first-stage residuals from the G X regression are included in the second-stage ˆX Y regression. The parameter estimated by the adjusted two-stage method does not have a clear interpretation in general, as the residuals represent a univariate combination of the unmeasured confounders, independent variation and measurement error in X [49]. Although we prefer the two-stage method on theoretical grounds, we include results from this method, which we label as the adjusted two-stage 8

9 method, when pre-chd event measures of the risk factor are available on the entire cohort, as using risk factor values measured after a CHD event may lead to reverse causation. We use the term two-stage to refer to a two-stage IV analysis and two-step to a two-step meta-analysis based on combining summary estimates from individual studies. All two-step meta-analyses in this paper use inverse-variance weighting and the DerSimonian Laird method of moments to estimate heterogeneity in a random-effects model [52]. 3.2 Bayesian methods To combat some of the difficulties with the two-stage methods, we use a Bayesian model with vague priors. We divide our population using genetic information into subgroups, where a subgroup contains all individuals in a study with a certain genotype. For each subgroup j, we estimate the mean level of risk factor for the subgroup j assuming that, for each individual i, the measured values of risk factor x ij come from a normal distribution with mean ξ j and variance σ 2, common across subgroups. Using a logistic model of outcome on risk factor, we model the probability of an event π j in subgroup j by assuming a binomial distribution of number of events n j from total number at risk N j and a linear relationship between the log-odds of event η j = logit(π j ) and the mean level of risk factor (ξ j ). The coefficient β 1, the increase in log-odds of an event for unit increase in the risk factor, is taken as our causal parameter of interest. As in the two-stage methods, we only use the risk factor values x ij for individuals from the control population in a case-control study, and for individuals without previous history of disease in a cohort study. Individuals with missing risk factor values are still included as cases or controls in the logistic regression. X ij N (ξ j, σ 2 ) (1) n j Binomial(N j, π j ) logit(π j ) = η j = β 0 + β 1 ξ j We model the risk factor as additive across SNPs with a per allele model for each SNP; justification for this is provided in the Web Appendix. For each subgroup j comprising all people with g jk (=0, 1, 2) variant allele copies for SNP k, k = 1,..., K, we estimate the change in risk factor per allele α k to give average levels of risk factor ξ j for each subgroup. K ξ j = α 0 + α k g jk (2) A haplotype-based model for the risk factor was also considered; details are given in the Web Appendix. In a meta-analysis context, we jointly estimate the causal parameter across studies in a hierarchical model. In a fixed-effect model, the causal parameter β 1 is the same k=1 9

10 for each study m = 1,..., M. X ijm N (ξ jm, σ 2 m) (3) n jm Binomial(N jm, π jm ) logit(π jm ) = η jm = β 0m + β 1 ξ jm In a random-effects model, the causal parameter is allowed to vary between studies, with a normal distribution imposed on the study-level causal parameters. Here, the causal parameter of interest µ β is the mean causal effect across studies. We replace the final line from (3) with: logit(π jm ) = η jm = β 0m + β 1m ξ jm (4) β 1m N (µ β, τ 2 ) where τ 2, the variance of the random-effects distribution, is a measure of the between-study heterogeneity in the β 1m. Hence, unlike the two-stage IV method, the Bayesian analysis is performed in one stage, and the meta-analysis is performed in one step. 3.3 Survival regression models Using the two-stage paradigm with survival outcomes, we perform second-stage Cox and Weibull proportional hazards regressions, censoring non-chd deaths. It is not clear what the parameter estimated by such regressions represents [53], and the results presented here are for comparative purposes only. We also convert the survival outcome into a binary outcome, ignoring variable follow-up, and use a logistic regression model. In the Bayesian framework, a Weibull distribution can be assumed for survival times, with shape parameter r and a log-linear model for the rate parameter µ j for each individual i in genotypic group j with time-to-event t ij : x ij N (ξ j, σ 2 ) (5) t ij Weibull(r, µ j ) log(µ j ) = η j = β 0 + β 1 ξ j If there is no event but an individual is right-censored, then we introduce a censoring indicator and use the likelihood contribution from the probability of not having an event until the time of censoring. A gamma distribution is used for the prior distribution of r with shape parameter 0.1 and rate parameter Details of Bayesian analyses In each of the Bayesian analyses below, vague independent N (0, ) priors were placed throughout on all regression parameters, independent U(0, 20) priors on the standard deviation parameters in the normal distributions of the risk factor, independent 10

11 U(0, 1) priors on the standard deviation parameters of random effects distributions, and inverse-wishart priors on the variance-covariance matrix of multivariate normal distributions, where the scale matrix in the Wishart distribution is diagonal with 10 as each diagonal element and 0 as each off-diagonal element. We use Markov chain Monte Carlo (MCMC) methods in WinBUGS [54] with at least iterations, of which the first 1000 are discarded as burn-in. We assess convergence by running three parallel chains with different starting values to assess convergence of the posterior distribution, examining the Gelman Rubin plots [55], and perform sensitivity analyses to show lack of dependence on the prior distributions. For ease of expression, we regard the mean of the posterior distribution as the estimate of the parameter of interest, the standard deviation of the posterior distribution as the standard error (SE), and the 2.5th to the 97.5th percentile range as the 95% confidence interval. 4 Analysis of individual studies For each of the study designs in the CCGC, we use a logistic model of disease association. This is for two reasons: first, to simplify calculations in the computationally intensive Bayesian framework, and secondly, to aim to estimate the same target parameter in each of the studies. We describe below how a logistic model is approximately valid for each study design. The difference between IV estimates based on different approaches (two-stage and Bayesian) and different models of association are examined in Section 6 as a sensitivity analysis for the simplifying assumptions to be made. In cohort studies, where possible, two analyses are performed. A retrospective analysis is performed by viewing the cohort at baseline as a cross-sectional study with cases taken as individuals with previous history of disease (prevalent cases) and controls as all individuals free from disease at baseline. A prospective analysis excludes all prevalent cases and considers CHD events within the reporting period. An individual who is censored at the end of the follow-up period is taken as a control in both the retrospective and prospective analyses as he has two separate opportunities to become a case. We look in turn at unmatched case-control studies and cohort studies viewed cross-sectionally, then matched case-control studies, and finally cohort studies viewed prospectively. In each case, we use both two-stage and Bayesian methods to estimate a causal effect. 4.1 Unmatched case-control studies and cross-sectional analysis of cohort studies For the case-control studies and cohort studies viewed cross-sectionally, we use a logistic model in the second stage regression. In both cases, this is the correct analysis, although with a cohort study, a log-linear model could also be used to estimate a relative risk, which is close to the odds ratio estimated by the logistic model under the rare-disease assumption. Table 2 shows that the two-stage and Bayesian methods give similar answers in most large studies. Some studies give less consistent results, 11

12 especially ISIS and HIFMECH, where no Bayesian results are given as the posterior distribution of the causal effect did not converge. In both of these studies, only one SNP is available and the F statistic in the additive G X model is less than 1, indicating that the IV explains less of the variation in the risk factor than would be expected by chance. 4.2 Analysis of matched case-control studies For the matched case-control studies, in the two-stage approach, we use conditional and unconditional logistic models in the second stage regression. In a matched case-control study, the effect size should be estimated using conditional logistic regression [56], although the bias from ignoring the matching is generally small [56] [57]. In the Bayesian approach, we use an unconditional logistic model, due to issues of computational complexity and difficulty of Bayesian inference on a conditional likelihood. Table 3 shows that for most studies the two approaches give broadly similar estimates. The Bayesian and two-stage random-effects pooled results are quite different due to different assumptions about heterogeneity. The lack of information on between-study heterogeneity due to the paucity of studies and diffuse prior on the heterogeneity parameter in the Bayesian approach gives a large estimate of τ. This conflict can be redressed by use of a more informative prior; two-stage and Bayesian fixed-effect meta-analyses (effectively a point-mass prior for τ concentrated at 0) give much closer results. 4.3 Analysis of cohort studies For the cohort studies viewed prospectively, we use Cox and Weibull proportional hazards, and logistic models in the second-stage regression. The adjusted two-stage method with a logistic model is also used (Web Table A5). In the Bayesian approach, we use a logistic model (1) and a Weibull model (5). For most studies, Table 4 shows that the approaches give similar estimates. There is a slight loss in precision in using a logistic model over a Cox or Weibull model, due to the loss of time-to-event information. We note that the Bayesian and two-stage analyses give similar inference throughout, especially in studies with over 100 events. The random-effects meta-analysis results are different between the Bayesian and two-stage analyses, but the fixed-effect results are almost identical. The correlation between the two-stage IV estimates in the ten cohort studies viewed prospectively and cross-sectionally (using a logistic model in both analyses; similar results are obtained using a Cox or Weibull model) is (Web Figure A6). 4.4 Differences between two-stage and Bayesian IV estimates Although there is broad agreement between the Bayesian and two-stage IV results in this section, there are several differences in results. We discuss some possible reasons for the differences. 12

13 1. Weak instrument bias: A simulation study has shown that a similar Bayesian method with a continuous outcome gives estimates with are free from weak instrument bias for IVs which would be conventionally thought of as weak (expected F statistic 5) [48]. This corresponds with the limited information maximum likelihood (LIML) method [58], which is based on the same likelihood function. LIML estimates are known to suffer less from weak instrument bias than two-stage methods [45]. With a binary outcome, simulations are less clear, although it seems that the Bayesian method gives similar results to the two-stage method, which estimates a population-averaged causal effect [49], but suffers less from weak instrument bias [48]. 2. Measurement error: When the estimates differ more substantially (say by more than 15% of the standard error), the Bayesian estimates are generally greater in magnitude than their two-stage counterparts, with standard errors similarly greater. The increase in size of effect may be due to random error in the mean risk factor estimates in genotypic groups leading to dilution of the regression coefficients in the second-stage regression and attenuation in the two-stage estimates [59]. As the Bayesian analyses allow for error in X, the Bayesian estimates should be unaffected by regression dilution bias. 3. Propagation of uncertainty: The Bayesian model estimates causal association in one stage, allowing for propagation of error and feedback throughout the model. In the two-stage model, there is no possibility of propagation of error or feedback from the second-stage to the first-stage regression. 4. Asymptotic assumptions: The Bayesian analysis gives a posterior distribution rather than a single point estimate. When the posterior distribution cannot be well-approximated by a normal distribution, the mean and median of the posterior can be quite different, and neither may be an adequate summary of the posterior. The two-stage estimate may be closer to one of the posterior mean or median than the other. The sampling distribution of IV estimates are typically non-normal with long tails, especially when the IV is weak [34]. This means that the asymptotic standard errors of the two-stage method may underestimate the true uncertainty in the causal parameter. Simulations in the continuous case for the two-stage method have shown poor coverage properties [44]. As the Bayesian method does not make asymptotic assumptions, the shape of the posterior distribution will reflect the true uncertainty in the parameter estimate. This may result in wider confidence intervals than those of the two-stage method, but the coverage levels of the confidence intervals in simulations have been shown to be better [48]. With regards to the Bayesian causal estimates which did not converge, this is usually due to lack of differentiation in mean risk factor levels between genetic subgroups, leading to gradients between risk factor and outcome which may be compatible with an infinite (vertical) association [48]. This is expressed in the two-stage method by a large standard error on the causal parameter, but represented more accurately by the confidence interval in the ratio method from 13

14 Fieller s Theorem, which may cover the entire real line, or by the Bayesian method, where the posterior distribution fails to converge. Hence, failure to converge in the Bayesian method is not (necessarily) a negative feature, but can be an indication that no proper posterior distribution reflects the uncertainty due to the weakness in the G X association. 5. Treatment of heterogeneity: The Bayesian model estimates a causal association in one stage. Similarly, the Bayesian meta-analysis model estimates a pooled association in one step. In the Bayesian meta-analysis, the prior for the heterogeneity parameter ensures that the heterogeneity is always positive. In a two-step meta-analysis, the DerSimonian Laird estimate of heterogeneity can be (and is often) zero. If there are not many studies or studies have imprecise estimates, the DerSimonian Laird estimate may be zero due to lack of evidence of heterogeneity, whereas the Bayesian one-step model recognizes only a lack of information on the between-study variance, and the posterior for τ is similar to the prior. The point estimate changes as heterogeneity increases, as larger studies are down-weighted in comparison to small studies [60]. For these reasons, while we would expect the results from a Bayesian and two-stage IV analysis to be close for large studies, they may well give different estimates if the sample size is small, if there are few events, or if the IV is weak. Random-effects meta-analysis estimates may be different if the number of studies is small and the prior on the between-study heterogeneity is diffuse. 4.5 Summary In this section, we have seen that despite the logistic model relying on certain assumptions, the causal estimates are not particularly sensitive to these assumptions, and the loss of information in discarding survival outcomes is not great. We conclude that using a logistic model in all studies is a reasonable simplifying assumption. The Bayesian and two-stage approaches make different assumptions in terms of feedback and propagation of errors between the regression stages, normality of the causal estimate, and heterogeneity in the random-effects models. We have seen that, where the number of cases is fairly large (n > 100) and the instrument strength is moderate (F-statistic in the G X regression > 5), the Bayesian and two-stage analyses give similar inferences. In meta-analysis models, the fixed-effect two-stage and Bayesian meta-analyses agree throughout, and the random-effects meta-analyses agree when the number of studies is large (e.g. Table 2 with M = 27). 14

15 Case-control studies Cohort studies Study N n Two-stage analysis Bayesian analyses ARIC (0.279) (0.314) CAPS (0.505) (0.600) CIHDS (0.225) (0.235) CUDAS (1.392) (2.176) CUPID (0.326) (0.491) DDDD (0.446) (0.628) EPICNL (0.340) (0.347) HIFMECH (2.508) - 1 HIMS (0.318) (0.333) ISIS (1.480) - 1 LURIC (0.212) (0.235) MALMO (0.158) (0.194) PROCARDIS (0.180) (0.185) SHEEP (0.216) (0.250) SPEED (0.488) (0.608) WHIOS (0.202) (0.216) WHITE (0.901) (1.238) BRHS (0.491) (0.500) BWHHS (0.475) (0.531) CCHS (0.772) (0.792) CGPS (0.325) (0.326) CHS (0.358) (0.375) EAS (0.974) (1.209) ELSA (0.461) (0.496) FRAMOFF (0.747) (0.852) PROSPER (0.258) (0.261) ROTT (0.388) (0.417) Pooled (0.061) (0.065) Heterogeneity I 2 = 0% (0 33%) ˆτ = Table 2: Unmatched case-control studies and cohort studies viewed cross-sectionally Log odds ratio of CHD per unit increase in log(crp) using two-stage and Bayesian IV methods with standard error, number of participants in study (N), number of events (n), pooled results from two-step inverse-variance weighted random-effects meta-analysis (two-stage) or hierarchical random-effects meta-analysis model (Bayesian), heterogeneity estimate (I 2 with 95% confidence interval for two-step method, ˆτ for hierarchical model) 1 Posterior distribution of causal effect did not converge. 15

16 Two-stage analyses Bayesian analyses Study N n Conditional logistic model Unconditional logistic model Logistic model EPICNOR (0.284) (0.280) (0.319) HPFS (0.405) (0.362) (0.543) NHS (0.327) (0.308) (0.374) NSC (0.327) (0.316) (0.338) Pooled (FE) (0.164) (0.156) (0.166) Pooled (RE) (0.164) (0.156) (0.266) Heterogeneity I 2 = 0% (0 83%) I 2 = 0% (0 82%) ˆτ = Table 3: Matched case-control studies Conditional and unconditional logistic models for causal log odds ratio of CHD per unit increase in log(crp) using two-stage and Bayesian IV methods with standard error, number of participants in study (N), number of events (n), pooled results from two-step inverse-variance weighted fixed-effects/random-effects (FE/RE) meta-analysis (two-stage) or hierarchical FE/RE meta-analysis model (Bayesian), heterogeneity estimate (I 2 with 95% confidence interval for two-step method, ˆτ for hierarchical model) from random-effects meta-analysis Two-stage analyses Bayesian analyses log-hr log-hr log-or log-hr log-or Study N n Cox model Weibull model Logistic model Weibull model 1 Logistic model BRHS (0.305) (0.306) (0.323) 0.51 (0.33) (0.351) BWHHS (1.034) (1.036) (1.042) (1.08) (1.085) CCHS (0.457) (0.457) (0.472) 0.04 (0.48) (0.482) CGPS (0.699) (0.700) (0.702) (0.71) (0.709) CHS (0.258) (0.259) (0.288) 0.68 (0.28) (0.307) EAS (0.689) (0.692) (0.722) 0.67 (0.84) (0.891) ELSA (0.828) (0.829) (0.833) (0.85) (0.857) FRAMOFF (0.965) (0.965) (0.974) 0.51 (1.21) (1.204) NPHSII (0.815) (0.830) (0.837) (0.97) (1.008) PROSPER (0.311) (0.312) (0.328) 0.25 (0.32) (0.337) ROTT (0.564) (0.565) (0.582) (0.61) (0.635) WOSCOPS (2.539) (2.540) (2.806) - 2 Pooled (FE) (0.137) (0.137) (0.145) 0.26 (0.13) (0.145) Pooled (RE) (0.159) (0.156) (0.175) 0.20 (0.21) (0.234) Heterogeneity I 2 = 14% (0 54) I 2 = 12% (0 51) I 2 = 19% (0 57) ˆτ = 0.35 ˆτ = Table 4: Cohort studies Cox, Weibull and logistic models for causal log risk ratio of CHD per unit increase in log(crp) using two-stage and Bayesian IV methods with standard error, number of participants in study (N), number of events (n), pooled results from two-step inverse-variance weighted fixed-effects/random-effects (FE/RE) meta-analysis (two-stage) or hierarchical FE/RE meta-analysis model (Bayesian), heterogeneity estimate (I 2 with 95% confidence interval for two-step method, ˆτ for hierarchical model): log hazard ratio (HR) and log odds ratio (OR) 1 The Weibull models were slower to run and mixed poorly, so results are only given to 2 decimal places due to Monte Carlo random error. 2 Posterior distributions of causal effect did not converge. 16

17 5 Dealing with issues of evidence synthesis In this section, we detail how the problems of combining evidence of heterogenous sources can be efficiently accomplished in the Bayesian model detailed above. 5.1 Cohort studies We want to include participants in cohort studies up to twice in the analysis, once in the study viewed retrospectively and once prospectively. However, we do not want to include the individual s risk factor data twice, and we want to ensure that the same parameter is estimated in both analyses. In the corresponding model (6), we consider genetic subgroup j. This subgroup contains N 1j individuals, n 1j of whom are prevalent cases, and N 2j (= N 1j n 1j ) non-prevalent individuals, n 2j of whom have incident events. X ij N (ξ j, σ 2 ) for i = 1,..., N 2j non-prevalent individuals (6) K ξ j = α 0 + α k g jk k=1 n 1j Binomial(N 1j, π 1j ) n 2j Binomial(N 2j, π 2j ) logit(π 1j ) = η 1j = β 01 + β 1 ξ j logit(π 2j ) = η 2j = β 02 + β 1 ξ j This model ensures that the same fitted values of the risk factor are used in both logistic regressions without including individuals twice in the regression of risk factor on genotype. Moreover, a single causal parameter β 1 is estimated. For comparison, in the two-stage method, we calculate the causal effect separately using prospectively and retrospectively assessed events, combine the two estimates using an inverse-variance weighted fixed-effect meta-analysis, and take the result of this as the study-specific effect. This assumes, incorrectly, that the two estimates are independent; such an assumption is not made in the Bayesian method. Although in this case the risk factor data is used twice, the main source of uncertainty in the causal estimates comes from the second-stage regression, and so inclusion of the risk factor data twice may not add undue precision to the overall pooled result. 5.2 Common SNPs Where the same subset of SNPs has been used in several studies, we can combine the estimates of genetic association α km across studies. This should give a more precise model of association in smaller studies and should reduce weak instrument bias, as instrument strength will be combined across the studies. Due to possible heterogeneity between populations, we use a random-effects model, where we impose a multivariate normal distribution on the study level parameters α km, k = 1,..., K 17

18 with mean vector µ α and variance-covariance matrix Ψ. Note that the intercept parameters α 0m for m = 1,..., M are not pooled. 5.3 Different sets of SNPs X ijm N (ξ jm, σm) 2 (7) K ξ jm = α 0m + α km g jkm k=1 (α 1m,..., α Km ) T N K (µ α, Ψ) (8) In the CCGC, there is no common set of SNPs measured in all studies. Due to correlation between the SNPs, it would not be valid to use the same parameters of genetic association α k in studies measuring different SNPs. In the overall meta-analysis, we use four different sets of parameters of genetic association, corresponding to four different patterns of measured SNPs in studies (see Web Appendix for details). 5.4 Lack of risk factor data Where a study m has not measured the risk factor (X) but has genetic data in common with other studies, we use the random-effects distributions for the genetic association parameters defined in equation (8) as a predictive distribution or implicit prior for the unknown parameters. This requires an assumption of exchangeability that the change in risk factor per additional allele is similar (i.e. can be drawn from the same random-effects distribution) as the other studies. We set α 0m = 0 as with no data on the G X association, this parameter cannot be identified. Incorporation of studies with information on only some of the gene risk factor outcome triangle needed for Mendelian randomization analysis is known in econometrics circles as the two sample problem [61]. 5.5 Tabular data For studies providing tabular data only, we had for each genetic subgroup j the total number of individuals (N j ) and the number with an event (n j ). We are able to incorporate such studies into our analysis using the random-effects distributions for the parameters of genetic association as above. 6 Meta-analysis of studies We apply the methods of the previous section to the CCGC data. Firstly, we look at estimation of the causal effect using a single instrument. We then present overall meta-analyses results from pooled two-stage estimates and from Bayesian hierarchical models. 18

19 6.1 Using instruments one at a time The G X and G Y associations are estimated in each study. A linear regression model is used for the G X association; the G Y association is estimated using logistic regression in unmatched case-control studies and cross-sectional analysis of cohort studies, conditional logistic regression in matched case-control studies and Cox regression in prospective analysis of cohort studies. The pooled results using each of the four pre-specified SNPs in turn are given in Table 5 (study-specific results in Web Figures A7 and A8). Using the method of Thompson et al. [62], we calculate causal estimates using each SNP in turn. Confidence intervals are constructed assuming the within-study correlation between G X and G Y association is zero, as recommended in the Thompson paper. G X G Y X Y SNP 1 Number of N n Pooled per allele p-value Heterogeneity studies effect (SE) (I 2 and 95% CI) g (0.0097) % (37 72%) g (0.0070) % (0 54%) g (0.0194) % (0 51%) g (0.0125) % (0 41%) g (0.0129) % (0 54%) g (0.0105) % (0 37%) g (0.0241) % (0 41%) g (0.0227) % (0 32%) SNP Number of studies N n Causal estimate (95% CI) g (-0.056, 0.237) g (-0.146, 0.168) g (-0.163, 0.195) g (-0.223, 0.203) Table 5: Pooled estimates from two-step inverse-variance weighted random-effects meta-analysis of per allele effect on log(crp) (G X association) and log odds of CHD (G Y association) in regression on each SNP in turn with standard error (SE), heterogeneity estimate; causal estimates (X Y association) of log odds ratio of CHD per unit increase in log(crp) from meta-analysis using method of Thompson et al. [62]; number of studies, total sample size (N), and total number of events (n) where appropriate 1 g1 = rs1205, g2 = rs , g3 = rs , g4 = rs or equivalent proxies (see Web Appendix) The causal estimates from each SNP are similar; heterogeneity of estimates would be evidence against the validity of one or more of the instruments [37]. As the causal estimates are derived from the same data and are correlated, they cannot be naively combined. As none of these analyses uses the totality of the genetic data, an integrated two-stage or Bayesian approach would be preferred. 19

Advanced IPD meta-analysis methods for observational studies

Advanced IPD meta-analysis methods for observational studies Advanced IPD meta-analysis methods for observational studies Simon Thompson University of Cambridge, UK Part 4 IBC Victoria, July 2016 1 Outline of talk Usual measures of association (e.g. hazard ratios)

More information

Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome

Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome Stephen Burgess July 10, 2013 Abstract Background: Sample size calculations are an

More information

An instrumental variable in an observational study behaves

An instrumental variable in an observational study behaves Review Article Sensitivity Analyses for Robust Causal Inference from Mendelian Randomization Analyses with Multiple Genetic Variants Stephen Burgess, a Jack Bowden, b Tove Fall, c Erik Ingelsson, c and

More information

Bayesian and Frequentist Approaches

Bayesian and Frequentist Approaches Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law

More information

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy Number XX An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy Prepared for: Agency for Healthcare Research and Quality U.S. Department of Health and Human Services 54 Gaither

More information

Cochrane Pregnancy and Childbirth Group Methodological Guidelines

Cochrane Pregnancy and Childbirth Group Methodological Guidelines Cochrane Pregnancy and Childbirth Group Methodological Guidelines [Prepared by Simon Gates: July 2009, updated July 2012] These guidelines are intended to aid quality and consistency across the reviews

More information

Methods Research Report. An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy

Methods Research Report. An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy Methods Research Report An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy Methods Research Report An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy

More information

Mendelian randomization analysis with multiple genetic variants using summarized data

Mendelian randomization analysis with multiple genetic variants using summarized data Mendelian randomization analysis with multiple genetic variants using summarized data Stephen Burgess Department of Public Health and Primary Care, University of Cambridge Adam Butterworth Department of

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Sylvia Richardson 1 sylvia.richardson@imperial.co.uk Joint work with: Alexina Mason 1, Lawrence

More information

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Journal of Social and Development Sciences Vol. 4, No. 4, pp. 93-97, Apr 203 (ISSN 222-52) Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Henry De-Graft Acquah University

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

Use of allele scores as instrumental variables for Mendelian randomization

Use of allele scores as instrumental variables for Mendelian randomization Use of allele scores as instrumental variables for Mendelian randomization Stephen Burgess Simon G. Thompson October 5, 2012 Summary Background: An allele score is a single variable summarizing multiple

More information

Mediation Analysis With Principal Stratification

Mediation Analysis With Principal Stratification University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 3-30-009 Mediation Analysis With Principal Stratification Robert Gallop Dylan S. Small University of Pennsylvania

More information

Mendelian Randomization

Mendelian Randomization Mendelian Randomization Drawback with observational studies Risk factor X Y Outcome Risk factor X? Y Outcome C (Unobserved) Confounders The power of genetics Intermediate phenotype (risk factor) Genetic

More information

Comparing treatments evaluated in studies forming disconnected networks of evidence: A review of methods

Comparing treatments evaluated in studies forming disconnected networks of evidence: A review of methods Comparing treatments evaluated in studies forming disconnected networks of evidence: A review of methods John W Stevens Reader in Decision Science University of Sheffield EFPSI European Statistical Meeting

More information

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering Meta-Analysis Zifei Liu What is a meta-analysis; why perform a metaanalysis? How a meta-analysis work some basic concepts and principles Steps of Meta-analysis Cautions on meta-analysis 2 What is Meta-analysis

More information

Data Analysis Using Regression and Multilevel/Hierarchical Models

Data Analysis Using Regression and Multilevel/Hierarchical Models Data Analysis Using Regression and Multilevel/Hierarchical Models ANDREW GELMAN Columbia University JENNIFER HILL Columbia University CAMBRIDGE UNIVERSITY PRESS Contents List of examples V a 9 e xv " Preface

More information

Instrumental Variables Estimation: An Introduction

Instrumental Variables Estimation: An Introduction Instrumental Variables Estimation: An Introduction Susan L. Ettner, Ph.D. Professor Division of General Internal Medicine and Health Services Research, UCLA The Problem The Problem Suppose you wish to

More information

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods (10) Frequentist Properties of Bayesian Methods Calibrated Bayes So far we have discussed Bayesian methods as being separate from the frequentist approach However, in many cases methods with frequentist

More information

Fundamental Clinical Trial Design

Fundamental Clinical Trial Design Design, Monitoring, and Analysis of Clinical Trials Session 1 Overview and Introduction Overview Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics, University of Washington February 17-19, 2003

More information

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1 Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition

More information

Received: 14 April 2016, Accepted: 28 October 2016 Published online 1 December 2016 in Wiley Online Library

Received: 14 April 2016, Accepted: 28 October 2016 Published online 1 December 2016 in Wiley Online Library Research Article Received: 14 April 2016, Accepted: 28 October 2016 Published online 1 December 2016 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.7171 One-stage individual participant

More information

Brief introduction to instrumental variables. IV Workshop, Bristol, Miguel A. Hernán Department of Epidemiology Harvard School of Public Health

Brief introduction to instrumental variables. IV Workshop, Bristol, Miguel A. Hernán Department of Epidemiology Harvard School of Public Health Brief introduction to instrumental variables IV Workshop, Bristol, 2008 Miguel A. Hernán Department of Epidemiology Harvard School of Public Health Goal: To consistently estimate the average causal effect

More information

GUIDELINE COMPARATORS & COMPARISONS:

GUIDELINE COMPARATORS & COMPARISONS: GUIDELINE COMPARATORS & COMPARISONS: Direct and indirect comparisons Adapted version (2015) based on COMPARATORS & COMPARISONS: Direct and indirect comparisons - February 2013 The primary objective of

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Improving ecological inference using individual-level data

Improving ecological inference using individual-level data Improving ecological inference using individual-level data Christopher Jackson, Nicky Best and Sylvia Richardson Department of Epidemiology and Public Health, Imperial College School of Medicine, London,

More information

Modelling heterogeneity variances in multiple treatment comparison meta-analysis Are informative priors the better solution?

Modelling heterogeneity variances in multiple treatment comparison meta-analysis Are informative priors the better solution? Thorlund et al. BMC Medical Research Methodology 2013, 13:2 RESEARCH ARTICLE Open Access Modelling heterogeneity variances in multiple treatment comparison meta-analysis Are informative priors the better

More information

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study Marianne (Marnie) Bertolet Department of Statistics Carnegie Mellon University Abstract Linear mixed-effects (LME)

More information

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Timothy N. Rubin (trubin@uci.edu) Michael D. Lee (mdlee@uci.edu) Charles F. Chubb (cchubb@uci.edu) Department of Cognitive

More information

Flexible Matching in Case-Control Studies of Gene-Environment Interactions

Flexible Matching in Case-Control Studies of Gene-Environment Interactions American Journal of Epidemiology Copyright 2004 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 59, No. Printed in U.S.A. DOI: 0.093/aje/kwg250 ORIGINAL CONTRIBUTIONS Flexible

More information

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data Michael Anderson, PhD Hélène Carabin, DVM, PhD Department of Biostatistics and Epidemiology The University

More information

Live WebEx meeting agenda

Live WebEx meeting agenda 10:00am 10:30am Using OpenMeta[Analyst] to extract quantitative data from published literature Live WebEx meeting agenda August 25, 10:00am-12:00pm ET 10:30am 11:20am Lecture (this will be recorded) 11:20am

More information

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,

More information

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis Treatment effect estimates adjusted for small-study effects via a limit meta-analysis Gerta Rücker 1, James Carpenter 12, Guido Schwarzer 1 1 Institute of Medical Biometry and Medical Informatics, University

More information

How do we combine two treatment arm trials with multiple arms trials in IPD metaanalysis? An Illustration with College Drinking Interventions

How do we combine two treatment arm trials with multiple arms trials in IPD metaanalysis? An Illustration with College Drinking Interventions 1/29 How do we combine two treatment arm trials with multiple arms trials in IPD metaanalysis? An Illustration with College Drinking Interventions David Huh, PhD 1, Eun-Young Mun, PhD 2, & David C. Atkins,

More information

For general queries, contact

For general queries, contact Much of the work in Bayesian econometrics has focused on showing the value of Bayesian methods for parametric models (see, for example, Geweke (2005), Koop (2003), Li and Tobias (2011), and Rossi, Allenby,

More information

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Combining Risks from Several Tumors Using Markov Chain Monte Carlo University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln U.S. Environmental Protection Agency Papers U.S. Environmental Protection Agency 2009 Combining Risks from Several Tumors

More information

Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods

Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods Estimation of effect sizes in the presence of publication bias: a comparison of meta-analysis methods Hilde Augusteijn M.A.L.M. van Assen R. C. M. van Aert APS May 29, 2016 Today s presentation Estimation

More information

Bayesian hierarchical modelling

Bayesian hierarchical modelling Bayesian hierarchical modelling Matthew Schofield Department of Mathematics and Statistics, University of Otago Bayesian hierarchical modelling Slide 1 What is a statistical model? A statistical model:

More information

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. Greenland/Arah, Epi 200C Sp 2000 1 of 6 EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. INSTRUCTIONS: Write all answers on the answer sheets supplied; PRINT YOUR NAME and STUDENT ID NUMBER

More information

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you? WDHS Curriculum Map Probability and Statistics Time Interval/ Unit 1: Introduction to Statistics 1.1-1.3 2 weeks S-IC-1: Understand statistics as a process for making inferences about population parameters

More information

Introduction to Bayesian Analysis 1

Introduction to Bayesian Analysis 1 Biostats VHM 801/802 Courses Fall 2005, Atlantic Veterinary College, PEI Henrik Stryhn Introduction to Bayesian Analysis 1 Little known outside the statistical science, there exist two different approaches

More information

Bayesian Joint Modelling of Longitudinal and Survival Data of HIV/AIDS Patients: A Case Study at Bale Robe General Hospital, Ethiopia

Bayesian Joint Modelling of Longitudinal and Survival Data of HIV/AIDS Patients: A Case Study at Bale Robe General Hospital, Ethiopia American Journal of Theoretical and Applied Statistics 2017; 6(4): 182-190 http://www.sciencepublishinggroup.com/j/ajtas doi: 10.11648/j.ajtas.20170604.13 ISSN: 2326-8999 (Print); ISSN: 2326-9006 (Online)

More information

Introduction to Survival Analysis Procedures (Chapter)

Introduction to Survival Analysis Procedures (Chapter) SAS/STAT 9.3 User s Guide Introduction to Survival Analysis Procedures (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 9.3 User s Guide. The correct bibliographic citation

More information

Ecological Statistics

Ecological Statistics A Primer of Ecological Statistics Second Edition Nicholas J. Gotelli University of Vermont Aaron M. Ellison Harvard Forest Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Brief Contents

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector

More information

Mendelian Randomisation and Causal Inference in Observational Epidemiology. Nuala Sheehan

Mendelian Randomisation and Causal Inference in Observational Epidemiology. Nuala Sheehan Mendelian Randomisation and Causal Inference in Observational Epidemiology Nuala Sheehan Department of Health Sciences UNIVERSITY of LEICESTER MRC Collaborative Project Grant G0601625 Vanessa Didelez,

More information

Index. Springer International Publishing Switzerland 2017 T.J. Cleophas, A.H. Zwinderman, Modern Meta-Analysis, DOI /

Index. Springer International Publishing Switzerland 2017 T.J. Cleophas, A.H. Zwinderman, Modern Meta-Analysis, DOI / Index A Adjusted Heterogeneity without Overdispersion, 63 Agenda-driven bias, 40 Agenda-Driven Meta-Analyses, 306 307 Alternative Methods for diagnostic meta-analyses, 133 Antihypertensive effect of potassium,

More information

Use of allele scores as instrumental variables for Mendelian randomization

Use of allele scores as instrumental variables for Mendelian randomization Use of allele scores as instrumental variables for Mendelian randomization Stephen Burgess Simon G. Thompson March 13, 2013 Summary Background: An allele score is a single variable summarizing multiple

More information

Supplementary Online Content

Supplementary Online Content Supplementary Online Content Hartwig FP, Borges MC, Lessa Horta B, Bowden J, Davey Smith G. Inflammatory biomarkers and risk of schizophrenia: a 2-sample mendelian randomization study. JAMA Psychiatry.

More information

Ordinal Data Modeling

Ordinal Data Modeling Valen E. Johnson James H. Albert Ordinal Data Modeling With 73 illustrations I ". Springer Contents Preface v 1 Review of Classical and Bayesian Inference 1 1.1 Learning about a binomial proportion 1 1.1.1

More information

BAYESIAN HYPOTHESIS TESTING WITH SPSS AMOS

BAYESIAN HYPOTHESIS TESTING WITH SPSS AMOS Sara Garofalo Department of Psychiatry, University of Cambridge BAYESIAN HYPOTHESIS TESTING WITH SPSS AMOS Overview Bayesian VS classical (NHST or Frequentist) statistical approaches Theoretical issues

More information

accuracy (see, e.g., Mislevy & Stocking, 1989; Qualls & Ansley, 1985; Yen, 1987). A general finding of this research is that MML and Bayesian

accuracy (see, e.g., Mislevy & Stocking, 1989; Qualls & Ansley, 1985; Yen, 1987). A general finding of this research is that MML and Bayesian Recovery of Marginal Maximum Likelihood Estimates in the Two-Parameter Logistic Response Model: An Evaluation of MULTILOG Clement A. Stone University of Pittsburgh Marginal maximum likelihood (MML) estimation

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ

Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ Tutorial in Biostatistics Received: 11 March 2016, Accepted: 13 September 2016 Published online 16 October 2016 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.7141 Meta-analysis using

More information

Inference with Difference-in-Differences Revisited

Inference with Difference-in-Differences Revisited Inference with Difference-in-Differences Revisited M. Brewer, T- F. Crossley and R. Joyce Journal of Econometric Methods, 2018 presented by Federico Curci February 22nd, 2018 Brewer, Crossley and Joyce

More information

Meta-analysis of two studies in the presence of heterogeneity with applications in rare diseases

Meta-analysis of two studies in the presence of heterogeneity with applications in rare diseases Meta-analysis of two studies in the presence of heterogeneity with applications in rare diseases Christian Röver 1, Tim Friede 1, Simon Wandel 2 and Beat Neuenschwander 2 1 Department of Medical Statistics,

More information

Propensity scores: what, why and why not?

Propensity scores: what, why and why not? Propensity scores: what, why and why not? Rhian Daniel, Cardiff University @statnav Joint workshop S3RI & Wessex Institute University of Southampton, 22nd March 2018 Rhian Daniel @statnav/propensity scores:

More information

The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0

The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0 The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0 Introduction Loss of erozygosity (LOH) represents the loss of allelic differences. The SNP markers on the SNP Array 6.0 can be used

More information

CENTRE FOR STATISTICAL METHODOLOGY Celebrating the first International Statistics Prize Winner, Prof Sir David Cox

CENTRE FOR STATISTICAL METHODOLOGY Celebrating the first International Statistics Prize Winner, Prof Sir David Cox CENTRE FOR STATISTICAL METHODOLOGY Celebrating the first International Statistics Prize Winner, Prof Sir David Cox Reduce, score*, regress, repeat: using factor analysis to tackle multicollinear HDL metabolomics

More information

EPSE 594: Meta-Analysis: Quantitative Research Synthesis

EPSE 594: Meta-Analysis: Quantitative Research Synthesis EPSE 594: Meta-Analysis: Quantitative Research Synthesis Ed Kroc University of British Columbia ed.kroc@ubc.ca March 28, 2019 Ed Kroc (UBC) EPSE 594 March 28, 2019 1 / 32 Last Time Publication bias Funnel

More information

Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time.

Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time. Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time. Anne Cori* 1, Michael Pickles* 1, Ard van Sighem 2, Luuk Gras 2,

More information

AP Statistics. Semester One Review Part 1 Chapters 1-5

AP Statistics. Semester One Review Part 1 Chapters 1-5 AP Statistics Semester One Review Part 1 Chapters 1-5 AP Statistics Topics Describing Data Producing Data Probability Statistical Inference Describing Data Ch 1: Describing Data: Graphically and Numerically

More information

Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models

Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models Florian M. Hollenbach Department of Political Science Texas A&M University Jacob M. Montgomery

More information

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions J. Harvey a,b, & A.J. van der Merwe b a Centre for Statistical Consultation Department of Statistics

More information

Detection of Unknown Confounders. by Bayesian Confirmatory Factor Analysis

Detection of Unknown Confounders. by Bayesian Confirmatory Factor Analysis Advanced Studies in Medical Sciences, Vol. 1, 2013, no. 3, 143-156 HIKARI Ltd, www.m-hikari.com Detection of Unknown Confounders by Bayesian Confirmatory Factor Analysis Emil Kupek Department of Public

More information

Optimal full matching for survival outcomes: a method that merits more widespread use

Optimal full matching for survival outcomes: a method that merits more widespread use Research Article Received 3 November 2014, Accepted 6 July 2015 Published online 6 August 2015 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.6602 Optimal full matching for survival

More information

Effective Implementation of Bayesian Adaptive Randomization in Early Phase Clinical Development. Pantelis Vlachos.

Effective Implementation of Bayesian Adaptive Randomization in Early Phase Clinical Development. Pantelis Vlachos. Effective Implementation of Bayesian Adaptive Randomization in Early Phase Clinical Development Pantelis Vlachos Cytel Inc, Geneva Acknowledgement Joint work with Giacomo Mordenti, Grünenthal Virginie

More information

Investigating causality in the association between 25(OH)D and schizophrenia

Investigating causality in the association between 25(OH)D and schizophrenia Investigating causality in the association between 25(OH)D and schizophrenia Amy E. Taylor PhD 1,2,3, Stephen Burgess PhD 1,4, Jennifer J. Ware PhD 1,2,5, Suzanne H. Gage PhD 1,2,3, SUNLIGHT consortium,

More information

NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES

NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES Amit Teller 1, David M. Steinberg 2, Lina Teper 1, Rotem Rozenblum 2, Liran Mendel 2, and Mordechai Jaeger 2 1 RAFAEL, POB 2250, Haifa, 3102102, Israel

More information

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15) ON THE COMPARISON OF BAYESIAN INFORMATION CRITERION AND DRAPER S INFORMATION CRITERION IN SELECTION OF AN ASYMMETRIC PRICE RELATIONSHIP: BOOTSTRAP SIMULATION RESULTS Henry de-graft Acquah, Senior Lecturer

More information

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method Biost 590: Statistical Consulting Statistical Classification of Scientific Studies; Approach to Consulting Lecture Outline Statistical Classification of Scientific Studies Statistical Tasks Approach to

More information

Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data

Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data Appl. Statist. (2018) 67, Part 1, pp. 145 163 Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data Qiuju Li and Li Su Medical Research Council

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary note

Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary note University of Iowa Iowa Research Online Theses and Dissertations Fall 2014 Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary

More information

Score Tests of Normality in Bivariate Probit Models

Score Tests of Normality in Bivariate Probit Models Score Tests of Normality in Bivariate Probit Models Anthony Murphy Nuffield College, Oxford OX1 1NF, UK Abstract: A relatively simple and convenient score test of normality in the bivariate probit model

More information

Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers

Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers Dipak K. Dey, Ulysses Diva and Sudipto Banerjee Department of Statistics University of Connecticut, Storrs. March 16,

More information

An Introduction to Bayesian Statistics

An Introduction to Bayesian Statistics An Introduction to Bayesian Statistics Robert Weiss Department of Biostatistics UCLA Fielding School of Public Health robweiss@ucla.edu Sept 2015 Robert Weiss (UCLA) An Introduction to Bayesian Statistics

More information

16:35 17:20 Alexander Luedtke (Fred Hutchinson Cancer Research Center)

16:35 17:20 Alexander Luedtke (Fred Hutchinson Cancer Research Center) Conference on Causal Inference in Longitudinal Studies September 21-23, 2017 Columbia University Thursday, September 21, 2017: tutorial 14:15 15:00 Miguel Hernan (Harvard University) 15:00 15:45 Miguel

More information

Meta-Analysis of Correlation Coefficients: A Monte Carlo Comparison of Fixed- and Random-Effects Methods

Meta-Analysis of Correlation Coefficients: A Monte Carlo Comparison of Fixed- and Random-Effects Methods Psychological Methods 01, Vol. 6, No. 2, 161-1 Copyright 01 by the American Psychological Association, Inc. 82-989X/01/S.00 DOI:.37//82-989X.6.2.161 Meta-Analysis of Correlation Coefficients: A Monte Carlo

More information

Empirical evidence on sources of bias in randomised controlled trials: methods of and results from the BRANDO study

Empirical evidence on sources of bias in randomised controlled trials: methods of and results from the BRANDO study Empirical evidence on sources of bias in randomised controlled trials: methods of and results from the BRANDO study Jonathan Sterne, University of Bristol, UK Acknowledgements: Tony Ades, Bodil Als-Nielsen,

More information

Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models

Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models Brady T. West Michigan Program in Survey Methodology, Institute for Social Research, 46 Thompson

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Epidemiologic Methods I & II Epidem 201AB Winter & Spring 2002

Epidemiologic Methods I & II Epidem 201AB Winter & Spring 2002 DETAILED COURSE OUTLINE Epidemiologic Methods I & II Epidem 201AB Winter & Spring 2002 Hal Morgenstern, Ph.D. Department of Epidemiology UCLA School of Public Health Page 1 I. THE NATURE OF EPIDEMIOLOGIC

More information

Bayesian meta-analysis of Papanicolaou smear accuracy

Bayesian meta-analysis of Papanicolaou smear accuracy Gynecologic Oncology 107 (2007) S133 S137 www.elsevier.com/locate/ygyno Bayesian meta-analysis of Papanicolaou smear accuracy Xiuyu Cong a, Dennis D. Cox b, Scott B. Cantor c, a Biometrics and Data Management,

More information

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study STATISTICAL METHODS Epidemiology Biostatistics and Public Health - 2016, Volume 13, Number 1 Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation

More information

Challenges of Observational and Retrospective Studies

Challenges of Observational and Retrospective Studies Challenges of Observational and Retrospective Studies Kyoungmi Kim, Ph.D. March 8, 2017 This seminar is jointly supported by the following NIH-funded centers: Background There are several methods in which

More information

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 3: Overview of Descriptive Statistics October 3, 2005 Lecture Outline Purpose

More information

ICH E9(R1) Technical Document. Estimands and Sensitivity Analysis in Clinical Trials STEP I TECHNICAL DOCUMENT TABLE OF CONTENTS

ICH E9(R1) Technical Document. Estimands and Sensitivity Analysis in Clinical Trials STEP I TECHNICAL DOCUMENT TABLE OF CONTENTS ICH E9(R1) Technical Document Estimands and Sensitivity Analysis in Clinical Trials STEP I TECHNICAL DOCUMENT TABLE OF CONTENTS A.1. Purpose and Scope A.2. A Framework to Align Planning, Design, Conduct,

More information

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis EFSA/EBTC Colloquium, 25 October 2017 Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis Julian Higgins University of Bristol 1 Introduction to concepts Standard

More information

Combining Information from Diverse Sources

Combining Information from Diverse Sources Combining Information from Diverse Sources Eloise E. Kaizar Thesis Proposal July 6, 2005 Abstract Research synthesis plays a central role in the process of scientific discovery, providing a formal methodology

More information

Generalized Estimating Equations for Depression Dose Regimes

Generalized Estimating Equations for Depression Dose Regimes Generalized Estimating Equations for Depression Dose Regimes Karen Walker, Walker Consulting LLC, Menifee CA Generalized Estimating Equations on the average produce consistent estimates of the regression

More information

Statistical Tolerance Regions: Theory, Applications and Computation

Statistical Tolerance Regions: Theory, Applications and Computation Statistical Tolerance Regions: Theory, Applications and Computation K. KRISHNAMOORTHY University of Louisiana at Lafayette THOMAS MATHEW University of Maryland Baltimore County Contents List of Tables

More information

How should the propensity score be estimated when some confounders are partially observed?

How should the propensity score be estimated when some confounders are partially observed? How should the propensity score be estimated when some confounders are partially observed? Clémence Leyrat 1, James Carpenter 1,2, Elizabeth Williamson 1,3, Helen Blake 1 1 Department of Medical statistics,

More information

RAG Rating Indicator Values

RAG Rating Indicator Values Technical Guide RAG Rating Indicator Values Introduction This document sets out Public Health England s standard approach to the use of RAG ratings for indicator values in relation to comparator or benchmark

More information

Bayesian Mediation Analysis

Bayesian Mediation Analysis Psychological Methods 2009, Vol. 14, No. 4, 301 322 2009 American Psychological Association 1082-989X/09/$12.00 DOI: 10.1037/a0016972 Bayesian Mediation Analysis Ying Yuan The University of Texas M. D.

More information

Models for potentially biased evidence in meta-analysis using empirically based priors

Models for potentially biased evidence in meta-analysis using empirically based priors Models for potentially biased evidence in meta-analysis using empirically based priors Nicky Welton Thanks to: Tony Ades, John Carlin, Doug Altman, Jonathan Sterne, Ross Harris RSS Avon Local Group Meeting,

More information

Bayesian Multiparameter Evidence Synthesis to Inform Decision Making: A Case Study in Metastatic Hormone-Refractory Prostate Cancer

Bayesian Multiparameter Evidence Synthesis to Inform Decision Making: A Case Study in Metastatic Hormone-Refractory Prostate Cancer Original Manuscript Bayesian Multiparameter Evidence Synthesis to Inform Decision Making: A Case Study in Metastatic Hormone-Refractory Prostate Cancer Medical Decision Making 1 15 Ó The Author(s) 2018

More information