Flexible Matching in Case-Control Studies of Gene-Environment Interactions

Size: px
Start display at page:

Download "Flexible Matching in Case-Control Studies of Gene-Environment Interactions"

Transcription

1 American Journal of Epidemiology Copyright 2004 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 59, No. Printed in U.S.A. DOI: 0.093/aje/kwg250 ORIGINAL CONTRIBUTIONS Flexible Matching in Case-Control Studies of Gene-Environment Interactions Catherine L. Saunders and Jennifer H. Barrett From the Genetic Epidemiology Division, Cancer Research UK Clinical Centre at Leeds, Leeds, United Kingdom. Received for publication January 4, 2003; accepted for publication May 23, Because of the lack of power of case-control study s to detect gene-environment interactions, flexible matching has recently been proposed as a method of improving efficiency. In this paper, the authors consider a large-sample approximation method that allows estimation of the most efficient matching strategy when genotype and exposure are either independent or associated. The authors provide tables of the sample sizes required to detect gene-environment interactions if this flexible matching strategy is followed, and they make brief comparisons with other study s. case-control studies; epidemiologic methods; interaction; research ; statistics The potential of matching strategies to improve statistical power for detection of gene-environment interactions has been debated ( 4). Detecting the departure from multiplicative joint effects of two risk factors for disease (as in geneenvironment interaction) is important in understanding how risk factors act together in complex diseases (5) and in identifying high-risk groups. The power of a case-control study to detect interactions is low compared with the power to detect main effects (). This has resulted in many different s being proposed as strategies for improving power. In addition to matching strategies, because genetic risk factors are often under study, family s have also been considered for studies of genegene or gene-environment interactions (6 8). Unlike family studies of risk factor main effects (9), these s have been found to have the potential to improve power to detect interactions in some situations. A that samples only cases has also been proposed (0). The improvement in power for this is large. However, if the risk factors under study are not independent in the population from which the cases are sampled, the false-positive rate with this can become greatly inflated (). Therefore, matching strategies are one of several approaches to improving the power to detect gene-environment interactions. Sturmer and Brenner (3) recently proposed the use of flexible matching to address this problem. By increasing the prevalence of the environmental exposure in controls above the prevalence in cases, the authors showed that this method could offer a substantial improvement in statistical power. There are many scenarios in which environmental exposure, or at least a proxy thereof, may be measured in a relatively large set of potential controls. For example, in a case-control study of the interaction between genetic risk factors and smoking in relation to bladder cancer, potential controls could be screened by means of a simple question asking whether or not they had ever been a regular smoker. When interest is in interaction rather than the main effect of smoking, controls could then be selected for genotyping and detailed exposure evaluation by sampling according to their response to this question. Similarly, in a case-control study of sun exposure and the genes involved in melanoma risk, potential controls who had lived for some time in a hot country might be oversampled to improve power to test for geneenvironment interactions. Using frequency matching strategies, the researchers would sample controls to have the same exposure frequency as the cases, whereas with a flexible matching strategy they would seek to sample exposure at the frequency among controls that maximized the power to test for interactions. Sturmer and Brenner s simulations showed that the optimal degree of matching for exposure could be found in different situations. However, they concluded, Given the strong dependence of the power and efficiency gains by matching on the multiple parameters, general recommendations as to the best degree of matching in all settings are difficult, if not impossible (3, p. 599). Correspondence to Dr. Jennifer Barrett, Genetic Epidemiology Division, Cancer Research UK Clinical Centre at Leeds, Leeds LS9 7TF, United Kingdom ( jenny.barrett@cancer.org.uk). 7 Am J Epidemiol 2004;59:7 22

2 8 Saunders and Barrett In this paper, we use a large-sample approximation of the variance of the interaction odds ratio to show that the exposure frequency among flexibly matched controls that minimizes the variance of the interaction odds ratio, and thus maximizes the power for this, can be estimated. METHODS Using the notation of Sturmer and Brenner (3), let p ij, p ijc, and p ijm be the proportions of persons with level of the environmental exposure (the matching factor) i (i = (0) if the environmental exposure is present (absent)) and genetic susceptibility j (j = (0) if the genetic susceptibility is present (absent)) in the population, in cases, and in matched controls, respectively. In the same way, let n ij, n ijc, and n ijm be the numbers of persons in each group in a study. The variance of the log of the interaction odds ratio for departure from multiplicative joint effects can be estimated as follows for a population-based case-control study (2): n ijc By a similar argument, the variance of the interaction effect from a study using flexible matching can be estimated by Here, the contribution to the variance of the log of the interaction odds ratio due to the population-based controls in equation, is replaced by the contribution from the flexibly matched controls, in equation 2. Therefore, the degree of matching that optimizes the efficiency of the flexible matching will be the degree that minimizes this variance. Because the flexible matching technique samples population-based cases, the variance in the interaction term that is due to the cases is unaffected by the matching strategy. Thus, the optimum strategy can be determined by finding the frequency of the environmental factor among controls that minimizes or, equivalently, that minimizes /p 00m + /p 0m + /p 0m + / p m. n ij n ijc n ijm -----, n ij , n ijm n ijm () (2) Let M E be the frequency of the matching factor (exposure) among flexibly matched controls, and let P G be the frequency of the genotype in the source population. When the two risk factors are independent in the source population, this term can be written as /[( M E )( P G )] + /[( M E )P G ] + /[M E ( P G )] + /[M E P G ], which simplifies to [P G ( P G )M E ( M E )]. Finding the value for M E that minimizes this variance is equivalent to finding a maximum for P G ( P G )M E ( M E ), which can be solved by differentiating with respect to M E and finding the solution at 0. Unsurprisingly, the variance is minimized when M E = 0.5. When the two risk factors are not independent, the most efficient frequency for the exposure sampling depends on both the odds ratio for the association between the genotype and the exposure in the source population (see the Appendix in Sturmer and Brenner (3)) and the frequencies of the two risk factors. The optimum frequency at which to sample exposure among controls (M E ) can be estimated using the following equation, where P E is the population exposure frequency and p 00, p 0, p 0, and p are, as before, the proportions of the population/unmatched controls with the different exposure/genotype combinations. Further details are given in the Appendix. The sample size required to detect interactions is calculated using the method of Self et al. (3 5). Briefly, the likelihood ratio test statistic for the interaction asymptotically follows a noncentral chi-squared distribution under the alternative hypothesis. A large exemplary data set with the risk factor frequencies among cases and controls expected under the alternative hypothesis is analyzed using standard statistical software. The likelihood ratio test statistic is the noncentrality parameter for this distribution. The required sample size is simply inversely proportional to this noncentrality parameter, which allows the application of this method to a wide range of s. RESULTS P E p 00 p 0 M E = P E p 00 p 0 + ( P E ) p 0 p Table shows the exposure frequency that maximizes the efficiency of a study to detect interactions over a range of control group genotype frequencies and magnitudes of risk factor associations (odds ratio for the association between genotype and exposure (OR GE )). We give the optimum exposure frequencies at particular matched control genotype frequencies rather than for specific population exposure and genotype frequencies. In practice, when exposure is sampled at a specific frequency among controls, this will also affect the frequency of the genotype, unless risk factors are independent; thus, the genotype frequency among flexible matching controls will not always be the same as that in the source population. (This is reflected in table 3, where the (3) Am J Epidemiol 2004;59:7 22

3 Flexible Matching in Gene-Environment Interaction Studies 9 TABLE. Optimum flexible matching exposure frequencies when risk factors are not independent Genotype frequency (proportion) among flexibly matched controls OR GE * * OR GE, odds ratio for the association between genotype and exposure (defined as p 00 p /p 0 p 0 ). optimum matching frequencies for exposure are expressed with respect to the population genotype frequency and are slightly different from those in table.) When exposure/ genotype combination frequencies are known among unmatched controls, applying equation 3 is the simplest way to calculate the optimum exposure frequency if risk factors are not independent. When the association between risk factors is small or the genotype frequency among controls is close to 0.5, a frequency for the matching factor of 0.5 remains the most efficient. In addition, the values shown in table confirm the finding in Sturmer and Brenner (3) that when there is a strong positive association between risk factors and genotype frequency is low, the optimum degree of matching is smaller than when there is less association or no association. To consider the practical use of the flexible matching, we calculated required sample sizes under this optimal matching strategy for a range of magnitudes of risk factor effects and frequencies. These complement the relative efficiencies presented by Sturmer and Brenner (3). Sample sizes needed (number of cases required, assuming equal numbers of cases and controls) for a statistical power of 80 percent and a two-sided significance level of 0.05 are presented in table 2. Situations in which exposure and genotype are independent are considered first; therefore, in the flexible matching, exposure frequency among controls is simply sampled at 50 percent, and required sample sizes are provided for comparison with an unmatched population-based study and a case-only. We consider a situation with a rare disease (population frequency = 0. percent) and genotype main effect (relative risk of disease among unexposed people with the susceptibility genotype compared with people exposed to neither risk factor) equaling 2. Required sample sizes are provided for a range of genotype and exposure frequencies and magnitudes of interaction and main effects. It can be seen from table 2 that the sample size requirements for the flexible matching are always lower, and can be substantially lower, than those for the population-based case-control, especially when the exposure is relatively rare (frequency 0.). Although the sample size requirements for the case-only are lower still, the flexible matching does not require the assumption of independence of risk factors that makes the case-only untenable in many situations. Situations where exposure and genotype are not independent are also shown. Sample size requirements are not presented for the case-only, because it would not be an appropriate strategy in these situations. The flexible matching strategy, however, still shows a significant reduction in the required sample size in comparison with the populationbased controls in all situations. Table 3 shows the optimal frequencies at which exposure is sampled for table 2. When genotype and exposure are independent, this frequency is 50 percent. Because changing the frequency at which exposure is sampled will also alter the genotype frequency among flexibly matched controls when genotype and exposure are not independent, both genotype and exposure population frequencies, as well as the magnitude of their association, affect the optimal matching frequency for exposure. DISCUSSION The relative efficiency under the four scenarios given in Sturmer and Brenner s (3) table 2 can also be estimated by the ratio of the variances calculated using this largesample approximation method. Although, for each scenario, both methods (simulation in the paper by Sturmer and Brenner (3) and approximation here) gave the Am J Epidemiol 2004;59:7 22

4 20 Saunders and Barrett TABLE 2. Numbers of cases required to detect interactions for the flexible matching, case-control, and case-only study s* Exposure frequency Exposure main effect Interaction effect Genotype frequency = 0.0 Genotype frequency = 0. Genotype frequency = 0.2 Flexible matching Case-control Case-only Flexible matching Case-control Case-only Flexible matching Case-control Case-only OR GE = ,549 69,506 2,76 4,98 8,328 2,862 3,55 5,4, ,320 7,454, ,75 6,479 2,982 3,853 7,236,770 2,46 4,386, ,767 6,822, ,486 3,306 9,369 3,403 3,736,294 2,03 2, ,568 3, ,485 30,287 8,3 3,267 3,599,39 2,0 2, ,558 3, ,556 27,556 8, 3,278 3,278,44 2,09 2, ,582 3, ,043 29,043 9,347 3,494 3,494,340 2,79 2, ,86 3, OR GE = ,09 52,4 4,230 6,740 2,887 4, ,666 5, ,734 46,563 3,399 5,889 2,246 3, ,299 5, ,880 28,282 3,257 3,456 2,05 2,92 3 3,474 3, ,494 27,89 3,90 3,388,994 2,35 3 3,54 3, ,927 29,309 3,4 3,438 2,085 2, ,847 3, ,0 3,472 3,70 3,736 2,29 2, ,54 4, OR GE = ,523 43,60 3,907 5,973 2,787 4, ,363 4, ,357 39,303 3,92 5,240 2,87 3,62 3 3,085 4, ,80 27,555 3,248 3,379 2,057 2,68 3 3,536 3, ,799 27,546 3,225 3,356 2,023 2,33 3 3,609 3, ,03 32,040 3,63 3,680 2,79 2,99 3 4,95 4, ,889 34,805 3,984 4,049 2,424 2, ,570 4, * In the flexible matching and case-control s, the number of controls is equal to the number of cases. OR GE, odds ratio for the association between genotype and exposure (defined as p 00 p /p 0 p 0 ). same degree of matching as the most efficient, the magnitudes of the relative efficiencies were slightly different. Discrepancies can be attributed to equation 2, which, though widely used, only calculates an asymptotic approximation to the variance of the interaction odds ratio. Am J Epidemiol 2004;59:7 22

5 Flexible Matching in Gene-Environment Interaction Studies 2 TABLE 3. Flexible matching exposure frequencies for table 2 Exposure frequency OR GE * = Genotype frequency OR GE = OR GE = * OR GE, odds ratio for the association between genotype and exposure (defined as p 00 p /p 0 p 0 ). Strategies similar to flexible matching for interactions have been discussed previously. Cain and Breslow (6) discussed a strategy similar to the one detailed above for improving power to detect interactions and main effects. They considered a situation where exposure information on cases and controls was available before sampling of the particular controls for which more detailed information would be collected (in this case, genotyping). They advocated a strategy in which controls are sampled with balanced numbers from each exposure stratum. Cain and Breslow found that the balanced is always much more powerful than the unstratified for detecting interactions. Indeed, the only time they found the strategy less efficient was when there is a strong negative correlation between the variables that are measured in the first and second stages; this is also reflected here in the case where the optimum sampling frequency for the exposure is potentially greater or less than 50 percent when the two risk factors are strongly associated. Breslow and Cain (7) similarly recognized for the twostage that unbiased estimates of the interaction parameter can be obtained from an unmatched analysis even though the exposure is used as a matching factor, in the same way as for the flexible matching. However, estimates of the population exposure frequency can also be used to additionally allow estimation of the exposure main effects. This is an aspect that could also be applied to the flexible matching if, at the control sampling stage of the study, an estimate of the population exposure frequency could be made, or if the controls were being sampled from a preexisting cohort for which exposure information was available. At the analysis stage, the log of the exposure group frequency (i.e., exposed or unexposed) is used as an offset in the logistic regression model, to retrieve unbiased estimates of exposure main effects. One advantage of this result is that the offset has no effect on the power of the to detect interactions (7). Thus, if this information is not available, this does not detract from the strength of the for detecting interactions. Understanding how the power of the flexible matching can be optimized is helpful in understanding comparisons between different s that have been proposed as strategies for detecting interactions. Table 2 reflects well that although the exposure frequency among controls is chosen to minimize the variance, the decrease in the required sample size is still small in comparison with the case-only, where there is no component of variance in the interaction estimate due to the controls. The inappropriateness of the case-only in the presence of risk factor association and concerns about the false-positive rate when this assumption is violated (, 8) mean that alternative strategies are still attractive and should be explored. By considering the large-sample approximation to the variance of the interaction parameter for the flexible matching, one can see why using family controls has the potential to improve the power to detect interactions (7, 8). When risk factors are rare (and this is the situation in which most improvement in power from family s has been observed), the exposure frequencies among controls are raised above the population levels towards the most optimal frequencies of 50 percent due to within-family correlation of genetic, and to a lesser extent environmental, risk factors. Similar arguments can be considered for other s, such as the that compares case subjects who have two primary cancers with cases who have only one primary cancer (9). This sampling strategy will increase the prevalence of rare risk factors among all study participants, again decreasing variation in the interaction parameter and increasing power. Matching strategies such as flexible matching are often the most rational approach to choosing an efficient for detecting interactions, if the assumption of independence of genotype and exposure that is required for the case-only proves untenable (). The strategies described here can be used to find the most informative risk factor frequencies. If the population exposure frequency is known, the theory from two-stage s can be incorporated at the analysis stage to estimate the main effects of the matching variables. This further increases the attractiveness of these s. REFERENCES. Smith PG, Day NE. The of case-control studies: the influence of confounding and interaction effects. Int J Epidemiol 984;3: Thomas DC, Greenland S. The efficiency of matching in casecontrol studies of risk-factor interactions. J Chronic Dis 985; 38: Sturmer T, Brenner H. Flexible matching strategies to increase power and efficiency to detect and estimate gene-environment interactions in case-control studies. Am J Epidemiol 2002;55: Am J Epidemiol 2004;59:7 22

6 22 Saunders and Barrett 4. Sturmer T, Brenner H. Potential gain in efficiency and power to detect gene-environment interactions by matching in casecontrol studies. Genet Epidemiol 2000;8: Brennan P. Gene-environment interaction and aetiology of cancer: what does it mean and how can we measure it? Carcinogenesis 2002;23: Gauderman WJ. Sample size requirements for association studies of gene-gene interaction. Am J Epidemiol 2002;55: Gauderman WJ. Sample size requirements for matched casecontrol studies of gene-environment interaction. Stat Med 2002;2: Witte JS, Gauderman WJ, Thomas DC. Asymptotic bias and efficiency in case-control studies of candidate genes and geneenvironment interactions: basic family s. Am J Epidemiol 999;49: Schaid DJ, Rowland C. Use of parents, sibs, and unrelated controls for detection of associations between genetic markers and disease. Am J Hum Genet 998;63: Piegorsch WW, Weinberg CR, Taylor JA. Non-hierarchical logistic models and case-only s for assessing susceptibility in population-based case-control studies. Stat Med 994;3: Albert PS, Ratnasinghe D, Tangrea J, et al. Limitations of the case-only for identifying gene-environment interactions. Am J Epidemiol 200;54: Cuzick J. Interaction, subgroup analysis and sample size. In: Boffetta P, Caporaso N, Cuzick J, et al, eds. Metabolic polymorphisms and susceptibility to cancer. Lyon, France: International Agency for Research on Cancer, 999:09 2. (IARC Scientific Publication no. 48). 3. Self SG, Mauritsen RH, Ohara J. Power calculations for likelihood ratio tests in generalized linear models. Biometrics 992; 48: Brown BW, Lovato J, Russell K. Asymptotic power calculations: description, examples, computer code. Stat Med 999;8: Longmate JA. Complexity and power in case-control association studies. Am J Hum Genet 200;68: Cain KC, Breslow NE. Logistic regression analysis and efficient for two-stage studies. Am J Epidemiol 988;28: Breslow NE, Cain KC. Logistic regression for two-stage casecontrol data. Biometrika 988;75: Saunders CL, Gooptu C, Bishop DT, et al. The use of case only studies for the detection of interactions, and the non-independence of genetic and environmental risk factors for disease. (Abstract). Genet Epidemiol 200;2: Begg CB, Berwick M. A note on the estimation of relative risks of rare genetic susceptibility markers. Cancer Epidemiol Biomarkers Prev 997;6: APPENDIX As before, let p ij and p ijm be the proportions of persons with level of the environmental exposure (the matching factor) i (i = (0) if the environmental exposure is present (absent)) and genetic susceptibility j ( j = (0) if the genetic susceptibility is present (absent)) in the population and in matched controls, respectively. Let P E be the exposure frequency in the source population, and let M E be the exposure frequency among flexibly matched controls. The p ij s are calculated following the method of Sturmer and Brenner (3). They depend on the genotype and exposure frequencies and the magnitude of the association between the two factors. Alternatively, if an unmatched control group were available, then the values of the proportions p ij could be observed directly. Therefore, the proportions of persons with each genotype/exposure combination when controls are selected under a flexible matching scheme, p ijm, are calculated as follows, such that the frequency of exposure among controls is M E. p 00m = p 00 ( M E )/( P E ). p 0m = p 0 ( M E )/( P E ). p 0m = p 0 M E /P E. p m = p M E /P E. Therefore, the variance of the log of the interaction odds ratio due to the flexibly matched controls can be estimated by ( P E )/[p 00 ( M E )] + ( P E )/[p 0 ( M E )] + P E /[p 0 M E ] + P E /[p M E ]. By differentiating this function with respect to M E and finding the value of M E when this is zero, one can find the value of M E that minimizes this variance. After some simple algebra, the derivative can be expressed as p 0u p 00u ( M E ) 2 /( P E ) 2 p 0u p u M E 2 /P E 2. Setting this to zero, the equation can be solved for M E by factorization, since the derivative is the difference of two squares, providing the solution in equation 3. Am J Epidemiol 2004;59:7 22

A Comparison of Sample Size and Power in Case-Only Association Studies of Gene-Environment Interaction

A Comparison of Sample Size and Power in Case-Only Association Studies of Gene-Environment Interaction American Journal of Epidemiology ª The Author 2010. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. This is an Open Access article distributed under

More information

Selection Bias in the Assessment of Gene-Environment Interaction in Case-Control Studies

Selection Bias in the Assessment of Gene-Environment Interaction in Case-Control Studies American Journal of Epidemiology Copyright 2003 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 158, No. 3 Printed in U.S.A. DOI: 10.1093/aje/kwg147 Selection Bias in the

More information

Propensity score methods to adjust for confounding in assessing treatment effects: bias and precision

Propensity score methods to adjust for confounding in assessing treatment effects: bias and precision ISPUB.COM The Internet Journal of Epidemiology Volume 7 Number 2 Propensity score methods to adjust for confounding in assessing treatment effects: bias and precision Z Wang Abstract There is an increasing

More information

Does Body Mass Index Adequately Capture the Relation of Body Composition and Body Size to Health Outcomes?

Does Body Mass Index Adequately Capture the Relation of Body Composition and Body Size to Health Outcomes? American Journal of Epidemiology Copyright 1998 by The Johns Hopkins University School of Hygiene and Public Health All rights reserved Vol. 147, No. 2 Printed in U.S.A A BRIEF ORIGINAL CONTRIBUTION Does

More information

CONTINUOUS AND CATEGORICAL TREND ESTIMATORS: SIMULATION RESULTS AND AN APPLICATION TO RESIDENTIAL RADON

CONTINUOUS AND CATEGORICAL TREND ESTIMATORS: SIMULATION RESULTS AND AN APPLICATION TO RESIDENTIAL RADON CONTINUOUS AND CATEGORICAL TREND ESTIMATORS: SIMULATION RESULTS AND AN APPLICATION TO RESIDENTIAL RADON A Schaffrath Rosario 1,2*, J Wellmann 1,3, IM Heid 1 and HE Wichmann 1,2 1 Institute of Epidemiology,

More information

Gene-Environment Interactions

Gene-Environment Interactions Gene-Environment Interactions What is gene-environment interaction? A different effect of an environmental exposure on disease risk in persons with different genotypes," or, alternatively, "a different

More information

REPRODUCTIVE ENDOCRINOLOGY

REPRODUCTIVE ENDOCRINOLOGY FERTILITY AND STERILITY VOL. 74, NO. 2, AUGUST 2000 Copyright 2000 American Society for Reproductive Medicine Published by Elsevier Science Inc. Printed on acid-free paper in U.S.A. REPRODUCTIVE ENDOCRINOLOGY

More information

breast cancer; relative risk; risk factor; standard deviation; strength of association

breast cancer; relative risk; risk factor; standard deviation; strength of association American Journal of Epidemiology The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail:

More information

Repeat Measurement of Case-Control Data: Corrections for Measurement Error in a Study of lschaemic Stroke and Haemostatic Factors

Repeat Measurement of Case-Control Data: Corrections for Measurement Error in a Study of lschaemic Stroke and Haemostatic Factors 10 1 1 1 1 1 1 0 1 0 1 0 1 0 1 International Journal of Epidemiology Vol. No. 1 International Epidemiological Association 1 Printed in Great Britain Repeat Measurement of Case-Control Data: Corrections

More information

Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome

Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome Stephen Burgess July 10, 2013 Abstract Background: Sample size calculations are an

More information

Regression Methods for Estimating Attributable Risk in Population-based Case-Control Studies: A Comparison of Additive and Multiplicative Models

Regression Methods for Estimating Attributable Risk in Population-based Case-Control Studies: A Comparison of Additive and Multiplicative Models American Journal of Epidemralogy Vol 133, No. 3 Copyright 1991 by The Johns Hopkins University School of Hygiene and Pubfc Health Printed m U.S.A. Al rights reserved Regression Methods for Estimating Attributable

More information

Controlling Bias & Confounding

Controlling Bias & Confounding Controlling Bias & Confounding Chihaya Koriyama August 5 th, 2015 QUESTIONS FOR BIAS Key concepts Bias Should be minimized at the designing stage. Random errors We can do nothing at Is the nature the of

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

11/24/2017. Do not imply a cause-and-effect relationship

11/24/2017. Do not imply a cause-and-effect relationship Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

Interpretation of Epidemiologic Studies

Interpretation of Epidemiologic Studies Interpretation of Epidemiologic Studies Paolo Boffetta Mount Sinai School of Medicine, New York, USA International Prevention Research Institute, Lyon, France Outline Introduction to epidemiology Issues

More information

Supplementary Figure 1. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations.

Supplementary Figure 1. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations. Supplementary Figure. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations. a Eigenvector 2.5..5.5. African Americans European Americans e

More information

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study Marianne (Marnie) Bertolet Department of Statistics Carnegie Mellon University Abstract Linear mixed-effects (LME)

More information

Reliability of Reported Age at Menopause

Reliability of Reported Age at Menopause American Journal of Epidemiology Copyright 1997 by The Johns Hopkins University School of Hygiene and Public Health All rights reserved Vol. 146, No. 9 Printed in U.S.A Reliability of Reported Age at Menopause

More information

Today Retrospective analysis of binomial response across two levels of a single factor.

Today Retrospective analysis of binomial response across two levels of a single factor. Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.3 Single Factor. Retrospective Analysis ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9,

More information

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA PharmaSUG 2014 - Paper SP08 Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA ABSTRACT Randomized clinical trials serve as the

More information

Contingency Tables Summer 2017 Summer Institutes 187

Contingency Tables Summer 2017 Summer Institutes 187 Contingency Tables 87 Overview ) Types of Variables ) Comparing () Categorical Variables Contingency (two-way) tables Tests 3) x Tables Sampling designs Testing for association Estimation of effects Paired

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1 Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition

More information

Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes

Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes Stephen Burgess Simon G. Thompson CRP CHD Genetics Collaboration May 24, 2012 Abstract

More information

OLS Regression with Clustered Data

OLS Regression with Clustered Data OLS Regression with Clustered Data Analyzing Clustered Data with OLS Regression: The Effect of a Hierarchical Data Structure Daniel M. McNeish University of Maryland, College Park A previous study by Mundfrom

More information

Pearce, N (2016) Analysis of matched case-control studies. BMJ (Clinical research ed), 352. i969. ISSN DOI: https://doi.org/ /bmj.

Pearce, N (2016) Analysis of matched case-control studies. BMJ (Clinical research ed), 352. i969. ISSN DOI: https://doi.org/ /bmj. Pearce, N (2016) Analysis of matched case-control studies. BMJ (Clinical research ed), 352. i969. ISSN 0959-8138 DOI: https://doi.org/10.1136/bmj.i969 Downloaded from: http://researchonline.lshtm.ac.uk/2534120/

More information

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT QUADRATIC (U-SHAPED) REGRESSION MODELS

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT QUADRATIC (U-SHAPED) REGRESSION MODELS BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT QUADRATIC (U-SHAPED) REGRESSION MODELS 12 June 2012 Michael Wood University of Portsmouth Business School SBS Department, Richmond Building Portland

More information

What is Multilevel Modelling Vs Fixed Effects. Will Cook Social Statistics

What is Multilevel Modelling Vs Fixed Effects. Will Cook Social Statistics What is Multilevel Modelling Vs Fixed Effects Will Cook Social Statistics Intro Multilevel models are commonly employed in the social sciences with data that is hierarchically structured Estimated effects

More information

Maria-Athina Altzerinakou1, Xavier Paoletti2. 9 May, 2017

Maria-Athina Altzerinakou1, Xavier Paoletti2. 9 May, 2017 An adaptive design for the identification of the optimal dose using joint modelling of efficacy and toxicity in phase I/II clinical trials of molecularly targeted agents Maria-Athina Altzerinakou1, Xavier

More information

m 11 m.1 > m 12 m.2 risk for smokers risk for nonsmokers

m 11 m.1 > m 12 m.2 risk for smokers risk for nonsmokers SOCY5061 RELATIVE RISKS, RELATIVE ODDS, LOGISTIC REGRESSION RELATIVE RISKS: Suppose we are interested in the association between lung cancer and smoking. Consider the following table for the whole population:

More information

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC Selected Topics in Biostatistics Seminar Series Missing Data Sponsored by: Center For Clinical Investigation and Cleveland CTSC Brian Schmotzer, MS Biostatistician, CCI Statistical Sciences Core brian.schmotzer@case.edu

More information

The Exposure-Stratified Retrospective Study: Application to High-Incidence Diseases

The Exposure-Stratified Retrospective Study: Application to High-Incidence Diseases The Exposure-Stratified Retrospective Study: Application to High-Incidence Diseases Peng T. Liu and Debra A. Street Division of Public Health and Biostatistics, CFSAN, FDA 5100 Paint Branch Pkwy, College

More information

Bias in randomised factorial trials

Bias in randomised factorial trials Research Article Received 17 September 2012, Accepted 9 May 2013 Published online 4 June 2013 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.5869 Bias in randomised factorial trials

More information

Using Statistical Principles to Implement FDA Guidance on Cardiovascular Risk Assessment for Diabetes Drugs

Using Statistical Principles to Implement FDA Guidance on Cardiovascular Risk Assessment for Diabetes Drugs Using Statistical Principles to Implement FDA Guidance on Cardiovascular Risk Assessment for Diabetes Drugs David Manner, Brenda Crowe and Linda Shurzinske BASS XVI November 9-13, 2009 Acknowledgements

More information

Transmission Disequilibrium Methods for Family-Based Studies Daniel J. Schaid Technical Report #72 July, 2004

Transmission Disequilibrium Methods for Family-Based Studies Daniel J. Schaid Technical Report #72 July, 2004 Transmission Disequilibrium Methods for Family-Based Studies Daniel J. Schaid Technical Report #72 July, 2004 Correspondence to: Daniel J. Schaid, Ph.D., Harwick 775, Division of Biostatistics Mayo Clinic/Foundation,

More information

RAG Rating Indicator Values

RAG Rating Indicator Values Technical Guide RAG Rating Indicator Values Introduction This document sets out Public Health England s standard approach to the use of RAG ratings for indicator values in relation to comparator or benchmark

More information

Title: The efficacy of fish oil supplements in the treatment of depression: food for thought

Title: The efficacy of fish oil supplements in the treatment of depression: food for thought Title: The efficacy of fish oil supplements in the treatment of depression: food for thought Response to: Meta-analysis and meta-regression of omega-3 polyunsaturated fatty acid supplementation for major

More information

Confounding, Effect modification, and Stratification

Confounding, Effect modification, and Stratification Confounding, Effect modification, and Stratification Tunisia, 30th October 2014 Acknowledgment: Kostas Danis Takis Panagiotopoulos National Schoool of Public Health, Athens, Greece takis.panagiotopoulos@gmail.com

More information

Choice of axis, tests for funnel plot asymmetry, and methods to adjust for publication bias

Choice of axis, tests for funnel plot asymmetry, and methods to adjust for publication bias Technical appendix Choice of axis, tests for funnel plot asymmetry, and methods to adjust for publication bias Choice of axis in funnel plots Funnel plots were first used in educational research and psychology,

More information

The Regression-Discontinuity Design

The Regression-Discontinuity Design Page 1 of 10 Home» Design» Quasi-Experimental Design» The Regression-Discontinuity Design The regression-discontinuity design. What a terrible name! In everyday language both parts of the term have connotations

More information

Mammographic density and risk of breast cancer by tumor characteristics: a casecontrol

Mammographic density and risk of breast cancer by tumor characteristics: a casecontrol Krishnan et al. BMC Cancer (2017) 17:859 DOI 10.1186/s12885-017-3871-7 RESEARCH ARTICLE Mammographic density and risk of breast cancer by tumor characteristics: a casecontrol study Open Access Kavitha

More information

Case-control studies. Hans Wolff. Service d épidémiologie clinique Département de médecine communautaire. WHO- Postgraduate course 2007 CC studies

Case-control studies. Hans Wolff. Service d épidémiologie clinique Département de médecine communautaire. WHO- Postgraduate course 2007 CC studies Case-control studies Hans Wolff Service d épidémiologie clinique Département de médecine communautaire Hans.Wolff@hcuge.ch Outline Case-control study Relation to cohort study Selection of controls Sampling

More information

Comparison of Some Almost Unbiased Ratio Estimators

Comparison of Some Almost Unbiased Ratio Estimators Comparison of Some Almost Unbiased Ratio Estimators Priyaranjan Dash Department of Statistics, Tripura University, India. ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

Measures of Association

Measures of Association Measures of Association Lakkana Thaikruea M.D., M.S., Ph.D. Community Medicine Department, Faculty of Medicine, Chiang Mai University, Thailand Introduction One of epidemiological studies goal is to determine

More information

A Methodological Issue in the Analysis of Second-Primary Cancer Incidence in Long-Term Survivors of Childhood Cancers

A Methodological Issue in the Analysis of Second-Primary Cancer Incidence in Long-Term Survivors of Childhood Cancers American Journal of Epidemiology Copyright 2003 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 158, No. 11 Printed in U.S.A. DOI: 10.1093/aje/kwg278 PRACTICE OF EPIDEMIOLOGY

More information

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi

More information

Analysis of TB prevalence surveys

Analysis of TB prevalence surveys Workshop and training course on TB prevalence surveys with a focus on field operations Analysis of TB prevalence surveys Day 8 Thursday, 4 August 2011 Phnom Penh Babis Sismanidis with acknowledgements

More information

ADENIYI MOFOLUWAKE MPH APPLIED EPIDEMIOLOGY WEEK 5 CASE STUDY ASSIGNMENT APRIL

ADENIYI MOFOLUWAKE MPH APPLIED EPIDEMIOLOGY WEEK 5 CASE STUDY ASSIGNMENT APRIL ADENIYI MOFOLUWAKE MPH 510 - APPLIED EPIDEMIOLOGY WEEK 5 CASE STUDY ASSIGNMENT APRIL 4 2013 Question 1: What makes the first study a case-control study? The first case study is a case-control study because

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

How should the propensity score be estimated when some confounders are partially observed?

How should the propensity score be estimated when some confounders are partially observed? How should the propensity score be estimated when some confounders are partially observed? Clémence Leyrat 1, James Carpenter 1,2, Elizabeth Williamson 1,3, Helen Blake 1 1 Department of Medical statistics,

More information

Measuring cancer survival in populations: relative survival vs cancer-specific survival

Measuring cancer survival in populations: relative survival vs cancer-specific survival Int. J. Epidemiol. Advance Access published February 8, 2010 Published by Oxford University Press on behalf of the International Epidemiological Association ß The Author 2010; all rights reserved. International

More information

Brief introduction to instrumental variables. IV Workshop, Bristol, Miguel A. Hernán Department of Epidemiology Harvard School of Public Health

Brief introduction to instrumental variables. IV Workshop, Bristol, Miguel A. Hernán Department of Epidemiology Harvard School of Public Health Brief introduction to instrumental variables IV Workshop, Bristol, 2008 Miguel A. Hernán Department of Epidemiology Harvard School of Public Health Goal: To consistently estimate the average causal effect

More information

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information

Methods of Calculating Deaths Attributable to Obesity

Methods of Calculating Deaths Attributable to Obesity American Journal of Epidemiology Copyright 2004 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 160, No. 4 Printed in U.S.A. DOI: 10.1093/aje/kwh222 Methods of Calculating

More information

Fixed Effect Combining

Fixed Effect Combining Meta-Analysis Workshop (part 2) Michael LaValley December 12 th 2014 Villanova University Fixed Effect Combining Each study i provides an effect size estimate d i of the population value For the inverse

More information

We expand our previous deterministic power

We expand our previous deterministic power Power of the Classical Twin Design Revisited: II Detection of Common Environmental Variance Peter M. Visscher, 1 Scott Gordon, 1 and Michael C. Neale 1 Genetic Epidemiology, Queensland Institute of Medical

More information

Modeling Binary outcome

Modeling Binary outcome Statistics April 4, 2013 Debdeep Pati Modeling Binary outcome Test of hypothesis 1. Is the effect observed statistically significant or attributable to chance? 2. Three types of hypothesis: a) tests of

More information

Analysis of single gene effects 1. Quantitative analysis of single gene effects. Gregory Carey, Barbara J. Bowers, Jeanne M.

Analysis of single gene effects 1. Quantitative analysis of single gene effects. Gregory Carey, Barbara J. Bowers, Jeanne M. Analysis of single gene effects 1 Quantitative analysis of single gene effects Gregory Carey, Barbara J. Bowers, Jeanne M. Wehner From the Department of Psychology (GC, JMW) and Institute for Behavioral

More information

Stratified Tables. Example: Effect of seat belt use on accident fatality

Stratified Tables. Example: Effect of seat belt use on accident fatality Stratified Tables Often, a third measure influences the relationship between the two primary measures (i.e. disease and exposure). How do we remove or control for the effect of the third measure? Issues

More information

Chapter 02. Basic Research Methodology

Chapter 02. Basic Research Methodology Chapter 02 Basic Research Methodology Definition RESEARCH Research is a quest for knowledge through diligent search or investigation or experimentation aimed at the discovery and interpretation of new

More information

Allowing for Missing Parents in Genetic Studies of Case-Parent Triads

Allowing for Missing Parents in Genetic Studies of Case-Parent Triads Am. J. Hum. Genet. 64:1186 1193, 1999 Allowing for Missing Parents in Genetic Studies of Case-Parent Triads C. R. Weinberg National Institute of Environmental Health Sciences, Research Triangle Park, NC

More information

THE UNIVERSITY OF OKLAHOMA HEALTH SCIENCES CENTER GRADUATE COLLEGE A COMPARISON OF STATISTICAL ANALYSIS MODELING APPROACHES FOR STEPPED-

THE UNIVERSITY OF OKLAHOMA HEALTH SCIENCES CENTER GRADUATE COLLEGE A COMPARISON OF STATISTICAL ANALYSIS MODELING APPROACHES FOR STEPPED- THE UNIVERSITY OF OKLAHOMA HEALTH SCIENCES CENTER GRADUATE COLLEGE A COMPARISON OF STATISTICAL ANALYSIS MODELING APPROACHES FOR STEPPED- WEDGE CLUSTER RANDOMIZED TRIALS THAT INCLUDE MULTILEVEL CLUSTERING,

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. Greenland/Arah, Epi 200C Sp 2000 1 of 6 EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. INSTRUCTIONS: Write all answers on the answer sheets supplied; PRINT YOUR NAME and STUDENT ID NUMBER

More information

Statistical questions for statistical methods

Statistical questions for statistical methods Statistical questions for statistical methods Unpaired (two-sample) t-test DECIDE: Does the numerical outcome have a relationship with the categorical explanatory variable? Is the mean of the outcome the

More information

What is indirect comparison?

What is indirect comparison? ...? series New title Statistics Supported by sanofi-aventis What is indirect comparison? Fujian Song BMed MMed PhD Reader in Research Synthesis, Faculty of Health, University of East Anglia Indirect comparison

More information

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives DOI 10.1186/s12868-015-0228-5 BMC Neuroscience RESEARCH ARTICLE Open Access Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives Emmeke

More information

Confounding. Confounding and effect modification. Example (after Rothman, 1998) Beer and Rectal Ca. Confounding (after Rothman, 1998)

Confounding. Confounding and effect modification. Example (after Rothman, 1998) Beer and Rectal Ca. Confounding (after Rothman, 1998) Confounding Confounding and effect modification Epidemiology 511 W. A. Kukull vember 23 2004 A function of the complex interrelationships between various exposures and disease. Occurs when the disease

More information

W e have previously described the disease impact

W e have previously described the disease impact 606 THEORY AND METHODS Impact numbers: measures of risk factor impact on the whole population from case-control and cohort studies R F Heller, A J Dobson, J Attia, J Page... See end of article for authors

More information

The Australian longitudinal study on male health sampling design and survey weighting: implications for analysis and interpretation of clustered data

The Australian longitudinal study on male health sampling design and survey weighting: implications for analysis and interpretation of clustered data The Author(s) BMC Public Health 2016, 16(Suppl 3):1062 DOI 10.1186/s12889-016-3699-0 RESEARCH Open Access The Australian longitudinal study on male health sampling design and survey weighting: implications

More information

Causal Mediation Analysis with the CAUSALMED Procedure

Causal Mediation Analysis with the CAUSALMED Procedure Paper SAS1991-2018 Causal Mediation Analysis with the CAUSALMED Procedure Yiu-Fai Yung, Michael Lamm, and Wei Zhang, SAS Institute Inc. Abstract Important policy and health care decisions often depend

More information

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study STATISTICAL METHODS Epidemiology Biostatistics and Public Health - 2016, Volume 13, Number 1 Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation

More information

How to analyze correlated and longitudinal data?

How to analyze correlated and longitudinal data? How to analyze correlated and longitudinal data? Niloofar Ramezani, University of Northern Colorado, Greeley, Colorado ABSTRACT Longitudinal and correlated data are extensively used across disciplines

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Using Direct Standardization SAS Macro for a Valid Comparison in Observational Studies

Using Direct Standardization SAS Macro for a Valid Comparison in Observational Studies T07-2008 Using Direct Standardization SAS Macro for a Valid Comparison in Observational Studies Daojun Mo 1, Xia Li 2 and Alan Zimmermann 1 1 Eli Lilly and Company, Indianapolis, IN 2 inventiv Clinical

More information

What s New in SUDAAN 11

What s New in SUDAAN 11 What s New in SUDAAN 11 Angela Pitts 1, Michael Witt 1, Gayle Bieler 1 1 RTI International, 3040 Cornwallis Rd, RTP, NC 27709 Abstract SUDAAN 11 is due to be released in 2012. SUDAAN is a statistical software

More information

Missing Data and Imputation

Missing Data and Imputation Missing Data and Imputation Barnali Das NAACCR Webinar May 2016 Outline Basic concepts Missing data mechanisms Methods used to handle missing data 1 What are missing data? General term: data we intended

More information

Differential Item Functioning

Differential Item Functioning Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item

More information

Performance of the Trim and Fill Method in Adjusting for the Publication Bias in Meta-Analysis of Continuous Data

Performance of the Trim and Fill Method in Adjusting for the Publication Bias in Meta-Analysis of Continuous Data American Journal of Applied Sciences 9 (9): 1512-1517, 2012 ISSN 1546-9239 2012 Science Publication Performance of the Trim and Fill Method in Adjusting for the Publication Bias in Meta-Analysis of Continuous

More information

MODEL SELECTION STRATEGIES. Tony Panzarella

MODEL SELECTION STRATEGIES. Tony Panzarella MODEL SELECTION STRATEGIES Tony Panzarella Lab Course March 20, 2014 2 Preamble Although focus will be on time-to-event data the same principles apply to other outcome data Lab Course March 20, 2014 3

More information

Supplement 2. Use of Directed Acyclic Graphs (DAGs)

Supplement 2. Use of Directed Acyclic Graphs (DAGs) Supplement 2. Use of Directed Acyclic Graphs (DAGs) Abstract This supplement describes how counterfactual theory is used to define causal effects and the conditions in which observed data can be used to

More information

Sample size determination for studies of gene-environment interaction

Sample size determination for studies of gene-environment interaction International Epidemiological Association 001 Printed in Great Britain International Journal of Epidemiology 001;30:1035 1040 Sample size determination for studies of gene-environment interaction JA Luan,

More information

CONDITIONAL REGRESSION MODELS TRANSIENT STATE SURVIVAL ANALYSIS

CONDITIONAL REGRESSION MODELS TRANSIENT STATE SURVIVAL ANALYSIS CONDITIONAL REGRESSION MODELS FOR TRANSIENT STATE SURVIVAL ANALYSIS Robert D. Abbott Field Studies Branch National Heart, Lung and Blood Institute National Institutes of Health Raymond J. Carroll Department

More information

Commentary SANDER GREENLAND, MS, DRPH

Commentary SANDER GREENLAND, MS, DRPH Commentary Modeling and Variable Selection in Epidemiologic Analysis SANDER GREENLAND, MS, DRPH Abstract: This paper provides an overview of problems in multivariate modeling of epidemiologic data, and

More information

Lecture II: Difference in Difference and Regression Discontinuity

Lecture II: Difference in Difference and Regression Discontinuity Review Lecture II: Difference in Difference and Regression Discontinuity it From Lecture I Causality is difficult to Show from cross sectional observational studies What caused what? X caused Y, Y caused

More information

observational studies Descriptive studies

observational studies Descriptive studies form one stage within this broader sequence, which begins with laboratory studies using animal models, thence to human testing: Phase I: The new drug or treatment is tested in a small group of people for

More information

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values Sutthipong Meeyai School of Transportation Engineering, Suranaree University of Technology,

More information

Clinical Trials A Practical Guide to Design, Analysis, and Reporting

Clinical Trials A Practical Guide to Design, Analysis, and Reporting Clinical Trials A Practical Guide to Design, Analysis, and Reporting Duolao Wang, PhD Ameet Bakhai, MBBS, MRCP Statistician Cardiologist Clinical Trials A Practical Guide to Design, Analysis, and Reporting

More information

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale.

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale. Model Based Statistics in Biology. Part V. The Generalized Linear Model. Single Explanatory Variable on an Ordinal Scale ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10,

More information

Epidemiologic challenges in the study of the efficacy and safety of medicinal herbs

Epidemiologic challenges in the study of the efficacy and safety of medicinal herbs Public Health Nutrition: 3(4A), 453±457 453 Epidemiologic challenges in the study of the efficacy and safety of medicinal herbs Lenore Arab* Departments of Nutrition and Epidemiology, University of North

More information

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS - CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS SECOND EDITION Raymond H. Myers Virginia Polytechnic Institute and State university 1 ~l~~l~l~~~~~~~l!~ ~~~~~l~/ll~~ Donated by Duxbury o Thomson Learning,,

More information

Score Tests of Normality in Bivariate Probit Models

Score Tests of Normality in Bivariate Probit Models Score Tests of Normality in Bivariate Probit Models Anthony Murphy Nuffield College, Oxford OX1 1NF, UK Abstract: A relatively simple and convenient score test of normality in the bivariate probit model

More information

THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION

THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION Timothy Olsen HLM II Dr. Gagne ABSTRACT Recent advances

More information

Bayesian Latent Subgroup Design for Basket Trials

Bayesian Latent Subgroup Design for Basket Trials Bayesian Latent Subgroup Design for Basket Trials Yiyi Chu Department of Biostatistics The University of Texas School of Public Health July 30, 2017 Outline Introduction Bayesian latent subgroup (BLAST)

More information

Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H

Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H Midterm Exam ANSWERS Categorical Data Analysis, CHL5407H 1. Data from a survey of women s attitudes towards mammography are provided in Table 1. Women were classified by their experience with mammography

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

Breast Cancer in First-degree Relatives and Risk of Lung Cancer: Assessment of the Existence of Gene Sex Interactions

Breast Cancer in First-degree Relatives and Risk of Lung Cancer: Assessment of the Existence of Gene Sex Interactions Breast Cancer in First-degree Relatives and Risk of Lung Cancer: Assessment of the Existence of Gene Sex Interactions Masaki Tsuchiya 1, Motoki Iwasaki 1, Tetsuya Otani 1, Jun-ichi Nitadori 1, Koichi Goto

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

A re-randomisation design for clinical trials

A re-randomisation design for clinical trials Kahan et al. BMC Medical Research Methodology (2015) 15:96 DOI 10.1186/s12874-015-0082-2 RESEARCH ARTICLE Open Access A re-randomisation design for clinical trials Brennan C Kahan 1*, Andrew B Forbes 2,

More information

CHL 5225 H Advanced Statistical Methods for Clinical Trials. CHL 5225 H The Language of Clinical Trials

CHL 5225 H Advanced Statistical Methods for Clinical Trials. CHL 5225 H The Language of Clinical Trials CHL 5225 H Advanced Statistical Methods for Clinical Trials Two sources for course material 1. Electronic blackboard required readings 2. www.andywillan.com/chl5225h code of conduct course outline schedule

More information