Ordered Subset Analysis in Genetic Linkage Mapping of Complex Traits

Size: px
Start display at page:

Download "Ordered Subset Analysis in Genetic Linkage Mapping of Complex Traits"

Transcription

1 Genetic Epidemiology 27: (2004) Ordered Subset Analysis in Genetic Linkage Mapping of Complex Traits Elizabeth R. Hauser, 1n Richard M. Watanabe, 2 William L. Duren, 3 Meredyth P. Bass 1 Carl D. Langefeld 4 and Michael Boehnke 3 1 Section of Medical Genetics, Department of Medicine, Center for Human Genetics, Duke University Medical Center, Durham, North Carolina 2 Division of Biostatistics, Department of Preventive Medicine, USC Keck School of Medicine, Los Angeles, California 3 Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan 4 Section on Biostatistics, Department of Public Health Sciences, Wake Forest University of Medicine, Winston-Salem, North Carolina Etiologic heterogeneity is a fundamental feature of complex disease etiology; genetic linkage analysis methods to map genes for complex traits that acknowledge the presence of genetic heterogeneity are likely to have greater power to identify subtle changes in complex biologic systems. We investigate the use of trait-related covariates to examine evidence for linkage in the presence of heterogeneity. Ordered-subset analysis (OSA) identifies subsets of families defined by the level of a traitrelated covariate that provide maximal evidence for linkage, without requiring a priori specification of the subset. We propose that examining evidence for linkage in the subset directly may result in a more etiologically homogeneous sample. In turn, the reduced impact of heterogeneity will result in increased overall evidence for linkage to a specific region and a more distinct lod score peak. In addition, identification of a subset defined by a specific trait-related covariate showing increased evidence for linkage may help refine the list of candidate genes in a given region and suggest a useful sample in which to begin searching for trait-associated polymorphisms. This method provides a means to begin to bridge the gap between initial identification of linkage and identification of the disease predisposing variant(s) within a region when mapping genes for complex diseases. We illustrate this method by analyzing data on breast cancer age of onset and chromosome 17q [Hall et al., 1990, Science 250: ]. We evaluate OSA using simulation studies under a variety of genetic models. & 2004 Wiley-Liss, Inc. Key words: linkage analysis; genetic heterogeneity; covariate; complex traits Grant sponsor: NIH; Grant numbers: MH59528, DK09525, HG00040, HG00376; Grant sponsor: ADA. n Correspondence to: Elizabeth R. Hauser, Ph.D., Section of Medical Genetics, Department of Medicine, Center for Human Genetics, Duke University Medical Center, Box 3445, Durham, NC Elizabeth.Hauser@duke.edu Received for publication 27 August 2002; revision accepted 8 September 2003 Published online 8 April 2004 in Wiley InterScience ( DOI /gepi INTRODUCTION Complex diseases, such as type 2 diabetes, cancer, hypertension, and cardiovascular disease, are characterized by a multifactorial etiology. Genes and environment likely interact in a complex fashion to cause disease. Presumably, development of a complex disease is a result of perturbations in biological pathways. Genetic polymorphisms may affect various components of these pathways resulting in differential effects on quantitative measurements related to the complex disease. Detecting the effects of genes and their genetic polymorphisms on a complex trait may require taking intermediate traits or covariates into account so that the susceptibility gene is detectable in a subset of the families. This model suggests that genetic heterogeneity will be a fundamental feature of complex disease etiology. Furthermore, genetic linkage analysis methods to map genes for complex traits should acknowledge the presence of genetic heterogeneity to maximize power to identify subtle changes in complex biologic systems. The large genome screens for the genetic analysis of complex traits provide tremendous amounts of information. These screens typically produce lod scores or estimated IBD sharing for hundreds of families typed for hundreds of genetic markers across the genome. This vast amount of information has not often translated directly into impressive lod scores, presumably & 2004 Wiley-Liss, Inc.

2 54 Hauser et al. because of etiologic complexity. The presence of multiple independent and/or interacting disease genes and environmental factors causes significant problems for genetic linkage analysis. Etiologic heterogeneity reduces power to detect linkage, but explicitly taking heterogeneity into account can help minimize this information loss [Arnett et al., 1997; Lunetta and Rogus, 1998; Greenwood and Bull, 1999; Leal and Ott, 2000]. Even when it is possible to detect linkage to a region, etiologic heterogeneity can bias the estimate of gene location and result in broader confidence intervals [Ott, 1983; Falk, 1997; Chiano and Yates, 1994]. Several linkage analysis methods have been proposed to recognize and correct for heterogeneity. The most popular is the admixture test and its extensions, which allow for two or more diseasecausing loci [Ott, 1983; Bhat et al., 1999]. This test uses evidence for or against linkage at a marker to suggest the presence of heterogeneity and may not be particularly sensitive when the sample is composed of small families, such as those usually used for late-onset diseases [Falk, 1997]. Another method to assess genetic heterogeneity is the pre-divided sample test [Morton, 1956]. When a priori evidence for genetic heterogeneity exists, subsets of families may be defined prior to analysis and examined for differences in the evidence for linkage. This idea was applied to a genetic linkage study of 26 Finnish families from the Bothnia region in which no evidence for linkage was obtained in the overall sample but suggestive evidence for linkage was obtained in families in the lowest quartile of 30-minute insulin [Mahtani et al., 1996]. While it may be straightforward to select trait-related covariates based on biological or clinical considerations, the value of the covariate to choose as a cutoff in defining a genetically homogeneous subset often is not obvious. One way to avoid having to define arbitrary a priori cutoffs is to use a function of a covariate to rank the families and then to add in the families one or a few at a time based on the covariate. This idea was exploited in the initial report of the linkage of breast and ovarian cancer to the gene subsequently identified as BRCA1 [Hall et al., 1990; Merette et al., 1992; Miki et al., 1994; Futreal et al., 1994]. Hall et al. [1990] ranked the families by mean age of onset of breast cancer and plotted the cumulative sum of the lod scores. The results indicate that the evidence for linkage was concentrated in the families with early age of onset. Recognition of this key feature was important in subsequent studies to narrow the region of linkage and to identify BRCA1. Had age of onset been ignored, linkage to chromosome 17 might have been excluded as the overall lod score was substantially negative due to the heterogeneous mixture of families in the sample. We expand upon the idea of using a trait-related covariate to examine evidence for linkage in the presence of heterogeneity. Our ordered subset method identifies subsets of families defined by level of a traitrelated covariate that provide maximal evidence for linkage, without requiring a priori specification of the subset. The significance of the subset and its evidence for linkage is evaluated using a permutation procedure to estimate empirical P values. We propose that examining evidence for linkage in the subset directly can result in a more etiologically homogeneous sample. In turn, the reduced impact of heterogeneity may result in increased overall evidence for linkage to a specific region and the potential to produce a more distinct lod score peak with more precise gene localization. The identified ordered subset may be a useful sample in which to begin searching for trait-associated polymorphisms or mutations through sequencing or other fine-mapping techniques. Identification of a trait-related covariate using this ordered subset analysis (OSA) also may help refine the list of candidate genes in a given region based on biological pathways, which connect the covariate and the trait. In this report, we apply OSA to linkage results of breast cancer to D17S74 as published by Hall et al. [1990]. We present the results of a simulation study evaluating type I error rates and power for a variety of genetic models consistent with a complex genetic trait. We find that OSA preserves the appropriate type I error rates. Furthermore, compared to analyses that do not use the covariate information but do allow for heterogeneity among families, OSA has substantially greater power to detect ordered subsets of families defined by a covariate. METHODS Ordered subset analysis (OSA) may be used at any time during a linkage analysis. The requirements are additive linkage statistics and covariates for each family. Typically, use of this method will begin when the genome screen linkage analysis has been completed. We assume in what follows that the results of the linkage analysis are

3 Linkage Analysis in Ordered Subsets 55 lod scores for all families, although other additive test statistics could also be used (see Discussion). The input for OSA is the lod scores Z i (d,g) for family i, at position d where d may range across a region, chromosome, or the genome, and g represents the genetic model. We will refer to the maximum of the sum of the lod scores over all families as the overall baseline lod score, Z(d,g). The genetic model may take a variety of forms from completely unspecified, to a vector of IBD sharing probabilities or recurrence risk ratios (l) or a full Mendelian model with a genetic model parameter set including penetrances and allele frequencies. Depending on the analysis method used for generating the family-specific lod scores, the genetic model may include reduced penetrance, liability classes, or phenocopies. The covariates may be one or more continuous or ordinal variables thought to be related to the trait of interest and are often a function of trait values for members in the family. Examples of covariates include mean, median, or minimum covariate values for affected individuals in the family. We use the term covariate very generally to include disease-related quantitative traits as well as IBD sharing or linkage statistics at another locus. FINDING THE MAXIMUM ORDERED SUBSET STATISTIC We begin by ordering the families based on some covariate value. Let Z (j) (d, g) be the lod score matrix for ordered family j. For example, if we are ordering families for low to high mean age of onset, we would start with the family with the lowest mean age of onset, with lod score matrix Z (1) (d, g). We find the maximum lod score in that family and the estimates of b d 1 andbg 1 at which the maximum occurs. Next we use element-wise addition to add the matrix for the family with the next lowest mean age of onset, Z (2) (d, g), to the matrix for the first family. In general, we create the jth partial sum by adding each element of the lod score matrix for each family up to and including ordered family j: Z j ðd; gþ ¼ Xj k¼1 Z ðkþ ðd; gþ After the addition of each family j, we note the maximum summed lod score, Z j ð b d j ;bg j Þ and estimates of the disease location ð b d j Þ and disease model parameters ðbg j Þ. After all of the N families have been added in turn, we have maxima for each partial sum of the lod scores ðz 1 ð b d 1 ;bg 1 Þ;...; ðz N ð b d N ;bg N ÞÞ, ordered by the family-specific covariate value. Note that the values of d and g can and often are maximized at different values depending on the subset of families. We define the maximum ordered subset lod score Z ð b d ;bg Þ¼max j h Z j ð b i d j ;bg j Þ and the D lod score D ¼ Z ð b d ;bg Þ Zð b d ;bg Þ to be the difference between the maximum ordered subset lod score and the lod score summed over all families at the same position and model. Since a genetically more homogeneous subset may be identified either through high values or low values of a covariate, we generally perform the summation in both ascending and descending order unless there is strong prior evidence to consider only one order or the other. EVALUATING LINKAGE EVIDENCE IN SUBSETS OF FAMILIES Given the maximization over subsets procedure we have employed, the maximum ordered subset lod score will always be at least as large as the overall lod score. Therefore, the distribution of the maximum ordered subset lod score cannot be the same as the distribution of the lod score without this maximization. The observation of a large lod score in a subset of the families must be evaluated in the context of the evidence for linkage in the entire sample and the fact that we have selected a subset of the data. To assess significance of linkage evidence for a subset conditional on linkage evidence in the entire sample, we employ a permutation strategy to test the null hypothesis that there is no relationship between the trait-related covariate and the evidence for linkage. In the permutation test procedure, we assess the significance of the increase in the lod score in the identified ordered subset compared to the baseline lod score in all families. We sample from the distribution of all N! permutations by randomly ordering the N families and calculating the maximum ordered subset lod score for each permutation p, Z p ðd p ; g p Þ, in exactly the same way as for the original data. We estimate the P value for the ordered subset lod score as the proportion of permutations giving maximum ordered subset lod scores as large or larger than that observed. The permutation test

4 56 Hauser et al. can be used to estimate P values when maximizing the ordered subset lod score at a single point, for a chromosomal region or whole chromosome, or for the entire genome. An assumption of permutation tests is that the observations to be permuted are exchangeable. When all families are of the same size and structure, this assumption is met. When families are of different sizes or structures, the P values are approximate. COMPUTATIONAL EFFICIENCY While this analysis and particularly the requirements of the permutation test may appear computationally burdensome, this burden can be minimized. If the number of family-positionmodel-specific lod scores Z i (d,g) is not too large, these lod scores can be calculated once and stored. OSA then may be performed for many different functions of covariates and for as many permutations as desired by simply reordering the families and carrying out the summation and maximization steps. We routinely compute permutation tests for analyses with over 500 families, 400 positions, and 25 disease models in this way. To devote more computation time to assessing significance of the most interesting results, we have implemented an inverse-sampling method for empirical P value estimation. We set the maximum number of permutations desired as a parameter for the empirical P value calculation. We continue sampling until the estimated variance of the empirical P value bp is sufficiently small so that 1 bp 2 is less than the 95% lower bound for bp or until the maximum number of permutations is reached. This procedure uses the Poisson approximation to the Binomial for small P. The variance of bp is evaluated after every 10th permutation replicate. If bp exceeds this boredom threshold, the permutation routine stops before the maximum number of permutations is reached. We also follow the recommendation of North et al. [2002] that the number of permutations giving a larger statistic than the observed statistic be 410. Thus, we can set a high number of maximum permutations but only reach that maximum when trying to estimate small P values, increasing computational efficiency. APPLICATION TO BREAST CANCER DATA We applied OSA to the data published by Hall et al. [1990] in 23 families ascertained for multiple cases of breast cancer. We used the data presented in their table I that includes, for each family, the mean age of onset of breast cancer in the affected family members and lod scores at five positions relative to marker D17S74. We used their figures 1 and 2 to define minimum age of onset, maximum age of onset, and range of age of onset for each family. Hall et al. [1990] considered only one genetic model, so that we stored only five lod scores (one for each position) for each family. SIMULATION STUDIES We performed a simulation study to evaluate the false-positive rate of OSA in affected sib pair (ASP) linkage analysis using the recurrence risk ratio to sibs l to describe the genetic effect [Risch 1990a c; Hauser and Boehnke 1998]. As seen in power studies of ASP linkage analysis, l is a critical factor in determining the power of the linkage analysis [Hauser et al., 1996]. We used the simulation package SIMLA that we developed for simulation of complex genetic traits incorporating both linkage and association [Bass et al., 2004]. We simulated ASPs with genotyped parents under disease models with locus-specific l values of 1.2 and 1.4. We first simulated data sets to represent two null hypotheses: (1) no linkage in any of the ASPs, l¼1, and 2) linkage (1.5olo5.0) in 20 50% of the ASPs but no difference in the covariate distribution between the linked and unlinked subsets (OSA null hypothesis of no covariate effect). We simulated a covariate value for each individual from a normal distribution. To evaluate the power of OSA to identify a linked subset, we simulated a subset in which the covariate values were chosen from a mixture of normal distributions with means different in the linked and unlinked ASPs. Table I shows the parametric models used to perform the simulations to examine power. These models were chosen to yield population prevalences of B5% as well as to cover a range of models for complex diseases. Two values of l are given for each model, the overall l for the combined sets of families and the l in the linked subset. These values allow comparisons between different genetic models resulting in the same l values. We simulated genotype data for a 90-cM map of 10 markers evenly spaced at 10-cM intervals. In simulations with l41, we placed the disease locus at 45 cm. We simulated 5,000 replicate data sets of 400 ASPs for each condition to examine type I error rates under the two null hypotheses and 500 replicate data sets for the power studies.

5 Linkage Analysis in Ordered Subsets 57 We performed two linkage analyses of the simulated data sets to generate family-specific lod scores for input into OSA. First, we carried out multipoint ASP based on the IBD sharing method of Risch [1990a c] as implemented in Siblink [Hauser and Boehnke, 1998]. This method provides OSA input files of lod scores calculated at a grid of map positions and l values assuming an additive genetic model. Second we performed two-point parametric linkage analysis with Vitesse [O Connell and Weeks, 1995] using the parametric model from the linked subset to calculate the family-specific lod scores at a marker 5 cm from the true disease location. We used these family-specific lod-scores as input to the HOMOG package [Ott, 1983, 1999], which implements the admixture test. HOMOG performs nested likelihood ratio tests for linkage with or without genetic heterogeneity and provides a w 2 test statistic for the hypotheses of linkage with homogeneity among the families (yo0.5, a¼1) vs. linkage with heterogeneity (yo0.5, ao1). To assess whether the incorporation of a trait-related covariate would provide information related to heterogeneity among the families above and beyond differences in the maximum lod scores, we compared the power of OSA with the power of the test for heterogeneity provided by the admixture test. RESULTS BREAST CANCER EXAMPLE Table II shows the results of OSA applied to the linkage results for breast cancer and chromosome 17q [Hall et al., 1990] that localized BRCA1 and ultimately led to the identification of BRCA1 [Miki et al., 1994]. OSA replicates the results presented in the paper with a maximum subset lod score of 5.98 and D lod of occurring in the 7 families with youngest mean age of onset. Hall et al. [1990] did not estimate a P value for their result. The approximate empirical P value for this subset from OSA is , suggesting that the result would be highly unlikely if there were no relationship between age of onset and evidence for linkage. Lowest maximum age of onset and range of age of onset in the family also gave significant improvements in linkage evidence with lod scores of 5.10 and 4.66 (D lods of and 11.14, P values of.02 and.01, respectively). TABLE I. Genetic models used for the simulation studies a Penetrances Model number l l in linked subset Genetic model Proportion of families in the linked subset Disease allele frequency P(A) P(D AA) P(D Aa) P(D aa) Dominant Recessive Dominant Recessive Additive Additive Additive a l is the locus-specific recurrence risk ratio in the entire sample. Penetrances are expressed as the probability of disease (D) given the genotype. The disease prevalence was constrained to be ~0.05. TABLE II. Results of ordered subset analysis applied to breast cancer linkage data for 23 families and marker D17S74 [Hall et al. 1990] for four age-related covariates Subste Age of onset covariate Order b y Maximum lod D lod at b y Empirical P-value Families in the subset Mean Ascending Descending Minimum Ascending ,7 10,13,15,16 Descending Maximum Ascending ,7,8 Descending Range Ascending ,3 7 Descending

6 58 Hauser et al. TABLE III. Type I error rates: simulation results for nonparametric multipoint linkage analysis with 400 ASPs using ordered subset analysis under no linkage (k¼l), and no covariate effect with 20 50% of the families linked (k41) a Model number l/linked l model Mean overall lod score Mean maximum subset lod score Mean D lod Mean proportion of families in subset Pðbpo:05Þ Pðbpo:01Þ F /5.0 D /4.0 R /2.5 D /2.5R /2.5 A /1.5 A /2.3 A a For the simulations with l41, the trait-related covariate was distributed as a normal random variable. Models are described in Table I; 5,000 replicates were simulated per model. TABLE IV. Power: simulation results for nonparametric multipoint linkage analysis with 400 ASPs using ordered subset analysis with a proportion of the families linked (k41) a Model number l/linked l model Mean overall lod score Mean maximum subset lod score Mean D lod Mean proportion of families in subset Pðbpo:05Þ Pðbpo:01Þ 1 1.2/5.0 D /4.0R /2.5 D /2.5 R /2.5 A /1.5 A /2.3 A a The trait-related covariate was distributed as a mixture of normals with means two standard deviations apart in the linked and unlinked subsets. Models are described in Table I, 500 replicates were simulated per model. The families included in these subsets are very similar to those included for the mean age subset, reflecting the high correlation in these familyspecific functions of age of onset. The minimum age of onset gave a nonsignificant result with a maximum subset lod score of 3.03, D lod of 1.49, and P value¼0.14. No subsets for any of the agerelated covariates showed an increase in the lod score when the 23 families were ordered from oldest to youngest or largest to smallest maximum, minimum, or range so that the results for these analyses are the same as the results for the complete set of families. SIMULATION STUDY Table III shows the results of a simulation study using multipoint affected sib pair linkage analysis to assess the overall behavior of OSA under no linkage (l¼1) and linkage (l41) but no covariate effect, i.e., the possible null hypotheses for OSA. As expected, the mean maximum subset lod score is larger than the overall lod score in every case. The D lod measures the difference between the maximum subset lod score and the overall baseline lod score at that map position and for that model. For the no linkage and no covariate effect simulations we performed, the mean D lod scores were 41 lod unit. The type I error rates estimated as the proportion of replicates with empirical P values o0.05 and o0.01 are close to the nominal levels for both the no linkage and no covariate effect cases. This observation suggests that, as expected, the permutation approach adequately adjusts for the increase in the lod score when maximizing over subsets. Table IV presents the power of OSA using the multipoint nonparametric affected sibling pair lod scores generated by Siblink for the genetic models listed in Table I. In these models, 20 50% of the families are linked; the covariate distribution is a mixture of normals with means two standard deviations apart in linked and unlinked families. The power of OSA is estimated by the proportion of replicates with empirical P values bp less than 0.05 or Not surprisingly, the power increases as the l in the linked families increases. We applied a variety of different genetic models, holding the l values constant. As discussed by Olson [1995], the underlying genetic model can

7 Linkage Analysis in Ordered Subsets 59 TABLE V. Simulation results for two-point parametric linkage analysis with 400 ASPs using ordered subset analysis and the Admixture Test implemented in HOMOG with a subset of the families linked (k41) a Model number l/linked l model Mean max HLOD Mean chi-square for heterogeneity Pðbpo:05Þ Mean overall lod score Mean maximum subset lod score Pðbpo:05Þ 1 1.2/5.0 D /4.0 R /2.5 D /2.5 R /2.5 A /1.5 A /2.3 A a The trait-related covariate was distributed as a mixture of normals with means two standard deviations apart in the linked and unlinked subsets. The generating model was used for the linkage analysis. Models are described in Table I, 500 replicates were simulated per model. have a significant impact on the power to detect linkage using nonparametric affected sibling pair analysis. The OSA results reflect this difference as seen by comparing the maximum lod scores, maximum subset lod scores, and the OSA power for genetic models with overall l¼1.2 or with linked subset l¼2.5. The recessive models provide more evidence for linkage across all measures. However, even in the dominant and additive models, when the subset is large (proportion of linked families 50%) or when the linked subset l is large (lz2.5), OSA had greater than 78% power at significance level 0.05 to detect the increased evidence for linkage in the subset using the traitrelated covariate. Table V compares the power of OSA to the power of the admixture test as implemented in HOMOG for detecting heterogeneity among the families. The parametric model from the linked subset was used as the analysis model to provide the most information possible for detecting linkage in these small families. The power of the test of heterogeneity in the admixture model as implemented in HOMOG is considerably lower for all genetic models tested for these small families. OSA, using the same family-specific parametric two-point lod scores, provides substantially greater power than that observed with the admixture test to detect heterogeneity using a trait-related covariate. As in the multipoint nonparametric case, the power varies substantially across models. DISCUSSION We propose ordered subset analysis (OSA) as a means to address the etiologic and genetic heterogeneity in the analysis of complex genetic traits. The goal of OSA is to identify genetically more homogeneous subsets and to refine disease gene location estimates. This method is a fast and simple way to identify potentially interesting subsets of families and to help prioritize the list of candidate genes in a region. It can provide direction for follow-up and confirmatory analyses. This work grew out of our attempts to perform stratified linkage analysis on pre-specified subsets based on disease-related covariates. When there is solid a priori evidence for defining strata, procedures such as the predivided sample test [Morton, 1955] should be used. However, when the strata are less obvious it is difficult to choose and defend cutpoints and even more difficult to interpret the results. The advantage of OSA is that it does not require pre-specified cutpoints but allows choices based on the data and then performs a permutation test to evaluate the evidence for linkage in the context of the results for the entire sample. OSA is quite flexible; d and g can be held fixed to highlight a specific location, d, or model, g. In addition, ðzð b d 1 ;bg 1 Þ... Z N ð b d N ;bg N ÞÞ can be plotted against the covariate values for the families to get a graphic representation of the change in the subset lod scores as families are added [Hall et al., 1990]. OSA can use a variety of covariates, including evidence for linkage at other loci as suggested by Cox et al. [1999], and provides empirical P values for these conditional analyses. OSA is not limited to a single method of linkage analysis but may be applied wherever grids of additive linkage statistics by family can be obtained [Watanabe et al., 1999]. For example, Z i (d,g) may be any additive statistic including lod scores, w 2 values, or squared normal Z-scores. The identification of the BRCA1 gene provided an important example of the challenges inherent in mapping complex human disease and of the successes that could be obtained by careful

8 60 Hauser et al. examination of the data. The small empirical P value obtained for the families with low mean age of onset clearly rejected the null hypothesis that these families are a random sample from families with breast cancer. In reality, no P value was necessary to identify the obvious clustering of linkage evidence in the seven families with the earliest mean age of onset of breast cancer [Hall et al., 1990]. However, applying this paradigm to studies of common diseases has been somewhat more difficult because of the greater degree of genetic heterogeneity and the smaller genetic effects of single genes accompanied by the much larger number of families required to detect them. Thus, we sought a flexible test to identify homogeneous subsets for further study. OSA IN A GENOME SCREEN An appealing feature of OSA in the genome scan context is the ability to re-estimate the disease gene location in the subset. When OSA is used on genome screen data, the entire multipoint lod score curves for each chromosome are used to define the maximum subset lod score. Our OSA software provides plot-ready files so that these multipoint curves may be compared in the subset and in the overall group. By re-estimating the gene location in the maximum ordered subset, it may be possible to narrow a large region of linkage to something more manageable, with a narrower one-lod down support interval [Ghosh et al., 2000; Shao et al., 2003]. It may also happen that the maximum likelihood disease gene location in the subset may be quite different from the maximum likelihood disease gene location in the entire sample. In some cases, it may be desirable to compare the OSA maximum subset lod score at a particular point on a map, say at the estimated disease gene location of the maximum in the entire sample, to understand the contribution of the covariate to evidence for linkage at that point. Our OSA software allows for analysis of a particular point and reports empirical P values for the test at that point. INTERPRETATION OF THE MAXIMUM SUBSETS LOD SCORE The maximum ordered subset lod score must be at least as large as the maximum lod score in the original sample. Thus, the value of the maximum subset lod score is sensitive to the linkage evidence in the total sample. If there is no evidence for linkage in the total sample, the empirical P value may be quite significant for a subset showing only moderate evidence for linkage. If there is substantial and perhaps diffuse evidence for linkage, the P values for the maximum subset lod score may not be significant. While the P values may be disappointing in themselves, the resulting subset may be useful in suggesting a group for further study or for refining the location in a diffuse area of linkage. Thus, it is important to consider the significance of a given result in the context of the overall evidence for linkage. Cox et al. [1999] propose examining the difference in the overall and conditional lod scores (D lod) to evaluate the effects of epistasis or genetic heterogeneity. As we have shown, that idea may also be applied in OSA. COMPARISON TO OTHER METHODS We compared the results of OSA to the results of the admixture test for heterogeneity as implemented in HOMOG [Ott, 1983, 1999]. These results demonstrate that when a covariate related to evidence for linkage is identified, power to detect linkage in subsets can be enhanced using OSA, even when the overall genetic effect is low or when the families are small. It is to be expected that the power to detect the heterogeneity and to identify a subset using the admixture test implemented in HOMOG, which uses only the evidence for linkage in each family, would be low in the case of ASPs. When the families are larger and more variability across the familyspecific lod scores is possible, as in the breast cancer example, the admixture test will have increased power to detect heterogeneity. However, for the family samples collected to study common complex genetic traits, OSA provides an additional means of identifying a subset of families with increased evidence for linkage by accumulating that evidence across families with similar covariate levels. There have been a number of other recent methods proposed to identify and control for genetic heterogeneity. The conditional logistic regression method of Olson [1999; Olson et al., 2001] allows for modeling the relationship between evidence for linkage and dependence on covariates. Goddard et al. [2001] showed how this conditional logistic approach can be applied to a genome screen and how the lod score, as a function of a likelihood ratio test on proposed models, can be increased as covariates are

9 Linkage Analysis in Ordered Subsets 61 included in the model. Langefeld and colleagues [Langefeld et al., 2001; Davis et al., 2001] propose nonparametric linkage (NPL) regression to allow conditional or simultaneous tests of multiple genetic loci, phenotypic or environmental covariates and their interactions. Bull et al. [2002] propose modeling ASP IBD sharing probabilities using covariates, including examination of experiment-specific covariates such as plate assignment. Devlin et al. [2002] apply mixture modeling methods, similar to the original heterogeneity models proposed by Smith [1963]. These modeling approaches can be powerful for examining welldefined models for specific genes or genetic locations including covariates. OSA is a simpler approach that can be easily applied to a wide variety of covariates. As a screening tool, OSA does not require a particular location or genetic model to be chosen and, thus, provides greater flexibility in detecting heterogeneity when heterogeneity might not be otherwise suspected on the basis of overall lod scores. SUITABLE COVARIATES For many complex traits, there are multiple choices of trait-related covariates to use in OSA. This method is well suited to any quantitative covariate, either continuous or ordered. The covariate may be adjusted by variables such as age and gender if they are known to influence the value of the covariate. Analyses of data from the FUSION study of the genetics of type 2 diabetes used the mean of the covariate value in all affected sibs to order the families [Ghosh et al., 2000]. Other summary statistics such as the sibship median, minimum, or maximum could be used. Complex functions of the data could be used, including results of data reduction techniques, such as principal components or cluster analysis [Merette et al., 1999; Shao et al., 2003] or residuals from fitting a linear model [Tores et al., 1999]. OSA may be applied to an ordinal categorical covariate such as number of affected family members or the proportion of family members with the specific characteristic, such as presence or absence of a specific allele, or evidence for linkage at another locus [see also Cox et al., 1999]. MULTIPLE COMPARISONS We control for the inflation in the false positive rate induced by examining multiple family subsets for a given covariate by generating empirical P values using a permutation test, which appears to give the proper type 1 error rate in limited simulations. However, we have not controlled for ordered subset comparisons over multiple traitrelated covariates or multiple regions of the genome as conditioning loci. It is likely that correlation between some trait-related covariates will be operating. This suggests that a Bonferronitype correction of the P value will be very conservative. Due to the exploratory nature of the analysis, we do not feel compelled to apply a correction for multiple comparisons. We strive for consistency of results across the various analyses within our study as well as comparisons of results to analyses performed by other groups. In the absence of a correction for multiple comparisons, we stress the hypothesis-generating nature of these results and the need for follow-up. POWER VS. SAMPLE SIZE In analyses that subset the total data set, there is an implicit tradeoff between the power to detect a given effect and the decreasing sample size in the subset. Leal and Ott [2000] show that if the relative risk in the subset is close to one, then the marginal increase in the genetic effect in the strata does not offset the decrease in power due to the reduced sample size. In this case, subsetting the data will not help and only a large increase in sample size will provide sufficient power to detect small genetic effects. In what we present here, we presume that the increase in the apparent genetic effect afforded by increasing homogeneity in the subset will offset the loss of power induced by the reduced sample size [Kovac et al., 1999]. Another factor that impacts power for OSA is the correlation between the evidence for linkage and the levels of the covariate. Our simulations used a large (two standard deviation) difference in the means for the linked and unlinked ASPs. Additional simulations with a one standard deviation difference in the means for linked and unlinked ASPs suggest that, while there is a decrease in the power, it is not substantial (e.g., for 2SD vs for 1SD in the 1.2/2.5D model in Table IV). We are continuing to explore sensitivity of OSA to the covariate values in additional simulations. So far, our power studies suggest that OSA has reasonable power to detect a subset with increased evidence for linkage when using a traitrelated continuous covariate. As a result OSA appears to be a useful tool in the set of statistical methods for the analysis of complex genetic traits.

10 62 Hauser et al. SUMMARY We have developed the ordered subset method to use trait-related covariates to identify a more genetically homogeneous sample. We generate an empirical p value to test the hypothesis that the families providing the maximum subset evidence for linkage are a random sample from all possible ordered subsets. We believe that such analyses will help identify homogenous sets of families that will, in turn, provide better estimates of a disease gene location. These families along with the trait-related covariates may help prioritize candidate genes in regions of linkage in these families. We have developed software to perform these analyses, which is available from the authors via the internet. ACKNOWLEDGMENTS We thank our FUSION colleagues for data and encouragement. We thank Chun Li, Heather Stringham, Julie Douglas, Mike Epstein, Margaret Pericak Vance, Eden Martin, and Bill Scott for helpful discussions. E.R.H. is supported by NIH grant MH R.M.W. was previously supported by an individual NSRA from the NIH (DK09525) and a Career Development Award from the ADA and is currently supported by an ADA Research Award. W.L.D. and C.D.L. were supported by NIH training grant HG M.B. is supported by NIH grant HG ELECTRONIC DATABASE INFORMATION OSA, SIBLINK and SIMLA software: wwwchg.mc.duke.edu/software/index.html REFERENCES Arnett DK, Pankow JS, Atwood LD, Sellers TA Impact of adjustments for multiple phenotypes on the power to detect linkage. Genet Epidemiol 14: Bass MP, Martin ER, Hauser ER Pedigree generation for analysis of genetic linkage and association. Proceedings of the Pacific Symposium on Biocomputing 2004: Bhat A, Heath SC, Ott J Heterogeneity for multiple disease loci in linkage analysis. Hum Hered 49: Bull SB, Greenwood CM, Mirea L, Morgan K Regression models for allele sharing: analysis of accumulating data in affected sib pair studies. Stat Med 21: Chiano MN, Yates JR Bootstrapping in human genetic linkage. Ann Hum Genet 58: Cox NJ, Frigge M, Nicolae DL, Concannon P, Hanis CL, Bell GI, Kong A Loci on chromosomes 2 (NIDDM1) and 15 interact to increase susceptibility to diabetes in Mexican Americans. Nat Genet 21: Davis CC, Brown WM, Lange EM, Rich SS, Langefeld CD Nonparametric linkage regression. II: Identification of influential pedigrees in test for linkage. Genet Epidemiol 21(Suppl):S123 S129. Devlin B, Jones BL, Bacanu SA, Roeder K Mixture models for linkage analysis of affected sibling pairs and covariates. Genet Epidemiol 22: Falk C Effect of genetic heterogeneity and assortative mating on linkage analysis: A simulation study. Am J Hum Genet 61: Futreal PA, Liu Q, Shattuck-Eidens D, Cochran C, Harshman K, Tavtigian S, Bennett LM, Haugen-Strano A, Swensen J, Miki Y, Eddington K, McClure M, Frye C, Weaver-Feldhaus J, Ding W, Gholami Z, Soderkvist P, Terry L, Jhanwar S, Berchuck A, Iglehart JD, Marks J, Ballinger DG, Barrett JC, Skolnick MH, Kamb A, Wiseman R BRCA1 mutations in primary breast and ovarian carcinomas. Science 266: Ghosh S, Watanabe RM, Valle T, Hauser ER, Magnuson VL, Langefeld CD, Ally DS, Mohlke K, Silander K, Kohtamaki K, Chines P, Balow J, Birznieks G, Chang J, Eldridge W, Erdos MR, Karanjawala ZE, Knapp JI, Kuelko K, Martin C, Morales-Mena A, Musick A, Musick T, Pfahl C, Porter R, Rayman JB, Rha D, Segal L, Shapiro S, Sharaf R, Shurtleff B, So A, Tannenbaum J, Te C, Tovar J, Unni A, Whiten R, Witt A, Blaschak-Harven J, Douglas JA, Duren WL, Epstien MP, Fingerlin TE, Kaleta HS, Lange EM, Li C, McEachin RC, Stringham HM, Trager E, White PP, Erikkson J, Toivanen L, Vidren G, Nylund SJ, Tuomilehto- Wolf E, Ross EH, Demirchyan E, Hagopian WA, Buchanan TA, Tuomilehto J, Bergman RN, Collins FS and Boehnke M The Finland-United States investigation of non-insulindependent diabetes mellitus genetic (FUSION) study: I. An autosomal genome scan for genes that predispose to type 2 diabetes. Am J Hum Genet 67: Goddard KA, Witte JS, Suarez BK, Catalona WJ, Olson JM Model-free linkage analysis with covariates confirms linkage of prostate cancer to chromosomes 1 and 4. Am J Hum Genet 68: Greenwood CM, Bull SB Analysis of affected sib pairs, with covariates with and without constraints. Am J Hum Genet 64: Hall JM, Lee MK, Newman B, Morrow JE, Anderson LA, Huey B, King MC Linkage of early-onset familial breast cancer to chromosome 17q21. Science 250: Hauser ER, Boehnke M Genetic linkage analysis of complex genetic traits by using affected sibling pairs. Biometrics 54: Hauser ER, Boehnke M, Guo SW, Risch N Affected-sib-pair interval mapping and exclusion for complex genetic traits: sampling considerations. Genet Epidemiol 13: Kovac I, Rouillard E, Merette C, Palmour R Exploring the impact of extended phenotype in stratified samples. Genet Epidemiol 17:S211 S216. Langefeld CD, Davis CC, Brown WM Nonparametric linkage regression. I: Combined Caucasian CSGA and German genome scans for asthma. Genet Epidemiol 21(Suppl): S136 S141. Leal SM Ott J Effects of stratification in the analysis of affected sib-pair data: Benefits and costs. Am J Hum Genet 66: Lunetta KL Rogus JJ Strategy for mapping minor histocompatibility genes involved in graft-versus-host disease:

11 Linkage Analysis in Ordered Subsets 63 a novel application for the discordant sib pair methodology. Genet Epidemiol 15: Mahtani MM, Widen E, Lehto M, Thomas J, McCarthy M, Brayer J, Bryant B, Chan G, Daly M, Forsblom C, Kanninen T, Kirby A, Kruglyak L, Munnelly K, Parkkonen M, Reeve-Daly MP, Weaver A, Brettin T, Duyk G, Lander ES, Groop LC Mapping of a gene for type 2 diabetes associated with an insulin secretion defect by a genome scan in Finnish families. Nat Genet 14: Merette C, King MC, Ott J Heterogeneity analysis of breast cancer families by using age at onset as a covariate. Am J Hum Genet 50: Merette C, Cayer M, Rouillard E, Roy AK, Guibord P, Kovac I, Ghazzali N, Szatmari P, Roy MA, Maziade M, Palmour R Evidence of linkage in subtypes of alcoholism. Genet Epidemiol 17:S253 S258. Miki Y, Swensen J, Shattuck-Eidens D, Futreal PA, Harshman K, Tavtigian S, Liu Q, Cochran C, Bennett LM, Ding W, Bell R, Rosenthal J, and 33 others A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science 266: Morton NE Sequential tests for the detection of linkage. Am J Hum Genet 7: Morton NE The detection and estimation of linkage between the genes for elliptocytosis and the Rh blood type. Am J Hum Genet 8: North BV, Curtis D, Sham PV A note on the calculation of empirical p values from Monte Carlo procedures. Am J Hum Genet 71: O Connell JR, Weeks DE The VITESSE algorithm for rapid exact multilocus linkage analysis via genotype set-recoding and fuzzy inheritance [see comments]. Nat Genet 11: Olson JM Multipoint linkage analysis using sibpairs: An interval mapping approach for dichotomous outcomes. Am J Hum Genet 56: Olson JM A general conditional-logistic model for affected-relative-pair linkage studies. Am J Hum Genet 65: Olson JM, Goddard KA, Dudek DM The amyloid precursor protein locus and very-late-onset Alzheimer disease. Am J Hum Genet 69: Ott J Linkage analysis and family classification under heterogeneity. Ann Hum Genet 47: Ott J Analysis of human genetic linkage, 3rd ed. Baltimore, MD: The Johns Hopkins University Press Risch N. 1990a. Linkage strategies for genetically complex traits. I. Multilocus models. Am J Hum Genet 46: Risch N. 1990b. Linkage strategies for genetically complex traits. II. The power of affected relative pairs. Am J Hum Genet 46: Risch N. 1990c. Linkage strategies for genetically complex traits. III. The effect of marker polymorphism on analysis of affected relative pairs. Am J Hum Genet 46: Shao Y, Cuccaro ML, Hauser ER, Raiford KL, Menold MM, Wolpert CM, Ravan SA, Elston L, Decena K, Donnelly SL, Abramson RK, Wright HH, DeLong GR, Gilbert JR, Pericak- Vance MA Fine mapping of autistic disorder to chromosome 15q11-q13 by use of phenotypic subtypes. Am J Hum Genet 72: Smith CAB Testing for heterogeneity of recombination fractions values in human genetics. Ann Hum Genet 27: Tores F, Uhry Z, Detroyes B, Demenais F, Martinez M Sib-pair linkage analysis of alcohol dependence taking into account covariates and age-of-onset variability: Evaluation of the residual approach. Genet Epidemiol 17: S349 S354. Watanabe RM, Ghosh S, Birznieks G, Duren WL, Mitchell BD Application of an ordered subset analysis approach to the genetics of alcoholism. Genet Epidemiol 17:S385 S390.

Effects of Stratification in the Analysis of Affected-Sib-Pair Data: Benefits and Costs

Effects of Stratification in the Analysis of Affected-Sib-Pair Data: Benefits and Costs Am. J. Hum. Genet. 66:567 575, 2000 Effects of Stratification in the Analysis of Affected-Sib-Pair Data: Benefits and Costs Suzanne M. Leal and Jurg Ott Laboratory of Statistical Genetics, The Rockefeller

More information

Non-parametric methods for linkage analysis

Non-parametric methods for linkage analysis BIOSTT516 Statistical Methods in Genetic Epidemiology utumn 005 Non-parametric methods for linkage analysis To this point, we have discussed model-based linkage analyses. These require one to specify a

More information

Nonparametric Linkage Analysis. Nonparametric Linkage Analysis

Nonparametric Linkage Analysis. Nonparametric Linkage Analysis Limitations of Parametric Linkage Analysis We previously discued parametric linkage analysis Genetic model for the disease must be specified: allele frequency parameters and penetrance parameters Lod scores

More information

Effect of Genetic Heterogeneity and Assortative Mating on Linkage Analysis: A Simulation Study

Effect of Genetic Heterogeneity and Assortative Mating on Linkage Analysis: A Simulation Study Am. J. Hum. Genet. 61:1169 1178, 1997 Effect of Genetic Heterogeneity and Assortative Mating on Linkage Analysis: A Simulation Study Catherine T. Falk The Lindsley F. Kimball Research Institute of The

More information

Introduction to linkage and family based designs to study the genetic epidemiology of complex traits. Harold Snieder

Introduction to linkage and family based designs to study the genetic epidemiology of complex traits. Harold Snieder Introduction to linkage and family based designs to study the genetic epidemiology of complex traits Harold Snieder Overview of presentation Designs: population vs. family based Mendelian vs. complex diseases/traits

More information

Chapter 2. Linkage Analysis. JenniferH.BarrettandM.DawnTeare. Abstract. 1. Introduction

Chapter 2. Linkage Analysis. JenniferH.BarrettandM.DawnTeare. Abstract. 1. Introduction Chapter 2 Linkage Analysis JenniferH.BarrettandM.DawnTeare Abstract Linkage analysis is used to map genetic loci using observations on relatives. It can be applied to both major gene disorders (parametric

More information

Alzheimer Disease and Complex Segregation Analysis p.1/29

Alzheimer Disease and Complex Segregation Analysis p.1/29 Alzheimer Disease and Complex Segregation Analysis Amanda Halladay Dalhousie University Alzheimer Disease and Complex Segregation Analysis p.1/29 Outline Background Information on Alzheimer Disease Alzheimer

More information

Genome Scan Meta-Analysis of Schizophrenia and Bipolar Disorder, Part I: Methods and Power Analysis

Genome Scan Meta-Analysis of Schizophrenia and Bipolar Disorder, Part I: Methods and Power Analysis Am. J. Hum. Genet. 73:17 33, 2003 Genome Scan Meta-Analysis of Schizophrenia and Bipolar Disorder, Part I: Methods and Power Analysis Douglas F. Levinson, 1 Matthew D. Levinson, 1 Ricardo Segurado, 2 and

More information

In 1996, Hanis et al. (1) reported that a genome-wide

In 1996, Hanis et al. (1) reported that a genome-wide Brief Genetics Report Variation in Three Single Nucleotide Polymorphisms in the Calpain-10 Gene Not Associated With Type 2 Diabetes in a Large Finnish Cohort Tasha E. Fingerlin, 1,2 Michael R. Erdos, 3

More information

Complex Multifactorial Genetic Diseases

Complex Multifactorial Genetic Diseases Complex Multifactorial Genetic Diseases Nicola J Camp, University of Utah, Utah, USA Aruna Bansal, University of Utah, Utah, USA Secondary article Article Contents. Introduction. Continuous Variation.

More information

1248 Letters to the Editor

1248 Letters to the Editor 148 Letters to the Editor Badner JA, Goldin LR (1997) Bipolar disorder and chromosome 18: an analysis of multiple data sets. Genet Epidemiol 14:569 574. Meta-analysis of linkage studies. Genet Epidemiol

More information

Dan Koller, Ph.D. Medical and Molecular Genetics

Dan Koller, Ph.D. Medical and Molecular Genetics Design of Genetic Studies Dan Koller, Ph.D. Research Assistant Professor Medical and Molecular Genetics Genetics and Medicine Over the past decade, advances from genetics have permeated medicine Identification

More information

Stat 531 Statistical Genetics I Homework 4

Stat 531 Statistical Genetics I Homework 4 Stat 531 Statistical Genetics I Homework 4 Erik Erhardt November 17, 2004 1 Duerr et al. report an association between a particular locus on chromosome 12, D12S1724, and in ammatory bowel disease (Am.

More information

Combined Analysis of Hereditary Prostate Cancer Linkage to 1q24-25

Combined Analysis of Hereditary Prostate Cancer Linkage to 1q24-25 Am. J. Hum. Genet. 66:945 957, 000 Combined Analysis of Hereditary Prostate Cancer Linkage to 1q4-5: Results from 77 Hereditary Prostate Cancer Families from the International Consortium for Prostate Cancer

More information

Genetics and Genomics in Medicine Chapter 8 Questions

Genetics and Genomics in Medicine Chapter 8 Questions Genetics and Genomics in Medicine Chapter 8 Questions Linkage Analysis Question Question 8.1 Affected members of the pedigree above have an autosomal dominant disorder, and cytogenetic analyses using conventional

More information

Complex Trait Genetics in Animal Models. Will Valdar Oxford University

Complex Trait Genetics in Animal Models. Will Valdar Oxford University Complex Trait Genetics in Animal Models Will Valdar Oxford University Mapping Genes for Quantitative Traits in Outbred Mice Will Valdar Oxford University What s so great about mice? Share ~99% of genes

More information

Performing. linkage analysis using MERLIN

Performing. linkage analysis using MERLIN Performing linkage analysis using MERLIN David Duffy Queensland Institute of Medical Research Brisbane, Australia Overview MERLIN and associated programs Error checking Parametric linkage analysis Nonparametric

More information

New Enhancements: GWAS Workflows with SVS

New Enhancements: GWAS Workflows with SVS New Enhancements: GWAS Workflows with SVS August 9 th, 2017 Gabe Rudy VP Product & Engineering 20 most promising Biotech Technology Providers Top 10 Analytics Solution Providers Hype Cycle for Life sciences

More information

Human population sub-structure and genetic association studies

Human population sub-structure and genetic association studies Human population sub-structure and genetic association studies Stephanie A. Santorico, Ph.D. Department of Mathematical & Statistical Sciences Stephanie.Santorico@ucdenver.edu Global Similarity Map from

More information

Linkage analysis: Prostate Cancer

Linkage analysis: Prostate Cancer Linkage analysis: Prostate Cancer Prostate Cancer It is the most frequent cancer (after nonmelanoma skin cancer) In 2005, more than 232.000 new cases were diagnosed in USA and more than 30.000 will die

More information

Introduction. Am. J. Hum. Genet. 67: , 2000

Introduction. Am. J. Hum. Genet. 67: , 2000 Am. J. Hum. Genet. 67:1174 1185, 2000 The Finland United States Investigation of Non Insulin-Dependent Diabetes Mellitus Genetics (FUSION) Study. I. An Autosomal Genome Scan for Genes That Predispose to

More information

Multifactorial Inheritance. Prof. Dr. Nedime Serakinci

Multifactorial Inheritance. Prof. Dr. Nedime Serakinci Multifactorial Inheritance Prof. Dr. Nedime Serakinci GENETICS I. Importance of genetics. Genetic terminology. I. Mendelian Genetics, Mendel s Laws (Law of Segregation, Law of Independent Assortment).

More information

Using Sex-Averaged Genetic Maps in Multipoint Linkage Analysis When Identity-by-Descent Status is Incompletely Known

Using Sex-Averaged Genetic Maps in Multipoint Linkage Analysis When Identity-by-Descent Status is Incompletely Known Genetic Epidemiology 30: 384 396 (2006) Using Sex-Averaged Genetic Maps in Multipoint Linkage Analysis When Identity-by-Descent Status is Incompletely Known Tasha E. Fingerlin, 1 Gonc-alo R. Abecasis 2

More information

A Large Sample of Finnish Diabetic Sib-Pairs Reveals No Evidence for a Non Insulin-dependent Diabetes Mellitus Susceptibility Locus at 2qter

A Large Sample of Finnish Diabetic Sib-Pairs Reveals No Evidence for a Non Insulin-dependent Diabetes Mellitus Susceptibility Locus at 2qter A Large Sample of Finnish Diabetic Sib-Pairs Reveals No Evidence for a Non Insulin-dependent Diabetes Mellitus Susceptibility Locus at 2qter Soumitra Ghosh,* Elizabeth R. Hauser, Victoria L. Magnuson,*

More information

Transmission Disequilibrium Methods for Family-Based Studies Daniel J. Schaid Technical Report #72 July, 2004

Transmission Disequilibrium Methods for Family-Based Studies Daniel J. Schaid Technical Report #72 July, 2004 Transmission Disequilibrium Methods for Family-Based Studies Daniel J. Schaid Technical Report #72 July, 2004 Correspondence to: Daniel J. Schaid, Ph.D., Harwick 775, Division of Biostatistics Mayo Clinic/Foundation,

More information

Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection

Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection Author's response to reviews Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection Authors: Jestinah M Mahachie John

More information

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method Biost 590: Statistical Consulting Statistical Classification of Scientific Studies; Approach to Consulting Lecture Outline Statistical Classification of Scientific Studies Statistical Tasks Approach to

More information

Ascertainment Through Family History of Disease Often Decreases the Power of Family-based Association Studies

Ascertainment Through Family History of Disease Often Decreases the Power of Family-based Association Studies Behav Genet (2007) 37:631 636 DOI 17/s10519-007-9149-0 ORIGINAL PAPER Ascertainment Through Family History of Disease Often Decreases the Power of Family-based Association Studies Manuel A. R. Ferreira

More information

Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al.

Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al. Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al. Holger Höfling Gad Getz Robert Tibshirani June 26, 2007 1 Introduction Identifying genes that are involved

More information

CS2220 Introduction to Computational Biology

CS2220 Introduction to Computational Biology CS2220 Introduction to Computational Biology WEEK 8: GENOME-WIDE ASSOCIATION STUDIES (GWAS) 1 Dr. Mengling FENG Institute for Infocomm Research Massachusetts Institute of Technology mfeng@mit.edu PLANS

More information

Understandable Statistics

Understandable Statistics Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement

More information

A Unified Sampling Approach for Multipoint Analysis of Qualitative and Quantitative Traits in Sib Pairs

A Unified Sampling Approach for Multipoint Analysis of Qualitative and Quantitative Traits in Sib Pairs Am. J. Hum. Genet. 66:1631 1641, 000 A Unified Sampling Approach for Multipoint Analysis of Qualitative and Quantitative Traits in Sib Pairs Kung-Yee Liang, 1 Chiung-Yu Huang, 1 and Terri H. Beaty Departments

More information

Statistical Tests for X Chromosome Association Study. with Simulations. Jian Wang July 10, 2012

Statistical Tests for X Chromosome Association Study. with Simulations. Jian Wang July 10, 2012 Statistical Tests for X Chromosome Association Study with Simulations Jian Wang July 10, 2012 Statistical Tests Zheng G, et al. 2007. Testing association for markers on the X chromosome. Genetic Epidemiology

More information

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models White Paper 23-12 Estimating Complex Phenotype Prevalence Using Predictive Models Authors: Nicholas A. Furlotte Aaron Kleinman Robin Smith David Hinds Created: September 25 th, 2015 September 25th, 2015

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Type 2 diabetes is a common metabolic disorder

Type 2 diabetes is a common metabolic disorder A Genome-Wide Scan for Loci Linked to Plasma Levels of Glucose and HbA 1c in a Community-Based Sample of Caucasian Pedigrees The Framingham Offspring Study James B. Meigs, 1 Carolien I. M. Panhuysen, 2

More information

Estimating genetic variation within families

Estimating genetic variation within families Estimating genetic variation within families Peter M. Visscher Queensland Institute of Medical Research Brisbane, Australia peter.visscher@qimr.edu.au 1 Overview Estimation of genetic parameters Variation

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Fig 1. Comparison of sub-samples on the first two principal components of genetic variation. TheBritishsampleisplottedwithredpoints.The sub-samples of the diverse sample

More information

Investigating the robustness of the nonparametric Levene test with more than two groups

Investigating the robustness of the nonparametric Levene test with more than two groups Psicológica (2014), 35, 361-383. Investigating the robustness of the nonparametric Levene test with more than two groups David W. Nordstokke * and S. Mitchell Colp University of Calgary, Canada Testing

More information

MULTIFACTORIAL DISEASES. MG L-10 July 7 th 2014

MULTIFACTORIAL DISEASES. MG L-10 July 7 th 2014 MULTIFACTORIAL DISEASES MG L-10 July 7 th 2014 Genetic Diseases Unifactorial Chromosomal Multifactorial AD Numerical AR Structural X-linked Microdeletions Mitochondrial Spectrum of Alterations in DNA Sequence

More information

Evidence for a type 2 diabetes locus at chromosome

Evidence for a type 2 diabetes locus at chromosome Genetic Variation Near the Hepatocyte Nuclear Factor-4 Gene Predicts Susceptibility to Type 2 Diabetes Kaisa Silander, 1 Karen L. Mohlke, 1 Laura J. Scott, 2 Erin C. Peck, 1 Pablo Hollstein, 1 Andrew D.

More information

National Disease Research Interchange Annual Progress Report: 2010 Formula Grant

National Disease Research Interchange Annual Progress Report: 2010 Formula Grant National Disease Research Interchange Annual Progress Report: 2010 Formula Grant Reporting Period July 1, 2011 June 30, 2012 Formula Grant Overview The National Disease Research Interchange received $62,393

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Transmission Disequilibrium Test in GWAS

Transmission Disequilibrium Test in GWAS Department of Computer Science Brown University, Providence sorin@cs.brown.edu November 10, 2010 Outline 1 Outline 2 3 4 The transmission/disequilibrium test (TDT) was intro- duced several years ago by

More information

STATISTICAL ANALYSIS FOR GENETIC EPIDEMIOLOGY (S.A.G.E.) INTRODUCTION

STATISTICAL ANALYSIS FOR GENETIC EPIDEMIOLOGY (S.A.G.E.) INTRODUCTION STATISTICAL ANALYSIS FOR GENETIC EPIDEMIOLOGY (S.A.G.E.) INTRODUCTION Release 3.1 December 1997 - ii - INTRODUCTION Table of Contents 1 Changes Since Last Release... 1 2 Purpose... 3 3 Using the Programs...

More information

Research and Evaluation Methodology Program, School of Human Development and Organizational Studies in Education, University of Florida

Research and Evaluation Methodology Program, School of Human Development and Organizational Studies in Education, University of Florida Vol. 2 (1), pp. 22-39, Jan, 2015 http://www.ijate.net e-issn: 2148-7456 IJATE A Comparison of Logistic Regression Models for Dif Detection in Polytomous Items: The Effect of Small Sample Sizes and Non-Normality

More information

During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin,

During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin, ESM Methods Hyperinsulinemic-euglycemic clamp procedure During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin, Clayton, NC) was followed by a constant rate (60 mu m

More information

The Determination of the Genetic Order and Genetic Map for the Eye Color, Wing Size, and Bristle Morphology in Drosophila melanogaster

The Determination of the Genetic Order and Genetic Map for the Eye Color, Wing Size, and Bristle Morphology in Drosophila melanogaster Kudlac 1 Kaitie Kudlac March 24, 2015 Professor Ma Genetics 356 The Determination of the Genetic Order and Genetic Map for the Eye Color, Wing Size, and Bristle Morphology in Drosophila melanogaster Abstract:

More information

Numerous hypothesis tests were performed in this study. To reduce the false positive due to

Numerous hypothesis tests were performed in this study. To reduce the false positive due to Two alternative data-splitting Numerous hypothesis tests were performed in this study. To reduce the false positive due to multiple testing, we are not only seeking the results with extremely small p values

More information

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis BST227 Introduction to Statistical Genetics Lecture 4: Introduction to linkage and association analysis 1 Housekeeping Homework #1 due today Homework #2 posted (due Monday) Lab at 5:30PM today (FXB G13)

More information

breast cancer; relative risk; risk factor; standard deviation; strength of association

breast cancer; relative risk; risk factor; standard deviation; strength of association American Journal of Epidemiology The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail:

More information

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) *

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * by J. RICHARD LANDIS** and GARY G. KOCH** 4 Methods proposed for nominal and ordinal data Many

More information

UNIVERSITY OF CALIFORNIA, LOS ANGELES

UNIVERSITY OF CALIFORNIA, LOS ANGELES UNIVERSITY OF CALIFORNIA, LOS ANGELES BERKELEY DAVIS IRVINE LOS ANGELES MERCED RIVERSIDE SAN DIEGO SAN FRANCISCO UCLA SANTA BARBARA SANTA CRUZ DEPARTMENT OF EPIDEMIOLOGY SCHOOL OF PUBLIC HEALTH CAMPUS

More information

Genomewide Linkage of Forced Mid-Expiratory Flow in Chronic Obstructive Pulmonary Disease

Genomewide Linkage of Forced Mid-Expiratory Flow in Chronic Obstructive Pulmonary Disease ONLINE DATA SUPPLEMENT Genomewide Linkage of Forced Mid-Expiratory Flow in Chronic Obstructive Pulmonary Disease Dawn L. DeMeo, M.D., M.P.H.,Juan C. Celedón, M.D., Dr.P.H., Christoph Lange, John J. Reilly,

More information

Introduction to Genetics and Genomics

Introduction to Genetics and Genomics 2016 Introduction to enetics and enomics 3. ssociation Studies ggibson.gt@gmail.com http://www.cig.gatech.edu Outline eneral overview of association studies Sample results hree steps to WS: primary scan,

More information

USE AND MISUSE OF MIXED MODEL ANALYSIS VARIANCE IN ECOLOGICAL STUDIES1

USE AND MISUSE OF MIXED MODEL ANALYSIS VARIANCE IN ECOLOGICAL STUDIES1 Ecology, 75(3), 1994, pp. 717-722 c) 1994 by the Ecological Society of America USE AND MISUSE OF MIXED MODEL ANALYSIS VARIANCE IN ECOLOGICAL STUDIES1 OF CYNTHIA C. BENNINGTON Department of Biology, West

More information

Strategies for Mapping Heterogeneous Recessive Traits by Allele- Sharing Methods

Strategies for Mapping Heterogeneous Recessive Traits by Allele- Sharing Methods Am. J. Hum. Genet. 60:965-978, 1997 Strategies for Mapping Heterogeneous Recessive Traits by Allele- Sharing Methods Eleanor Feingold' and David 0. Siegmund2 'Department of Biostatistics, Emory University,

More information

Lecture 6 Practice of Linkage Analysis

Lecture 6 Practice of Linkage Analysis Lecture 6 Practice of Linkage Analysis Jurg Ott http://lab.rockefeller.edu/ott/ http://www.jurgott.org/pekingu/ The LINKAGE Programs http://www.jurgott.org/linkage/linkagepc.html Input: pedfile Fam ID

More information

Flexible Matching in Case-Control Studies of Gene-Environment Interactions

Flexible Matching in Case-Control Studies of Gene-Environment Interactions American Journal of Epidemiology Copyright 2004 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 59, No. Printed in U.S.A. DOI: 0.093/aje/kwg250 ORIGINAL CONTRIBUTIONS Flexible

More information

SNPrints: Defining SNP signatures for prediction of onset in complex diseases

SNPrints: Defining SNP signatures for prediction of onset in complex diseases SNPrints: Defining SNP signatures for prediction of onset in complex diseases Linda Liu, Biomedical Informatics, Stanford University Daniel Newburger, Biomedical Informatics, Stanford University Grace

More information

An Introduction to Quantitative Genetics I. Heather A Lawson Advanced Genetics Spring2018

An Introduction to Quantitative Genetics I. Heather A Lawson Advanced Genetics Spring2018 An Introduction to Quantitative Genetics I Heather A Lawson Advanced Genetics Spring2018 Outline What is Quantitative Genetics? Genotypic Values and Genetic Effects Heritability Linkage Disequilibrium

More information

Lessons in biostatistics

Lessons in biostatistics Lessons in biostatistics The test of independence Mary L. McHugh Department of Nursing, School of Health and Human Services, National University, Aero Court, San Diego, California, USA Corresponding author:

More information

GENETIC LINKAGE ANALYSIS

GENETIC LINKAGE ANALYSIS Atlas of Genetics and Cytogenetics in Oncology and Haematology GENETIC LINKAGE ANALYSIS * I- Recombination fraction II- Definition of the "lod score" of a family III- Test for linkage IV- Estimation of

More information

Replacing IBS with IBD: The MLS Method. Biostatistics 666 Lecture 15

Replacing IBS with IBD: The MLS Method. Biostatistics 666 Lecture 15 Replacing IBS with IBD: The MLS Method Biostatistics 666 Lecture 5 Previous Lecture Analysis of Affected Relative Pairs Test for Increased Sharing at Marker Expected Amount of IBS Sharing Previous Lecture:

More information

A NEW TRIAL DESIGN FULLY INTEGRATING BIOMARKER INFORMATION FOR THE EVALUATION OF TREATMENT-EFFECT MECHANISMS IN PERSONALISED MEDICINE

A NEW TRIAL DESIGN FULLY INTEGRATING BIOMARKER INFORMATION FOR THE EVALUATION OF TREATMENT-EFFECT MECHANISMS IN PERSONALISED MEDICINE A NEW TRIAL DESIGN FULLY INTEGRATING BIOMARKER INFORMATION FOR THE EVALUATION OF TREATMENT-EFFECT MECHANISMS IN PERSONALISED MEDICINE Dr Richard Emsley Centre for Biostatistics, Institute of Population

More information

A Comparison of Sample Size and Power in Case-Only Association Studies of Gene-Environment Interaction

A Comparison of Sample Size and Power in Case-Only Association Studies of Gene-Environment Interaction American Journal of Epidemiology ª The Author 2010. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. This is an Open Access article distributed under

More information

Computerized Mastery Testing

Computerized Mastery Testing Computerized Mastery Testing With Nonequivalent Testlets Kathleen Sheehan and Charles Lewis Educational Testing Service A procedure for determining the effect of testlet nonequivalence on the operating

More information

Regression-based Linkage Analysis Methods

Regression-based Linkage Analysis Methods Chapter 1 Regression-based Linkage Analysis Methods Tao Wang and Robert C. Elston Department of Epidemiology and Biostatistics Case Western Reserve University, Wolstein Research Building 2103 Cornell Road,

More information

Geographic Differences in Event Rates by Model for End-Stage Liver Disease Score

Geographic Differences in Event Rates by Model for End-Stage Liver Disease Score American Journal of Transplantation 2006; 6: 2470 2475 Blackwell Munksgaard C 2006 The Authors Journal compilation C 2006 The American Society of Transplantation and the American Society of Transplant

More information

Field wide development of analytic approaches for sequence data

Field wide development of analytic approaches for sequence data Benjamin Neale Field wide development of analytic approaches for sequence data Cohort Allelic Sum Test (CAST; Hobbs, Cohen and others) Li and Leal (AJHG) Madsen and Browning (PLoS Genetics) C alpha and

More information

Analysis and Interpretation of Data Part 1

Analysis and Interpretation of Data Part 1 Analysis and Interpretation of Data Part 1 DATA ANALYSIS: PRELIMINARY STEPS 1. Editing Field Edit Completeness Legibility Comprehensibility Consistency Uniformity Central Office Edit 2. Coding Specifying

More information

Genome-wide Association Analysis Applied to Asthma-Susceptibility Gene. McCaw, Z., Wu, W., Hsiao, S., McKhann, A., Tracy, S.

Genome-wide Association Analysis Applied to Asthma-Susceptibility Gene. McCaw, Z., Wu, W., Hsiao, S., McKhann, A., Tracy, S. Genome-wide Association Analysis Applied to Asthma-Susceptibility Gene McCaw, Z., Wu, W., Hsiao, S., McKhann, A., Tracy, S. December 17, 2014 1 Introduction Asthma is a chronic respiratory disease affecting

More information

Rapid evolution towards equal sex ratios in a system with heterogamety

Rapid evolution towards equal sex ratios in a system with heterogamety Evolutionary Ecology Research, 1999, 1: 277 283 Rapid evolution towards equal sex ratios in a system with heterogamety Mark W. Blows, 1 * David Berrigan 2,3 and George W. Gilchrist 3 1 Department of Zoology,

More information

ARTICLE Identification of Susceptibility Genes for Cancer in a Genome-wide Scan: Results from the Colon Neoplasia Sibling Study

ARTICLE Identification of Susceptibility Genes for Cancer in a Genome-wide Scan: Results from the Colon Neoplasia Sibling Study ARTICLE Identification of Susceptibility Genes for Cancer in a Genome-wide Scan: Results from the Colon Neoplasia Sibling Study Denise Daley, 1,9, * Susan Lewis, 2 Petra Platzer, 3,8 Melissa MacMillen,

More information

Rare Variant Burden Tests. Biostatistics 666

Rare Variant Burden Tests. Biostatistics 666 Rare Variant Burden Tests Biostatistics 666 Last Lecture Analysis of Short Read Sequence Data Low pass sequencing approaches Modeling haplotype sharing between individuals allows accurate variant calls

More information

THERE is broad interest in genetic loci (called quantitative

THERE is broad interest in genetic loci (called quantitative Copyright Ó 2006 by the Genetics Society of America DOI: 10.1534/genetics.106.061176 The X Chromosome in Quantitative Trait Locus Mapping Karl W. Broman,*,1 Śaunak Sen, Sarah E. Owens,,2 Ani Manichaikul,*

More information

A Review of Multiple Hypothesis Testing in Otolaryngology Literature

A Review of Multiple Hypothesis Testing in Otolaryngology Literature The Laryngoscope VC 2014 The American Laryngological, Rhinological and Otological Society, Inc. Systematic Review A Review of Multiple Hypothesis Testing in Otolaryngology Literature Erin M. Kirkham, MD,

More information

Mendelian Inheritance. Jurg Ott Columbia and Rockefeller Universities New York

Mendelian Inheritance. Jurg Ott Columbia and Rockefeller Universities New York Mendelian Inheritance Jurg Ott Columbia and Rockefeller Universities New York Genes Mendelian Inheritance Gregor Mendel, monk in a monastery in Brünn (now Brno in Czech Republic): Breeding experiments

More information

Allowing for Missing Parents in Genetic Studies of Case-Parent Triads

Allowing for Missing Parents in Genetic Studies of Case-Parent Triads Am. J. Hum. Genet. 64:1186 1193, 1999 Allowing for Missing Parents in Genetic Studies of Case-Parent Triads C. R. Weinberg National Institute of Environmental Health Sciences, Research Triangle Park, NC

More information

On Missing Data and Genotyping Errors in Association Studies

On Missing Data and Genotyping Errors in Association Studies On Missing Data and Genotyping Errors in Association Studies Department of Biostatistics Johns Hopkins Bloomberg School of Public Health May 16, 2008 Specific Aims of our R01 1 Develop and evaluate new

More information

MBG* Animal Breeding Methods Fall Final Exam

MBG* Animal Breeding Methods Fall Final Exam MBG*4030 - Animal Breeding Methods Fall 2007 - Final Exam 1 Problem Questions Mick Dundee used his financial resources to purchase the Now That s A Croc crocodile farm that had been operating for a number

More information

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp The Stata Journal (22) 2, Number 3, pp. 28 289 Comparative assessment of three common algorithms for estimating the variance of the area under the nonparametric receiver operating characteristic curve

More information

Lecture 1 Mendelian Inheritance

Lecture 1 Mendelian Inheritance Genes Mendelian Inheritance Lecture 1 Mendelian Inheritance Jurg Ott Gregor Mendel, monk in a monastery in Brünn (now Brno in Czech Republic): Breeding experiments with the garden pea: Flower color and

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Predicting Breast Cancer Survival Using Treatment and Patient Factors

Predicting Breast Cancer Survival Using Treatment and Patient Factors Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women

More information

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you? WDHS Curriculum Map Probability and Statistics Time Interval/ Unit 1: Introduction to Statistics 1.1-1.3 2 weeks S-IC-1: Understand statistics as a process for making inferences about population parameters

More information

Breast-Ovarian Cancer Gene on Chromosome 1 7q 1 2-q2

Breast-Ovarian Cancer Gene on Chromosome 1 7q 1 2-q2 Am. J. Hum. Genet. 52:767-776, 1993 Genetic Heterogeneity and Localization of a Familial Breast-Ovarian Cancer Gene on Chromosome 1 7q 1 2-q2 S. A. Smith,* D. F. Eastont D. Ford,t J. Peto,t K. Anderson,t

More information

Mapping quantitative trait loci in humans: achievements and limitations

Mapping quantitative trait loci in humans: achievements and limitations Review series Mapping quantitative trait loci in humans: achievements and limitations Partha P. Majumder and Saurabh Ghosh Human Genetics Unit, Indian Statistical Institute, Kolkata, India Recent advances

More information

Genome-wide linkage scan for prostate cancer susceptibility genes in men with aggressive disease: significant evidence for linkage at chromosome 15q12

Genome-wide linkage scan for prostate cancer susceptibility genes in men with aggressive disease: significant evidence for linkage at chromosome 15q12 Hum Genet (2006) 119: 400 407 DOI 10.1007/s00439-006-0149-6 ORIGINAL INVESTIGATION Ethan M. Lange Æ Lindsey A. Ho Jennifer L. Beebe-Dimmer Æ Yunfei Wang Elizabeth M. Gillanders Æ Jeffrey M. Trent Leslie

More information

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu What you should know before you collect data BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Types and levels of study Descriptive statistics Inferential statistics How to choose a statistical test

More information

Linkage of a bipolar disorder susceptibility locus to human chromosome 13q32 in a new pedigree series

Linkage of a bipolar disorder susceptibility locus to human chromosome 13q32 in a new pedigree series (2003) 8, 558 564 & 2003 Nature Publishing Group All rights reserved 1359-4184/03 $25.00 www.nature.com/mp ORIGINAL RESEARCH ARTICLE to human chromosome 13q32 in a new pedigree series SH Shaw 1,2, *, Z

More information

Ecological Statistics

Ecological Statistics A Primer of Ecological Statistics Second Edition Nicholas J. Gotelli University of Vermont Aaron M. Ellison Harvard Forest Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Brief Contents

More information

Name: PS#: Biol 3301 Midterm 1 Spring 2012

Name: PS#: Biol 3301 Midterm 1 Spring 2012 Name: PS#: Biol 3301 Midterm 1 Spring 2012 Multiple Choice. Circle the single best answer. (4 pts each) 1. Which of the following changes in the DNA sequence of a gene will produce a new allele? a) base

More information

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA

Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA Minimizing Uncertainty in Property Casualty Loss Reserve Estimates Chris G. Gross, ACAS, MAAA The uncertain nature of property casualty loss reserves Property Casualty loss reserves are inherently uncertain.

More information

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0% Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of

More information

Impact and adjustment of selection bias. in the assessment of measurement equivalence

Impact and adjustment of selection bias. in the assessment of measurement equivalence Impact and adjustment of selection bias in the assessment of measurement equivalence Thomas Klausch, Joop Hox,& Barry Schouten Working Paper, Utrecht, December 2012 Corresponding author: Thomas Klausch,

More information

Pedigree Construction Notes

Pedigree Construction Notes Name Date Pedigree Construction Notes GO TO à Mendelian Inheritance (http://www.uic.edu/classes/bms/bms655/lesson3.html) When human geneticists first began to publish family studies, they used a variety

More information

A MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA

A MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA A MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA Elizabeth Martin Fischer, University of North Carolina Introduction Researchers and social scientists frequently confront

More information