Genome-wide Analysis of Epistatic Effects for Quantitative Traits in Barley. Shizhong Xu and Zhenyu Jia

Size: px
Start display at page:

Download "Genome-wide Analysis of Epistatic Effects for Quantitative Traits in Barley. Shizhong Xu and Zhenyu Jia"

Transcription

1 Genetics: Published Articles Ahead of Print, published on February 4, 2007 as /genetics Genome-wide Analysis of Epistatic Effects for Quantitative Traits in Barley Shizhong Xu and Zhenyu Jia Department of Botany and Plant Sciences University of California Riverside, CA 92521

2 Running head: Epistatic effects in barley Keywords: Empirical Bayes, Mixed model, QTL mapping, Permutation test and Variance components Corresponding author: Shizhong Xu Department of Botany and Plant Sciences University of California, Riverside, Riverside, CA USA Phone: (951) Fax: (951)

3 ABSTRACT The doubled-haploid (DH) barley population (Harrington TR306) developed by the North American Barley Genome Mapping Proect (NABGMP) for QTL mapping consisted of 145 lines and 127 markers covering a total genome length of 1270 cm. These DH lines were evaluated in about 25 environments for seven quantitative traits: heading, height, kernel weight, lodging, maturity, test weight and yield. We applied an empirical Bayes method that simultaneously estimates 127 main effects for all markers and 127 (127 1) / 2 = 8001 interaction effects for all marker pairs in a single model. We found that the largest main effect QTL (single marker) and the largest epistatic effect (single pair of markers) explained about 18% and 2.6% of the phenotypic variance, respectively. On average, the sum of all significant main effects and the sum of all significant epistatic effects contributed 35% and 6% of the total phenotypic variance, respectively. Epistasis seems to be negligible for all the seven traits. We also found that whether two loci interact does not depend on whether or not the loci have individual main effects. This invalidates the common practice of epistatic analysis in which epistatic effects are only estimated for pairs of loci of which both have main effects. 3

4 INTRODUCTION Epistatic effects are statistically defined as interactions between effects of alleles from two or more genetic loci (FISHER, 1918). Interactions, however, are simply deviations from additivity in a general linear model; as such they are often treated as statistical errors. COCKERHAM (1954) showed that epistatic effects can be partitioned into various epistatic components, e.g., A A and A D effects etc. Epistasis is now considered as an important source of genetic variation for quantitative traits. Because different components involve interactions of different numbers and different types of alleles, some components are more important than others. Especially, the A A component is shown to be heritable (GOODNIGHT, 1988) and thus much attention has been paid to the study of A A effects in response to selection and evolution (GOODNIGHT, 2000; JANNINK, 2003). Epistasis is an important source of variation contributing to speciation (WRIGHT, 1931) because breakdown of a certain combination of alleles already adapted to a local environment will decrease the fitness of the recombinants. However, the importance of epistasis in quantitative traits among less diversified populations is less clear. Some studies showed that the epistatic variance can account for a large proportion of the genetic variance of quantitative traits among progeny of line crosses (CARLBORG et al., 2005; MALMBERG et al., 2005; MALMBERG and MAURICIO, 2005). However, the reported epistatic effects of QTL are most likely biased because many studies that did not show any significant epistatic effects are perhaps not reported in the literature, a phenomenon called the Beavis effect (BEAVIS, 1994; XU, 2003b). The censorship of small or null 4

5 epistatic effects biases the reported results upwards. Although efficient methods have been developed for mapping QTL with main effects (LANDER and BOTSTEIN, 1989; SILLANPAA and ARJAS, 1998; WANG et al., 2005; XU, 2003a; YI et al., 2003a; ZHANG et al., 2005), methods for mapping QTL with epistatic effects are still premature. These methods either utilize models including a single epistatic effect at a time (HOLLAND, 1998; MALMBERG et al., 2005) or apply a model selection strategy that searches for multiple epistatic effects (YI et al., 2003b; 2005; CARLBORG et al., 2000). These methods may not guarantee that all important epistatic effects are detected. Recently, XU (2007) developed an empirical Bayes method that can simultaneously estimate main effects of all individual markers and epistatic effects of all pairs of markers. The algorithm is computationally efficient so that large sets of data can be analyzed within very short computing time. A doubled-haploid barley population was developed by the North American Barley Genome Mapping Proect (TINKER et al., 1996). Each genotype was replicated about 25 times. QTL were mapped for seven agronomic traits. On average, there were 3 6 QTL contributing to the genetic variance of each trait. The results were quite reliable due to the relatively dense marker map, the reasonable sample size and, more importantly, the large number of replications. Because of this, the dataset has been analyzed many times by various investigators to test new statistical models (XU, 2003a; YI et al., 2003a, 2003b; ZHANG et al., 2005; XU, 2007). However, epistatic effects have not been tested in this barley population for all the traits recorded in the experiment. XU (2007) only analyzed the trait kernel weight (KWT) to demonstrate the application of the empirical Bayes method. No general conclusion was made in that study. We conducted a genomewide analysis of epistatic effects for all the traits using the new method (XU, 2007). The 5

6 genome-wide analysis employed was a true multiple effect analysis that required no variable selection. All markers and marker pairs were included in a single model and their effects were estimated simultaneously. Since the genome coverage of the markers was quite high, no QTL or QTL pairs would be missed. The results are reliable so that conclusions can be made inclusively about the relative importance of epistasis in the genetic variance of quantitative traits. MATERIALS AND METHODS Experimental population Data were retrieved from the NABGMP website ( The experimental design and results were reported by TINKER et al. (1996). For the paper to be self-contained, the experiment was briefly described here. The population consisted of 145 doubled haploid (DH) lines of a cross between two related Canadian two-row barley lines, Harrington and TR306. The cross was made by the NABGMP in order to (1) construct a molecular marker map and (2) locate QTL that affect traits of economic importance. These DH lines were evaluated in a 25 replications (environments) for seven quantitative traits: heading (HED), height (HGT), kernel weight (KWT), lodging (LDG), maturity (MAT), test weight (TWT) and yield (YLD). The number of replicates for an individual trait varied from 15 to 29 with an average of 25. The map consisted of 127 markers (mostly RFLP) distributed over seven chromosomes with an average marker interval of 10.5 cm. The genome coverage of the markers was 1270 cm in length. TINKER et al. (1996) identified, on average, three to six QTL per trait, collectively explaining 35 to 50% of the genetic variance. None of the traits were controlled by a maor QTL. Some 6

7 QTL had interaction effects with the environments, but many showed effects that were consistent across environments. Epistatic effects were not investigated in the original study. The purpose of this analysis was to conduct a genome-wide investigation on the epistatic effects for the seven traits. For simplicity, we took the average phenotypic value of each line across the environments as the input phenotype for that line. Because of the large number of replicates, the average phenotypic value of each line approximately represents the genotypic value of that line. All QTL detected would represent those showing consistent effects across environments. The genotype of each marker was coded as + 1 for the TR306 allele and 1 for the Harrington allele. A missing genotype was coded as 0. There were about 4.9% of the marker genotypes with missing values. Statistical analysis Missing marker genotypes were imputed using information of the nearest nonmissing flanking markers. We first used the genotypes of flanking markers to calculate the conditional probability of the missing marker genotype. We then sampled the genotype of the missing marker from this conditional probability. This is called marker imputation. The missing markers were imputed one at a time from one end of the chromosome to the other end. If two or more consecutive markers were missing, the imputed marker genotype for the first missing marker, combined with the first nonmissing marker in the other side of the second missing marker, was used to calculate the probability of the second missing marker genotype. For example, consider five markers in the order of ABCDE and markers BCD have missing genotypes. The genotype of marker B is generated using information from markers A and E. The genotype of marker 7

8 C is generated from the imputed genotype of marker B and the genotype of marker E. The genotype of marker D is generated from the imputed genotyped of marker C and the genotype of marker E. Once all missing markers were imputed for all individuals, we had a set of imputed marker genotypes for the population. This dataset was used as an input dataset to conduct the epistatic analysis. We generated 20 imputed samples of the marker genotype data, and thus analyzed the data 20 times, one from each imputed marker dataset. The estimated parameters represented the average estimates of the 20 imputed samples. The marker distribution was quite even across the genome and thus, for simplicity, we treated each marker as a putative QTL, i.e., we only estimated the effects of markers. If a QTL was located between two markers, its effect would be picked up by the two flanking markers. Hereafter, we use markers and putative QTL interchangeably. The empirical Bayes method (XU, 2007) was used to analyze the data. The model is briefly reintroduced here, but the technical detail of the method is referred to the original study (XU, 2007). Let n = 145 be the number of DH lines and m =127 be the number of markers. The vector of phenotypic values for a trait is described by the following linear model, m m y = 1μ+ Zγ + ( Z # Z ) γ + ε (1) l l l l' ll' l= 1 l' > l T where y is an n 1 vector, μ is the population mean, Z = ( Z... Z ) is an n 1 l 1l nl vector of the genotype indicators for locus l ( l 1,..., m), Z takes one of two values = il { 1, + 1} depending on which parental allele has been passed to line i for locus l, γ l is the additive (main) effect for locus l and γ is the epistatic effect between loci l and l ', and ε is the residual error vector with an assumed ll ' 2 N(, 0 σ I ) distribution. The notation 8

9 Z # Z l represents direct product of vectors Z and Z. Excluding l ' l l ' μ, the total number of QTL effects is p = m( m+ 1) / 2 = 8128, including m = 127 additive effects and mm ( 1) / 2 = 8001 epistatic effects. We now use to index the th genetic effect (including additive and epistatic effects) for = 1,, p. We can rewrite model (1) as p y = 1μ+ X β +ε. (2) = 1 Comparing model (2) to model (1), we can see that X = Z l and β = γ l if the th effect is a main effect, and X = Zl # Z l ' and β γ ll ' = if the th effect is an epistatic effect. Therefore, model (2) is a general model for both the main and the epistatic effects. As far as the method of estimation is concerned, distinction between a main effect and an epistatic effect is unnecessary. For convenience of presentation, we always assumed that X has been centered and rescaled so that n X i = 0 and i= 1 n i= 1 X 2 i = n. In other words, each X variable has been standardized to have a zero mean and a unity standard deviation. Since the number of model effects is p/ n= 56 times as large as the sample size, the ordinary least squares method would not work. The empirical Bayes method (XU, 2007) adopted a random model approach by treating each QTL effect, say β, as a random variable sampled from a N σ distribution. The random model regression 2 (0, ) analysis is essentially a Bayesian regression method (LINDLEY and SMITH, 1972). We used the two-step approach of XU (2007) to estimate the QTL effects. The first step was a typical random model variance component analysis. The maximum likelihood method (HARTLEY and RAO, 1967) was used to estimate the population mean and all the variance components. The number of variance components was 8128, 9

10 requiring a special algorithm to handle a model with such a large number of parameters (XU, 2007). The second step involved BLUP (best linear unbiased prediction) estimation of the QTL effects given the estimated variance components (ROBINSON, 1991). The estimated variance components were used only to shrink the estimation of the QTL effects. Because each QTL had its own estimated variance ( σ ), the shrinkage was selective, meaning that spurious and small-effect QTL would be shrunken to zero and large-effect QTL would be subect to virtually no shrinkage. This is the very reason that the shrinkage approach can handle a super saturated model. Once the BLUP estimates of QTL effects were obtained, the genetic variance explained by a QTL took the square of ˆ β. Therefore, the total additive variance was variance was V AA m l< l' 2 ll' V A m ˆ 2 γ l= 1 l 2 ˆ = and the total epistatic = ˆ γ, neglecting the covariance caused by linkage. The overall genetic variance was VG = VA + VA A. The corresponding proportions of the phenotypic variance contributed by the additive, the epistatic and the total genetic variances were defined as H = V / V, H V / V and A A P AA = AA P HG VG / VP =, respectively, where V P is the observed phenotypic variance. Note that only effects deemed to be significant contributed to the calculation of V, V and V. A AA How large of an estimated effect is sufficiently large to be declared as G significant? One can convert each estimated effect into a t-test statistic, t = ˆ β /, S ˆ β and compare t with a critical value. We found that ˆ β was either close to zero (trivial) or noticeably deviated from zero (non-trivial). When ˆ β was trivial, the estimation error S ˆ β was also close to zero. However, the speed of approaching zero for S ˆ β is slower 10

11 than that of ˆ β. When ˆ β was non-trivial, S ˆ β was roughly constant across all the non- trivial effects. Therefore, we may simply compare ˆ β with a critical value drawn from the distribution of ˆ β. All effects larger than the critical value are declared as significant. The critical value was obtained from analysis of 20 reshuffled samples (permutation test). The (1 α ) 100 percentile of the distribution of the estimated effects of the reshuffled sample was a good approximation of the critical value, where α is a controlled experimental Type I error. RESULTS The data were analyzed using a SAS IML program written by the author. The computer program can be downloaded from our website ( The program took about two minutes to converge for each trait on a Pentium PC with 3.60 GHz processor and 3.00GB RAM. The numbers of replications, estimated means ˆμ and the phenotypic variances V P are listed in Table 1 for all the seven traits. The estimated QTL effects are plotted (3D) in Figure 1. Clearly, additive effects (diagonals of the plots) dominate in each of the seven traits. We chose the critical value at an experimental Type I error of α = for each trait. The critical values obtained from the average of 20 reshuffled samples were also listed in Table 1. The total number of QTL effects detected (N) ranged from 13 (KWT) to 21 (MAT and TWT) with an average 18. The number of significant additive effects ( ) ranged from 8 (LDG) to 17 (TWT) with an average 12. N A The number of significant epistatic effects ( N AA ) ranged from 1 (KWT) to 11 (LDG) with an average 6. Table 1 shows that, on average, additive variance ( H A ) dominates 11

12 over epistatic variance ( H ) for all traits, with an average H equal to 0.35 and an AA average H equal to The average H was 0.41 with KWT having the highest AA (0.47) and TWT the lowest (0.35). The highest single additive effect ( H ) G H G A( MAX ) explained about 18% of the phenotypic variance (KWT) and the highest single epistatic effect ( H AA( MAX) ) explained 2.6% of the phenotypic variance (LDG). Clearly, the total genetic variance contributed by significant QTL effects was predominantly determined by the additive variance. The result was consistent for all the seven traits. Among the detected epistatic effects, we investigated the proportion of the epistatic effects between pairs of loci that both lack main effects ( N A / N (0) ). We found that these epistatic effects accounted for 90% of the total number of epistatic effects. The remaining 10% of the epistatic effects involved pairs of loci in which only one had significant main effect ( N / N (1) ). There are no significant epistatic effects that involve pairs of loci in which both had significant main effects. This means that whether a locus interacts with another locus does not depend on whether or not the locus has a main effect. A AA A AA HG DISCUSSION The epistatic effect model used to analyze the barley data is an oversaturated model. The usual solution to the p > n problem in regression is through variable selection (SCHWARZ, 1978). For a model as large as p = 8128, an exhaustive search would require evaluation of all possible models, which is impossible to achieve within a reasonable time frame, even using a super computer. A heuristic search is 12

13 possible but may not guarantee to find the optimal model. Bayesian model selection (GEORGE and MCMULLOCH, 1993), by taking advantage of MCMC sampling, is a more efficient algorithm than both the exhaustive and heuristic searches. We employed the stochastic search variable selection (SSVS), which is a Bayesian model selection algorithm (GEORGE and MCMULLOCH, 1995; YI et al., 2003a), to analyze this data set, but it failed because n =145 was too small for this large model. Bayesian shrinkage analysis (XU, 2003) also failed to generate any meaningful results. We then tried the LASSO algorithm (TIBSHIRANI, 1996) and it failed for most of the traits. The failure of all these algorithms was reflected by the strange result: most of the estimated QTL effects were extremely large so that the total genetic variance (the sum of squares of the estimated QTL effects) was larger than the phenotypic variance by many orders of magnitude. The reasons for the failure of LASSO are unclear, but small sample size may be responsible for this. The MCMC-based analyses (SSVS and Bayesian shrinkage) use hierarchical models, which involve an extremely large numbers of parameters. The epistatic model is a linear model. Our primary interest is in the regression coefficients (additive and epistatic effects, both denoted by β, = 1,..., p). However, the MCMC-based methods also infer the variance components ( σ 2, = 1,..., p ), which are the parameters of the prior distributions of the regression coefficients. The methods draw all the variables (regression coefficients and variance components) sequentially from their conditional posterior distributions. The methods try to infer too much from the data, and thus require large sample sizes. The simulation experiments conducted by XU (2007) showed that the MCMC-based methods performed satisfactorily when the sample size was

14 The method we employed here is a Bayesian method in terms of estimation of the QTL effects. However, the variance components ( σ ) were supposed to be hyperparameters of the normal priors for the regression coefficients. Because there are so many of them, their values are hard to determine a priori. Therefore, we estimated them first using the variance component analysis and then substituted these prior variances by the estimated values, an approach called empirical Bayes (CARLIN and LOUIS, 1996). The empirical Bayes method not only generated meaningful results, but also was efficient in terms of high computing speed. Given the fact that the empirical Bayes method is also a Bayes method, why is it more robust to small sample sizes than the MCMC-based full Bayes methods? Several special properties of the empirical Bayes method may contribute to the robustness. First, the empirical Bayes method does not use a hierarchical model, and thus it infers a smaller number of parameters than the full Bayes methods. Although the empirical Bayes still estimates the variance components, these variance components are estimated separately using a marginal ML method before the Bayes analysis. By marginal ML we mean that the likelihood function is only a function of the variance components and the regression coefficients have been integrated out. The estimates of the variance components do not depend on the regression coefficients. This is clearly in contrast to the full Bayes methods in which the regression coefficients and the variance components are inferred sequentially with the estimate of one parameter depending on estimates of all other parameters. Secondly, the empirical Bayes method does not involves MCMC sampling for inference of the parameter distribution. The MCMC-based full Bayes methods failed not because they generated wrong estimates but because the Markov chains had never 2 14

15 converged to the stationary distribution of the parameters, and thus never generated meaningful estimates of the parameters. Thirdly, the empirical Bayes method employed in this study infers the marginal posterior mean of each regression coefficient, with other regression coefficients integrated out. Therefore, the estimate of one regression coefficient does not depend on the estimates of other regression coefficients. This clearly has improved the robustness of the method to small sample size. The ustification for the robustness of the empirical Bayes method to small sample sizes does not means that the empirical Bayes method can deal with a population with a sample size of infinitely small. There is a limit below which any method will fail because the data are ust too small to contain sufficient information to make any inferences. Is a sample size of 145 DH lines sufficient for QTL mapping? The answer is probably no in general (BEAVIS 1994; MELCHINGER, UTZ AND SCHÖN 1998), but 145 lines seem to be sufficient to infer large QTL in this particular barley population. One reason is that the phenotypic value of each line in each environment was measured by the average value of several plants, all with the same genotype (TINKER et al 1996). The mean phenotypic value (plot mean) of a line in a particular environment already represented approximately the genotypic value of that line in that environment. The variance of plot means across environments was largely due to G E interaction. In addition, each line was replicated in many (about 25) environments. After taking the average of the trait values across the environments, we have further reduced the residual error and made the mean value close to the true genotypic value. Certainly, we would not be able to handle such a large number of effects with 145 lines if the trait value for each line were measured from a 15

16 single plant. Nevertheless, 145 is not a large number and thus our conclusions are still limited. Although the empirical Bayes method worked well for this data set, the sample size was not sufficiently large to shrink all small effects to zero. Many spurious QTL effects still occurred. Therefore, we adopted a permutation test to find a suitable criterion to select those QTL with effects sufficiently large to be declared as significant. Theoretically, the critical value should be obtained from many reshuffled samples, say In this study, the number of effects in the model was so large that 20 reshuffled sample appeared to be sufficient. We found that the variance of the critical values obtained from multiple reshuffled samples was small. For example, the average critical value for HED at α = obtained from 20 reshuffled samples was and the standard deviation was (CV = 8%). When taking the average of 20 reshuffled samples, we have reduced the CV to 0.08 / 20 = 1.8%. The CV s of other traits are all less than 2% (data not shown). Therefore, we used the critical value obtained from 20 reshuffled samples for each trait as an approximation of the true critical value. We ranked the estimated QTL effects in descending order obtained from both the original sample and the reshuffled sample and investigated the patterns of change of the estimated effects. Figure 2 shows the patterns of the change for the top 500 effects (truncated from a total of 8128 effects) for four of the seven traits. The reshuffled samples showed drastically different patterns of change from the original samples. The shaded area of a reshuffled sample, e.g., HED-PERM, resembles a triangle whereas that of the original sample does not. Therefore, simply comparing the patterns of change for the estimated effects from 16

17 the original sample, we can tell that there are some large estimated effects which cannot be explained by chance. Usually, significance tests are not required in Bayesian analysis; only frequentists emphasize significance tests. We employed a permutation test to make inferences about the significance of a QTL effect. This is a hybrid approach between Bayesian and frequentist approaches. The purpose of this study was not for QTL detection; rather it was intended to assess the importance of epistasis relative to additivity for economically important quantitative traits in barley. If QTL detection were the purpose, simply reporting all QTL effects that explain more than ρ proportion of the total phenotypic variance would suffice, where ρ = 0.05 may be a natural choice. If a QTL explains less than ρ proportion of the total phenotypic variance, it will not be biologically important, even if it may be statistically significant. However, the purpose of this study was to evaluate the genome-wide QTL effects. On one hand, we wanted to include every estimated QTL effect that cannot be explained by chance. On the other hand, we wanted to exclude all the noisy effects. Therefore, we adopted the permutation test to separate the true QTL from the spurious QTL. Of course, the thresholds drawn from the permutation test are arbitrary. For the thresholds chosen in this study, each QTL effect deemed to be significant has a chance of α = to be false positive. The empirical Bayes method has already shrunken a large number of effects to zero, but still left many effects deviating from zero. These deviations, although very small individually, collectively contribute to a large proportion of the trait variance, because of the extremely large number of epistatic effects included in the model. We noticed that the residual variance has been consumed by the large number of epistatic effects, i.e., the estimated residual 17

18 error variance is close to zero in every case. The situation is equivalent to multiple regression using the least squares method where the residual variance decreases as the number of independent variables increases until it reaches zero as p = n. Once p n, the least squares method becomes invalid. The empirical Bayes method still works even if p is many times larger than n, but the residual variance is consumed by so many of the spurious effects. If all the estimated effects were included in the calculation of the genetic variance without using some kind of testing criterion, the overall epistatic variance would always dominate over the additive variance (a useless conclusion). This explains why significance tests have been conducted in this study. The epistatic effects defined in our model are different from the orthogonal contrasts define by COCKERHAM (1954) and recently reiterated by KAO and ZENG (2002) and ZENG et al., (2005). KAO and ZENG called the COCKERHAM s epistatic effects the statistical parameters and the epistatic effects defined here the genetic parameters. The genetic parameters were also called physiological parameters by CHEVERUD and ROUTMAN (1995). The orthogonal contrasts can be expressed as linear functions of the genetic effects defined in our model. This means that Cockerham s epistatic effects are functions of the main effects defined in our model. This may ustify KAO and ZENG (2002) for estimating epistatic effects only for loci that both have main effects because main effects contribute to the orthogonalized epistatic effects. The presence of epistatic effect between two loci defined in our study does not depend on whether or not the two loci both have main effects. This has been proved by the analysis of this experiment. Take trait HED for example, we detected five epistatic effects, but none of the effects involved two loci that both had significant main effects. One epistatic effect involved a pair of loci 18

19 of which only one had a significant main effect (1/5 = 0.2). The remaining four epistatic effects (4/5 = 0.8) involved pairs of loci that both lack main effects. In traditional QTL mapping, people normally scan the genome for QTL with main effects. Once QTL with main effects are identified, an epistatic model is fit to examine the epistatic effects only between the QTL that have main effects. This approach certainly has no logical basis. We would not be able to detect any epistatic effects for the barley data if this approach had been taken. Our conclusion of the barley data analysis was that epistatic variance does contribute to the genetic variance. However, the cumulative contribution from significant epistatic effects is very small relative to that from the additive effects. This discovery was consistent for all the seven traits investigated. A recent study using F2 progeny of a cross between two inbred mice for obesity related traits showed that many epistatic effects were significant, but they were all small relative to the additive effects (YI et al., 2006). The largest main effect QTL contributed about 20% of the trait variance but the largest epistatic effect accounted for only about 5% of the trait variance. The result from the mice experiment was similar to the barley analysis. YI et al. (2006) used a Bayesian model selection method to analyze the mice data. Although the model used by YI et al. (2006) was not saturated, it was a multiple effects model in which multiple main effects and epistatic effects were simultaneously estimated in a single model, a feature also shared by the method used this study. Therefore, simultaneous estimation of all model effects seemed to support the notion that epistasis is not an important contributor to the genetic variance of less diversified varieties of crops or breeds of animals. 19

20 ACKNOWLEDGEMENTS We are grateful to two anonymous reviewers for their useful comments and suggestions on the first version of the manuscript, which have significantly improved the presentation of the current manuscript. The research was supported by the National Institute of Health Grant R01-GM55321 and the National Science Foundation Grant DBI to SX. 20

21 LITERATURE CITED BEAVIS, W. D., 1994 The power and deceit of QTL experiments: Lessons from comparitive QTL studies, pp in Proceedings of the Forty-Ninth Annual Corn & Sorghum Industry Research Conference. American Seed Trade Association, Washington, D.C. CARLBORG, O., L. ANDERSSON and B. KINGHORN, 2000 The use of a genetoc algorithm for simultaneous mapping of multiple interacting quantitative traits loci. Genetics 155: CARLBORG, O., G. A. BROCKMANN and C. S. HALEY, 2005 Simultaneous mapping of epistatic QTL in DU6i x DBA/2 mice. Mammalian Genome 16: CARLIN, B. P., and T. A. LOUIS, 1996 Bayes and Empirical Bayes Methods for Data Analysis. Chapman & Hall/CRC, London. CHEVERUD, J. M., and E. J. ROUTMAN, 1995 Epistasis and its contribution to genetic variance components. Genetics 139: COCKERHAM, C. C., 1954 An extension of the concept of partitioning hereditary variance for analysis of covariances among relatives when epistasis is present. Genetics 39: FISHER, R. A., 1918 The correlations between relatives on the supposition of Mendelian inheritance. Trans. R. Soc. Edinburgh 52: GEORGE, E. I., and R. E. MCMULLOCH, 1993 Variable selection via Gibbs sampling. J. A. Stat. Assoc. 91: GEORGE, E. I., and R. E. MCMULLOCH, 1995 Stochastic search variable selection, pp in Practical Markov Chain Monte Carlo in Practice, edited by W. R. GILKS, S. RICHARDSON and D. J. SPIEGELHALTER. Chapman and Hall, London. GOODNIGHT, C. J., 1988 Epistasis and the effect of founder events on the additive genetic variance. Evolution 42: GOODNIGHT, C. J., 2000 Quantitative trait loci and gene interaction: the quantitative genetics of metapopulations. Heredity 84: HARTLEY, H. O., and J. N. K. RAO, 1967 Maximum-likelihood estimation for the mixed analysis of variance model. Biometrika 54: HOLLAND, J. B., 1998 EPISTACY: a SAS program for detecting two-locus epistatic interactions using marker information. J. Hered. 89: JANNINK, J.-L., 2003 Selection dynamics and limits under additive x additive epistatic gene action. Crop Sci. 43:

22 KAO, C.-H., and Z.-B. ZENG, 2002 Modeling epistasis of quantitative trait loci using Cockerham's model. Genetics 160: LANDER, E. S., and D. BOTSTEIN, 1989 Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121: LINDLEY, D. V., and A. F. M. SMITH, 1972 Bayes estimates for linear model. J. R. Statist. Soc. B 34: MALMBERG, R. L., S. HELD, A. WAITS and R. MAURICIO, 2005 Epistasis for fitness-related quantitative traits in arabidopsis thaliana grown in the field and in the greenhouse. Genetics 171: MALMBERG, R. L., and R. MAURICIO, 2005 QTL-based edivence for the role of epistasis in evolution. Genetical Research 86: MELCHINGER, A. E., H. F. UTZ, and C. C. SCHÖN, 1998 Quantitative trait locus (QTL) mapping using different testers and independent population samples in maize reveals low power of QTL detection and large bias in estimates of QTL effects. Genetics 149: ROBINSON, G. K., 1991 That BLUP is a good thing: The estimation of random effects. Statistical Science 6: SCHWARZ, G., 1978 Estimating the dimension of a model. Ann. Statist. 6: SILLANPAA, M. J., and E. ARJAS, 1998 Bayesian mapping of multiple quantitative trait loci from incomplete inbred line cross data. Genetics 148: TIBSHIRANI, R., 1996 Regression shrinkage and selection via the lasso. J. R. Statist. Soc. B 58: TINKER, N. A., D. E. MATHER, B. G. ROSSNAGEL, K. J. KASHA, A. KLEINHOFS et al., 1996 Regions of the genome that affect agronomic performance in two-row barley. Crop Sci. 36: WANG, H., Y. M. ZHANG, X. LI, G. L. MASINDE, S. MOHAN et al., 2005 Bayesian shrinkage estimation of quantitative trait loci parameters. Genetics 170: WRIGHT, S., 1931 Evolution in Mendelian populations. Genetics 16: XU, S., 2003a Estimating polygenic effects using markers of the entire genome. Genetics 163: XU, S., 2003b Theoretical basis of the Beavis effect. Genetics 165: XU, S., 2007 An empirical Bayes method for estimating epistatic effects of quantitative trait loci. Biometrics (in press) 22

23 YI, N., V. GEORGE and D. B. ALLISON, 2003a Stochastic search variable selection for identifying quantitative trait loci. Genetics 164: YI, N., S. XU and D. B. ALLISON, 2003b Bayesian model choice and search strategies for mapping interacting quantitative trait loci. Genetics 165: YI, N., B. S. YANDELL, G. A. CHURCHILL, D. B. ALLISON, E. J. EISEN et al., 2005 Bayesian model selection for genome-wide epistatic quantitative trait loci analysis. Genetics 170: YI, N., D. K. ZINNIEL, K. KIM, E. J. EISEN, A. BARTOLUCCI et al., 2006 Bayesian analysis of multiple epistatic QTL models for body weight and body composition in mice. Genetical Research 87: ZENG, Z.-B., T. WANG and W. ZOU, 2005 Modeling quantitative trait loci and interpretation of models. Genetics 169: ZHANG, M., K. L. MONTOOTH, M. T. WELLS, A. G. CLARK and D. ZHANG, 2005 Mapping multiple quantitative trait loci by Bayesian classification. Genetics 169:

24 Figure Legends Figure 1. QTL effects for seven agronomic traits of the Harrington TR306 (H T) double-haploid barley population. The main (additive) effects are on the diagonals and the epistatic effects are on the left triangle of the 3D plots. Blue prisms represent positive effects (T allele > H allele) and gray prisms represent negative effects (H allele > T allele). Figure 2. Plots of QTL effects against the rankings (in descending order) for four agronomic traits in the Harrington TR306 double-haploid barley population. Each panel in the right column represents the plot for a randomly reshuffled sample for the corresponding trait. The total number of effects included in the model was 8128, but the plots reflect the top 500 effects only (truncated data). 24

25 25 Figure 1

26 Figure 2 26

27 Table 1. Summary statistics for seven agronomic traits in the Harrington TR306 double-haploid barley population. Trait HED HGT KWT LDG MAT TWT YLD Average Replication Mean Variance Critical value N A N AA N V A V AA V G H A H AA H G N A /N AA (0) N A /N AA (1) N A /N AA (2) H A(MAX) H AA(MAX) N A Number of additive (main) effects; N AA Number of epistatic effects; N Total number of effects; V A Variance of additive effects; V AA Variance of epistatic effects; V G Total genetic variance; H A Proportion of additive variance; H AA Proportion of epistatic variance; H G Proportion of total genetic variance; N A /N AA (0) Proportion of epistatic effects between pair of loci of which both lack main effects; N A /N AA (1) Proportion of epistatic effects between pair of loci of which only one has main effect; N A /N AA (2) Proportion of epistatic effects between pair of loci of which both have main effects. H A(MAX) H A of the largest additive effect; H AA(MAX) H AA of the largest epistatic effect. 27

EPISTATIC effects are statistically defined as interactions

EPISTATIC effects are statistically defined as interactions Copyright Ó 2007 by the Genetics Society of America DOI: 10.1534/genetics.106.066571 Genomewide Analysis of Epistatic Effects for Quantitative Traits in Barley Shizhong Xu 1 and Zhenyu Jia Department of

More information

A Decision Rule for Quantitative Trait Locus Detection under the Extended Bayesian LASSO Model

A Decision Rule for Quantitative Trait Locus Detection under the Extended Bayesian LASSO Model Genetics: Published Articles Ahead of Print, published on September 14, 01 as 10.1534/genetics.111.13078 A Decision Rule for Quantitative Trait Locus Detection under the Extended Bayesian LASSO Model Crispin

More information

Estimating genetic variation within families

Estimating genetic variation within families Estimating genetic variation within families Peter M. Visscher Queensland Institute of Medical Research Brisbane, Australia peter.visscher@qimr.edu.au 1 Overview Estimation of genetic parameters Variation

More information

STATISTICAL INFERENCE 1 Richard A. Johnson Professor Emeritus Department of Statistics University of Wisconsin

STATISTICAL INFERENCE 1 Richard A. Johnson Professor Emeritus Department of Statistics University of Wisconsin STATISTICAL INFERENCE 1 Richard A. Johnson Professor Emeritus Department of Statistics University of Wisconsin Key words : Bayesian approach, classical approach, confidence interval, estimation, randomization,

More information

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15) ON THE COMPARISON OF BAYESIAN INFORMATION CRITERION AND DRAPER S INFORMATION CRITERION IN SELECTION OF AN ASYMMETRIC PRICE RELATIONSHIP: BOOTSTRAP SIMULATION RESULTS Henry de-graft Acquah, Senior Lecturer

More information

MEA DISCUSSION PAPERS

MEA DISCUSSION PAPERS Inference Problems under a Special Form of Heteroskedasticity Helmut Farbmacher, Heinrich Kögel 03-2015 MEA DISCUSSION PAPERS mea Amalienstr. 33_D-80799 Munich_Phone+49 89 38602-355_Fax +49 89 38602-390_www.mea.mpisoc.mpg.de

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Your DNA extractions! 10 kb

Your DNA extractions! 10 kb Your DNA extractions! 10 kb Quantitative characters: polygenes and environment Most ecologically important quantitative traits (QTs) vary. Distributions are often unimodal and approximately normal. Offspring

More information

Diallel Analysis and its Applications in Plant Breeding

Diallel Analysis and its Applications in Plant Breeding Diallel Analysis and its Applications in Plant Breeding Madhu Choudhary*, Kana Ram Kumawat and Ravi Kumawat Department of Plant Breeding and Genetics, S.K.N. Agriculture University, Jobner-303329, Jaipur

More information

Discontinuous Traits. Chapter 22. Quantitative Traits. Types of Quantitative Traits. Few, distinct phenotypes. Also called discrete characters

Discontinuous Traits. Chapter 22. Quantitative Traits. Types of Quantitative Traits. Few, distinct phenotypes. Also called discrete characters Discontinuous Traits Few, distinct phenotypes Chapter 22 Also called discrete characters Quantitative Genetics Examples: Pea shape, eye color in Drosophila, Flower color Quantitative Traits Phenotype is

More information

Bayesian and Frequentist Approaches

Bayesian and Frequentist Approaches Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law

More information

additive genetic component [d] = rded

additive genetic component [d] = rded Heredity (1976), 36 (1), 31-40 EFFECT OF GENE DISPERSION ON ESTIMATES OF COMPONENTS OF GENERATION MEANS AND VARIANCES N. E. M. JAYASEKARA* and J. L. JINKS Department of Genetics, University of Birmingham,

More information

The Efficiency of Mapping of Quantitative Trait Loci using Cofactor Analysis

The Efficiency of Mapping of Quantitative Trait Loci using Cofactor Analysis The Efficiency of Mapping of Quantitative Trait Loci using Cofactor G. Sahana 1, D.J. de Koning 2, B. Guldbrandtsen 1, P. Sorensen 1 and M.S. Lund 1 1 Danish Institute of Agricultural Sciences, Department

More information

Complex Trait Genetics in Animal Models. Will Valdar Oxford University

Complex Trait Genetics in Animal Models. Will Valdar Oxford University Complex Trait Genetics in Animal Models Will Valdar Oxford University Mapping Genes for Quantitative Traits in Outbred Mice Will Valdar Oxford University What s so great about mice? Share ~99% of genes

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Journal of Social and Development Sciences Vol. 4, No. 4, pp. 93-97, Apr 203 (ISSN 222-52) Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Henry De-Graft Acquah University

More information

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Combining Risks from Several Tumors Using Markov Chain Monte Carlo University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln U.S. Environmental Protection Agency Papers U.S. Environmental Protection Agency 2009 Combining Risks from Several Tumors

More information

Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection

Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection Author's response to reviews Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection Authors: Jestinah M Mahachie John

More information

Advanced Bayesian Models for the Social Sciences

Advanced Bayesian Models for the Social Sciences Advanced Bayesian Models for the Social Sciences Jeff Harden Department of Political Science, University of Colorado Boulder jeffrey.harden@colorado.edu Daniel Stegmueller Department of Government, University

More information

Decomposition of the Genotypic Value

Decomposition of the Genotypic Value Decomposition of the Genotypic Value 1 / 17 Partitioning of Phenotypic Values We introduced the general model of Y = G + E in the first lecture, where Y is the phenotypic value, G is the genotypic value,

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Package noia. February 20, 2015

Package noia. February 20, 2015 Type Package Package noia February 20, 2015 Title Implementation of the Natural and Orthogonal InterAction (NOIA) model Version 0.97.1 Date 2015-01-06 Author Arnaud Le Rouzic (2007-2014), Arne B. Gjuvsland

More information

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill)

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill) Advanced Bayesian Models for the Social Sciences Instructors: Week 1&2: Skyler J. Cranmer Department of Political Science University of North Carolina, Chapel Hill skyler@unc.edu Week 3&4: Daniel Stegmueller

More information

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study STATISTICAL METHODS Epidemiology Biostatistics and Public Health - 2016, Volume 13, Number 1 Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation

More information

Kelvin Chan Feb 10, 2015

Kelvin Chan Feb 10, 2015 Underestimation of Variance of Predicted Mean Health Utilities Derived from Multi- Attribute Utility Instruments: The Use of Multiple Imputation as a Potential Solution. Kelvin Chan Feb 10, 2015 Outline

More information

Method Comparison for Interrater Reliability of an Image Processing Technique in Epilepsy Subjects

Method Comparison for Interrater Reliability of an Image Processing Technique in Epilepsy Subjects 22nd International Congress on Modelling and Simulation, Hobart, Tasmania, Australia, 3 to 8 December 2017 mssanz.org.au/modsim2017 Method Comparison for Interrater Reliability of an Image Processing Technique

More information

An Introduction to Bayesian Statistics

An Introduction to Bayesian Statistics An Introduction to Bayesian Statistics Robert Weiss Department of Biostatistics UCLA Fielding School of Public Health robweiss@ucla.edu Sept 2015 Robert Weiss (UCLA) An Introduction to Bayesian Statistics

More information

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Timothy N. Rubin (trubin@uci.edu) Michael D. Lee (mdlee@uci.edu) Charles F. Chubb (cchubb@uci.edu) Department of Cognitive

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

A test of quantitative genetic theory using Drosophila effects of inbreeding and rate of inbreeding on heritabilities and variance components #

A test of quantitative genetic theory using Drosophila effects of inbreeding and rate of inbreeding on heritabilities and variance components # Theatre Presentation in the Commision on Animal Genetics G2.7, EAAP 2005 Uppsala A test of quantitative genetic theory using Drosophila effects of inbreeding and rate of inbreeding on heritabilities and

More information

Bayesian Bi-Cluster Change-Point Model for Exploring Functional Brain Dynamics

Bayesian Bi-Cluster Change-Point Model for Exploring Functional Brain Dynamics Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'18 85 Bayesian Bi-Cluster Change-Point Model for Exploring Functional Brain Dynamics Bing Liu 1*, Xuan Guo 2, and Jing Zhang 1** 1 Department

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector

More information

An Introduction to Quantitative Genetics I. Heather A Lawson Advanced Genetics Spring2018

An Introduction to Quantitative Genetics I. Heather A Lawson Advanced Genetics Spring2018 An Introduction to Quantitative Genetics I Heather A Lawson Advanced Genetics Spring2018 Outline What is Quantitative Genetics? Genotypic Values and Genetic Effects Heritability Linkage Disequilibrium

More information

HERITABILITY AND ITS GENETIC WORTH FOR PLANT BREEDING

HERITABILITY AND ITS GENETIC WORTH FOR PLANT BREEDING HERITABILITY AND ITS GENETIC WORTH FOR PLANT BREEDING Author: Prasanta Kumar Majhi M. Sc. (Agri.), Junior Research Scholar, Department of Genetics and Plant Breeding, College of Agriculture, UAS, Dharwad,

More information

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California Computer Age Statistical Inference Algorithms, Evidence, and Data Science BRADLEY EFRON Stanford University, California TREVOR HASTIE Stanford University, California ggf CAMBRIDGE UNIVERSITY PRESS Preface

More information

Statistical power and significance testing in large-scale genetic studies

Statistical power and significance testing in large-scale genetic studies STUDY DESIGNS Statistical power and significance testing in large-scale genetic studies Pak C. Sham 1 and Shaun M. Purcell 2,3 Abstract Significance testing was developed as an objective method for summarizing

More information

PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science. Homework 5

PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science. Homework 5 PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science Homework 5 Due: 21 Dec 2016 (late homeworks penalized 10% per day) See the course web site for submission details.

More information

Overview of Animal Breeding

Overview of Animal Breeding Overview of Animal Breeding 1 Required Information Successful animal breeding requires 1. the collection and storage of data on individually identified animals; 2. complete pedigree information about the

More information

Lecture 9: Hybrid Vigor (Heterosis) Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013

Lecture 9: Hybrid Vigor (Heterosis) Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013 Lecture 9: Hybrid Vigor (Heterosis) Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013 Breaking Yield Barriers for 2050 Phillips 2010 Crop Sci. 50:S-99-S-108 Hybrid maize is a modern

More information

Multivariate Regression with Small Samples: A Comparison of Estimation Methods W. Holmes Finch Maria E. Hernández Finch Ball State University

Multivariate Regression with Small Samples: A Comparison of Estimation Methods W. Holmes Finch Maria E. Hernández Finch Ball State University Multivariate Regression with Small Samples: A Comparison of Estimation Methods W. Holmes Finch Maria E. Hernández Finch Ball State University High dimensional multivariate data, where the number of variables

More information

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods (10) Frequentist Properties of Bayesian Methods Calibrated Bayes So far we have discussed Bayesian methods as being separate from the frequentist approach However, in many cases methods with frequentist

More information

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang Classification Methods Course: Gene Expression Data Analysis -Day Five Rainer Spang Ms. Smith DNA Chip of Ms. Smith Expression profile of Ms. Smith Ms. Smith 30.000 properties of Ms. Smith The expression

More information

A Bayesian approach for constructing genetic maps when markers are miscoded

A Bayesian approach for constructing genetic maps when markers are miscoded Genet. Sel. Evol. 34 (2002) 353 369 353 INRA, EDP Sciences, 2002 DOI: 10.1051/gse:2002012 Original article A Bayesian approach for constructing genetic maps when markers are miscoded Guilherme J.M. ROSA

More information

Non-parametric methods for linkage analysis

Non-parametric methods for linkage analysis BIOSTT516 Statistical Methods in Genetic Epidemiology utumn 005 Non-parametric methods for linkage analysis To this point, we have discussed model-based linkage analyses. These require one to specify a

More information

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction

More information

Mating Systems. 1 Mating According to Index Values. 1.1 Positive Assortative Matings

Mating Systems. 1 Mating According to Index Values. 1.1 Positive Assortative Matings Mating Systems After selecting the males and females that will be used to produce the next generation of animals, the next big decision is which males should be mated to which females. Mating decisions

More information

Name Class Date. KEY CONCEPT The chromosomes on which genes are located can affect the expression of traits.

Name Class Date. KEY CONCEPT The chromosomes on which genes are located can affect the expression of traits. Section 1: Chromosomes and Phenotype KEY CONCEPT The chromosomes on which genes are located can affect the expression of traits. VOCABULARY carrier sex-linked gene X chromosome inactivation MAIN IDEA:

More information

SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA. Henrik Kure

SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA. Henrik Kure SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA Henrik Kure Dina, The Royal Veterinary and Agricuural University Bülowsvej 48 DK 1870 Frederiksberg C. kure@dina.kvl.dk

More information

The Pretest! Pretest! Pretest! Assignment (Example 2)

The Pretest! Pretest! Pretest! Assignment (Example 2) The Pretest! Pretest! Pretest! Assignment (Example 2) May 19, 2003 1 Statement of Purpose and Description of Pretest Procedure When one designs a Math 10 exam one hopes to measure whether a student s ability

More information

Introduction to Quantitative Genetics

Introduction to Quantitative Genetics Introduction to Quantitative Genetics 1 / 17 Historical Background Quantitative genetics is the study of continuous or quantitative traits and their underlying mechanisms. The main principals of quantitative

More information

GENETIC LINKAGE ANALYSIS

GENETIC LINKAGE ANALYSIS Atlas of Genetics and Cytogenetics in Oncology and Haematology GENETIC LINKAGE ANALYSIS * I- Recombination fraction II- Definition of the "lod score" of a family III- Test for linkage IV- Estimation of

More information

Bayesian Estimation of a Meta-analysis model using Gibbs sampler

Bayesian Estimation of a Meta-analysis model using Gibbs sampler University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Engineering and Information Sciences 2012 Bayesian Estimation of

More information

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,

More information

THERE is broad interest in genetic loci (called quantitative

THERE is broad interest in genetic loci (called quantitative Copyright Ó 2006 by the Genetics Society of America DOI: 10.1534/genetics.106.061176 The X Chromosome in Quantitative Trait Locus Mapping Karl W. Broman,*,1 Śaunak Sen, Sarah E. Owens,,2 Ani Manichaikul,*

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc.

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc. Sawtooth Software RESEARCH PAPER SERIES MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB Bryan Orme, Sawtooth Software, Inc. Copyright 009, Sawtooth Software, Inc. 530 W. Fir St. Sequim,

More information

Modeling Sentiment with Ridge Regression

Modeling Sentiment with Ridge Regression Modeling Sentiment with Ridge Regression Luke Segars 2/20/2012 The goal of this project was to generate a linear sentiment model for classifying Amazon book reviews according to their star rank. More generally,

More information

Introduction of Genome wide Complex Trait Analysis (GCTA) Presenter: Yue Ming Chen Location: Stat Gen Workshop Date: 6/7/2013

Introduction of Genome wide Complex Trait Analysis (GCTA) Presenter: Yue Ming Chen Location: Stat Gen Workshop Date: 6/7/2013 Introduction of Genome wide Complex Trait Analysis (GCTA) resenter: ue Ming Chen Location: Stat Gen Workshop Date: 6/7/013 Outline Brief review of quantitative genetics Overview of GCTA Ideas Main functions

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and

More information

The Association Design and a Continuous Phenotype

The Association Design and a Continuous Phenotype PSYC 5102: Association Design & Continuous Phenotypes (4/4/07) 1 The Association Design and a Continuous Phenotype The purpose of this note is to demonstrate how to perform a population-based association

More information

25.1 QUANTITATIVE TRAITS

25.1 QUANTITATIVE TRAITS CHAPTER OUTLINE 5.1 Quantitative Traits 5. Polygenic Inheritance 5.3 Heritability 5 QUANTITATIVE In this chapter, we will examine complex traits characteristics that are determined by several genes and

More information

Predictive Bias Correction for Sequential. Quantitative Visual Assessments

Predictive Bias Correction for Sequential. Quantitative Visual Assessments Predictive Bias Correction for Sequential Quantitative Visual Assessments 1 Aletta Nonyane Department of Primary Care and General Practice, University of Birmingham, Edgbaston, Birmingham, England, B15

More information

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing

More information

IDENTIFICATION OF QTLS FOR STARCH CONTENT IN SWEETPOTATO (IPOMOEA BATATAS (L.) LAM.)

IDENTIFICATION OF QTLS FOR STARCH CONTENT IN SWEETPOTATO (IPOMOEA BATATAS (L.) LAM.) Journal of Integrative Agriculture Advanced Online Publication: 2013 Doi: 10.1016/S2095-3119(13)60357-3 IDENTIFICATION OF QTLS FOR STARCH CONTENT IN SWEETPOTATO (IPOMOEA BATATAS (L.) LAM.) YU Xiao-xia

More information

New Enhancements: GWAS Workflows with SVS

New Enhancements: GWAS Workflows with SVS New Enhancements: GWAS Workflows with SVS August 9 th, 2017 Gabe Rudy VP Product & Engineering 20 most promising Biotech Technology Providers Top 10 Analytics Solution Providers Hype Cycle for Life sciences

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger Conditional Distributions and the Bivariate Normal Distribution James H. Steiger Overview In this module, we have several goals: Introduce several technical terms Bivariate frequency distribution Marginal

More information

Individual Differences in Attention During Category Learning

Individual Differences in Attention During Category Learning Individual Differences in Attention During Category Learning Michael D. Lee (mdlee@uci.edu) Department of Cognitive Sciences, 35 Social Sciences Plaza A University of California, Irvine, CA 92697-5 USA

More information

Chapter 11 introduction to genetics 11.1 The work of Gregor mendel

Chapter 11 introduction to genetics 11.1 The work of Gregor mendel Chapter 11 introduction to genetics 11.1 The work of Gregor mendel What is inheritance? Two uses of the word inheritance Things that are passed down through generations Factors we get from our parents

More information

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Patrick J. Heagerty Department of Biostatistics University of Washington 174 Biomarkers Session Outline

More information

Confidence Intervals On Subsets May Be Misleading

Confidence Intervals On Subsets May Be Misleading Journal of Modern Applied Statistical Methods Volume 3 Issue 2 Article 2 11-1-2004 Confidence Intervals On Subsets May Be Misleading Juliet Popper Shaffer University of California, Berkeley, shaffer@stat.berkeley.edu

More information

False Discovery Rates and Copy Number Variation. Bradley Efron and Nancy Zhang Stanford University

False Discovery Rates and Copy Number Variation. Bradley Efron and Nancy Zhang Stanford University False Discovery Rates and Copy Number Variation Bradley Efron and Nancy Zhang Stanford University Three Statistical Centuries 19th (Quetelet) Huge data sets, simple questions 20th (Fisher, Neyman, Hotelling,...

More information

Role of Genomics in Selection of Beef Cattle for Healthfulness Characteristics

Role of Genomics in Selection of Beef Cattle for Healthfulness Characteristics Role of Genomics in Selection of Beef Cattle for Healthfulness Characteristics Dorian Garrick dorian@iastate.edu Iowa State University & National Beef Cattle Evaluation Consortium Selection and Prediction

More information

Examining Relationships Least-squares regression. Sections 2.3

Examining Relationships Least-squares regression. Sections 2.3 Examining Relationships Least-squares regression Sections 2.3 The regression line A regression line describes a one-way linear relationship between variables. An explanatory variable, x, explains variability

More information

J2.6 Imputation of missing data with nonlinear relationships

J2.6 Imputation of missing data with nonlinear relationships Sixth Conference on Artificial Intelligence Applications to Environmental Science 88th AMS Annual Meeting, New Orleans, LA 20-24 January 2008 J2.6 Imputation of missing with nonlinear relationships Michael

More information

Mediation Analysis With Principal Stratification

Mediation Analysis With Principal Stratification University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 3-30-009 Mediation Analysis With Principal Stratification Robert Gallop Dylan S. Small University of Pennsylvania

More information

Bayesian Inference Bayes Laplace

Bayesian Inference Bayes Laplace Bayesian Inference Bayes Laplace Course objective The aim of this course is to introduce the modern approach to Bayesian statistics, emphasizing the computational aspects and the differences between the

More information

Laboratory. Mendelian Genetics

Laboratory. Mendelian Genetics Laboratory 9 Mendelian Genetics Biology 171L FA17 Lab 9: Mendelian Genetics Student Learning Outcomes 1. Predict the phenotypic and genotypic ratios of a monohybrid cross. 2. Determine whether a gene is

More information

On Regression Analysis Using Bivariate Extreme Ranked Set Sampling

On Regression Analysis Using Bivariate Extreme Ranked Set Sampling On Regression Analysis Using Bivariate Extreme Ranked Set Sampling Atsu S. S. Dorvlo atsu@squ.edu.om Walid Abu-Dayyeh walidan@squ.edu.om Obaid Alsaidy obaidalsaidy@gmail.com Abstract- Many forms of ranked

More information

Lesson 9: Two Factor ANOVAS

Lesson 9: Two Factor ANOVAS Published on Agron 513 (https://courses.agron.iastate.edu/agron513) Home > Lesson 9 Lesson 9: Two Factor ANOVAS Developed by: Ron Mowers, Marin Harbur, and Ken Moore Completion Time: 1 week Introduction

More information

Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3

Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3 Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3 Analysis of Vaccine Effects on Post-Infection Endpoints p.1/40 Data Collected in Phase IIb/III Vaccine Trial Longitudinal

More information

Statistical Techniques. Meta-Stat provides a wealth of statistical tools to help you examine your data. Overview

Statistical Techniques. Meta-Stat provides a wealth of statistical tools to help you examine your data. Overview 7 Applying Statistical Techniques Meta-Stat provides a wealth of statistical tools to help you examine your data. Overview... 137 Common Functions... 141 Selecting Variables to be Analyzed... 141 Deselecting

More information

Chapter 5: Field experimental designs in agriculture

Chapter 5: Field experimental designs in agriculture Chapter 5: Field experimental designs in agriculture Jose Crossa Biometrics and Statistics Unit Crop Research Informatics Lab (CRIL) CIMMYT. Int. Apdo. Postal 6-641, 06600 Mexico, DF, Mexico Introduction

More information

CHAPTER VI RESEARCH METHODOLOGY

CHAPTER VI RESEARCH METHODOLOGY CHAPTER VI RESEARCH METHODOLOGY 6.1 Research Design Research is an organized, systematic, data based, critical, objective, scientific inquiry or investigation into a specific problem, undertaken with the

More information

Identification of Tissue Independent Cancer Driver Genes

Identification of Tissue Independent Cancer Driver Genes Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important

More information

MBG* Animal Breeding Methods Fall Final Exam

MBG* Animal Breeding Methods Fall Final Exam MBG*4030 - Animal Breeding Methods Fall 2007 - Final Exam 1 Problem Questions Mick Dundee used his financial resources to purchase the Now That s A Croc crocodile farm that had been operating for a number

More information

Running Head: BAYESIAN MEDIATION WITH MISSING DATA 1. A Bayesian Approach for Estimating Mediation Effects with Missing Data. Craig K.

Running Head: BAYESIAN MEDIATION WITH MISSING DATA 1. A Bayesian Approach for Estimating Mediation Effects with Missing Data. Craig K. Running Head: BAYESIAN MEDIATION WITH MISSING DATA 1 A Bayesian Approach for Estimating Mediation Effects with Missing Data Craig K. Enders Arizona State University Amanda J. Fairchild University of South

More information

T. R. Golub, D. K. Slonim & Others 1999

T. R. Golub, D. K. Slonim & Others 1999 T. R. Golub, D. K. Slonim & Others 1999 Big Picture in 1999 The Need for Cancer Classification Cancer classification very important for advances in cancer treatment. Cancers of Identical grade can have

More information

Bayesian Mediation Analysis

Bayesian Mediation Analysis Psychological Methods 2009, Vol. 14, No. 4, 301 322 2009 American Psychological Association 1082-989X/09/$12.00 DOI: 10.1037/a0016972 Bayesian Mediation Analysis Ying Yuan The University of Texas M. D.

More information

An Introduction to Quantitative Genetics

An Introduction to Quantitative Genetics An Introduction to Quantitative Genetics Mohammad Keramatipour MD, PhD Keramatipour@tums.ac.ir ac ir 1 Mendel s work Laws of inheritance Basic Concepts Applications Predicting outcome of crosses Phenotype

More information

Mendelian Genetics & Inheritance Patterns. Practice Questions. Slide 1 / 116. Slide 2 / 116. Slide 3 / 116

Mendelian Genetics & Inheritance Patterns. Practice Questions. Slide 1 / 116. Slide 2 / 116. Slide 3 / 116 New Jersey Center for Teaching and Learning Slide 1 / 116 Progressive Science Initiative This material is made freely available at www.njctl.org and is intended for the non-commercial use of students and

More information

Progressive Science Initiative. Click to go to website:

Progressive Science Initiative. Click to go to website: Slide 1 / 116 New Jersey Center for Teaching and Learning Progressive Science Initiative This material is made freely available at www.njctl.org and is intended for the non-commercial use of students and

More information

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch.

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch. S05-2008 Imputation of Categorical Missing Data: A comparison of Multivariate Normal and Abstract Multinomial Methods Holmes Finch Matt Margraf Ball State University Procedures for the imputation of missing

More information

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Karl Bang Christensen National Institute of Occupational Health, Denmark Helene Feveille National

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a

More information

Quantitative Genetics. Statistics Overview: Mean. Statistics Overview: Variance. Statistics Overview: Distributions. Chapter 22

Quantitative Genetics. Statistics Overview: Mean. Statistics Overview: Variance. Statistics Overview: Distributions. Chapter 22 Quantitative Genetics Chapter Statistics Overview: Distributions Phenotypes on X axis, Frequencies on Y axis Statistics Overview: Mean Measure of central tendency (average) of a group of measurements X

More information

Multilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison

Multilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison Group-Level Diagnosis 1 N.B. Please do not cite or distribute. Multilevel IRT for group-level diagnosis Chanho Park Daniel M. Bolt University of Wisconsin-Madison Paper presented at the annual meeting

More information

Variation in Measurement Error in Asymmetry Studies: A New Model, Simulations and Application

Variation in Measurement Error in Asymmetry Studies: A New Model, Simulations and Application Symmetry 2015, 7, 284-293; doi:10.3390/sym7020284 Article OPEN ACCESS symmetry ISSN 2073-8994 www.mdpi.com/journal/symmetry Variation in Measurement Error in Asymmetry Studies: A New Model, Simulations

More information