Analyzing Developmental Trajectories: A Semiparametric, Group-Based Approach

Size: px
Start display at page:

Download "Analyzing Developmental Trajectories: A Semiparametric, Group-Based Approach"

Transcription

1 Psychological Methods 1999, Vol. i. No. 2, Copyright 1999 by the American Psychological Association, Inc X/99/$.VOO Analyzing Developmental Trajectories: A Semiparametric, Group-Based Approach Daniel S. Nagin Carnegie Mellon University A developmental trajectory describes the course of a behavior over age or time. A group-based method for identifying distinctive groups of individual trajectories within the population and for profiling the characteristics of group members is demonstrated. Such clusters might include groups of "increasers." "decreasers," and "no changers." Suitably defined probability distributions are used to handle 3 data types count, binary, and psychometric scale data. Four capabilities are demonstrated: (a) the capability to identify rather than assume distinctive groups of trajectories, (b) the capability to estimate the proportion of the population following each such trajectory group, (c) the capability to relate group membership probability to individual characteristics and circumstances, and (d) the capability to use the group membership probabilities for various other purposes such as creating profiles of group members. Over the past decade, major advances have been made in methodology for analyzing individual-level developmental trajectories. The two main branches of methodology are hierarchical modeling (Bryk & Raudenbush, 1987, 1992; Goldstein, 1995) and latent curve analysis (McArdle & Epstein, 1987; Meredith &Tisak, 1990;Muthen, 1989; Willett& Sayer, 1994). These advanced methods allow researchers to move beyond the use of ad hoc categorization procedures for constructing developmental trajectories. Although these two classes of methodology differ in very important respects, they also have important commonalities (MacCallum, Kim, Malarkey, & Kiecolt-Glaser, 1997; Muthen & Curran, 1997; Raudenbush, in press; Willett & Sayer, 1994). For the purposes of this article, one is key: Both model the This work was supported by National Science Foundation Grant SBR and by the National Consortium for Violence Research. 1 thank Robert Brame, Lisa Broidy, Patrick Curran, Kenneth Dodge, Laura Dugan, Anne Garvin, Don Lynam, Steve Raudenbush, Kathryn Roeder, Robert Siegler, and Larry Wasserman for helpful comments on previous versions of this article. Correspondence concerning this article should be addressed to Daniel S. Nagin, H. John Heinz III School of Public Policy & Management, Carnegie Mellon University, Pittsburgh, Pennsylvania unconditional and conditional population distribution of growth curves based on continuous distribution functions. Unconditional models estimate two key features of the population distribution of growth curve parameters their mean and covariance structure. The former defines average growth within the population, and the latter calibrates the variances of growth throughout the population. The conditional models are designed to explain this variability by relating growth parameters to one or more explanatory variables. This article demonstrates a distinct semiparametric, group-based approach for modeling developmental trajectories. The modeling strategy is intended to provide a flexible and easily applied approach for identifying distinctive clusters of individual trajectories within the population and for profiling the characteristics of individuals within the clusters. Using mixtures of suitably defined probability distributions, the method identifies distinctive groups of developmental trajectories within the population (Land & Nagin, 1996; Nagin, Harrington, & Moffitt, 1995; Nagin & Land, 1993; Nagin & Tremblay, in press; Roeder, Lynch, & Nagin, in press). Thus, whereas the hierarchical and latent curve methodologies model population variability in growth with multivariate continuous distribution functions, the group-based approach uses a multinomial modeling strategy and is designed to 139

2 140 NAGIN identify relatively homogeneous clusters of developmental trajectories. The group-based modeling strategy demonstrated in this article is presaged in the work of Rindskopf (1990). Rindskopf's methodology is fully nonparametric. It is designed for analysis of repeated measurements of dichotomous response data. The repeated measurements may come in the form of responses to a series of test items or, alternatively, in the form of panel data. Rindskopf's method is designed to identify distinct groups of response sequences across sampled individuals. The method described here expands on Rindskopf's innovative work in several important respects it increases the variety of response variables to which a group-based modeling strategy can be applied, it provides a basis for linking group membership probability to relevant individual-level characteristics, and it describes a formal approach for determining the optimal number of groups. Rationale for a Group-Based Modeling Strategy Before discussing the technical features of the model, I briefly speak to the underlying rationale for a group-based modeling strategy. There is a long tradition in psychology of group-based theorizing about development. Examples include theories of personality development (Caspi, 1998), drug use (Kandel, 1975), learning (Holyoak & Spellman, 1993), language and conceptual development (Markman, 1989), development of prosocial behaviors such as altruism, and development of antisocial behaviors such as delinquency (Loeber, 1991; Moffitt, 1993; Patterson, DeBaryshe, & Ramsey, 1989). This group-based approach is well suited to analyzing questions about developmental trajectories that are inherently categorical do certain types of people tend to have distinctive developmental trajectories? For example, Moffitt (1993) and Patterson et al. (1989) have argued that early onset of delinquency is a distinctive feature of the development trajectories of chronically criminal and antisocial adults but not of individuals whose offending is limited to their adolescence. The group-based modeling strategy can test whether the developmental trajectories predicted by the theories are actually present in the population. It can also be used to test key predictions such as whether the chronic trajectory is typified by an earlier onset of delinquent behavior than the adolescent limited trajectory. In contrast, because hierarchical and latent curve modeling assume a continuous distribution of trajectories within the population, these methods are not designed to identify distinct clusters of trajectories. Rather they are designed to describe how patterns of growth vary continuously throughout the population. As a result, it is awkward to use these methods to address research questions that contrast distinct categories of developmental trajectories. The group-based modeling approach assumes that the population is composed of a mixture of distinct groups defined by their developmental trajectories. The assumption that the population is composed of distinct groups is not likely literally correct. Unlike biological or physical phenomena, in which populations may be composed of actually distinct groups such as different types of animal or plant species, population differences in developmental trajectories of behavior are unlikely to reflect such bright-line differences (although biology has a long tradition of debates concerning classification; see Appel, 1987). To be sure, taxonomic theories predict different trajectories of development across subpopulations, but the purpose of such taxonomies is generally to draw attention to differences in the causes and consequences of different developmental trajectories within the population rather than to suggest that the population is composed of literally distinct groups. In this respect, developmental toxonomies are analogous to prototypal diagnostic classifications in clinical psychology. Unlike classical classifications, prototypal classifications acknowledge fuzziness in classification, employ polythetic diagnostic criteria, and recognize within-group variation in the prevalence and severity of symptoms (Cantor & Genero, 1986; Widiger & Frances, 1985). The group-based modeling strategy is prototypal in design and provides a methodological complement to theories that predict prototypal developmental etiologies and trajectories within the population. The modeling strategy explicitly recognizes uncertainty in group membership, allows an examination of the impact of multiple factors on probability of group membership, and anoints no set of factors as necessary and sufficient in determining group membership. With this approach the basic elements of taxonomic theories can be directly tested: Are the relatively homogeneous developmental trajectories and etiologies predicted by the theory actually present in the population? In so doing, the method provides an alternative to using assignment rules based on subjective categori-

3 ANALYZING DEVELOPMENTAL TRAJECTORIES 141 zation criteria to construct categories of developmental trajectories. Although such assignment rules are generally reasonable, there are limitations and pitfalls attendant to their use. One is that the existence of the various developmental trajectories that underlie the taxonornic theory cannot be tested; they must be assumed a priori. A related pitfall of constructing groups with subjective classification procedures is overfilling creation of trajectory groups that reflect only random variation. Second, ex ante specified rules provide no basis for calibrating the precision of individual classifications to the various groups that compose th;: taxonomy. The semiparametric, group-based method presented in this article avoids each of these limitations. It provides a formal basis for determining the number of groups that best fits the data and also provides an explicit metric, the posterior probability of group membership, for evaluating the precision of group assignments. Data and Three Illustrative Examples Throughout, key methodological points are illustrated with findings from the analysis of two wellknown longitudinal studies. One is the Cambridge Study of Delinquent Development, which tracked a sample of 411 British males from a working-class area of London. Data collection began in , when most of the boys were age 8. Criminal involvement is measured by conviction for criminal offenses and is available for all individuals in the sample through age 32, with the exception of the 8 individuals who died prior to this age. Between ages 10 and 32 a wealth of data was assembled on each individual's psychological makeup, family circumstances including parental behaviors, and performance in school and work. For a complete discussion of the data set, see Farrington and West (1990). The other data set is from a prospective longitudinal study based in Montreal, Canada. This study has been tracking 1,037 White males of French ancestry. Participants were selected in 1984 from kindergarten classes in low-socioeconomic-status Montreal neighborhoods. Following the assessment at age 6, the boys and other informants were interviewed annually from ages 10 to 18. Like the Cambridge study, assessments were made on a wide range of factors. Among these is physical aggression, which was measured on the basis of teacher ratings using the Social Behavior Questionnaire (Tremblay, Desmarais-Gervais, Gagnon, & Charlebois, 1987). The boys were rated on three items: kicking, biting, and hitting other children; fighting with other children; and bullying and intimidating other children. See Tremblay et al. for further details on this study. Figures 1, 2, and 3 show the results of trajectory analyses based on these two data sets. The software used to estimate these models is a customized SAS procedure that was developed with the SAS product SAS/TOOLKIT. The procedure, which is currently available for the PC platform only, is a program written in the C programming language and is designed to interface with the SAS system to perform model fitting. It accommodates missing data. Thus, individuals with incomplete assessment histories do not have to be dropped from the analysis. Also, time between assessment periods does not have to be equally spaced, and assessment periods do not have to be identical across participants. The software, which is easily used and incorporated into the standard SAS package, is available on request and is described in detail in Jones, Nagin, and Roeder (1998). Figure 1 displays the results of the analysis of the Cambridge data. The response variable in this analysis is a count, the number of convictions for each 2-year period from ages 10 to 31 years. The solid lines represent actual behavior, and the dashed lines represent predicted behavior. 1 Three groups were identified: A group called "the never convicted" was composed of respondents who, but for a few individuals with a single conviction, were not convicted throughout the observation period. This group accounts for an estimated 71 % of the sampled population. A second group of individuals who ceased their offending (as measured by conviction) by their early 20s was labeled "adolescent limited" after Moffitt (1993). This group is estimated to constitute 22% of the population. Finally, a chronic group was identified and is estimated to make up 7% of the population. Individu- 1 Predicted behavior is calculated as the expected value of the random variable depicting each group's behavior. Expected values are computed based on model coefficient estimates. For the application depicted in Figure!, this expectation equals the antilog of Equation 1. For the application depicted in Figure 2, it is calculated according to the relationship provided in footnote 4. For the application depicted in Figure 3, it equals the binomial probability as computed by Equation 3. Actual behavior is computed as the mean behavior of all persons assigned to the various groups identified in estimation. As described hereinafter, the assignments are based on the posterior probability of group membership.

4 142 NAGIN - Never-actual -Never-pred. -Adol. Limited-actual -Adol. Limited-pred. -Chronic-actual - Chronic-pred Age 30 Figure I. Trajectories of number of convictions (Cambridge sample). Adol. = adolescent; pred. = predicted. als in this group offended at a high level throughout ihe observation period. Figure 2 displays results based on the Montreal data. In this analysis the response variable was a psychometric scale of physical aggression. A four-group model was found to best fit the data. A group called "nevers" is composed of individuals who never display physically aggressive behavior to any substantial degree. This group is estimated to make up about 15% of the sample population. A second group that constitutes about 50% of the population is best labeled "low-level desisters." At age 6 boys in this group displayed modest levels of physical aggression, but by age 10 they had largely desisted. A third group, constituting about 30% of the population, was labeled ''high-level near desisters." This group started off scoring high on physical aggression at age 6 but by age 15 scored far lower. Notwithstanding this marked decline, at age 15 they continued to display a modest level of physical aggression. Finally, there is a small chronic group, constituting about 5% of the population, who displayed high levels of physical aggression throughout the observation period. Figure 3 displays another variant of the analysis of the physical aggression data. For this analysis the aggression data for each assessment period was transformed from an intensity scale over the interval 0 to 6 to a binary indicator equal to 1 if the individual displayed any evidence of physical aggression in that period (i.e., if his score was greater than 0). Such a symptom indicator variable is an example of binary data a type of data that is widely analyzed in the social sciences. It is also the type of data Rindskopf s group-based method was designed to analyze. Three symptom trajectories are depicted in Figure 3. One group, constituting about 20% of the population, never displayed any symptoms. This group is the counterpart of the nevers in Figure 2. A second group includes about 50% of the population. It started off with a fairly high probability of displaying symptoms of physical aggression, about.6, but this probability declined rapidly to near 0 by age 15. The final group, which is estimated to account for about 30% of the population, began with a probability of physical aggression that is only modestly higher than that of the second group, but the subsequent rate of decline was

5 ANALYZING DEVELOPMENTAL TRAJECTORIES Never-actual Never-pred. - Low desister-actual Low desister-pred. High desister-actual High desister-pred. - Chronic-actual -Chronic-pred. Figure 2. Trajectories of physical aggression (Montreal sample), pred. = predicted. far slower. Even at age 15, the probability of their displaying symptoms of physical aggression is about 4. Model Figure 4 provides an overview of the model and its key outputs. Model estimation does not require the ex ante sorting of the data among the various groups depicted in Figures 1,2, and 3. To the contrary, the data are used to identify the number of groups that best fit-, the data and the shape of the trajectory for each group. The data also provide an estimate of the proportion of the population whose measured behavior conforms most closely to each trajectory group. In the discussion that follows, I first lay out the general form of the model for any given number of groups. Next, I discuss the determination of the optimal number of groups and the shape of trajectory for each group. I then move to a discussion of how the model's parameter estimates can be used to compute the final element depicted in Figure 4 the probability that each individual / in the estimation sample belongs to each of the trajectory groups identified in estimation. These capabilities are illustrated with results from the Cambridge and Montreal data sets. For the Montreal data the censored normal analysis is used for this purpose. However, all of these capabilities are equally well employed in the analysis of binary data. Differences in the form of the response variable in the analyses depicted in Figures 1, 2, and 3 a count variable, a psychometric scale, and a binary variable necessitate technical differences in the statistical model used in each analysis. For the count data the underlying mixture model builds from the Poisson distribution and its even more general relative, the zero-inflated Poisson (Lambert, 1993). The Poisson family of distributions is widely used in the analysis of count data (Cameron & Trivedi, 1986; Parzen, 1960). For the psychometric scale data the underlying model is based on the censored normal distribution. The censored normal distribution is well suited to accommodate a common feature of psychometric scale data. Typically, a sizable contingent of the sample exhibits none of the behaviors measured by the scale. The result is a clustering of data at the scale minimum. Also, there is usually a smaller contingent that exhibits all of the behaviors measured by the scale. The result is another cluster of data at the scale

6 144 NAGIN - Desister-actual - Desister-pred. Non-desister-actual Non-desister-pred. Figure 3. Trajectories of symptoms of physical aggression (Montreal sample), pred. = predicted. maximum. Finally, the binary logit distribution is used to model binary data. See Maddala (1983) or Greene (1990) for a thorough discussion of the censored normal and binary logit distributions. Like hierarchical and latent curve modeling, a polynomial relationship is used to model the link between age and behavior. Specifically, a quadratic relationship is assumed. 2 For the Poisson-based model it is assumed that log(\/,) = ft, + ft Age, + ft Age?,, (1) where X-J, is the expected number of occurrences of the event of interest (e.g., convictions) of subjecti at time / given membership in group/ Age,, is subject i's age at time t, and Age 2, is the square of subject fs age at time r. 3 The model's coefficients (3{,, fj',, and ft determine the shape of the trajectory and are superscripted by j to denote that the coefficients are not constrained to be the same across the j groups. The conditional probability of the actual number of events, P(^it \j), given j is assumed to follow the well-known Poisson distribution. 4 For the censored normal model, the linkage between age and behavior is established by means of a latent variable, >',*,', that can be thought of as measur- Optimal Number of Groups & Trajectory Shapes Probability that Individual i Belongs to Trajectory Group j I Proportion of Population in Each Group Figure 4. Overview of the model. 2 The software package used in estimating the models reported here allows for estimation of up to a cubic polynomial in age. For ease of exposition I limit the discussion to examples where the maximum order considered is quadratic. * A log-linear relationship between X', and age is assumed to ensure that the requirement that \j, > 0 is fulfilled in model estimation. 4 In an even more general version of this model, P(v/,[/) is assumed to follow the zero-inflated Poisson distribution. See Land, McCall, & Nagin (1996), Nagin and Land (1993), or Roeder, Lynch, and Nagin (in press) for the development of this more general case.

7 ANALYZING DEVELOPMENTAL TRAJECTORIES 145 ing potential for engaging in the behavior of interest say, physical aggression for individual ;'s age at time t given membership in group/ Again, a quadratic relationship is assumed between > '/ and age: B v*/=(y o +p' I (2) where Age,-, and Age 2,, are as previously defined and e is a disturbance assumed to be normally distributed with zero mean and constant variance a 2. The latent variable, y*/, is linked to its observed but censored counterpart, y,-,, as follows. Let 5 min and S max, respectively, denote the minimum and maximum possible score on the measurement scale. The model assumes y'ii ~ -^min 1' y/r ^min' y /; = y*,' if 5 mm < y*' < S max, and. V i> = -Xmix " }'it -"" ^max- In words, if the latent variable, y*', is less than 5 min, it is assumed that observed behavior equals this minimum. Likewise, if the latent variable, y*/, is greater than S max, it is assumed that observed behavior equals this maximum. Only if y*/ is within the scale minimum and maximum does y,-, = y*'. 5 ' 6 Finally, consider the case where the measured response y,,, is binary. In this case it is assumed that conditional on membership in group; the probability that y,, = 1, ot'ip follows the binary logit distribution: 1 +<?" In the Poisson, censored normal, and binary logit formulations, the parameters defining the shape of the trajectory $ J 0. (3',, and (3 ; 2 are left free to differ across groups. This flexibility is a key feature of the model because it allows for easy identification of population heterogeneity not only in the level of behavior at a given age but also in its development over time. Figure 5 illustrates two hypothetical possibilities. A single peaked trajectory Trajectory A is implied if p, > 0 and p? < 0. Thus, if data collection began at age 1, the trajectory would imply that for this group the occurrence of the behavior rose steadily until age 6 and then began a steady decline. Alternatively, if data collection began at age 6, as was the case in the Montreal study, generally it would be inappropriate to extrapolate backward to a younger age outside the period of measurement. Thus, for a model (3) Age Figure 5. Two hypothetical trajectories. A = single peaked; B = chronic trajectory. based on data from age 6 onward, this trajectory would imply a steady decline in the behavior following the initial assessment. Such a trajectory would typify desistance from the behavior. The second trajectory (B) depicted in Figure 5 has no curvature. Rather it remains constant over age. This trajectory is implied if (^ =0 and 3 2 = 0. If that stable level of the behavior is high, this trajectory would typify a group that chronically engages in the behavior. Other interesting possibilities include trajectories in which growth is either steadily accelerating or decelerating. The former would be characterized by a trajectory in which both (3j and (3 2 are positive and the latter by both being negative. The trajectories depicted in Figures 1, 2, and 3 are the product of maximum-likelihood estimation. The likelihood function was constructed as follows. Let the vector y, = {y,,, y/i,, y, T } denote the longitudinal sequence of individual /"s behavioral measure- 5 The expected value of the latent variable y*' equals $ n j + (3{ Age,, + (3^ Agef,, which is denoted below by p.v ;. The expected value of the measured quantity, (v',). assuming j was observed, is E(Y>,) = V mta S min + P^*! - *!, ) + (r(ct>i lin - 4>,', u») + (1-4>L*)S max, where *(*) denotes the cumulative normal distribution function and <t> min and 4> max, respectively, denote < t>'[(s m, n - (io/cr)] and <l>'[(s max - PX',)/O-)], with corresponding definitions for 4> min and <j> max, where <j> denotes the normal density function. This relationship was used to compute the predicted components of Figure 1. 6 I use the term latent variable to describe y*' because it is not fully observed. Thus, my use of the term latent is different from that in the psychometric literature, where the term latent factor refers to an unobservable construct that is assumed to give rise to multiple manifest variables.

8 146 NAGIN merits over the T periods of measurement, P(Y t ) denote the probability of Y/ given membership in group j, and TTj denote the probability of membership in group/ For count data P! (Y/) is constructed from the Poisson distribution, for psychometric scale data from the censored normal distribution, and for the binary data from the binary logit distribution. If group membership was observed, the sampled individuals could be sorted by group membership and their trajectory parameters estimated with readily available Poisson, censored normal (tobit), and logit regression software routines (e.g., STATA, 1995 or LIMDEP, Greene, 1991). 7 However, group membership is not observed. Indeed the proportion of the population composing group/ ity, is an important parameter of interest in its own right. Thus, construction of the likelihood requires the aggregation of the J conditional likelihoods, f'(k/), to form the unconditional probability of the data, K,: (4) where P(Y t ) is the unconditional probability of observing individual fs longitudinal sequence of behavioral measurements. It equals the sum across the J groups of the probability of K, given membership in iiroupy weighted by the proportion of the population in group / The log of the likelihood for the entire sample is thus the sum across all individuals that compose the sample of the log of Equation 4 evaluated for each individual /. The parameters of interest (3(,, (B 7,, :1 2, TTJ, and, in addition, cr for the censored normal can be estimated by maximization of this log likelihood. As previously mentioned, SAS-based software for accomplishing this task is available on request. For a derivation of the likelihood see the appendix, but intuitively, the estimation procedure works as follows. Suppose unbeknownst to us there were two distinct groups in the population: youth offenders, constituting 50% of the population, who up to age 18 have an expected offending rate, X, of 5 and who after age 18 have a X of 1, and adult offenders, constituting the other 50% of the population, whose offending trajectory is the reverse of that of the youth offenders through age 18 their X = I, and after age 18 their X increases to 5. If we had longitudinal data on the recorded offenses of a sample of individuals from this population, we would observe two distinct groups: a clustering of about 50% of the sample who generally have many offenses prior to age 18 and relatively few offenses after age 18 and another 50% clustering with just the reverse pattern. Suppose these data were analyzed under the assumption that the relationship between age and X was identical across all individuals. The estimated value of X would be a "compromise" estimate of about 3 for all ages from which we would mistakenly conclude that in this population the rate of offending is invariant with age. If the data were analyzed using the approach described here, which specifies the likelihood function as a mixing distribution, no such mathematical compromise would be necessary. The parameters of one component of the mixture would effectively be used to accommodate (i.e., match) the youth offending portion of the data whose offending declines with age, and another component of the mixing distribution would be available to accommodate the adult offender data whose offending increases with age. Model Selection This section addresses two important issues in model selection: (a) determination of the optimal number of groups to compose the mixture and (b) determination of the appropriate order of the polynomial used to model each group's trajectory. Here order refers to the degree of the polynomial used to model the group's trajectory, where a second-order trajectory is defined by a quadratic equation, a firstorder trajectory is defined by a linear equation in which (3 2 is set equal 0, and a zero-order trajectory is defined by a flat line in which (}, and (3 2 are set equal to zero. I discuss these two issues in turn. One possible choice for testing the optimality of a specified number of groups is the likelihood ratio test. However, the null hypothesis (i.e., three components vs. more than three components) is on the boundary of the parameter space, and hence the classical asymptotic results that underlie the likelihood ratio test do not hold (Erdfelder, 1990; Ghosh & Sen, 1985; Titterington, Smith, & Makov, 1985). The likelihood ratio test is suitable only for model selection problems in which the alternative models are nested. In mixture models, a k group model is not nested within a k In the econometric literature censored normal regression is generally called tobit regression after its originator, James Tobin (Tobin, 1958).

9 ANALYZING DEVELOPMENTAL TRAJECTORIES 147 group model, and, therefore, it is not appropriate to use the likelihood ratio test for model selection. 8 Given these problems with the use of the likelihood ratio test for model selection, we follow the lead of D'Unger, Land, McCall, and Nagin (1998) and use the Bayesian information criterion (BIC) as a basis for selecting the optimal model. For a given model, BIC is calculated as follows: BIC = log(l) - 0.5*logOz)*(/t), (5) where L is the value of the model's maximized likelihood, n is the sample size, and k is the number of parameters in the model. Kass and Raftery (1995) and Raftery (1995) have argued that BIC can be used for comparison of both nested and unnested models under fairly general circumstances. When prior information on the correct model is limited, they recommended selection of the model with the maximum BIC. Note that BIC is always negative, so the maximum BIC will be the least negative value. In even more recent work, Keribin (1997) demonstrated that BIC identifies the optimal number of groups in finite mixture models. Her result is specifically relevant for the mixture models demonstrated here. For insight into the usefulness of BIC as a criterion for model selection, consider its calculation. The first term of Equation 5 is always negative. For a model that predicts the data perfectly it equals zero. As the quality of the model's fit with the data declines, this term declines (i.e., becomes more negative). One way to improve fit and thereby reduce the first term is to add more parameters to the model. The second term extracts a penalty proportional to the log of the sample size for the addition of more parameters. Thus, on the basis of the BIC criterion, expansion of the model by addition of a trajectory group is desirable only if the resulting improvement in the log likelihood exceeds the penalty for more parameters. As Kass and Raftery (1995) noted, the BIC rewards parsimony. For this application the BIC criterion will tend to favor models with fewer groups. Table 1 shows BIC scores for models with varying numbers of groups. For the Cambridge data the pattern is seemingly very distinct BIC appears to reach a clear maximum at three groups. I say "seemingly," because, without a concrete standard for calibrating the magnitude of the change in BIC, it is difficult to calibrate what constitutes a distinct maximum. Kass and Wasserman (1995) and Schwarz (1978) have provided such a standard. Let B tj denote the Bayes factor comparing models i and j, where for Table 1 BIC-Based Calculations of the Probability That a "j" Group Model Is the Correct Model for Different Numbers of Groups of Quadratic Trajectories No. of groups BIC Cambridge Probability correct model 1 Data set BIC Note. BIC = Bayesian information criterion. Montreal Probability correct model this application model i might be a two-group model and j a three-group model. The Bayes factor measures the odds of each of the two competing models being the correct model. It is computed as the ratio of the probability of i being the correct model to j being the correct model. Thus, a Bayes factor of 1 implies that the models are equally likely, whereas a Bayes factor of 10 implies that model i is 10 times more likely than/ Table 2 shows Jeffreys's scale of evidence for Bayes factors as reported by Wasserman (1997). 9 Computation of the Bayes factor is in general very difficult and indeed commonly impossible. Schwarz (1978) and Kass and Wasserman (1995), however, have shown that e BIC/ ~ BIC ' is a good approximation of the Bayes factor for problems in which equal weight is placed on the prior probabilities of models i and / On the basis of this approximation, for the Cambridge 8 The problem is most easily illustrated with an example. The likelihood ratio test is computed as minus two times the difference in the likelihood of the two nested models. This statistic is asymptotically distributed as chi-squared with degrees of freedom equal to the difference in number of parameters between the nested models. For our mixture model, the degrees of freedom is indeterminate, because a group can become superfluous in two ways. One is by the proportion of the population in that group, itj, approaching zero. Alternatively, the three parameters defining the trajectory of one group can collapse onto those for another group. What then is the appropriate degrees of freedom one or three? 9 Jeffreys was an early and very prominent contributer to Bayesian statistics.

10 148 NAGIN Table 2 Jeffreys's Bayes factor B ;; < 1/10 1/10 < B,j < 1/3 1/3 <B ij < 1 1 < BJJ < 3 3 <B]j< 10 B,, > 10 Scale of Evidence for Bayes Factors Interpretation Strong evidence for model j Moderate evidence for model j Weak evidence for model j Weak evidence for model /' Moderate evidence for model /' Strong evidence for model ; Note. Adapted from "Bayesian Model Selection and Model Averaging" by L. Wasserman. 1997, Working Paper No. 666, Carnegie Mellon University, Department of Statistics. Copyright 1997 by L. Wasserman. Adapted with permission. data the odds of the three-group model compared with :he two- or four-group model far exceed 1000 to 1. Thus, according to Jeffreys's scale this is very strong evidence in favor of the three-group model. 10 Schwarz (1978) and Kass and Wasserman (1995) have also provided a related metric for comparing more than two models. Let p f denote the posterior probability that model j is the correct model, where in general j is greater than 2. They show that p : is rea- -onably approximated by the following: (6) where BIC max is the maximum B1C score of the models under consideration. Also shown in Table 1 are the probabilities, as computed using Equation 6, that the models with varying numbers of groups are the true model. With this BIC-based probability approximation, for the Cambridge data the probability of the three-group model is near 1. Consider now the BIC scores for the models estimated with the Montreal data. The four-group model has the best BIC score, but the probability of its being the correct model,.55, is far less than 1. Although the four-group model is far more likely than the fivegroup model, the three-group model is a close competitor. Its probability is.43. On the basis of Jeffreys's scale, the four-group model is strongly preferred to the five-group model. The odds ratio in favor of the four-group model is 26 to l(i.e.,.55/.02). However, th: edge compared with the three-group model is slight with an odds ratio of only 1.28 (i.e., ). Thus, for models in which each trajectory is described by a full quadratic specification, the three- and fourgroup models fit the data about equally well. The Montreal analysis illustrates a situation in which the mechanical application of the BIC model selection criterion does not result in an unambiguous determination of the "best" model. This brings me to the issue of determination of the appropriate order for modeling the trajectory of each group that makes up the mixture. The best model will not necessarily involve a mixture in which all trajectories are of the same order (e.g., quadratic). In principle one could exhaustively explore all possible combinations of orders for a model. In practice this approach is generally not practical. For a four-group model alone, there are 81 (i.e., 3 4 ) possible combinations of trajectories models, the number of possibilities in a four-group model is 256. Further, a full search would require estimating models over varying numbers of groups. To reduce the number of alternatives, an analyst will generally have to use knowledge of the problem domain to limit the model search process. For instance, in the Montreal example substantive considerations suggest that an improved, more parsimonious model should restrict the number of parameters used to describe the never and chronic trajectories. Description of a never trajectory does not require a threeparameter model: a zero-order model with a "very negative" intercept should suffice. Indeed the large standard errors (not reported) associated with the parameters of the never trajectory strongly suggest that the trajectory is overparameterized. Table 3 shows the probabilities that various forms of three- and four-group models are the correct model. We refer to models in which the trajectories for all groups are quadratic as Type A models. Three- and four-group models in which the never trajectory is described by a zero-order model are labeled Type B models. Also shown are three- and four-group models in which, in addition, the chronic trajectory is described by a zero-order model (Type C). The information was prompted by the theories of Moffitt (1993) and Patterson et al. (1989), which predict the existence of a small group of chronically antisocial individuals. The BIC-based probability calculations provide strong support for the four-group, Type C model in Table 3. Compared with the other options 10 Using the BIC criterion, D'Unger et al. (1998) and Roeder, Lynch, and Nagin (in press) showed that a fourgroup model is best. In those analyses the model is fit based on the more general zero-inflated Poisson.

11 ANALYZING DEVELOPMENTAL TRAJECTORIES 149 Table 3 BIC-Based Calculation of Probability of Correct Model for Models Combining Quadratic and Nonquadratic Trajectories: Montreal Data No. of groups \ '\ A Model type B.07 c.93 Note. : :1IC = Bayesian information criterion; Model Type A = all trajectories quadratic; Model Type B = one single parameter trajectory, remaining quadratic; Model Type C = two single parameter trajectories, remaining quadratic. listed in the table, the probability of this being the correct model is.93. The closest competitor is the four-group. Type B model. However, the posterior probability of it being the correct model is only.07. As the Montreal analysis illustrates, the determination o: the optimal number of groups is not always a clear-cut process. Although Bayesian statisticians have made important strides in developing theoretically grounded criteria for model selection, the results are recent and not yet widely used. Further, as illustrated by the Montreal example, the model search process may also be guided in part by nonstatistical considerations. This injects a degree of subjectivity into model selection. Still, it is important to recognize that use of the BIC criterion for model selection adds a very significant degree of statistical objectivity to the determination of the optimal number of groups. This provides an important check against the spurious interpretation of random fluctuations in the data as reflecting systematic patterns of behavior. Calculation and Use of Posterior Group Membership Probabilities It is not possible to determine definitively an individual's group membership. However, it is possible to calculate the probability of his or her membership in the various groups that make up the model. These probabilities, the posterior probabilities of group membership, are among the most useful products of the group-based modeling approach. Specifically, based on the model coefficient estimates, for each individual i the probability of membership in group j is calculated on the basis of the individual's longitudinal pattern of behavior, K,. We denote this probability by Pij'iy,-). It is computed as follows: /W) (7) where P(Yj\j) is the estimated probability of observing ;"s actual behavioral trajectory, K,, given membership in j, and fr ; is the estimated proportion of the population in group / The quantity P(K,-ly) can be calculated postestimation based on the maximumlikelihood estimates of the trajectory parameters. The posterior probability calculations provide the researcher with an objective basis for assigning individuals to the development trajectory group that best matches their behavior. Individuals can be assigned to the group to which their posterior membership probability is largest. On the basis of this maximum posterior probability assignment rule, an individual from the Cambridge sample with a cluster of offenses in adolescence but none thereafter, in all likelihood, will be assigned to the adolescent limited category. Alternatively, an individual who offends at a comparatively high rate throughout the observation period will likely be assigned to the chronic group. Table 4 shows the mean assignment probability for the Montreal and Cambridge data. For example, in the Cambridge data the mean chronic group posterior probability for the 21 individuals assigned to this group was very high.95. The counterpart average for the 77 individuals assigned to the adolescent limited group is similarly large.94. For the Montreal data, classification certainly is not so high but still seems reasonably good. Across the four groups it ranges from.73 to.93. One informative use of the posterior probabilitybased classifications is to create profiles of the ''average" individual following the trajectory characterized by each group. Table 5 shows summary statistics on individual characteristics and behaviors of each group for both data sets. The profiles conform with longstanding findings on risk and protective factors for antisocial and problem behaviors. In the Cambridge data those in the chronic group, on average, were most likely to have lived in a low-income household, to have had at least one parent with a criminal record, to have been subjected to poor parenting, and to have displayed a high propensity to engage in risky activities. Conversely, the never group was lowest on these risk factors. The pattern is similar for physical aggression. Members of the chronic group had the least well-educated parents and most frequently scored in the lowest quartile of the measured IQ distribution of the sample, whereas the nevers were high-

12 150 NAGIN Table 4 Average Assignment Probability Conditional on Assignment by Maximum Probability Rule Group Assigned group Never Adolescent limited Chronic Never Adolescent limited Chronic Never Low desister High desister Chronic Never Cambridge data Montreal data Low desister High desister Chronic est on these protective factors. Further, 90% of those in the chronic group failed to reach the eighth grade on schedule, and 13% had a juvenile record by age 18; only 19% of the nevers had fallen behind by the eighth grade, and none had a juvenile record. In between are the low-level and high-level desisters, who themselves order as expected on these characteristics and behaviors. Aside from providing a basis for group assignment, the posterior probabilities can provide an objective criterion for selecting subsamples for follow-up data collection. For example, in an analysis based on the Montreal data, Nagin and Tremblay (in press) found that individuals following the chronic physical aggression trajectory depicted in Figure 3 displayed heightened levels of violence at age 18, controlling for chronic opposition and hyperactivity. Conversely, it was found that chronic opposition and hyperactivity did not predict violence at age 18, controlling for chronic physical aggression. Nagin and Tremblay concluded that chronic physical aggression was a distinct risk factor for later violence. A follow-up study that is currently under way at the time of this writing was devised to examine whether self-regulatory processes were a possible explanation for the findings. The study involves a series of laboratory assessments of selected individuals in the Montreal study. Individuals were recruited for this study based on their posterior probabilites of membership in the various trajectory groups for physical aggression, opposition, Table 5 Group Profile Variable Low household income (%) Poor parenting (%) High risk taking (%) Parents with criminal record (%) Years of school mother Years of school father Low lq a (%) Completed eighth grade on time (%) Juvenile record (%) No. of sexual partners at age 17 (past year) 1 Measured IQ score is in the lowest quartile of the sample. Never Cambridge data Montreal data Never Group Adolescent Low desister limited High desister Chronic Chronic

13 ANALYZING DEVELOPMENTAL TRAJECTORIES 151 and hvperactivity. The purpose was to recruit strategically to ensure that the sample included persons with distinct combinations of development in externalizing behaviors (e.g., high probability of chronic physical aggression but low probability of chronic opposition). The posterior probabilities provide objective, quantified criteria for such strategic sampling. Statistically Linking Group Membership to Covariates What individual, familial, and environmental factors distinguish the populations of the various trajectory groups? Are the factors consistent with received theory on developmental trajectories? What statistical procedures are most appropriate to test whether such factors distinguish among the trajectory groups? Questons such as these follow naturally from the identification of distinct groups of developmental trajectories. The profiles reported in Table 5 are a first step in addressing these questions but indeed only a beginning. First, the profiles are simply a collection of univariate contrasts. For the purpose of constructing a more parsimonious list of predictors or for causal inference, a multivariate procedure is required to sort out redundant predictors and to control for potential confound-. Second, the profiles are based on group identifications that are probabilistic, not certain. Conventional statistical methods to test for cross-group differences, such as F- and chi-square-based tests, assume no classification error in group identification (Roeder et al., in press). Thus, in general they are technically inappropriate in this setting." The mixture model of developmental trajectories comprises two basic components: (a) an expected trajectory given membership in group j and (b) a probability of group membership denoted by IT ;. Thus far, the dkcussion has focused on the former component. The discussion of the test for factors that distinguish groups turns our focus to the latter component. The profiles in Table 5 show that the trajectory groups are composed of individuals who differ in substantial and predictable ways. Provision of the capacity to test formally for whether and by what degree such factors distinguish the groups requires that the model be generalized to allow itj to depend on characteristics of the individual. Heretofore, ir ; has been described as the proportion of the population following each trajectory group /. Equivalently, it can be described as the probability that a randomly chosen individual from the population under study belongs to trajectory group j. By allowing T^ to vary with individual characteristics, it is possible to test whether and by how much a specified factor affects probability of group membership controlling for the level of other factors that potentially affect TT ;. Let Xj denote a vector of factors measuring individual, familial, or environmental factors that potentially are associated with group membership, and let Tfj(Xj) denote the probability of membership in group j given Xj. For a two-group model the logit model is a natural candidate for modeling group membership probability as a function of *,-. For this special case, we need only estimate Tr ; U,) for one group, say Group 1, because irfjc,) = 1 - TT,(J:,), where e*f> For the more general case, in which there are more than two groups, the logit model generalizes to the multinomial logit model (Maddala, 1983): (8) where the parameters of the multinomial logit model, 0y, capture the impact of the covariates of interest, *,-, on probability of group membership. Without loss of generality, 9 ; for one "contrast" group can be set equal to zero. The coefficient estimates for the remaining groups should be interpreted as measuring the impact of covariates on group membership relative to the contrast group. Table 6 illustrates the application of the extended model to the Montreal data. Results are reported for a four-group model in which probability of group membership is related to four variables: one parent having less than a ninth-grade education, both parents having less than a ninth-grade education, having a mother who began childbearing as a teenager, and scoring in the lowest quartile of the measured IQ of the sampled boys. The first panel of Table 6 shows coefficient estimates and r statistics. For this example, the neverconvicted group serves as the contrast group. Con- 1 ' Roeder et al. (in press) explored the impact of classification error on statistical inference. One finding is not surprising. When classification is highly certain, as is the case in the Cambridge data, errors in inference using conventional methods are small.

14 152 NAGIN Table 6 The Impact of Parental Education. Low IQ, and Teen Onset of Motherhood on Group Membership Probabilities: Montreal Data Variable-condition Never Multinomial logit coefficients Constant Low education" parent Low education" both parents Low IQ b Teen mom c Low desister Predicted membership probabilities based on multinomial logit No risk factors.20 Low education" both parents only.15 Low lq h only.15 Teen mom 1 ' only.08 Low education" both parents and low IQ h and teen mom c.03 Parent has less than a ninlh-grade education. Measured IQ score is in the lowest quartile of the sample. Participant's mother began childbearing as a teenager. Group High desister (with ± statistics given in parentheses) 1.04(6.36) (-0.10) 0.44 ( 1.46) 0.69(2.28) 0.18(0.48) 0.67(1.83) 0.07 (0.22) 0.85 (2.85) 0.89 (2.03) 1.10(2.57) model coefficient estimates Chronic (-4.49) (-0.30) 0.98(1.73) 0.92(1.83) 2.27 (3.89) sider first the parental low-education variables. Relative to the never group, having one or more poorly educated parents does not significantly increase the probability of membership in the low-level desister group (a =.05, one-tailed test). However, probability of membership in the high-level near-desister and chronic groups is significantly increased if both parents are poorly educated. Low IQ has a similar impact on group membership probability. Compared with the never group, probability of membership in the lowlevel group is not significantly affected, whereas probability of membership in the two higher aggression trajectories is significantly increased. Finally, the teen n'iom variable is associated with a significant increase in all group membership probabilities relative to the never group. The second panel of Table 6 shows calculations of the predicted probabilities of group membership based on the coefficient estimates. The calculations were performed by substituting the coefficient estimates into Equation 8 and then computing group membership probabilities for assumed values of x/. The calculations show that if none of the risk factors are present, the probability of the never or low-level groups is.77 and the probability of the chronic group is only.02. The largest single factor affecting the chronic group membership probability is the teen mom variable. For the case in which an individual has none of the other risk factors except for a mother who began childbearing as a teenager, the combined probability of the never and the low-level desister group declines to.66 and the probability of the chronic group increases to.09. Also reported is a high-risk scenario in which three risk factors are present both parents poorly educated, low IQ. and teen motherhood. In this case the predicted probability of the chronic group is.25, an order of magnitude larger than the no-risk-factor case. Also, the predicted probability of the high-level near-desister group of.44 is double the probability for the no-risk-factor case. Simultaneous estimation of the group-specific trajectory parameters and the group membership probabilities conditional on individual factors is easily accomplished with the above-referenced software. The efficiency of estimation is also relatively good. Although estimation time will depend on data set size, number of groups and covariates, and the speed of the computer, computation time is generally 5 to 10 min. For analyses in which hypotheses are well formed, such computation time is inconsequential. However, in analyses in which hypotheses are less well formed or analyses that are generally exploratory, much more effort will be put into mode! fitting. In these circumstances computation times of 5 to 10 min per run may be very cumbersome. Two alternatives for highly efficient exploratory model fitting are recommended. Both are two-stage approaches with the same first stage identification

15 ANALYZING DEVELOPMENTAL TRAJECTORIES 153 of the best fitting model in terms of number of groups. The first stage involves a search for the best fitting model. This search is conducted without covariates distinguishing group membership. One alternative for the second-stage analysis makes use of the group membership assignments based on the maximum posterior probability rule. Multinomial logit models are then estimated relating group assignment to individual-level factors. A second alternative for the second-stage analysis is to regress group membership probabilities on the individual-level factors of interest. Both of these approaches for identifying candidate factors that distinguish group membership are highly efficient and easily conducted with conventional statistical software. Personal experience has shown that the results of an exploratory analysis conducted by either of these methods provide a good guide for specification of the proper model in which trajectories and the impact of covariates on membership probabilities are jointly estimated. 12 Comparative Advantages and Disadvantages of Group-Based Modeling In this article I have described a group-based modeling strategy for analyzing developmental trajectories. Alternative methods generally treat the population distribution of development as continuous. The two leading examples of this continuous modeling strategy are hierarchical modeling and latent curve modeling.' 3 The group-based and continous modes of analysis each have their distinctive strengths. Growth curve modeling, whether in the hierarchical or latent variable tradition, is designed to identify average developmental tendencies, to calibrate variability about the average, and to explain that variability in terms of covariates of interest. By contrast, the mixture modeling strategy is designed to identify distinctive, prototypal developmental trajectories within the population, to calibrate the probability of population members following each such trajectory, and to relate those probabilities to covariates of interest. Raudenbush (in press) offered a valuable perspective on the types of problems for which these two broad classes of modeling strategies are most suitable. He observed, "In many studies it is reasonable to assume that all participants are growing according to some common function but that the growth parameters vary in magnitude" (p. 30). He offered children's vocabulary growth curves as an example of such a growth process. Two distinctive features of such developmental processes are (a) they are generally monotonic thus, the term growth and (b) they vary regularly within the population. For such processes it is natural to ask, "What is the typical pattern of growth within the population and how does this typical growth pattern vary across population members?" Hierarchical and latent curve modeling are specifically designed to answer such a question. Raudenbush (in press) also offered an example of a developmental process namely, depression that does not generally change monotonically over time and does not vary regularly through the population. He observed, "It makes no sense to assume that everyone is increasing (or decreasing) in depression.... many persons will never be high in depression, others will always be high, while others will become increasingly depressed (p. 30). For problems such as this, he recommended the use of a multinomial-type method such as that demonstrated here. The reason for this recommendation is that the development does not vary regularly across population members. Instead trajectories vary greatly across population subgroups both in terms of the level of behavior at the outset of the measurement period and in the rate of growth and decline over time. For such problems a modeling strategy designed to identify averages and explain variability about that average is far less useful than a group-based strategy designed to identify distinctive clusters of trajectories and to calibrate how characteristics of individuals and their circumstances affect membership in these clusters. Summary and Discussion This article has demonstrated a semiparametric, group-based approach for analyzing developmental 12 Although either of the two-stage procedures provides an efficient approach for exploring the potential impact of covariates on group membership, they are not a substitute for a final analysis based on the jointly estimated model. Roeder, Lynch, and Nagin (in press) have found that the first described two-stage procedure in particular tends to overstate the statistical significance of covariates on group membership. The probable reason is that the procedure ignores the uncertainty about group assignments. u Recent advances in latent curve modeling have adapted the conventional assumption of a continuous distribution of growth curves to accommodate the group-based approach described here. For an excellent summary of this advance in latent curve modeling, see Muthen (in press).

16 154 NAGIN trajectories. Technically, the model is simply a mixture of probability distributions that are suitably specified to describe the data to be analyzed. Three such distributions were illustrated: the Poisson, which is suitable for analyzing count data; the censored normal, which is suitable for analyzing psychometric scale data with clusters of observations at the scale maximum or minimum or both; and the binary logit, which is suitable for analyzing binary data. For maximum flexibility the parameters that characterize the trajectory of each group are specified to vary freely across groups. This allows for substantial cross-group differences in the shape of trajectories. The examples used to illustrate the method were selected to demonstrate four of the method's most important capabilities: (a) the capability to identify rather than assume distinctive developmental trajectories; (b) the capability to estimate the proportion of the population best approximated by the various trajectories so identified; (c) the capability to relate the probability of membership in the various trajectory groups to characteristics of the individual and his or her circumstances; and (d) the capability to use the posterior probabilities of group membership for various other purposes such as sampling, creating profiles of group membership, or serving as regressors in mullivariate statistical analyses. The modeling strategy described here has only recently been developed. Opportunities for extension abound. Three extensions seem particularly worthwhile. Further development of approaches for deciding on the best fitting number of groups would be extremely valuable. Although BIC scores are very useful for this purpose, in addition it would be helpful to have measures of the goodness of fit between actual and predicted trajectories. A second useful extension involves the joint estimation of trajectories of different behaviors for example, joint estimation of psychological well-being in childhood and employment status over adulthood. This would provide the capacity to examine the relationship between the development patterns of distinct but likely linked behaviors. A third valuable extension involves developing diagnostics for calibrating the degree to which developmental trajectories cluster into distinct groups or vary regularly according to some specified parametric distribution (e.g., multivariate normal). Nagin and Tremblay (in press) described another rationale that is not discussed here for the group-based modeling strategy to avoid making strong and generally untestable assumptions about the population distribution of developmental trajectories. The semiparametric mixture can be thought of as approximating an unspecified but likely continuous distribution of population heterogeneity in developmental trajectories. In so doing, a standard procedure in nonparametric and semiparametric statistics of approximating a continuous distribution by a discrete mixture is adopted (Follman & Lambert, 1989; Heckman & Singer, 1984; Lindsay, 1995). A difficult but valuable research effort would be to develop metrics for calibrating the degree to which trajectory groups are actually closely approximating a specified continuous distribution. References Appel, T. A. (1987). The Cuvier-Geoffroy debate: French biology in the decades before Darwin. New York: Oxford University Press. Bryk, A. S., & Raudenbush, S. W. (1987). Application of hierarchical linear models to assessing change. Psychological Bulletin, 101, Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear models for social and behavioral research: Application and data analysis methods. Newbury Park, CA: Sage. Cameron, A. C., &Trivedi, P. K. (1986). Econometric models based on count data: Comparisons and applications of some estimators and tests. Journal of Applied Econometrics, I, Cantor, N., & Genero, N. (1986). Psychiatric diagnosis and natural categorization: A close analogy. In T. Millon & G. Klerman (Eds.), Contemporary directions in psychopathology (pp ). New York: Guilford Press. Caspi, A. (1998). Personality development across the life course. In W. Daom (Series Ed.) & N. Eisenberg (Vol. Ed.), Handbook of child psychology: Vol. 3. Social, emotional, and personality development (5th ed., pp ). New York: Wiley. D'Unger, A., Land. K., McCall, P., & Nagin, D. (1998). How many latent classes of deliquent/criminal careers? Results from mixed Poisson regression analyses of the London, Philadelphia, and Racine cohorts studies. American Journal of Sociology, 103, Erdfelder, E. (1990). Deterministic developmental hypotheses, probabilistic rules of manifestation, and the analysis of finite mixture distributions. In A. von Eye (Ed.), Statistical methods in longitudinal research: Time series and categorical longitudinal data (Vol. 2, pp ). Boston: Academic Press Farrington, D. P., & West, D. J. (1990). The Cambridge study in delinquent development: A prospective longitu-

17 ANALYZING DEVELOPMENTAL TRAJECTORIES 155 dinal study of 411 males. In H. Kerner & G. Kaiser (Eds ), Criminality: Personality, behavior, and life history ipp ). New York: Springer-Verlag. Follmari, D. A., & Lambert, D. (1989). Generalized logistic regression by nonparametric mixing. Journal of the American Statistical Association, 84, Ghosh, J. K., & Sen, P. K. (1985). On the asymptotic performance of the log-likelihood ratio statistic for the mixture model and related results. In L. M. LeCam & R. A. Olshen (Eds.), Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer (Vol. 2, pp ). Monterey, CA: Wadsworth. Goldstein, H. (1995). Multilevel statistical models (2nd ed.). London: Edward Arnold. Greene. W. H. (1990). Econometric analysis. New York: Macmillan. Greene. W. H. (1991). LIMDEP User's Manual (Version 6.0) 'computer software], Bellport, NY: Econometrics Software, Inc. Heckrriin, J., & Singer, B. (1984). A method for minimizing the i npact of distributional assumptions in econometric models for duration data. Econometrica, 52, Holyoak, K., & Spellman, B. (1993). Thinking. Annual Review of Psychology, 44, Jones, B.L., Nagin, D. S., & Roeder, K. (1998). A SAS procedure based on mixture models for estimating developmental trajectories (Working Paper No. 684). Carnegie Mellon University, Department of Statistics. Kandel. D. B. (1975). Stages in adolescent involvement in drug use. Science, 190, Kass, R. E., & Raftery, A. E. (1995). Bayes factor. Journal of th,- American Statistical Association, 90, Kass, R E., & Wasserman, L. (1995). A reference Bayesian test :or nested hypotheses and its relationship to the Schwarz criterion. Journal of the American Statistical Association, 90, Keribin. C. (1997). Consistent estimation of the order of mixture models (Working Paper No. 61). Universite d'evry-val d'essonne, Laboratorie Analyse et Probabilite. Lambeu, D. (1993). Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics, 34, Land, K., McCall, P.. & Nagin, D. (1996). A comparison of Poisson, negative binomial, and semiparametric mixed Poisson regression models with empirical applications to criminal careers data. Sociological Methods & Research, 24, 387^40. Land, K., & Nagin, D. (1996). Micro-models of criminal careers: A synthesis of the criminal careers and life course approaches via semiparametric mixed Poisson models with empirical applications. Journal of Quantitative Criminology, 12, Lindsay, B. G. (1995). Mixture models: Theory, geometry, and applications. Hayward, CA: Institute of Mathematical Statistics. Loeber, R. (1991). Questions and advances in the study of developmental pathways. In D. Cicchetti & S. Toth (Eds.), Models and integrations. Rochester Symposium on developmental psychopathology (Vol. 3, pp ). Rochester, NY: University of Rochester Press. MacCallum, R. C., Kim, C., Malarkey, W. B., & Kiecolt- Glaser, J. K. (1997). Studying multivariate change using multilevel models and latent curve models. Multivariate Behavioral Research, 32, Maddala, G. S. (1983). Limited dependent and qualitative variables in econometrics. New York: Cambridge University Press. Markman, E. M. (1989). Categorization and naming in children: Problems of induction. Cambridge, MA: MIT Press. McArdle, J. J., & Epstein, D. (1987). Latent growth curves within developmental structural equation models. Child Development, 58, Meredith, W., & Tisak, J. (1990). Latent curve analysis. Psychometrika, 55(1), Moffitt, T. E. (1993). Adolescence-limited and life-course persistent antisocial behavior: A developmental taxonomy. Psychological Review, 100, Muthen, B. O. (1989). Latent variable modeling in heterogeneous populations. Psychometrika, 54(4), Muthen, B. O. (in press). Second-generation structural equation modeling with a combination of categorical and continuous latent variables: New opportunities for latent class/latent curve modeling. In A. Sayers & L. Collins (Eds.), New methods for the analysis of change. Washington, DC: American Psychological Association. Muthen, B. O., & Curran, P. (1997). General longitudinal modeling of individual differences in experimental design: A latent variable framework for analysis and power estimation. Psychological Methods, 2, Nagin, D., Farrington, D., & Moffitt, T. (1995). Life-course trajectories of different types of offenders. Criminology, 33, Nagin, D., & Land, K. (1993). Age, criminal careers, and population heterogeneity: Specification and estimation of a nonparametric, mixed Poisson model. Criminology, 31,

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

Page 1 of 7. Supplemental Analysis

Page 1 of 7. Supplemental Analysis Data Supplement for Birmaher et al., Longitudinal Trajectories and Associated Baseline Predictors in Youths With Bipolar Spectrum Disorders. Am J Psychiatry (doi: 10.1176/appi.ajp.2014.13121577) Supplemental

More information

Causal Mediation Analysis with the CAUSALMED Procedure

Causal Mediation Analysis with the CAUSALMED Procedure Paper SAS1991-2018 Causal Mediation Analysis with the CAUSALMED Procedure Yiu-Fai Yung, Michael Lamm, and Wei Zhang, SAS Institute Inc. Abstract Important policy and health care decisions often depend

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Chapter 11. Experimental Design: One-Way Independent Samples Design

Chapter 11. Experimental Design: One-Way Independent Samples Design 11-1 Chapter 11. Experimental Design: One-Way Independent Samples Design Advantages and Limitations Comparing Two Groups Comparing t Test to ANOVA Independent Samples t Test Independent Samples ANOVA Comparing

More information

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc.

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc. Sawtooth Software RESEARCH PAPER SERIES MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB Bryan Orme, Sawtooth Software, Inc. Copyright 009, Sawtooth Software, Inc. 530 W. Fir St. Sequim,

More information

The Use of Latent Trajectory Models in Psychopathology Research

The Use of Latent Trajectory Models in Psychopathology Research Journal of Abnormal Psychology Copyright 2003 by the American Psychological Association, Inc. 2003, Vol. 112, No. 4, 526 544 0021-843X/03/$12.00 DOI: 10.1037/0021-843X.112.4.526 The Use of Latent Trajectory

More information

Developmental Trajectories of Externalizing Behaviors in Childhood and Adolescence

Developmental Trajectories of Externalizing Behaviors in Childhood and Adolescence Child Development, September/October 2004, Volume 75, Number 5, Pages 1523 1537 Developmental Trajectories of Externalizing Behaviors in Childhood and Adolescence Ilja L. Bongers, Hans M. Koot, Jan van

More information

EISUKE SEGAWA University of Illinois at Chicago. JOB E. NGWE Northeastern Illinois University. YANHONG LI Eli Lilly and Company via MedFocus, Inc.

EISUKE SEGAWA University of Illinois at Chicago. JOB E. NGWE Northeastern Illinois University. YANHONG LI Eli Lilly and Company via MedFocus, Inc. 1.1177/193841X42719 EVALUATION Segawa et al. / LATENT REVIEW CLASS / XXX XXX GROWTH MIXTURE MODELING EVALUATION OF THE EFFECTS OF THE ABAN AYA YOUTH PROJECT IN REDUCING VIOLENCE AMONG AFRICAN AMERICAN

More information

You must answer question 1.

You must answer question 1. Research Methods and Statistics Specialty Area Exam October 28, 2015 Part I: Statistics Committee: Richard Williams (Chair), Elizabeth McClintock, Sarah Mustillo You must answer question 1. 1. Suppose

More information

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX The Impact of Relative Standards on the Propensity to Disclose Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX 2 Web Appendix A: Panel data estimation approach As noted in the main

More information

Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties

Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties Bob Obenchain, Risk Benefit Statistics, August 2015 Our motivation for using a Cut-Point

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

UNIVERSITY OF FLORIDA 2010

UNIVERSITY OF FLORIDA 2010 COMPARISON OF LATENT GROWTH MODELS WITH DIFFERENT TIME CODING STRATEGIES IN THE PRESENCE OF INTER-INDIVIDUALLY VARYING TIME POINTS OF MEASUREMENT By BURAK AYDIN A THESIS PRESENTED TO THE GRADUATE SCHOOL

More information

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

3 CONCEPTUAL FOUNDATIONS OF STATISTICS 3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical

More information

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research 2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

For general queries, contact

For general queries, contact Much of the work in Bayesian econometrics has focused on showing the value of Bayesian methods for parametric models (see, for example, Geweke (2005), Koop (2003), Li and Tobias (2011), and Rossi, Allenby,

More information

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2

More information

Group Assignment #1: Concept Explication. For each concept, ask and answer the questions before your literature search.

Group Assignment #1: Concept Explication. For each concept, ask and answer the questions before your literature search. Group Assignment #1: Concept Explication 1. Preliminary identification of the concept. Identify and name each concept your group is interested in examining. Questions to asked and answered: Is each concept

More information

PATHWAYS. Age is one of the most consistent correlates. Is Desistance Just a Waiting Game? Research on Pathways to Desistance.

PATHWAYS. Age is one of the most consistent correlates. Is Desistance Just a Waiting Game? Research on Pathways to Desistance. PATHWAYS Research on Pathways to Desistance Volume 9 In this edition of the Pathways newsletter, we summarize a recent publication by Pathways Study investigators related to the age-crime curve the observation

More information

CRIMINAL TRAJECTORIES FROM ADOLESCENCE TO ADULTHOOD IN AN ONTARIO SAMPLE OF OFFENDERS

CRIMINAL TRAJECTORIES FROM ADOLESCENCE TO ADULTHOOD IN AN ONTARIO SAMPLE OF OFFENDERS CRIMINAL TRAJECTORIES FROM ADOLESCENCE TO ADULTHOOD IN AN ONTARIO SAMPLE OF OFFENDERS 1 2 3 4 David M. Day, Irene Bevc, Thierry Duchesne, Jeffrey S. Rosenthal, 5 2 Ye Sun, & Frances Theodor 1 2 Ryerson

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

Exploring the Impact of Missing Data in Multiple Regression

Exploring the Impact of Missing Data in Multiple Regression Exploring the Impact of Missing Data in Multiple Regression Michael G Kenward London School of Hygiene and Tropical Medicine 28th May 2015 1. Introduction In this note we are concerned with the conduct

More information

Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985)

Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985) Confirmations and Contradictions Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985) Estimates of the Deterrent Effect of Capital Punishment: The Importance of the Researcher's Prior Beliefs Walter

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION

THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION Timothy Olsen HLM II Dr. Gagne ABSTRACT Recent advances

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector

More information

Hierarchical Linear Models: Applications to cross-cultural comparisons of school culture

Hierarchical Linear Models: Applications to cross-cultural comparisons of school culture Hierarchical Linear Models: Applications to cross-cultural comparisons of school culture Magdalena M.C. Mok, Macquarie University & Teresa W.C. Ling, City Polytechnic of Hong Kong Paper presented at the

More information

11/24/2017. Do not imply a cause-and-effect relationship

11/24/2017. Do not imply a cause-and-effect relationship Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

Score Tests of Normality in Bivariate Probit Models

Score Tests of Normality in Bivariate Probit Models Score Tests of Normality in Bivariate Probit Models Anthony Murphy Nuffield College, Oxford OX1 1NF, UK Abstract: A relatively simple and convenient score test of normality in the bivariate probit model

More information

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Today s Class: Features of longitudinal data Features of longitudinal models What can MLM do for you? What to expect in this

More information

Estimating Heterogeneous Choice Models with Stata

Estimating Heterogeneous Choice Models with Stata Estimating Heterogeneous Choice Models with Stata Richard Williams Notre Dame Sociology rwilliam@nd.edu West Coast Stata Users Group Meetings October 25, 2007 Overview When a binary or ordinal regression

More information

Methodological Issues in Measuring the Development of Character

Methodological Issues in Measuring the Development of Character Methodological Issues in Measuring the Development of Character Noel A. Card Department of Human Development and Family Studies College of Liberal Arts and Sciences Supported by a grant from the John Templeton

More information

CROSS-VALIDATION IN GROUP-BASED LATENT TRAJECTORY MODELING WHEN ASSUMING A CENSORED NORMAL MODEL. Megan M. Marron

CROSS-VALIDATION IN GROUP-BASED LATENT TRAJECTORY MODELING WHEN ASSUMING A CENSORED NORMAL MODEL. Megan M. Marron CROSS-VALIDATION IN GROUP-BASED LATENT TRAJECTORY MODELING WHEN ASSUMING A CENSORED NORMAL MODEL by Megan M. Marron B.S., Rochester Institute of Technology, 2011 Submitted to the Graduate Faculty of the

More information

Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE

Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE 1. When you assert that it is improbable that the mean intelligence test score of a particular group is 100, you are using. a. descriptive

More information

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Journal of Social and Development Sciences Vol. 4, No. 4, pp. 93-97, Apr 203 (ISSN 222-52) Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Henry De-Graft Acquah University

More information

Bayesian and Frequentist Approaches

Bayesian and Frequentist Approaches Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law

More information

Measuring Goodness of Fit for the

Measuring Goodness of Fit for the Measuring Goodness of Fit for the Double-Bounded Logit Model Barbara J. Kanninen and M. Sami Khawaja The traditional approaches of measuring goodness of fit are shown to be inappropriate in the case of

More information

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives DOI 10.1186/s12868-015-0228-5 BMC Neuroscience RESEARCH ARTICLE Open Access Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives Emmeke

More information

PERSON LEVEL ANALYSIS IN LATENT GROWTH CURVE MODELS. Ruth E. Baldasaro

PERSON LEVEL ANALYSIS IN LATENT GROWTH CURVE MODELS. Ruth E. Baldasaro PERSON LEVEL ANALYSIS IN LATENT GROWTH CURVE MODELS Ruth E. Baldasaro A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

The Limits of Inference Without Theory

The Limits of Inference Without Theory The Limits of Inference Without Theory Kenneth I. Wolpin University of Pennsylvania Koopmans Memorial Lecture (2) Cowles Foundation Yale University November 3, 2010 Introduction Fuller utilization of the

More information

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological

More information

Cutting-Edge Statistical Methods for a Life-Course Approach 1,2

Cutting-Edge Statistical Methods for a Life-Course Approach 1,2 REVIEWS FROM ASN EB 2013 SYMPOSIA Cutting-Edge Statistical Methods for a Life-Course Approach 1,2 Kristen L. Bub* and Larissa K. Ferretti Auburn University, Auburn, AL ABSTRACT Advances in research methods,

More information

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method Biost 590: Statistical Consulting Statistical Classification of Scientific Studies; Approach to Consulting Lecture Outline Statistical Classification of Scientific Studies Statistical Tasks Approach to

More information

Live WebEx meeting agenda

Live WebEx meeting agenda 10:00am 10:30am Using OpenMeta[Analyst] to extract quantitative data from published literature Live WebEx meeting agenda August 25, 10:00am-12:00pm ET 10:30am 11:20am Lecture (this will be recorded) 11:20am

More information

Addendum: Multiple Regression Analysis (DRAFT 8/2/07)

Addendum: Multiple Regression Analysis (DRAFT 8/2/07) Addendum: Multiple Regression Analysis (DRAFT 8/2/07) When conducting a rapid ethnographic assessment, program staff may: Want to assess the relative degree to which a number of possible predictive variables

More information

Ecological Statistics

Ecological Statistics A Primer of Ecological Statistics Second Edition Nicholas J. Gotelli University of Vermont Aaron M. Ellison Harvard Forest Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Brief Contents

More information

SUPPLEMENTAL MATERIAL

SUPPLEMENTAL MATERIAL 1 SUPPLEMENTAL MATERIAL Response time and signal detection time distributions SM Fig. 1. Correct response time (thick solid green curve) and error response time densities (dashed red curve), averaged across

More information

MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2

MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2 MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and Lord Equating Methods 1,2 Lisa A. Keller, Ronald K. Hambleton, Pauline Parker, Jenna Copella University of Massachusetts

More information

Deanna Schreiber-Gregory Henry M Jackson Foundation for the Advancement of Military Medicine. PharmaSUG 2016 Paper #SP07

Deanna Schreiber-Gregory Henry M Jackson Foundation for the Advancement of Military Medicine. PharmaSUG 2016 Paper #SP07 Deanna Schreiber-Gregory Henry M Jackson Foundation for the Advancement of Military Medicine PharmaSUG 2016 Paper #SP07 Introduction to Latent Analyses Review of 4 Latent Analysis Procedures ADD Health

More information

The Use of Piecewise Growth Models in Evaluations of Interventions. CSE Technical Report 477

The Use of Piecewise Growth Models in Evaluations of Interventions. CSE Technical Report 477 The Use of Piecewise Growth Models in Evaluations of Interventions CSE Technical Report 477 Michael Seltzer CRESST/University of California, Los Angeles Martin Svartberg Norwegian University of Science

More information

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill)

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill) Advanced Bayesian Models for the Social Sciences Instructors: Week 1&2: Skyler J. Cranmer Department of Political Science University of North Carolina, Chapel Hill skyler@unc.edu Week 3&4: Daniel Stegmueller

More information

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at

More information

Version No. 7 Date: July Please send comments or suggestions on this glossary to

Version No. 7 Date: July Please send comments or suggestions on this glossary to Impact Evaluation Glossary Version No. 7 Date: July 2012 Please send comments or suggestions on this glossary to 3ie@3ieimpact.org. Recommended citation: 3ie (2012) 3ie impact evaluation glossary. International

More information

Available from Deakin Research Online:

Available from Deakin Research Online: This is the published version: Richardson, Ben and Fuller Tyszkiewicz, Matthew 2014, The application of non linear multilevel models to experience sampling data, European health psychologist, vol. 16,

More information

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Today s Class: Features of longitudinal data Features of longitudinal models What can MLM do for you? What to expect in this

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

INVESTIGATING FIT WITH THE RASCH MODEL. Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form

INVESTIGATING FIT WITH THE RASCH MODEL. Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form INVESTIGATING FIT WITH THE RASCH MODEL Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form of multidimensionality. The settings in which measurement

More information

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. Greenland/Arah, Epi 200C Sp 2000 1 of 6 EPI 200C Final, June 4 th, 2009 This exam includes 24 questions. INSTRUCTIONS: Write all answers on the answer sheets supplied; PRINT YOUR NAME and STUDENT ID NUMBER

More information

The Regression-Discontinuity Design

The Regression-Discontinuity Design Page 1 of 10 Home» Design» Quasi-Experimental Design» The Regression-Discontinuity Design The regression-discontinuity design. What a terrible name! In everyday language both parts of the term have connotations

More information

Bruno D. Zumbo, Ph.D. University of Northern British Columbia

Bruno D. Zumbo, Ph.D. University of Northern British Columbia Bruno Zumbo 1 The Effect of DIF and Impact on Classical Test Statistics: Undetected DIF and Impact, and the Reliability and Interpretability of Scores from a Language Proficiency Test Bruno D. Zumbo, Ph.D.

More information

Chapter 21 Multilevel Propensity Score Methods for Estimating Causal Effects: A Latent Class Modeling Strategy

Chapter 21 Multilevel Propensity Score Methods for Estimating Causal Effects: A Latent Class Modeling Strategy Chapter 21 Multilevel Propensity Score Methods for Estimating Causal Effects: A Latent Class Modeling Strategy Jee-Seon Kim and Peter M. Steiner Abstract Despite their appeal, randomized experiments cannot

More information

Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz

Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz This study presents the steps Edgenuity uses to evaluate the reliability and validity of its quizzes, topic tests, and cumulative

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP I 5/2/2016

Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP I 5/2/2016 Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP233201500069I 5/2/2016 Overview The goal of the meta-analysis is to assess the effects

More information

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016 The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016 This course does not cover how to perform statistical tests on SPSS or any other computer program. There are several courses

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary Statistics and Results This file contains supplementary statistical information and a discussion of the interpretation of the belief effect on the basis of additional data. We also present

More information

Limited dependent variable regression models

Limited dependent variable regression models 181 11 Limited dependent variable regression models In the logit and probit models we discussed previously the dependent variable assumed values of 0 and 1, 0 representing the absence of an attribute and

More information

Appendix A. Basics of Latent Class Analysis

Appendix A. Basics of Latent Class Analysis 7/3/2000 Appendix A. Basics of Latent Class Analysis Latent class analysis (LCA) is a statistical method for discovering subtypes of related cases from multivariate categorical data. A latent class may

More information

Generalized bivariate count data regression models

Generalized bivariate count data regression models Economics Letters 68 (000) 31 36 www.elsevier.com/ locate/ econbase Generalized bivariate count data regression models Shiferaw Gurmu *, John Elder a, b a Department of Economics, University Plaza, Georgia

More information

Issues That Should Not Be Overlooked in the Dominance Versus Ideal Point Controversy

Issues That Should Not Be Overlooked in the Dominance Versus Ideal Point Controversy Industrial and Organizational Psychology, 3 (2010), 489 493. Copyright 2010 Society for Industrial and Organizational Psychology. 1754-9426/10 Issues That Should Not Be Overlooked in the Dominance Versus

More information

How to analyze correlated and longitudinal data?

How to analyze correlated and longitudinal data? How to analyze correlated and longitudinal data? Niloofar Ramezani, University of Northern Colorado, Greeley, Colorado ABSTRACT Longitudinal and correlated data are extensively used across disciplines

More information

Statistical Methods and Reasoning for the Clinical Sciences

Statistical Methods and Reasoning for the Clinical Sciences Statistical Methods and Reasoning for the Clinical Sciences Evidence-Based Practice Eiki B. Satake, PhD Contents Preface Introduction to Evidence-Based Statistics: Philosophical Foundation and Preliminaries

More information

SUMMARY AND DISCUSSION

SUMMARY AND DISCUSSION Risk factors for the development and outcome of childhood psychopathology SUMMARY AND DISCUSSION Chapter 147 In this chapter I present a summary of the results of the studies described in this thesis followed

More information

Media, Discussion and Attitudes Technical Appendix. 6 October 2015 BBC Media Action Andrea Scavo and Hana Rohan

Media, Discussion and Attitudes Technical Appendix. 6 October 2015 BBC Media Action Andrea Scavo and Hana Rohan Media, Discussion and Attitudes Technical Appendix 6 October 2015 BBC Media Action Andrea Scavo and Hana Rohan 1 Contents 1 BBC Media Action Programming and Conflict-Related Attitudes (Part 5a: Media and

More information

Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models

Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models Jin Gong University of Iowa June, 2012 1 Background The Medical Council of

More information

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling Olli-Pekka Kauppila Daria Kautto Session VI, September 20 2017 Learning objectives 1. Get familiar with the basic idea

More information

Logistic regression: Why we often can do what we think we can do 1.

Logistic regression: Why we often can do what we think we can do 1. Logistic regression: Why we often can do what we think we can do 1. Augst 8 th 2015 Maarten L. Buis, University of Konstanz, Department of History and Sociology maarten.buis@uni.konstanz.de All propositions

More information

Multiple Act criterion:

Multiple Act criterion: Common Features of Trait Theories Generality and Stability of Traits: Trait theorists all use consistencies in an individual s behavior and explain why persons respond in different ways to the same stimulus

More information

A Race Model of Perceptual Forced Choice Reaction Time

A Race Model of Perceptual Forced Choice Reaction Time A Race Model of Perceptual Forced Choice Reaction Time David E. Huber (dhuber@psyc.umd.edu) Department of Psychology, 1147 Biology/Psychology Building College Park, MD 2742 USA Denis Cousineau (Denis.Cousineau@UMontreal.CA)

More information

Conditional spectrum-based ground motion selection. Part II: Intensity-based assessments and evaluation of alternative target spectra

Conditional spectrum-based ground motion selection. Part II: Intensity-based assessments and evaluation of alternative target spectra EARTHQUAKE ENGINEERING & STRUCTURAL DYNAMICS Published online 9 May 203 in Wiley Online Library (wileyonlinelibrary.com)..2303 Conditional spectrum-based ground motion selection. Part II: Intensity-based

More information

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1 Welch et al. BMC Medical Research Methodology (2018) 18:89 https://doi.org/10.1186/s12874-018-0548-0 RESEARCH ARTICLE Open Access Does pattern mixture modelling reduce bias due to informative attrition

More information

Maternal and paternal antisocial disorder, prenatal exposure to cigarette smoke, and other risk factors for the development of conduct disorder

Maternal and paternal antisocial disorder, prenatal exposure to cigarette smoke, and other risk factors for the development of conduct disorder Maternal and paternal antisocial disorder, prenatal exposure to cigarette smoke, and other risk factors for the development of conduct disorder Mark Zoccolillo, M.D. Acknowledgements Jeroen Vermunt, Ph.D.

More information

Data Analysis Using Regression and Multilevel/Hierarchical Models

Data Analysis Using Regression and Multilevel/Hierarchical Models Data Analysis Using Regression and Multilevel/Hierarchical Models ANDREW GELMAN Columbia University JENNIFER HILL Columbia University CAMBRIDGE UNIVERSITY PRESS Contents List of examples V a 9 e xv " Preface

More information

A MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA

A MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA A MONTE CARLO STUDY OF MODEL SELECTION PROCEDURES FOR THE ANALYSIS OF CATEGORICAL DATA Elizabeth Martin Fischer, University of North Carolina Introduction Researchers and social scientists frequently confront

More information

An informal analysis of multilevel variance

An informal analysis of multilevel variance APPENDIX 11A An informal analysis of multilevel Imagine we are studying the blood pressure of a number of individuals (level 1) from different neighbourhoods (level 2) in the same city. We start by doing

More information

The University of North Carolina at Chapel Hill School of Social Work

The University of North Carolina at Chapel Hill School of Social Work The University of North Carolina at Chapel Hill School of Social Work SOWO 918: Applied Regression Analysis and Generalized Linear Models Spring Semester, 2014 Instructor Shenyang Guo, Ph.D., Room 524j,

More information

Models and strategies for factor mixture analysis: Two examples concerning the structure underlying psychological disorders

Models and strategies for factor mixture analysis: Two examples concerning the structure underlying psychological disorders Running Head: MODELS AND STRATEGIES FOR FMA 1 Models and strategies for factor mixture analysis: Two examples concerning the structure underlying psychological disorders Shaunna L. Clark and Bengt Muthén

More information

Supplement 2. Use of Directed Acyclic Graphs (DAGs)

Supplement 2. Use of Directed Acyclic Graphs (DAGs) Supplement 2. Use of Directed Acyclic Graphs (DAGs) Abstract This supplement describes how counterfactual theory is used to define causal effects and the conditions in which observed data can be used to

More information

PLEASE SCROLL DOWN FOR ARTICLE. Full terms and conditions of use:

PLEASE SCROLL DOWN FOR ARTICLE. Full terms and conditions of use: This article was downloaded by: [CDL Journals Account] On: 26 November 2009 Access details: Access Details: [subscription number 794379768] Publisher Psychology Press Informa Ltd Registered in England

More information

Multilevel Latent Class Analysis: an application to repeated transitive reasoning tasks

Multilevel Latent Class Analysis: an application to repeated transitive reasoning tasks Multilevel Latent Class Analysis: an application to repeated transitive reasoning tasks Multilevel Latent Class Analysis: an application to repeated transitive reasoning tasks MLLC Analysis: an application

More information

Tutorial #7A: Latent Class Growth Model (# seizures)

Tutorial #7A: Latent Class Growth Model (# seizures) Tutorial #7A: Latent Class Growth Model (# seizures) 2.50 Class 3: Unstable (N = 6) Cluster modal 1 2 3 Mean proportional change from baseline 2.00 1.50 1.00 Class 1: No change (N = 36) 0.50 Class 2: Improved

More information

Propensity Score Analysis Shenyang Guo, Ph.D.

Propensity Score Analysis Shenyang Guo, Ph.D. Propensity Score Analysis Shenyang Guo, Ph.D. Upcoming Seminar: April 7-8, 2017, Philadelphia, Pennsylvania Propensity Score Analysis 1. Overview 1.1 Observational studies and challenges 1.2 Why and when

More information

A Comparison of Several Goodness-of-Fit Statistics

A Comparison of Several Goodness-of-Fit Statistics A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures

More information

Advanced Bayesian Models for the Social Sciences

Advanced Bayesian Models for the Social Sciences Advanced Bayesian Models for the Social Sciences Jeff Harden Department of Political Science, University of Colorado Boulder jeffrey.harden@colorado.edu Daniel Stegmueller Department of Government, University

More information

Short and Long Views of Integrative Data Analysis: Comments on Contributions to the Special Issue

Short and Long Views of Integrative Data Analysis: Comments on Contributions to the Special Issue Psychological Methods 2009, Vol. 14, No. 2, 177 181 2009 American Psychological Association 1082-989X/09/$12.00 DOI: 10.1037/a0015953 Short and Long Views of Integrative Data Analysis: Comments on Contributions

More information

OLS Regression with Clustered Data

OLS Regression with Clustered Data OLS Regression with Clustered Data Analyzing Clustered Data with OLS Regression: The Effect of a Hierarchical Data Structure Daniel M. McNeish University of Maryland, College Park A previous study by Mundfrom

More information