Analyzing data from educational surveys: a comparison of HLM and Multilevel IRT. Amin Mousavi

Size: px
Start display at page:

Download "Analyzing data from educational surveys: a comparison of HLM and Multilevel IRT. Amin Mousavi"

Transcription

1 Analyzing data from educational surveys: a comparison of HLM and Multilevel IRT Amin Mousavi Centre for Research in Applied Measurement and Evaluation University of Alberta Paper Presented at the 2013 annual meeting of National Council of Measurement in Education San Francisco, California, USA April 26-30

2 Introduction During the last two decades, there has been an increased focus on educational quality and learning outcomes in the world. International educational surveys like TIMSS, PIRLS and PISA have provided a basis for assessing educational quality on different levels and corresponding influential factors. Using information provided by such large scale assessment, participating countries can explore weaknesses and strengths of their educational system comparing to other participating countries. International Association for the Evaluation of Educational Achievement (IEA) administers TIMSS and PIRLS for assessing Mathematics and Science and assessing Reading Literacy study. IEA s Trends in International Mathematics and Science Study (TIMSS) provides useful information about students mathematics and science achievement in an international context. TIMSS assesses students at the fourth and eighth grades, and also collects a wealth of data from the principals and teachers about curriculum and instruction in mathematics and science. Also, there is an advanced TIMSS designated for assessing school-leaving students with special preparation in advanced mathematics and physics. Participating countries in advanced TIMSS want internationally comparative data about the achievement of their students enrolled in advanced courses designed to lead into science-oriented programs in university. TIMSS uses the curriculum, broadly defined, as the major organizing concept in considering how educational opportunities are provided to students, and the factors that influence how students use these opportunities. Large-scale assessments like TIMSS accomplish a broad coverage of the targeted content domain by dividing the pool of items into blocks or clusters of items. Each student responds to one or more of these blocks, and so receives only a subset of what constitutes the total item pool. Under this design, each student responds to only a portion of the whole assessment in the form of a booklet. These test booklets are partially linked through blocks that occur in multiple test booklets. Therefore, each student responds to relatively small number of items meaning that the accuracy of measurement at the individual level of these assessments is considerably lower than when students are administered the full test. Common approaches to estimating individual proficiency, such as marginal maximum likelihood (MML) and expected aposteriori (EAP) estimates, are optimal for individual students, but not for classes and

3 schools.these approaches result in biased estimates of group-level results (von Davier et al, 2009). One way of solving this problem and having accurate group level estimates, is of using multiple values representing the expected distribution of a student s ability. These so-called plausible values (PV) allow unbiased estimation of the plausible range and the location of proficiency for groups of students. Plausible values are based on student responses to the subset of items they receive, as well as on other relevant and available background information (Mislevy, 1991). Plausible values can be viewed as a set of estimates generated using multiple imputations. Plausible values are not individual scores in the traditional sense, and should therefore not be analyzed as multiple indicators of the same score or latent variable (Mislevy, 1993). There are at least two approaches on analyzing data from large scale assessments which are briefly described in the next two parts. Hierarchical Linear Modeling (HLM) In social and behavioural research, often we work with data sets such that individuals are nested within some units/groups which this grouping nature can affect individuals scores. For instance, students are nested within classes or schools and properties of class or school like differences among teachers or school resources can affect students achievement performance. So, for analyzing this kind of nested data a researcher has to take into account the nature of data in order to extract more meaningful and accurate information about individuals. An acceptable and general statistical method for analyzing nested data requires Hierarchical Linear Modeling or Multilevel modeling. The use of multilevel models (Goldstein, 1995), also called hierarchical linear models (Snijders & Bosker, 2012, Raudenbush and Byrk, 2002), takes into account the nested structure of the data. Briefly, multilevel models model relationships between a dependent variable and a set of explanatory variables considering hierarchies of the data. The relative variation in the outcome measures (dependent variable), between students within the same school and between schools can therefore be evaluated. Multilevel models are used to make inferences about the relationships among explanatory and outcome variables at different levels. Since data from large scale tests like TIMSS are collected in a way that students are nested within classes; schools and countries, this implies that for analyzing data obtained by educational surveys we need

4 to use multilevel modelling. Usually, the outcome variables in most large scale tests like TIMSS, PISA, PERLS, are in form of plausible values, the common practice is to use plausible values and explanatory variables and analyze them using multiple imputations method in order to find final model estimates. In this way, test items are not directly used as performance indicators of individuals in the multilevel model. Another way for analyzing this type of data could be using test items as individuals performance indicators in a multilevel framework. Fox and Glas (2001, 2003) purposed a new method for multilevel modeling which takes into account the uncertainty of performance indicators at different levels using Item Response Theory (IRT) models in a multilevel framework as Multilevel Item Response Theory (MLIRT). Multilevel Item Response Theory (MLIRT) In most educational research, measurements are needed at the individual and group levels. Students examination results can be perceived as an indicator for the students abilities. They are measured with error. In summary, observed test data can be considered as indicators for the latent variables (i.e. ability). The latent variables can be integrated in a multilevel model. A multilevel IRT model extends the traditional IRT models, such that they regard variations of abilities between groups such as schools or classes, as well as within group units. Hence, a multilevel IRT model will distinguish the individual-level abilities and group-level abilities. For example, a multilevel extended two-parameter logistic IRT model for dichotomously scored items can be written as = (1) Where is the probability of a correct answer to i th item by person ; and are discrimination and difficulty parameter of i th item respectively and θ pg = ξ g + ζ pg. Here, θ pg is the ability of person p in group g, ξ g is the mean ability of group g, and ζ pg that is the deviation of person p from the group mean ability. This is one of the simplest forms of a multilevel IRT model. However, typical applications of multilevel IRT models involve explanatory variables in the model (Kamata & Vaughn, 2011). Fox and Glas (2001) extended this idea to multilevel linear modeling with the two-parameter normal ogive and graded response models as the measurement models. The multilevel model is implanted in the IRT framework to model the relationship

5 between observed individual and group characteristics and an outcome variable measured by dichotomous or polytomous items. Let y ijk denote the observed item response of the i th student in the j th school to item k. Let a two parameter IRT model relate the observed dichotomous item response with the students latent ability, θ ij, that is, = 1,, = Φ( ) (2) Where Φ is the cumulative normal distribution, a k and b k are the discrimination and difficulty parameter of item k, respectively. The latent ability can be articulated at level- 1 of a multilevel model as a liner combination of predictors at level-1: = (3) in which X Qij is explanatory variable at level-1. At level-2, each of the constant and regression coefficients can be expressed as a linear combination of predictors at level-2: = (4) Where W sqj is the S th explanatory variable at level-2. Both residuals, e ij and u qj, are assumed to be independently and normally distributed. The above equations define a multilevel IRT model, with a latent dependent variable measured by a two-parameter IRT model. Here, a latent variable is used as a dependent variable. Then, the IRT model can be seen as a level within the multilevel model. A multilevel IRT model can also consist of a multilevel model that defines the relation between different latent variables and various IRT models for measuring the latent variables. In summary, MLIRT is a method for incorporating IRT ability estimates using ogive normal model as outcome variable in a multilevel model. Traditional multilevel models assume that variables in multilevel models are measured without error and this can lead to biased estimates of multilevel model parameters but MLIRT is designed to be able to handle measurement error of the explanatory variables (Fox, 2004). This means that the uncertainty in the measurements of the latent variables is taken into account in the estimation of the other model parameters. Usually, individual abilities or group characteristics are estimated and imputed in the multilevel analysis and the measurements are supposed to be observed

6 errorless. As a result, the estimated regression coefficients are biased and their standard deviations are too small (Fox, 2004). The standard multilevel software (MLwiN; HLM; ) cannot be used to estimate simultaneously all parameters of a multilevel IRT model therefore free software called mlirt under R program(r Development Core Team, 2012) is developed for estimating MLIRT models (Fox, 2003). This package can be used for analyzing dichotomous or polytomous responses. This method is applicable to the data from large scale assessments like TIMSS. Fox (2007) used the mlirt package for analyzing data from PISA 2003 and compared its results with HLM 5 program (Raudenbush et al, 2000). He showed that estimates from mlirt package and HLM are close but estimated variance components using MLIRT are greater than HLM using plausible values which is due to taking into account the measurement error of explanatory variables in mlirt package. So, the aim of this study is to replicate Fox s study using advanced TIMSS 2008 data for more investigation on MLIRT method purposed by Fox and Glas. Methodology Data/participants: Data from advanced TIMSS 2008 mathematics (IEA, 2008) from IRAN was used to compare the two procedures. There were 2,362 students (60.6% male (coded as 0), 39.4% female (coded as 1)) nested within 116 schools in the data set. Achievement test and questionnaires.the advanced mathematics assessment framework for TIMSS Advanced 2008 was organized around two dimensions: a content dimension specifying the subject matter to be assessed within mathematics (i.e., algebra, calculus, and geometry) and a cognitive dimension specifying the thinking processes to be assessed (i.e., knowing, applying, and reasoning). The items were included in four linked booklets (Arora et al, 2009). While both dichotomously and polytomously scored items were included in the assessment, just dichotomous items were used in the present study. The data for the explanatory variables was obtained from the student, teacher and school questionnaires. Multilevel modeling: Three HLM models were considered in this study: 1) Null model (M0): This is a model without any explanatory in the model. A null model can be denoted as Level-1: = + Level-2: = +

7 2) Model1 (M1): This is a random intercept model with two explanatory variables at level-1, attitude towards math (AM) and student s gender (SEX). This model can be denoted as Level-1: = + ( )+ ( ) Level-2: = + = = 3) Model2 (M2): This is a random intercept model with two explanatory variables at level-1, attitude towards math (AM) and student s gender (SEX) and one level-2 explanatory variables, school resources for teaching math (RESOURCES. This model can be denoted as Level-1: = + ( )+ ( ) Level-2: = + ( ) = = All the explanatory variables were assumed to have fixed effect across level-2 units. For the purpose of comparison, in addition to plausible values, sum of raw scores of the students responses on the mathematics test were considered as outcome variable in HLM. Also, in order to make the results from the two methods (i.e. MLIRT and HLM) comparable, outcome variables entered into HLM were standardized with mean of 0 and standard deviation of 1 so the results derived from HLM and MLIRT could be comparable because in the mlirt package ability estimates were assumed to have a mean of 0 and standard deviation of 1. Software/Estimation method: for analyzing traditional multilevel models HLM 6 program (Raudenbush, Bryk, Cheong, &Congdon, 2004) was used using plausible values and standardized sum of raw scores as outcome variables. In HLM, analyzing plausible values is done by multiple imputations. For each student, there were five plausible values in the data set then HLM performs multilevel analysis for given model for each plausible value. The final estimate for each parameter was computed based on following steps (Snijders and Bosker, 2012): First, from the multiple imputations the average estimate is

8 = ( ) (5) where, M is the number of imputations (i.e. in this case equals 5) and is the estimate. Then the average within data set variance and between-imputation variance can be achieved via =.. ( ) (6) and = ( ( ) ) (7) where Then the standard error of estimate is.. ( ) = (8) The milrt 2.0 package developed by Fox (2010) under the R environment (R Development Core Team, 2012) was used to compute the results for the multilevel item response theory model. This package uses Monte Carlo Markov Chain (MCMC) algorithm for parameter estimation in the context of Bayesian analysis. Common normal priors are specified for item parameters as (, )~ (,Σ). A Gibbs sampler is used to simulate draws from the conditional posterior distribution for binary responses. Estimation method in the mlirt package is complex and intensive so for more details readers can refer to Fox (2007). Results The parameter estimates for the null model (M0) resulted from the two methods are reported in Table 1. In this table, γ is the intercept of level-2,σ is the variance of outcome variable at level-1, τ is the variance of outcome variable among schools and ρ is the estimated intraclass correlation.

9 Table1: Estimates of model parameters for null model (M0) HLM-PV HLM-Raw MLIRT Fixed part Estimate S.E. Estimate S.E. Estimate S.E. (INTERCEPT) Random part *. (p 0.05) * * * Results from MLIRT are quite different than traditional multilevel model for null model. In fixed part, all estimates are close to zero which is due to standardization of outcome variable for HLM and the scale of ability estimation in MLIRT. In random part, sigma-square estimate using plausible values is smaller than sigma-square estimate using the raw scores whereas the estimate from MLIRT is higher than the others which was expected because the MLIRT takes into account the measurement error of explanatory variables as well. The Tau estimate using raw score is about half of estimate using plausible values indicating that there is more observed variation between schools when using plausible values, and these two are higher than the estimate from MLIRT. Finally, the intraclass correlation coefficients for the plausible values and raw scores indicate that about 49% and 26.5% of variation of outcome variable is due to nesting students within schools respectively. This amount decreases sharply down to 5.2% for MLIRT model. Table2 shows parameter estimates for the model1 (M1) resulted from the two methods. In this table, γ and γ are leve-2 intercepts of corresponding level-1 explanatory variables.

10 Table2: Estimates of model parameters for model1 (M1) HLM-PV HLM-Raw MLIRT Fixed part Estimate S.E. Estimate S.E. Estimate S.E. (INTERCEPT) (AM) (SEX) * * * Random part *. (p 0.05) The results reported in Table 2 show that the explained variance using HLM varies from 24.8% for raw scores to 47.8% for plausible values but the explained variance using MLIRT is just 4.1%. In the fixed part, AM is not significant (p>0.05)across methods and different outcome variables. In case of SEX, it is significant (p<0.05) across methods and outcome variables. The negative sign of estimate for SEX indicates that males outperformed females. Pseudo R-squares based on the formula provided by Snijders & Bosker (2012) after adding the two explanatory variables compared to the null model for plausible values, raw score and MLIRT are 0.026, and 0.006, respectively. Table3 shows parameter estimates for the model2 (M2) for the two methods. In this table, is the regression coefficient of RESOURCES at level-2.

11 Table. 3: Estimates of model parameters for model2 (M2) HLM-PV HLM-Raw MLIRT Fixed part Estimate S.E. Estimate S.E. Estimate S.E. (INTERCEPT) Student (AM) (SEX) * * * School (RESOURCES) * * * Random part *. (p-value 0.05) * * * From Table3, it can be seen that intraclass correlation coefficients are less than the intraclass correlation coefficients for M0 and M1. The amount of explained variance for plausible values andraw scores are 45.8%and 22.4% respectively and for MLIRT is 3.8%. In the fixed part, RESOURCES and SEX are significant (p <0.05) across methods and outcome variables. Discussion and Conclusion The observed difference between MLIRT and HLM could be due to the different outcome variables. Although results from plausible values and raw scores are not very close, they both differ from the results of MLIRT. This could be due to distribution of estimated thetas from MLIRT as shown in Figure 1 which demonstrates the estimated probability density functions obtained by kernel density estimation (Silverman, 1986) of outcome variables used in HLM and MLIRT. The use of mean of plausible values in the following graphs is just for demonstration and prevention of a confused graph because of using five plausible values in one graph.

12 Figure.1: Distribution of estimated theta from MLIRT against mean of plausible values and sum score From the Figure 1, it can be clearly seen that estimated density functions of the mean of plausible values (i.e. mean of PVs in solid line) and sum scores (i.e. dotted line) are closer to each other but the estimated density function of MLIRT (i.e. dashed line) is different and bi-modal. This difference can affect multilevel parameter estimates. For more investigation, a dummy variable generated based on the theta distribution in which values smaller than 0 coded as 1 and values equal to and greater of 0 coded as 2. Using this student level explanatory variable, the intraclass correlation in MLIRT analysis increased up to 12.8% suggesting that there is another clustering variable which is not taken into account in MLIRT analysis. One possible clustering factor could be multidimensionality of the test because in MLIRT it is assumed that the data are one-dimensional. Also, low discrimination indices obtained from MLIRT (i.e. mean=0.66, variance=0.40, range = ) suggest more than one dimension. Results of a dimensionality analysis using mirt package under the R program (Chalmers, 2012) revealed that a two factor model fits better the data (Table4). Tucker-Lewis index (TLI) of fit and Root Mean Square Error of Approximation (RMSEA) suggest that two-factor model fits better and in comparison with the three-factor model, the two-factor model is more parsimonious.

13 Table4: Dimensionality analysis of the data One factor Two factor Three factor Log-Likelihood AIC BIC TLI RMSEA Another possible source of having this clustering factor could be due to the complexity of the data and sampling design used in TIMMS. Following to the results obtained by Fox (2007) using PISA data and finding close estimates from the two methods, plotting a kernel density estimate of the estimated thetas from Fox s study shows a fairly unidimentional normally distributed thetas which a close distribution to the distribution of mean of PVs (Figure.2). Figure.2: Distribution of estimated theta from MLIRT of Fox(2007) study

14 It seems that the main difference between the results of the current study and Fox (2007) on comparing HLM and MLIRT is due to the difference between estimated latent abilities by the mlirt package. To sum up, even though the MLIRT approach seems promising in the analysis of data from large scale assessments but it should be noted that the accuracy of the estimates highly depends on meeting the assumption on the used analytical method. For advanced TIMSS 2008 data used in this study, it seems that, for some reasons, plausible values could capture the underlying grouping variable in the data while MLIRT couldn t so if a researcher wants to use MLIRT needs to take into account the possibility of existence of such factors in the data and including it into the multilevel model which is not easy. On the other hand, the similarity between the theta and mean of PVs distributions in Fox (2007) study indicates that under some circumstances there is no need to generate plausible values and the estimated theta values can provide credible results ( assuming that plausible values can provide credible results). However, results suggest that there is an obvious need for more investigation on the merit of using MLIRT or HLM for analyzing data from large scale assessment.

15 References Arora, A., Foy, P., Martin, M. O., & Mullis, I. V. S. (2009). TIMSS advanced 2008 technical report. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. Chalmers, R. Philip. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), URL Fox, J.-P.(2004). Applications of Multilevel IRT Modeling. School Effectiveness and School Improvement, 3-4. Fox, J.-P., &Glas, C.A.W. (2001).Bayesian estimation of a multilevel IRT model using Gibbs sampling.psychometrika, 66, Fox, J.-P., &Glas, C.A.W. (2003).Bayesian modeling of measurement error in predictor variables using item response theory.psychometrika, 68, Fox, J-P. (2007). Multilevel IRT Modeling in Practice with the Package mlirt.university of California at Los Angeles, Department of Statistics. Fox, J-P. (2010). mlirt: Multilevel item response theory (mlirt) modeling. R package version Goldstein, H. (1995). Multilevel statistical models (2nd ed.). London: Edward Arnold. Kamata, A., & Vaughn, B. K.(2011). Multilevel IRT Modeling. In Hox, J. J., & Roberts, J. K., Handbook of Advanced Multilevel Analysis, Psychology Press. Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56(2), Mislevy, R. J. (1993). Should multiple imputations be treated as multiple indicators?,psychometrika, 58(1), R Core Team (2012). R: A language and environment for statistical computing. RFoundation for Statistical Computing, Vienna, Austria. ISBN , URL Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Thousand Oaks: Sage Publications. Raudenbush, S.W., Bryk, A.S., Cheong, Y.F., &Congdon, R.T., Jr. (2000). HLM 5.Hierarchical Linear and nonlinear modeling.lincolnwood, IL; Scientific Software International. Raudenbush, S.W., Bryk, A.S., Cheong, Y.F., &Congdon, R.T., Jr. (2004). HLM 6.Hierarchical Linear and nonlinear modeling.lincolnwood, IL; Scientific Software International.

16 Silverman, B. W. (1986). Density estimation. London, England: Chapman and Hall. Snijders, T. A. B., &Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Los Angeles: Sage. Von Davier, M., Gonzalez, E., & Mislevy, R. (2009). What are plausible values and why are they useful. IERI Monograph Series Volume, 2, 9-3

Multilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison

Multilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison Group-Level Diagnosis 1 N.B. Please do not cite or distribute. Multilevel IRT for group-level diagnosis Chanho Park Daniel M. Bolt University of Wisconsin-Madison Paper presented at the annual meeting

More information

THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION

THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION Timothy Olsen HLM II Dr. Gagne ABSTRACT Recent advances

More information

2. Literature Review. 2.1 The Concept of Hierarchical Models and Their Use in Educational Research

2. Literature Review. 2.1 The Concept of Hierarchical Models and Their Use in Educational Research 2. Literature Review 2.1 The Concept of Hierarchical Models and Their Use in Educational Research Inevitably, individuals interact with their social contexts. Individuals characteristics can thus be influenced

More information

OLS Regression with Clustered Data

OLS Regression with Clustered Data OLS Regression with Clustered Data Analyzing Clustered Data with OLS Regression: The Effect of a Hierarchical Data Structure Daniel M. McNeish University of Maryland, College Park A previous study by Mundfrom

More information

The Use of Multilevel Item Response Theory Modeling in Applied Research: An Illustration

The Use of Multilevel Item Response Theory Modeling in Applied Research: An Illustration APPLIED MEASUREMENT IN EDUCATION, 16(3), 223 243 Copyright 2003, Lawrence Erlbaum Associates, Inc. The Use of Multilevel Item Response Theory Modeling in Applied Research: An Illustration Dena A. Pastor

More information

A Multilevel Testlet Model for Dual Local Dependence

A Multilevel Testlet Model for Dual Local Dependence Journal of Educational Measurement Spring 2012, Vol. 49, No. 1, pp. 82 100 A Multilevel Testlet Model for Dual Local Dependence Hong Jiao University of Maryland Akihito Kamata University of Oregon Shudong

More information

Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study

Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study Research Report Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study Xueli Xu Matthias von Davier April 2010 ETS RR-10-10 Listening. Learning. Leading. Linking Errors in Trend Estimation

More information

Using Sample Weights in Item Response Data Analysis Under Complex Sample Designs

Using Sample Weights in Item Response Data Analysis Under Complex Sample Designs Using Sample Weights in Item Response Data Analysis Under Complex Sample Designs Xiaying Zheng and Ji Seung Yang Abstract Large-scale assessments are often conducted using complex sampling designs that

More information

Statistics for Social and Behavioral Sciences

Statistics for Social and Behavioral Sciences Statistics for Social and Behavioral Sciences Advisors: S.E. Fienberg W.J. van der Linden For other titles published in this series, go to http://www.springer.com/series/3463 Jean-Paul Fox Bayesian Item

More information

How few countries will do? Comparative survey analysis from a Bayesian perspective

How few countries will do? Comparative survey analysis from a Bayesian perspective Survey Research Methods (2012) Vol.6, No.2, pp. 87-93 ISSN 1864-3361 http://www.surveymethods.org European Survey Research Association How few countries will do? Comparative survey analysis from a Bayesian

More information

Copyright. Kelly Diane Brune

Copyright. Kelly Diane Brune Copyright by Kelly Diane Brune 2011 The Dissertation Committee for Kelly Diane Brune Certifies that this is the approved version of the following dissertation: An Evaluation of Item Difficulty and Person

More information

Comparing DIF methods for data with dual dependency

Comparing DIF methods for data with dual dependency DOI 10.1186/s40536-016-0033-3 METHODOLOGY Open Access Comparing DIF methods for data with dual dependency Ying Jin 1* and Minsoo Kang 2 *Correspondence: ying.jin@mtsu.edu 1 Department of Psychology, Middle

More information

MODELING HIERARCHICAL STRUCTURES HIERARCHICAL LINEAR MODELING USING MPLUS

MODELING HIERARCHICAL STRUCTURES HIERARCHICAL LINEAR MODELING USING MPLUS MODELING HIERARCHICAL STRUCTURES HIERARCHICAL LINEAR MODELING USING MPLUS M. Jelonek Institute of Sociology, Jagiellonian University Grodzka 52, 31-044 Kraków, Poland e-mail: magjelonek@wp.pl The aim of

More information

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill)

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill) Advanced Bayesian Models for the Social Sciences Instructors: Week 1&2: Skyler J. Cranmer Department of Political Science University of North Carolina, Chapel Hill skyler@unc.edu Week 3&4: Daniel Stegmueller

More information

A Bayesian Nonparametric Model Fit statistic of Item Response Models

A Bayesian Nonparametric Model Fit statistic of Item Response Models A Bayesian Nonparametric Model Fit statistic of Item Response Models Purpose As more and more states move to use the computer adaptive test for their assessments, item response theory (IRT) has been widely

More information

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,

More information

Data Analysis Using Regression and Multilevel/Hierarchical Models

Data Analysis Using Regression and Multilevel/Hierarchical Models Data Analysis Using Regression and Multilevel/Hierarchical Models ANDREW GELMAN Columbia University JENNIFER HILL Columbia University CAMBRIDGE UNIVERSITY PRESS Contents List of examples V a 9 e xv " Preface

More information

Does factor indeterminacy matter in multi-dimensional item response theory?

Does factor indeterminacy matter in multi-dimensional item response theory? ABSTRACT Paper 957-2017 Does factor indeterminacy matter in multi-dimensional item response theory? Chong Ho Yu, Ph.D., Azusa Pacific University This paper aims to illustrate proper applications of multi-dimensional

More information

Advanced Bayesian Models for the Social Sciences

Advanced Bayesian Models for the Social Sciences Advanced Bayesian Models for the Social Sciences Jeff Harden Department of Political Science, University of Colorado Boulder jeffrey.harden@colorado.edu Daniel Stegmueller Department of Government, University

More information

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine

Data Analysis in Practice-Based Research. Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine Data Analysis in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine Multilevel Data Statistical analyses that fail to recognize

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

Hierarchical Linear Models: Applications to cross-cultural comparisons of school culture

Hierarchical Linear Models: Applications to cross-cultural comparisons of school culture Hierarchical Linear Models: Applications to cross-cultural comparisons of school culture Magdalena M.C. Mok, Macquarie University & Teresa W.C. Ling, City Polytechnic of Hong Kong Paper presented at the

More information

Kelvin Chan Feb 10, 2015

Kelvin Chan Feb 10, 2015 Underestimation of Variance of Predicted Mean Health Utilities Derived from Multi- Attribute Utility Instruments: The Use of Multiple Imputation as a Potential Solution. Kelvin Chan Feb 10, 2015 Outline

More information

ANNEX A5 CHANGES IN THE ADMINISTRATION AND SCALING OF PISA 2015 AND IMPLICATIONS FOR TRENDS ANALYSES

ANNEX A5 CHANGES IN THE ADMINISTRATION AND SCALING OF PISA 2015 AND IMPLICATIONS FOR TRENDS ANALYSES ANNEX A5 CHANGES IN THE ADMINISTRATION AND SCALING OF PISA 2015 AND IMPLICATIONS FOR TRENDS ANALYSES Comparing science, reading and mathematics performance across PISA cycles The PISA 2006, 2009, 2012

More information

Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note

Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1 Nested Factor Analytic Model Comparison as a Means to Detect Aberrant Response Patterns John M. Clark III Pearson Author Note John M. Clark III,

More information

Using response time data to inform the coding of omitted responses

Using response time data to inform the coding of omitted responses Psychological Test and Assessment Modeling, Volume 58, 2016 (4), 671-701 Using response time data to inform the coding of omitted responses Jonathan P. Weeks 1, Matthias von Davier & Kentaro Yamamoto Abstract

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

C h a p t e r 1 1. Psychologists. John B. Nezlek

C h a p t e r 1 1. Psychologists. John B. Nezlek C h a p t e r 1 1 Multilevel Modeling for Psychologists John B. Nezlek Multilevel analyses have become increasingly common in psychological research, although unfortunately, many researchers understanding

More information

Score Tests of Normality in Bivariate Probit Models

Score Tests of Normality in Bivariate Probit Models Score Tests of Normality in Bivariate Probit Models Anthony Murphy Nuffield College, Oxford OX1 1NF, UK Abstract: A relatively simple and convenient score test of normality in the bivariate probit model

More information

Basic concepts and principles of classical test theory

Basic concepts and principles of classical test theory Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must

More information

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Timothy N. Rubin (trubin@uci.edu) Michael D. Lee (mdlee@uci.edu) Charles F. Chubb (cchubb@uci.edu) Department of Cognitive

More information

Understanding and Applying Multilevel Models in Maternal and Child Health Epidemiology and Public Health

Understanding and Applying Multilevel Models in Maternal and Child Health Epidemiology and Public Health Understanding and Applying Multilevel Models in Maternal and Child Health Epidemiology and Public Health Adam C. Carle, M.A., Ph.D. adam.carle@cchmc.org Division of Health Policy and Clinical Effectiveness

More information

Having your cake and eating it too: multiple dimensions and a composite

Having your cake and eating it too: multiple dimensions and a composite Having your cake and eating it too: multiple dimensions and a composite Perman Gochyyev and Mark Wilson UC Berkeley BEAR Seminar October, 2018 outline Motivating example Different modeling approaches Composite

More information

The Classification Accuracy of Measurement Decision Theory. Lawrence Rudner University of Maryland

The Classification Accuracy of Measurement Decision Theory. Lawrence Rudner University of Maryland Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, April 23-25, 2003 The Classification Accuracy of Measurement Decision Theory Lawrence Rudner University

More information

Bayesian and Frequentist Approaches

Bayesian and Frequentist Approaches Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law

More information

THE MANTEL-HAENSZEL METHOD FOR DETECTING DIFFERENTIAL ITEM FUNCTIONING IN DICHOTOMOUSLY SCORED ITEMS: A MULTILEVEL APPROACH

THE MANTEL-HAENSZEL METHOD FOR DETECTING DIFFERENTIAL ITEM FUNCTIONING IN DICHOTOMOUSLY SCORED ITEMS: A MULTILEVEL APPROACH THE MANTEL-HAENSZEL METHOD FOR DETECTING DIFFERENTIAL ITEM FUNCTIONING IN DICHOTOMOUSLY SCORED ITEMS: A MULTILEVEL APPROACH By JANN MARIE WISE MACINNES A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF

More information

Dimensionality of the Force Concept Inventory: Comparing Bayesian Item Response Models. Xiaowen Liu Eric Loken University of Connecticut

Dimensionality of the Force Concept Inventory: Comparing Bayesian Item Response Models. Xiaowen Liu Eric Loken University of Connecticut Dimensionality of the Force Concept Inventory: Comparing Bayesian Item Response Models Xiaowen Liu Eric Loken University of Connecticut 1 Overview Force Concept Inventory Bayesian implementation of one-

More information

Ordinal Data Modeling

Ordinal Data Modeling Valen E. Johnson James H. Albert Ordinal Data Modeling With 73 illustrations I ". Springer Contents Preface v 1 Review of Classical and Bayesian Inference 1 1.1 Learning about a binomial proportion 1 1.1.1

More information

On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015

On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015 On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses Structural Equation Modeling Lecture #12 April 29, 2015 PRE 906, SEM: On Test Scores #2--The Proper Use of Scores Today s Class:

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

MISSING DATA AND PARAMETERS ESTIMATES IN MULTIDIMENSIONAL ITEM RESPONSE MODELS. Federico Andreis, Pier Alda Ferrari *

MISSING DATA AND PARAMETERS ESTIMATES IN MULTIDIMENSIONAL ITEM RESPONSE MODELS. Federico Andreis, Pier Alda Ferrari * Electronic Journal of Applied Statistical Analysis EJASA (2012), Electron. J. App. Stat. Anal., Vol. 5, Issue 3, 431 437 e-issn 2070-5948, DOI 10.1285/i20705948v5n3p431 2012 Università del Salento http://siba-ese.unile.it/index.php/ejasa/index

More information

Context of Best Subset Regression

Context of Best Subset Regression Estimation of the Squared Cross-Validity Coefficient in the Context of Best Subset Regression Eugene Kennedy South Carolina Department of Education A monte carlo study was conducted to examine the performance

More information

IDENTIFYING DATA CONDITIONS TO ENHANCE SUBSCALE SCORE ACCURACY BASED ON VARIOUS PSYCHOMETRIC MODELS

IDENTIFYING DATA CONDITIONS TO ENHANCE SUBSCALE SCORE ACCURACY BASED ON VARIOUS PSYCHOMETRIC MODELS IDENTIFYING DATA CONDITIONS TO ENHANCE SUBSCALE SCORE ACCURACY BASED ON VARIOUS PSYCHOMETRIC MODELS A Dissertation Presented to The Academic Faculty by HeaWon Jun In Partial Fulfillment of the Requirements

More information

A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests

A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests David Shin Pearson Educational Measurement May 007 rr0701 Using assessment and research to promote learning Pearson Educational

More information

CHAPTER 3 RESEARCH METHODOLOGY

CHAPTER 3 RESEARCH METHODOLOGY CHAPTER 3 RESEARCH METHODOLOGY 3.1 Introduction 3.1 Methodology 3.1.1 Research Design 3.1. Research Framework Design 3.1.3 Research Instrument 3.1.4 Validity of Questionnaire 3.1.5 Statistical Measurement

More information

You must answer question 1.

You must answer question 1. Research Methods and Statistics Specialty Area Exam October 28, 2015 Part I: Statistics Committee: Richard Williams (Chair), Elizabeth McClintock, Sarah Mustillo You must answer question 1. 1. Suppose

More information

How to analyze correlated and longitudinal data?

How to analyze correlated and longitudinal data? How to analyze correlated and longitudinal data? Niloofar Ramezani, University of Northern Colorado, Greeley, Colorado ABSTRACT Longitudinal and correlated data are extensively used across disciplines

More information

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Journal of Social and Development Sciences Vol. 4, No. 4, pp. 93-97, Apr 203 (ISSN 222-52) Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Henry De-Graft Acquah University

More information

1.4 - Linear Regression and MS Excel

1.4 - Linear Regression and MS Excel 1.4 - Linear Regression and MS Excel Regression is an analytic technique for determining the relationship between a dependent variable and an independent variable. When the two variables have a linear

More information

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Combining Risks from Several Tumors Using Markov Chain Monte Carlo University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln U.S. Environmental Protection Agency Papers U.S. Environmental Protection Agency 2009 Combining Risks from Several Tumors

More information

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives DOI 10.1186/s12868-015-0228-5 BMC Neuroscience RESEARCH ARTICLE Open Access Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives Emmeke

More information

Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking

Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking Jee Seon Kim University of Wisconsin, Madison Paper presented at 2006 NCME Annual Meeting San Francisco, CA Correspondence

More information

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions J. Harvey a,b, & A.J. van der Merwe b a Centre for Statistical Consultation Department of Statistics

More information

An Introduction to Bayesian Statistics

An Introduction to Bayesian Statistics An Introduction to Bayesian Statistics Robert Weiss Department of Biostatistics UCLA Fielding School of Public Health robweiss@ucla.edu Sept 2015 Robert Weiss (UCLA) An Introduction to Bayesian Statistics

More information

Many studies conducted in practice settings collect patient-level. Multilevel Modeling and Practice-Based Research

Many studies conducted in practice settings collect patient-level. Multilevel Modeling and Practice-Based Research Multilevel Modeling and Practice-Based Research L. Miriam Dickinson, PhD 1 Anirban Basu, PhD 2 1 Department of Family Medicine, University of Colorado Health Sciences Center, Aurora, Colo 2 Section of

More information

Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories

Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,

More information

Impact of Methods of Scoring Omitted Responses on Achievement Gaps

Impact of Methods of Scoring Omitted Responses on Achievement Gaps Impact of Methods of Scoring Omitted Responses on Achievement Gaps Dr. Nathaniel J. S. Brown (nathaniel.js.brown@bc.edu)! Educational Research, Evaluation, and Measurement, Boston College! Dr. Dubravka

More information

Introduction to Bayesian Analysis 1

Introduction to Bayesian Analysis 1 Biostats VHM 801/802 Courses Fall 2005, Atlantic Veterinary College, PEI Henrik Stryhn Introduction to Bayesian Analysis 1 Little known outside the statistical science, there exist two different approaches

More information

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Karl Bang Christensen National Institute of Occupational Health, Denmark Helene Feveille National

More information

André Cyr and Alexander Davies

André Cyr and Alexander Davies Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander

More information

Extending Rungie et al. s model of brand image stability to account for heterogeneity

Extending Rungie et al. s model of brand image stability to account for heterogeneity University of Wollongong Research Online Faculty of Commerce - Papers (Archive) Faculty of Business 2007 Extending Rungie et al. s model of brand image stability to account for heterogeneity Sara Dolnicar

More information

How do we combine two treatment arm trials with multiple arms trials in IPD metaanalysis? An Illustration with College Drinking Interventions

How do we combine two treatment arm trials with multiple arms trials in IPD metaanalysis? An Illustration with College Drinking Interventions 1/29 How do we combine two treatment arm trials with multiple arms trials in IPD metaanalysis? An Illustration with College Drinking Interventions David Huh, PhD 1, Eun-Young Mun, PhD 2, & David C. Atkins,

More information

The matching effect of intra-class correlation (ICC) on the estimation of contextual effect: A Bayesian approach of multilevel modeling

The matching effect of intra-class correlation (ICC) on the estimation of contextual effect: A Bayesian approach of multilevel modeling MODERN MODELING METHODS 2016, 2016/05/23-26 University of Connecticut, Storrs CT, USA The matching effect of intra-class correlation (ICC) on the estimation of contextual effect: A Bayesian approach of

More information

Item Parameter Recovery for the Two-Parameter Testlet Model with Different. Estimation Methods. Abstract

Item Parameter Recovery for the Two-Parameter Testlet Model with Different. Estimation Methods. Abstract Item Parameter Recovery for the Two-Parameter Testlet Model with Different Estimation Methods Yong Luo National Center for Assessment in Saudi Arabia Abstract The testlet model is a popular statistical

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS)

County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS) County-Level Small Area Estimation using the National Health Interview Survey (NHIS) and the Behavioral Risk Factor Surveillance System (BRFSS) Van L. Parsons, Nathaniel Schenker Office of Research and

More information

The Effect of Extremes in Small Sample Size on Simple Mixed Models: A Comparison of Level-1 and Level-2 Size

The Effect of Extremes in Small Sample Size on Simple Mixed Models: A Comparison of Level-1 and Level-2 Size INSTITUTE FOR DEFENSE ANALYSES The Effect of Extremes in Small Sample Size on Simple Mixed Models: A Comparison of Level-1 and Level-2 Size Jane Pinelis, Project Leader February 26, 2018 Approved for public

More information

Ecological Statistics

Ecological Statistics A Primer of Ecological Statistics Second Edition Nicholas J. Gotelli University of Vermont Aaron M. Ellison Harvard Forest Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Brief Contents

More information

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15) ON THE COMPARISON OF BAYESIAN INFORMATION CRITERION AND DRAPER S INFORMATION CRITERION IN SELECTION OF AN ASYMMETRIC PRICE RELATIONSHIP: BOOTSTRAP SIMULATION RESULTS Henry de-graft Acquah, Senior Lecturer

More information

Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP I 5/2/2016

Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP I 5/2/2016 Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP233201500069I 5/2/2016 Overview The goal of the meta-analysis is to assess the effects

More information

accuracy (see, e.g., Mislevy & Stocking, 1989; Qualls & Ansley, 1985; Yen, 1987). A general finding of this research is that MML and Bayesian

accuracy (see, e.g., Mislevy & Stocking, 1989; Qualls & Ansley, 1985; Yen, 1987). A general finding of this research is that MML and Bayesian Recovery of Marginal Maximum Likelihood Estimates in the Two-Parameter Logistic Response Model: An Evaluation of MULTILOG Clement A. Stone University of Pittsburgh Marginal maximum likelihood (MML) estimation

More information

Small-area estimation of mental illness prevalence for schools

Small-area estimation of mental illness prevalence for schools Small-area estimation of mental illness prevalence for schools Fan Li 1 Alan Zaslavsky 2 1 Department of Statistical Science Duke University 2 Department of Health Care Policy Harvard Medical School March

More information

Linking Mixed-Format Tests Using Multiple Choice Anchors. Michael E. Walker. Sooyeon Kim. ETS, Princeton, NJ

Linking Mixed-Format Tests Using Multiple Choice Anchors. Michael E. Walker. Sooyeon Kim. ETS, Princeton, NJ Linking Mixed-Format Tests Using Multiple Choice Anchors Michael E. Walker Sooyeon Kim ETS, Princeton, NJ Paper presented at the annual meeting of the American Educational Research Association (AERA) and

More information

Impact and adjustment of selection bias. in the assessment of measurement equivalence

Impact and adjustment of selection bias. in the assessment of measurement equivalence Impact and adjustment of selection bias in the assessment of measurement equivalence Thomas Klausch, Joop Hox,& Barry Schouten Working Paper, Utrecht, December 2012 Corresponding author: Thomas Klausch,

More information

UNIVERSITY OF FLORIDA 2010

UNIVERSITY OF FLORIDA 2010 COMPARISON OF LATENT GROWTH MODELS WITH DIFFERENT TIME CODING STRATEGIES IN THE PRESENCE OF INTER-INDIVIDUALLY VARYING TIME POINTS OF MEASUREMENT By BURAK AYDIN A THESIS PRESENTED TO THE GRADUATE SCHOOL

More information

educational assessment and educational measurement

educational assessment and educational measurement EDUCATIONAL ASSESSMENT AND EDUCATIONAL MEASUREMENT research line 5 educational assessment and educational measurement EDUCATIONAL ASSESSMENT AND EDUCATIONAL MEASUREMENT 98 1 Educational Assessment 100

More information

Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology*

Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology* Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology* Timothy Teo & Chwee Beng Lee Nanyang Technology University Singapore This

More information

Paul Irwing, Manchester Business School

Paul Irwing, Manchester Business School Paul Irwing, Manchester Business School Factor analysis has been the prime statistical technique for the development of structural theories in social science, such as the hierarchical factor model of human

More information

The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance

The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance The SAGE Encyclopedia of Educational Research, Measurement, Multivariate Analysis of Variance Contributors: David W. Stockburger Edited by: Bruce B. Frey Book Title: Chapter Title: "Multivariate Analysis

More information

Mixed Effect Modeling. Mixed Effects Models. Synonyms. Definition. Description

Mixed Effect Modeling. Mixed Effects Models. Synonyms. Definition. Description ixed Effects odels 4089 ixed Effect odeling Hierarchical Linear odeling ixed Effects odels atthew P. Buman 1 and Eric B. Hekler 2 1 Exercise and Wellness Program, School of Nutrition and Health Promotion

More information

Why item response theory should be used for longitudinal questionnaire data analysis in medical research

Why item response theory should be used for longitudinal questionnaire data analysis in medical research Gorter et al. BMC Medical Research Methodology (2015) 15:55 DOI 10.1186/s12874-015-0050-x RESEARCH ARTICLE Why item response theory should be used for longitudinal questionnaire data analysis in medical

More information

Adaptive EAP Estimation of Ability

Adaptive EAP Estimation of Ability Adaptive EAP Estimation of Ability in a Microcomputer Environment R. Darrell Bock University of Chicago Robert J. Mislevy National Opinion Research Center Expected a posteriori (EAP) estimation of ability,

More information

Meta-analysis using HLM 1. Running head: META-ANALYSIS FOR SINGLE-CASE INTERVENTION DESIGNS

Meta-analysis using HLM 1. Running head: META-ANALYSIS FOR SINGLE-CASE INTERVENTION DESIGNS Meta-analysis using HLM 1 Running head: META-ANALYSIS FOR SINGLE-CASE INTERVENTION DESIGNS Comparing Two Meta-Analysis Approaches for Single Subject Design: Hierarchical Linear Model Perspective Rafa Kasim

More information

Wim Van den Noortgate; Paul De Boeck; Michel Meulders. Journal of Educational and Behavioral Statistics, Vol. 28, No. 4. (Winter, 2003), pp

Wim Van den Noortgate; Paul De Boeck; Michel Meulders. Journal of Educational and Behavioral Statistics, Vol. 28, No. 4. (Winter, 2003), pp Cross-Classification Multilevel Logistic Models in Psychometrics Wim Van den Noortgate; Paul De Boeck; Michel Meulders Journal of Educational and Behavioral Statistics, Vol. 28, No. 4. (Winter, 2003),

More information

In this chapter, we discuss the statistical methods used to test the viability

In this chapter, we discuss the statistical methods used to test the viability 5 Strategy for Measuring Constructs and Testing Relationships In this chapter, we discuss the statistical methods used to test the viability of our conceptual models as well as the methods used to test

More information

Nonparametric DIF. Bruno D. Zumbo and Petronilla M. Witarsa University of British Columbia

Nonparametric DIF. Bruno D. Zumbo and Petronilla M. Witarsa University of British Columbia Nonparametric DIF Nonparametric IRT Methodology For Detecting DIF In Moderate-To-Small Scale Measurement: Operating Characteristics And A Comparison With The Mantel Haenszel Bruno D. Zumbo and Petronilla

More information

How many speakers? How many tokens?:

How many speakers? How many tokens?: 1 NWAV 38- Ottawa, Canada 23/10/09 How many speakers? How many tokens?: A methodological contribution to the study of variation. Jorge Aguilar-Sánchez University of Wisconsin-La Crosse 2 Sample size in

More information

Section on Survey Research Methods JSM 2009

Section on Survey Research Methods JSM 2009 Missing Data and Complex Samples: The Impact of Listwise Deletion vs. Subpopulation Analysis on Statistical Bias and Hypothesis Test Results when Data are MCAR and MAR Bethany A. Bell, Jeffrey D. Kromrey

More information

Performance of Median and Least Squares Regression for Slightly Skewed Data

Performance of Median and Least Squares Regression for Slightly Skewed Data World Academy of Science, Engineering and Technology 9 Performance of Median and Least Squares Regression for Slightly Skewed Data Carolina Bancayrin - Baguio Abstract This paper presents the concept of

More information

Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models

Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models Bayesian Analysis of Between-Group Differences in Variance Components in Hierarchical Generalized Linear Models Brady T. West Michigan Program in Survey Methodology, Institute for Social Research, 46 Thompson

More information

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study STATISTICAL METHODS Epidemiology Biostatistics and Public Health - 2016, Volume 13, Number 1 Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation

More information

Selection of Linking Items

Selection of Linking Items Selection of Linking Items Subset of items that maximally reflect the scale information function Denote the scale information as Linear programming solver (in R, lp_solve 5.5) min(y) Subject to θ, θs,

More information

Exploring the Factors that Impact Injury Severity using Hierarchical Linear Modeling (HLM)

Exploring the Factors that Impact Injury Severity using Hierarchical Linear Modeling (HLM) Exploring the Factors that Impact Injury Severity using Hierarchical Linear Modeling (HLM) Introduction Injury Severity describes the severity of the injury to the person involved in the crash. Understanding

More information

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,

More information

Measuring mathematics anxiety: Paper 2 - Constructing and validating the measure. Rob Cavanagh Len Sparrow Curtin University

Measuring mathematics anxiety: Paper 2 - Constructing and validating the measure. Rob Cavanagh Len Sparrow Curtin University Measuring mathematics anxiety: Paper 2 - Constructing and validating the measure Rob Cavanagh Len Sparrow Curtin University R.Cavanagh@curtin.edu.au Abstract The study sought to measure mathematics anxiety

More information

The Use of Piecewise Growth Models in Evaluations of Interventions. CSE Technical Report 477

The Use of Piecewise Growth Models in Evaluations of Interventions. CSE Technical Report 477 The Use of Piecewise Growth Models in Evaluations of Interventions CSE Technical Report 477 Michael Seltzer CRESST/University of California, Los Angeles Martin Svartberg Norwegian University of Science

More information

Simple Linear Regression the model, estimation and testing

Simple Linear Regression the model, estimation and testing Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.

More information

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

3 CONCEPTUAL FOUNDATIONS OF STATISTICS 3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical

More information

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 39 Evaluation of Comparability of Scores and Passing Decisions for Different Item Pools of Computerized Adaptive Examinations

More information

Incorporating Within-Study Correlations in Multivariate Meta-analysis: Multilevel Versus Traditional Models

Incorporating Within-Study Correlations in Multivariate Meta-analysis: Multilevel Versus Traditional Models Incorporating Within-Study Correlations in Multivariate Meta-analysis: Multilevel Versus Traditional Models Alison J. O Mara and Herbert W. Marsh Department of Education, University of Oxford, UK Abstract

More information