Comprehensive Statistical Analysis of a Mathematics Placement Test
|
|
- Jonas McKinney
- 5 years ago
- Views:
Transcription
1 Comprehensive Statistical Analysis of a Mathematics Placement Test Robert J. Hall Department of Educational Psychology Texas A&M University, USA (bobhall@tamu.edu) Eunju Jung Department of Educational Psychology Texas A&M University, USA (doduli@tamu.edu) Michael S. Pilant Department of Mathematics Texas A&M University, USA (mpilant@tamu.edu) Background Abstract: As part of an NSF Science, Technology, Engineering and Mathematics Talent Expansion Project (STEP) grant, a math placement exam (MPE) has been developed at Texas A&M University for the purpose of evaluating the pre-calculus mathematical skills of entering students. It consists of 33 multiple-choice questions, and is taken online. A score of 22 or higher is required in order to register for Calculus I. Following admission to the University and prior to registering for classes, approximately 4,500 students take the placement exam each fall. To date, more than 15,000 students have taken the MPE. This paper focuses on the psychometric properties of the MPE based on analysis of test score results over a four-year period from 2008 to An item response theory (IRT) analysis has been performed. Various other statistical tests, such as a confirmatory factor analysis and the computation of Cronbach's alpha (α), have also been performed. The value for Cronbach's α (using the entire test sample) exceeds This is comparable to high-stakes placement tests such as the SAT and the Advanced Placement tests. A detailed description of the statistical analyses performed on the MPE, as well as the results and their interpretation, will be presented in this paper. Finally, a brief analysis of the degree to which student performance in Calculus I can be predicted using MPE scores is presented. Texas A&M University has the second largest engineering program in the United States, with over 8,000 undergraduate engineering majors. Traditionally, students in this program take calculus during their freshman year, along with physics and other science, technology, engineering, and mathematics (STEM) courses. Some students are not sufficiently prepared and have difficulty passing their mathematics courses. In order to identify students with potential problems, a Math Placement Exam (MPE) was developed. Our goal was to develop a reliable, robust measure of the preparedness of incoming students for college-level calculus. Beginning in 2008, almost all entering freshmen were required to take one of two math placement exams, either for engineering calculus or for business math. More than 4,000 students took the exams in summer 2011 prior to entering Texas A&M, and 1,781 students were enrolled in Calculus I during the fall 2011 semester. As of January 2012, data (MPE scores and grades) for more than 15,000 students in the science and engineering programs has been collected. This study focuses on three research areas: 1. Internal consistency and robustness of the test instrument; 2. Number of latent variables present; and 3. Difficulty and discrimination of each test item.
2 Conceptual Design of the Mathematics Placement Exam (MPE) The MPE consists of 33 multiple choice items covering the following areas: polynomials, functions, graphing, exponentials and logarithms, and trigonometric functions. Questions were designed by two veteran faculty members who are experienced in teaching both pre-calculus and calculus. Each problem has 15 variants constructed from a template with different parameters. Consequently, each of the variants is of equal difficulty. When an exam is created, a question is selected and one of 15 variants of that question is delivered online to the student. Questions are delivered in the same order in every exam. This ensures that each exam is of equal difficulty. After the student selects a response, another question is delivered until all questions are completed. Students have an opportunity to review questions and answers before submitting the test. Following submission, questions are graded (correct or incorrect), and scores are returned to students. There are 15 N different versions of the MPE where N is the number of questions on the exam. Consequently, students each receive a unique set of questions on their version of the MPE. Based on cumulative performance data, a cutoff score of 22 was established to help ensure basic algebra and pre-calculus skills in Calculus I. Historically, more than 70% of students with placement scores of 22 or greater pass Calculus I. Passing is defined as receiving a grade of A, B, or C. Under current guidelines, a student must score 22 or higher on the MPE in order to enroll in Calculus I. If they score below 22, they make take the exam again (after waiting a month) or enroll in a summer program called the Personalized Pre-Calculus Program (PPP). Until fall 2011, students with less than 22 could still self-enroll in Calculus I, but as of fall 2011, the registration system blocks registration in Calculus I if their MPE score is less than 22. Statistical Analysis Reliability Under classical test theory, reliability is defined as the proportion of the true score variance over the observed score variance (Raykov & Marcoulides, 2011). Practically, we can define the reliability as the consistency of the test scores across different administrations or different populations. Among the many indexes for reliability, Cronbach s α is the most widely used, since internal consistency among test items can be measured with only a single test administration. Cronbach s α is calculated by taking the mean of all possible split-half coefficients which are computed using Rulon s method (Crocker & Algina, 1986). Given the entire data set (over 15,000 tests), the Cronbach's α coefficient value of the test items is 0.901, which indicates that the internal consistency among the items is very good. One of the important features of the Cronbach's α test is the re-calculation of α when one question is removed. Removing the question that reduces α the least gives a reduced set of questions with the maximal α. This can be done repeatedly until one has a minimal number of questions with an α above a certain level. The results of this procedure are shown below in Figure 1. Figure 1. Values of Cronbach s α Removing One Question at a Time from the MPE (n =15,128)
3 In general, good internal consistency amongst items on a test is indicated when the Cronbach s α coefficient is greater than 0.8. The MPE has Cronbach α's which exceed 0.8 for the combined year data (N=15,128) and for data separated by year (Table 1). Therefore, we can say that the MPE has very good internal consistency. Construct Validity Year Number of Students Mean Difficulty Standard Deviation Cronbach s α Table 1. Values of Cronbach s α for the MPE by Year Under the traditional confirmatory factor analysis (CFA) model, only continuous observed variables can be handled. Often, however, questions are scored dichotomously, correct or incorrect. Advances in CFA modeling allow researchers to test factor structures using measures made up of dichotomous items. The CFA model for categorical data does not use the item scores themselves, but depends on the assumption of an underlying, normally distributed variable behind each discrete item or instrument component (Raykov & Marcoulides, 2011). Confirmatory factor analysis has been widely used to test factor structures underlying tests or instruments with multiple items. For example, cognitive tests are usually designed to measure one or more theoretical latent constructs, such as math proficiency, reading comprehension, or problem-solving skills. The primary function of CFA models is to relate observed variables with unobserved latent constructs. CFA allows researchers to test hypotheses about a particular factor structure by examining both the dimensionality (i.e., the number of underlying factors) of the test and the pattern of relations between the items and factors (Brown, 2006). To evaluate the construct validity of the MPE, various categorical data analysis models were tested using the software package Mplus6 (Muthén & Muthén, ). Confirmatory factor analysis (CFA) of categorical variables uses the concept of latent variables. These variables are not directly observable but are assumed to give rise to the categorical responses. For the MPE, observed variables are scores on the algebra test items and the latent construct is thought to be higher-order algebra skills. Even though the observed scores of our data are dichotomous, the latent variables are thought to be continuous and assumed to be normally distributed. The estimation method used is weighted least squares. Fundamentally, the CFA model under the MPlus6 Program is equivalent to the two parameter item response theory (2-P IRT) model. For this analysis, we chose to try to fit a two parameter normal-ogive model as opposed to a two parameter Logistic model (2-PL). The normal-ogive model is mathematically more difficult to fit but provides a more robust, direct interpretation than does the logistic regression model. Mplus6 provides indicators of overall model fit and parameter estimates (i.e., difficulty and discrimination) for the two parameter IRT model. Ultimately, however, we would like to be able to fit the more restrictive one parameter model as we are primarily interested in whether dimensionality (i.e., one factor under the set of items) is plausible or not. We want to establish that the MPE can be characterized by a single latent higher-order algebra factor. We begin by fitting the two parameter, less restrictive model, and then proceed to examining fit using a one parameter model. The overall model fit for a two factor model can be evaluated by computing a chi-square statistic and accompanying fit indices, root mean squared error of approximation (RMSEA) and comparative fit index (CFI). The items of the MPE were fitted to the two factor model across years and by year. Results are shown in Table 2. Although the chi-square values were statistically significant (indicating that our data did not fit a two parameter model across years or within year), the MPE data samples were large, and large samples often result in significant chi-square values (Brown, 2006). In such cases, it is both necessary and appropriate to check other fit statistics, such as the CFI and RMSEA indices, before judging model fit. Conceptually, the comparative fit index (CFI) measures the distance of the proposed model from a bad model, one that does not pose any relationship amongst variables (i.e., covariances among all input indicators are fixed to zero). Hence, a larger CFI value, 0.95 or above, is deemed acceptable. For the MPE, the combined year data had a CFI of Individual year data had CFIs of 0.977, 0.979, 0.982, and 0.980, respectively. CFIs, therefore, offer a compelling argument for a two parameter model. The RMSEA measures the discrepancy between the model and the observed data; therefore, smaller values of the RMSEA are better. Typically, an RMSEA less than 0.05 is thought to represent a good fit between the proposed model and the observed data. For the MPE, the RMSEA value for the combined years was while RMSEA values for years 2008, 2009, 2010, and 2011 were
4 0.027, 0.025, 0.024, and 0.025, respectively. All RMSEAs were small, indicating very good model fit. To summarize, CFA analyses were interpreted to indicate a good fit for a two factor (one latent variable) model whether data were combined into one large group or separated into year cohorts. To further characterize this model, we turned to item response theory (IRT). Year N χ2 df CFI RMSEA Total Table 2. Uni-dimensionality Confirmatory Factor Analysis Results (Two Parameter Model) Item Response Theory (IRT) Analysis Item Response Theory is called a latent trait model since it uses a mathematical model that relates a theorized latent construct (or trait) with the observed item response using item parameters (Hableton, et al., 1991). In order to fit observed data to an IRT model, two assumptions should be met. First, the dimensionality of the items (i.e., the number of latent variables) should be confirmed. Second, the items should not be correlated after taking account for the latent factor (or factors). This assumption is called local independence. Based on the CFA analysis, we can say that the uni-dimensionality of the complete set of MPE items is supported. If there is a big modification index, suggesting correlation between unique factor scores under the CFA analysis, the local independence assumption is considered to be violated. In our analysis, there was no evidence for violation of the local independence assumptions based on review of the modification indices which did not suggest adding any error covariances to reduce the overall χ 2. Therefore, the use of a uni-dimensional, single latent variable IRT model is justified. Often, cognitive test items are scored as dichotomous (i.e., correct or incorrect). The IRT model allows for analysis of dichotomous data by theorizing a latent trait with the probability of the correct response for each item. Basically, the dichotomous IRT model is an S-shaped curve. The horizontal axis represents the values of the latent trait scaled to have a mean of zero, and the vertical axis represents the probability of making a correct response (see Figure 2). Figure 2. Item Characteristic Curve: Dichotomous IRT Model
5 The IRT model mathematically captures item difficulty and the individual's ability on the same continuum. For this reason, we can compare the item difficulty and individual s ability level directly. There are three generally accepted dichotomous IRT models. The first IRT model (one parameter) postulates a single parameter, item difficulty, as the underlying latent variable. This corresponds to equation (1) where c i =0 and the a i are the same for each question. This is closely related to the classical Rasch model (Rasch, 1961). In the context of a person s ability, measurement of the parameter implies that one needs more ability to endorse or pass a more difficult item. In Figure 2, the x-value corresponding to the 0.5 probability of passing an item indicates the difficulty parameter for that item. The second IRT model involves two parameters and is referred to as the (2-PL) model. In this model, an attempt is made to fit two parameters to the observed data, item discrimination and difficulty. The discrimination parameter represents how well the item differentiates individuals according to their ability level. In other words, the slope of the item response curve where the difficulty level locates is the discrimination parameter. This corresponds to equation (1) where c i = 0 for each question. The third IRT model is a three-parameter (3-PL) model that allows the parameter c i to vary in addition to the difficulty and discrimination parameters. It is called a pseudo-guessing parameter and represents the probability of passing an item with very low ability level. The three parameter IRT model can be written as the following equation: c i + (1 c i ) e a i( θ b i) 1 +e a i( θ b i) (1) In equation (1), the parameter b i represents the difficulty parameter of the i th item. The parameter a i represents the discrimination of the i th item. The parameter c i represents the pseudo-guessing parameter of the i th item. As the skill (latent trait) increases, the probability of getting the correct response goes to 1. As the skill decreases, the probability asymptotically approaches that of guessing. The effect of guessing is reduced as the value of the latent variable (skill) increases. In the case of the MPE, we chose not to fit a three parameter model. Conceptually, the pseudo-guessing parameter is problematic because it does not take into account differential option attractiveness; thus the random guessing model s assumption is not reflected in the response data. Given potential problems with the pseudoguessing parameter, de Ayala (2009) suggests that a two parameter model may provide a sufficiently reasonable representation of the data (p. 126). To handle random guessing, we based our analyses on protocols with scores of 7 or higher. On the 33-item MPE, with each question having 5 possible responses, random guessing could result in scores of 6 or below. MPlus6 was used to fit both one and two parameter IRT models to the MPE data. Data were fitted to both models as a whole (i.e., combining years) and by year. MPlus6 also allows us to fit the one and two parameter IRT models using both normal-ogive (inverse normal cumulative distribution) and logistic functions. In this analysis, we used the normal-ogive function. First, the two parameter model which is equal to the confirmatory factor analysis model previously tested was found to be feasible with only one dimensionality. Next, a one parameter IRT model was tested by fixing all the factor loadings to be the same. Table 3 presents the results of the one parameter IRT analyses of data combined across years and separated by year. The combined data (N = 15,128) has a good fit with indices of for CFI and for RMSEA. However, for the two parameter model, the data by year fit is not as good. All the RMSEAs indicate good fit while all the CFIs indicate adequate fit. Conversely, the one parameter model showed acceptable fit statistics for the combined-year as well as for the by-year MPE data. Year N χ2 df CFI RMSEA Total Table 3. Confirmatory Factor Analysis Results (i.e., the One Parameter IRT model)
6 In summary, further analyses regarding the difficulty and discrimination parameters will follow depending upon need, but at this point we are confident that the MPE instrument is psychometrically sound, measuring a single latent variable sensitive to how well items differentiate individuals according to their ability level. This is fundamentally true whether we look at the group as a whole or break the group into year-based cohorts. Predicting Student Performance Using the MPE Given that the outcome variable is categorical (i.e., A, B, C, D, F, or Pass/Fail), measurement of the Pearson correlation coefficient is not a good indicator of the relationship between grades and MPE scores. In the case of our data, for example, the resulting Pearson correlation coefficient r 2 values ranged between 0.15 and To better understand the relationship between MPE scores and grades, we switched to an Odd s Ratio or probability analysis. Using historical data, we can compute the frequency of students passing the course with MPE scores in a given range. This cumulative frequency distribution function can be modeled very accurately by a two parameter logistic function. Consequently, instead of using grades as the outcome variable, we compute the probability of passing with MPE scores in a given range. The r 2 value between the output of this model and the actual historical data exceeds Using this model, we can identify an MPE score such that 70% of the students with this score or higher pass Calculus I. The MPE cutoff score of 22 originated from this analysis. MPE Scoring Across Years Using a logistics curve fitting model, it was determined that a cutoff score of 22 for the 33 item MPE instrument could be used to indicate who would be successful in the beginning calculus class, Calculus 1. The MPE instrument, psychometrically, does a good job of discriminating performance on one latent variable (measuring calculus readiness or pre-calculus ability ) across item difficulty levels. The instrument was designed to measure the latent variable, higher order algebraic ability, and cutoff scores of 22 indicate a probability of about 0.7 that students majoring in engineering or the sciences will be successful in the gateway calculus class, Calculus 1. The following analyses look at the relationship between student grades in Calculus 1 to measured ability on the Math Placement Exam. Table 4 provides descriptive statistics for the MPE by year for 15,160 students. Average scores for the MPE are similar across years but there is a noticeable upward (creeping) trend in the scores. An ANOVA with Year as the independent variable and MPE score as the dependent variable produced a statistically significant main effect for year (F (3, 15,156 df) = 32.44; p <.000). Subsequent post hoc testing (Tukey HSD and Bonferroni) indicated that the mean MPE score for 2008 was lower than the mean score for any other year. The mean MPE scores for 2009 and 2010 grouped together (no difference between means) as did the mean scores for 2010 and Overall, there is evidence that might be interpreted to suggest that over time MPE scores are increasing. What does this mean for using a firm cutoff point for the MPE? First, even though the cohort differences appear to be getting larger, the largest cohort mean difference is only 1.5 questions. Although the difference is statistically significant, it may not be meaningfully significant. Meaningful significance is addressed through the concept of explained or common variance, the effect size. Partial eta squared for the year main effect was or 0.6 of one percent. Knowing a student s MPE score tells us nothing about the student s cohort year. An artifact of large samples is small standard error values making small mean differences statistically significant. Year Mean N S.D Total a a Note. At the time of this analysis, grades for 32 additional students were paired with MPE scores increasing the n from 15,128 to 15,160 Table 4. Descriptive Statistics for MPE by Year
7 Despite the fact that the mean increases are small and not of major concern there is a trend upward. The creep in scoring, however, makes sense given the evolution of the instrument. Initially, in 2008, students were asked to complete the MPE before beginning their freshman year. At that time, the MPE was in a development phase. Since nothing was at stake, students tended not to take the MPE more than one time. As the level of MPE scoring has become tied to placement in freshman math classes, students are now more aware that low scores on the MPE will prevent them from taking Calculus 1 as an entering freshman. Since the highest score on repeated administrations of the MPE is the officially registered MPE value, we might expect some upward movement like that seen in the current data set. Students are also made aware, through academic counseling, that poor performance on the MPE reflects relative weakness in a skill area, complex algebra, which is related to poor performance in the 4-course engineering math sequence. Students taking time to refresh or instantiate their algebraic skills may improve their performance on the MPE, thus contributing to the slight elevation in overall performance observed in the data set. Summary and Conclusions The purpose of IRT and Rasch modeling is to provide a framework for evaluating how well assessments work and how well individual items on assessments work (Embretson & Reise, 2000). Typically they are used in conjunction with test development, not to test whether assessments are psychometrically sound following development. We built the MPE using subject-matter experts, then used Rasch and IRT modeling to validate the MPE, thus lending support to the product development process. By performing a CFA analysis of the MPE data, we found that we could model the MPE data using a unidimensional, single latent variable model. From this information, we were justified in applying an item response theory analysis which supported a two parameter model. RMSEA and CFI values confirmed the fit of the two parameter model. In addition, we looked at the internal consistency of the 33-item MPE and found Cronbach s α values of approximately This was true both cumulatively and by year. This indicates the high internal consistency of the MPE placement test both longitudinally and cumulatively. Finally, if the relationship between grades in Calculus I and MPE scores is expressed as the probability of passing Calculus I given an MPE score range, the result is an accurate prediction regarding the success (retention rate) of how students typically perform in Calculus I. In summary, results from these analyses, Cronbach s α, CFA, and IRT, attest to the psychometric soundness of the instrument for this sample of students. Moreover, the relationship between grades and MPE scores suggests that knowledge of performance on the MPE can be used to predict the probability that a student will experience difficulties in the first year engineering calculus sequence. We recognize, however, that what may be a good instrument for measuring the pre-calculus mathematical skills for students entering this particular university may not work well for measuring math readiness placement skills at other institutions due to differing student populations. The MPE developed at Texas A&M University for a particular engineering mathematics course provides some measure of confidence that one can develop placement exams with similar statistical properties for other courses at other institutions. References Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York, NY: The Guilford Press. Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. Fort Worth, TX: Harcourt College. de Ayala, R. J. (2009). The theory and practice of item response theory. New York, NY: The Guilford Press. Embretson, S. E. and Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates. Hableton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage.
8 Muthén, L.K. and Muthén, B.O. ( ). MPlus User s Guide. Sixth Edition. Los Angeles, CA: Muthén & Muthén. Rasch, G. (1961). On general laws and the meaning of measurement in psychology. In, Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, IV (pp ). Berkeley, CA: University of California Press. Raykov, T. & Marcoulides, G. A. (2011). Introduction to Psychometric Theory. New York,NY: Routledge. Acknowledgements Research supported in part by a grant from the National Science Foundation, NSF-DUE# Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation.
The Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory
The Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory Kate DeRoche, M.A. Mental Health Center of Denver Antonio Olmos, Ph.D. Mental Health
More informationInvestigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories
Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,
More informationItem Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses
Item Response Theory Steven P. Reise University of California, U.S.A. Item response theory (IRT), or modern measurement theory, provides alternatives to classical test theory (CTT) methods for the construction,
More informationContents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD
Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT
More informationCollege Student Self-Assessment Survey (CSSAS)
13 College Student Self-Assessment Survey (CSSAS) Development of College Student Self Assessment Survey (CSSAS) The collection and analysis of student achievement indicator data are of primary importance
More informationAssessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies. Xiaowen Zhu. Xi an Jiaotong University.
Running head: ASSESS MEASUREMENT INVARIANCE Assessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies Xiaowen Zhu Xi an Jiaotong University Yanjie Bian Xi an Jiaotong
More informationItem Response Theory (IRT): A Modern Statistical Theory for Solving Measurement Problem in 21st Century
International Journal of Scientific Research in Education, SEPTEMBER 2018, Vol. 11(3B), 627-635. Item Response Theory (IRT): A Modern Statistical Theory for Solving Measurement Problem in 21st Century
More informationEmpowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison
Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological
More informationBasic concepts and principles of classical test theory
Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must
More informationAlternative Methods for Assessing the Fit of Structural Equation Models in Developmental Research
Alternative Methods for Assessing the Fit of Structural Equation Models in Developmental Research Michael T. Willoughby, B.S. & Patrick J. Curran, Ph.D. Duke University Abstract Structural Equation Modeling
More informationAnalysis of the Reliability and Validity of an Edgenuity Algebra I Quiz
Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz This study presents the steps Edgenuity uses to evaluate the reliability and validity of its quizzes, topic tests, and cumulative
More informationOn the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation in CFA
STRUCTURAL EQUATION MODELING, 13(2), 186 203 Copyright 2006, Lawrence Erlbaum Associates, Inc. On the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation
More informationAndré Cyr and Alexander Davies
Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander
More informationDiagnostic Classification Models
Diagnostic Classification Models Lecture #13 ICPSR Item Response Theory Workshop Lecture #13: 1of 86 Lecture Overview Key definitions Conceptual example Example uses of diagnostic models in education Classroom
More informationDoes factor indeterminacy matter in multi-dimensional item response theory?
ABSTRACT Paper 957-2017 Does factor indeterminacy matter in multi-dimensional item response theory? Chong Ho Yu, Ph.D., Azusa Pacific University This paper aims to illustrate proper applications of multi-dimensional
More informationThe Modification of Dichotomous and Polytomous Item Response Theory to Structural Equation Modeling Analysis
Canadian Social Science Vol. 8, No. 5, 2012, pp. 71-78 DOI:10.3968/j.css.1923669720120805.1148 ISSN 1712-8056[Print] ISSN 1923-6697[Online] www.cscanada.net www.cscanada.org The Modification of Dichotomous
More informationExamining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology*
Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology* Timothy Teo & Chwee Beng Lee Nanyang Technology University Singapore This
More informationFactors Influencing Undergraduate Students Motivation to Study Science
Factors Influencing Undergraduate Students Motivation to Study Science Ghali Hassan Faculty of Education, Queensland University of Technology, Australia Abstract The purpose of this exploratory study was
More informationITEM RESPONSE THEORY ANALYSIS OF THE TOP LEADERSHIP DIRECTION SCALE
California State University, San Bernardino CSUSB ScholarWorks Electronic Theses, Projects, and Dissertations Office of Graduate Studies 6-2016 ITEM RESPONSE THEORY ANALYSIS OF THE TOP LEADERSHIP DIRECTION
More informationBusiness Statistics Probability
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationItem Response Theory: Methods for the Analysis of Discrete Survey Response Data
Item Response Theory: Methods for the Analysis of Discrete Survey Response Data ICPSR Summer Workshop at the University of Michigan June 29, 2015 July 3, 2015 Presented by: Dr. Jonathan Templin Department
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationManifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement Invariance Tests Of Multi-Group Confirmatory Factor Analyses
Journal of Modern Applied Statistical Methods Copyright 2005 JMASM, Inc. May, 2005, Vol. 4, No.1, 275-282 1538 9472/05/$95.00 Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement
More informationResearch and Evaluation Methodology Program, School of Human Development and Organizational Studies in Education, University of Florida
Vol. 2 (1), pp. 22-39, Jan, 2015 http://www.ijate.net e-issn: 2148-7456 IJATE A Comparison of Logistic Regression Models for Dif Detection in Polytomous Items: The Effect of Small Sample Sizes and Non-Normality
More informationDuring the past century, mathematics
An Evaluation of Mathematics Competitions Using Item Response Theory Jim Gleason During the past century, mathematics competitions have become part of the landscape in mathematics education. The first
More informationRunning head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note
Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1 Nested Factor Analytic Model Comparison as a Means to Detect Aberrant Response Patterns John M. Clark III Pearson Author Note John M. Clark III,
More informationTechniques for Explaining Item Response Theory to Stakeholder
Techniques for Explaining Item Response Theory to Stakeholder Kate DeRoche Antonio Olmos C.J. Mckinney Mental Health Center of Denver Presented on March 23, 2007 at the Eastern Evaluation Research Society
More informationKnown-Groups Validity 2017 FSSE Measurement Invariance
Known-Groups Validity 2017 FSSE Measurement Invariance A key assumption of any latent measure (any questionnaire trying to assess an unobservable construct) is that it functions equally across all different
More informationLikelihood Ratio Based Computerized Classification Testing. Nathan A. Thompson. Assessment Systems Corporation & University of Cincinnati.
Likelihood Ratio Based Computerized Classification Testing Nathan A. Thompson Assessment Systems Corporation & University of Cincinnati Shungwon Ro Kenexa Abstract An efficient method for making decisions
More informationPersonal Style Inventory Item Revision: Confirmatory Factor Analysis
Personal Style Inventory Item Revision: Confirmatory Factor Analysis This research was a team effort of Enzo Valenzi and myself. I m deeply grateful to Enzo for his years of statistical contributions to
More informationConnexion of Item Response Theory to Decision Making in Chess. Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan
Connexion of Item Response Theory to Decision Making in Chess Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan Acknowledgement A few Slides have been taken from the following presentation
More informationGENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS
GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at
More informationThe Classification Accuracy of Measurement Decision Theory. Lawrence Rudner University of Maryland
Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, April 23-25, 2003 The Classification Accuracy of Measurement Decision Theory Lawrence Rudner University
More informationThe SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance
The SAGE Encyclopedia of Educational Research, Measurement, Multivariate Analysis of Variance Contributors: David W. Stockburger Edited by: Bruce B. Frey Book Title: Chapter Title: "Multivariate Analysis
More informationChapter 1 Introduction. Measurement Theory. broadest sense and not, as it is sometimes used, as a proxy for deterministic models.
Ostini & Nering - Chapter 1 - Page 1 POLYTOMOUS ITEM RESPONSE THEORY MODELS Chapter 1 Introduction Measurement Theory Mathematical models have been found to be very useful tools in the process of human
More informationOn the Many Claims and Applications of the Latent Variable
On the Many Claims and Applications of the Latent Variable Science is an attempt to exploit this contact between our minds and the world, and science is also motivated by the limitations that result from
More information3 CONCEPTUAL FOUNDATIONS OF STATISTICS
3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical
More informationFundamental Concepts for Using Diagnostic Classification Models. Section #2 NCME 2016 Training Session. NCME 2016 Training Session: Section 2
Fundamental Concepts for Using Diagnostic Classification Models Section #2 NCME 2016 Training Session NCME 2016 Training Session: Section 2 Lecture Overview Nature of attributes What s in a name? Grain
More informationProceedings of the 2011 International Conference on Teaching, Learning and Change (c) International Association for Teaching and Learning (IATEL)
EVALUATION OF MATHEMATICS ACHIEVEMENT TEST: A COMPARISON BETWEEN CLASSICAL TEST THEORY (CTT)AND ITEM RESPONSE THEORY (IRT) Eluwa, O. Idowu 1, Akubuike N. Eluwa 2 and Bekom K. Abang 3 1& 3 Dept of Educational
More informationTesting the Multiple Intelligences Theory in Oman
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 190 ( 2015 ) 106 112 2nd GLOBAL CONFERENCE on PSYCHOLOGY RESEARCHES, 28-29, November 2014 Testing the Multiple
More informationThe Development of Scales to Measure QISA s Three Guiding Principles of Student Aspirations Using the My Voice TM Survey
The Development of Scales to Measure QISA s Three Guiding Principles of Student Aspirations Using the My Voice TM Survey Matthew J. Bundick, Ph.D. Director of Research February 2011 The Development of
More informationUsing the Rasch Modeling for psychometrics examination of food security and acculturation surveys
Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys Jill F. Kilanowski, PhD, APRN,CPNP Associate Professor Alpha Zeta & Mu Chi Acknowledgements Dr. Li Lin,
More informationFACTOR VALIDITY OF THE MERIDEN SCHOOL CLIMATE SURVEY- STUDENT VERSION (MSCS-SV)
FACTOR VALIDITY OF THE MERIDEN SCHOOL CLIMATE SURVEY- STUDENT VERSION (MSCS-SV) Nela Marinković 1,2, Ivana Zečević 2 & Siniša Subotić 3 2 Faculty of Philosophy, University of Banja Luka 3 University of
More informationInfluences of IRT Item Attributes on Angoff Rater Judgments
Influences of IRT Item Attributes on Angoff Rater Judgments Christian Jones, M.A. CPS Human Resource Services Greg Hurt!, Ph.D. CSUS, Sacramento Angoff Method Assemble a panel of subject matter experts
More informationExploratory Factor Analysis Student Anxiety Questionnaire on Statistics
Proceedings of Ahmad Dahlan International Conference on Mathematics and Mathematics Education Universitas Ahmad Dahlan, Yogyakarta, 13-14 October 2017 Exploratory Factor Analysis Student Anxiety Questionnaire
More informationConfirmatory Factor Analysis of the Procrastination Assessment Scale for Students
611456SGOXXX10.1177/2158244015611456SAGE OpenYockey and Kralowec research-article2015 Article Confirmatory Factor Analysis of the Procrastination Assessment Scale for Students SAGE Open October-December
More informationBefore we get started:
Before we get started: http://arievaluation.org/projects-3/ AEA 2018 R-Commander 1 Antonio Olmos Kai Schramm Priyalathta Govindasamy Antonio.Olmos@du.edu AntonioOlmos@aumhc.org AEA 2018 R-Commander 2 Plan
More informationMeasuring mathematics anxiety: Paper 2 - Constructing and validating the measure. Rob Cavanagh Len Sparrow Curtin University
Measuring mathematics anxiety: Paper 2 - Constructing and validating the measure Rob Cavanagh Len Sparrow Curtin University R.Cavanagh@curtin.edu.au Abstract The study sought to measure mathematics anxiety
More informationUsing Analytical and Psychometric Tools in Medium- and High-Stakes Environments
Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session
More informationItem Analysis: Classical and Beyond
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013 Why is item analysis relevant? Item analysis provides
More informationNonparametric DIF. Bruno D. Zumbo and Petronilla M. Witarsa University of British Columbia
Nonparametric DIF Nonparametric IRT Methodology For Detecting DIF In Moderate-To-Small Scale Measurement: Operating Characteristics And A Comparison With The Mantel Haenszel Bruno D. Zumbo and Petronilla
More informationBy Hui Bian Office for Faculty Excellence
By Hui Bian Office for Faculty Excellence 1 Email: bianh@ecu.edu Phone: 328-5428 Location: 1001 Joyner Library, room 1006 Office hours: 8:00am-5:00pm, Monday-Friday 2 Educational tests and regular surveys
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationPaul Irwing, Manchester Business School
Paul Irwing, Manchester Business School Factor analysis has been the prime statistical technique for the development of structural theories in social science, such as the hierarchical factor model of human
More informationGMAC. Scaling Item Difficulty Estimates from Nonequivalent Groups
GMAC Scaling Item Difficulty Estimates from Nonequivalent Groups Fanmin Guo, Lawrence Rudner, and Eileen Talento-Miller GMAC Research Reports RR-09-03 April 3, 2009 Abstract By placing item statistics
More informationModeling the Influential Factors of 8 th Grades Student s Mathematics Achievement in Malaysia by Using Structural Equation Modeling (SEM)
International Journal of Advances in Applied Sciences (IJAAS) Vol. 3, No. 4, December 2014, pp. 172~177 ISSN: 2252-8814 172 Modeling the Influential Factors of 8 th Grades Student s Mathematics Achievement
More informationASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT S MATHEMATICS ACHIEVEMENT IN MALAYSIA
1 International Journal of Advance Research, IJOAR.org Volume 1, Issue 2, MAY 2013, Online: ASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT
More informationStructural Validation of the 3 X 2 Achievement Goal Model
50 Educational Measurement and Evaluation Review (2012), Vol. 3, 50-59 2012 Philippine Educational Measurement and Evaluation Association Structural Validation of the 3 X 2 Achievement Goal Model Adonis
More informationScale Building with Confirmatory Factor Analysis
Scale Building with Confirmatory Factor Analysis Latent Trait Measurement and Structural Equation Models Lecture #7 February 27, 2013 PSYC 948: Lecture #7 Today s Class Scale building with confirmatory
More informationDifferential Item Functioning
Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item
More informationConstruct Invariance of the Survey of Knowledge of Internet Risk and Internet Behavior Knowledge Scale
University of Connecticut DigitalCommons@UConn NERA Conference Proceedings 2010 Northeastern Educational Research Association (NERA) Annual Conference Fall 10-20-2010 Construct Invariance of the Survey
More informationDoing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto
Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling Olli-Pekka Kauppila Daria Kautto Session VI, September 20 2017 Learning objectives 1. Get familiar with the basic idea
More informationInstrument equivalence across ethnic groups. Antonio Olmos (MHCD) Susan R. Hutchinson (UNC)
Instrument equivalence across ethnic groups Antonio Olmos (MHCD) Susan R. Hutchinson (UNC) Overview Instrument Equivalence Measurement Invariance Invariance in Reliability Scores Factorial Invariance Item
More informationConstruct Validity of Mathematics Test Items Using the Rasch Model
Construct Validity of Mathematics Test Items Using the Rasch Model ALIYU, R.TAIWO Department of Guidance and Counselling (Measurement and Evaluation Units) Faculty of Education, Delta State University,
More informationSection 5. Field Test Analyses
Section 5. Field Test Analyses Following the receipt of the final scored file from Measurement Incorporated (MI), the field test analyses were completed. The analysis of the field test data can be broken
More informationPersonality Traits Effects on Job Satisfaction: The Role of Goal Commitment
Marshall University Marshall Digital Scholar Management Faculty Research Management, Marketing and MIS Fall 11-14-2009 Personality Traits Effects on Job Satisfaction: The Role of Goal Commitment Wai Kwan
More informationMantel-Haenszel Procedures for Detecting Differential Item Functioning
A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning H. Jane Rogers, Teachers College, Columbia University Hariharan Swaminathan, University of
More informationA Comparison of Several Goodness-of-Fit Statistics
A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures
More informationBruno D. Zumbo, Ph.D. University of Northern British Columbia
Bruno Zumbo 1 The Effect of DIF and Impact on Classical Test Statistics: Undetected DIF and Impact, and the Reliability and Interpretability of Scores from a Language Proficiency Test Bruno D. Zumbo, Ph.D.
More informationPUBLIC KNOWLEDGE AND ATTITUDES SCALE CONSTRUCTION: DEVELOPMENT OF SHORT FORMS
PUBLIC KNOWLEDGE AND ATTITUDES SCALE CONSTRUCTION: DEVELOPMENT OF SHORT FORMS Prepared for: Robert K. Bell, Ph.D. National Science Foundation Division of Science Resources Studies 4201 Wilson Blvd. Arlington,
More informationComparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria
Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Thakur Karkee Measurement Incorporated Dong-In Kim CTB/McGraw-Hill Kevin Fatica CTB/McGraw-Hill
More informationDevelopment, Standardization and Application of
American Journal of Educational Research, 2018, Vol. 6, No. 3, 238-257 Available online at http://pubs.sciepub.com/education/6/3/11 Science and Education Publishing DOI:10.12691/education-6-3-11 Development,
More informationModels in Educational Measurement
Models in Educational Measurement Jan-Eric Gustafsson Department of Education and Special Education University of Gothenburg Background Measurement in education and psychology has increasingly come to
More informationThe MHSIP: A Tale of Three Centers
The MHSIP: A Tale of Three Centers P. Antonio Olmos-Gallo, Ph.D. Kathryn DeRoche, M.A. Mental Health Center of Denver Richard Swanson, Ph.D., J.D. Aurora Research Institute John Mahalik, Ph.D., M.P.A.
More informationRunning head: CFA OF STICSA 1. Model-Based Factor Reliability and Replicability of the STICSA
Running head: CFA OF STICSA 1 Model-Based Factor Reliability and Replicability of the STICSA The State-Trait Inventory of Cognitive and Somatic Anxiety (STICSA; Ree et al., 2008) is a new measure of anxiety
More informationEmpirical Formula for Creating Error Bars for the Method of Paired Comparison
Empirical Formula for Creating Error Bars for the Method of Paired Comparison Ethan D. Montag Rochester Institute of Technology Munsell Color Science Laboratory Chester F. Carlson Center for Imaging Science
More informationImpact and adjustment of selection bias. in the assessment of measurement equivalence
Impact and adjustment of selection bias in the assessment of measurement equivalence Thomas Klausch, Joop Hox,& Barry Schouten Working Paper, Utrecht, December 2012 Corresponding author: Thomas Klausch,
More informationSurvey Sampling Weights and Item Response Parameter Estimation
Survey Sampling Weights and Item Response Parameter Estimation Spring 2014 Survey Methodology Simmons School of Education and Human Development Center on Research & Evaluation Paul Yovanoff, Ph.D. Department
More informationA Brief (very brief) Overview of Biostatistics. Jody Kreiman, PhD Bureau of Glottal Affairs
A Brief (very brief) Overview of Biostatistics Jody Kreiman, PhD Bureau of Glottal Affairs What We ll Cover Fundamentals of measurement Parametric versus nonparametric tests Descriptive versus inferential
More informationIntroduction to Multilevel Models for Longitudinal and Repeated Measures Data
Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Today s Class: Features of longitudinal data Features of longitudinal models What can MLM do for you? What to expect in this
More informationTHE MANTEL-HAENSZEL METHOD FOR DETECTING DIFFERENTIAL ITEM FUNCTIONING IN DICHOTOMOUSLY SCORED ITEMS: A MULTILEVEL APPROACH
THE MANTEL-HAENSZEL METHOD FOR DETECTING DIFFERENTIAL ITEM FUNCTIONING IN DICHOTOMOUSLY SCORED ITEMS: A MULTILEVEL APPROACH By JANN MARIE WISE MACINNES A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF
More informationThe effects of ordinal data on coefficient alpha
James Madison University JMU Scholarly Commons Masters Theses The Graduate School Spring 2015 The effects of ordinal data on coefficient alpha Kathryn E. Pinder James Madison University Follow this and
More informationMultifactor Confirmatory Factor Analysis
Multifactor Confirmatory Factor Analysis Latent Trait Measurement and Structural Equation Models Lecture #9 March 13, 2013 PSYC 948: Lecture #9 Today s Class Confirmatory Factor Analysis with more than
More informationThe Influence of Psychological Empowerment on Innovative Work Behavior among Academia in Malaysian Research Universities
DOI: 10.7763/IPEDR. 2014. V 78. 21 The Influence of Psychological Empowerment on Innovative Work Behavior among Academia in Malaysian Research Universities Azra Ayue Abdul Rahman 1, Siti Aisyah Panatik
More information11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES
Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are
More informationCYRINUS B. ESSEN, IDAKA E. IDAKA AND MICHAEL A. METIBEMU. (Received 31, January 2017; Revision Accepted 13, April 2017)
DOI: http://dx.doi.org/10.4314/gjedr.v16i2.2 GLOBAL JOURNAL OF EDUCATIONAL RESEARCH VOL 16, 2017: 87-94 COPYRIGHT BACHUDO SCIENCE CO. LTD PRINTED IN NIGERIA. ISSN 1596-6224 www.globaljournalseries.com;
More informationA PRELIMINARY EXAMINATION OF THE ICAR PROGRESSIVE MATRICES TEST OF INTELLIGENCE
A PRELIMINARY EXAMINATION OF THE ICAR PROGRESSIVE MATRICES TEST OF INTELLIGENCE Jovana Jankovski 1,2, Ivana Zečević 2 & Siniša Subotić 3 2 Faculty of Philosophy, University of Banja Luka 3 University of
More informationA Brief Introduction to Bayesian Statistics
A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon
More informationValidating Measures of Self Control via Rasch Measurement. Jonathan Hasford Department of Marketing, University of Kentucky
Validating Measures of Self Control via Rasch Measurement Jonathan Hasford Department of Marketing, University of Kentucky Kelly D. Bradley Department of Educational Policy Studies & Evaluation, University
More informationEffects of the Number of Response Categories on Rating Scales
NUMBER OF RESPONSE CATEGORIES 1 Effects of the Number of Response Categories on Rating Scales Roundtable presented at the annual conference of the American Educational Research Association, Vancouver,
More informationDescription of components in tailored testing
Behavior Research Methods & Instrumentation 1977. Vol. 9 (2).153-157 Description of components in tailored testing WAYNE M. PATIENCE University ofmissouri, Columbia, Missouri 65201 The major purpose of
More informationCentre for Education Research and Policy
THE EFFECT OF SAMPLE SIZE ON ITEM PARAMETER ESTIMATION FOR THE PARTIAL CREDIT MODEL ABSTRACT Item Response Theory (IRT) models have been widely used to analyse test data and develop IRT-based tests. An
More informationAnalysis of Variance (ANOVA)
Research Methods and Ethics in Psychology Week 4 Analysis of Variance (ANOVA) One Way Independent Groups ANOVA Brief revision of some important concepts To introduce the concept of familywise error rate.
More informationStill important ideas
Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement
More informationAdaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida
Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida and Oleksandr S. Chernyshenko University of Canterbury Presented at the New CAT Models
More informationEncoding of Elements and Relations of Object Arrangements by Young Children
Encoding of Elements and Relations of Object Arrangements by Young Children Leslee J. Martin (martin.1103@osu.edu) Department of Psychology & Center for Cognitive Science Ohio State University 216 Lazenby
More informationReferences. Embretson, S. E. & Reise, S. P. (2000). Item response theory for psychologists. Mahwah,
The Western Aphasia Battery (WAB) (Kertesz, 1982) is used to classify aphasia by classical type, measure overall severity, and measure change over time. Despite its near-ubiquitousness, it has significant
More informationlinking in educational measurement: Taking differential motivation into account 1
Selecting a data collection design for linking in educational measurement: Taking differential motivation into account 1 Abstract In educational measurement, multiple test forms are often constructed to
More informationAnumber of studies have shown that ignorance regarding fundamental measurement
10.1177/0013164406288165 Educational Graham / Congeneric and Psychological Reliability Measurement Congeneric and (Essentially) Tau-Equivalent Estimates of Score Reliability What They Are and How to Use
More information