Difficulty and Discrimination Parameters of Boston Naming Test Items in a Consecutive Clinical Series
|
|
- Tamsyn Sharp
- 5 years ago
- Views:
Transcription
1 Archives of Clinical Neuropsychology 26 (2011) Difficulty and Discrimination Parameters of Boston Naming Test Items in a Consecutive Clinical Series Abstract Otto Pedraza*, Bonnie C. Sachs, Tanis J. Ferman, Beth K. Rush, John A. Lucas Department of Psychiatry and Psychology, Mayo Clinic, Jacksonville, FL, USA *Corresponding author at: Department of Psychiatry and Psychology, Mayo Clinic, Jacksonville, FL 32224, USA. Tel.: ; fax: address: otto.pedraza@mayo.edu (O. Pedraza). Accepted 21 April 2011 The Boston Naming Test is one of the most widely used neuropsychological instruments; yet, there has been limited use of modern psychometric methods to investigate its properties at the item level. The current study used Item response theory to examine each item s difficulty and discrimination properties, as well as the test s measurement precision across the range of naming ability. Participants included 300 consecutive referrals to the outpatient neuropsychology service at Mayo Clinic in Florida. Results showed that successive items do not necessarily reflect a monotonic increase in psychometric difficulty, some items are inadequate to distinguish individuals at various levels of naming ability, multiple items provide redundant psychometric information, and measurement precision is greatest for persons within a low-average range of ability. These findings may be used to develop short forms, improve reliability in future test versions by replacing psychometrically poor items, and analyze profiles of intra-individual variability. Keywords: Boston Naming Test; Item response theory; Item difficulty; Item discriminability Introduction The Boston Naming Test (BNT) (Kaplan, Goodglass, & Weintraub, 1983) is the most frequently used instrument for the assessment of visual naming ability (Rabin, Barr, & Burton, 2005). Its validity and reliability are well established and reviewed elsewhere (Strauss, Sherman, & Spreen, 2006). Briefly, internal consistency for the 60-item version ranges from r ¼.78 to.96 across studies. Test retest stability in cognitively normal adults varies as a function of time interval and sample composition, but generally ranges from r ¼.59 to.92. Moreover, the BNT correlates highly (r ¼.76 to.86) with other naming tests, such as the Visual Naming Test from the Multilingual Aphasia Examination. Although the psychometric properties of the BNT have been established at the global test level, few studies have used modern psychometric methods to evaluate the BNT at the item level (some studies considered item characteristics at a descriptive level, e.g., Tombaugh & Hubley, 1997). This information can be helpful to develop new short forms, improve test reliability by replacing psychometrically poor items, analyze error patterns or profiles of intra-individual variability, or take into account regional or cultural influences on individual item responses. For instance, Graves, Bezeau, Fogarty, and Blair (2004) used a one-parameter (Rasch) model to analyze the difficulty of BNT items and develop a new short form. Items were excluded from the short form if they were too easy, failed to fit the Rasch model, or had poor loadings on the first component of a principal components analysis. Item response theory (IRT) is a state-of-the-art measurement approach that uses examinees item responses to simultaneously estimate each person s underlying (latent) ability and the characteristics of the test items used to measure that ability (Embretson & Reise, 2000; Hambleton & Swaminathan, 1985; Hambleton, Swaminathan, & Rogers, 1991). In this framework, a person s ability level is considered a function of the pattern of unique item responses as well as the parametric properties of the test items. It thus becomes possible to estimate an item s discrimination (a), or the degree to which the item # The Author Published by Oxford University Press. All rights reserved. For permissions, please journals.permissions@oup.com. doi: /arclin/acr042 Advance Access publication on 18 May 2011
2 O. Pedraza et al. / Archives of Clinical Neuropsychology 26 (2011) distinguishes persons with higher ability from those with lower ability, and difficulty (b), the point in the ability scale at which a person has a 50% chance of responding correctly to the item. Models that estimate both item discrimination and difficulty parameters are well suited for the investigation of cognitive tests and abilities (Teresi, 2006). In IRT, item characteristic curves (ICCs) trace the probability of a correct item response as a function of the underlying ability construct and can be thought of as the regression of an item score on the person s latent ability. Item difficulty is depicted by the location along the x-coordinate where the probability of a correct response for a binary item is 50%, and item discrimination is represented by the slope of the trace line at that location. For instance, Fig. 1 depicts a theoretical test item with a difficulty parameter equal to zero. In this case, a person with an average ability has a 50% chance of responding correctly to the item. In contrast, Fig. 2 depicts a theoretical item with a difficulty parameter equal to Because a lesser degree of latent ability is required to obtain a 50% chance of responding correctly, the item in Fig. 2 is considered less difficult than that in Fig. 1. Note also the differences in the discrimination parameters between the two items. The steeper slope (i.e., higher discrimination) for the item in Fig. 2 indicates that it is better at distinguishing persons within a very narrow range of ability. When item discrimination is zero, every person has an equal probability of providing a correct response. In this case, the ICC is flat and the item should be flagged for deletion or replacement from the pool of test items. An advantage of IRT over classical test theory methods is that reliability is not constrained to a single coefficient, but instead can be measured continuously over the entire ability spectrum. Reliability in IRT is equivalent to the concept of information and is inversely related to the standard error of measurement (Embretson & Reise, 2000). Item, and hence test, information is maximized by higher discrimination parameters and an adequate match between item difficulty and a person s ability level. A further attractive property of IRT models is that item information can be summed to yield a global test information function, which represents the degree of precision for the test at each level of the latent ability. Recently, Pedraza and colleagues (2009) used an IRT approach to evaluate the differential response pattern for BNT items in cognitively normal Caucasian and African American adults. Results showed that successive BNT items do not necessarily reflect an increase in psychometric difficulty, many items do not discriminate persons with low versus high naming ability, and a subset of items demonstrate comparable difficulty or discrimination properties, suggesting that these items may be psychometrically redundant. In addition, the BNT showed the greatest measurement precision for individuals with mild naming difficulty. The current study represents an extension of Pedraza and colleagues (2009) to investigate the item-level properties of the BNT in a prospective series of adult patients with a broad range of naming ability. Method Participants Study participants included 300 consecutive referrals to the outpatient clinical neuropsychology service at Mayo Clinic in Florida. Patients were referred predominantly by the Departments of Neurology and Neurosurgery (65%) and Internal Medicine and its subspecialties (19%). Approximately half of the patients were referred for dementia evaluations, with the Fig. 1. Theoretical item with discrimination (a) ¼ 2.0 and difficulty (b) ¼ 0.
3 436 O. Pedraza et al. / Archives of Clinical Neuropsychology 26 (2011) Fig. 2. Theoretical item with discrimination (a) ¼ 3.0 and difficulty (b) ¼ remainder including epilepsy, normal pressure hydrocephalus, depression, poststroke status, and other medical and neurologic conditions. All patients were evaluated between July 2009 and January Only those patients whose primary language was English were considered for inclusion. All data were obtained in full compliance with a study protocol approved by the Mayo Clinic Institutional Review Board. Materials The BNT was administered in ascending order beginning with item 1 and proceeding until item 60. Items were scored as correct or incorrect following standardized instructions (Kaplan et al., 1983). For the purposes of the current investigation, the BNT total raw score represents the sum of all correct items regardless of basal or discontinuation rules. A separate score using basal and discontinuation rules was recorded for the purposes of the clinical examination and will not be considered in this study. Statistical Analyses A fundamental assumption in IRT is that the set of test items should measure a single dimension or construct. The dimensionality of the BNT was evaluated using multiple approaches. First, internal consistency was examined using Cronbach s alpha coefficient. Although internal consistency (i.e., alpha. 0.70) does not preclude the presence of multiple constructs, it represents a necessary but insufficient component of unidimensionality and is considered in that context (Gardner, 1995; Schmitt, 1996). Second, an exploratory factor analysis was performed using unweighted least squares extraction, followed by confirmatory factor analysis (CFA) in LISREL (Jöreskog & Sörbom, 1997, 2006) on the tetrachoric covariance matrix using an asymptotic distribution-free (ADF) estimator. A limitation of ADF estimators, however, is that substantially large sample sizes are necessary to generate admissible solutions (Boomsma & Hoogland, 2001). Non-admissible solutions can result from parameter estimates failing to converge after multiple iterations or negative variance estimates due to sampling fluctuations. Given our sample size, as well as our prior experience resulting in non-admissible solutions (Pedraza et al., 2009), robust maximum-likelihood estimation was also considered. The asymptotic covariance matrix was generated using PRELIS 2.0. Model fit was evaluated with the comparative fit index (CFI, values.0.90 indicate better fit) and root-mean-square error of approximation (RMSEA, values,0.10 indicate better fit), as well as the Satorra Bentler scaled chi-square statistic for the robust model (Satorra & Bentler, 1988). Lastly, unidimensionality was evaluated further using DIMTEST 2.0, a non-parametric conditional covariance-based test (Nandakumar & Stout, 1993; Stout, 1987; Stout, Froelich, & Gao, 2001). Item difficulty and discriminability parameters, standard errors, and summary statistics were obtained using marginal maximum-likelihood estimation in MULTILOG (Thissen, 2003). The characteristic curves for each item were plotted for visual inspection, and the overall test information was calculated to measure reliability across the range of naming ability.
4 Table 1. Demographic characteristics and BNT data for 300 patients Mean SD Range Age Education Sex (% men) 53.0 BNT total score Note: BNT ¼ Boston Naming Test. O. Pedraza et al. / Archives of Clinical Neuropsychology 26 (2011) Fig. 3. Mean percent correct item responses on the BNT. Results Demographic characteristics and mean BNT data are presented in Table 1. Participants ranged in age from 22 to 92 years, and the majority were Caucasian (.95%). BNT scores were significantly correlated with age (r ¼ 2.21, p,.001) and years of education (r ¼.28, p,.001), but not with sex (r ¼ 2.10, p ¼.10). As expected, internal consistency was high (alpha ¼ 0.91). Exploratory factor analysis revealed a 5.3:1 ratio between the first and second eigenvalues. A single-factor CFA using ADF estimators returned non-admissible solutions, but the use of robust maximum-likelihood estimation yielded a well-fitting single-factor model (CFI ¼ 0.97; RMSEA ¼ ; Satorra Bentler scaled x 2 ¼ , p,.001). A two-factor model did not result in improved fit. Moreover, the result from DIMTEST (T-statistic ¼ 0.99, p ¼.16) was consistent with the prior dimensionality assessments. Altogether, these findings suggest that the BNT was sufficiently unidimensional to proceed with IRT modeling. BNT total scores ranged from 22 to 60. As shown in Fig. 3, participants provided 100% correct responses to four items (BNT item numbers denoted in parenthesis): bed (1), tree (2), toothbrush (10), and hanger (15). Protractor (59) had the fewest correct responses (19%). The graph in Fig. 3 also illustrates multiple dips or points at which there is a prominent decline in the percent of correct responses for consecutive items. For example, 92% of participants responded correctly to wreath (28) and 88% responded correctly to harmonica (30), but only 63% responded correctly to beaver (29). Similarly, 81% responded correctly to asparagus (49), yet 41% responded correctly to the following item, compass (50). Table 2 presents the IRT item discrimination and difficulty parameters. As expected, there was no variance associated with the four items with 100% correct responses. The standard error for items with highly skewed response patterns (e.g., scissors, broom ) could not be defined under maximum-likelihood estimation. Protractor (59) had a negative, near-zero discrimination parameter, suggesting that it was a poor item yielding minimal-to-no psychometric information for the IRT model.
5 438 O. Pedraza et al. / Archives of Clinical Neuropsychology 26 (2011) Table 2. Item discrimination and difficulty parameters for the BNT BNT item Discrimination Difficulty a Std. error b Std. error 1. Bed Tree Pencil House Whistle Scissors Comb Flower Saw Toothbrush Helicopter Broom Octopus Mushroom Hanger Wheelchair Camel Mask Pretzel Bench Racquet Snail Volcano Seahorse Dart Canoe Globe Wreath Beaver Harmonica Rhinoceros Acorn Igloo Stilts Dominoes Cactus Escalator Harp Hammock Knocker Pelican Stethoscope Pyramid Muzzle Unicorn Funnel Accordion Noose Asparagus Compass Latch Tripod Scroll Tongs Sphinx Yoke Trellis Palette Protractor Abacus Note: BNT ¼ Boston Naming Test.
6 O. Pedraza et al. / Archives of Clinical Neuropsychology 26 (2011) Fig. 4. Matrix of ICCs for the BNT (Note: ICCs not available for items 1, 2, 10, and 15). Among the remaining items, comb (7) showed the highest magnitude of discrimination, followed by racquet (21), saw (9), canoe (26), and wheelchair (16). The least discriminating items were flower (8), scissors (6), latch (51), yoke (56), and trellis (57). These findings are more clearly visualized in Fig. 4, where the items with the highest degree of discrimination show the steepest slopes, and those with the lowest discrimination have relatively flat slopes.
7 440 O. Pedraza et al. / Archives of Clinical Neuropsychology 26 (2011) Fig. 4. Continued In terms of difficulty, abacus (60) exhibited the highest parameter and was followed by compass (50), yoke (56), palette (58), and sphinx (55). As noted earlier, although 81% of participants responded incorrectly to protractor (59), its IRT parametric difficulty could not be properly estimated because the likelihood of responding correctly was nearly equal for any individual along the ability spectrum. Besides the four items in which all participants responded correctly, the next five easiest items were flower (8), scissors (6), broom (12), camel (17), and house (4). Several items had difficulty parameters that
8 O. Pedraza et al. / Archives of Clinical Neuropsychology 26 (2011) Fig. 4. Continued suggested a notable discrepancy from their ordered placement on the test. For instance, acorn (32) was the 19th easiest item and harp (38) the 22nd easiest item. In contrast, octopus (13) was the 36th easiest item and seahorse (24) the 48th easiest item. These results highlight the lack of monotonic increase in psychometric difficulty among successive items.
9 442 O. Pedraza et al. / Archives of Clinical Neuropsychology 26 (2011) Fig. 4. Continued Figure 5 displays the global test information curve. The BNT provided the most information (reliability) for individuals in the low-average range of naming ability, or approximately 21.0 standardized units. Measurement error increased considerably when assessing individuals with at least a high-average degree of naming ability. Discussion The present study explores the item-level psychometric properties of the BNT in a clinical outpatient sample and suggests the following: First, each successive BNT item does not necessarily confer a stepwise increase in psychometric difficulty. Easier items generally do group together in the first half of the test and harder items in the second half, but there is marked variability in difficulty levels within smaller clusters of successive items. Second, some BNT items do not discriminate well between individuals at close levels of naming ability, and a few items are simply inadequate to make such distinctions.
10 O. Pedraza et al. / Archives of Clinical Neuropsychology 26 (2011) Fig. 5. Test information and standard error curves for the BNT. For instance, scissors, flower, and protractor do not discriminate well within any range of naming ability, and this lack of discrimination is independent of their difficulty level. Third, a subset of items exhibits a comparable degree of difficulty or discrimination, which suggests that these items provide redundant psychometric information. For instance, there is a high degree of redundancy between the following item pairs: octopus and asparagus, racquet and canoe, wreath and harp, and igloo and volcano. It can be expected that excluding an item from each of these pairs would result in negligible psychometric loss. This may be helpful for future derivation of shorter naming tasks using BNT items without a loss of discrimination characteristics. And fourth, the BNT yields the highest degree of measurement precision near the low average to mildly impaired range of naming ability (i.e., between 21.0 and 21.5 standardized units). Measurement precision remains acceptable in the moderate range of impairment, but declines markedly in the high-to-above average ability range, likely due to the test s measurement ceiling. In practical terms, this suggests that the BNT is most precise for adults who present to an outpatient clinical practice with an early or mild naming deficit. These findings in a neurologic and medical outpatient sample are consistent with those reported by Pedraza and colleagues (2009) among cognitively normal older adults. This detailed item-level psychometric information may be useful to supplement the test s total score and more clearly delineate the nature of a patient s naming deficit. A briefer short form could be created empirically for a clinical trial by selecting highly discriminating items located at equidistant intervals along the entire range of difficulty and selecting only those without differential item functioning. For instance, a brief 10-item form could include the following items, ordered from easiest to hardest: saw, comb, mushroom, racquet, harmonica, pyramid, seahorse, beaver, sphinx, and abacus. These items demonstrate relatively equidistant difficulty parameters, relatively high discriminability, and no differential item functioning with respect to Caucasian versus African American adults. These data also could be used to construct alternate test forms in which items have equivalent difficulty and discrimination, and which may be helpful in rehabilitation or other settings requiring repeated evaluations. Moreover, examining a person s pattern of BNT item responses as part of a forensic exam could yield a symptom validation index if the person tends to make a disproportionate number of errors to psychometrically easy items but responds correctly to difficult items. A few limitations are worth noting. First, although a key advantage of IRT over classical test theory is that item parameters are invariant across populations, this property holds only when the range of the ability sampled is maximized. Participants in this study obtained BNT total scores ranging from 22 to 60, and only 10 participants had scores between 22 and 29. Thus, the findings may not generalize to patients with acute, language-dominant hemisphere stroke or advanced semantic dementia, who may be expected to make substantially greater number of errors on the BNT. Also, the clinical sample in this study has limited ethnic minority representation, and our past findings from cognitively normal adults suggest that slight differences in item parameters exist between ethnic groups (Pedraza et al., 2009). Although these results demonstrate a lack of incremental or monotonic difficulty among ordered items, the extent to which this factor may contribute to variation in total test scores under standard basal and discontinuation criteria is unknown. It seems reasonable to assume, however, that such an effect may be negligible because normative values (e.g., MOANS/MOAANS age-scaled scores; Heaton T-scores) generally comprise a range of raw scores rather than a single score. Lastly, it bears noting that these results do not negate the utility of error pattern analyses as originally intended by the BNT authors. In summary, these results offer additional information regarding the psychometric properties of the BNT that may be useful in clinical practice, research, and future test development or refinement.
11 444 O. Pedraza et al. / Archives of Clinical Neuropsychology 26 (2011) Funding This work was supported by the National Institutes of Health (NS to O.P.). Conflict of Interest None declared. Acknowledgements We would like to thank Dan Mungas, Ph.D., for helpful comments on an earlier portion of the manuscript. We also extend our gratitude to our wonderful team of psychometrists: Diana Achem, Cameron Griffin, Ashley Marshall, Jill McBride, Wendy Mercer, and Sonya Prescott. References Boomsma, A., & Hoogland, J. J. (2001). The robustness of LISREL modeling revisited. In R. Cudeck, S. du Toit, & D. Sörbom (Eds.), Structural equation models: Present and future (pp ). Lincolnwood, IL: Scientific Software International. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahway, NJ: Lawrence Erlbaum Associates. Gardner, P. L. (1995). Measuring attitudes to science: Unidimensionality and internal consistency revisited. Research in Science Education, 25(3), Graves, R. E., Bezeau, S. C., Fogarty, J., & Blair, R. (2004). Boston naming test short forms: A comparison of previous forms with new item response theory based forms. Journal of Clinical and Experimental Neuropsychology, 26(7), Hambleton, R. K., & Swaminathan, H. (1985). Item response theory. Principles and applications. Boston: Kluwer-Nijhoff Publishing. Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publications. Jöreskog, K. G., & Sörbom, D. (1997). LISREL 8: User s reference guide (2nd ed.). Chicago, IL: Scientific Software International. Jöreskog, K. G., & Sörbom, D. (2006). LISREL Chicago, IL: Scientific Software International. Kaplan, E., Goodglass, H., & Weintraub, S. (1983). The Boston Naming Test. Philadelphia: Lea & Febiger. Nandakumar, R., & Stout, W. (1993). Refinements of Stout s procedure for assessing latent trait unidimensionality. Journal of Educational Statistics, 18, Pedraza, O., Graff-Radford, N. R., Smith, G. E., Ivnik, R. J., Willis, F. B., Petersen, R. C., et al.. (2009). Differential item functioning of the Boston Naming Test in cognitively normal African American and Caucasian older adults. Journal of the International Neuropsychological Society, 15(5), Rabin, L. A., Barr, W. B., & Burton, L. A. (2005). Assessment practices of clinical neuropsychologists in the United States and Canada: A survey of INS, NAN, and APA Division 40 members. Archives of Clinical Neuropsychology, 20(1), Satorra, A., & Bentler, P. M. (1988). Scaling corrections for chi-square statistics in covariance structure analysis. In American Statistical Association 1988 proceedings of the business and economics section (pp ). Alexandria, VA: American Statistical Association. Schmitt, N. (1996). Uses and abuses of coefficient alpha. Psychological Assessment, 8(4), Stout, W. (1987). A nonparametric approach for assessing latent trait unidimensionality. Psychometrika, 52 (4), Stout, W., Froelich, A., & Gao, F. (2001). Using resampling methods to produce an improved DIMTEST procedure. In A. Boomsma, M. A. J. van Duijn, & T. A. B. Snijders (Eds.), Essays on item response theory (pp ). New York: Springer-Verlag. Strauss, E., Sherman, E. M. S., & Spreen, O. (2006). A compendium of neuropsychological tests: Administration, norms, and commentary (3rd ed.). New York: Oxford University Press. Teresi, J. A. (2006). Different approaches to differential item functioning in health applications: Advantages, disadvantages and some neglected topics. Medical Care, 44(11 Suppl. 3), S152 S170. Thissen, D. (2003). MULTILOG 7.0: Multiple, categorical item analysis and test scoring using item response theory. Chicago: Scientific Software International. Tombaugh, T.N., & Hubley, A.M. (1997). The 60-item Boston Naming Test: Norms for cognitively intact adults aged 25 to 88 years. Journal of Clinical and Experimental Neuropsychology, 19 (6),
EXPLORING DIFFERENTIAL ITEM FUNCTIONING AMONG HAWAI I RESIDENTS ON THE BOSTON NAMING TEST
EXPLORING DIFFERENTIAL ITEM FUNCTIONING AMONG HAWAI I RESIDENTS ON THE BOSTON NAMING TEST A THESIS SUBMITTED TO THE GRADUATE DIVISION OF THE UNIVERSITY OF HAWAI I AT MĀNOA IN PARTIAL FULFILLMENT OF THE
More informationThe Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory
The Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory Kate DeRoche, M.A. Mental Health Center of Denver Antonio Olmos, Ph.D. Mental Health
More informationInvestigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories
Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,
More informationComprehensive Statistical Analysis of a Mathematics Placement Test
Comprehensive Statistical Analysis of a Mathematics Placement Test Robert J. Hall Department of Educational Psychology Texas A&M University, USA (bobhall@tamu.edu) Eunju Jung Department of Educational
More informationAssessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies. Xiaowen Zhu. Xi an Jiaotong University.
Running head: ASSESS MEASUREMENT INVARIANCE Assessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies Xiaowen Zhu Xi an Jiaotong University Yanjie Bian Xi an Jiaotong
More informationA Bayesian Nonparametric Model Fit statistic of Item Response Models
A Bayesian Nonparametric Model Fit statistic of Item Response Models Purpose As more and more states move to use the computer adaptive test for their assessments, item response theory (IRT) has been widely
More informationPersonal Style Inventory Item Revision: Confirmatory Factor Analysis
Personal Style Inventory Item Revision: Confirmatory Factor Analysis This research was a team effort of Enzo Valenzi and myself. I m deeply grateful to Enzo for his years of statistical contributions to
More informationItem Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses
Item Response Theory Steven P. Reise University of California, U.S.A. Item response theory (IRT), or modern measurement theory, provides alternatives to classical test theory (CTT) methods for the construction,
More informationContents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD
Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT
More informationReferences. Embretson, S. E. & Reise, S. P. (2000). Item response theory for psychologists. Mahwah,
The Western Aphasia Battery (WAB) (Kertesz, 1982) is used to classify aphasia by classical type, measure overall severity, and measure change over time. Despite its near-ubiquitousness, it has significant
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationLikelihood Ratio Based Computerized Classification Testing. Nathan A. Thompson. Assessment Systems Corporation & University of Cincinnati.
Likelihood Ratio Based Computerized Classification Testing Nathan A. Thompson Assessment Systems Corporation & University of Cincinnati Shungwon Ro Kenexa Abstract An efficient method for making decisions
More informationModeling the Influential Factors of 8 th Grades Student s Mathematics Achievement in Malaysia by Using Structural Equation Modeling (SEM)
International Journal of Advances in Applied Sciences (IJAAS) Vol. 3, No. 4, December 2014, pp. 172~177 ISSN: 2252-8814 172 Modeling the Influential Factors of 8 th Grades Student s Mathematics Achievement
More informationAssessing the Validity and Reliability of a Measurement Model in Structural Equation Modeling (SEM)
British Journal of Mathematics & Computer Science 15(3): 1-8, 2016, Article no.bjmcs.25183 ISSN: 2231-0851 SCIENCEDOMAIN international www.sciencedomain.org Assessing the Validity and Reliability of a
More informationASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT S MATHEMATICS ACHIEVEMENT IN MALAYSIA
1 International Journal of Advance Research, IJOAR.org Volume 1, Issue 2, MAY 2013, Online: ASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT
More informationBasic concepts and principles of classical test theory
Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must
More informationThe Development of Scales to Measure QISA s Three Guiding Principles of Student Aspirations Using the My Voice TM Survey
The Development of Scales to Measure QISA s Three Guiding Principles of Student Aspirations Using the My Voice TM Survey Matthew J. Bundick, Ph.D. Director of Research February 2011 The Development of
More informationManifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement Invariance Tests Of Multi-Group Confirmatory Factor Analyses
Journal of Modern Applied Statistical Methods Copyright 2005 JMASM, Inc. May, 2005, Vol. 4, No.1, 275-282 1538 9472/05/$95.00 Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement
More informationPUBLIC KNOWLEDGE AND ATTITUDES SCALE CONSTRUCTION: DEVELOPMENT OF SHORT FORMS
PUBLIC KNOWLEDGE AND ATTITUDES SCALE CONSTRUCTION: DEVELOPMENT OF SHORT FORMS Prepared for: Robert K. Bell, Ph.D. National Science Foundation Division of Science Resources Studies 4201 Wilson Blvd. Arlington,
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report. Assessing IRT Model-Data Fit for Mixed Format Tests
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 26 for Mixed Format Tests Kyong Hee Chon Won-Chan Lee Timothy N. Ansley November 2007 The authors are grateful to
More informationAndré Cyr and Alexander Davies
Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander
More informationItem Response Theory: Methods for the Analysis of Discrete Survey Response Data
Item Response Theory: Methods for the Analysis of Discrete Survey Response Data ICPSR Summer Workshop at the University of Michigan June 29, 2015 July 3, 2015 Presented by: Dr. Jonathan Templin Department
More informationRunning head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note
Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1 Nested Factor Analytic Model Comparison as a Means to Detect Aberrant Response Patterns John M. Clark III Pearson Author Note John M. Clark III,
More informationITEM RESPONSE THEORY ANALYSIS OF THE TOP LEADERSHIP DIRECTION SCALE
California State University, San Bernardino CSUSB ScholarWorks Electronic Theses, Projects, and Dissertations Office of Graduate Studies 6-2016 ITEM RESPONSE THEORY ANALYSIS OF THE TOP LEADERSHIP DIRECTION
More informationItem Analysis: Classical and Beyond
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013 Why is item analysis relevant? Item analysis provides
More informationEmpowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison
Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological
More informationOn the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation in CFA
STRUCTURAL EQUATION MODELING, 13(2), 186 203 Copyright 2006, Lawrence Erlbaum Associates, Inc. On the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation
More informationResearch and Evaluation Methodology Program, School of Human Development and Organizational Studies in Education, University of Florida
Vol. 2 (1), pp. 22-39, Jan, 2015 http://www.ijate.net e-issn: 2148-7456 IJATE A Comparison of Logistic Regression Models for Dif Detection in Polytomous Items: The Effect of Small Sample Sizes and Non-Normality
More informationInfluences of IRT Item Attributes on Angoff Rater Judgments
Influences of IRT Item Attributes on Angoff Rater Judgments Christian Jones, M.A. CPS Human Resource Services Greg Hurt!, Ph.D. CSUS, Sacramento Angoff Method Assemble a panel of subject matter experts
More informationGENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS
GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 39 Evaluation of Comparability of Scores and Passing Decisions for Different Item Pools of Computerized Adaptive Examinations
More informationThe Patient-Reported Outcomes Measurement Information
ORIGINAL ARTICLE Practical Issues in the Application of Item Response Theory A Demonstration Using Items From the Pediatric Quality of Life Inventory (PedsQL) 4.0 Generic Core Scales Cheryl D. Hill, PhD,*
More informationTechniques for Explaining Item Response Theory to Stakeholder
Techniques for Explaining Item Response Theory to Stakeholder Kate DeRoche Antonio Olmos C.J. Mckinney Mental Health Center of Denver Presented on March 23, 2007 at the Eastern Evaluation Research Society
More informationUsing Analytical and Psychometric Tools in Medium- and High-Stakes Environments
Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session
More informationvalidscale: A Stata module to validate subjective measurement scales using Classical Test Theory
: A Stata module to validate subjective measurement scales using Classical Test Theory Bastien Perrot, Emmanuelle Bataille, Jean-Benoit Hardouin UMR INSERM U1246 - SPHERE "methods in Patient-centered outcomes
More informationA Comparison of Several Goodness-of-Fit Statistics
A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures
More informationChapter 9. Youth Counseling Impact Scale (YCIS)
Chapter 9 Youth Counseling Impact Scale (YCIS) Background Purpose The Youth Counseling Impact Scale (YCIS) is a measure of perceived effectiveness of a specific counseling session. In general, measures
More informationConfirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children
Confirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children Dr. KAMALPREET RAKHRA MD MPH PhD(Candidate) No conflict of interest Child Behavioural Check
More informationThe Modification of Dichotomous and Polytomous Item Response Theory to Structural Equation Modeling Analysis
Canadian Social Science Vol. 8, No. 5, 2012, pp. 71-78 DOI:10.3968/j.css.1923669720120805.1148 ISSN 1712-8056[Print] ISSN 1923-6697[Online] www.cscanada.net www.cscanada.org The Modification of Dichotomous
More informationMultidimensional Modeling of Learning Progression-based Vertical Scales 1
Multidimensional Modeling of Learning Progression-based Vertical Scales 1 Nina Deng deng.nina@measuredprogress.org Louis Roussos roussos.louis@measuredprogress.org Lee LaFond leelafond74@gmail.com 1 This
More informationDuring the past century, mathematics
An Evaluation of Mathematics Competitions Using Item Response Theory Jim Gleason During the past century, mathematics competitions have become part of the landscape in mathematics education. The first
More informationExploratory Factor Analysis Student Anxiety Questionnaire on Statistics
Proceedings of Ahmad Dahlan International Conference on Mathematics and Mathematics Education Universitas Ahmad Dahlan, Yogyakarta, 13-14 October 2017 Exploratory Factor Analysis Student Anxiety Questionnaire
More informationAnalysis of the Reliability and Validity of an Edgenuity Algebra I Quiz
Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz This study presents the steps Edgenuity uses to evaluate the reliability and validity of its quizzes, topic tests, and cumulative
More informationDoes factor indeterminacy matter in multi-dimensional item response theory?
ABSTRACT Paper 957-2017 Does factor indeterminacy matter in multi-dimensional item response theory? Chong Ho Yu, Ph.D., Azusa Pacific University This paper aims to illustrate proper applications of multi-dimensional
More informationUsing the Rasch Modeling for psychometrics examination of food security and acculturation surveys
Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys Jill F. Kilanowski, PhD, APRN,CPNP Associate Professor Alpha Zeta & Mu Chi Acknowledgements Dr. Li Lin,
More informationRunning head: CFA OF TDI AND STICSA 1. p Factor or Negative Emotionality? Joint CFA of Internalizing Symptomology
Running head: CFA OF TDI AND STICSA 1 p Factor or Negative Emotionality? Joint CFA of Internalizing Symptomology Caspi et al. (2014) reported that CFA results supported a general psychopathology factor,
More informationDescription of components in tailored testing
Behavior Research Methods & Instrumentation 1977. Vol. 9 (2).153-157 Description of components in tailored testing WAYNE M. PATIENCE University ofmissouri, Columbia, Missouri 65201 The major purpose of
More informationSurvey Sampling Weights and Item Response Parameter Estimation
Survey Sampling Weights and Item Response Parameter Estimation Spring 2014 Survey Methodology Simmons School of Education and Human Development Center on Research & Evaluation Paul Yovanoff, Ph.D. Department
More informationDevelopment, Standardization and Application of
American Journal of Educational Research, 2018, Vol. 6, No. 3, 238-257 Available online at http://pubs.sciepub.com/education/6/3/11 Science and Education Publishing DOI:10.12691/education-6-3-11 Development,
More informationUSE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION
USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION Iweka Fidelis (Ph.D) Department of Educational Psychology, Guidance and Counselling, University of Port Harcourt,
More informationREPORT. Technical Report: Item Characteristics. Jessica Masters
August 2010 REPORT Diagnostic Geometry Assessment Project Technical Report: Item Characteristics Jessica Masters Technology and Assessment Study Collaborative Lynch School of Education Boston College Chestnut
More informationAnalyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia
Analyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia 1 Introduction The Teacher Test-English (TT-E) is administered by the NCA
More informationThe Functional Outcome Questionnaire- Aphasia (FOQ-A) is a conceptually-driven
Introduction The Functional Outcome Questionnaire- Aphasia (FOQ-A) is a conceptually-driven outcome measure that was developed to address the growing need for an ecologically valid functional communication
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationComparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria
Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Thakur Karkee Measurement Incorporated Dong-In Kim CTB/McGraw-Hill Kevin Fatica CTB/McGraw-Hill
More informationMantel-Haenszel Procedures for Detecting Differential Item Functioning
A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning H. Jane Rogers, Teachers College, Columbia University Hariharan Swaminathan, University of
More informationAlternative Methods for Assessing the Fit of Structural Equation Models in Developmental Research
Alternative Methods for Assessing the Fit of Structural Equation Models in Developmental Research Michael T. Willoughby, B.S. & Patrick J. Curran, Ph.D. Duke University Abstract Structural Equation Modeling
More informationConstruct Invariance of the Survey of Knowledge of Internet Risk and Internet Behavior Knowledge Scale
University of Connecticut DigitalCommons@UConn NERA Conference Proceedings 2010 Northeastern Educational Research Association (NERA) Annual Conference Fall 10-20-2010 Construct Invariance of the Survey
More informationConfirmatory Factor Analysis of the Group Environment Questionnaire With an Intercollegiate Sample
JOURNAL OF SPORT & EXERCISE PSYCHOLOGY, 19%. 18,49-63 O 1996 Human Kinetics Publishers, Inc. Confirmatory Factor Analysis of the Group Environment Questionnaire With an Intercollegiate Sample Fuzhong Li
More informationINTERPRETING IRT PARAMETERS: PUTTING PSYCHOLOGICAL MEAT ON THE PSYCHOMETRIC BONE
The University of British Columbia Edgeworth Laboratory for Quantitative Educational & Behavioural Science INTERPRETING IRT PARAMETERS: PUTTING PSYCHOLOGICAL MEAT ON THE PSYCHOMETRIC BONE Anita M. Hubley,
More informationMEASURING MIDDLE GRADES STUDENTS UNDERSTANDING OF FORCE AND MOTION CONCEPTS: INSIGHTS INTO THE STRUCTURE OF STUDENT IDEAS
MEASURING MIDDLE GRADES STUDENTS UNDERSTANDING OF FORCE AND MOTION CONCEPTS: INSIGHTS INTO THE STRUCTURE OF STUDENT IDEAS The purpose of this study was to create an instrument that measures middle grades
More informationBruno D. Zumbo, Ph.D. University of Northern British Columbia
Bruno Zumbo 1 The Effect of DIF and Impact on Classical Test Statistics: Undetected DIF and Impact, and the Reliability and Interpretability of Scores from a Language Proficiency Test Bruno D. Zumbo, Ph.D.
More informationA Comparison of Pseudo-Bayesian and Joint Maximum Likelihood Procedures for Estimating Item Parameters in the Three-Parameter IRT Model
A Comparison of Pseudo-Bayesian and Joint Maximum Likelihood Procedures for Estimating Item Parameters in the Three-Parameter IRT Model Gary Skaggs Fairfax County, Virginia Public Schools José Stevenson
More informationInvestigating the Reliability of Classroom Observation Protocols: The Case of PLATO. M. Ken Cor Stanford University School of Education.
The Reliability of PLATO Running Head: THE RELIABILTY OF PLATO Investigating the Reliability of Classroom Observation Protocols: The Case of PLATO M. Ken Cor Stanford University School of Education April,
More informationBipolar items for the measurement of personal optimism instead of unipolar items
Psychological Test and Assessment Modeling, Volume 53, 2011 (4), 399-413 Bipolar items for the measurement of personal optimism instead of unipolar items Karl Schweizer 1, Wolfgang Rauch 2 & Andreas Gold
More informationNaming Test of the Neuropsychological Assessment Battery: Convergent and Discriminant Validity
Archives of Clinical Neuropsychology 24 (2009) 575 583 Naming Test of the Neuropsychological Assessment Battery: Convergent and Discriminant Validity Brian P. Yochim*, Katherine D. Kane, Anne E. Mueller
More informationInternational Conference on Humanities and Social Science (HSS 2016)
International Conference on Humanities and Social Science (HSS 2016) The Chinese Version of WOrk-reLated Flow Inventory (WOLF): An Examination of Reliability and Validity Yi-yu CHEN1, a, Xiao-tong YU2,
More informationlinking in educational measurement: Taking differential motivation into account 1
Selecting a data collection design for linking in educational measurement: Taking differential motivation into account 1 Abstract In educational measurement, multiple test forms are often constructed to
More informationA Modified CATSIB Procedure for Detecting Differential Item Function. on Computer-Based Tests. Johnson Ching-hong Li 1. Mark J. Gierl 1.
Running Head: A MODIFIED CATSIB PROCEDURE FOR DETECTING DIF ITEMS 1 A Modified CATSIB Procedure for Detecting Differential Item Function on Computer-Based Tests Johnson Ching-hong Li 1 Mark J. Gierl 1
More informationRapidly-administered short forms of the Wechsler Adult Intelligence Scale 3rd edition
Archives of Clinical Neuropsychology 22 (2007) 917 924 Abstract Rapidly-administered short forms of the Wechsler Adult Intelligence Scale 3rd edition Alison J. Donnell a, Neil Pliskin a, James Holdnack
More informationThe MHSIP: A Tale of Three Centers
The MHSIP: A Tale of Three Centers P. Antonio Olmos-Gallo, Ph.D. Kathryn DeRoche, M.A. Mental Health Center of Denver Richard Swanson, Ph.D., J.D. Aurora Research Institute John Mahalik, Ph.D., M.P.A.
More informationRunning head: CPPS REVIEW 1
Running head: CPPS REVIEW 1 Please use the following citation when referencing this work: McGill, R. J. (2013). Test review: Children s Psychological Processing Scale (CPPS). Journal of Psychoeducational
More informationPresented By: Yip, C.K., OT, PhD. School of Medical and Health Sciences, Tung Wah College
Presented By: Yip, C.K., OT, PhD. School of Medical and Health Sciences, Tung Wah College Background of problem in assessment for elderly Key feature of CCAS Structural Framework of CCAS Methodology Result
More informationJason L. Meyers. Ahmet Turhan. Steven J. Fitzpatrick. Pearson. Paper presented at the annual meeting of the
Performance of Ability Estimation Methods for Writing Assessments under Conditio ns of Multidime nsionality Jason L. Meyers Ahmet Turhan Steven J. Fitzpatrick Pearson Paper presented at the annual meeting
More informationElderly Norms for the Hopkins Verbal Learning Test-Revised*
The Clinical Neuropsychologist -//-$., Vol., No., pp. - Swets & Zeitlinger Elderly Norms for the Hopkins Verbal Learning Test-Revised* Rodney D. Vanderploeg, John A. Schinka, Tatyana Jones, Brent J. Small,
More informationProceedings of the 2011 International Conference on Teaching, Learning and Change (c) International Association for Teaching and Learning (IATEL)
EVALUATION OF MATHEMATICS ACHIEVEMENT TEST: A COMPARISON BETWEEN CLASSICAL TEST THEORY (CTT)AND ITEM RESPONSE THEORY (IRT) Eluwa, O. Idowu 1, Akubuike N. Eluwa 2 and Bekom K. Abang 3 1& 3 Dept of Educational
More informationNonparametric DIF. Bruno D. Zumbo and Petronilla M. Witarsa University of British Columbia
Nonparametric DIF Nonparametric IRT Methodology For Detecting DIF In Moderate-To-Small Scale Measurement: Operating Characteristics And A Comparison With The Mantel Haenszel Bruno D. Zumbo and Petronilla
More informationConfirmatory Factor Analysis and Item Response Theory: Two Approaches for Exploring Measurement Invariance
Psychological Bulletin 1993, Vol. 114, No. 3, 552-566 Copyright 1993 by the American Psychological Association, Inc 0033-2909/93/S3.00 Confirmatory Factor Analysis and Item Response Theory: Two Approaches
More informationORIGINAL CONTRIBUTION. Detecting Dementia With the Mini-Mental State Examination in Highly Educated Individuals
ORIGINAL CONTRIBUTION Detecting Dementia With the Mini-Mental State Examination in Highly Educated Individuals Sid E. O Bryant, PhD; Joy D. Humphreys, MA; Glenn E. Smith, PhD; Robert J. Ivnik, PhD; Neill
More informationParallel Forms for Diagnostic Purpose
Paper presented at AERA, 2010 Parallel Forms for Diagnostic Purpose Fang Chen Xinrui Wang UNCG, USA May, 2010 INTRODUCTION With the advancement of validity discussions, the measurement field is pushing
More informationTest review. Comprehensive Trail Making Test (CTMT) By Cecil R. Reynolds. Austin, Texas: PRO-ED, Inc., Test description
Archives of Clinical Neuropsychology 19 (2004) 703 708 Test review Comprehensive Trail Making Test (CTMT) By Cecil R. Reynolds. Austin, Texas: PRO-ED, Inc., 2002 1. Test description The Trail Making Test
More informationDifferential Item Functioning
Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item
More informationExamining the Validity and Fairness of a State Standards-Based Assessment of English-Language Arts for Deaf or Hard of Hearing Students
Examining the Validity and Fairness of a State Standards-Based Assessment of English-Language Arts for Deaf or Hard of Hearing Students Jonathan Steinberg Frederick Cline Guangming Ling Linda Cook Namrata
More informationThe Ego Identity Process Questionnaire: Factor Structure, Reliability, and Convergent Validity in Dutch-Speaking Late. Adolescents
33 2 The Ego Identity Process Questionnaire: Factor Structure, Reliability, and Convergent Validity in Dutch-Speaking Late Adolescents Koen Luyckx, Luc Goossens, Wim Beyers, & Bart Soenens (2006). Journal
More informationThe Psychometric Properties of Dispositional Flow Scale-2 in Internet Gaming
Curr Psychol (2009) 28:194 201 DOI 10.1007/s12144-009-9058-x The Psychometric Properties of Dispositional Flow Scale-2 in Internet Gaming C. K. John Wang & W. C. Liu & A. Khoo Published online: 27 May
More informationValidation of the Patient Perception of Migraine Questionnaire
Volume 5 Number 5 2002 VALUE IN HEALTH Validation of the Patient Perception of Migraine Questionnaire Kimberly Hunt Davis, MS, 1 Libby Black, PharmD, 1 Betsy Sleath, PhD 2 1 GlaxoSmithKline, Research Triangle
More informationaccuracy (see, e.g., Mislevy & Stocking, 1989; Qualls & Ansley, 1985; Yen, 1987). A general finding of this research is that MML and Bayesian
Recovery of Marginal Maximum Likelihood Estimates in the Two-Parameter Logistic Response Model: An Evaluation of MULTILOG Clement A. Stone University of Pittsburgh Marginal maximum likelihood (MML) estimation
More informationA Broad-Range Tailored Test of Verbal Ability
A Broad-Range Tailored Test of Verbal Ability Frederic M. Lord Educational Testing Service Two parallel forms of a broad-range tailored test of verbal ability have been built. The test is appropriate from
More informationGraphical Representation of Multidimensional
Graphical Representation of Multidimensional Item Response Theory Analyses Terry Ackerman University of Illinois, Champaign-Urbana This paper illustrates how graphical analyses can enhance the interpretation
More informationwas also my mentor, teacher, colleague, and friend. It is tempting to review John Horn s main contributions to the field of intelligence by
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30, 179 185. (3362 citations in Google Scholar as of 4/1/2016) Who would have thought that a paper
More informationAn item response theory analysis of Wong and Law emotional intelligence scale
Available online at www.sciencedirect.com Procedia Social and Behavioral Sciences 2 (2010) 4038 4047 WCES-2010 An item response theory analysis of Wong and Law emotional intelligence scale Jahanvash Karim
More informationSchool Administrators Level of Self-Esteem and its Relationship To Their Trust in Teachers. Mualla Aksu, Soner Polat, & Türkan Aksu
School Administrators Level of Self-Esteem and its Relationship To Their Trust in Teachers Mualla Aksu, Soner Polat, & Türkan Aksu What is Self-Esteem? Confidence in one s own worth or abilities (http://www.oxforddictionaries.com/definition/english/self-esteem)
More informationReliability and Validity of the Divided
Aging, Neuropsychology, and Cognition, 12:89 98 Copyright 2005 Taylor & Francis, Inc. ISSN: 1382-5585/05 DOI: 10.1080/13825580590925143 Reliability and Validity of the Divided Aging, 121Taylor NANC 52900
More informationRunning head: CFA OF STICSA 1. Model-Based Factor Reliability and Replicability of the STICSA
Running head: CFA OF STICSA 1 Model-Based Factor Reliability and Replicability of the STICSA The State-Trait Inventory of Cognitive and Somatic Anxiety (STICSA; Ree et al., 2008) is a new measure of anxiety
More informationItem Response Theory (IRT): A Modern Statistical Theory for Solving Measurement Problem in 21st Century
International Journal of Scientific Research in Education, SEPTEMBER 2018, Vol. 11(3B), 627-635. Item Response Theory (IRT): A Modern Statistical Theory for Solving Measurement Problem in 21st Century
More informationAn Assessment of the Mathematics Information Processing Scale: A Potential Instrument for Extending Technology Education Research
Association for Information Systems AIS Electronic Library (AISeL) SAIS 2009 Proceedings Southern (SAIS) 3-1-2009 An Assessment of the Mathematics Information Processing Scale: A Potential Instrument for
More informationMultidimensionality and Item Bias
Multidimensionality and Item Bias in Item Response Theory T. C. Oshima, Georgia State University M. David Miller, University of Florida This paper demonstrates empirically how item bias indexes based on
More informationAnumber of studies have shown that ignorance regarding fundamental measurement
10.1177/0013164406288165 Educational Graham / Congeneric and Psychological Reliability Measurement Congeneric and (Essentially) Tau-Equivalent Estimates of Score Reliability What They Are and How to Use
More informationalternate-form reliability The degree to which two or more versions of the same test correlate with one another. In clinical studies in which a given function is going to be tested more than once over
More informationConstruct Validity of Mathematics Test Items Using the Rasch Model
Construct Validity of Mathematics Test Items Using the Rasch Model ALIYU, R.TAIWO Department of Guidance and Counselling (Measurement and Evaluation Units) Faculty of Education, Delta State University,
More information