Parallel Forms for Diagnostic Purpose
|
|
- Moses May
- 5 years ago
- Views:
Transcription
1 Paper presented at AERA, 2010 Parallel Forms for Diagnostic Purpose Fang Chen Xinrui Wang UNCG, USA May, 2010
2 INTRODUCTION With the advancement of validity discussions, the measurement field is pushing full research from the initial stage of a test development to the final stage of score interpretation and usage. Strictly speaking, a test designed for a specific purpose cannot be used for other unintended usage. However, there are so many goals a test is expected to fulfill for different audience that there are frequent compromise in practice. One strong argument in favor of using one test for more than one purpose is to minimize interference and time-robbing from class-room teaching. This is especially true for achievement tests in many parts of the world. With these in mind, how to maximize the information from a single test would be a topic of interest for many measurement researchers as well as practitioners. One approach to realize this may be to analyze data from different perspectives for different purposes. For example, using Item Response Theory (IRT) to find the best cutting score for selection purpose and using Cognitive Diagnostic Models (CDM) to look for profiles of skill-mastery or not for diagnostic, placement or program evaluation purposes. Another issue of interest to the measurement field is parallel test forms. Generating parallel test forms is important when measuring achievement. It helps with test security and it enables multiple windows for testing to ensure fairness for best performance of every test taker. It is even more so when the same test result is used for limited further educational opportunities. However, generating parallel forms is also a challenging task as tests have to strike a balance between content and measurement specifications at the same time (Gibson & Weiner, 1998). Under classical test theory (CTT), Item difficulty and item discrimination are used to measure whether test forms are parallel. With modern test theories (models) such as item response theory (IRT), one selects items from pre-calibrated banks according to the test information function under constraints. Tests do not need to be parallel in terms of item or test difficulty although it is still regarded better with comparable content coverage. However, while IRT has become a norm for modern testing, it cannot provide more refined information that can benefit teachers and students for diagnosing and teaching purpose. For this purpose, CDMs are developed. CDMs can produce detailed analysis of a person s ability profile and help maximize information from a test in addition to what IRT can provide. However, for this emerging model, the examination of whether test forms are parallel has not been covered as much in the literature. For this reason, it is interesting to explore procedures to evaluate the parallism of test forms from a cognitive diagnostic perspective. A detailed introduction to CDM is beyond the scope of this paper, interested readers can refer to Leighton and Gierl (2007), and Rupp, Templin and Henson (2010). Chinese readers can also refer to a non-technical introduction by Chen (2011). In CDMs, latent classes are involved, that is we regard the correct and incorrect answers to an item a manifestation of a group of latent attributes working together for that item. There may be different attribute patterns that can lead to the probability of overall correct or incorrect responses. These patterns are called latent classes. Within latent classes is the idea of conditional independence in which the probabilities of responses are independent of one another given the number of latent classes within the model (Rupp et al, 2010). CDM includes a whole series of sub-models with their own particular constraints. One model in particular, the noncompensatory reduced reparameterized unified model (NCRUM) assumes that that the probability of a correct response reduces as the number of attributes mastered decreases (Rupp et al, 2010). Simply though, the NCRUM requires that a person master simpler attributes before more complex attributes. A model such as this may be especially useful in achievement testing, where there are 2
3 several content areas that range from rudimentary to more complex, i.e., hierarchical learning is expected (Bloom, 1956). Benjamin S. Bloom developed the taxonomy of learning as a result of instruction, known as the Taxonomy of Educational Objectives, as an easier approach to developing examinations (Krathwohl, 2002). The taxonomy is used as a tool to measure learning in the six main cognitive domains: knowledge, comprehension, application, analysis, synthesis, and evaluation (Krathwohl, 2002). Furthermore, the taxonomy is arranged as a hierarchy; in order to progress to the next level in thinking, one must possess the skills of those levels by which it was preceded (Bloom, 1956). For example, one cannot progress to comprehension without having the skills acquired in knowledge. Although Bloom s taxonomy has been extended and developed into numerous new framework, the central concept remain: the cognitive skills are hierarchical. Is this assumption supported by real achievement test data? If so, is the cognitive diagnostic information for test-takers based on traditionally-defined parallel forms also consistent from CDM perspective? How to evaluate parallelism if the test purpose is for diagnostic analyses? We decided to explore parallelism of test forms in terms of cognitive assessment and diagnosing which is closely related to the proposal of maximizing information from achievement tests through CDMs. We demonstrate the considerations needed to evaluate parallelism for the purpose of diagnosing and explore indices that can help with the judgment. METHOD Data We used data from the 2007 Trends in International Mathematics and Science Study (TIMSS) program. TIMSS is an international program designed to improve students mathematical and science skills (Olson, Martin, & Mullis, 2009). TIMSS measures trends in math and science every four years at the fourth and eighth grade levels in fifty-nine countries (Olson et al, 2009). The sampling selection in TI MSS 2007 follows a systematic, two-stage probability proportional-to-size (PPS) sampling technique, where schools are first selected and then classes within sampled (and participating) schools. This sampling method is a natural match with the hierarchical nature of the population where classes of students are nested within schools. The schools are sampled to mimic the variety of school type, and classes within schools are sampled to mimic the diversity among classes. For our study, we used student sample of the United States that had a total of 545 students of which 50.6% were girls and 49.4% were boys. We also decide to focus on the mathematical test for exploration purpose, where the cognitive hierarchy is clear to define. For the mathematical test, students are given a booklet of questions to measure achievement (Olson et al, 2009). The booklets are divided into two blocks ( Block 1 and Block 2). While each student takes two different blocks, every one of the blocks is used between two groups of test-takers for linking purposes. The test questions are divided into three different cognitive domains: Knowing, applying, and reasoning, similar to that of Bloom s taxonomy. We can examine whether the two blocks of TIMSS mathematical achievement tests were parallel from a diagnostic perspective, i.e., whether they give similar estimation of students ability profile between the two forms according to the test blueprint. The NCRUM model was chosen based on our theory that the three cognitive domains are hierarchical in nature. Put simply, if reasoning is the skill required to respond to a question correctly, knowing the concept and being able to apply the knowledge is not enough to ensure correct answer. The chosen mathematical test is divided into two blocks with 13 and 16 questions respectively. The design of the blocks makes it clear that the blocks are assumed to be parallel in 3
4 terms of structure, content and quality. That is, they can be exchanged between each other and provide reliable score interpretation for any group of test takers. The cognitive skills are the focus of this paper and the coverage of the skills as measured by the blocks are defined by the test development team and summarized in Table 1. Table 1. Distribution of Cognitive Skills Block 1 Block 2 Knowing 3 6 Applying 9 6 Reasoning 1 4 Total There are three relevant research questions: 1. How well can the TIMSS items discriminate between students with high and low cognitive abilities? 2. Can the two test forms (blocks) give consistent and reliable classification of students in terms of cognitive abilities? 3. Do student responses reflect a hierarchy of the three cognitive skills? Or is the assumed hierarchy of Bloom s taxonomy supported by the data? Model A cognitive diagnostic model (DCM/CDM), NCRUM, was chosen as the model for several reasons. First, we wanted to see more detailed information than just a total score. A regular unidimensional Item Response Model (IRT) was enough to provide overall item quality and person ability estimation but did not differentiate between test takers on each cognitive skill required for each item. However, DCM provided this type of information. When combined with analyses on math attributes such as Algebra and Number, as classified by TIMSS, users will be able to explain the differences in student responses in terms of both math ability and cognitive ability. As a subtype of DCM models, NCRUM assumes that the attributes measured by each item can not compensate for the lack of other attributes required for the item (Rupp et al, 2010). As previously mentioned this is consistent with Bloom s taxonomy and justifies our choice with this particular model within DCM families. A Q-matrix relevant to our research purpose is retrofitted to the data for analysis. To create a Q-matrix, we re-specified the cognitive skills that met Bloom s Taxonomy. Thus, an item that was meant measure Knowing was coded 1,0, 0, meaning the student only achieved level 1 in terms of cognitive skills. An item measuring Applying was coded as 1,1,0, meaning the student is at level 2; and an item measuring Reasoning was coded as 1,1,1. We used the program RUM, written by Dr. Robert Henson for the purpose of this analysis. It was an easy tool to implement and it provided outcomes for attribute-level item parameters p* and r* (Rupp, etal, 2010). These parameters were then used to calculate the attribute level discrimination parameters for the purpose of item evaluation. 4
5 Analyses procedures We calculated discrimination parameters using Equation 1 following the notation in Rupp et al (2010). A * * * qia dia = π π ria (Equation 1) a= 1 High π* and low r* indicates good items. We also used traditional difficulty index, the p-values in classical test theory and compared the results between the two approaches. Next, we classified the students into different categories. Although there were eight possible attribute categories, Bloom s taxonomy only allows four possible categories because of the hierarchy nature. However, we summarized both categories in our study for exploration of question 3. Finally, we compared the students profiles as the result of the test and compared the two blocks. The percentage of profile change was probed and analyzed. If the blocks were parallel and items were good, we expected to have a smaller percentage change for each attribute, and vice versa. RESULTS Item analyses Item discrimination analyses based on NCRUM model and classical test theory are shown in Table 2 and 3. Table 2. Attribute Level Item Discrimination based on Block 1 NCRUM Item Knowing Applying Reasoning Mean Table 3. Attribute Level Item Discrimination based on Block 2 5
6 NCRUM Item Knowing Applying Reasoning Mean There is little literature to guide the evaluation about the quality of items under NCRUM. We decided to use.20 as a reasonable cut point. This means that if the percentage of students that answer an item correctly (masters) is 20% more than non-masters, this item discriminates between the masters and non-masters very well. Using this rule, the two blocks were found to be discriminating the masters from non-masters moderately well. Specifically, out of all twenty-four attribute measures that could be evaluated for Block 1, eighteen of them were discriminating; out of thirty possible attribute measures for Block 2, twenty-four of them were discriminating. Classical Test Theory (CTT) was also used to examine the quality of the test for Blocks 1 and 2, which can be seen in Table 4. Overall, the 29 test items reliability was According to the item statistics, item 22 (p-value = 0.14) and item 12 (p-value = 0.17) were the hardest items on the test. Item 27 (p-value = 0.84) was the easiest item among the examinees taking this test. The least discriminating items for the 29 items were item 17 (r pb = 0.162) and item 6 (r pb = 0.164). Table 4. Classical Test Theory: Reliability Statistics (N = 545) Block Cronbach s Alpha # of Items Block 1 Block
7 Block 1 consisted of 13 items that had a reliability of The easiest item was item 1 (p-value = 0.81) and the hardest item was item 12 (p-value = 0.17). Item 6 (r pb = 0.159) and item 1 (r pb = 0.195) had the lowest discriminating values. Block 2 had a total of 16 items with a reliability of Item 27 had a p-value of 0.84, which made it the easiest item in Block 2. Additionally, item 22 seemed to be the hardest item with a p-value of Item 17(r pb = 0.158) doesn t discriminate well among examinees. These results are also listed in Table 5. Table 5. Classical Test Theory: Item Statistics (N= 545) Block 1 Block 2 Items M Corrected Item-Total Correlation Items M Corrected Item-Total Correlation Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Item Block 1 and Block 2 were separated according to Bloom s Taxonomy to examine the item analyses in each subcategory (Table 6): Knowing, Applying, and Reasoning. In Block 1, Knowing had a total of 3 items with a reliability of In the Applying category, 9 items were included with a reliability of 0.71; however, item analyses could not be conducted for the Reasoning category due to there being one item in that section. 7
8 In Block 2, Knowing included 6 items (α = 0.51) and Applying included 6 items (α = 0.49). The Reasoning section included 4 items with a reliability of Table 6. Classical Test Theory: Reliability Statistics (N= 545) Block Cronbach s Alpha # of Items Block 1 Knowing Applying Reasoning Block 2 Knowing Applying Reasoning If a higher level cognitive skill is required for an item, most of the time the item can discriminate the lower level skill better than the higher skills. This is in line with Bloom s Taxonomy because more guessing may be involved for the item requiring higher skill, making responses to it more variable and discrimination less clear. In addition, having more items and/or high discriminating items would provide a better quality of items in both Block 1 and Block 2. When items are analyzed for the cognitive attributes specifically, we found that the item that discriminated Knowing best was item 9 in Block 2. This item measured Knowing, Applying and Reasoning, and only 14% students got the item right. Contrarily, item 6 in Block 1 discriminated Knowing least. This item measured Knowing and Applying, and 73% students got this item correct. The item that discriminates Applying best was item 13 in Block 1. It measured Knowing and Applying, and 26% student got the item right. The item that discriminated Applying least was item 2 in Block 1. It measured Knowing and Applying, and 76% students got this item right. Interestingly, item 2 is a word problem rather than a multiple choice question. This infers that perhaps the attributes the question was intended to measure were not clearly defined (for the questions for both blocks, see Appendix A). The item that discriminated Reasoning best was item 8 in Block 2. It measured all three attributes, and 32% students got this item right. However, the item that was least discriminating for the Reasoning attribute was item 10 in Block 2, which had a correct probability of 37%. For the attributes of Knowing and Applying, easy items had low ability of discrimination. This may suggest a ceiling effect for these items in that most students knew the correct answer; thus, the items lost the ability to discriminate. For the attribute of Reasoning, no such pattern was found. 8
9 Attribute Profile A probability higher than 0.60 suggests one has mastered an attribute, while a probability lower than 0.40 suggests that one has not mastered an attribute (Rupp et al, 2010). We deleted the cases with any probability of mastering an attribute between 0.40 and 0.60, meaning we needed more information for accurate classification of these students. 402 out of 545 individuals provided valid data. The possible latent class profiles for three attributes are shown in the Table 7. Table 7. Possible student profile classification Latent Class Attribute Profile Only the first four classes were reasonable according to hierarchy described in Bloom s Taxonomy. In our results, each block generated five different attribute profiles. The attribute profile for latent class and the corresponding probability is shown Table 8. Table 8. The probability for attribute profiles Block 1 Block2 Latent class Attribute Profile for Latent p Attribute Profile for Latent p Class Class 1 α 11 = (0,0,0).493 α 21 = (0,0,0) α 12 = (0,1,0).032 α 22 = (1,0,0) α 13 = (1,0,0).313 α 23 = (1,0,1) α 14 = (1,1,0).159 α 24 = (1,1,0) α 15 = (1,1,1).002 α 25 = (1,1,1).170 According to the table, Bloom s Taxonomy is generally supported. Three attribute profiles had low probability percentages: (0,1,0), (1,0,1) and (1,1,1). The first two were unreasonable according to Bloom s Taxonomy and low p value in the data also supports these hypotheses. Excluding these attribute profiles, we found only four latent classes: (0,0,0), (1,0,0), (1,1,0), and (1,1,1). These are exactly what Bloom s taxonomy would predict. Among the four latent cases, we found the percentage of students mastering each attribute also reasonable. The probability decreases as the number of attribute increases. It suggests a hierarchy among these three attributes. However, the probability of latent class 5 in Block 1 was extremely low. This may be due to the fact that there is only one item in Block 1 to measure reasoning so the results may not be accurate. 9
10 Block Comparison Our hypothesis was that Block 1 and Block 2 should generate the same mastery profile for each person because these blocks are parallel. If the mastery profiles generated by two blocks are different, they are not parallel tests in reality. By comparing each examinee s attribute mastery profiles generated by Block 1 and Block 2, discrepancies were found across three attributes. The percentage of students switching classes between a master of an attribute and a non-master was 29.7% for Knowing, 17.4% for Applying and 12.2% for Reasoning. This means our Blocks could categorize the students reliably. Though the discrepancy seems to decrease with a higher cognitive domain, it should be noted that the probability of getting the items concerning applying and reasoning also decreases. Apparently, these two blocks were not parallel tests for classification and diagnostic purposes. A second reason for the discrepancy between the diagnostic results may have been due to the unbalanced item distribution in Block 1, in which there is only one item to measure reasoning. Therefore, the judgment for students reasoning ability obtained from Block 1 was not reliable. Relating CTT with CDM indices Generally speaking, both of the two blocks discriminated students well. The range of the discrimination index was (0.01, 0.75) based on CDM and (0.15, 0.53) based on CTT for Block 1. The range of the discrimination index was (0.02, 0.85) based on CDM and (0.15, 0.52) based on CTT for Block 2. All the item discrimination indices were positive, which show that those items were reasonable. While the discrimination index based on CTT was the item discrimination, the discrimination index based on CDM was the attribute discrimination. When the item only measured one attribute, the CDM discrimination index and the CTT discrimination index was consistent. When all attributes for an item had small discrimination indices, that item had a small CTT discrimination index as well (eg. Item 4 in Block 2). However, when the attribute discrimination index varied across the attributes for a certain item, the CTT item discrimination index seemed to be a balance of all attribute discrimination indices. SUMMARY This paper evaluated the efficacy of Booklet 1 of TIMSS 2007 on measuring the cognitive ability of students at the eighth grade. We used both methods based on Cognitive Diagnostic Modeling (CDM) and Classic Test Theory (CTT) to examine how well these items discriminated students with different ability levels. We demonstrated how to use various indices to evaluate parallelism from a CDM perspective and compared it to CTT indices. More empirical research should be done to explore the relationship between item difficulty in CTT and attribute discrimination in CDM. We also compared the two blocks in Booklet 1 regarding their ability to accurately classify students. The reliability evaluation based on CTT showed that both blocks had good internal consistency. However, after using CDM to categorize students into different latent classes, we did not obtain evidence that two blocks give consistent and reliable classification of students. The percentage of students switching classes between a master of an attribute and a non-master was 29.7% for Knowing, 17.4% for Applying and 12.2% for Reasoning. This leads to concerns in the validity of test score interpretation if the diagnosing feature is built into the design stage and is expected to be shared with score users. Fortunately, TIMSS was not designed for diagnostic purpose. If a test is designed for diagnostic purpose, concepts such as parallel 10
11 forms and reliability will have to be different from non-diagnostic tests. This paper explores this issue and cast a new understanding of these traditional concepts from a CDM perspective. This is relevant not only to the validity of test score interpretation but also the initial stage of test development where the test blueprint will have to consider new dimensions to ensure test quality. Of course, many other concepts related to the current trend to computer-based testing such as test-assembly engineering will also change. This is a worthwhile field for furthur exploration for diagnostic assessment. CDM was used to exam the hierarchy among the three cognitive domains. Our results support the hypothesis that student responses reflect a hierarchy of the three cognitive skills for TIMSS math measurement. If a higher level cognitive skill was required for the item, all the lower level skills had to be present to give a correct response. These three attributes are in the same order as defined in Bloom s Taxonomy, with Knowing being the lowest skill and Reasoning (Evaluation) being the highest. Reference Bloom, B.S. (1956). Taxonomy of educational objectives, handbook 1: The cognitive domain. New York: David McKay. 陈芳,2011, 诊断分类模型 : 测试领域的新工具, 外语教学理论与实践 第 2 期, Gibson, W.M., Weiner, J.A. (1998). Generating parallel test forms using CTT in a computerbased environment. Journal of Educational Measurement, 35, Hambleton, R.K., Swaminathan, H. (1990). Item response theory: Principles and Applications. Norwell, MA: Kluwer Academic Publishers. Krathwohl, D.R. (2002). A revision of Bloom s taxonomy: An overview. Theory Into Practice, 41, Leighton, J. P., & Griel, M. J. (2007), Ed. Cognitive diagnostic assessment for education: theory and applications. New York, NY: Cambridge University Press. Olson, J. F., Martin, M.O. & Mullis, I.V. (2009). TIMSS 2007 Technical Report. TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College. Olson, J.F., & Foy, P. (2009). TIMSS 2007 User Guide for the International Database. TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College. Rupp, A.A., Templin, J., & Henson, R. (2010). Diagnostic measurement: Theory, methods, and applications. New York: Guilford Press. 11
Diagnostic Classification Models
Diagnostic Classification Models Lecture #13 ICPSR Item Response Theory Workshop Lecture #13: 1of 86 Lecture Overview Key definitions Conceptual example Example uses of diagnostic models in education Classroom
More informationFundamental Concepts for Using Diagnostic Classification Models. Section #2 NCME 2016 Training Session. NCME 2016 Training Session: Section 2
Fundamental Concepts for Using Diagnostic Classification Models Section #2 NCME 2016 Training Session NCME 2016 Training Session: Section 2 Lecture Overview Nature of attributes What s in a name? Grain
More informationInvestigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories
Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,
More informationEmpowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison
Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological
More informationAndré Cyr and Alexander Davies
Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander
More informationModel-based Diagnostic Assessment. University of Kansas Item Response Theory Stats Camp 07
Model-based Diagnostic Assessment University of Kansas Item Response Theory Stats Camp 07 Overview Diagnostic Assessment Methods (commonly called Cognitive Diagnosis). Why Cognitive Diagnosis? Cognitive
More informationReviewing the TIMSS Advanced 2015 Achievement Item Statistics
CHAPTER 11 Reviewing the TIMSS Advanced 2015 Achievement Item Statistics Pierre Foy Michael O. Martin Ina V.S. Mullis Liqun Yin Kerry Cotter Jenny Liu The TIMSS & PIRLS conducted a review of a range of
More informationLikelihood Ratio Based Computerized Classification Testing. Nathan A. Thompson. Assessment Systems Corporation & University of Cincinnati.
Likelihood Ratio Based Computerized Classification Testing Nathan A. Thompson Assessment Systems Corporation & University of Cincinnati Shungwon Ro Kenexa Abstract An efficient method for making decisions
More informationFactors Affecting the Item Parameter Estimation and Classification Accuracy of the DINA Model
Journal of Educational Measurement Summer 2010, Vol. 47, No. 2, pp. 227 249 Factors Affecting the Item Parameter Estimation and Classification Accuracy of the DINA Model Jimmy de la Torre and Yuan Hong
More informationIDENTIFYING DATA CONDITIONS TO ENHANCE SUBSCALE SCORE ACCURACY BASED ON VARIOUS PSYCHOMETRIC MODELS
IDENTIFYING DATA CONDITIONS TO ENHANCE SUBSCALE SCORE ACCURACY BASED ON VARIOUS PSYCHOMETRIC MODELS A Dissertation Presented to The Academic Faculty by HeaWon Jun In Partial Fulfillment of the Requirements
More informationConstruct Validity of Mathematics Test Items Using the Rasch Model
Construct Validity of Mathematics Test Items Using the Rasch Model ALIYU, R.TAIWO Department of Guidance and Counselling (Measurement and Evaluation Units) Faculty of Education, Delta State University,
More informationBasic concepts and principles of classical test theory
Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must
More informationINVESTIGATING FIT WITH THE RASCH MODEL. Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form
INVESTIGATING FIT WITH THE RASCH MODEL Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form of multidimensionality. The settings in which measurement
More informationContents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD
Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT
More informationItem Response Theory: Methods for the Analysis of Discrete Survey Response Data
Item Response Theory: Methods for the Analysis of Discrete Survey Response Data ICPSR Summer Workshop at the University of Michigan June 29, 2015 July 3, 2015 Presented by: Dr. Jonathan Templin Department
More informationRunning head: ATTRIBUTE CODING FOR RETROFITTING MODELS. Comparison of Attribute Coding Procedures for Retrofitting Cognitive Diagnostic Models
Running head: ATTRIBUTE CODING FOR RETROFITTING MODELS Comparison of Attribute Coding Procedures for Retrofitting Cognitive Diagnostic Models Amy Clark Neal Kingston University of Kansas Corresponding
More informationDetecting Suspect Examinees: An Application of Differential Person Functioning Analysis. Russell W. Smith Susan L. Davis-Becker
Detecting Suspect Examinees: An Application of Differential Person Functioning Analysis Russell W. Smith Susan L. Davis-Becker Alpine Testing Solutions Paper presented at the annual conference of the National
More informationGENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS
GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at
More informationLinking Assessments: Concept and History
Linking Assessments: Concept and History Michael J. Kolen, University of Iowa In this article, the history of linking is summarized, and current linking frameworks that have been proposed are considered.
More informationThe Effect of Review on Student Ability and Test Efficiency for Computerized Adaptive Tests
The Effect of Review on Student Ability and Test Efficiency for Computerized Adaptive Tests Mary E. Lunz and Betty A. Bergstrom, American Society of Clinical Pathologists Benjamin D. Wright, University
More informationMCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2
MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and Lord Equating Methods 1,2 Lisa A. Keller, Ronald K. Hambleton, Pauline Parker, Jenna Copella University of Massachusetts
More informationKeywords: positive attribution, negative attribution, English learning, gender. Introduction
US-China Foreign Language, October 2016, Vol. 14, No. 10, 706-711 doi:10.17265/1539-8080/2016.10.005 D DAVID PUBLISHING Junior Middle School Students Self-attribution in English Learning GAO Yuan-yuan
More informationMULTIPLE-CHOICE ITEMS ANALYSIS USING CLASSICAL TEST THEORY AND RASCH MEASUREMENT MODEL
Man In India, 96 (1-2) : 173-181 Serials Publications MULTIPLE-CHOICE ITEMS ANALYSIS USING CLASSICAL TEST THEORY AND RASCH MEASUREMENT MODEL Adibah Binti Abd Latif 1*, Ibnatul Jalilah Yusof 1, Nor Fadila
More informationCOMBINING SCALING AND CLASSIFICATION: A PSYCHOMETRIC MODEL FOR SCALING ABILITY AND DIAGNOSING MISCONCEPTIONS LAINE P. BRADSHAW
COMBINING SCALING AND CLASSIFICATION: A PSYCHOMETRIC MODEL FOR SCALING ABILITY AND DIAGNOSING MISCONCEPTIONS by LAINE P. BRADSHAW (Under the Direction of Jonathan Templin and Karen Samuelsen) ABSTRACT
More informationUvA-DARE (Digital Academic Repository)
UvA-DARE (Digital Academic Repository) Standaarden voor kerndoelen basisonderwijs : de ontwikkeling van standaarden voor kerndoelen basisonderwijs op basis van resultaten uit peilingsonderzoek van der
More informationAMERICAN BOARD OF SURGERY 2009 IN-TRAINING EXAMINATION EXPLANATION & INTERPRETATION OF SCORE REPORTS
AMERICAN BOARD OF SURGERY 2009 IN-TRAINING EXAMINATION EXPLANATION & INTERPRETATION OF SCORE REPORTS Attached are the performance reports and analyses for participants from your surgery program on the
More informationREPORT. Technical Report: Item Characteristics. Jessica Masters
August 2010 REPORT Diagnostic Geometry Assessment Project Technical Report: Item Characteristics Jessica Masters Technology and Assessment Study Collaborative Lynch School of Education Boston College Chestnut
More information2016 Technical Report National Board Dental Hygiene Examination
2016 Technical Report National Board Dental Hygiene Examination 2017 Joint Commission on National Dental Examinations All rights reserved. 211 East Chicago Avenue Chicago, Illinois 60611-2637 800.232.1694
More informationEVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS
DePaul University INTRODUCTION TO ITEM ANALYSIS: EVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS Ivan Hernandez, PhD OVERVIEW What is Item Analysis? Overview Benefits of Item Analysis Applications Main
More informationComprehensive Statistical Analysis of a Mathematics Placement Test
Comprehensive Statistical Analysis of a Mathematics Placement Test Robert J. Hall Department of Educational Psychology Texas A&M University, USA (bobhall@tamu.edu) Eunju Jung Department of Educational
More informationDescription of components in tailored testing
Behavior Research Methods & Instrumentation 1977. Vol. 9 (2).153-157 Description of components in tailored testing WAYNE M. PATIENCE University ofmissouri, Columbia, Missouri 65201 The major purpose of
More informationDuring the past century, mathematics
An Evaluation of Mathematics Competitions Using Item Response Theory Jim Gleason During the past century, mathematics competitions have become part of the landscape in mathematics education. The first
More informationJONATHAN TEMPLIN LAINE BRADSHAW THE USE AND MISUSE OF PSYCHOMETRIC MODELS
PSYCHOMETRIKA VOL. 79, NO. 2, 347 354 APRIL 2014 DOI: 10.1007/S11336-013-9364-Y THE USE AND MISUSE OF PSYCHOMETRIC MODELS JONATHAN TEMPLIN UNIVERSITY OF KANSAS LAINE BRADSHAW THE UNIVERSITY OF GEORGIA
More informationThe Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory
The Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory Kate DeRoche, M.A. Mental Health Center of Denver Antonio Olmos, Ph.D. Mental Health
More informationItem Analysis Explanation
Item Analysis Explanation The item difficulty is the percentage of candidates who answered the question correctly. The recommended range for item difficulty set forth by CASTLE Worldwide, Inc., is between
More informationBy Hui Bian Office for Faculty Excellence
By Hui Bian Office for Faculty Excellence 1 Email: bianh@ecu.edu Phone: 328-5428 Location: 1001 Joyner Library, room 1006 Office hours: 8:00am-5:00pm, Monday-Friday 2 Educational tests and regular surveys
More informationAN ANALYSIS ON VALIDITY AND RELIABILITY OF TEST ITEMS IN PRE-NATIONAL EXAMINATION TEST SMPN 14 PONTIANAK
AN ANALYSIS ON VALIDITY AND RELIABILITY OF TEST ITEMS IN PRE-NATIONAL EXAMINATION TEST SMPN 14 PONTIANAK Hanny Pradana, Gatot Sutapa, Luwandi Suhartono Sarjana Degree of English Language Education, Teacher
More informationBruno D. Zumbo, Ph.D. University of Northern British Columbia
Bruno Zumbo 1 The Effect of DIF and Impact on Classical Test Statistics: Undetected DIF and Impact, and the Reliability and Interpretability of Scores from a Language Proficiency Test Bruno D. Zumbo, Ph.D.
More informationComputerized Mastery Testing
Computerized Mastery Testing With Nonequivalent Testlets Kathleen Sheehan and Charles Lewis Educational Testing Service A procedure for determining the effect of testlet nonequivalence on the operating
More informationBrent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014
Brent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014 Studies under review ELA event Mathematics event Duckor, B., Castellano, K., Téllez, K., & Wilson, M. (2013, April). Validating
More informationThe Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing
The Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing Terry A. Ackerman University of Illinois This study investigated the effect of using multidimensional items in
More informationValidity, Reliability, and Fairness in Music Testing
chapter 20 Validity, Reliability, and Fairness in Music Testing Brian C. Wesolowski and Stefanie A. Wind The focus of this chapter is on validity, reliability, and fairness in music testing. A test can
More informationThe Classification Accuracy of Measurement Decision Theory. Lawrence Rudner University of Maryland
Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, April 23-25, 2003 The Classification Accuracy of Measurement Decision Theory Lawrence Rudner University
More informationTHE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION
THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION Timothy Olsen HLM II Dr. Gagne ABSTRACT Recent advances
More informationModels in Educational Measurement
Models in Educational Measurement Jan-Eric Gustafsson Department of Education and Special Education University of Gothenburg Background Measurement in education and psychology has increasingly come to
More informationAdaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida
Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida and Oleksandr S. Chernyshenko University of Canterbury Presented at the New CAT Models
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationEvaluating the Consistency of Test Content across Two Successive Administrations of a State- Mandated Science and Technology Assessment 1,2
Evaluating the Consistency of Test Content across Two Successive Administrations of a State- Mandated Science and Technology Assessment 1,2 Timothy O Neil 3 and Stephen G. Sireci University of Massachusetts
More informationItem Analysis: Classical and Beyond
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013 Why is item analysis relevant? Item analysis provides
More informationWorkshop Overview. Diagnostic Measurement. Theory, Methods, and Applications. Session Overview. Conceptual Foundations of. Workshop Sessions:
Workshop Overview Workshop Sessions: Diagnostic Measurement: Theory, Methods, and Applications Jonathan Templin The University of Georgia Session 1 Conceptual Foundations of Diagnostic Measurement Session
More informationShiken: JALT Testing & Evaluation SIG Newsletter. 12 (2). April 2008 (p )
Rasch Measurementt iin Language Educattiion Partt 2:: Measurementt Scalles and Invariiance by James Sick, Ed.D. (J. F. Oberlin University, Tokyo) Part 1 of this series presented an overview of Rasch measurement
More informationInfluences of IRT Item Attributes on Angoff Rater Judgments
Influences of IRT Item Attributes on Angoff Rater Judgments Christian Jones, M.A. CPS Human Resource Services Greg Hurt!, Ph.D. CSUS, Sacramento Angoff Method Assemble a panel of subject matter experts
More informationDetermining Differential Item Functioning in Mathematics Word Problems Using Item Response Theory
Determining Differential Item Functioning in Mathematics Word Problems Using Item Response Theory Teodora M. Salubayba St. Scholastica s College-Manila dory41@yahoo.com Abstract Mathematics word-problem
More informationRe-Examining the Role of Individual Differences in Educational Assessment
Re-Examining the Role of Individual Differences in Educational Assesent Rebecca Kopriva David Wiley Phoebe Winter University of Maryland College Park Paper presented at the Annual Conference of the National
More informationA Comparison of Three Measures of the Association Between a Feature and a Concept
A Comparison of Three Measures of the Association Between a Feature and a Concept Matthew D. Zeigenfuse (mzeigenf@msu.edu) Department of Psychology, Michigan State University East Lansing, MI 48823 USA
More informationACADEMIC APPLICATION:
Academic Skills: Critical Thinking Bloom s Taxonomy Name Point of the Assignment: To help you realize there are different forms of critical thinking to be used in education. Some forms of critical thinking
More informationfor Scaling Ability and Diagnosing Misconceptions Laine P. Bradshaw James Madison University Jonathan Templin University of Georgia Author Note
Combing Item Response Theory and Diagnostic Classification Models: A Psychometric Model for Scaling Ability and Diagnosing Misconceptions Laine P. Bradshaw James Madison University Jonathan Templin University
More informationComputer Adaptive-Attribute Testing
Zeitschrift M.J. für Psychologie Gierl& J. / Zhou: Journalof Computer Psychology 2008Adaptive-Attribute Hogrefe 2008; & Vol. Huber 216(1):29 39 Publishers Testing Computer Adaptive-Attribute Testing A
More informationUSE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION
USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION Iweka Fidelis (Ph.D) Department of Educational Psychology, Guidance and Counselling, University of Port Harcourt,
More informationAnswers to end of chapter questions
Answers to end of chapter questions Chapter 1 What are the three most important characteristics of QCA as a method of data analysis? QCA is (1) systematic, (2) flexible, and (3) it reduces data. What are
More informationBlending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously
Blending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously Jonathan Templin Department of Educational Psychology Achievement and Assessment Institute
More informationDoes factor indeterminacy matter in multi-dimensional item response theory?
ABSTRACT Paper 957-2017 Does factor indeterminacy matter in multi-dimensional item response theory? Chong Ho Yu, Ph.D., Azusa Pacific University This paper aims to illustrate proper applications of multi-dimensional
More informationDecision consistency and accuracy indices for the bifactor and testlet response theory models
University of Iowa Iowa Research Online Theses and Dissertations Summer 2014 Decision consistency and accuracy indices for the bifactor and testlet response theory models Lee James LaFond University of
More informationHaving your cake and eating it too: multiple dimensions and a composite
Having your cake and eating it too: multiple dimensions and a composite Perman Gochyyev and Mark Wilson UC Berkeley BEAR Seminar October, 2018 outline Motivating example Different modeling approaches Composite
More informationUniversity of Alberta
University of Alberta Estimating Attribute-Based Reliability in Cognitive Diagnostic Assessment by Jiawen Zhou A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment
More informationJanuary 2, Overview
American Statistical Association Position on Statistical Statements for Forensic Evidence Presented under the guidance of the ASA Forensic Science Advisory Committee * January 2, 2019 Overview The American
More informationPearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world
Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Visit us on the World Wide Web at: www.pearsoned.co.uk Pearson Education Limited 2014
More informationImpact of Methods of Scoring Omitted Responses on Achievement Gaps
Impact of Methods of Scoring Omitted Responses on Achievement Gaps Dr. Nathaniel J. S. Brown (nathaniel.js.brown@bc.edu)! Educational Research, Evaluation, and Measurement, Boston College! Dr. Dubravka
More informationChanging the Order of Mathematics Test Items: Helping or Hindering Student Performance?
Journal of Humanistic Mathematics Volume 3 Issue 1 January 2013 Changing the Order of Mathematics Test Items: Helping or Hindering Student Performance? Kristin T. Kennedy Bryant University, kkennedy@bryant.edu
More informationINSPECT Overview and FAQs
WWW.KEYDATASYS.COM ContactUs@KeyDataSys.com 951.245.0828 Table of Contents INSPECT Overview...3 What Comes with INSPECT?....4 Reliability and Validity of the INSPECT Item Bank. 5 The INSPECT Item Process......6
More informationUsing the Score-based Testlet Method to Handle Local Item Dependence
Using the Score-based Testlet Method to Handle Local Item Dependence Author: Wei Tao Persistent link: http://hdl.handle.net/2345/1363 This work is posted on escholarship@bc, Boston College University Libraries.
More informationAlignment in Educational Testing: What it is, What it isn t, and Why it is Important
Alignment in Educational Testing: What it is, What it isn t, and Why it is Important Stephen G. Sireci University of Massachusetts Amherst Presentation delivered at the Connecticut Assessment Forum Rocky
More informationA Modified CATSIB Procedure for Detecting Differential Item Function. on Computer-Based Tests. Johnson Ching-hong Li 1. Mark J. Gierl 1.
Running Head: A MODIFIED CATSIB PROCEDURE FOR DETECTING DIF ITEMS 1 A Modified CATSIB Procedure for Detecting Differential Item Function on Computer-Based Tests Johnson Ching-hong Li 1 Mark J. Gierl 1
More informationCYRINUS B. ESSEN, IDAKA E. IDAKA AND MICHAEL A. METIBEMU. (Received 31, January 2017; Revision Accepted 13, April 2017)
DOI: http://dx.doi.org/10.4314/gjedr.v16i2.2 GLOBAL JOURNAL OF EDUCATIONAL RESEARCH VOL 16, 2017: 87-94 COPYRIGHT BACHUDO SCIENCE CO. LTD PRINTED IN NIGERIA. ISSN 1596-6224 www.globaljournalseries.com;
More informationAnalyzing data from educational surveys: a comparison of HLM and Multilevel IRT. Amin Mousavi
Analyzing data from educational surveys: a comparison of HLM and Multilevel IRT Amin Mousavi Centre for Research in Applied Measurement and Evaluation University of Alberta Paper Presented at the 2013
More informationMEASURING MIDDLE GRADES STUDENTS UNDERSTANDING OF FORCE AND MOTION CONCEPTS: INSIGHTS INTO THE STRUCTURE OF STUDENT IDEAS
MEASURING MIDDLE GRADES STUDENTS UNDERSTANDING OF FORCE AND MOTION CONCEPTS: INSIGHTS INTO THE STRUCTURE OF STUDENT IDEAS The purpose of this study was to create an instrument that measures middle grades
More informationChapter 1 Introduction to Educational Research
Chapter 1 Introduction to Educational Research The purpose of Chapter One is to provide an overview of educational research and introduce you to some important terms and concepts. My discussion in this
More informationA Comparison of Pseudo-Bayesian and Joint Maximum Likelihood Procedures for Estimating Item Parameters in the Three-Parameter IRT Model
A Comparison of Pseudo-Bayesian and Joint Maximum Likelihood Procedures for Estimating Item Parameters in the Three-Parameter IRT Model Gary Skaggs Fairfax County, Virginia Public Schools José Stevenson
More informationDevelopment, Standardization and Application of
American Journal of Educational Research, 2018, Vol. 6, No. 3, 238-257 Available online at http://pubs.sciepub.com/education/6/3/11 Science and Education Publishing DOI:10.12691/education-6-3-11 Development,
More information3 CONCEPTUAL FOUNDATIONS OF STATISTICS
3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical
More informationAssessment with Multiple-Choice Questions in Medical Education: Arguments for Selected-Response Formats
Assessment with Multiple-Choice Questions in Medical Education: Arguments for Selected-Response Formats Congreso Nacional De Educacion Medica Puebla, Mexico 11 January, 2007 Steven M. Downing, PhD Department
More informationPublished by European Centre for Research Training and Development UK (
DETERMINATION OF DIFFERENTIAL ITEM FUNCTIONING BY GENDER IN THE NATIONAL BUSINESS AND TECHNICAL EXAMINATIONS BOARD (NABTEB) 2015 MATHEMATICS MULTIPLE CHOICE EXAMINATION Kingsley Osamede, OMOROGIUWA (Ph.
More informationMantel-Haenszel Procedures for Detecting Differential Item Functioning
A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning H. Jane Rogers, Teachers College, Columbia University Hariharan Swaminathan, University of
More informationMeasuring mathematics anxiety: Paper 2 - Constructing and validating the measure. Rob Cavanagh Len Sparrow Curtin University
Measuring mathematics anxiety: Paper 2 - Constructing and validating the measure Rob Cavanagh Len Sparrow Curtin University R.Cavanagh@curtin.edu.au Abstract The study sought to measure mathematics anxiety
More informationAnalyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia
Analyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia 1 Introduction The Teacher Test-English (TT-E) is administered by the NCA
More informationComputerized Adaptive Testing for Classifying Examinees Into Three Categories
Measurement and Research Department Reports 96-3 Computerized Adaptive Testing for Classifying Examinees Into Three Categories T.J.H.M. Eggen G.J.J.M. Straetmans Measurement and Research Department Reports
More informationHow Many Options do Multiple-Choice Questions Really Have?
How Many Options do Multiple-Choice Questions Really Have? ABSTRACT One of the major difficulties perhaps the major difficulty in composing multiple-choice questions is the writing of distractors, i.e.,
More informationComparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria
Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Thakur Karkee Measurement Incorporated Dong-In Kim CTB/McGraw-Hill Kevin Fatica CTB/McGraw-Hill
More informationChapter 1 Introduction. Measurement Theory. broadest sense and not, as it is sometimes used, as a proxy for deterministic models.
Ostini & Nering - Chapter 1 - Page 1 POLYTOMOUS ITEM RESPONSE THEORY MODELS Chapter 1 Introduction Measurement Theory Mathematical models have been found to be very useful tools in the process of human
More informationPsychometrics for Beginners. Lawrence J. Fabrey, PhD Applied Measurement Professionals
Psychometrics for Beginners Lawrence J. Fabrey, PhD Applied Measurement Professionals Learning Objectives Identify key NCCA Accreditation requirements Identify two underlying models of measurement Describe
More informationSupplementary Material*
Supplementary Material* Lipner RS, Brossman BG, Samonte KM, Durning SJ. Effect of Access to an Electronic Medical Resource on Performance Characteristics of a Certification Examination. A Randomized Controlled
More informationThe Functional Outcome Questionnaire- Aphasia (FOQ-A) is a conceptually-driven
Introduction The Functional Outcome Questionnaire- Aphasia (FOQ-A) is a conceptually-driven outcome measure that was developed to address the growing need for an ecologically valid functional communication
More informationEnglish 10 Writing Assessment Results and Analysis
Academic Assessment English 10 Writing Assessment Results and Analysis OVERVIEW This study is part of a multi-year effort undertaken by the Department of English to develop sustainable outcomes assessment
More informationA Broad-Range Tailored Test of Verbal Ability
A Broad-Range Tailored Test of Verbal Ability Frederic M. Lord Educational Testing Service Two parallel forms of a broad-range tailored test of verbal ability have been built. The test is appropriate from
More informationItem Response Theory (IRT): A Modern Statistical Theory for Solving Measurement Problem in 21st Century
International Journal of Scientific Research in Education, SEPTEMBER 2018, Vol. 11(3B), 627-635. Item Response Theory (IRT): A Modern Statistical Theory for Solving Measurement Problem in 21st Century
More informationCHAPTER 3 DATA ANALYSIS: DESCRIBING DATA
Data Analysis: Describing Data CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA In the analysis process, the researcher tries to evaluate the data collected both from written documents and from other sources such
More informationOn indirect measurement of health based on survey data. Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state
On indirect measurement of health based on survey data Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state A scaling model: P(Y 1,..,Y k ;α, ) α = item difficulties
More informationUNIT 4 ALGEBRA II TEMPLATE CREATED BY REGION 1 ESA UNIT 4
UNIT 4 ALGEBRA II TEMPLATE CREATED BY REGION 1 ESA UNIT 4 Algebra II Unit 4 Overview: Inferences and Conclusions from Data In this unit, students see how the visual displays and summary statistics they
More informationRunning head: MAJOR FIELD TEST SCORES AND PSYCHOLOGY COURSE WORK
Major Field Test 1 Running head: MAJOR FIELD TEST SCORES AND PSYCHOLOGY COURSE WORK Major Field Test Scores Related to Quantity, Quality and Recency of Psychology Majors Course Work at Two Universities
More informationDifferential Item Functioning from a Compensatory-Noncompensatory Perspective
Differential Item Functioning from a Compensatory-Noncompensatory Perspective Terry Ackerman, Bruce McCollaum, Gilbert Ngerano University of North Carolina at Greensboro Motivation for my Presentation
More information