MEASURING MIDDLE GRADES STUDENTS UNDERSTANDING OF FORCE AND MOTION CONCEPTS: INSIGHTS INTO THE STRUCTURE OF STUDENT IDEAS
|
|
- Dora Higgins
- 5 years ago
- Views:
Transcription
1 MEASURING MIDDLE GRADES STUDENTS UNDERSTANDING OF FORCE AND MOTION CONCEPTS: INSIGHTS INTO THE STRUCTURE OF STUDENT IDEAS The purpose of this study was to create an instrument that measures middle grades students understanding of concepts relating to Newton s First Law of Motion. Design criteria included: 1) an exclusive focus on force and motion ideas (no mathematics or other science concepts); 2) distractors based upon the research on student thinking; 3) minimal burden on the test taker and the researcher; and 4) the measurement property of providing reliable information about students across a broad ability spectrum. After defining the content domain, multiple-choice items were drafted and revised using feedback from cognitive interviews with students. A pool of items was piloted, revised, field tested with approximately 5,000 students, and reviewed by a panel of physicists for content accuracy and domain coverage. Dimensionality analyses revealed that items clustered in two sets; one representing general knowledge of Newton s First Law, and one tapping a particularly prominent misconception the idea that constant non-zero net force results in constant speed. Item response theory was used to select items for a 25-item scale. The study generated a valid, rigorously constructed, minimally burdensome instrument that researchers can use to study the effect of instructional strategies. Further, it added to the knowledge base on how student thinking about force and motion is organized. P. Sean Smith, Horizon Research, Inc. Eric R. Banilower, Horizon Research, Inc. Introduction The purpose of this study was to create a measure of middle grades students understanding of concepts related to Newton s First Law of Motion. Specifically, we set out to create a tool that could be used by researchers to study the effect of different instructional strategies on student understanding. Similar tools exist. For instance, the Force Concept Inventory (FCI) (Hestenes, Wells, & Swackhamer, 1992) is well-respected and widely used for studying force and motion learning at the undergraduate level. The FCI is often used at the high school level as well, but it is not appropriate for the middle grades, covering topics beyond what national standards indicate middle school students should know. Our intent was to create a measure of the targeted concepts in as pure a form as possible; i.e., the instrument would not draw on other understandings such as mathematical or graphing skills. Finally, to enable wide-scale research, we set out to create an instrument that would be minimally burdensome, both for the test taker and the researcher. Thus, we opted for a multiple-choice format. Although this format has some limits, multiplechoice items can probe conceptual understanding, and is the format best suited to our purposes. Smith & Banilower Page 1 of 13 Horizon Research. Inc.
2 Theoretical Underpinnings This study is firmly rooted in the literature on student thinking (summarized in: Driver, Squires, Rushworth, & Wood-Robinson, 2002; Driver, Guesne, & Tiberghien, 2002), particularly the literature on the developmental progression of student ideas (correct and incorrect) in force and motion. The process we used to develop our instrument draws on and adds to this literature. The work is also situated in item response theory (IRT) (Swaminathan & Rogers, 1991), which we drew on to develop a scale that provides reliable information about student understanding across a wide range of ability levels. Instrument Development The development effort described in this paper is part of a much larger and well-funded project 1, which afforded the luxury of an elaborate and thorough development process. The process began with identifying the content domain, the idea that an unbalanced force acting on an object changes its speed (American Association for the Advancement of Science/Project 2061, 1993). For assessment purposes, we restricted the domain to motion in one dimension and defined the performance space by unpacking this idea into six sub-ideas. The content domain was reviewed by a panel of physicists and physics educators, which prompted minor revisions. The final version of the content domain is shown in Table 1. 1 The project is tilted ATLAST Assessing Teacher Learning About Science Teaching. ATLAST is funded by the National Science Foundation under grant number EHR The views expressed in this paper are those of the authors and do not necessarily represent the opinions of the National Science Foundation. Smith & Banilower Page 2 of 13 Horizon Research. Inc.
3 Table 1 Force and Motion Content Domain Targeted Idea: An unbalanced force acting on an object changes its speed. Sub-ideas: A. A force is a push or pull interaction between two objects, and has both magnitude and direction. B. All of the forces acting on an object combine through vector addition into a net force; they either balance each other out (net force is zero), or act like an unbalanced force (net force is not zero). 1. If the sum of forces exerted on an object in one direction is the same strength as the sum of forces exerted on the object in the opposite direction, then the forces on the object are balanced (i.e., the net force is zero). 2. If the sum of forces exerted on an object in one direction is greater than the sum of forces exerted on the object in the opposite direction, then the forces on the object are unbalanced (i.e., the net force is not zero). C. If an object is moving faster and faster, then there is a net force acting on the object in the same direction as the motion. D. If an object is moving slower and slower, then there is a net force acting on the object in the direction opposite to the object s motion. E. If an object has constant speed in a straight line (or zero speed), then there is no net force acting on the object. This can occur either when: 1. the forces on the object are balanced; or 2. there are no forces exerted on the object F. The force of friction acts to oppose the relative motion of two objects in contact. Friction acts on both objects along the surfaces in contact with each other. The magnitude of friction depends upon the smoothness/roughness of the surfaces and how hard the objects are pushed together. Force and motion is one of the few science topics that enjoys a robust literature on student thinking. After an extensive search of this literature, we associated known misconceptions 2 with the relevant sub-idea(s) in preparation for writing distractors. We then drafted multiple-choice items, and began a months-long iterative process of conducting cognitive interviews with students (well over 50) and revising items. A pool of 35 items was piloted with approximately 2,000 middle grades students in spring 2004; at the same time, each item was critiqued through Project 2061 s extensive item analysis procedure (DeBoer, 2005). Results of the piloting and analysis by Project 2061 were used to revise the item pool, which necessitated more student interviews. Ultimately, we field tested a pool of 48 items in fall 2004 with approximately 5,000 middle grades students. The items were split between two forms with 16 items common to each form. 2 We use the term misconception to describe anything that precedes full understanding of a specific idea. Some misconceptions are prior conceptions and may represent important steps in a learning progression. Smith & Banilower Page 3 of 13 Horizon Research. Inc.
4 Lessons Learned in the Development Process Assessment items with distractors based on misconceptions Writing multiple-choice items with misconception-based distractors is a very appealing approach. Multiple-choice items are often criticized as focusing on factual recall rather than conceptual understanding. However, items that use misconceptions as distractors not only probe deeper understanding, but can also serve a diagnostic function in planning instruction. Misconceptions often represent important, perhaps even necessary, steps on a trajectory to full understanding. Items that provide evidence of where a student is on that trajectory can be very useful for diagnosing thinking and guiding instruction. Given the obvious value of misconceptions-based distractors, one may wonder why the approach is not more common. Interestingly, such items present a challenge to development efforts using IRT. To understand this challenge, a bit of background on IRT is necessary. IRT affords many advantages to the test maker. Chief among these is the power to design a test that provides reliable information at the ability level of interest, in our case over a range of ability. As with any theory, IRT rests on a number of assumptions. One of the most important is that the probability of a correct response increases as respondent ability increases (i.e., that the item is monotonic). For monotonic items, graphing the probability of answering an item correctly by ability level results in an S-shaped curve like the one illustrated in Figure 1. In IRT, this graph is known as an item characteristic curve, or ICC, and is central to the theory. The item characteristic curve is the basic building block of item response theory; all the other constructs of the theory depend upon this curve. (p. 7, Baker, 2001). No item will match the shape in Figure 1 exactly, but the general trend of increasing probability with increasing ability must hold. Smith & Banilower Page 4 of 13 Horizon Research. Inc.
5 1.0 Item Characteristic Curv e: ES018V03 a = b = Probabi lity b Ability Figure 1 Sadler (1998) and others have conducted empirical studies suggesting that items with misconception-based distractors present a challenge to IRT. Specifically, such items may not be monotonic; that is, at some point in ability spectrum, a respondent with higher ability is less likely than one with lower ability to choose the correct response. A possible explanation for this finding is as follows: a respondent with no understanding will likely guess and have a 25 percent chance of answering correctly, assuming four choices. A respondent with some understanding (ability) may be drawn to one of the misconception-based distractors, making the probability of choosing the correct answer less than 25 percent. Some of our items exhibit a small degree of nonmonotonicity. However, most met the assumption of monotonicity, and we decided to proceed with an IRT-based model. Insights into student thinking from item analysis Our intent at the beginning of the development process was to generate a single scale that would measure students understanding of the idea that an unbalanced force acting on an object changes Smith & Banilower Page 5 of 13 Horizon Research. Inc.
6 its speed. Analysis of the field test data included examining the dimensionality of the items via factor and cluster analyses. These analyses indicated that our items fell into two groups, each measuring different aspects of student thinking about Newton s First Law. This grouping provides some insight into how students thinking about force and motion is organized. The first set includes items that address each sub-idea in Table 1, and can be thought of as general knowledge of the targeted idea. The second set includes items from only a few sub-ideas, primarily sub-ideas C (an object moves faster and faster as a result of a non-zero net force in the direction of motion), D (an object moves slower and slower as a result of a non-zero net force in the direction opposite its motion), and E (constant speed is a result of a zero net force). All of the items relate to the misconception that a constant non-zero net force applied to an object results in constant speed (or vice-versa, that an object moving with constant speed must be acted on by a constant non-zero net force). Figure 3 shows an item with a choice based on this misconception. The most commonly selected choice was D (40 percent), the correct answer. However, 37 percent of students chose B, indicating they think a non-zero net force is needed to keep the bicycle moving at constant speed. We saw very similar results on the item in Figure 4; 40 percent chose B (the correct answer), and 36 percent chose C. FM009V04 A boy is pedaling his bike on level ground so that he is moving at a constant speed. Which of the following is true about the forces on the bike? A. There are no forces being applied to the bike. B. The total force in the direction of the bike's motion is greater than the total force in the opposite direction. C. The total force in the direction of the bike's motion is getting larger and larger. D. The total force in the direction of the bike's motion is equal in strength to the total force in the opposite direction. Smith & Banilower Page 6 of 13 Horizon Research. Inc.
7 Figure 3 FM003V05 The total force acting on an object in one direction is greater than the total force acting on the object in the opposite direction. What is true about the object? A. It is not moving. B. It is changing speed. C. It is moving at a constant speed. D. It is moving back and forth. Figure 4 This second set of items, as a group, was much more difficult than the first set, indicating that the misconception is very prevalent among middle school students and may dominate their thinking about force and motion. The power and pervasiveness of this misconception are not surprising. All motion on Earth is affected by friction, and unless students are aware of friction s effects, they can hardly help but form the idea that a constant force is needed to make an object move with constant speed. This pattern of student thinking is well documented in the literature. Gunstone and Watts (2002) provide a summary of studies that consistently identified the misconception among students. Although the items related to this idea seemed to form a distinct subset, the inter-item reliability was quite low, below 0.4. To understand why, a bit more background on IRT is necessary. Figure 1 (see p. 5) depicts the item characteristic curve (the ICC). Figure 1 illustrates two other key ideas from IRT as well. The first is the difficulty parameter (a.k.a. the b parameter). In classical test theory, item difficulty typically represents the probability of students answering an item correctly. In IRT, the difficulty parameter describes the ability level at which a respondent has a 50 percent chance of answering correctly (Swaminathan and Rogers, 1991). Difficulty parameters less than zero indicate items are relatively easy; difficulty parameters greater than zero indicate items are relatively difficult. In Figure 1, the item difficulty is , indicating that the item is relatively easy. Smith & Banilower Page 7 of 13 Horizon Research. Inc.
8 The second key idea is that of item discrimination (a.k.a. the a parameter). Item discrimination describes how well an item can distinguish among respondents of different ability levels (Swaminathan and Rogers, 1991). Items for which there is a large change in the probability of responding correctly over a small change in ability are said to be highly discriminating. The more discriminating an item is, the more information it provides about a respondent; in other words, the more reliable the estimate of ability for that respondent is. In regards to ICCs, items that are more discriminating have steeper slopes. The effect of the discrimination parameter on item information is illustrated in Figure 2, which plots the item information for two items with roughly equal difficulty parameters but different discrimination parameters. Information Ability Disc.=1.56 Disc.=0.95 Figure 2 Narode (1987, cited in Sadler, 1998) found that mathematics items with misconceptions-based distractors were both more difficult (higher b parameter) and less discriminating (lower a parameter) than more traditional multiple choice items. A scale constructed of items with low discriminating power cannot be very reliable, as the two are directly linked. The discrimination parameters of the items in the second group are shown in Table 4 below. Generally, a discrimination parameter below 1 (using a logistic metric) is less than desirable. Clearly, low discrimination presents a measurement dilemma; the field is very interested in assessing student thinking areas that are laden with misconceptions, but including the misconceptions as distractors may make a reliable scale difficult to construct. Smith & Banilower Page 8 of 13 Horizon Research. Inc.
9 Table 3 Item Discrimination Difficulty Characteristics of the final scale Given that we could not reliably measure what appeared to be a separate factor, we opted to focus our final scale on overall understanding of the idea that an unbalanced force acting on an object changes the object s speed. Using BILOG-MG 3.0 (Zimowski, Muraki, Mislevy, & Bock, 2003), we estimated the discrimination and difficulty parameters (i.e., a two parameter logistic model) 3 for all items that loaded on the first, more general factor. IRT allows the construction of scales with specific properties, a distinct advantage over classical test theory. The ultimate goal of a scale created using IRT is to generate ability estimates for test takers. In IRT, ability is plotted on a scale from negative to positive infinity in terms of standard deviations, with a mean of 0. However, practically all test takers fall within the range -3 to +3 on the ability scale. Our goal was to create a scale that would allow us to accurately estimate ability over a wide range. Using the difficulty and discrimination parameters, we selected 25 items that covered the content domain. Table 3 shows the number of items addressing each sub-idea. The items total to more than 25 because some items address more than one sub-idea. 3 A 3 parameter logistic model did not fit the data any better than the 2 parameter model. In the interest of simplicity, we opted for the 2 parameter model Smith & Banilower Page 9 of 13 Horizon Research. Inc.
10 Table 3 Number of Items Addressing Each Sub-idea Sub-ideas: A force is a push or pull interaction between two objects, and has both magnitude and direction. All of the forces acting on an object combine through vector addition into a net force; they either balance each other out (net force is zero), or act like an unbalanced force (net force is not zero). If an object is moving faster and faster, then there is a net force acting on the object in the same direction as the motion. If an object is moving slower and slower, then there is a net force acting on the object in the direction opposite to the object s motion. If an object has constant speed in a straight line (or zero speed), then there is no net force acting on the object. This can occur either when the forces on the object are balanced or when there are no forces exerted on the object The force of friction acts to oppose the relative motion of two objects in contact. Friction acts on both objects along the surfaces in contact with each other. The magnitude of friction depends upon the smoothness/roughness of the surfaces and how hard the objects are pushed together. Number of items Estimating ability accurately requires an adequate amount of information. The amount of information a test provides is described by the test information curve. The curve is constructed simply by summing the information contributed by each item. Again, we were interested in constructing a test that functions well over a broad ability range, which stands in contrast to other purposes, for example a credentialing exam. In the latter scenario, the test constructer s interest is in maximizing the amount of information at the ability determined to be necessary for credentialing. Figure 5 displays the test information curve for our 25-item scale. Clearly the scale provides a maximum amount of information near the middle of the ability scale. Consistent with our goals, the scale provides information for making sufficiently reliable ability estimates between about -2 and Smith & Banilower Page 10 of 13 Horizon Research. Inc.
11 Test Information Curve for 25-item Scale Information Ability Figure 5 Conclusions We set out to develop an instrument that measures student understanding of ideas related to Newton s First Law; specifically, the idea that an unbalanced force acting on an object changes the object s speed. The instrument represents an important contribution to the field in two regards. First, the development process itself and the resulting instrument provide insight into student thinking about the targeted concepts. The misconception that a non-zero net force results in constant speed appears to be quite prevalent among middle grades students, so prevalent that it may dominate their thinking about Newton s First Law. More generally, although the instrument was not developed to be a diagnostic measure, it does shed light on student thinking, as most distractors were written from documented misconceptions about force and motion. Second, the work provides researchers with a valid, rigorously constructed, minimally burdensome tool to use in studying teaching and learning at the middle grades level. In particular, the tool allows researchers to study the effect of different instructional approaches on students understanding of the targeted concepts. The development process revealed a particularly challenging measurement dilemma. Items that used strongly held misconceptions as distractors tended to be poorly discriminating. That is, Smith & Banilower Page 11 of 13 Horizon Research. Inc.
12 they did not distinguish well between students who understood the target idea and those who did not. In an IRT measurement framework, such items do not function well in estimating student ability. Our work suggests that while using misconceptions-based distractors is very appealing from a diagnostic perspective, the approach can, especially when strongly held misconceptions are employed, make scale construction quite challenging. It is clear that more work is needed in this area. References American Association for the Advancement of Science/Project (1993). Benchmarks for Science Literacy. New York: Oxford University Press. Baker, F.B. (2001). The basics of item response theory. ERIC Clearinghouse on Assessment and Evaluation, University of Maryland, College Park, MD. DeBoer, G.E. (2005). Aligning student assessment to state and national content standards. Paper presented at the NSTA National Convention, Dallas, Texas. Driver, R., Squires, A., Rushworth, P., and Wood-Robinson, V. (2002). Making Sense of Secondary Science: Research into Children s Ideas. London and New York, NY: RoutledgeFalmer. Driver, R., Guesne, E. and Tiberghien, A. (2002). Children s Ideas in Science. Philadelphia, PA, Open University Press. Gunstone, R. and Watts, M. (2002). Force and motion, in Driver, R., Guesne, E., & Tiberghien, A. (eds.). Children s ideas in science (pp ). Philadelphia, PA, Open University Press. Hestenes, D., Wells, M., and Swackhamer, G. (1992). Force Concept Inventory, The Physics Teacher, 30 (3), Narode, R. (1987). Standardized testing for alternative conceptions in basic mathematics. In J.D. Novak (Ed.), 2 nd International Seminar on Misconception and Educational Strategies in Science and Mathematics (Vol. 1) (pp ). Ithaca, NY: Cornell University Press. Smith & Banilower Page 12 of 13 Horizon Research. Inc.
13 Sadler, P.M. (1998). Psychometric models of student conceptions in science: reconciling qualitative studies and distractor-driven assessment instruments, Journal of Research in Science Teaching, 35 (3), Swaminathan, H. and Rogers, H.J. (1991). Fundamentals of Item Response Theory. Thousand Oaks, CA: Sage Publications. Zimowski, M, Muraki, E., Mislevy, R, and Bock, R. (2003) BILOG MG-3. Assessment Systems Corporation: St. Paul, MN. Smith & Banilower Page 13 of 13 Horizon Research. Inc.
Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories
Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,
More informationHow Do You Know If They're Getting It? Writing Assessment Items That Reveal Student Understanding
How Do You Know If They're Getting It? Writing Assessment Items That Reveal Student Understanding Sean Smith 2008 Conference on the Preparation of Physics and Physical Science Teachers Austin, TX Goals
More informationDimensionality of the Force Concept Inventory: Comparing Bayesian Item Response Models. Xiaowen Liu Eric Loken University of Connecticut
Dimensionality of the Force Concept Inventory: Comparing Bayesian Item Response Models Xiaowen Liu Eric Loken University of Connecticut 1 Overview Force Concept Inventory Bayesian implementation of one-
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationAN ANALYSIS ON VALIDITY AND RELIABILITY OF TEST ITEMS IN PRE-NATIONAL EXAMINATION TEST SMPN 14 PONTIANAK
AN ANALYSIS ON VALIDITY AND RELIABILITY OF TEST ITEMS IN PRE-NATIONAL EXAMINATION TEST SMPN 14 PONTIANAK Hanny Pradana, Gatot Sutapa, Luwandi Suhartono Sarjana Degree of English Language Education, Teacher
More informationReinforcement Learning : Theory and Practice - Programming Assignment 1
Reinforcement Learning : Theory and Practice - Programming Assignment 1 August 2016 Background It is well known in Game Theory that the game of Rock, Paper, Scissors has one and only one Nash Equilibrium.
More informationItem Analysis: Classical and Beyond
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013 Why is item analysis relevant? Item analysis provides
More informationDevelopment, Standardization and Application of
American Journal of Educational Research, 2018, Vol. 6, No. 3, 238-257 Available online at http://pubs.sciepub.com/education/6/3/11 Science and Education Publishing DOI:10.12691/education-6-3-11 Development,
More informationAn Item Response Curves Analysis of the Force Concept Inventory
Kennesaw State University DigitalCommons@Kennesaw State University Faculty Publications 9-2012 An Item Response Curves Analysis of the Force Concept Inventory Gary A. Morris Valparaiso University Nathan
More informationA Comparison of Pseudo-Bayesian and Joint Maximum Likelihood Procedures for Estimating Item Parameters in the Three-Parameter IRT Model
A Comparison of Pseudo-Bayesian and Joint Maximum Likelihood Procedures for Estimating Item Parameters in the Three-Parameter IRT Model Gary Skaggs Fairfax County, Virginia Public Schools José Stevenson
More informationBuilding Evaluation Scales for NLP using Item Response Theory
Building Evaluation Scales for NLP using Item Response Theory John Lalor CICS, UMass Amherst Joint work with Hao Wu (BC) and Hong Yu (UMMS) Motivation Evaluation metrics for NLP have been mostly unchanged
More informationUSE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION
USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION Iweka Fidelis (Ph.D) Department of Educational Psychology, Guidance and Counselling, University of Port Harcourt,
More informationHoughton Mifflin Harcourt. Participant s Guide Distractor Rationales Fall 2012 User s Conference By Christina Fritz
Houghton Mifflin Harcourt Participant s Guide Distractor Rationales Fall 2012 User s Conference By Christina Fritz Topics for Discussion High Quality Items Anatomy of a Multiple Choice Item Types of Distractors
More informationDoes momentary accessibility influence metacomprehension judgments? The influence of study judgment lags on accessibility effects
Psychonomic Bulletin & Review 26, 13 (1), 6-65 Does momentary accessibility influence metacomprehension judgments? The influence of study judgment lags on accessibility effects JULIE M. C. BAKER and JOHN
More informationGENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS
GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at
More informationScaling TOWES and Linking to IALS
Scaling TOWES and Linking to IALS Kentaro Yamamoto and Irwin Kirsch March, 2002 In 2000, the Organization for Economic Cooperation and Development (OECD) along with Statistics Canada released Literacy
More informationStatistical Methods and Reasoning for the Clinical Sciences
Statistical Methods and Reasoning for the Clinical Sciences Evidence-Based Practice Eiki B. Satake, PhD Contents Preface Introduction to Evidence-Based Statistics: Philosophical Foundation and Preliminaries
More informationInfluences of IRT Item Attributes on Angoff Rater Judgments
Influences of IRT Item Attributes on Angoff Rater Judgments Christian Jones, M.A. CPS Human Resource Services Greg Hurt!, Ph.D. CSUS, Sacramento Angoff Method Assemble a panel of subject matter experts
More informationRunning head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note
Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1 Nested Factor Analytic Model Comparison as a Means to Detect Aberrant Response Patterns John M. Clark III Pearson Author Note John M. Clark III,
More informationCHAPTER 3 METHOD AND PROCEDURE
CHAPTER 3 METHOD AND PROCEDURE Previous chapter namely Review of the Literature was concerned with the review of the research studies conducted in the field of teacher education, with special reference
More informationLiving with Newton's Laws
Task #1 - Newton s 1 st Law - This is a pain in the neck Let's suppose you are in your car, waiting at a stop light. Like any good driver, you have your seat belt buckled. (It's the law.) Suddenly, a car
More information6 th Force & Motion Summative Assessment Scoring Rubrics
6 th Force & Motion Summative Assessment Scoring Rubrics 1. During a youth track event, a runner travels a distance of 100 meters in a time of 20 seconds. a. What is the runner s average speed? b. Name
More informationConvergence Principles: Information in the Answer
Convergence Principles: Information in the Answer Sets of Some Multiple-Choice Intelligence Tests A. P. White and J. E. Zammarelli University of Durham It is hypothesized that some common multiplechoice
More informationDetermining Differential Item Functioning in Mathematics Word Problems Using Item Response Theory
Determining Differential Item Functioning in Mathematics Word Problems Using Item Response Theory Teodora M. Salubayba St. Scholastica s College-Manila dory41@yahoo.com Abstract Mathematics word-problem
More informationReliability, validity, and all that jazz
Reliability, validity, and all that jazz Dylan Wiliam King s College London Published in Education 3-13, 29 (3) pp. 17-21 (2001) Introduction No measuring instrument is perfect. If we use a thermometer
More informationChapter 3. Psychometric Properties
Chapter 3 Psychometric Properties Reliability The reliability of an assessment tool like the DECA-C is defined as, the consistency of scores obtained by the same person when reexamined with the same test
More informationResults & Statistics: Description and Correlation. I. Scales of Measurement A Review
Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize
More informationAndré Cyr and Alexander Davies
Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander
More informationA Case Study of One Student s Metaconceptual Processes and the Changes in Her Alternative Conceptions of Force and Motion
Eurasia Journal of Mathematics, Science & Technology Education, 2007, 3(4), 305-325 A Case Study of One Student s Metaconceptual Processes and the Changes in Her Alternative Conceptions of Force and Motion
More informationEmpowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison
Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological
More informationBasic concepts and principles of classical test theory
Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must
More informationReliability and Validity checks S-005
Reliability and Validity checks S-005 Checking on reliability of the data we collect Compare over time (test-retest) Item analysis Internal consistency Inter-rater agreement Compare over time Test-Retest
More informationTechniques for Explaining Item Response Theory to Stakeholder
Techniques for Explaining Item Response Theory to Stakeholder Kate DeRoche Antonio Olmos C.J. Mckinney Mental Health Center of Denver Presented on March 23, 2007 at the Eastern Evaluation Research Society
More informationHaving your cake and eating it too: multiple dimensions and a composite
Having your cake and eating it too: multiple dimensions and a composite Perman Gochyyev and Mark Wilson UC Berkeley BEAR Seminar October, 2018 outline Motivating example Different modeling approaches Composite
More informationItem Analysis Explanation
Item Analysis Explanation The item difficulty is the percentage of candidates who answered the question correctly. The recommended range for item difficulty set forth by CASTLE Worldwide, Inc., is between
More informationDo First Year College Female and Male Students Hold Different Misconceptions about Force and Motion?
IOSR Journal of Applied Physics (IOSR-JAP) e-issn: 2278-4861.Volume 9, Issue 2 Ver. II (Mar. Apr. 2017), PP 14-18 www.iosrjournals.org Do First Year College Female and Male Students Hold Different Misconceptions
More informationReliability, validity, and all that jazz
Reliability, validity, and all that jazz Dylan Wiliam King s College London Introduction No measuring instrument is perfect. The most obvious problems relate to reliability. If we use a thermometer to
More informationVisualizing Higher Level Mathematical Concepts Using Computer Graphics
Visualizing Higher Level Mathematical Concepts Using Computer Graphics David Tall Mathematics Education Research Centre Warwick University U K Geoff Sheath Polytechnic of the South Bank London U K ABSTRACT
More informationDiscrimination Weighting on a Multiple Choice Exam
Proceedings of the Iowa Academy of Science Volume 75 Annual Issue Article 44 1968 Discrimination Weighting on a Multiple Choice Exam Timothy J. Gannon Loras College Thomas Sannito Loras College Copyright
More informationDescription of components in tailored testing
Behavior Research Methods & Instrumentation 1977. Vol. 9 (2).153-157 Description of components in tailored testing WAYNE M. PATIENCE University ofmissouri, Columbia, Missouri 65201 The major purpose of
More informationThe Effect of Guessing on Item Reliability
The Effect of Guessing on Item Reliability under Answer-Until-Correct Scoring Michael Kane National League for Nursing, Inc. James Moloney State University of New York at Brockport The answer-until-correct
More informationThe Classification Accuracy of Measurement Decision Theory. Lawrence Rudner University of Maryland
Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, April 23-25, 2003 The Classification Accuracy of Measurement Decision Theory Lawrence Rudner University
More informationDuring the past century, mathematics
An Evaluation of Mathematics Competitions Using Item Response Theory Jim Gleason During the past century, mathematics competitions have become part of the landscape in mathematics education. The first
More informationChapter 1 Introduction. Measurement Theory. broadest sense and not, as it is sometimes used, as a proxy for deterministic models.
Ostini & Nering - Chapter 1 - Page 1 POLYTOMOUS ITEM RESPONSE THEORY MODELS Chapter 1 Introduction Measurement Theory Mathematical models have been found to be very useful tools in the process of human
More informationParallel Forms for Diagnostic Purpose
Paper presented at AERA, 2010 Parallel Forms for Diagnostic Purpose Fang Chen Xinrui Wang UNCG, USA May, 2010 INTRODUCTION With the advancement of validity discussions, the measurement field is pushing
More informationContents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD
Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT
More informationReviewing the TIMSS Advanced 2015 Achievement Item Statistics
CHAPTER 11 Reviewing the TIMSS Advanced 2015 Achievement Item Statistics Pierre Foy Michael O. Martin Ina V.S. Mullis Liqun Yin Kerry Cotter Jenny Liu The TIMSS & PIRLS conducted a review of a range of
More informationAppendix B Statistical Methods
Appendix B Statistical Methods Figure B. Graphing data. (a) The raw data are tallied into a frequency distribution. (b) The same data are portrayed in a bar graph called a histogram. (c) A frequency polygon
More informationSurvey Methods in Relationship Research
Purdue University Purdue e-pubs Department of Psychological Sciences Faculty Publications Department of Psychological Sciences 1-1-2009 Survey Methods in Relationship Research Christopher Agnew Purdue
More informationCHAPTER VI RESEARCH METHODOLOGY
CHAPTER VI RESEARCH METHODOLOGY 6.1 Research Design Research is an organized, systematic, data based, critical, objective, scientific inquiry or investigation into a specific problem, undertaken with the
More informationTEACHING BAYESIAN METHODS FOR EXPERIMENTAL DATA ANALYSIS
TEACHING BAYESIAN METHODS FOR EXPERIMENTAL DATA ANALYSIS Bruno Lecoutre, C.N.R.S. et Université de Rouen Mathématiques, France The innumerable articles denouncing the deficiencies of significance testing
More informationTHE USE OF CRONBACH ALPHA RELIABILITY ESTIMATE IN RESEARCH AMONG STUDENTS IN PUBLIC UNIVERSITIES IN GHANA.
Africa Journal of Teacher Education ISSN 1916-7822. A Journal of Spread Corporation Vol. 6 No. 1 2017 Pages 56-64 THE USE OF CRONBACH ALPHA RELIABILITY ESTIMATE IN RESEARCH AMONG STUDENTS IN PUBLIC UNIVERSITIES
More informationForces and motion 1: Identifying forces
Forces and motion 1: Identifying forces University of York 2003 5 Identifying forces All the questions in this set focus on the ability to identify the forces acting in everyday situations. Although there
More informationConnexion of Item Response Theory to Decision Making in Chess. Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan
Connexion of Item Response Theory to Decision Making in Chess Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan Acknowledgement A few Slides have been taken from the following presentation
More informationFORUM: QUALITATIVE SOCIAL RESEARCH SOZIALFORSCHUNG
FORUM: QUALITATIVE SOCIAL RESEARCH SOZIALFORSCHUNG Volume 5, No. 1, Art. 27 January 2004 Review: Mechthild Kiegelmann Melanie Mauthner, Maxine Birch, Julie Jessop & Tina Miller (Eds.) (2002). Ethics in
More informationDetecting Suspect Examinees: An Application of Differential Person Functioning Analysis. Russell W. Smith Susan L. Davis-Becker
Detecting Suspect Examinees: An Application of Differential Person Functioning Analysis Russell W. Smith Susan L. Davis-Becker Alpine Testing Solutions Paper presented at the annual conference of the National
More informationMODULE 3 APPRAISING EVIDENCE. Evidence-Informed Policy Making Training
MODULE 3 APPRAISING EVIDENCE Evidence-Informed Policy Making Training RECAP OF PREVIOUS DAY OR SESSION MODULE 3 OBJECTIVES At the end of this module participants will: Identify characteristics of basic
More informationA framework for predicting item difficulty in reading tests
Australian Council for Educational Research ACEReSearch OECD Programme for International Student Assessment (PISA) National and International Surveys 4-2012 A framework for predicting item difficulty in
More informationComparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria
Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Thakur Karkee Measurement Incorporated Dong-In Kim CTB/McGraw-Hill Kevin Fatica CTB/McGraw-Hill
More informationPsychological testing
Psychological testing Lecture 12 Mikołaj Winiewski, PhD Test Construction Strategies Content validation Empirical Criterion Factor Analysis Mixed approach (all of the above) Content Validation Defining
More informationAnalyzing the FCI based on a Force and Motion Learning Progression
Analyzing the FCI based on a Force and Motion Learning Progression Irene Neumann a*, Gavin W. Fulmer b and Ling L. Liang c a Department of Physics Education, Ruhr-Universität Bochum, Bochum, Germany b
More informationUsing Analytical and Psychometric Tools in Medium- and High-Stakes Environments
Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session
More informationfor Scaling Ability and Diagnosing Misconceptions Laine P. Bradshaw James Madison University Jonathan Templin University of Georgia Author Note
Combing Item Response Theory and Diagnostic Classification Models: A Psychometric Model for Scaling Ability and Diagnosing Misconceptions Laine P. Bradshaw James Madison University Jonathan Templin University
More informationRisk Aversion in Games of Chance
Risk Aversion in Games of Chance Imagine the following scenario: Someone asks you to play a game and you are given $5,000 to begin. A ball is drawn from a bin containing 39 balls each numbered 1-39 and
More informationBy Hui Bian Office for Faculty Excellence
By Hui Bian Office for Faculty Excellence 1 Email: bianh@ecu.edu Phone: 328-5428 Location: 1001 Joyner Library, room 1006 Office hours: 8:00am-5:00pm, Monday-Friday 2 Educational tests and regular surveys
More informationSTAT 110: Chapter 1 Introduction to Thinking Statistically
In the Evaluating a Claim of Hearing Loss activity, we encountered several basic examples that required us to think statistically in order to investigate a question of interest. Before we move on to slightly
More informationSAT School Day April Sample Release Questions. (& answers!) your parent s SAT!.. - Not
1 - Not your parent s SAT!.. Sample Release Questions SAT School Day April 2017 (& answers!) instance, a study that analyzed a set of published experiments all sharing Line Knowing your own reputation
More informationINVESTIGATING FIT WITH THE RASCH MODEL. Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form
INVESTIGATING FIT WITH THE RASCH MODEL Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form of multidimensionality. The settings in which measurement
More informationRegression Discontinuity Analysis
Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationCHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to
CHAPTER - 6 STATISTICAL ANALYSIS 6.1 Introduction This chapter discusses inferential statistics, which use sample data to make decisions or inferences about population. Populations are group of interest
More informationUtilizing the NIH Patient-Reported Outcomes Measurement Information System
www.nihpromis.org/ Utilizing the NIH Patient-Reported Outcomes Measurement Information System Thelma Mielenz, PhD Assistant Professor, Department of Epidemiology Columbia University, Mailman School of
More informationQualitative and quantitative research as complements (Black, 1999)
Qualitative and quantitative research as complements (Black, 1999) Use of design depends on research problem, and the two RDs complement each other Single or a few selected groups (case study): Why? How?
More informationChapter 1 Introduction to Educational Research
Chapter 1 Introduction to Educational Research The purpose of Chapter One is to provide an overview of educational research and introduce you to some important terms and concepts. My discussion in this
More information3 CONCEPTUAL FOUNDATIONS OF STATISTICS
3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical
More informationHow Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis?
How Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis? Richards J. Heuer, Jr. Version 1.2, October 16, 2005 This document is from a collection of works by Richards J. Heuer, Jr.
More informationSOTM LAB: P5R Forces and Motion I. TEACHER NOTES & GUIDELINES TITLE OF LAB. Forces and Motion (with a 1.2m track and smart pulley) DEVELOPERS OF LAB
SOTM LAB: P5R Forces and Motion I. TEACHER NOTES & GUIDELINES TITLE OF LAB Forces and Motion (with a 1.2m track and smart pulley) DEVELOPERS OF LAB Kirk Reinhardt, JD738 Ted Brown, JD806 OVERVIEW OF LAB
More informationWork, Employment, and Industrial Relations Theory Spring 2008
MIT OpenCourseWare http://ocw.mit.edu 15.676 Work, Employment, and Industrial Relations Theory Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationAnalysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique
PHYSICAL REVIEW PHYSICS EDUCATION RESEARCH 12, 020135 (2016) Analysis test of understanding of vectors with the three-parameter logistic model of item response theory and item response curves technique
More informationField-normalized citation impact indicators and the choice of an appropriate counting method
Field-normalized citation impact indicators and the choice of an appropriate counting method Ludo Waltman and Nees Jan van Eck Centre for Science and Technology Studies, Leiden University, The Netherlands
More informationGroup Assignment #1: Concept Explication. For each concept, ask and answer the questions before your literature search.
Group Assignment #1: Concept Explication 1. Preliminary identification of the concept. Identify and name each concept your group is interested in examining. Questions to asked and answered: Is each concept
More informationcaspa Comparison and Analysis of Special Pupil Attainment
caspa Comparison and Analysis of Special Pupil Attainment Analysis and bench-marking in CASPA This document describes of the analysis and bench-marking features in CASPA and an explanation of the analysis
More informationDeveloping and Testing Survey Items
Developing and Testing Survey Items William Riley, Ph.D. Chief, Science of Research and Technology Branch National Cancer Institute With Thanks to Gordon Willis Contributions to Self-Report Errors Self-report
More informationGrade 3 Science, Quarter 3, Unit 3.1. Force and Motion. Overview
Grade 3 Science, Quarter 3, Unit 3.1 Force and Motion Overview Number of instructional days: 7 (1 day = 45 minutes) Content to be learned Use prior knowledge and investigations in order to predict whether
More informationEVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS
DePaul University INTRODUCTION TO ITEM ANALYSIS: EVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS Ivan Hernandez, PhD OVERVIEW What is Item Analysis? Overview Benefits of Item Analysis Applications Main
More informationLikert Scaling: A how to do it guide As quoted from
Likert Scaling: A how to do it guide As quoted from www.drweedman.com/likert.doc Likert scaling is a process which relies heavily on computer processing of results and as a consequence is my favorite method
More informationThe Design Pattern: A Blueprint for a New Domain-Specific Assessment. U.S. Department of Education of NSF disclaimer.
The Design Pattern: A Blueprint for a New Domain-Specific Assessment Louise Yarnall, Ph.D. SRI International April 9, 2011 U.S. Department of Education of NSF disclaimer. Background Prototype Assessment
More informationTRENDS IN LEGAL ADVOCACY: INTERVIEWS WITH LEADING PROSECUTORS AND DEFENCE LAWYERS ACROSS THE GLOBE
TRENDS IN LEGAL ADVOCACY: INTERVIEWS WITH LEADING PROSECUTORS AND DEFENCE LAWYERS ACROSS THE GLOBE Instructions to Interviewers Each interview with a prosecutor or defence lawyer will comprise a book chapter
More information2002 AP BIOLOGY FREE-RESPONSE QUESTIONS
2002 AP BIOLOGY FREE-RESPONSE QUESTIONS 2. The activities of organisms change at regular time intervals. These changes are called biological rhythms. The graph depicts the activity cycle over a 48-hour
More informationLikelihood Ratio Based Computerized Classification Testing. Nathan A. Thompson. Assessment Systems Corporation & University of Cincinnati.
Likelihood Ratio Based Computerized Classification Testing Nathan A. Thompson Assessment Systems Corporation & University of Cincinnati Shungwon Ro Kenexa Abstract An efficient method for making decisions
More information1st Proofs Not for Distribution.
3 A FRAMEWORK FOR INCORPORATING INTERVENTION FIDELITY IN EDUCATIONAL EVALUATION STUDIES William M. Murrah, Jeff Kosovich, and Chris Hulleman The randomized controlled trial (RCT) is considered by many
More informationAnalysis of OLLI Membership Survey 2016 Q1. I have been an OLLI member for: Less than 5 years 42.5% 5-15 years 48.0% years 9.
Analysis of OLLI Membership Survey 2016 The fall survey of OLLI members was written by a committee of OLLI members, including board members, with an eye toward having results in time for the annual Town
More informationCrossing boundaries between disciplines: A perspective on Basil Bernstein s legacy
Crossing boundaries between disciplines: A perspective on Basil Bernstein s legacy Ana M. Morais Department of Education & Centre for Educational Research School of Science University of Lisbon Revised
More informationRecollection Can Be Weak and Familiarity Can Be Strong
Journal of Experimental Psychology: Learning, Memory, and Cognition 2012, Vol. 38, No. 2, 325 339 2011 American Psychological Association 0278-7393/11/$12.00 DOI: 10.1037/a0025483 Recollection Can Be Weak
More informationReliability. Internal Reliability
32 Reliability T he reliability of assessments like the DECA-I/T is defined as, the consistency of scores obtained by the same person when reexamined with the same test on different occasions, or with
More informationComprehensive Statistical Analysis of a Mathematics Placement Test
Comprehensive Statistical Analysis of a Mathematics Placement Test Robert J. Hall Department of Educational Psychology Texas A&M University, USA (bobhall@tamu.edu) Eunju Jung Department of Educational
More informationRegistered Radiologist Assistant (R.R.A. ) 2016 Examination Statistics
Registered Radiologist Assistant (R.R.A. ) Examination Statistics INTRODUCTION This report summarizes the results of the Registered Radiologist Assistant (R.R.A. ) examinations developed and administered
More informationHARRISON ASSESSMENTS DEBRIEF GUIDE 1. OVERVIEW OF HARRISON ASSESSMENT
HARRISON ASSESSMENTS HARRISON ASSESSMENTS DEBRIEF GUIDE 1. OVERVIEW OF HARRISON ASSESSMENT Have you put aside an hour and do you have a hard copy of your report? Get a quick take on their initial reactions
More informationExamining the Psychometric Properties of The McQuaig Occupational Test
Examining the Psychometric Properties of The McQuaig Occupational Test Prepared for: The McQuaig Institute of Executive Development Ltd., Toronto, Canada Prepared by: Henryk Krajewski, Ph.D., Senior Consultant,
More informationWhat Do You Think? For You To Do GOALS. The men s high jump record is over 8 feet.
Activity 5 Run and Jump GOALS In this activity you will: Understand the definition of acceleration. Understand meters per second per second as the unit of acceleration. Use an accelerometer to detect acceleration.
More information