Basic concepts and principles of classical test theory
|
|
- Clara Harrison
- 5 years ago
- Views:
Transcription
1 Basic concepts and principles of classical test theory Jan-Eric Gustafsson
2 What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must be defined in theoretical terms. Measurement should be understood in a broad sense and encompasses also classification and assessment.
3 Why should one measure? Increases precision and comparability Gives access to a well developed set of tools for developing measurement instruments" collecting data" summarizing and describing data" analyzing data" making inferences and generalizations" However: Measurement suits certain purposes but not others
4 Classical and modern theories of measurement Theories of measurement to a large extent deal with how to put together components (items or subscales) into scales with known properties: Classical theories of measurement assume simple relations between the components and the dimension to be measured. Measures of test properties are typically group-dependent. " Modern theories of measurement (IRT) are based on probabilistic models of relations between item scores and characteristics of persons and item. This allows for estimation of ability from different items for different persons, and for estimating item characteristics which are invariant over groups of persons. "
5 An example: the IEA Reading Literacy study 1991 In 1991 some Swedish students in grade 3 participated in a study of reading literacy along with samples of students in about 30 other countries (RL 1991). A large number of instruments: Reading literacy tests: 15 texts from three categories and 66 multiple choice items." Student questionnaire: Questions about home background, attitudes towards reading, and reading habits." Parental questionnaire: Literacy activities in the home, economic and cultural resources, reading habits and attitudes, relations between home and school, education and occupation." Teacher questionnaire: Questions about the class, the teaching of reading, resources and the teacher." School questionnaire: Characteristics of the school, resources, school climate, and relations between home and school."
6 Starting points for the construction of the reading literacy test Definition: Reading literacy is the ability to understand and use such forms of written language which are required in society and/or are of value for the individual Requirements on the texts: The students should not have met them before" The texts should be possible to use again after 10 years" The texts should be appropriate for all countries, languages, ethnic and socioeconomic groups and both genders." They should be possible to use stand alone in such a way that they could provide a meaningful reading experience" They should not be formulated in such general terms that the students would be able to answer the questions without reading the texts" They should comprise different levels of difficulty"
7 The reading literacy test Three types of texts Narrative prose. These are continous texts which aim to tell a story. The texts typically follow a linear time sequence, and are usually intended to entertain or to involve the reader emotionally. The texts ranged in length from short fables to longer stories." Expository prose. This category comprises continuous texts which aim to convey factual information or opinion to the reader. " Documents. These are structured presentations of information, in the form of graphs, charts, maps, lists, or sets of instructions. The reader can process the information in a nonlinear fashion without reading the whole text, and typically the number of words is limited." Items" In relation to each text between two and six questions were asked." Altogether there were 66 multiple-choice items and two open-ended questions. The latter were not included in the analysis because of too low inter-rater agreement." Booklets The 15 texts and the 66 multiple-choice items were distributed over two booklets (A and B)."
8 Test components This test, and most other tests, thus consists of different types of components: Single questions (items), which here are scored 0 (incorrect choice) and 1 (correct choice)" Text passages (testlets, parcels, or item bundles) with between 2 and 6 as the miximum score." Booklets" If test components (items) are independent, there is more flexibility and power in designing tests than if there are dependencies among the components."
9 A minitest A minitest has been created from the items (10) to two of the narrative texts ( Bird and Shark )
10 Statistical measures for the minitest Descriptive Statistics minitest Valid N (listwise) N Minimum Maximum Mean Std. Deviation
11 Distribution of scores for the minitest
12 Means and standard deviations of the items Item Statistics nbird1r nbird2r nbird3r nbird4r nbird5r nshak1r nshak2r nshak3r nshak4r nshak5r Mean Std. Deviation N
13 Correlations among the items Inter-Item Correlation Matrix nbird1r nbird2r nbird3r nbird4r nbird5r nshak1r nshak2r nshak3r nshak4r nshak5r nbird1r nbird2r nbird3r nbird4r nbird5r nshak1r nshak2r nshak3r nshak4r nshak5r
14 Relations between items and the total score Item-Total Statistics nbird1r nbird2r nbird3r nbird4r nbird5r nshak1r nshak2r nshak3r nshak4r nshak5r Scale Corrected Squared Cronbach's Scale Mean if Variance if Item-Total Multiple Alpha if Item Item Deleted Item Deleted Correlation Correlation Deleted
15 Reliability The precision of an instrument How well an instrument resists the influence of random variation Does the instrument give the same result upon repeated measurements?
16 Definition of reliability Observed score = True score + Error An instruments correlation with itself The ratio between true score variance and observed score variance (observed score variance = true score variance + error variance) What is a true score and what is error?
17 Sources of variance in test scores (after Thorndike, 1951) Individual characteristics External/situational factors
18 Reasons for reliability loss Factors at test administration Rating of responses Guessing Selection of items for the test Variation in individuals true scores
19 Ways to determine reliability To determine reliability we would like to be able to compute the correlation between the test and itself, or to know the true scores and the error scores. This is not possible, so different approaches have been devised: Test retest: Administer the same test twice (memory effects may be a problem; sensitive to temporal instability, but not to effects of item selection)" Parallel test: Create an identical twin of the test (sensitive to effects of item selection; may or may not be sensitive to temporal instability)" Split-half: Create two parallel tests by randomly splitting the items into two groups (sensitive to effects of item selection; not sensitive to temporal instability). The splithalf correlation gives the reliability for a half test, and to get it for the full test it needs to be corrected with the Spearman-Brown prophecy formula."
20 Ways to determine reliability, cont Cronbach s α " A measure of internal consistency among items" Sensitive to effects of item selection; not sensitive to temporal instability" The mean of all possible split-half coefficients" Increases as a function of the correlation among the items and as a function of the number of items" " α = (k/(k-1))*[1-σ(var(itemscores))/var(totscore)]
21 Computation of Cronbach s α with SPSS RELIABILITY /VARIABLES=nbird1r nbird2r nbird3r nbird4r nbird5r nshak1r nshak2r nshak3r nshak4r nshak5r /SCALE('ALL VARIABLES') ALL /MODEL=ALPHA.
22 Reliability and test length The reliability increases as a function of the test length according to the Spearman-Brown prophecy formula : If we increase our minitest 6.5 times to 65 items we expect a reliability of r(6.5) = 6.5*.737/(1+5.5*.737) =.948 If we compute Cronbach s α from the 65 items we obtain:
23 Reliability as a function of test length
24 Split-half reliability (first 33 items versus last 32 items) This analysis yields a much lower reliability estimate than Cronbach s α did! Cronbach's Alpha Reliability Statistics Part 1 Value.888 N of Items 33 Part 2 Value.913 N of Items 32 Total N of Items Equal Length Unequal Length Correlation Between Forms Spearman- Brown Guttman Coefficient Split-Half Coefficient
25 Cronbach s α for passage scores This analysis too yields a much lower reliability estimate than Cronbach s α for item scores did!
26 Cronbach s α assumptions 1. All components measure the same underlying dimension 2. All components have the same relation to the underlying dimension 3. All components have the same error variance If assumptions 2 and 3 are violated α will underestimate reliability but will provide a lower bound to reliability If assumption 1 is violated, α misestimates reliability, but we also run into interpretational difficulties We need methods to test these assumptions
27 The Birds passage
28 A congeneric latent variable model for the items in the Birds passage
29 Estimating the model Needed: estimates of 10 parameters (4 regression coefficients, 5 error variances, 1 variance of the latent variable). Available: 15 elements of the covariance matrix (10 covariances and 5 variances). Express the known entities in terms of the unknown parameters through application of path rules, e. g.: Cov(NBIRD4R, NBIRD5R) = b4 Var(nbird) b3" Cov(NBIRD2R, NBIRD1R) = b1 Var(nbird) 1" Var(NBIRD2R) = b1 Var(nbird) b1 + 1 Var(NBIRD2R&) 1" Solve the 15 equations for the 10 unknown parameters.
30 Unstandardized parameter estimates Standardized parameter estimates
31 Does the model fit the data? Reproduce the covariance matrix from the estimated parameters (the implied matrix) and compare it with the observed matrix, e.g.: Cov(NBIRD4R, NBIRD5R) = 0.54 x 0.05 x 1.20 = (observed value = 0.035) Cov(NBIRD1R, NBIRD2R) = 1.00 x 0.05 x 0.74 = (observed value = 0.037) A chi-square test of model fit may be computed: Chi-square = , df = 5, p <.00
32 Problems with the Chi-square Goodness of Fit test The test is χ 2 distributed only when data has a multivariate normal distribution under maximum likelihood estimation. When the sample size is large even trivial deviations between model and data cause the χ 2 test to be significant. When the sample size is small even important deviations from the true model may be undetected. A model with many free parameters has a better χ 2 value than a model with few free parameters. However, models with few free parameters are generally to be preferred over models with many free parameters.
33 The Root Mean Square Error of Approximation (RMSEA) The RMSEA measures the amount of discrepancy between model and data in the population, taking model complexity (i. e., number of estimated parameters) into account. Values less than 0.05 indicate good fit, and values up to 0.07 or 0.08 may be accepted." The Test of Close Fit tests the hypothesis that RMSEA < 0.05." A 90 % confidence interval of RMSEA may be contructed. The lower limit of interval should be less than.05 and the upper limit of the interval should not be higher than " The nbird model: RMSEA = 0.064, 90 % CI
34 A parallel latent variable model for the items in the Birds passage
35 Unstandardized parameter estimates Standardized parameter estimates Chi-square = , df = 13, RMSEA = 0.226, 90 % CI
36 Reliability calculations ρ for a single item = 0.04/( ) = 0.23 ρ for 5 items according to S-B = 5*0.23/(1+4*0.23) = ρ for the total score can also be computed with formula: ρ = Latvar * Σ(b i ) 2 /(Latvar * Σ(b i ) 2 + Σ Resvar i ) Parallell model: ρ = 0.04*25/(0.04*25 + 5*0.134) = Congeneric model: 0.049* /(0.049* ) = 0.598
37 Some conclusions Cronbach s α is based on several assumptions, and one of them is that items should be identical in terms of relation to the latent variable ( discrimination ) and residual variance (the parallelity assumption). However, our results indicate that estimation of α is robust against deviations from the parallelity assumption. This seems to be quite a general finding. The unidimensionality assumption is another assumption, which we now turn to.
38 A two-dimensional model for the passages Bird and Shark
39 Standardized estimates for the two-dimensional model Chi-square , df = 35, RMSEA = 0.046, 90 % CI
40 Standardized estimates for a one-dimensional model Chi-square = , df = 36, RMSEA = 0.056, 90 % CI
41 An orthogonal model with general and specific factors
42 Results from a model with one general factor and 15 passage factors Fit statistics for the one-dimensional model: Chi-square = , df = 2015, RMSEA = 0.056, 90 % CI Fit statistics for a model with one general factor and 15 passage factors: Chi-square = , df = 2000, RMSEA = 0.032, 90 % CI Estimated components of variance in sum of scores READGEN ECARD 0.03 NBIRD 0.33 DISLA 0.38 DMARIA 0.12 NDOG 0.64 EWLR 1.66 ESND 0.13 NSHK 0.38 DBTT 0.37 DBUS 0.35 DCNT 0.34 DTEMP 0.30 EMRM 0.30 NGRP 1.44 ETRE 1.84 Error 7.58 Estimated total variance Sum of passage variances 8.61 Estimated systematic variance Estimated total reliability Estimated reliability for ReadGen 0.904
43 Definition of validity Does the instrument measure what it intends to measure? Validity is an integrated evaluative judgement of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment. (Messick, 1989, p 5).
44 The three classical forms of validity Content validity. How well do the items in a test cover a certain domain? Criterion-related validity. How well does the test predict a criterion? Construct validity. How well does the test function as an indicator of a construct?
45 Construct validity as the overarching validity construct Content validity and criterion-related validity are insufficient forms of validity and require construct validity. This has led to the view that construct validity is the only needed validity construct The meaning of construct validity has been broadened, particularly by Messick through introduction of consequential aspects of validity.
46 Threats against construct validity Construct underrepresentation. The instrument covers only parts of the construct, and leaves out important dimensions or facets. Construct-irrelevant variance. The instrumentet is influenced by sources of variation which have nothing to do with the construct.
47 Testing construct validity... test validation embraces all of the experimental, statistical, and philosophical means by which hypotheses and scientific theories are evaluated. (Messick, 1989, p 6).
48 Sources of information about construct validity Internal structure (explorative and confirmatory factor analysis) Relations with other variables (external structure) Assessment of content Studies of processes Differences over time and between groups Effects of experimental interventions Value implications and social consequences, concerning both intended and unintended effects
49 A three-dimensional model for the RL-test
50 Messick s progressive matrix
CHAPTER VI RESEARCH METHODOLOGY
CHAPTER VI RESEARCH METHODOLOGY 6.1 Research Design Research is an organized, systematic, data based, critical, objective, scientific inquiry or investigation into a specific problem, undertaken with the
More informationSubescala D CULTURA ORGANIZACIONAL. Factor Analysis
Subescala D CULTURA ORGANIZACIONAL Factor Analysis Descriptive Statistics Mean Std. Deviation Analysis N 1 3,44 1,244 224 2 3,43 1,258 224 3 4,50,989 224 4 4,38 1,118 224 5 4,30 1,151 224 6 4,27 1,205
More informationAnalysis of the Reliability and Validity of an Edgenuity Algebra I Quiz
Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz This study presents the steps Edgenuity uses to evaluate the reliability and validity of its quizzes, topic tests, and cumulative
More informationModels in Educational Measurement
Models in Educational Measurement Jan-Eric Gustafsson Department of Education and Special Education University of Gothenburg Background Measurement in education and psychology has increasingly come to
More informationSubescala B Compromisso com a organização escolar. Factor Analysis
Subescala B Compromisso com a organização escolar Factor Analysis Descriptive Statistics Mean Std. Deviation Analysis N 1 4,42 1,108 233 2 4,41 1,001 233 3 4,99 1,261 233 4 4,37 1,055 233 5 4,48 1,018
More informationDoing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto
Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling Olli-Pekka Kauppila Daria Kautto Session VI, September 20 2017 Learning objectives 1. Get familiar with the basic idea
More informationMeasurement and Descriptive Statistics. Katie Rommel-Esham Education 604
Measurement and Descriptive Statistics Katie Rommel-Esham Education 604 Frequency Distributions Frequency table # grad courses taken f 3 or fewer 5 4-6 3 7-9 2 10 or more 4 Pictorial Representations Frequency
More informationStatistics for Psychosocial Research Session 1: September 1 Bill
Statistics for Psychosocial Research Session 1: September 1 Bill Introduction to Staff Purpose of the Course Administration Introduction to Test Theory Statistics for Psychosocial Research Overview: a)
More informationAPÊNDICE 6. Análise fatorial e análise de consistência interna
APÊNDICE 6 Análise fatorial e análise de consistência interna Subescala A Missão, a Visão e os Valores A ação do diretor Factor Analysis Descriptive Statistics Mean Std. Deviation Analysis N 1 4,46 1,056
More informationEmpowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison
Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological
More informationConfirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children
Confirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children Dr. KAMALPREET RAKHRA MD MPH PhD(Candidate) No conflict of interest Child Behavioural Check
More informationInstrument equivalence across ethnic groups. Antonio Olmos (MHCD) Susan R. Hutchinson (UNC)
Instrument equivalence across ethnic groups Antonio Olmos (MHCD) Susan R. Hutchinson (UNC) Overview Instrument Equivalence Measurement Invariance Invariance in Reliability Scores Factorial Invariance Item
More informationPÄIVI KARHU THE THEORY OF MEASUREMENT
PÄIVI KARHU THE THEORY OF MEASUREMENT AGENDA 1. Quality of Measurement a) Validity Definition and Types of validity Assessment of validity Threats of Validity b) Reliability True Score Theory Definition
More informationESTABLISHING VALIDITY AND RELIABILITY OF ACHIEVEMENT TEST IN BIOLOGY FOR STD. IX STUDENTS
International Journal of Educational Science and Research (IJESR) ISSN(P): 2249-6947; ISSN(E): 2249-8052 Vol. 4, Issue 4, Aug 2014, 29-36 TJPRC Pvt. Ltd. ESTABLISHING VALIDITY AND RELIABILITY OF ACHIEVEMENT
More informationRunning head: CFA OF TDI AND STICSA 1. p Factor or Negative Emotionality? Joint CFA of Internalizing Symptomology
Running head: CFA OF TDI AND STICSA 1 p Factor or Negative Emotionality? Joint CFA of Internalizing Symptomology Caspi et al. (2014) reported that CFA results supported a general psychopathology factor,
More informationBy Hui Bian Office for Faculty Excellence
By Hui Bian Office for Faculty Excellence 1 Email: bianh@ecu.edu Phone: 328-5428 Location: 1001 Joyner Library, room 1006 Office hours: 8:00am-5:00pm, Monday-Friday 2 Educational tests and regular surveys
More informationMMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?
MMI 409 Spring 2009 Final Examination Gordon Bleil Table of Contents Research Scenario and General Assumptions Questions for Dataset (Questions are hyperlinked to detailed answers) 1. Is there a difference
More informationPTHP 7101 Research 1 Chapter Assignments
PTHP 7101 Research 1 Chapter Assignments INSTRUCTIONS: Go over the questions/pointers pertaining to the chapters and turn in a hard copy of your answers at the beginning of class (on the day that it is
More informationPersonal Style Inventory Item Revision: Confirmatory Factor Analysis
Personal Style Inventory Item Revision: Confirmatory Factor Analysis This research was a team effort of Enzo Valenzi and myself. I m deeply grateful to Enzo for his years of statistical contributions to
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationPLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity
PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity Measurement & Variables - Initial step is to conceptualize and clarify the concepts embedded in a hypothesis or research question with
More informationValidity, Reliability, and Fairness in Music Testing
chapter 20 Validity, Reliability, and Fairness in Music Testing Brian C. Wesolowski and Stefanie A. Wind The focus of this chapter is on validity, reliability, and fairness in music testing. A test can
More informationChapter 3. Psychometric Properties
Chapter 3 Psychometric Properties Reliability The reliability of an assessment tool like the DECA-C is defined as, the consistency of scores obtained by the same person when reexamined with the same test
More informationAssessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies. Xiaowen Zhu. Xi an Jiaotong University.
Running head: ASSESS MEASUREMENT INVARIANCE Assessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies Xiaowen Zhu Xi an Jiaotong University Yanjie Bian Xi an Jiaotong
More informationSmall Group Presentations
Admin Assignment 1 due next Tuesday at 3pm in the Psychology course centre. Matrix Quiz during the first hour of next lecture. Assignment 2 due 13 May at 10am. I will upload and distribute these at the
More informationalternate-form reliability The degree to which two or more versions of the same test correlate with one another. In clinical studies in which a given function is going to be tested more than once over
More informationClass 7 Everything is Related
Class 7 Everything is Related Correlational Designs l 1 Topics Types of Correlational Designs Understanding Correlation Reporting Correlational Statistics Quantitative Designs l 2 Types of Correlational
More informationInternal structure evidence of validity
Internal structure evidence of validity Dr Wan Nor Arifin Lecturer, Unit of Biostatistics and Research Methodology, Universiti Sains Malaysia. E-mail: wnarifin@usm.my Wan Nor Arifin, 2017. Internal structure
More informationThe MHSIP: A Tale of Three Centers
The MHSIP: A Tale of Three Centers P. Antonio Olmos-Gallo, Ph.D. Kathryn DeRoche, M.A. Mental Health Center of Denver Richard Swanson, Ph.D., J.D. Aurora Research Institute John Mahalik, Ph.D., M.P.A.
More informationInternational Conference on Humanities and Social Science (HSS 2016)
International Conference on Humanities and Social Science (HSS 2016) The Chinese Version of WOrk-reLated Flow Inventory (WOLF): An Examination of Reliability and Validity Yi-yu CHEN1, a, Xiao-tong YU2,
More informationEFFECTS OF ITEM ORDER ON CONSISTENCY AND PRECISION UNDER DIFFERENT ORDERING SCHEMES IN ATTITUDINAL SCALES: A CASE OF PHYSICAL SELF-CONCEPT SCALES
Item Ordering 1 Edgeworth Series in Quantitative Educational and Social Science (Report No.ESQESS-2001-3) EFFECTS OF ITEM ORDER ON CONSISTENCY AND PRECISION UNDER DIFFERENT ORDERING SCHEMES IN ATTITUDINAL
More informationRunning head: CFA OF STICSA 1. Model-Based Factor Reliability and Replicability of the STICSA
Running head: CFA OF STICSA 1 Model-Based Factor Reliability and Replicability of the STICSA The State-Trait Inventory of Cognitive and Somatic Anxiety (STICSA; Ree et al., 2008) is a new measure of anxiety
More informationSPSS output for 420 midterm study
Ψ Psy Midterm Part In lab (5 points total) Your professor decides that he wants to find out how much impact amount of study time has on the first midterm. He randomly assigns students to study for hours,
More informationValidity and reliability of measurements
Validity and reliability of measurements 2 3 Request: Intention to treat Intention to treat and per protocol dealing with cross-overs (ref Hulley 2013) For example: Patients who did not take/get the medication
More informationModeling the Influential Factors of 8 th Grades Student s Mathematics Achievement in Malaysia by Using Structural Equation Modeling (SEM)
International Journal of Advances in Applied Sciences (IJAAS) Vol. 3, No. 4, December 2014, pp. 172~177 ISSN: 2252-8814 172 Modeling the Influential Factors of 8 th Grades Student s Mathematics Achievement
More informationProof. Revised. Chapter 12 General and Specific Factors in Selection Modeling Introduction. Bengt Muthén
Chapter 12 General and Specific Factors in Selection Modeling Bengt Muthén Abstract This chapter shows how analysis of data on selective subgroups can be used to draw inference to the full, unselected
More informationSPSS output for 420 midterm study
Ψ Psy Midterm Part In lab (5 points total) Your professor decides that he wants to find out how much impact amount of study time has on the first midterm. He randomly assigns students to study for hours,
More informationReliability. Internal Reliability
32 Reliability T he reliability of assessments like the DECA-I/T is defined as, the consistency of scores obtained by the same person when reexamined with the same test on different occasions, or with
More informationADMS Sampling Technique and Survey Studies
Principles of Measurement Measurement As a way of understanding, evaluating, and differentiating characteristics Provides a mechanism to achieve precision in this understanding, the extent or quality As
More informationVARIABLES AND MEASUREMENT
ARTHUR SYC 204 (EXERIMENTAL SYCHOLOGY) 16A LECTURE NOTES [01/29/16] VARIABLES AND MEASUREMENT AGE 1 Topic #3 VARIABLES AND MEASUREMENT VARIABLES Some definitions of variables include the following: 1.
More informationHaving your cake and eating it too: multiple dimensions and a composite
Having your cake and eating it too: multiple dimensions and a composite Perman Gochyyev and Mark Wilson UC Berkeley BEAR Seminar October, 2018 outline Motivating example Different modeling approaches Composite
More informationCHAPTER-III METHODOLOGY
CHAPTER-III METHODOLOGY 3.1 INTRODUCTION This chapter deals with the methodology employed in order to achieve the set objectives of the study. Details regarding sample, description of the tools employed,
More informationCHAPTER 3 METHOD AND PROCEDURE
CHAPTER 3 METHOD AND PROCEDURE Previous chapter namely Review of the Literature was concerned with the review of the research studies conducted in the field of teacher education, with special reference
More informationAssessing the Validity and Reliability of the Teacher Keys Effectiveness. System (TKES) and the Leader Keys Effectiveness System (LKES)
Assessing the Validity and Reliability of the Teacher Keys Effectiveness System (TKES) and the Leader Keys Effectiveness System (LKES) of the Georgia Department of Education Submitted by The Georgia Center
More informationPaul Irwing, Manchester Business School
Paul Irwing, Manchester Business School Factor analysis has been the prime statistical technique for the development of structural theories in social science, such as the hierarchical factor model of human
More informationTitle: The Theory of Planned Behavior (TPB) and Texting While Driving Behavior in College Students MS # Manuscript ID GCPI
Title: The Theory of Planned Behavior (TPB) and Texting While Driving Behavior in College Students MS # Manuscript ID GCPI-2015-02298 Appendix 1 Role of TPB in changing other behaviors TPB has been applied
More informationPsychologist use statistics for 2 things
Psychologist use statistics for 2 things O Summarize the information from the study/experiment O Measures of central tendency O Mean O Median O Mode O Make judgements and decisions about the data O See
More informationAnumber of studies have shown that ignorance regarding fundamental measurement
10.1177/0013164406288165 Educational Graham / Congeneric and Psychological Reliability Measurement Congeneric and (Essentially) Tau-Equivalent Estimates of Score Reliability What They Are and How to Use
More informationLec 02: Estimation & Hypothesis Testing in Animal Ecology
Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then
More informationCHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to
CHAPTER - 6 STATISTICAL ANALYSIS 6.1 Introduction This chapter discusses inferential statistics, which use sample data to make decisions or inferences about population. Populations are group of interest
More informationREPORT. Technical Report: Item Characteristics. Jessica Masters
August 2010 REPORT Diagnostic Geometry Assessment Project Technical Report: Item Characteristics Jessica Masters Technology and Assessment Study Collaborative Lynch School of Education Boston College Chestnut
More informationOn the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation in CFA
STRUCTURAL EQUATION MODELING, 13(2), 186 203 Copyright 2006, Lawrence Erlbaum Associates, Inc. On the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation
More informationLecture Week 3 Quality of Measurement Instruments; Introduction SPSS
Lecture Week 3 Quality of Measurement Instruments; Introduction SPSS Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit Overview Quality of Measurement Instruments Introduction SPSS Read:
More informationDevelopment of self efficacy and attitude toward analytic geometry scale (SAAG-S)
Available online at www.sciencedirect.com Procedia - Social and Behavioral Sciences 55 ( 2012 ) 20 27 INTERNATIONAL CONFERENCE ON NEW HORIZONS IN EDUCATION INTE2012 Development of self efficacy and attitude
More informationConfirmatory Factor Analysis of the Procrastination Assessment Scale for Students
611456SGOXXX10.1177/2158244015611456SAGE OpenYockey and Kralowec research-article2015 Article Confirmatory Factor Analysis of the Procrastination Assessment Scale for Students SAGE Open October-December
More informationAssociate Prof. Dr Anne Yee. Dr Mahmoud Danaee
Associate Prof. Dr Anne Yee Dr Mahmoud Danaee 1 2 What does this resemble? Rorschach test At the end of the test, the tester says you need therapy or you can't work for this company 3 Psychological Testing
More informationInvestigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories
Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,
More informationCHAPTER 3. Research Methodology
CHAPTER 3 Research Methodology The research studies the youth s attitude towards Thai cuisine in Dongguan City, China in 2013. Researcher has selected survey methodology by operating under procedures as
More informationRunning Head: MULTIPLE CHOICE AND CONSTRUCTED RESPONSE ITEMS. The Contribution of Constructed Response Items to Large Scale Assessment:
Running Head: MULTIPLE CHOICE AND CONSTRUCTED RESPONSE ITEMS The Contribution of Constructed Response Items to Large Scale Assessment: Measuring and Understanding their Impact Robert W. Lissitz 1 and Xiaodong
More informationAnswer Key to Problem Set #1
Answer Key to Problem Set #1 Two notes: q#4e: Please disregard q#5e: The frequency tables of the total CESD scales of 94, 96 and 98 in question 5e should sum up to 328 observation not 924 (the student
More informationHPS301 Exam Notes- Contents
HPS301 Exam Notes- Contents Week 1 Research Design: What characterises different approaches 1 Experimental Design 1 Key Features 1 Criteria for establishing causality 2 Validity Internal Validity 2 Threats
More informationSaville Consulting Wave Professional Styles Handbook
Saville Consulting Wave Professional Styles Handbook PART 4: TECHNICAL Chapter 19: Reliability This manual has been generated electronically. Saville Consulting do not guarantee that it has not been changed
More informationPsychometric Instrument Development
Psychometric Instrument Development Image source: http://commons.wikimedia.org/wiki/file:soft_ruler.jpg, CC-by-SA 3.0 Lecture 6 Survey Research & Design in Psychology James Neill, 2017 Creative Commons
More informationPsychometric Instrument Development
Psychometric Instrument Development Image source: http://commons.wikimedia.org/wiki/file:soft_ruler.jpg, CC-by-SA 3.0 Lecture 6 Survey Research & Design in Psychology James Neill, 2017 Creative Commons
More informationPsychometric Instrument Development
Psychometric Instrument Development Image source: http://commons.wikimedia.org/wiki/file:soft_ruler.jpg, CC-by-SA 3.0 Lecture 6 Survey Research & Design in Psychology James Neill, 2016 Creative Commons
More informationCritical Thinking Assessment at MCC. How are we doing?
Critical Thinking Assessment at MCC How are we doing? Prepared by Maura McCool, M.S. Office of Research, Evaluation and Assessment Metropolitan Community Colleges Fall 2003 1 General Education Assessment
More informationA Brief Introduction to Bayesian Statistics
A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon
More informationHow!Good!Are!Our!Measures?! Investigating!the!Appropriate! Use!of!Factor!Analysis!for!Survey! Instruments!
22 JournalofMultiDisciplinaryEvaluation Volume11,Issue25,2015 HowGoodAreOurMeasures? InvestigatingtheAppropriate UseofFactorAnalysisforSurvey Instruments MeganSanders TheOhioStateUniversity P.CristianGugiu
More informationBusiness Research Methods. Introduction to Data Analysis
Business Research Methods Introduction to Data Analysis Data Analysis Process STAGES OF DATA ANALYSIS EDITING CODING DATA ENTRY ERROR CHECKING AND VERIFICATION DATA ANALYSIS Introduction Preparation of
More informationMeasurement is the process of observing and recording the observations. Two important issues:
Farzad Eskandanian Measurement is the process of observing and recording the observations. Two important issues: 1. Understanding the fundamental ideas: Levels of measurement: nominal, ordinal, interval
More informationResearch Questions and Survey Development
Research Questions and Survey Development R. Eric Heidel, PhD Associate Professor of Biostatistics Department of Surgery University of Tennessee Graduate School of Medicine Research Questions 1 Research
More informationComprehensive Statistical Analysis of a Mathematics Placement Test
Comprehensive Statistical Analysis of a Mathematics Placement Test Robert J. Hall Department of Educational Psychology Texas A&M University, USA (bobhall@tamu.edu) Eunju Jung Department of Educational
More information11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES
Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are
More informationConnectedness DEOCS 4.1 Construct Validity Summary
Connectedness DEOCS 4.1 Construct Validity Summary DEFENSE EQUAL OPPORTUNITY MANAGEMENT INSTITUTE DIRECTORATE OF RESEARCH DEVELOPMENT AND STRATEGIC INITIATIVES Directed by Dr. Daniel P. McDonald, Executive
More informationAnalysis and Interpretation of Data Part 1
Analysis and Interpretation of Data Part 1 DATA ANALYSIS: PRELIMINARY STEPS 1. Editing Field Edit Completeness Legibility Comprehensibility Consistency Uniformity Central Office Edit 2. Coding Specifying
More informationThe Psychometric Properties of Dispositional Flow Scale-2 in Internet Gaming
Curr Psychol (2009) 28:194 201 DOI 10.1007/s12144-009-9058-x The Psychometric Properties of Dispositional Flow Scale-2 in Internet Gaming C. K. John Wang & W. C. Liu & A. Khoo Published online: 27 May
More informationMidterm Exam MMI 409 Spring 2009 Gordon Bleil
Midterm Exam MMI 409 Spring 2009 Gordon Bleil Table of contents: (Hyperlinked to problem sections) Problem 1 Hypothesis Tests Results Inferences Problem 2 Hypothesis Tests Results Inferences Problem 3
More informationCHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS
CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS Chapter Objectives: Understand Null Hypothesis Significance Testing (NHST) Understand statistical significance and
More informationThe Effect of Guessing on Item Reliability
The Effect of Guessing on Item Reliability under Answer-Until-Correct Scoring Michael Kane National League for Nursing, Inc. James Moloney State University of New York at Brockport The answer-until-correct
More informationPsychometric Instrument Development
Psychometric Instrument Development Lecture 6 Survey Research & Design in Psychology James Neill, 2012 Readings: Psychometrics 1. Bryman & Cramer (1997). Concepts and their measurement. [chapter - ereserve]
More informationResults & Statistics: Description and Correlation. I. Scales of Measurement A Review
Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize
More informationValidity and reliability of measurements
Validity and reliability of measurements 2 Validity and reliability of measurements 4 5 Components in a dataset Why bother (examples from research) What is reliability? What is validity? How should I treat
More informationRunning head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note
Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1 Nested Factor Analytic Model Comparison as a Means to Detect Aberrant Response Patterns John M. Clark III Pearson Author Note John M. Clark III,
More informationOn Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015
On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses Structural Equation Modeling Lecture #12 April 29, 2015 PRE 906, SEM: On Test Scores #2--The Proper Use of Scores Today s Class:
More informationUsing the Rasch Modeling for psychometrics examination of food security and acculturation surveys
Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys Jill F. Kilanowski, PhD, APRN,CPNP Associate Professor Alpha Zeta & Mu Chi Acknowledgements Dr. Li Lin,
More informationASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT S MATHEMATICS ACHIEVEMENT IN MALAYSIA
1 International Journal of Advance Research, IJOAR.org Volume 1, Issue 2, MAY 2013, Online: ASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT
More informationKnowledge as a driver of public perceptions about climate change reassessed
1. Method and measures 1.1 Sample Knowledge as a driver of public perceptions about climate change reassessed In the cross-country study, the age of the participants ranged between 20 and 79 years, with
More informationExamining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology*
Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology* Timothy Teo & Chwee Beng Lee Nanyang Technology University Singapore This
More informationAndré Cyr and Alexander Davies
Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander
More informationvalidscale: A Stata module to validate subjective measurement scales using Classical Test Theory
: A Stata module to validate subjective measurement scales using Classical Test Theory Bastien Perrot, Emmanuelle Bataille, Jean-Benoit Hardouin UMR INSERM U1246 - SPHERE "methods in Patient-centered outcomes
More informationContents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD
Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT
More informationRESULTS. Chapter INTRODUCTION
8.1 Chapter 8 RESULTS 8.1 INTRODUCTION The previous chapter provided a theoretical discussion of the research and statistical methodology. This chapter focuses on the interpretation and discussion of the
More informationCollecting & Making Sense of
Collecting & Making Sense of Quantitative Data Deborah Eldredge, PhD, RN Director, Quality, Research & Magnet Recognition i Oregon Health & Science University Margo A. Halm, RN, PhD, ACNS-BC, FAHA Director,
More informationPackianathan Chelladurai Troy University, Troy, Alabama, USA.
DIMENSIONS OF ORGANIZATIONAL CAPACITY OF SPORT GOVERNING BODIES OF GHANA: DEVELOPMENT OF A SCALE Christopher Essilfie I.B.S Consulting Alliance, Accra, Ghana E-mail: chrisessilfie@yahoo.com Packianathan
More informationValidity, Reliability and Classical Assumptions
, Reliability and Classical Assumptions Presented by Mahendra AN Sources: www-psych.stanford.edu/~bigopp/.ppt http://ets.mnsu.edu/darbok/ethn402-502/reliability.ppt http://5martconsultingbandung.blogspot.com/2011/01/uji-asumsi-klasik.html
More informationUsing Analytical and Psychometric Tools in Medium- and High-Stakes Environments
Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session
More information1. Evaluate the methodological quality of a study with the COSMIN checklist
Answers 1. Evaluate the methodological quality of a study with the COSMIN checklist We follow the four steps as presented in Table 9.2. Step 1: The following measurement properties are evaluated in the
More informationDiscriminant Analysis with Categorical Data
- AW)a Discriminant Analysis with Categorical Data John E. Overall and J. Arthur Woodward The University of Texas Medical Branch, Galveston A method for studying relationships among groups in terms of
More informationOn the purpose of testing:
Why Evaluation & Assessment is Important Feedback to students Feedback to teachers Information to parents Information for selection and certification Information for accountability Incentives to increase
More information