Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification

Similar documents
ADMS Sampling Technique and Survey Studies

Variables in Research. What We Will Cover in This Section. What Does Variable Mean?

Importance of Good Measurement

Validation of Scales

Chapter 3 Psychometrics: Reliability and Validity

On the purpose of testing:

Variables in Research. What We Will Cover in This Section. What Does Variable Mean? Any object or event that can take on more than one form or value.

Examining the Psychometric Properties of The McQuaig Occupational Test

June 26, E. Mineral Circle, Suite 350 Englewood, CO Critical Role of Performance Ratings

Reliability & Validity Dr. Sudip Chaudhuri

26:010:557 / 26:620:557 Social Science Research Methods


Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

Chapter 15 PSYCHOLOGICAL TESTS

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Ursuline College Accelerated Program

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

Academic achievement and its relation to family background and locus of control

Reliability and Validity

KCP learning. factsheet 44: mapp validity. Face and content validity. Concurrent and predictive validity. Cabin staff

Funnelling Used to describe a process of narrowing down of focus within a literature review. So, the writer begins with a broad discussion providing b

The Concept of Validity

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

The Psychometric Principles Maximizing the quality of assessment

d =.20 which means females earn 2/10 a standard deviation more than males

2 Critical thinking guidelines

Process of a neuropsychological assessment

Validity and Reliability. PDF Created with deskpdf PDF Writer - Trial ::

DAT Next Generation. FAQs

Technical Specifications

and Screening Methodological Quality (Part 2: Data Collection, Interventions, Analysis, Results, and Conclusions A Reader s Guide

A study of association between demographic factor income and emotional intelligence

equation involving two test variables.

Psychometric Instrument Development

Validity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that

Simple Linear Regression the model, estimation and testing

CHAPTER 3 METHOD AND PROCEDURE

University of Wollongong. Research Online. Australian Health Services Research Institute

VARIABLES AND MEASUREMENT

Organizational Behaviour

Verbal Reasoning: Technical information

Social Determinants and Consequences of Children s Non-Cognitive Skills: An Exploratory Analysis. Amy Hsin Yu Xie

The Short NART: Cross-validation, relationship to IQ and some practical considerations

Survey Question. What are appropriate methods to reaffirm the fairness, validity reliability and general performance of examinations?

Analysis of Confidence Rating Pilot Data: Executive Summary for the UKCAT Board

1.4 - Linear Regression and MS Excel

Chapter 1 Applications and Consequences of Psychological Testing

Sample Exam Questions Psychology 3201 Exam 1

CHAPTER VI RESEARCH METHODOLOGY

Chapter 4. The Validity of Assessment- Based Interpretations

Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving

Measurement is the process of observing and recording the observations. Two important issues:

Cultural Intelligence: A Predictor of Ethnic Minority College Students Psychological Wellbeing

HPS301 Exam Notes- Contents

Class 7 Everything is Related

Chapter 6 Topic 6B Test Bias and Other Controversies. The Question of Test Bias

ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology

Variables in Research. What We Will Cover in This Section. What Does Variable Mean? Any object or event that can take on more than one form or value.

VALIDITY STUDIES WITH A TEST OF HIGH-LEVEL REASONING

Designing a Questionnaire

Interpretation of Data and Statistical Fallacies

Chapter 6. Methods of Measuring Behavior Pearson Prentice Hall, Salkind. 1

How Do We Gather Evidence of Validity Based on a Test s Relationships With External Criteria?

Self-Efficacy in the Prediction of Academic Performance and Perceived Career Options

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Overview of the Logic and Language of Psychology Research

CHAPTER LEARNING OUTCOMES

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering

Do not write your name on this examination all 40 best

Prejudice and Stereotypes in School Environment - Application to adolescence -

RELIABILITY AND VALIDITY (EXTERNAL AND INTERNAL)

Chapter 14: More Powerful Statistical Methods

Test Code : RZI/RZII (Short Answer type) Junior Research Fellowship in Psychology

Analyzing the Relationship between the Personnel s Achievement Motivation and their Performance at the Islamic Azad University, Shoushtar branch

A Study on Emotional Intelligence among Teachers with Special reference to Erode District

Psychological testing

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

Briefing for employers on Asperger Syndrome

Reliability. Internal Reliability

Self-Handicapping Variables and Students' Performance

Psychology Research Process

No part of this page may be reproduced without written permission from the publisher. (

UNIT 2 SCALES AND TESTS

Construction of an Attitude Scale towards Teaching Profession: A Study among Secondary School Teachers in Mizoram

Lesson 12. Understanding and Managing Individual Behavior

Programme Name: Climate Schools: Alcohol and drug education courses

Psychometric Instrument Development

THE RELATIONSHIP BETWEEN EMOTIONAL INTELLIGENCE AND STRESS MANAGEMENT

Procedure of the study

Managing Mental Health (at Work)

PSYCHOLOGY 300B (A01) One-sample t test. n = d = ρ 1 ρ 0 δ = d (n 1) d


Test Code: SCA/SCB (Short Answer type) Junior Research Fellowship in Psychology

11-3. Learning Objectives

DATA is derived either through. Self-Report Observation Measurement

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

investigate. educate. inform.

Making a psychometric. Dr Benjamin Cowan- Lecture 9

Transcription:

What is? IOP 301-T Test Validity It is the accuracy of the measure in reflecting the concept it is supposed to measure. In simple English, the of a test concerns what the test measures and how well it measures Types of Content VALIDITY Criterion-related Construct Content-description Criterion-description Construct-identification Face Concurrent Predictive Correlation Factorial Convergent Discriminant Content non-statistical in nature involves determining whether the sample used for the measure is representative of the aspect to be measured Content It adopts a subjective approach whereby we have recourse, for e.g, to expert opinion for the evaluation of items during the test construction phase. 1

Content Relevant for evaluating Achievement Educational Occupational measures. Content Basic requirement for Criterion-referenced Job sample measures, which are essential for Employee selection Employee classification Content Measures are interpreted in terms of Mastery of knowledge Skills for a specific job. Content Not appropriate for Aptitude Personality measures since validation has to be made through criterionprediction procedures. Face It is not in psychometric terms! It just refers to what the test appears to measure and not to what it measures in fact. Face It is not useless since the aim may be achieved by using appropriate phrasing only! In a sense, it ensures relevance to the context by employing correct expressions. 2

Criterion-related Definition A criterion variable is one with (or against) which psychological measures are compared or evaluated. A criterion must be reliable! Criterion-related It is a quantitative procedure which involves calculating the correlation coefficient between one or more predictor variables and a criterion variable. Criterion-related The of the measure is also determined by its ability to predict performance on the criterion. Criterion-related Concurrent Accuracy to identify the current status regarding skills and characteristics Predictive Accuracy to forecast future behaviour. It implicitly contains the concept of decision-making Criterion-related Warning! Sometimes a factor may affect the criterion such that it is no longer a valid measure. This is known as criterion contamination. Criterion-related Warning! For example, the rater might Be too lenient Commit halo error, i. e, rely on impressions 3

Criterion-related Warning! Therefore, we must make sure that the criterion is free from bias (prejudice). Bias definitely influences the correlation coefficient. Academic achievement Performance in specialised training Job performance Contrasted groups Psychiatric diagnoses Academic achievement Used for validation of intelligence, multiple aptitude and personality measures. Indices include School, college or university grades Achievement test scores Special awards Performance in specialised training Used for specific aptitude measures Indices include training outcomes for Technical courses Academic courses Job performance Used for validating intelligence, special aptitude and personality measures Indices include jobs in industry, business, armed services, government Tests describe duties performed and the ways that they are measured Contrasted groups Sometimes used for validating personality measures Relevant when distinguishing the nature of occupations (e.g, social and nonsocial Public Relations Officer and Clerk) 4

Psychiatric diagnoses Used mainly for validating personality measures Based on Prolonged observation Case history Intelligence Aptitude Personality Academic achievement Performance in specialised training Job performance Contrasted groups Psychiatric diagnoses Ratings Ratings Suitable for almost any type of measure Very subjective in nature Often the only source available These are given by teachers, lecturers, instructors, supervisors, officers, etc, Raters may be trained to avoid common errors like Halo error Ambiguity Error of central tendency Leniency Criterion-related We can always validate a new measure by correlating it with another valid test (obviously, reliable as well!) Criterion-related Some modern and popular criterionprediction procedures which are now widely used are Validity generalisation Meta-analysis Cross-validation 5

Criterion-related Validity generalisation Schmidt, Hunter et al. showed that the of tests measuring verbal, numeric and reasoning aptitudes can be generalised widely across occupations (these require common cognitive skills). Criterion-related Meta-analysis Method of reviewing research literature Statistical integration and analysis of previous and current findings on a topic. Validation by correlation Criterion-related Cross-validation Refinement of initial measure Application to another representative normative sample Recalculation of coefficients Lowering of coefficient expected after minimisation of chance differences and sampling errors (spuriousness) Construct-identification Construct is the sensitivity of the instrument to pick up minor variations in the concept being measured. Can an instrument (questionnaire) to measure anxiety pick up different levels of anxiety or just its presence or absence? Construct-identification Any data throwing light on the nature of the trait and the conditions affecting its development and manifestations represent appropriate evidence for this validation. Example I have designed a program to lower girls Math phobia. The girls who complete my program should have lower scores on the Math Phobia Measure compared to their scores before the program and compared to the scores of girls who have not completed the program. Construct methods Correlational Factor analysis Convergent and discriminant 6

Construct methods Correlational This involves correlating a new measure with similar previous measures of the same name. Warning! High correlation may indicate duplication of measures. Construct methods Factor analysis (FA) It is a multivariate statistical technique which is used to group multiple variables into a few factors. In doing FA you hope to find clusters of variables that can be identified as new hypothetical factors. Construct methods Convergent and discriminant The idea is that a test should correlate highly with other similar tests and the test should correlate poorly with tests that are very dissimilar. Construct methods Convergent and discriminant Example A newly developed test of motor coordination should correlate highly with other tests of motor coordination. It should also have low correlation with tests that measure attitudes. Validity coefficient - Magnitude of coefficient - Factors affecting Coefficient of determination Standard error of estimation Regression analysis (prediction) Validity coefficient Definition It is a correlation coefficient between the criterion and the predictor(s) variables. 7

Validity coefficient Differential refers to differences in the magnitude of the correlation coefficients for different groups of test-takers. Magnitude of coefficient Treated in the same way as the Pearson correlation coefficient! Factors affecting Nature of the group Sample heterogeneity Criterion-predictor relationship Validity-reliability proportionality Criterion contamination Moderator variables Factors affecting Nature of the group Consistency of the coefficient for subgroups which differ in any characteristic (e. g. age, gender, educational level, etc, ) Factors affecting Sample heterogeneity A wider range of scores results in a higher coefficient (range restriction phenomenon) Factors affecting Criterion-predictor relationship There must be a linear relationship between predictor and criterion. Otherwise, the Pearson correlation coefficient would be of no use! 8

Factors affecting Validity-reliability proportionality Reliability has a limiting influence on we simply cannot validate an unreliable measure! Factors affecting Criterion contamination Get rid of bias by measuring contaminated influences. Then correct this influence statistically by use of partial correlation. Factors affecting Moderator variables Variables like age, gender, personality characteristics may help to predict performance for particular variables only keep them in mind! Coefficient of determination Indicates the proportion of variance in the criterion variable explained by the predictor. E.g. If r = 0.9, r 2 = 0.81. 81% of the changes in the criterion is accounted for by the predictor. Standard error of estimation (SE) SE est = s y 2 1 rxy Treated just like the standard deviation. (True and predicted values for the criterion should differ by at most 1.96SE at a 95% confidence level.) Regression analysis Mainly used to predict values of the criterion variable. If r is high, prediction is more accurate. Predicted values are obtained from the line of best fit. 9

Regression analysis Linear regression It involves one criterion variable but may involve one (simple regression) or more than one predictor variable (multiple regression). Reliability and Validity A valid test is always reliable (in order for a test to be valid, it needs to be reliable in the first place) A reliable test is not always valid. Validity is more important than reliability. To be useful, a measuring instrument (test, scale) must be both reasonably reliable and valid. Aim for first, and then try make the test more reliable little by little, rather than the other way around. Reliability and Validity Optimising reliability and Validity IF Unreliable Reliable, but not valid Unreliable and invalid Reliable and valid THEN Test is undermined. Test is not useful. Test is definitely NOT useful! Test can be used with good results. The more questions the better (the number of test items) Ask questions several times in slightly different ways (homogeneity) Get as many people as you can in your program (sample size n) Get different kinds of people in your program (sample heterogeneity) Linear relationship between the test and the criterion (Pearson correlation coefficient) Selecting and creating measures Define the construct(s) that you want to measure clearly Identify existing measures, particularly those with established reliability and Determine whether those measures will work for your purpose and identify any areas where you may need to create a new measure or add new questions Create additional questions/measures Identify criteria that your measure should correlate with or predict, and develop procedures for assessing those criteria 10