PÄIVI KARHU THE THEORY OF MEASUREMENT

Similar documents
Measurement is the process of observing and recording the observations. Two important issues:

ADMS Sampling Technique and Survey Studies

11-3. Learning Objectives

Construct Validation of Direct Behavior Ratings: A Multitrait Multimethod Analysis

9 research designs likely for PSYC 2100

Validity and Reliability. PDF Created with deskpdf PDF Writer - Trial ::

Lecture 4: Research Approaches

VARIABLES AND MEASUREMENT

Ch. 11 Measurement. Measurement

Ch. 11 Measurement. Paul I-Hai Lin, Professor A Core Course for M.S. Technology Purdue University Fort Wayne Campus

VALIDITY OF QUANTITATIVE RESEARCH

Validity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that

Revisiting the Halo Effect Within a Multitrait, Multimethod Framework

32.5. percent of U.S. manufacturers experiencing unfair currency manipulation in the trade practices of other countries.

Constructing Indices and Scales. Hsueh-Sheng Wu CFDR Workshop Series June 8, 2015

Lecture Week 3 Quality of Measurement Instruments; Introduction SPSS

CHAPTER VI RESEARCH METHODOLOGY


Associate Prof. Dr Anne Yee. Dr Mahmoud Danaee

Research Questions and Survey Development

Special Section QUANTITATIVE METHODS IN THE STUDY OF DRUG USE. Edited by George J. Huba ABSTRACT

Introduction to Reliability

26:010:557 / 26:620:557 Social Science Research Methods

RESEARCH METHODS. Winfred, research methods, ; rv ; rv

2 Types of psychological tests and their validity, precision and standards

RESEARCH METHODS. Winfred, research methods,

Handout 5: Establishing the Validity of a Survey Instrument

Reliability & Validity Dr. Sudip Chaudhuri

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604

Chapter 4: Defining and Measuring Variables

DATA is derived either through. Self-Report Observation Measurement

AP Psychology -- Chapter 02 Review Research Methods in Psychology

Basic concepts and principles of classical test theory

Validity and Quantitative Research. What is Validity? What is Validity Cont. RCS /16/04

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

Measurement. Different terminology. Marketing managers work with abstractions. Different terminology. Different terminology.

Evaluating Reading Diagnostic Tests:

Measurement. 500 Research Methods Mike Kroelinger

STATISTICAL CONCLUSION VALIDITY

HPS301 Exam Notes- Contents

Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology*

Variable Measurement, Norms & Differences

Reliability and Validity

On the purpose of testing:

Validity, Reliability and Classical Assumptions

9.63 Laboratory in Cognitive Science

Variables and Data. Gbenga Ogunfowokan Lead, Nigerian Regional Faculty The Global Health Network 19 th May 2017

Variables in Research. What We Will Cover in This Section. What Does Variable Mean?

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments

Designing a Questionnaire

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Psy 464 Advanced Experimental Design. Overview. Variables types. Variables, Control, Validity & Levels of Measurement

Variables in Research. What We Will Cover in This Section. What Does Variable Mean? Any object or event that can take on more than one form or value.

DATA GATHERING. Define : Is a process of collecting data from sample, so as for testing & analyzing before reporting research findings.

Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data

04/12/2014. Research Methods in Psychology. Chapter 6: Independent Groups Designs. What is your ideas? Testing

UNIVERSITY OF THE FREE STATE DEPARTMENT OF COMPUTER SCIENCE AND INFORMATICS CSIS6813 MODULE TEST 2

Validation of Scales

Group Assignment #1: Concept Explication. For each concept, ask and answer the questions before your literature search.

Reliability AND Validity. Fact checking your instrument

INTERPARENTAL CONFLICT, ADOLESCENT BEHAVIORAL PROBLEMS, AND ADOLESCENT COMPETENCE: CONVERGENT AND DISCRIMINANT VALIDITY

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Use of the Quantitative-Methods Approach in Scientific Inquiry. Du Feng, Ph.D. Professor School of Nursing University of Nevada, Las Vegas

Work, Employment, and Industrial Relations Theory Spring 2008

RELIABILITY AND VALIDITY (EXTERNAL AND INTERNAL)

Regression Discontinuity Analysis

On the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA

Do not write your name on this examination all 40 best

INTRODUCTION TO STATISTICS SORANA D. BOLBOACĂ

Technical Whitepaper

Business Statistics Probability

Quantifying Construct Validity: Two Simple Measures

Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving

Options and issues in the development of validation methodologies for Hua Oranga and Hōmai te Waiora ki Ahau

Psychological testing

ESTABLISHING VALIDITY AND RELIABILITY OF ACHIEVEMENT TEST IN BIOLOGY FOR STD. IX STUDENTS

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Reliability. Internal Reliability

11/24/2017. Do not imply a cause-and-effect relationship

Measurement issues in the assessment of psychosocial stressors at work

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Chapter 11 Nonexperimental Quantitative Research Steps in Nonexperimental Research

Overview of Experimentation

Saville Consulting Wave Professional Styles Handbook

Underlying Theory & Basic Issues

Choosing an Approach for a Quantitative Dissertation: Strategies for Various Variable Types

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN)

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Ursuline College Accelerated Program

Buy full version here - for $ 7.00

sample of 85 graduate students at The University of Michigan s influenza, benefits provided by a flu shot, and the barriers or costs associated

On the Many Claims and Applications of the Latent Variable

Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys

By Hui Bian Office for Faculty Excellence

Using Generalizability Theory to Investigate the Psychometric Property of an Assessment Center in Indonesia

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

The Current State of Our Education

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Reliability of the Ghiselli Self-Description Inventory as a Measure of Self-Esteem

Transcription:

PÄIVI KARHU THE THEORY OF MEASUREMENT

AGENDA 1. Quality of Measurement a) Validity Definition and Types of validity Assessment of validity Threats of Validity b) Reliability True Score Theory Definition and Types of reliability Assessment of reliability 2. Levels of Measurement 2

AGENDA 1. Quality of Measurement a) Validity Definition and Types of validity Assessment of validity Threats of Validity b) Reliability True Score Theory Definition and Types of reliability Assessment of reliability 2. Levels of Measurement 3

Quality of Measurement - Validity - Definition and Types of validity WHAT IS MEASUREMENT? Measurement is the process of observing and recording the observations that are collected as part of a research effort. 4

Quality of Measurement - Validity - Definition and Types of validity VALIDITY QUESTIONS ARE CUMULATIVE. Conclusion Internal Is the relationship causal? Is there a relationship between the cause and effect? Construct Did you measure what you wanted to measure? External How well can you generalize from your sample to other persons, place, times? Validity The best available approximation of the truth of a given proposition, inference or conclusion. 5

Quality of Measurement - Validity - Definition and Types of validity VALIDITY TYPES CONSTRUCT VALIDITY An assessment of how well your actual programs or measures reflect your ideas or theories Is the operationalization* a good reflection of the construct? Does the operationalization* behave the way it should? TRANSLATION VALIDITY CRITERION-RELATED VALIDITY Face Validity Content Validity *)act of translating a construct into its manifestation Predictive Validity Concurrent Validity Convergent Validity Discriminant Validity 6

Quality of Measurement - Validity - Definition and Types of validity TRANSLATION VALIDITY Face Validity you look at the operationalization and see whether "on its face" it seems like a good translation of the construct Content Validity you essentially check the operationalization against the relevant content domain for the construct 7

Quality of Measurement - Validity - Definition and Types of validity CRITERION-RELATED VALIDITY Predictive Validity we assess the operationalization's ability to predict something it should theoretically be able to predict Concurrent Validity we assess the operationalization's ability to distinguish between groups that it should theoretically be able to distinguish between Convergent Validity we examine the degree to which the operationalization is similar to (converges on) other operationalizations that it theoretically should be similar to Discriminant Validity we examine the degree to which the operationalization is not similar to (diverges from) other operationalizations that it theoretically should be not be similar to 8

Quality of Measurement - Validity - Definition and Types of validity Land of Observation Land of Theory CONSTRUCT VALIDITY Theory Cause construct What you think Cause-effect Construct Effect construct Can we generalize to the constructs? Program What you do Program-outcome Relationship Observations What you see Observation

Quality of Measurement - Validity - Definition and Types of validity CONSTRUCT VALIDITY Convergent correlations should be > Discriminant correlations They work together: Evidence for both Evidence for construct validity Convergent Validity Discriminant Validity To establish convergent validity, you need to show that measures that should be related are in reality related To establish discriminant validity, you need to show that measures that should not be related are in reality not related 10

Quality of Measurement - Validity - Definition and Types of validity CONSTRUCT VALIDITY WHY? The more complex your theoretical model, the more evidence you are providing that you know what you re talking about! Convergent correlations should be > Discriminant correlations They work together: Evidence for both Evidence for construct validity Convergent Validity Discriminant Validity To establish convergent validity, you need to show that measures that should be related are in reality related To establish discriminant validity, you need to show that measures that should not be related are in reality not related 11

GREAT! This analysis provided evidence for both convergent and discriminant validity. It shows that the three self-esteem measures reflect the same construct and locus-of-control measures reflect the same construct and that these two sets of measures reflect two different constructs. BUT: What are the constructs measuring?! How do you show that your measures are actually measuring self-esteem or locus-ofcontrol? 12

Quality of Measurement - Validity - Assessment of Construct Validity CONSTRUCT VALIDITY ASSESSMENT Assessment tools: The Nomological Network The Multitrait-Multimethod matrix (MTMM) Pattern Matching Structural Equation Modeling (SEM) 13

Quality of Measurement - Validity - Assessment of Construct Validity THE NOMOLOGICAL NETWORK Developed by Lee Cronbach and Paul Meehl (1955) Nomological network=lawful network This network includes Theoretical framework for what you are trying to measure Empirical framework for how you are going to measure it A specification of the linkages among and between those two frameworks Not practical Only useful as a philosophical foundation for construct validity 14

Quality of Measurement - Validity - Assessment of Construct Validity THE NOMOLOGICAL NETWORK 15

Quality of Measurement - Validity - Assessment of Construct Validity THE MULTITRAIT-MULTIMETHOD MATRIX (MTMM) Developed by Campbell and Fiske (1959) An attempt to provide a practical methodology Introduced convergent and discriminant validity as subcategories of construct validity A matrix of correlations arranged to facilitate the assessment of construct validity Assumes that you have measured each construct (trait) with different methods in a fully crossed design (traits by methods) A restrictive methodology 16

Quality of Measurement - Validity - Assessment of Construct Validity MTMM Example: MTMM for 3 concepts Correlations 3 kinds of shapes 1. Diagonals 2. Triangels 3. Blocks 3 Concepts Traits A, B, C 3 Methods 1, 2, 3 17 17

Quality of Measurement - Validity - Assessment of Construct Validity INTERPRETATION OF THE MTMM Coefficients in the reliability diagonal should consistently be the highest in the matrix Coefficients in the validity diagonals should be significantly different from zero and high enough to warrant further investigation A validity coefficient should be higher than values lying in its column and row in the same heteromethod block A validity coefficient should be higher than all coefficients in the heterotrait-monomethod triangles 18

19

Quality of Measurement - Validity - Assessment of Construct Validity PROS AND CONS OF MTMM + - An operational methodology for assessing construct validity A rigorous framework for assessing construct validity Requirement of a fully crossed measurement design Judgemental nature hindering adoption Impossible to quantifiy the degree of construct validity in a study 20

Quality of Measurement - Validity - Assessment of Construct Validity PATTERN* MATCHING Pattern* Matching is the of construct validity Degree of correspondence between the theoretical and the observed pattern *)any arrangement of objects or entities; nonrandom and at least potentially 21 describable.

Quality of Measurement - Validity - Assessment of Construct Validity A PATTERN MATCHING EXAMPLE 22

Quality of Measurement - Validity - Assessment of Construct Validity PROS AND CONS OF PATTERN MATCHING + - Higher generality and flexibility that MTMMs: it does not require that you measure each construct with multiple methods Treatment of convergence and discrimination as continuum Estimation of the overall construct validity for a set of measures in a specific context Specification of what you think about the constructs Requires the precise specification of the theory of the constructs Requires the qualification of both patterns Requires the description in matrices that have the same construct 23

Quality of Measurement - Validity - Threats of Validity THREATS TO CONSTRUCT VALIDITY Design threats Inadequate preoperational explication of constructs Mono-Operation Bias single version of your independent variable Mono-Method Bias single method of measurement Interaction of different treatments not able to isolate effects of your program from other treatments Social Threats Hypothesis Guessing participants base behavior on what they guess what the real purpose of the study is Evaluation Apprehension participants afraid of being evaluated Experimenter Expectancies researcher communicates what the desired outcome is Interaction of testing and treatment pretest makes participants more sensitive to the treatment Restricted generalizability across construct side effects of your treatments Confounding constructs and Levels of Constructs dosage level changes results 24

AGENDA 1. Quality of Measurement a) Validity Definition and Types of validity Assessment of validity Threats of Validity b) Reliability True Score Theory Definition and Types of reliability Assessment of reliability 2. Levels of Measurement 25

Quality of Measurement - Reliability - True Score Theory TRUE SCORE THEORY Assumes that every observation is composed of two components: true value plus random error Error can be divided further into two subcomponents Random error Systematic error 26

Quality of Measurement - Reliability - True Score Theory MEASUREMENT ERROR Random error Caused by random factor Pushes scores up or down randomly Random errors sum up to 0 Adds variability to data but does not affect the average performance of the group Systematic error Caused systematically by a certain factor Does effect the whole sample Pushes scores consistently either up or down 27

Quality of Measurement - Reliability - Definition and Types of reliability DEFINITION AND TYPES Reliability Repeatability or consistency of a measure Var(T)/Var(X) We cannot directly calculate reliability because we cannot measure the true score component We can estimate the true score component as the covariance or the correlation between two observations of the same measure Cov(X1,X2) It ranges between 0 and 1 (0 r 1) A measure is perfectly reliable (r=1) when the random error equals 0 28

Quality of Measurement - Reliability - Definition and Types of reliability RELIABILITY TYPES Inter-Rater or Inter- Observer Reliability Degree to which different raters/observers give consistent estimates of the same phenomenon Used to test how similarly people categorize and score items Test-Retest Reliability Consistency of a measure from one time to another Good tests have less retest variation over time Parallel-Forms Reliability Consistency of two tests constructed in the same way from the same content domain Evaluates different questions that seek to assess the same construct Consistency of results across items within a test Evaluates how consistent the results are for different items for the same construct within a measure Internal Consistency Reliability 29

Quality of Measurement - Reliability - Assessment of reliability INTERNAL CONSISTENCY RELIABILITY Consistency of results across items within a test Different approches: Split-Half Reliability Average Interitem Correlation Average Item-Total Correlation Cronbach's Alpha ( 30

Quality of Measurement - Reliability - Assessment of reliability SPLIT-HALF RELIABILITY Randomly divides items that measure the same construct into two sets Entire instrument is administered to a sample of people and the total scores for each half is calculated Then the correlation between the two total scores is calculated split half reliability 31

Quality of Measurement - Reliability - Assessment of reliability AVERAGE INTER-ITEM CORRELATION Compares correlations between all pairs of questions that measure the same construct by calculating the mean of all paired correlations Average Item-total correlation Takes the average interitem correlations and calculates a total score for the 6 items (seventh variable), then averages these. 32

Quality of Measurement - Reliability - Assessment of reliability CRONBACH'S ALPHA Calculates an equivalent to the average of all possible split-half correlations; but you re not calculating it that way Computation is quicker; the most used estimate 33

AGENDA 1. Quality of Measurement a) Validity Definition and Types of validity Assessment of validity Threats of Validity b) Reliability True Score Theory Definition and Types of reliability Assessment of reliability 2. Levels of Measurement 34

Quality of Measurement - Reliability Levels of measurement LEVELS OF MEASUREMENT Helps in deciding how to interpret data from a variable Defines the kind of statistical analysis Data is classified with different levels of precision or levels of measurement The higher the better 35

Quality of Measurement - Reliability Levels of measurement LEVELS OF MEASUREMENT Nominal Attributes are just named No ordering of the attributes (categories) is implied Example: jersey numbers in basketball Ordinal Attributes can be rank-ordered Distances between attributes do not have any meaning Example: grades at school Interval The distance between attributes does have a meaning Example: temperature (in Fahrenheit) Ratio There is always an absolute zero that is meaningful You can construct a meaningful fraction (or ratio) Example: weight 36

Quality of Measurement - Reliability Levels of measurement VALIDITY AND RELIABILITY Validity: Quality of Operationalization Reliability: Quality of Measurement 37

38 EXERCISES

EXERCISE 1. Gender 2. Hair colour 3. Puls rate (in bpm) 4. Body temperatur 5. Team number 6. Shoe size 39

EXERCISE 40

HOW DO I REMEMBER ALL THIS? IT S EASY! The 5 C s of Validity The DPEFT Validity NoNeMuMuPaMaSEM Construct Validity TIROPI Reliability SACA Reliability Assessment NOIR Levels of Measurement 41

THE 5 C S OF VALIDITY Concurrent validity Construct validity Content validity Convergent validity Criterion-related validity 42

THE DPEFT VALIDITY Discriminant validity Predictive validity External validity Face validity Translation validity 43

NONEMUMUPAMASEM CONSTRUCT VALIDITY ASSESSMENT TOOLS o o o o The Nomological Network The Multitrait-Multimethod matrix (MTMM) Pattern Matching Structural Equation Modeling (SEM) 44

TIROPI RELIABILITY Test-Retest Reliability Inter-Rater or Inter-Observer Reliability Parallel-Forms Reliability Internal Consistency Reliability 45

SACA RELIABILITY ASSESSMENT Split-half correlation Average interitem correlation Cronbach's alpha Average item-total correlation 46

NOIR LEVELS OF MEASUREMENT Nominal Ordinal Interval Ratio 47