Validity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that

Similar documents
Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

11-3. Learning Objectives

PÄIVI KARHU THE THEORY OF MEASUREMENT

Lecture Week 3 Quality of Measurement Instruments; Introduction SPSS

Validity and Reliability. PDF Created with deskpdf PDF Writer - Trial ::


Reliability and Validity checks S-005

ADMS Sampling Technique and Survey Studies

Reliability AND Validity. Fact checking your instrument

RELIABILITY AND VALIDITY (EXTERNAL AND INTERNAL)

Program Evaluation: Methods and Case Studies Emil J. Posavac Eighth Edition

Constructing Indices and Scales. Hsueh-Sheng Wu CFDR Workshop Series June 8, 2015

Validation of Scales

Associate Prof. Dr Anne Yee. Dr Mahmoud Danaee

Chapter 3 Psychometrics: Reliability and Validity

Introduction to Reliability

Measurement issues in the assessment of psychosocial stressors at work

Table of Contents. Part 9: Criminal Legal Prosecutors and System-Based Advocates

Mohegan Sun Casino/Resort Uncasville, CT AAPP Annual Seminar

DATA is derived either through. Self-Report Observation Measurement

Importance of Good Measurement

PATHWAYS. Age is one of the most consistent correlates. Is Desistance Just a Waiting Game? Research on Pathways to Desistance.

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

and Screening Methodological Quality (Part 2: Data Collection, Interventions, Analysis, Results, and Conclusions A Reader s Guide

AP Psychology -- Chapter 02 Review Research Methods in Psychology

VARIABLES AND MEASUREMENT

Variables in Research. What We Will Cover in This Section. What Does Variable Mean?

Craft Personality Questionnaire

What works in policing domestic abuse?

By Hui Bian Office for Faculty Excellence

Reliability & Validity Dr. Sudip Chaudhuri

PM12 Validity P R O F. D R. P A S Q U A L E R U G G I E R O D E P A R T M E N T O F B U S I N E S S A N D L A W

Homework #2 is due next Friday at 5pm.

2 Critical thinking guidelines

The Validity of Logic- Based Tests

Asking & Answering Sociological Questions

FORENSIC PSYCHOLOGY. Psychological Profiling of Homicidal Offenders

ACDI. An Inventory of Scientific Findings. (ACDI, ACDI-Corrections Version and ACDI-Corrections Version II) Provided by:

CHAPTER 3 METHOD AND PROCEDURE

ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH

Screener and Opioid Assessment for Patients with Pain- Revised (SOAPP -R)

Psychology 305A; Lecture 4. Methods Part 2 Psychoanalysis: Freud (Part 1)

! #! began to change and reform expanded, there was a significant shift in legal approach that

Measurement is the process of observing and recording the observations. Two important issues:

Chapter 4. The Validity of Assessment- Based Interpretations

Variables in Research. What We Will Cover in This Section. What Does Variable Mean? Any object or event that can take on more than one form or value.

Basic Concepts in Research and DATA Analysis

Survey Question. What are appropriate methods to reaffirm the fairness, validity reliability and general performance of examinations?

DEVELOPING A TOOL TO MEASURE SOCIAL WORKERS PERCEPTIONS REGARDING THE EXTENT TO WHICH THEY INTEGRATE THEIR SPIRITUALITY IN THE WORKPLACE

Survey Research. We can learn a lot simply by asking people what we want to know... THE PREVALENCE OF SURVEYS IN COMMUNICATION RESEARCH

MEASUREMENT THEORY 8/15/17. Latent Variables. Measurement Theory. How do we measure things that aren t material?

Reliability and Validity

Examining the Psychometric Properties of The McQuaig Occupational Test

Sociology 4 Winter PART ONE -- Based on Baker, Doing Social Research, pp and lecture. Circle the one best answer for each.

Reliability and Validity

CHAPTER 2. RESEARCH METHODS AND PERSONALITY ASSESSMENT (64 items)

Psychology 205, Revelle, Fall 2014 Research Methods in Psychology Mid-Term. Name:

Statistics 13, Midterm 1

Making a psychometric. Dr Benjamin Cowan- Lecture 9

Reliability & Validity: Qt RD & Ql RD. Neuman (2003: Chap. 7: , esp )

Linda R. Murphy and Charles D. Cowan, U.S. Bureau of the Census EFFECTS OF BOUNDING ON TELESCOPING IN THE NATIONAL CRIME SURVEY.

how good is the Instrument? Dr Dean McKenzie

Chapter 4: Defining and Measuring Variables

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604

M2. Positivist Methods

Teresa Anderson-Harper

What are Indexes and Scales

Chapter 11 Nonexperimental Quantitative Research Steps in Nonexperimental Research

The Process of Scientific Inquiry Curiosity. Method. The Process of Scientific Inquiry. Is there only one scientific method?

CLINICAL VS. BEHAVIOR ASSESSMENT

Installing a Moral Learning Process Integrity Beyond Traditional Ethics Training

Psychologist use statistics for 2 things

Deceptive Communication Behavior during the Interview Process: An Annotated Bibliography. Angela Q. Glass. November 3, 2008

Factor Structure of the Self-Report Psychopathy Scale: Two and Three factor solutions. Kevin Williams, Craig Nathanson, & Delroy Paulhus

Emotional Intelligence Assessment Technical Report

Amendments to the draft Regulations of the Criminal Law. (Sexual Offences and Related Matters) Amendment Act

Validity and reliability of measurements

Psychology: The Science

The Psychometric Principles Maximizing the quality of assessment

Ch. 11 Measurement. Measurement

Assessing Risk in ID Persons with Problem Sexual Behaviors. Thomas Graves, M.S., M.Ed. Ed.D.(C), LPC

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

University of Wollongong. Research Online. Australian Health Services Research Institute

CHAPTER VI RESEARCH METHODOLOGY

East Cleveland Community Perceptions Baseline Survey. Final Report October 2012

HIV/AIDS ALL YOU NEED TO KNOW! BELIEVE ONLY IN FACTS!

Psychological testing

Applying for a Discharge Upgrade When You Have a Mental Health Condition

Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE

A Level Sociology. A Resource-Based Learning Approach

Reproductive Health s Knowledge, Attitudes, and Practices A European Youth Study Protocol October 13, 2009

Developing a Comprehensive and One-Dimensional Subjective Well-Being Measurement: Evidence from a Belgian Pilot Survey

Psychometrics, Measurement Validity & Data Collection

The Youth Experience Survey 2.0: Instrument Revisions and Validity Testing* David M. Hansen 1 University of Illinois, Urbana-Champaign

What do you see? Write down 3 things that you think about when you look at this picture.

The science of the mind: investigating mental health Treating addiction

Quality of Life in Epilepsy for Adolescents: QOLIE-AD-48 (Version 1)

Transcription:

Validity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that it purports to perform. Does an indicator accurately measure the variable that it is intended to measure?

Content Face Predictive Concurrent Construct

Determined by expert judgements of the appropriateness of the contents of a measure.

Content validity refers to degree that one has representatively sampled from that domain of meaning. This type of validity receives the most attention in construction of achievement & proficiency measures with psychology & education.

Consider test time & types of items. There are 20 topics I wish to cover and I decide to use 35 multiple choice in 50 minute period. I might have one question on each topic & distribute remaining questions to topics I consider most important.

Determined by judgements made by the researcher and based on surface appearance.

There are two basic types: Predictive Concurrent

This is the standard by which your measure is being judged or evaluated.

This is determined by correlating test scores with criterion scores obtained after examinees have had a chance to perform what is predicted by the test. A measure is validated with reference to future standing on criterion variable.

A measure of what the test is designed to predict.

After examinees have had a chance to exhibit the predicted behavior

The purpose of an employment test is to predict success on the job. The most appropriate test of its validity is predictive validity - to what extent does the test predict what it is supposed to predict?

Give test to applicants for a position. For all those hired, compare their test scores to supervisors rating after 6 months on the job. The supervisors ratings are the criterion. If employees scored on the test similarly to supervisors ratings, then predictive validity of test is supported.

Correlate test scores with criterion scores obtained at about the same time. The ability of a measure to indicate an individual s present standing on the criterion variable.

An independent measure of the same trait that the test is designed to measure.

At about the same time that the test is administered.

A multiple choice test on reading achievement (which is designed to measure achievement at the time of testing and not designed to predict any future behavior.) might be validated by: comparing the scores on the test with teachers ratings of students reading abilities. Teachers ratings are the criterion

Validate a measure of political conservatism by correlating it with reported voting behavior. Travis Hirschi validated his index of self-reported delinquency by comparing responses to office police records on the respondents.

Hypothesize a relationship between the test scores and scores on another variable. Then, test the hypothesis

A construct stands for a collection of related behaviors that are associated in a meaningful way. Depression is a construct that stands for a personality trait that is manifested by behaviors such as lethargy, loss of appetite, difficulty in concentrating on tasks, and so forth.

The construct itself does not have a physical being outside of its indicators. Thus, we infer its existence by observing the collection of related indicators. Emphasis on collection crucial because any one sign may be associated with several constructs.

Loss of appetite is a sign of depression. But loss of appetite may also be a sign of anxiety, or fear, or falling in love. Thus, loss of appetite is indicative of depression only when it is found in association with other indicators of depression

Begin by hypothesizing about how the construct that the measure is designed to measure should relate to other variables.

Hypothesize that students who are very depressed will earn lower grades & be more likely to drop out of college than students who are less depressed. Test the hypothesis on sample of students. Suppose we find no relationship between scores obtained on depression measure and success in college.

Either (1) the measure lacks validity for measuring depression; that is, it measures something else which is not related to success in college, or (2) the hypothesis is wrong. If we continue to our faith in the hypothesis, we will have to conclude that the empirical evidence argues against the validity of the measure.

Either, (1) the depression measure is, to some degree, valid for measuring depression, or (2) the depression index measures a variable other than depression that is also related to success in college. This other variable could be many things

Maybe the index is heavily laden with signs of depression that are also signs of anxiety such that it is more a measure of anxiety than depression. Since debilitating anxiety may lead to failure in college, the scores on the sclae may relate to success in college because it measures anxiety and not depression.

Is the indicator measuring something consistently & dependably. Do repeated applications of the operational definition under similar conditions yield consistent results?

Test-retest interitem (internal consistency) alternate-forms interobserver

You apply a measure to a sample of people & then, at a somewhat later time, apply the same measure to the same people again. Result will be two sets of scores. Correlate the scores & should find high degree of association.

If too short an interim, respondents may remember & simply repeat the responses they gave the first time, thereby inflating the reliability estimate. If too long an interim, real change in behavior may have occurred between the two tests.

Appropriate when multiple items are used to measure a concept. Concern is with assessing the reliability of a set of measures of the same concept by comparing all possible combinations of these items. Degree to which this has been achieved is assessed by a statistic (e.g, Cronbach s alpha) which is the average inter-item correlation for a set of items.

Compare the responses of subjects to variations on a particular survey question. The variations on the questions should be extremely close. The two sets of responses should be quite similar. Split-halves reliability is similar to this.

Using a particular measure, two or more different observers should provide similar ratings to the same phenomenon. If they do so, the interobserver reliability of the measure is supported.

Some social researchers have claimed that we can assume that members of certain minority groups are more active in criminal behavior due to the fact that they are more likely to be arrested for such activities. Is this claim more easily attacked due to problems of reliability or validity?

Arrest rates are not necessarily indicative of rates of involvement in illegal behavior. They can be affected by police bias, location, and other demographic factors. Therefore, we know that there are problems in the validity of arrest rates to function as measure of certain criminal behaviors and for certain populations.

National Crime Victimization Survey (NCVS) developed to provide alternate source of data on crime. NCVS used to assess validity of arrest rates & reports as measures of crime. Convergence between the two in terms of relevant factors supports official measures as more valid indicators of certain crimes (e.g., homicide) than of others (e.g., burglary).

Joe and Mary were interested in whether male or female college students tended to have more sexual partners. They each stationed themselves at different entrances to the college cafeteria and stopped every tenth person who was entering the building and asked them how many people they had had sexual intercourse with. When they examined their results, they found that the men that Mary interviewed had had fewer partners than the men that Joe interviewed. Is this a problem with validity or reliability?

It looks like when Mary conducts the interviews, males under-report their number of partners. When {Joe} conducts the interviews, males over-report their number of partners. When two people get different results, this is a reliability problem.