Psychometrics, Measurement Validity & Data Collection

Similar documents
Types of Tests. Measurement Reliability. Most self-report tests used in Psychology and Education are objective tests :

Reliability and Validity checks S-005

Research Process. the Research Loop. Now might be a good time to review the decisions made when conducting any research project.

Overview of Experimentation

Importance of Good Measurement

MEASUREMENT THEORY 8/15/17. Latent Variables. Measurement Theory. How do we measure things that aren t material?

The Current State of Our Education

2/14/18. Richard Wiseman examined thousands of people trying to attain their goals Finish a project Get a job Stop smoking Lose weight

11-3. Learning Objectives

Issues in Clinical Measurement

Validation of Scales

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Writing Reaction Papers Using the QuALMRI Framework

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604


Multiple Act criterion:

Menu Planning Made Easy

The Psychometric Principles Maximizing the quality of assessment

THE FORMATION OF FALSE MEMORIES LOFTUS AND PECKRILL (1995)

Validity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that

Final Exam: PSYC 300. Multiple Choice Items (1 point each)

Stanford Youth Diabetes Coaches Program Instructor Guide Class #1: What is Diabetes? What is a Diabetes Coach? Sample

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

Associate Prof. Dr Anne Yee. Dr Mahmoud Danaee

Science, Society, and Social Research (1) Benjamin Graham

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

1. Before starting the second session, quickly examine total on short form BDI; note

Data Collection: Validity & Ethics

My Review of John Barban s Venus Factor (2015 Update and Bonus)

Name Staying Fit Challenge: Option 1: Option 2:

Selecting Research Participants. Conducting Experiments, Survey Construction and Data Collection. Practical Considerations of Research

Interviewer: Tell us about the workshops you taught on Self-Determination.

Choosing Life: Empowerment, Action, Results! CLEAR Menu Sessions. Substance Use Risk 2: What Are My External Drug and Alcohol Triggers?

CHAPTER 3 METHOD AND PROCEDURE

Validity and Reliability. PDF Created with deskpdf PDF Writer - Trial ::

VARIABLES AND MEASUREMENT

Building Friendships: Avoid Discounting

Strategies to Develop Food Frequency Questionnaire

Validity and reliability of measurements

ADMS Sampling Technique and Survey Studies

PSY The Psychology Major: Academic and Professional Issues. Module 8: Critical Thinking. Study Guide Notes

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology

CAREER BASE CAMP Day 2: Leverage Your Emotional Intelligence

Psychology: The Science

How to Help Your Patients Overcome Anxiety with Mindfulness

GCSE PSYCHOLOGY UNIT 2 FURTHER RESEARCH METHODS

Georgina Salas. Topics 1-7. EDCI Intro to Research Dr. A.J. Herrera

Validity and reliability of measurements

Multiple Intelligences: Let em show you how they re smart! Diana Beasley April 17, 2007 East Carolina University

Assignment 4: True or Quasi-Experiment

Chapter 15 PSYCHOLOGICAL TESTS

New Food Label Pages Diabetes Self-Management Program Leader s Manual

Beattie Learning Disabilities Continued Part 2 - Transcript

What is Psychology? chapter 1

Do not write your name on this examination all 40 best

Lecture 11: Measurement to Hypotheses. Benjamin Graham

Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time.

Measurement is the process of observing and recording the observations. Two important issues:

The first section of this booklet will help you think about what alcohol can do to your health.

Organizing Scientific Thinking Using the QuALMRI Framework

Stress Thermostat. How to Create Your Own Before Meds/After Meds Program. 1.2 Resetting Your

Psychology 205, Revelle, Fall 2014 Research Methods in Psychology Mid-Term. Name:

Changing Patterns. healthyfutures.nhs.uk/llb

The 5 Things You Can Do Right Now to Get Ready to Quit Smoking

Information Session. What is Dementia? People with dementia need to be understood and supported in their communities.

keep track of other information like warning discuss with your doctor, and numbers of signs for relapse, things you want to

Technical Specifications

Multiple Intelligences of the High Primary Stage Students

So You Want to do a Survey?

An introduction to personality assessments in the workplace. Getting more from your people.

Making a psychometric. Dr Benjamin Cowan- Lecture 9

Volunteer Befriender Reference: Doncaster Mental Health Floating Support Service Closing Date: 14 August 2018

MYWELLBEING THE EIGHT DOMAINS

Management Growth Strategies

Exposure Therapy. in Low Intensity CBT. Marie Chellingsworth, Dr Paul Farrand & Gemma Wilson. Marie Chellingsworth, Dr Paul Farrand & Gemma Wilson

Keep the Fat Loss Coming!

GCSE. Psychology. Mark Scheme for June General Certificate of Secondary Education Unit B543: Research in Psychology

Combating the Negative Effects of Stereotypes: Improving Minority Performance with a Values- Affirmation Intervention

Introduction to Survey Research. Clement Stone. Professor, Research Methodology.

UIC Solutions Suite Webinar Series Transcript for how-to webinar on Action Planning for Prevention & Recovery Recorded by Jessica A.

Safeguarding adults: mediation and family group conferences: Information for people who use services

DAY 2 RESULTS WORKSHOP 7 KEYS TO C HANGING A NYTHING IN Y OUR LIFE TODAY!

Using Lertap 5 in a Parallel-Forms Reliability Study

The Doctrine of Traits. Lecture 29

Letter to the teachers

Progress Monitoring Handouts 1

Goldsmith. Marshall. FEEDFORWARD Coaching For Your Future. Coaching For Your Future. MMIX ChartHouse Learning. All Rights Reserved. What Is It?

Studies have long shown that adding fluoride

Scouter Support Training Participant Workbook

Sample Exam Questions Psychology 3201 Exam 1

Flex case study. Pádraig MacGinty Owner, North West Hearing Clinic Donegal, Ireland

Designing an experiment 7 TH /8 TH GRADE SCIENCE

Scientific Thinking Handbook

Chapter 4: Defining and Measuring Variables

Dr. John Berardi. Who is Dr. John Berardi

I. Task List INTERVIEWING SKILLS

RISK-REWARD ANALYSIS

Math 124: Modules 3 and 4. Sampling. Designing. Studies. Studies. Experimental Studies Surveys. Math 124: Modules 3 and 4. Sampling.

Transcription:

Psychometrics, Measurement Validity & Data Collection Psychometrics & Measurement Validity Measurement & Constructs Kinds of items and their combinations Properties of a good measure Data Collection Observational -- Self-report -- Trace Data Collection Primary vs. Archival Data Data Collection Settings Psychometrics (Psychological measurement) The process of assigning values to represent the amounts and kinds of specified behaviors or attributes, to describe participants. We do not measure participants We measure specific behaviors or attributes of a participant Psychometrics is the centerpiece of empirical psychological research and practice. All data result from some form of measurement For those data to be useful we need Measurement Validity The better the measurement validity, the better the data, the better the conclusions of the psychological research or application Most of the behaviors and characteristics we want to study in are constructs They re called constructs because most of what we care about as psychologists are not physical measurements, such as time, length, height, weight, pressure & velocity rather the stuff of psychology performance, learning, motivation, anxiety, social skills, depression, wellness, etc. are things that don t really exist. We have constructed them to give organization and structure to out study of behaviors and characteristics. Essentially all of the things we psychologists research and apply, both as causes and effects, are Attributive Hypotheses with different levels of support and acceptance!!!!

What are the different types of constructs we use??? The most commonly discussed types are... Achievement -- performance broadly defined (judgements) e.g., scholastic skills, job-related skills, research DVs, etc. Attitude/Opinion -- how things should be (sentiments) polls, product evaluations, etc. Personality -- characterological attributes (keyed sentiments) anxiety, psychoses, assertiveness, etc. There are other types of measures that are often used Social Skills -- achievement or personality?? Aptitude -- how well someone will perform after then are trained.. and experiences but measures before the training & experience some combo of achievement, personality and preferences IQ -- is it achievement (things learned) or is it aptitude for academics, career and life?? Each question/behavior is called an item Kinds of items objective items vs. subject items objective does not mean true real or accurate subjective does not mean made up or inaccurate Items are names for how the observer/interviewer/coder transforms participant s responses into data Objective Items - no evaluation, judgment or decision is needed either response = data or a mathematical transformation e.g., multiple choice, T&F, matching, fill-in-the-blanks Subjective Items response must be evaluated and a decision or judgment made what should be the data value content coding, diagnostic systems, behavioral taxonomies e.g., essays, interview answers, drawings, facial expressions Some more language A collection of items is called many things e.g., survey, questionnaire, instrument, measure, test, or scale Three kinds of item collections you should know.. Scale (Test) - all items are put together to get a single score Subscale (Subtest) item sets put together to get multiple separate scores Surveys each item gives a specific piece of information Most questionnaires, surveys or interviews are a combination of all three.

Reverse Keying We want the respondents to carefully read an separately respond to each item of our scale/test. One thing we do is to write the items so that some of them are backwards or reversed Consider these items from a depression measure 1. It is tough to get out of bed some mornings. disagree 1 2 3 4 5 agree 2. I m generally happy about my life. 1 2 3 4 5 3. I sometimes just want to sit and cry. 1 2 3 4 5 4. Most of the time I have a smile on my face. 1 2 3 4 5 If the person is depressed, we would expect then to give a fairly high rating for questions 1 & 3, but a low rating on 2 & 4. Before aggregating these items into a composite scale or test score, we would reverse key items 2 & 4 (1=5, 2=4, 4=2, 5=1) Desirable Properties of Psychological Measures Interpretability of Individual and Group Scores Population Norms Validity Reliability Standardization Standardization Administration test is given the same way every time who administers the instrument specific instructions, order of items, timing, etc. Varies greatly - multiple-choice classroom test hand it out) - WAIS -- 100+ page administration manual Scoring test is scored the same way every time who scores the instrument correct, partial and incorrect answers, points awarded, etc. Varies greatly -- multiple choice test (fill in the sheet) -- WAIS 200+ page scoring manual

Reliability (Agreement or Consistency) Inter-rater or Inter-observers reliability do multiple observers/coders score an item the same way? especially important whenever using subjective items Internal reliability do the items measure a central thing will the items add up to a meaningful score? Test-retest reliability if the test is repeated, does the test give the same score each time (when the characteristic/behavior hasn t changed)? Validity Non-statistical types of Validity Face Validity does the respondent know what is being measured? can be good or bad depends on construct and population target population members are asked what is being tested? Content Validity Do the items cover the desired content domain? especially important when a test is designed to have low face validity e.g., tests of honesty used for hiring decisions simpler for more concrete constructs ideas) e.g., easier for math experts to agree about an algebra item item than for psychological experts to agree about a depression item Content validity is not tested for rather it is assured having the domain and population expertise to write good tems having other content and population experts evaluate the items Validity Statistical types of Validity Criterion-related Validity do test scores correlate with the criterion behavior or attribute they are trying to estimate e.g., ACT GPA types are labeled for when the test and the criteria msrd Before postdictive Now When criterion behavior occurs concurrent predictive Test taken now Later concurrent -- test taken now replaces criterion measured now eg, written drivers test instead of road test predictive -- test taken now predicts criterion measured later eg, GRE taken in college predicts grades in grad school postdictive -- test taken now predicts criterion measured earlier eg, almost every self-report measure (ever)

Construct Validity Validity Statistical types of Validity refers to whether a test/scale measures the theorized construct that it purports to measure attention to construct validity reminds us that our many of the characteristics and behaviors we study are constructs Construct validity is assessed as the extent to which a measure correlates as it should with other measures of similar and different constructs Statistically, construct validity has two parts Convergent Validity -- test correlates with other measures of similar constructs Divergent Validity -- test isn t correlated with measures of other, different constructs Evaluate this measure of depression. New Dep Dep1 Dep2 Anx Happy PhyHlth FakBad New Dep Old Dep1.61 Old Dep2.59.60 Convergent Validity Divergent Validity Anx.25.30.28 Happy -.59 -.61 -.56 -.75 PhyHlth.16.18.22.45 -.35 FakBad.15.14.21.10 -.21.31 Population Norms In order to interpret a score from an individual or group, you must know what scores are typical for that population Requires a large representative sample of the target population preferably random, researcher-selected & stratified Requires solid standardization both administrative & scoring Requires great inter-rater reliability if subjective The Result?? A scoring distribution of the population. lets us identify individual scores as normal, high & low lets us identify cutoff scores to put individual scores into importantly different populations and subpopulations

Desirable Properties of Psychological Measures Interpretability of Individual and Group Scores Population Norms Scoring Distribution & Cutoffs Validity Face, content, Criterion-Related & Construct Reliability Inter-rater, Internal Consistency &Test-Retest Standardization Administration & Scoring All data are collected using one of three major methods Behavioral Observation Data Studies actual behavior of participants Can require elaborate data collection & coding techniques Quality of data can depend upon secrecy (naturalistic, disguised participant) or rapport (habituation or desensitization) Self-Report Data Allows us to learn about non-public behavior thoughts, feelings, intentions, personality, etc. Added structure/completeness of prepared set of?s Participation & data quality/honesty dependent upon rapport Trace Data Limited to studying behaviors that do leave a trace Least susceptible to participant dishonesty Can require elaborate data collection & coding techniques Behavioral Observation Data Collection It is useful to discriminate among different types of observation Naturalistic Observation Participants don t know that they are being observed requires camouflage or distance researchers can be VERY creative & committed!!!! Participant Observation (which has two types) Participants know someone is there researcher is a participant in the situation Undisguised the someone is an observer who is in plain view Maybe the participant knows they re collecting data Disguised the observer looks like someone who belongs there Observational data collection can be part of Experiments (w/ RA & IV manip) or of Non-experiments!!!!!

Self-Report Data Collection We need to discriminate among various self-report data collection procedures Mail Questionnaire Computerized Questionnaire Group-administered Questionnaire Personal Interview Phone Interview Group Interview (focus group) Journal/Diary In each of these participants respond to a series of questions prepared by the researcher. Self-report data collection can be part of Experiments (w/ RA & IV manip) or of Non-experiments!!!!! Trace data are data collected from the marks & remains left behind by the behavior we are trying to measure. There are two major types of trace data Accretion when behavior adds something to the environment trash, noseprints, graffiti Deletion when behaviors wears away the environment wear of steps or walkways, shiny places Garbageology the scientific study of society based on what it discards -- its garbage!!! Researchers looking at family eating habits collected data from several thousand families about eating take-out food Self-reports were that people ate take-out food about 1.3 times per week These data seemed at odds with economic data obtained from fast food restaurants, suggesting 3.2 times per week The Solution they dug through the trash of several hundred families garbage cans before pick-up for 3 weeks estimated about 2.8 take-out meals eaten each week Data Sources It is useful to discriminate between two kinds of data sources Primary Data Sources Sampling, questions and data collection completed for the purpose of this specific research Researcher has maximal control of planning and completion of the study substantial time and costs Archival Data Sources (AKA secondary analysis) Sampling, questions and data collection completed for some previous research, or as standard practice Data that are later made available to the researcher for secondary analysis Often quicker and less expensive, but not always the data you would have collected if you had greater control.

Is each primary or archival data? Collect data to compare the outcome of those patients I ve treated using Behavior vs. using Cognitive interventions Go through past patient records to compare Behavior vs. Cognitive interventions Purchase copies of sales receipts from a store to explore shopping patterns Ask shoppers what they bought to explore shopping patterns Using the data from some else s research to conduct a pilot study for your own research Using a database available from the web to perform your own research analyses Collecting new survey data using the web primary archival archival primary archival archival primary Data collection Settings Same thing we discussed as an element of external validity Any time we collect data, we have to collect it somewhere there are three general categories of settings Field Usually defined as where the participants naturally behave Helps external validity, but can make control (internal validity) more difficult (RA and Manip possible with some creativity) Laboratory Helps with control (internal validity) but can make external validity more difficult (remember ecological validity?) Structured Setting A natural appearing setting that promotes natural behavior while increasing opportunity for control An attempt to blend the best attributes of Field and Laboratory settings!!! Data collection Settings identify each as laboratory, field or structured Study of turtle food preference conducted in Salt Creek. Study of turtle food preference conducted with turtles in 10 gallon tanks. Study of turtle food preference conducted in a 13,000 gallon cement pond with natural plants, soil, rocks, etc. Study of jury decision making conducted in 74 Burnett, having participants read a trial transcript. Study of jury decision making with mock juries conducted in the mock trial room at the Law College. Study of jury decision making conducted with real jurors at the Court Building. Field Laboratory Structured Laboratory Structured Field