In this module we will cover Correla4on and Validity.

Similar documents
6/20/18. Prac+cum #1 Wednesday 6/27. Lecture Ques+on. Quick review. Establishing Causality. Causality

Sta$s$cs is Easy. Dennis Shasha From a book co- wri7en with Manda Wilson

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

CHAPTER ONE CORRELATION

Casual Methods in the Service of Good Epidemiological Practice: A Roadmap

Unit 2: Personality and Individuality. Part 2: Intelligence Tes7ng

Mul$ Voxel Pa,ern Analysis (fmri) Mul$ Variate Pa,ern Analysis (more generally) Magic Voxel Pa,ern Analysis (probably not!)

Chapter 11 Nonexperimental Quantitative Research Steps in Nonexperimental Research

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

The Brief Cogni-ve Assessment Tool (BCAT): A New Test Emphasizing Contextual Memory and Execu<ve Func<ons

EDS- 544 Week 2: Concepts on theory and theories of instruc<on. Dr. Evrim Baran Middle East Technical University

Chapter 14: More Powerful Statistical Methods

Statistical Methods and Reasoning for the Clinical Sciences

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604


Basic Concepts in Research and DATA Analysis

Appendix B Statistical Methods

AP Psychology -- Chapter 02 Review Research Methods in Psychology

The Mortality Effects of Re3rement: Evidence from Social Security Eligibility at Age 62

9.63 Laboratory in Cognitive Science

The Moral Psychology Simplexity of Acceptable Risk in Safety Standards

Chapter 3 Psychometrics: Reliability and Validity

Understanding. Design & Sta/s/cs. in Clinical Trials

Emotion, stress, coping & health. Emotion. Emotion and the body 11/30/11

Applications and Evaluation of Freud s Theory

Classifica4on. CSCI1950 Z Computa4onal Methods for Biology Lecture 18. Ben Raphael April 8, hip://cs.brown.edu/courses/csci1950 z/

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Survey Question. What are appropriate methods to reaffirm the fairness, validity reliability and general performance of examinations?

VARIABLES AND MEASUREMENT

Study Designs in HIV Research

Evalua&ng Methods. Tandy Warnow

COSWL/COSIAC Workshop 2015 LSA Summer Ins8tute. Imposter Syndrome. Penny Eckert, Stanford University Monica Macaulay, Univ.

Validity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that

CHAPTER III RESEARCH METHODOLOGY

On the purpose of testing:

Overview of Experimentation

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification

Welcome to OSA Training Statistics Part II

In the following presenta2on I will briefly map the research process I undertook a year ago while working towards my Masters of Research

t-test for r Copyright 2000 Tom Malloy. All rights reserved

Biosta's'cs Board Review. Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016

Handout 5: Establishing the Validity of a Survey Instrument

Chapter 4: Defining and Measuring Variables

Compulsory vaccina0on and rates of coverage immunisa0on in Europe

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

Propensity Score. Overview:

STATISTICS INFORMED DECISIONS USING DATA

STATISTICS AND RESEARCH DESIGN

Lesson 11 Correlations

Reliability and Validity

Learning Objec1ves. Study Design Considera1ons in Clinical Pharmacy

Learning Objec1ves. Study Design Strategies. Cohort Studies 9/28/15

When Is. Useful? RACING PSYCHOLOGY. By Dr. Patrick Cohn. Copyright 2012 Peak Performance Sports 1

Insights from CISCRP s 2017 Percep9ons & Insights Study March 2018

Which Figure Is It? (Geometry) ANNOTATED STUDENT WORK SAMPLE ARGUMENTATION RESOURCE PACKET

Social Cogni+ve Theory. Amanda Gerson

Underlying Theory & Basic Issues

Causes of variability in developmental disorders

For more information about how to cite these materials visit

Still important ideas

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

Chapter 9: Intelligence and Psychological Testing

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES

Correlational analysis: Pearson s r CHAPTER OVERVIEW

Today. HW6 ques.ons? Next reading presenta.on: Friday (R25) Sta.s.cal methods

Chapter 19. Confidence Intervals for Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Use and Interpreta,on of LD Score Regression. Brendan Bulik- Sullivan PGC Stat Analysis Call

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

Still important ideas

DATA GATHERING. Define : Is a process of collecting data from sample, so as for testing & analyzing before reporting research findings.

Chapter 25. Paired Samples and Blocks. Copyright 2010 Pearson Education, Inc.

Module 2/3 Research Strategies: How Psychologists Ask and Answer Questions

9/4/17. Acetaminophen Overdose Modelling

Process of Science and hypothesis tes2ng in Behavioral Ecology

For Objec*ve Causal Inference, Design Trumps Analysis. Donald B. Rubin Department of Sta*s*cs Harvard University 16 March 2012

Chapter 2--Norms and Basic Statistics for Testing

Propensity Score Methods for Longitudinal Data Analyses: General background, ra>onale and illustra>ons*

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Assessing Intelligence. AP Psychology Chapter 11: Intelligence Ms. Elkin Fall 2014

Teaching Social Skills to Youth with Mental Health

How Do We Gather Evidence of Validity Based on a Test s Relationships With External Criteria?

Chapter 5 Analyzing Quantitative Research Literature

INTERVIEWS II: THEORIES AND TECHNIQUES 5. CLINICAL APPROACH TO INTERVIEWING PART 1

Audio: In this lecture we are going to address psychology as a science. Slide #2

Listeria monocytogenes. Tom Duszynski, MPH, REHS Director of Surveillance and Inves>ga>on

Body Dysmorphic Disorder. Body Image. Influences on Body Image 7/20/ Media and popular culture

Validation of Scales

Comparing Two Means using SPSS (T-Test)

Categories. Represent/store visual objects in terms of categories. What are categories? Why do we need categories?

factors that a ect data e.g. study performance of college students taking a statistics course variables include

SYMSYS 130: Research Methods in the Cognitive and Information Sciences (Spring 2013)

Examining the Psychometric Properties of The McQuaig Occupational Test

Alzheimer s Disease and Demen<a

Chapter 4. The Validity of Assessment- Based Interpretations

Validity and Reliability. PDF Created with deskpdf PDF Writer - Trial ::

Critical Thinking Assessment at MCC. How are we doing?

Team Players: Strategies for Conflict Resolu8on

Persuasion. A tradi;onal model of persuasion 2/2/12. Untrustworthy sources can be persuasive! MAR February 7, 2012

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

Transcription:

In this module we will cover Correla4on and Validity.

A correla4on coefficient is a sta4s4c that is o:en used as an es4mate of measurement, such as validity and reliability. You will learn the strength of correla4on and the direc4on of correla4on. Validity is a crucial aspect of test quality. You will learn three types of validity: content validity, criterion validity, and construct validity.

Correla4on is a sta4s4cal procedure used to measure the rela4onship or associa4on between two variables. Correla4ons allow us to answer ques4ons such as: Are athletes poor scholars? or Do students with high grades in high school tend to get high grades in college? You may ask: Why do we need correla4on? A:er all, this is a measurement course, not a sta4s4cal course. However, correla4on is used heavily in certain aspects of measurement including es4ma4on of validity and reliability.

A correla4on sta4s4c measures the rela4onship between two variables. There are two aspects to this associa4on. The strength of the rela4onship and the direc4on of the rela4onship. Correla4on coefficients range from nega4ve 1 to posi4ve 1.

The lemer r is used to denote the correla4on coefficient. The closest a coefficient gets to - 1 or +1 the stronger the rela4onship between two variables.

ScaMerplots can be used to visually represent the associa4on between two variables. They can also show the strength and direc4on of the rela4onship. The strength of the rela4onship shows how accurately a predic4on can be made from the test score to the criterion. The direc4on of the rela4onship could be posi4ve and nega4ve. The posi4ve associa4on between two variables denotes that as one increases, the other increases. Whereas the nega4ve rela4onship denotes that as one increases, the other decreases.

Here is an example that uses a reading readiness test (denoted by X) to predict future reading skills (denoted by Y). We collected ten students scores on the reading readiness test and future reading skills. Each student has two scores. These scores can be transferred into a two- dimensional graph, known as a scamerplot. The X axis represents reading readiness scores and the Y axis represents future reading skills scores. Each dot in the scamerplot represents each student with two scores. For instance, Sheila has two scores of (2 and 1). Here is the corresponding dot. The rest of students also have corresponding dots in the scamerplot. Based a special formula and the data we have, the correla4on coefficient can be calculated. The correla4on coefficient in this case is a.529 which is a posi4ve and moderate correla4on.

Generally speaking, the rela4onships between two variables can be classified into four categories: A posi4ve correla4on, a nega4ve correla4on, no rela4onship, and a curvilinear rela4onship.

As men4oned earlier, a posi4ve correla4on suggests that as scores on Test A increase, scores on Test B increase. This scamerplot shows the posi4ve rela4onship between Tests A and B. There is a pamern among dots from the lower le: to the upper right.

A nega4ve correla4on suggests that as scores on Test A increase, scores on Test B decrease. A popular example is that when the number of absences increases, GPA decreases. This scamerplot shows this nega4ve correla4on between GPA and the number of absences. There is a pamern among dots from the upper le: to the lower right.

No rela4onship means that the two variables are not associated. The correla4on coefficient would be close to zero. This scamerplot represents no rela4onship between Tests A and B. As you can see, there is no pamern among these dots.

A curvilinear rela4onship between two variables suggests that as scores on Test A increase, scores on Test B first increase, then decrease. There is a curvilinear pamern among these dots in this scamerplot.

Another thing we should know is the restric4on of range problem. When we look at the en4re scamerplot there is a clear posi4ve correla4on between the tests. But when a truncated or restricted range of scores are examined, such as high scores on both tests, the rela4onship becomes weak and the direc4on is unclear. A typical example of this situa4on is to explore the rela4onship between students scores on the GRE test and their graduate GPA scores of the first year. This is a restricted range case because most of the students who are admimed into graduate schools have higher GRE scores.

There are several different correla4on coefficients. Two types of correla4ons are commonly used: the Pearson Product- Moment Correla4on and the Rank Difference Correla4on. The Pearson Product Moment Correla4on used student s raw, con4nuous scores and requires at least 30 samples. For the rank difference correla4on, the students raw scores are ordered, and then the ranks or ordered numbers are used. This coefficient allows to measure the associa4on for small sample sizes. For informa4on on Pearson Product- Moment, refer to Appendix B of your text. For informa4on on Rank difference correla4on, refer to Chapter 14.

Cau4on: correla4on does not imply causality. You cannot conclude that one is the result of another just because two distribu4ons are related. For instance, there may be a high correla4on between the sales of ice cream and the number of drownings. It s weird to say that ea4ng ice cream causes drowning. The reasonable explana4on is that perhaps both are affected by temperature.

We measure validity because it provides informa4on about the usefulness of a test in a par4cular situa4on. We always make inferences from test scores so it is important to make sure that those inferences are appropriate. That is why we care about validity.

Recall that validity and reliability are two very important aspects of test quality. Validity is the extent to which a test measures what it s supposed to measure if a test is not measuring what it s supposed to measure, it s rather pointless at best. Reliability is the extent to which a test score is consistent. If test scores aren t reliable, they don t give us good informa4on about a student s true achievement level.

There are three popular types of validity: Content validity, criterion- related validity, and construct validity. Criterion- related validity can be predic4ve criterion validity and concurrent criterion validity.

Content validity can indicate if the instruc4on matches the objec4ves, if the test blueprint matches the instruc4on or objec4ves and, if the test items match the test blueprint. To increase content validity, each item should be linkable to the instruc4on or objec4ves. And there should be a proper balance of content and a proper balance of cogni4ve process levels.

Content validity is relevant to achievement tes4ng. To be content valid, the test should be representa4ve of the topics and cogni4ve processes given in the course unit, and must be consistent with the objec4ves and the instruc4on. Evidence for content validity is obtained by logical analyzing the course objec4ves and instruc4on, and comparing these to the test.

To inves4gate criterion validity, test scores are correlated with a criterion variable. Criterion validity is established through an empirical process, which first gather data on the test and criterion of interest and then create a scamerplot and compute correla4on between two scores. The computed correla4on represents criterion validity.

There are two types of criterion validity: predic4ve criterion validity and concurrent criterion validity. Predic4ve criterion validity is computed by correla4ng test scores from an instrument with test scores from a criterion measure taken in the future. In contrast, concurrent criterion validity is the correla4on between the scores from an instrument with the scores from a measure taken at the same +me.

Here are some examples of predic4ve criterion validity: Correla4on between SAT and college GPA Correla4on between GRE and graduate GPA Correla4on between the score on job placement and work produc4vity Correla4on between the score on reading readiness and future reading performance

Here are some examples of concurrent criterion validity: correla4on between the score on short group IQ test and the score on individually administered IQ test. correla4on between the score on standardized achievement test and the score on teacher made assessments correla4on between the score on the Terra Nova reading test and the score on the FCAT reading test. They are concurrent criterion validity because each pair of tests are administered at the same 4me.

Concurrent coefficients are generally higher than predic4ve coefficients. This does not mean that the test with the higher validity coefficient is bemer in a specific situa4on. Group variability affects the size of the validity coefficient. Higher validity coefficients are derived from heterogeneous groups than from homogenous ones. The relevance and reliability of the criterion needs to be considered, as well as the test itself. A poor criterion will lower the validity coefficient.

Construct validity can be thought of in two ways. In the first way, construct validity is simply another type of validity, like content validity, predic4ve, and concurrent validity. It is used specifically to establish the validity of a test that is intended to measure some abstract trait or skill, such as intelligence, anxiety, or mechanical ap4tude. The construct validity of a test is o:en inves4gated by correla4ng the test with tests of other amributes that, theore4cally, ought to be related to the trait that this test is supposed to measure.

For example, scores on a test of intelligence ought to show a strong rela4onship with measures of achievement. For some tests, this is the only type of validity inves4ga4on that can be carried out. There may be no direct connec4on to another measure that could be used for predic4ve or concurrent validity analyses.

The second way of thinking about construct validity is to think of this type of validity as the founda4onal, or all- encompassing type of validity. In this sense, all the other types of validity can be seen as also providing informa4on about whether the test is actually measuring the abstract trait, or construct, that was theorized. In either way, construct validity is important in helping define the actual construct that a test measures.

Construct validity inves4gates whether the test score actually taps an abstract trait or ability (e.g., intelligence, test anxiety, academic mo4va4on). For example: Does the Stanford- Binet IQ test measure the abstract trait of intelligence?

In other words, as for construct validity, we are interested in making inferences about the amount of trait that is possessed. Therefore, establishing this form of validity usually involves finding evidence that agrees with logical and theore4cal expecta4on, which means that construct validity is obtained by correla4ng observed test scores with other scores or measures.

Here is an example of construct validity. To establish construct validity of the Academic Mo4va4on Test: We could have teachers rate students mo4va4on and we would expect teacher ra4ngs to correlate posi4vely with mo4va4on scale. If so, we would have evidence for construct validity of the AMT. We could examine completed homework assignments (we would expect highly mo4vated students to complete homework, and for this to relate to the mo4va4on measure).

It is some4mes argued that all validity is construct validity. This is because all evidence that helps establish any type of test validity, helps establish the construct as well. It all helps determine if the test is measuring what it is supposed to.