Constructing Indices and Scales. Hsueh-Sheng Wu CFDR Workshop Series June 8, 2015

Similar documents
Introduction to Factor Analysis. Hsueh-Sheng Wu CFDR Workshop Series June 18, 2018


PÄIVI KARHU THE THEORY OF MEASUREMENT

Overview of Experimentation

11-3. Learning Objectives

University of Oxford Intermediate Social Statistics: Lecture One

Validity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that

VARIABLES AND MEASUREMENT

Ch. 11 Measurement. Measurement

Ch. 11 Measurement. Paul I-Hai Lin, Professor A Core Course for M.S. Technology Purdue University Fort Wayne Campus

Measurement is the process of observing and recording the observations. Two important issues:

Measurement. 500 Research Methods Mike Kroelinger

By Hui Bian Office for Faculty Excellence

32.5. percent of U.S. manufacturers experiencing unfair currency manipulation in the trade practices of other countries.

A Short Form of Sweeney, Hausknecht and Soutar s Cognitive Dissonance Scale

Subescala D CULTURA ORGANIZACIONAL. Factor Analysis

Basic SPSS for Postgraduate

ADMS Sampling Technique and Survey Studies

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

DEVELOPING A TOOL TO MEASURE SOCIAL WORKERS PERCEPTIONS REGARDING THE EXTENT TO WHICH THEY INTEGRATE THEIR SPIRITUALITY IN THE WORKPLACE

Associate Prof. Dr Anne Yee. Dr Mahmoud Danaee

Subescala B Compromisso com a organização escolar. Factor Analysis

APÊNDICE 6. Análise fatorial e análise de consistência interna

Validity and Reliability. PDF Created with deskpdf PDF Writer - Trial ::

Collecting & Making Sense of

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

CHAPTER 3 METHOD AND PROCEDURE

Lecture 4: Research Approaches

Do not write your name on this examination all 40 best

CHAPTER VI RESEARCH METHODOLOGY

Collecting & Making Sense of

qfactor: A new Stata program for Q-methodology analysis

Basic concepts and principles of classical test theory

Making a psychometric. Dr Benjamin Cowan- Lecture 9

Bijay Lal Pradhan, M Sc Statistics, FDPM (IIMA) 2

Reliability AND Validity. Fact checking your instrument

TECHNICAL REPORT ON ITEM REDUCTION, FACTOR LOADINGS, AND OTHER INFORMATION ON PRINCIPAL COMPONENT ANALYSES SUMMARIZED IN

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis

Reliability and Validity checks S-005

Factorial Validity and Reliability of 12 items General Health Questionnaire in a Bhutanese Population. Tshoki Zangmo *

Reliability & Validity Dr. Sudip Chaudhuri

Chapter 1: Explaining Behavior

Validity and Reliability of Sport Satisfaction

University of Warwick institutional repository:

CHAPTER 3 RESEARCH METHODOLOGY

Internal structure evidence of validity

Analysis and Interpretation of Data Part 1

Chapter 3 Psychometrics: Reliability and Validity

Developing a Comprehensive and One-Dimensional Subjective Well-Being Measurement: Evidence from a Belgian Pilot Survey

Teachers Sense of Efficacy Scale: The Study of Validity and Reliability

The Influence of Hedonic versus Utilitarian Consumption Goals on the Compromise Effect. Abstract

Extension of the Children's Perceptions of Interparental Conflict Scale for Use With Late Adolescents

The State of the Art in Indicator Research

An Empirical Study of the Roles of Affective Variables in User Adoption of Search Engines

THE INDIVIDUALISM-COLLECTIVISM INTERPERSONAL ASSESSMENT INVENTORY (ICIAI):

Reveal Relationships in Categorical Data

RESULTS. Chapter INTRODUCTION

Internal Consistency and Reliability of the Networked Minds Measure of Social Presence

Career Counseling and Services: A Cognitive Information Processing Approach

Statistics for Psychosocial Research Session 1: September 1 Bill

Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving

Factor structure of the Schalock and Keith Quality of Life Questionnaire (QOL-Q): validation on Mexican and Spanish samples

Factor Analysis of Gulf War Illness: What Does It Add to Our Understanding of Possible Health Effects of Deployment?

Key words: State-Trait Anger, Anger Expression, Anger Control, FSTAXI-2, reliability, validity.

Examining the Psychometric Properties of The McQuaig Occupational Test

Measuring Perceived Social Support in Mexican American Youth: Psychometric Properties of the Multidimensional Scale of Perceived Social Support

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604

Lecturer: Dr. Emmanuel Adjei Department of Information Studies Contact Information:

Internal Consistency and Reliability of the Networked Minds Social Presence Measure

Variable Measurement, Norms & Differences

5 Research Methodology

N Utilization of Nursing Research in Advanced Practice, Summer 2008

Lecture Week 3 Quality of Measurement Instruments; Introduction SPSS

The Effects of Gender Role on Perceived Job Stress

MEASUREMENT, SCALING AND SAMPLING. Variables

PTHP 7101 Research 1 Chapter Assignments

Family Income (SES) Age and Grade 4/22/2014. Center for Adolescent Research in the Schools (CARS) Participants n=647

Modeling the Influential Factors of 8 th Grades Student s Mathematics Achievement in Malaysia by Using Structural Equation Modeling (SEM)

COWLEY COLLEGE & Area Vocational Technical School

Chapter 4: Defining and Measuring Variables

CHAPTER 3 RESEARCH METHODOLOGY

Validity, Reliability and Classical Assumptions

Introduction to Survey Research. Clement Stone. Professor, Research Methodology.

CHAPTER III RESEARCH METHODOLOGY

Development and Psychometric Properties of the Relational Mobility Scale for the Indonesian Population

Emotional Intelligence Assessment Technical Report

how good is the Instrument? Dr Dean McKenzie

The Savvy Survey #6d: Constructing Indices for a Questionnaire 1

Psychometric Properties of Farsi Version State-Trait Anger Expression Inventory-2 (FSTAXI-2)

PSYCHOMETRIC INDICATORS OF GENERAL HEALTH QUESTIONNAIRE IN LATVIA

Constructing a Situation-Based Career Interest Assessment for Junior High School Students and Examining Their Interest Structure.

Psychometric evaluation of the self-test (PST) in the responsible gambling tool Playscan (GamTest)

How to Model Mediating and Moderating Effects. Hsueh-Sheng Wu CFDR Workshop Series Summer 2012

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Measurement. Different terminology. Marketing managers work with abstractions. Different terminology. Different terminology.

Overview of Non-Parametric Statistics

Toward the Measurement of Interpersonal Generosity (IG): An IG Scale Conceptualized, Tested, and Validated Conceptual Definition

Transcription:

Constructing Indices and Scales Hsueh-Sheng Wu CFDR Workshop Series June 8, 2015 1

Outline What are scales and indices? Graphical presentation of relations between items and constructs for scales and indices Why do sociologists need scales and indices? Similarities and differences between scales and indices Constructions of scales and indices Criteria for evaluating a composite measure Evaluation of scales and indices How to obtain the sum score of a scale or an index? Conclusion 2

What Are Scales and Indices? Scales and indices are composite measures that use multiple items to collect information about a construct. These items are then used to rank individuals. Examples of scales: Depression scales Anxiety scale Mastery scale Examples of indices: Socio-Economic Status (SES) index Consumer price index Stock market index Body mass Index 3

Graphical Presentation of Relations between Items and Constructs for Scales and Indices Scale: Feeling sad e Depression Sleepless e Index: Education Suicidal Ideation e Income Social Economic Status e Occupation 4

Why Do Sociologists Need Scales and Indices? Most social phenomenon of interest are multidimensional constructs and cannot be measured by a single question, for example: Well-being Violence When a single question is used, the information may not be very reliable because people may have different responses to a particular word or idea in the question. The variation of one question may not be enough to differentiate individuals. Scales and indices allow researchers to focus on large theoretical constructs rather than individual empirical indicator. 5

Similarities and Differences between Scales and Indices Similarities: Both try to measure a composite construct or constructs. Both recognize that the construct or constructs have multiple-dimensional attributes. Both use multiple items to capture these attributes. Both can apply various measurement levels (i.e., nominal, ordinal, interval, and ratio) to the items. Both are composite measures as they both aggregate the information from multiple items. Both use the weighted sum of the items to assign a score to individuals. The score that an individual has on an index or a scale indicates his/her position relative to those of other people. Differences: Scale consists of effect indicator, but index includes causal indicators Scales are always used to give scores at individual level. However, indices could be used to give scores at both individual and aggregate levels. They differ in how the items are aggregated. Many discussions on the reliability and validity of the scales, but few discussions 6 on those of indices.

Construction of Scales DeVellis, Robert (2011) Scale Development: Theory and Applications 1. Determine clearly what it is you want to measure 2. Generate an item pool 3. Determine the format for measurement 4. Have the initial item pool reviewed by experts 5. Consider inclusion of validation items 6. Administer items to a development sample 7. Evaluate the items 8. Optimize scale length 7

Construction of Indices Babbie, E (2010) suggested the following steps of constructing an index 1. Selecting possible items Decide how general or specific your variable will be Select items with high face validity Choose items that measure one dimension of the construct Consider the amount of variance that each item provides 2. Examining their empirical relations Examine the empirical relations among the items you wish to include in the index 3. Scoring the index you then assign scores for particular responses, thereby making a composite variable out of your several items 4. Validating it item analysis The association between this index and other related measures. 8

Concepts of Reliability and Validity Reliability: Whether a particular technique, applied repeatedly to the same object, yields the same result each time Test-retest reliability Alternate-forms reliability (split-halves reliability) Inter-observer reliability Inter-item reliability (internal consistency) Validity: The extent to which an empirical measure adequately reflects the real meaning of the concept under consideration Face validity Content validity Construct validity (convergent validity and discriminant validity) Criterion validity (concurrent validity and predictive validity) 9

Test-retest reliability: Reliability Apply the test at two different time points. The degree to which the two measurements are related to each other is called test-retest reliability Example: Take a test of math ability and then retake the same test two months later. If receiving a similar score both times, the reliability of this test is high. Possible problems: Test-retest reliability holds only when the phenomenon do not change between two points in time. Respondents may get a better score when taking the same test the second time, which reduce the test-retest reliability 10

Reliability (cont.) Alternate-forms reliability: Compare respondents answers to slightly different versions of survey questions. The degree to which the two measurements are related to each other is called alternative-form reliability. Possible problem: How to make sure these two alternate-forms are equivalent? 11

Split-halves reliability Reliability (cont.) Similar to the concept of alternate-forms reliability Randomly divide survey sample into two. These two halves of the sample answer two forms of the questions. If the responses of the two halves of the sample are about the same, the measure s reliability is established Possible problem: What if these two halves are not equivalent? 12

Reliability (cont.) Inter-observer reliability When more than one observer to rate the same people, events, or places If observers are using the same instrument to rate the same thing, their ratings should be very similar. If they are similar, we can have much confidence that the ratings reflect the phenomenon being assessed than the orientations of the observers Possible problem: The reliability is established for observers, not for the measurement items. Thus, inter-observer reliability cannot be generalized to studies with different observers. 13

Inter-item reliability Reliability (cont.) Apply only when you have multiple items to measure a single concept. The stronger the association among the individual items, the higher the reliability of the measures. In Statistics, we use Cronbach s Alpha to measure inter-item reliability. Possible problems You can increase the value of Cronbach s Alpha by increasing the number of scale items even if these items are not highly correlated. 14

Validity The extent to which an empirical measure adequately reflects the real meaning of the concept under consideration Face validity Content validity Criterion validity (concurrent and predictive validity) Construct validity (convergent and discriminant validity) 15

Face Validity The quality of an indicator that makes it seem a reasonable measure of some variable ( on its face ) Example: frequency of church attendance an indicator of a person s religiosity 16

Content Validity The degree to which a measure covers the full range of the concept s meaning Example: Attitudes toward police department contain different domains, for example, expectation, past experience, others experience, and mass media 17

Criterion Validity The degree to which a measure relates to some external criterion Example: People who score high on a depression scales also are more likely to receive a diagnosis of clinical depression. 18

Concurrent Validity A measure yields scores that are closely related to scores obtained by using previously established measures of the same construct. Example: Comparing a test of sales ability to the person s sales performance 19

Predictive Validity The ability of a measure to predict score on a criterion measured in the future Example: SAT scores can predict the college students GPA (SAT scores would be a valid indicator of a college student s success) 20

Construct Validity The degree to which a measure relates to other variables as expected within a system of theoretical relationships Example: If we believe that marital satisfaction is related to marital fidelity, the response to the measure of marital satisfaction and the response to the measure of marital fidelity should act in the expected direction (i.e., more satisfied couples also less likely to cheat each other) 21

Convergent vs. Discriminant Validity Convergent validity -- one measure of a concept is associated with different types of measures of the same concept Discriminant validity -- one measure of a concept is not associated with measures of different concepts 22

Reliability and Validity of Scales and Indices Table 1. The reliability and validity of scales and indices Scales Indices Reliability Test-retest reliability X X Alternate-forms reliability (split-halves reliability) X? Inter-observer reliability X X Inter-item reliability X? Validity Face validity X X Content validity X X Criterion validity concurrent validity X X predictive validity X X Construct validity convergent validity X X discriminant validity X X 23

How to obtain the sum score of a scale or an index Common way Assume that each item have the equal weight, and simply sum each item together Use factor analysis Table 1. Variable about environment in General Socail Survey, 2010 Variable Name Variable Description grncon concerned about environment grndemo protested for envir issue grnecon worry too much about envir, too little econ grneffme environment effect everyday life grnexagg environmental threats exaggerated 24

How to obtain the sum score of a scale or an index (Cont.). factor grncon grndemo grnecon grneffme grnexagg (obs=1287) Factor analysis/correlation Number of obs = 1287 Method: principal factors Retained factors = 3 Rotation: (unrotated) Number of params = 10 Factor Eigenvalue Difference Proportion Cumulative Factor1 1.14876 1.09574 1.3943 1.3943 Factor2 0.05302 0.03888 0.0643 1.4586 Factor3 0.01414 0.18208 0.0172 1.4758 Factor4-0.16794 0.05614-0.2038 1.2720 Factor5-0.22408. -0.2720 1.0000 LR test: independent vs. saturated: chi2(10) = 670.67 Prob>chi2 = 0.0000 Factor loadings (pattern matrix) and unique variances Variable Factor1 Factor2 Factor3 Uniqueness grncon 0.5491-0.0853 0.0352 0.6900 grndemo -0.1385 0.0054 0.1125 0.9682 grnecon 0.5622 0.1020-0.0046 0.6735 grneffme -0.3677 0.1678 0.0138 0.8365 grnexagg 0.6139 0.0846 0.0064 0.6160 25

How to obtain the sum score of a scale or an index (Cont.) Factor1 = 0.54910*grncon + -0.1385*grndemo + 0.5622*grnecon + -0.3677*grneffme + 0.6139*grnexagg Factor2 = -0.0853*grncon + 0.0054*grndemo + 0.1020*grnecon + 0.16780*grneffme + 0.0846*grnexagg Factor3 = 0.03520*grncon + 0.1125*grndemo + -0.0046*grnecon + 0.01380*grneffme + 0.0064*grnexagg 26

How to obtain the sum score of a scale or an index (Cont.) Using principal component analysis for indices. pca educ realrinc prestg80 Principal components/correlation Number of obs = 1200 Number of comp. = 3 Trace = 3 Rotation: (unrotated = principal) Rho = 1.0000 Component Eigenvalue Difference Proportion Cumulative Comp1 1.8637 1.21591 0.6212 0.6212 Comp2.647785.159268 0.2159 0.8372 Comp3.488517. 0.1628 1.0000 Principal components (eigenvectors) Variable Comp1 Comp2 Comp3 Unexplained educ 0.5964-0.3750-0.7097 0 realrinc 0.5382 0.8428 0.0070 0 prestg80 0.5955-0.3862 0.7044 0 Score1 = 0.5964 *educ + 0.5382*realrinc + 0.5955*prestg80 Score2 = -0.3750 *educ + 0.8428*realrinc + -0.3862*prestg81 Score3 = -0.7097 *educ + 0.0070*realrinc + 0.7044*prestg82 27

Conclusion Both indices and scales are composite measures that use multiple items to collect information about a construct. With indices and scales, researchers can move beyond examining observable indicators, and start examining abstract construct. Scales have effect indicators, while indices have causal indicators. To determine whether your are constructing an index or a scale, you need to think about whether these indicators are highly correlated with each other. If you have any questions about measuring a construct or constructs, please come see me at 5D, Williams Hall or send me an email (wuh@bgsu.edu). 28