Introduction. Chapter 1

Size: px
Start display at page:

Download "Introduction. Chapter 1"

Transcription

1 Chapter 1 Introduction Whatever exists at all exists in some amount. To know it thoroughly involves knowing its quantity as well as its quality. Education is concerned with changes in human beings; a change is a difference between two conditions; each of these conditions is known to us only by the products produced by it things made, words spoken, acts performed, and the like. To measure any of these products means to define its amount in some way so that competent persons will know how large it is, better than they would without measurement. To measure a product well means so to define its amount that competent persons will know how large it is, with some precision, and that this knowledge may be conveniently recorded and used. This is the general credo of those who, in the last decade, have been busy trying to extend and improve measurements of educational products (Thorndike, 1918, p 16). Psychometrics is that area of psychology that specializes in how to measure what we talk and think about. It is how to assign numbers to observations in a way that best allows us to summarize our observations in order to advance our knowledge. Although in particular it is the study of how to measure psychological constructs, the techniques of psychometrics are applicable to most problems in measurement. The measurement of intelligence, extraversion, severity of crimes, or even batting averages in baseball are all grist for the psychometric mill. Any set of observations that are not perfect exemplars of the construct of interest is open to questions of reliability and validity and to psychometric analysis. Although it is possible to make the study of psychometrics seem dauntingly difficult, in fact the basic concepts are straightforward. This text is an attempt to introduce the fundamental concepts in psychometric theory so that the reader will be able to understand how to apply them to real data sets of interest. It is not meant to make one an expert, but merely to instill confidence and an understanding of the fundamentals of measurement so that the reader can better understand and contribute to the research enterprise. With the advent of powerful computer languages that have been developed with the specific aim of doing statistics (R and S+), the process of doing psychometrics has become more approachable. It is no longer necessary to write long programs in Fortran (Backus, 1998), nor is it necessary to rely on sets of proprietary computer packages that have been developed to do particular analyses. It is now possible to use R for almost all of one s basic (and even advanced) psychometric requirements. R is an open source implementation of the computer language S. As such it is available free of charge under the General Public License (GPL) of the GNU Project of the Free Software 5

2 6 1 Introduction Foundation. 1 The source code of all R programs and packages are open to inspection and change. The core of R has been developed over the past 20 years by a dedicated group (the R Core Team) of about members which includes some of the original authors of S. In addition, there are at least 1000 packages that are written in R and contributed to the overall R project by many different authors. In the psychometrics community, at least 10 to 20 packages have been developed and made available to the psychometrics user in particular and the R community in general. Like psychometrics, R is initially daunting. Also like psychometrics, while it takes years to master, the basics can be learned fairly easily and expertise comes with practice. Moreover, combining examples written and analyzed in R with psychometric problems allows one to learn both at the same time. This text is thus both an introduction to psychometric theory as well as to R. 1.1 An overview of the book The structure of this book is best represented in the form of a structural model showing a number of boxes and circles with a set of connecting paths (Figure 1.1). This symbolic notation is a way of showing the relationship between a set of observed variables (the boxes on the far left and right of the figure) in terms of a smaller set of unobserved, or latent, variables said to account for the observed variables (the circles in the middle of the figure). Paths in the figure represent relationships. In the following chapters we will use this figure to help locate where we are. Part I: Basic issues Constructs and measures (Chapter 2) A basic distinction in science may be made between theoretical constructs and observed measures thought to represent these constructs. This distinction is perhaps best understood in terms of Plato s Allegory of the Cave in the Republic (Book VII). Consider a group of prisoners confined in the darkness of a cave. They are chained so that they face away from the mouth of cave and can not observe anything behind them. Behind them there is a fire and people are walking back and forth in front of the fire carrying a variety of objects. To the prisoners, all that is observable are the shadows cast on the wall of the cave of the walking people and of the objects that they carry. From the patterns of these shadows the prisoners need to infer the reality of the people and objects. These shadows are analogous to 1 Under the Free Software Foundations s GPL, all programs are required to have the following statement. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. The GNU.org website has an extensive discussion of the meaning of free software.

3 1.1 An overview of the book 7 X1 X2 Xi1 Y1 X3 eta1 Y2 X4 Y3 X5 Xi2 X6 Y4 X7 eta2 Y5 X8 Xi3 Y6 X9 Fig. 1.1 A conceptual overview of the book. Psychometrics integrates the relationships between observed variables (rectangles) in terms of latent or unobserved constructs (ellipses). Chapter 2 considers what goes into a observable (e.g., the box X 1 ) while Chapter 3 addresses the shape of the mapping function between a latent variable and an observed variable (path Ξ 1 to X 1 ). Chapter 4 considers how to assess relationships between two variables (e.g., the simple correlation or regression of X 1 and Y 1 ), how to combine two or more variables to predict a third variable (multiple regression of Y 1 on X 1 and X 2 ), and how to assess the relationship between two variables while holding the effect of a third variable constant (partial correlation of X 1 and Y 1 with X 2 partialed out). Chapter 6 considers how to estimate the correlation between an observed variable and a latent variable (X 1 with Ξ 1 ) to estimate the reliability of X 1. Chapter 9 addresses the question of how many latent variables are needed. Chapter 8 considers how the structure of relationships allows alternative conceptualizations of validity. Chapter 10 addresses how to model the entire figure.

4 8 1 Introduction the observed variables that study, while the real but unobservable people and objects are the latent variables about which we make theoretical inferences. The prisoners make their inferences based upon the patterning of the shadows. While behaviorism reigned supreme in psychology until the late 1950 s, the emphasis was upon measurement of observed variables. With a greater appreciation of the process of theory building, the use of latent variables and hypothetical constructs gradually became more accepted in psychology in general, and in psychometrics in particular. For rarely are we interested in specific observations of twitches and utterances. Psychological theories are concerned with higher level constructs such as love, intelligence, or happiness. Although these constructs can not be measured directly, good theory relates them through a process of measurement to specific observed behaviors and physiological markers. A very deep question, and one that will not be addressed in the detail it requires, is what does it to mean to say we measure something? The assigning of numbers to observations does not necessarily (and indeed probably does not) imply that the data are isomorphic to the real numbers. The types of inferences that can be made from observations and the types of analysis that are or are not appropriate for the observed data is a deep question that goes beyond the scope of an introductory text Observational and Experimental Psychology A recurring debate in psychology ever since Wundt (1874) and Galton (1865, 1884) has been the merits of experimental versus observational approaches to the study of psychology. These two approaches tend to represent different sub fields within psychology and to emphasize different types of training. Experimental psychology tends to emphasize central tendencies and strives to formulate general laws of behavior and cognition. Observational psychologists, on the other hand, emphasize variability and covariation and study individual differences in ability, character, and temperament. For many years these two approaches also used different statistical procedures to analyze data, with experimentalists comparing means using the t-test and its generalization, the Analysis of Variance (ANOVA). This was in contrast to observationalists who would study variability and particularly covariation with the correlation coefficient and multivariate procedures. However, with the recognition that ANOVA and correlations are just special cases of the general linear model, the statistical distincition is less relevant than the distinction of research methodologies. Despite eloquent pleas for the reunification of these two disciplines Cronbach (1957, 1975), Eysenck (1966, 1997) and Vale and Vale (1969) there is surprisingly little emphasis upon individual differences within experimental (but see Underwood, 1975) for why individual differences are necessary for theory building in cognitive psychology) or experimental techniques in personality research (Revelle and Oehleberg, 2008). However, both research approaches need to understand the quality of measurements in order to make experimental or correlational inferences. With an emphasis upon the type of data we collect, and the problems of inference associated with the metric properties of our data, chapters 2 & 3 are particularly relevant for the experimentalist who wants to interpret differences between observed means in terms of differences at a latent level. These two chapters are also important for the student of correlations who wants to interpret scale score values as if they have more than ordinal meaning.

5 1.1 An overview of the book Data = Model + Error Perhaps the most important concept to realize in psychometrics in particular and statistics in general is that we are modeling data, not merely reporting it. Complex data can be partly understood in terms of simpler theories or models. But our models are incomplete and do not completely capture the data. The data we collect, no matter how carefully we do it, nor how well we understand the process that generates the data, are never quite what we expect. The world is complex and the observations that we take are multiply determined. At the most basic atomic level of physics, quantum randomness occurs. At the level of the neuron, coding is a statistical frequency of firing, not a binary outcome. At the level of human behavior although our theories might be powerful (which they tend not to be) there are causes that have not been observed. Throughout the book we will propose models of the data and try to evaluate those models in terms of how well they fit. Conceptually, we use the equations Data = Model + Error (1.1) to represent the problem of inference. We evaluate how well our models fit by examining error as defined as Error = Data Model (1.2) and evaluate the magnitude of some function of the error (Judd and McClelland, 1989). These equations would seem to imply a greater quantitative level of measurement precision than is generally the case, and should be treated as abstractions to remind us that we are evaluating models of data and need to continuously ask how appropriate is the model. The distinction between model and error is not new, for it dates at least to Plato. As discussed by Stigler (1999) the whole of nineteenth century statistical theory was based upon this distinction between physical truth as modeled by Newton and actual observations taken to extend the theories A theory of data (Chapter 2) Clyde Coombs introduced a taxonomy of the kinds of data that we can observe that allows us to abstractly organize what goes into each observation (Coombs, 1964). Although many of the examples in this book are drawn from a small portion of the kinds of data that could be collected, by thinking about the basic distinctions made by Coombs we see the range of possibilities for psychometric applications Basic summary statistics problems of scale (Chapter 3) The problem of how to summarize data reflects some deep issues in measurement. The naive assumption that our measures are linearly related to our constructs leads to many a misinterpretation. Indeed, to some, this assumption reflects a pathological thought disorder (Michell, 1997). Although not taking such a strong position, examples of misinterpretation of findings because of faulty assumptions of interval or ratio levels of measurement are easy to find (3.5).

6 10 1 Introduction Alternative ways of estimating central tendencies can lead to very different conclusions about sets of data (3.4) Covariance, regression, and correlation (Chapter 4) Perhaps the most fundamental concept in psychometrics is the correlation coefficient. How to best represent the relationship between two or more variables is a fundamental problem that, if understood, may be generalized to most of psychometrics. Correlation takes many forms and understanding when to use which type of correlation is important. Understanding how multiple and partial correlation are generalizations of the simple zero order correlation allows for an understanding of classic reliability theory. Part II: Classical and modern reliability theory Classical theory and the Measurement of Reliability (Chapter 6) How well does a scale measure whatever it is measuring? Do alternative measures give the same or similar values? Are measures of the construct the same over time, over items, over situations? These are the basic questions of both classical test theory as well as its generalization to Latent Trait and Item Response Theory Parallel tests and their generalizations If one observed scale is thought to be composed of True score and Error, then the correlation of the test with true score may be calculated in terms of the proportion of variance that is True score. But how to estimate this? Parallel tests, tau equivalent tests, and congeneric test models make progressively fewer assumptions of the data and all yield ways of estimating true scores Domain sampling theory An alternative approach, that yields the same solution as congeneric test theory is to think of tests as representing larger and larger sets of items sampled from an infinite domain of items. By thinking in terms of domain sampling, the meaning of alternative estimates of reliability (α,β,ω) is easily understood. Hierarchical structures of tests in terms of group and general factors emphasizes the need to understand the test structure.

7 1.1 An overview of the book The many sources of reliability: Generalizability theory Reliability needs to be considered in terms of the dimensions across which we want to generalize our measures. A measure of a single trait should be a good indicator of a single domain and should be consistent across time and situation. A measure of a mood, however, should have high internal consistency (be a good measure of a single domain) but should not show consistency over time. In a prediction setting, it is possible that a test need not have high internal consistency, but it should be stable over time, situations, and perhaps forms. Reliability is not just a concept of the item, but also is concerned with the source of the data. Do multiple raters agree with each other, do various forms of the test give similar answers Latent Trait Theory - The New Psychometrics (Chapter 7) Although Classical Test Theory treats items as random replicates, it is possible to consider item parameters as well. This leads to a more efficient means of estimating person parameters and also emphasizes issues of scaling shape. Extensions to two and three parameter models, ability and unfolding models, and dichotomous versus polytomous models will be considered Validity (Chapter 8) Does a test measure what it supposed to test? How do we know? The most direct (and perhaps least accurate) way is to simply examine the item content. Does the content appear related to the construct of interest. Face (which is sometimes known as Faith ) validity, address the question of obvious relevance. Do questions about psychometric knowledge ask about matrices or do they ask about general knowledge of English? The former item would seem to be more valid than the latter. If tests are used for selection or diagnosis, merely looking good is not enough. It is also necessary to assess how well the measure correlates with current status of known criterion groups or how well the test predicts future outcomes. Concurrent and predictive validity assess how well tests correlate with alternative measures of the same construct right now and do they allow future predictions? For theory testing and development, validity is a process more than a particular value. In assessing the construct validity of a measure, it is necessary to examine the location of the test in the complete nomological network of the theory. Assessing convergent validity asks whether measures correlate with what they should correlate with given the theory? Equally important is discriminant validity: Do measures not correlate with what the theory says they should not correlate with? A final part of construct validity is emphincremental validity: does it make any difference if we add a test to a battery? Decision Theory The practical use of tests also involves knowing how to combine data to make decisions. Although the measures used to predict and to validate tend to be continuous, decisions and

8 12 1 Introduction outcomes are frequenenty discrete. Students are admitted to graduate school or they are not. They finish their Ph.D. or they do not. People are offered jobs, are promoted, are accused of crimes, and are found guilty or innocent. These are binary decisions and binary outcomes based upon linear and non-linear models of predictors. In addition to considering the base rates of outcomes and the selectivity of the choice process, the utility of test reflects the value applied to the various types of outcomes as well as the cost of developing and giving the test. Part III: Latent variables Factor, Principal Components and Cluster Analysis (Chapter 9) How many latent constructs are represented in a data set of N variables? Is it possible to reduce the complexity of the data without a great loss of information? How much information is lost? Exploratory versus Confirmatory models Early in the development of a particular subarea, exploratory data reduction techniques are most appropriate. These range techniques can include cluster analysis and principal components analysis as well as exploratory factor analysis. All of the procedures are faced with the problem of what is the appropriate amount of data reduction and how to evaluate alternative models. Confirmatory models, primarily confirmatory factor models, are special cases of structural equation modeling procedures Structural Equation Modeling (Chapter 10) Structural Equation Modeling = Reliability + Validity. How to evaluate the measurement model and the structural model at the same time. There are severe limitations on the type of inferences that can be drawn, even from the best fitting structural equation. Through the use of simulated data representing various threats to measurement, it is possible to better understand how to properly interpret results of standard sem packages. Part IV: The construction of tests and the analysis of data Scale construction (Chapter 15) Practical suggestions about how to construct scales based upon basic item statistics. For students and practitioners with limited resources, some procedures are much more useful

9 1.2 General comments 13 than others. What are the tradeoffs involved in making particular decisions when developing tests to use in research and applied settings. Knowing a few simple rules of test construction and evaluation helps speed up the cycle of test development. Appendices Appendix Basic R The basic commands and methods for using R. How to get, install, and use the basic R packages Appendix - Review of Matrix algebra Although understanding matrix algebra is not completely necessary to understand psychometrics, it makes it much easier. Because it is more abstract, matrix notation is far more compact than the alternative notation using the summation of cross products. In addition, programming operations in R using matrices produces much cleaner and faster code. This appendix is what you should have learned in college but have probably forgotten. 1.2 General comments This book is aimed for beginners in psychometrics (and perhaps in R) who want to use the basic principals of psychometric theory in their substantive research. As an introduction to psychometrics some major philosophical issues about the meaning of measurement (e.g, Barrett, 2005, Borsboom, 2004, and Michell, 1997) will not be discussed in the detail they deserve, nor will many of the basic models be derived from first principles in the manner of Guilford (1954), McDonald (1999), or Nunnally (1967). It is hoped, however, that the reader will become interested enough in the theory and practice of psychometrics to delve into those much deeper texts. Most scientists read books backwards. That is, we start at the later chapters and if understand them, we are finished. If we don t, we go to an earlier chapter and test ourselves with that. For that reason, the appendix on R is meant to allow the eager reader to start running programs in R without reading anything else. However, the introductory chapters are meant to be useful as they consider the meaning of our observations, the inferences we are able to draw from observations and the inferences we can not make.

Bruno D. Zumbo, Ph.D. University of Northern British Columbia

Bruno D. Zumbo, Ph.D. University of Northern British Columbia Bruno Zumbo 1 The Effect of DIF and Impact on Classical Test Statistics: Undetected DIF and Impact, and the Reliability and Interpretability of Scores from a Language Proficiency Test Bruno D. Zumbo, Ph.D.

More information

2 Types of psychological tests and their validity, precision and standards

2 Types of psychological tests and their validity, precision and standards 2 Types of psychological tests and their validity, precision and standards Tests are usually classified in objective or projective, according to Pasquali (2008). In case of projective tests, a person is

More information

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological

More information

PÄIVI KARHU THE THEORY OF MEASUREMENT

PÄIVI KARHU THE THEORY OF MEASUREMENT PÄIVI KARHU THE THEORY OF MEASUREMENT AGENDA 1. Quality of Measurement a) Validity Definition and Types of validity Assessment of validity Threats of Validity b) Reliability True Score Theory Definition

More information

VARIABLES AND MEASUREMENT

VARIABLES AND MEASUREMENT ARTHUR SYC 204 (EXERIMENTAL SYCHOLOGY) 16A LECTURE NOTES [01/29/16] VARIABLES AND MEASUREMENT AGE 1 Topic #3 VARIABLES AND MEASUREMENT VARIABLES Some definitions of variables include the following: 1.

More information

Statistical Methods and Reasoning for the Clinical Sciences

Statistical Methods and Reasoning for the Clinical Sciences Statistical Methods and Reasoning for the Clinical Sciences Evidence-Based Practice Eiki B. Satake, PhD Contents Preface Introduction to Evidence-Based Statistics: Philosophical Foundation and Preliminaries

More information

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

3 CONCEPTUAL FOUNDATIONS OF STATISTICS 3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical

More information

August 29, Introduction and Overview

August 29, Introduction and Overview August 29, 2018 Introduction and Overview Why are we here? Haavelmo(1944): to become master of the happenings of real life. Theoretical models are necessary tools in our attempts to understand and explain

More information

Measuring and Assessing Study Quality

Measuring and Assessing Study Quality Measuring and Assessing Study Quality Jeff Valentine, PhD Co-Chair, Campbell Collaboration Training Group & Associate Professor, College of Education and Human Development, University of Louisville Why

More information

CHAPTER VI RESEARCH METHODOLOGY

CHAPTER VI RESEARCH METHODOLOGY CHAPTER VI RESEARCH METHODOLOGY 6.1 Research Design Research is an organized, systematic, data based, critical, objective, scientific inquiry or investigation into a specific problem, undertaken with the

More information

Fundamental Concepts for Using Diagnostic Classification Models. Section #2 NCME 2016 Training Session. NCME 2016 Training Session: Section 2

Fundamental Concepts for Using Diagnostic Classification Models. Section #2 NCME 2016 Training Session. NCME 2016 Training Session: Section 2 Fundamental Concepts for Using Diagnostic Classification Models Section #2 NCME 2016 Training Session NCME 2016 Training Session: Section 2 Lecture Overview Nature of attributes What s in a name? Grain

More information

1 The conceptual underpinnings of statistical power

1 The conceptual underpinnings of statistical power 1 The conceptual underpinnings of statistical power The importance of statistical power As currently practiced in the social and health sciences, inferential statistics rest solidly upon two pillars: statistical

More information

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA PharmaSUG 2014 - Paper SP08 Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA ABSTRACT Randomized clinical trials serve as the

More information

Measurement is the process of observing and recording the observations. Two important issues:

Measurement is the process of observing and recording the observations. Two important issues: Farzad Eskandanian Measurement is the process of observing and recording the observations. Two important issues: 1. Understanding the fundamental ideas: Levels of measurement: nominal, ordinal, interval

More information

CHAPTER 3 RESEARCH METHODOLOGY

CHAPTER 3 RESEARCH METHODOLOGY CHAPTER 3 RESEARCH METHODOLOGY 3.1 Introduction 3.1 Methodology 3.1.1 Research Design 3.1. Research Framework Design 3.1.3 Research Instrument 3.1.4 Validity of Questionnaire 3.1.5 Statistical Measurement

More information

Structural Equation Modeling (SEM)

Structural Equation Modeling (SEM) Structural Equation Modeling (SEM) Today s topics The Big Picture of SEM What to do (and what NOT to do) when SEM breaks for you Single indicator (ASU) models Parceling indicators Using single factor scores

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector

More information

Basic concepts and principles of classical test theory

Basic concepts and principles of classical test theory Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must

More information

You must answer question 1.

You must answer question 1. Research Methods and Statistics Specialty Area Exam October 28, 2015 Part I: Statistics Committee: Richard Williams (Chair), Elizabeth McClintock, Sarah Mustillo You must answer question 1. 1. Suppose

More information

Methodological Issues in Measuring the Development of Character

Methodological Issues in Measuring the Development of Character Methodological Issues in Measuring the Development of Character Noel A. Card Department of Human Development and Family Studies College of Liberal Arts and Sciences Supported by a grant from the John Templeton

More information

Answers to end of chapter questions

Answers to end of chapter questions Answers to end of chapter questions Chapter 1 What are the three most important characteristics of QCA as a method of data analysis? QCA is (1) systematic, (2) flexible, and (3) it reduces data. What are

More information

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling Olli-Pekka Kauppila Daria Kautto Session VI, September 20 2017 Learning objectives 1. Get familiar with the basic idea

More information

Chapter 1: Explaining Behavior

Chapter 1: Explaining Behavior Chapter 1: Explaining Behavior GOAL OF SCIENCE is to generate explanations for various puzzling natural phenomenon. - Generate general laws of behavior (psychology) RESEARCH: principle method for acquiring

More information

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session

More information

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize

More information

Statistical reports Regression, 2010

Statistical reports Regression, 2010 Statistical reports Regression, 2010 Niels Richard Hansen June 10, 2010 This document gives some guidelines on how to write a report on a statistical analysis. The document is organized into sections that

More information

Internal structure evidence of validity

Internal structure evidence of validity Internal structure evidence of validity Dr Wan Nor Arifin Lecturer, Unit of Biostatistics and Research Methodology, Universiti Sains Malaysia. E-mail: wnarifin@usm.my Wan Nor Arifin, 2017. Internal structure

More information

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016 The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016 This course does not cover how to perform statistical tests on SPSS or any other computer program. There are several courses

More information

On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015

On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015 On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses Structural Equation Modeling Lecture #12 April 29, 2015 PRE 906, SEM: On Test Scores #2--The Proper Use of Scores Today s Class:

More information

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

DRAFT (Final) Concept Paper On choosing appropriate estimands and defining sensitivity analyses in confirmatory clinical trials

DRAFT (Final) Concept Paper On choosing appropriate estimands and defining sensitivity analyses in confirmatory clinical trials DRAFT (Final) Concept Paper On choosing appropriate estimands and defining sensitivity analyses in confirmatory clinical trials EFSPI Comments Page General Priority (H/M/L) Comment The concept to develop

More information

Item Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses

Item Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses Item Response Theory Steven P. Reise University of California, U.S.A. Item response theory (IRT), or modern measurement theory, provides alternatives to classical test theory (CTT) methods for the construction,

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis 5A.1 General Considerations 5A CHAPTER Multiple regression analysis, a term first used by Karl Pearson (1908), is an extremely useful extension of simple linear regression

More information

Simultaneous Equation and Instrumental Variable Models for Sexiness and Power/Status

Simultaneous Equation and Instrumental Variable Models for Sexiness and Power/Status Simultaneous Equation and Instrumental Variable Models for Seiness and Power/Status We would like ideally to determine whether power is indeed sey, or whether seiness is powerful. We here describe the

More information

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification What is? IOP 301-T Test Validity It is the accuracy of the measure in reflecting the concept it is supposed to measure. In simple English, the of a test concerns what the test measures and how well it

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Neuroscience and Generalized Empirical Method Go Three Rounds

Neuroscience and Generalized Empirical Method Go Three Rounds Bruce Anderson, Neuroscience and Generalized Empirical Method Go Three Rounds: Review of Robert Henman s Global Collaboration: Neuroscience as Paradigmatic Journal of Macrodynamic Analysis 9 (2016): 74-78.

More information

Chapter 14: More Powerful Statistical Methods

Chapter 14: More Powerful Statistical Methods Chapter 14: More Powerful Statistical Methods Most questions will be on correlation and regression analysis, but I would like you to know just basically what cluster analysis, factor analysis, and conjoint

More information

Audio: In this lecture we are going to address psychology as a science. Slide #2

Audio: In this lecture we are going to address psychology as a science. Slide #2 Psychology 312: Lecture 2 Psychology as a Science Slide #1 Psychology As A Science In this lecture we are going to address psychology as a science. Slide #2 Outline Psychology is an empirical science.

More information

Certificate Courses in Biostatistics

Certificate Courses in Biostatistics Certificate Courses in Biostatistics Term I : September December 2015 Term II : Term III : January March 2016 April June 2016 Course Code Module Unit Term BIOS5001 Introduction to Biostatistics 3 I BIOS5005

More information

Work Personality Index Factorial Similarity Across 4 Countries

Work Personality Index Factorial Similarity Across 4 Countries Work Personality Index Factorial Similarity Across 4 Countries Donald Macnab Psychometrics Canada Copyright Psychometrics Canada 2011. All rights reserved. The Work Personality Index is a trademark of

More information

Associate Prof. Dr Anne Yee. Dr Mahmoud Danaee

Associate Prof. Dr Anne Yee. Dr Mahmoud Danaee Associate Prof. Dr Anne Yee Dr Mahmoud Danaee 1 2 What does this resemble? Rorschach test At the end of the test, the tester says you need therapy or you can't work for this company 3 Psychological Testing

More information

Structural Equation Modelling: Tips for Getting Started with Your Research

Structural Equation Modelling: Tips for Getting Started with Your Research : Tips for Getting Started with Your Research Kathryn Meldrum School of Education, Deakin University kmeldrum@deakin.edu.au At a time when numerical expression of data is becoming more important in the

More information

INADEQUACIES OF SIGNIFICANCE TESTS IN

INADEQUACIES OF SIGNIFICANCE TESTS IN INADEQUACIES OF SIGNIFICANCE TESTS IN EDUCATIONAL RESEARCH M. S. Lalithamma Masoomeh Khosravi Tests of statistical significance are a common tool of quantitative research. The goal of these tests is to

More information

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Today s Class: Features of longitudinal data Features of longitudinal models What can MLM do for you? What to expect in this

More information

Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data

Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data Analysis of Environmental Data Conceptual Foundations: En viro n m e n tal Data 1. Purpose of data collection...................................................... 2 2. Samples and populations.......................................................

More information

INVESTIGATING FIT WITH THE RASCH MODEL. Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form

INVESTIGATING FIT WITH THE RASCH MODEL. Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form INVESTIGATING FIT WITH THE RASCH MODEL Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form of multidimensionality. The settings in which measurement

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Name Date Per Key Vocabulary: response variable explanatory variable independent variable dependent variable scatterplot positive association negative association linear correlation r-value regression

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

The reality.1. Project IT89, Ravens Advanced Progressive Matrices Correlation: r = -.52, N = 76, 99% normal bivariate confidence ellipse

The reality.1. Project IT89, Ravens Advanced Progressive Matrices Correlation: r = -.52, N = 76, 99% normal bivariate confidence ellipse The reality.1 45 35 Project IT89, Ravens Advanced Progressive Matrices Correlation: r = -.52, N = 76, 99% normal bivariate confidence ellipse 25 15 5-5 4 8 12 16 2 24 28 32 RAVEN APM Score Let us examine

More information

Agents with Attitude: Exploring Coombs Unfolding Technique with Agent-Based Models

Agents with Attitude: Exploring Coombs Unfolding Technique with Agent-Based Models Int J Comput Math Learning (2009) 14:51 60 DOI 10.1007/s10758-008-9142-6 COMPUTER MATH SNAPHSHOTS - COLUMN EDITOR: URI WILENSKY* Agents with Attitude: Exploring Coombs Unfolding Technique with Agent-Based

More information

In this chapter we discuss validity issues for quantitative research and for qualitative research.

In this chapter we discuss validity issues for quantitative research and for qualitative research. Chapter 8 Validity of Research Results (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.) In this chapter we discuss validity issues for

More information

In 1980, a new term entered our vocabulary: Attention deficit disorder. It

In 1980, a new term entered our vocabulary: Attention deficit disorder. It In This Chapter Chapter 1 AD/HD Basics Recognizing symptoms of attention deficit/hyperactivity disorder Understanding the origins of AD/HD Viewing AD/HD diagnosis and treatment Coping with AD/HD in your

More information

Denny Borsboom Jaap van Heerden Gideon J. Mellenbergh

Denny Borsboom Jaap van Heerden Gideon J. Mellenbergh Validity and Truth Denny Borsboom Jaap van Heerden Gideon J. Mellenbergh Department of Psychology, University of Amsterdam ml borsboom.d@macmail.psy.uva.nl Summary. This paper analyzes the semantics of

More information

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Introduction to Multilevel Models for Longitudinal and Repeated Measures Data Today s Class: Features of longitudinal data Features of longitudinal models What can MLM do for you? What to expect in this

More information

PARTIAL IDENTIFICATION OF PROBABILITY DISTRIBUTIONS. Charles F. Manski. Springer-Verlag, 2003

PARTIAL IDENTIFICATION OF PROBABILITY DISTRIBUTIONS. Charles F. Manski. Springer-Verlag, 2003 PARTIAL IDENTIFICATION OF PROBABILITY DISTRIBUTIONS Charles F. Manski Springer-Verlag, 2003 Contents Preface vii Introduction: Partial Identification and Credible Inference 1 1 Missing Outcomes 6 1.1.

More information

Reliability Theory for Total Test Scores. Measurement Methods Lecture 7 2/27/2007

Reliability Theory for Total Test Scores. Measurement Methods Lecture 7 2/27/2007 Reliability Theory for Total Test Scores Measurement Methods Lecture 7 2/27/2007 Today s Class Reliability theory True score model Applications of the model Lecture 7 Psych 892 2 Great Moments in Measurement

More information

alternate-form reliability The degree to which two or more versions of the same test correlate with one another. In clinical studies in which a given function is going to be tested more than once over

More information

Psychology, 2010, 1: doi: /psych Published Online August 2010 (

Psychology, 2010, 1: doi: /psych Published Online August 2010 ( Psychology, 2010, 1: 194-198 doi:10.4236/psych.2010.13026 Published Online August 2010 (http://www.scirp.org/journal/psych) Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes

More information

Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving

Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, 2018 The Scientific Method of Problem Solving The conceptual phase Reviewing the literature, stating the problem,

More information

computation and interpretation of indices of reliability. In

computation and interpretation of indices of reliability. In THE CONCEPTS OF RELIABILITY AND HOMOGENEITY C. H. COOMBS 1 University of Michigan I. Introduction THE literature of test theory is replete with articles on the computation and interpretation of indices

More information

Appendix B Statistical Methods

Appendix B Statistical Methods Appendix B Statistical Methods Figure B. Graphing data. (a) The raw data are tallied into a frequency distribution. (b) The same data are portrayed in a bar graph called a histogram. (c) A frequency polygon

More information

investigate. educate. inform.

investigate. educate. inform. investigate. educate. inform. Research Design What drives your research design? The battle between Qualitative and Quantitative is over Think before you leap What SHOULD drive your research design. Advanced

More information

Multifactor Confirmatory Factor Analysis

Multifactor Confirmatory Factor Analysis Multifactor Confirmatory Factor Analysis Latent Trait Measurement and Structural Equation Models Lecture #9 March 13, 2013 PSYC 948: Lecture #9 Today s Class Confirmatory Factor Analysis with more than

More information

Reveal Relationships in Categorical Data

Reveal Relationships in Categorical Data SPSS Categories 15.0 Specifications Reveal Relationships in Categorical Data Unleash the full potential of your data through perceptual mapping, optimal scaling, preference scaling, and dimension reduction

More information

Experimental Design. Dewayne E Perry ENS C Empirical Studies in Software Engineering Lecture 8

Experimental Design. Dewayne E Perry ENS C Empirical Studies in Software Engineering Lecture 8 Experimental Design Dewayne E Perry ENS 623 Perry@ece.utexas.edu 1 Problems in Experimental Design 2 True Experimental Design Goal: uncover causal mechanisms Primary characteristic: random assignment to

More information

Ursuline College Accelerated Program

Ursuline College Accelerated Program Ursuline College Accelerated Program CRITICAL INFORMATION! DO NOT SKIP THIS LINK BELOW... BEFORE PROCEEDING TO READ THE UCAP MODULE, YOU ARE EXPECTED TO READ AND ADHERE TO ALL UCAP POLICY INFORMATION CONTAINED

More information

ELEMENTS OF PSYCHOPHYSICS Sections VII and XVI. Gustav Theodor Fechner (1860/1912)

ELEMENTS OF PSYCHOPHYSICS Sections VII and XVI. Gustav Theodor Fechner (1860/1912) ELEMENTS OF PSYCHOPHYSICS Sections VII and XVI Gustav Theodor Fechner (1860/1912) Translated by Herbert Sidney Langfeld (1912) [Classics Editor's note: This translation of these passages from Fechner's

More information

Chapter 1 Introduction to Educational Research

Chapter 1 Introduction to Educational Research Chapter 1 Introduction to Educational Research The purpose of Chapter One is to provide an overview of educational research and introduce you to some important terms and concepts. My discussion in this

More information

Bayesian and Frequentist Approaches

Bayesian and Frequentist Approaches Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law

More information

Chapter 4: Defining and Measuring Variables

Chapter 4: Defining and Measuring Variables Chapter 4: Defining and Measuring Variables A. LEARNING OUTCOMES. After studying this chapter students should be able to: Distinguish between qualitative and quantitative, discrete and continuous, and

More information

Shiken: JALT Testing & Evaluation SIG Newsletter. 12 (2). April 2008 (p )

Shiken: JALT Testing & Evaluation SIG Newsletter. 12 (2). April 2008 (p ) Rasch Measurementt iin Language Educattiion Partt 2:: Measurementt Scalles and Invariiance by James Sick, Ed.D. (J. F. Oberlin University, Tokyo) Part 1 of this series presented an overview of Rasch measurement

More information

COURSE: NURSING RESEARCH CHAPTER I: INTRODUCTION

COURSE: NURSING RESEARCH CHAPTER I: INTRODUCTION COURSE: NURSING RESEARCH CHAPTER I: INTRODUCTION 1. TERMINOLOGY 1.1 Research Research is a systematic enquiry about a particular situation for a certain truth. That is: i. It is a search for knowledge

More information

Generalization and Theory-Building in Software Engineering Research

Generalization and Theory-Building in Software Engineering Research Generalization and Theory-Building in Software Engineering Research Magne Jørgensen, Dag Sjøberg Simula Research Laboratory {magne.jorgensen, dagsj}@simula.no Abstract The main purpose of this paper is

More information

Speaker Notes: Qualitative Comparative Analysis (QCA) in Implementation Studies

Speaker Notes: Qualitative Comparative Analysis (QCA) in Implementation Studies Speaker Notes: Qualitative Comparative Analysis (QCA) in Implementation Studies PART 1: OVERVIEW Slide 1: Overview Welcome to Qualitative Comparative Analysis in Implementation Studies. This narrated powerpoint

More information

Bill Wilson. Categorizing Cognition: Toward Conceptual Coherence in the Foundations of Psychology

Bill Wilson. Categorizing Cognition: Toward Conceptual Coherence in the Foundations of Psychology Categorizing Cognition: Toward Conceptual Coherence in the Foundations of Psychology Halford, G.S., Wilson, W.H., Andrews, G., & Phillips, S. (2014). Cambridge, MA: MIT Press http://mitpress.mit.edu/books/categorizing-cognition

More information

Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz

Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz This study presents the steps Edgenuity uses to evaluate the reliability and validity of its quizzes, topic tests, and cumulative

More information

The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance

The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation Multivariate Analysis of Variance The SAGE Encyclopedia of Educational Research, Measurement, Multivariate Analysis of Variance Contributors: David W. Stockburger Edited by: Bruce B. Frey Book Title: Chapter Title: "Multivariate Analysis

More information

ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH

ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH The following document provides background information on the research and development of the Emergenetics Profile instrument. Emergenetics Defined 1. Emergenetics

More information

Validity and reliability of measurements

Validity and reliability of measurements Validity and reliability of measurements 2 Validity and reliability of measurements 4 5 Components in a dataset Why bother (examples from research) What is reliability? What is validity? How should I treat

More information

Multiple Act criterion:

Multiple Act criterion: Common Features of Trait Theories Generality and Stability of Traits: Trait theorists all use consistencies in an individual s behavior and explain why persons respond in different ways to the same stimulus

More information

Panel: Using Structural Equation Modeling (SEM) Using Partial Least Squares (SmartPLS)

Panel: Using Structural Equation Modeling (SEM) Using Partial Least Squares (SmartPLS) Panel: Using Structural Equation Modeling (SEM) Using Partial Least Squares (SmartPLS) Presenters: Dr. Faizan Ali, Assistant Professor Dr. Cihan Cobanoglu, McKibbon Endowed Chair Professor University of

More information

Psychological testing

Psychological testing Psychological testing Lecture 12 Mikołaj Winiewski, PhD Test Construction Strategies Content validation Empirical Criterion Factor Analysis Mixed approach (all of the above) Content Validation Defining

More information

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information

Research Methodologies

Research Methodologies Research Methodologies Quantitative, Qualitative, and Mixed Methods By Wylie J. D. Tidwell, III, Ph.D. www.linkedin.com/in/wylietidwell3 Consider... The research design is the blueprint that enables the

More information

The Classification Accuracy of Measurement Decision Theory. Lawrence Rudner University of Maryland

The Classification Accuracy of Measurement Decision Theory. Lawrence Rudner University of Maryland Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, April 23-25, 2003 The Classification Accuracy of Measurement Decision Theory Lawrence Rudner University

More information

Understanding. Regression Analysis

Understanding. Regression Analysis Understanding Regression Analysis Understanding Regression Analysis Michael Patrick Allen Washington State University Pullman, Washington Plenum Press New York and London Llbrary of Congress Cataloging-in-Publication

More information

Validity and reliability of measurements

Validity and reliability of measurements Validity and reliability of measurements 2 3 Request: Intention to treat Intention to treat and per protocol dealing with cross-overs (ref Hulley 2013) For example: Patients who did not take/get the medication

More information

11-3. Learning Objectives

11-3. Learning Objectives 11-1 Measurement Learning Objectives 11-3 Understand... The distinction between measuring objects, properties, and indicants of properties. The similarities and differences between the four scale types

More information

Lecture 11: Measurement to Hypotheses. Benjamin Graham

Lecture 11: Measurement to Hypotheses. Benjamin Graham Lecture 11: Measurement to Hypotheses Benjamin Graham Today s Schedule Homework #2 will be posted Friday. It is due October 2. Finish up validity and reliability Levels of Analysis Direction of Causation

More information

Reliability and Validity

Reliability and Validity Reliability and Validity Why Are They Important? Check out our opening graphics. In a nutshell, do you want that car? It's not reliable. Would you recommend that car magazine (Auto Tester Weakly) to a

More information

Using Generalizability Theory to Investigate the Psychometric Property of an Assessment Center in Indonesia

Using Generalizability Theory to Investigate the Psychometric Property of an Assessment Center in Indonesia Using Generalizability Theory to Investigate the Psychometric Property of an Assessment Center in Indonesia Urip Purwono received his Ph.D. (psychology) from the University of Massachusetts at Amherst,

More information

Intelligence as the Tests Test It

Intelligence as the Tests Test It Boring. E. G. (1923). Intelligence as the tests test it. New Republic, 36, 35 37. Intelligence as the Tests Test It Edwin G. Boring If you take on of the ready-made tests of intelligence and try it on

More information

Technical Whitepaper

Technical Whitepaper Technical Whitepaper July, 2001 Prorating Scale Scores Consequential analysis using scales from: BDI (Beck Depression Inventory) NAS (Novaco Anger Scales) STAXI (State-Trait Anxiety Inventory) PIP (Psychotic

More information

Introduction to Test Theory & Historical Perspectives

Introduction to Test Theory & Historical Perspectives Introduction to Test Theory & Historical Perspectives Measurement Methods in Psychological Research Lecture 2 02/06/2007 01/31/2006 Today s Lecture General introduction to test theory/what we will cover

More information

Introduction to Measurement

Introduction to Measurement This is a chapter excerpt from Guilford Publications. The Theory and Practice of Item Response Theory, by R. J. de Ayala. Copyright 2009. 1 Introduction to Measurement I often say that when you can measure

More information

Social-semiotic externalism in Can robot understand values?

Social-semiotic externalism in Can robot understand values? Social-semiotic externalism in Can robot understand values? English presentation seminar Prof. Minao Kukika November 20, 2015 Seiichiro Yasuda 1 List of acronyms (for your reference) SGP: ESGP: AMA: AI:

More information

Game Theoretic Statistics

Game Theoretic Statistics Game Theoretic Statistics Glenn Shafer CWI, Amsterdam, December 17, 2018 Game theory as an alternative foundation for probability Using the game theoretic foundation in statistics 1 May 2019 2 ON THE BACK

More information

Overview of the Logic and Language of Psychology Research

Overview of the Logic and Language of Psychology Research CHAPTER W1 Overview of the Logic and Language of Psychology Research Chapter Outline The Traditionally Ideal Research Approach Equivalence of Participants in Experimental and Control Groups Equivalence

More information