Basic Psychometrics for the Practicing Psychologist Presented by Yossef S. Ben-Porath, PhD, ABPP

Size: px
Start display at page:

Download "Basic Psychometrics for the Practicing Psychologist Presented by Yossef S. Ben-Porath, PhD, ABPP"

Transcription

1 Basic Psychometrics for the Practicing Psychologist Presented by Yossef S. Ben-Porath, PhD, ABPP ABPP Annual Conference & Workshops S a n Diego, CA M a y 1 8, 2 017

2 Basic Psychometrics for The Practicing Psychologist Yossef S. Ben-Porath Kent State University

3 Learning Objectives Attendees will be able to: Conceptualize and explain correlational statistics Evaluate the strengths and weaknesses of various methods for estimating reliability Use reliability estimates to calculate and use confidence intervals and standard errors of measurement Evaluate the validity of psychological test scores used in psychological assessments Correlational Statistics Correlation A researcher seeks to develop a psychological test to assist in identifying individuals with good leadership ability He/she has studied the scientific literature on what makes good leaders and has concluded that dominance - the ability to influence others through interpersonal interactions, plays an important role in leadership To explore this further, correlational analyses will be used to establish the association between dominance test scores and leadership ratings Correlation A sample of individuals is administered a psychological test measuring dominance (variable X) and receive leadership ratings (variable Y) by peers who know them well These two procedures yield Raw Scores for each participant on variables X and Y Individual raw scores on psychological measures typically have no intrinsic meaning They must be converted into a meaningful, standard metric, that is, they must be standardized 1

4 Correlation Deviation Scores reflect an individual s standing on a variable in reference to a group mean (M) M = X x X M X M X =Mean of variable X scores Σ= Sum of X= Raw Score N=Number of participants (i.e., sample size) x= deviation score of X Like raw scores, deviation scores have no intrinsic meaning (though they are a bit more informative) because they are still expressed on the same arbitrary metric as the raw score Standard Scores reflect an individual s standing on a variable in reference to their group mean and its associated standard deviation x 2 1 =Standard Deviation for Mean of X Standard Scores reflect an individual s standing on a variable in reference to their group mean and its associated standard deviation x 2 1 Z= x =Standard Deviation for Mean of X Z x = Standard Z Score for X Standard Scores are expressed on a uniform metric and can be compared across variables. Z scores are standard scores that have a mean of zero and a standard deviation of 1 Some Standard Score Equivalents 2

5 Correlation The Pearson Product Moment Correlation (r) represents the extent to which standardized deviations from the mean on one variable (X) are associated with standardized deviations from the mean on a second variable (Y). ( ) 1 r r xy = Correlation between variables X and Y Scatterplots & Correlations for Positive Values of r ( Z) 1 Sum of product of Z scores divided by N-1 Scatterplots & Correlations for Negative Values of r Correlation Measures with limited variability (i.e., not much deviation from the mean) cannot be correlated strongly with other measures Extreme example where both variables have no variance: Subject X Y

6 Correlation Measures with limited variability (i.e., not much deviation from the mean) cannot be correlated strongly with other measures Extreme example where both variables have no variance: Subject X Y ( Z) ? Correlation Example where only 1 variable has limited variance: Subject X Y ( Z) 1? Scatterplots Illustrating Unrestricted and Restricted Ranges Scatterplots Illustrating Unrestricted and Restricted Ranges 4

7 Scatterplots Illustrating Unrestricted and Restricted Ranges Scatterplots Illustrating Unrestricted and Restricted Ranges Range Restriction Scattergram of adult males and NBA players, illustrating the effect of restriction of range on the correlation of height and weight Limited variance (a.k.a. Range restriction) may be a function of the variable, in which case it s not going to be very useful as a predictor of any other variable Or it may be a function of restricted sampling, in which case the data will underestimate the true association between variables As just noted, range restriction in just one variable will lower the observed correlation between both 5

8 Range Restriction When might range restriction affect psychology research findings: Restricted sampling: Use of convenience samples (e.g., college students, prison inmates) to study psychopathology (e.g., association between psychopathy and substance abuse) Range restriction will produce underestimate of true association Low base rate phenomena (e.g., suicide) Are difficult to predict because of range restriction Regression In assessment, we typically use correlations between variables to predict an individual s standing on one variable (leadership), as a function of their standing on another (dominance). This can be accomplished by using regression models that take on the basic form: Ŷ=bX+a where: Ŷ=predicted (leadership) score b=slope of the regression line X=raw test (dominance) score br SD SD Y X a= Y intercept a=m Y -b*m X Numerical Example Dominance Leadership Dominance Leadership Z X Z Y X Y Z X Z Y M X =5.6 M Y =5.8 SD X =2.79 SD Y =1.75 X Y Z Z r N

9 Dominance Dominance Regression and Range Restriction Regression and Range Restriction Ŷ=bX+a where: Ŷ=predicted (leadership) score b=slope of the regression line X=raw test (dominance) score br SD SD Y X Ŷ=bX+a where: Ŷ=predicted (leadership) score b=slope of the regression line X=raw test (dominance) score br SD SD Y X a= Y intercept a=m Y -b*m X a= Y intercept a=m Y -b*m X Implications: o Range-restricted correlations (r) result in greater prediction error 7

10 Regression and Range Restriction Ŷ=bX+a where: Ŷ=predicted (leadership) score b=slope of the regression line X=raw test (dominance) score br SD SD Y X Reliability and Measurement Error a= Y intercept a=m Y -b*m X Implications: o Range-restricted correlations (r) result in greater prediction error o If SS Y and SD X differ across samples or settings prediction errors will vary as well Reliability Definitions: Conceptual Psychometric Ways of estimating Applications: Estimating true scores Establishing confidence intervals Reliability Defined Conceptually The reliability of a test score reflects the precision with which it represents an individual s standing on the test If we administer a WAIS-IV, how accurate is the resulting IQ score as an indication of the individual s actual IQ If Mr. Jones obtains an IQ score of 120, how much confidence do we have that his actual IQ (not his intelligence!) is 120 8

11 Reliability Defined Psychometrically In Classical Test Theory: X=T+e Where: X= Observed Score (The test score) T= True Score (the individual s actual score if measured without error) e= Random Measurement Error Two types of Measurement Error Random Measurement Error results from unsystematic factors that contribute to the observed score: Test-taker: fatigue distractibility mood Testing conditions lighting noise variability in the examiner By definition, random measurement error cannot be correlated with any other variable If it correlates with another variable, it is not unsystematic Two types of Measurement Error Systematic Error affects the True Score, and, by definition, is not random. It affects test score validity rather than reliability. Examples of Systematic Error: Items don t measure intended construct Reactivity Bias Systematic error can correlate with other variables Reliability Defined Psychometrically Every time we conduct a measurement, we encounter a different amount of random error: X 1 =T+e 1 X 2 =T+e 2 X 3 =T+e 3 X k =T+e k T remains constant, whereas X i varies as a function of e i. Changes in X are a product of random error 9

12 Reliability Defined Psychometrically Random measurement error will sometimes push the observed score above the true score and other times it will pull the observed score below the true score. The mean of a hypothetical, infinite number of measurements is equal to the True Score. Reliability Defined Psychometrically Random measurement error will sometimes push the observed score above the true score and other times it will pull the observed score below the true score. The mean of a hypothetical, infinite number of measurements is equal to the True Score. Reliability Defined Psychometrically Random measurement error will sometimes push the observed score above the true score and other times it will pull the observed score below the true score. The mean of a hypothetical, infinite number of measurements is equal to the True Score. T 1 This method for determining the true score is impossible to implement. Xi Reliability Defined Psychometrically We can, alternatively, estimate the correlation between observed and true scores, r XT If r XT =1.0, then X=T and there is no error Random Error is reflected in the extent to which r XT <1.0 Expressed in terms of standard deviations from the mean: r If σ X > σ T, then r XT <

13 Reliability Defined Psychometrically The square of a correlation indicates the proportion of variance shared by the two variables. The squared correlation between the Observed and the True score reflects the proportion of true score variance in Observed Score variance. This is the psychometric definition of reliability: r r Ways of Estimating Reliability Because we don t know the actual true score, we can only estimate reliability by indirect means: Test-Retest Alternate (Equivalent) Forms Internal Consistency Inter-Scorer/Inter-Rater Reliability is the ratio of True Score variance to Observed Score variance. Test-Retest Reliability A test is administered to a group of people twice. Scores are correlated across Time 1 and Time 2. X 1 =T 1 +e 1 X 2 =T 2 +e 2 Because they are random variables, e 1 and e 2 cannot be correlated with each other or with T1, T 2, X 1, or X 2. It is assumed that: T 1 =T 2 2 It can be derived that rxx r x xx Test-Retest Reliability Shortcomings Reasons why T 1 may not equal T 2 Temporal instability Repeated measurement If T 1 T 2 Then r X will underestimate reliability by 1X2 attributing changes from T 1 to T 2 to measurement error. The more unstable the measured construct, and the longer the interval between measurements, the more likely this is to occur. Example: Anger 11

14 Alternate Forms Reliability Rather than administer the same test (X) twice, an alternate form (Y), assumed to be equivalent to X, is administered and scores are correlated across forms administered at the same time X 1 =T 1 +e 1 Y 1 =T 2 +e 2 Because they are random variables, e 1 and e 2 cannot be correlated with each other or with T1, T 2, X 1, or Y 1. It is assumed that: T 1 =T 2 2 It can be derived that rxy r x xx Alternate Forms Reliability Shortcomings Reasons why T 1 may not equal T 2 Forms may not be truly equivalent measures Repeated measurement If T 1 T 2 Then r X will underestimate reliability by 1y1 attributing changes from T 1 to T 2 to measurement error The more difficult it is to construct truly equivalent forms, the more likely this is to occur More likely to be problematic when measuring personality or psychopathology than cognitive functioning Internal Consistency Reliability Rather than construct alternate forms, we treat each item on a scale as an alternate measure of T We calculate correlations between all possible pairs of items and create a variable that represents the average correlation between items We then adjust that average correlation to account for the fact that each observation is based on a correlation between single items, whereas the test score represents a composite of all of the test items (i.e., multiple measurements) Internal Consistency Reliability Xi 1 Recall that T = Therefore the more measurements we take, the closer their average is likely to be to the True Score When we create a composite Observed Score, based on multiple measurements (items), we enhance the Observed Score s reliability 12

15 Internal Consistency Reliability The relation between the number of items on a scale and the composite score s reliability is reflected in the Spearman-Brown Formula: k rxx rkk Where 1( k 1) rxx r kk =the corrected reliability estimate k=the factor by which the number of items is increased r xx =the uncorrected reliability estimate This assumes that additional items are equivalent measures of T Internal Consistency Reliability For example: A five-item scale has been found to have an internal consistency of.70 An investigator wants to estimate what the reliability would be if five more items are added (i.e., 5*k=10; k=2) k rxx rkk 1( k 1) rxx rkk ( 21). 7 rkk Internal Consistency Reliability Cronbach s Coefficient Alpha () is the most commonly used index of internal consistency It estimates a test s reliability based on the average correlation between all possible pairs of items on a scale, corrected for the fact that these correlations are based on single measurements (correlations between single items) One way to increase reliability is by adding items Short tests tend to be less reliable Single items are least reliable Inter-Scorer/Rater Reliability Some psychological measures require that scorers (or raters) exercise judgment Examples Scoring Rorschach variables Structured diagnostic interviewing Coding Behavioral Observations Random differences among scorers/raters contribute random error to the Observed Score resulting in lower reliability 13

16 Inter-Scorer/Rater Reliability Inter-Scorer/Rater reliability is determined by examining correlations between test scores across scorers/raters Examples: a set of Rorschach protocols is scored by two different individuals Two interviewers code the same set of responses to a structured diagnostic interview Two raters code the same video-recorded sample of behavior Inter-Scorer/Rater Reliability Note that the same set of data are scored in each instance If, for example, two psychologists administered and scored Rorschachs separately, we could be confounding Test-re-test and Inter-Scorer reliability as well as any interaction effects between the examiner and the test taker Inter-Scorer/Rater Reliability When two or more individuals actually generate the data to be scored (i.e., two separate structured psychiatric interviews are conducted and scored by two independent interviewers), then we evaluate Inter-examination Reliability. The same techniques are used, however the results may be interpreted differently because of the added effects of repeated measurement Estimating Reliability All four methods for estimating reliability (testretest, alternate forms, internal consistency, inter-scorer or inter-rater) are just that, they are estimates For various reasons already discussed, they are usually under-estimates of actual reliability In addition, if we use samples with restricted ranges on the variables of interest, this too will yield an under-estimate of reliability For example, if we use a non-clinical sample to estimate the reliability of a psychopathology measure 14

17 Estimating Reliability Applications of Reliability Estimating True Scores Establishing Confidence Intervals and the significance of the difference between test scores Estimating True Scores With knowledge of the following: an individual s Observed Score the reliability of that score and the Mean of that score in that individual s population it is possible to estimate her/his true score on the construct being measured Estimating True Scores T`=(1-r xx )*M X +r xx *X Where T`=Estimated True Score r xx = the test s reliability M X =population s mean observed score on X X=individual s raw score on X 15

18 Estimating True Scores T`=(1-r xx )*M X +r xx *X Note that: if r xx = 1, T`=X T`= (1-1)*M X +1*X = X (i.e., if a test is perfectly reliable, the true score actially equals the observed score) if r xx = 0, T`=M X T`= (1-0)*M X +0*X = Mx (i.e., if a test is perfectly unreliable, then our best estimate of the individual s true score [the one that s least likely to be wrong] is the population Mean) Estimating True Scores T`=(1-r xx )*M X +r xx *X Note that: if r xx = 1, T`=X T`= (1-1)*M X +1*X = X if r xx = 0, T`=M X T`= (1-0)*M X +0*X = M X As reliability decreases, the predicted True Score approaches the population mean This is known as Regression to the Mean due to measurement error Estimating True Scores T`=(1-r xx )*M X +r xx *X An individual obtains an IQ score of 120 If reliability is.90: T`=(1-.9)*100+.9*120=118 If reliability is.80: T`=(1-.8)*100+.8*120=116 If reliability is.70: T`=(1-.7)*100+.7*120=114 16

19 Establishing Confidence Intervals The estimated True Score represents the most accurate estimate of the actual True Score as constrained by the reliability of a measure As a measure s reliability approaches 1.0, our estimate of the True Score is more likely to be accurate Establishing Confidence Intervals With knowledge of the following: an individual s estimated True Score the reliability of that score AND the standard deviation of that score in the individual s population It is possible to identify a range of scores within which that individual s True Score is likely to fall at a given probability level This requires calculating the standard error of measurement Standard Error of Measurement SE M = X * 1 rxx SE M =Standard Error of Measurement X =Standard Deviation of X r XX =Reliability of X Standard Error of Measurement SE M = X * 1 rxx If r xx of an IQ test is.90 SE M =15.316=4.74 If r xx of an IQ test is.80 SE M =15.447=6.71 If r xx of an IQ test is.70 SE M =15.548=

20 Confidence Intervals The Standard Error of Measurement can be used to place Confidence Intervals around the estimated True Score Based on the characteristics of the Normal Curve, we can determine that there is a 68% probability that the actual True Score will fall within one SE M above or below the estimated True Score and a 95% probability that it will fall within two SE M s about this score -2SEm -1SEm T +1SEm +2SEm Area Under the Normal Curve Confidence Intervals If an individual obtains an IQ score of 120 on a measure with estimated.90 reliability, then: Estimated True Score=118 68% Confidence Interval: ( ) 95% Confidence Interval: ( ) Note that confidence intervals range symmetrically about the estimated True Score (118), NOT the Observed Score (120) Confidence Intervals If an individual obtains an IQ score of 120 on a measure with estimated.80 reliability, then: Estimated True Score=116 68% Confidence Interval: ( ) 95% Confidence Interval: ( ) If an individual obtains an IQ score of 120 on a measure with estimated.70 reliability, then: Estimated True Score=114 68% Confidence Interval: ( ) 95% Confidence Interval: (98-130) 18

21 Confidence Intervals If an individual obtains an IQ score of 75 on a measure with estimated.80 reliability, then: T`=(1-r xx )*M X +r xx *X T`=(1-.80)* *75=20+60=80 Estimated True Score=80 68% Confidence Interval: (83-87) 95% Confidence Interval: (67-93) Hall v Florida 134 S. Ct (2014) U.S. Supreme Court held that the SEM must be recognized when assessing intellectual disability and found Florida's bright-line statute unconstitutional. Validity Validity Validity of a test score reflects the extent to which it represents an individual s standing on the construct of interest Reliability of a test score reflects its precision Reliability sets the upper limit for validity, but has no bearing on its lower limit 19

22 Validity Unlike, reliability, establishing information about test-score validity is an ongoing process Validation is the ongoing process of accumulating information about the meaning of test scores Sound psychological testing and assessment practice requires that test users remain current on the product of ongoing test validation research Validity Validity is not a dichotomous property of test scores Rather, test scores may be more or less valid depending upon: their intended use AND the population with which the test is used Validity Examples: Intended use: IQ test scores may help identify children with special educational needs, however, they are not good predictors of special behavioral needs Population: An adult IQ test will not be very helpful in identifying a 10-year-old s special educational needs Validity Three primary sources may provide test score validity information: a. the test s content b. the test scores empirical associations with other variables c. a and b s relation with theory about the construct being measured These are also known respectively as Content, Criterion, and Construct Validity 20

23 Content Validity Content Validity reflects the extent to which test items canvas the relevant content domain Being able to define the relevant content domain is a crucial step in the process of establishing content validity A structured psychiatric interview designed to diagnose schizophrenia could be examined to determine the extent to which it includes questions that address all of the diagnostic criteria for the disorder Content Validity Content validity is not a static property of test scores As our understanding or definition of a construct s relevant content domain changes, so does the content validity of its measures For example, if we learn more about manifestations of PTSD or Psychopathy, content-based measures of these constructs need to be updated Content Validity Content validity can be sufficient, but is not necessary for test scores to be valid sufficient: When test score interpretation is content-based, adequate representation of the content domain is sufficient to support test score validity If a reliable structured psychiatric interview adequately canvases the current diagnostic criteria, a positive test score validly indicates the diagnosis not necessary: Scores on a test that does not satisfy content validity criteria may nonetheless be valid 21

24 Criterion Validity Criterion validity of test scores is established based on their empirical associations with other variables The other variables are the criteria. For a self-report test designed to yield valid psycho-diagnoses, correlations between test scores and actual diagnoses would provide information about criterion validity For a test designed to identify potentially problematic employees, future job ratings would provide criterion validity information Criterion Validity The two examples represent two sub-types of criterion validity: Concurrent Criterion Validity: Criteria are measured at the same time the test is administered Predictive Criterion Validity: Criteria are measured some time after the test is administered Prediction, in psychometric lingo, is not necessarily future oriented Predictive validity, on the other hand, is future oriented Criterion Validity Three major challenges in establishing test scores criterion validity: Identifying appropriate criteria Measuring them adequately and appropriately Determining their appropriate association with test scores Construct Validity The absence of a gold standard is one of the primary challenges to evaluating criterion validity In its absence, we do the best we can This challenge, led to the delineation of Construct Validity as an important psychometric feature of psychological test scores The term construct validity was coined in the 1954 APA Technical recommendations for Psychological tests and Diagnostic Techniques 22

25 Construct Validity Construct Validity defined conceptually: The extent to which test scores measure a theoretical construct or trait Examples of theoretical constructs: Scholastic Aptitude, Neuroticism, Intelligence, Leadership, Depression Definition is synonymous with definition of validity It is the research method and its underlying conceptualization that differ Construct Validity Theoretical foundations and research methods for elucidating test scores construct validity were presented by Cronbach and Meehl (1955) Both were members of the APA Committee on Psychological Tests that produced the technical recommendations Construct validity allows the test user to go beyond empirical correlates when interpreting test results Classification Accuracy Classification Accuracy Statistics Often, in assessment, we are called upon to make dichotomous predictions: Schizophrenia v. Not Schizophrenia Alcohol abuser v. Not alcohol abuser Genuine psychopathology v. Malingering Will act out violently v. Will not act out violently Will complete therapy v. Will not complete therapy Will attempt suicide v. Will not attempt suicide 23

26 Classification Prediction However, the measures we use, and the constructs they assess, tend to be continuous: o WAIS: Intelligence o CPI/MMPI/PAI: Personality traits/behavioral proclivities/psychopathology To make classification predictions, we establish cutoffs and create dichotomies: o Elevated versus non-elevated To establish cutoffs and establish their validity we collect data that allow for calculating classification accuracy statistics Classification Prediction Problem Occurs Yes No Elevated a b Test Score Non-elevated c d Classification Prediction Test Score Elevated Non-elevated Problem Occurs Yes No a c Desirable outcomes: a = Elevated score correctly predicts that a problem will occur d = Non-elevated score correctly predicts that a problem will not occur b d Classification Accuracy Statistics Base Rate (BR): Proportion of sample that actually has the predicted condition Selection Ratio (SR): Proportion of sample that is predicted to have the condition Sensitivity (Sen): Proportion of those who have the condition who are predicted to have the condition Specificity (Spec): Proportion of those who do NOT have the condition who are predicted NOT to have the condition Positive Predictive Power (PPP): Proportion of those predicted to have the condition who actually have the condition Negative Predictive Power (NPP): Proportion of those predicted NOT to have the condition who actually do not have the condition True Positive Rate (TPR): Proportion of those who have the condition who are predicted to have the condition (= Sensitivity) False Positive Rate (FPR): Proportion of those predicted to have the condition among those who actually do NOT have the condition (=1- Specificty) Hit Rate (HR): Proportion of correct predictions in overall sample 24

27 Classification Accuracy Statistics Test Score Elevated Non-elevated Problem Occurs Yes No a c a+c b+d BR=(a+c)/n PPP=a/a+b TPR=a/a+c HR=(a+d)/n SR=(a+b)/n NPP=d/c+d FPR=1-(d/(b+d)) Sen=a/a+c Spec=d/b+d b d a+b c+d n=a+b+c+d Classification Prediction A psychologist has been asked by the FAA to construct an assessment procedure designed to identify pilots who abuse alcohol The psychologist develops a test using large validation and cross-validation samples The psychologist indicates that this is a very effective test, in fact it identifies 95% of all pilots who abuse alcohol Classification Accuracy Actual Abuse Not Abuse Classification Accuracy Statistics Actual Positive Negative Prediction Abuse Prediction Positive a b a+b Not Abuse Negative c d c+d a+c b+d n=a+b+c+d BR=(a+c)/n PPP=a/a+b TPR=a/a+c HR=(a+d)/n SR=(a+b)/n NPP=d/c+d FPR=1-(d/(b+d)) Sen=a/a+c Spec=d/b+d 25

28 Classification Accuracy Statistics Prediction Abuse Not Abuse Abuse a c Actual Not Abuse b d BR=(a+c)/n = 500/10000=.05 PPP=a/a+b = 475/5000=.095 SR=(a+b)/n =5000/10000=.50 NPP=d/c+d = 4975/5000=.995 Sen=a/a+c = 475/500=.95 TPR= a/a+c = 475/500=.95 Spec=d/b+d = 4975/9500=.52 FPR=1-(d/(b+d)) = 1-(4975/9500)=.48 HR=(a+d)/n = /10000=.54 Classification Accuracy Statistics BR=(a+c)/n = 500/10000=.05 PPP=a/a+b = 475/5000=.095 SR=(a+b)/n =5000/10000=.50 NPP=d/c+d = 4975/5000=.995 Sen=a/a+c = 475/500=.95 TPR= a/a+c = 475/500=.95 Spec=d/b+d = 4975/9500=.52 FPR=1-(d/(b+d)) = 1-(4975/9500)=.48 HR=(a+d)/n = /10000=.54 Using a higher classification cutoff will typically improve specificity at the cost of specificity Lower specificity means higher false positives (FPR=1-Spec) Which of these is emphasized depends in part on the nature of the assessment: For screening purposes (leading to more detailed assessment): sensitivity is key to not missing actual cases more detailed assessment will (hopefully) further reduce false positives Classification Prediction Low base rate phenomena present a challenge for positive predictive power Why? Range Restriction! Implication: Some test classification accuracy characteristics established with one population will not apply to another population with a different BR 26

29 27

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological

More information

On the purpose of testing:

On the purpose of testing: Why Evaluation & Assessment is Important Feedback to students Feedback to teachers Information to parents Information for selection and certification Information for accountability Incentives to increase

More information

alternate-form reliability The degree to which two or more versions of the same test correlate with one another. In clinical studies in which a given function is going to be tested more than once over

More information

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971) Ch. 5: Validity Validity History Griggs v. Duke Power Ricci vs. DeStefano Defining Validity Aspects of Validity Face Validity Content Validity Criterion Validity Construct Validity Reliability vs. Validity

More information

Measures. David Black, Ph.D. Pediatric and Developmental. Introduction to the Principles and Practice of Clinical Research

Measures. David Black, Ph.D. Pediatric and Developmental. Introduction to the Principles and Practice of Clinical Research Introduction to the Principles and Practice of Clinical Research Measures David Black, Ph.D. Pediatric and Developmental Neuroscience, NIMH With thanks to Audrey Thurm Daniel Pine With thanks to Audrey

More information

LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors

LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors affecting reliability ON DEFINING RELIABILITY Non-technical

More information

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971) Ch. 5: Validity Validity History Griggs v. Duke Power Ricci vs. DeStefano Defining Validity Aspects of Validity Face Validity Content Validity Criterion Validity Construct Validity Reliability vs. Validity

More information

Page 1 of 11 Glossary of Terms Terms Clinical Cut-off Score: A test score that is used to classify test-takers who are likely to possess the attribute being measured to a clinically significant degree

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

VARIABLES AND MEASUREMENT

VARIABLES AND MEASUREMENT ARTHUR SYC 204 (EXERIMENTAL SYCHOLOGY) 16A LECTURE NOTES [01/29/16] VARIABLES AND MEASUREMENT AGE 1 Topic #3 VARIABLES AND MEASUREMENT VARIABLES Some definitions of variables include the following: 1.

More information

Chapter 4: Defining and Measuring Variables

Chapter 4: Defining and Measuring Variables Chapter 4: Defining and Measuring Variables A. LEARNING OUTCOMES. After studying this chapter students should be able to: Distinguish between qualitative and quantitative, discrete and continuous, and

More information

Validation of Scales

Validation of Scales Validation of Scales ἀγεωμέτρητος μηδεὶς εἰσίτω (Let none enter without a knowledge of mathematics) D R D I N E S H R A M O O Introduction Validity and validation are crucial to understanding psychological

More information

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. INTRO TO RESEARCH METHODS: Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. Experimental research: treatments are given for the purpose of research. Experimental group

More information

Saville Consulting Wave Professional Styles Handbook

Saville Consulting Wave Professional Styles Handbook Saville Consulting Wave Professional Styles Handbook PART 4: TECHNICAL Chapter 19: Reliability This manual has been generated electronically. Saville Consulting do not guarantee that it has not been changed

More information

Testing and Intelligence. What We Will Cover in This Section. Psychological Testing. Intelligence. Reliability Validity Types of tests.

Testing and Intelligence. What We Will Cover in This Section. Psychological Testing. Intelligence. Reliability Validity Types of tests. Testing and Intelligence 10/19/2002 Testing and Intelligence.ppt 1 What We Will Cover in This Section Psychological Testing Reliability Validity Types of tests. Intelligence Overview Models Summary 10/19/2002

More information

Introduction to Reliability

Introduction to Reliability Reliability Thought Questions: How does/will reliability affect what you do/will do in your future job? Which method of reliability analysis do you find most confusing? Introduction to Reliability What

More information

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification What is? IOP 301-T Test Validity It is the accuracy of the measure in reflecting the concept it is supposed to measure. In simple English, the of a test concerns what the test measures and how well it

More information

CHAPTER ONE CORRELATION

CHAPTER ONE CORRELATION CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to

More information

Throughout this book, we have emphasized the fact that psychological measurement

Throughout this book, we have emphasized the fact that psychological measurement CHAPTER 7 The Importance of Reliability Throughout this book, we have emphasized the fact that psychological measurement is crucial for research in behavioral science and for the application of behavioral

More information

PÄIVI KARHU THE THEORY OF MEASUREMENT

PÄIVI KARHU THE THEORY OF MEASUREMENT PÄIVI KARHU THE THEORY OF MEASUREMENT AGENDA 1. Quality of Measurement a) Validity Definition and Types of validity Assessment of validity Threats of Validity b) Reliability True Score Theory Definition

More information

Variables in Research. What We Will Cover in This Section. What Does Variable Mean?

Variables in Research. What We Will Cover in This Section. What Does Variable Mean? Variables in Research 9/20/2005 P767 Variables in Research 1 What We Will Cover in This Section Nature of variables. Measuring variables. Reliability. Validity. Measurement Modes. Issues. 9/20/2005 P767

More information

RELIABILITY AND VALIDITY (EXTERNAL AND INTERNAL)

RELIABILITY AND VALIDITY (EXTERNAL AND INTERNAL) UNIT 2 RELIABILITY AND VALIDITY (EXTERNAL AND INTERNAL) Basic Process/Concept in Research Structure 2.0 Introduction 2.1 Objectives 2.2 Reliability 2.3 Methods of Estimating Reliability 2.3.1 External

More information

Research Questions and Survey Development

Research Questions and Survey Development Research Questions and Survey Development R. Eric Heidel, PhD Associate Professor of Biostatistics Department of Surgery University of Tennessee Graduate School of Medicine Research Questions 1 Research

More information

Understanding CELF-5 Reliability & Validity to Improve Diagnostic Decisions

Understanding CELF-5 Reliability & Validity to Improve Diagnostic Decisions Understanding CELF-5 Reliability & Validity to Improve Diagnostic Decisions Senior Educational Consultant Pearson Disclosures Dr. Scheller is an employee of Pearson, publisher of the CELF-5. No other language

More information

Reliability & Validity Dr. Sudip Chaudhuri

Reliability & Validity Dr. Sudip Chaudhuri Reliability & Validity Dr. Sudip Chaudhuri M. Sc., M. Tech., Ph.D., M. Ed. Assistant Professor, G.C.B.T. College, Habra, India, Honorary Researcher, Saha Institute of Nuclear Physics, Life Member, Indian

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604 Measurement and Descriptive Statistics Katie Rommel-Esham Education 604 Frequency Distributions Frequency table # grad courses taken f 3 or fewer 5 4-6 3 7-9 2 10 or more 4 Pictorial Representations Frequency

More information

EVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS

EVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS DePaul University INTRODUCTION TO ITEM ANALYSIS: EVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS Ivan Hernandez, PhD OVERVIEW What is Item Analysis? Overview Benefits of Item Analysis Applications Main

More information

Chapter 3. Psychometric Properties

Chapter 3. Psychometric Properties Chapter 3 Psychometric Properties Reliability The reliability of an assessment tool like the DECA-C is defined as, the consistency of scores obtained by the same person when reexamined with the same test

More information

Overview of Experimentation

Overview of Experimentation The Basics of Experimentation Overview of Experiments. IVs & DVs. Operational Definitions. Reliability. Validity. Internal vs. External Validity. Classic Threats to Internal Validity. Lab: FP Overview;

More information

Item Analysis Explanation

Item Analysis Explanation Item Analysis Explanation The item difficulty is the percentage of candidates who answered the question correctly. The recommended range for item difficulty set forth by CASTLE Worldwide, Inc., is between

More information

Statistical Methods and Reasoning for the Clinical Sciences

Statistical Methods and Reasoning for the Clinical Sciences Statistical Methods and Reasoning for the Clinical Sciences Evidence-Based Practice Eiki B. Satake, PhD Contents Preface Introduction to Evidence-Based Statistics: Philosophical Foundation and Preliminaries

More information

Clinician-reported Outcomes (ClinROs), Concepts and Development

Clinician-reported Outcomes (ClinROs), Concepts and Development Clinician-reported Outcomes (ClinROs), Concepts and Development William Lenderking, PhD, Senior Research Leader; Dennis Revicki, PhD, Senior Vice President, Outcomes Research In heathcare, there are many

More information

Reliability. Internal Reliability

Reliability. Internal Reliability 32 Reliability T he reliability of assessments like the DECA-I/T is defined as, the consistency of scores obtained by the same person when reexamined with the same test on different occasions, or with

More information

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session

More information

HPS301 Exam Notes- Contents

HPS301 Exam Notes- Contents HPS301 Exam Notes- Contents Week 1 Research Design: What characterises different approaches 1 Experimental Design 1 Key Features 1 Criteria for establishing causality 2 Validity Internal Validity 2 Threats

More information

Sample Exam Questions Psychology 3201 Exam 1

Sample Exam Questions Psychology 3201 Exam 1 Scientific Method Scientific Researcher Scientific Practitioner Authority External Explanations (Metaphysical Systems) Unreliable Senses Determinism Lawfulness Discoverability Empiricism Control Objectivity

More information

Chapter 3 Psychometrics: Reliability and Validity

Chapter 3 Psychometrics: Reliability and Validity 34 Chapter 3 Psychometrics: Reliability and Validity Every classroom assessment measure must be appropriately reliable and valid, be it the classic classroom achievement test, attitudinal measure, or performance

More information

Collecting & Making Sense of

Collecting & Making Sense of Collecting & Making Sense of Quantitative Data Deborah Eldredge, PhD, RN Director, Quality, Research & Magnet Recognition i Oregon Health & Science University Margo A. Halm, RN, PhD, ACNS-BC, FAHA Director,

More information

11-3. Learning Objectives

11-3. Learning Objectives 11-1 Measurement Learning Objectives 11-3 Understand... The distinction between measuring objects, properties, and indicants of properties. The similarities and differences between the four scale types

More information

ADMS Sampling Technique and Survey Studies

ADMS Sampling Technique and Survey Studies Principles of Measurement Measurement As a way of understanding, evaluating, and differentiating characteristics Provides a mechanism to achieve precision in this understanding, the extent or quality As

More information

Collecting & Making Sense of

Collecting & Making Sense of Collecting & Making Sense of Quantitative Data Deborah Eldredge, PhD, RN Director, Quality, Research & Magnet Recognition i Oregon Health & Science University Margo A. Halm, RN, PhD, ACNS-BC, FAHA Director,

More information

PTHP 7101 Research 1 Chapter Assignments

PTHP 7101 Research 1 Chapter Assignments PTHP 7101 Research 1 Chapter Assignments INSTRUCTIONS: Go over the questions/pointers pertaining to the chapters and turn in a hard copy of your answers at the beginning of class (on the day that it is

More information

Any phenomenon we decide to measure in psychology, whether it is

Any phenomenon we decide to measure in psychology, whether it is 05-Shultz.qxd 6/4/2004 6:01 PM Page 69 Module 5 Classical True Score Theory and Reliability Any phenomenon we decide to measure in psychology, whether it is a physical or mental characteristic, will inevitably

More information

Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys

Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys Jill F. Kilanowski, PhD, APRN,CPNP Associate Professor Alpha Zeta & Mu Chi Acknowledgements Dr. Li Lin,

More information

The Psychometric Principles Maximizing the quality of assessment

The Psychometric Principles Maximizing the quality of assessment Summer School 2009 Psychometric Principles Professor John Rust University of Cambridge The Psychometric Principles Maximizing the quality of assessment Reliability Validity Standardisation Equivalence

More information

Appendix B Statistical Methods

Appendix B Statistical Methods Appendix B Statistical Methods Figure B. Graphing data. (a) The raw data are tallied into a frequency distribution. (b) The same data are portrayed in a bar graph called a histogram. (c) A frequency polygon

More information

Associate Prof. Dr Anne Yee. Dr Mahmoud Danaee

Associate Prof. Dr Anne Yee. Dr Mahmoud Danaee Associate Prof. Dr Anne Yee Dr Mahmoud Danaee 1 2 What does this resemble? Rorschach test At the end of the test, the tester says you need therapy or you can't work for this company 3 Psychological Testing

More information

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

26:010:557 / 26:620:557 Social Science Research Methods

26:010:557 / 26:620:557 Social Science Research Methods 26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview

More information

Validity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that

Validity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that Validity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that it purports to perform. Does an indicator accurately

More information

CLINICAL VS. BEHAVIOR ASSESSMENT

CLINICAL VS. BEHAVIOR ASSESSMENT CLINICAL VS. BEHAVIOR ASSESSMENT Informal Tes3ng Personality Tes3ng Assessment Procedures Ability Tes3ng The Clinical Interview 3 Defining Clinical Assessment The process of assessing the client through

More information

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity Measurement & Variables - Initial step is to conceptualize and clarify the concepts embedded in a hypothesis or research question with

More information

Regression CHAPTER SIXTEEN NOTE TO INSTRUCTORS OUTLINE OF RESOURCES

Regression CHAPTER SIXTEEN NOTE TO INSTRUCTORS OUTLINE OF RESOURCES CHAPTER SIXTEEN Regression NOTE TO INSTRUCTORS This chapter includes a number of complex concepts that may seem intimidating to students. Encourage students to focus on the big picture through some of

More information

How Do We Gather Evidence of Validity Based on a Test s Relationships With External Criteria?

How Do We Gather Evidence of Validity Based on a Test s Relationships With External Criteria? CHAPTER 8 How Do We Gather Evidence of Validity Based on a Test s Relationships With External Criteria? CHAPTER 8: HOW DO WE GATHER EVIDENCE OF VALIDITY BASED ON A TEST S RELATIONSHIPS WITH EXTERNAL CRITERIA?

More information

Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving

Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, 2018 The Scientific Method of Problem Solving The conceptual phase Reviewing the literature, stating the problem,

More information

Evaluation and Assessment: 2PSY Summer, 2017

Evaluation and Assessment: 2PSY Summer, 2017 Evaluation and Assessment: 2PSY - 542 Summer, 2017 Instructor: Daniel Gutierrez, Ph.D., LPC, LMHC, NCC Meeting times: Perquisite: July 24 28, 2017; 8:00am-5:00pm Admission to the MAC program Required Texts

More information

TLQ Reliability, Validity and Norms

TLQ Reliability, Validity and Norms MSP Research Note TLQ Reliability, Validity and Norms Introduction This research note describes the reliability and validity of the TLQ. Evidence for the reliability and validity of is presented against

More information

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

No part of this page may be reproduced without written permission from the publisher. (

No part of this page may be reproduced without written permission from the publisher. ( CHAPTER 4 UTAGS Reliability Test scores are composed of two sources of variation: reliable variance and error variance. Reliable variance is the proportion of a test score that is true or consistent, while

More information

The Short NART: Cross-validation, relationship to IQ and some practical considerations

The Short NART: Cross-validation, relationship to IQ and some practical considerations British journal of Clinical Psychology (1991), 30, 223-229 Printed in Great Britain 2 2 3 1991 The British Psychological Society The Short NART: Cross-validation, relationship to IQ and some practical

More information

Simple Linear Regression the model, estimation and testing

Simple Linear Regression the model, estimation and testing Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.

More information

MMPI-2 short form proposal: CAUTION

MMPI-2 short form proposal: CAUTION Archives of Clinical Neuropsychology 18 (2003) 521 527 Abstract MMPI-2 short form proposal: CAUTION Carlton S. Gass, Camille Gonzalez Neuropsychology Division, Psychology Service (116-B), Veterans Affairs

More information

Ch. 11 Measurement. Measurement

Ch. 11 Measurement. Measurement TECH 646 Analysis of Research in Industry and Technology PART III The Sources and Collection of data: Measurement, Measurement Scales, Questionnaires & Instruments, Sampling Ch. 11 Measurement Lecture

More information

USE OF THE MMPI-2-RF IN POLICE & PUBLIC SAFETY ASSESSMENTS

USE OF THE MMPI-2-RF IN POLICE & PUBLIC SAFETY ASSESSMENTS USE OF THE MMPI-2-RF IN POLICE & PUBLIC SAFETY ASSESSMENTS Yossef S. Ben-Porath Kent State University ybenpora@kent.edu Disclosure Yossef Ben-Porath is a paid consultant to the MMPI publisher, the University

More information

Variables in Research. What We Will Cover in This Section. What Does Variable Mean? Any object or event that can take on more than one form or value.

Variables in Research. What We Will Cover in This Section. What Does Variable Mean? Any object or event that can take on more than one form or value. Variables in Research 6/27/2005 P360 Variables in Research 1 What We Will Cover in This Section Nature of variables. Measuring variables. Reliability. Validity. Measurement Modes. Issues. 6/27/2005 P360

More information

In search of the correct answer in an ability-based Emotional Intelligence (EI) test * Tamara Mohoric. Vladimir Taksic.

In search of the correct answer in an ability-based Emotional Intelligence (EI) test * Tamara Mohoric. Vladimir Taksic. Published in: Studia Psychologica - Volume 52 / No. 3 / 2010, pp. 219-228 In search of the correct answer in an ability-based Emotional Intelligence (EI) test * 1 Tamara Mohoric 1 Vladimir Taksic 2 Mirjana

More information

Interpreting change on the WAIS-III/WMS-III in clinical samples

Interpreting change on the WAIS-III/WMS-III in clinical samples Archives of Clinical Neuropsychology 16 (2001) 183±191 Interpreting change on the WAIS-III/WMS-III in clinical samples Grant L. Iverson* Department of Psychiatry, University of British Columbia, 2255 Wesbrook

More information

Reliability AND Validity. Fact checking your instrument

Reliability AND Validity. Fact checking your instrument Reliability AND Validity Fact checking your instrument General Principles Clearly Identify the Construct of Interest Use Multiple Items Use One or More Reverse Scored Items Use a Consistent Response Format

More information

Importance of Good Measurement

Importance of Good Measurement Importance of Good Measurement Technical Adequacy of Assessments: Validity and Reliability Dr. K. A. Korb University of Jos The conclusions in a study are only as good as the data that is collected. The

More information

Class 7 Everything is Related

Class 7 Everything is Related Class 7 Everything is Related Correlational Designs l 1 Topics Types of Correlational Designs Understanding Correlation Reporting Correlational Statistics Quantitative Designs l 2 Types of Correlational

More information

CHAPTER VI RESEARCH METHODOLOGY

CHAPTER VI RESEARCH METHODOLOGY CHAPTER VI RESEARCH METHODOLOGY 6.1 Research Design Research is an organized, systematic, data based, critical, objective, scientific inquiry or investigation into a specific problem, undertaken with the

More information

Georgina Salas. Topics EDCI Intro to Research Dr. A.J. Herrera

Georgina Salas. Topics EDCI Intro to Research Dr. A.J. Herrera Homework assignment topics 51-63 Georgina Salas Topics 51-63 EDCI Intro to Research 6300.62 Dr. A.J. Herrera Topic 51 1. Which average is usually reported when the standard deviation is reported? The mean

More information

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2

More information

Assessing the Validity and Reliability of the Teacher Keys Effectiveness. System (TKES) and the Leader Keys Effectiveness System (LKES)

Assessing the Validity and Reliability of the Teacher Keys Effectiveness. System (TKES) and the Leader Keys Effectiveness System (LKES) Assessing the Validity and Reliability of the Teacher Keys Effectiveness System (TKES) and the Leader Keys Effectiveness System (LKES) of the Georgia Department of Education Submitted by The Georgia Center

More information

Statistics for Psychology

Statistics for Psychology Statistics for Psychology SIXTH EDITION CHAPTER 12 Prediction Prediction a major practical application of statistical methods: making predictions make informed (and precise) guesses about such things as

More information

Ch. 11 Measurement. Paul I-Hai Lin, Professor A Core Course for M.S. Technology Purdue University Fort Wayne Campus

Ch. 11 Measurement. Paul I-Hai Lin, Professor  A Core Course for M.S. Technology Purdue University Fort Wayne Campus TECH 646 Analysis of Research in Industry and Technology PART III The Sources and Collection of data: Measurement, Measurement Scales, Questionnaires & Instruments, Sampling Ch. 11 Measurement Lecture

More information

CHAPTER III RESEARCH METHODOLOGY

CHAPTER III RESEARCH METHODOLOGY CHAPTER III RESEARCH METHODOLOGY In this chapter, the researcher will elaborate the methodology of the measurements. This chapter emphasize about the research methodology, data source, population and sampling,

More information

Questionnaire design. Questionnaire Design: Content. Questionnaire Design. Questionnaire Design: Wording. Questionnaire Design: Wording OUTLINE

Questionnaire design. Questionnaire Design: Content. Questionnaire Design. Questionnaire Design: Wording. Questionnaire Design: Wording OUTLINE Questionnaire design OUTLINE Questionnaire design tests Reliability Validity POINTS TO CONSIDER Identify your research objectives. Identify your population or study sample Decide how to collect the information

More information

Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz

Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz This study presents the steps Edgenuity uses to evaluate the reliability and validity of its quizzes, topic tests, and cumulative

More information

Basic concepts and principles of classical test theory

Basic concepts and principles of classical test theory Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must

More information

32.5. percent of U.S. manufacturers experiencing unfair currency manipulation in the trade practices of other countries.

32.5. percent of U.S. manufacturers experiencing unfair currency manipulation in the trade practices of other countries. TECH 646 Analysis of Research in Industry and Technology PART III The Sources and Collection of data: Measurement, Measurement Scales, Questionnaires & Instruments, Sampling Ch. 11 Measurement Lecture

More information

Regression Discontinuity Analysis

Regression Discontinuity Analysis Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income

More information

Error in the estimation of intellectual ability in the low range using the WISC-IV and WAIS- III By. Simon Whitaker

Error in the estimation of intellectual ability in the low range using the WISC-IV and WAIS- III By. Simon Whitaker Error in the estimation of intellectual ability in the low range using the WISC-IV and WAIS- III By Simon Whitaker In press Personality and Individual Differences Abstract The error, both chance and systematic,

More information

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 39 Evaluation of Comparability of Scores and Passing Decisions for Different Item Pools of Computerized Adaptive Examinations

More information

Prepared by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies

Prepared by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Prepared by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Putra Malaysia Serdang At the end of this session,

More information

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA Data Analysis: Describing Data CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA In the analysis process, the researcher tries to evaluate the data collected both from written documents and from other sources such

More information

CHAPTER 2. RESEARCH METHODS AND PERSONALITY ASSESSMENT (64 items)

CHAPTER 2. RESEARCH METHODS AND PERSONALITY ASSESSMENT (64 items) CHAPTER 2. RESEARCH METHODS AND PERSONALITY ASSESSMENT (64 items) 1. Darwin s point of view about empirical research can be accurately summarized as... a. Any observation is better than no observation

More information

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology ISC- GRADE XI HUMANITIES (2018-19) PSYCHOLOGY Chapter 2- Methods of Psychology OUTLINE OF THE CHAPTER (i) Scientific Methods in Psychology -observation, case study, surveys, psychological tests, experimentation

More information

Chapter 6 Topic 6B Test Bias and Other Controversies. The Question of Test Bias

Chapter 6 Topic 6B Test Bias and Other Controversies. The Question of Test Bias Chapter 6 Topic 6B Test Bias and Other Controversies The Question of Test Bias Test bias is an objective, empirical question, not a matter of personal judgment. Test bias is a technical concept of amenable

More information

Variables in Research. What We Will Cover in This Section. What Does Variable Mean? Any object or event that can take on more than one form or value.

Variables in Research. What We Will Cover in This Section. What Does Variable Mean? Any object or event that can take on more than one form or value. Variables in Research 1/1/2003 P365 Variables in Research 1 What We Will Cover in This Section Nature of variables. Measuring variables. Reliability. Validity. Measurement Modes. Issues. 1/1/2003 P365

More information

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize

More information

11/24/2017. Do not imply a cause-and-effect relationship

11/24/2017. Do not imply a cause-and-effect relationship Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

Treatment Effects: Experimental Artifacts and Their Impact

Treatment Effects: Experimental Artifacts and Their Impact 6 Treatment Effects: Experimental Artifacts and Their Impact This chapter presents a substantive discussion of the evaluation of experiments and interventions. The next chapter (Chapter 7) will present

More information

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

3 CONCEPTUAL FOUNDATIONS OF STATISTICS 3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical

More information

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger Conditional Distributions and the Bivariate Normal Distribution James H. Steiger Overview In this module, we have several goals: Introduce several technical terms Bivariate frequency distribution Marginal

More information

Theory Testing and Measurement Error

Theory Testing and Measurement Error EDITORIAL Theory Testing and Measurement Error Frank L. Schmidt University of Iowa, Iowa City, IA, USA John E. Hunter Michigan State University, East Lansing, MI, USA Accurate empirical tests of theories

More information