Reliability Analysis: Its Application in Clinical Practice

Size: px
Start display at page:

Download "Reliability Analysis: Its Application in Clinical Practice"

Transcription

1 Reliability Analysis: Its Application in Clinical Practice NahathaiWongpakaran Department of Psychiatry, Faculty of Medicine Chiang Mai University, Thailand TinakonWongpakaran Department of Psychiatry, Faculty of Medicine Chiang Mai University, Thailand

2 1 Introduction 1.1 What We Need From Measurement? In order to evaluate the empirical applicability of theoretical propositions, we need indicators that are not weak or unreliable, but how can we evaluate the degree to which ten items of Rosenberg s scale, a scale used to measure self-esteem, correctly represent that concept? In essence, we need two basic measurement properties to be present. First, a reliability indicator needs to be used which represents the extent to which a measurement yields the same results over repeated trials. In fact, it may not be possible to yield the exact same results over repeated measurements, although they will tend to be consistent. Second, an indicator will be more reliable if it provides an accurate representation of the concept being measured, called validity. For example, a neuroticism scale is considered a valid measure if it is bound to a neuroticism trait shown by the respondent, and in accordance with the Eysenck theory of personality, rather than other phenomenon such as anxiety disorders. Put another way, reliability is concerned with the degree to which a measurement is consistent across repeated uses, while validity reflects the relationship between an indicator and a concept. Like reliability, validity is also a matter of degree. 2 Types of Reliability Measure Reliability refers to the accuracy and precision of a scale s score, and in this chapter, reliability is based on Classical test theory (CTT), which provides a basis for the effectiveness of a measurement or scale. Most psychological measurements attempt to appropriate the variability of the measurement which is, according to CTT, due to the actual variation across individuals within the phenomenon that the scale is measuring, made up of a true score and error. Therefore, observed scores are determined based on the true score (T) and the measurement error (E), or O x = T x + E x. True scores are the signal we wish to detect, whereas measurement errors are noise that cloud the signal. Errors that affect measurement include random and non-random errors, and noise refers to a random error - anything that confounds measurement. In addition, reliability can be viewed as the degree to which observed score variances are uncorrelated and unaffected by error scores; for example, a reliability score of 0.65 means that 65% of the variability in the observed scale score is attributable to true score variability. Since it is not possible to directly observe a true score, a number of methods have been devised which are aimed at assessing reliability, these being: (i) the test-retest method, in which the same group of samples is used for the same measure but at different times, (ii) the measure of equivalence, which uses equivalent methods over the same period of time (to test inter-method reliability), (iii) a measure of stability and equivalence, (iv) a measure of internal consistency, meaning the consistency between items for the same test, or on the same sub-scale of a test that demonstrates the same underlying latent construct. Under this last method, internal consistency is measured using the following methods: the splithalf method, the Kuder-Richardson method (Feldt, 1969), Cronbach s alpha (Cronbach, 1951) and Hoyt s analysis of variance method (Hoyt, 1941). And finally (v) the inter-rater reliability method, which refers to measurements taken by different persons but using the same method or instruments. In this chapter, we will focus on the most commonly used reliability measures, these being the internal consistency and test-retest methods.

3 For all measures reliability ranges between 0 and 1, with the higher the score, the better the psychometric properties in terms of validity and reliability. An acceptable value is 0.7, while 0.8 or higher is considered good and 0.9 or higher is considered excellent. When the score falls below 0.7, this signifies the presence of psychometric problems. Reliability impacts upon the statistical significance (t-test or F- test), effect size, as well as the factor structure of the scale. Since a high level of reliability means that an individual score is a good estimate of the true score, low reliability precludes the real results from materializing. Furr (2011) suggested that if estimated reliability is acceptable, then researchers should examine the appropriate psychological interpretation of a scale s score (i.e. its validity) 2.1 Internal Consistency What Determines Internal Consistency? Internal consistency is a measure used to estimate the reliability of a test from the single administration of a form, and is commonly used for scales which are multiple-item in nature, in this case a composite score test computed by summing or averaging. Internal consistency depends on the individual s performance from item to item based on the standard deviation of the test and the standard deviations of the items. Internal consistency approaches include the split-half method, Kuder-Richardson method, Cronbach s alpha and Hoyt s analysis of variance method. One of the tests most commonly used for exploring internal consistency is Cronbach s alpha, an item-level approach which uses inter-item correlation to determine the level of reliability. In sum, the purpose of Cronbach s alpha is to see how items are linked. Put another way, it provides an index of reliability. When the scale yield of Cronbach s alpha is high, it does not mean that the scale is reliable, although this can be tested through exact repetition. There are two types of Cronbach s alpha; raw and standardized. The formula for the raw version is defined as: α = K K 1 1 K 2 σ i=1 Yi 2 σ X or α = K c v + ( K 1)c ( ), 2 2 where K is the number of items, σ X is the variance of the observed total test scores and σ Yi is the variance of the component i for the current sample of people (Devellis, 1991); v is the average variance and c the average of all the co-variances of the components taken across the current sample. Under the assumption that the item variances are all equal, this ratio modifies to the average interitem correlation, and the result is known as the standardized item alpha. The formula for this is as follows: kr α = 1+ ( k 1)r. In general, raw and standardized alpha are similar or very close in terms of their value, except when different item response formats are applied; for example, a 5-item scale with a 7-Likert scale, or an ordinal scale and a dichotomous scale. Inter-item covariance is normally positive because it is internally consistent, and an appropriate coefficient should be in the range 0.3 to 0.5. Whenever it appears as a negative value or is near to zero, it means the items do not measure the same variable, or there is measurement error (for example, the particular items are unclear, meaning the respondents cannot express their true scores). For example, the

4 Rosenberg Self-esteem scale item which says I wish I could have more respect for myself, seems to be problematic in many studies (see below for details). As for the item-total statistic, this will follow the inter-item correlation and should be higher than 0.2. The example in Figure 1 is drawn from a study by Wongpakaran and Wongpakaran (2010d) with regard to the therapeutic alliance (data modified for simplicity), and shows the results of the Cronbach s alpha test when the two therapeutic alliance scales and different response formats are combined. The standardized alpha scores are seen to be higher than the raw alpha (0.711 as compared with 0.655). Reliability Statistics Cronbach s Alpha Cronbach s Alpha Based on Standardized Items No. of Items Item Statistics Mean Std. Deviation N Item Item Item Item Item Item Scale Mean if Item Deleted Item-Total Statistics Scale Variance if Item Deleted Corrected Item- Total Correlation Squared Multiple Correlation Cronbach s Alpha if Item Deleted Item Item Item Item Item Item Figure 1: Reliability analysis results using SPSS Factors Affecting Reliability Coefficients Factors affecting reliability coefficients: 1. Group homogeneity We can see from the formula that the greater the variance, the higher the reliability coefficients. Why is a large variance good? A large variance means there is a wide spread of scores, which means it is easier to differentiate respondents. If a test has a small amount of variance,

5 the scores for each class are close (a high level of homogeneity). It is important to note that the reliability score given here is not the test s reliability itself, but the reliability of that sample. 2. Test length The longer the test, the more accurate the reliability coefficients; however, a long test may comprise the motivation or concentration levels of the respondents. As a result, when shortening the measurement, we need to be cautious about its reliability. 3. Inter-item correlation The more correlation there is between items, the higher the level of reliability, because of its homogeneity (Guilford, 1954). 4. Time limitation too little time given for completing the test 5. Test factor - this includes poor instructions and ambiguous questionnaires 6. Examinee factors - includes poor concentration, poor motivation and fatigue 7. Administrator factors - influence of staff on respondent biases. Cronbach s Alpha is a single assessment tool and may be affected by some of the factors above, but might not be affected in the same way on different occasions; for example, reliability may be poor the first time because the participants have poor levels of concentration, but better the second time when they have better levels of concentration. In addition, it is important to note that the alpha coefficient produced by this measurement tool will vary by the group of respondents involved; for example, students or clinical patients. In patient groups, the alpha tends to be lower than for student groups, because they are more likely to be affected by a number of compromising factors such as poorer levels of concentration and lower cognition (T. Wongpakaran et al., 2012b). In addition, the alpha coefficient does not indicate that the scale or measurement is uni-dimensional, but if dimensionality does exist, the alpha coefficient should be calculated according to different dimensions or sub-scales, in addition to the overall scale. 2.2 Test-Retest Assumptions related to the test-retest stage include: 1) that there should be the same true scores between the first and second administration, and 2) that error variances between the two iterations should be equal. However, there are a number of factors that prevent these assumptions happening the expected way, especially for the first. For example, some attributes probed by particular items of the measurement may change over time, such as feelings of boredom, whereas interpersonal style or personality traits may take a longer time to change. As a result, the interval between the tests needs to be considered carefully, with a two to eight week interval being common in psychological and psychiatric practice. Another factor that might affect the test-retest tool is the measurement itself, for we have found that the longer the length of the test, the lower the level of reliability (Wongpakaran et al., 2012b). In general, Pearson s product-moment correlation coefficient is commonly used on test-retest reliability measures. Yen & Lo (2002) demonstrated that intra-class correlation is more sensitive to the detection of systematic error, and that as, theoretically, Pearson s product-moment correlation is the correlation between two different variables, it should not be used in reliability analyses as it cannot detect the

6 existence of systematic errors. For example, the Pearson s product-moment correlation coefficient when assessing test 1 and test 2 is 0.816, while the ICC is When systemic error is introduced for 18 points, the Pearson s product-moment correlation coefficient for test 1 and test 2 is 0.816, while ICC is Therefore, the ICC coefficient is the more appropriate approach to use to evaluate test and retest reliability. 2.3 Inter-Rater Reliability This is used to measure the extent to which a measurement can be considered consistent (agreed) between raters. Inter-rater reliability is commonly used by clinicians who want to use an instrument for diagnostic criteria, and the responses are usually dichotomous (in a yes-no format). For example, general practitioners use the structured clinical interview for DSM-IV to diagnose personality disorders (SCID II) (First et al., 1997) or the CAM algorithm to detect delirium in inpatient units (Wongpakaran et al., 2011). There are a number of factors associated with the degree of reliability of the test-retest instrument, these being: 1. The number of raters the more raters there are, the lower the level of reliability 2. Rating scale format an ordinal or Likert type rating is likely to give higher levels of reliability than when using dichotomous responses 3. The rating method used (nested, jointed, independent or mixed SCID II). Jointed rating (i.e. both raters see the interviewee at the same time or use video of the first rater as material) yields the highest level of reliability, while a nested rating design in which each rater independently sees the interviewee yields the lowest. If an interval is involved, the degree of agreement might be even lower as the interval affects the degree of variance in interviewee factors (as found in the test-retest) (T. Wongpakaran et al., 2012a). To sum up, the method of rating allows for which type of variation will play a role; rater variation (e.g. data interpretation) or patient variation (e.g. consistency in giving information on different occasions and/or with different raters) How to Calculate Inter-Rater Reliability Statistical methods used to assess inter-rater reliability include Cohen s kappa (Cohen, 1988), the Fleiss kappa (Fleiss, 1981) or a weighted kappa; inter-rater correlation, concordance correlation coefficients and intra-class correlation. Which one is chosen depends on the type of data being analyzed. Kappa methods are used for nominal, correlation coefficients, and intra-class correlation is used for order (Spearman rho) or continuous (Pearson r) scales. However, the intra-class correlation coefficient (ICC) tends to prevail over Spearman s rho and Pearson s r because it takes into account the differences in ratings for individual parts, along with the association between raters. Moreover, it also provides flexibility in terms of the number of raters, the number of participants, the response format and the handling of missing data. Table 1 shows an example of inter-rater reliability between raters for SCID II. When categorical diagnosis is used (yes/no), kappa statistics indicate that obsessive-compulsive personality disorders (PD) yield the highest level of agreement between raters, while depressive PD is the lowest, at When the summed score (continuous data) is used, passive-aggressive PD is found to be much improved than when using Cohen s kappa.

7 Personality disorder (PD) No. of diagnoses (%) 1 st rater 2 nd rater % Observed agreement Kappa Avoidant 11(24.49) 11(20.4) Obsessive-compulsive 14(25.9) 14(25.9) Passive-aggressive 8(14.8) 7(13.0) Depressive 5(9.3) 6(11.1) Paranoid 9(16.7) 6(11.1) Schizoid 7(13) 8(14.8) Borderline 7(13.0) 7(13.0) Anti-social 5(9.3) 5(9.3) ICC Table 1: Kappa values and ICC 3 Relationship Between Reliability and Validity Validity is used to interpret what a scale or measurement represents. When the measurement is internally consistent, it can gauge something but cannot tell what it is. Therefore, to examine its validity means to test for some hypothesis about the constructs related to the measurement. There are several ways to test for constructs; for example, examining inter-correlation with other designated constructs. Dimensionality and reliability are important elements of the psychometric properties of a measurement; however, what is more important is validity, because this tells you how useful a measurement or scale is. Some important issues need to be noted here regarding validity, these being: 1. It is not of a yes/no quality it is a matter of degree 2. It varies from setting to setting (of study) and from sample to sample; for example participants may be depressed, or a patient may respond to the Hamilton Depression rating scale even though from a non-depressed part of the population 3. It can be assessed in a variety of ways 4. It needs to have both scientific evidence as well as theoretically-based structure to compare against, and 5. It is concerned with the interpretation of a scale score, not the scale itself (Furr, 2011) The techniques used for studying validity include: 1. Criterion-related validity, which compares the measurement or scale-score of a measurement with the scores given on other variables (or criteria). The definition of criterion-related validity has been defined by Nunnally (1994) as: to use an instrument to estimate some important form of behavior that is external to the measuring instrument itself, the latter being referred to as the criterion (Nunnally&Bernstein, 1994). For example, Chan-Ob and Boonyanaruthee (1999) validated high school students performance in mathematics to predict how well the students with high scores would be able to obtain high grade-point averages (GPA) in their

8 medical school studies. There are two types of validity; predictive validity - as mentioned earlier, and concurrent validity - which assesses the degree of correlation between the measurement and the criterion (other measurement) at the same time. For example, Wongpakaran et al. (2011b) have studied the level of concurrent validity of the experience of close relationship questionnaire (revised), or ECR-R, by establishing the Pearson s product moment correlation coefficient between the anxiety scale for ECR-R and Speilberger s State-Trait Anxiety Inventory, finding it to be positively correlated. 2. Content validity, which means the extent to which a measurement represents all aspects of the given construct. For example, when measuring content validity for depression, the test items should correspond to the required domain of a given syndrome of depression, such as dysphoric mood, poor concentration, a lack of energy, anxiety and sleep problems. In addition, in the development of a scale or measurement, experts can carefully select items that are relevant to a test specification drawn up to test a particular subject domain. 3. Construct validity, which refers to the validity evidence that requires empirical and theoretical support in order to allow interpretation of the construct. This evidence includes the internal construct, the degree of association with other variables, the test s content, response processes and the consequences of its use. In this chapter, we will focus on internal structure using exploratory factor analysis and confirmatory factor analysis. 3.1 How Reliability Affects Validity Validity is often assessed alongside reliability - the extent to which a measurement gives consistent results. A measure can have a high level of reliability without having validity (and in this case, the measure is not useful at all). On the other hand, any attempt to define the level of validity is futile if a test is not reliable. Within classical test theory, predictive or concurrent validity (correlation between the predictor and the predicted) cannot exceed the square root of the product of reliability, or r 12max = r 11 r 12 - where r 11 and r 12 are the reliability of the two variables. Internal error and time sampling error are two types of reliability that reduce the validity of tests. 3.2 Confirmatory Factor Analysis (CFA) CFA is used to evaluate the dimensionality of the scale or to verify the factor structure of a set of observed variables. It allows researchers to test the hypothesis that there is a relationship between observed variables and unseen latent constructs. It is the next step after exploratory factor analysis (EFA) because it is used to confirm the factor structure as theoretically proposed. More importantly, CFA can also be used to determine, compare and revise models (model modification and re-analysis), which EFA is unable to do. Statistics used for CFA include the χ 2 test, which indicates the amount of difference between the expected and observed covariance matrices to a level of probability of more than 0.05, and while the χ 2 value is close to zero. Other tests to determine how well a model fits the data given include Normed Fit Index (NFI), the Comparative Fit Index (CFI), the Tucker-Lewis Index (TLI) and the Goodness of Fit Index (GFI). These tests produce values ranging from 0 to 1; the larger the number the better the model fit. Other statistics include bad fit indices, and these include the standardized root-mean-square residual

9 (SRMR) which has a value of no more than 0.08, and the root-mean-square error of approximation (RMSEA) scale, for which an acceptable value is no more than 0.06 (Hu &Bentler, 1998; 1999; 1995). Since validity requires test reliability, there also needs to be an acceptable level of internal consistency (at least 0.7) for the scale or sub-scale. For example, one study was carried out into the perceived stress of a non-clinical sample, using the Perceived Stress Scale-10 (PSS-10) (N. Wongpakaran &Wongpakaran, 2010) - a scale used for measuring levels of stress as perceived by patients at a particular point in time. Within the study scale there were ten items and two sub-scales (latent factors) - control (Factor II) and stress (Factor). Items in the scale were loaded onto a designated factor, these being items 1, 2, 3, 6, 9 and 10 for factor I and items 4, 5, 7 and 8 for factor II, as hypothesized (Table 2). Communality (reliability of the indicator - h 2 ) also reflected an inter-item correlation in which all values were acceptable. The results show that the Cronbach s alpha for each sub-scale was very good (0.83 and 0.90) meaning that inter-item correlation between the designated factor was also good, reflected in acceptable values for communality (all were more than 0.3), as shown. Item Factor I Factor II h 2 1. In the last month, how often have you been upset because of something that has happened unexpectedly? 2. In the last month, how often have you felt unable to control the important things in your life? 3. In the last month, how often have you felt nervous and "stressed"? In the last month, how often have you found yourself unable to cope with all the things that you have had to do? 5. In the last month, how often have you been angered because of things outside of your control? 6. In the last month, how often have you felt difficulties piling up so high that you could not overcome them? 7. In the last month, how often have you felt confident about your ability to handle your personal problems? 8. In the last month, how often have you felt that things were going your way? In the last month, how often have you been able to control the irritations in your life? 10. In the last month, how often have you felt on top of things? Eigenvalue %variance M SD Cronbach s alpha Table 2: Factor structure of PSS-10 CFA was also carried out to see to what extent the factor structure was yielded as a hypothetical construct. The original PSS-10 was shown to have a two latent construct, i.e. stress and control. In this sample, EFA yielded two factors, and the fit statistics from CFA were shown to have a good outcome,

10 that is:! 2 = , df = 31, p < 0.002; Goodness-of-Fit Index (GFI) = 0.97; Non-Normed Fit Index (NNFI or TLI) = 0.95; Normed fit index (NFI) = 0.93; Comparative Fit Index (CFI) = 0.97; Standardized Root Mean square Residual (SRMR) = 0.041, and Root Mean Square Error of Approximation (RMSEA) = (Figure 2). Figure. 2: Measurement model with parameter estimates for PSS CFA and Reliability As we know, Cronbach s alpha may not accurately reflect the reliability of a measurement because reliability is impacted by the nature of the psychometric properties, that is, the fact that correlated errors terms exist (Furr, 2011; Miller, 1995). Rather than using traditional alpha, in this example CFA is used to assess reliability. The example below shows a modified model plus parameter estimates for the uni-dimensional scale of the modified Rosenberg Self-esteem scale (m-rses) (T. Wongpakaran&Wongpakaran, 2012a), in which items 5 and Item 7 are revised (see Table 2 for the revised items). Figure 3 shows the final model after modification, which represents a one-factor solution with correlated terms (positively worded items and negatively worded items). The model yields good fit statistics, as shown by the outcomes:! 2 = 47.50, df =27, p = 0.009, Goodness-of-Fit Index (GFI) = 0.96, Non-Normed Fit Index (NNFI or TLI) = 0.96, Normed fit index (NFI) = 0.95, Comparative Fit Index (CFI) = 0.98, Standardized Root Mean square Residual (SRMR) = 0.038, Root Mean Square Error of Approximation (RMSEA) = 0.055, and a Cronbach s alpha for the scale of If the reliability of the scale is calculated through the use of struc-

11 tural equation modeling and using CFA as per the following formula, the estimated reliability (Brown, 2006) can be shown as:! = (#" i ) + $ ii (#" i ) 2 # + 2 # $ ij where " i is the item factor loading; # ii is the item error variance; # ij is the covariance between error terms of the two items; ($" i ) 2 = ( ) 2 =29.48; $# ii + 2$# ij = ( ) + 2( ) = / ( ) = We can see a difference between the two levels of reliability yielded from Cronbach s alpha and from CFA. As stated by Miller (1995), alpha has a tendency to wrongly estimate reliability. Figure 3: Measurement Model with Parameter Estimates of the Modified Rosenberg Self- Esteem Scale (m-rses). 4 How to Solve the Problem of Low Reliability in the Measurement?

12 As mentioned earlier, a number of factors influence levels of reliability and validity. In clinical practice, we commonly find that clinical samples respond differently to questionnaires or psychological measurements than non-clinical samples. This does not mean that clinical participants tend to have higher scores on problems than non-clinical participants; it is because the ways in which they respond lead to different factor structures. For example, in a review of the factor structure of the Multi-dimensional Scale of Perceived Social Support (MSPSS), which is designed to tap into three sub-scales (friends, family and significant others), we found that clinically distressed patients tended to merge family members with significant others, whereas non-clinical people, especially younger people, tended to merge significant others with friends (Wongpakaran et al., 2011a). Invariance tests of the CFA show that there is a significant difference in terms of factor structure, which reflects in poor reliability among some other sub-scales of the MSPSS, a fact which may compromise the effect size and create significant differences in any analyses that use this measurement. As a result, we have come up with a modification to the instruction part of the questionnaire, in order to make it clear to the respondents that they should be aware of the existence of significant others (Wongpakaran & Wongpakaran, 2010c). Other ways to fix the problem of low internal consistency include: to increase the number of items, increase the sample size and reduce the use of items that cause low inter-item correlation; however, such trimming should first pass theoretical evaluation. Rather than increase the number of items, researchers tend to reduce them into the smallest number possible, as long as the test still holds acceptable psychometric properties. A long test, in contrast, might be seen to compromise any subsequent adherence to the test, leading to poor reliability. This can be clearly supported in a clinical setting, especially in those patients with mental health or psychiatric problems, whose minds tend to wander easily. Speaking of items causing low inter-item correlation, we have discovered from our previous studies that some negatively worded items can cause problems in terms of internal consistency and the factor structures of the scale. As mentioned before, in a clinical setting, patients tend to have a poor cognitive ability, regardless of how much motivation they may have; they usually perform particularly poorly on negated or double negated sentences, or on sentences containing the word not (Wongpakaran & Wongpakaran, 2012a; Wongpakaran & Wongpakaran, 2012b). Not only can this be found in clinical samples, but also in general samples. We would now like to give an example of a well-known scale - the Rosenberg Self-Esteem scale. The item in this scale that seems to be most problematic and tends to create poor inter-item correlation and low levels of reliability is item # 5 I wish I could have more respect for myself. This is clearly a negatively worded statement and in a number of studies has yielded the poorest correlation with other items (that is, it is the least internally consistent item) (Beeberet al., 2007; Carmines, 1978; Wongpakaran & Wongpakaran, 2011). This item also leads to an indeterminable factor structure. Table 2 shows a communality value of 0.077, which is unacceptable, so we subsequently managed to change it in a positive way by rephrasing the statement to I think I am able to give myself more respect and reanalyzing it with another sample. This ends up providing a better result in terms of the item factor loading (h 2 increases to 0.468) and the acceptability of the model fit for CFA. However, it then raises a problem for item #7 I feel that I m a person of worth, at least on an equal plane with others (h 2 increase to 0.149). Therefore, in the latest revision, we have altered this sentence into a more positive form: I feel that I m a person of worth, and I feel this emotion is stronger in me than in many other people, testing it on a third group of samples. We now find it provides an acceptable factor loading for all the items, and

13 with the same (good) model fit as found before with the second revision. Interestingly, the Cronbach s alphas are similar, around 0.84, for all three versions of the test. Version Original (n = 664) Revised #5 (n = 187) Revised #5 and #7 (n = 251) Item m SD F.L h 2 m SD F.L h 2 m SD F.L h 2 1. On the whole, I am satisfied with myself. 2. At times I think I am no good at all. 3. I am able to do things as well as most other people. 4. I feel I do not have much to be proud of. 5. I wish I could have more respect for myself. [5] 1 I think I am able to give myself more respect. 6. I certainly feel useless at times. 7. I feel that I m a person of worth, at least on an equal plane with others. [7] 1 I feel that I m a person of worth, and I feel this emotion is stronger in me than in many other people. 8. All in all, I am inclined to feel that I am a failure. 9. I feel that I have a number of good qualities. 10. I take a positive attitude toward myself m = mean; SD = standard deviation; F.L. = factor loading; h 2 = communality 1 re-worded to positive direction; Table. 2: A comparison of mean, SD, factor loadings and communalities (h 2 ) between the original, revised # 5,and revised #5 and #7

14 5 Conclusion Reliability can be assessed in a number of ways, and its importance lies in the impact it can have on the construct validity of a scale or measurement. Since reliability and validity are specific to clinical settings and samples, finding possible sources of error and providing modifications for problematic items that cause low reliability should be encouraged, in order that scales or measurements do not mislead or skew the outcomes of analysis. References Beeber, L. B., Seeherunwong, A., Schwartz, T., Funk, S. G., & Vongsirimas, N. (2007). Validity of the Rosenberg Selfesteem Scale in Young Women from Thailand and the USA. Thai Journal of Nursing Research, 11(4), Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: Guilford. Carmines, E. G. (1978). Psychological origins of adolescent political attitudes: self-esteem, political salience, and political involvement. American politics Quarterly, 6, Chan-Ob, T., & Boonyanaruthee, V. (1999). Medical student selection: which matriculation scores and personality factors are important? Journal of Medical Assocociation of Thailand, 82(6), Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. New Jersey: Lawrence Erlbaum Associates. Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, Devellis, R. F. (1991). Scale Development: Theory and Applications. Newbury Park: Sage. Feldt, L. S. (1969). A test of the hypothesis that Cronbach s alpha or Kuder-Richardson coefficient twenty is the same for two tests. Psychometrika, 34, First, M. B., Gibbon, M., Spitzer, R. L., Williams, J. B. W., & Benjamin, L. S. (1997). Structured Clinical Interview for DSM-IV Axis II Personality Disorder (SCID-II). Washington, DC: American Psychiatric Press. Fleiss, J. (1981). Statistical methods for rates and proportions. (2nd ed.). New York: John Wiley. Furr, R. M. (2011). Scale Construction and Psychometrics for Social and Personality Psychology. London, UK: Sage Publications. Guilford, J. P. (Ed.). (1954). Psychometric Methods. New York: McGraw-Hill Book Company. Hoyt, C. (1941). Test Reliability Estimated by Analysis of Variance. Psychometrika, 6, Hu, L., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to under parameterized model misspecification. Psycholical Methods, 3, Hu, L., & Bentler, P. M. (1999). Cut off criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, Hu, L., & Bentler, P. M. (1995). Evaluating model fit. In R. H. Hoyle (Ed.), In Structural Equation Modeling: Concepts, Issues and Applications (pp ). California: Sage. Miller, M. B. (1995). Coefficient alpha: A basic introduction from the perspectives of classical test theory and structural equation modeling. Structural Equation Modeling, 2(3), Nunnally, J., & Bernstein, I. (1994). Psychometric theory. New York, USA: McGraw-Hill Book Company.

15 Wongpakaran, N., & Wongpakaran, T. (2010). The Thai version of the PSS-10: An Investigation of its psychometric properties. Biopsychosocial Medicine, 4, 6. doi: / Wongpakaran, N., Wongpakaran, T., Bookamana, P., Pinyopornpanish, M., Maneeton, B., Lerttrakarnnon, P., Jiraniramai, S. (2011). Diagnosing delirium in elderly Thai patients: utilization of the CAM algorithm. BMC Family Practice, 12, 65. Wongpakaran, T., & Wongpakaran, N. (2011). Confirmatory Factor Analysis of Rosenberg Self Esteem Scale: A study of Thai student sample. Journal of Psychiatric Association of Thailand, 65(1), Wongpakaran, T., Wongpakaran, N., & Ruktrakul, R. (2011a). Reliability and Validity of the Multidimensional Scale of Perceived Social Support (MSPSS): Thai Version. Clinical Practice &Epidemiology in Mental Health, 7, doi: / Wongpakaran, T., Wongpakaran, N., & Wannarit, K. (2011b). Validity and reliability of the Thai version of the Experiences of Close Relationships-Revised questionnaire. Singapore Medical Journal, 52(2), Wongpakaran, T., & Wongpakaran, N. (2012a). A comparison of reliability and construct validity between the original and revised versions of the Rosenberg Self-Esteem Scale. Psychiatry Investigation, 9(1), Wongpakaran, T., & Wongpakaran, N. (2012b). A short version of the Revised Experience of Close Relationships Questionnaire: Investigating non-clinical and clinical samples.clinical Practice & Epidemiology in Mental Health, 8, Wongpakaran, T., & Wongpakaran, N. (2010c). A revised Thai Multi-dimensional Scale of Perceived Social Support (MSPSS). Spanish Journal of Psychology. 15(3), Wongpakaran, T., & Wongpakaran, N. (2010d). How the interpersonal and attachment styles of therapists impact upon the therapeutic alliance and therapeutic outcomes?. Journal of Medical Assocociation of Thailand. 95(12), Wongpakaran, T., Wongpakaran, N., Bukkamana, P., Boonyanaruthee, V., Pinyopornpanish, M., Likhitsathian, S., Srisutadsanavong, U. (2012a). Interrater Reliability of the Thai version of the Structured Clinical Interview for DSM-IV Axis II disorders (T-SCID II). Journal of Medical Assocociation of Thailand, 95(2), Wongpakaran, T., Wongpakaran, N., Sirithepthawee, U., Pratoomsri, W., Burapakajornpong, N., Rangseekajee, P.,Temboonkiat, A. (2012b). Interpersonal Problems among psychiatric outpatients and non-clinical samples. Singapore Medical Journal, 53(7), Yen, M., & Lo, L. H. (2002). Examining test-retest reliability: an intra-class correlation approach. Nursing Research, 51(1),

Open Access Reliability and Validity of the Multidimensional Scale of Perceived Social Support (MSPSS): Thai Version

Open Access Reliability and Validity of the Multidimensional Scale of Perceived Social Support (MSPSS): Thai Version Clinical Practice & Epidemiology in Mental Health, 2011, 7, 161-166 161 Open Access Reliability and Validity of the Multidimensional Scale of Perceived Social Support (MSPSS): Thai Version Tinakon Wongpakaran

More information

Development and Psychometric Properties of the Relational Mobility Scale for the Indonesian Population

Development and Psychometric Properties of the Relational Mobility Scale for the Indonesian Population Development and Psychometric Properties of the Relational Mobility Scale for the Indonesian Population Sukaesi Marianti Abstract This study aims to develop the Relational Mobility Scale for the Indonesian

More information

alternate-form reliability The degree to which two or more versions of the same test correlate with one another. In clinical studies in which a given function is going to be tested more than once over

More information

Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz

Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz This study presents the steps Edgenuity uses to evaluate the reliability and validity of its quizzes, topic tests, and cumulative

More information

Open Access A Short Version of the Revised Experience of Close Relationships Questionnaire : Investigating Non-Clinical and Clinical Samples

Open Access A Short Version of the Revised Experience of Close Relationships Questionnaire : Investigating Non-Clinical and Clinical Samples 36 Clinical Practice & Epidemiology in Mental Health, 2012, 8, 36-42 Open Access A Short Version of the Revised Experience of Close Relationships Questionnaire : Investigating Non-Clinical and Clinical

More information

Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology*

Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology* Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology* Timothy Teo & Chwee Beng Lee Nanyang Technology University Singapore This

More information

Confirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children

Confirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children Confirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children Dr. KAMALPREET RAKHRA MD MPH PhD(Candidate) No conflict of interest Child Behavioural Check

More information

Making a psychometric. Dr Benjamin Cowan- Lecture 9

Making a psychometric. Dr Benjamin Cowan- Lecture 9 Making a psychometric Dr Benjamin Cowan- Lecture 9 What this lecture will cover What is a questionnaire? Development of questionnaires Item development Scale options Scale reliability & validity Factor

More information

Introduction to Reliability

Introduction to Reliability Reliability Thought Questions: How does/will reliability affect what you do/will do in your future job? Which method of reliability analysis do you find most confusing? Introduction to Reliability What

More information

Chapter 9. Youth Counseling Impact Scale (YCIS)

Chapter 9. Youth Counseling Impact Scale (YCIS) Chapter 9 Youth Counseling Impact Scale (YCIS) Background Purpose The Youth Counseling Impact Scale (YCIS) is a measure of perceived effectiveness of a specific counseling session. In general, measures

More information

Michael Armey David M. Fresco. Jon Rottenberg. James J. Gross Ian H. Gotlib. Kent State University. Stanford University. University of South Florida

Michael Armey David M. Fresco. Jon Rottenberg. James J. Gross Ian H. Gotlib. Kent State University. Stanford University. University of South Florida Further psychometric refinement of depressive rumination: Support for the Brooding and Pondering factor solution in a diverse community sample with clinician-assessed psychopathology Michael Armey David

More information

ADMS Sampling Technique and Survey Studies

ADMS Sampling Technique and Survey Studies Principles of Measurement Measurement As a way of understanding, evaluating, and differentiating characteristics Provides a mechanism to achieve precision in this understanding, the extent or quality As

More information

VARIABLES AND MEASUREMENT

VARIABLES AND MEASUREMENT ARTHUR SYC 204 (EXERIMENTAL SYCHOLOGY) 16A LECTURE NOTES [01/29/16] VARIABLES AND MEASUREMENT AGE 1 Topic #3 VARIABLES AND MEASUREMENT VARIABLES Some definitions of variables include the following: 1.

More information

By Hui Bian Office for Faculty Excellence

By Hui Bian Office for Faculty Excellence By Hui Bian Office for Faculty Excellence 1 Email: bianh@ecu.edu Phone: 328-5428 Location: 1001 Joyner Library, room 1006 Office hours: 8:00am-5:00pm, Monday-Friday 2 Educational tests and regular surveys

More information

Basic concepts and principles of classical test theory

Basic concepts and principles of classical test theory Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must

More information

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604 Measurement and Descriptive Statistics Katie Rommel-Esham Education 604 Frequency Distributions Frequency table # grad courses taken f 3 or fewer 5 4-6 3 7-9 2 10 or more 4 Pictorial Representations Frequency

More information

Statistics for Psychosocial Research Session 1: September 1 Bill

Statistics for Psychosocial Research Session 1: September 1 Bill Statistics for Psychosocial Research Session 1: September 1 Bill Introduction to Staff Purpose of the Course Administration Introduction to Test Theory Statistics for Psychosocial Research Overview: a)

More information

Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories

Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,

More information

On the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation in CFA

On the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation in CFA STRUCTURAL EQUATION MODELING, 13(2), 186 203 Copyright 2006, Lawrence Erlbaum Associates, Inc. On the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation

More information

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological

More information

The Psychometric Properties of Dispositional Flow Scale-2 in Internet Gaming

The Psychometric Properties of Dispositional Flow Scale-2 in Internet Gaming Curr Psychol (2009) 28:194 201 DOI 10.1007/s12144-009-9058-x The Psychometric Properties of Dispositional Flow Scale-2 in Internet Gaming C. K. John Wang & W. C. Liu & A. Khoo Published online: 27 May

More information

1. Evaluate the methodological quality of a study with the COSMIN checklist

1. Evaluate the methodological quality of a study with the COSMIN checklist Answers 1. Evaluate the methodological quality of a study with the COSMIN checklist We follow the four steps as presented in Table 9.2. Step 1: The following measurement properties are evaluated in the

More information

THE USE OF CRONBACH ALPHA RELIABILITY ESTIMATE IN RESEARCH AMONG STUDENTS IN PUBLIC UNIVERSITIES IN GHANA.

THE USE OF CRONBACH ALPHA RELIABILITY ESTIMATE IN RESEARCH AMONG STUDENTS IN PUBLIC UNIVERSITIES IN GHANA. Africa Journal of Teacher Education ISSN 1916-7822. A Journal of Spread Corporation Vol. 6 No. 1 2017 Pages 56-64 THE USE OF CRONBACH ALPHA RELIABILITY ESTIMATE IN RESEARCH AMONG STUDENTS IN PUBLIC UNIVERSITIES

More information

Validity and reliability of measurements

Validity and reliability of measurements Validity and reliability of measurements 2 3 Request: Intention to treat Intention to treat and per protocol dealing with cross-overs (ref Hulley 2013) For example: Patients who did not take/get the medication

More information

LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors

LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors affecting reliability ON DEFINING RELIABILITY Non-technical

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

Development of self efficacy and attitude toward analytic geometry scale (SAAG-S)

Development of self efficacy and attitude toward analytic geometry scale (SAAG-S) Available online at www.sciencedirect.com Procedia - Social and Behavioral Sciences 55 ( 2012 ) 20 27 INTERNATIONAL CONFERENCE ON NEW HORIZONS IN EDUCATION INTE2012 Development of self efficacy and attitude

More information

Scale Building with Confirmatory Factor Analysis

Scale Building with Confirmatory Factor Analysis Scale Building with Confirmatory Factor Analysis Latent Trait Measurement and Structural Equation Models Lecture #7 February 27, 2013 PSYC 948: Lecture #7 Today s Class Scale building with confirmatory

More information

Personal Style Inventory Item Revision: Confirmatory Factor Analysis

Personal Style Inventory Item Revision: Confirmatory Factor Analysis Personal Style Inventory Item Revision: Confirmatory Factor Analysis This research was a team effort of Enzo Valenzi and myself. I m deeply grateful to Enzo for his years of statistical contributions to

More information

Factorial Validity and Consistency of the MBI-GS Across Occupational Groups in Norway

Factorial Validity and Consistency of the MBI-GS Across Occupational Groups in Norway Brief Report Factorial Validity and Consistency of the MBI-GS Across Occupational Groups in Norway Astrid M. Richardsen Norwegian School of Management Monica Martinussen University of Tromsø The present

More information

SEM-Based Composite Reliability Estimates of the Crisis Acuity Rating Scale with Children and Adolescents

SEM-Based Composite Reliability Estimates of the Crisis Acuity Rating Scale with Children and Adolescents Archives of Assessment Psychology, Vol. 1, No. 1, (pp) Printed in U.S.A. All rights reserved 2011 American Board of Assessment Psychology SEM-Based Composite Reliability Estimates of the Crisis Acuity

More information

CHAPTER VI RESEARCH METHODOLOGY

CHAPTER VI RESEARCH METHODOLOGY CHAPTER VI RESEARCH METHODOLOGY 6.1 Research Design Research is an organized, systematic, data based, critical, objective, scientific inquiry or investigation into a specific problem, undertaken with the

More information

The Development of Scales to Measure QISA s Three Guiding Principles of Student Aspirations Using the My Voice TM Survey

The Development of Scales to Measure QISA s Three Guiding Principles of Student Aspirations Using the My Voice TM Survey The Development of Scales to Measure QISA s Three Guiding Principles of Student Aspirations Using the My Voice TM Survey Matthew J. Bundick, Ph.D. Director of Research February 2011 The Development of

More information

Internal structure evidence of validity

Internal structure evidence of validity Internal structure evidence of validity Dr Wan Nor Arifin Lecturer, Unit of Biostatistics and Research Methodology, Universiti Sains Malaysia. E-mail: wnarifin@usm.my Wan Nor Arifin, 2017. Internal structure

More information

02a: Test-Retest and Parallel Forms Reliability

02a: Test-Retest and Parallel Forms Reliability 1 02a: Test-Retest and Parallel Forms Reliability Quantitative Variables 1. Classic Test Theory (CTT) 2. Correlation for Test-retest (or Parallel Forms): Stability and Equivalence for Quantitative Measures

More information

Answer Key to Problem Set #1

Answer Key to Problem Set #1 Answer Key to Problem Set #1 Two notes: q#4e: Please disregard q#5e: The frequency tables of the total CESD scales of 94, 96 and 98 in question 5e should sum up to 328 observation not 924 (the student

More information

Importance of Good Measurement

Importance of Good Measurement Importance of Good Measurement Technical Adequacy of Assessments: Validity and Reliability Dr. K. A. Korb University of Jos The conclusions in a study are only as good as the data that is collected. The

More information

Factorial Validity and Reliability of 12 items General Health Questionnaire in a Bhutanese Population. Tshoki Zangmo *

Factorial Validity and Reliability of 12 items General Health Questionnaire in a Bhutanese Population. Tshoki Zangmo * Factorial Validity and Reliability of 12 items General Health Questionnaire in a Bhutanese Population Tshoki Zangmo * Abstract The aim of this study is to test the factorial structure and the internal

More information

Personality Traits Effects on Job Satisfaction: The Role of Goal Commitment

Personality Traits Effects on Job Satisfaction: The Role of Goal Commitment Marshall University Marshall Digital Scholar Management Faculty Research Management, Marketing and MIS Fall 11-14-2009 Personality Traits Effects on Job Satisfaction: The Role of Goal Commitment Wai Kwan

More information

International Conference on Humanities and Social Science (HSS 2016)

International Conference on Humanities and Social Science (HSS 2016) International Conference on Humanities and Social Science (HSS 2016) The Chinese Version of WOrk-reLated Flow Inventory (WOLF): An Examination of Reliability and Validity Yi-yu CHEN1, a, Xiao-tong YU2,

More information

Confirmatory factor analysis (CFA) of first order factor measurement model-ict empowerment in Nigeria

Confirmatory factor analysis (CFA) of first order factor measurement model-ict empowerment in Nigeria International Journal of Business Management and Administration Vol. 2(5), pp. 081-088, May 2013 Available online at http://academeresearchjournals.org/journal/ijbma ISSN 2327-3100 2013 Academe Research

More information

College Student Self-Assessment Survey (CSSAS)

College Student Self-Assessment Survey (CSSAS) 13 College Student Self-Assessment Survey (CSSAS) Development of College Student Self Assessment Survey (CSSAS) The collection and analysis of student achievement indicator data are of primary importance

More information

Modeling the Influential Factors of 8 th Grades Student s Mathematics Achievement in Malaysia by Using Structural Equation Modeling (SEM)

Modeling the Influential Factors of 8 th Grades Student s Mathematics Achievement in Malaysia by Using Structural Equation Modeling (SEM) International Journal of Advances in Applied Sciences (IJAAS) Vol. 3, No. 4, December 2014, pp. 172~177 ISSN: 2252-8814 172 Modeling the Influential Factors of 8 th Grades Student s Mathematics Achievement

More information

Emotional Intelligence and Leadership

Emotional Intelligence and Leadership The Mayer Salovey Caruso Notes Emotional Intelligence Test (MSCEIT) 2 The Mayer Salovey Caruso Emotional Intelligence Test (MSCEIT) 2 The MSCEIT 2 measures four related abilities. 3 Perceiving Facilitating

More information

Personal Well-being Among Medical Students: Findings from a Pilot Survey

Personal Well-being Among Medical Students: Findings from a Pilot Survey Analysis IN BRIEF Volume 14, Number 4 April 2014 Association of American Medical Colleges Personal Well-being Among Medical Students: Findings from a Pilot Survey Supplemental Information References 1.

More information

Instrument equivalence across ethnic groups. Antonio Olmos (MHCD) Susan R. Hutchinson (UNC)

Instrument equivalence across ethnic groups. Antonio Olmos (MHCD) Susan R. Hutchinson (UNC) Instrument equivalence across ethnic groups Antonio Olmos (MHCD) Susan R. Hutchinson (UNC) Overview Instrument Equivalence Measurement Invariance Invariance in Reliability Scores Factorial Invariance Item

More information

Running head: CFA OF STICSA 1. Model-Based Factor Reliability and Replicability of the STICSA

Running head: CFA OF STICSA 1. Model-Based Factor Reliability and Replicability of the STICSA Running head: CFA OF STICSA 1 Model-Based Factor Reliability and Replicability of the STICSA The State-Trait Inventory of Cognitive and Somatic Anxiety (STICSA; Ree et al., 2008) is a new measure of anxiety

More information

Collecting & Making Sense of

Collecting & Making Sense of Collecting & Making Sense of Quantitative Data Deborah Eldredge, PhD, RN Director, Quality, Research & Magnet Recognition i Oregon Health & Science University Margo A. Halm, RN, PhD, ACNS-BC, FAHA Director,

More information

The Bilevel Structure of the Outcome Questionnaire 45

The Bilevel Structure of the Outcome Questionnaire 45 Psychological Assessment 2010 American Psychological Association 2010, Vol. 22, No. 2, 350 355 1040-3590/10/$12.00 DOI: 10.1037/a0019187 The Bilevel Structure of the Outcome Questionnaire 45 Jamie L. Bludworth,

More information

Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving

Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, 2018 The Scientific Method of Problem Solving The conceptual phase Reviewing the literature, stating the problem,

More information

Alternative Methods for Assessing the Fit of Structural Equation Models in Developmental Research

Alternative Methods for Assessing the Fit of Structural Equation Models in Developmental Research Alternative Methods for Assessing the Fit of Structural Equation Models in Developmental Research Michael T. Willoughby, B.S. & Patrick J. Curran, Ph.D. Duke University Abstract Structural Equation Modeling

More information

Multifactor Confirmatory Factor Analysis

Multifactor Confirmatory Factor Analysis Multifactor Confirmatory Factor Analysis Latent Trait Measurement and Structural Equation Models Lecture #9 March 13, 2013 PSYC 948: Lecture #9 Today s Class Confirmatory Factor Analysis with more than

More information

26:010:557 / 26:620:557 Social Science Research Methods

26:010:557 / 26:620:557 Social Science Research Methods 26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview

More information

PSYCHOLOGY, PSYCHIATRY & BRAIN NEUROSCIENCE SECTION

PSYCHOLOGY, PSYCHIATRY & BRAIN NEUROSCIENCE SECTION Pain Medicine 2015; 16: 2109 2120 Wiley Periodicals, Inc. PSYCHOLOGY, PSYCHIATRY & BRAIN NEUROSCIENCE SECTION Original Research Articles Living Well with Pain: Development and Preliminary Evaluation of

More information

ASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT S MATHEMATICS ACHIEVEMENT IN MALAYSIA

ASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT S MATHEMATICS ACHIEVEMENT IN MALAYSIA 1 International Journal of Advance Research, IJOAR.org Volume 1, Issue 2, MAY 2013, Online: ASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT

More information

A Short Form of Sweeney, Hausknecht and Soutar s Cognitive Dissonance Scale

A Short Form of Sweeney, Hausknecht and Soutar s Cognitive Dissonance Scale A Short Form of Sweeney, Hausknecht and Soutar s Cognitive Dissonance Scale Associate Professor Jillian C. Sweeney University of Western Australia Business School, Crawley, Australia Email: jill.sweeney@uwa.edu.au

More information

Validation of the WHOQOL-BREF Quality of Life Questionnaire for Use with Medical Students

Validation of the WHOQOL-BREF Quality of Life Questionnaire for Use with Medical Students B R I E F C O M M U N I C A T I O N Validation of the WHOQOL-BREF Quality of Life Questionnaire for Use with Medical Students CU Krägeloh 1, MA Henning 2, SJ Hawken 2, Y Zhao 1,2, D Shepherd 1, R Billington

More information

Organizational readiness for implementing change: a psychometric assessment of a new measure

Organizational readiness for implementing change: a psychometric assessment of a new measure Shea et al. Implementation Science 2014, 9:7 Implementation Science RESEARCH Organizational readiness for implementing change: a psychometric assessment of a new measure Christopher M Shea 1,2*, Sara R

More information

The Modification of Dichotomous and Polytomous Item Response Theory to Structural Equation Modeling Analysis

The Modification of Dichotomous and Polytomous Item Response Theory to Structural Equation Modeling Analysis Canadian Social Science Vol. 8, No. 5, 2012, pp. 71-78 DOI:10.3968/j.css.1923669720120805.1148 ISSN 1712-8056[Print] ISSN 1923-6697[Online] www.cscanada.net www.cscanada.org The Modification of Dichotomous

More information

PTHP 7101 Research 1 Chapter Assignments

PTHP 7101 Research 1 Chapter Assignments PTHP 7101 Research 1 Chapter Assignments INSTRUCTIONS: Go over the questions/pointers pertaining to the chapters and turn in a hard copy of your answers at the beginning of class (on the day that it is

More information

STUDY ON THE CORRELATION BETWEEN SELF-ESTEEM, COPING AND CLINICAL SYMPTOMS IN A GROUP OF YOUNG ADULTS: A BRIEF REPORT

STUDY ON THE CORRELATION BETWEEN SELF-ESTEEM, COPING AND CLINICAL SYMPTOMS IN A GROUP OF YOUNG ADULTS: A BRIEF REPORT STUDY ON THE CORRELATION BETWEEN SELF-ESTEEM, COPING AND CLINICAL SYMPTOMS IN A GROUP OF YOUNG ADULTS: A BRIEF REPORT Giulia Savarese, PhD Luna Carpinelli, MA Oreste Fasano, PhD Monica Mollo, PhD Nadia

More information

11/24/2017. Do not imply a cause-and-effect relationship

11/24/2017. Do not imply a cause-and-effect relationship Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session

More information

So far. INFOWO Lecture M5 Homogeneity and Reliability. Homogeneity. Homogeneity

So far. INFOWO Lecture M5 Homogeneity and Reliability. Homogeneity. Homogeneity So far INFOWO Lecture M5 Homogeneity and Reliability Peter de Waal Department of Information and Computing Sciences Faculty of Science, Universiteit Utrecht Descriptive statistics Scores and probability

More information

DEVELOPMENT AND VALIDATION OF THE JAPANESE SCALE OF MINDFULNESS SKILLS BASED ON DBT STRATEGIES

DEVELOPMENT AND VALIDATION OF THE JAPANESE SCALE OF MINDFULNESS SKILLS BASED ON DBT STRATEGIES DEVELOPMENT AND VALIDATION OF THE JAPANESE SCALE OF MINDFULNESS SKILLS BASED ON DBT STRATEGIES Keiko Nakano Department of Clinical Psychology/Atomi University JAPAN ABSTRACT The present study reports findings

More information

The Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory

The Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory The Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory Kate DeRoche, M.A. Mental Health Center of Denver Antonio Olmos, Ph.D. Mental Health

More information

The Ego Identity Process Questionnaire: Factor Structure, Reliability, and Convergent Validity in Dutch-Speaking Late. Adolescents

The Ego Identity Process Questionnaire: Factor Structure, Reliability, and Convergent Validity in Dutch-Speaking Late. Adolescents 33 2 The Ego Identity Process Questionnaire: Factor Structure, Reliability, and Convergent Validity in Dutch-Speaking Late Adolescents Koen Luyckx, Luc Goossens, Wim Beyers, & Bart Soenens (2006). Journal

More information

The revised short-form of the Eating Beliefs Questionnaire: Measuring positive, negative, and permissive beliefs about binge eating

The revised short-form of the Eating Beliefs Questionnaire: Measuring positive, negative, and permissive beliefs about binge eating Burton and Abbott Journal of Eating Disorders (2018) 6:37 https://doi.org/10.1186/s40337-018-0224-0 RESEARCH ARTICLE Open Access The revised short-form of the Eating Beliefs Questionnaire: Measuring positive,

More information

Procedia - Social and Behavioral Sciences 152 ( 2014 ) ERPA Academic functional procrastination: Validity and reliability study

Procedia - Social and Behavioral Sciences 152 ( 2014 ) ERPA Academic functional procrastination: Validity and reliability study Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 152 ( 2014 ) 194 198 ERPA 2014 Academic functional procrastination: Validity and reliability study Mehmet

More information

Validity and reliability of measurements

Validity and reliability of measurements Validity and reliability of measurements 2 Validity and reliability of measurements 4 5 Components in a dataset Why bother (examples from research) What is reliability? What is validity? How should I treat

More information

Words: 1393 (excluding table and references) Exploring the structural relationship between interviewer and self-rated affective

Words: 1393 (excluding table and references) Exploring the structural relationship between interviewer and self-rated affective Interviewer and self-rated affective symptoms in HD 1 Words: 1393 (excluding table and references) Tables: 1 Corresponding author: Email: Maria.Dale@leicspart.nhs.uk Tel: +44 (0) 116 295 3098 Exploring

More information

11-3. Learning Objectives

11-3. Learning Objectives 11-1 Measurement Learning Objectives 11-3 Understand... The distinction between measuring objects, properties, and indicants of properties. The similarities and differences between the four scale types

More information

Chapter 4: Defining and Measuring Variables

Chapter 4: Defining and Measuring Variables Chapter 4: Defining and Measuring Variables A. LEARNING OUTCOMES. After studying this chapter students should be able to: Distinguish between qualitative and quantitative, discrete and continuous, and

More information

Chapter 3 Psychometrics: Reliability and Validity

Chapter 3 Psychometrics: Reliability and Validity 34 Chapter 3 Psychometrics: Reliability and Validity Every classroom assessment measure must be appropriately reliable and valid, be it the classic classroom achievement test, attitudinal measure, or performance

More information

EFFECTS OF ITEM ORDER ON CONSISTENCY AND PRECISION UNDER DIFFERENT ORDERING SCHEMES IN ATTITUDINAL SCALES: A CASE OF PHYSICAL SELF-CONCEPT SCALES

EFFECTS OF ITEM ORDER ON CONSISTENCY AND PRECISION UNDER DIFFERENT ORDERING SCHEMES IN ATTITUDINAL SCALES: A CASE OF PHYSICAL SELF-CONCEPT SCALES Item Ordering 1 Edgeworth Series in Quantitative Educational and Social Science (Report No.ESQESS-2001-3) EFFECTS OF ITEM ORDER ON CONSISTENCY AND PRECISION UNDER DIFFERENT ORDERING SCHEMES IN ATTITUDINAL

More information

Examination of the factor structure of critical thinking disposition scale according to different variables

Examination of the factor structure of critical thinking disposition scale according to different variables American Journal of Theoretical and Applied Statistics 2015; 4(1-1): 1-8 Published online August 30, 2015 (http://www.sciencepublishinggroup.com/j/ajtas) doi: 10.11648/j.ajtas.s.2015040101.11 ISSN: 2326-8999

More information

The Institute for Motivational Living, Inc.

The Institute for Motivational Living, Inc. The Institute for Motivational Living, Inc. DISC Instrument Validation Study Technical Report February 1, 2006 Copyright 2006 retained by The Institute for Motivational Living, Incorporated, Larry R. Price,

More information

Packianathan Chelladurai Troy University, Troy, Alabama, USA.

Packianathan Chelladurai Troy University, Troy, Alabama, USA. DIMENSIONS OF ORGANIZATIONAL CAPACITY OF SPORT GOVERNING BODIES OF GHANA: DEVELOPMENT OF A SCALE Christopher Essilfie I.B.S Consulting Alliance, Accra, Ghana E-mail: chrisessilfie@yahoo.com Packianathan

More information

Chapter 3. Psychometric Properties

Chapter 3. Psychometric Properties Chapter 3 Psychometric Properties Reliability The reliability of an assessment tool like the DECA-C is defined as, the consistency of scores obtained by the same person when reexamined with the same test

More information

Internal Consistency and Reliability of the Networked Minds Measure of Social Presence

Internal Consistency and Reliability of the Networked Minds Measure of Social Presence Internal Consistency and Reliability of the Networked Minds Measure of Social Presence Chad Harms Iowa State University Frank Biocca Michigan State University Abstract This study sought to develop and

More information

FACTOR VALIDITY OF THE MERIDEN SCHOOL CLIMATE SURVEY- STUDENT VERSION (MSCS-SV)

FACTOR VALIDITY OF THE MERIDEN SCHOOL CLIMATE SURVEY- STUDENT VERSION (MSCS-SV) FACTOR VALIDITY OF THE MERIDEN SCHOOL CLIMATE SURVEY- STUDENT VERSION (MSCS-SV) Nela Marinković 1,2, Ivana Zečević 2 & Siniša Subotić 3 2 Faculty of Philosophy, University of Banja Luka 3 University of

More information

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971) Ch. 5: Validity Validity History Griggs v. Duke Power Ricci vs. DeStefano Defining Validity Aspects of Validity Face Validity Content Validity Criterion Validity Construct Validity Reliability vs. Validity

More information

Comprehensive Statistical Analysis of a Mathematics Placement Test

Comprehensive Statistical Analysis of a Mathematics Placement Test Comprehensive Statistical Analysis of a Mathematics Placement Test Robert J. Hall Department of Educational Psychology Texas A&M University, USA (bobhall@tamu.edu) Eunju Jung Department of Educational

More information

Instrument Validation Study

Instrument Validation Study Instrument Validation Study REGARDING LEADERSHIP CIRCLE PROFILE By Industrial Psychology Department Bowling Green State University INSTRUMENT VALIDATION STUDY EXECUTIVE SUMMARY AND RESPONSE TO THE RECOMMENDATIONS

More information

Chapter 1: Explaining Behavior

Chapter 1: Explaining Behavior Chapter 1: Explaining Behavior GOAL OF SCIENCE is to generate explanations for various puzzling natural phenomenon. - Generate general laws of behavior (psychology) RESEARCH: principle method for acquiring

More information

ESTABLISHING VALIDITY AND RELIABILITY OF ACHIEVEMENT TEST IN BIOLOGY FOR STD. IX STUDENTS

ESTABLISHING VALIDITY AND RELIABILITY OF ACHIEVEMENT TEST IN BIOLOGY FOR STD. IX STUDENTS International Journal of Educational Science and Research (IJESR) ISSN(P): 2249-6947; ISSN(E): 2249-8052 Vol. 4, Issue 4, Aug 2014, 29-36 TJPRC Pvt. Ltd. ESTABLISHING VALIDITY AND RELIABILITY OF ACHIEVEMENT

More information

The Youth Experience Survey 2.0: Instrument Revisions and Validity Testing* David M. Hansen 1 University of Illinois, Urbana-Champaign

The Youth Experience Survey 2.0: Instrument Revisions and Validity Testing* David M. Hansen 1 University of Illinois, Urbana-Champaign The Youth Experience Survey 2.0: Instrument Revisions and Validity Testing* David M. Hansen 1 University of Illinois, Urbana-Champaign Reed Larson 2 University of Illinois, Urbana-Champaign February 28,

More information

Paul Irwing, Manchester Business School

Paul Irwing, Manchester Business School Paul Irwing, Manchester Business School Factor analysis has been the prime statistical technique for the development of structural theories in social science, such as the hierarchical factor model of human

More information

Confirmatory Factor Analysis of the BCSSE Scales

Confirmatory Factor Analysis of the BCSSE Scales Confirmatory Factor Analysis of the BCSSE Scales Justin Paulsen, ABD James Cole, PhD January 2019 Indiana University Center for Postsecondary Research 1900 East 10th Street, Suite 419 Bloomington, Indiana

More information

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology ISC- GRADE XI HUMANITIES (2018-19) PSYCHOLOGY Chapter 2- Methods of Psychology OUTLINE OF THE CHAPTER (i) Scientific Methods in Psychology -observation, case study, surveys, psychological tests, experimentation

More information

Connectedness DEOCS 4.1 Construct Validity Summary

Connectedness DEOCS 4.1 Construct Validity Summary Connectedness DEOCS 4.1 Construct Validity Summary DEFENSE EQUAL OPPORTUNITY MANAGEMENT INSTITUTE DIRECTORATE OF RESEARCH DEVELOPMENT AND STRATEGIC INITIATIVES Directed by Dr. Daniel P. McDonald, Executive

More information

Technical Whitepaper

Technical Whitepaper Technical Whitepaper July, 2001 Prorating Scale Scores Consequential analysis using scales from: BDI (Beck Depression Inventory) NAS (Novaco Anger Scales) STAXI (State-Trait Anxiety Inventory) PIP (Psychotic

More information

Types of Tests. Measurement Reliability. Most self-report tests used in Psychology and Education are objective tests :

Types of Tests. Measurement Reliability. Most self-report tests used in Psychology and Education are objective tests : Measurement Reliability Objective & Subjective tests Standardization & Inter-rater reliability Properties of a good item Item Analysis Internal Reliability Spearman-Brown Prophesy Formla -- α & # items

More information

An Assessment of the Mathematics Information Processing Scale: A Potential Instrument for Extending Technology Education Research

An Assessment of the Mathematics Information Processing Scale: A Potential Instrument for Extending Technology Education Research Association for Information Systems AIS Electronic Library (AISeL) SAIS 2009 Proceedings Southern (SAIS) 3-1-2009 An Assessment of the Mathematics Information Processing Scale: A Potential Instrument for

More information

Brooding and Pondering: Isolating the Active Ingredients of Depressive Rumination with Confirmatory Factor Analysis

Brooding and Pondering: Isolating the Active Ingredients of Depressive Rumination with Confirmatory Factor Analysis Michael Armey David M. Fresco Kent State University Brooding and Pondering: Isolating the Active Ingredients of Depressive Rumination with Confirmatory Factor Analysis Douglas S. Mennin Yale University

More information

While many studies have employed Young s Internet

While many studies have employed Young s Internet CYBERPSYCHOLOGY, BEHAVIOR, AND SOCIAL NETWORKING Volume 16, Number 3, 2013 ª Mary Ann Liebert, Inc. DOI: 10.1089/cyber.2012.0426 Arabic Validation of the Internet Addiction Test Nazir S. Hawi, EdD Abstract

More information

Do not write your name on this examination all 40 best

Do not write your name on this examination all 40 best Student #: Do not write your name on this examination Research in Psychology I; Final Exam Fall 200 Instructor: Jeff Aspelmeier NOTE: THIS EXAMINATION MAY NOT BE RETAINED BY STUDENTS This exam is worth

More information

Wellbeing Measurement Framework for Colleges

Wellbeing Measurement Framework for Colleges Funded by Wellbeing Measurement Framework for Colleges In partnership with Child Outcomes Research Consortium CONTENTS About the wellbeing measurement framework for colleges 3 General Population Clinical

More information

The MHSIP: A Tale of Three Centers

The MHSIP: A Tale of Three Centers The MHSIP: A Tale of Three Centers P. Antonio Olmos-Gallo, Ph.D. Kathryn DeRoche, M.A. Mental Health Center of Denver Richard Swanson, Ph.D., J.D. Aurora Research Institute John Mahalik, Ph.D., M.P.A.

More information