O ver the years, researchers have been concerned about the possibility that selfreport

Size: px
Start display at page:

Download "O ver the years, researchers have been concerned about the possibility that selfreport"

Transcription

1 A Psychometric Investigation of the Marlowe Crowne Social Desirability Scale Using Rasch Measurement Hyunsoo Seol The author used Rasch measurement to examine the reliability and validity of 382 Korean university students scores on the Marlowe Crowne Social Desirability Scale (MCSDS; D. P. Crowne & D. Marlowe, 1960). Results revealed that item-fit statistics and principal component analysis with standardized residuals provide evidence of MCSDS s unidimensionality. Six (of 33) items displayed differential item functioning for gender. v O ver the years, researchers have been concerned about the possibility that selfreport instruments could be contaminated by respondents answering questions in ways that influence whether they are perceived favorably or unfavorably. Socially desirable responding in self-reports on sensitive behaviors such as delinquency, violence, and drug abuse interferes with accurate inferences of self-report scores (Fraboni & Cooper, 1989; King & Bruner, 2000). Social desirability bias, which is defined as the tendency for individuals to portray themselves in a generally favorable fashion (Holden, 1994, p. 429), has been studied since the 1950s (e.g., Edwards, 1957). One concern, however, is that Edwards s scale was derived from the Minnesota Multiphasic Personality Inventory (MMPI) and was associated with psychopathology. When Edwards s scale with items drawn from the MMPI is administered to college students, Crowne and Marlowe (1960) noted that the meaning of high social desirability scores is not at all clear (p. 349). Accordingly, the Marlowe Crowne Social Desirability Scale (MCSDS; Crowne & Marlowe, 1960) was developed to minimize pathological implications (p. 349) and tested on college students. The MCSDS consists of 33 true false statements in which 18 items describe socially approved behaviors but infrequent behaviors (e.g., I never hesitate to go out of my way to help someone in trouble ), and the other 15 items refer to socially disapproved but frequent behaviors (e.g., I like to gossip at times ). Since its development, the MCSDS has been described and adopted in more than 1,000 diverse studies (Beretvas, Meyers, & Leite, 2002). Although this scale has been widely used to measure one form of response bias or faking good, because of its length, several short forms of the MCSDS (Ballard, 1992; Reynolds, 1982; Strahan & Gerbasi, 1972) have been developed and used. According to Barger (2002), for example, Reynolds s forms were cited in 128 studies and Strahan and Gerbasi s forms were cited in 145 studies in the 1990s. In addition to the variety of uses to which the MCSDS has been put, there has also been some concern that the scale represents a unidimensional structure (called social desirability or need for approval ). Several researchers have investigated the dimensionality of the full and short forms of the MCSDS using factor analytic techniques, and v Hyunsoo Seol, Department of Education, Chung-Ang University, Seoul, Republic of Korea. This work was supported by a Korea Research Foundation Grant (MOEHRD, Basic Research Promotion Fund, KRF B00232) funded by the government of the Republic of Korea. Correspondence concerning this article should be addressed to Hyunsoo Seol, Department of Education, Chung-Ang University, Dongjak- Gu, Seoul, , Republic of Korea ( snow@cau.ac.kr) American Counseling Association. All rights reserved. Measurement and Evaluation in Counseling and Development October 2007 Volume

2 their findings have been contradictory (Barger, 2002; Collazo, 2005; Fischer & Fick, 1993; Leite & Beretvas, 2005; Loo & Thorpe, 2000; Paulhus, 1984; Ramanaiah & Martin, 1980; Ramanaiah, Schill, & Leung, 1977). In most of the studies investigating the factor structure of responses to the MCSDS, it was found that the MCSDS did not fit a one-factor model, suggesting a multidimensional structure. Recently, Barger found that the MCSDS did not fit the one-factor model using the Satorra-Bentler robust maximum likelihood estimation (Bentler, 1995). Leite and Beretvas also conducted confirmatory factor analysis using the mean and variance-adjusted weighted least squares estimator (L. Muthén & Muthén, 1998) and have found two subscales, thus raising questions about the unidimensional structure of the MCSDS. A number of problems regarding the use of linear factor analytic techniques on dichotomous data have been noted (Meara, Robin, & Sireci, 2000; B. Muthén, 1978; Olsson, 1979). The dimensionality of a matrix of phi coefficients, for example, may differ from the dimensionality of the underlying continuous variables (Bernstein & Teng, 1989; Hambleton & Rovinelli, 1986). Because responses to the MCSDS are in a true false format, the linear factor analysis may distort the underlying structure of dichotomous data (Bock, Gibbons, & Muraki, 1988; McDonald & Ahlawat, 1974). Many of the problems posed by factor analysis procedures with dichotomous items can be avoided with the use of item response theory models or, as illustrated in this study, with the use of the Rasch model (Rasch, 1960). Unlike the traditional factor analytic techniques, Rasch modeling places both the difficulty of items and the ability of persons on a common scale. The graphical representation (item person map) of the conjoint distribution of the person and item estimates on the same scale provides evidence about whether the instrument is appropriate for a given sample. The lack of spread of item difficulty estimates on the item person map suggests that some of the items may be redundant. Furthermore, although the models of factor analysis work under the assumption of normal distribution of the data, Rasch models make no such assumption (Slinde & Linn, 1979). It should be noted, however, that the importance of item-level analysis (i.e., focusing on the quality of the individual items, rather than on the whole test) has not been properly recognized in the published research of the MCSDS, mostly because previous research on the MCSDS is based on traditional methods of factor analysis. Because the Rasch model is based on the assumption of unidimensionality, deviations from Rasch model data fit are considered evidence of multidimensionality (Linacre, 1998; E. V. Smith, 2002; R. Smith & Miao, 1994; Wright, 1996). Given the frequent use of the MCSDS in many different fields and the conflicting findings about whether the MCSDS represents a unidimensional or a multidimensional construct, it is imperative to examine the dimensionality of the scale with the use of appropriate statistical techniques (Collazo, 2005; Leite & Beretvas, 2005); therefore, the purpose of this study was to examine the psychometric properties of the MCSDS by using the Rasch model with a college student sample. Method Participants The participants were 382 (248 undergraduate and 134 graduate) students from several education classes at a large private university in the Republic of Korea. Participants received extra course credit for participating. Participants were given a consent form and a brief explanation of the goals of the study. The questionnaire consisted of a page of demographic statements, followed by 33 true false items. Of the participants, 28.3% (n = 108) were men and 71.7% (n = 274) were women. The average age was years (median = 23.0). All participants completed the MCSDS in its entirety. 156 Measurement and Evaluation in Counseling and Development October 2007 Volume 40

3 Instrument The MCSDS consists of 33 items, of which 18 are keyed true items (i.e., highly desirable behaviors but low probability of occurrence) and 15 are keyed false items (i.e., socially disapproved behaviors but high probability of occurrence). Each item was rated on a 2-point (true false) scale. Gender and age were requested at the end of the questionnaire. Responses to the items were dummy coded as true = 1 and false = 0 for the positively keyed items and the reverse of this for the negatively keyed items. Thus, high scale scores indicate a strong tendency to respond in a socially desirable fashion. Translation Procedure I translated the original version of the MCSDS into Korean. The translation was checked by a counseling psychologist. Some modifications were made in the wording of certain items. The Korean version was then translated back into English by two bilingually fluent researchers. High convergence between the two versions was obtained. The Dichotomous Rasch model The dichotomous Rasch model specifies through log-odds that the probability of a person s response to item i is governed by the ability of the person (B n ) and the difficulty of the item (D i ): ln [P ni1 /P ni0 ] = B n D i, where P ni1 is the probability of an endorsed response, P ni0 is the probability of a nonendorsed response, B n is the trait (or ability) measure of person n, and D i is the difficulty of endorsing item i. One of the most important features of the Rasch model is its specific objectivity (Wright & Stone, 1979), which means the estimates of the ability of a person are freed from the sampling distribution of the items attempted and that the estimates of the difficulty of the items are freed from the sampling distribution of specific people used in the calibration if the data fit the model requirements. The Rasch model, according to Dinero and Haertel (1977), has also been found to be robust in detecting departure from the assumption of equal item discrimination. In the Rasch model, item-fit statistics, item standardized residual correlations, and principal component analysis (PCA) of the standardized residuals can be used to investigate the dimensionality of the MCSDS (Andrich, Sheridan, & Luo, 2005; Linacre, 1998, 2006). The Rasch analysis was conducted using the Winsteps computer program (Linacre, 2006). Measurement Properties Measurement properties of the MCSDS, including person separation reliability, person separation, and person strata indices, were evaluated. The Rasch analogue to Cronbach s alpha is called the person separation reliability, which refers to the ability to differentiate persons on the measured variable and the replicability of person placement across other items measuring the same construct (Wright & Masters, 1982). Fox and Jones (1998) considered person separation reliability equal to or greater than.80 acceptable. On the other hand, E. V. Smith (2001) has suggested that the index of person separation reliability is nonlinear and suffers from ceiling effect. The person separation index is an estimate of the spread of persons on the measured variable and is computed by the adjusted person standard deviation divided by the root mean square measurement error (Wright & Masters, 1982). The higher the value of the person separation index, which Measurement and Evaluation in Counseling and Development October 2007 Volume

4 has a range from zero to infinity, the more spread out the persons are on the variable being measured. In this study, I also examined the person strata index, which indicates the number of distinct ability levels separated by three errors of measurement. The person strata index is helpful when creating an assessment instrument designed to determine group differences. For example, a reliability of.80 gives a separation of 2.0, so that three strata can be usefully distinguished. Content Evidence of Construct Validity Messick (1995) defined the content evidence of construct validity as the degree to which the items on the instrument are representative of and relevant to the construct domain, as well as the technical quality of individual items. To examine content evidence of construct validity of the MCSDS, several methods are available within the framework of Rasch measurement (e.g., E. V. Smith, 2001; Wolfe & Smith, 2007a, 2007b). First, content evidence of construct validity can be examined by an interpretation of the intended item hierarchy, the spread of the item calibration, and separation and reliability statistics. In the Rasch analysis, the distributions of the ability of persons and item parameter estimates can be visually inspected to assess whether the difficulty level of the MCSDS is appropriate for the sample of persons. If an instrument is appropriately targeting a sample, there should be sufficient overlap between the range of person trait (ability) levels and the range of item difficulty levels. E. V. Smith (2001) also noted that the examination of person-fit statistics provides an indication of substantive validity. Therefore, to ensure that a person s response patterns to the MCSDS items correspond to that predicted by the Rasch model, I also examined person-fit statistics. Infit and outfit person-fit mean square (MNSQ) within a range of 0.5 to 1.5 is considered evidence of adequate fit, but these cutoff values are not absolute rules (Wright & Linacre, 1994). To provide evidence of internal structure of the MCSDS for the study sample, two (infit and outfit) item-fit MNSQ statistics (Wright & Stone, 1979) and point-measure correlations of each item were computed to determine how well each item contributes to defining one common construct. MNSQ has an expectation of 1.0 and a range from zero to infinity. Although a variety of ranges have been suggested to indicate adequate fit, I considered items to fit if their MNSQ fell within the range of 0.5 to 1.5, as suggested by Linacre (2006). Item MNSQ values greater than 1.5 indicate a lack of construct homogeneity, and values less than 0.5 indicate redundancy with other items. The point-measure correlations are computed in the same way as the point biserial correlations, except that the total raw scores are replaced by Rasch measures. Negative point-measure correlations may be due to scoring errors, rating scale with reversed direction, or multidimensionality. As noted earlier, the Rasch model is based on the assumption of unidimensionality. If there is a good model fit for persons and items, the residual components, representing the difference between what the Rasch model predicts and what is observed, should be random and contain no additional systematic information (Andrich et al., 2005; Linacre, 1998, 2006). This implies that the interitem residual correlations would approach zero if the data fit the Rasch unidimensional measurement model. The residual analysis for assessing unidimensionality in the Rasch model is similar to the method based on the principle of essential unidimensionality (W. F. Stout, 1990; W. R. Stout 1987). To help in decisions about the dimensionality of the MCSDS, I conducted PCA with standardized residuals. Linacre (1998) recommended that standardized residuals (residuals divided by their model standard deviation) are very useful, along with other types of residuals (e.g., raw score residual, logit residual). In this study, I considered an eigenvalue less than 3.0 (Linacre, 2006) and variance explained by the first component of residuals less than 10% to be an indication of unidimensionality, but there are also some other criteria for representing the existence of a secondary dimension (Andrich et al., 2005; Linacre, 2006; Raiche, 2005; Reckase, 1979; E. V. Smith, 2002; R. Smith & Miao, 1994). 158 Measurement and Evaluation in Counseling and Development October 2007 Volume 40

5 Generalizability of Construct Validity In the context of Rasch measurement, E. V. Smith (2001) stated that the generalizability aspect of construct validity is concerned with scope to which inferences regarding person measures or item calibration are invariant, within measurement error, across different tasks, items, groups, or contexts (p. 298). In order to provide evidence of generalizability of construct validity in this study, differential item functioning (DIF) to the MCSDS data was carried out to examine whether the items work in the same way for gender groups, regardless of their placement on the trait or ability. Differences in item location parameters from one subgroup to another were examined by dividing the difference by the joint standard errors of the items (Linacre, 2006). The resulting ratio is considered significant when it exceeds 2.0 in absolute value. This test is equivalent to the Mantel Haenszel significance test (Linacre & Wright, 1989). I also examined the Mantel Haenszel (Mantel & Haenszel, 1959) DIF estimates and the empirical item characteristic curve for each group. Results Paticipants Summary Statistics Table 1 provides participants summary statistics for the total sample and by gender. The mean measures for the male respondents and the female respondents on the MCSDS were.03 (SE =.07) and.37 (SE =.05), respectively, on the logit scale. A statistically significant gender difference was found, t(380) = 4.39, p <.01. The person separation reliability of the MCSDS was slightly higher for the female respondents than for the male respondents (.74 vs..71, respectively). Similar patterns were found for male respondents and the female respondents with respect to the person separation and person strata indices. The person separation reliability, person separation, and person strata indices for the total sample were 0.74, 1.69, and 2.59, respectively. This result implies that if the sample is normally distributed, there are about two measurably different levels of performance in this sample. Item Hierarchy and Targeting Figure 1 shows the distribution of the 382 participants and 33 MCSDS items on a common metric (logit scale), where each # represents 3 persons, and the MCSDS items are indicated by the item number. Table 1 Participants Summary Statistics for the Marlowe Crowne Social Desirability Scale Gender Variable Rasch Measure M SE a b R p Men (n = 108) Women (n = 274) Total Sample (N = 382) G p c Strata d a Standard error of person mean. b Person separation reliability, which has a range from 0 to 1. c Person separation index, which has a range from 0 to infinity. d Person strata index = (4G p + 1)/3. Measurement and Evaluation in Counseling and Development October 2007 Volume

6 Logit Scale Key: <more> = most able persons <less> = least able persons <rare> = most difficult to endorse items <frequ> = least difficult to endorse items # = 3 persons M = the location of the mean person and item measures S = one standard deviation away from the mean T = two standard deviations away from the mean FIGURE 1 Comparison Between Person Trait and Item Difficulty Measure As shown in Figure 1, the MCSDS items are arranged by degree of item difficulty from most difficult to endorse at the top (Item 31) to least difficult to endorse at the bottom (Item 24). The coverage of the 33 MCSDS items, which spans a wide range of a person s ability, indicates that this set of items is appropriately targeted for the sample being measured. The 160 Measurement and Evaluation in Counseling and Development October 2007 Volume 40

7 item separation index, which has a range from zero to infinity, shows whether the items are adequately dispersed along the latent traits of the MCSDS, implying the replicability of the order of the items across other samples (Bond & Fox, 2001). Larger separation index value indicates higher reliability (E. V. Smith, 2001). The item strata index can be used to determine the number of statistically distinct levels of item difficulty that respondents have distinguished (E. V. Smith, 2001; Wright & Masters, 1982). In this study, the item separation and strata indices were 7.54 and 10.39, respectively, thus indicating adequately dispersed items on the logit scale. Item Fit Two (infit and outfit) item-fit MNSQ statistics (Wright & Stone, 1979) and point-measure correlations of each item were computed to determine how well each item contributes to defining one common construct for the internal structure of the MCSDS (see Table 2). The results showed that no items produced infit or outfit values greater than 1.5 and less than 0.5. The fit value for one item (Item 7: I am always careful about my manner of Table 2 Item Location Estimates and Fit Statistics of the Marlowe Crowne Social Desirability Scale Item Measure Error Infit Outfit r pm Note. Infit = information-weighted mean square statistic; Outfit = outlier-sensitive mean square statistic; r pm = point-measure correlation. Measurement and Evaluation in Counseling and Development October 2007 Volume

8 dress ) was substantially higher than other fit values (infit = 1.27, outfit = 1.39), and there was a negative point-measure correlation for this item (r pm =.01), which may be viewed as a problem with the unidimensionality of the MCSDS. Person Fit Only 16 participants (out of 382; approximately 4%) in the study sample showed outfit values greater than 1.5, and no participants produced infit values greater than 1.5. This implies that the Rasch model requirements are not being severely violated to accommodate the data from this sample. Unidimensionality of the MCSDS To investigate whether the items constituting the Korean version of the MCSDS were measuring one general construct of social desirability, I examined the standardized item residual correlation matrix. High positive (or negative) correlation of residuals between pairs of items may indicate item redundancy, that is the dependency in the responses on these items is higher than can be accounted for by the unidimensional model. The finding showed that even the biggest correlation coefficient is relatively small (.23), which indicates that the standardized residuals may contain no additional systematic information. Next, a PCA of standardized residuals was performed. Table 3 shows the eigenvalues and percentage of variance attributed to each residual component. Sixteen residual components had eigenvalues greater than 1.0 and accounted for 65.10% of the total residual variance. The first component had an eigenvalue of 2.27, representing 6.87% of the residual variance. This can be interpreted as indicating unidimensionality of the MCSDS on the basis of the guidelines indicated earlier. As further evidence, the more similar the distribution of the percentages of variance accounted for by the component across the set of items, the more likely all items will fit the Rasch model (Andrich et al., 2005). Table 3 shows that the percentages of variance accounted for by the components were quite similar in magnitude, thus suggesting the lack of secondary structures (subdimensions) underlying the data. Figure 2 displays the expected and empirical item characteristic curves (ICC). The boundary lines indicate the upper and lower 95% two-sided confidence intervals. When an empiri- Table 3 Principal Component Analysis of Standardized Residuals Component Eigenvalue VAF Note. VAF = percentage of residual variance accounted for by the component. Cumulative VAF Measurement and Evaluation in Counseling and Development October 2007 Volume 40

9 Key: X = empirical point = item characteristic curve cal point, which is represented by x, lies outside the boundaries, then some unmodeled source of variance may be present in the observations (Linacre, 2006). Clearly, most of the empirical points are located within the 95% confidence interval along the expected ICC, implying that the data fit the Rasch unidimensional model. Differential Item Functioning for Gender The results of the DIF analyses for the MCSDS data are presented in Table 4. The DIF t tests identified 6 (out of 33) items as displaying DIF. As expected, the Mantel Haenszel significance test also revealed the same items as showing significant DIF, with the exception of Item 32 that showed nonsignificant gender-related DIF. Item 7 ( I am always careful about my manner of dress ) and Item 27 ( I never make a long trip without checking the safety of my car ) were significantly easier for women to endorse as true than they were for men. On the other hand, Item 19 ( I sometimes try to get even rather than forgive and forget ) and Item 32 ( I sometimes think when people have a misfortune they only got what they deserved ) were easier for women to endorse as false than they were for men. However, Item 3 ( It is sometimes hard to go on with my work if I am not encouraged ) and Item 28 ( There have been times when I was quite jealous of the good fortune of others ) were easier for men to endorse as false than they were for women. The expected and empirical ICCs of the 6 items showing significant DIF by gender are depicted in Figure 3. For Item 3 and Item 28, the bold line for men is higher than the line for women along the expected ICC, indicating that the DIF is in favor of men on these 2 items. Discussion FIGURE 2 Item Characteristic Curves (With 95% Confidence Intervals) The MCSDS was designed to assess the impact of social desirability on self-report measures. Despite a sufficient theoretical grounding in developing this instrument, the linear factor analyses with dichotomous data have yielded conflicting results. Several conclusions can be Measurement and Evaluation in Counseling and Development October 2007 Volume

10 Table 4 Differential Item Functioning for Gender Groups Item Men Measure SE Women Measure SE t MH ** ** * ** 3.26** ** ** ** ** ** 0.75** Note. MH = Mantel Haenszel significance test estimates. *p <.05. **p <.01. drawn from the present study, in which I used a Rasch measurement model to investigate the psychometric properties of the MCSDS. First, as seen in Figure 1, the difficulty levels of the 33 items reflect a wide range of respondents trait levels, indicating that the distribution of the person trait level is well targeted by the difficulty level of the 33 items. The person and item separation indices also indicate adequately dispersed persons and items on the logit scale. The order of item difficulty seems reasonable, although there is no previous research to check for agreement on this matter. Item 31 ( I have never felt that I was punished without cause ) was the most difficult to endorse item, followed by Item 2 ( I never hesitate to go out of my way to help someone in trouble ). These hard-to-endorse items were responded to as true by few participants and were responded to as false by most of the participants high severity indicators. On the other hand, the easiest to endorse item was Item 24 ( I would never think of letting someone else be punished for my wrong-doings ), and the next easiest was Item 17 ( I always try to practice what I preach ); that is, these easy-to-endorse items were responded to as true by most of the participants low severity indicators. These findings on item and person hierarchy can also be useful for developing short forms of the MCSDS. In 164 Measurement and Evaluation in Counseling and Development October 2007 Volume 40

11 Key: X = empirical point = women = men = item characteristic curve FIGURE 3 The Expected and Empirical Item Characteristic Curves of Six MCSDS Items Showing Significant Differential Item Functioning by Gender Measurement and Evaluation in Counseling and Development October 2007 Volume

12 this study, the most difficult (or easiest) to endorse items provide little information about social desirability for the study sample. Second, the results from the PCA with standardized residuals seem to support a unidimensional construct of the MCSDS, thus suggesting that the application of the scale as a measure of social desirability is warranted. In addition, item-fit MNSQ statistics also support a unidimensional construct of the MCSDS, with the exception of Item 7 ( I am always careful about my manner of dress ), which exhibited a negative pointmeasure correlation and statistically significant DIF showing that, for persons with the same trait location on the scale, women scored higher (responding true) than did men on this item. In other words, Item 7 is functioning differentially in favor of women. This is expected because, in general, Korean women are more concerned about appearance than are men during college life irrespective of women s tendency to respond in a socially desirable way. Moreover, because of cultural differences between Western and non-western samples, it might be that Item 7 demonstrates inadequate fit. Because little research exists to draw any definitive conclusions, cross-cultural investigations using the Rasch measurement models are needed to understand different displays of item fit across Western and non-western samples. Third, in this study, the male respondents had higher mean scores than did the female respondents on the MCSDS. Fraboni and Cooper (1989) also reported similar results. However, there are studies (e.g., Dalton, 1994; Robinette, 1991; Zook & Sipps, 1985) indicating that women score higher than men, as well as studies that reveal no gender differences on the MCSDS (e.g., Andrews & Meyer, 2003; Barger, 2002; Loo & Thorpe, 2000). To examine if DIF exists on the scale, the item difficulty estimates for the gender groups of male participants and female participants were compared. The results showed that 6 (out of 33) items displayed DIF. These 6 items are likely to have different meanings for men and women, and, therefore, extreme caution is warranted in gender comparisons on MCSDS scores. Additional research, involving content analysis, is needed to examine the role of culture and other factors that may cause differential functioning for some MCSDS items. Some limitations of this study must also be mentioned. First, the generalizability of the findings is limited to the Korean college student population. Because the MCSDS was developed and validated using data from Western samples, the adequacy of the MCSDS with non-western samples should be examined before using the MCSDS with such samples. Second, even though proper procedures were used in this study to ensure adequate translation of the original MCSDS, it is possible that cultural differences in the verbal expression of emotional and physical experiences may have affected the validity of MCSDS items; therefore, it should be noted that the Rasch analysis may have affected the validity of the MCSDS items and may lead to inconsistent results for the dimensionality of the MCSDS across Western and non-western samples. Third, to a certain degree, indications on the lack (or presence) of DIF may also result from a low test power because of the relatively small sample size used in this study. Future research is needed to crossvalidate the findings with larger samples (say, between 500 and 1,000 people). In summary, this article provides evidence of the MCSDS s unidimensionality. At the same time, however, some items display gender-related DIF. If there is a common cause for such items to function differentially, this may suggest the presence of a secondary dimension in the MCSDS. If, however, the DIF is caused by an unknown reason for each item, this should not be perceived as indicating new dimension(s). The content examination of the DIF items in this study did not indicate a common cause for their differential behavior, and, therefore, the presence of a second dimension in the MCSDS was ruled out. In conclusion, using the Rasch model to analyze the validity of the content and the generalizability of the MCSDS can be useful to researchers in counseling, education, and related fields. 166 Measurement and Evaluation in Counseling and Development October 2007 Volume 40

13 References Andrews, P., & Meyer, R. (2003). Marlowe-Crowne Social Desirability Scale and Short Form C: Forensic norms. Journal of Clinical Psychology, 59, Andrich, D., Sheridan, B., & Luo, G. (2005). Interpreting RUMM2020 Monograph. Retrieved July 20, 2007, from Ballard, R. (1992). Short forms of the Marlowe-Crowne Social Desirability Scale. Psychological Reports, 71, Barger, S. D. (2002). The Marlowe-Crowne affair: Short forms, psychometric structure, and social desirability. Journal of Personality Assessment, 79, Bentler, P. M. (1995). EQS structural equations program manual. Encino, CA: Multivariate Software. Beretvas, S. N., Meyers, J. L., & Leite, W. L. (2002). A reliability generalization study of the Marlowe- Crowne Social Desirability Scale. Educational and Psychological Measurement, 62, Bernstein, I., & Teng, G. (1989). Factoring items and factoring scales are different: Spurious evidence for multidimensionality due to item categorization. Psychological Bulletin, 105, Bock, R. D., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, Bond, T. G., & Fox, C. M. (2001). Applying the Rasch model: Fundamental measurement in the human sciences. Mahwah, NJ: Erlbaum. Collazo, A. (2005). Translation of the Marlowe-Crowne Social Desirability Scale into an equivalent Spanish version. Educational and Psychological Measurement, 65, Crowne, D. P., & Marlowe, D. (1960). A new scale of social desirability independent of psychopathology. Journal of Consulting Psychology, 24, Dalton, J. E. (1994). MMPI-168 and Marlowe-Crowne profiles of adoptive applications. Journal of Clinical Psychology, 50, Dinero, T. E., & Haertel, E. (1977). The applicability of the Rasch model with varying item discrimination. Applied Psychological Measurement, 1, Edwards, A. L. (1957). The social desirability variable in personality assessment and research. New York: Dryden. Fischer, D. G., & Fick, C. (1993). Measuring social desirability: Short forms of the Marlowe-Crowne Social Desirability Scale. Educational and Psychological Measurement, 53, Fox, C. M., & Jones, J. A. (1998). Uses of Rasch modeling in counseling psychology research. Journal of Counseling Psychology, 45, Fraboni, M., & Cooper, D. (1989). Further validation of the three short forms of the Marlowe-Crowne Scale of Social Desirability. Psychological Reports, 65, Hambleton, R., & Rovinelli, R. (1986). Assessing the dimensionality of a set of test items. Applied Psychological Measurement, 10, Holden, R. R. (1994). Social desirability. In R. J. Corsini (Ed.), Encyclopedia of psychology: Vol. 3 (2nd ed., pp ). New York: Wiley. King, M. F., & Bruner, G. C. (2000). Social desirability bias: A neglected aspect of validity testing. Psychology & Marketing, 17, Leite, W., & Beretvas, S. (2005). Validation of scores on the Marlowe-Crowne Social Desirability Scale and the Balanced Inventory of Desirable Responding. Educational and Psychological Measurement, 65, Linacre, J. M. (1998). Detecting multidimensionality: Which residual data-type works best? Journal of Outcome Measurement, 2, Linacre, J. M. (2006). A user s guide to Winsteps Rasch-model computer programs. Retrieved July 20, 2007, from Linacre, J. M., & Wright, B. D. (1989). Mantel Haenszel DIF and PROX are equivalent! Rasch Measurement Transactions, 3, Loo, R., & Thorpe, K. (2000). Confirmatory factor analyses of the full and short versions of the Marlowe- Crowne Social Desirability Scale. The Journal of Social Psychology, 140, Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, McDonald, R., & Ahlawat, K. (1974). Difficulty factors in binary data. British Journal of Mathematical and Statistical Psychology, 27, Meara, K., Robin, F., & Sireci, S. (2000). Using multidimensional scaling to assess the dimensionality of dichotomous item data. Multivariate Behavioral Research, 35, Measurement and Evaluation in Counseling and Development October 2007 Volume

14 Messick, S. (1995). Validity of psychological assessment: Validation of inferences from person s responses and performances as scientific inquiry into score meaning. American Psychologist, 50, Muthén, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43, Muthén, L., & Muthén, B. (1998). Mplus user s guide. Los Angeles: Author. Olsson, U. (1979). On the robustness of factor analysis against crude classification of the observations. Multivariate Behavioral Research, 14, Paulhus, D. L. (1984). Two-component models of socially desirable responding. Journal of Personality and Social Psychology, 46, Raiche, G. (2005). Critical eigenvalue sizes in standardized residual principal components analysis. Rasch Measurement Transaction, 19, Ramanaiah, N. V., & Martin, H. J. (1980). On the two-dimensional nature of the Marlowe-Crowne Social Desirability Scale. Journal of Personality Assessment, 44, Ramanaiah, N. V., Schill, T., & Leung, L. S. (1977). A test of the hypothesis about the two-dimensional nature of the Marlowe-Crowne Social Desirability Scale. Journal of Research in Personality, 11, Rasch, G. (1960). Probabilistic model for some intelligence and attainment tests. Copenhagen, Denmark: Danmarks Paedogogiske Institute. Reckase, M. (1979). Unifactor latent trait models applied to multifactor tests: Results and implications. Journal of Educational Statistics, 4, Reynolds, W. M. (1982). Development of reliable and valid short forms of the Marlowe-Crowne Social Desirability Scale. Journal of Clinical Psychology, 38, Robinette, R. L. (1991). The relationship between the Marlowe-Crowne Form C and the validity scales of the MMPI. Journal of Clinical Psychology, 38, Slinde, J. A., & Linn, R. L. (1979). The Rasch model, objective measurement, equating, and robustness. Applied Psychological Measurement, 3, Smith, E. V., Jr. (2001). Evidence for the reliability of measures and the validity of measure interpretation: A Rasch measurement perspective. Journal of Applied Measurement, 2, Smith, E. V., Jr. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3, Smith, R., & Miao, C. (1994). Assessing unidimensionality for Rasch measurement. In M. Wilson (Ed.), Objective measurement: Theory into practice (Vol. 2). Norwood, NJ: Ablex. Stout, W. F. (1990). A new item response theory modeling approach with applications to unidimensionality assessment and ability estimation. Psychometrika, 55, Stout, W. R. (1987). A nonparametric approach for assessing latent trait unidimensionality. Psychometrika, 52, Strahan, R., & Gerbasi, K. C. (1972). Short, homogeneous versions of the Marlowe-Crowne Social Desirability Scale. Journal of Clinical Psychology, 28, Wolfe, E. W., & Smith, E. V., Jr. (2007a). Instrument development tools and activities for measure validation using Rasch models: Part I Instrument development tools. In E. V. Smith, Jr., & R. M. Smith (Eds.), Rasch measurement: Advanced and specialized applications (pp ). Maple Grove, MN: JAM Press. Wolfe, E. W., & Smith, E. V., Jr. (2007b). Instrument development tools and activities for measure validation using Rasch models: Part II Validation activities. In E. V. Smith, Jr., & R. M. Smith (Eds.), Rasch measurement: Advanced and specialized applications (pp ). Maple Grove, MN: JAM Press. Wright, B. D. (1996). Local dependency, correlations and principal components. Rasch Measurement Transactions, 10, Wright, B. D., & Linacre, M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8, 370. Wright, B. D., & Masters, G. (1982). Rating scale analysis. Chicago: MESA Press. Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago: MESA Press. Zook, A., & Sipps, G. J. (1985). Cross-validation of a short form of the Marlowe-Crowne Social Desirability Scale. Journal of Clinical Psychology, 41, Measurement and Evaluation in Counseling and Development October 2007 Volume 40

Validating Measures of Self Control via Rasch Measurement. Jonathan Hasford Department of Marketing, University of Kentucky

Validating Measures of Self Control via Rasch Measurement. Jonathan Hasford Department of Marketing, University of Kentucky Validating Measures of Self Control via Rasch Measurement Jonathan Hasford Department of Marketing, University of Kentucky Kelly D. Bradley Department of Educational Policy Studies & Evaluation, University

More information

Measuring mathematics anxiety: Paper 2 - Constructing and validating the measure. Rob Cavanagh Len Sparrow Curtin University

Measuring mathematics anxiety: Paper 2 - Constructing and validating the measure. Rob Cavanagh Len Sparrow Curtin University Measuring mathematics anxiety: Paper 2 - Constructing and validating the measure Rob Cavanagh Len Sparrow Curtin University R.Cavanagh@curtin.edu.au Abstract The study sought to measure mathematics anxiety

More information

Author s response to reviews

Author s response to reviews Author s response to reviews Title: The validity of a professional competence tool for physiotherapy students in simulationbased clinical education: a Rasch analysis Authors: Belinda Judd (belinda.judd@sydney.edu.au)

More information

Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys

Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys Jill F. Kilanowski, PhD, APRN,CPNP Associate Professor Alpha Zeta & Mu Chi Acknowledgements Dr. Li Lin,

More information

Construct Validity of Mathematics Test Items Using the Rasch Model

Construct Validity of Mathematics Test Items Using the Rasch Model Construct Validity of Mathematics Test Items Using the Rasch Model ALIYU, R.TAIWO Department of Guidance and Counselling (Measurement and Evaluation Units) Faculty of Education, Delta State University,

More information

Evaluating and restructuring a new faculty survey: Measuring perceptions related to research, service, and teaching

Evaluating and restructuring a new faculty survey: Measuring perceptions related to research, service, and teaching Evaluating and restructuring a new faculty survey: Measuring perceptions related to research, service, and teaching Kelly D. Bradley 1, Linda Worley, Jessica D. Cunningham, and Jeffery P. Bieber University

More information

Centre for Education Research and Policy

Centre for Education Research and Policy THE EFFECT OF SAMPLE SIZE ON ITEM PARAMETER ESTIMATION FOR THE PARTIAL CREDIT MODEL ABSTRACT Item Response Theory (IRT) models have been widely used to analyse test data and develop IRT-based tests. An

More information

Construct Invariance of the Survey of Knowledge of Internet Risk and Internet Behavior Knowledge Scale

Construct Invariance of the Survey of Knowledge of Internet Risk and Internet Behavior Knowledge Scale University of Connecticut DigitalCommons@UConn NERA Conference Proceedings 2010 Northeastern Educational Research Association (NERA) Annual Conference Fall 10-20-2010 Construct Invariance of the Survey

More information

Psychometric properties of the PsychoSomatic Problems scale an examination using the Rasch model

Psychometric properties of the PsychoSomatic Problems scale an examination using the Rasch model Psychometric properties of the PsychoSomatic Problems scale an examination using the Rasch model Curt Hagquist Karlstad University, Karlstad, Sweden Address: Karlstad University SE-651 88 Karlstad Sweden

More information

Contents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD

Contents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT

More information

The Functional Outcome Questionnaire- Aphasia (FOQ-A) is a conceptually-driven

The Functional Outcome Questionnaire- Aphasia (FOQ-A) is a conceptually-driven Introduction The Functional Outcome Questionnaire- Aphasia (FOQ-A) is a conceptually-driven outcome measure that was developed to address the growing need for an ecologically valid functional communication

More information

The Impact of Item Sequence Order on Local Item Dependence: An Item Response Theory Perspective

The Impact of Item Sequence Order on Local Item Dependence: An Item Response Theory Perspective Vol. 9, Issue 5, 2016 The Impact of Item Sequence Order on Local Item Dependence: An Item Response Theory Perspective Kenneth D. Royal 1 Survey Practice 10.29115/SP-2016-0027 Sep 01, 2016 Tags: bias, item

More information

RATER EFFECTS AND ALIGNMENT 1. Modeling Rater Effects in a Formative Mathematics Alignment Study

RATER EFFECTS AND ALIGNMENT 1. Modeling Rater Effects in a Formative Mathematics Alignment Study RATER EFFECTS AND ALIGNMENT 1 Modeling Rater Effects in a Formative Mathematics Alignment Study An integrated assessment system considers the alignment of both summative and formative assessments with

More information

A Rasch Analysis of the Statistical Anxiety Rating Scale

A Rasch Analysis of the Statistical Anxiety Rating Scale University of Wyoming From the SelectedWorks of Eric D Teman, J.D., Ph.D. 2013 A Rasch Analysis of the Statistical Anxiety Rating Scale Eric D Teman, Ph.D., University of Wyoming Available at: https://works.bepress.com/ericteman/5/

More information

Item Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses

Item Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses Item Response Theory Steven P. Reise University of California, U.S.A. Item response theory (IRT), or modern measurement theory, provides alternatives to classical test theory (CTT) methods for the construction,

More information

RUNNING HEAD: EVALUATING SCIENCE STUDENT ASSESSMENT. Evaluating and Restructuring Science Assessments: An Example Measuring Student s

RUNNING HEAD: EVALUATING SCIENCE STUDENT ASSESSMENT. Evaluating and Restructuring Science Assessments: An Example Measuring Student s RUNNING HEAD: EVALUATING SCIENCE STUDENT ASSESSMENT Evaluating and Restructuring Science Assessments: An Example Measuring Student s Conceptual Understanding of Heat Kelly D. Bradley, Jessica D. Cunningham

More information

Validation of an Analytic Rating Scale for Writing: A Rasch Modeling Approach

Validation of an Analytic Rating Scale for Writing: A Rasch Modeling Approach Tabaran Institute of Higher Education ISSN 2251-7324 Iranian Journal of Language Testing Vol. 3, No. 1, March 2013 Received: Feb14, 2013 Accepted: March 7, 2013 Validation of an Analytic Rating Scale for

More information

MEASURING AFFECTIVE RESPONSES TO CONFECTIONARIES USING PAIRED COMPARISONS

MEASURING AFFECTIVE RESPONSES TO CONFECTIONARIES USING PAIRED COMPARISONS MEASURING AFFECTIVE RESPONSES TO CONFECTIONARIES USING PAIRED COMPARISONS Farzilnizam AHMAD a, Raymond HOLT a and Brian HENSON a a Institute Design, Robotic & Optimizations (IDRO), School of Mechanical

More information

Validation of the Behavioral Complexity Scale (BCS) to the Rasch Measurement Model, GAIN Methods Report 1.1

Validation of the Behavioral Complexity Scale (BCS) to the Rasch Measurement Model, GAIN Methods Report 1.1 Page 1 of 36 Validation of the Behavioral Complexity Scale (BCS) to the Rasch Measurement Model, GAIN Methods Report 1.1 Kendon J. Conrad University of Illinois at Chicago Karen M. Conrad University of

More information

Description of components in tailored testing

Description of components in tailored testing Behavior Research Methods & Instrumentation 1977. Vol. 9 (2).153-157 Description of components in tailored testing WAYNE M. PATIENCE University ofmissouri, Columbia, Missouri 65201 The major purpose of

More information

Modeling DIF with the Rasch Model: The Unfortunate Combination of Mean Ability Differences and Guessing

Modeling DIF with the Rasch Model: The Unfortunate Combination of Mean Ability Differences and Guessing James Madison University JMU Scholarly Commons Department of Graduate Psychology - Faculty Scholarship Department of Graduate Psychology 4-2014 Modeling DIF with the Rasch Model: The Unfortunate Combination

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

METHODS. Participants

METHODS. Participants INTRODUCTION Stroke is one of the most prevalent, debilitating and costly chronic diseases in the United States (ASA, 2003). A common consequence of stroke is aphasia, a disorder that often results in

More information

Validation of the HIV Scale to the Rasch Measurement Model, GAIN Methods Report 1.1

Validation of the HIV Scale to the Rasch Measurement Model, GAIN Methods Report 1.1 Page 1 of 35 Validation of the HIV Scale to the Rasch Measurement Model, GAIN Methods Report 1.1 Kendon J. Conrad University of Illinois at Chicago Karen M. Conrad University of Illinois at Chicago Michael

More information

Proceedings of the 2011 International Conference on Teaching, Learning and Change (c) International Association for Teaching and Learning (IATEL)

Proceedings of the 2011 International Conference on Teaching, Learning and Change (c) International Association for Teaching and Learning (IATEL) EVALUATION OF MATHEMATICS ACHIEVEMENT TEST: A COMPARISON BETWEEN CLASSICAL TEST THEORY (CTT)AND ITEM RESPONSE THEORY (IRT) Eluwa, O. Idowu 1, Akubuike N. Eluwa 2 and Bekom K. Abang 3 1& 3 Dept of Educational

More information

Running head: PRELIM KSVS SCALES 1

Running head: PRELIM KSVS SCALES 1 Running head: PRELIM KSVS SCALES 1 Psychometric Examination of a Risk Perception Scale for Evaluation Anthony P. Setari*, Kelly D. Bradley*, Marjorie L. Stanek**, & Shannon O. Sampson* *University of Kentucky

More information

Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement Invariance Tests Of Multi-Group Confirmatory Factor Analyses

Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement Invariance Tests Of Multi-Group Confirmatory Factor Analyses Journal of Modern Applied Statistical Methods Copyright 2005 JMASM, Inc. May, 2005, Vol. 4, No.1, 275-282 1538 9472/05/$95.00 Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement

More information

Application of Latent Trait Models to Identifying Substantively Interesting Raters

Application of Latent Trait Models to Identifying Substantively Interesting Raters Application of Latent Trait Models to Identifying Substantively Interesting Raters American Educational Research Association New Orleans, LA Edward W. Wolfe Aaron McVay April 2011 LATENT TRAIT MODELS &

More information

Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz

Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz This study presents the steps Edgenuity uses to evaluate the reliability and validity of its quizzes, topic tests, and cumulative

More information

On indirect measurement of health based on survey data. Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state

On indirect measurement of health based on survey data. Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state On indirect measurement of health based on survey data Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state A scaling model: P(Y 1,..,Y k ;α, ) α = item difficulties

More information

The following is an example from the CCRSA:

The following is an example from the CCRSA: Communication skills and the confidence to utilize those skills substantially impact the quality of life of individuals with aphasia, who are prone to isolation and exclusion given their difficulty with

More information

CHAPTER 7 RESEARCH DESIGN AND METHODOLOGY. This chapter addresses the research design and describes the research methodology

CHAPTER 7 RESEARCH DESIGN AND METHODOLOGY. This chapter addresses the research design and describes the research methodology CHAPTER 7 RESEARCH DESIGN AND METHODOLOGY 7.1 Introduction This chapter addresses the research design and describes the research methodology employed in this study. The sample and sampling procedure is

More information

The Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory

The Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory The Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory Kate DeRoche, M.A. Mental Health Center of Denver Antonio Olmos, Ph.D. Mental Health

More information

Students' perceived understanding and competency in probability concepts in an e- learning environment: An Australian experience

Students' perceived understanding and competency in probability concepts in an e- learning environment: An Australian experience University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2016 Students' perceived understanding and competency

More information

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories

Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,

More information

INTRODUCTION TO ITEM RESPONSE THEORY APPLIED TO FOOD SECURITY MEASUREMENT. Basic Concepts, Parameters and Statistics

INTRODUCTION TO ITEM RESPONSE THEORY APPLIED TO FOOD SECURITY MEASUREMENT. Basic Concepts, Parameters and Statistics INTRODUCTION TO ITEM RESPONSE THEORY APPLIED TO FOOD SECURITY MEASUREMENT Basic Concepts, Parameters and Statistics The designations employed and the presentation of material in this information product

More information

Assessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies. Xiaowen Zhu. Xi an Jiaotong University.

Assessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies. Xiaowen Zhu. Xi an Jiaotong University. Running head: ASSESS MEASUREMENT INVARIANCE Assessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies Xiaowen Zhu Xi an Jiaotong University Yanjie Bian Xi an Jiaotong

More information

The Effect of Guessing on Assessing Dimensionality in Multiple-Choice Tests: A Monte Carlo Study with Application. Chien-Chi Yeh

The Effect of Guessing on Assessing Dimensionality in Multiple-Choice Tests: A Monte Carlo Study with Application. Chien-Chi Yeh The Effect of Guessing on Assessing Dimensionality in Multiple-Choice Tests: A Monte Carlo Study with Application by Chien-Chi Yeh B.S., Chung Yuan Christian University, 1988 M.Ed., National Tainan Teachers

More information

Presented By: Yip, C.K., OT, PhD. School of Medical and Health Sciences, Tung Wah College

Presented By: Yip, C.K., OT, PhD. School of Medical and Health Sciences, Tung Wah College Presented By: Yip, C.K., OT, PhD. School of Medical and Health Sciences, Tung Wah College Background of problem in assessment for elderly Key feature of CCAS Structural Framework of CCAS Methodology Result

More information

The outcome of cataract surgery measured with the Catquest-9SF

The outcome of cataract surgery measured with the Catquest-9SF The outcome of cataract surgery measured with the Catquest-9SF Mats Lundstrom, 1 Anders Behndig, 2 Maria Kugelberg, 3 Per Montan, 3 Ulf Stenevi 4 and Konrad Pesudovs 5 1 EyeNet Sweden, Blekinge Hospital,

More information

A Comparison of Pseudo-Bayesian and Joint Maximum Likelihood Procedures for Estimating Item Parameters in the Three-Parameter IRT Model

A Comparison of Pseudo-Bayesian and Joint Maximum Likelihood Procedures for Estimating Item Parameters in the Three-Parameter IRT Model A Comparison of Pseudo-Bayesian and Joint Maximum Likelihood Procedures for Estimating Item Parameters in the Three-Parameter IRT Model Gary Skaggs Fairfax County, Virginia Public Schools José Stevenson

More information

A Comparison of Several Goodness-of-Fit Statistics

A Comparison of Several Goodness-of-Fit Statistics A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures

More information

11/24/2017. Do not imply a cause-and-effect relationship

11/24/2017. Do not imply a cause-and-effect relationship Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

5/6/2008. Psy 427 Cal State Northridge Andrew Ainsworth PhD

5/6/2008. Psy 427 Cal State Northridge Andrew Ainsworth PhD Psy 427 Cal State Northridge Andrew Ainsworth PhD Some Definitions Personality the relatively stable and distinctive patterns of behavior that characterize an individual and his or her reactions to the

More information

By Hui Bian Office for Faculty Excellence

By Hui Bian Office for Faculty Excellence By Hui Bian Office for Faculty Excellence 1 Email: bianh@ecu.edu Phone: 328-5428 Location: 1001 Joyner Library, room 1006 Office hours: 8:00am-5:00pm, Monday-Friday 2 Educational tests and regular surveys

More information

Development, Standardization and Application of

Development, Standardization and Application of American Journal of Educational Research, 2018, Vol. 6, No. 3, 238-257 Available online at http://pubs.sciepub.com/education/6/3/11 Science and Education Publishing DOI:10.12691/education-6-3-11 Development,

More information

RASCH ANALYSIS OF SOME MMPI-2 SCALES IN A SAMPLE OF UNIVERSITY FRESHMEN

RASCH ANALYSIS OF SOME MMPI-2 SCALES IN A SAMPLE OF UNIVERSITY FRESHMEN International Journal of Arts & Sciences, CD-ROM. ISSN: 1944-6934 :: 08(03):107 150 (2015) RASCH ANALYSIS OF SOME MMPI-2 SCALES IN A SAMPLE OF UNIVERSITY FRESHMEN Enrico Gori University of Udine, Italy

More information

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological

More information

The Rasch Measurement Model in Rheumatology: What Is It and Why Use It? When Should It Be Applied, and What Should One Look for in a Rasch Paper?

The Rasch Measurement Model in Rheumatology: What Is It and Why Use It? When Should It Be Applied, and What Should One Look for in a Rasch Paper? Arthritis & Rheumatism (Arthritis Care & Research) Vol. 57, No. 8, December 15, 2007, pp 1358 1362 DOI 10.1002/art.23108 2007, American College of Rheumatology SPECIAL ARTICLE The Rasch Measurement Model

More information

Evaluation of the Short-Form Health Survey (SF-36) Using the Rasch Model

Evaluation of the Short-Form Health Survey (SF-36) Using the Rasch Model American Journal of Public Health Research, 2015, Vol. 3, No. 4, 136-147 Available online at http://pubs.sciepub.com/ajphr/3/4/3 Science and Education Publishing DOI:10.12691/ajphr-3-4-3 Evaluation of

More information

REPORT. Technical Report: Item Characteristics. Jessica Masters

REPORT. Technical Report: Item Characteristics. Jessica Masters August 2010 REPORT Diagnostic Geometry Assessment Project Technical Report: Item Characteristics Jessica Masters Technology and Assessment Study Collaborative Lynch School of Education Boston College Chestnut

More information

INVESTIGATING FIT WITH THE RASCH MODEL. Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form

INVESTIGATING FIT WITH THE RASCH MODEL. Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form INVESTIGATING FIT WITH THE RASCH MODEL Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form of multidimensionality. The settings in which measurement

More information

Conceptualising computerized adaptive testing for measurement of latent variables associated with physical objects

Conceptualising computerized adaptive testing for measurement of latent variables associated with physical objects Journal of Physics: Conference Series OPEN ACCESS Conceptualising computerized adaptive testing for measurement of latent variables associated with physical objects Recent citations - Adaptive Measurement

More information

Comparing DIF methods for data with dual dependency

Comparing DIF methods for data with dual dependency DOI 10.1186/s40536-016-0033-3 METHODOLOGY Open Access Comparing DIF methods for data with dual dependency Ying Jin 1* and Minsoo Kang 2 *Correspondence: ying.jin@mtsu.edu 1 Department of Psychology, Middle

More information

Measuring the External Factors Related to Young Alumni Giving to Higher Education. J. Travis McDearmon, University of Kentucky

Measuring the External Factors Related to Young Alumni Giving to Higher Education. J. Travis McDearmon, University of Kentucky Measuring the External Factors Related to Young Alumni Giving to Higher Education Kathryn Shirley Akers 1, University of Kentucky J. Travis McDearmon, University of Kentucky 1 1 Please use Kathryn Akers

More information

Techniques for Explaining Item Response Theory to Stakeholder

Techniques for Explaining Item Response Theory to Stakeholder Techniques for Explaining Item Response Theory to Stakeholder Kate DeRoche Antonio Olmos C.J. Mckinney Mental Health Center of Denver Presented on March 23, 2007 at the Eastern Evaluation Research Society

More information

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Karl Bang Christensen National Institute of Occupational Health, Denmark Helene Feveille National

More information

Information Structure for Geometric Analogies: A Test Theory Approach

Information Structure for Geometric Analogies: A Test Theory Approach Information Structure for Geometric Analogies: A Test Theory Approach Susan E. Whitely and Lisa M. Schneider University of Kansas Although geometric analogies are popular items for measuring intelligence,

More information

BACKGROUND CHARACTERISTICS OF EXAMINEES SHOWING UNUSUAL TEST BEHAVIOR ON THE GRADUATE RECORD EXAMINATIONS

BACKGROUND CHARACTERISTICS OF EXAMINEES SHOWING UNUSUAL TEST BEHAVIOR ON THE GRADUATE RECORD EXAMINATIONS ---5 BACKGROUND CHARACTERISTICS OF EXAMINEES SHOWING UNUSUAL TEST BEHAVIOR ON THE GRADUATE RECORD EXAMINATIONS Philip K. Oltman GRE Board Professional Report GREB No. 82-8P ETS Research Report 85-39 December

More information

Differential Item Functioning

Differential Item Functioning Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item

More information

Comprehensive Statistical Analysis of a Mathematics Placement Test

Comprehensive Statistical Analysis of a Mathematics Placement Test Comprehensive Statistical Analysis of a Mathematics Placement Test Robert J. Hall Department of Educational Psychology Texas A&M University, USA (bobhall@tamu.edu) Eunju Jung Department of Educational

More information

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session

More information

linking in educational measurement: Taking differential motivation into account 1

linking in educational measurement: Taking differential motivation into account 1 Selecting a data collection design for linking in educational measurement: Taking differential motivation into account 1 Abstract In educational measurement, multiple test forms are often constructed to

More information

Psychological Experience of Attitudinal Ambivalence as a Function of Manipulated Source of Conflict and Individual Difference in Self-Construal

Psychological Experience of Attitudinal Ambivalence as a Function of Manipulated Source of Conflict and Individual Difference in Self-Construal Seoul Journal of Business Volume 11, Number 1 (June 2005) Psychological Experience of Attitudinal Ambivalence as a Function of Manipulated Source of Conflict and Individual Difference in Self-Construal

More information

Hanne Søberg Finbråten 1,2*, Bodil Wilde-Larsson 2,3, Gun Nordström 3, Kjell Sverre Pettersen 4, Anne Trollvik 3 and Øystein Guttersrud 5

Hanne Søberg Finbråten 1,2*, Bodil Wilde-Larsson 2,3, Gun Nordström 3, Kjell Sverre Pettersen 4, Anne Trollvik 3 and Øystein Guttersrud 5 Finbråten et al. BMC Health Services Research (2018) 18:506 https://doi.org/10.1186/s12913-018-3275-7 RESEARCH ARTICLE Open Access Establishing the HLS-Q12 short version of the European Health Literacy

More information

Latent Trait Standardization of the Benzodiazepine Dependence. Self-Report Questionnaire using the Rasch Scaling Model

Latent Trait Standardization of the Benzodiazepine Dependence. Self-Report Questionnaire using the Rasch Scaling Model Chapter 7 Latent Trait Standardization of the Benzodiazepine Dependence Self-Report Questionnaire using the Rasch Scaling Model C.C. Kan 1, A.H.G.S. van der Ven 2, M.H.M. Breteler 3 and F.G. Zitman 1 1

More information

Assessing the Validity and Reliability of Dichotomous Test Results Using Item Response Theory on a Group of First Year Engineering Students

Assessing the Validity and Reliability of Dichotomous Test Results Using Item Response Theory on a Group of First Year Engineering Students Dublin Institute of Technology ARROW@DIT Conference papers School of Civil and Structural Engineering 2015-07-13 Assessing the Validity and Reliability of Dichotomous Test Results Using Item Response Theory

More information

References. Embretson, S. E. & Reise, S. P. (2000). Item response theory for psychologists. Mahwah,

References. Embretson, S. E. & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, The Western Aphasia Battery (WAB) (Kertesz, 1982) is used to classify aphasia by classical type, measure overall severity, and measure change over time. Despite its near-ubiquitousness, it has significant

More information

On the purpose of testing:

On the purpose of testing: Why Evaluation & Assessment is Important Feedback to students Feedback to teachers Information to parents Information for selection and certification Information for accountability Incentives to increase

More information

Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology*

Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology* Examining the efficacy of the Theory of Planned Behavior (TPB) to understand pre-service teachers intention to use technology* Timothy Teo & Chwee Beng Lee Nanyang Technology University Singapore This

More information

Key words: Educational measurement, Professional competence, Clinical competence, Physical therapy (specialty)

Key words: Educational measurement, Professional competence, Clinical competence, Physical therapy (specialty) Dalton et al: Assessment of professional competence of students The Assessment of Physiotherapy Practice (APP) is a valid measure of professional competence of physiotherapy students: a cross-sectional

More information

André Cyr and Alexander Davies

André Cyr and Alexander Davies Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander

More information

alternate-form reliability The degree to which two or more versions of the same test correlate with one another. In clinical studies in which a given function is going to be tested more than once over

More information

Chapter 9. Youth Counseling Impact Scale (YCIS)

Chapter 9. Youth Counseling Impact Scale (YCIS) Chapter 9 Youth Counseling Impact Scale (YCIS) Background Purpose The Youth Counseling Impact Scale (YCIS) is a measure of perceived effectiveness of a specific counseling session. In general, measures

More information

IMPACT ON PARTICIPATION AND AUTONOMY QUESTIONNAIRE: INTERNAL SCALE VALIDITY OF THE SWEDISH VERSION FOR USE IN PEOPLE WITH SPINAL CORD INJURY

IMPACT ON PARTICIPATION AND AUTONOMY QUESTIONNAIRE: INTERNAL SCALE VALIDITY OF THE SWEDISH VERSION FOR USE IN PEOPLE WITH SPINAL CORD INJURY J Rehabil Med 2007; 39: 156 162 ORIGINAL REPORT IMPACT ON PARTICIPATION AND AUTONOMY QUESTIONNAIRE: INTERNAL SCALE VALIDITY OF THE SWEDISH VERSION FOR USE IN PEOPLE WITH SPINAL CORD INJURY Maria Larsson

More information

Self-Oriented and Socially Prescribed Perfectionism in the Eating Disorder Inventory Perfectionism Subscale

Self-Oriented and Socially Prescribed Perfectionism in the Eating Disorder Inventory Perfectionism Subscale Self-Oriented and Socially Prescribed Perfectionism in the Eating Disorder Inventory Perfectionism Subscale Simon B. Sherry, 1 Paul L. Hewitt, 1 * Avi Besser, 2 Brandy J. McGee, 1 and Gordon L. Flett 3

More information

Validation the Measures of Self-Directed Learning: Evidence from Confirmatory Factor Analysis and Multidimensional Item Response Analysis

Validation the Measures of Self-Directed Learning: Evidence from Confirmatory Factor Analysis and Multidimensional Item Response Analysis Doi:10.5901/mjss.2015.v6n4p579 Abstract Validation the Measures of Self-Directed Learning: Evidence from Confirmatory Factor Analysis and Multidimensional Item Response Analysis Chaiwichit Chianchana Faculty

More information

ITEM RESPONSE THEORY ANALYSIS OF THE TOP LEADERSHIP DIRECTION SCALE

ITEM RESPONSE THEORY ANALYSIS OF THE TOP LEADERSHIP DIRECTION SCALE California State University, San Bernardino CSUSB ScholarWorks Electronic Theses, Projects, and Dissertations Office of Graduate Studies 6-2016 ITEM RESPONSE THEORY ANALYSIS OF THE TOP LEADERSHIP DIRECTION

More information

Page 1 of 11 Glossary of Terms Terms Clinical Cut-off Score: A test score that is used to classify test-takers who are likely to possess the attribute being measured to a clinically significant degree

More information

Paul Irwing, Manchester Business School

Paul Irwing, Manchester Business School Paul Irwing, Manchester Business School Factor analysis has been the prime statistical technique for the development of structural theories in social science, such as the hierarchical factor model of human

More information

Mantel-Haenszel Procedures for Detecting Differential Item Functioning

Mantel-Haenszel Procedures for Detecting Differential Item Functioning A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning H. Jane Rogers, Teachers College, Columbia University Hariharan Swaminathan, University of

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

Personality measures under focus: The NEO-PI-R and the MBTI

Personality measures under focus: The NEO-PI-R and the MBTI : The NEO-PI-R and the MBTI Author Published 2009 Journal Title Griffith University Undergraduate Psychology Journal Downloaded from http://hdl.handle.net/10072/340329 Link to published version http://pandora.nla.gov.au/tep/145784

More information

Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note

Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1 Nested Factor Analytic Model Comparison as a Means to Detect Aberrant Response Patterns John M. Clark III Pearson Author Note John M. Clark III,

More information

Measuring Academic Misconduct: Evaluating the Construct Validity of the Exams and Assignments Scale

Measuring Academic Misconduct: Evaluating the Construct Validity of the Exams and Assignments Scale American Journal of Applied Psychology 2015; 4(3-1): 58-64 Published online June 29, 2015 (http://www.sciencepublishinggroup.com/j/ajap) doi: 10.11648/j.ajap.s.2015040301.20 ISSN: 2328-5664 (Print); ISSN:

More information

Improving Measurement of Ambiguity Tolerance (AT) Among Teacher Candidates. Kent Rittschof Department of Curriculum, Foundations, & Reading

Improving Measurement of Ambiguity Tolerance (AT) Among Teacher Candidates. Kent Rittschof Department of Curriculum, Foundations, & Reading Improving Measurement of Ambiguity Tolerance (AT) Among Teacher Candidates Kent Rittschof Department of Curriculum, Foundations, & Reading What is Ambiguity Tolerance (AT) and why should it be measured?

More information

Jason L. Meyers. Ahmet Turhan. Steven J. Fitzpatrick. Pearson. Paper presented at the annual meeting of the

Jason L. Meyers. Ahmet Turhan. Steven J. Fitzpatrick. Pearson. Paper presented at the annual meeting of the Performance of Ability Estimation Methods for Writing Assessments under Conditio ns of Multidime nsionality Jason L. Meyers Ahmet Turhan Steven J. Fitzpatrick Pearson Paper presented at the annual meeting

More information

Instrument equivalence across ethnic groups. Antonio Olmos (MHCD) Susan R. Hutchinson (UNC)

Instrument equivalence across ethnic groups. Antonio Olmos (MHCD) Susan R. Hutchinson (UNC) Instrument equivalence across ethnic groups Antonio Olmos (MHCD) Susan R. Hutchinson (UNC) Overview Instrument Equivalence Measurement Invariance Invariance in Reliability Scores Factorial Invariance Item

More information

Item Response Theory: Methods for the Analysis of Discrete Survey Response Data

Item Response Theory: Methods for the Analysis of Discrete Survey Response Data Item Response Theory: Methods for the Analysis of Discrete Survey Response Data ICPSR Summer Workshop at the University of Michigan June 29, 2015 July 3, 2015 Presented by: Dr. Jonathan Templin Department

More information

The validity of polytomous items in the Rasch model The role of statistical evidence of the threshold order

The validity of polytomous items in the Rasch model The role of statistical evidence of the threshold order Psychological Test and Assessment Modeling, Volume 57, 2015 (3), 377-395 The validity of polytomous items in the Rasch model The role of statistical evidence of the threshold order Thomas Salzberger 1

More information

Mapping the Continuum of Alcohol Problems in College Students: A Rasch Model Analysis

Mapping the Continuum of Alcohol Problems in College Students: A Rasch Model Analysis Psychology of Addictive Behaviors Copyright 2004 by the Educational Publishing Foundation 2004, Vol. 18, No. 4, 322 333 0893-164X/04/$12.00 DOI: 10.1037/0893-164X.18.4.322 Mapping the Continuum of Alcohol

More information

VARIABLES AND MEASUREMENT

VARIABLES AND MEASUREMENT ARTHUR SYC 204 (EXERIMENTAL SYCHOLOGY) 16A LECTURE NOTES [01/29/16] VARIABLES AND MEASUREMENT AGE 1 Topic #3 VARIABLES AND MEASUREMENT VARIABLES Some definitions of variables include the following: 1.

More information

The Use of Rasch Wright Map in Assessing Conceptual Understanding of Electricity

The Use of Rasch Wright Map in Assessing Conceptual Understanding of Electricity Pertanika J. Soc. Sci. & Hum. 25 (S): 81-88 (2017) SOCIAL SCIENCES & HUMANITIES Journal homepage: http://www.pertanika.upm.edu.my/ The Use of Rasch Wright Map in Assessing Conceptual Understanding of Electricity

More information

Estimating Individual Rater Reliabilities John E. Overall and Kevin N. Magee University of Texas Medical School

Estimating Individual Rater Reliabilities John E. Overall and Kevin N. Magee University of Texas Medical School Estimating Individual Rater Reliabilities John E. Overall and Kevin N. Magee University of Texas Medical School Rating scales have no inherent reliability that is independent of the observers who use them.

More information

Basic concepts and principles of classical test theory

Basic concepts and principles of classical test theory Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must

More information

Developing the First Validity of Shared Medical Decision- Making Questionnaire in Taiwan

Developing the First Validity of Shared Medical Decision- Making Questionnaire in Taiwan Global Journal of Medical research: k Interdisciplinary Volume 14 Issue 2 Version 1.0 Year 2014 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc. (USA) Online

More information

Development of the Mental, Emotional, and Bodily Toughness Inventory in Collegiate Athletes and Nonathletes

Development of the Mental, Emotional, and Bodily Toughness Inventory in Collegiate Athletes and Nonathletes Journal of Athletic Training 2008;43(2):125 132 g by the National Athletic Trainers Association, Inc www.nata.org/jat original research Development of the Mental, Emotional, and Bodily Toughness Inventory

More information