MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS IN SWEDEN A Rasch-analysis of the HBSC Instrument

Size: px

Start display at page:

Download "MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS IN SWEDEN A Rasch-analysis of the HBSC Instrument"

Mildred Simon
5 years ago
Views:

1 CURT HAGQUIST and DAVID ANDRICH MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS IN SWEDEN A Rasch-analysis of the HBSC Instrument (Accepted 27 June 2003) ABSTRACT. The cross-national WHO-study Health Behaviour in School-Aged Children (HBSC) is a comprehensive adolescent survey ongoing in Europe based on a public health perspective. The present study, examining the HBSCinstrument on subjective health, uses the unidimensional Rasch model. Items are analysed with respect to their operating characteristics across the whole range of the subjective health scale and the empirical operation of the response categories intended to be ordered for all items. The study is based on cross-sectional data collected in Sweden during the 1980s and 1990s among students in years five, seven and nine. The analyses reveal that the symptom checklist in the HBSCinstrument does not work consistently with the Rasch model when all eight items are analysed simultaneously. In particular, the response categories do not work as intended. Hence, the original set of eight items should not be used to construct a latent measure of subjective health. In order to bring the instrument to meet the requirements of the Rasch model, three items were removed. The reduced set of five items did work consistently with the model with respect to the response categories, and did show relative invariance across the latent trait. Since a few of the remaining items showed lack of invariance across genders and grades that problem should be solved, if the reduced item set is to be used for post-hoc analyses. Furthermore, the analysis of the reduced set of items suggests that both somatic and psychological complaints might be considered as parts of one higher order dimension of subjective health. In order to improve the questionnaire, further attention should be paid to the response format of the items. INTRODUCTION The cross-national WHO-study Health Behaviour in School-Aged Children (HBSC) is a comprehensive adolescent survey ongoing in Europe based on a public health perspective (HBSC, 1998; Currie et al., 2000). Since the start of the study in 1985/86, five waves of data collections have been carried out, the last one in the school year of 2001/2002. In 1997/98, 25 countries including Canada and USA Social Indicators Research 68: , Kluwer Academic Publishers. Printed in the Netherlands.

2 202 CURT HAGQUIST AND DAVID ANDRICH participated in the study (Currie et al., 2000). In addition to healthrelated behaviours, the reports on the study have paid great attention to measures of wellbeing and subjective health. Not least in Sweden which is characterised by a lack of longitudinal data on children s and adolescents mental health, the study has been widely quoted and referenced. The measures of subjective health included in the HBSC-survey comprise different items reflecting self-reported health complaints. Some of the psychometric properties of these measures have been analysed and reported. Relying on qualitative and quantitative analyses based on Norwegian adolescent data Haugland and Wold (2001) report good face validity as well as adequate test-retest reliability. Focusing on the dimensionality of the health complaints, Haugland et al. (2001) identified two qualitatively different dimensions using confirmatory factor analysis on adolescent data from Finland, Norway, Poland and Scotland. One factor labelled somatic included headache, abdominal, backache and dizziness. A second factor labelled psychological included feeling low, irritable, nervous and sleeping difficulties. Although a two-factor solution is consistent with some analyses carried out on similar checklists (Hurrelmann et al., 1988; Hopland et al., 1993), there are also analyses favouring one-factor solutions (Attanasio et al., 1984; Wisniewski et al., 1988). In analysing the dimensionality of the HBSC-checklist for subjective health complaints, this study takes a different approach from the earlier studies. Instead of using factor analysis, this study makes use of the unidimensional Rasch model (Rasch, 1980), implying that the primary focus is on the operating characteristics of the items intended to measure a single construct or trait of subjective health. Such items, as many in the field, have Likert-style response formats with the ordered categories intended to indicate increasing levels of intensity or frequency of health problems. Critical to any interpretation, therefore, is that these categories do operate empirically as intended. The need for categories intended to be ordered to operate as required empirically, and therefore be a property of the data, was recognised by Fisher (1958). Concerns with this kind of operation of categories of items is not new (Dubois and Burns, 1975; Duncan and Stenbeck, 1987), but such operation has not been

3 MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS 203 examined systematically and routinely in studies where these items are used. The Rasch model which is sensitive to the operational ordering of categories provides an opportunity for studying the relative operation of response categories routinely (Andrich, 1979, 1982, 1996). The purpose of the article is to examine to what extent the HBSCchecklist on subjective health complaints works within the framework of unidimensional measurement with a view to its mixture of items indicating psychological as well as somatic complaints. METHODS Material The study is based on Swedish data collected 1985/86, 1989/90, 1993/94 and 1997/98 among students in years five, seven and nine in the compulsory school system. These data collections are parts of the cross-national HBSC-study. A two-stage sampling procedure was used. At the first stage, within each grade a random sample of schools was drawn from a sample frame comprising all governmental and non-governmental schools in Sweden. At the second stage, within each of those schools sampled one school class was drawn in which all students comprised the targeted population (Danielson and Marklund, 2000). The number of participating students was 2933 (1985/86), 3553 (1989/90), 3584 (1993/94) and 3802 (1997/98). Across years of investigations the attrition rates varied between 10 and 15 percent (Danielson and Marklund, 2000). In those rates, attrition due to non-participation from schools is not included. For the purpose of the current analyses only students who completed all eight items on subjective health complaints were included implying the following number of participants at each year of investigation: 2899 (1985/86), 3504 (1989/90), 3492 (1993/94) and 3734 (1997/98). Instrument In the HBSC-survey subjective health was measured using a symptom checklist based on different health complaints. The measures

4 204 CURT HAGQUIST AND DAVID ANDRICH TABLE I Item characteristics at different years of investigations 1985/ / /94 & 1997/98 Initial How often do you In the last 6 months, In the last 6 months, question have the following how often have you how often have you complaints? had the following had the following complaints? complaints? Items Headache Headache Headache Stomach-ache Stomach-ache Stomach-ache Backache Have been irritable or Backache Feel low in a bad temper Felt low Being irritable or in a Felt nervous Have been irritable or bad temper Have had difficulty in a bad temper Feel nervous getting to sleep Felt nervous Difficulty getting to Felt dizzy Have had difficulty sleep getting to sleep Feel dizzy Felt dizzy Response About every day Often About every day categories More than once a week Sometimes More than once a week About once a week Seldom About once a week About once a month Never About once a month Seldom or never Seldom or never used were not the same across all years of investigations which is shownintablei. In 1985/86, 1993/94 and 1997/98 the measures were almost the same including eight items with five identical response categories. In contrast, in 1989/90 only six of the eight complaints used the other three years of investigations were included. Moreover, in contrast to all other years of investigations, in 1989/90 only four response categories, with different wording compared to the other years, were used. The Model of Analysis Different sets of items intended to measure subjective health complaints were examined by means of Rasch-analyses (Rasch, 1980).

5 MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS 205 The Rasch model offers opportunities to examine latent trait constructs in a rigorous way, e.g., composite measures of subjective health (Hagquist, 2001). The Rasch model is a member of the group of models based on latent trait analysis, implying a focus on the operating characteristics of the items across the whole range of the latent trait and the properties of those characteristics. However, the case for the model is that it is constructed to reflect the property of invariance the comparison between two persons should be independent which items of a class of items are used, and vice versa and if data accord with the model with respect to tests of relevant hypotheses for misfit to the model, then it reflects invariance with respect to that hypothesis. Thus the model is constructed independently of data, and as a result, it turns upside down the traditional way of viewing the data-model relationship (Rasch, 1980; Andrich, 2001). Traditionally, fit between the model and the data is enhanced by finding a different model; in the case of applying the Rasch model, the emphasis is on understanding the data-model misfit by examining relevant aspects of the data collection. The Rasch model provides evidence as to where to study potential problems with the data (Andrich, 2001). There are Rasch-models for dichotomous (Andrich, 1988a) as well as ordered data with polytomous response categories (Andrich, 1978, 1979). The general form is given by Pr{X ni = x} = 1 γ ni exp(κ xi + x(β n δ i )) (1) in which κ xi = tau 1i τ 2i τ xi and where τ xi ; x = 1, 2,...m i are thresholds which partitioned the latent continuum of item i into m i + 1 ordered categories, and X ni = x {0, 1, 2...m i } is a random integer variable that scores the successive ordered categories. In the case of a dichotomous item, the thresholds are absorbed as the single location parameter δ i and X ni = x {0, 1}. β n is the location parameter of person n. The model of equation (1) has sufficient statistics enabling independent estimations of item and person parameters (Rasch, 1961, 1980). Operationally, the item parameters are estimated conditioning on the sum of the raw scores for each person across items. The person parameters are estimated given the estimates of the item

6 206 CURT HAGQUIST AND DAVID ANDRICH parameters and these involve effectively a log-odds transformation of the raw scores of persons. The requirement of invariance, reflected in the Rasch-model, requires that the items work invariantly across possible classifications of individuals. In the dichotomous case this implies that all items have the same discrimination, regardless of the persons locations on the latent health scale (Andrich, 1988a). In the case of ordered polytomous response categories that requirement applies to the latent thresholds of the items (Andrich, 1978, 1979). The thresholds are the points on the latent scale where the conditional probability of scoring in one of two adjacent categories is equal. Thus from equation (1), it is evident that Pr{X ni = x x 1orx} = Pr{X ni = x} Pr{X ni = x 1}+Pr{X ni = x} = exp{β n δ i τ xi } 1 + exp{β n δ i τ xi }. (2) This equation indicates the relative success of higher score x, given that the response is in one of the adjacent categories x 1 or x and takes the form of the dichotomous Rasch model Pr{X ni = 1} =exp{β n δ i }/1 + exp{β n δ i } except that the location of the item δ i is qualified by the threshold τ xi. It is stressed however, that there is no single response at a threshold: there is only one response across all categories, and the responses in adjacent categories of a threshold are latent and only implied. The criterion of invariance is reflected in fit to the Rasch model. However, data can fit the model even though another important criterion, that of the intended operation of the categories is not met. This criterion is that intended increasing levels of severity across the response categories for each item is reflected in the data. This implies that the threshold estimates on the latent trait must appear in the same order as the manifest categories (Andrich, 1988b, 1996; Andrich et al., 1997). This means that the threshold ordering is τ 1i < τ 2i < τ xi < τ mi as in Figure 1, which shows the response probabilities in the ordered categories and the latent dichotomous responses at each threshold. As will be seen in the example, it is possible for this ordering to be violated in the empirical data, indicating a problem with the empirical operation of the categories.

7 MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS 207 Figure 1. Response probabilities in the ordered categories and the latent dichotomous responses at each threshold. Analyses The software used for the Rasch-analysis was RUMM2010 which uses a pairwise conditional method of estimation of the item parameters while eliminating the person parameters and then estimating the person parameters given the item parameters (Andrich et al., 2000). The analyses were carried out by means of the model of equation (1), which allows the thresholds to vary across items. The item sets were analysed with respect to the reliability measured by Cronbach s alpha; the ordering of the thresholds across the latent trait; potential differential item functioning (DIF) which reflects lack of invariance in the working of the item with respect to the latent trait, as well as across genders, grades and years of investigations. The analyses were based on two separate sets of data: (a) one main set comprising data from 1985/86, 1993/94 and 1997/98, i.e., a data set consisting of eight items with five identical response categories; (b) one minor set comprising data from 1989/90, i.e., a data set consisting of six items with four response categories. These two sets of data were analysed separately. Since the original eight items from the main set of data did not conform adequately to the Rasch model, the original set of items from 1985/86, 1993/94 and 1997/98 was reduced through a step

8 208 CURT HAGQUIST AND DAVID ANDRICH by step removal of items that had reversed empirical thresholds. In addition, for tentative purposes, separate analyses were conducted for each of these three years of investigations. Moreover, in order to further examine the differences in response formats between the main and the minor data sets, the six items set from 1989/90 were compared with the corresponding six items set from 1985/86, 1993/94 and 1997/98. The test of invariance across the latent trait and sample characteristics, that is DIF, was conducted using two-way analysis of variance (Glass and Stanley, 1970) of the residuals. Specifically, a standardised residual between the observed response and the expected response according to the model given the person and item parameter estimates was calculated and these residuals placed into cells according to the sample characteristic (e.g., gender) and class intervals along the continuum. The main effect of the sample characteristic and the interaction effect between the sample characteristic and the class interval in the two-way analysis of variance detects uniform and non-uniform DIF respectively (Andrich and Hagquist, 2001). The main effect of the class intervals, provides a general test of item fit of the item across the trait irrespective of the classification by sample characteristic. Separate analyses were conducted for each of the sample characteristics of interest, i.e., gender, grade and years of investigations. Any test of fit is sensitive to the relative locations of the person and item parameters and to the sample size. In addition, applying analysis of variance many tests of fit are conducted among items and within items when DIF is examined, increasing the possibility of a type I error. Finally, the test of fit conducted with the model is a deviation from perfection rather than deviation from some null effect, meaning that items which are not perfect, but nevertheless useful for the purpose of increasing precision of person locations, will be rejected statistically given a large enough sample size. In this case the sample sizes were extremely large, of the order of 3000 or more in each sample and over in the case where those different years of data collection are combined, making even small deviations from the model inevitably show statistical misfit. Therefore graphical examinations in the form of item characteristic curves were used as heuristic tools, in order to judge the substantive

9 MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS 209 TABLE II The percent of responses in different categories for eight items based on data from 1985/86, 1993/94 and 1997/98 Item label Response category About More About About Seldom every day than once once a once a or never (0) a week week month (4) (1) (2) (3) Headache Stomach-ache Backache Felt low Irritable Nervous Sleeping difficulties Dizziness meaning of potential misfit indicated by the formal tests. These displays are not shown in the results section for reasons of space. In order to facilitate comparisons between the two sets of data used, tests of fit were carried out with a sample size of the main data set (including three years of investigations) adjusted to the value of the order of 3462 which is the effective sample size for test of fit within the minor set of data from 1989/90. RESULTS Table II shows the frequency distribution for the set of eight items based on data from 1985/86, 1993/94 and 1997/98, indicating that the prevalence of different health complaints differs substantially. Dizziness and Backache comprise complaints that two out of three students experience seldom or never, while only one out six students reports experiences of being Irritable seldom or never. Figure 2 shows the locations of the item threshold parameter estimates relative to the distribution of the persons for the item set

10 210 CURT HAGQUIST AND DAVID ANDRICH Figure 2. health. Person-item threshold distribution. The higher the score, the better the consisting of eight items based on the data from 1985/86, 1993/94 and 1997/98. The person locations are negatively skewed with a relatively positive mean and with a number of thresholds at the margin of the person distribution around 1.0 logits. The dislocation that appears in the distribution results from the students showing relatively good health with respect to the complaints specified in the items. Table III shows the item parameter estimates and threshold parameter estimates for eight items based on data from 1985/86, 1993/94 and 1997/98. The table shows that the items represent different levels of severity, Irritable showing the lowest and Dizziness the highest estimates of the item parameters. Furthermore, half of the items show reverse item thresholds. The Cronbach s alpha value for this set of eight items was Separate analyses for each year of investigations (not reported here) confirm the results shown in Table III as regards the disordered thresholds. Figure 3 and 4 show category probability curves for the items Irritable and Dizziness. Figure 3 shows that the estimates of the thresholds defining the successive categories in item Irritable are ordered as required. Having a low (negative) value on the health scale indicates a high probability of scoring on the lowest value on the item. Conversely, having a high (positive) value on the overall measure, the probability of scoring a high value on the single item is

11 MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS 211 TABLE III Estimates of individual item parameters and threshold parameters (in italics = disordered) based on data from 1985/86, 1993/94 and 1997/98 original set of 8 items Item label Location SE Thresholds estimate Headache Stomach-ache Backache Felt low Irritable Nervous Sleeping difficulties Dizziness Figure 3. Category probability curve for Irritable. high, and having central value on the continuum implies responding in one of the middle two categories. In contrast, Figure 4 shows that the estimates of the thresholds defining the categories in the item Dizziness are not ordered as required. Instead of forming distinctive regions the thresholds are twisted leaving regions of the continuum undefined, and having the case of a person who is located at around 0 logits less likely

12 212 CURT HAGQUIST AND DAVID ANDRICH Figure 4. Category probability curve for Dizziness. to respond in category 2 (about once a week) than in categories 1 (about every day), 3 (about once a month) and 4 (seldom or never). This is a concrete illustration of a set of categories not working empirically as intended and not constituting increasing levels of the trait as required by the model. In all, three items were removed in order to meet the requirement of ordered thresholds. Table IV shows analysis of variance based on standardised residuals for the set of five remaining items, reported separately for gender, grade and years of investigations. Table IV shows that the reduced set with five items as a whole fit the model relatively well as regards item by latent trait interaction, although the class interval p-value for the item Felt low is on the borderline. Two items show DIF with respect to sample groups: The item Nervous across genders and the item Felt low across grades. The Cronbach s alpha value for this reduced set of five items was Table V shows the frequency distribution for the set of six items based on data from 1989/90. Table V shows that the prevalence of different health complaints differs substantially. Dizziness is a complaint that 45 percent of the students experience seldom or never, while Irritable is a complaint which only six percent report seldom or never. In Table VI item parameter estimates, probability values and threshold parameter estimates based on the six items set from 1989/

13 MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS 213 TABLE IV Analysis of variance of residuals for test of DIF between genders, grades and years of investigations as well as tests of class interval fit based on data from 1985/86, 1993/94 and 1997/98; number of class intervals = 10 Item label Probability values Division by gender Division by grade Division by years of investigations Gender Class Gender Grade Class Grade Year Class Year by interval by class interval by class interval class interval interval interval Headache Stomach-ache Felt low N/Sig Irritable Nervous N/Sig

14 214 CURT HAGQUIST AND DAVID ANDRICH TABLE V The percent of responses in different categories for all items used based on data from 1989/90 Item label Response category Often Sometimes Seldom Never (0) (1) (2) (3) Headache Stomach-ache Irritable Nervous Sleeping difficulties Dizziness TABLE VI Estimates of individual item parameters, item fit and threshold parameters based on data from 1989/90 original set of 6 items Item label Estimate SE Thresholds Headache Stomach-ache Irritable Nervous Sleeping difficulties Dizziness are reported. The threshold ordering for the set is as required. The Cronbach s alpha value for this set of six items was Table VII shows analysis of variance based on standardised residuals for the item set consisting of six items from 1989/90, reported separately for gender and grade. Table VII shows that all six items work consistently across the latent trait. There is clearly no main effects as regards the class intervals for five out of six items, while the item Sleeping difficulties

15 MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS 215 TABLE VII Analysis of variance of residuals for test of DIF between genders and grades as well as tests of class interval fit based on data from 1989/90; number of class intervals = 10 Item label Probability values Division by gender Division by grade Gender Class Gender Grade Class Grade interval by class interval by class interval interval Headache Stomach-ache Irritable Nervous Sleeping difficulties N/Sig Dizziness

16 216 CURT HAGQUIST AND DAVID ANDRICH is on the borderline. However, two items show DIF across genders, the item Headache and the item Sleeping difficulties. In order to achieve comparability between the set of items from 1989/90 and the a set of items including three years of investigations two items, Backache and Felt low, were eliminated from the original set of items from 1985/86, 1993/94 and 1997/98. Comparing these two sets of items with identical complaints included, but with different response categories, the results reported before still hold: The data from 85/86, 93/94 and 97/98 show disordered thresholds, while the data from 89/90 does not (not reported here). DISCUSSION The results reported in this article on measurement of self-reported health among adolescents in Sweden indicate that the HBSCinstrument used to measure subjective health complaints at three years of investigations does not work adequately according to the Rasch model when all eight original items are included in the analysis. This is indicated by disordered thresholds among some of the items showing that the categories are not working as intended. Hence, the original set of eight items should not be used to construct a latent measure of subjective health. Because some threshold estimates were located at the margin of the person distribution, it would be reasonable to hypothesise that lack of information for parameter estimation might be the cause of the disordered thresholds. However, none of the three items that were removed belonged to the group of items with extreme threshold values. Notable, the problems with disordered thresholds described above do not occur in the HBSC-instrument used 1989/90. Reducing the number of items based on data from 1985/86, 1993/94 and 1997/98 to six and thereby constructing a set of items identical to that from 1989/90 does not make any difference: three items in the data set from 1985/86, 1993/94 and 1997/98 still reveal reversed threshold ordering. At face value, theoutcomefrom thiscomparison may come as a surprise since most of the response categories used 1985/86, 1993/94 and 1997/98 were primarily built upon quantitative expressions (e.g., About once a week ) while those used

17 MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS /90 comprised qualitative expressions (e.g., Often ). Using quantitative instead of qualitative expressions may be a way to avoid contextual bias owing to possible different meanings of qualitative expressions across items. On the other hand side, quantification may work in the opposite direction due to loss of conceptualisation and accuracy in the responses. These kinds of trade off problems are similar to those discussed in statistics as to whether qualitative probabilistic expressions should be quantified or not (Mosteller and Youtz, 1990). The fact that the sets of items with response categories mainly based on quantitative expressions were mixed up with one category based on a qualitative expression may have reinforced the problems for the operation of the categories described above. Interestingly, items on perceived health invoking response categories similar to those ones used 1989/90 (but with five instead of four response categories) have shown required properties not least with respect to the threshold ordering when applied on another adolescents data set from Sweden (Hagquist, 2001). In order to find a set of items based on data from 1985/86, 1993/94 and 1997/98 that did not suffer from disordered thresholds three items had to be removed. The reduced set of five items did work consistently with the model with respect to the response categories, and did show relative invariance across the latent trait. Moreover, the reliability measured by Cronbach s alpha remained high (0.764 compared to in the initial set), indicating that the loss of precision in the measurement due to the removal of three items was negligible. However, because of the threshold reversal, the precision that is apparent from the statistical formula cannot be taken at face value the lower value in the presence of the correct ordering of the thresholds is more tenable. Since a few of the remaining items showed lack of invariance across genders that problem should be solved through principles of equating (Andrich and Hagquist, 2001), if the reduced item set is to be used for post-hoc analyses. Following the categorisation suggested by Haugland et al. (2001) two out of the five items in the reduced set were somatic, while three were psychological suggesting that there are somatic

18 218 CURT HAGQUIST AND DAVID ANDRICH and psychological complaints that might reflect one higher order dimension of subjective health. Since discarding items sometimes may decrease the precision of measurement too much, an alternative strategy for post-hoc analyses might be to collapse the response categories in order to try to bring the items in accordance with the Rasch model (Andrich, 1996; Zhu et al., 1997). Collapsing response categories is an appropriate technique only if the data does not conform to the Rasch model (Andrich, 1996). In contrast, if the data fit the Rasch-model collapsing response categories violates the model (Rasch, 1966). In order to judge whether additional items could be retained through a revision of the response categories complementary analyses by means of qualitative interviews might be useful. Such interviews would facilitate the understanding of the ways different groups of individuals perceive and internalise different response categories in relation to the complaints. In particular, it would be instructive to check whether respondents thought that those items with reversed thresholds created difficulties for them in making a choice. ACKNOWLEDGEMENTS The data were provided by the National Institute of Public Health in Sweden. Principal investigator of the HBSC-survey in Sweden is Dr Ulla Marklund. REFERENCES Andrich, D.: 1978, A rating formulation for ordered response categories, Psychometrika 43, pp Andrich, D.: 1979, A model for contingency tables having an ordered response classification, Biometrics 35, pp Andrich, D.: 1982, Using latent trait measurement models to analyse attitudinal data: A synthesis of viewpoints, in D. Spearritt (ed.), Proceedings of the 1980 Invitational Conference for the Improvement of Testing, celebrating the 50th anniversary of the Australian Council for Educational Research. Melbourne, A.C.E.R., pp Andrich, D.: 1988a, Rasch Models for Meaurement (Sage Publications, Newbury Park).

19 MEASURING SUBJECTIVE HEALTH AMONG ADOLESCENTS 219 Andrich, D.: 1988b, A general form of Rasch s extended logistic model for partial credit scoring, Applied Measurement in Education 1, pp Andrich, D.: 1996, Measurement criteria for choosing among models with graded responses, in A. von Eye and C.C. Clogg (eds.), Categorical Variables in Developmental Research. Methods of Analysis (Academic Press, San Diego), pp Andrich, D.: 2001, Controversy and the Rasch model: A characteristic of a scientific revolution?, Invitational presentation at the meeting of the International Conference on Objective Measurement: Focus on Health Care, Chicago, October Andrich, D. and C. Hagquist: 2001, Taking account of differential item functioning through principles of equating, Research report no. 12, April 2001 (Social Measurement Laboratory, Murdoch University, Perth). Andrich, D., J.H.A.L. de Jong and B.E. Sheridan: 1997, Diagnostic opportunities with the Rasch model for ordered response categories, in J. Rost and R. Langeheine (eds.), Applications of Latent Trait and Latent Class Models in the Social Sciences (Waxmann Verlag, Münster), pp Andrich, D., B. Sheridan and G. Luo: 2000, RUMM2010: A Windows interactive program for analysising data with Rasch Unidimensional Models for Measurement (RUMM Laboratory, Perth, Western Australia). Attanasio, V., F. Andrasik, E.B. Blanchard and J.G. Arena: 1984, Psychometric properties of the SUNYA revision of the psychosomatic symptom checklist, Journal of Behavioral Medicine 7, pp Currie, C., K. Hurrelmann, W. Settertobulte, R. Smith and J. Todd (eds.): 2000, Health and health behaviour among young people (World Health Organization, Copenhagen). Danielson, M. and U. Marklund: 2000, Svenska skolbarns hälsovanor 1997/98 [Health-related behaviours among Swedish schoolchildren 1997/98], Tabellrapport (Folkhälsoinstitutet, Stockholm). Dubois, B. and J.A. Burns: 1975, An analysis of the meaning of the question mark response category in attitude scales, Educational and Psychological Measurement 35, pp Duncan, O.D. and M. Stenbeck: 1987, Are Likert scales unidimensional?, Social Science Research 16, pp Fisher, R.A.: 1958, Statistical Methods for Research Workers, 13th ed. (Hafner, New York). Glass, G.V. and J.C. Stanley: 1970, Statistical Methods in Education and Psychology (Prentice-Hall, New Jersey). Hagquist, C.: 2001, Evaluating composite health measures using Raschmodeling: An illustrative example, Social and Preventive Medicine 46, pp Haugland, S. and B. Wold: 2001, Subjective health complaints in adolescence Reliability and validity of survey methods, Journal of Adolescence 24, pp

20 220 CURT HAGQUIST AND DAVID ANDRICH Haugland, S., B. Wold, J. Stevenson, L.E. Aaroe and B. Woynarowska: 2001, Subjective health complaints in adolescence. A cross-national comparison of prevalence and dimensionality, European Journal of Public Health 11, pp HBSC: 1998, Health Behaviour in School-aged Children. A WHO Cross-national Survey. Research Protocol for the Study (University of Edinburgh, Edinburgh). Hopland, K., L.E. Aaroe and B. Wold: 1993, Sosialt nettverk, einsemd og kvardagsplager. Ei epidemiologisk undersöking blant 9. klassingar [Social network, loneliness and everyday complaints. An epidemiological survey among adolescents], Tidsskrift for Norsk Psykologforening 30, pp Hurrelmann, K., U. Engel, B. Holler and E. Nordlohne: 1988, Failure in school, family conflicts, and psychosomatic disorders in adolescence, Journal of Adolescence 11, pp Mosteller, F. and C. Youtz: 1990, Quantifying probabilistic expressions, Statistical Science 5, pp Rasch, G.: 1961, On general laws and the meaning of measurement in psychology, Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability (University of California Press, Berkely), pp Rasch, G.: 1966, An individualistic approach to item analysis, in P.F. Lazarsfeld and N.W. Henry (eds.), Readings in Mathematical Social Science (Science Research Associates, Chicago), pp Rasch, G.: 1980, Probabilistic Models for Some Intelligence and Attainment Tests (First published 1960 by the Danish Institute for Educational Research) (MESA Press, Chicago). Wisniewski, J.J., J.A. Naglieri and J.A. Mulick: 1988, Psychometric properties of a childrenþs psychosomatic symptom checklist, Journal of Behavioral Medicine 11, pp Zhu, W., W.F. Updyke and C. Lewandowski: 1997, Post-hoc Rasch analysis of optimal categorization of an ordered-response scale, Journal of Outcome Measurement 1, pp Karlstad University SE Karlstad, Sweden and National Board of Health and Welfare Centre for Epidemiology, Stockholm, Sweden Murdoch University, Perth Australia Curt Hagquist David Andrich

Psychometric properties of the PsychoSomatic Problems scale an examination using the Rasch model

Psychometric properties of the PsychoSomatic Problems scale an examination using the Rasch model Curt Hagquist Karlstad University, Karlstad, Sweden Address: Karlstad University SE-651 88 Karlstad Sweden