Development and Validation of an Improved, COPD-Specific Version of the St. George Respiratory Questionnaire*

Similar documents
Development and validity testing of an IPF-specific version of the St George s Respiratory Questionnaire

C hronic obstructive pulmonary disease (COPD) is one of

A comparison of global questions versus health status questionnaires as measures of the severity and impact of asthma

PSYCHOMETRIC ASSESSMENT OF THE CHINESE LANGUAGE VERSION OF THE ST. GEORGE S RESPIRATORY QUESTIONNAIRE IN TAIWANESE PATIENTS WITH BRONCHIAL ASTHMA

Comparisons of health status scores with MRC grades in COPD: implications for the GOLD 2011 classification

Interpreting thresholds for a clinically significant change in health status in asthma and COPD

Development of a self-reported Chronic Respiratory Questionnaire (CRQ-SR)

Patient reported outcomes in respiratory diseases; How to assess clinical success in COPD

Journal of the COPD Foundation

Usefulness of the Medical Research Council (MRC) dyspnoea scale as a measure of disability in patients with chronic obstructive pulmonary disease

Patient Reported Outcomes

Chronic obstructive pulmonary disease. Development and first validation of the COPD Assessment Test

The development and validation of the King s Brief Interstitial Lung Disease (K-BILD) health status questionnaire

Longitudinal deteriorations in patient reported outcomes in patients with COPD

COPD refers to a cluster of diseases (including. The Breathlessness, Cough, and Sputum Scale*

Psychometric properties of the PsychoSomatic Problems scale an examination using the Rasch model

SGRQ Questionnaire assessing respiratory disease-specific quality of life. Questionnaire assessing general quality of life

Is there any evidence that multi disciplinary pulmonary rehabilitation impacts on quality of life?

A comparison of three disease-specific and two generic health-status measures to evaluate the outcome of pulmonary rehabilitation in COPD

Does the multidimensional grading system (BODE) correspond to differences in health status of patients with COPD?

Disease progression in COPD:

Chapter 5: Patient-reported Health Instruments used for people with Chronic Obstructive Pulmonary Disease (COPD)

Health-related quality of life is associated with COPD severity: a comparison between the GOLD staging and the BODE index

Blood Eosinophils and Response to Maintenance COPD Treatment: Data from the FLAME Trial. Online Data Supplement

Quantification of dyspnoea using descriptors: development and initial testing of the Dyspnoea-12

Manuscript type: Research letter

Validity and Reliability of CAT and Dyspnea-12 in Bronchiectasis and Tuberculous Destroyed Lung

TORCH: Salmeterol and Fluticasone Propionate and Survival in COPD

Patient Assessment Quality of Life

QUALITY OF LIFE MEASURED BY THE ST GEORGE'S RESPIRATORY QUESTIONNAIRE AND SPIROMETRY

aclidinium 322 micrograms inhalation powder (Eklira Genuair ) SMC No. (810/12) Almirall S.A.

The COPD assessment test (CAT): response to pulmonary rehabilitation. A multicentre, prospective study

The Asthma Quality of Life Questionnaire (AQLQ) Validation of a Standardized Version of the Asthma Quality of Life Questionnaire*

Key words: COPD, prediction, pulmonary rehabilitation, response.

What s new in COPD? Apichart Khanichap MD. Department of Medicine, Faculty of Medicine, Thammasat university

Choosing an inhaler for COPD made simple. Dr Simon Hart Castle Hill Hospital

Douglas W. Mapel MD, MPH, Melissa Roberts PhD

Pulmonary rehabilitation (PR) is an important

Quality of life and hospital re-admission in patients with chronic obstructive pulmonary disease

JOINT CHRONIC OBSTRUCTIVE PULMONARY DISEASE (COPD) MANAGEMENT GUIDELINES

Minimal important difference of the transition dyspnoea index in a multinational clinical trial

รศ. นพ. ว ชรา บ ญสว สด M.D., Ph.D. ภาคว ชาอาย รศาสตร คณะแพทยศาสตร มหาว ทยาล ยขอนแก น

Interstitial lung diseases and, in particular, idiopathic

Reliability and validity of a Swedish version of the St George's Respiratory Questionnaire

MEASURING AFFECTIVE RESPONSES TO CONFECTIONARIES USING PAIRED COMPARISONS

Chronic obstructive pulmonary disease (COPD) is characterized

Quality of Life and Related Factors in Patients with Chronic Obstructive Pulmonary Disease

Clinical and radiographic predictors of GOLD-Unclassified smokers in COPDGene

COMMITTEE FOR MEDICINAL PRODUCTS FOR HUMAN USE (CHMP)

Surveillance report Published: 6 April 2016 nice.org.uk. NICE All rights reserved.

Characterisation and impact of reported and unreported exacerbations: results from ATTAIN

Measuring mathematics anxiety: Paper 2 - Constructing and validating the measure. Rob Cavanagh Len Sparrow Curtin University

Recent advances in analysis of differential item functioning in health research using the Rasch model

A psychometric assessment of the St. George s respiratory questionnaire in patients with COPD using rasch model analysis

Cross-Cultural Adaptation, Reliability and Validity Study of the Persian Version of the Clinical COPD Questionnaire

Turning Science into Real Life Roflumilast in Clinical Practice. Roland Buhl Pulmonary Department Mainz University Hospital

Kun-Yen Hsu 1,3, MD, Jr-Rung Lin 1,2, PhD, Ming-Shian Lin 3, MD, Wei Chen 3, MD, Yi-Jen Chen 3, MD, Yuan-Horng Yan 4, MD

Self effi cacy measurement and goal attainment after pulmonary rehabilitation

Chronic Obstructive Pulmonary Diseases:

Depression, anxiety and health status after hospitalisation for COPD: A multicentre study in the Nordic countries

The beneficial effects of ICS in COPD

Comparison of the standard gamble, rating scale, AQLQ and SF-36 for measuring quality of life in asthma

Comparison of patient-reported outcomes during acute exacerbations of chronic obstructive pulmonary disease

Life-long asthma and its relationship to COPD. Stephen T Holgate School of Medicine University of Southampton

Introduction R. GARROD*, J.C.BESTALL {,E.A.PAUL*, J.A.WEDZICHA* AND P. W. JONES {

A.A. Okubadejo*, E.A. Paul*, P.W. Jones**, J.A. Wedzicha

The assessment of health status among patients with COPD

Shaping a Dynamic Future in Respiratory Practice. #DFResp

D ue to the irreversible nature of chronic obstructive pulmonary

A Randomized Controlled Trial of Follow-up of Patients Discharged From the Hospital Following Acute Asthma*

Table of Contents. Preface to the third edition xiii. Preface to the second edition xv. Preface to the fi rst edition xvii. List of abbreviations xix

Objectives. Objectives. Definition. Physiology. Evaluation of the Dyspneic Patient. B. Celli Disclaimer

A Comparison of the BODE Index and the GOLD Stage Classification of COPD Patients in the Evaluation of Physical Ability

This clinical study synopsis is provided in line with Boehringer Ingelheim s Policy on Transparency and Publication of Clinical Study Data.

A Validation Study for the Korean Version of Chronic Obstructive Pulmonary Disease Assessment Test (CAT)

Factors affecting health status in COPD patients with co-morbid anxiety or depression

Occupational exposures are associated with worse morbidity in patients with COPD

Available online at Scholars Research Library

Study No.: Title: Rationale: Phase: Study Period: Study Design: Centres: Indication: Treatment: Objectives: Primary Outcome/Efficacy Variable:

Division of Pulmonary, Critical Care, and Sleep Medicine, Jacksonville, FL. Department of Internal Medicine, Wichita, KS

Kian-Chung Ong, FRCP (Edin); Arul Earnest, MSc; and Suat-Jin Lu, MBBS

Co-morbidity in older patients with COPD its impact on health service utilisation and quality of life, a community study

Quantification of dyspnoea using descriptors: Development and initial testing of the Dyspnoea-12

Aclidinium Bromide: Clinical Benefit in Patients with Moderate to Severe COPD

Cost-Effectiveness of Therapy with Combinations of Long-Acting Bronchodilators and Inhaled Steroids for Treatment of COPD

The physiological hallmark of chronic. Tiotropium as essential maintenance therapy in COPD. M. Decramer

exacerbation has greater impact on functional status than frequency of exacerbation episodes.

Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys

Scale construction utilising the Rasch unidimensional measurement model: A measurement of adolescent attitudes towards abortion

Roflumilast (Daxas) for chronic obstructive pulmonary disease

Comparing COPD Treatment: Nebulizer, Metered Dose Inhaler, and Concomitant Therapy

Structural Equation Modeling of Health Literacy and Medication Adherence by Older Asthmatics

Three s Company - The role of triple therapy in chronic obstructive pulmonary

Minimum clinically important difference for the COPD Assessment Test: a prospective analysis

Quality of Life after. A Critical Illness: A review of the literature

COPD is a progressive disorder leading to increasing

The Rasch Measurement Model in Rheumatology: What Is It and Why Use It? When Should It Be Applied, and What Should One Look for in a Rasch Paper?

Productivity losses in chronic obstructive pulmonary disease a population-based survey.

Running Head: Impact of Distractive Auditory Stimuli on Health-Related Quality of Life

Transcription:

Original Research COPD Development and Validation of an Improved, COPD-Specific Version of the St. George Respiratory Questionnaire* Makiko Meguro, Mphil; Elizabeth A. Barley, PhD, CPsychol; Sally Spencer, PhD; Paul W. Jones, PhD, FRCP Objective: To produce an improved, COPD-specific version of the St. George respiratory questionnaire (SGRQ-C). Methods: Five different steps were required: (1) Rasch analysis of the responses of 893 COPD patients to the St. George respiratory questionnaire (SGRQ) identified weaker items to be removed; (2) a scoring algorithm was produced using data from 1,036 patients; (3) validity of the new and original SGRQ was tested using data from the original validation study; (4) responsiveness was tested using data from a previously published trial; and (5) a reworded version (SGRQ-C) that no longer specified the recall period was administered to 63 pulmonary rehabilitation participants. Results: Items were removed due to lack of response (n 1), misfit to the Rasch model (n 8), and disordered responses (n 1). Another six items had disordered responses; this was corrected. Scores from the two versions differed slightly, so the scoring algorithm was revised to produce scores equivalent to the original. Intraclass correlation coefficient (ICCC) for the scores for original and new versions was 0.99. Correlations with other measures of disease were very similar to those obtained with the original. New and original scores for treatment effects were similar: difference, 0.1 2.7U( SD). Baseline SGRQ and SGRQ-C scores were similar (ICCC, 0.95; 95% confidence interval, 0.92 to 0.97; mean difference, 0.9 5.8 U). Change scores were similar (difference, 1.0 7.3 U). Conclusions: The SGRQ-C contains the best of the original items, no longer specifies a recall period, and produces scores equivalent to the original. (CHEST 2007; 132:456 463) Key words: COPD; health-related quality of life; health status; St. George respiratory questionnaire; questionnaires Abbreviations: ICC item characteristic curve; ICCC intraclass correlation coefficient; PSI person separation index; SGRQ St. George respiratory questionnaire; SGRQ-C COPD-specific version of the St. George respiratory questionnaire *From Respiratory Medicine, Cardiac and Vascular Sciences (Miss Meguro, Dr. Barley, and Dr. Jones), St. George s, University of London, London; and Brunel University (Dr. Spencer), Osterley Campus, Isleworth, Middlesex, UK. The authors have no financial or other potential conflicts of interest to disclose. The St. George respiratory questionnaire (SGRQ) 1 is a measure of health status for COPD. It has been in existence for 15 years, and there is a large body of evidence concerning its validity. Rigorous methodology was used in its development, but it is inevitable that some items have weaker measurement properties than others. Removal of these should improve reliability and responsiveness. This study was designed to identify such items and test the effect of their removal on the performance of the questionnaire. Shortening the questionnaire was not a specific aim. Short instruments for measuring health status in Manuscript received March 16, 2006; revision accepted April 19, 2007. Reproduction of this article is prohibited without written permission from the American College of Chest Physicians (www.chestjournal. org/misc/reprints.shtml). Correspondence to: Paul W. Jones, PhD, FRCP, Respiratory Medicine, Cardiac and Vascular Sciences, St. George s, University of London, Cranmer Terrace, London, SW17 0RE; e-mail: p.jones@sgul.ac.uk DOI: 10.1378/chest.06-0702 456 Original Research

respiratory disease are available, 2 4 but they may perform slightly less well than longer instruments. 2,3,5 From experience, we knew that up to 10 randomly missed items did not affect scores significantly, so we limited item removal to 10 items. We were concerned to ensure that the revised instrument would produce scores directly comparable to those from the original, to enable direct comparison between studies using different versions. The SGRQ was developed initially for use in asthma and COPD. It has found widespread use in COPD but less application in asthma, so we restricted this analysis to data from COPD patients. We describe a series of studies that reanalyzed the SGRQ items, produced a revised version and scoring system, and tested validity and responsiveness. A final study reports a preliminary test of the crosssectional and longitudinal validity of the new revised and reworded measure. The SGRQ Materials and Methods A total and three component scores are provided: symptoms (8 items), activity (16 items), and impacts (26 items). The 50 items vary in form between polytomous (eg, Likert-type scale) and dichotomous (eg, true/false ). In the symptoms component, patients are asked to recall symptoms over a specified time frame: 1 month, 3 months, or 1 year. Other items are not time related. Each item has an empirically derived weight. 6 Scores range between 0 (no impairment) and 100 (worst possible health). Study Protocols Study 1: Development of the Revised Version: Baseline data were available from the Inhaled Steroids in Obstructive Lung Disease in Europe study 7 and the placebo limb of a study 8 of tiotropium in COPD. Only data from patients with no missing responses were used (n 893; 371 women; mean age, 64 7 years [ SD]; mean prebronchodilator FEV 1, 1.4 0.5 L). The performance of each item was tested using Rasch analysis. Study 2: Development of a New Scoring System: Rasch analysis showed that responses to some polytomous items should be amalgamated. For the revised version, we averaged the item weights from the response categories that were to be collapsed to produce weights for the new categories. Study 3: Cross-sectional Validity: Scores for the revised version were compared with scores from the original using data from the original validation study 1 (n 152; 53 women; mean age, 63 8 years; mean FEV 1, 53.3 23.3% of predicted). Study 4: Responsiveness: Sensitivity to change was compared with that of the original using data from a trial 9 of salmeterol. We only used patients who provided SGRQ data with no missed responses (n 169; 25 women; mean age, 62 8 years; mean FEV 1,46 15% of predicted). Study 5: Cross-sectional and Longitudinal Validity: The analyses described in studies 3 and 4 were performed using a revised scoring algorithm developed from studies 1 and 2. Since this involved collapsing response categories for a few items, we revised the wording (details in on-line Supplement). We also removed the recall period used in the symptoms component (details in on-line Supplement) since previous work had shown that the 1-month version had only marginally inferior characteristics to the 1-year version. 10 The new SGRQ is termed the COPD-specific version of the St. George respiratory questionnaire (SGRQ-C). The SGRQ-C was administered with the SGRQ (1-year version) at the same time using a randomized order to 63 COPD patients (23 women; mean age, 70 9 years; mean FEV 1, 47 14% of predicted) participating in three pulmonary rehabilitation programs throughout the United Kingdom. The SGRQ-C was readministered after 4 weeks. Analysis Study 1: Prior to the Rasch analysis, we removed those items that had a high proportion of missed responses relative to the others. Rasch Analysis: The Rasch model 11,12 provides a template for measurement in which all the items are assumed to conform to a unidimensional structure. It enables tests of how well each item fits that structure. The first step examines the ordering of responses to polytomous items. When correctly ordered, the probability of responding to items of different severities should follow the level of health status impairment of the respondents. Item responses that do not follow a logical order are disordered, but this may be corrected by combining response categories. We used a range of tests to identify the least well-performing items that could be removed without altering the basic properties of the SGRQ. Analyses were performed using a Rasch unidimensional measurement model (RUMM 2020; RUMM Laboratory; Perth, Western Australia). 13 The capacity of the scale to discriminate between patients was tested using the person separation index (PSI) from the Rasch analysis. It is analogous to Cronbach. 14 The PSI for each component was calculated before and after removing an item. If it worsened, the item was retained. Finally, the PSI was calculated for all items used in the revised version and compared with that for the original 50 items. We examined the item characteristic curve (ICC) and 2 fit statistic for each item to test its ability to discriminate between patients with different levels of health. ICCs show whether persons with a similar level of health respond similarly to an item. The 2 was used to test the quality of fit of each ICC to the unidimensional model. To maintain the balance of symptom, activity, and impact items present in the original version, we examined the fit of each item only in relation to the others within its component. Items for which ordered thresholds could not be achieved and items showing misfit were removed. Where several items met the removal criteria, item content was also considered in order to preserve content validity. Other Analysis: Differences between original and revised versions of the SGRQ were tested using paired t tests and intraclass correlation coefficients (ICCCs). Agreement was tested by plotting the difference between original and revised scores against the mean of the two. 15 Comparative validity was tested by examining the correlations (r) between the two versions of the SGRQ and other measures of disease severity. Differences between patient groups were tested using analysis of variance. Data are reported as mean SD. Study 1 Results The PSI of the original SGRQ indicated excellent discriminatory function (PSI, 0.90). One item referwww.chestjournal.org CHEST / 132 / 2/ AUGUST, 2007 457

ring to ability to work was found to have a large number of missing responses (19% compared with a mean of 1% for the other items). It was removed. Symptoms Items: The PSI for this component was 0.74 ( good ). Six of seven Likert items had disordered thresholds. In five, ordered response options were achieved by combining two or more categories (details in the on-line Supplement). One item, How long did the worst attack of chest trouble last? could not be rescored and was removed. Following these changes, the PSI for this component was 0.74. Activity Items: The initial PSI was 0.84. The two worst-fitting (p 0.0001) items I get breathless lying still and I get breathless playing sports or games were removed. An item concerning very strenuous activities was also removed. It fitted the model, but its location ( 2.32 logits) showed it was less severe than an item about moderately strenuous activity ( 2.49 logits), which is counter intuitive. After removal of these items, the PSI was 0.88. Impact Items: The initial PSI was 0.84. One item showed disordered thresholds; this was resolved by combining two response categories. The item with the worst fit ( I do not expect my chest to get any better ; p 0.0001) was removed. Removal of the next worse-fitting item caused another previously good item to misfit. Both were retained and other misfitting items examined. Three referred to medication. A fourth medication item had quite a good fit, but as a single item it now appeared incongruous with the other items that reflect quite different psychosocial effects. It was removed. Removal of these items reduced the PSI by 0.005, which was considered acceptable. The PSI for the 40 retained items showed that the excellent discriminatory function had been maintained (PSI, 0.90). Study 2 A Bland and Altman plot 15 for the total score (not shown) showed that the difference between the new and original scores also increased very slightly with higher scores (linear regression slope, 0.10; R 2 0.30; p 0.0001). This was also the case for the component scores (symptoms: slope, 0.02; R 2 0.01, p 0.01; activity: slope, 0.13; R 2 0.44, p 0.0001; and impacts: slope, 0.10; R 2 0.20, p 0.0001). This appeared to be due to the removal of items that had a high response rate in severe disease but had a relatively low item weight. To achieve equivalence between the original and revised scores, the revised scores were adjusted. We calculated regressions between the original (dependent variable) and the revised (independent variable) scores (total: Y 3.104 [0.90 X], R 2 0.98; symptoms: Y 0.94 [0.99 X], R 2 0.94; activity: Y 7.01 [0.87 X], R 2 0.98; and impacts: Y 2.18 [0.88 X], R 2 0.96) [p 0.0001 for all scores]. From this, we created algorithms that would produce original SGRQ scores from the 40-item set. Subsequent analyses of the revised SGRQ reported in this article use scores adjusted in this way. Table 1 shows the regressions between the original and revised total scores before and after using these algorithms. Study 3 Differences between the two scores were minimal: mean total, 0.02 2.4 U (minimum, 6.4 U; maximum, 5.7 U; median, 0.1 U); mean symptoms, 0.001 5.0 U (minimum, 16.2 U; maximum, 13.1 U; median, 0.5 U); mean activity, 0.01 3.9 U (minimum, 9.3 U; maximum, 10.6 U; median, 0.8 U); mean impacts, 0.01 3.9 U (minimum, 9.4 U; maximum, 11.0 U; median, 0.6 U; all p 0.05) [paired t test, degree of freedom 151; t 0.08, total; t 0.003, symptoms; t 0.02, activity; t 0.04, impacts]. The IC- CCs showed excellent reliability: total, 0.99; symptoms, 0.96; activity, 0.99; and impacts, 0.98. Bland and Altman plots indicated good agreement (Fig 1). The difference between the two total scores was less than the minimum clinically important score of 4 Table 1 Step 2, Regression Estimates Between Original SGRQ Total Scores (Dependent Variable) and the Revised Total Scores (Independent Variable) Before and After the Correction Algorithm Was Applied* Before Algorithm Applied After Algorithm Applied Variables No. Intercept Slope R 2 Intercept Slope R 2 Step 2 893 3.10 0.90 0.98 0.07 1.00 0.98 Step 3 152 2.69 0.92 0.98 0.001 1.00 0.98 Step 4 169 4.12 0.89 0.98 1.05 0.99 0.98 Prospective study 40 3.084 0.90 0.87 0.03 1.00 0.87 *The intercept is measured in SGRQ units; the slope is a ratio (original score divided by revised score). For steps 2 to 4 the revised and original SGRQ scores were calculated from the same set of item responses, but the scores were produced using different items. In the prospective study, two different questionnaires were administered to the patients: the original version of the questionnaire and the reworded revised questionnaire (the SGRQ-C). 458 Original Research

Figure 1. Difference between original and revised SGRQ scores plotted against the mean of the original and revised SGRQ scores. points for all but 13 9% of patients (Fig 1). Scores from the new and original versions correlated very similarly with other measures of disease (Table 2). Both were independent of age and sex (p 0.05). Study 4 Mean change scores (follow-up baseline) for the original and revised versions were very similar: mean difference, total: 0.1 2.7 U (minimum, 6.6 U; maximum, 8.2 U; median, 0.4 U); symptoms: 0.02 6.1 U (minimum, 16.2 U; maximum, 14.5 U; median, 0.003 U); activity: 0.2 3.2 U (minimum, 7.5 U; maximum, 8.1 U; median, 0.003 U); and impacts, 0.4 4.1 (minimum, 10.6; maximum, 12.1 U; median, 0.01 U) [all scores, p 0.05]. The ICCC between the original and revised scores (n 169) showed excellent reliability: total, 0.98; symptoms, 0.93; activity, 0.98; and impacts, 0.97. Original and revised SGRQ change scores were similarly correlated with change in the other variables (Table 3). According to both SGRQ scores, patients receiving salmeterol showed greater improvement than those receiving placebo (Table 4). Differences between treatment groups were significant for impacts and total scores (original and revised) only (analysis of variance, degrees of freedon 1,168; impacts: original, F 7.5, p 0.007; new, F 6.1, p 0.02; total: original, F 6.0, p 0.02; new, F 4.3, p 0.04). In the placebo-treated arm of the study, repeatability measured as ICCC was the same for both versions (Table 4). Study 5 At baseline, SGRQ and SGRQ-C scores were very similar (mean difference: total, 0.9 5.8 U; symp- www.chestjournal.org CHEST / 132 / 2/ AUGUST, 2007 459

Table 2 Step 3, Correlations (r) Between Original and Revised SGRQ Scores and Concurrent Measures of Disease Severity (n 152)* Total Symptoms Activity Impacts Variables Original Revised Original Revised Original Revised Original Revised FEV 1 0.29 0.29 0.12 0.17 0.31 0.30 0.26 0.25 FVC 0.41 0.42 0.24 0.28 0.32 0.34 0.43 0.42 6-min walk distance 0.60 0.63 0.27 0.27 0.59 0.62 0.58 0.60 MRC dyspnea grade 0.70 0.72 0.36 0.38 0.71 0.72 0.64 0.64 HADS anxiety 0.57 0.56 0.36 0.33 0.43 0.43 0.61 0.60 SIP total 0.71 0.70 0.34 0.35 0.62 0.64 0.72 0.69 Global health 0.63 0.61 0.39 0.38 0.54 0.55 0.62 0.59 Cough/phlegm 0.36 0.36 0.39 0.36 0.32 0.30 0.31 0.33 Daily wheeze 0.51 0.49 0.57 0.57 0.34 0.32 0.50 0.49 *Data are from Jones et al. 1 MRC Medical Research Council; HADS hospital anxiety and depression scale; SIP sickness impact profile; Global health 5-point global health rating (very poor, poor, satisfactory, good, very good); Cough/phlegm and daily wheeze were measured using the Medical Research Council Respiratory Questionnaire; p 0.0001 unless otherwise indicated. p 0.05, not significant. p 0.04. p 0.001. toms, 3.0 12.0 U; activity, 1.3 7.1 U; impacts, 0.1 8.5 U) [all p 0.05]. The ICCCs showed excellent reliability: total, 0.95; activity, 0.93; and impacts, 0.91. The symptoms component was slightly less reliable (ICCC, 0.80). The Bland and Altman plots showed that the activity score differed slightly depending on severity (linear regression slope, R 2 0.10, p 0.01), but this was not seen with the other components or total score (Fig 2). The two questionnaires were correlated similarly with the other measures of disease severity (Table 5). Both were independent of age and sex (p 0.05). Thirty-six patients were available for follow-up (mean age, 70 9 years; mean FEV 1,49 13% of predicted). Their global health rating did not change. SGRQ and SGRQ-C total scores did not change: SGRQ, 1.8 9.4 U; SGRQ-C, 2.8 11.1 U. The SGRQ-C changed slightly more than the SGRQ, but this was not significant (paired t test, p 0.05). Discussion This analysis of responses to the SGRQ from a large number of COPD patients identified 10 weaker items that could be removed without altering the performance of the instrument. Initial testing found that the SGRQ already had good measurement properties. Its internal reliability has now been improved by removing an item with a low response rate and by using Rasch analysis to identify items that did not fit a unidimensional model so well. The content balance of the SGRQ Table 3 Step 4, Correlations (r) Between Changes in Original and Revised SGRQ Scores and Changes in Other Measures of Disease Severity From a Randomized Trial of Salmeterol (n 169)* Total Symptoms Activity Impacts Variables Original Revised Original Revised Original Revised Original Revised FEV 1 % predicted 0.23 0.21 0.14 0.13 0.17 0.17 0.19 0.17 6-min walk distance 0.03 0.04 0.08 0.10 0.04 0.04 0.04 0.05 SF-36, physical functioning 0.39 0.37 0.05 0.02 0.41 0.41 0.33 0.31 SF-36, social functioning 0.19 0.19 0.16 0.16 0.19 0.17 0.13 0.14 SF-36, physical role limitation 0.26 0.25 0.06 0.03 0.27 0.26 0.21 0.22 SF-36, emotional role limitation 0.17 0.17 0.01 0.01 0.12 0.11 0.18 0.16 SF-36, mental health 0.17 0.19 0.12 0.13 0.01 0.01 0.21 0.23 SF-36, vitality 0.31 0.31 0.16 0.19 0.24 0.23 0.28 0.27 SF-36, health perception 0.18 0.19 0.07 0.08 0.15 0.15 0.15 0.17 *From Jones and Bosh. 9 SF-36 Medical Outcomes Study Short Form-36 health scale; p 0.05 unless otherwise indicated. p 0.01. p 0.05. p 0.0001. p 0.001. 460 Original Research

Table 4 Step 4, Change in Score Over 16 Weeks for Placebo- (n 81) and Salmeterol- (n 88) Treated Patients for the Original and Revised SGRQ Scores. The ICCC Shows the Degree of Association Between Baseline and Follow-up Scores in Placebo-Treated Patients* Original SGRQ Revised SGRQ Variables Mean (SD) ICCC (95% CI) Mean (SD) ICCC (95% CI) Total Placebo 1.4 (11.5) 0.81 (0.73 0.87) 1.6 (11.6) 0.80 (0.71 0.87) Salmeterol 5.9 (12.7) 5.5 (13.1) Symptoms Placebo 4.9 (15.0) 0.73 (0.61-0.81) 5.1 (15.7) 0.66 (0.53 0.77) Salmeterol 6.7 (16.6) 6.4 (16.9) Activity Placebo 1.8 (15.1) 0.78 (0.68 0.85) 2.3 (14.8) 0.77 (0.67 0.85) Salmeterol 3.4 (14.6) 3.3 (14.4) Impacts Placebo 0.2 (15.8) 0.70 (0.57 0.79) 0.1 (15.8) 0.69 (0.56 0.78) Salmeterol 7.0 (16.3) 6.4 (17.0) *Data are from Jones and Bosh. 9 Mean change was calculated as follow-up baseline score. Differences between placebo and treatment groups were significant for impacts (original, p 0.007; revised, p 0.02) and total (original, p 0.02; revised, p 0.04) scores. Changes in score calculated in this analysis differ from those in the original publication because that analysis included patients for whom there were missed responses in the questionnaires and the SGRQ was scored using preset rules for including/excluding questionnaires based on the number of missing items. This analysis used only those questionnaires for which there were no missed items at baseline or follow up. CI confidence interval. has been maintained. The item with a low response rate referred to employment; since most of these patients were beyond retirement age, this finding was unsurprising. The most important finding was that the response options for some polytomous items did not operate in an ordered manner. This could be resolved by combining two or more options for all except one item, which was removed. Eight items were removed because they showed a poor fit relative to the rest of the items in the same component of the questionnaire. Rasch analysis is a useful tool for examining the performance of individual items. The properties of each item are defined in relation to one another. No single test of fit is both necessary and sufficient for confirming the model; rather, evidence of statistical misfit needs to be seen as giving a clue to a substantive anomaly to be studied further. 16 Removal of the worst-fitting impact item caused another previously well-fitting item to fit less well. When this occurs, the effect of removing items that show less misfit can be tested. Item content should guide which items to examine because this will help maintain content validity. This reasoning led us to remove four impact items. Only three showed misfit, but all four referred to medication use. A single medication item would have been incongruous with the remaining items in that component. Initial testing of the revised SGRQ showed its scores to be a little different from the original, so we derived a scoring algorithm to ensure that scores from the new version are equivalent to the original. Cross-sectional and longitudinal tests of validity us- Table 5 Prospective Study of Reworded, Revised Version (SGRQ-C) vs Original SGRQ in Patients Recruited to a Rehabilitation Program: Baseline Correlations (r) Between SGRQ and SGRQ-C Scores and Concurrent Measures of Disease Severity* Total Symptoms Activity Impacts Variables SGRQ SGRQ-C SGRQ SGRQ-C SGRQ SGRQ-C SGRQ SGRQ-C FEV 1 0.04 0.02 0.10 0.02 0.06 0.06 0.06 0.05 HADS anxiety 0.46 0.42 0.42 0.28 0.27 0.27 0.48 0.47 HADS depression 0.59 0.56 0.34 0.29 0.52 0.44 0.58 0.58 MRC dyspnea grade 0.43 0.49 0.31 0.28 0.46 0.55 0.36 0.40 Global health 0.60 0.54 0.34 0.27 0.52 0.48 0.59 0.53 *See Table 2 for expansion of abbreviations. p 0.05, not significant. p 0.0001. p 0.001. p 0.05. www.chestjournal.org CHEST / 132 / 2/ AUGUST, 2007 461

Figure 2. Difference between the original SGRQ and the SGRQ-C scores plotted against the mean of the original SGRQ and the SGRQ-C scores. ing existing datasets showed this algorithm to be reliable. Finally, the correlation between the SGRQ-C and the original was slightly lower than that found in studies 3 and 4. This was to be expected because in those studies patients completed only one questionnaire and the comparison was between scores calculated in different ways. By contrast, in the final study, two different questionnaires were completed, so the correlation between them contains measurement error due to between-test repeatability and the effects of differences between the two. Overall, good agreement was found between the SGRQ and the SGRQ-C. Larger studies comparing the two questionnaires are needed to confirm these findings. In summary, the new COPD-only version of the SGRQ, the SGRQ-C, is shorter, contains the best of the original items, no longer specifies a recall period, and produces scores equivalent to the existing instrument. References 1 Jones PW, Quirk FH, Baveystock CM, et al. A self-complete measure of health status for chronic airflow limitation. Am Rev Respir Dis 1992; 145:1321 1327 2 Barley EA, Quirk FH, Jones PW. Asthma health status measurement in clinical practice: validity of a new short and simple instrument. Respir Med 1998; 92:1207 1214 3 Juniper EF, Guyatt GH, Cox FM, et al. Development and validation of the mini asthma quality of life questionnaire. Eur Respir J 2001; 14:32 38 4 Katsura H, Yamada K, Kida K. Usefulness of a linear analog scale questionnaire to measure health-related quality of life in elderly patients with chronic obstructive pulmonary disease. J Am Geriatr Soc 2003; 51:1131 1135 5 Oga T, Nishimura K, Tsukino M, et al. Comparison of the responsiveness of different disease-specific health status measures in patients with asthma. Chest 2002; 122:1228 1233 6 Quirk F, Baveystock C, Wilson R, et al. Influence of demographic and disease related factors on the degree of distress associated with symptoms and restrictions on daily living due 462 Original Research

to asthma in six countries. Eur Respir J 1991; 4:167 171 7 Burge PS, Calverley PM, Jones PW, et al. Prednisolone response in patients with chronic obstructive pulmonary disease: results from the ISOLDE study. Thorax 2003; 58:654 658 8 Casaburi R, Mahler DA, Jones PW, et al. A long-term evaluation of once-daily inhaled tiotropium in chronic obstructive pulmonary disease. Eur Respir J 2002; 19:217 224 9 Jones PW, Bosh TK. Quality of life changes in COPD patients treated with salmeterol. Am J Respir Crit Care Med 1997; 155:1283 1289 10 Barr JT, Schumacher GE, Freeman S, et al. American translation, modification and validation of the St George s Respiratory Questionnaire. Clin Ther 2000; 22:1121 1145 11 Rasch G. Probabilistic models for some intelligence and attainment tests. Chicago, IL: University of Chicago Press, 1960 12 Barley EA, Jones PW. Repeatability of a Rasch model of the AQ20 over five assessments. Qual Life Res 2006; 15:801 809 13 Andrich D, Sheridon BE, Luo G. RUMM2020 (Rasch unidimensional measurement models). Perth, Western Australia: RUMM Laboratory, 2005 14 Andrich D. Rasch models for measurement. Newbury Park, CA: Sage Publications, 1998 15 Bland JM, Altman, DG Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986; 1:307 310 16 Hagquist C, Andrich D. Is the sense of coherence-instrument applicable on adolescents? A latent trait analysis using Raschmodelling. Person Indiv Diff 2004; 36:955 968 www.chestjournal.org CHEST / 132 / 2/ AUGUST, 2007 463