Summary. Introduction. Methods

Similar documents
WESTERN ONTARIO OSTEOARTHRITIS OF THE SHOULDER INDEX (WOOS)

WESTERN ONTARIO ROTATOR CUFF INDEX (WORC)

White Rose Research Online URL for this paper:

EXPERTISE, UNDERUSE, AND OVERUSE IN HEALTHCARE * Amitabh Chandra Harvard and the NBER. Douglas O. Staiger Dartmouth and the NBER

Widespread use of pure and impure placebo interventions by GPs in Germany

Sickle Cell. Scientific Investigation

Propensity score analysis with hierarchical data

Each year is replete with occasions to give gifts. From

Correcting for Lead Time and Length Bias in Estimating the Effect of Screen Detection on Cancer Survival

the risk of heart disease and stroke in alabama: burden document

Name: Key: E = brown eye color (note that blue eye color is still represented by the letter e, but a lower case one...this is very important)

Lothian Palliative Care Guidelines patient information

A Review of Generic Health Status Measures in Patients With Low Back Pain

Development of a self-reported Chronic Respiratory Questionnaire (CRQ-SR)

Relationship of the Penn Shoulder Score with Measures of Range of Motion and Strength in Patients with Shoulder Disorders: A Preliminary Report

Unbiased MMSE vs. Biased MMSE Equalizers

internal consistency SDQ-UK w1 patients no no?? yes score GP-patients > score community ; ceiling 54; 67

Derivation of Nutrient Prices from Household level Consumption Data: Methodology and Application*

Individual differences in the fan effect and working memory capacity q

Modeling H1N1 Vaccination Rates. N. Ganesh, Kennon R. Copeland, Nicholas D. Davis, National Opinion Research Center at the University of Chicago

Running head: SEPARATING DECISION AND ENCODING NOISE. Separating Decision and Encoding Noise in Signal Detection Tasks

Homophily and minority size explain perception biases in social networks

DENOMINATOR: All patient visits for patients aged 21 years and older with a diagnosis of OA

Magnetic Resonance Imaging in Acute Hamstring Injury: Can We Provide a Return to Play Prognosis?

A Platoon-Level Model of Communication Flow and the Effects on Operator Performance

Locomotor and feeding activity rhythms in a light-entrained diurnal rodent, Octodon degus

Public Assessment Report Scientific discussion. Kagitz (quetiapine) SE/H/1589/01, 04-05/DC

An examination of bias in shoulder scoring instruments among healthy collegiate and recreational athletes

SHOULDER Survey Packet for Measuring Your Improvement

Evaluation of Rotator Cuff Repair Using Korean Shoulder Scoring System

Public Assessment Report. Scientific discussion. Ramipril Teva 1.25 mg, 2.5 mg, 5 mg and 10 mg tablets Ramipril DK/H/2130/ /DC.

Performance of Fractured Horizontal Wells in High-Permeability Reservoirs P. Valkó, SPE and M. J. Economides, SPE, Texas A&M University

International Journal of Health Sciences and Research ISSN:

2018 OPTIONS FOR INDIVIDUAL MEASURES: REGISTRY ONLY. MEASURE TYPE: Process

Journal of Theoretical Biology

THE BATH ANKYLOSING SPONDYLITIS PATIENT GLOBAL SCORE (BAS-G)

The EuroQol and Medical Outcome Survey 36-item shortform

Psychometric Evaluation of Self-Report Questionnaires - the development of a checklist

continued TABLE E-1 Outlines of the HRQOL Scoring Systems

Department of Surgery, University of Alberta, 1F1.52 WMC, Street, Edmonton, AB, Canada T6G 2B7 3

A New Measure of Health Status for Clinical Trials in Inflammatory Bowel Disease

Two Rotator Cuff Disease Specific Outcome Measures, The RC-QOL And the WORC Exhibit Similar Construct Validity And Responsiveness

Citation Knight J, Andrade M (2018) Genes and chromosomes 4: common genetic conditions. Nursing Times [online]; 114: 10,

TRAUMATIC HIP DISLOCATION IN CHILDHOOD

Public Assessment Report. Scientific discussion. Amoxiclav Aristo 500 mg/125 mg and 875 mg/125 mg film-coated tablets

A Mathematical Model for Assessing the Control of and Eradication strategies for Malaria in a Community ABDULLAHI MOHAMMED BABA

Allergy: the unmet need

The recommended method for diagnosing sleep

REPRODUCIBILITY AND RESPONSIVENESS OF EVALUATIVE OUTCOME MEASURES

Public Assessment Report. Scientific discussion. Mebeverine HCl Aurobindo Retard 200 mg modified release capsules, hard. (mebeverine hydrochloride)

arxiv: v2 [cs.ro] 31 Jul 2018

Three-dimensional simulation of lung nodules for paediatric multidetector array CT

Intraarticular platelet-rich plasma injection in the treatment of knee osteoarthritis: review and recommendations.

The Patient-Rated Elbow Evaluation (PREE) User Manual. June 2010

Public Assessment Report Scientific discussion. Aspirin (acetylsalicylic acid) Asp no:

A GEOMETRICAL OPTIMIZATION PROBLEM ASSOCIATED WITH FRUITS OF POPPY FLOWER. Muradiye, Manisa, Turkey. Muradiye, Manisa, Turkey.

T he WOMAC (Western Ontario and McMaster

Retrospective Analysis of Arthroscopic Management of Glenohumeral Degenerative Disease

Public Assessment Report. Scientific discussion. Orlyelle 0.02 mg/3 mg and 0.03 mg/3 mg film-coated tablets. (Ethinylestradiol/Drospirenone)

C. Lemière*, A. Desjardins*, Y. Cloutier**, D. Drolet**, G. Perrault**, A. Cartier*, J-L. Malo*

J oints. Validity and reliability of the SPORTS score for shoulder instability. Abstract. Introduction

Results 30 KNEE INJURY AND KNEE OSTEOARTHRITIS. Development and validation of the questionnaires (studies III V)

Information Sheet No. 27 Consultation and aid in case of impaired memory in old age

Arthroscopic Decompression in Stage II Subacromial Impingement Five to Twelve Years Follow up

TG13 management bundles for acute cholangitis and cholecystitis

Comparison of Clinical and Patient-Based Measures to Assess Medium-Term Outcomes Following Shoulder Surgery for Disorders of the Rotator Cuff

Public Assessment Report. Scientific discussion. Carbidopa/Levodopa Bristol 10 mg/100 mg, 12.5 mg/50 mg, 25 mg/100 mg and 25 mg/250 mg tablets

Public Assessment Report. Scientific discussion. Efavirenz/Emtricitabine/Tenofovirdisoproxil Teva, film-coated tablets

Insights. Central Nervous System Cancers, Version

Patient- and Clinician-Rated Outcome Measures for Clinical Decision Making in Rehabilitation

MULTI-STATE MODELS OF HIV/AIDS BY HOMOGENEOUS SEMI-MARKOV PROCESS

Assessment of Differences Between the Modified Cincinnati and International Knee Documentation Committee Patient Outcome Scores

ASSESSMENT OF THE RELIABILITY AND VALIDITY OF THE ARTHRITIS IMPACT MEASUREMENT SCALES FOR CHILDREN WITH JUVENILE ARTHRITIS

Preparations for pandemic influenza. Guidance for hospital medical specialties on adaptations needed for a pandemic influenza outbreak

Spiral of Silence in Recommender Systems

Generic and condition-specific outcome measures for people with osteoarthritis of the knee

The Gold Standard. ASDIN 2014 Scientific Meeting. When is an AVF mature? Longitudinal Assessment of AVF Maturation with Ultrasound.

A comparison of three disease-specific and two generic health-status measures to evaluate the outcome of pulmonary rehabilitation in COPD

Importance of sensitivity to change as a criterion

REXON-AGE therapy in the treatment of arthrosis

Final Report. HOS/VA Comparison Project

Rotator Cuff Repair Outcomes. Patrick Birmingham, MD

Massive Rotator Cuff Tears. Rafael M. Williams, MD

ARD Online First, published on July 1, 2004 as /ard

Shoulder disorders affect

Validity and responsiveness of the Core Outcome Measures Index (COMI) for the neck

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Reliability and validity of the International Spinal Cord Injury Basic Pain Data Set items as self-report measures

NYSE MKT: SYN. Protecting the Gut Microbiome

4 Smallest detectable difference of maximal mouth opening in patients with painfully restricted temporomandibular joint function.

International Cartilage Repair Society

Two optimal treatments of HIV infection model

REVIEW ARTICLE Anesthesiology 2010; 113: Copyright 2010, the American Society of Anesthesiologists, Inc. Lippincott Williams & Wilkins

Patient Reported Outcomes

Health Status Instruments / Utilities

Chapter 4B: Reliability Trial 2. Chapter 4B. Reliability. General issues. Inter-rater. Intra - rater. Test - Re-test

Supplementary Appendix

Validation of the Russian version of the Quality of Life-Rheumatoid Arthritis Scale (QOL-RA Scale)

Transcription:

Osteoartritis and Cartilage (2001) 9, 771 778 2001 OsteoArtritis Researc Society International 1063 4584/01/080771+08 $35.00/0 doi:10.1053/joca.2001.0474, available online at ttp://www.idealibrary.com on Te development of a disease-specific quality of life measurement tool for osteoartritis of te soulder: Te Western Ontario Osteoartritis of te Soulder (WOOS) index I. K. Y. Lo, S. Griffin and A. Kirkley Fowler Kennedy Sport Medicine Clinic, University of Western Ontario, London, Ontario, Canada Summary Objective: Te purpose of tis study was to develop and validate a disease-specific quality of life measurement tool for osteoartritis (OA) of te soulder. Metods: An instrument wic could be used as te primary outcome measure in clinical trials involving patients wit OA of te soulder was developed using a specific metodological protocol: (1) identification of a specific patient population; (2) item generation; (3) item reduction; (4) pre-testing of te prototype questionnaire and (5) determining te validity, reliability and responsiveness of te final questionnaire. Results: Te final instrument contains 19 items, eac wit a visual analog response option for te four domains (six questions for pain and pysical symptoms, five questions for sport, recreation and work, five questions for lifestyle function and tree questions for emotional function). Ten of te 19 questions ad t been identified previously on oter soulder measurement tools. Te instrument proved to be valid by demonstrating predicted correlations wit previously publised soulder measures, global ealt status measure and range of motion. Te new instrument was also more responsive tan oter soulder measurement tools, a global ealt status measure and range of motion. Conclusions: Since te patient s own perception of canges in ealt status is te most important indicator of te success of treatment we suggest tat tis measurement tool be used as te primary outcome in clinical evaluation of various treatments for OA of te soulder and monitoring patients over time. 2001 OsteoArtritis Researc Society International Key words: Osteoartritis, Outcome, Soulder, Quality of life. Introduction Osteoartritis (OA) is a common disease of great ecomic importance and its burden on mankind is increasing in step wit te aging of te population 1. OA of te soulder, altoug t as prevalent wen compared wit te ip or knee, as been demonstrated in cadaveric studies to affect 28.6% of patients greater tan 60 years of age 2. It is estimated tat over 10,000 soulder artroplasties are performed eac year in te United States alone 3. A wide range of treatment modalities is available including analgesics, n-steroidal antiinflammatory drugs, pysioterapy, artroscopic debridement, artrodesis, resection artroplasty and soulder artroplasty 4. All of tese interventions are designed to improve a patient s quality of life and yet a validated, disease-specific quality of life measurement tool is t available to assess te benefit of suc interventions 5. Tus, te purpose of tis study was to develop a measurement tool tat was valid, reproducible, responsive and user friendly for te assessment of quality of life as it pertains to te soulder in patients wit OA of te soulder. Since it is te patient s subjective impression of teir ealt Received 15 August 2000; revision requested 17 November 2000; revision received 28 Marc 2001; accepted 9 July 2001. Address correspondence to: Dr A. Kirkley, Fowler Kennedy Sport Medicine Clinic, Te University of Western Ontario, London, Ontario, Canada N6A 3K7. Tel: 519-661-4154; Fax: 519-661-4052; E-mail: akirkley@julian.uwo.ca Tis work was supported by a grant from 3M Canada. status tat is most important to te success of treatment it was decided tat a disease-specific quality of life measurement tool would be most appropriate. Tis tool was designed to be used as te primary outcome measure in clinical trials evaluating te treatment of patients wit OA of te soulder or to be used by clinicians for ongoing evaluation of patients in teir practice. Metods Te metod for developing quality of life measurement tools as been well defined by Kirsner and Guyatt 6 and as previously been publised for te development of a tool for patients wit instability of te soulder 7. Te five major steps include (1) identification of a specific patient population, (2) item generation, (3) item reduction, (4) pre-testing te prototype instrument and (5) determining te validity, reliability and responsiveness of te instrument. (1) POPULATION IDENTIFICATION Te purpose of tis instrument is to evaluate te disease specific quality of life of patients wit symptomatic OA of te soulder. Tus, te target population was patients of all ages wit a diagsis of primary OA of te soulder. Te diagsis of primary OA was defined as: (1) Cronic progressive soulder pain. 771

772 I. K. Y. Lo et al.: Disease-specific quality of life measurement tool for osteoartritis of te soulder (2) Radiograpic evidence demonstrating all of te following: joint space narrowing, osteopyte formation and subcondral sclerosis. Patient exclusion criteria included: (1) Large or massive rotator cuff tears and rotator cuff artropaty. (2) Language, psyciatric, or cognitive difficulties tat prevented reliable completion of interviews or questionnaires. (3) A diagsis oter tan soulder OA tat would significantly contribute to te patient s soulder dysfunction (suc as cervical spine disease, weakness relating to neurologic condition). (4) Ater major illness, wic substantially influenced te patient s quality of life (i.e. unstable angina). (5) History of significant soulder trauma, infection, avascular necrosis, cuff tear artropaty, cronic dislocation, or secondary cause of OA. Since te final instrument could potentially be used in evaluating all n-operative and operative treatments, patients were included wo ad received treatment or any combination of pysioterapy, oral medications, injections or operative treatments. From te files of tree participating ortopedic surgeons, a database of 150 patients meeting te inclusion criteria was establised. Carts were reviewed and data collected to ensure tat patients met te inclusion and exclusion criteria. Data were collected on te patients age, gender, treatments to date and teir overall assessment of te severity of teir disability. (2) ITEM GENERATION Item generation is considered te most important step in te development of a disease specific quality of life measurement tool. Tis step must be compreensive since te final measurement tool can only consist of te specific items identified in tis stage. Since te ultimate goal of te final measurement tool is to measure ealt related quality of life, a compreensive definition of ealt was cosen. Te World Healt Organization defines ealt as a state of complete pysical, mental and social well-being 8. We terefore included five domains encompassing all aspects of ealt including: (1) pain and pysical symptoms, (2) sports and recreation, and work function, (3) social function and (4) emotional function 8. Te item generation was carried out in tree steps as previously publised 7. In te first step, a review of te literature was conducted to identify items tat would be appropriate from descriptions of te syndrome of OA of te soulder, from global ealt measurement tools (Index of Well-Being 9, Sickness Impact Profile 10, SF-36 11 ), from disease specific questionnaires in related areas (Artritis Impact Measurement Scale 12 ), and existing instruments specific to te soulder (Constant score 13, UCLA soulder scale 14, American Soulder and Elbow Surgeon s Evaluation Form 15 ). In addition, items were identified from te unpublised soulder rating scales of colleagues wit an interest in soulder surgery. In te second step, ortopedic surgeons and pysioterapists wit interest in te treatment of OA of te soulder were interviewed. Participants were asked to identify te most important patient symptoms for eac domain using open-ended questions in an interview setting. In te final and most important step, patients wit OA of te soulder were interviewed. Patients were selected wit a wide spectrum of patient caracteristics, disease severity, and treatments to ensure tat te entire spectrum of symptomatology would be elicited. Patients signed an informed consent to participate in te study. Patients ten underwent a semi-structured interview by a researc assistant (SG) wit expertise in te development in quality of life measurement tools 7. During te interview, patients were asked to identify any items tat contributed to teir soulder functioning less tan perfectly. Patients were ten asked specifically about eac of te five domains. At te end of te interview a spouse or significant oter was invited to identify any items tat te subject may ave omitted. New subjects were interviewed until new items ad been identified wit five consecutive interviews, a tecnique termed interviewing to redundancy 16.Asa result, a total of 20 patients were interviewed. Tere were nine females and 11 males and te patients ages ranged from 38 to 89 (mean 66.2 years). Seven ad undergone artroplasty surgery, tree artroscopy and debridement, eigt were undergoing combinations of n-surgical treatment wic included cortisone injections, pysioterapy and NSAIDs and two ad ad treatment at te time of te interview. By te end of tis step 199 items were identified. (3) ITEM REDUCTION Item reduction involves determining wic items sould be retained on te final measurement tool. Obviously, a questionnaire wit close to 200 items would be clinically impractical. Terefore, te goal is to retain 20 30 items tat are most important to te patients and are representative of te total concept of ealt-related quality of life. Item reduction was carried out in tree stages as previously publised 6,7. Te investigators reviewed eac item and eliminated items tat were duplicated or incompreensible. Eac item was restructured to a grade 8 reading level. Items were screened to remove any ambiguity, jargon or value-laden wording and double-barreled questions 16. Any items wic could t cange following treatment were also discarded. Seventy-one items remained at tis point. Tis questionnaire of 71 items was ten administered to 100 new patients (t previously involved in item generation) selected from te database wo again represented te full spectrum of patients. For eac item, subjects were asked to assess weter tey experienced te item or t. If te item was experienced, tey were furter asked to rate te importance of te item to teir overall soulder functioning. Te importance was ranked on a scale of 1 (t important) to 5 (ly important). Te frequency wit wic eac item was experienced and te mean importance were calculated for eac item 17. Te frequency importance product (frequency mean importance) was ten generated for eac item. Te top 50 items based on te frequency importance product (FIP) were ten subjected to furter analysis. Eac question was correlated wit every oter question and also correlated wit te grouped questions in its domain. Tis allowed us to eliminate questions tat were igly correlated (r 2 =0.58 0.999) and were terefore measuring te same concept.

Osteoartritis and Cartilage Vol. 9, No. 8 773 Table I Construct validation sowing correlations between te WOOS and oter outcome measures Outcome measure Baseline r, A priori prediction Pearson product moment correlation r Actual r, A priori prediction Cange score r Actual UCLA soulder rating scale 0.5 0.630 0.5 0.604 Constant score 0.6 0.730 0.6 0.685 Global cange N/A N/A 0.5 0.475 ASES 0.5 0.590 0.5 0.425 McGill pain questionnaire 0.4 0.578 0.4 0.536 McGill VAS 0.4 0.407 0.4 0.218 SF12 pysical score 0.5 0.650 0.4 0.287 SF12 mental score 0.3 0.460 0.2 0.159 Range of motion 0.3 0.607 0.2 0.545 N/A: t applicable. (4) PRE-TESTING Pre-testing te prototype questionnaire is conducted to ensure tat te wording is clear and tat patients interpret te items as tey are intended. Te questionnaire was administered to two groups of 10 subjects t previously involved in te development of te tool. Subjects were asked to complete te questionnaire and ten to give teir interpretation of eac item, to identify any questions wit unclear wording, and finally to give teir opinion as to weter tere were any omissions from te questionnaire. Canges were made to te questionnaire after te first set of 10 patients but furter modifications were necessary after te second set of patients. ITEM SCALING AND WEIGHTING In general, two metods exist for quantifying an individual s response to a specific item: multi-item Likert scales (i.e. pain, a little pain, moderate pain, a great deal of pain) and visual analog scales (100 mm lines ancored at eiter end by te s of te dimension being measured pain, pain along wic te subjects are asked to place a mark to indicate teir status) 18. We ave found it difficult to construct Likert scales tat are consistent, meaningful and evenly spaced and, terefore, te visual analog scale was used for tis tool 7,18. It is also controversial as to ow te individual items witin an instrument sould be weigted. In some instruments te autor wo as devised te instrument as arbitrarily cosen some items to be more important tan oters and terefore as given tem more weigt 13,14. It would be possible to weigt te items according to te mean importance or te FIP. Te FIPs, for te questionnaire of 71 items ranged from 29 to 206. Te final 19 items, owever, ad a narrow range from 142 to 206. Since te items on te final tool ad consistently ig FIPs falling witin a narrow range, we decided to give all te items te same weigt 7,19. Terefore, in te final instrument, eac question as a possible score from 0 100 (100 mm VAS) and is t multiplied by any factor because of te equal weigting. Tese scores are added to give a total score of 1900. Te raw score can ten be converted to a percentage score. (5) VALIDITY, RELIABILITY AND RESPONSIVENESS Validity A tool is considered valid if it is measuring wat it is supposed to measure 20. However, a gold standard by wic to determine tis does t exist for quality of life. Terefore we must rely on construct validation, wic tests te ypotesis tat te questionnaire beaves in relation to oter measures as would be expected if it were measuring quality of life. Tis is accomplised by making a priori predictions regarding te correlation of te questionnaire wit oter related measures at one point in time for evaluating te discriminative caracteristics of te tools and te cange in scores over time for evaluating te evaluative caracteristics of te tools. A discriminative index is used to distinguis between individuals or groups wit respect to an underlying condition wereas an evaluative index is used to measure te magnitude of longitudinal cange in an individual or group 21. Te new measurement tool and oter measurement tools (Constant Score 13, UCLA Soulder Rating Scale 14, American Soulder and Elbow Surgeons Standardized Soulder Assessment Form (ASES) 15, SF12 global ealt instrument 11, range of motion 11, global cange rating scale (Appendix B) were administered on two occasions to a group of 41 patients selected from te database wo were undergoing treatment for OA of te soulder. (Some of tese patients were involved in ater step of te development of te tool.) Correlations of te baseline and cange scores were determined by te Pearson product moment correlation. Te a priori predictions of te correlations for baseline and cange scores were made giving careful consideration to te content and weigting of te various tools and ow tey ave beaved in publised studies and are presented in Table I. Reliability Reliability encompasses te concept tat repeated administration of a measurement tool in stable subjects will yield te same results. Te reliability of te measurement tool was evaluated in 58 patients over a 3-mont period. Tis interval was cosen since it represents a common

774 I. K. Y. Lo et al.: Disease-specific quality of life measurement tool for osteoartritis of te soulder time interval for evaluation of patients in clinical trials. Because it was possible for patients to cange over 3 monts of time, a global rating of cange questionnaire (Appendix 1) was concurrently administered to te subjects. Subjects wo reported cange were considered stable and tose wo reported cange were eliminated from tis analysis. Intraclass correlation coefficient is considered to be te preferable index of reliability and was calculated from a one-way random effects analysis of variance 22 25. Responsiveness A group of 41 patients, wo were involved in a randomized clinical trial comparing emiartroplasty wit total soulder artroplasty in primary OA of te soulder and terefore were expected to cange between testings, were evaluated. Tese patients were evaluated pre-operatively (baseline) and 3 monts post-operatively. Several metods ave been described for determining te responsiveness of a tool 20,26 28. In tis case te standardized response mean 28 was cosen since it correlates wit te power of a test and terefore is most relevant wen designing clinical trials. Te standardized response mean is calculated by dividing te mean cange in score by te standard deviation of te cange scores 28. Results Te final instrument as 19 items, representing te four domains (six questions for pain and pysical symptoms, five questions for sport, recreation and work function, five questions for lifestyle function and tree questions for emotional function (Appendix 2)). Te response time is approximately 10 min. Te igest or most symptomatic score is 1900 and te best or asymptomatic score is 0. In order to present tis in a clinically more meaningful format, te score can be reported as a percentage of rmal by subtracting te total from 1900, dividing by 1900 and multiplying by 100. As an example, a patient wit a total score of 450 would ave a percentage score of Te instrument contains specific instructions to be read by te subjects prior to beginning and a supplement to te instrument may be referred to if patients are unsure of te meaning of any question. Te instrument also as specific instructions to te clinician on ow it sould be scored. Tese features allow for a more consistent presentation to all subjects and evaluations can be done by mail wen necessary. Tus, results using tis measurement tool may be compared between centres. Te correlation between te WOOS and eac of te oter measurement tools for construct validation is summarized in Table I. As predicted all te actual correlations were witin 0.2 of te a priori predictions. As a discriminative instrument, te WOOS correlated most strongly wit te Constant Score (r=0.685) and lowest wit te SF12 mental score (r=0.159). As an evaluative instrument it also correlated best wit te Constant Score (r=0.73) and lowest wit te SF12 mental score (r=0.460). Table II Reliability of te WOOS and its domains as sown by intraclass correlation coefficients at 3 monts after original administration of te WOOS Domain Intraclass correlation coefficient (r) WOOS total score 0.964 Pysical symptoms 0.946 Sports/recreation/work 0.939 Lifestyle 0.869 Emotions 0.907 Of te 58 patients evaluated for reliability, 22 remained stable (as measured by a global rating of cange (Appendix 1)) over a 3-mont period. Te WOOS total score and eac of te domains ad excellent reliability (intraclass correlation coefficient >0.75) (Table II) 25,30. Data on responsiveness revealed (Table III) tat te WOOS was more sensitive for detecting cange over time tan te oter measurement tools. Te SF 12 mental score was te least responsive. Discussion It is estimated tat over 10,000 soulder artroplasties are performed annually in te United States and tat te number will continue to increase dramatically in te near future 3. Total soulder artroplasty as been sown to be as cost effective as total ip or knee artroplasty in improving quality of life (wen using te metod of quality adjusted life years) 31. Most ortopedic interventions are designed to improve a patient s quality of life (morbidity) and yet a disease-specific quality of life measurement tool is t available to assess te benefit of suc interventions in te soulder OA population. Global ealt-related quality of life measurement tools ave been developed for te general population suc as te Index of Well-Being 9, Sickness Impact Profile 11, and te SF12 11. However, tese tools are poor at detecting small but clinically important canges in quality of life of patients wit specific medical conditions. Terefore, disease-specific measurement tools ave been developed. As an example, te Western Ontario McMaster Artritis Index (WOMAC) 32, a disease-specific tool, was developed Table III Te standardized response mean of te outcome measures tested Outcome measure Standardized response mean WOOS 1.910 McGill VAS 1.710 UCLA soulder rating scale 1.370 ASES 1.290 McGill Pain 1.240 Constant score 1.210 SF 12 pysical score 0.970 Range of motion 0.720 SF12 mental score 0.430

Osteoartritis and Cartilage Vol. 9, No. 8 775 for patients wit OA of te ip and knee. Te WOMAC as subsequently been sown to be te most responsive of all currently available tools for measuring te outcome following total knee replacement 33. A igly responsive measurement tool may decrease costs in clinical trials since it allows for smaller sample sizes to identify clinically relevant differences and is more likely to be useful in making treatment decisions in individual patients in clinical practice. Te tree most commonly used functional rating scales for studies evaluating patients wit OA of te soulder are te UCLA soulder rating scale 14, te Constant score 13 and te American Soulder and Elbow Surgeons (ASES) evaluation form 15. Eac of tese scales provide gauges by wic to document improvement following treatment. However, ne is specific for OA of te soulder. Te questions in te subjective portion of tese tools were generated by experts as opposed to patients and, terefore, te items included in te measurement tools are tose te clinicians deem important and t necessarily tose tat are important to patients. It is t clear ow te weigting of te items as been decided. In a study involving patients wit soulder disorders, Lirette et al. 34 demonstrated tat by using different pysician generated scales and different accepted criteria for a satisfactory result, te proportion of satisfactory results ranged from 37% to 80%. In addition, wat may be a documented cange in questionnaire scoring may t prove to be a significant clinical cange or an improvement in quality of life (particularly in pysician generated scales). Many of tese scales require clinical evaluations of te patients by a pysician. It as been sown tat clinical examination variables (i.e. range of motion), even wen carried out by experienced clinicians, ave very poor reliability 35 and correlate poorly wit patients subjective evaluations of teir function 20.In addition, it as been well documented tat pysicians tend to evaluate teir patients as functioning better tan te patients perceive temselves to be 36. Wen comparing previous scales to te WOOS, 10 of te 19 items were t identified in any of te tree previously publised scales despite tese items being considered to be among te 19 most frequently experienced and most important items to patients wit OA of te soulder. For example, How muc is your soulder affected by te weater? and How muc do you experience reacing beind to tuck in a sirt, get a wallet from your back pocket or do up cloting because of your soulder? In addition, most scales do t ave any questions about te domains of lifestyle and emotional function, making tem less compreensive for te evaluation of total ealt. Validating a disease-specific quality of life measurement tool is an ongoing process. Since tere is gold standard wit wic to compare, we must rely on construct validation. Te WOOS correlated moderately well wit te Constant score 13 and te UCLA soulder rating scale 14, wic would be expected since tese are joint-specific outcome tools. Wen assessing te reliability of an instrument, te intraclass correlation coefficient is te preferable statistical test 22 25. However, tere is consensus on te acceptable degree of reliability. In general, te acceptable reliability for interpreting scores for individuals 37,38 ranges from 0.85 to 0.94 and te acceptable reliability for describing groups is lower (0.75) 16,30. Te WOOS, wit an intraclass correlation coefficient of 0.964, is terefore adequate for use in bot te researc setting, were conclusions would be drawn based on mean scores from a group of patients, and also in te clinical setting for monitoring an individual patient s progress and potentially using te results to aid in clinical decision making. Te WOOS also proved to be igly responsive and ad te igest standardized response mean of tose tested. A igly responsive measurement tool as two major advantages. Higly responsive tools are best able to identify small, but clinically important, canges following an intervention, making tem most useful for monitoring patients in te clinical setting. In addition, suc tools may decrease costs in clinical trials since a more responsive tool allows for smaller sample sizes in order to identify clinically significant differences. For example, wen comparing te WOOS to ASES, we can see tat te WOOS would require approximately 35% fewer subjects tan ASES to detect te same amount of cange (square of te ratio of standardized response mean=(1.158/0.996) 2 =1.35). Tere are w several outcome tools available to coose from wen evaluating patients wit soulder disorders 7,9 15,32,39 41. One needs to select te most appropriate tool for te task required. It is certainly convenient to use global ealt tools but tey are of little clinical usefulness because most of te items are t relevant to te patient s condition and tey are insensitive to cange. However, tese tools are useful for comparing across diseases (i.e., effectiveness of Coronary Artery Bypass Graft vs Total Knee Artroplasty). Joint-specific tools are te middle ground; practical for monitoring and retrospective reviews. Finally, disease-specific tools are best for clinical trials and for monitoring and clinical decision making wit individual patients. For example, if one wises to determine and compare te overall soulder function of several different patient populations wit different soulder disorders and treatments, ten peraps a general soulder function questionnaire suc as te Soulder Rating Scale 41 sould be included as an outcome measure. However, if one is conducting a clinical trial and wises to determine if tere a difference in quality of life outcome in osteoartritic patients treated wit total soulder vs emiartroplasty, ten a igly responsive disease-specific quality of life outcome measure suc as te WOOS is most appropriate. Conclusion Te WOOS is a rigorously designed measurement tool for patients wit OA of te soulder tat is valid, reliable and igly responsive. Since te patient s own perception of canges in ealt status is te most important indicator of te success of treatment, we suggest tat tis measurement tool may be used as te primary outcome in clinical trials of treatments in tis patient population. Its properties also allow it to be used in te clinical setting. References 1. Pillips WC, Kattapuram SV. Osteoartritis: wit empasis on primary osteoartritis of te soulder. Del Med J 1991;63(10):609 15. 2. Peterson CJ. Degeneration of te gle-umeral joint. Acta Ortop Scand 1983;54:277 83.

776 I. K. Y. Lo et al.: Disease-specific quality of life measurement tool for osteoartritis of te soulder 3. Wirt MA, Rockwood CA. Complications of soulder artroplasty. Clin Ortop 1994;307:47 69. 4. Matsen FA, Rockwood CA, Wirt MA, Lippitt SB. Gleumeral artritis and its management. In: Rockwood CA, Matsen FA, Eds. Te Soulder, 2nd edn. Piladelpia: WB Saunders Co 1998:840 964. 5. Romeo AA, Bac BR, O Halloran KL. Scoring systems for soulder conditions. Am J Sport Med 1996;24: 472 6. 6. Kirsner B, Guyatt GH. Metodological framework for assessing ealt indices. J Cron Dis 1985;1:27 36. 7. Kirkley A, Griffin S, McClintock J, Ng L. Te development of a disease specific quality of life measurement tool for soulder instability. Am J Sports Med 1998;26(6):764 72. 8. World Healt Organization: Te constitution of te World Healt Organization. Basic Documents. Geneva: WHO, 1984. 9. Kaplan RM, Bus JW, Berry CC. Healt status: types of validity for an index of well-being. Healt Serv Res 1976;11:478. 10. Bergner M, Bobbitt RA, Carter WB, Gilson BS. Te Sickness Impact Profile: development and final revision of a ealt status measure. Med Care 1981;19:787 805. 11. Ware JE, Serbourne CD. Te MOS 36-item sortform ealt survey (SF-36). Med Care 1992;30(6): 473 83. 12. Meenan RF, Gertman PM, Mason JM. Measuring ealt status in artritis: te artritis impact measurement scales. Artritis Reum 1980;23:146 52. 13. Constant CR, Murley AHG. A clinical metod of functional assessment of te soulder. Clin Ortop 1987;214:160 4. 14. Amstutz HC, Hoy ALS, Clarke IC. UCLA anatomic total soulder artroplasty. Clin Ortop 1981;15:7 20. 15. Ricards RR, An KN, Bigliani L, Friedman RJ, Gartsman, GM, Gristina AG, et al. A standardized metod for te assessment of soulder function. J Soulder Elbow Surg 1994;3:347 52. 16. Streiner DL, Norman GR. Healt Measurement Scales. A Practical Guide to teir Development and Use. Oxford: Oxford University Press, 1989. 17. Guyatt GH, Bombardier C, Tugwell PX. Measuring disease specific quality of life in clinical trials. Can Med Assoc J 1986;134:889 95. 18. Guyatt GH, Townsend M, Burman L, Keller JL. A comparison of Likert and visual analogue scales for measuring cange in function. J Cron Dis 1987;49: 1129 33. 19. Wainer H. Estimating coefficients in linear models: it don t make never mind. Psycol Bull 1976;83: 213 7. 20. Guyatt G, Mitcell A, Irvine EJ, Singer J, Williams N, Goodacre R, et al. A new measure of ealt status for clinical trials in inflammatory bowel disease. Gastroenterology 1989;96:804 10. 21. Guyatt G, Jaescke R, Feeny DH, Patrick DL, Spiker B (Eds). Quality of Life and Parmoecomics. 2nd ed. Piladelpia: Lippincott-Raven, 5, Measurements in Clinical Trials: Coosing te Rigt Approac 1996: 41 8. 22. Bartko JJ. Te intraclass correlation coefficient as a measure of reliability. Psycol Rep 1966;19:3 11. 23. Bland JM, Altman DG. Statistical metods for assessing agreement between two metods of clinical measurement. Lancet 1986;1:307 10. 24. Srout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psycol Bull 1979;86:420. 25. Tammemagi MC, Frank JW, Leblanc M, Artsob H, Streiner DL. Metodological issues in assessing reproducibility: a comparative study of various indices of reproducibility applied to repeat Elisa serological tests for Lyme disease. J Clin Epidemiol 1995; 9:1123 32. 26. Guyatt GH, Walter S, Norman G. Measuring cange over time: Assessing te usefulness of evaluative instruments. J Cronic Dis 1987;40:171 8. 27. Kazis LE, Anderson JJ, Meenan RF. Effect sizes for interpreting canges in ealt status. Med Care 1989;27:S178 89. 28. Liang MH, Larson MG, Cullen KE, Scwartz JA. Comparative measurement efficiency and sensitivity of five ealt status instruments for artritis researc. Artritis Reum 1985;28:542 7. 29. Liang MH, Fossel AH, Larson MG. Comparisons of five ealt status instruments for ortopaedic evaluation. Med Care 1990;28:632 42. 30. Fleiss JL. Te design and analysis of clinical experiments. New York: Jon Wiley and Sons, 1986. 31. Williams A. Te importance of quality of life in policy decisions. In: Walkder SR, Rosser RM, Eds. Quality of Life: Assessment and Application. Lancaster: MTP Press Ltd 1988:279 90. 32. Bellamy N, Bucanen WW, Goldsmit CH, Campbell J, Stitt LW. Validation study of WOMAC: a ealt status instrument for measuring clinically important patient relevant outcomes to antireumatic drug terapy in patients wit OA of te ip or knee. J Reum 1988;15(12):1833 40. 33. Kreibic DN, Vaz M, Bourne RB, Rorabeck CH, Kim P, Hardie R, et al. Wat is te best way of assessing outcome in total knee replacements? Clin Ortop 1996;331:221 5. 34. Lirette R, Morin F, Kinnard P. Te difficulties in assessment of results of anterior acromioplasty. Clin Ortop 1992;278:14 16. 35. Koran LM. Te reliability of clinical metods, data and judgments. N Eng J Med 1975;293:642 6. 36. Gerber C. Integrated scoring systems for te functional assessment of te soulder. In: Matsen FA, Fu FH, Hawkins RJ, Eds. Te Soulder: A Balance of Mobility and Stability. Rosemont: American Academy of Ortopaedic Surgeons 1992:531 50. 37. Kelley TL. Interpretation of Educational Measurements. Yonkers, NY: World Books, 1927. 38. Weiner EA, Stewart GJ. Assessing Individuals: Psycological and Educational Tests and Measurements. Boston: Little Brown, 1984. 39. Hudak PL, Amadio PL, Bombardier C, te Upper Extremity Collaborative Group. Development of an upper extremity outcome measure: Te DASH (Disabilities of te Arm, Soulder and Hand). Am J Ind Med 1996;29:602 8. 40. Lippitt SB, Harryman DT II, Matsen FA III. A practical tool for evaluation of function: te simple soulder test. In: Matsen FA, Fu FH, Hawkins RJ, Eds. Te Soulder: A Balance of Mobility and Stability. Rosemont: American Academy of Ortopaedic Surgeons 1993:501 18. 41. L Insalata JC, Warren RF, Coen SB, Altcek DW, Peterson MGE. A self-administered questionnaire for assessment of symptoms and function of te soulder. J Bone Joint Surg 1997;79-A:738 48.

Osteoartritis and Cartilage Vol. 9, No. 8 777 Appendix 1: Global rating scale follow-up A: Overall ow is your soulder compared to 3 monts ago? Better Same Worse If you are feeling better or worse, ow muc better or worse do you feel? Very little difference A little different Somewat different A good deal different A great deal different *B: Considering your Pysical Symptoms: How is your soulder compared to 3 monts ago? Better Same Worse If you are feeling better or worse, ow muc better or worse do you feel? Very little difference A little different Somewat different A good deal different A great deal different *Te same scale was administered for eac domain (Sports/ recreation/work, Lifestyle, Emotions). Appendix 2*: Te Western Ontario Artritis of te Soulder (WOOS) Index SECTION A: Pysical Symptoms INSTRUCTIONS TO PATIENTS Te following questions concern te pysical symptoms you ave experienced due to your soulder problem. In all cases, please enter te amount of te symptom you ave experienced in te last week. (Please mark your answers wit a slas /.) 1. How muc pain do you experience in your soulder wit movement? pain pain 2. How muc constant, nagging pain do you ave in your soulder? 3. How muc weakness do you experience in your soulder? weakness weakness 4. How muc stiffness do you experience in your soulder? stiffness stiffness 5. How muc grinding do you experience in your soulder? ne 6. How muc is your soulder affected by te weater? affected ly affected SECTION B: Sports/Recreation/Work INSTRUCTIONS TO PATIENTS Te following section concerns ow your soulder problem as affected your sports or recreational activities in te past week. For eac question, please mark your answers wit a slas /. 7. How muc do you experience working or reacing above soulder level? 8. How muc do you experience wit lifting objects (eg. grocery bags, garbage can etc.) below soulder level? 9. How muc do you experience doing repetitive motions below soulder level suc as raking, sweeping or wasing floors because of your soulder? 10. How muc do you experience pusing or pulling forcefully because of your soulder? pain pain 11. How troubled are you by an increase in pain in your soulder after activities? *On te actual form te lines are 100-mm long. Tis form is reproduced by permission of te Fowler Kennedy Sport Medicine Clinic. t at all ly troubled

778 I. K. Y. Lo et al.: Disease-specific quality of life measurement tool for osteoartritis of te soulder SECTION C: Lifestyle INSTRUCTIONS TO PATIENTS Te following section concerns te amount tat your soulder problem as affected or canged your lifestyle. Again, please indicate te appropriate amount for te past week wit a slas /. 12. How muc do you ave sleeping because of your soulder? 13. How muc ave you experienced wit styling your air because of your soulder? 14. How muc do you ave maintaining your desired level of fitness because of your soulder? 15. How muc do you experience reacing beind to tuck in a sirt, get a wallet from your back pocket or do up cloting because of your soulder? 16. How muc do you ave dressing or undressing? SECTION D: Emotions INSTRUCTIONS TO PATIENTS Te following questions relate to ow you ave felt in te past week wit regard to your soulder problem. Please indicate your answer wit a slas /. 17. How muc frustration or discouragement do you feel because of your soulder? frustration frustration 18. How worried are you about wat will appen to your soulder in te future? t worried at all ly worried 19. How muc of a burden do you feel you are on oters? t at all burden