Kylie N. Johnston, Adrian J. Potter, Anna Phillips

Similar documents
William C Miller, PhD, FCAOT Professor Occupational Science & Occupational Therapy University of British Columbia Vancouver, BC, Canada

Patient Reported Outcomes

Optimizing the Lung Transplant Candidate through Exercise Training. Lisa Wickerson BScPT, MSc Canadian Respiratory Conference April 25, 2014

Prapaporn Pornsuriyasak, M.D. Pulmonary and Critical Care Medicine Ramathibodi Hospital

SGRQ Questionnaire assessing respiratory disease-specific quality of life. Questionnaire assessing general quality of life

Is there any evidence that multi disciplinary pulmonary rehabilitation impacts on quality of life?

The five-repetition sit-to-stand test as a functional outcome measure in COPD

Outpatient Pulmonary Rehabilitation

REHABILITATION FOR SURVIVORS OF CRITICAL ILLNESS FOLLOWING HOSPITAL DISCHARGE

They are updated regularly as new NICE guidance is published. To view the latest version of this NICE Pathway see:

CARE OF THE ADULT COPD PATIENT

Cardiac & Pulmonary Rehab Individual Treatment Plan

The Importance of Pulmonary Rehabilitation

Neuromuscular electrical stimulation for muscle weakness in adults with advanced disease

Nico Arie van der Maas

aclidinium 322 micrograms inhalation powder (Eklira Genuair ) SMC No. (810/12) Almirall S.A.

Outpatient Pulmonary Rehabilitation

E1. Post hospital discharge follow-up services and rehabilitation programmes

SUBMAXIMAL EXERCISE TESTING: ADVANTAGES AND WEAKNESS IN PERFORMANCE ASSESSMENT IN CARDIAC REHABILITATION

Measurement properties of the Timed Up & Go test in patients with COPD

Supplementary Online Content

Author's response to reviews

Perspective. Making Geriatric Assessment Work: Selecting Useful Measures. Key Words: Geriatric assessment, Physical functioning.

Asthma: Evaluate and Improve Your Practice

Treatment. Assessing the outcome of interventions Traditionally, the effects of interventions have been assessed by measuring changes in the FEV 1

Exercise Stress Testing: Cardiovascular or Respiratory Limitation?

Pulmonary rehabilitation in lymphangioleiomyomatosis: a controlled clinical trial

2/4/2019. GOLD Objectives. GOLD 2019 Report: Chapters

Lead team presentation: Roflumilast for treating chronic obstructive pulmonary disease [ID984]

The Art and Science of Pulmonary Rehab. Pam Haines, RCP Cardiopulmonary Rehab Manager

Author's response to reviews

The 4-metre gait speed in COPD: responsiveness and minimal clinically important difference

They are updated regularly as new NICE guidance is published. To view the latest version of this NICE Pathway see:

What do pulmonary function tests tell you?

Clinical exercise testing

Clinical and radiographic predictors of GOLD-Unclassified smokers in COPDGene

Does the multidimensional grading system (BODE) correspond to differences in health status of patients with COPD?

Scoring The score of the test is the distance a patient walks in 6 minutes (measured in meters and can round to the nearest decimal point).

호흡재활치료 울산의대서울아산병원 호흡기내과 이상도

Patient assessment assessing exercise capacity

Pulmonary rehabilitation in severe COPD.

AEROBIC METABOLISM DURING EXERCISE SYNOPSIS

Author s Accepted Manuscript

Cardiopulmonary Exercise Testing in Cystic Fibrosis

Surveillance report Published: 6 April 2016 nice.org.uk. NICE All rights reserved.

Hands on Sports Therapy KNOWLEDGE REVIEW QUESTIONS 2004 Thomson Learning It can help to shape a basic fitness training programme

COPD. Helen Suen & Lexi Smith

Research Report. A Comparison of Five Low Back Disability Questionnaires: Reliability and Responsiveness

Role of Cardiopulmonary Exercise Testing in Exercise Prescription

Pulmonary Rehabilitation in Chronic Lung Disease; Components and Organization. Prof. Dr. Müzeyyen Erk Cerrahpaşa Medical Faculty Chest Disease Dept.

Patient reported outcomes in respiratory diseases; How to assess clinical success in COPD

THE CHALLENGES OF COPD MANAGEMENT IN PRIMARY CARE An Expert Roundtable

Chapter 5: Patient-reported Health Instruments used for people with Chronic Obstructive Pulmonary Disease (COPD)

פעילות גופנית במחלות נשימה כרוניות

The Role of CPET (cardiopulmonary exercise testing) in Assessing Lung Disease in CF

Development of a self-reported Chronic Respiratory Questionnaire (CRQ-SR)

Manuscript type: Research letter

Inertial measurement units features to assess gait quality in aging or pathological states : a systematic review.

HQO s Episode of Care for Chronic Obstructive Pulmonary Disease

Douglas W. Mapel MD, MPH, Melissa Roberts PhD

Functional Ability Screening Tools for the Clinic

Pulmonary rehabilitation in patients with idiopathic pulmonary fibrosis: comparison with chronic obstructive pulmonary disease

Chronic obstructive pulmonary disease in over 16s: diagnosis and management

A Comparison of the BODE Index and the GOLD Stage Classification of COPD Patients in the Evaluation of Physical Ability

VA/DoD Clinical Practice Guideline Management of COPD Pocket Guide

Clinical Applications Across the Lifespan

Fatigue in COPD. Dr. Jan Vercoulen, Clinical Psychologist. Dpt. Medical Psychology Radboud University Nijmegen Medical Center

British College of Osteopathic Medicine Lief House, Finchley Road, London, NW3 5HR

Care Bundle. Adult patients with COPD

Background: Traditional rehabilitation after total joint replacement aims to improve the muscle strength of lower limbs,

Update on heterogeneity of COPD, evaluation of COPD severity and exacerbation

OUTCOME MEASURES USEFUL FOR TOTAL JOINT ARTHROPLASTY

1. Evaluate the methodological quality of a study with the COSMIN checklist

JOINT CHRONIC OBSTRUCTIVE PULMONARY DISEASE (COPD) MANAGEMENT GUIDELINES

Pulmonary rehabilitation following exacerbations of chronic obstructive pulmonary disease(review)

THIS MATERIAL IS A SUPPLEMENTAL TOOL. IT IS NOT INTENDED TO REPLACE INFORMATION PROVIDED IN YOUR TEXT AND/OR STUDENT HAND-BOOKS

Chronic obstructive pulmonary disease in over 16s: diagnosis and management

DATE: 29 Aug 2012 CONTEXT AND POLICY ISSUES

Equipment Stopwatch A clear pathway of at least 10 m (32.8 ft) in length in a designated area over solid flooring 2,3.

Objectives. Definition: Screen. Definition: Assessment 10/30/2013. Falls: Screens vs. Assessments vs. Outcome Measures

New approaches to exercise Thursday Sep 4, 3 pm

The clinical trial information provided in this public disclosure synopsis is supplied for informational purposes only.

MULTICARE Health System Care of the Adult Chronic Obstructive Pulmonary Disease (COPD) Patient

The Benefits Effects of Exercise for over 65s. Anna Haendel Physiotherapist

Chronic Obstructive Pulmonary Disease (COPD).

What s New in Acute COPD? Dr Nick Scriven Consultant AIM President SAM

Effectiveness of Acu-TENS on Reducing Dyspnoea and Improving Physical Ability on Stable COPD patients A Controlled Trial

Does bilateral upper limb training improve upper limb function following stroke?

Exercise, Physical Therapy and Fall Prevention

Outline FEF Reduced FEF25-75 in asthma. What does it mean and what are the clinical implications?

CANADIAN PHYSICAL PERFORMANCE EXCHANGE FITNESS STANDARD FOR TYPE 1 WILDLAND FIRE FIGHTERS (WFX-FIT) SIX WEEK TRAINING PROGRAM

The Clinical COPD Questionnaire: response to pulmonary rehabilitation and minimal clinically important difference

April 10 th, Bond Street, Toronto ON, M5B 1W8

Reliability of mobility measures in older medical patients with cognitive impairment

umeclidinium, 55 micrograms, powder for inhalation (Incruse ) SMC No. (1004/14) GlaxoSmithKline

This clinical study synopsis is provided in line with Boehringer Ingelheim s Policy on Transparency and Publication of Clinical Study Data.

The Aging Lung. Sidney S. Braman MD FACP FCCP Professor of Medicine Brown University Providence RI

COPD is a syndrome of chronic limitation in expiratory airflow encompassing emphysema or chronic bronchitis.

Reliability and validity of 4-metre gait speed in COPD

Transcription:

Review Measurement Properties of Short Lower Extremity Functional Exercise Tests in People With Chronic Obstructive Pulmonary Disease: Systematic Review Kylie N. Johnston, Adrian J. Potter, Anna Phillips K.N. Johnston, PhD, B App Sc(Physio), School of Health Sciences, University of South Australia, GPO Box 2471, Adelaide, South Australia, 5001, Australia, and Sansom Institute for Health Research, Division of Health Sciences, University of South Australia. Address all correspondence to Dr Johnston at: kylie.johnston@unisa.edu.au. A.J. Potter, B Physiotherapy, School of Health Sciences, University of South Australia. A. Phillips, PhD, B App Sc(Physio), School of Health Sciences, University of South Australia and Sansom Institute for Health Research, Division of Health Sciences, University of South Australia. [Johnston KN, Potter AJ, Phillips A. Measurement properties of short lower extremity functional exercise tests in people with chronic obstructive pulmonary disease: systematic review. Phys Ther. 2017;97:926 943.] 2017 American Physical Therapy Association Published Ahead of Print: June 12, 2017 Accepted: June 07, 2017 Submitted: December 15, 2017 Background. An increasing variety of short functional exercise tests are in people with chronic obstructive pulmonary disease (COPD). Systematic review of the psychometric properties of these exercise tests is indicated. Purpose. The aim of this study was to determine the reliability, validity, and responsiveness of short (duration < 6 min) lower extremity functional exercise tests in people with COPD. Data Sources. Five databases were searched: MEDLINE, Embase, Scopus, AMED, and CINAHL. Study Selection. Studies reporting psychometric properties of short functional exercise tests in people with COPD were included. Data Extraction. Two reviewers independently extracted data and rated the quality of each measurement property using the COnsensus-based Standards for the Selection of Health Measurement INstrument (COSMIN). Data Synthesis. Twenty-nine studies were identified reporting properties of 11 different tests. Four-meter gait speed [4MGS] and 5 repetition sit-to-stand [5STS] demonstrated high reliability (ICC =.95.99;.97) with no learning effect (COSMIN study ratings=good excellent). Their validity for use as a stratification tool anchored against an established prognostic indicator (area under receiver operator characteristics curve [AUC] = 0.72 0.87; 0.82) and responsiveness to change after pulmonary rehabilitation was greatest in more frail people with COPD. Studies of the Timed Up and Go [TUG] test support use of a practice test and show discriminative ability to detect falls history and low six-minute walk distance (AUC = 0.77; 0.82, COSMIN ratings=fair excellent). Limitations. Earlier studies were limited by small sample size. Limited data of lower study quality was identified for step tests and the Two-Minute Walk Test. Conclusions. Selected short functional exercise tests can complement established exercise capacity measures, in stratification and measuring responsiveness to change especially in people with COPD and lower functional ability. Post a comment for this article at: https://academic.oup.com/ptj 926 Physical Therapy Volume 97 Number 9 September 2017

People with chronic obstructive pulmonary disease (COPD) experience reduced exercise tolerance related to multiple factors. Altered ventilatory mechanics, specifically dynamic lung hyperinflation during physical activity and resultant dyspnea, 1 gas exchange abnormalities, 2,3 and peripheral muscle dysfunction, 4 may all contribute, with further limitations associated with cardiovascular comorbidities, 5 to depression, 6 balance, 7 poor sleep, 8 and nutritional deficits. 9 Reduced exercise capacity is associated with worse survival, 10 and lower physical activity levels are associated with greater morbidity, including hospital admissions 11 and greater mortality. 12 However, interventions that include exercise training are highly effective to improve exercise tolerance, health-related quality of life, and to reduce hospitalization in people with COPD. 13 Field walking tests provide important outcomes of functional exercise capacity in people with COPD. These tests enable clinicians to quantify a person s exercise capacity, response to treatment, predict prognosis, and evaluate the effect of new interventions in clinical trials. Established tests include the Six-Minute Walk Test (6MWT), the incremental Shuttle Walk Test (ISWT), and the Endurance Shuttle Walk Test (ESWT). 14 These tests are valid, reliable, and functionally relevant outcome measures in COPD. 15 However, there are barriers to their use, related to both the ability of the patient and time and resource restraints. Test duration (6 20 min) may not be well tolerated by some people, including those in acute or subacute settings recovering from an acute exacerbation of symptoms, the elderly, people with severe COPD, or people with significant cardiovascular comorbidities. 16 To account for learning effects and ensure the accuracy of measurement, 2 repetitions of these tests are recommended, 14 and this may reduce feasibility in the health care setting. For example, use of one 6MWT only has been in 54% (preprogram) to 77% (postprogram) of pulmonary rehabilitation programs in Australia. 17 There are an increasing variety of short functional exercise tests in people with COPD, such as evaluation of gait speed, short duration walking distance tests, tests involving the sit-tostand transition, and step tests. 18 Shorter tests require less time for the professional and patient to conduct; some require less space; and some have potential suitability in more frail patients. It is critical that measurements obtained from these tools are valid, reliable, and able to detect change so the measurements they produce can confidently be interpreted. Although narrative 18 and systematic reviews (including literature up to 2014 19 ) have been published, there has been much further recent quality research activity in this area. Therefore, the primary aim of this study was to systematically review available evidence regarding the psychometric properties (reliability, validity, and responsiveness) of short lower extremity functional exercise tests, specifically tests of less than 6 minutes in duration in people with COPD. Review question: What are the psychometric properties of short lower extremity functional exercise tests in people with COPD? Methods This systematic review was in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 20 (Supplementary File 1). Data Sources and Searches The search strategy underwent several iterations of peer review by an academic librarian prior to being finalized and registered with the International Prospective Register of Systematic Reviews (Registration no: CRD42015026631). A systematic search was conducted in 5 databases (MEDLINE, Embase, AMED, Scopus, and CINAHL) on September 2, 2015. An identical updated search was completed on September 15, 2016. No restrictions on publication dates were used. The search strategy (Supplementary File 2) included combinations of terms to identify the population of interest (COPD) and short walking or lower limb functional exercise tests. Study Selection Studies reporting the psychometric properties (reliability, validity, and/ or responsiveness) of the Two-Minute Walk Test (2MWT), Three-Minute Walk Test, Sit-to-Stand (STS), Timed Up and Go (TUG), gait speed, or step tests in adults with COPD were included. Variations of these tests were included, provided the total duration of the test was less than 6 minutes. There were no restrictions on study design. Studies reporting on other chronic lung diseases, pediatric studies, and studies relating to tests equal to or greater than 6 minutes in duration were excluded. This test duration cutoff of less than 6 minutes was chosen, as this is shorter than the already-established 6MWT. Screening of search results was performed by 2 independent reviewers, and discrepancies were resolved by consensus with a third reviewer. Studies were first screened at the level of title and abstract. Full texts were then obtained for studies that met the eligibility criteria or whose eligibility could not be determined from review of title and abstract. Hand-searching of reference lists and forward citation searches of included papers were conducted. Data Extraction and Quality Assessment Data was extracted using pre-piloted forms (Supplementary File 3). Extracted data included identifying information (ie, title, year of publication), study methodology, participant demographics and disease characteristics, protocols of the outcome measures utilized, and results regarding their psychometric properties. The primary measurement properties evaluated in this study were reliability, validity, and responsiveness. Terminology and definitions were consistent with those of the Consensus-Based Standards for the Selection of Health Measurement Instruments (COSMIN) taxonomy. 21 Reliability was defined as the proportion of the total variance in measurements which is due to true differences between patients. 21(p743) This included both testretest reproducibility (the similarity of results of separate measurements in an September 2017 Volume 97 Number 9 Physical Therapy 927

unchanged population) and interrater reliability (the similarity of results of separate measurements by different persons on the same occasion). Validity was defined as the degree to which an instrument measures the construct(s) it purports to measure. 21(p743) Responsiveness was defined as the ability of an instrument to detect change over time. 21(p743) When available, information was collected on the presence of learning effects and the impact of technical factors on the results of the tests. Quality assessment of all included studies was conducted using the COSMIN instrument. 22 This tool assesses the quality of each measurement property investigated in a study. For each measurement property, quality was rated as excellent, good, fair, or poor for specific criteria, including study design and statistical methods. Criteria for rating study design and statistical methods are outlined specific to the measurement property being rated and all aspects are clearly defined by the COSMIN instrument. 22 For example, general criteria rating study design included handling of missing items, adequacy of sample size, and any important flaws in methods or design; reliability studies included additional questions regarding independent administration of at least 2 measurements under similar test conditions, and appropriate features of the time interval; validity studies included stating of hypotheses about direction and magnitude of correlations and description and appropriateness of comparator instruments; responsiveness included further questions regarding description of the intervention/interim period and proportion of patients changed. For each measurement property, appropriateness of specific statistical methods was evaluated. Overall quality corresponded to the lowest score obtained on the checklist for that property. Data extraction and quality appraisal was conducted by 2 independent reviewers, and differences were resolved by consensus with a third reviewer. Data Synthesis and Analysis Descriptive synthesis of results was conducted and meta-analysis was not performed due to significant variation between studies. Results A total of 29 studies were included ( Fig.); 18 studies met the inclusion criteria in the initial search, with a further 11 identified in the updated 2016 search. Study Characteristics Measurement properties in people with COPD were described in 4 types of tests: (1) short walking tests (<6 mins duration) where distance was measured; 23 27 (2) gait speed tests; 28 34 (3) step tests; 35 38 and (4) lower limb functional exercise tests involving the sit-tostand transition, 29,39-43 the TUG test, 44 50 or both. 51 Protocol details of all included tests are provided in Supplementary File 4. All included studies were conducted in people with moderate-severe COPD according to recognized international criteria based on lung function measures, except in 1 study where lung function was not. 46 People with milder COPD were also included in 4 studies. 37,44,45,47 All studies were conducted in outpatient, community, or primary care settings except for 2 that were conducted in inpatient pulmonary rehabilitation settings, although they also included patients with stable COPD. 24,43 Characteristics of the study setting and included populations are shown in Table 1. Short walking distance and gait speed tests in COPD: reliability. Reliability was for the 2MWT 23,26 and Five-Minute Walk Test (5MWT) 25 and for the measurement of gait speeds over 4, 29,31,33 10, 31,34 and 30 m 28 (Tab. 2). Same day test-retest reliability of the 2MWT was evaluated in Tab. 2 studies, with the mean difference between first and second tests as.3 m 26 to 2.5 m. 23 The study by Leung et al 26 included an un practice test, and the walking track (straight vs circuit) varied between studies. Knox et al 25 a large learning effect in the 5MWT in a small sample (n=12), most marked up to the first 4 walks but continuing throughout their 12 walks. Study quality was limited mainly by small sample size, with COSMIN ratings of poor to fair for reliability in short walking tests (Tab. 2). Reliability of 4-m gait speed (4MGS) was assessed in 2 studies, with COSMIN ratings of fair 31 and good 33 (Tab. 2). A key difference between these studies was inclusion of a 2-m acceleration zone before the measured 4 m, 31 compared with a static start. 33 However, consistently high intraclass coefficients (ICCs >.95), low percentages of standard errors of measurement (4.4% to 4.8%), and no significant differences in mean speed between the 2 trials (Tab. 2) indicated that both methods were highly reliable. Short walking distance and gait speed tests in COPD: validity. To examine validity, short walking distance, and gait speed tests were compared with other field walking tests (6MWT and ISWT) or maximum oxygen uptake (VO 2 max) measured during a cycle exercise test (Tab. 2). Leung et al 26 a very high correlation of the two-minute walk distance (2MWD) with the six-minute walk distance (6MWD, r=.937, n=45), and moderately strong correlations with VO 2 max (r=.45, COSMIN ratings= fair). Nadir values of peripheral oxygen saturation (SpO 2) during conduct of the 2MWT and 6MWT were correlated (r=.81, P <.0001), 24 and high congruence was in detecting participants who desaturated to <90% during both tests. 24 Usual gait speed over 4 meters demonstrated a linear relationship with 6MWD (r =.77, n = 130) 30 and a curvilinear relationship with incremental shuttle walk distance (ISWD, goodness of fit R 2 =.59, Tab. 2), with the curve flatter at lower ISWD, indicating potential usefulness in individuals with lower functional exercise capacity. 33 Two studies examined the ability of the 4MGS to identify participants with lower exercise capacity (based on an established prognostic marker of 6 MWD less than 350m), with proposed cut points of 0.9m/s (sensitivity 64%, specificity 90%) 30 and 1.0 m/s (sensitivity 80%, specificity 50%). 29 These studies 29,30,33 were rated as good to excellent on the COSMIN instrument (Tab. 2). 928 Physical Therapy Volume 97 Number 9 September 2017

Figure. Systematic search and identification of included studies. COPD = chronic obstructive pulmonary disease. Four-meter gait speed was (Tab. 2) to correlate moderately with ratings of functional impairment due to breathlessness and health-related quality of life but had a low correlation with physical activity level or lung function. Short walking distance and gait speed tests in COPD: responsiveness. Responsiveness to change was for the 2MWT, 3-minute constant rate shuttle test, and 4MGS. Leung et al 26 large standardized response means (mean change/standard deviation of change) for the 2MWD and 6MWD after pulmonary rehabilitation (1.25 and 1.70, respectively). Sample size was small (n=9, thus COSMIN rating=poor) and did not include 6 participants who dropped out due to low motivation or exacerbation at the time of posttest. In addition, the pulmonary rehabilitation program was intensive, delivered 3 days per week, 6 hours per day for 5 weeks in an inpatient rehabilitation setting. In a sample of 54 participants with COPD, standardized mean response of the 2MWT after delivery of bronchodilator medication was moderate at 0.75, but lower than responsiveness for the 6MWT (0.84) or forced expiratory volume in 1 second (1.5), with COSMIN rating of fair. 23 No studies evaluated minimal important difference in 2MWD. Sava et al 27 the mean reduction in Borg rating of dyspnea at the end of the 3-minute constant rate shuttle test was 1.0 (standard deviation[sd] = 0.2) lower after delivery of nebulized September 2017 Volume 97 Number 9 Physical Therapy 929

Table 1. Study Settings and Populations a Types of Tests Study Year Test No. of Participants With COPD Disease Country Setting Severity b Short walking tests Eiser et al 23 2003 2MWT 57 35 (15) UK Gloeckl et al 24 2016 2MWT 26 37 (5) Germany Inpatient pulmonary rehabilitation Knox et al 25 1988 5MWT 36 c 0.75 (0.34) UK Leung et al 26 2006 2MWT 45 d 42 (13) China Pulmonary rehabilitation Sava et al 27 2012 3MST 39 51 (14) Canada Gait speed tests Andersson et al 28 2011 30mWT 47 46 (17) Sweden Pulmonary outpatients Bernabeu-Mora et al 29 2016 4MGS 137 50 (17) Spain Pulmonary outpatients Karpman et al 30 2014 4MGS 130 50 (20) US Pulmonary rehabilitation Karpman et al 31 2014 4MGS, 10MGS 70 53 (18) US Pulmonary rehabilitation Kon et al 33 2013 4MGS 586 e 46 (29 61) f UK Pulmonary outpatient clinics Kon et al 32 2014 4MGS 301 49 (32 63) f UK Pulmonary rehabilitation or pulmonary outpatient clinics Rozenberg et al 34 2014 10MGS 29 42 (20) UK Pulmonary outpatients Step tests de Camargo et al 35 2011 CST 32 g 46 (15) Brazil Lower limb functional tests involving the sit-to-stand transition Karloh et al 36 2013 CST 10 38 (12) Brazil Pulmonary outpatients and general public Kramer et al 37 1999 15SOT 96 Severe group = 30 (10); Moderate group = 57 (4); Mild group = 83 (15) Israel Pulmonary rehabilitation Starobin et al 38 2006 15SOT 50 46.3 (19.9) Israel Pulmonary outpatients Aguilaniu et al 39 2014 3-min chair rise 40 51 (9) France Stable patients, setting not Bernabeu-Mora et al 29 2016 5STS 137 50 (17) Spain Pulmonary outpatients Jones et al 40 2013 5STS 475 h 47.6 (20.6) UK Pulmonary outpatients and pulmonary rehabilitation Ozalevli et al 41 2007 60STS 53 46 (9) Turkey Pulmonary outpatients Puhan et al 42 2013 60STS 374 64% GOLD II (moderate); 22% GOLD III (severe); 14% GOLD IV (very severe) Zanini et al 43 2015 30STS, 60STS Switzerland and the Netherlands Primary care 60 46 (14), 51 (16) i Italy Inpatient pulmonary rehabilitation Al Haddad et al 44 2016 TUG 119 59 (18) UK Pulmonary outpatients Albarrati et al 45 2016 TUG 520 58 (19) UK Butcher et al 51 2012 TUG, 30STS 13 47.9 (13.9) Canada Pulmonary rehabilitation and pulmonary outpatients De Buyser et al 46 2013 TUG 28 Belgium Community Marques et al 47 2016 TUG 60 62 (23) Portugal Primary care centers and hospital outpatients Mesquita et al 48 2013 TUG 95 33 (26 42) f the Netherlands Mesquita et al 49 2016 TUG 500 46 (32 63) f the Netherlands Outpatient clinics Outpatient pulmonary rehabilitation Roig et al 50 2010 TUG 21 47.2 (12.9) Canada a FEV1 = forced expiratory volume in 1 second, COPD = chronic obstructive pulmonary disease, 2MWT = 2-minute walk test, 5MWT = 5-minute walk test, 3MST = 3-minute constant-rate shuttle test, 30mWT = 30-m walk test, 4MGS = 4-m gait speed, 10MGS = 10-m gait speed, CST = Chester Step Test, 15SOT = 15-step oximetry test, 5STS = 5-times sit-to-stand test, 60STS = 60-second sit-to-stand test, GOLD II to IV = Global Initiative for Chronic Lung Disease severity classifications, 30STS = 30-second sit-to-stand test, TUG = Timed Up & Go Test. b Reported as percent predicted mean (standard deviation) FEV1 unless otherwise indicated. c Included 36 for validity but 12 for reliability. d Included 45 for reliability/validity but 9 for responsiveness. e Included 586 for validity but 80 for test/retest reliability. f Reported as median (25th 75th centile). g Included 32 for repeatability but only 9 for validity with cycle exercise test. h Included 475 for validity, 50 for repeatability, and 239 for responsiveness. i Data are presented separately for mean values of FEV1% pred for the 2 intervention groups in the study: group 1 = strength-specific training plus usual pulmonary rehabilitation, FEV1%pred = 46(14); group 2 = usual pulmonary rehabilitation, FEV1%pred = 51(16). 930 Physical Therapy Volume 97 Number 9 September 2017

Table 2. Reliability and Validity of Short Walking Distance and Gait Speed Tests in People With Chronic Obstructive Pulmonary Disease a Types of Tests Study Year No. of Participants Test or Protocol Time Interval Distance or Difference Speed b Between Trials b Reliability Validity Reliability Coefficient (Intraclass Correlation Coefficient) Measurement COSMIN Error c Comparison Correlation COS- d MIN Short walking tests (distance measured) Eiser 2003 57 2MWT, 120-m et al 23 circuit track 3 trials (T1 T3) on same day (30-min break), repeated on 3 consecutive days 155.2 m in T1, 157.7 m in T2, 159 m in T3 2.5 m (T1 and T2), 3.8 m (T1 and T3), 1.3 m (T2 and T3) Variability from within-subject changes = 5.1% Poor Validity not Gloeckl et 2016 26 2MWT, 30-m al 24 track N/A (1 trial conducted) 149.7 m Reliability not Nadir SpO2 in 6MWT.81 Poor Knox 1988 12 5MWT, et al 25 rectangular corridor (length not ) 4 trials on same day, repeated on 3 consecutive days 192 m in T1 on day 1, 254 m in T12 on day 3 Mean increase from T1 to T12 = 33% Poor FEV1.46 Leung et 2006 45 2MWT, 30-m al 26 track 3 trials on same day (20- min break) 129.5 m in T1, 129.8 m in T2, 130.3 m in T3 0.3 m (T1 and T2), 0.8 m (T1 and T3), 0.4 m (T2 and T3).9994 3.1 to 3.8 m in T1, 3.1 to 4.6 m in T2, 2.1 to 3.0 m in T3 (95% LOA) 6MWT.937 CPET (VO2 max).454 CPET (VO2 max/kg).555 Gait speed tests (speed measured) Andersson 2011 47 30mWT, et al 28 usual and fast speeds; static start 2 trials, 7 14 d apart Fast speed: 1.14 min in T1, 1.15 min in T2; usual speed: 1.55 min in T1, 1.60 min in T2 0.01 (95% CI = 0.40 to 0.43).87 5.9% 6MWT Fast speed:.78; usual speed:.73 Bernabeu- Mora et al 29 2016 137 4MGS, usual speed; static start N/A 4.7 s (thus, 1.18 m/s) Reliability not 6MWT (to discriminate 6MWD < 350 m) Cut point = 4 s, AUC = 0.72, Se = 79.6, Sp = 49.4 (Continued) September 2017 Volume 97 Number 9 Physical Therapy 931

Table 2. Continued. Types of Tests Gait speed tests (speed measured) and merge cells. Study Year No. of Participants Test or Protocol Karpman 2014 70 4MGS, et al. 31 usual and fast speeds; dynamic start 10MGS, usual speed; dynamic start Karpman 2014 130 (85 for et al 30 PAL) 4MGS, usual and fast speeds; 2-m acceleration zone Kon et al 33 2013 80 for test-retest reliability, 58 for interobserver reliability, 586 for validity 4MGS, usual speed; static start Time Interval 2 trials on same day (5- to 10-s break) 2 trials on same day (5- to 10-s break) N/A (1 trial only 24 48 h, test-retest Reliability Validity Distance or Difference Speed b Between Trials b Reliability Coefficient (Intraclass Correlation Coefficient) Measurement COSMIN Error c Comparison Correlation COS- d MIN Fast speed: 1.13 m/s; usual speed: 1.68 m/s Fast speed: 0.01 m/s (95% CI = 0.03 to 0.01); usual speed: 0.02 m/s (95% CI = 0.04 to 0.01) Fast speed:.95; usual speed:.95 Fast speed: 4.4%; usual speed: 4.8% Validity not 1.27 m/s <0.01 m/s (95% CI = 0.02 to 0.01).97 3.3% Validity not Fast speed: 1.63 m/s; usual speed: 1.11 m/s Reliability not PAL Fast speed:.28; usual speed:.24 Good 6MWT Fast speed:.80; usual speed:.77 Good Discriminate 6MWD 350 m Cut point = 0.9 m/s, AUC = 0.87, Se = 64%, Sp = 90% Good Discriminate 6MWD 200m Cut point = 0.8 m/s, AUC = 0.98, Se = 82%, Sp = 97% Good 0.89 m/s 0.0005 m/s.97 1.5% Good ISWT.78 (Continued) 932 Physical Therapy Volume 97 Number 9 September 2017

Table 2. Continued. Types of Tests Study Year No. of Participants Test or Protocol Time Interval Distance or Difference Speed b Between Trials b Reliability Validity Reliability Coefficient (Intraclass Correlation Coefficient) Measurement COSMIN Error c Comparison Correlation COS- d MIN Simultaneously (2 observers), interrater 0.01 m/s.99 1.4% Good % predicted FEV1 rs =.10 Good MRC rs =.55 Good SGRQ total rs =.55 Good Rozenberg 2014 29 10MGS, et al 34 usual and fast speeds; acceleration zone 2 trials on same day (5-min break), repeated on 3 d in 1 wk Fast speed: 74.3 m/min; usual speed: 60.3 m/min Fast speed: 3.1 m/min (95% CI = 1.5 to 4.7) on days 1 and 2; usual speed: 2.9 m/min (95% CI = 2.0 to 3.8) in T1 and T2 on day 1, 2.7 m/min (95% CI = 1.1 to 4.3) on days 1 and 2 Coefficient of repeatability: 7.1 m/min for fast speed, 7.5 (5 10) m/ min for usual speed Poor DLCO (fast speed).45 Poor FEV1/FVC (fast speed).54 Poor PaO2 (fast speed).42 Poor PaO2 (usual speed).42 Poor a COSMIN = Consensus-Based Standards for the Selection of Health Measurement Instruments, 2MWT = 2-minute walk test, T = trial, N/A = not applicable, SpO2 = peripheral oxygen saturation, 6MWT = 6-minute walk test, 5MWT = 5-minute walk test, FEV1 = forced expiratory volume in 1 second, LOA = limits of agreement, CPET = cardiopulmonary exercise test, VO2max = maximal oxygen uptake, 30mWT = 30-m walk test, 4MGS = 4-m gait speed, 6MWD = 6-minute walk distance, AUC = area under the receiver operating characteristic curve, Se = sensitivity, Sp = specificity, 10MGS = 10-m gait speed, PAL = physical activity level, ISWT = incremental shuttle walk test, rs = Spearman correlation coefficient, MRC = Medical Research Council dyspnea scale, SGRQ = St George Respiratory Questionnaire, DLCO = diffusing capacity of the lungs for carbon monoxide, FVC = forced vital capacity, PaO2 = partial pressure of arterial oxygen. b Reported as mean unless otherwise indicated. c Reported as standard error of measurement unless otherwise indicated. d Reported as Pearson correlation coefficient (r) unless otherwise indicated. September 2017 Volume 97 Number 9 Physical Therapy 933

bronchodilator than with placebo (nebulized saline) in 38 participants with COPD. In a well-powered study 32 (n=301) examining the 4MGS before and after pulmonary rehabilitation (COSMIN rating=excellent for responsiveness), the minimal important difference was calculated as between 0.11m/s (anchored against change in the ISWT) and 0.08 m/s (anchored against feeling better ). For patients with low baseline gait speed (ie, lowest quartile, <0.77m/s), effect size after pulmonary rehabilitation was high at 1.0; in contrast, patients in the highest quartile ( 1.05m/s) showed a lower effect size of 0.2, indicating greater responsiveness to change in patients who were more frail. Responsiveness of the 4MGS to longitudinal change over a 12-month period was in 162 patients in the same study. Mean 4MGS declined during this period ( 0.04m/s), and change in 4MGS was correlated with change in ISWD (r=.45; P <.001). Step tests in COPD: reliabilit, validity, and responsiveness. Same day test-retest reliability of the Chester Step Test was found to be excellent (ICC=.99, 95% CI=.97-.99) in a single study of 32 participants with COPD (COSMIN rating=fair). 35 The mean difference between tests was 1.1 steps/min [95% limits of agreement, 20.2 to 17.9 steps]. Strong positive correlations (Tab. 3) were between the number of steps taken in the Chester Step Test and 6MWD 35,36 and ISWT highest level, 36 although studies were limited by small sample size with COSMIN ratings of fair to poor. Cadence required in the Chester Step Test increased each 2 minutes; average test time tolerated by people with COPD was between 3.8 and 4 minutes, with frequent desaturation (SpO 2 <88% in 18 of 32 participants) and high proportion of test cessation due to participant dyspnea (63%). 35 Using the 15-step exercise oximetry test, Kramer and colleagues 37 examined desaturation time (calculated as the area below 98% saturation on the SpO 2/test time graph) and inverse correlations with 6MWD and lung function measures (Tab. 3). In contrast, no correlation was by Starobin and colleagues 38 between the 15-step exercise oximetry test results and either the cardiopulmonary exercise test or the 6MWD, although it is unclear which of the measured test variables this relates to. No studies investigated responsiveness to change in step tests. Lower limb functional tests involving the sit-to-stand transition in COPD: reliability. Variations of sit-to-stand tests evaluated in people with COPD involved either recording the time taken to perform 5 repetitions of this movement (5STS, taking 14 15 sec 29,40 ) or the number of chair rises able to be performed in 30 seconds, 43,51 60 seconds, 41 43 or 3 minutes. 39 Also included was evaluation of the TUG test in people with COPD, where the time taken for participants to stand up, walk at a regular pace for 3 meters (indicated by a line on the floor), turn around, walk back, and sit down was recorded. 44,45,47,51 Reliability, validity, and responsiveness and ratings of study quality are in Table 4. Test-retest reliability of the 5STS was examined in 1 study (n=50, COSMIN rating=good) 40 and to have high ICC (.97) with no significant bias to indicate a learning effect. Although no studies of reliability were identified for the 30- or 60-second sit-to-stand tests, variations of the 3-minute sit-to-stand test that used different rates of external pacing also demonstrated reliable results (ICC=.82-.90) and no significant bias between test repetitions). 39 Test-retest reliability of the TUG test was described in 4 studies of older patients with COPD (mean age >70 years, Tab. 4). 44,47,48,50 Of the 2 studies that provided detailed analysis of the TUG test trials, 1 acceptable ICC (ie, greater than.90) and no significant differences between first and second trials (COSMIN rating=fair) 47 ; the other 48 found acceptable ICC and lower standard error of measurement between the second and third trials only (COSMIN rating=good). A floor effect was observed in this group of lower limb functional tests, with some participants unable to rise from the chair independently to commence the 60-second sit-to-stand test (35 participants, or 8% of sample 42 ) and 5STS (15% of sample 40 ), or were unable to keep up with the higher pacing on the 3-minute chair rise test (3 patients). 39 Lower limb functional tests involving the sit-to-stand transition in COPD: validity. A moderately strong inverse correlation (r=.59) between 5STS time and ISWT score was by Jones et al 40 (n=475, COSMIN rating=excellent); correlations between timed sit-to-stand tests and the 6MWD ranged from r=.44 43 to.75 41 (COSMIN ratings both fair). In contrast, the TUG test consistently had a high inverse correlation with the 6MWD (COSMIN ratings all excellent, Tab 4). Ability of the TUG test to discriminate between those who walked less than 350m on the 6MWT was evaluated in 2 studies, with TUG test cut points of 8.4 seconds 45 and 11.2 seconds 49 recommended. Comparison of lower limb functional tests involving the sit-tostand transition with different measures of quadriceps strength were highly varied in results (Tab. 4). The 5STS had a moderate inverse correlation with quads maximal voluntary contraction (r s =.33 40 ) similar to that of the TUG test with isokinetic quadriceps torque (r s = 0.33 49 ) both in highly rated studies (COSMIN rating=excellent). Lower limb functional tests involving the sit-to-stand transition in COPD: responsiveness. Change in response to pulmonary rehabilitation programs was for 3 types of sit-tostand tests and for the TUG test (Tab. 4). In 239 participants with moderate to severe COPD, a significant reduction in median 5STS time ( 1.4 sec) was found after pulmonary rehabilitation (8-week, twice-weekly outpatient program), which correlated significantly but weakly with change in ISWT 40 (r=.13, P<.05). The minimal important difference in 5STS time was estimated at between 1.3 seconds (anchored against feeling much better or better ) and 1.4 seconds (anchored 934 Physical Therapy Volume 97 Number 9 September 2017

Table 3. Validity of Step Tests in People With Chronic Obstructive Pulmonary Disease a Study Year No. of Participants de Camargo et al 35 2011 32 CST, 20-cm step: no. of steps taken Short Lower Extremity Functional Exercise Tests in People With COPD Test Comparison Correlation (Pearson r) P Value COSMIN 6MWD (2 tests).60.001 CPET (peak W) (n = 11).69.02 Poor FEV1.43.02 Karloh et al 36 2013 10 CST, 17-cm step 6MWD.85 (CST last level achieved),.76 (CST steps taken) Kramer et al 37 1999 96 15-step exercise oximetry test (15SOT) (desaturation area), 20-cm step Starobin et al 38 2006 50 15SOT test results, 20-cm step ISWT (level).67 (CST last level achieved) <.01 (CST last level achieved),.02 (CST steps taken) <.01 (CST last level achieved) Poor Poor 6MWD (n = 51).65, b.7 c <.01, b <.05 c DLCO.52 b <.05 FEV1.65 b <.05 FVC.60 b <.05 6MWD.13.34 Poor CPET 0.07.64 Poor FEV1 0.44.001 Poor a COSMIN = Consensus-Based Standards for the Selection of Health Measurement Instruments, CST = Chester Step Test, 6MWD = 6-minute walk distance, CPET = cardiopulmonary exercise test, FEV1 = forced expiratory volume in 1 second, ISWT = incremental shuttle walk test, DLCO = diffusing capacity of the lungs for carbon monoxide, FVC = forced vital capacity. b Correlation with the main test outcome variable, desaturation in the I5SOT area. c Correlation with the secondary test outcome variable, 15SOT test time. against change in ISWT and St George Respiratory Questionnaire, COSMIN rating good). Another study comparing usual pulmonary rehabilitation with pulmonary rehabilitation plus a strength program 43 significant improvements in 30-second STS count with both interventions, and changes in 30- and 60-second STS in the group with extra strength training (COSMIN rating=fair). In a sample of 378 participants, TUG test time improved after intensive pulmonary rehabilitation (40-day program, either inpatient for 8 weeks, 5 days/week, or outpatient 16 weeks, 3 to 2 half days/week) by a mean of 0.5 seconds (effect size=0.16, COS- MIN rating=excellent). 49 Using a clinically meaningful change in the 6MWD (30 m) and distribution-based methods as anchors, the authors estimate a minimal clinically important difference to be between 0.9 and 1.4 seconds; notably, this was greater than the actual change observed during the intensive rehabilitation program. 49 Discussion Twenty-nine studies evaluating the psychometric properties of 11 different short (<6 min) walking and lower limb functional exercise tests in people with COPD were examined in this systematic review. This review updates and adds to a previous analysis 19 in 3 important ways. First, while Bisca and colleagues 19 on the measurement properties of 17 unique studies, the current study updates this review by presenting evaluation of measurement properties in 29 studies, including 7 published since the end of 2014. Second, this review is the first to include the psychometric properties of short (2- and 5-min) walking distance tests in 6 studies. Third, the new studies in this review add clinically important information about STS 43 and TUG 49 test responsiveness following pulmonary rehabilitation, a common intervention in COPD and discriminative ability of the TUG, 44,45 4MGS, and STS tests, 29 supported by excellent study quality ratings in most cases. Comprehensive information (reliability, validity, and responsiveness, including estimates of minimal important difference) of high quality is now available for the 4MGS, 5STS, and TUG tests. The shorter of these tests (4MGS and 5STS) were highly repeatable with no learning effect. The included studies of the 2MWT differed in study population (Germany, United Kingdom, and China) and protocol, but together indicated high reliability, albeit with evidence of a learning effect between first and second walks. Interpretation of existing data on the short walking distance September 2017 Volume 97 Number 9 Physical Therapy 935

Table 4. Reliability, Validity, and Responsiveness of Functional Exercise Tests Featuring the Sit-to-Stand Transition in People With Chronic Obstructive Pulmonary Disease a Study Year No. of Participants Aguilaniu et al 39 2014 40 3-min chair (48 cm) rise test, with minute 1 paced at 12 rises/min Bernabeu-Mora et al 29 Test Reliability Validity Responsiveness 3-min chair (48 cm) rise test, with minute 1 paced at 15 rises/min 3-min chair (48 cm) rise test, with minute 1 paced at 20 rises/min 2016 137 5STS (chair height not ) Jones et al 40 2013 50 for reliability, 475 for validity, 239 for responsiveness 5STS (chair height = 48 cm) Reliability Test Details Test-retest (2 10 d between visits) ICC =.82 (95% CI =.61 to.93) Test-retest (2 10 d between visits) ICC =.90 (95% CI =.82 to.94) Test-retest (2 10 d between visits) ICC =.90 (95% CI =.82 to.95) Test-retest (24 48 h) ICC =.97, mean difference between tests = 0.04 s (95% CI = 0.21 to 0.20), interrater ICC =.99 COSMIN Comparison Correlation COSMIN 6MWD r =.54.65 for the 3 test versions 6MWT (to discriminate 6MWD <350 m) Cut point = 13 s, AUC = 0.71, Se = 75.6, Sp = 45.9 Good ISWT rs =.59, ISWT to discriminate <170m AUC = 0.82 Intervention COSMIN Pulmonary rehabilitation, 2 times/wk, 8 wk; MCID estimates of 1.3 1.7 s Good Quadriceps MVC rs =.33, P <.01 MRC rs =.43, P <.01 SGRQ (total) rs =.35, P <.01 CAT rs =.31, P <.01 Ozalevli 2007 53 60STS (chair et al 41 height = 46 cm) 6MWD r =.75, Quadriceps strength (manual muscle test grade = 0 5) tingham Health Profile (physical mobility category) rs =.65, P <. 01 r =.63, P <.01 936 Physical Therapy Volume 97 Number 9 September 2017

Table 4. Continued. Study Year No. of Participants Puhan 2013 374 60STS (chair et al 42 height = 46 48 cm) Zanini et 2015 60 30STS (chair al 43 height = 47 cm) Al Haddad et al 44 2016 22 for interrater, 87 for test-retest, 119 for validity Test Reliability Validity Responsiveness 60STS (chair height = 47 cm) TUG (chair height not ) Albarrati 2016 520 TUG (chair et al 45 height = 45 cm) Reliability Test Details Interrater (simultaneous) ICC =.99 Test-retest (same day) mean difference (second test first test) = 0.6 s (SD = 0.9 s) COSMIN Comparison Correlation COSMIN Association with 2-y mortality Association with CRQ 1-RM (isokinetic bilateral knee extension) Hazard ratio = 0.58 (95% CI = 0.4 to 0.85, P =.004) per 5 more STS repetitions b Effect of 5 more STS repetitions: CRQ dyspnea score = 0.26 (95% CI = 0.19 to 0.34) c Intervention COSMIN Good September 2017 Volume 97 Number 9 Physical Therapy 937 Good r =.48, Pulmonary rehabilitation, 3 wk, 15 sessions, plus strength training; % change correlated with change in 1-RM: r =.47, P =.009 6MWD r =.44, 1-RM (isokinetic bilateral knee extension) r =.36, P =.005 Pulmonary rehabilitation, 3 wk, 15 sessions, plus strength training; no change in correlation 6MWD r =.48, Poor 6MWD r =.74, Poor BODE r =.53, Discriminate falls in previous year 6MWD 6MWT (to discriminate 6MWD<360 m) Hand grip strength Cut point = 12 s, AUC = 0.77, Se = 0.74, Sp = 0.74 r =.71, P =.001 Cut point = 8.4 s, AUC = 0.82, Se = 0.9, Sp = 0.8 r =.27, P =.001

Table 4. Continued. Study Year No. of Participants Butcher 2012 13 TUG (chair et al 51 height not ) Marques 2016 60 for interrater, et al 47 41 for test-retest Test Reliability Validity Responsiveness 30STS (chair height not ) TUG (chair height not ) Mesquita 2013 95 TUG (chair et al 48 height not ) Mesquita 2016 500 (378 for et al 49 responsiveness) TUG (chair height not ) Reliability Test Details Interrater ICC =.997 Test-retest (48 72 h) ICC =.921, SEM = 12.1%, MDC95 = 2.68 s Test-retest (same day): trials 1 and 2, ICC =.85, SEM = 1.76 s, MDC95 = 4.88 s; trials 2 and 3, ICC =.95, SEM = 0.79 s, MDC95 = 2.19 s COSMIN Good Comparison Correlation COSMIN mmrc r =.34, P =.001 SGRQ (total) r =.37, CAT r =.37, Exacerbation frequency CPET (cycle) Steep ramp anaerobic test Quadriceps strength (isometric) CPET (cycle) Steep ramp anaerobic test Quadriceps strength (isometric) 6MWD (n = 497) rs =.24, No significant correlation r =.71, P <.05 r =.92, P <.01 No significant correlation r =.67, P <.05 r =.76, P <.05 rs =.74, Intervention COSMIN Poor Poor Poor Poor Poor Poor 40-d pulmonary rehabilitation program (inpatient or outpatient); mean change = 0.5 s (95% CI = 0.6 to 0.3 s), P <.0001, effect size = 0.16; correlation of changes in TUG and 6MWD: r =.32, P<.0001; distributionbased MCID = 0.9 1.4 s 938 Physical Therapy Volume 97 Number 9 September 2017

Table 4. Continued. Study Year No. of Participants Test Reliability Validity Responsiveness Reliability Test Details COSMIN Comparison Correlation COSMIN 6MWT (to discriminate 6MWD < 350 m) CPET (cycle) (n=479 Quadriceps torque (isokinetic) (n=452) mmrc (n=495) Cut point = 11.2 s, AUC = 0.86, Se = 0.75, Sp = 0.83 rs =.44, rs =.33, rs =.49, CAT (n=488) rs =.27, SGRQ total (n=488) Updated BODE (n=492) HADS anxiety (n=485) HADS depression (n=485) rs =.41, rs =.55, rs =.21, rs =.26, Intervention COSMIN Roig 2010 21 TUG (chair et al 50 height not ) Test-retest (same day) ICC =.95 Poor a COSMIN = Consensus-Based Standards for the Selection of Health Measurement Instruments; ICC = intraclass correlation coefficient; 6MWD = 6-minute walk distance; r = Pearson correlation coefficient; 5STS = 5-times sit-to-stand test; 6MWT = 6-minute walk test; AUC = area under the receiver operating characteristic curve; Se = sensitivity; Sp = specificity; ISWT = incremental shuttle walk test; rs = Spearman correlation coefficient; MCID = minimal clinically important difference; MVC = maximal voluntary contraction; MRC = Medical Research Council dyspnea scale; SGRQ = St George Respiratory Questionnaire; CAT = Chronic Obstructive Pulmonary Disease Assessment Test; 60STS = 60-second sit-to-stand test; STS = sit-to-stand test; CRQ = Chronic Respiratory Questionnaire; 30STS = 30-second sit-to-stand test; 1-RM = 1-repetition maximum; TUG = Timed Up & Go Test; BODE = index including body mass index, airflow obstruction, dyspnea, and exercise capacity; mmrc = modified MRC; CPET = cardiopulmonary exercise test; SEM = standard error of measurement; MDC95 = minimal detectable change at the 95% level of confidence (calculated on the basis of the SEM and reflecting the smallest change in a participant s score that indicated a true change in the participant s performance rather than the result of random measurement error 47,48 ); HADS = Hospital Anxiety and Depression Scale. b Mortality hazard ratio was adjusted for forced expiratory volume in 1 second, dyspnea, and inhaled corticosteroid/long-acting beta-agonist (ICS/LABA) use. c Effect was adjusted for age and forced expiratory volume in 1 second; there were also smaller but significant effects for other CRQ domains. tests is limited by poor to fair study quality, but the high correlation of 2MWD with 6MWD suggests potential for consideration as an alternative field test of exercise tolerance, consistent with a study of patients with sporadic inclusion body myositis 52 where 2MWD was also highly correlated with 6MWD (r=.97, n=62). However, investigation of the responsiveness and minimal important difference in 2MWD before and after clinical interventions in patients with COPD, with larger samples, would also be required to improve the usefulness of this outcome measure. A key purpose of established functional exercise tests in COPD is to prescribe exercise training intensity. This was developed by a body of work that measured oxygen uptake during the conduct of the 6MWT and ISWT in people with COPD, and compared responses with oxygen uptake responses during maximal exercise testing. 53 55 Subsequently, direct measurements of oxygen uptake while participating in walking training at the appropriate intensity prescribed from both 6MWT 56 and ISWT 57 have confirmed that this resulted in the participants working at 77% and 70% of peak oxygen uptake, respectively. Although the 2MWT was correlated moderately with peak oxygen uptake, 26 further studies are needed to consider whether exercise training intensity could be prescribed from this test. Although this review demonstrates their robust measurement properties, there is clear agreement in the literature that tests including the 4MGS, 5STS, and TUG are not intended as replacements for the 6MWT or ISWT. 49,58,59 In contrast, there is now evidence (supported by excellent COSMIN ratings for most studies in this group) for their suitability as stratification or screening September 2017 Volume 97 Number 9 Physical Therapy 939

assessment tools in people with COPD. Gait speeds in people with COPD of less than 0.9 to 1.0 m/s were associated with a 6MWD of less than 350 m, itself a strong predictor of mortality in this population; 60 62 and a cut point of 0.8 m/sec, aligned with a 6MWD of less than 200 m, had even greater sensitivity and specificity. 30 Cut points for the TUG test identified people with COPD who walked less than 350m or 360m in the 6MWT 45,49 or experienced a fall in the previous year 44 with high levels of sensitivity and specificity. Similarly, the 5STS provided a strong model (area under the curve 0.82) to identify those with an ISWT time associated with greater mortality 40 (ie, less than 170 m). 63 In this way, these short tests have direct clinical relevance regarding prognosis of people with COPD. This is consistent with the use of gait speed 64 and sit-to-stand tests 65 alone or as part of the Short Physical Performance Battery (SPPB) 66 in prognosis of older community-based adults. As in the use of the SPPB, the use of 2 short exercise tests (eg, 4MGS and the 5STS or TUG) may provide complementary information in patients with COPD. Short functional exercise tests in people with COPD may also have greater applicability at times when exercise ability is limited, such as during hospitalization for an acute exacerbation. For example, during hospital admission for an exacerbation of COPD, total time spent walking has been measured using accelerometers as 7.2 mins/day, 67 making a 6-minute walking test a large proportion of daily activity. Some short exercise tests (eg, 4MGS, 5STS) may also be applicable where space is limited and use of a long straight walking track is not possible, such as in a hospital ward setting. However, no studies identified in this review were conducted in hospital inpatient settings at the time of admission for an exacerbation of COPD. Studies that evaluate measurement properties of short exercise tests such as the 4MGS and 5STS at the time of a hospital admission for an exacerbation of COPD could improve usefulness of these outcome measures in this setting. Shorter exercise tests show applicability for patients with COPD with greater functional limitation, and a ceiling effect in higher-functioning individuals. The 4MGS showed greater responsiveness to change after pulmonary rehabilitation in patients who are more frail. 32 This is in contrast with a study of gait speed over 5 and 10 meters in patients with acute stroke, which found less of a ceiling effect than other measures, including the Berg Balance Scale. 68 Similarly, the TUG test only showed responsiveness to change after pulmonary rehabilitation in participants with COPD who had low baseline values and lower functional exercise capacity. 49 Minimal important differences estimated for 4MGS, 5STS, and TUG in people with stable COPD following pulmonary rehabilitation have been summarized in this review, providing guidance for interpretation of change in these measures. Limited data on reliability and validity was identified for the Chester Step test in people with COPD, and it was not well tolerated due to dyspnea and oxygen desaturation. Oxygen desaturation during the 15-step exercise oximetry test showed some relationship with desaturation during the 6MWT in only 1 of the 2 identified studies. Exercise-induced oxygen desaturation occurs frequently in people with COPD during the 6MWT, and this has been associated with disease severity and prognosis, 15 but similar relationships were not established in the included studies of the 15-step oximetry test. Although all tests demonstrated at least some validity through correlation with established functional exercise tests (6MWT/ISWT), this association was strongest in the 2MWT. This is to be expected, as the muscle activation in stair climbing 69 and the sit-to-stand maneuver 70 differ greatly from level walking, especially in knee extensor force generation. In people with severe COPD, stair climbing time was not associated with 6MWD and the former resulted in more dyspnea, higher blood lactate, and lung hyperinflation (measured by total lung capacity difference post exercise). 71 In addition, short, high-intensity exercise (such as the 15-step oximetry test) results in differing contributions of anaerobic and aerobic energy systems than longer periods of exercise at lower intensity. 72 These factors should be considered when interpreting the validity of other tests by comparison with the 6MWT or ISWT. The search strategy utilized was peer-reviewed and clearly defined, but no strategy is guaranteed and the use of exercise test as a keyword rather than functional test may have restricted identification of certain tests; however, we believe this was adequately addressed by other aspects of the systematic search (hand-searching of references and citations). Inclusion and exclusion criteria for this review designated the selection of tests that had duration of less than 6 minutes, which may require clarification regarding step tests. In the included papers examining the psychometric properties of the Chester Step Test in people with COPD, test duration was as less than 6 minutes (mean time 3.8[1.8] min 35 or completion of 2.1(0.9) levels, where each level was 1-min long). 36 In contrast, studies of a variation of the Chester Step Test, the Incremental Step Test, were excluded from this review on examination of the full text, as test duration in people with COPD was as greater than or equal to 6 minutes in some 73 or at least 50% 74 of cases. Similarly, studies examining the Six-Minute Step Test 75 77 were excluded, as they did not meet the a priori criteria of duration less than 6 minutes. Methodological quality of the included studies evaluated by the COSMIN checklist was generally fair to poor, with fewer studies rating as good or excellent. This is likely due to the conservative nature of the COSMIN checklist, 22 as scoring requires that overall ratings for each measurement property assessed be given according to the lowest score assigned over multiple criteria. Criterion validity using 6MWT or ISWT was limited in some cases where only a single repetition of this measure was conducted. Included studies were all conducted in stable outpatients with mostly moderate to severe COPD. Therefore, findings of the included exercise tests are limited to these populations and judgments cannot be made regarding their use in other groups. 940 Physical Therapy Volume 97 Number 9 September 2017