Quality of Life. The assessment, analysis and reporting of patient-reported outcomes. Third Edition

Quality of Life The assessment, analysis and reporting of patient-reported outcomes Third Edition PETER M. FAYERS Institute of Applied Health Sciences, University ofaberdeen School of Medicine and Dentistry, Scotland, UK and Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), Trondheim, Norway and DAVID MACHIN Medical Statistics Group, School of Health and Related Research, University of Sheffield, Sheffield, UK and Department of Cancer Studies and Molecular Medicine, University of Leicester, Leicester, UK WlLEY Blackwell

Contents Preface to the third edition Preface to the second edition Preface to the first edition List of abbreviations xiii xv xvii xix PART 1 Developing and Validating Instruments for Assessing Quality of Life and Patient-Reported Outcomes 1 Introduction 3 1.1 Patient-reported outcomes 3 1.2 What is a patient-reported outcome? 4 1.3 What is quality of life? 4 1.4 HistoricaL development 6 1.5 Why measure quality of Life? 9 1.6 Which clinical trials should assess QoL? 17 1.7 How to measure quality of life 18 1.8 Instruments 19 1.9 Computer-adaptive Instruments 32 1.10 Conclusions 32 2 Principles of measurement scales 35 2.1 Introduction 35 2.2 Scales and items 35 2.3 Constructs and latent variables 36 2.4 Single global questions versus multi-item scales 37 2.5 Single-item versus multi-item scales 40 2.6 Effect indicators and causal indicators 42 2.7 Psychometrics, factor analysis and item response theory 48 2.8 Psychometric versus dinimetric scales 52 2.9 Sufficient causes, necessary causes and scoring items 53 2.10 Discriminative, evaluative and predictive instruments 54 2.11 Measuring quality of life: reflective, causal and composite indicators? 55 2.12 Further reading 56 2.13 Conclusions 56

vi 3 Developing a questionnaire 57 3.1 Introduction 57 3.2 General issues 58 3.3 Defining the target population 58 3.4 Phases of development 59 3.5 Phase 1: Generation of issues 61 3.6 Qualitative methods 63 3.7 Sample sizes 66 3.8 Phase 2: Developing items 68 3.9 Multi-item scales 72 3.10 Wording of questions 73 3.11 Face and content validity of the proposed questionnaire 74 3.12 Phase 3: Pre-testing the questionnaire 74 3.13 Cognitive interviewing 77 3.14 Translation 80 3.15 Phase 4: Field-testing 80 3.16 Conclusions 86 3.17 Further reading 87 4 Scores and measurements: validity, reliability, sensitivity 89 4.1 Introduction 89 4.2 Content validity 90 4.3 Criterion validity 94 4.4 Construct validity 96 4.5 Repeated assessments and change over time 104 4.6 Reliability 104 4.7 Sensitivity and responsiveness 117 4.8 Conclusions 124 4.9 Further reading 124 5 Multi-item scales 125 5.1 Introduction 125 5.2 Significance tests 126 5.3 Correlations 127 5.4 Construct validity 133 5.5 Cronbach's a and internal consistency 139 5.6 Validation or alte ratio n? 143 5.7 Implications for formative or causal items 144 5.8 Conclusions 147 6 Factor analysis and structural equation modelling 149 6.1 Introduction 149 6.2 Correlation patterns 150 6.3 Path diagrams 152 6.4 Factor analysis 154

vi: 6.5 Factor analysis of the HADS questionnaire 154 6.6 Uses of factor analysis 159 6.7 Applying factor analysis: Choices and decisions 161 6.8 Assumptions for factor analysis 167 6.9 Factor analysis in QoL research 171 6.10 Limitations of correlation-based analysis 172 6.11 Formative or causal models 173 6.12 Confirmatory factor analysis and structural equation modelling 176 6.13 Chi-square goodness-of-fit test 178 6.14 Approximate goodness-of-fit indices 180 6.15 Comparative fit of models 181 6.16 DifficuLty-factors 182 6.17 Bifactor analysis 183 6.18 Do formative or causal relationships matter? 186 6.19 Conclusions 187 6.20 Further reading, and Software 188 7 Item response theory and differential item functioning 189 7.1 Introduction 189 7.2 Item characteristic curves 191 7.3 Logistic models 193 7.4 Polytomous item response theory models 196 7.5 Applying logistic IRT models 197 7.6 Assumptions of IRT models 205 7.7 Eitting item response theory models: Tips 208 7.8 Test design and Validation 209 7.9 IRT versus traditional and Guttman scales 209 7.10 Differential item functioning 210 7.11 Sample size for DIE analyses 218 7.12 Quantifying differential item functioning 219 7.13 Exploring differential item functioning: Tips 219 7.14 Conclusions 221 7.15 Further reading, and Software 222 8 Item banks, item linking and computer-adaptive tests 223 8.1 Introduction 223 8.2 Item bank 224 8.3 Item evaluation, reduction and calibration 226 8.4 Item linking and test equating 228 8.5 Test Information 231 8.6 Computer-adaptive testing 232 8.7 Stopping rules and simulations 235 8.8 Computer-adaptive testing Software 236 8.9 CATs for PROs 237 8.10 Computer-assisted tests 238

viii 8.11 Short-form tests 239 8.12 Conclusions 239 8.13 Further reading 240 PART 2 Assessing, Analystng and Reporting Patient-Reported Outcomes and the Quality of Life of Patients 9 Choosing and scoring questionnaires 243 9.1 Introduction 243 9.2 Finding instruments 244 9.3 Generic versus specific 245 9.4 Content and presentation 246 9.5 Choice of Instrument 247 9.6 Scoring multi-item scales 250 9.7 Conclusions 256 9.8 Further reading 257 10 Clinical trials 259 10.1 Introduction 259 10.2 Basic design issues 260 10.3 Compliance 262 10.4 Administering a quality-of-life assessment 268 10.5 Recommendations for writing protocols 270 10.6 Standard operating procedures 280 10.7 Summary and checklist 281 10.8 Further reading 282 11 Sample sizes 283 11.1 Introduction 283 11.2 Significance tests, p-values and power 284 11.3 Estimating sample size 284 11.4 Comparing two groups 289 11.5 Comparison with a reference population 298 11.6 Non-inferiority studies 298 11.7 Choice of sample size method 301 11.8 Non-Normal distributions 302 11.9 Multiple testing 303 11.10 Specifying the target difference 305 11.11 Sample size estimation is pre-study 305 11.12 Attrition 306 11.13 Circumspection 306 11.14 Conclusion 306 11.15 Further reading 307

ix 12 Cross-sectional analysis 309 12.1 Types of data 309 12.2 Comparing two groups 312 12.3 Adjusting for covariates 324 12.4 Changes from baseline 330 12.5 Analysis of variance 331 12.6 Analysis of variance models 336 12.7 Graphical summaries 337 12.8 Endpoints 342 12.9 Conclusions 343 13 Exploring longitudinal data 345 13.1 Area under the curve 345 13.2 Graphical presentations 348 13.3 Tabular presentations 358 13.4 Reporting 360 13.5 Conclusions 365 14 ModeLling longitudinal data 367 14.1 Preliminaries 367 14.2 Auto-correlation 368 14.3 Repeated measures 373 14.4 Other situations 388 14.5 Modelling versus area under the curve 389 14.6 Conclusions 390 15 Missing data 393 15.1 Introduction 393 15.2 Why do missing data matter? 396 15.3 Types of missing data 400 15.4 Missing items 403 15.5 Methods for missing items within a form 404 15.6 Missing forms 408 15.7 Methods for missing forms 410 15.8 Simple methods for missing forms 410 15.9 Methods of Imputation that incorporate variability 415 15.10 Multiple Imputation 421 15.11 Pattern mixture models 422 15.12 Comments 424 15.13 Degrees of freedom 425 15.14 Sensitivity analysis 426 15.15 Conclusions 426 15.16 Further reading 427

X 16 Practica! and reporting issues 429 16.1 Introduction 429 16.2 The reporting of design issues 430 16.3 Data analysis 430 16.4 Elements of good graphics 436 16.5 Some errors 440 16.6 Guidelines for reporting 442 16.7 Further reading 445 17 Death, and quality-adjusted survival 447 17.1 Introduction 447 17.2 Attrition due to death 448 17.3 Preferences and Utilities 449 17.4 Multi-attribute Utility (MAU) measures 453 17.5 Utility-based Instruments 454 17.6 Quality-adjusted Life years (QALYs) 456 17.7 Utilities for traditional Instruments 457 17.8 0-7MST 462 17.9 Sensitivity analysis 467 17.10 Prognosis and Variation with time 470 17.11 Alternatives to QALY 472 17.12 Conclusions 473 17.13 Further reading 474 18 Clinical interpretation 475 18.1 Introduction 475 18.2 Statistical significance 476 18.3 Absolute levels and changes over time 477 18.4 Threshold values: percentages 478 18.5 Population norms 479 18.6 Minimal important difference 488 18.7 Anchoring against other measurements 492 18.8 Minimum detectable change 493 18.9 Expert judgement for evidence-based guidelines 494 18.10 Impact of the State of quality of life 495 18.11 Changes in relation to life events 496 18.12 Effect size statistics 498 18.13 Patient variability 505 18.14 Number needed to treat 506 18.15 Conclusions 509 18.16 Further reading 509 19 Biased reporting and response shift 511 19.1 Bias 5Ü 19.2 Recall bias 512

xi 19.3 Selective reporting bias 513 19.4 Other biases affecting PROs 514 19.5 Response shift 516 19.6 Assessing response shift 521 19.7 Impact of response shift 523 19.8 CLinicaL trials 523 19.9 Non-randomised studies 525 19.10 Conclusions 526 20 Meta-analysis 527 20.1 Introduction 527 20.2 Defining objectives 528 20.3 Defining outcomes 528 20.4 Literature searching 528 20.5 Assessing quality 529 20.6 Summarising results 533 20.7 Measures of treatment effect 534 20.8 Combining studies 537 20.9 Forest plot 542 20.10 Heterogeneity 542 20.11 Publication bias and funnel plots 544 20.12 Conclusions 545 20.13 Further reading 546 Appendix 1: Examples of instrumenta 547 Generic instruments El Sickness Im pact Profile (SIP) 549 E2 Nottingham Health Profile (NHP) 551 E3 SF36v2 Health Survey Standard Version 552 E4 EuroQoL EQ-5D-5L 555 E5 Patient Generated Index of quality of life (PGI) 557 Disease-sperific instruments 559 E6 European Organisation for Research and Treatment of Cancer QLQ-C30 (E0RTC QLQ-C30) 559 E7 Elderly cancer patients module (EORTC QLQ-ELD14) 561 E8 Functional Assessment of Cancer Therapy - General (FACT-G) 562 E9 Rotterdam Symptom Checklist (RSCL) 564 E10 Quality of Life in Epilepsy Inventory (QOLIE-89) 566 Ell Paediatric Asthma Quality of Life Questionnaire (PAQLQ) 570 Domain-specific instruments 573 E12 Hospital Anxiety and Depression Scale (HADS) 573 E13 Short-Form McGill Pain Questionnaire (SF-MPQ) 574 E14 Multidimensional Fatigue Inventory (MFI-20) 575

xii ADL and disability 577 E 15 (Modified) BartheL Index of Disability (MBI) 577 Appendix 2: Statistical tables 579 Table Tl: Normal distribution 579 Table T2: Probability points of the Normal distribution 581 Table T3: Student's f-distribution 582 Table T4: The distribution 583 Table T5: The F-distribution 584 References 585 Index 613