Comparing Generic and Condition-Specific Preference-Based Measures in Epilepsy: EQ-5D-3L and NEWQOL-6D

Similar documents
Review of health related quality of life evidence in the TLoC population

Comparing the UK EQ-5D-3L and English EQ-5D-5L value sets

Improving trial methodology: Examples from epilepsy. Tony Marson University of Liverpool

To what extent do people prefer health states with higher values? A note on evidence from the EQ-5D valuation set

NICE DSU TECHNICAL SUPPORT DOCUMENT 8: AN INTRODUCTION TO THE MEASUREMENT AND VALUATION OF HEALTH FOR NICE SUBMISSIONS

Time Trade-Off and Ranking Exercises Are Sensitive to Different Dimensions of EQ-5D Health States

Japan Journal of Medicine

levetiracetam 250,500,750 and 1000mg tablets and levetiracetam oral solution 100mg/1ml (Keppra ) (No. 397/07) UCB Pharma Ltd

Scottish Medicines Consortium

VALUE IN HEALTH 19 (2016) Available online at journal homepage:

Using Discrete Choice Experiments with duration to model EQ-5D-5L health state preferences: Testing experimental design strategies

White Rose Research Online URL for this paper:

VALUE IN HEALTH 21 (2018) Available online at journal homepage:

This is a repository copy of Using preference based measures in mental health conditions: The psychometric validity of the EQ-5D and SF-6D.

Condition-Specific Preference-Based Measures: Benefit or Burden?

Properties of patient-reported outcome measures in individuals following acute whiplash injury

Mapping the EORTC QLQ C-30 onto the EQ-5D Instrument: The Potential to Estimate QALYs without Generic Preference Data

EuroQol Working Paper Series

Time to tweak the TTO: results from a comparison of alternative specifications of the TTO

Lacosamide (Vimpat) for partial-onset epilepsy monotherapy. December 2011

Learning Effects in Time Trade-Off Based Valuation of EQ-5D Health States

David Patterson The Whittington Hospital, Magdala Avenue, London N19 5NF, UK

A panel data comparison of two commonly-used health-related quality of life instruments

Economic Evaluation of AEDs used as monotherapy in the treatment of adults with newly diagnosed focal epilepsy

Interim Scoring for the EQ-5D-5L: Mapping the EQ-5D-5L to EQ-5D-3L Value Sets

Validity of the EuroQoL (EQ-5D) Instrument in a Greek General Population

Refractory epilepsy: treatment with new antiepileptic drugs

Analysis of EQ-5D scores from two phase 3 clinical trials of romiplostim in the treatment of immune thrombocytopenia (ITP)

LMMG New Medicine Recommendation

Jane T Osterhaus 1* and Oana Purcaru 2

Appendix M Health Economic Evidence Extractions

Assessment of Health State in Patients With Tinnitus: A Comparison of the EQ-5D and HUI Mark III

A Longitudinal Evaluation of Health-Related Quality of Life of Patients with Parkinson s Disease

SUPPORTING THE ROUTINE COLLECTION OF PATIENT REPORTED OUTCOME MEASURES IN THE NATIONAL CLINICAL AUDITS FOR ASSESSING COST EFFECTIVENESS

Comparison of Value Set Based on DCE and/or TTO Data: Scoring for EQ-5D-5L Health States in Japan

Introducing the QLU-C10D A preference-based utility measure derived from the QLQ-C30

Valuing health using visual analogue scales and rank data: does the visual analogue scale contain cardinal information?

Zonisamide Should Be Considered a First-Line Antiepileptic Drug for Patients with Newly Diagnosed Partial Epilepsy

Mapping EORTC QLQ-C30 onto EQ-5D for the assessment of cancer patients

This is a repository copy of Deriving a preference-based measure for cancer using the EORTC QLQ-C30.

What do we know about prognosis and natural course of epilepsies?

This is a repository copy of A comparison of the EQ-5D and the SF-6D across seven patient groups.

Choice of EQ 5D 3L for Economic Evaluation of Community Paramedicine Programs Project

The responsiveness of the EQ-5D and time trade-off scores in schizophrenia, affective disorders, and alcohol addiction

How to Measure and Value Health Benefits to Facilitate Priority Setting for Pediatric Population? Development and Application Issues.

Health Economics & Decision Science (HEDS) Discussion Paper Series

Treatment outcome after failure of a first antiepileptic drug

Keywords: treatment; epilepsy; population based cohort Institute of Neurology, University College London, London WC1N 3BG, UK

Scottish Medicines Consortium

Published online: 10 January 2015 The Author(s) This article is published with open access at Springerlink.com

Canadian Expert Drug Advisory Committee Final Recommendation Plain Language Version

Adjunctive therapy in epilepsy: a cost-effectiveness comparison of alternative treatment options

Session 6: Choosing and using HRQoL measures vs Multi-Attribute Utility Instruments QLU-C10D and EQ-5D as examples

Acupuncture for chronic pain when combined with depression. Hugh MacPherson

Discriminative validity of the EQ-5D-5 L and SF-12 in older adults with arthritis

How is the most severe health state being valued by the general population?

An empirical comparison of well-being measures used in UK. Clara Mukuria, Tessa Peasgood, Donna Rowen, John Brazier

Validity and responsiveness of the Core Outcome Measures Index (COMI) for the neck

[ ]). 2 4 (95% CI)

Difficult to treat childhood epilepsy: Lessons from clinical case scenario

LIHS Mini Master Class

Critical Appraisal Skills. Professor Dyfrig Hughes Health Economist AWMSG

CADTH CANADIAN DRUG EXPERT COMMITTEE FINAL RECOMMENDATION

Aliasghar A. Kiadaliri 1,2,6*, Björn Eliasson 3 and Ulf-G Gerdtham 4,5

Mapping scores from the Strengths and Difficulties Questionnaire (SDQ) to preferencebased

Interchangeability of the EQ-5D and the SF-6D in Long-Lasting Low Back Pain

Stay, Hit, or Fold? What Do You Do If the Treatment May Be as Bad as the Problem Results of a Q-PULSE Survey

Health Services Research and Health Economics. Paul McCrone Institute of Psychiatry, King s College London

Julie A. Campbell 1 Martin Hensher. Stephen Wilkinson 3 Andrew J. Palmer

Is EQ-5D-5L Better Than EQ-5D-3L? A Head-to-Head Comparison of Descriptive Systems and Value Sets from Seven Countries

eslicarbazepine acetate 800mg tablet (Zebinix) SMC No. (592/09) Eisai Ltd

Early add-on treatment vs alternative monotherapy in patients with partial epilepsy

To what extent can we explain time trade-off values from other information about respondents?

Clinical Commissioning Policy: Vagal Nerve Stimulation (VNS) December Reference : NHSCB/D4/c/7

Seizure remission in adults with long-standing intractable epilepsy: An extended follow-up

Health Related Quality of Life: Cancer cachexia. Professor Graeme D. Smith Editor Journal of Clinical Nursing CMU,Taiwan August 2017

Study Protocol: Comparison of Inconsistency between Time Trade Off and Discrete Choice Experiments in EQ-5D-3 L Health State Valuations

NICE Guidelines for HTA Issues of Controversy

p ผศ.นพ.ร งสรรค ช ยเสว ก ล คณะแพทยศาสตร ศ ร ราชพยาบาล

Making Value Based Pricing A Reality: Issue Panel

Laura Bonnett, Catrin Tudur Smith, David Smith, Paula Williamson, David Chadwick, Anthony G Marson

VALUE IN HEALTH 14 (2011) available at journal homepage:

The EuroQol and Medical Outcome Survey 36-item shortform

15D: Strengths, weaknesses and future development

Technology appraisal guidance Published: 20 December 2017 nice.org.uk/guidance/ta496

Work in progress: please do not quote without authors permission


Epilepsy, co-morbidities and Quality of Life. Professor Gus A Baker PhD FBPS

Tails from the Peak District: Adjusted Limited Dependent Variable Mixture Models of EQ-5D Questionnaire Health State Utility Values

Mapping health outcome measures from a stroke registry to EQ-5D weights

Basic Economic Analysis. David Epstein, Centre for Health Economics, York

SUPPORTING THE ROUTINE COLLECTION OF PATIENT REPORTED OUTCOME MEASURES IN THE NATIONAL CLINICAL AUDITS FOR ASSESSING COST EFFECTIVENESS

A comparison of injured patient and general population valuations of EQ-5D health states for New Zealand

Symptom Severity, Quality of Life and Work Productivity of US Psoriasis Patients During Periods of Flare and Remission

HEDS Discussion Paper 05/05

Perampanel Benefit assessment according to 35a Social Code Book V 1

Epilepsy. Seizures and Epilepsy. Buccal Midazolam vs. Rectal Diazepam for Serial Seizures. Epilepsy and Seizures 6/18/2008

Efficacy of Levetiracetam: A Review of Three Pivotal Clinical Trials

PHARMAC responds on long-acting insulin analogues

Matt D. Stevenson, PhD, 1 Alison Scope, PhD, 1 Paul A. Sutcliffe, DPhil 2. Introduction. Methods

Transcription:

VALUE IN HEALTH 20 (2017) 687 693 Available online at www.sciencedirect.com journal homepage: www.elsevier.com/locate/jval Comparing Generic and Condition-Specific Preference-Based Measures in Epilepsy: EQ-5D-3L and NEWQOL-6D Brendan Mulhern, MRes 1,2, *, Joshua Pink, PhD 3, Donna Rowen, PhD 2, Simon Borghs, MSc 4, Thomas Butt, PhD 5, Dyfrig Hughes, PhD 6, Antony Marson, MD 7, John Brazier, PhD 2 1 Centre for Health Economics Research and Evaluation, University of Technology Sydney, Sydney, New South Wales, Australia; 2 School of Health and Related Research, University of Sheffield, Sheffield, South Yorkshire, UK; 3 Warwick Medical School, University of Warwick, Coventry, West Midlands, UK; 4 UCB Pharma, Slough, Berkshire, UK; 5 UCB Biopharma, Brussels, Belgium; 6 Centre for Health Economics and Medicines Evaluation, Bangor University, Bangor, Gwynedd UK; 7 Department of Molecular and Clinical Pharmacology, University of Liverpool, Liverpool, Merseyside UK ABSTRACT Background: There is debate about the psychometric characteristics of the three-level EuroQol five-dimensional questionnaire (EQ-5D-3L) for use in epilepsy. In response to the concerns, an epilepsy-specific preference-based measure (NEWQOL-6D) was developed. The psychometric characteristics of the NEWQOL-6D, however, have not been assessed. Objectives: To investigate the validity and responsiveness of the EQ-5D-3L and the Quality of Life in Newly Diagnosed Epilepsy Instrument-six dimensions (NEWQOL-6D) for use in the assessment of treatments for newly diagnosed focal epilepsy. Methods: The analysis used data from the Standard And New Antiepileptic Drugs trial including patients with focal epilepsy. We assessed convergent validity using correlations, and known-group validity across different epilepsy and general health severity indicators using analysis of variance and effect sizes. The responsiveness of the measures to change over time was assessed using standardized response means. We also assessed agreement between the measures. Results: There was some level of convergence and agreement between the measures in terms of utility score but divergence in the concepts measured by the descriptive systems. Both instruments displayed known-group validity, with significant differences between severity groups, and generally slightly larger effect sizes for the NEWQOL-6D across the epilepsy-specific indicators. Evidence for responsiveness was less clear, with small to moderate standardized response means demonstrating different levels of change across different indicators. Conclusions: There was an overall tendency for the NEWQOL-6D to better reflect differences across groups, but this does not translate into large absolute utility differences. Both the EQ-5D-3L and the NEWQOL-6D show some evidence of validity for providing utility values for economic evaluations in newly diagnosed focal epilepsy. Keywords: epilepsy, EQ-5D, NEWQOL-6D, preference-based measures, psychometrics. Copyright & 2017, International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. Introduction In the economic evaluation of health technologies, the qualityadjusted life-year (QALY) is often the preferred measure of health outcome. The QALY combines a period of time spent in a health state and the associated quality of life into a single figure to assess the overall effectiveness of treatments. The quality aspect of the QALY is known as health utility, and can be derived from preference-based measures of health such as the three-level EuroQol five-dimensional questionnaire (EQ-5D-3L), which has a utility value set based on the preferences of the general population and is anchored at 1 (equivalent to full health) and 0 (equivalent to dead). Epilepsy is a neurological disorder characterized by unprovoked, recurring seizures. There are different types of seizures, with focal (partial-onset) seizures originating within specific areas in the brain [1]. It is a prevalent condition and is present in around 1% of the world s population [2]. Patients newly diagnosed with epilepsy are usually initiated on antiepileptic drug treatment, with dose and subsequent regimens adapted according to treatment response and tolerability. For those whose seizures are difficult to control, multiple concurrent therapies can be used to maximize seizure control [3,4]. Given the nature of the condition, both epilepsy and antiepileptic drug treatment have a range of impacts on health-related quality of life (HRQOL), including mental health (e.g., anxiety about the onset of seizures), and cognitive and social functioning. For the economic evaluation of treatments for epilepsy, it is important to measure the HRQOL (and utilities) of the patient groups receiving treatment, and the EQ-5D-3L is often used for this purpose. Nevertheless, although the EQ-5D-3L is a generic measure, and is used to compare health status across conditions, there is evidence suggesting that it may lack some level of validity for use in epilepsy. Stavem et al. [5] found that a number * Address correspondence to: Brendan Mulhern, Centre for Health Economics Research and Evaluation, University of Technology Sydney, 1-59 Quay Street, Sydney, New South Wales 2000, Australia. E-mail: Brendan.mulhern@chere.uts.edu.au. 1098-3015$36.00 see front matter Copyright & 2017, International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. http://dx.doi.org/10.1016/j.jval.2016.03.1860

688 VALUE IN HEALTH 20 (2017) 687 693 of the EQ-5D-3L dimensions did not discriminate well between patients using antiepileptic drugs and patients with neurologic comorbidities. Furthermore, the EQ-5D-3L may not capture all the HRQOL dimensions relevant to epilepsy patients [6] and may not be sensitive to seizure control [7]. In terms of responsiveness, the EQ-5D-3L may be insensitive to health status at the upper end because of a ceiling effect (with many patients reporting the best health state, with no problems, on the measure). Furthermore, the recall period of 1 day may impact the sensitivity of the instrument when used for episodic conditions such as epilepsy [8]. In response to these findings, an epilepsy-specific preferencebased measure was developed for use in the assessment of HRQOL (NEWQOL-6D) [9].The NEWQOL-6D was developed using baseline data from the Standard And New Antiepileptic Drugs (SANAD) study [10]. The SANAD study included patients with predominantly newly developed focal epilepsy for whom one treatment (carbamazepine) was considered the standard treatment, and who were randomly assigned to receive treatment as usual or one of four other treatments (gabapentin, lamotrigine, oxcarbazepine, or topiramate). Condition-specific preferencebased measures may improve health care decision making in areas in which the psychometric validity of generic measures is lacking by providing more precise utility values that are based on the dimensions of HRQOL impacted by the condition. At present it is, however, unclear whether the NEWQOL-6D provides an improvement in the validity of utility values generated for the calculation of QALYs, and therefore psychometric evidence comparing generic and condition-specific measures is required. The aim of this study was to investigate the psychometric performance of the EQ-5D-3L and the NEWQOL-6D to provide evidence for the use of the utility values generated in the assessment of epilepsy treatments in patients newly diagnosed with focal epilepsy using an existing data set. This includes the following objectives: 1. to assess the psychometric validity of the EQ-5D-3L and the NEWQOL-6D; 2. to assess the responsiveness (i.e., change in utility) of the EQ- 5D-3L and the NEWQOL-6D to clinical outcomes in newly diagnosed focal epilepsy. Measures NEWQOL-6D The NEWQOL-6D [9] was developed from the NEWQOL instrument [16] using baseline data from a large randomized controlled trial of antiepileptic drugs (the SANAD study; see Data Set section). The NEWQOL-6D assesses epilepsy-specific HRQOL across six dimensions (worry about attacks [seizures], depression, memory, concentration, control, and stigma) each with four severity response levels, therefore describing 4096 (4 6 ) possible health states. To produce the utility scale, 50 health states were valued by a representative sample of the UK general population using time trade-off (to promote comparability with the UK EQ-5D-3L tariff [15]). This produced a value set with a range from 0.954 (for the best state, 111111) to 0.341 (for the worst state, 444444). Further work with the instrument has shown that general population and patient valuations of the NEWQOL-6D health states have limited differences [17]. Data Set The SANAD study data [10] from three time points (baseline, year 1, and year 2) were used for the analysis. The data for each time point included the EQ-5D-3L, the NEWQOL battery [16] that allows the NEWQOL-6D to be calculated, and a range of clinical and other health indicators, including the number of seizures, selfreported health change over the last year, cognitive problems, and health care services received. The Hospital Anxiety and Depression Scale (HADS) [18], a widely used measure of anxiety and depression that provides cutoffs for possible and probable caseness, was also used as a comparative measure. Patients with complete data for each of the measures across each time point were included. Analysis A range of psychometric analyses were carried out to assess the validity and responsiveness of the EQ-5D-3L and the NEWQOL-6D. Descriptive analysis The means, standard deviations, medians, ranges, and completion rates of the utility scores for each of the measures at each time point were assessed. Methods Measures EQ-5D-3L The EQ-5D-3L [11] is the most widely used generic preferencebased measure internationally. It is recommended by the National Institute for Health and Care Excellence in the United Kingdom for use in the economic evaluation of interventions [12], and is accepted by other reimbursement agencies around the world [13,14]. The EQ-5D-3L measures health across five dimensions (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression), each with three response levels (none, some, and extreme/unable), and therefore describes 243 (3 5 ) health states. A selection of health states were valued by a representative sample of the UK general population using the time trade-off preference elicitation technique [15]. This produced a utility scale with a range from 1 for the best state (11111) to 0.594 for the worst state (33333), with negative values being equivalent to states valued as worse than dead [15]. The EQ visual analogue scale (which reports health on a 0 [worst imaginable] to 100 [best imaginable] scale) was also used to provide an indication of the overall health of the sample. Validity Psychometric analysis of validity assesses the extent to which an instrument measures what it is intended to measure (in this case HRQOL). It should be noted that the validity of patient-reported outcome measures such as the EQ-5D-3L and the NEWQOL-6D is difficult to prove because in many health conditions there is rarely a criterion standard against which to compare. This is the case in epilepsy, for which many of the measures lack in psychometric evidence, and means that validity is compared across measures using various well-established tests and guidelines about the magnitude of the relationship. Concurrent validity Concurrent validity assesses the strength of the relationship between measures of the same concept using Pearson correlations. Strong correlations indicate that the preference-based measures are assessing related constructs. Correlations are considered weak if scores are less than 0.3, moderate if scores are between 0.3 and less than 0.7, and strong if scores are 0.7 or higher [19]. Correlations between the EQ-5D-3L and the NEWQOL-6D utility scores and dimensions at each time point (baseline, year 1, year 2) were assessed.

VALUE IN HEALTH 20 (2017) 687 693 689 Agreement between the utility values was assessed using Bland-Altman plots, which are used to visualize the relationship between measures scored on the same scale. This was done by plotting the mean of the EQ-5D-3L and the NEWQOL-6D values against the difference (i.e., NEWQOL-6D minus EQ-5D-3L) and assessing the number of plots outside the agreement range ( 2 SD from the mean) [20]. Known-group validity Known-group validity assesses the extent to which scores on an instrument differ across groups in which they are expected to differ (e.g., severity groups). This was measured using analysis of variance difference tests and by calculating effect sizes between the groups, which provide a standard indicator of the size of the difference, and are calculated as score 2 score 1/SD of score 1. Effect sizes of less than 0.2 are considered small, 0.5 moderate, and 0.8 large [19]. We assessed differences in the EQ-5D-3L and the NEWQOL-6D utility scores at baseline across a range of groups: 1. total number of seizures at baseline (defined as four groups: 1 3, 4 5, 6 9, and 10þ total lifetime seizures); 2. self-reported health status (defined as three groups from a five-point Likert scale: very good/excellent, good, and fair/ poor); 3. remission status (two groups: remission and no remission, where remission was defined as no attack in the past year); this was carried out at year 1 and year 2, and can be interpreted as known-group validity because the utility of those in different frequency groups can be assessed at one time point; 4. HADS anxiety and depression dimension caseness cutoffs at baseline (defined as no case [a score below 8] and possible case [a score of 8þ]). Responsiveness Responsiveness assesses the ability of an instrument to detect changes in HRQOL over time. Responsiveness, however, is difficult to prove because there is no true reference measure of HRQOL. Therefore, clinical variables that are expected to have an effect on HRQOL are used to identify groups of patients whose HRQOL is expected to have changed, and the responsiveness of the utility measures is compared across the groups. We assessed responsiveness for full completers at each time point, and compared baseline to year 1, and baseline to year 2, by assessing mean change in utility overall, and across identified change groups using the standardized response mean (SRM; [T2 T1]/ SD of change), which is a standard indicator of change across measures and time points. SRMs of less than 0.2 are considered small, 0.5 moderate, and 0.8 large [19]. The following analyses were carried out: 1. Assessment of floor and ceiling effects (if a large proportion is at the floor [lowest possible score] or ceiling [highest possible score], then this impairs the ability of the measure to pick up decreases or increases in HRQOL, respectively); 2. Mean change and SRMs based on self-reported health transition anchor (assessing whether health has improved, stayed the same, or worsened over the last year); 3. Mean change and SRMs based on remission, defined as those with no seizures during the past year. This was done between year 1 and year 2 because it is difficult to interpret baseline frequency in terms of when the attacks occurred outside the SANAD study period. Although the sample consists of patients newly diagnosed with focal epilepsy, the exact time of seizure occurrence is unclear. 4. Change in the actual NEWQOL-6D and EQ-5D-3L health states reported over time. Results Sample Characteristics The characteristics of the sample at baseline are presented in Table 1, with the sample size at years 1 and 2 also indicated. Fifty-five percent of the patients in the sample were men, and the mean age was 40 years. The mean EQ visual analogue scale score was 68, with a wide range of health states reported (from 5 to 100). Descriptive Analysis The completion rate of the EQ-5D-3L was higher than that of the NEWQOL-6D. Descriptive statistics for the EQ-5D-3L and the NEWQOL-6D are presented in Table 2. The mean EQ-5D-3L scores were lower than those of the NEWQOL-6D across each time point, with a larger standard deviation. The median of the EQ-5D-3L is higher in each case, reflecting the skewed nature of the data. The EQ-5D-3L data are bimodal, with a large number of responses at 1, whereas the NEWQOL-6D data are unimodal. Bimodal EQ-5D- 3L data are common because of the difference in utility values between the best state (11111 with a utility value of 1) and the next best state (11211 with a utility value of 0.883). Concurrent Validity Correlations indicate moderate convergence between the EQ-5D-3L and the NEWQOL-6D utility scores at baseline (0.617), year 1 (0.651), and year 2 (0.647). Nevertheless, the correlations between the NEWQOL-6D and the EQ-5D-3L dimensions at baseline presented in Table 3 indicate a low level of overlap except for the depression dimension. Only baseline correlations are presented because the correlation pattern is similar at both follow-up time points. Figure 1 displays the Bland-Altman plot at baseline, with the central line indicating the mean and the lines on either side indicating 2 SD away from the mean, and shows that disagreement between the measuresislargeratthemoresevereendofthescale,wherethe mean of the measures is low, and many data points are outside the Table 1 Background characteristics (baseline). Characteristics n (%) Sample size Baseline 1611 Year 1 1288 Year 2 1134 Sex: male 885 (55.2) Age (y) Mean SD 39.7 16.5 Range 16 86 In employment 1297 (78.8) Marital status Married/partner 885 (55.1) Single 548 (34.1) Divorced/separated 134 (8.3) Widowed 38 (2.4) EQ-VAS Mean SD 68.14 20.7 Range 5 100 EQ-VAS, EuroQol Visual Analogue Scale.

690 VALUE IN HEALTH 20 (2017) 687 693 Table 2 Descriptive analysis of the EQ-5D-3L and NEWQOL-6D utility scores. Descriptive statistics for each measure Baseline Year 1 Year 2 EQ-5D-3L N, fully completing (% completion) 1563 (98) 1244 (98) 1091 (98) Mean SD 0.735 0.30 0.769 0.29 0.789 0.28 Median 0.848 0.848 0.848 Range 0.38 to 1 0.454 to 1 0.239 to 1 NEWQOL-6D N, fully completing (% completion) 1508 (94) 1156 (90) 1023 (90) Mean SD 0.766 0.13 0.798 0.13 0.805 0.13 Median 0.786 0.832 0.844 Range 0.341 to 0.957 0.341 to 0.957 0.341 to 0.957 EQ-5D-3L, three-level EuroQol five-dimensional questionnaire; NEWQOL-6D. Quality of Life in Newly Diagnosed Epilepsy Instrument-six dimensions. limits of agreement. The relationship is similar at both follow-up time points. Known-Group Validity From Table 4 it can be seen that both the EQ-5D-3L and the NEWQOL-6D significantly discriminate between groups defined by number of seizures (EQ-5D-3L: F 2,1543 ¼ 28.02, P ¼ 0.000; NEWQOL-6D: F 2,1496 ¼ 32.50, P ¼ 0.000), with a small to moderate effect size. For self-reported health, the differences are significant (EQ-5D-3L: F 2,1557 ¼ 333.03, P ¼ 0.000; NEWQOL-6D: F 2,1505 ¼ 212.72, P ¼ 0.000), with a moderate to large effect size. A large effect size is observed for both measures between groups defined by impact of attack score (EQ-5D-3L: F 1,808 ¼ 127.33, P ¼ 0.000; NEWQOL-6D: F 1,793 ¼ 174.07, P ¼ 0.000), HADS-A (anxiety) (EQ-5D-3L: F 1,1526 ¼ 418.51, P ¼ 0.000; NEWQOL-6D: F 1,1483 ¼ 670.57, P ¼ 0.000), and HADS-D (depression) (EQ-5D-3L: F 1,1538 ¼ 625.82, P ¼ 0.000; NEWQOL-6D: F 1,1493 ¼ 792.24, P ¼ 0.000) cutoff scores. A moderate to large effect size for remission status is observed for both the EQ-5D-3L and the NEWQOL-6D. Responsiveness The EQ-5D-3L has a large ceiling effect at each time point, with 31.6% reporting the best health state (11111) at baseline. This may limit the ability of the descriptive system to measure increases in health, and is a common finding with the EQ-5D-3L [21]. The NEWQOL-6D does not have floor or ceiling effects. Table 5 presents the responsiveness of the EQ-5D-3L and the NEWQOL-6D. The NEWQOL-6D has larger SRMs than the EQ-5D-3L has for those who reported remission in number of seizures and also improved health over the last year, with utility values improving. The EQ-5D-3L is slightly more responsive in those who reported worsening health, although the SRMs are very similar. Overall, the SRMs are generally in the low to moderate range, with the largest SRMs reported between baseline and year 1. The frequency of respondents changing response category and overall utility value varies for each of the preference-based measure. The NEWQOL-6D demonstrates a higher level of change than the EQ-5D-3L between baseline and year 1 at the utility (85% vs. 62%) and response category levels. At the response level, the number of patients changing category ranges from 33% (stigma) to 60% (worry) for the NEWQOL- 6D and from 12% (mobility) to 35% (anxiety/depression) for the EQ-5D-3L. Discussion There is some evidence from earlier research that the EQ-5D-3L may not demonstrate a high level of validity for use in the measurement of HRQOL in epilepsy [5 7]. We have carried out psychometric analysis to compare the recently developed epilepsy-specific preference-based measure NEWQOL-6D and the EQ-5D-3L in a sample of patients newly diagnosed with focal epilepsy. This adds to the past work investigating the psychometric validity of the EQ-5D-3L, as well as establishing evidence about the validity of the NEWQOL-6D. The results suggest that there are similarities and differences between the measures, which impact on the psychometric characteristics, and are likely to affect subsequent costeffectiveness analyses. First, the mean EQ-5D-3L scores are lower, and this is commonly found when comparing the EQ-5D-3L to other generic and condition-specific preference-based measures due to comparative differences in the range of the utility scales and the concepts measured by the descriptive systems [22,23]. Table 3 The EQ-5D-3L and NEWQOL-6D dimension-level correlations (baseline). NEWQOL-6D EQ-5D-3L Mobility Self-care Usual activities Pain/discomfort Anxiety/depression Worry 0.221 0.169 0.326 0.243 0.327 Depression 0.235 0.217 0.355 0.303 0.633 Memory 0.267 0.207 0.346 0.271 0.319 Concentration 0.289 0.237 0.388 0.265 0.326 Control 0.307 0.255 0.387 0.288 0.357 Stigma 0.243 0.213 0.305 0.230 0.261 EQ-5D-3L, three-level EuroQol five-dimensional questionnaire; NEWQOL-6D, Quality of Life in Newly Diagnosed Epilepsy Instrument-six dimensions.

VALUE IN HEALTH 20 (2017) 687 693 691 Fig. 1 Bland-Altman plot at baseline (BL) of the EQ-5D-3L and the NEWQOL-6D. Note. Thex-axisisthemeanutilityscoreofthe EQ-5D-3L and the NEWQOL-6D, and the y-axis is the difference in utility scores between the measures. EQ-5D-3L, three-level EuroQol five-dimensional questionnaire; NEWQOL-6D, Quality of Life in Newly Diagnosed Epilepsy Instrument-six dimensions; Second, in terms of convergence, the utility scales are moderately correlated, but there are low correlations between the descriptive systems. This has been shown elsewhere [22 24] and suggests that these instruments are measuring some overlapping concepts, but there are some important differences, and the utility scales, developed using the same method, are complementary. In terms of other differences between the descriptive systems, the EQ-5D-3L has a large ceiling effect, a common finding [21] that suggests that the EQ-5D-3L is not sensitive to small differences in health (i.e., the difference between no problems and some problems is too large to pick up small changes), but also cannot pick up improvements in health at follow-up among those who report the best EQ-5D-3L health state at baseline. The NEWQOL-6D does not Table 4 Known-group validity of the EQ-5D-3L and NEWQOL-6D utilities. Known group indicator EQ-5D-3L NEWQOL-6D N Mean SD ES Sig. N Mean SD ES Sig. No. of seizures last year (baseline) 0.000 0.000 1 3 474 0.811 0.26 468 0.802 0.11 4 9 447 0.737 0.29 0.28 432 0.765 0.12 0.34 10þ 625 0.676 0.33 0.21 599 0.738 0.14 0.23 Self-reported health Very good/excellent 545 0.893 0.18 0.000 515 0.830 0.10 0.000 Good 581 0.787 0.22 0.59 554 0.776 0.11 0.54 Fair/poor 482 0.489 0.35 1.35 439 0.678 0.13 0.89 Remission status (year 1) No remission (41 attack) 742 0.701 0.32 0.000 695 0.762 0.13 0.000 Remission (0 attack) 495 0.876 0.20 0.88 456 0.855 0.10 0.93 Remission status (year 2) No remission (41 attack) 513 0.700 0.31 0.000 479 0.756 0.14 0.000 Remission (0 attack) 573 0.869 0.22 0.77 540 0.849 0.09 1.03 HADS-A cutoff No case 743 0.880 0.19 0.000 718 0.840 0.08 0.000 Possible case 785 0.599 0.32 1.48 767 0.696 0.13 1.80 HADS-D cutoff No case 1036 0.848 0.20 0.000 999 0.820 0.09 0.000 Possible case 504 0.501 0.34 1.02 496 0.657 0.13 1.81 EQ-5D-3L, three-level EuroQol five-dimensional questionnaire; ES, effect size; HADS-A, Hospital Anxiety and Depression Scale-Anxiety; HADS- D, HADS-Depression; NEWQOL-6D, Quality of Life in Newly Diagnosed Epilepsy Instrument-six dimensions; Sig., significance.

692 VALUE IN HEALTH 20 (2017) 687 693 Table 5 Responsiveness of the EQ-5D-3L and the NEWQOL-6D. Change indicator T0 T1 T1 T2 Mean utility change SRM Mean utility change SRM EQ-5D-3L Overall 0.025 0.10 0.005 0.03 Self-reported health change anchor Improved 0.076 0.38 0.051 0.30 Same 0.019 0.09 0.002 0.01 Declined 0.092 0.29 0.084 0.35 Seizure frequency, remission 0.045 0.23 0.031 0.18 NEWQOL-6D Overall 0.027 0.27 0.002 0.02 Self-reported health change anchor Improved 0.056 0.52 0.022 0.24 Same 0.018 0.20 0.008 0.11 Declined 0.022 0.19 0.036 0.31 Seizure frequency, remission 0.047 0.53 0.019 0.23 EQ-5D-3L, three-level EuroQol five-dimensional questionnaire; NEWQOL-6D, Quality of Life in Newly Diagnosed Epilepsy Instrument-six dimensions; SRM, standardized response mean. display a ceiling effect, and has a larger number of respondents changing the reported health state between time points, in part because of the number of health states described (4096 vs. 243). This may mean that the NEWQOL-6D could be more sensitive in less severe epilepsy groups that are likely to be at the upper end of the utility scale. It should be noted that the five-level EQ-5D (EQ-5D-5L) [25] may help resolve the ceiling effect issue to some extent because it increases the number of health states to 3125. The EQ-5D-5L should now be considered for inclusion in trials in which costeffectiveness analysis is required. The EQ-5D-3L has a higher overall response rate, because it is administered as a five-item questionnaire, whereas the NEWQOL-6D is derived from a much longer battery measure. Furthermore, the EQ-5D-3L appeared before the instruments included in the NEWQOL battery [16] in the SANAD study questionnaire, and therefore the later measures may be affected by questionnaire fatigue. Both measures, however, have response rates of more than 90%. Finally, the recall period for the EQ- 5D relates to today, and this may limit its sensitivity in a chronic, episodic disease such as epilepsy. In terms of other psychometric indicators, the EQ-5D-3L and the NEWQOL-6D perform similarly overall, which suggests that both measures have validity for use in the economic evaluation of treatments for newly diagnosed focal epilepsy. Both display generally good discriminant validity between clinical and general health severity groups, with the same category effect size. Therefore, both measures can be used with confidence to distinguish severity groups. Agreement between the measures is better at the less severe end of the utility scale, which is a common finding given differences in the range reported by the measures, with the EQ-5D-3L having a larger range than most utility instruments [26]. The evidence for responsiveness is less clear. There is an indication that the NEWQOL-6D has slightly better responsiveness among those who report general health and seizure frequency improvement, but the measures are more similar when assessing those who report worsening health between time points. Further research using clinical rather than selfreported indicators of change would be useful. The general determinants of HRQOL in epilepsy, which is determined not just by efficacy but also by comorbidities, treatment safety, and baseline severity of the condition [27], may provide context for some of the results reported. For example, in terms of the descriptive system, both measures are highly sensitive to anxiety and depression, which are well-established epilepsy comorbidities that are related to HRQOL regardless of seizure frequency [28]. Anxiety and depression, however, are not correlated entirely with treatment efficacy in epilepsy and so they do not effect much change in overall utility in studies of antiepileptic drug treatment. It is therefore difficult to delineate the impact of HRQOL in epilepsy on the sensitivity of and change in utility values measured (rather than at the dimension level) given the overall similar findings. The analysis reported here raises wider issues about the use and acceptability of generic and condition-specific preferencebased measures. For example, the EQ-5D-3L allows for comparisons of utilities across health conditions that in some cases are limited by the psychometric validity of the instrument in that area. By definition, condition-specific measures do not allow for comparisons across health areas [29,30], but may provide a set of complementary utility values that can be assessed alongside, for example, the EQ-5D, to allow for more holistic measurement of the HRQOL impacts of the condition. This might be particularly useful in settings in which a particular measure is recommended, but evidence relating to that measure in certain health conditions is mixed (e.g., the National Institute for Health and Care Excellence recommending the EQ-5D in England) 12]. Furthermore, condition-specific measures such as the NEWQOL-6D may be limited in populations in which there are comorbidities (because it is not designed to pick up changes in those). Further psychometric comparisons of generic and condition-specific preferencebased measures across a range of health areas are warranted. The differences in the psychometric indicators may reflect in some ways in the differences in cost-effectiveness results, in which the incremental cost-effectiveness ratio using the NEWQOL-6D values may be different when compared with that using the EQ-5D-3L values. Further work could investigate this issue in more detail across different trial data sets. There are limitations to this analysis. Data were available from only one source (the SANAD trial that was also used to develop the NEWQOL-6D) and replication of the results on a different sample with different external clinical indicators and types of epilepsy would be of interest. The time period studied in the SANAD trial, in which the measures were taken once a year over a 2-year observation period, is quite different from the typical randomized controlled trial duration used for drug registration purposes. We also did not have access to data about the type of seizure experienced by those included in the SANAD study, and this may have helped inform the utility values used. The baseline seizure frequency data are difficult to interpret

VALUE IN HEALTH 20 (2017) 687 693 693 because a patient with two seizures in the 12 months before starting treatment may have had them in the space of a few weeks before starting treatment or over a longer period. The analysis could also not fully assess the responsiveness of the measures by varying levels of response to antiepileptic drug treatment because we could define patients only as seizure-free or having one or more seizures. Finally, using samples based on those fully completing each of the measures also excludes a number of respondents, and potentially those in poorer health who may have not managed to complete the long NEWQOL battery, but is a valid criterion given that we are trying to compare the performance of measures across matched samples. Conclusions There is the indication that the NEWQOL may better reflect differencesacrossgroupsparticularlyattheleastsevereend,butboth measures perform at a similar level. Therefore, there is some evidence that both the EQ-5D-3L and the NEWQOL-6D have validity for providing utility values for use in the economic evaluation of interventions for newly diagnosed focal epilepsy and in subsequent decision making. Given the widespread use of the EQ-5D-3L (and its use across conditions given its generic nature), and the subsequent development of the EQ-5D-5L, which may increase responsiveness, both measures could be used alongside each other to provide complementary utility values from a generic and condition-specific perspectiveandamoreholisticmeasurementofhrqolinepilepsy. Acknowledgments Source of financial support: This study was funded by Epilepsy UK and UCB Biopharma. REFERENCES [1] Berg AT, Berkovic SF, Brodie MJ, et al. Revised terminology and concepts for organization of seizures and epilepsies: report of the ILAE Commission on Classification and Terminology, 2005 2009. Epilepsia 2010;51:676 85. [2] Thurman DJ, Beghi E, Begley CE, et al. Standards for epidemiologic studies and surveillance of epilepsy. Epilepsia 2011;52:2 26. [3] Shorvon SD, Reynolds EH. Unnecessary polypharmacy for epilepsy. BMJ 1977;1:1635 7. [4] Shorvon SD, Reynolds EH. Reduction of polypharmacy for epilepsy. BMJ 1979;2:1023 5. [5] Stavem K, Bjornaes H, Lossius MI. Properties of the 15D and EQ-5D-3L utility measures in a community sample of people with epilepsy. Epilepsy Res 2001;44:179 89. [6] Selai CE, Trimble MR, Price MJ, et al. Evaluation of health status in epilepsy using the EQ-5D-3L questionnaire: a prospective, observational, 6-month study of adjunctive therapy with anti-epileptic drugs. Curr Med Res Opin 2005;21:733 9. [7] Langfitt JT, Vickrey BG, McDermott MP. Validity and responsiveness of generic preference-based HRQOL instruments in chronic epilepsy. Qual Life Res 2006;15:899 914. [8] Böckerman P, Johansson E, Saarni SI. Do established health-related quality-of-life measures adequately capture the impact of chronic conditions on subjective well-being? Health Policy 2011;100:91 5. [9] Mulhern B, Rowen D, Jacoby A, et al. The development of a QALY measure for epilepsy: NEWQOL-6D. Epilepsy Behav 2012;24: 36 43. [10] Marson AG, Al-Kharusi AM, Alwaidh M, et al. The SANAD study of effectiveness of carbamazepine, gabapentin, lamotrigine, oxcarbazeprine, or topiramate for treatment of partial epilepsy: an unblinded randomised controlled trial. Lancet 2007;369:1000 15. [11] Brooks R. EuroQol: the current state of play. Health Policy 1996;37:53 72. [12] National Institute for Health and Care Excellence. Guides to the Methods for Technology Appraisal. London: National Institute for Health and Care Excellence, 2013. [13] Pharmaceutical Benefits Advisory Committee. Guidelines to Preparing Submissions to the Pharmaceutical Benefits Advisory Committee. Canberra: Australian Department of Health, 2013. [14] Canadian Agency for Drugs and Technologies in Health. Guidelines for the Economic Evaluation of Health Technologies. Ottawa, ON: Canadian Agency for Drugs and Technologies in Health, 2006. [15] Dolan P. Modeling valuations for EuroQol health states. Med Care 1997;35:1095 108. [16] Abetz L, Jacoby A, Baker GA, McNulty P. Patient-based assessments of quality of life in newly diagnosed epilepsy patients: validation of the NEWQOL. Epilepsia 2000;41:1119 28. [17] Mulhern B, Rowen D, Snape A, et al. Valuations of epilepsy-specific health states: a comparison of patients with epilepsy and the general population. Epilepsy Behav 2013;36:12 7. [18] Zigmond AS, Snaith RP. The Hospital Anxiety and Depression Scale. Acta Psychiatr Scand 1983;67:361 70. [19] Cohen J. Statistical Power. Analysis for the Behavioral Sciences(2nd ed.) Hillsdale, NJ: Erlbaum, 1988. [20] Bland MJ, Altman D. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;327:307 10. [21] Brazier J, Roberts J, Tsuchiya A, Busschbach J. A comparison of the EQ-5D-3L and SF-6D across seven patient groups. Health Econ 2004;13:873 84. [22] Rowen D, Young T, Brazier J, Gaugris S. Comparison of generic, condition-specific, and mapped health state utility values for multiple myeloma cancer. Value Health 2012;15:1059 68. [23] Mulhern B, Rowen D, Brazier J, et al. Development of DEMQOL-U and DEMQOL-Proxy-U: generation of preference based indices from DEMQOL and DEMQOL-Proxy for use in economic evaluation. Health Technol Assess 2013;17:v xv, 1 140. [24] Mulhern B, Mukuria C, Barkham M, et al. Using preference based measures in mental health conditions: the psychometric validity of the EQ-5D-3L and SF-6D. Br J Psychiatry 2014;205:236 43. [25] Herdman M, Gudex C, Lloyd A, et al. Development and preliminary testing of the new five-level version of EQ-5D-3L (EQ-5D-3L-5L). Qual Life Res 2011;20:1727 36. [26] Whitehurst D, Bryan S. Another study showing that two preferencebased measures of health-related quality of life (EQ-5D-3L and SF-6D) are not interchangeable. But why should we expect them to be? Value Health 2011;14:531 8. [27] Taylor R, Sander J, Taylor R, Baker G. Predictors of health-related quality of life and costs in adults with epilepsy: a systematic review. Epilepsia 2011;52:2168 80. [28] Kwan P, Yu E, Leung H, et al. Association of subjective anxiety, depression, and sleep disturbance with quality-of-life ratings in adults with epilepsy. Epilepsia 2009;50:1059 66. [29] Brazier JE, Rowen D, Mavranezouli I, et al. Developing and testing methods for deriving preference-based measures of health from condition-specific measures (and other patient-based measures of outcome). Health Technol Assess 2012;16:32. [30] Brazier J, Tsuchiya A. Preference-based condition-specific measures of health: what happens to cross programme comparability? Health Econ 2010;19:125 9.