Time Trade-Off and Ranking Exercises Are Sensitive to Different Dimensions of EQ-5D Health States

Similar documents
Learning Effects in Time Trade-Off Based Valuation of EQ-5D Health States

Time to tweak the TTO: results from a comparison of alternative specifications of the TTO

Study Protocol: Comparison of Inconsistency between Time Trade Off and Discrete Choice Experiments in EQ-5D-3 L Health State Valuations

EuroQol Working Paper Series

Interim Scoring for the EQ-5D-5L: Mapping the EQ-5D-5L to EQ-5D-3L Value Sets

Comparison of Value Set Based on DCE and/or TTO Data: Scoring for EQ-5D-5L Health States in Japan

How is the most severe health state being valued by the general population?

Japan Journal of Medicine

Using Discrete Choice Experiments with duration to model EQ-5D-5L health state preferences: Testing experimental design strategies

Valuing health using visual analogue scales and rank data: does the visual analogue scale contain cardinal information?

Comparing the UK EQ-5D-3L and English EQ-5D-5L value sets

NIH Public Access Author Manuscript J Clin Epidemiol. Author manuscript; available in PMC 2010 March 1.

Keep it simple: Ranking health states yields values similar to cardinal measurement approaches

Time trade-off: one methodology, different methods

Population Health Metrics

Valuation of EQ-5D Health States in Poland: First TTO-Based Social Value Set in Central and Eastern Europevhe_

Health Economics & Decision Science (HEDS) Discussion Paper Series

Swedish experience-based value sets for EQ-5D health states

Department of Economics

Comparing Generic and Condition-Specific Preference-Based Measures in Epilepsy: EQ-5D-3L and NEWQOL-6D

LIHS Mini Master Class

Mapping the EORTC QLQ C-30 onto the EQ-5D Instrument: The Potential to Estimate QALYs without Generic Preference Data

Back to the future : Influence of beliefs regarding the future on TTO answers

Experience-based VAS values for EQ-5D-3L health states in a national general population health survey in China

Functional health state description and valuation by people aged 65 and over: a pilot study

Condition-Specific Preference-Based Measures: Benefit or Burden?

To what extent do people prefer health states with higher values? A note on evidence from the EQ-5D valuation set

Original Research Article Pain Quality of Life as Measured by Utilities

This is a repository copy of Estimating an EQ-5D population value set: the case of Japan.

David Patterson The Whittington Hospital, Magdala Avenue, London N19 5NF, UK

Waiting times. Deriving utility weights for the EQ-5D-5L

VALUE IN HEALTH 21 (2018) Available online at journal homepage:

University of Groningen. Quantification of Health by Scaling Similarity Judgments Arons, Alexander M. M.; Krabbe, Paulus. Published in: PLoS ONE

University of Groningen

Health care policy evaluation: empirical analysis of the restrictions implied by Quality Adjusted Life Years

Is EQ-5D-5L Better Than EQ-5D-3L? A Head-to-Head Comparison of Descriptive Systems and Value Sets from Seven Countries

A comparison of injured patient and general population valuations of EQ-5D health states for New Zealand

A consistency test of the time trade-off

Validity of the EuroQoL (EQ-5D) Instrument in a Greek General Population

EQ-5D-5L norms for the urban Chinese population in China

To what extent can we explain time trade-off values from other information about respondents?

NICE DSU TECHNICAL SUPPORT DOCUMENT 8: AN INTRODUCTION TO THE MEASUREMENT AND VALUATION OF HEALTH FOR NICE SUBMISSIONS

Mapping EORTC QLQ-C30 onto EQ-5D for the assessment of cancer patients

Analysis of EQ-5D scores from two phase 3 clinical trials of romiplostim in the treatment of immune thrombocytopenia (ITP)

Interchangeability of the EQ-5D and the SF-6D in Long-Lasting Low Back Pain

VALUE IN HEALTH 19 (2016) Available online at journal homepage:

Health State Utility Scores for Cancer-Related Anemia through Societal and Patient Valuations

Willingness to pay: a feasible method for assessing treatment benefits in epilepsy?

Does the correspondence between EQ-5D health state description and VAS score vary by medical condition?

Health Economics Working Paper Series HEWPS Number: Exploring differences between TTO and direct choice in the valuation of health states

A new method of measuring how much anterior tooth alignment means to adolescents

Department of Medicine,Yong Loo Lin School of Medicine, National University of Singapore, Singapore

Introducing the QLU-C10D A preference-based utility measure derived from the QLQ-C30

How to Measure and Value Health Benefits to Facilitate Priority Setting for Pediatric Population? Development and Application Issues.

The responsiveness of the EQ-5D and time trade-off scores in schizophrenia, affective disorders, and alcohol addiction

Supplementary Appendix

Using HAQ-DI to estimate HUI-3 and EQ-5D utility values for patients with rheumatoid arthritis in Spain

University of Bristol - Explore Bristol Research. Publisher's PDF, also known as Version of record

Estimating EQ-5D values from the Neck Disability Index and numeric rating scales for neck and arm pain

Kelvin Chan Feb 10, 2015

This is a repository copy of A comparison of the EQ-5D and the SF-6D across seven patient groups.

Valuing Health-Related Quality of Life A Review of Health State Valuation Techniques

HEALTH ECONOMICS GROUP Faculty of Medicine and Health Norwich Medical School

HEDS Discussion Paper 05/05

Qu est-ce que la santé? Regard critique sur les QALYs et analyse d autres paramètres pour mesurer les gains en santé

BMC Health Services Research

The role of non-transparent matching methods in avoiding preference reversals in the evaluation of health outcomes. Fernando I.

Choice of EQ 5D 3L for Economic Evaluation of Community Paramedicine Programs Project

Medical Decision Making

Assessment of quality of life in lung transplantation using a simple generic tool

Liv Ariane Augestad and Kim Rand-Hendriksen, 2012

EuroQol Working Paper Series

White Rose Research Online URL for this paper:

Aliasghar A. Kiadaliri 1,2,6*, Björn Eliasson 3 and Ulf-G Gerdtham 4,5

Technical Appendix. Response to: Quality review of a proposed EEPRU review of the EQ-5D-5L value set for England

Denise Bijlenga, MSc, 1 Erwin Birnie, PhD, 2 Gouke J. Bonsel, MD, PhD 2. Introduction. Methods

Effects of Mode and Order of Administration on Generic Health-Related Quality of Life Scoresvhe_

Title: The sequence effect in time trade-off elicitation.

Work in progress: please do not quote without authors permission

A panel data comparison of two commonly-used health-related quality of life instruments

AN INTERVIEW-BASED COMPARISON OF THE TTO AND VAS VALUES GIVEN TO EQ-50 STATES OF HEALTH BY THE GENERAL GERMAN POPULATION

The Chimera of WTP for a QALY: Inconsistencies of Stated Preferences in Scenario Variations

Properties of patient-reported outcome measures in individuals following acute whiplash injury

Valuation of the SF-6D Health States Is Feasible, Acceptable, Reliable, and Valid in a Chinese Population

Are preferences over health states informed?

15D: Strengths, weaknesses and future development

Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L)

Quality review of a proposed EQ- 5D-5L value set for England

Is there a case for using visual analogue scale valuations in cost-utility analysis?

This is a repository copy of The estimation of a preference-based measure of health from the SF-36.

Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985)

This is a repository copy of Deriving a preference-based measure for cancer using the EORTC QLQ-C30.

Assessment of Health State in Patients With Tinnitus: A Comparison of the EQ-5D and HUI Mark III

reliability and validity of the EuroQol ( EQ-5D), an patients with osteoarthritis of the knee M. Fransen and J. Edmonds

Mapping health outcome measures from a stroke registry to EQ-5D weights

Mapping QLQ-C30, HAQ, and MSIS-29 on EQ-5D

Health Services Research and Health Economics. Paul McCrone Institute of Psychiatry, King s College London

Health Technology Assessment (HTA)

W13: Modelling disease progression and economic outcomes of dementia interventions: exploring options for a complex health problem

Health Economics 101: PPI prescriptions in the Emergency Room

Transcription:

VALUE IN HEALTH 15 (2012) 777 782 Available online at www.sciencedirect.com journal homepage: www.elsevier.com/locate/jval Time Trade-Off and Ranking Exercises Are Sensitive to Different Dimensions of EQ-5D Health States Kim Rand-Hendriksen, Cand.Psychol, MSc 1,2, *, Liv Ariane Augestad, MD 1,2 1 Department of Health Management and Health Economics, University of Oslo, Oslo, Norway; 2 Health Services Research Centre, Akershus University Hospital, Lørenskog, Norway A B S T R A C T Background: One method suggested for creating preference-based tariffs for the new five-level EuroQol five-dimensional (EQ-5D) questionnaire is combining time trade-off (TTO) and discrete choice exercises. Rank values from previous valuation studies can be used as proxies for discrete choice exercises. This study examined rank and TTO data to determine whether the methods differ in sensitivity to the EQ-5D questionnaire dimensions. Methods: We used rank and TTO data for 42 EQ-5D questionnaire health states from the US and UK three-level EQ-5D questionnaire valuation studies, extracting overall ranks of mean TTO and mean rank values, ranging from 1 (best) to 42 (worst). We identified pairs of health states with reversed overall ranks between TTO and rank data and regressed overall rank differences (TTO ranking) on dummy variables representing impairments on EQ-5D questionnaire dimensions. Results: Forty-three (US) and 41 (UK) health state pairs displayed reversed rank order. Both US and UK regression models on rank differences indicated that respondents rated impairments involving pain/discomfort and anxiety/depression as relatively worse in TTO than in the ranking task. Discussion: Different dimension sensitivity between TTO and ranking methods suggests that combining them could lead to inconsistent tariffs. Differences could be caused by respondents focusing on the first presented dimensions when ranking states or could be related to the longest endurable time for health states involving pain/discomfort or anxiety/depression. The observed differences call into question which method best represents the preferences of the population. Keywords: EQ-5D, QALY, ranking, TTO, utility, valuation. Copyright 2012, International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. Background The EuroQol five-dimensional (EQ-5D) questionnaire is a health-related quality-of-life instrument that is used extensively to estimate quality-adjusted life-years in health economic evaluations [1,2]. It uses five dimensions of health: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. Up until recently, these dimensions could be rated at three levels, corresponding to no problems, some problems, and extreme problems. The EuroQol group, however, has released official versions of the new five-level EQ-5D questionnaire, an expansion of the previous three-level EQ-5D questionnaire, in which each of the instrument s five dimensions can be rated at five levels. This expansion has increased the number of combination health states from 243 to 3125. Value sets for the three-level EQ-5D (EQ-5D-3L) questionnaire have typically been made by using mean preference values from the general population, elicited by using the time trade-off (TTO) method, in which health states are valued in relation to perfect health and death. As TTO interviews are costly and time-consuming, EQ-5D-3L questionnaire valuation studies have typically elicited TTO values for subsets (17 46) of the 243 possible health states, and values for all 243 states have been estimated by using regression modeling. Differences in the number of health states directly valued have been determined to contribute to observed differences between national EQ-5D questionnaire value sets [3], and two recent valuation studies directly valuing greater numbers of health states have revealed more complex interactions than those identified by previous valuation studies [4,5]. The increase in the number of possible health states that accompany the new five-level EQ-5D questionnaire makes the conventional method economically unfeasible and has led to a renewed focus on alternative valuation methods. One suggested method for creating value sets for the five-level EQ-5D questionnaire is combining TTO values for a limited set of health states with discrete choice exercise (DCE) data for a larger sample of health states [6]. In DCE, respondents are asked to state which of two alternative health states they think is best, a simpler and less costly method than TTO valuation. Combining TTO and DCE data in this manner requires that the two methods measure the same construct in similar manners. Preliminary analyses of results from a set of experimental valuation exercises performed in Norway, however, led us to wonder whether ranking and TTO exercises may make respondents sensitive to different EQ-5D questionnaire dimensions; we observed unexpected and stable mean rank transpositions between TTO and ranking of health state pairs involving impairments on different EQ-5D questionnaire dimensions. Both in our valuation experiments and in previous TTO-based EQ-5D-3L questionnaire valuation studies, respondents have been familiarized with health state valuation before TTO elicitation by having them rank the presented * Address correspondence to: Kim Rand-Hendriksen, Department of Health Management and Health Economics, University of Oslo, Pb. 1089 Blindern, 0318 Oslo, Norway. E-mail: kim.rand-hendriksen@medisin.uio.no. 1098-3015/$36.00 see front matter Copyright 2012, International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. http://dx.doi.org/10.1016/j.jval.2012.04.002

778 VALUE IN HEALTH 15 (2012) 777 782 health states from subjective best to worst and then value the ranked states on a visual analogue scale (VAS). Lacking a gold standard for comparison, several researchers have proposed the use of ranking as a benchmark for comparison when considering the validity of other valuation methods such as TTO [7,8]. Furthermore, the ranking task can be considered as an ordered set of discrete choices. As such, existing rank data may be used as imperfect proxies for DCE data. Since ranking tasks were used in previous TTO-based EQ-5D questionnaire valuation studies, an abundance of data is available that enables comparison of ranking and TTO values. The aim of this study was to examine data from previous valuation studies to determine whether respondents were sensitive to different dimensions of health state impairment when performing ranking of health states to when performing TTO valuation. Methods Data We used data from the UK (measuring and valuing health) [9] and US [10] TTO-based EQ-5D questionnaire valuation studies, both of which asked respondents from the general population to value the same 42 EQ-5D questionnaire health states by using a ranking task, VAS, and TTO. Performed in 1993, the UK valuation study was the first large-scale national EQ-5D questionnaire valuation study using TTO interviews, and it has become the model upon which most subsequent valuation studies have been built. After exclusions, 2997 respondents were included in the valuation sample. The US valuation study was performed in 2001 by using an interview protocol that was nearly identical to the UK protocol. The sampling techniques used were more advanced than in the UK study, as were the statistical methods used to ensure population representativeness. With 3773 respondents after exclusions, the US valuation study has partially supplanted the UK study as the valuation study of reference. We were interested in two variables from these studies: rank order and TTO values for the measured EQ-5D questionnaire health states. Rank order In both studies, respondents were familiarized with the EQ-5D questionnaire and valuation of hypothetical health states prior to TTO valuation. First, they were asked to describe their own current health by using the EQ-5D questionnaire descriptive system. They were then asked to rank from subjective best to worst a set of 15 cards, each describing a health state. The cards included the states death, unconscious, EQ-5D questionnaire state 11111 (no health problems), and 12 other EQ-5D questionnaire states selected from the pool of 42 EQ-5D questionnaire states that were valued in the study. We were interested in the rank order of the 12 health states from the pool of 42, and therefore discarded the ranks of death, unconscious, and state 11111. The remaining 12 states then had ranks from 1 (best) to 12 (worst). We extracted the mean rank values of all 42 measured EQ-5D questionnaire health states from the UK and US data sets. TTO values The states death and 11111 were used as anchors in the TTO interview, with values of 0 and 1, respectively. In the TTO interview, respondents valued the 12 EQ-5D questionnaire health states they had previously ranked, one by one in random order. The objective of the TTO task was to identify the respondent s point of preferential indifference between 10 years in the impaired health state in question (the target state) and a shorter life in state 11111. The point of indifference was identified through a sequence of choice tasks in which the length of life in state 11111 was manipulated. When equilibrium was found, the TTO value of the target state was calculated as time in the target state divided by time in perfect health (10 years). We extracted the mean TTO values for each of the 42 measured health states from the UK and US data sets. In the calculation of mean rank and TTO values, we used the same methods and exclusion criteria employed in the respective valuation studies. There were two primary differences between the UK and US valuation studies that merit mention here: The US valuation study intentionally oversampled certain ethnic groups. To achieve population representativeness, respondent survey weights were used. The other difference pertains to health states considered worse than death. In both studies, TTO values elicited when respondents considered the presented health states to be worse than death were transformed before health state means were calculated. However, the UK valuation study was performed by using a transformation suggested by Patrick et al. [9,11], while the US study was performed by using a method suggested by Torrance [10,12]. The worse-than-death valuation procedure and subsequent transformations have been extensively discussed and criticized for being atheoretical [3,13 16], and the differences between the two methods used have been found to contribute substantially to observed differences between the published US and UK EQ-5D questionnaire tariffs [16]. Nevertheless, worse-thandeath transformation has been considered a necessary evil and has been used in the calculation of all published mean-based TTO tariffs for the EQ-5D questionnaire, with the Torrance transformation used only in the US valuation study and the Patrick transformation used in the remaining 14 valuation studies (UK, Spain, Germany, Japan, Denmark, Zimbabwe, Netherlands, Argentine, South Korea, Thailand, Poland, France, Chile, and Australia). We consider discussion of the appropriateness of the valuation studies exclusion criteria, sampling methods, and worse-thandeath transformation methods to be outside the scope of this article. Because of previous observations of the substantial impact that the choice of the transformation method makes, however, we performed sensitivity analyses substituting transformation methods. Overall rank orders Mean TTO values and mean rank orders are not directly comparable. To enable a crude comparison using a common scale, we chose to perform analyses on the overall rank orders of the mean TTO values and the mean rank values: In each data set, the best mean TTO value was assigned an overall rank value of 1 and the worst was assigned a rank value of 42. Similarly, the state with the best mean rank order (close to 1) was assigned an overall mean rank value of 1 and the worst (close to 12) was assigned a value of 42. In this procedure, we disregard the relative distances between mean TTO values and between mean ranks. For simplicity, we will refer to the overall mean TTO rank orders as mean TTO ranks and the overall mean ranking task rank orders as mean rankings. For each of the two data sets, we then subtracted the mean rankings from the mean TTO ranks, rendering a measure of difference between the two valuation methods relative ordering of the 42 EQ-5D questionnaire health states. We refer to these values as mean rank differences. A positive mean rank difference reflected states that were ranked as worse in the TTO valuation than in the ranking task, and a negative mean rank difference reflected states ranked as worse in the ranking task than in the TTO. Analyses First, we analyzed the mean TTO ranks and the mean rankings to identify pairs of health states for which the rank orders were reversed between mean TTO rank and mean rankings, that is, in which one of the health states was considered to be better than the

VALUE IN HEALTH 15 (2012) 777 782 779 Table 1 Mean TTO, mean rank, and mean rank differences for valued EQ-5D questionnaire health states. Health state US data UK data TTO Ranking Rank TTO Ranking Rank difference difference Mean Rank Mean Rank Mean Rank Mean Rank 11121 0.88 1 1.55 1 0 0.85 3 1.66 2 1 11211 0.87 3 1.76 3 0 0.87 2 1.64 1 1 21111 0.87 2 1.77 4 2 0.88 1 1.82 4 3 11112 0.83 5 1.73 2 3 0.83 5 1.68 3 2 12111 0.84 4 2.08 5 1 0.83 4 1.86 5 1 12211 0.79 6 3.64 9 3 0.76 6 3.02 7 1 12121 0.79 7 3.22 6 1 0.74 7 3.19 8 1 11122 0.76 8 3.29 7 1 0.73 8 3.00 6 2 22121 0.74 9 3.56 8 1 0.64 10 3.87 9 1 22112 0.7 10 4.52 13 3 0.66 9 3.87 10 1 21222 0.68 12 4.07 11 1 0.56 11 4.26 11 0 12222 0.66 13 3.73 10 3 0.54 13 4.43 12 1 22122 0.68 11 4.09 12 1 0.53 14 4.58 13 1 11312 0.65 14 4.97 16 2 0.55 12 4.63 14 2 21312 0.63 15 4.77 15 0 0.52 15 5.13 17 2 11113 0.56 17 4.56 14 3 0.39 17 4.64 15 2 22222 0.6 16 5.81 19 3 0.50 16 5.10 16 0 13212 0.51 18 6.03 20 2 0.38 18 5.60 19 1 12223 0.47 20 5.80 18 2 0.21 20 6.07 20 0 11131 0.39 23 5.59 17 6 0.21 21 5.40 18 3 13311 0.48 19 6.61 22 3 0.33 19 6.24 21 2 21232 0.41 21 6.33 21 0 0.06 25 6.49 22 3 21323 0.39 22 7.02 23 1 0.15 22 6.91 24 2 32211 0.33 26 7.36 28 2 0.15 23 7.04 25 2 23321 0.38 24 7.40 29 5 0.13 24 7.34 27 3 11133 0.29 28 7.13 24 4 0.05 29 6.87 23 6 22323 0.36 25 7.14 25 0 0.05 26 7.78 29 3 22331 0.3 27 7.22 27 0 0.02 27 7.39 28 1 21133 0.28 29 7.17 26 3 0.07 30 7.24 26 4 23232 0.22 31 8.18 31 0 0.09 32 8.06 30 2 33212 0.2 32 8.54 33 1 0.02 28 8.08 31 3 23313 0.22 30 8.36 32 2 0.07 31 8.22 32 1 22233 0.2 33 7.89 30 3 0.15 34 8.34 33 1 32223 0.2 34 8.54 34 0 0.18 36 8.70 37 1 13332 0.14 36 8.83 35 1 0.23 37 8.61 34 3 32232 0.15 35 8.86 36 1 0.23 38 8.63 36 2 33321 0.14 37 9.15 37 0 0.13 33 8.98 39 6 32313 0.13 38 9.25 38 0 0.15 35 8.62 35 0 32331 0.05 40 9.49 39 1 0.27 39 8.98 38 1 33232 0.06 39 9.86 40 1 0.33 40 9.62 40 0 33323 0.02 41 9.98 41 0 0.38 41 9.83 41 0 33333 0.1 42 11.20 42 0 0.54 42 11.12 42 0 EQ-5D, EuroQol five-dimensional; TTO, time trade-off. other using TTO and the other was considered better based on rank order data. Second, we used multiple linear regressions to determine whether specific EQ-5D questionnaire dimension impairments were related to the mean rank differences. For this purpose, we used 10 dummy variables commonly used in valuation studies (m2, m3, s2, s3, u2, u3, p2, p3, a2, and a3) to represent each of the five EQ-5D questionnaire dimensions at some problems and extreme problems. For example, m2 represents the mobility dimension at some problems and p3 represents the pain/discomfort dimension at extreme problems. Separately for the US and UK data sets, we used the 42 mean rank differences as our dependent variables and the 10 dimension dummy variables for the corresponding health states as our independent variables. Using this regression procedure, a positive coefficient value for a specific dummy variable would indicate that the health impairment represented was related to worse rating using TTO than ranking, while a negative coefficient would indicate that the impairment was considered worse in the ranking. Finally, as a measure of relative agreement between TTO and rank values, we calculated Spearman s Rho between US mean TTO values and UK mean TTO values and between US mean rank values and UK mean rank values. These rank correlations were then compared with Spearman s Rho between mean TTO values and mean rank values within the same data set. Ethics, funding, and conflicts of interest All analyses were performed on anonymized and publicly available data from previously conducted studies. Therefore, no ethics committee has considered the appropriateness of our study.

780 VALUE IN HEALTH 15 (2012) 777 782 Table 2 Health state pairs with reversed mean rank order between TTO and ranking. US data UK data EQ-5D questionnaire vector Mean TTO ranks Mean rankings EQ-5D questionnaire vector Mean TTO ranks Mean rankings a b a b a b a b a b a b 11211 21111 3 2 3 4 11112 21111 5 1 3 4 11112 11211 5 3 2 3 11121 21111 3 1 2 4 11112 21111 5 2 2 4 11211 21111 2 1 1 4 11112 12111 5 4 2 5 11112 12111 5 4 3 5 12121 12211 7 6 6 9 11122 12211 8 6 6 7 11122 12211 8 6 7 9 11122 12121 8 7 6 8 22121 12211 9 6 8 9 22121 22112 10 9 9 10 12222 21222 13 12 10 11 12222 11312 13 12 12 14 12222 22112 13 10 10 13 22122 11312 14 12 13 14 12222 22122 13 11 10 12 11113 22222 17 16 15 16 21222 22112 12 10 11 13 11113 21312 17 15 15 17 21222 22122 12 11 11 12 22222 21312 16 15 16 17 22122 22112 11 10 12 13 11131 13212 21 18 18 19 21312 11312 15 14 15 16 11131 12223 21 20 18 20 11113 11312 17 14 14 16 11131 13311 21 19 18 21 11113 21312 17 15 14 15 12223 13311 20 19 20 21 11113 22222 17 16 14 19 11133 21323 29 22 23 24 12223 22222 20 16 18 19 21232 21323 25 22 22 24 11131 22222 23 16 17 19 11133 32211 29 23 23 25 12223 13212 20 18 18 20 21232 32211 25 23 22 25 11131 12223 23 20 17 18 11133 23321 29 24 23 27 11131 13212 23 18 17 20 21133 23321 30 24 26 27 12223 13311 20 19 18 22 21232 23321 25 24 22 27 11131 13311 23 19 17 22 11133 22331 29 27 23 28 11131 21232 23 21 17 21 21133 22331 30 27 26 28 21232 13311 21 19 21 22 11133 22323 29 26 23 29 11131 21323 23 22 17 23 21133 22323 30 26 26 29 11133 22323 28 25 24 25 22331 22323 27 26 28 29 22323 23321 25 24 25 29 11133 33212 29 28 23 31 11133 23321 28 24 24 29 21133 33212 30 28 26 31 11133 22331 28 27 24 27 23232 33212 32 28 30 31 11133 32211 28 26 24 28 23232 23313 32 31 30 32 22331 23321 27 24 27 29 13332 32313 37 35 34 35 32211 23321 26 24 28 29 13332 32223 37 36 34 37 21133 23321 29 24 26 29 32232 32223 38 36 36 37 22331 32211 27 26 27 28 13332 33321 37 33 34 39 21133 22331 29 27 26 27 22233 33321 34 33 33 39 21133 32211 29 26 26 28 32223 33321 36 33 37 39 22233 23232 33 31 30 31 32232 33321 38 33 36 39 22233 23313 33 30 30 32 32313 33321 35 33 35 39 23232 23313 31 30 31 32 32331 33321 39 33 38 39 13332 32232 36 35 35 36 32331 33232 40 39 39 40 EQ-5D, EuroQol five-dimensional; TTO, time trade-off. The study was indirectly funded by the Southern and Eastern Regional Health Authorities and the Norwegian Research Council through PhD grants to the authors. The funding sources had no involvement in the study, and neither author had any conflict of interest. Results Table 1 lists mean TTO ranks, mean rankings, and mean rank differences from the US and UK data sets for the 42 measured health states. Positive mean rank differences (i.e., the health states that were ranked as worse on the TTO than in the ranking task) were common for health states dominated by impairments involving the pain/discomfort or anxiety/depression dimensions, while negative mean rank differences were predominant for health states with impairments involving mobility, self-care, and usual activities. For instance, state 11131, indicating extreme pain/ discomfort, had a US (UK) mean ranking of 17 (18) and a mean TTO rank of 23 (21), yielding a rank difference of 6 (3). We identified a total of 43 US and 41 UK health state pairs for which the mean TTO ranks and mean rankings were reversed. These are listed in Table 2. Most of these transpositions took place in health state pairs in which one state was dominated by impairments of mobility, self-care, and usual activities (considered best when performing TTO) while the other was dominated by impairments involving pain/discomfort and anxiety/depression (consid-

VALUE IN HEALTH 15 (2012) 777 782 781 Table 3 Multiple linear regression models predicting mean rank differences. Predictor US data (r 2.626) UK data (r 2.755) Coefficient P Coefficient P Constant 0.683 0.287 0.205 0.703 m2 Mobility level 2 1.789 0.004 1.424 0.006 m3 Mobility level 3 0.765 0.298 1.542 0.017 s2 Self-care level 2 0.960 0.130 0.334 0.527 s3 Self-care level 3 1.796 0.021 0.918 0.150 u2 Usual activities level 2 0.970 0.156 1.002 0.085 u3 Usual activities level 3 1.167 0.109 1.857 0.004 p2 Pain/discomfort level 2 1.182 0.048 0.418 0.395 p3 Pain/discomfort level 3 2.609 <0.001 3.250 <0.001 a2 Anxiety/depression level 2 0.371 0.547 1.302 0.017 a3 Anxiety/depression level 3 1.547 0.018 1.693 0.003 Statistically significant (P.05) coefficients in bold. ered best when ranking). For instance, in both data sets, state 13311, indicating extreme problems in the self-care and usual activities dimensions, was considered better than state 11131 (presented previously) in TTO while it was considered worse in ranking. Table 3 lists the regression models predicting rank differences using the 10 main dimension dummies. The coefficients for the dummy variables representing the first three dimensions (m2, m3, s2, s3, u2, and u3) were consistently negative, while the coefficients for the dummies representing the final two dimensions (p2, p3, a2, and a3) were consistently positive. Spearman s Rho values between US and UK mean values were 0.991 (TTO, P 0.001) and 0.990 (rank, P 0.001). Spearman s Rho values between mean TTO and mean rank values were 0.984 (US, P 0.001) and 0.983 (UK, P 0.001). Discussion Respondents in the two valuation studies appear to have been more sensitive to impairments on the dimensions of mobility, self-care, and usual activities when ranking health states and more sensitive to impairments involving pain/discomfort and anxiety/depression in the TTO valuation. In both data sets, there were many examples of health state pairs in which respondents ranked one state as better than the other but were willing to trade away more life time to avoid the health states of the better ranked state than the worst. In nearly all these pairs, one of the states was predominantly impaired on the first three dimensions of the EQ-5D questionnaire, while the other state was dominated by impairments on the last two dimensions. The regression models indicate that the health state pairs in which the overall rank order was reversed between ranking and TTO represent extreme examples of a general trend. This apparent inconsistency in how the two methods value the different dimensions of health constitutes a breach of procedural invariance [17,18] and casts doubt on the two methods ability to capture the same underlying construct the population s preferences for EQ-5D questionnaire health states. Because the analyses were performed on ranks of means, interpreting the magnitude of the observed differences is not a straightforward task. Spearman s Rho between US and UK data within each method, however, was higher than between those methods for the same country, indicating that the difference between the methods is greater than the differences in mean preferences between the two countries. There is a large body of literature documenting how different valuation methods yield different results [7,19 21]. For instance, it has often been found that the standard gamble method yields higher values than TTO, which yields higher values than the VAS. Such comparisons, however, have typically focused on differences in absolute levels of values or on the functional form of values from different instruments. Our finding of dimension-specific inconsistencies between the ranking and TTO methods underscores the importance of investigating potential disagreements on the level of health dimensions when comparing valuation methods. In addition to the analyses presented, we performed several tests that did not add any new information: Analyses on data from the Danish TTO-based valuation study replicated the findings from the UK and US data. Switching the transformation methods for health states considered worse than death in the TTO task resulted in slight changes to the magnitudes of the regression coefficients, but the overall picture remained unchanged. In the two valuation studies from which our data were acquired, respondents were asked to value the same set of health states by using a thermometer-like VAS. VAS valuation was performed right after the ranking task, with the health states still in their ranked order, meaning that the VAS values were highly dependent on the previous ranking. We performed analyses substituting the overall mean rankings with overall rankings of mean VAS scores, with nearly identical results. Because of the intertwined nature of the ranking and VAS valuations, this does not necessarily mean that VAS valuation without prior ranking would induce sensitivity to the same EQ-5D questionnaire dimensions that ranking apparently does. Details and results for these analyses are available from the corresponding author. This study had four primary limitations. First, because ranking can be conceptualized as a set of discrete choices, we have used rank order as a proxy for DCE data. Empirical testing however, would be required to determine whether respondents perform consecutive DCE tasks in the same manner as they perform ranking. As ranking involves simultaneous comparison of more items than does DCE, there may be differences in how the two tasks are processed by respondents. Second, we analyzed the rank order of mean values from TTO and mean rank data. This procedure is insensitive to the relative distance between health states. Third, the analyses were performed on data collected for the EQ-5D-3L questionnaire. The degree to which this is generalizable to the five-level version is unknown, though some studies have been performed indicating that there is considerable agreement between the three- and five-level versions [22 25]. Finally, multiple linear regressions on rank data are not ideal. Because our objective was to identify and illustrate differences between the two valuation methods in terms of how respondents value the five EQ-5D questionnaire dimensions, we considered multiple linear regressions to be the simplest and most accessible method sufficient for our purpose. This study does not inform us as to why respondents rank the EQ-5D questionnaire dimensions in an apparently incon-

782 VALUE IN HEALTH 15 (2012) 777 782 sistent manner when performing ranking and TTO valuation. We offer two hypotheses that are congruent with our findings but must warn that they are speculative at present. First, the respondents could be more influenced by the ordering of presentation for the EQ-5D questionnaire dimensions when ranking than when performing TTO; the ordering was fixed in both studies, and it is conceivable that respondents performing ranking of health states start by comparing the first dimension, then go on to the second, and so on, increasing the relative impact of impairments on the first dimensions. Fortunately, if this is the case, the observed differences should disappear if the ordering of the five dimensions was randomized. Alternatively, it could be that time framing is more salient in TTO and that respondents find the thought of longterm impairments involving pain/discomfort or anxiety/depression unbearable. This interpretation is compatible with previous findings about nonlinear time preferences and the concept of maximum endurable time in TTO [26 29]. In a recent cognitive debriefing study of EQ-5D questionnaire valuation by Bailey et al. [30], respondents frequently ignored the 10-year duration in the VAS and ranking tasks but were sensitive to time when performing TTO. The observed inconsistency between TTO and ranking raises two important issues: First, which of the two valuation methods should be considered as being the best or most correct? Second, if these findings can be generalized to TTO and DCE for the five-level version of the EQ-5D questionnaire, combining the two methods for the purpose of tariff generation may prove troublesome. If we understand the methods required for such hybrid tariff generation correctly, and DCE behaves as ranking does in our study, combining data from a small set of health states valued with the TTO with data from a large set of states valued with DCE could result in inconsistent tariffs: health states in proximity to the states selected for TTO valuation would be more influenced by pain/discomfort and anxiety/depression, while states further from the selected TTO states would be more influenced by mobility, self-care, and usual activities. In conclusion, experimental studies on DCE and TTO need to be performed to determine whether the two methods can be combined for the purpose of tariff generation without creating inconsistent tariffs. Source of financial support: The study was indirectly financed by the Norwegian Research Council and the Southern and Eastern Regional Health Care Authorities through PhD grants for the first two authors. Neither funding source had any involvement in the study. REFERENCES [1] Brooks R. EuroQol: the current state of play. Health Policy 1996;37: 53 72. [2] Brazier J, Ratcliffe J, Tsuchiya A, et al. Measuring and Valuing Health Benefits for Economic Evaluation. Oxford: Oxford University Press, 2007. [3] Norman R, Cronin P, Viney R, et al. International comparisons in valuing EQ-5D health states: a review and analysis. Value Health 2009; 12:1194 200. [4] Viney R, Norman R, King MT, et al. Time trade-off derived EQ-5D weights for Australia. Value Health 2011;14:928 36. [5] Lee YK, Nam HS, Chuang LH, et al. South Korean time trade-off values for EQ-5D health states: modeling with observed values for 101 health states. Value Health 2009;12:1187 93. [6] Oppe M, van Hout B. The optimal hybrid: experimental design and modeling of a combination of TTO and DCE. In: Yfantopoulos J, ed., 27th Scientific Plenary Meeting of the EuroQol Group-Proceedings. Rotterdam: EuroQol Group Executive Office, 2010. [7] Bleichrodt H, Johannesson M. Standard gamble, time trade-off and rating scale: experimental results on the ranking properties of QALYs. J Health Econ 1997;16:155 75. [8] Giesler RB, Ashton CM, Brody B, et al. Assessing the performance of utility techniques in the absence of a gold standard. Med Care 1999;37: 580. [9] Dolan P. Modeling valuations for EuroQol health states. Med Care 1997; 35:1095 108. [10] Shaw JW, Johnson JA, Coons SJ. US valuation of the EQ-5D health states: development and testing of the D1 valuation model. Med Care 2005;43:203 20. [11] Patrick DL, Starks HE, Cain KC, et al. Measuring preferences for health states worse than death. Med Decis Making 1994;14:9 18. [12] Torrance GW. Multi-attribute utility theory as a method of measuring social preferences for health states in long-term care. In: Kane RA, Kane RL, eds. Values and Long Term Care. Lexington Books, Lexington, MA, 1982. [13] Lamers LM. The transformation of utilities for health states worse than death: consequences for the estimation of EQ-5D value sets. Med Care 2007;45:238 44. [14] Devlin NJ, Tsuchiya A, Buckingham K, Tilling C. A uniform time trade off method for states better and worse than dead: feasibility study of the lead time approach. Health Econ 2011;20:348 61. [15] Craig BM, Busschbach JJ. The episodic random utility model unifies time trade-off and discrete choice approaches in health state valuation. Pop Health Metrics 2009;7:3. [16] Augestad LA, Rand-Hendriksen K, Kristiansen IS, Stavem K. Impact of transformation of negative values and regression models on differences between the UK and US EQ-5D TTO value sets. Pharmacoeconomics. In press. [17] Tversky A, Kahneman D. Rational choice and the framing of decisions. J Bus 1986;59:251 78. [18] Tversky A, Sattath S, Slovic P. Contingent weighting in judgment and choice. PsycholRev 1988;95:371. [19] Dolan P, Gudex C, Kind P, Williams A. Valuing health states: a comparison of methods. J Health Econ 1996;15:209 31. [20] Brazier J, Deverill M, Green C. A review of the use of health status measures in economic evaluation. J Health Serv Res Policy 1999;4: 174 84. [21] Green C, Brazier J, Deverill M. Valuing health-related quality of life: a review of health state valuation techniques. Pharmacoeconomics 2000;17:151 65. [22] Janssen MF, Birnie E, Haagsma JA, Bonsel GJ. Comparing the standard EQ-5D three-level system with a five-level version. Value Health 2008; 11:275 84. [23] Herdman M, Gudex C, Lloyd A, et al. Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L). Qual Life Res 2011;20:1 10. [24] Pickard AS, De Leon MC, Kohlmann T, et al. Psychometric comparison of the standard EQ-5D to a 5 level version in cancer patients. Med Care 2007;45:259. [25] Pickard AS, Kohlmann T, Janssen MF, et al. Evaluating equivalency between response systems: application of the Rasch model to a 3-level and 5-level EQ-5D. Med Care 2007;45:812. [26] Stalmeier PF, Lamers LM, Busschbach JJ, Krabbe PF. On the assessment of preferences for health and duration: maximal endurable time and better than dead preferences. Med Care 2007;45:835. [27] Attema AE, Versteegh MM, Oppe M, et al. Lead time TTO: leading to better health state valuations? In: Yfantopoulos J, ed., 27th Scientific Plenary Meeting of the EuroQol Group-Proceedings. Rotterdam: EuroQol Group Executive Office, 2010. [28] Attema AE, Brouwer WBF. The value of correcting values: influence and importance of correcting TTO scores for time preference. Value Health 2010;13:879 84. [29] Attema AE, Brouwer WB. The correction of TTO-scores for utility curvature using a risk-free utility elicitation method. J Health Econ 2009;28:234 43. [30] Bailey H, Kind P, Lascelles K. Poster 2: What are we asking? What are they thinking? Preliminary results from a cognitive debriefing study of EQ-5D elicitation exercises. In: Yfantopoulos J, ed., 27th Scientific Plenary Meeting of the EuroQol Group - Proceedings. Rotterdam: EuroQol Group Executive Office, 2010:276 9.