Does reporting heterogeneity bias the measurement of health disparities?

Similar documents
Kim M Iburg Joshua A Salomon Ajay Tandon Christopher JL Murray. Global Programme on Evidence for Health Policy Discussion Paper No.

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

WHO S ASSESSMENT OF HEALTH CARE INDUSTRY PERFORMANCE: RATING THE RANKINGS

HIV/AIDS-related Expectations and Risky Sexual Behavior in Malawi

Desperation or Desire? The Role of Risk Aversion in Marriage. Christy Spivey, Ph.D. * forthcoming, Economic Inquiry. Abstract

A Meta-Analysis of the Effect of Education on Social Capital

Can Subjective Questions on Economic Welfare Be Trusted?

HIV/AIDS-related Expectations and Risky Sexual Behavior in Malawi

What Determines Attitude Improvements? Does Religiosity Help?

Using the Perpendicular Distance to the Nearest Fracture as a Proxy for Conventional Fracture Spacing Measures

Parameter Estimates of a Random Regression Test Day Model for First Three Lactation Somatic Cell Scores

310 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16

Price linkages in value chains: methodology

Fitsum Zewdu, Junior Research Fellow. Working Paper No 3/ 2010

Richard Williams Notre Dame Sociology Meetings of the European Survey Research Association Ljubljana,

Rich and Powerful? Subjective Power and Welfare in Russia

Incorrect Beliefs. Overconfidence. Types of Overconfidence. Outline. Overprecision 4/22/2015. Econ 1820: Behavioral Economics Mark Dean Spring 2015

Economic crisis and follow-up of the conditions that define metabolic syndrome in a cohort of Catalonia,

Rich and Powerful? Subjective Power and Welfare in Russia

TOPICS IN HEALTH ECONOMETRICS

Biased Perceptions of Income Distribution and Preferences for Redistribution: Evidence from a Survey Experiment

ARTICLE IN PRESS Neuropsychologia xxx (2010) xxx xxx

The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis

I T L S. WORKING PAPER ITLS-WP Social exclusion and the value of mobility. INSTITUTE of TRANSPORT and LOGISTICS STUDIES

Appendix F: The Grant Impact for SBIR Mills

Copy Number Variation Methods and Data

HIV/AIDS AND POVERTY IN SOUTH AFRICA: A BAYESIAN ESTIMATION OF SELECTION MODELS WITH CORRELATED FIXED-EFFECTS

ALMALAUREA WORKING PAPERS no. 9

Are National School Lunch Program Participants More Likely to be Obese? Dealing with Identification

Joint Modelling Approaches in diabetes research. Francisco Gude Clinical Epidemiology Unit, Hospital Clínico Universitario de Santiago

Are Drinkers Prone to Engage in Risky Sexual Behaviors?

The Importance of Being Marginal: Gender Differences in Generosity 1

INITIAL ANALYSIS OF AWS-OBSERVED TEMPERATURE

An Introduction to Modern Measurement Theory

Addressing empirical challenges related to the incentive compatibility of stated preference methods

Sheffield Economic Research Paper Series. SERP Number:

Socioeconomic Inequalities in Adult Obesity Prevalence in South Africa: A Decomposition Analysis

Appendix for. Institutions and Behavior: Experimental Evidence on the Effects of Democracy

Encoding processes, in memory scanning tasks

Stephanie von Hinke Kessler Scholder, George Davey Smith, Debbie A. Lawlor, Carol Propper, Frank Windmeijer

The Marginal Income Effect of Education on Happiness: Estimating the Direct and Indirect Effects of Compulsory Schooling on Well-Being in Australia

FORGONE EARNINGS FROM SMOKING: EVIDENCE FOR A DEVELOPING COUNTRY

A GEOGRAPHICAL AND STATISTICAL ANALYSIS OF LEUKEMIA DEATHS RELATING TO NUCLEAR POWER PLANTS. Whitney Thompson, Sarah McGinnis, Darius McDaniel,

Statistical models for predicting number of involved nodes in breast cancer patients

Project title: Mathematical Models of Fish Populations in Marine Reserves

Modeling the Survival of Retrospective Clinical Data from Prostate Cancer Patients in Komfo Anokye Teaching Hospital, Ghana

Estimation for Pavement Performance Curve based on Kyoto Model : A Case Study for Highway in the State of Sao Paulo

Study and Comparison of Various Techniques of Image Edge Detection

Unobserved Heterogeneity and the Statistical Analysis of Highway Accident Data

NHS Outcomes Framework

Risk Misperception and Selection in Insurance Markets: An Application to Demand for Cancer Insurance

IV Estimation. Dr. Alexander Spermann. Summer Term 2012

PREDICTING CRIMINAL RECIDIVISM IN PAROLED QUEENSLAND PRISONERS: FINDINGS FROM A MULTINOMIAL ORDERED PROBIT MODEL

Optimal Planning of Charging Station for Phased Electric Vehicle *

THE NORMAL DISTRIBUTION AND Z-SCORES COMMON CORE ALGEBRA II

Saeed Ghanbari, Seyyed Mohammad Taghi Ayatollahi*, Najaf Zare

Education and social capital: empirical evidence from microeconomic analyses Huang, J.

The Effect of Fish Farmers Association on Technical Efficiency: An Application of Propensity Score Matching Analysis

Discussion Papers In Economics And Business

ME Abstract. Keywords: multidimensional reliability, instrument of students satisfaction as an internal costumer, confirmatory factor analysis

N-back Training Task Performance: Analysis and Model

Normal variation in the length of the luteal phase of the menstrual cycle: identification of the short luteal phase

Multidimensional Reliability of Instrument for Measuring Students Attitudes Toward Statistics by Using Semantic Differential Scale

Offsetting Behavior in Reducing High Cholesterol: Substitution of Medication for Diet and Lifestyle Changes

Integration of sensory information within touch and across modalities

UNIVERISTY OF KWAZULU-NATAL, PIETERMARITZBURG SCHOOL OF MATHEMATICS, STATISTICS AND COMPUTER SCIENCE

J. H. Rohrer, S. H. Baron, E. L. Hoffman, D. V. Swander

CORRUPTION PERCEPTIONS IN RUSSIA: ECONOMIC OR SOCIAL ISSUE?

The effect of salvage therapy on survival in a longitudinal study with treatment by indication

Modeling Multi Layer Feed-forward Neural. Network Model on the Influence of Hypertension. and Diabetes Mellitus on Family History of

Journal of Economic Behavior & Organization

Microfinance, Food Security and Women's Empowerment in Côte d'ivoire

Statistical Analysis on Infectious Diseases in Dubai, UAE

Comparison of methods for modelling a count outcome with excess zeros: an application to Activities of Daily Living (ADL-s)

National Polyp Study data: evidence for regression of adenomas

Working Paper Series FSWP Ming-Feng Hsieh University of Wisconsin-Madison. Paul D. Mitchell University of Wisconsin-Madison

Sparse Representation of HCP Grayordinate Data Reveals. Novel Functional Architecture of Cerebral Cortex

Computing and Using Reputations for Internet Ratings

CONSTRUCTION OF STOCHASTIC MODEL FOR TIME TO DENGUE VIRUS TRANSMISSION WITH EXPONENTIAL DISTRIBUTION

SMALL AREA CLUSTERING OF CASES OF PNEUMOCOCCAL BACTEREMIA.

EVALUATION OF BULK MODULUS AND RING DIAMETER OF SOME TELLURITE GLASS SYSTEMS

Ghebreegziabiher Debrezion Eric Pels Piet Rietveld

Lateral Transfer Data Report. Principal Investigator: Andrea Baptiste, MA, OT, CIE Co-Investigator: Kay Steadman, MA, OTR, CHSP. Executive Summary:

Willingness to Pay for Health Risk Reductions: Differences by Type of Illness

Alma Mater Studiorum Università di Bologna DOTTORATO DI RICERCA IN METODOLOGIA STATISTICA PER LA RICERCA SCIENTIFICA

Physical Model for the Evolution of the Genetic Code

Using Past Queries for Resource Selection in Distributed Information Retrieval

Prototypes in the Mist: The Early Epochs of Category Learning

Non-linear Multiple-Cue Judgment Tasks

Evaluation of the generalized gamma as a tool for treatment planning optimization

Strong, Bold, and Kind: Self-Control and Cooperation in Social Dilemmas

Chapter 20. Aggregation and calibration. Betina Dimaranan, Thomas Hertel, Robert McDougall

Relevance of statistical techniques when using administrative health data: gender inequality in mortality from cardio-vascular disease

arxiv: v1 [cs.cy] 9 Nov 2018

Birol, Ekin; Asare-Marfo, Dorene; Ayele, Gezahegn; Mensa-Bonsu, Akwasi; Ndirangu, Lydia; Okpukpara, Benjamin; Roy, Devesh; and Yakhshilikov, Yorbol

RENAL FUNCTION AND ACE INHIBITORS IN RENAL ARTERY STENOSISA/adbon et al. 651

Latent Class Analysis for Marketing Scales Development

Length of Hospital Stay After Acute Myocardial Infarction in the Myocardial Infarction Triage and Intervention (MITI) Project Registry

Prediction of Total Pressure Drop in Stenotic Coronary Arteries with Their Geometric Parameters

Validation of the Gravity Model in Predicting the Global Spread of Influenza

Transcription:

HEDG Workng Paper 06/03 Does reportng heterogenety bas the measurement of health dspartes? Teresa Bago d Uva Eddy Van Doorslaer Maarten Lndeboom Owen O Donnell Somnath Chatterj March 2006 ISSN 1751-1976 york.ac.uk/res/herc/hedgwp

Does reportng heterogenety bas the measurement of health dspartes? Teresa Bago d Uva Unversty of York Eddy Van Doorslaer Erasmus Unversty, Netspar & Tnbergen Insttute Maarten Lndeboom Free Unversty of Amsterdam, HEB, IZA, Netspar & Tnbergen Insttute Owen O Donnell Unversty of Macedona, Erasmus Unversty & Netspar Somnath Chatterj World Health Organzaton March 2006 Abstract Heterogenety n reportng of health by soco-economc and demographc characterstcs potentally bases the measurement of health dspartes. Responses to anchorng vgnettes have been proposed as a means of dentfyng reportng heterogenety. We apply the vgnette methodology to data from Indonesa, Inda and Chna n order to test for systematc dfferences n reportng of health by sex, age, urban/rural locaton, educaton and ncome and to establsh the senstvty of estmated dspartes n health to the purgng of reportng dfferences. The hypothess of homogeneous reportng across all soco-demographcs s rejected. Homogenety tends to be most consstently and decsvely rejected across urban/rural, ncome and age dfferences and less consstently across sex and educaton groups. In general, younger, male (not Indonesa), better educated (not Chna) low ncome and urban respondents dsplay lower health expectatons. Correctng for reportng heterogenety tends to reduce dspartes n health by age, sex (not Indonesa), urban/rural and educaton (not Chna) and to ncrease ncome dspartes n health. Overall, whle homogeneous reportng s sgnfcantly rejected, the results suggest that the sze of the reportng bas n measures of health dspartes s not large. Keywords: health measurement, vgnettes, self-reported health, reportng heterogenety JEL Classfcaton: D30, D31, I10, I12 Acknowledgements Teresa Bago d'uva s funded by Fundação para a Cênca e Tecnologa, under PhD grant SFRH/BD/10551/2002. The authors thank the WHO for provdng access to the MCS data. They acknowledge comments on earler versons by Andrew Jones and Ngel Rce and by semnar partcpants at the Unversty of Melbourne, the Unversty of New South Wales, the Unversty of York, Erasmus Unversty Rotterdam and the ECuty Project meetng at the IZA n Bonn. The usual dsclamer apples. Correspondng author: Teresa Bago d Uva, Centre for Health Economcs, Alcun A Block, Unversty of York, York, YO10 5DD, Unted Kngdom. E-mal: tbd100@york.ac.uk. 1

1 Introducton Self-reported health s a convenent and nformatve nstrument, wdely used n analyses of health determnants as well as the economc consequences of ll health. Inevtably, there s heterogenety n the reportng of health. For a gven true but unobserved health state, ndvduals wll report health dfferently dependng upon conceptons of health n general, expectatons for own health, fnancal ncentves to report ll health and comprehenson of the survey questons. In many contexts, reportng heterogenety need not be a major concern provded that t s random. Systematc dfferences n reportng behavour are more problematc. For example, measurement of nequalty n health wll be based f there are systematc dfferences n the way n whch health s reported across the demographc and soco-economc characterstcs aganst whch nequalty s beng assessed. The purpose of ths paper s to test and correct for reportng bas n measures of health dspartes n developng countres. Dfferences n health dspartes derved from self-reported and more objectve ndcators are suggestve of systematc varaton n reportng behavour. One frequently cted example s the tendency for Aborgnals to report better health than the general Australan populaton despte beng serously dsadvantaged accordng to more objectve health ndcators, such as mortalty (Mathers and Douglas 1998). Dscrepancy n health gradents measured by objectve and subjectve ndcators s even more common n evdence from the developng world. In Inda, the state of Kerala consstently shows the hghest rates of reported morbdty, n spte of havng the lowest rates of nfant and chld mortalty (Murray 1996). Wagstaff (2002) notes that ncome-related nequaltes n objectve ndcators of ll health, such as malnutrton and mortalty, tend to be hgher than those n subjectve health. Moreover, the use of subjectve health measures has led to some mprobable health gradents n developng countres, wth the rch reportng worse health than the poor (Baker and Van der Gaag 1993), whch seems qute nconsstent wth substantal prorch nequalty n nfant and chld mortalty rate and n anthropometrc ndcators (Gwatkn, Rusten et al. 2000). Sen (2002) argues: There s a strong need for scrutnsng statstcs on self percepton of llness n a socal context by takng note of levels of educaton, avalablty of medcal facltes and publc nformaton on llness and remedy. 2

Formal testng of reportng heterogenety by demographc and soco-economc status has been undertaken n recent studes, albet not n an exhaustve way, and not for less developed countres. Van Doorslaer and Gerdtham (2003) use Swedsh data to assess the extent to whch the capacty of self-reported health to predct mortalty vares across soco-demographc groups. Self-reported health s found to be a very strong predctor of subsequent mortalty rsk. The relatonshp vares wth demographc and dsease characterstcs but not by soco-economc status. Lndeboom and van Doorlsaer (2004) assume that the McMaster Health Utlty Index (HUI) provdes an objectve and comprehensve health ndcator and test whether, condtonal on ths, there s varaton n stated health n Canada that can be attrbuted to reportng behavour. The results are consstent wth those of Van Doorslaer and Gerdtham, there s evdence of reportng heterogenety for age and sex, but not for educaton and ncome. 1 Whle ths evdence s encouragng for the measurement of soco-economc nequaltes n health n developed countres, t says nothng about the effect of reportng heterogenety on the measurement of health nequalty n developng countres where dfferences n conceptons of llness by educaton and ncome levels and between urban and rural locatons may be greater. The studes dscussed n the prevous paragraph test for reportng heterogenety through examnaton of varaton n health reportng condtonal on some objectve measure of health. One problem s that objectve ndcators, for example mortalty, may not be avalable. Less objectve ndcators, such as health condtons, are more lkely to be avalable but are also self-reported and are subject to error (Baker, Stable et al. forthcomng). The test mght uncover dfferent types of reportng heterogenety n dfferent ndcators rather than devatons from a purely objectve benchmark of health. A further dsadvantage of usng objectve ndcators to test and correct for reportng heterogenety s that ths strps out any soco-economc related varaton n self-reported health condtonal on the objectve ndcators. If the self-reported health contans nformaton on true health, condtonal on objectve ndcators, then ths s lost. If self-reported health does not contan addtonal nformaton, then one mght as well examne the relatonshp between objectve ndcators and soco-economc characterstcs from the outset. Rather than attempt to dentfy reportng behavour from varaton n selfreported health beyond that explaned by objectve ndcators, an alternatve s to examne varaton n the evaluaton of gven health states represented by hypothetcal 3

case vgnettes (Tandon, Murray et al. 2003; Kng, Murray et al. 2004; Salomon, Tandon et al. 2004). The vgnettes represent fxed levels of latent health and so all varaton n the ratng of them can arguably be attrbuted to reportng behavour, whch can be examned n relaton to observed characterstcs. Under the assumpton that ndvduals rate the vgnettes n the same way as they rate ther own health, t s possble to dentfy a measure of health that s purged of reportng heterogenety. Murray, Ozaltn et al. (2003) evaluate ths approach to the measurement of health, n the doman of moblty, usng data from 55 countres covered by World Health Organsaton (WHO) surveys. The prncpal objectve of ther analyss s to obtan comparable measures of populaton health that are purged of cross-country dfferences n the reportng of health. Besdes country, reportng of health s allowed to vary wth age, sex and educaton but there s no detaled examnaton of these dmensons of reportng heterogenety or of the mpact on measured health dspartes. Usng the vgnettes method, Kapteyn, Smth et al. (2004) fnd that about half of the dfference n rates of self-reported work dsablty between the Netherlands and the US can be attrbuted to reportng behavour. Our concern n ths paper s not wth the cross-country comparablty of health measures but wth the comparablty of self-reported health across demographc and soco-economc groups wthn a country and the consequences of any systematc dfferences n reportng behavour for measures of health dspartes between these groups. Our prmary nterest s n the degree to whch measures of health nequalty are based by reportng heterogenety n developng countres. We apply the vgnettes methodology to data from the three largest Asan countres - Indonesa, Inda and Chna - n order to test for systematc dfferences n reportng of health by sex, age, urban/rural locaton, educaton and ncome and to establsh the extent to whch estmated dspartes n health change when reportng dfferences are purged from the health measures. In subsequent sectons of the paper, the data, econometrc models, results and conclusons are presented. 2 Data WHO Mult-Country Survey The data used n ths paper, as n Murray, Ozaltn et al. (2003), are from the WHO Mult-Country Survey Study on Health and Responsveness 2000-2001 (WHO-MCS) 4

that covered 71 adult populatons n 61 countres. Ustun, Vllanueva et al. (2003) provde a comprehensve report on the goals, desgn, nstrument development and executon of ths survey. Indvduals were asked to report ther health n each of sx health domans (moblty, cogntve functonng, affectve behavour, pan or dscomfort, self-care and usual actvtes). In addton, a sub-sample of ndvduals were asked to rate a set of anchorng vgnettes descrbng fxed ablty levels on each health doman. The general dea s to use the responses to these vgnettes to dentfy reportng heterogenety. Assessments of own health by doman can then be calbrated aganst the vgnettes, purgng reportng heterogenety and gvng nterpersonally comparable health measures. We use the WHO-MCS data for Indonesa (excludng Papua, Aceh and Maluku), an Indan state (Andrah Pradesh) and three Chnese provnces (Gansu, Henan and Shan-dong). 2 The dataset used here results from droppng ndvduals wth mssng data on own health, the soco-demographc varables used n the analyss and the vgnettes. The resultng dataset contans 7770 observatons for Indonesa, 5129 for Inda and 7156 for Chna. Table A1 n the Appendx documents the number of observatons lost due to tem non-response. 2.1 Health varables: own health and vgnettes Health by doman s obtaned from the questons: Overall n the last 30 days, how much... : dffculty dd you have wth movng around? (moblty) dffculty dd you have wth concentratng or rememberng thngs? (cognton) pan or dscomfort dd you have? (pan) dffculty dd you have wth self-care, such as washng or dressng yourself? (selfcare) dffculty dd you have wth work or household actvtes? (usual) dstress, sadness or worry dd you experence? (affect) The fve response categores are: Extreme/Cannot do, Severe, Moderate, Mld, None. Table 1 presents the dstrbutons of these self-reported health varables. For each doman, a random sub-sample of ndvduals s presented wth a set of vgnettes, descrbng levels of dffculty on that doman, and asked to evaluate these hypothetcal cases n the same way as they evaluate ther own health for that doman (.e., usng the same 5 response categores). Of course, there can be no reference to the 5

experence of the vgnettes over the last 30 days. One-half of the samples evaluate the vgnettes n the doman of moblty and roughly one-quarter of the samples respond to the vgnettes n each of the other domans. Each respondent s asked to rate vgnettes on two domans. Wthn a gven doman, the set of vgnettes s the same for all respondents. The vgnette descrptons for all the domans are presented n the Appendx. TABLE 1 The dstrbutons of the vgnette evaluatons are presented n Table1. Despte representng fxed levels of ablty by doman, the vgnette ratngs show consderable varaton, whch can be attrbuted to reportng heterogenety. For example, vgnette 4 n the moblty doman descrbes a person who has chest pans and gets breathless after walkng up to 200 metres but s able to do so wthout assstance. In Indonesa, almost 35% of respondents categorse ths as a moderate moblty problem but 36% defne t as severe and almost 19% as mld. There s even 2.8% wth suffcently hgh health expectatons such that they consder ths an extreme moblty problem. On the other hand, 7% do not consder ths a problem at all. Varyng degrees of reportng heterogenety can be seen across the vgnettes for all domans and countres. Ths s the varaton we explot to test for systematc reportng heterogenety n relaton to demographc and soco-economc characterstcs and to purge health dspartes across such characterstcs of reportng bas. 2.2 Soco-demographc varables Expectatons for health and tolerance of llness may be nfluenced by an ndvdual s soco-economc envronment and demographc characterstcs. The degree of functonng consdered as good health mght be expected to declne wth age. Conceptons of good health may also dffer by sex although t s more dffcult to predct the sgn of the effect, whch mght dffer across dfferent health domans. Geographc and economc crcumstances may mould health expectatons through peer effects and access to medcal care. Lvng wthn a communty n whch a large proporton of the populaton suffers poor health may lower the ndvdual s expectatons for her own health. Improved access to effectve health care may lower tolerance of llness and dsease. Reportng of health may vary wth educaton not only because educaton acts as a proxy for permanent ncome but also through a drect 6

effect. The latter wll operate through conceptons of llness, understandng of dsease and knowledge of the avalablty and effectveness of health care. It s not mmedately clear n whch drecton such effects wll shft the reportng of health. One mght expect the better educated to be less tolerant of poor health. On the other hand, the better educated should be better nformed of the health of others and able to apprecate ther relatvely prvleged poston n the health dstrbuton. We test for reportng heterogenety n relaton to age, sex, urban/rural status, educaton and ncome. Age s represented by categores: 15 to 29 years (reference category), 30 to 44 (AGE3044), 45 and 59 (AGE4559) and more than 60 (AGE60). Sex s represented by the dummy varable FEMALE and locaton by the dummy URBAN. A flexble educaton effect s allowed for through a seres of dummes ndcatng the hghest level of educaton completed: less than prmary (reference category), prmary (EDUC2), secondary (EDUC3), and hgh school or above (EDUC4). 3 The varable log(income) s the log of monthly household earnngs by equvalent adult (n natonal currences). 4 5 Table 2 presents descrptve statstcs for the covarates by country. TABLE 2 3 Econometrc models Categorcal data on health are typcally modelled by assumng that the observed categorcal varable s a dscrete representaton of an underlyng unobserved true level of health, measured on a contnuous scale. The categorcal varable s defned as the result of a mappng between latent health and the response categores. Homogeneous reportng behavour corresponds to the assumpton that the mappng s constant across ndvduals. By contrast, reportng heterogenety translates nto dfferent mappngs between the latent varable and the observed categorcal varable. Indvduals mght attach very dfferent meanngs to the labels used for each of the response categores, thus makng the observed health varables ncomparable, snce they do not correspond to the same ntervals n the latent health scale. After presentng the homogeneous case, we descrbe n detal below how vgnette nformaton can be used to dentfy reportng heterogenety n self-reported health. 7

3.1 Ordered Probt: Homogeneous reportng behavour Let y, = 1,..., N be a self-reported categorcal health measure. It s assumed that y s generated by the latent health varable where * Y, specfed as: * ( ) Y = Z β + ε, ε Z N 0,1 (1) Z s a vector of covarates. Snce the latent varable s unobserved and ts observed counterpart s categorcal, the varance of the error term ε, condtonal on Z, and the constant term are not dentfed and are usually set to 1 and 0, respectvely. 6 The observed categorcal response of ndvdual health n the followng way: y relates to latent y k Y k 1 * k = τ < τ (2) k=0,,k, τ 0 <τ 1 <..<τ K-1 <τ K and τ 0 =-, τ K =. The parameters τ k, k=1,,k-1 are estmated along wth the other parameters of the model ( β ).The assumpton of homogeneous reportng that s nherent to the ordered probt model arses from the k constant cut-ponts τ. If ths assumpton does not hold, n partcular, f the cut-ponts vary accordng to some of the covarates Z, then mposng the restrcton wll lead to based estmates of β snce they wll reflect both health effects and reportng effects. It s possble to generalse the ordered probt model and allow the cut-ponts to k k depend on covarates, τ = τ ( x α ), (Terza 1985). Normalsng one threshold to a constant, the other threshold parameters are dentfed n the sense of showng how covarates shft the thresholds relatve to ther mpact on the baselne threshold. If the covarate effect were the same on all thresholds, labelled parallel shft (Hernandez- Quevedo, Jones et al. 2004), then the threshold coeffcents would be zero and t s not possble to dstngush ths case from an effect on the ndex functon alone. Less attractvely, dentfcaton could be acheved through a seres of mantaned assumptons that each covarate can be excluded from ether the threshold or health ndex functon (Pudney and Shelds 2000). Whle the generalsed ordered probt allows thresholds to vary wth covarates, n the present context t would be hazardous to nterpret such effects as a reflecton of reportng heterogenety rather than heterogenety n the latent health ndex tself (Hernandez-Quevedo, Jones et al. 2004). As specfed n (1), t s assumed that there s a sngle latent health ndex that apples 8

for all ndvduals. It s possble, however, that the relatonshp of true health wth the covarates vares wth the level of health tself. For example, ncome may have a weaker margnal mpact on health at better levels of health and ths may result n varaton of the ncome coeffcent across the categores of reported health. Interpretaton of the varyng thresholds of a generalsed probt model as an ndcaton of reportng heterogenety would therefore rely strongly on the assumpton that the latent health ndex was correctly specfed as a homogeneous functon of covarates. 7 Wth addtonal nformaton provded by vgnettes t s possble to dentfy the separate effects of covarates on reportng behavour and true health wthout relyng on functonal form and/or excluson restrctons. 3.2 Herarchcal Ordered Probt: Heterogeneous reportng behavour Suppose one has access to ndvduals self reports y j on specfc health domans j and vgnette ratngs on these same domans v y j. The vgnettes descrbe the level of ablty on each doman and ask ndvduals to rate these hypothetcal cases n the same way as they evaluate ther own health for that doman (.e. usng the same response scale). The health status of the hypothetcal ndvdual s exogenously vared across the vgnettes and therefore ndvdual varaton n responses to these vgnettes must be due to reportng heterogenety. In the context of the generalsed ordered probt ths means that we can use the external vgnette nformaton to separately dentfy the k k thresholds ( τ = τ ( x α ), k=1, K-1). These cut-offs can be mposed on the model for the self reports wth respect to the ndvdual s own health, so that estmates of β now reflect true health dfferences rather than a mxture of health dfferences and reportng heterogenety. Ths has been suggested by Kng, Murray et al. (2004), who label ther model the herarchcal ordered probt (HOPIT). The HOPIT model s specfed n two parts: one reflectng reportng behavour and another representng the relatonshp between the ndvdual s own health and the observables. The use of vgnettes to dentfy the cut-ponts and so systematc reportng heterogenety reles on two assumptons. Frst, there must be response consstency: ndvduals classfy the hypothetcal cases represented by the vgnettes n the same way as they rate ther own health. That s, the mappng used to translate the perceved 9

latent health of others to reported categores s the same as that governng the correspondence between own latent and reported health. Ths s essental f we are to learn about how ndvduals report ther own health from how they rate others health. The assumpton s not ndsputable. Strategc behavour mght nfluence reportng of own health but not that of others. For example, enttlement rules for dsablty transfers provde an ncentve to understate own health but are rrelevant to the reportng on others health. The second assumpton necessary for dentfcaton of reportng behavour va the vgnettes s vgnette equvalence: the level of the varable represented by any one vgnette s perceved by all respondents n the same way and on the same undmensonal scale (Kng, Murray et al. 2004, p.194). If ths dd not hold, then one could not nterpret varaton n responses to a gven vgnette as reflectng dfferences n evaluatons of health for a gven level of functonng n a sngle health doman. 8 3.2.1 Reportng behavour The frst (vgnette) component of the HOPIT uses nformaton on the vgnette ratngs to model the cut-ponts as functons of covarates. For a gven health doman, let be the latent health level of vgnette j as perceved by ndvdual. Gven that each vgnette j s assumed to represent a fxed level of ablty, any assocaton between the v* v* latent level of health Y j and ndvdual characterstcs s ruled out. E Y j s therefore assumed to depend solely on the correspondng vgnette. Formally, t s assumed that v* Yj s determned by: Y v* v v j j j j ( ) v* Y j = α + ε, ε N 0,1. (3) The observed vgnette ratngs v y j relate to v* Y j n the followng way: y v j = k τ Y < τ k 1 v* j k (4) k=0,,k, 0 τ < τ < < 1 K 1 K... < τ τ and τ 0 =-, τ K =. The cut-ponts are defned as functons of covarates but are assumed not to vary across dfferent vgnettes j for a gven health doman, for nstance: 10

τ k k = X γ (5) Note that the ndvdual s characterstcs are ncluded only n the cut-ponts, reflectng the assumpton that all the systematc varaton n the vgnette ratngs can be attrbuted to ndvdual reportng behavour. 9 3.2.2 Health equaton Smlar to the ordered probt, the second component of the HOPIT defnes the latent level of ndvdual own health, s* Y, and the observaton mechansm that relates ths latent varable to the observed categorcal varable, y. The dfference s that the cutponts are no longer constant parameters but can vary across ndvduals, beng determned by the vgnette component of the model. Identfcaton derves from the response consstency and the vgnette equvalence assumptons. The possblty of fxng the cut-ponts leads to the specfcaton of the model for ndvdual own health as an nterval regresson, enablng the dentfcaton of the constant term and the varance. The latent level of ndvdual own health s specfed as: where ( ) Y = Z β + ε, ε Z N 0, σ (6) s* s s 2 Z s a vector of covarates ncludng a constant. The observed categorcal varable y s determned by: y = k τ Y < τ k 1 s* k (7) k=0,,k, (5). 0 τ < τ < < < 1 K 1 K... τ τ and τ 0 =-, τ K = and where k τ are as defned as n It s assumed that the error terms n the vgnette and own latent health v equatons, εj and ε respectvely, are ndependent for all = 1,..., N and j = 1,..., V. s The lkelhood functon depends on the probabltes of observng partcular vgnette responses and the probablty of a partcular own health category beng reported. Although the errors n the two components of the model are assumed ndependent, the lkelhood does not factorse nto two ndependent parts snce the two components of the model are lnked through parameter restrctons. The vgnette component dentfes the threshold parameters, whch are mposed n the estmaton of the latent health functon. 11

3.2.3 Test of homogeneous reportng behavour Ths framework offers the possblty of testng for heterogeneous reportng behavour n relaton to ndvdual characterstcs. Ths s done by means of log-lkelhood rato tests of sgnfcance of (groups of) covarates n the cut-ponts of model (4), (5). If a set of coeffcents relatng to some factors s found to be jontly sgnfcant, then the null hypothess of homogenety of reportng behavour wth respect to these factors s rejected. We also use lkelhood rato tests to test whether the effect of a covarate s equal across all thresholds (ths s labelled as parallel cut-pont shft). We return to ths n the next secton 4 Results For each of the sx health domans, we estmate ordered probt models, equatons (1) and (2), and HOPIT models, equatons (3)-(7), separately for each of the three countres. The ndex functon and the cut-ponts are specfed as functons of the same covarates: FEMALE, AGE3044, AGE4559, AGE60, EDUC2, EDUC3, EDUC4, Log(INCOME) and URBAN. The mean health functon n the vgnette component of the HOPIT ncludes only dummes ndcatng the respectve vgnettes. Wth 2 models estmated for 6 domans and 3 countres, we do not present all the parameter estmates 10. We frst report results on tests for homogeneous reportng behavour. Next we turn to the quanttatve effects of reportng heterogenety and to what degree reportng heterogenety bases measures of nequalty n health. 4.1 Tests of reportng homogenety Table 3 presents the results of tests of homogeneous reportng behavour and parallel cut-pont shft. For homogenety, each column gves the p-values of lkelhood rato tests of jont sgnfcance of the respectve (groups of) covarates n the 4 cut-ponts. For each country, the frst column shows evdence of cut-pont heterogenety accordng to at least one of the characterstcs for all health domans. For the specfc characterstcs, the tests ndcate some varaton n the presence of reportng heterogenety across domans and countres. Homogenety of reportng by sex s rejected (5% or less) for all domans n the case of Inda but only for two domans n 12

each of Indonesa and Chna. Homogenety by age s rejected for four domans n Chna, three domans n Indonesa and two domans n Inda. The null hypothess that the cut-ponts are nvarant wth respect to educaton s rejected for three domans n Indonesa and two n each of Inda and Chna but there s relatvely lttle consstency across the countres n the domans for whch there s evdence of reportng heterogenety. The evdence for reportng heterogenety by ncome s stronger. The null s rejected for all but one doman n each of Inda and Chna and for all but two domans n Indonesa. There s also strong evdence for dfferences n reportng behavour across urban and rural locatons. TABLE 3 In the fnal column of Table 3, we report tests of whether the covarates affect all cut-ponts by the same magntude,.e. whether there s parallel cut-pont shft. The null s decsvely rejected n all cases but for affectve behavour n Chna. Ths suggests that covarates do not smply alter the overall concepton of health but that reportng behavour s stronger at some levels of health than others and that the effect need not even be monotonc. The nature of the reportng dfferences can be better understood through examnaton of the cut-pont coeffcents themselves. We now turn to ths. 4.2 Reportng behavour The response categores for the degree of dffculty / pan / dstress wthn any doman range from Extreme / Cannot do to None and so hgher health standards or expectatons are represented n the HOPIT model by postve shfts n the cut-ponts. If a certan covarate has postve coeffcents across all the cut-ponts, then hgher values of the covarate are assocated wth hgher health standards.e. lower probabltes of reportng better levels of health. Tables 4 and 5 present the cut-pont coeffcents of the ncome and educaton varables respectvely. To save on space, we do not present the coeffcents for sex, age and urban but llustrate the drecton and magntude of all effects graphcally (Fgure 1). For ncome, consstent wth the LR tests, there s a sgnfcant effect on at least one cut-pont for all domans and countres except for moblty n Indonesa and Chna. The sgnfcant coeffcents are mostly postve, wth the exceptons of pan and self-care for Indonesa and moblty 13

for Inda. There are no sgnfcant negatve effects on the uppermost cut-pont (4) and many sgnfcant postve effects, the latter mplyng that the better-off have a lower probablty ratng a vgnette as correspondng to no dffculty / pan / dstress. They have a hgher standard regardng what t means to have very good health. TABLE 4 Reportng behavour by educaton level s llustrated by the results presented n Table 5. There s less consstency than there s for ncome, wth more cases of both postve and negatve coeffcents for a gven doman and country and even cases of sgnfcant effects n opposte drectons. For Indonesa and Inda, most of the coeffcents for the uppermost cut-pont are negatve, although sgnfcance s reached only for moblty and cognton, ndcatng that more educated people are more lkely to report no dffcultes. The opposte s true for Chna, although there are sgnfcant postve effects on the uppermost cut-pont only for cognton. The results for Indonesa and Inda are perhaps surprsng. They do not support the contenton that educaton rases health expectatons. Rather, taken at face value, they suggest that the better educated are more lkely to tolerate ll-health. Another possblty s that there s dfferental capacty by educaton level n comprehenson of the vgnette ratng exercse. Indeed, Murray, Ozaltn et al. (2003) fnd that lower educated groups dsplay greater nconsstences between ther rankng of the vgnettes and the average rankng. But whle ths suggests that there s more nose n the ratngs of the less well educated, t does not explan why the better educated are more lkely to gve more postve evaluatons. It s notable that the Chnese sample has consderably hgher levels of educaton than the others and that the drecton of the educaton effect s consstent wth what mght be hypothessed a pror; that s, health expectatons rsng wth educaton. It may be that the vgnette exercse was more comprehensble for the Chnese sample. TABLE 5 Snce t s dffcult to drectly assess the relatve mportance of the reportng effects from the coeffcents alone, we use the parameter estmates of the reportng model (3)-(5) to calculate, per doman, the probablty that an ndvdual wth gven characterstcs wll rate a hypothetcal ndvdual (vgnette) as beng wthout dffculty 14

/ pan /dstress, whch we wll refer to as very good health. As the reference ndvdual we use a male from the youngest age group, wthout even prmary level educaton, lvng n a rural area and wth ncome at the threshold of the poorest quntle. To assess the effect of reportng dfferences by ncome alone, we redo the calculaton for the same ndvdual, but now wth ncome at the threshold of the rchest quntle. The rato of the two probabltes s used as a measure of the relatve magntude of the reportng effect. We repeat these calculatons changng n turn sex from male to female, age from the youngest to oldest group, locaton from rural to urban and educaton from the lowest to hghest level. The results are depcted n Fgure 1. FIGURE 1 The top row of the fgure presents the ncome effects. A rato smaller than one mples that an ndvdual wth ncome at the threshold of the top quntle has a lower probablty of reportng very good health than an otherwse dentcal ndvdual at the threshold of the bottom ncome quntle. For Indonesa, as was evdent from the coeffcents n Table 4, there are no dfferences n reportng by ncome for four of the sx domans. Even the two sgnfcant effects for cognton and affect are quanttatvely unmportant, wth the probablty of the hghest quntle reportng very good health beng only 1% smaller than that of the bottom quntle. For Inda, the sgnfcant effects of ncome for pan, self and usual are quanttatvely slghtly more mportant. In partcular, at the hghest quntle, the probablty of reportng no pan s 4% less than that of the lowest quntle. For Chna, the relatve dfference n probabltes reaches about 6% for both pan and affect. As regards educaton (second row), there are agan few quanttatvely mportant effects. For Indonesa, the hghest educaton category has a probablty of reportng no pan that s 5% greater than that of the lowest group. For Chna, the drecton of the educaton effect on the reportng of pan s reversed but s agan about 5% n relatve magntude. Otherwse, the relatve dfferences n reportng between hgh and low educaton groups barely exceed 2%. For Indonesa and Inda, people from urban areas are more lkely to report very good health than those lvng n rural areas (Fgure 1, row 3). The effect s largest for pan, wth about a 10% relatve dfference n probabltes for Indonesa and a dfference of almost 4% for Inda. In Chna, the urban-rural effect s n the opposte drecton for pan. For Inda and Chna but not for Indonesa, men are more lkely than women to report very good health. The relatve dfference n probabltes s largest 15

for pan, reachng 7% and 9% for Inda and Chna respectvely. Ths s consstent wth a large body of epdemologcal and expermental evdence showng that women are more lkely to report negatve responses to (ther own) pan (Unruh 1996; Rley, Robnson et al. 1998). Generally, the youngest age group (15-29 years) s more lkely to rate a vgnette as correspondng to no dffculty / pan / dstress than the oldest group (60+ years). The dfferences are agan greatest for pan. The drecton of these age effects s perhaps surprsng gven evdence that, condtonal on some objectve health ndcator, the elderly are more lkely to assess ther health postvely (Idler 1993; Van Doorslaer and Gerdtham 2003; Lndeboom and van Doorslaer 2004). The evdence on whether perceptons of pan vary wth age s, however, ambguous (Gbson and Helme 2001). Our results may be explaned by a greater capacty of elderly respondents, gven ther greater lfe experences, to empathse wth the vgnette descrptons. If ths were the case, t would mply volaton of the assumpton of vgnette equvalence. Murray, Ozaltn et al. (2003) clam some support for ths proposton n the WHO-MCS data, fndng that the degree of correlaton between an ndvdual s rankng of vgnettes and the average rankng s ncreasng wth age. The general pcture that emerges from the fgures s that the reportng effects of ncome and educaton are relatvely small compared to the age, sex and urban/rural effects. Ths suggests that ncome and educaton related reportng heterogenety may not have a large mpact on measures of soco-economc nequalty n health. We now turn to ths ssue. 4.3 Purgng reportng bas from measures of health dspartes The parameter estmates of the ndex functon of the standard ordered probt model (1) wll reflect true health effects and the effects of reportng heterogenety. Therefore, n the presence of reportng heterogenety, nequalty measures based on these parameter estmates wll be based. Wth estmates from the HOPIT model, we can separate the reportng heterogenety (parameters γ from equaton 5) from the true health effects (parameters β from equaton 6). In order to gauge the degree of bas generated by reportng heterogenety we compare nequalty measures based on the ordered probt model wth those obtaned from the approprate parameters of the HOPIT model. Gven that the scale of the latent varable s not dentfable n the 16

ordered probt model, the constant term and the varance are usually set equal to 0 and 1, respectvely. Here, n order to make the estmated effects from the two models comparable, we fx the scale of the ordered probt model by settng the constant term and the varance equal to those estmated by the HOPIT model. TABLES 6 & 7 The ncome and educaton coeffcents n the health equatons, (1) and (6), are shown n Tables 6 and 7 respectvely. For all three countres, the ordered probt results ndcate sgnfcant postve relatonshps between ncome and all health domans, except for pan n Indonesa and Inda, even wthout any adjustment for reportng heterogenety. For 14 of the 18 cases (6 health domans by 3 countres), the HOPIT adjustment ncreases the magntude of the ncome coeffcent. Indeed, regardng reportng behavour, as we saw n secton 4.2, better-off ndvduals generally have hgher expectatons (standards) for health. The HOPIT model can separate ths effect from the health effects and therefore gves greater ncome gradents than the ordered probt model. A frst concluson s therefore that the postve assocaton between ncome and health s underestmated f reportng heterogenety by ncome s not accounted for (n 14 of the 18 cases). In Inda, the ncome coeffcent n the pan functon becomes sgnfcant after the purgng of reportng heterogenety. But n Indonesa, sgnfcance s lost from the ncome coeffcent n the self-care functon. The ordered probt educaton coeffcents are sgnfcantly postve for all three countres n almost every doman confrmng a postve assocaton between health and educaton, before the correcton for reportng bas (Table 7). The vgnette adjustment for reportng bas leads to a decrease n 13 of the 18 educaton coeffcents for Indonesa and 15 of the 18 for Inda. For these two countres, n general, more educated people appear to over report ther health (n partcular, they are more lkely to report no dffcultes/pan/dstress n a gven doman) and ths means that the estmated effects of educaton on health are overstated when reportng bas s not accounted for. The drecton of the adjustment s n the opposte drecton n the case of Chna. Purgng reportng bas rases 12 of the 18 educaton coeffcents.. Agan we performed some calculatons wth the model n order to quantfy the effects of correctng for reportng heterogenety on a measure of nequalty. We 17

calculate per doman the probablty of havng no dffculty/pan/dstress for the reference ndvdual as defned above.e. male, youngest age group, lowest educaton, rural dweller and ncome at threshold of poorest quntle. Changng one characterstc, re-computng the probablty and expressng ths as a rato of that for the reference ndvdual gves a measure of relatve nequalty that reflects the health gradent of each characterstc holdng the others constant. We calculate the rato usng the standard ordered probt and the HOPIT model. For the HOPIT calculatons we fx the cut-ponts to the characterstcs of the reference ndvdual. The calculated ratos based on the HOPIT model now reflect purely health effects. The dfference between the ordered probt and the HOPIT results gves an ndcaton of the extent of the bas nduced by reportng heterogenety. Note that ths procedure purges reportng heterogenety dervng from all covarates and only that for whch the health dsparty s computed. The results for ncome, educaton, sex, age and the urban-rural dfferences are depcted n Fgure 2 FIGURE 2 From the frst row of the fgure t s mmedately clear that ncome gradents n health are strongest n Chna and are neglgble n Indonesa. The correcton for reportng heterogenety does not gve rse to any notceable gradent n Indonesa. For Inda, the ncome gradents are modest and whle purgng reportng bas (ndcated by the dfference between the lght and dark bars) consstently shfts them upward, substantally so n relatve terms for pan, self-care and usual, they reman modest after the adjustment. For Chna we generally see a strong postve ncome gradent that s ncreased, most notceable for pan and affect, after correcton for reportng heterogenety. There s a postve educaton gradent for all countres, whch s shfted slghtly downward for Inda and Indonesa but not substantally so. For Indonesa and Inda there s an apparent urban health advantage that s reduced after purgng reportng bas. The adjustment s greatest for pan, partcularly n Indonesa, where we fnd an 8% pont dfference between the two bars. For Chna, we always fnd the probablty of very good health to be lower n urban areas and ths dsparty s ncreased n four of the sx domans after purgng reportng bas. The gender gradent vares most across the dfferent health dmensons and less so across countres. Where there s a clear male advantage (cognton and pan), purgng reportng dfferences reduces the dsparty n Inda and Chna and rases t n Indonesa. Fnally, as 18

expected, we fnd large and postve age effects for all three countres. The correcton for reportng heterogenety generally reduces the age gradent and n some cases the changes are substantal. Fgure 2 llustrates the effect of purgng reportng heterogenety from partal assocatons between covarates and health. Measurement of soco-economc nequaltes n health usually focuses on the total assocaton between health and some measure of soco-economc rank, possbly standardsed for demographcs lke age and sex. To check on the effect of reportng bas on a measure of total soco-economc nequalty n health, we compute the concentraton ndex (Kakwan, Wagstaff et al. 1997) for the predcted probablty of reportng good health (as defned n fgure 2) aganst ncome. Probabltes are obtaned both from the ordered probt and from the HOPIT wth cut-ponts set equal to those of the reference ndvdual as defned above. Control s made for dfferences n demographc composton by ncome level by settng age and sex to the values of the reference ndvdual n predctng the health ndex from both models. Results are presented n Fgure 3 and show that total ncomerelated health nequalty n generally largest n Inda whereas the partal correlatons show greatest dspartes n Chna (Fgure 2, row 1). But the effect of purgng reportng heterogenety from both the partal and total correlatons s smlar. There s a slght upward adjustment to the dspartes n most cases and a marked ncrease n health nequalty by ncome only n the domans of pan and affect n Chna. 5 Concluson In ths paper we have nvestgated whether there s heterogenety n health reportng and whether and how ths affects the measurement of soco-demographc dspartes n sx domans of self-reported health. We have done ths for three low/mddle ncome Asan countres (Inda, Indonesa and Chna) usng WHO-MCS data that, n addton to respondents assessements of ther own health domans, nclude ther assessments of vgnette descrptons of the health domans. Such data allow for the estmaton of herarchcal ordered probt models whch consst of two, smultaneously estmated parts: the vgnette ratngs are used to estmate the effects of soco-demographcs on thresholds for reportng levels of health, whle respondents own health ratngs are used to estmate soco-demographc effects on own health. We then use these estmates to test reportng homogenety and to examne the mpact of correctng for 19

heterogenety on dspartes n health by soco-economc and demographc characterstcs. The hypothess of homogeneous reportng across all soco-demographcs s rejected for all countres and health domans. There s varaton across countres and health domans n the rejecton of homogeneous reportng wth respect to ndvdual soco-demographc characterstcs. Homogenety tends to be most consstently and decsvely rejected across urban/rural, ncome and age dfferences and less consstently across sex and educaton groups. Parallel shft of reportng thresholds s rejected n all but one case, ndcatng that soco-demographcs do not smply shft the thresholds by the same magntude and n the same drecton. There s varaton n the drecton and strength of the reportng dfferences across countres and domans. Generalsng and so obscurng ths varaton, younger, male (not Indonesa), better educated (not Chna), low ncome and urban respondents dsplay lower health expectatons. These groups are more lkely to assess a health condton postvely. Reportng of health vares most across age, sex and urban/rural dfferences and less by educaton and ncome. Correctng for reportng heterogenety tends to reduce dspartes n health by age, sex (not Indonesa), urban/rural and educaton (not Chna) and to ncrease ncome dspartes n health. Overall, whle homogeneous reportng s sgnfcantly rejected, our results suggest that the sze of the reportng bas n measures of health dspartes s not large. Some of these results mght be consdered surprsng. Prevous evdence suggests that the elderly have lower expectatons of health and on ths bass one would expect age dspartes n health to ncrease after purgng reportng heterogenety. As mentoned above, a possble explanaton for our contradctory result s that the assumpton of vgnette equvalence does not hold wth respect to age. Older respondents may comprehend the level of functonng/pan/dstress that a vgnette s ntended to descrbe dfferently from younger respondents. Recognsng condtons descrbed n the vgnettes from ther own experence, older respondents may be more apprecatve of the consequent constrants on health whle younger respondents, lackng exposure to such condtons, may be more dsmssve. If ths were true, t would nvaldate the vgnette approach. We have no evdence to support such a concluson. In fact, t s manly for pan that we fnd the elderly to have lower thresholds and, as noted above, the evdence on whether perceptons of pan vary wth age s ambguous (Gbson and Helme 2001). In the doman of moblty, Murray, 20

Ozaltn et al. (2003) fnd reportng behavour consstent wth health expectatons fallng wth age for sx countres covered by the WHO-MCS. But they also fnd that rankng of vgnettes vares systematcally wth age and educaton, suggestng that comprehenson of the descrbed levels of health vares wth these characterstcs. Further evaluaton of the valdty of the vgnette approach s clearly requred n the form of experments desgned to drectly test the assumpton of vgnette equvalence and that of response consstency. Future applcatons of the vgnette approach should also gve consderaton to what varaton n reportng t s approprate to remove from a health measure. Arguably, perceptons of health are more mportant to qualty of lfe experences than are objectve health condtons. Ths rases the dffcult queston of whether health s nterpersonally comparable. Any attempt to measure health nequalty must assume that t s. In ths context, we argue that an approprate measure of soco-economc nequalty n health should correct for any tendency of better-off ndvduals to report ther health more negatvely for a gven condton. But t may not be consdered approprate to remove dfferences n the reportng of health by sex, for example. The tendency for women to report pan more negatvely, confrmed here for Inda and Chna, presumably does ndcate that the real experence of pan s greater for women and ths should be reflected n a health measure. Fnally, our general fndng that, whle sgnfcant, reportng heterogenety does not appear to have a large quanttatve mpact on measured soco-economc dspartes n health may be contngent upon the measurement of health separately n each of sx domans rather than through a sngle ndcator of general health. By separatng health nto sx dmensons, much of the heterogenety n the reportng of the standard selfassessed health queston s removed. There s no heterogenety dervng from dfferental weghtng of each dmenson of health. It remans to be seen whether the vgnette approach can be extended to the measurement of general health and f so what wll be the mpact on dspartes n general health. References Baker, J. L. and J. Van der Gaag (1993). Equty n health care and health care fnancng: Evdence from fve developng countres. Equty n the fnance and 21

delvery of health care. E. Van Dooslaer, A. Wagstaff and F. Rutten. Oxford, Oxford Unversty Press. Baker, M., M. Stable and C. Der (forthcomng). What do self-reported, objectve measures of health measure? Journal of Human Resources. Bentez-Slva, H., M. Buschnsk, H. M. Chan, S. Chedvasser and J. Rust (1999). How large s the bas n self-reported dsablty? Journal of Appled Econometrcs 19(6): 649-670. Bound, J. (1991). Self reported versus objectve measures of health n retrement models. Journal of Human Resources 26: 107-137. Dsney, R., C. Emerson and M. Wakefeld (forthcomng). Ill-health and retrement n Brtan: A panel data based analyss. Journal of Health Economcs. Gbson, S. J. and R. D. Helme (2001). Age-related dfferences n pan percepton and report. Clncal Geratrc Medcne 17: 433-456. Gwatkn, D. R., S. Rusten, K. Johnson, R. Pande and A. Wagstaff (2000). Socoeconomc dfferences n health, nutrton and populaton. World Bank Health, Nutrton and Populaton Dscusson Paper. Washngton DC. Hernandez-Quevedo, C., A. M. Jones and N. Rce (2004). Reportng bas and heterogenety n self-assessed health. Evdence from the Brtsh Household Panel Survey. ECuty III Workng Papers. York. Idler, E. L. (1993). Age dfferences n self-assessments of health: age changes, cohort dfferences, or survvorshp? Journal of Gerontology 48(6): S289-300. Kakwan, N., A. Wagstaff and E. van Doorslaer (1997). Socoeconomc nequaltes n health: measurement, computaton and statstcal nference. Journal of Econometrcs 77(1): 87-104. Kapteyn, A., J. Smth and A. van Soest (2004). Self-reported work dsablty n the US and the Netherlands. RAND Workng Paper. Santa Monca. Kerkhofs, M. J. M. and M. Lndeboom (1995). Subjectve health measures and state dependent reportng errors. Health Economcs 4: 221-235. Kng, G., C. J. L. Murray, J. Salomon and A. Tandon (2004). Enhancng the valdty and cross-cultural comparablty of measurement n survey research. Amercan Poltcal Scence Revew 98(1): 184-91. Kreder, B. (1999). Latent work dsablty and reportng bas. Journal of Human Resources 34(4): 734-769. 22

Lndeboom, M. and E. van Doorslaer (2004). Cut-pont chft and ndex shft n self-reported health. Journal of Health Economcs 23(6): 1083-1099. Mathers, C. D. and R. M. Douglas (1998). Measurng progress n populaton health and well-beng. Measurng progress: s lfe gettng better? R. Eckersley. Collngwood, CSIRO Publshng: 125-155. Murray, C. J. L. (1996). Epdemology and morbdty transtons n Inda. Health, poverty and development n Inda. M. Dasgupta, C. L.C. and T. N. Krshnan. Delh, Oxford Unversty Press: 122-147. Murray, C. J. L., E. Ozaltn, A. Tandon, J. Salomon, R. Sadana and S. Chatterj (2003). Emprcal evaluaton of the anchorng vgnettes approach n health surveys. Health systems performance assessment: debates, methods and emprcsm. C. J. L. Murray and D. B. Evans. Geneva, World Health Organzaton. Pudney, S. and M. Shelds (2000). Gender, race, pay and promoton n the Brtsh nursng professon: estmaton of a generalzed probt model. Journal of Appled Econometrcs 15: 367-399. Rley, J. L., M. E. Robnson, E. A. Wse, C. D. Myers and R. B. Fllngm (1998). Sex dfferences n the percepton of noxous expermental stmul: a metaanalyss. Pan 74: 181-187. Salomon, J., A. Tandon, C. J. L. Murray and W. H. S. P. S. C. Group (2004). Comparablty of self-rated health: Cross sectonal mutl-country survey usng anchorng vgnettes. Brtsh Medcal Journal 328: 258. Sen, A. (2002). Health: percepton versus observaton. Brtsh Medcal Journal 324: 860-1. Stern, S. (1989). Measurng the effect of dsablty on labor force partcpaton. Journal of Human Resources 24: 361-395. Tandon, A., C. J. L. Murray, J. A. Salomon and G. Kng (2003). Statstcal models for enhancng cross-populaton comparablty. Health systems performance assessment: debates, methods and emprcsms. C. J. L. Murray and D. B. Evans. Geneva, World Health Organzaton: 727-746. Terza, J. V. (1985). Ordnal probt: a generalzaton. Communcatons n Statstcs 14(1): 1-11. Unruh, A. M. (1996). Gender varatons n clncal pan experence. Pan 65: 123-167. 23