WORKING PAPERS IN ECONOMICS & ECONOMETRICS TESTING PROVIDERS' MORAL HAZARD CAUSED BY A HEALTH CARE REPORT CARD POLICY

Similar documents
Rates and patterns of participation in cardiac rehabilitation in Victoria

We define a simple difference-in-differences (DD) estimator for. the treatment effect of Hospital Compare (HC) from the

Technological development and medical productivity: the diffusion of angioplasty in New York state

In each hospital-year, we calculated a 30-day unplanned. readmission rate among patients who survived at least 30 days

Estimating treatment effects with observational data: A new approach using hospital-level variation in treatment intensity

Learning and Spillover in Treatment for Ischaemic Heart Disease Using Surveys of Medical Care in Japan *)

had non-continuous enrolment in Medicare Part A or Part B during the year following initial admission;

Physician-Patient Race-Match & Patient Outcomes

EC352 Econometric Methods: Week 07

Research Dissemination and Impact: Evidence from Web Site Downloads

NBER WORKING PAPER SERIES IS MORE INFORMATION BETTER? THE EFFECTS OF REPORT CARDS ON HEALTH CARE PROVIDERS

Econometric Game 2012: infants birthweight?

1 Online Appendix for Rise and Shine: The Effect of School Start Times on Academic Performance from Childhood through Puberty

MEA DISCUSSION PAPERS

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Identifying Endogenous Peer Effects in the Spread of Obesity. Abstract

Appendix 1: Supplementary tables [posted as supplied by author]

Ashwini S Erande MPH, Shaista Malik MD University of California Irvine, Orange, California

PubH 7405: REGRESSION ANALYSIS. Propensity Score

Title: How efficient are Referral Hospitals in Uganda? A Data Envelopment Analysis and Tobit Regression Approach

HQID Hospital Performance Update & Analysis of Quality, Cost and Mortality Trends Fact Sheet

Journal of Political Economy, Vol. 93, No. 2 (Apr., 1985)

Regression Discontinuity Design (RDD)

The Association of Morbid Obesity with Mortality and Coronary Revascularization among Patients with Acute Myocardial Infarction

Chapter 4: Cardiovascular Disease in Patients With CKD

Link between effectiveness and cost data Costing was conducted prospectively on the same patient sample as that used in the effectiveness analysis.

Confidence Intervals On Subsets May Be Misleading

Instrumental Variables Estimation: An Introduction

Heart Attack Readmissions in Virginia

HEALTH OF WISCONSIN. Children and young adults (ages 1-24) B D REPORT CARD 2016

Surgical Outcomes: A synopsis & commentary on the Cardiac Care Quality Indicators Report. May 2018

The Effects of Maternal Alcohol Use and Smoking on Children s Mental Health: Evidence from the National Longitudinal Survey of Children and Youth

Economics 270c. Development Economics. Lecture 9 March 14, 2007

Are Illegal Drugs Inferior Goods?

Supplementary Appendix

Attendance rates and outcomes of cardiac rehabilitation in Victoria, 1998

HEALTH EXPENDITURES AND CHILD MORTALITY: EVIDENCE FROM KENYA

(C) Jamalludin Ab Rahman

Inference with Difference-in-Differences Revisited

Technical Notes for PHC4 s Report on CABG and Valve Surgery Calendar Year 2005

Helmut Farbmacher: Copayments for doctor visits in Germany and the probability of visiting a physician - Evidence from a natural experiment

Establishing Causality Convincingly: Some Neat Tricks

Public Reporting and Self-Regulation: Hospital Compare s Effect on Sameday Brain and Sinus Computed Tomography Utilization

Isolating causality between gender and corruption: An IV approach Correa-Martínez, Wendy; Jetter, Michael

BAYESIAN INFERENCE FOR HOSPITAL QUALITY IN A SELECTION MODEL. By John Geweke, Gautam Gowrisankaran, and Robert J. Town 1

HEALTH MAINTENANCE ORGANIZATIONS (HMOs) Segmentation Of Hospital Markets: Where Do HMO Enrollees Get Care?

WRITTEN PRELIMINARY Ph.D. EXAMINATION. Department of Applied Economics. January 17, Consumer Behavior and Household Economics.

Voluntary Mental Health Treatment Laws for Minors & Length of Inpatient Stay. Tori Lallemont MPH Thesis: Maternal & Child Health June 6, 2007

APPENDIX EXHIBITS. Appendix Exhibit A2: Patient Comorbidity Codes Used To Risk- Standardize Hospital Mortality and Readmission Rates page 10

Supplementary Appendix

Ann Rheum Dis 2017;76: doi: /annrheumdis Lin, Wan-Ting 2018/05/161

Chapter 1 Data Collection

The Limits of Inference Without Theory

Cost effectiveness of drug eluting coronary artery stenting in a UK setting: cost-utility study Bagust A, Grayson A D, Palmer N D, Perry R A, Walley T

Cardiac surgery in Victorian public hospitals, Public report

Technical Appendix for Outcome Measures

BIOSTATISTICAL METHODS

State of Iowa Outcomes Monitoring System

Applied Quantitative Methods II

Economics 345 Applied Econometrics

Policy Brief RH_No. 06/ May 2013

Variations in Procedure Use

Archimedes, Medicare, and ARCHeS

Motherhood and Female Labor Force Participation: Evidence from Infertility Shocks

Setting The setting was a hospital. The economic study was carried out in Australia.

Is Knowing Half the Battle? The Case of Health Screenings

Physician Beliefs and Patient Preferences: A New Look at Regional Variation in Spending

Empirical Tools of Public Finance. 131 Undergraduate Public Economics Emmanuel Saez UC Berkeley

Technological Development and Medical Productivity: The Diffusion of Angioplasty in New York State

The Lifetime Costs and Benefits of Medical Technology

Applied Econometrics for Development: Experiments II

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error

Racial and Ethnic Disparities in the Surgical Treatment of Acute Myocardial Infarction: The

Unit 1 Exploring and Understanding Data

Recognition of Complications After Pancreaticoduodenectomy for Cancer Determines Inpatient Mortality

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX

Sample Size, Power and Sampling Methods

Cancer survivorship and labor market attachments: Evidence from MEPS data

Sample selection in the WCGDS: Analysing the impact for employment analyses Nicola Branson

Methods for Addressing Selection Bias in Observational Studies

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

The Economic Burden of Hypercholesterolaemia

Testing for Robustness in the Relationship between Fatal Automobile Crashes and Daylight Saving Time

EMPIRICAL STRATEGIES IN LABOUR ECONOMICS

Fundamental Clinical Trial Design

Aanvraag gegevens ten behoeve van wetenschappelijk onderzoek

The Growing Health and Economic Burden of Older Adult Falls- Recent CDC Research

The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters

Supplementary Online Content

Do children in private Schools learn more than in public Schools? Evidence from Mexico

Marno Verbeek Erasmus University, the Netherlands. Cons. Pros

Achieving Quality and Value in Chronic Care Management

A NON-TECHNICAL INTRODUCTION TO REGRESSIONS. David Romer. University of California, Berkeley. January Copyright 2018 by David Romer

Estimands, Missing Data and Sensitivity Analysis: some overview remarks. Roderick Little

Mortality following acute myocardial infarction (AMI) in

State of Iowa Outcomes Monitoring System

Gender Effects in Private Value Auctions. John C. Ham Department of Economics, University of Southern California and IZA. and

SUPPLEMENTARY INFORMATION

Transcription:

WORKING PAPERS IN ECONOMICS & ECONOMETRICS TESTING PROVIDERS' MORAL HAZARD CAUSED BY A HEALTH CARE REPORT CARD POLICY Yijuan Chen Research School of Economics Australian National University Canberra, ACT 0200 Australia yijuan.chen@anu.edu.au Juergen Meinecke Research School of Economics Australian National University Canberra, ACT 0200 Australia juergen.meinecke@anu.edu.au JEL codes: I18, D82, C31 Working Paper No: 527 ISBN: 086831 527 3 September, 2010

Testing Providers Moral Hazard Caused by A Health Care Report Card Policy Yijuan Chen Research School of Economics Australian National University Canberra, ACT 0200 Australia yijuan.chen@anu.edu.au Jürgen Meinecke Research School of Economics Australian National University Canberra, ACT 0200 Australia juergen.meinecke@anu.edu.au September 22, 2010 Abstract This paper focuses on testing providers moral hazard caused by a health care report card policy. We argue that, to indicate providers moral hazard, empirical approaches should be based on understanding that the policy may cause different sides of participants to take actions. Neglecting this, an estimation strategy will estimate treatment effects that only capture the mixture of the providers and patients actions, and therefore cannot identify either side s action. We propose a simple remedy to the estimation strategy in the previous literature: Restricting to data before the report cards are published and setting the date when providers performance start being recorded as the effective date of the policy. The U.S. state of Pennsylvania started collecting information on mortality outcomes for coronary artery bypass graft (CABG) surgery in 1990. The first report cards were published in 1992. Using U.S. Nationwide Inpatient Sample data from 1988 to 1992, we find insignificant quantity and incidence effects of the reportcard policy before report cards are published. This means that the report card policy has not affected the likelihood that heart patients receive CABG surgery and it has not led hospitals to select patients strategically. Keywords: health care report cards; provider moral hazard; difference in differences estimation; incidence effect We are very grateful for comments from participants at the 2010 Labour Econometrics Workshop at Deakin University and the ANU micro brownbag. All remaining errors are our own. Research School of Economics, H.W. Arndt Building 25A, Australian National University, Canberra, ACT 0200, Australia. Research School of Economics, H.W. Arndt Building 25A, Australian National University, Canberra, ACT 0200, Australia. 1

1 Introduction In sectors such as health care and education, consumers typically don t know the quality of producers (hospitals, physicians, schools, teachers, etc.) before making their consumption decisions. To reduce such informational asymmetry, policy makers increasingly resort to quality report cards, which publish information about producers performance. A recent example is the My School web site (http://www.myschool.edu.au) launched on January 2010 publishing nationally comparable data on Australian schools. Among these policies, however, the ones that arguably receive the most academic attention are the state-wide health care report cards for coronary artery bypass graft (CABG) surgery providers in the U.S. For a comprehensive review of studies on the CABG report cards, see Epstein (2006), a recent study includes Gravelle and Sivey (2010), among others. One particular focus of the literature is on the health care providers moral hazard caused by the report cards. Arguably the most compelling results to date come from Dranove et al. (2003). They point out that, despite the risk adjustment procedures used in producing the report cards, providers are likely to have better information on patients conditions than the clinically detailed database, and may use such private information to improve their results by selecting patients. Dranove et al. use a comprehensive longitudinal Medicare claims data set, combined with the American Hospital Association data, that allows them to follow patients over time to test empirically for provider selection. They use total inpatient hospital expenditures for the year prior to admission as a proxy measure for illness severity before CABG surgery as a dependent variable and find that illness severity decreases by 3.47% 5.30% due to the introduction of hospital report cards. Furthermore, they estimate that this decrease in illness severity goes hand in hand with an increase in the probability that an average patient will undergo CABG surgery within one year of admission for acute myocardial infarction (AMI) by 0.60 0.91 percentage points. Dranove et al. conclude that the increase in quantity is accounted for by surgeries on less severely ill patients. In this paper we argue that, to indicate providers moral hazard, empirical approaches should be based on understanding that the report-card policy may cause different sides of participants to take actions. Specifically, the policy not only may affect providers actions but also influence patients choices when they infer providers quality from the report cards. Each side s action can impact a treatment effect of concern. Neglecting this, an estimation strategy will lead to an estimated treatment effect that only cap- 2

tures the mixture of the providers and patients actions, and therefore cannot identify either side s action. To test providers moral hazard, we propose a simple remedy to the estimation strategy in the previous literature: Restricting to data before the report cards are published and setting the date when providers performance start being recorded as the effective date of the policy. This is because the period between the effective date of the policy and the publication of the report cards is the period when patients do not have access to the report cards yet, but providers, knowing their performance data are being collected, may take actions to affect their report-card results. Using U.S. Nationwide Inpatient Sample data from 1988 to 1992, we estimate how health care report cards on CABG surgery in Pennsylvania affected the likelihood that heart patients receive CABG surgery (quantity effect). We also estimate whether hospitals select patients strategically to bias the report cards in their favor (incidence effect). We find non significant quantity and incidence effects of health care report cards. This finding is consistent with Chen s (2009) theoretical paper on hospital and patient selection. It is however in stark contrast to Dranove et al s. (2003) findings of negative incidence effects and positive quantity effects. 2 Theoretical Framework A policy like the health-care report card is essentially a signalling device, thus the signalling-game models are most suitable for theoretically analyzing the treatment effects of such a policy, and should provide a solid foundation on which empirical studies can be based. For concreteness yet without loss of generality, we will focus on the CABG report cards here after. The simplest theoretical model should be one where the report cards will be issued only once, which entails a two-period game. In each period, there is a set of patients who will be active only within that period. There is also a set of health-care providers who will practice in both periods. The providers differ in their types (skills, ability, etc.), which are unobserved by the patients at the beginning of the game. The interaction between a health care provider and her 1st-period patients form the database based on which a report card for her will published at the beginning of the second period. Upon seeing the report cards, the 2nd-period patients will update their beliefs about the providers types and choose the providers accordingly. In the first period, the report card policy may cause the providers to take actions (moral 3

hazard), such as selecting patients, in order to signal or hide their types through the report cards. On the other hand, not knowing the providers types, the 1st-period patients will have to choose providers as if the policy was not implemented. In the second period, however, the scenario will be different. Since it is the last period of the game, the providers will have no incentives to shun patients or exerting extra efforts, while the report cards allow the 2nd-period patients to select providers. In short, the policy has different treatment effects in different periods. We call the treatment effects in the first period the period-1 effects, and those in the second the period-2 effects, as indicated on the time line in Figure 1. [Figure 1 about here.] If we have pre- and post-policy data from the treatment group and a control group, the period-1 effects and the period-2 effects can be estimated by the difference in differences approach, as the following model illustrates. Without loss of generality, suppose there is only one period-1 effect, and one period-2 effect. Denote a time period by t = 0, 1, 2. Let the superscript T denote the treatment group and NT the control group. Let y be the dependent variable on which we study the treatment effect of the policy. Suppose there are n members in the treatment group and m members in the control group. For ease of illustration, ignore other covariates. Suppose yj0 T = α 0 + ε j0, yj1 T = α 1 + β 1 + ε j1, yj2 T = α 1 + β 2 + ε j2, y NT j 0 = α 0 + ε j 0 y NT j 1 = α 1 + ε j 1 y NT j 2 = α 1 + ε j 2 where ε jt and ε j t are i.i.d. with E[ε jt ] = E[ε j t] = 0. Then there are E[y0 T ] = α 0, E[y1 T ] = α 1 + β 1, E[y2 T ] = α 1 + β 2, E[y0 NT ] = α 0 E[y1 NT ] = E[y2 NT ] = α 1 and identification, which validate the difference in differences estimators, of the period- 1 effect β 1 and the period-2 effect β 2 stems from β 1 = [ E[y T 1 ] E[y T 0 ] ] [E[y NT 1 ] E[y NT 0 ]] β 2 = [ E[y T 2 ] E[y T 0 ] ] [E[y NT 2 ] E[y NT 0 ]]. Though straightforward as it looks, estimating period-1 effects and period-2 effects separately closes two loopholes in using a single difference in differences estimator, 4

which, after an effective date of the policy is decided, compares the pre- and postpolicy data only once. First, when using the single difference in differences estimator, previous research often sets t = 2, i.e. the date when the report cards are issued, as the effective date of the policy. This essentially uses the combination of period-0 data and period-1 data as the pre-policy data, and treats period-2 data as the post-policy data. Consequently, one cannot conclusively interpret the estimation results. For example, a negative estimate can result from, among many other possibilities, (i) a negative period-2 effect, and a positive period-1 effect, (ii) a strongly negative period-2 effect, and a less negative period-1 effect, or (iii) a less positive period-2 effect, and a strongly positive period-1 effect. Secondly, even if one correctly sets t = 1 as the effective date of the policy, which means that the period-0 data is regarded as the pre-policy data, and the combination of period-1 and period-2 data are used as post-policy data, the single difference in differences approach can only estimate the average of a period-1 effect and a period-2 effect. To see this, continued from the previous illustrative model, the single difference in differences estimate ˆβ means that [ y T j1 + ] [ yj2 ˆβ T = 1 y T y NT j 2n n j0 1 + yj NT 2 1 2m m [ [ ] ] [ 1 y T j1 y T j2 = + 1 y T y NT j 2 n n n j0 1 + yj NT 2 2m 1 {[ (E[y T p 2 1 ] + E[y2 T ]) 2E[y0 T ] ] 2 [ E[y1 NT ] E[y0 NT ] ]} y NT j 0 1 m ] y NT j 0 = 1 2 (β 1 + β 2 ). (2.1) One factor not captured by the two-period signalling model is that, in reality, report cards may be issued more than once, reflecting changes from providers over time. The changes may come from, for example, (i) at the physician level, changes of providers types due to learning or aging, and (ii) at the hospital level, exit or mergers of hospitals and entry of new hospitals. A model with more than two periods will necessarily be more complicated. In a two-period model, the period-1 effects reflect providers actions, and the period-2 effects indicate patients actions. But with more than two periods, even after the first report cards are published, providers moral hazard remains a problem because of their concern with future report cards. Consequently, the corresponding periodic effects will reflect not only providers actions but also patients actions caused by the policy. The immediate corollary is that the period-1 data become the most precious for identifying provider moral hazard when report cards are issued 5 ]

more than once, because, as discussed, in the first period patients have no access to report cards. Following the timeline in Figure 1, we identify hospital selection by comparing periods 0 and 1. As our focus is on Pennsylvania, which issued the first CABG report cards in 1992, with our data Period 0 corresponds to the years 1988 and 1989, while period 1 corresponds to the years 1990, 1991, and 1992. 3 Data 3.1 The Nationwide Inpatient Sample We use the Healthcare Cost and Utilization Project (HCUP) Nationwide Inpatient Sample (NIS) data set sponsored by the U.S. Agency for Health Care Policy and Research. So far, we have combined the NIS release 1 (years 1988, 1989, 1990, 1991, 1992), release 2 (year 1993) and release 3 (year 1994). Each NIS release approximates a 20 percent sample of U.S. community hospitals for each year. It is based on a stratified sample of hospitals, with sampling probabilities proportional to the number of U.S. community hospitals in each stratum. Individual persons cannot be identified in the NIS. The NIS contains patient level clinical and resource use information included in a typical discharge abstract. This universe of U.S. community hospitals is divided into strata using five hospital characteristics: ownership/control, bedsize, teaching status, urban/rural location, and U.S. region. The NIS is a stratified probability sample of hospitals in the frame, with sampling probabilities proportional to the number of U.S. community hospitals in each stratum. Table 1 shows what states are available in the sample at what time. In the first year if the NIS, 1988, eight states are included with a total of about 5,265,000 observations on patients which are distributed over 759 different hospitals. The first time data for Pennsylvania is available is the year 1989. Altogether we have data on eleven states for the years 1989 92 with a total of more than 6,100,000 observations on patients distributed across more than 850 hospitals each year. Six more states join the NIS in the years 1993 and 1994. [Table 1 about here.] 6

Table 2 presents a subset of variables available in the NIS. Two variables that stand out are the principal diagnosis and the principal procedure. Via the principal diagnosis we are able to identify patients who have suffered a heart attack (AMI). The two principal procedures we will focus on in our study are coronary artery bypass graft (CABG) and percutaneous transluminal coronary angioplasty (PTCA). We present a model that treats CABG and PTCA as substitutes in treatment of AMI patients. [Table 2 about here.] 3.2 Descriptive Statistics for Patients We focus only on patients with a principal diagnosis of AMI as they are the subject of our identification strategy for the quantity effect. We also restrict our attention to people aged 65 99. Table 3 contains descriptive statistics on patient characteristics for a subset of the control states. The fraction of males in the sample is 53.0% in 1988 and stays fairly stable across all years. So does the fraction of Africans in the sample at about 2.5%. The average age of the control state population (given that we restrict ourselves to the range 65 99) is about 76. Total hospital charges as measured in 1994 dollars increase from $13,804 in 1988 to $21,253 in 1994. Those charges cover the expenses for the entire length of stay at the hospital as recorded in the discharge abstract. [Table 3 about here.] Regarding the main clinical outcomes of interest, 3.4% of all patients who are admitted to the hospital with a principal diagnosis of AMI receive CABG as the principal procedure in 1988. The alternative procedure of PTCA is given to 4.0% of AMI patients. Both numbers are increasing over the years to 7.4% for CABG and 11.2% for PTCA. Over the same period the mortality rate for all AMI patients is declining from 19.5% in 1988 to 13.9% in 1994. The mortality rate for AMI patients who received CABG as the principal procedure is lower with 12.1% in 1988 declining to 7.2% in 1994. Patients who receive PTCA die with a probability of 6.8% in 1988 versus 4.1% in 1994. The size of the AMI patient sample is declining between 1988 and 1994 from 52,860 to 45,507 because Table 3 is based on only the eight states that were in the sample over the entire range 1988 94 and the number of observations per state decreases over time as nine more states join the NIS while the overall sample size did not grow proportionally. Table 4 contains the descriptive statistics on patient characteristics for Pennsylvania. 7

There are about 3 percentage points fewer males treated for AMI in Pennsylvania hospital than in the control states. The mean age in Pennsylvania appears to be the same as in the control states. Total hospital charges for AMI patients in Pennsylvania generally is higher than in the control states. In 1989 (the first year for which we have PA data) hospitals charged an average of $15,809 which rises to $23,357 in 1994. [Table 4 about here.] The probability that a randomly chosen AMI patient receives CABG surgery in Pennsylvania equals 2.2% in 1989 and climbs up to 7.7% in 1994. The CABG incidence over the time span 1989 94 equals 5.2% in Pennsylvania and 5.9% in the control states (see Table 3). The probability of a randomly chosen AMI patient receiving PTCA increases from 4.2% in 1989 to 11.1% in 1994 in Pennsylvania. The average PTCA incident over the 1989 94 time frame in Pennsylvania equals 7.2%, in the control states the incidence is 8.2%. The mortality rates of AMI patients in Pennsylvania are slightly lower than in the control states. For the whole time frame 1989 94 the unconditional (i.e., whole sample) mortality rate in Pennsylvania equals 15.0% and in the control state it equals 15.9%. The mortality rate conditional on receiving CABG surgery is 7.7% in Pennsylvania and 7.8% in the control states. The conditional mortality rate for PTCA equals 4.5% in Pennsylvania and 4.7% in the control states. The declining trend in the control states mortality rates is partly mirrored in the Pennsylvania data. For the whole sample mortality decreases from 16.8% in 1989 to 13.0% in 1994. For the CABG subsample the decline is from 11.7% to 7.2% between 1989 and 1994 and for PTCA the trend ambiguous. Contrary to the control states, there exists no steady downward development in the PTCA mortality. We find spikes at 5.9% and 5.7% in the years 1991 and 1992. 3.3 Descriptive Statistics for Hospitals We observe altogether 4197 hospitals treating AMI patients in the subset of control states that have been in the NIS uninterruptedly between 1988 94. Table 5 shows that roughly a third of all hospitals is of medium size and roughly a third of hospitals is of large size. About 65% of all hospitals are in urban areas while only 12.6% are in an urban area and at the same time a teaching hospital. A quarter of all hospitals are owned publicly and about 17% privately for profit. 8

[Table 5 about here.] The hospitals in Pennsylvania, in contrast, are generally larger as about 76.4% are either of medium are large size as Table 6 indicates. Pennsylvania also has more urban hospitals of which a greater fraction are also teaching hospitals. The ownership structure of hospitals in Pennsylvania differs substantially from the control states. In total, we observe 441 hospitals in Pennsylvania treating AMI patients. [Table 6 about here.] 4 Estimation 4.1 Incidence Effect We want to test whether hospitals in Pennsylvania after 1990, as a response to the introduction of the report cards, have selected patients. Report card data are only collected for patients who received CABG surgery. Dranove et al. (2003), using US Medicare claims data, use total inpatient hospital expenditures one year prior to admission for the current episode of heart problems as a proxy for a illness severity. Their illness severity measure has the advantage of not being subject to providers selection behavior, and thus circumvent the endogeneity problem when estimating the incidence effect. We propose an intuitive way of testing for the presence of an incidence effect. If hospitals in Pennsylvania in 1990, 1991, and 1992 have engaged in strategic selection of patients (bad risks from CABG surgery to PTCA surgery, good risks from PTCA surgery to CABG surgery) then we would expect to see a change in the mortality rate for CABG surgery in those years. Since patient hospital expenditures are not available in our data set, in line with the illness severity measure proposed by Elixhauser et al. (1998), we identify and estimate whether the incidence of mortality has changed due to the introduction of report cards in Pennsylvania. We apply difference in differences estimation where the treatment state is Pennsylvania and the treatment years are 1990, 1991, and 1992. Figure 1 illustrates that we cannot use year after 1992 because then we would conflate hospital selection with patient selection. We use all other state and year combinations as control states, see Table 1. The general difference in differences equation for the incidence effect looks as follows: M ig = c + X g β + Z igγ + C ig φ + α g + (δ ig + ε ig ), (4.1) 9

where M ig is a dummy for patients who die during their hospital stay, X g is a treatment group indicator, Z ig collects variables that vary over individuals i and groups g, C ig is a dummy for patients who received CABG surgery, α g is a group specific constant, δ ig is an illness severity measure, and ε ig is an error term. The index variable g runs over all ordered pairs of states and years (s, t). For example, California in the year 1988 could be assigned to g = 1. Every value of g represents a particular state at a particular time. Whenever g coincides with Pennsylvania in the years 1990 through 1992 we set X g equal to one and zero otherwise. Our objective is to estimate β. The variable Z ig collects characteristics on patients. We include in Z ig a person s age category (age 65 69, age 70 74, age 75 79, age 80 89, age 90 99), gender, admission status (emergency admission, urgent admission, elective admission, other admission), a dummy for whether a person had a primary diagnosis of AMI, dummies for clinical comorbidities (hypertension, diabetes, cardiogenic shock, previous CABG surgery). The clinical comorbidities are the same comorbidities that are used in the construction of the report cards. The term δ ig represents private information that hospitals have on the illness severity (conditional on observables) of a patient. It is unobserved by the econometrician. While δ ig is correlated with C ig we assume that it is uncorrelated with Z ig because hospitals are aware that report cards control for Z ig. For those patients that receive CABG surgery because they have a low draw of δ ig we assume that Z ig and C ig are uncorrelated. The justification for this assumption is that, for those patients, Z ig is at the same time consistent with receiving PTCA surgery. Hospitals can only select patients when their observables Z ig are consistent with both CABG and PTCA. The econometric implication of that assumption is that we can estimate a average mortality rates M g consistently for Pennsylvania in 1990, 1991, and 1992. We estimate equation (4.1) following Donald and Lang (2007) who present an improved inference framework for panel and repeated cross section data when the number of groups is small. In our case, we observe altogether 52 pairs of states and year, see Table 1. Donald and Lang propose a two step estimation following Amemiya (1978). First, within each (state, year) pair estimate averages of the dependent variable (controlling for covariates). Second, between (state, year) pairs estimate the policy effect by feasible least squares. This estimation strategy lends itself well to our objective of analyzing hospital selection. Since we do not observe actual illness severity of patients we focus on measuring mortality rates over time. If hospital selection of patients did 10

occur then we would expect to see a significant change in mortality rates in Pennsylvania starting in 1990. We therefore study the development of mortality rates as a measure of change in actual illness severity. In particular, estimate first, for each g separately and conditional on all patients having received CABG surgery, the coefficient γ in the regression M ig = d g + Z igγ + (δ ig + ε ig ). The presence of δ ig does not bias our estimation because we assume it is not correlated with Z ig. We also assume that, for those patients that have been selected into the sample of CABG patients because of their low draws for δ ig, the observable characteristics, Z ig, are the same as for those that have left the CABG sample due to them having high draws for δ ig. The OLS estimator of d g is equal to ˆd s := M g Z gˆγ. Second, regress ˆd g on X g, ˆd g = c + X g β + u g, via feasible GLS to obtain an estimator ˆβ. Donald and Lang (2007) show that the resulting estimator has approximately a t distribution with degrees of freedom that depend on the number of state year combinations g. Table 7 presents the estimation results for β. [Table 7 about here.] In no case do we find an incidence effect. When we focus on CABG surgery we find very small effects of -0.002 (not controlling for observable characteristics) and -0.277 (controlling for characteristics). In both cases we reject the null hypothesis that patient mortality has changed due to the introduction of report cards. Recall that, due to hospital selection of patients, we would expect a change in mortality. In addition, hospital report cards not only induce selection of patients by hospitals but could also improved quality in CABG treatment. The direction of this quality effect should also reduce mortality. We find no such quality improvement effects. Table 7 shows difference in differences estimates for PTCA surgery are 0.0001 (not controlling for observable characteristics) and -0.082 (controlling for characteristics). In both cases we reject the null hypothesis that patient mortality has changed due to the introduction of report cards. We find no such incidence effect, which, given our earlier results, is consistent with no effect of report cards whatsoever on selection of patients by hospitals or quality improvements. 11

4.2 Quantity Effect We want to identify and estimate whether the incidence of CABG surgery has changed due to the introduction of report cards in Pennsylvania. We apply difference in differences estimation where the treatment state is Pennsylvania and the treatment years are 1990, 1991, and 1992. Figure 1 illustrates that we cannot use year after 1992 because then we would conflate hospital selection with patient selection. We use all other state and year combinations as control states, see Table 1. The general difference in differences equation for the quantity effect looks as follows: C ig = c + X g β + Z igγ + α g + (δ ig + ε ig ), (4.2) where C ig is a dummy for patients who received CABG surgery, X g is a treatment group indicator, Z ig collects variables that vary over individuals i and groups g, α g is a group specific constant, δ ig is an illness severity measure, and ε ig is an error term. Description of the index variable g and the variable Z ig is the same as that for (4.1) The term δ ig represents private information that hospitals have on the illness severity (conditional on observables) of a patient. It is unobserved by the econometrician. We assume that E[δ ig Z ig, C ig ] = E[δ ig C ig ], for all i and g (in particular also when g corresponds to Pennsylvania in 1990, 1991, 1992). We estimate equation (4.2) following Donald and Lang (2007) in two steps. First estimate, for each g separately, the coefficient γ in the regression C ig = d g + Z igγ + (δ ig + ε ig ). The presence of δ ig does not bias our estimation because we assume it is not correlated with Z ig. The OLS estimator of d g is equal to ˆd s := C g Z gˆγ. Second, regress ˆd g on X g, ˆd g = c + X g β + u g, via feasible GLS to obtain an estimator ˆβ. Donald and Lang (2007) show that the resulting estimator has approximately a t distribution with degrees of freedom that depend on the number of state year combinations g. We run the two stage regression for three different sub samples. First, similar to Dranove et al. (2003), we condition on those patients who received a principal diagnosis of AMI upon admission to the hospital. As argued earlier, that sub sample presents an important candidate pool for the CABG procedure but also for the PTCA procedure. 12

Table 8 shows that the estimate of β is small and insignificant. When we do not control for patient characteristics then the estimate of β equals 0.004 which is insignificant whereas when we control for patient characteristics the estimate is 0.031 which also is insignificant. Second, we expand the sub sample to which we apply the estimation. We now also include patients who received a principal diagnosis of coronary atherosclerosis (CA). Together, AMI and CA patients are by far the dominant candidate pool for CABG and PTCA surgery. About 96% of all patients who received CABG surgery entered the hospital with AMI or CA principal diagnosis. The same is true for PTCA surgery. The difference in differences estimates of β are 0.015 (not controlling for characteristics) and 0.008 (controlling for characteristics). Both estimates suggest an insignificant population coefficient. The same holds, thirdly, when we condition only on patients who ended up receiving CABG or PTCA surgeries. The coefficient estimates are 0.004 (not controlling for characteristics) and 0.095 (controlling for characteristics) which, again, suggests insignificance of β. [Table 8 about here.] For additional robustness, we also run a difference in differences estimation along the lines of equation (2) in Dranove et al. s (2003). C kst = A s + B t + g Z kst + p L st + e kst. Here, k indexes patients, s indexes states, and t indexes time. The variable C kst is binary and takes on value one if patient k from state s at time t received CABG surgery within one year of admission. The variables A s and B t are state (50 states) and year fixed effects. The variable Z kst collects patient characteristics (rural dummy, gender, race, age groups, and interactions). The policy variable of interest is L st. A positive value for p means that the quantity effect is positive, i.e., the probability of receiving CABG surgery is higher in treatment states. We run Dranove et al. s regression using our timing of events, i.e., the policy variable L st is binary with value one if patient k s residence is in Pennsylvania in or after 1990. Dranove et al. used the year 1992 as a cutoff for Pennsylvania. Table 8 confirms that our estimates are not merely an artefact of the particular difference in differences estimator used. The estimate for p is 0.004 and the t statistic indicates that the true coefficient is insignificant. 13

5 Conclusion To indicate providers moral hazard caused by a report card policy, empirical researchers should recognize that the policy may cause all sides of participants to take actions. Based on our theoretical framework, with the U.S. NIS data from 1988 to 1992, we separate providers actions in the treatment state from patients actions, and, in contrast to previous empirical findings, we find no evidence that the report card policy causes providers to select patients. An immediate future direction is, upon having a richer data set, to extend the present study to include other states, such as New York, that also issue CABG report cards. One caveat from our theoretical framework, however, is that one shall focus on the treatment states separately if those states issue the report cards in different dates. For example, while New York issued its first report cards in 1991, Pennsylvania issued its own in 1992. An estimation putting the two states together necessarily involves period-2 data of one state, and thus will complicate the estimation result. To illustrate the insight gained from the estimation strategy based on our theoretical framework, in this paper we have focused on period-1 and period-0 data, and otherwise stayed as closely as possible to estimation strategies in the literature. However, with the advantage of separating one participant group s (providers ) actions from the other (patients), our estimation strategy can be readily applied to study other effects of the CABG report cards, and, more generally, extended to study the effects of report-card policies in other sectors. References Chen, Y. (2009): Why Are Health Care Report Cards So Bad (Good)? ANU School of Economics Working Papers, No. 511. Donald, S. G. and K. Lang (2007): Inference with Difference in Differences And Other Panel Data, The Review of Economics and Statistics, 89, 221 233. Dranove, D., D. Kessler, M. McClellan, and M. Satterthwaite (2003): Is More Information Better? The Effects of Report Cards on Health Care Providers, The Journal of Political Economy, 111, 555 588. Elixhauser, A., C. Steiner, D. Harris, and R. Coffey (1998): Comorbidity measures for use with administrative data, Med Care, 36, 8 27. 14

Epstein, A. J. (2006): Do Cardiac Surgery Report Cards Reduce Mortality? Assessing the Evidence, Medical Care Research and Review, 63, 403 426. (2010): Effect of report cards on referral patterns to cardiac surgeons, Journal of Health Economics, forthcoming. Gravelle, H. and P. Sivey (2010): Imperfect Information in A Qualitycompetitive Hospital Market, Journal of Health Economics, 29, 524 535. 15

Period 0: before 1990 Before report card policy is effective Period 1: 1990 1992 Data about providers are being collected. Policy has period 1 effect. Period 2: after 1992 Report cards issued. Policy has period 2 effect. t = 1 t = 2 Figure 1: Timing of Report Card Policy 16

Table 1. Nationwide Inpatient Sample (NIS) Data Coverage Calendar States covered Number of Number of year hospitals in patient stays 1988 CA, CO, FL, IL, IA, MA NJ, WA 759 5,265,756 1989 AZ, CA, CO, FL, IL, IA, MA NJ, PA, WA, WI 882 6,110,064 1990 AZ, CA, CO, FL, IL, IA, MA NJ, PA, WA, WI 871 6,268,515 1991 AZ, CA, CO, FL, IL, IA, MA NJ, PA, WA, WI 859 6,156,188 1992 AZ, CA, CO, FL, IL, IA, MA NJ, PA, WA, WI 856 6,195,744 1993 AZ, CA, CO, CT, FL, IL, IA, KS, MD, MA NJ, NY, OR, PA, SC, WA, WI 913 6,538,976 1994 AZ, CA, CO, CT, FL, IL, IA, KS, MD, MA NJ, NY, OR, PA, SC, WA, WI 904 6,385,011 Note. Source: NIS documentation, release 3. 17

Table 2. Nationwide Inpatient Sample (NIS) Variable Description Variable description Selected variable details Age at admission Admission type emergency, urgent, elective, newborn, delivery, other Principal diagnosis Clinical Classifications for Health Policy Research Diagnosis label: Acute Myocardial Infarction (code #100) Died during hospitalization Hospital state Length of stay (cleaned) Principal procedure Clinical Classifications for Health Policy Research Procedure label: Coronary Artery Bypass Graft (code #44) Race White, Black, Hispanic, Asian or Pacific Islander, Native American, Other Sex Total charges (cleaned) Median income patient s ZIP code Based on 4 or 8 income categories Year 1988 through 1994 Hospital size Small, medium, large Hospital location Urban or rural Hospital teaching status Urban teaching only Hospital owner status Public, private non profit, private for profit Note. Source: NIS documentation, release 3. 18

Table 3. Descriptive Statistics: Patient Characteristics for Control States Year 1988 1989 1990 1991 1992 1993 1994 1989 1994 Male 53.0 54.0 53.7 54.1 54.4 54.6 54.3 54.2 African 2.5 2.7 2.5 2.0 2.2 2.1 2.5 2.3 Age 75.7 75.8 75.8 75.9 75.9 76.0 76.2 75.9 Total charges 13804 15544 16437 18496 20185 21481 21253 18905 (17447) (19802) (21169) (24583) (25659) (26619) (25950) (24234) CABG 3.4 3.8 4.5 5.8 6.7 7.4 7.4 5.9 PTCA 4.0 5.1 6.2 7.6 8.6 10.9 11.2 8.2 DIED whole sample 19.5 18.2 17.0 16.2 15.3 14.8 13.9 15.9 only CABG 12.1 9.4 8.5 8.4 7.0 7.7 7.2 7.8 only PTCA 6.8 5.6 5.5 5.1 4.5 4.4 4.1 4.7 Sample Size 52860 47008 49654 52738 56805 46553 45507 298265 Note. Sample: Age 65 99 who have received a primary diagnosis of acute myocardial infarction. Numbers in parentheses are standard deviations. Total charges are in 1994 dollars. Table is based only on the subset of states that were always in the NIS sample between 1988 and 1994 (i.e., CA, CO, FL, IL, IA, MA NJ, WA). 19

Table 4. Descriptive Statistics: Patient Characteristics for Pennsylvania Year 1989 1990 1991 1992 1993 1994 1989 1994 Male 49.7 50.6 50.5 51.3 50.9 50.1 50.6 Age 75.46 75.35 75.55 75.64 75.61 75.78 75.56 Total charges 15809 18520 19509 21745 22247 23357 20189 (19325) (26934) (26158) (30494) (34036) (31439) (28503) CABG 2.2 4.6 4.6 5.7 6.2 7.7 5.2 PTCA 4.2 5.5 6.1 7.5 9.9 11.1 7.2 DIED whole sample 16.8 15.2 14.8 15.5 14.5 13.0 15.0 only CABG 11.7 6.9 7.7 8.2 7.2 7.2 7.7 only PTCA 3.5 3.8 5.9 5.7 4.3 3.3 4.5 Sample Size 5413 8744 9008 9393 5596 5765 43919 Note. Sample: Age 65 99 who have received a primary diagnosis of acute myocardial infarction. Numbers in parentheses are standard deviations. Total charges are in 1994 dollars. Race not observed for Pennsylvania. 20

Table 5. Descriptive Statistics: Hospital Characteristics for Control States Year 1988 1989 1990 1991 1992 1993 1994 1988 1994 Hospital size medium 32.8 31.4 31.5 31.5 32.0 33.7 33.5 32.3 Hospital size large 32.0 32.9 32.4 31.8 31.7 31.6 29.4 31.8 Urban 64.9 62.4 63.4 63.9 65.1 68.2 68.1 64.9 Urban and Teaching 13.5 11.9 12.6 12.2 12.4 12.3 13.4 12.6 Public owner 23.9 26.7 26.2 25.0 24.1 23.8 23.3 24.8 For profit owner 15.4 16.8 16.8 15.9 15.3 19.1 18.9 16.7 Sample Size 735 630 626 623 619 487 477 4197 Note. Sample: Age 65 99 who have received a primary diagnosis of acute myocardial infarction. Table is based only on the subset of states that were always in the NIS sample between 1988 and 1994 (i.e., CA, CO, FL, IL, IA, MA NJ, WA). 21

Table 6. Descriptive Statistics: Hospital Characteristics for Pennsylvania Year 1989 1990 1991 1992 1993 1994 1989 1994 Hospital size medium 40.7 38.4 43.0 43.0 44.4 48.0 42.4 Hospital size large 34.9 37.2 36.7 34.9 29.6 26.0 34.0 Urban 70.9 72.1 69.6 67.4 70.4 70.0 70.1 Urban and Teaching 25.6 23.3 24.1 22.1 18.5 24.0 23.1 Public owner 3.5 3.5 1.3 0.0 1.9 2.0 2.0 For profit owner 0.2 0.3 0.2 0.2 0.1 0.1 0.2 N 86 86 79 86 54 50 441 Note. Sample: Age 65 99 who have received a primary diagnosis of acute myocardial infarction. 22

Table 7. Estimation Results: Incidence Effect Sub sample used Controlling for Diff in Diff t stat Significant? Patient Characteristics Estimate Principal procedure: CABG No -0.002-0.12 No Principal procedure: PTCA No 0.0001 0.01 No Principal procedure: CABG Yes -0.277-0.78 No Principal procedure: PTCA Yes -0.082-0.58 No Replicating Dranove et al. (2003) Yes 0.004 1.11 No Note. Critical value for absolute value of t stat is 2.03 (based on t distribution with 36 degrees of freedom). Principal procedure acronyms: CABG = coronary artery bypass graft, PTCA = percutaneous coronary angioplasty. 23

Table 8. Estimation Results: Quantity Effect Sub sample used Controlling for Diff in Diff t stat Significant? Patient Characteristics Estimate Principal diagnosis: AMI No 0.004 0.34 No Principal diagnosis: AMI or CA No 0.015 0.48 No Principal procedure: CABG or PTCA No 0.004 0.13 No Principal diagnosis: AMI Yes -0.031-0.44 No Principal diagnosis: AMI or CA Yes 0.008 0.15 No Principal procedure: CABG or PTCA Yes 0.095 0.30 No Replicating Dranove et al. (2003) Yes 0.004 1.11 No Note. Critical value for absolute value of t stat is 2.03 (based on t distribution with 36 degrees of freedom). Principal diagnosis and procedure acronyms: AMI = acute myocardial infarction, CA = coronary atherosclerosis, CABG = coronary artery bypass graft, PTCA = percutaneous coronary angioplasty. 24