DIABETES VALUE METRIC Final Report A participant in the Robert Wood Johnson Foundation s Aligning Forces for Quality initiative to improve health and health care in Wisconsin. valueworks measure enlighten improve Executive summary Beginning in 2009 the Wisconsin Collaborative for Healthcare Quality (WCHQ), with the support of an Aligning Forces for Quality grant from the Robert Wood Johnson Foundation, undertook a project to help develop, test and produce a workable value metric for the management of diabetes. The project intended to answer the question, Are WCHQ and Wisconsin Health Information Organization (WHIO) diabetes performance measure results reasonably close to suggest WHIO standard cost data can be combined with WCHQ quality data to report on value as a metric, or as relative value points on a cost-quality quadrant graph? In this study, value was defined as quality divided by cost (v = q/c). WCHQ and WHIO reporting systems were developed for their own original purposes. As such, known differences between algorithms and patient populations required additional testing to determine if they could be combined to create a value metric. A close match between data sets would further the concept that WCHQ quality data could be combined with WHIO cost data to drive improvement based on value. The project included two tests of the WCHQ and WHIO data to determine compatibility for measurement of value and reporting at the level of clinic location and specialty. The tests provided a) comparisons of annual compliance rates for diabetes testing and b) comparisons of value ranking. The results were compared using unmodified and modified data specifications. The data were tested in partnership with three WCHQ member test site participants, Bellin Health, ProHealth Care and ThedaCare. Conclusion The diabetes performance results, using different populations and measure specifications, do not match as closely as hoped. Additional inquiry is necessary to understand the factors contributing to these compliance rate and value rank differences. The need for a measure of value The U.S. spent more than $2.3 trillion on healthcare in 2009. However, we are unable to compare provider performance on value because of the lack of healthcare cost and quality information. For almost 20 years policymakers have advocated healthcare services be measured, reported, and reimbursed based on value. Purchasers have had mixed success in implementing value-based purchasing strategies, due in part to the lack of a construct for measuring value. In our current healthcare system, quality and costs are usually looked at in isolation of each other. Additionally, data is typically spread across multiple institutions for any given patient with multiple payers owning various parts of the cost data. This fragmented data has made it difficult to create a complete view. A standard set of specifications is necessary to measure and report on healthcare systems delivering high-value care. According to the Institute of Medicine
Final Report 2 of 7 (IOM), a high-value healthcare delivery system consists of the following characteristics: The Critical Aims of Healthcare Safe Timely Effective Efficient Equitable Patient Centered Figure I The next level of performance measurement needs to demonstrate an organization s ability to simultaneously provide a high level of quality across the continuum of care at the lowest cost possible (i.e., deliver highvalue care). The assumption is greater value will be generated with improved quality, reduced cost, or both. An ideal measure of value would include at least three dimensions: cost (price and resource use), quality (including appropriateness) and patient experience. WCHQ undertook a project in 2009-2010 to determine the feasibility of combining quality and cost resource data into a single measurement of value for the management of diabetes. The project was supported through Aligning Forces for Quality. Aligning Forces for Quality is the Robert Wood Johnson Foundation s signature effort to lift the overall quality of health care in targeted communities, reduce racial and ethnic disparities, and provide models for national reform. Combining WCHQ quality data and WHIO cost data to report value The value metric project attempted to combine WCHQ performance measures derived from clinical systems data with resource utilization information gathered from WHIO. Compliance rates and value metrics were matched between WCHQ provider-based encounter records and WHIO insurance-based claims records for 30 clinics, by primary care specialty. WCHQ is a voluntary consortium of organizations learning and working together to improve the quality and cost-effectiveness of healthcare for the people of Wisconsin. Members represent a diverse group of healthcare organizations: physician groups, hospitals and health plans. Partner stakeholder organizations include healthcare purchasers, governmental agencies and healthcare foundations and associations. WCHQ has built a measurement development and reporting system from internal clinical systems data. WHIO is a voluntary initiative supported by leaders from insurance companies, healthcare providers, major employers and public agencies who share a commitment to the future of healthcare. The WHIO database was built with patient de-identified claims data from health insurers. At the time of this project, the WHIO database included data from five data contributors, representing 72 million claims records and 7.2 million episodes of care for approximately 1.6 million Wisconsin residents. WHIO claims data was processed through the Ingenix Impact Intelligence software to create diabetes episodes of care. WHIO reports included measures of quality of care and risk adjusted cost of care. WCHQ is a measurement development and reporting system built from internal clinical systems data. The WCHQ data represented all patients and all payers for the three participating test site organizations. The WHIO data mart is designed for tracking and measuring episodes of care using healthcare claims data. The WHIO data represented the enrollees for the five data contributors. Source data and specification differences between the WCHQ and WHIO systems were tested to better understand if they could be combined for a measure of value. Known differences are described in Figure II.
Final Report 3 of 7 Current WCHQ/WHIO Data Specifications Differences Concern WCHQ WHIO Population Reported Denominator Build Differences Attribution All patients, all payers Ages 18-75 and at least 2 office visit encounters Currently reports at system level only Figure II Limited payer sources from five data contributors: WPS, WEA, Humana, UHC, Anthem and non-commercial Medicare No age exclusions and use office visits or inpatient admissions, ER visits or 1x insulin or hyper/ hypoglycemic Reports at system level, clinic by department and by individual physician Model to compare WCHQ and WHIO results Compare WCHQ and WHIO Process Measures & Value Rankings with Unmodified Specifications Do Process Measures & Value Rankings Match? YES NO Figure III VALUE = Quality/Cost Modify WHIO Data to Match WCHQ Specs Do Process Measures & Value Rankings Match? YES Will we change specs to create a value metric? Analyses At the heart of the analysis for the value metric project was determining if the costs results in WHIO are representative of the quality measured through WCHQ. This was the critical hypothesis inherent in the attempt to answer the question, Are WCHQ and WHIO diabetes performance measure results reasonably close to suggest WHIO standard cost data can be combined with WCHQ quality data to report on value as a metric, or as relative value points on cost-quality quadrant graph? Two analyses were performed to answer the question: 1. Comparison of WCHQ and WHIO diabetes process measure results; rates of compliance for twice annual A1c testing, annual LDL testing and annual nephropathy screening 2. Comparison of WCHQ value rank results vs. WHIO value rank results The model to answer the question is presented in Figure III Done NO Comparing diabetes process measures and value rankings with unmodified specifications Process measure results Results for twice annual A1c testing and nephropathy testing were significantly different (p-value, < 0.05) at the system level of reporting. Differences in LDL test results were statistically insignificant (p value, > 0.05). See Figure IV.? Figure IV Process measure results using unmodified specifications Test Site WCHQ WHIO A1c BHC 68.0% 83.7% PHC 69.0% 76.3% TC 81.5% 89.9% LDL BHC 91.1% 93.2% PHC 83.7% 84.2% TC 95.6% 94.4% Neph. Screen BHC 78.4% 87.2% PHC 72.7% 86.8% TC 83.5% 88.6% (BHC = Bellin Health; PHC = ProHealth Care; TC = ThedaCare)
Final Report 4 of 7 Comparisons of the WCHQ and WHIO test results were also performed at the level of clinic by specialty. The results are displayed in Figure V. WHIO commercial data was presented with upper and lower control limits at a 95% confidence interval. WCHQ results falling within the upper and lower control limits were considered statistically similar to the WHIO result. Clinic level, confidence interval process measure results using unmodified specifications 100% Figure V Percent of WCHQ results within WHIO control limits 90% 80% 76.6% 70% 60% 60.0% 50% 46.6% 40% 30% 20% 10% 0% A1c LDL Neph Figure VI Percent Compliane w/ UCL & LCL 90% 80% 70% 60% 50% 40% TC CLINIC-3 FM TC CLINIC-4 FM TC CLINIC-5 FM TC CLINIC-6 FM TC CLINIC-7 FM TC CLINIC-8 FM TC CLINIC-9 FM TC CLINIC-10 FM TC CLINIC-1 FM TC CLINIC-2 IM WHIO UCL & LCL WCHQ Results TC CLINIC-11 FM TC CLINIC-12 FM PHC CLINIC-1 FM PHC CLINIC-2 IM PHC CLINIC-3- FM PHC CLINIC-4 FM PHC CLINIC-5 IM PHC CLINIC-6 FM PHC CLINIC-7 FM PHC CLINIC-8 FM PHC CLINIC-9 FM PHC CLINIC 10 FM PHC CLINIC-11 IM PHC CLINIC-12 IM BHC CLINIC-1 FM Figure VI shows 15 of 25 (60%) of the WCHQ clinic results fell within the control limits for twice annual A1c testing. Approximately 77% of clinics fell within the control limits for LDL testing and 47% for nephropathy screening. Value ranking results Value ranking was the second method used to compare the diabetes process measure test results from WCHQ to WHIO. Value in this study was defined as quality divided by cost. Using absolute numbers generated from diabetes testing rates and cost per episode, a value metric can be calculated. Ranked organizations were further segmented into quartiles. Rather than look for a one-to-one correspondence of value ranking between WCHQ and WHIO, the study examined if the test sites remained in the same quartile even with a change in rank designation. See Figure VII for a comparison of the WCHQ value ranking vs. WHIO value ranking.
Final Report 5 of 7 A1c testing value ranking results using unmodified specifications Quartile WCHQ-APAP (Rank) Figure VII WHIO Unmodified (Rank) 4th TC CLINIC #6 FM (1) TC CLINIC #6 FM (1) TC CLINIC #3 FM (2) TC CLINIC #3 FM (2) TC CLINIC #10 FM (3) TC CLINIC #10 FM (3) TC CLINIC #9 FM (4) TC CLINIC #10 FM (5) TC CLINIC #10 FM (5) TC CLINIC #8 FM (9) TC CLINIC #1 FM (6) TC CLINIC #9 FM (4) 3rd PHC CLINIC #10 FM (7) TC CLINIC #11 FM (8) TC CLINIC #11 FM (8) PHC CLINIC #10 FM (7) TC CLINIC #8 FM (9) TC CLINIC #1 FM (6) TC CLINIC #4 FM (10) BHC CLINIC #1 FM (12) PHC CLINIC #4 FM (11) TC CLINIC #5 FM (13) BHC CLINIC #1 FM (12) PHC CLINIC #12 IM (15) 2nd TC CLINIC #5 FM (13) TC CLINIC #4 FM (10) PHC CLINIC #5 IM (14) PHC CLINIC #1 FM (16) PHC CLINIC #12 IM (15) PHC CLINIC #8 FM (17) PHC CLINIC #1 FM (16) PHC CLINIC #4 FM (11) PHC CLINIC #8 FM (17) PHC CLINIC #11 IM (22) PHC CLINIC #3 FM (18) PHC CLINIC #5 IM (14) 1st PHC CLINIC #9 IM (19) TC CLINIC #2 IM (21) PHC CLINIC #7 FM (20) PHC CLINIC #3 FM (18) TC CLINIC #2 IM (21) TC CLINIC #12 FM (23) PHC CLINIC #11 IM (22) PHC CLINIC #7 FM (20) TC CLINIC #12 FM (23) PHC CLINIC #6 FM (25) PHC CLINIC #2 IM (24) PHC CLINIC #9 IM (19) PHC CLINIC #6 FM (25) PHC CLINIC #2 IM (24) Comparing WCHQ and WHIO value rankings with unmodified specifications resulted in 7 of 25 clinics moving to another quartile for twice annual A1c testing. The comparisons of value ranking using LDL tests resulted in 4 of 30 clinics moving to a different quartile. Similarly, 9 of 30 clinics shifted quartiles when nephropathy screening results were compared. Comparing diabetes process measures and value rankings with MODIFIED specifications Following the testing model presented in Figure III, the answer to the question, Do process measures and value rankings match? was No for unmodified specifications. Therefore, a comparison of the WHCQ and WHIO process measures and value rankings using modified specifications was undertaken. Process measure results The WCHQ specifications were applied to the WHIO raw claims data to test the hypothesis that WHIO performance measure results would yield statistically insignificant differences from WCHQ results when WHIO specifications were modified. The A1c, LDL and nephropathy screening performance measures were then calculated. All results, except A1c testing at Bellin Health System, were significantly different when comparing WCHQ to WHIO. Results at the healthcare system level are presented in Figure VIII. Figure VIII Process measures results for unmodified and modified specifications Test Site WCHQ APAP WCHQ 5DC WHIO Unmod. WHIO Mod A1c BHC 68.0% 70.6% 83.7% 67.4% PHC 69.0% 69.5% 76.3% 57.6% TC 81.5% 81.5% 89.9% 67.3% LDL BHC 91.1% 91.9% 93.2% 83.8% PHC 84.9% 84.6% 84.2% 71.6% TC 95.6% 96.0% 94.4% 83.9% Neph. Screen BHC 78.4% 78.1% 87.2% 72.3% PHC 72.7% 72.7% 86.8% 66.2% TC 83.5% 84.1% 88.6% 73.5% (APAP = All patients, all payers; 5DC = Five data contributors) The WHIO modified diabetes tests results were then compared to the WCHQ commercially insured population limited to the same five WHIO data contributors. This comparison was meant to test the hypothesis that similar patient populations, by payer type, would result in significantly comparable results. However, when comparing the same data contributors, the results were still significantly different.
Final Report 6 of 7 Value ranking results Employing the same quartile logic as the unmodified specification value rank testing, WCHQ and WHIO value rankings were then compared using modified specifications. The WHIO claims were run against the WCHQ specifications. The WHIO modified results are presented in Figure IX. Figure IX A1c testing value ranking results using modified specifications Quartile WCHQ-APAP (Rank) WHIO Modified (Rank) 4th TC CLINIC #6 FM (1) TC CLINIC #6 FM (1) TC CLINIC #3 FM (2) TC CLINIC #3 FM (2) TC CLINIC #10 FM (3) TC CLINIC #10 FM (3) TC CLINIC #9 FM (4) TC CLINIC #9 FM (4) TC CLINIC #7 FM (5) TC CLINIC #7 FM (5) TC CLINIC #1 FM (6) PHC CLINIC #10 FM (7) 3rd PHC CLINIC #10 FM (7) BHC CLINIC #1 FM (12) TC CLINIC #11 FM (8) TC CLINIC #1 FM (6) TC CLINIC #8 FM (9) TC CLINIC #8 FM (9) TC CLINIC #4 FM (10) PHC CLINIC #12 IM (15) PHC CLINIC #4 FM (11) TC CLINIC #4 FM (10) BHC CLINIC #1 FM (12) TC CLINIC #11 FM (8) 2nd TC CLINIC #5 FM (13) PHC CLINIC #4 FM (11) PHC CLINIC #5 IM (14) TC CLINIC #5 FM (13) PHC CLINIC #12 IM (15) PHC CLINIC #1 FM (16) PHC CLINIC #1 FM (16) PHC CLINIC #8 FM (17) PHC CLINIC #8 FM (17) PHC CLINIC #11 IM (22) PHC CLINIC #3 FM (18) PHC CLINIC #7 FM (20) 1st PHC CLINIC #9 IM (19) TC CLINIC #2 IM (21) PHC CLINIC #7 FM (20) PHC CLINIC #5 IM (14) TC CLINIC #2 IM (21) TC CLINIC #12 FM (23) PHC CLINIC #11 IM (22) PHC CLINIC #3 FM (18) TC CLINIC #12 FM (23) PHC CLINIC #6 FM (25) PHC CLINIC #2 IM (24) PHC CLINIC #9 IM (19) PHC CLINIC #6 FM (25) PHC CLINIC #2 IM (24) Comparing the WCHQ and WHIO value rankings in this scenario resulted in 9 of 25 clinics moving to another quartile for twice annual A1c testing. The comparison of value ranking using LDL tests resulted in 9 of 30 clinics moving to a different quartile. Similarly, 9 of 30 clinics shifted quartiles when nephropathy screening results were compared. Findings The comparison of WCHQ and WHIO process measure and value ranking results when using unmodified specifications can be summarized as follows: 3 of 9 (33%) performance measure results are statistically similar at the system level of reporting (p > 0.05) 57 of 85 (67%) measures are statistically similar at the clinic by specialty level of reporting (p > 0.05) 65 of 85 (76%) clinics stay in the same quartile value rank (V = Q/C) In conclusion, the diabetes performance results, using different populations and measure specifications, do not match as closely as hoped. Additional inquiry is needed to understand the factors contributing to those clinics and tests that do not match in compliance rate or value rank. Given the differences in performance measure algorithms and populations measured, it might not be surprising that the diabetes performance measures did not match. The more surprising results are the significant differences in diabetes performance measure results when WHIO modified data was compared to the WCHQ specifications. Discussion A number of factors were considered to explain the differences between WCHQ and WHIO results. Where possible, more testing was conducted to strengthen the analyses and provide new insights. The additional testing included: Matching age limits There was some inconsistency in applying an upper age limit (75 vs. 85) at the test site organizations. Upon further testing, moving from age 75 to age 85 increased the compliance of twice annual A1c testing results by 1-2%. Matching codes WCHQ does include a few additional CPT codes to identify diabetes tests.
Final Report 7 of 7 Upon further testing, these codes increased A1c testing results by tenths of a percent. Matching dates of service and reporting periods These were determined to be the same and, therefore, not a source of the differences. Impact of a minimum insurance enrollment period Upon further testing, the requirement of an 11-month insurance enrollment period during the measurement year increased the WHIO testing results by 5-6%. Access to non-medical claims data by WCHQ reporting organizations (e.g. employer paid health risk assessments) While this factor was considered, no additional testing was performed to determine the impact of non-medical claims data. Impact of capitated contracts The pilot organizations confirmed they did not have any capitated contracts at the time of this project. Impact of self-funded plans The WCHQ unmodified data included patients covered by self-funded plans while they were not included in the WHIO data. Unless self-funded patients are cared for in a manner significantly different than other patients, the impact should be negligible. This hypothesis needs analysis of the actual data to confirm. Although no single factor is likely to explain the gap in test results and value rank, the significant differences may be attributable, in part, to each of the above factors. A much more detailed analysis, most likely at the patient level, is needed to truly understand all the factors affecting the difference in diabetes testing results and value rank. A future study comprised of a random selection of WCHQ patients tracked from the provider encounter records, through billing, insurance processing and the WHIO/Ingenix algorithm could be conducted. Such a study might help to identify the variety and significance of factors contributing to the differences between health system encounter-based reporting and administrative-based claims reporting. Next steps The results of this project provided an important foundation for additional efforts to further develop and test a workable value metric for the management of diabetes. This project represented the first cycle of testing and evaluation of clinical encounter data and claims data to report measures of healthcare value. The WCHQ board of directors will reconvene for a strategic planning session in May 2010. At that time, the board will consider additional options to test and develop measures of value with the WCHQ and WHIO data as well as how to use these rich sources of data to improve healthcare in the state of Wisconsin. WCHQ is sincerely appreciative of the support it has received from the Robert Wood Johnson Foundation through Aligning Forces for Quality as well as the hardworking team members at Bellin Health, ProHealth Care, ThedaCare, WHIO and Ingenix. Questions, feedback and suggestions are encouraged as an integral part of the collaborative learning process. For more information, please contact Jack Bowhan at 608-826-6842 or jbowhan@wchq.org. A participant in the Robert Wood Johnson Foundation s Aligning Forces for Quality initiative to improve health and health care in Wisconsin. valueworks measure enlighten improve