Evaluating E&M Coding Accuracy of GoCode as Compared to Internal Medicine Physicians and Auditors

Similar documents
Not all NLP is Created Equal:

Semantic Alignment between ICD-11 and SNOMED-CT. By Marcie Wright RHIA, CHDA, CCS

Innovative Risk and Quality Solutions for Value-Based Care. Company Overview

HIV/AIDS Care: The Service (CPT) Code Evaluation and Management Series 1

Bellagio, Las Vegas November 26-28, Patricia Davis Computer-assisted Coding Blazing a Trail to ICD 10

c i r c l e o f l i f e a w a r d C I R C L E o f L I F E

How to Advance Beyond Regular Data with Text Analytics

Utilizing Structured Reporting to Increase the Number of Reports Correctly Coded in ICD-10

Keeping Abreast of Breast Imagers: Radiology Pathology Correlation for the Rest of Us

LibreHealth EHR Student Exercises

Chapter 12 Conclusions and Outlook

Sound Off DR. GOOGLE S ROLE IN PRE-DIAGNOSIS THROUGH TREATMENT. Ipsos SMX. June 2014

Behavioral Health Services An essential, coding, billing and reimbursement resource for psychiatrists, psychologists, and clinical social workers

9/27/2011. Improving Revenue Capture: Best Practices in Coding, Documentation and Charge Capture. Educational Breakout Session PRESENTERS

NATIONAL HEALTH POLICY FORUM

Anesthesia Reimbursement

Practical E/M Audit Form: Initial Outpatient Visit (p.1)

ChiroCredit.com / OnlineCE.com presents Documentation 101 Part 5 of 10 Instructor: Paul Sherman, DC

The Potential of SNOMED CT to Improve Patient Care. Dec 2015

DEPARTMENT OF HEALTH MEDICAID PAYMENTS FOR DIABETIC TESTING SUPPLIES. Report 2008-S-123 OFFICE OF THE NEW YORK STATE COMPTROLLER

Musculoskeletal Program

How Doctors Feel About Electronic Health Records. National Physician Poll by The Harris Poll

Making Connections: Early Detection Hearing and Intervention through the Medical Home Model Podcast Series

Note: This is an authorized excerpt from 2016 Healthcare Benchmarks: Digital Health To download the entire report, go to

Sample page. For the Physical Therapist An essential coding, billing and reimbursement resource for the physical therapist CODING & PAYMENT GUIDE

THE DIVERSITY OF SAMPLES FROM THE SAME POPULATION

Appendix I: E/M CodeBuilder

This is the author s version of a work that was submitted/accepted for publication in the following source:

extraction can take place. Another problem is that the treatment for chronic diseases is sequential based upon the progression of the disease.

Version No: 3.0. Feature - Dragon's Implementation with iphone App

Guideline Request Form Instructions

JOHN C. THORNE, PHD, CCC-SLP UNIVERSITY OF WASHINGTON DEPARTMENT OF SPEECH & HEARING SCIENCE FETAL ALCOHOL SYNDROME DIAGNOSTIC AND PREVENTION NETWORK

Experimental Research in HCI. Alma Leora Culén University of Oslo, Department of Informatics, Design

History of SETMA s Preparation for ICD-10 By James L. Holly, MD October 1, 2015

GUIDELINES: PEER REVIEW TRAINING BOD G [Amended BOD ; BOD ; BOD ; Initial BOD ] [Guideline]

Pioneering Radiology Revenues: The NovaPRO Business Growth Model

Student Performance Q&A:

2015 Behavioral Medicine Resident Chart Documentation. Laura Sullivan, MSW, CPC Compliance Auditor

A Common Sense Approach to Emergency Department E&M

ICD 9 EXERCISES WITH ANSWERS

1. Automatically create Flu Shot encounters in AHLTA in 2 mouse clicks. 2. Ensure accurate DX and CPT codes used for every encounter, every time.

DEPARTMENT OF HEALTH INAPPROPRIATE MEDICAID BILLINGS FOR DENTAL SEALANTS. Report 2007-S-58 OFFICE OF THE NEW YORK STATE COMPTROLLER

Lumify. Lumify reimbursement guide {D DOCX / 1

Enea Multilingual Solutions, LLC

ProviderNews2015. a growing issue. Body mass index and obesity: Tips and tools for tackling

Ultrasound Reimbursement Guide 2015: BioJet Fusion

Hot Topics and Data Analytics in ICD-10. Carrie Aiken, CHC Compliance and Consulting Manager

Webinar Series: Diabetes Epidemic & Action Report (DEAR) for Washington State Session 3

This report summarizes the stakeholder feedback that was received through the online survey.

Access to Medicaid for Breast & Cervical Cancer Treatment:

Data Quality and Medical Record Abstraction in the Veterans Health Administration s External Peer Review Program

Charting Smarter, not Longer: Basic Concepts in Outpatient Coding

Attn: Alicia Richmond Scott, Pain Management Task Force Designated Federal Officer

Sample page. Primary Care, Pediatrics/ Emergency Medicine. A comprehensive illustrated guide to coding and reimbursement CODING COMPANION

Lesson 3 Comprehensive Documentation IPED Documentation PT & OT

The Science of PASRR Validated BH Instruments Let s Envision The Possibilities

the rural primary care practice guide to Creating Interprofessional Oral Health Networks

Combined Alcohol and Drug Abuse Problems. Edward Gottheil, Section Editor

August 29, Dear Dr. Berwick:

Sepsis Denials. Presented by James Donaher, RHIA, CDIP, CCS, CCS-P

Leading change: examples from the front line

MERIT-BASED INCENTIVE PAYMENT SYSTEM (MIPS) QUALITY CATEGORY

Peer Review in Radiology

CLINICIAN-LED E-HEALTH RECORDS. (AKA GETTING THE LITTLE DATA RIGHT) Dr Heather Leslie Ocean Informatics/openEHR Foundation

2016 Behavioral Medicine Resident Chart Documentation. Laura Sullivan, MSW, CPC Compliance Auditor

SAMPLE. Behavioral Health Services

3/20/2013. "ICD-10 Update Understanding and Analyzing GEMs" March 10, 2013

Corporate Medical Policy Investigational (Experimental) Services

RADIATION THERAPY SERVICES CSHCN SERVICES PROGRAM PROVIDER MANUAL

Cancer Survivorship Connection Website Process Evaluation Cancer Survivorship Conference: Up Close & Personal

A framework for predicting item difficulty in reading tests

Sample page. Contents

Our Goals 10/2/2013. ICD-10 Transition: Five Phases. Hot Topics in Billing and Coding: Transitioning to ICD-10

Motivational Interviewing

Progress Report. Date: 12/18/ :15 PM Medical Record #: DOB: 10/17/1940 Account #: Patient Information

Dr. Coakley, so virtual colonoscopy, what is it? Is it a CT exam exactly?

Radiation Therapy Services

Psychology 2019 v1.3. IA2 high-level annotated sample response. Student experiment (20%) August Assessment objectives

Empirical Research Methods for Human-Computer Interaction. I. Scott MacKenzie Steven J. Castellucci

Physician s Compliance Guide

Advances in Alignment, Measurement, and Performance MY 2017 Results Highlights

Clinical Payment and Coding Policy Committee Approval Date: 02/23/2018

MU - Selection & Configuration of Measures

Clinical Payment and Coding Policy Committee Approval Date: 02/23/2018

Applying Student Development Theory to Veteran Services on Campus. Michael W. Rutledge

Detecting Patient Complexity from Free Text Notes Using a Hybrid AI Approach

September 6, Dear Centers for Medicare & Medicaid Services,

2013 ICD-10 Preparation Survey

ICD-10 RSPMI Provider Education April 16, Cathy Munn MPH RHIA CPHQ Sr. Consultant

Christina Martin Kazi Russell MED INF 406 INFERENCING Session 8 Group Project November 15, 2014

Reach Out to Patients for Better Disease Management

Polypharmacy and Deprescribing. A special report on views from the PrescQIPP landscape review

What is a Trauma Center? What is a Trauma Center? Minnesota Trauma Centers. Alcohol Screening and Brief Intervention in the Trauma Center Setting

Medicare Physician Fee Schedule Final Rule for CY 2018 Appropriate Use Criteria for Advanced Diagnostic Imaging Services Summary

Assistant Surgeon Payments

Appendix C: Protocol for the Use of the Scribe Accommodation and for Transcribing Student Responses

2015 Cardiac Catheterization Survey. Part A : General Information. Part B : Survey Contact Information. 1. Identification UID: 2.

Major Revenue Loss Projected for Hospitals and Physicians. Education and Readiness Testing Available from

CHAPTER III METHODOLOGY

BENCHMARKING REPORT. Read the results of a survey on cardiac CT angiography privileging. Help us to help you. The mission.

Transcription:

Evaluating E&M Coding Accuracy of GoCode as Compared to Internal Medicine Physicians and Auditors Evaluating E&M Coding Accuracy of GoCode as Compared to Internal Medicine Physicians and Auditors by Rhonda R. Thomas, PhD, MBA; Constantin Mitocaru; and Judith Hanlon Introduction Change is continually upon us, and the challenge is to leverage technology to become more efficient professionals, better problem solvers, and true knowledge workers, using technology to enable us to work smarter, faster, and more accurately. Reviewing free text has been a time-consuming and tedious manual task. Humans were needed to understand the meanings of synonyms, acronyms, and complex conceptual expressions. Statistics-based trigger words like complications or history often do not identify synonyms or closely related terms. 1 Just as technology has raced ahead of consumers in the past decade, it has also been being racing forward in development (albeit sometimes quietly and behind the scenes) for revolutionary new work in healthcare. New advances in natural language processing (NLP) allow the computer to understand what clinicians mean in their documentation, with little regard to how they say it. This preliminary study examines both the qualitative and quantitative impact of new NLP technology, specifically found in GoCode, for assisting HIM professionals in achieving a more standardized interpretation of evaluation and management (E&M) coding guidelines in internal medicine encounters. Observations on how this technology impacts the coding behavior of human reviewers will also be tracked and discussed, both qualitatively, from their own words, and quantitatively, from observing changes to their subsequent coding behavior during and after their interaction with the system. Background Most healthcare payors require the clinician to provide a CPT (Current Procedural Terminology) 2 visit code in order to bill for patient services in the evaluation and management of their care. In fact, reimbursement from payors is largely based on the associated evaluation and management (E&M) level reported and supported by the exam documentation for reimbursement for a patient visit. However, CPT guidelines are in fact guidelines, subject to regional and individual interpretation. In fact, the complexity and variations in interpretations of CPT E&M levels is well documented in various studies. 3, 4 Unstructured, free-text documentation of the patient encounter is certainly the norm throughout the largest single healthcare enterprise in the United States, the federal government, especially the Veterans Health Administration 5 and the Department of Defense. 6 Free-text documentation (from dictation, transcription, or voice recognition) is the preferred method of documentation currently and thus the norm in most US practice settings. 7 However, extracting meaning from unstructured text is a complex task, previously only possible by human reviewers, of understanding syntax, negation, synonyms, and prose to unlock the meaning contained in narrative documentation.

Perspectives in Health Information Management, CAC Proceedings; Fall 2008 Research Hypothesis: Null Hypothesis The study sought to disprove the following null hypothesis: Semantic- and syntactic-based natural language processing technology, combined with a sophisticated medical coding engine known as GoCode, has no impact on improving the accuracy of human coder reviews for E&M coding of free-text patient encounter documents. Methodology The test production environment of a major, multispecialty medical ambulatory clinic was polled to extract all internal medicine patient encounter notes processed by the GoCode system from May 1, 2007, to July 26, 2007. This resulted in a total population of 2,515 internal medicine patient encounter notes. These notes were then processed with GoCode version 1.6 and compared with the E&M level of care indicated by the physician at the time of dictation (Table 1). Note that a difference of 0 means that the provider and GoCode were a perfect match in the E&M coding assignment, while negative differences ( 1, 2, 3) indicate possible undercoding and positive differences (1, 2, 3, 4) indicate potential overcoding. Notes falling within +/ 1 level of care difference were less concerning than the outliers (those greater than +/ 1 level of difference), for which the coding issues need to be understood. Notes that had a physician-entered E&M code within +/ 1 level of agreement with the GoCodeassigned code were eliminated, resulting in a set of 131 discordant notes (disagreement by 2 levels or more) that would require human review and resolution prior to release for billing. In the typical workflow within GoCode, these 131 records (approximately 5.6 percent of the total) would have been routed back to both the physician and other human reviewers for evaluation as exceptions and would not be automatically released for billing. Four certified coding professionals were selected by AHIMA staff to assist with this coding evaluation. AHIMA staff divided the sample set of 131 discordant notes among these four independent reviewers. Reviewers then evaluated the encounter note without the aid of GoCode findings, using the same audit form, and applied only the 1995 version of the American Medical Association (AMA)/Centers for Medicare and Medicaid Services (CMS) documentation guidelines for establishing the level of service. Upon completion of the reviewers evaluation, 64 notes (or 48.9 percent of the sample) demonstrated no agreement between either the physicians, GoCode, or the auditors. Human reviewers noted that three documents had incomplete or insufficient documentation for coding, and these documents were removed from the sample population and excluded from further analysis. In 12 cases the reviewers believed the providers selected the wrong type of encounter, resulting in an invalid CPT code, and these notes were also excluded from further evaluation. Thus, 116 notes remained for further analysis. Next, a random-number generator was applied to each reviewer s note set, and five notes from each of the four reviewers (n = 20) were randomly selected for a second pass involving a group evaluation and interaction with GoCode processing detail. The group leader was given Web access to the processing detail provided by GoCode for this second pass. Through a virtual meeting environment, the group was able to review findings from GoCode, discuss interpretations of the documentation, and record comments and/or disagreements in interpretation of the record s content. A final E&M coding level was reviewed and recorded. Analysis Discussion with HIM professionals regarding interpretive and often subjective interpretations of coding guidelines has generated some consensus that coding results within +/ 1 level of agreement should be acceptable. A statistical test of proportions was conducted to evaluate the change in the proportion of records falling within +/ 1 level of agreement prior to interaction with GoCode and with AHIMA staff coding (agreement/disagreements) based on interaction with the GoCode tool.

Evaluating E&M Coding Accuracy of GoCode as Compared to Internal Medicine Physicians and Auditors Based on statistical test results from the reviewers first pass to the changes in the second pass, the null hypothesis was rejected at the p <.03 or 97 percent confidence level on a two-tailed test of proportions, using a t-test statistic. Human review initially demonstrated that 52.6 percent fell within the +/ 1 level of agreement, as compared to GoCode findings (see Table 2). Upon interacting with GoCode, reviewers achieved 80 percent agreement in the second pass in the +/ 1 range of concordance with GoCode (see Table 3). It appears that in this preliminary study, interaction with GoCode technology did improve consensus in the free-text narrative documentation interpretation and coding outcome. Using the auditing software on the LingoLogix Web portal, a second pass was made comparing the evaluation of the human coders with the software program. This resulted in a change in E&M level in 55 percent of the cases once the components of the documentation were evaluated using the tool. Initially, in only 1 of the 20 cases (5 percent) did GoCode and human reviewers agree (see Figure 1). At the close of this preliminary study, there were only four cases with a variance of more than one level between the human coder and the software results, and the causes for the variance are still under review. The light blue bars in Figure 1 show the deviation of reviewers E&M levels at the beginning of the second pass (n = 20). Note that the distribution is skewed toward the negative level, indicating a propensity to undercode the documents. After interaction with GoCode, the distribution shows results similar to those in Table 3, with much higher agreement and concurrence of human E&M coding with GoCode. Prior to interaction with GoCode, only 11 out of 20 documents (55 percent) were concurrent (within the +/ 1 range) with GoCode versus 80 percent after interaction with GoCode. Discussion New technologies involving natural language processing based on semantic and syntactic rules (rather than string word matching or keyword searching) offer new and exciting opportunities to make health information management professionals more efficient, effective, and productive, while simultaneously improving accuracy. Comments from the AHIMA staff participants include the following. (Note: These comments are those of individuals rather than official remarks by AHIMA.) The project was interesting and exciting. It was very valuable to look at the software to compare the E&M assignment since the software application referenced the note and displayed it in components required by the guidelines for review. Since E&M coding is so subjective, there were areas that the human coder varied with the interpretation of the software application some were discovered to be facility specific guidelines (medication risk). One area that was evident to the coders examining the overall results was that GoCode frequently selected a higher level code for the case compared to the physician or the human coder. In many cases the documentation supported the higher level code. We were very impressed with the software and the capabilities and particularly with the fact that SNOMED-CT was utilized. The audit tool was very easy to use and helpful in organizing the component parts of the documentation available to evaluate for level of service. The educational value as a feedback loop to physicians is impressive. The ability to add organization specific criteria to support care levels is useful and in many cases not available to the human coder at the time of code selection. Other specific observations from the reviewers were as follows: 1. In the History section, GoCode seemed to use isolated words, and out of context. For example, under history of present illnesses, this must be concerning the elements of the present condition, not some past condition, and the computer could not differentiate that.

Perspectives in Health Information Management, CAC Proceedings; Fall 2008 Occasionally, some false positive findings may occur especially between history of present illnesses and past medical history, if there is no clear definition between these sections, as the computer cannot tell the recent past from significant past history. 2. In the physical examination section, it is noteworthy that the body areas and the organ systems were randomly mixed, and it appeared that many times the system or area was counted more than once; however, this is not actually the case. For example, if you count an examination of the ears, nose, mouth, throat, then you cannot count head; cardiovascular, then you don t count chest; GI, then you don t count abdomen, and so forth. It also appeared that for these reasons GoCode gave a higher assignment to Examination. Even though GoCode is displaying all the body and organ systems, they are not being double counted. There was one particular record where the total exam was only four lines long, with two lines dedicated to constitutional (vital signs) and GoCode assigned a detailed level exam. 3. In the Medical Decision Making section, it also appears to code at a level higher than humans. There may be issues with the software recognizing what was current data and what was a test that appeared in the past medical history section. For example, statements of the patient having a colonoscopy in the past might be utilized in the amount of data to review. It also seemed that cases were inappropriately assigned to high medical decision making due to medication management at the high risk level, as knowledge of these high-risk medications was not taken into account by reviewers. Conclusions This is a preliminary study that could be expanded to other specialty areas. Focusing on internal medicine is challenging because of the complexity and variability of the exams and documentation encountered that must be interpreted by the NLP technology against coding guidelines. More standardization of coding guidelines will help ensure that codes are interpreted and applied more uniformly across all areas of healthcare and across institutions. Rhonda R. Thomas, PhD, MBA, is the president of LingoLogix in Dallas, TX. Constantin Mitocaru is the lead software developer for LingoLogix in Dallas, TX. Judith Hanlon is the chief operating officer and vice president of professional services for LingoLogix in Dallas, TX.

Evaluating E&M Coding Accuracy of GoCode as Compared to Internal Medicine Physicians and Auditors Notes 1. Brown, Steven, T. Speroff, E. Fielstein, et al. equality: Electronic Quality Assessment from Narrative Clinical Reports. Mayo Clinic Proceedings 81, no. 11 (2006, November): 1472 1481. 2. Current Procedural Terminology: CPT. Chicago, IL: American Medical Association, 1999. 3. Lasker, Roz, and Susan Marquis. The Intensity of Physicians Work in Patient Visits. New England Journal of Medicine 341, no. 5 (1999): 337 341. 4. King, M. S., L. Sharp, and M. S. Lipsky. Accuracy of CPT Evaluation and Management Coding by Family Physicians, Journal of the American Board of Family Practice 14, no. 3 (2001, May June): 184 192. 5. Brown, Steven, T. Speroff, E. Fielstein, et al. equality: Electronic Quality Assessment from Narrative Clinical Reports. 6. The authors have had meetings and conferences with Dr. Pak and others at Fort Detrick, 2007. 7. Brown, Steven, T. Speroff, E. Fielstein, et al. equality: Electronic Quality Assessment from Narrative Clinical Reports.

Perspectives in Health Information Management, CAC Proceedings; Fall 2008 Table 1 E&M Coding Level Deviation of Internal Medicine Notes Difference between Provider and GoCode Number of Notes 3 5 2 110 1 866 0 1,425 1 93 2 13 3 1 4 2

Evaluating E&M Coding Accuracy of GoCode as Compared to Internal Medicine Physicians and Auditors Table 2 GoCode and Human Review without GoCode Assistance Deviation Frequency Percent 3 0 0.0% 2 2 1.7% 1 4 3.4% 0 6 5.2% 52.6% 1 51 44.0% 2 52 44.8% 3 1 0.9%

Perspectives in Health Information Management, CAC Proceedings; Fall 2008 Table 3 GoCode and Human Review with GoCode Assistance (Second Pass) Deviation Frequency Percent 3 0 0.0% 2 0 0.0% 1 0 0.0% 0 8 40.0% 80% 1 8 40.0% 2 4 20.0% 3 0 0.0%

Records Evaluating E&M Coding Accuracy of GoCode as Compared to Internal Medicine Physicians and Auditors Figure 1 Human Reviewer Before/After GoCode 12 10 8 6 AHIMA I - GoCode AHIMA II - GoCode 4 2 0-3 -2-1 0 1 2 3 Deviation