A Simple Pipeline Application for Identifying and Negating SNOMED CT in Free Text

Similar documents
Symbolic rule-based classification of lung cancer stages from free-text pathology reports

Text mining for lung cancer cases over large patient admission data. David Martinez, Lawrence Cavedon, Zaf Alam, Christopher Bain, Karin Verspoor

Chapter 12 Conclusions and Outlook

Clinician-Driven Automated Classification of Limb Fractures from Free-Text Radiology Reports

Detecting Patient Complexity from Free Text Notes Using a Hybrid AI Approach

COMPARISON OF BREAST CANCER STAGING IN NATURAL LANGUAGE TEXT AND SNOMED ANNOTATED TEXT

Query Refinement: Negation Detection and Proximity Learning Georgetown at TREC 2014 Clinical Decision Support Track

arxiv: v1 [cs.ai] 27 Apr 2015

Automatic Identification & Classification of Surgical Margin Status from Pathology Reports Following Prostate Cancer Surgery

Text Mining of Patient Demographics and Diagnoses from Psychiatric Assessments

Artificial Intelligence In Medicine xxx (2018) xxx-xxx. Contents lists available at ScienceDirect. Artificial Intelligence In Medicine

A Study of Abbreviations in Clinical Notes Hua Xu MS, MA 1, Peter D. Stetson, MD, MA 1, 2, Carol Friedman Ph.D. 1

Data Linking and Integration for Health Applications

Building a Diseases Symptoms Ontology for Medical Diagnosis: An Integrative Approach

TeamHCMUS: Analysis of Clinical Text

Biomedical resources for text mining

Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Abstracts

Retrieving disorders and findings: Results using SNOMED CT and NegEx adapted for Swedish

Classification of Cancer-related Death Certificates using Machine Learning

Lung Cancer Concept Annotation from Spanish Clinical Narratives

Semantic Alignment between ICD-11 and SNOMED-CT. By Marcie Wright RHIA, CHDA, CCS

Heiner Oberkampf. DISSERTATION for the degree of Doctor of Natural Sciences (Dr. rer. nat.)

General Symptom Extraction from VA Electronic Medical Notes

Automatic generation of MedDRA terms groupings using an ontology

KNOWLEDGE-BASED METHOD FOR DETERMINING THE MEANING OF AMBIGUOUS BIOMEDICAL TERMS USING INFORMATION CONTENT MEASURES OF SIMILARITY

Atigeo at TREC 2012 Medical Records Track: ICD-9 Code Description Injection to Enhance Electronic Medical Record Search Accuracy

Term variation in clinical records First insights from a corpus study

How preferred are preferred terms?

BMC Medical Informatics and Decision Making 2006, 6:30

Automatic ICD-10 Classification of Cancers from Free-text Death Certificates

Using Natural Language Processing To Analyze Electronic Health Records. Philip Poon PhD Data Scientist

Innovative Risk and Quality Solutions for Value-Based Care. Company Overview

A Descriptive Delta for Identifying Changes in SNOMED CT

The Impact of Belief Values on the Identification of Patient Cohorts

Extracting geographic locations from the literature for virus phylogeography using supervised and distant supervision methods

Standardize and Optimize. Trials and Drug Development

DOWNLOAD EASYCIE AND DATASET

Automated UICC-TNM Staging

How to code rare diseases with international terminologies?

Effect of (OHDSI) Vocabulary Mapping on Phenotype Cohorts

Asthma Surveillance Using Social Media Data

Automatically extracting, ranking and visually summarizing the treatments for a disease

Appendix A: DARTNet Design Specifications

Data and Text-Mining the ElectronicalMedicalRecord for epidemiologicalpurposes

UMLS and phenotype coding

WELSH INFORMATION STANDARDS BOARD

Toward a Unified Representation of Findings in Clinical Radiology

FMD diagnostics: current developments and application in the context of FMD control in endemic countries Wilna Vosloo

Kaiser Permanente Convergent Medical Terminology (CMT)

A Study on a Comparison of Diagnostic Domain between SNOMED CT and Korea Standard Terminology of Medicine. Mijung Kim 1* Abstract

CLAMP-Cancer an NLP tool to facilitate cancer research using EHRs Hua Xu, PhD

Running Head: AUTOMATED SCORING OF CONSTRUCTED RESPONSE ITEMS. Contract grant sponsor: National Science Foundation; Contract grant number:

Pilot Study: Clinical Trial Task Ontology Development. A prototype ontology of common participant-oriented clinical research tasks and

Session 35: Text Analytics: You Need More than NLP. Eric Just Senior Vice President Health Catalyst

IBM Research Report. Automated Problem List Generation from Electronic Medical Records in IBM Watson

Not all NLP is Created Equal:

Automated Annotation of Biomedical Text

Clinical terms and ICD 11

Mining Electronic Health Records to Extract Patient-Centered Outcomes Following Prostate Cancer Treatment

SNOMED CT and Orphanet working together

Automatic Extraction of Synoptic Data. George Cernile Artificial Intelligence in Medicine AIM

Curriculum Vitae. Degree and date to be conferred: Masters in Computer Science, 2013.

Boundary identification of events in clinical named entity recognition

Knowledge networks of biological and medical data An exhaustive and flexible solution to model life sciences domains

FDA Workshop NLP to Extract Information from Clinical Text

SAGE. Nick Beard Vice President, IDX Systems Corp.

How can Natural Language Processing help MedDRA coding? April Andrew Winter Ph.D., Senior Life Science Specialist, Linguamatics

LEADING THE WAY IN HEALTH IT RESEARCH ANNUAL REPORT 2016 / 2017 RESEARCH. DISCOVERY. INNOVATION POWERS HEALTH IN THE DIGITAL AGE

Data driven Ontology Alignment. Nigam Shah

SNOMED Clinical Terms Fundamentals

Distillation of Knowledge from the Research Literatures on Alzheimer s Dementia

Factuality Levels of Diagnoses in Swedish Clinical Text

Collection of Cancer Stage Data by Classifying Free-text Medical Reports

PBSI-EHR Off the Charts!

Can SNOMED CT replace ICD-coding?

Exploiting deduction and abduction services for information retrieval. Ralf Moeller Hamburg University of Technology

Research and Innovation in Rural and Remote Health Services

Auditing SNOMED Relationships Using a Converse Abstraction Network

Headings: Information Extraction. Natural Language Processing. Blogs. Discussion Forums. Named Entity Recognition

An Intelligent Writing Assistant Module for Narrative Clinical Records based on Named Entity Recognition and Similarity Computation

Extraction of Adverse Drug Effects from Clinical Records

Evaluation of SNOMED Coverage of Veterans Health Administration Terms

Learning the Fine-Grained Information Status of Discourse Entities

A Lexical-Ontological Resource forconsumerheathcare

Memory-Augmented Active Deep Learning for Identifying Relations Between Distant Medical Concepts in Electroencephalography Reports

Automatic Structured Reporting from Narrative Cancer Pathology Reports

Using the CEN/ISO Standard for Categorial Structure to Harmonise the Development of WHO International Terminologies

Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain

Automatic Fault Tree Derivation from Little-JIL Process Definitions

ALL PRINCIPAL INVESTIGATORS/NURSES/DATA MANAGERS RE: PROTOCOL GOG-0233 ACRIN 6671, REVISION # 9 & #10

Analysis of Semantic Classes in Medical Text for Question Answering

a) List of KMTs targeted in the shrna screen. The official symbol, KMT designation,

Automatic classification of diseases from free-text death certificates for real-time surveillance

Harvard-MIT Division of Health Sciences and Technology HST.952: Computing for Biomedical Scientists. Data and Knowledge Representation Lecture 6

Problem-Oriented Patient Record Summary: An Early Report on a Watson Application

Schema-Driven Relationship Extraction from Unstructured Text

A Web Tool for Building Parallel Corpora of Spoken and Sign Languages

Rebooting Cancer Data Through Structured Data Capture GEMMA LEE NAACCR CONFERENCE JUNE, 2017

1.Title: TruData: The Facilitator of Clear Communication at Weill Cornell. Weill Cornell Medical College

Keeping Abreast of Breast Imagers: Radiology Pathology Correlation for the Rest of Us

Transcription:

A Simple Pipeline Application for Identifying and Negating SNOMED CT in Free Text Anthony Nguyen 1, Michael Lawley 1, David Hansen 1, Shoni Colquist 2 1 The Australian e-health Research Centre, CSIRO ICT Centre, Brisbane, Australia 2 Queensland Cancer Control Analysis Team, Queensland Health, Brisbane, Australia

Medical Free Text to SNOMED CT Mapping Medical free text Non-standardised Individual & institutional variation Manual extraction of information SNOMED CT Standard reference terminology 1 Large formal ontology Aggregate clinical information for retrieval & analysis 1 SNOMED CT is identified by NeHTA as the standard ontology to be used in systems within Australian Healthcare Free-text Report Detect Negation Phrases Parse into Medical Terms Figure 1. General methodology for coded biomedical terminology mapping from medical free text

Case Study: Cancer Notifications Manual review of (text based) medical records Figure 2. De-identified pathology report demonstrating how cancer notifications was historically performed Multi-year cancer information reporting delay Expensive and labour intensive process Subject to omission errors Cases inadvertently skipped / keywords missed

Goal Hypothesis: Automated computer system could automatically perform the time and labour intensive manual coding of clinical information Method: Automatically scan free-text medical documents for clinically relevant terms Metastatic Small Cell Carcinoma

Pipeline Application [Tools] Figure 3. Pipeline application for the annotation of SNOMED CT concepts from free text General Architecture for Text Engineering (GATE) Open source architecture for natural language processing (NLP) MetaMap Transfer (MMTx) application Identify concepts in free text which are restricted to the SNOMED CT source Negation detection algorithm (NegEx) Detection of negation phrases Ontology server (OntoServer) SNOMED CT semantics to identify subsets of findings and diseases concepts for negation Development corpus (AUSLAB) Lung cancer reports from state-wide pathology information system

Pipeline Application Document Reset: Restores the document to its original state by deleting existing annotations Tokeniser: Splits the document into: Tokens Length measurements & units TNM cancer stages Legacy SNOMED IDs De-identified information

Pipeline Application Sentence Splitter: Segment text into sentences using regular expressions UMLS (MMTx) Annotator: Map strings in free text to closest concepts in UMLS UMLS concepts were restricted to the SNOMED CT source

Pipeline Application UMLS to SNOMED CT Mapper: UMLS concepts mapped to SNOMED CT using UMLS Metathesaurus database file Only active concepts were retained SNOMED ID tokens mapped to SNOMED CT concept

Pipeline Application Negation Phrase (NegEx) Finder Finds common medical negation phrases SNOMED CT Negation (NegEx) Applicator Tags negated concepts by associating negation phrases with neighbouring SNOMED CT findings or disease concepts SNOMED CT sub-hierarchies considered for negation: 404684003 clinical finding (e.g. decreased capillary fragility, diabetes mellitus, dyspnea. ) 49755003 morphologically abnormal structure (e.g. wallerian degeneration, carcinoma ) 363787002 observable entity (e.g. status of invasion by tumour ) 272379006 event (e.g. exposure to toxin, accident caused by bench saw )

Results

Results

Symbolic Rule-Based Post-Coordination Subsumption relationships can be taken advantage of to retrieve concepts which relate to the same disease, anatomy or finding, or infer if two descriptions relate to the same concept. Example Lung Resection Subsumption Test Test if following post-coordinated expression template <procedure> : {260686004 method = <procedure.method>,405813007 procedure site - Direct = <topology> } is subsumed by: 119746007 lung excision CSIRO.

Symbolic Rule-Based Post-Coordination CSIRO. Figure 4. Synoptic factor extraction using SNOMED CT semantics

The Australian e-health Research Centre Michael Lawley Principal Research Scientist Phone: +61 7 3253 3609 Email: michael.lawley@csiro.au Web: http://aehrc.com Thank you Contact Us Phone: 1300 363 400 or +61 3 9545 2176 Email: Enquiries@csiro.au Web: www.csiro.au