Toward a Unified Representation of Findings in Clinical Radiology Valérie Bertaud a, Jérémy Lasbleiz ab, Fleur Mougin a, Franck Marin a, Anita Burgun a, Régis Duvauferrier ab a EA 3888, LIM, Faculty of Medicine, University of Rennes1, Rennes, France b Département de Radiologie et Imagerie Médicale, CHU de Rennes, France Abstract The representations of findings in clinical radiology are heterogeneous. Motivations for developing a unified representation include the semantic integration of medical reports based on DICOM-SR(Digital Image Communication in Medicine Structured Reporting), bibliographic databases in the context of evidence-based medicine, and teaching resources. In this work, we propose a unified representation integrating the representations of findings in the UMLS, the GAMUTS in Radiology and the DICOM-SR. We analyse the UMLS and the DCMR (DICOM Content Mapping Resource) of DICOM SR to figure out their own representation of findings. Then we set up a syntax between the UMLS concepts using DICOM-SR relations in order to rewrite the GAMUTS sentences. The translation of the whole GAMUTS using the UMLS concepts and the DICOM SR syntax could be a method to create or supplement the DCMR TIDs (Template ID : Identifier of a Template) and CIDs (Context ID : Identifier of a Context Group) in the field of description of findings in medical imaging. This method could also enable to give an ontologic dimension to the DICOM SR representation system of information. The meaning of the CIDs would then be enhanced far beyond the simple use of the SNOMED vocabulary. Keywords: Diagnosis [Subheading], Radiology, Unified Medical Language System, Medical Informatics 1. Introduction The representations of findings in clinical radiology are heterogeneous : medical reports (DICOM SR), bibliographic databases (MEDLINE), teaching resources. A unified representation would enable to create a bridge between clinical activity, evidence based medicine and teaching. This concern has already been the subject of some works (e.g. [1,2]). But a unique representation of findings that would fit to all these kinds of resources was missing in those works. Our work could make it possible to provide the basis which will be necessary for the "semantic Web" [3] and for the "just in time" in clinical activity [4].
The Unified Medical Language System (UMLS ) [5] already unifies the main medical classifications and nomenclatures in its Metathesaurus [6]. The DCMR (DICOM Content Mapping Resource) is used by DICOM SR (Digital Image Communication in Medicine Structured Reporting) [7] to carry out standardized imaging reports. We have to consider this representation of medical information as a reference insofar as it is integrated in HL7 v3 (Health Layer 7). This representation of radiological information has already admitted the importance of a rigorous terminology since it largely integrates the SNOMED (Systematized NOmenclature of MEDicine).Description of radiological findings are proposed in books of ranges of diagnosis based on descriptions of findings[8,9,10]. These books are numerous. Most of them have been edited a lot of times and some have been translated into several languages [11]. We can notice that the descriptions of findings remain unchanged (in vocabulary and in structure) in various books from various authors and in various editions[12]. The aim of this work is to propose a unified representation integrating the representations of findings in the UMLS, the GAMUTS in Radiology from Reeder and Felson [8] and the DICOM-SR. 2. Materials and methods First, we analyse how findings are represented in the ULMS. The UMLS has been developed and maintained by the U.S. National Library of Medicine since 1990. It comprises two major inter-related components: the Metathesaurus, a huge repository of concepts, and the Semantic Network, a limited network of 135 Semantic Types, and 54 Relations. The latter is a high-level representation of the biomedical domain based on Semantic Types under which all the Metathesaurus concepts are categorized, and which is intended to provide a basic ontology for the biomedical domain. In order to analyse the representation of findings in the UMLS, we start from the 500 sentences of osteo-articular semiology of the Gamuts 2003. Some examples are given in the table 1. We use the Metamap program[13] to discover UMLS Metathesaurus concepts in the Gamuts phrases. Table 1 - Example of findings descriptions (GAMUT in Radiology 2003) Code Phrase D-126 Hypoplastic (spindle-shaped or stubby) terminal phalanges D-127-1 Acro-osteolysis (erosion or destruction of multiple terminal phalangeal tufts) D-127-2 Acquired acro-osteolysis confined to one digit D-127-3 Band-like destruction or erosion of the midportion of a terminal phalanx D-128 Acro-osteosclerosis (terminal phalangeal sclerosis) D-129-1 Amputation or absence of a phalanx, digit, hand, or foot acquired D-129-2 Amputation or absence of a phalanx, digit, hand, or foot congenital D-129-3 Self-mutilation of digits D-130 Gangrene of a finger or toe D-131 Lytic lesion(s) in a phalanx (often cyst-like) Then, we analyse the DICOM-SR, and particularly the Part 16 [14] and the supplement 23 [15], to identify the architecture of findings held in this model. Part 16 of DICOM has been created in 2001. It gathers all the coding and terminology elements which appear in the DICOM objects (images, structured reports, physiological signals ). Thus it is considered as the DICOM Content Mapping Ressource (DCMR). This document contains all the terms sorted by applicative domains (Context groups ID: CIDs). Data templates (TIDs) are built from these resources. The structured report is based on the TIDs.
Lastly, we set up a syntax between the UMLS concepts using DICOM-SR relations in order to rewrite the GAMUTS sentences. 3. Results Findings in radiology according to the UMLS MetaMap makes it possible to discover UMLS Metathesaurus concepts in the Gamuts phrases (500 phrases, 3780 words, 476 different words). It also indicates for each UMLS concept (1271 different concepts, 1149 different CUIs), the corresponding Semantic Types (Table 2). Table 2 - The Metamap results for the most frequent Semantic Types: number of words in Gamuts phrases, number of UMLS different concepts, number of different CUIs (Concept Unique Identifier). Semantic Types Words Concepts CUI 1. Entity 1.1. Anatomical structure 1.1.1. Anatomical abnormality 1.1.1.1. Congenital abnormality 1.1.1.2. Acquired abnormality 1.1.2. Fully formed anatomical structure 1.1.2.1. Body part, organ, or organ component 1.1.2.2. Tissue 1.2. Temporal concept 1.3. Qualitative concept 1.4. Quantitative concept 1.5. Functional concept 1.6. Spatial concept 1.6.1. Body space or junction 1.6.2. Body location or region 1.7. Finding 1.8. Sign or symptom 2. Event 2.1. Pathologic function 2.1.1. Disease or syndrome 2.1.1.1. Neoplastic process 2.2. Injury or poisoning First, we can identify the part of the Semantic Network regarding the findings. Among the UMLS Semantic Types, those which are the most frequently assigned to findings in clinical radiology and the relations that hold between these Semantics Types are presented in figure 1. Secondly, a great amount of Gamuts phrases are combinations of simple UMLS concepts. The composed concepts need to use a syntax to be represented (Fig. 2). On this point, we notice that : Quantitative concepts, qualitative concepts and functional concepts can be attributes that make spatial concepts and anatomical structures pathologic. Finding, sign or symptom, disease or syndrome, pathologic function, anatomical abnormality, are either self supporting to express pathologies or they can be, like previously, attributes that make spatial concepts and anatomical structures pathologic. Beyond these syntactical considerations, some terms of Gamuts cannot be found in the UMLS. The coverage of the UMLS is not complete in particular for the description of the images characteristics. Thus, the following concepts are missing in the UMLS : radiopaque ; 26 78 195 82 186 855 323 243 721 370 949 842 101 246 232 241 297 464 174 67 4 19 50 18 3 118 24 33 97 44 123 146 15 63 64 93 39 100 33 21 4 15 42 16 1 99 19 33 91 41 117 137 13 49 59 21 34 89 21 17
high attenuation ; low attenuation ; hypodense ; isodense ; hyperdense ; hypoechoic ; isoechoic ; hyperechoic ; Enhancing ; intravenous contrast ; homogeneous ; heterogenous Some proper nouns are also missing (e.g. Magdelung deformity), some metaphoric expressions (e.g. salt and pepper demineralization), some anatomic localizations (e.g. parosteal) and various findings (e.g. cupping). (is a) Finding Sign or symptom (manifestation of) (Diagnose) Anatomical abnormality Pathologic fonction Disease & syndrome Diagnostic procedure (location of) (is a) (Diagnose) Figure 1 The semantic network representing findings in the UMLS 1. Entity 1.1. Anatomical structure 1.1.1. Anatomical abnormality 1.1.1.1. Congenital abnormality 1.1.1.2. Acquired abnormality 1.1.2. Fully formed anatomical structure 1.1.2.1. Body part, organ, or organ component 1.1.2.2. Tissue 1.2. Temporal concept 1.3. Qualitative concept 1.4. Quantitative concept 1.5. Functional concept 1.6. Spatial concept 1.6.1. Body space or junction 1.6.2. Body location or region 1.7. Finding 1.8. Sign or symptom 2. Event 2.1. Pathologic function 2.1.1. Disease or syndrome 2.1.1.1. Neoplastic process 2.2. Injury or poisoning Figure 2 - In this semantic network, in bold the semantic types which are self-supporting, in normal writing the semantic types of the nouns that need an adjective, in italic writing the semantic types of adjective Findings in radiology according to DICOM SR Figure 3 Representation of findings in DICOM SR. The main relations are : 1 - CONTAINS, 2 - HAS PROPERTIES, 3 - HAS CONC MOD In the DCMR, the "Findings" are found in multiple CIDs (Fig.3). They are for example: "Previous findings" (former results), "Clinical findings" (clinical elements) coming within the scope of the patient s history or the procedure reason. They can also be the "anatomy findings"
(anatomical elements observed) and the "finding site" (site of lesion) possibly specified by "topographical modifiers" (for example the former part for the mitral valve). But they are especially the "lesion findings" (lesion element or anomaly) possibly specified by a "finding modifier" (modifier related to a visual anomaly) which states, for example, that the radiological line is a liquid level. It is necessary to add the "quality finding" (quality standards) which includes the artefacts like the CID 6041 (Mammography image quality Finding) or the CID 6135 (Chest image quality finding). The "certainty of finding" means the level of certainty expressed by a percentage of the result. This wealth makes it possible to describe the radiological lesions with a high level of accuracy. The relations used between CIDs and TIDs to express the findings in DICOM SR are: CONTAINS, HAS CONCEPT MOD (has concept modifier), and HAS PROPERTIES. Rewriting the GAMUTS phrases using the UMLS and DICOM SR Although the coverage of the UMLS is not complete, it is possible to represent a great amount of the GAMUTS phrases by combining the UMLS concepts with boolean operators, brackets and the DICOM SR relations regarding the findings (HAS CONCEPT MOD: has concept modifier, HAS PROPERTIES) (Fig. 4). - D-126 Hypoplastic (spindle-shaped or stubby) terminal phalanges : Hypoplastic; Functional Concept; C0543481; HAS PROPERTIES; Terminal phalanx; Body Part, Organ, or Organ Component; C0576464 - D-127-1 Acro-osteolysis (erosion or destruction of multiple terminal phalangeal tufts) : Acro-Osteolysis; Disease or Syndrome; C0917990 / Terminal phalanx; Body Part, Organ, or Organ Component; C0576464; HAS CONCEPT MOD; Multiple; Quantitative Concept; C0205294; HAS PROPERTIES; Erosion; Acquired Abnormality; C0333307 - D-127-2 Acquired acro-osteolysis confined to one digit : Acro-Osteolysis; Disease or Syndrome; C0917990; HAS CONCEPT MOD; Acquired; Temporal Concept; C0439661; HAS PROPERTIES; Digit, NOS; Body Part, Organ, or Organ Component; C0278455; HAS CONCEPT MOD ; One Quantitative Concept; C0205447 - D-127-3 Band-like destruction or erosion of the midportion of a terminal phalanx : Erosion; Acquired Abnormality; C0333307; HAS PROPERTIES ; Band; Spatial Concept; C0439645; HAS PROPERTIES; Terminal phalanx; Body Part, Organ, or Organ Component; C0576464 - D-128 Acro-osteosclerosis (terminal phalangeal sclerosis) : Osteosclerosis; Disease or Syndrome; C0029464 / Sclerosis; Finding; C0036429; HAS PROPERTIES ; Terminal phalanx; Body Part, Organ, or Organ Component; C0576464 - D-129-1 Amputation or absence of a phalanx, digit, hand, or foot acquired : (Amputation; Therapeutic or Preventive Procedure; C0002688; OR; Absences; Finding; C0424530); HAS CONCEPT MOD; Acquired; Temporal Concept; C0439661; HAS PROPERTIES; (Phalanx, NOS; Body Part, Organ, or Organ Component; C0222682; OR; Digit, NOS; Body Part, Organ, or Organ Component; C0278455; OR; hand <2>; Body Part, Organ, or Organ Component; C0018563; OR; Foot <2>; Body Location or Region; C0016504) - D-129-2 Amputation or absence of a phalanx, digit, hand, or foot congenital : Congenital n est pas reconnu en UMLS mais en fait il s agit de congenital abnormality - D-129-3 Self-mutilation of digits : Self Mutilation; Injury or Poisoning; C0036601; HAS PROPERTIES; Digit, NOS; Body Part, Organ, or Organ Component; C0278455 - D-130 Gangrene of a finger or toe : Gangrene; Disease or Syndrome; C0017086; HAS PROPERTIES; (Finger; Body Location or Region; C0016129; OR; Toe; Body Location or Region; C0040357) - D-131 Lytic lesion(s) in a phalanx (often cyst-like) : (Lytic lesion; Disease or Syndrome; C0221204;OR ; Cyst; Neoplastic Process; C0010709); HAS PROPERTIES; Phalanx, NOS; Body Part, Organ, or Organ Component; C0222682; Figure 4 GAMUTS phrases translation using the UMLS concepts and the DICOM SR syntax. In italic writing, the GAMUTS phrases, in bold the UMLS concepts, in bold and upper case letters the DICOM relations and boolean operators, in normal writing the UMLS semantic types and their CUI (Concept Unique Identifier). 4. Discussion In this work, after analysing the structure and the content of the UMLS and DICOM SR concerning the representation of findings, we demonstrate that it is possible to represent the findings using the UMLS terminology and the DICOM SR syntax.
Nevertheless, some essential findings in radiology remain missing in the UMLS like in particular the terms regarding the images characteristics. From this point of view, the UMLS needs to be supplemented. Our unified representation enables to connect the medical report (DICOM SR) to the bibliographic databases (MEDLINE) and reference books used for students. We envision potential applications in evidence based medicine, or in information retrieval in the framework of the semantic web for example. 5. Conclusion The translation of the whole GAMUTS using the UMLS concepts and the DICOM SR syntax could be a method to create or supplement the DCMR TIDs and CIDs in the field of the description of imaging findings. This method could enable to give an ontologic dimension to DICOM SR system of information representation. The meaning of the CIDs would then be enhanced far beyond the simple use of SNOMED vocabulary. 6. References [1] Sneiderman CA Rindflesch TC, Aronson AR. Finding the findings: identification of findings in medical literature using restricted natural language processing. Proc AMIA Annu Fall Symp. 1996;:239-43. [2] Hersh W, Mailhot M, Arnott-Smith C, Lowe H. Selective automated indexing of findings and diagnoses in radiology reports. J Biomed Inform. 2001 Aug;34(4):262-73. [3] Berners-Lee T., Hendler J., Lassila O. The Semantic Web. Scientific American, May 2001 [4] Chueh H, Barnett GO. "Just-in-time" clinical information. Acad Med. 1997 Jun;72(6):512-7. [5] UMLS Knowledge Source Server Version 4.2.3 available at : http://umlsks.nlm.nih.gov/kss/servlet/turbine/template/admin,user,kss_login.vm [6] Burgun A, Bodenreider O. Mapping the UMLS Semantic Network into general ontologies. Proc AMIA Symp. 2001;:81-5 [7] Hussein R, Engelmann U, Schroeter A, Meinzer HP. DICOM structured reporting: Part 1. Overview and characteristics. Radiographics. 2004 May-Jun;24(3):891-6. [8] MM Reeder, B Felson. Gamuts in Radiology Comprehensive lists of roentgen differential diagnosis. Springer fourth edition 2003. [9] R.L. Eisenberg. Clinical Imaging. An atlas of differential diagnosis. Aspen publishers, 1988. [10] S. Chapman, R. Nakielny. Aids to radiological diagnosis. Third edition. Saunders, 1995. [11] S. Chapman, R. Nakielny. Guide du diagnostic différentiel en radiologie. Traduit par G. Coche de l édition 1984. Vigot, 1989. [12] MM Reeder, B Felson. Gamuts in Radiology Comprehensive lists of roentgen differential diagnosis. Audiovisual Radiology of Cincinnati, inc. 1975. [13] Aronson AR. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp. 2001;:17-21. [14] National Electrical Manufacturers Association. Digital Imaging and Communication in Medicine (DICOM), part 16 : Content Mapping Ressource. Rosslyn, Va : NEMA, 2001.Available at : http://medical.nema.org/dicom/2003.html [15] National Electrical Manufacturers Association. Digital Imaging and Communication in Medicine (DICOM), supplement 23 : structured reporting storage SOP classes. Rosslyn, Va : NEMA, 2000.Available at : http://medical.nema.org 7. Address for correspondence : Valerie Bertaud : valerie.bertaud@univ-rennes1.fr