Deriving an Abstraction Network to Support Quality Assurance in OCRe

Similar documents
Auditing SNOMED Relationships Using a Converse Abstraction Network

A Descriptive Delta for Identifying Changes in SNOMED CT

Copyright Warning & Restrictions

Representation of Part-Whole Relationships in SNOMED CT

Formalizing UMLS Relations using Semantic Partitions in the context of task-based Clinical Guidelines Model

Copyright Warning & Restrictions

Semantic Alignment between ICD-11 and SNOMED-CT. By Marcie Wright RHIA, CHDA, CCS

AMIA 2005 Symposium Proceedings Page - 166

Building a Diseases Symptoms Ontology for Medical Diagnosis: An Integrative Approach

AN ONTOLOGICAL APPROACH TO REPRESENTING AND REASONING WITH TEMPORAL CONSTRAINTS IN CLINICAL TRIAL PROTOCOLS

Harvard-MIT Division of Health Sciences and Technology HST.952: Computing for Biomedical Scientists. Data and Knowledge Representation Lecture 6

A Content Model for the ICD-11 Revision

Using an Integrated Ontology and Information Model for Querying and Reasoning about Phenotypes: The Case of Autism

Research Scholar, Department of Computer Science and Engineering Man onmaniam Sundaranar University, Thirunelveli, Tamilnadu, India.

Binding Ontologies and Coding Systems to Electronic Health Records and Message

Data Structures vs. Study Results:

Temporal Knowledge Representation for Scheduling Tasks in Clinical Trial Protocols

A Lexical-Ontological Resource forconsumerheathcare

Defined versus Asserted Classes: Working with the OWL Ontologies. NIF Webinar February 9 th 2010

Design of a Goal Ontology for Medical Decision-Support

A FRAMEWORK FOR CLINICAL DECISION SUPPORT IN INTERNAL MEDICINE A PRELIMINARY VIEW Kopecky D 1, Adlassnig K-P 1

What Is A Knowledge Representation? Lecture 13

An introduction to case finding and outcomes

Clinical Genome Knowledge Base and Linked Data technologies. Aleksandar Milosavljevic

Kaiser Permanente Convergent Medical Terminology (CMT)

Stepwise Knowledge Acquisition in a Fuzzy Knowledge Representation Framework

Role Representation Model Using OWL and SWRL

International Journal of Software and Web Sciences (IJSWS)

Keeping Abreast of Breast Imagers: Radiology Pathology Correlation for the Rest of Us

STIN2103. Knowledge. engineering expert systems. Wan Hussain Wan Ishak. SOC 2079 Ext.: Url:

Basis for Conclusions: ISA 530 (Redrafted), Audit Sampling

HBML: A Representation Language for Quantitative Behavioral Models in the Human Terrain

An Interval-Based Representation of Temporal Knowledge

When and Why to use a Classifier?

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

Models of Information Retrieval

A Method for Analyzing Commonalities in Clinical Trial Target Populations

CIMI Modeling Architecture, Methodology & Style Guide

A Framework Ontology for Computer-based Patient Record Systems

Explanation-Boosted Question Selection in Conversational CBR

Using the CEN/ISO Standard for Categorial Structure to Harmonise the Development of WHO International Terminologies

Extending the GuideLine Implementability Appraisal (GLIA) instrument to identify problems in control flow

Weighted Ontology and Weighted Tree Similarity Algorithm for Diagnosing Diabetes Mellitus

Foundations of AI. 10. Knowledge Representation: Modeling with Logic. Concepts, Actions, Time, & All the Rest

Causal Knowledge Modeling for Traditional Chinese Medicine using OWL 2

EER Assurance Criteria & Assertions IAASB Main Agenda (June 2018)

A Visual Representation of Part-Whole Relationships in BFO-Conformant Ontologies

PO Box 19015, Arlington, TX {ramirez, 5323 Harry Hines Boulevard, Dallas, TX

APPROVAL SHEET. Uncertainty in Semantic Web. Doctor of Philosophy, 2005

Accepted Manuscript. Use of Ontology Structure and Bayesian Models to Aid the Crowdsourcing of ICD-11 Sanctioning Rules

APPLYING ONTOLOGY AND SEMANTIC WEB TECHNOLOGIES TO CLINICAL AND TRANSLATIONAL STUDIES

Recognizing Scenes by Simulating Implied Social Interaction Networks

Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials

Hoare Logic and Model Checking. LTL and CTL: a perspective. Learning outcomes. Model Checking Lecture 12: Loose ends

Heiner Oberkampf. DISSERTATION for the degree of Doctor of Natural Sciences (Dr. rer. nat.)

Development of Application Ontology of Lenke s classification of Scoliosis - OBR-Scolio

Semantic Clarification of the Representation of Procedures and Diseases in SNOMED CT

ORC: an Ontology Reasoning Component for Diabetes

Plan Recognition through Goal Graph Analysis

Data driven Ontology Alignment. Nigam Shah

Ontology Development for Type II Diabetes Mellitus Clinical Support System

Including Quantification in Defeasible Reasoning for the Description Logic EL

An OWL Ontology of Norms and Normative Judgements

Design of a Goal Ontology for Medical Decision-Support

Toward a Unified Representation of Findings in Clinical Radiology

Health informatics Digital imaging and communication in medicine (DICOM) including workflow and data management

Knowledge Representation defined. Ontological Engineering. The upper ontology of the world. Knowledge Representation

NIH Public Access Author Manuscript Stud Health Technol Inform. Author manuscript; available in PMC 2010 February 28.

Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms

Lecture 2: Foundations of Concept Learning

Effective radiology workflow in a filmless, An Ontology for PACS Integration

Evaluating OWL 2 Reasoners in the context of Clinical Decision Support in Lung Cancer Treatment Selection

Not all NLP is Created Equal:

The Potential of SNOMED CT to Improve Patient Care. Dec 2015

A Study on a Comparison of Diagnostic Domain between SNOMED CT and Korea Standard Terminology of Medicine. Mijung Kim 1* Abstract

Rehabilitation Robotics Ontology on the Cloud

Ontologies for the Study of Neurological Disease

A Simple Pipeline Application for Identifying and Negating SNOMED CT in Free Text

High-level Vision. Bernd Neumann Slides for the course in WS 2004/05. Faculty of Informatics Hamburg University Germany

Using Semantic Web technologies for Clinical Trial Recruitment

Using Scripts to help in Biomedical Text Interpretation

Expert System Profile

Grounding Ontologies in the External World

Chapter 5: Field experimental designs in agriculture

AN ONTOLOGY FOR DIABETES EARLY WARNING SYSTEM

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Support system for breast cancer treatment

Assessing the Foundations of Conscious Computing: A Bayesian Exercise

Network Analysis of Toxic Chemicals and Symptoms: Implications for Designing First-Responder Systems

An Evolutionary Approach to the Representation of Adverse Events

Technical Specifications

A Description Logics Approach to Clinical Guidelines and Protocols

Learning Deterministic Causal Networks from Observational Data

NIH Public Access Author Manuscript J Biomed Inform. Author manuscript; available in PMC 2013 February 1.

Rare Diseases Nomenclature and classification

Semantics-Powered Healthcare Engineering and Data Analytics

Biomedical resources for text mining

Understanding Users. - cognitive processes. Unit 3

Plan Recognition through Goal Graph Analysis

Transcription:

Deriving an Abstraction Network to Support Quality Assurance in OCRe Christopher Ochs, MS 1, Ankur Agrawal, BE 1, Yehoshua Perl, PhD 1, Michael Halper, PhD 1, Samson W. Tu, MS 3, Simona Carini, MA 2, Ida Sim, MD, PhD 2, Natasha Noy, PhD 3, Mark Musen, MD, PhD 3, James Geller, PhD 1 1 New Jersey Institute of Technology, Newark, NJ; 2 Univ. of California, San Francisco, San Francisco, CA; 3 Stanford University, Stanford, CA Abstract An abstraction network is an auxiliary network of nodes and links that provides a compact, high-level view of an ontology, which is typically large and complex in nature. Such a view lends support to ontology orientation, comprehension, and quality-assurance efforts. A methodology is presented for deriving a kind of abstraction network, called a partial-area taxonomy, for the Ontology of Clinical Research (OCRe), covering the topic of studies with human subjects. OCRe was selected as a representative of ontologies implemented using the Web Ontology Language (OWL). The derivation of the partial-area taxonomy for one specific hierarchy of OCRe, namely, the Entity hierarchy, is described. Utilizing the enhanced visualization of the content and structure of the hierarchy provided by the taxonomy, the Entity hierarchy is audited, and several errors and inconsistencies in OCRe s modeling of its domain are exposed. After appropriate corrections are made to OCRe, a new partial-area taxonomy is derived. The generalizability of the paradigm of the derivation methodology to various families of biomedical ontologies is discussed. Introduction Biomedical ontologies support communication and integration among healthcare information systems. These terminological knowledge bases are typically large and complex, making them difficult to visualize and comprehend. As a result, errors and inconsistencies in their content are nearly unavoidable and are often difficult to detect. Visualization tools like FlexViz [1] help in this regard by providing a view of one concept s contextual neighborhood, including its parents, children, and target concepts of its lateral relationships. Such visualization tools provide micro-level contextual knowledge for concepts, but are not helpful in gaining an understanding of the whole ontology, i.e., getting an idea of its gestalt. To accomplishment that task, other tools are required. One example of such a tool is an abstraction network, a relatively compact, auxiliary collection of nodes and links that summarizes the structure and content of an ontology. Each node in an abstraction network derived from the ontology itself represents a whole subset of classes, unified by a similar structure and/or semantics. The nodes are organized hierarchically in a manner derived from the hierarchical relationships in the original ontology. While abstraction networks have previously been constructed for various terminologies and terminological systems [2], idiosyncrasies in the underlying knowledge models can limit the overall applicability of the methodologies. There is a need to formulate a unified abstraction methodology that can work with an entire family of ontologies exhibiting similar knowledge properties. In this paper, we begin the work toward that goal. In fact, fruitful ground for this work can be found in the extensive collection of important ontologies hosted in the National Center for Biomedical Ontology (NCBO) BioPortal repository [3, 4]. In particular, we present an abstraction methodology for the Ontology of Clinical Research (OCRe), as a representative description-logic-based ontology, modeled using OWL. The abstraction network derived for OCRe is called a partial-area taxonomy. Its usefulness in auditing OCRe and guiding selective remodeling is demonstrated. Indeed, the results of this research have already been incorporated into a newer release of OCRe. The general applicability of the methodology with respect to other ontologies is discussed. Background In previous work, two abstraction networks, the area taxonomy and the partial-area taxonomy, have been formulated and successfully employed in quality-assurance efforts for SNOMED CT hierarchies [5]. The area taxonomy is derived based on the relationship patterns exhibited by the hierarchy s concepts, which are partitioned accordingly into groups called areas. The partial-area taxonomy goes further by systematically dividing areas into

smaller groups of concepts, called partial-areas, based on relationship patterns and hierarchical proximity. The methodology presented in this paper significantly adapts some of the techniques to OCRe (and related ontologies). The Ontology of Clinical Research (OCRe) provides classes and relationships to characterize the different types of human studies in a uniform way [6]. It was developed in the Web Ontology Language (OWL) 1.1 [6] using Protégé 4.1 [6], and focuses on annotating human studies according to their design and analysis. OCRe is organized as a set of modular components with the core modules being study protocol, study design, and statistics. OCRe includes significant information on human investigations, with its Revision 258 consisting of 342 unique classes and 192 different kinds of relationships [7]. In this research, we have accessed OCRe through the National Center for Biomedical Ontologies (NCBO) BioPortal, which provides access to a collection of over 300 ontologies with a uniform framework [2]. OWL [8] is based on description logic (DL) and defines ontologies in an abstract syntax. Ontology authoring tools, such as Protégé [9, 10], help create ontologies in OWL format. Within NCBO s BioPortal [3, 4], many ontologies, including OCRe, are released in this format [11-13]. Overall, an OWL ontology consists of a collection of classes that may be related to one another through a class hierarchy. All classes are subclasses of owl:thing which is the root class and all classes are subclassed by owl:nothing which is the empty class. An important knowledge element used in this paper is the object property, which defines a directed binary relationship between two classes, allowing for their respective instances to be related. As an example from [14], consider this definition of the object property madefromgrape: <owl:objectproperty rdf:id="madefromgrape"> <rdfs:domain rdf:resource="#wine"/> <rdfs:range rdf:resource="#winegrape"/> </owl:objectproperty> This states that madefromgrape has the domain (class) Wine and the range (class) WineGrape. Therefore, for example, LindemansBin65Chardonnay (a type of wine) can be related to ChardonnayGrape (a type of grape) using the object property madefromgrape. OCRe is released in the asserted OWL format. An ontology released in the asserted format consists of only the knowledge explicitly defined by the editor(s) of the ontology. This stands in contrast to the inferred view of an ontology, which is obtained by running a reasoner on the asserted view. More precisely, classifiers take the facts from the asserted view of an ontology and infer new knowledge to obtain the inferred view of the same ontology. The inferred view may contain new hierarchical relationships, different domains for object properties, and other knowledge that was not explicitly expressed in the asserted view. Methods An Abstraction Network for OCRe The derivation of a partial-area taxonomy abstraction network for OCRe requires the reformulation of the taxonomic elements of area and partial-area. Here, these notions are based on object properties and their defined (class) domains. This represents a shift from a reliance on actual relationship occurrences to potential relationship occurrences. Let O be a non-empty set of object properties. The area with respect to O is defined as the set of all classes that are explicitly defined (or are inferred) as being in the exact domains of O s object properties. The object properties collectively are used to name the area. For example, the OCRe class Entity is explicitly asserted as the domain for the object properties has part and part of; therefore, it belongs to the area named {has part, part of}. All of the descendants of Entity are also implicitly within the domain of has part and part of. However, many descendants introduce new object properties of their own and therefore will have different (larger) sets of object properties and reside in different areas. This inheritance and introduction of object properties within OCRe is the basis for defining an area taxonomy. Areas are linked by child-of relationships that abstract the underlying subclass hierarchy. A root within an area is defined as a class whose set of object properties differs from all such sets of its superclasses. An area may have more than one root. A partial-area, which is based on a root within an area, is defined as a collection of classes that share a common set of object properties and a common ancestor class, namely, the root, which introduced the partial-area s new object properties (while the rest of the object properties were inherited from ancestors of the root). Let R be a root of an area A. The set of classes consisting of R and all its descendants in A is called a partial-area and is named after the root. Partial-areas are linked by child-of relationships derived from the

underlying subclass hierarchy in the ontology. In this paper, we focus on OCRe s Entity hierarchy, which is the largest and features a rich set of object properties. As of Version 244, there were 120 distinct classes and 75 unique types of object properties defined within Entity. This hierarchy also contains the important Study class, which is considered the primary element of OCRe. A preliminary step in creating the partial-area taxonomy was to run a reasoner on the ontology. We applied Pellet, provided within Protégé [7]. As the first step, we analyzed where object properties were introduced within the hierarchy. Figure 1 shows an indented subhierarchy of 21 classes from Entity, along with classes that are explicitly defined as the domains for the given object properties. The object properties introduced at a given class are shown in bold brackets next to the class name. As an example, the class Physical entity is defined as being within the domain of two object properties, is element of and plays. In addition, Physical entity has the two object properties has part and part of that are inherited from the Entity class. Altogether, Physical entity has four object properties. Object properties are color-coded according to the total number of properties for this class. For example, all classes with a green-colored object property have three in total. This color coding will be utilized in the following figures. Figure 1. A portion of OCRe s Entity hierarchy in an indented format with object properties introduced. The number after + indicates the inherited properties. Once the inferred hierarchy was established and all the properties were identified, the hierarchy was traversed using a depth first search algorithm and classes that shared the exact same set of properties were grouped together. This second step established the area taxonomy for Entity. Figure 2 shows the grouping of classes for the part of the hierarchy from Figure 1. Areas are represented as colored boxes. Different colors indicate different numbers of object properties. Sets of properties are shown at the top of each colored box. Each such list of properties is the respective area s name. Classes with that set of properties are shown in the box, with descendants of each root shown indented. For example, the class Collection in Figure 2 has the object property set {has part, part of, has element} and a child class Population along with two grandchildren, Enrolled population and Study population. Edges are used to represent child-of links between areas. Child-of links indicate the chain of inheritance involved with a particular set of object properties. The edge from the area containing Organization indicates that this area inherited five properties from the area containing Social institution. The third and final step was to identify the roots of each area and group the descendants of these roots into partialareas. In Figure 2, Planned activity is the ancestor of five classes that all share the same set of object properties: has part, part of, has effective time, has planned component relationship, and occurs in. This group of classes is joined into a single partial-area rooted at Planned activity. Partial-areas are then arranged into the partial-area taxonomy. Child-of links between partial-areas are established, linking each root to the partial-area where its superclass resides. Figure 3 shows the partial-area taxonomy for the subset of classes from Figure 2. In Figure 3, we explicitly identity

the root that names each partial-area. The numbers in parentheses indicate the total numbers of classes of the respective partial-areas. The targets of the child-of s between partial-areas are listed in the boxes after CHILD OF. Figure 2. The grouping of classes from Figure 1 into areas based on each class s set of object properties. Edges are child-of links between areas. Figure 3. Partial-areas derived for each area in Figure 2. The targets of child-of links within partial-areas are indicated after CHILD OF. Figure 4 shows the final diagram representation of the partial-area taxonomy created for the subhierarchy of Entity given in Figure 1. It is a representation that is more compact than the original hierarchy. In Figure 4, we have condensed the 21 classes of Figure 1 into a structure of ten partial-areas residing in nine areas. In the figure, the colored boxes represent areas. The white boxes in an area are the partial-areas. The number of classes for each partial-area is shown. We organized areas into levels based on their number of object properties. Areas containing classes with the fewest object properties are at the top, while those with the most properties are at the bottom. Within the partial-area taxonomy, edges are used to represent the child-of s links between partial-areas. As a graphical simplification, we draw edges only between areas when all of the partial-areas of a given area are children of the same parent area. As the level structure is now explicit, arrow heads may be omitted. For additional clarity, edges may be color coded based on which area the parent partial-area resides in. For example, Biospecimen is childof Physical entity. Because Physical entity is at Level 2 (blue), a blue line is used to connect the two areas that contain these two partial-areas.

Figure 4. Partial-area taxonomy for the subset of classes from OCRe s Entity hierarchy used in Figures 1 3. Auditing Methodology One way to perform quality assurance is to review the partial-area taxonomy to see whether it conforms to the original conception that the designer of the ontology had. For example, do the various partial-areas indeed have the correct sets of object properties? Such a review can be done by an individual who is familiar with the content and structure of the ontology. Another way of utilizing the taxonomy is by identifying any components that display an anomaly vis-à-vis the rest of the ontology. For example, a partial-area that is much larger than all the other partialareas might be considered an anomaly. Another example is a partial-area in which a very large number of object properties are introduced. A further anomaly may relate to exceptions in the number of child-of relationships emanating from a partial-area. If, for example, most partial-areas have just one child-of and only a few have multiple child-of s, the latter constitute an exception to the norm and are recommended for review. It is not necessarily the case that each such anomaly manifests an error, but the anomalous classes are recommended for in-depth review by an auditor or curator of the ontology. Some anomalies are the results of modeling errors that can be discovered during an in-depth review. Results The complete partial-area taxonomy for the Entity hierarchy, shown in Figure 5, was created for Version 244 of OCRe, as hosted on NCBO s BioPortal repository. This version of Entity consists of 120 classes and 75 unique types of object properties. Levels are numbered, with the root area {has part, part of} at Level 0. Lower levels have larger level numbers and also larger numbers of object properties; however, these numbers are usually not equal. The partial-area Physical entity is in the area {has part, part of, is element of, plays} at Level 2. Physical entity has three classes, the root and its two children, Material and Organism (see Figure 2). There are two partial-areas that are child-of the partial-area Physical entity: Person which is a subclass of Organism, and Biospecimen which is a subclass of Material in the original ontology. The corresponding child-of relationships are shown as (blue) lines in Figure 5. In total, the taxonomy has 21 areas organized into nine levels. There are 23 partial-areas in total, because two areas at Level 1 contain two partial-areas. Twelve of the partial-areas consist of just one class. To observe the main focus of the content of the Entity hierarchy, we should concentrate on the larger partial-areas: Entity (14 classes), Study design (13 classes), Outcome analysis specification (34 classes), Planned activity (6 classes), and Study (24 classes). By reviewing the 21 partial-areas and concentrating on the large ones, one can get an orientation to the structure and content of this hierarchy. The two largest partial-areas, Outcome analysis specification and Study, should be considered anomalous. At first, the OCRe curators were surprised to see so many classes included within the former. Upon closer inspection, it was seen that all 33 descendants of Outcome analysis specification describe statistical methods and clearly do not belong under

this class. Furthermore, they do not even belong in the Entity hierarchy. In the revised modeling of OCRe, all 33 were moved to OCRe s Statistical concept hierarchy. Figure 5. Complete partial-area taxonomy for OCRe s Entity hierarchy prior to auditing efforts. Figure 6 shows the taxonomy of the revised Entity hierarchy, which has only 88 classes. It was made available on BioPortal as Version 258 of OCRe. In the revised taxonomy, the partial-area Outcome analysis specification contains just one class on Level 2 (blue) with only four object properties. Outcome analysis specification was removed as the domain of two object properties, has dependent variable and has independent variable, which were formulated as existential restrictions on Outcome analysis specification. Variable specification was added as the domain of these properties inverses. Because of this addition, Variable specification moved down one level to Level 7 in the taxonomy. Another anomaly was encountered at the partial-area Relative time point (one class). It is the only partial-area that has two child-of s emanating from it, one to the partial-area Entity, where it is a subclass of the Time point class, and the other to the partial-area Time interval, where it is a subclass of the root class Time interval. According to the definition provided within the ontology, Relative time point is not a Time interval at all, but a Time point in reference to some other given time point. Furthermore, the subclass to Time interval was not in the asserted view of OCRe, but was instead inferred by the Pellet reasoner. The error was due to an error in the specified domain of the duration object property, leading the reasoner to infer an unintended subsumption relationship. During the review of OCRe, the domain of the property was changed and as a result the second subclass relationship to Time interval is no longer inferred. In Figure 6, Relative time point appears on Level 3 (red), with only five object properties. This is due to the fact that the two object properties has start time and has stop time, originally inherited from Time interval, disappeared since Relative time point is no longer a subclass of Time interval.

Figure 6. Partial-area taxonomy for OCRe s Entity hierarchy, revised after an audit. Another change in the ontology resulted from observations made upon review of the various partial-areas and their sets of object properties. The partial-area Physical entity (two classes) had an irrelevant object property has semantic constraint, which was removed. This partial-area is now on Level 2 (blue) of the taxonomy (Figure 6). Discussion Our long-range goal in a broader research project is to develop methodologies for deriving partial-area taxonomies and similar abstraction networks that are applicable to whole families of ontologies, with members sharing a similar model and similar meta-properties. Examples of candidate ontologies that all share a similar model and have similar properties to OCRe include the Orphanet Ontology of Rare Diseases (OntoOrpha) [15, 16], the Bone Dysplasia Ontology (BDO) [17], and the Sleep Domain Ontology (SDO) [18, 19]. Hence, we hypothesize that they form a family and that the methodology developed in this paper for deriving a partial-area taxonomy is applicable to them and further that such a taxonomy is useful for quality assurance. However, we note that OCRe is somewhat unusual among OWL ontologies in its careful assignment of domains for object properties. Future research may lead to distinguishing among various kinds of OWL ontologies, and to the formulation of several families requiring potentially different methodologies for taxonomy derivation. OCRe served as an initial test case for developing a methodology for deriving taxonomies for medical ontologies based on DL and expressed using OWL. It is an encouraging sign that the derived taxonomy for OCRe has proven helpful in quality assurance and was embraced by the OCRe editorial team for this purpose. Further taxonomy derivation is planned for the other six hierarchies of OCRe for the purpose of auditing. With regards to the methodology used in this paper, we note that the intermediate stages, reflected by Figures 1, 2, and 3, are only shown for expository purposes. The methodology is algorithmic and can automatically generate the partial-area taxonomy (as has been done previously for SNOMED CT). The purpose of highlighting these stages and

the transitions between them was to help the reader develop an understanding of the process. In other words, we provided the figures for the intermediate stages to demystify the black box derivation. A human auditor will only see the final results, corresponding to Figure 5 before the auditing and Figure 6 afterward. The remodeling of OCRe following the auditing that was facilitated by the taxonomy of Figure 5 has not yet been completed. In addition to the changes reflected in Figure 6 and presented in the previous section, there is some additional remodeling work underway for the partial-area Study (bottom of Figure 5 on Level 8). This partial-area has 24 object properties, a large increase versus the nine object properties at Level 7 in the taxonomy. It is surprising that besides part of and has part all 22 other object properties are introduced at the same class Study. One could envision some of the object properties being defined at children or grandchildren of Study. For example, the object properties has recruitment status or has biospecimen collected may not be relevant for all 24 classes of this partialarea and should be introduced at appropriate descendants. This introduction would partition the large partial-area Study into several smaller ones, likely improving the presentation of OCRe to users. The editorial team of OCRe is currently using this feedback to re-examine the modeling of these classes, which are critical to the purpose of OCRe. Some initial changes are reflected in Figure 6, where Study has grown from 24 to 26 classes, compared to Figure 5, reflecting a finer distinction between classes. Conclusions An abstraction-network derivation methodology has been introduced and used to obtain a partial-area taxonomy for OCRe, an OWL ontology, focusing on human studies. This taxonomy was shown to support quality-assurance efforts and led to an improved version of the Entity hierarchy of OCRe, which was reduced from 120 classes to 88 classes. The corrections suggested by the study have already been incorporated by the OCRe editorial team into Version 258 of OCRe, hosted in the NCBO BioPortal, and more corrections are expected in the future. The derivation methodology is applicable to three more ontologies in the NCBO BioPortal. Further work is planned in studying the generalizability of the paradigm for taxonomy derivation methodologies for various families of ontologies sharing similar models and properties. References 1. The CHISEL Group, University of Victoria. [cited 2012 January 24]; Available from: http://www.thechiselgroup.org/flexviz. 2. Min H, Perl Y, Chen Y, Halper M, Geller J, Wang Y. Auditing as part of the terminology design life cycle. J Am Med Inform Assoc. 2006;13(6):676-90. Epub 2006/08/25. 3. BioPortal. [cited 2012 January 30]; Available from: http://bioportal.bioontology.org/. 4. Musen MA, Noy NF, Shah NH, Whetzel PL, Chute CG, Story MA, et al. The national center for biomedical ontology. J Am Med Inform Assoc. 2012;19(2):190-5. 5. Wang Y, Halper M, Min H, Perl Y, Chen Y, Spackman KA. Structural methodologies for auditing SNOMED. J Biomed Inform. 2007;40(5):561-81. Epub 2007/02/06. 6. Tu S, Carini S, Rector A, Maccalum P, Toujilov I, Harris S, et al. OCRe: An Ontology of Clinical Research. 11th International Protege Conference2009. 7. OCRe in BioPortal. BioPortal [updated February 21, 2012; cited 2012 February 22]; Available from: http://bioportal.bioontology.org/ontologies/45657. 8. OWL Web Ontology Language Overview. [cited 2012 February 23]; Available from: http://www.w3.org/tr/owl-features. 9. Noy NF, Crubezy M, Fergerson RW, Knublauch H, Tu SW, Vendetti J, et al. Protege-2000: an open-source ontology-development and knowledge-acquisition environment. AMIA Annu Symp Proc. 2003:953. Epub 2004/01/20. 10. Protégé. [cited 2012 February 23]; Available from: http://protege.stanford.edu/. 11. Bertaud-Gounot V, Duvauferrier R, Burgun A. Ontology and medical diagnosis. Inform Health Soc Care. 2012;37(1):22-32. 12. Bright TJ, Yoko Furuya E, Kuperman GJ, Cimino JJ, Bakken S. Development and evaluation of an ontology for guiding appropriate antibiotic prescribing. J Biomed Inform. 2012;45(1):120-8. 13. Hardiker NR, Kim TY, Coenen AM, Jansen KR. A dynamic classification approach for nursing. AMIA Annu Symp Proc. 2011:543-8.

14. OWL Web Ontology Language Guide. [cited 2012 March 14]; Available from: http://www.w3.org/tr/owlguide. 15. Dhombres F, Vandenbussche P-Y, Rath A, Hanauer M, Orly A, Urbero B, et al. OntoOrpha : an ontology to support the editing and audit of rare diseases knowledge in Orphanet. International Conference of Biomedical Ontology2011. 16. Orphanet Ontology of Rare Diseases [cited 2012 March 4]; Available from: http://bioportal.bioontology.org/ontologies/1586. 17. Bone Dysplasia Ontology. [cited 2012 March 4]; Available from: http://bioportal.bioontology.org/ontologies/1613. 18. Arabandi S, Ogbuji C, Redline S, Chervin R, Boero J, Benca R, et al. Developing a Sleep Domain Ontology. AMIA Clinical Research Informatics Summit; San FranciscoMar 12-13, 2010. 19. Sleep Domain Ontology. [cited 2012 March 4]; Available from: http://bioportal.bioontology.org/ontologies/1651.