What is an Ontology? An explicit specification of a conceptualization [Tom Gruber 1993] concepts properties and attributes of concepts constraints on properties and attributes individuals (often, but not always) abstract model of some domain An ontology defines a common vocabulary a shared understanding Semantic Web (6) 1
Ontology examples: Lightweight taxonomies Taxonomy is the practice and science of classification. The word comes from the Greek τ αξις ( order ) and νoµoς ( law or science ) Linnaean taxonomy: a classification of living things (Carl Linnaeus, 1707 1778, father of modern taxonomy ) Yahoo! Web Directory (http://dir.yahoo.com/) Open Directory Project 590,000 categories (http://dmoz.org/) Amazon product catalog Semantic Web (6) 2
Ontology examples: Heavy weight Cyc, upper ontology for all of human consensus reality (started in 1994) http://www.opencyc.org/ the world s largest and most complete general knowledge base and commonsense reasoning engine Semantic Web (6) 3
Ontology examples: e-science Open Biomedical Ontologies Consortium: GO http://www.geneontology.org/, MGED http://mged.sourceforge.net/ontologies/mgedontology.php,... Used, e.g., for in silico investigations relating theory and data Semantic Web (6) 4
Ontology examples: Medicine Building/maintaining terminologies such as Snomed CT, NCI, Galen and FMA (Snomed CT: Systematised Nomenclature of Medicine Clinical Terms) Used, e.g., for semi-automated annotation of MRI images Clinicians use different terms that mean the same thing: heart attack, myocardial infarction, and MI may mean the same thing to a cardiologist, but to a computer, they are all different. Semantic Web (6) 5
Ontology examples: organising complex information E.g., UN-FAO, NASA, Ordnance Survey, General Motors, Lockheed Martin,... Semantic Web (6) 6
What is Ontology Engineering? Ontology Engineering defining terms in the domain and relations among them: defining concepts in the domain (classes) arranging the concepts in a hierarchy (subclass-superclass) defining which attributes and properties classes can have and constraints on their values defining individuals and filling in attribute/property values Ontology engineering is transferring knowledge into a computer accessible form Semantic Web (6) 7
Why develop an Ontology? to share common understanding of the structure of information among people among software agents to enable reuse of domain knowledge to avoid re-inventing the wheel to introduce standards to allow interoperability to make domain assumptions explicit easier to change domain assumptions easier to understand and update legacy data to separate domain knowledge from the operational knowledge re-use domain and operational knowledge separately Semantic Web (6) 8
Steps in developing an Ontology 1 determine domain and scope what is the domain that the ontology will cover? what we are going to use the ontology for? what types of questions the information in the ontology should provide answers for (competence questions)? 2 informal/semiformal knowledge acquisition collect the terms organise them informally paraphrase and clarify terms to produce informal concept definitions diagram informally 3 refine requirements and tests Semantic Web (6) 9
Steps in developing an Ontology (cont.) 4 implementation paraphrase and comment at each stage before implementing develop normalised schema implement prototype recording the intension as a paraphrase scale up a bit and check performance 5 evaluation and quality assurance against goals (ontology design is subjective!) include tests for evolution and change management design regression tests and probes 6 maintenance: usage monitoring and evolution compatibility between different versions of the same ontology and between versions of an ontology and instance data Process not product! Semantic Web (6) 10
Example: Animal ontology Purpose and scope: To provide an ontology for an index of a children s book of animals including where they live what they eat (carnivores, herbivores and omnivores) how dangerous they are how big they are a bit of basic anatomy (number of legs, wings, toes, etc.) Semantic Web (6) 11
Collect the terms what are the terms we need to talk about? what are the properties of these terms? what do we want to say about the terms? card sorting is often the best way write down each concept/idea on a card organise them into piles link the piles together do it again, and again (works best in a small group) Semantic Web (6) 12
Example: Animals and Plants Dog Carnivore Dangerous Cat Plant Pet Cow Animal Domestic Animal Person Draught Animal a Farm Animal Tree Child Food Animal Grass Parent Fish Herbivore Mother Carp Male Father Goldfish Female Pig a used for pulling heavy loads, etc. Semantic Web (6) 13
Extend the Concepts Take a group of things and ask what they have in common For example: and then what other siblings there might be Plant, Animal Living Thing (might add Bacteria, Fungi?) Cat, Dog, Cow, Person Mammal (might add Goat, Rabbit?) Cow, Goat, Sheep, Horse Ungulate (hoofed animal) (what others are there? do they divide amongst themselves? even/odd-toed?) Wild, Domestic Domestication (what other states?) Semantic Web (6) 14
Organise the Concepts Choose some main axes: add abstractions where needed identify relations identify definable things (e.g., Living Thing, Mammal, Fish) (e.g., eats, owns, parent of) (e.g., Draught Animal, Father, Herbivore) i.e., things where you can say clearly what it means try to define a dog precisely very difficult (a natural kind ) Self-standing things vs. Modifiers self-standing things can exist on their own (roughly nouns) (e.g., people, animals, houses, actions, processes) modifiers modify other things (roughly adjectives and adverbs) (e.g., wild/domestic, male/female, healthy/sick, dangerous/safe) Semantic Web (6) 15
Arrange Concepts/Properties into Hierarchy Reorganise everything but definable things into pure trees these will be the primitives self-standing modifiers relations definable LivingThing Animal Mammal Cat Dog Cow Person Pig Fish Carp Goldfish Plant Tree Grass Domestication Domestic Wild Use Pet Food Draught Dangerousness Dangerous Safe Sex Male Female Age Adult Child eats owns parentof... Carnivore Herbivore Child Parent Mother Father FoodAnimal DraughtAnimal Semantic Web (6) 16
Defining Classes and a Class Hierarchy Things to remember: there is no single correct class hierarchy but there are some guidelines Question to ask: Is each instance of the subclass an instance of it superclass? Semantic Web (6) 17
Defining Classes and a Class Hierarchy (cont.) All the siblings in the class hierarchy must be at the same level of generality (compare to section and subsections in a book) If a class has more than a dozen direct subclasses, additional subcategories may be necessary (compare to bullets in a list) However, if no natural classification exists, the long list may be more natural Class names should be either all singular or all plural (Animal is not a kind-of Animals) Classes represent concepts in the domain, not their names The class name can change, but it will still refer to the same concept (synonym names for the same concept are not different classes) Semantic Web (6) 18
Properties Identify the domain and range constraints for properties Animal eats LivingThing (if anything is used in a special way, add a text comment) domain: Animal range: LivingThing (NB: ignore difference between parts of LivingThings and LivingThings) Person owns LivingThing except Person Animal parentof Animal domain: Person range: LivingThing and not Person domain: Animal range: Animal Identify property restrictions: what can we say about all instances of a class? all Cows eat some Plants all Cats eat some Animals all Pigs eat some Animals and eat some Plants... descriptions of self-standing things Semantic Web (6) 19
Definable things Paraphrase and formalise the definitions in terms of the primitives, relations and other definables Note any assumptions to be represented elsewhere (add as comments when implementing) A Parent is an Animal that is a parent of some other Animal (NB: ignore Plants for now) Parent = Animal and parentof some Animal A Herbivore is an Animal that eats only Plants (NB: all Animals eat some LivingThings) Herbivore = Animal and eats only Plant An Omnivore is an Animal that eats both Plants and Animals Omnivore = Animal and eats some Plant and eats some Animal Without a paraphrase we cannot tell if we disagree on what you meant to represent and how you represented it. Semantic Web (6) 20
Normalisation and Untangling Tree everything (but the root) has one parent strict hierarchy Directed Acyclic Graph (DAG) things can have multiple parents polyhierarchy Normalisation: separate primitives into disjoint trees link the trees with definitions and restrictions let the classifier produce the DAG Trees are easier to manage than DAGs Animal Herbivore Carnivore Mammal Omnivore Cow Cat Dog Person Pig Cow Herbivore Cat Carnivore Animal Dog Mammal Person Omnivore Pig Semantic Web (6) 21
Modifiers Identify modifiers that have mutually exclusive values (Domestication, Dangerousness, Sex, Age) NB. Uses are not mutually exclusive (can be both Draught and Food) Extend and complete lists of values (Dangerousness: Dangerous, Risky, Safe) Define a functional property for every such a modifier There are two ways of specifying values for modifiers value partitions (classes that partition a quality) Domestication Domestic Wild Use Pet Food Draught Dangerousness Dangerous Safe Sex Male Female Age Adult Child value sets (individuals that enumerate all states of a quality) Semantic Web (6) 22
Specifying Values: Value Partitions Example: a parent quality Dangerousness Define subqualities for each degree: Dangerous, Risky, Safe all subqualities are disjoint subqualities cover parent quality, i.e., Dangerousness = Dangerous or Risky or Safe Define a functional property hasdangerousness range is the parent quality, i.e., Dangerousness domain must be specified separately DangerousAnimal = Animal and hasdangerousness some Dangerous Semantic Web (6) 23
Specifying Values: Value Sets Example: a parent quality SexValue Define individuals for each value: male, female values are different (NOT assumed in OWL) value type is enumeration of values, i.e., SexValue = { female, male } Define a functional property hassex range is the parent quality, i.e., SexValue domain must be specified separately MaleAnimal = Animal and hassex is male Semantic Web (6) 24
Issues in Specifying Values Value Partitions can be subdivided and specialised fit with philosophical notion of quality space require interpretation to go in databases as values in theory but rarely considered in practice work better with existing classifiers in OWL DL Value Sets cannot be subdivided fit with intuition more similar to databases no interpretation work less well with existing classifiers Semantic Web (6) 25
Roles To keep primitives disjoint: need to distinguish the roles things play in different situations pet, farm animal, draught animal from what they are: e.g., professor, student doctor, nurse, patient often need to distinguish qualifications from roles a person may be qualified as a doctor but playing the role of a patient Roles usually summarise relations to play the role of pet is to say that there is somebody for whom the animal is a pet to play the role of doctor is to say that there is somebody for whom the person is acting as the doctor or some situation in which they play that role But we often do not want to explain the situation or relation completely. Semantic Web (6) 26
Roles and Untangling Example: DraughtAnimal, FoodAnimal, PetAnimal Identify roles draught: cow, horse, dog food: cow, horse pet: horse, dog Define subclasses of AnimalUseRole: FoodRole PetRole DraughtRole DraughtAnimal = Animal and hasrole some DraughtRole Semantic Web (6) 27
Limiting the Scope An ontology should not contain all the possible information about the domain no need to specialise or generalise more than the application requires no need to include all possible properties of a class Example: an ontology of biological experiments contains BiologicalOrganism and Experimenter. Is the class Experimenter a subclass of BiologicalOrganism? Semantic Web (6) 28
Summary: Normalised Ontology Development Identify the self-standing primitives (comment any that are not self-evident) Separate them into trees (you may have to create some roles or other auxiliary concepts to do so) Identify the relations Create the descriptions and definitions (comment any that are not self-evident) (provide a paraphrase for each) Identify how key items should be classified (define regression tests) Use classifier to form a DAG Check if tests are satisfied Semantic Web (6) 29
Tutorial: Travel agency ontology Use the Protégé editor to define a normalised ontology for use by a travel agency covering the following: Hotel, restaurant, sports, luxury hotel, bed and breakfast, safari, activity, hiking, spa treatment, sunbathing, sightseeing, accommodation rating (three stars, etc.), campground, surfing. Build a class hierarchy and indicate which classes in it are primitive and which are definable. Define the required relations, their properties, domains and ranges as well as individuals. Define the following classes: 1. A two star hotel. 2. A spa resort (i.e., a destination offering a spa treatment). 3. A destination with sport activities but without safari. 4. A destination where all hotels have three star rating. 5. A destinations with at least three restaurants and at least four hotels. Semantic Web (6) 30