Clinical Genome Knowledge Base and Linked Data technologies. Aleksandar Milosavljevic

Similar documents
Causal Knowledge Modeling for Traditional Chinese Medicine using OWL 2

Ontology-Based Personalized System to Support Patients at Home. Mukasine Angelique. Supervisors Professor Rune Fensli Jan Pettersen Nytun

Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms

Knowledge networks of biological and medical data An exhaustive and flexible solution to model life sciences domains

Outline. BIONT Goals. Work so Far. Collaborative BIONT BIORDF use case. Next Steps

Interpretation can t happen in isolation. Jonathan S. Berg, MD/PhD Assistant Professor Department of Genetics UNC Chapel Hill

Schema-Driven Relationship Extraction from Unstructured Text

W3C Semantic Sensor Networks Ontologies, Applications, and Future Directions

Towards Querying Bioinformatic Linked Data in Natural Langua

Kaiser Permanente Convergent Medical Terminology (CMT)

Innovative Risk and Quality Solutions for Value-Based Care. Company Overview

110 DISEASES 3 DISEASES GENE TIC COUNSELING CARRIERMAP Recombine. Others. 30+ minute clinical genetic counseling session.

Biomedical resources for text mining

APPLYING ONTOLOGY AND SEMANTIC WEB TECHNOLOGIES TO CLINICAL AND TRANSLATIONAL STUDIES

cagrid, cabig, CVRG and NCIBI Joel Saltz MD, PhD Director Center for Comprehensive Informatics

ORC: an Ontology Reasoning Component for Diabetes

CIViC Clinical Interpretations of Variants in Cancer. rnal/v49/n2/full/ng.3774.html

Deriving an Abstraction Network to Support Quality Assurance in OCRe

R2 Training Courses. Release The R2 support team

Heiner Oberkampf. DISSERTATION for the degree of Doctor of Natural Sciences (Dr. rer. nat.)

Weighted Ontology and Weighted Tree Similarity Algorithm for Diagnosing Diabetes Mellitus

Ontology-Based Modeling of Clinical Reasoning in Traditional Chinese Medicine

Shashikant Kulkarni, M.S (Medicine)., Ph.D., FACMG Associate Professor of Pathology & Immunology Associate Professor of Pediatrics and Genetics

Role Representation Model Using OWL and SWRL

Exploiting deduction and abduction services for information retrieval. Ralf Moeller Hamburg University of Technology

National Academies Next Generation SAMPLE Researchers TITLE Initiative HERE

Semantic Web Applications in Financial Industry, Government, Health Care and Life Sciences

Context, Perspective, and Generalities in a Knowledge Ontology Ontolog Forum

Automated Annotation of Biomedical Text

Pool Canvas. Add. Creation Settings. Chapter 01: Learning About Human Biology. Description Instructions. Add Question Here

Curriculum Vitae. Degree and date to be conferred: Masters in Computer Science, 2013.

Integrating Genome and Functional Genomics Data to Reveal Perturbed Signaling Pathways in Ovarian Cancers

ONTOLOGY. elearnig INITIATIVE PRAISE: Version number 1.0. Peer Review Network Applying Intelligence to Social Work Education

The 100,000 Genomes Project

AN ONTOLOGY FOR DIABETES EARLY WARNING SYSTEM

Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers

APPROVAL SHEET. Uncertainty in Semantic Web. Doctor of Philosophy, 2005

From Patients to Therapies. How could the BADIPS challenge progress towards improved in vitro models and novel patient therapies?

Variant Classification: ACMG recommendations. Andreas Laner MGZ München

Chapter 2. Knowledge Representation: Reasoning, Issues, and Acquisition. Teaching Notes

JOURNAL OF COMPUTATION IN BIOSCIENCES AND ENGINEERING

Semantic Infrastructure for Automated Lipid Classification

Variant Classification: ACMG recommendations. Andreas Laner MGZ München

Research Scholar, Department of Computer Science and Engineering Man onmaniam Sundaranar University, Thirunelveli, Tamilnadu, India.

Key determinants of pathogenicity

Results and Discussion of Receptor Tyrosine Kinase. Activation

Master of Science in Management: Fall 2018 and Spring 2019

Text mining for lung cancer cases over large patient admission data. David Martinez, Lawrence Cavedon, Zaf Alam, Christopher Bain, Karin Verspoor

Knowledge Representation defined. Ontological Engineering. The upper ontology of the world. Knowledge Representation

HUMAN REASONING AND DESCRIPTION LOGICS APPLYING PSYCHOLOGICAL THEORY TO UNDERSTAND AND IMPROVE THE USABILITY OF DESCRIPTION LOGICS

Multiplexed Cancer Pathway Analysis

General Diseases Ontology Specification 1.1

Brief introduction to NeuroLex. Stephen D. Larson NeuroML 3 rd Annual Workshop London - 3/31/11

Master of Business Administration: Fall 2018/Spring 2019/Fall Semesters (45 Credits)

Personalis ACE Clinical Exome The First Test to Combine an Enhanced Clinical Exome with Genome- Scale Structural Variant Detection

Leveraging Machines for Causal Modeling

The 100,000 Genomes Project Harnessing the power of genomics for NHS rare disease and cancer patients

Technology & Standards Watch. An ontology methodology and CISP. the proposed Core Information about Scientific Papers

Representing Social Reality in OWL 2

Standardize and Optimize. Trials and Drug Development

Learning from Online Health Communities. Noémie Elhadad

A FRAMEWORK FOR CLINICAL DECISION SUPPORT IN INTERNAL MEDICINE A PRELIMINARY VIEW Kopecky D 1, Adlassnig K-P 1

Investigating rare diseases with Agilent NGS solutions

Survey of Knowledge Base Content

Haitham Maarouf 1, María Taboada 1*, Hadriana Rodriguez 1, Manuel Arias 2, Ángel Sesar 2 and María Jesús Sobrido 3

Time-Oriented Question Answering from Clinical Narratives Using Semantic-Web Techniques

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

Event Classification and Relationship Labeling in Affiliation Networks

Learning the Semantic Meaning of a Concept from the Web. By Yang Yu

Extending the Web of Drug Identity with Knowledge Extracted from United States Product Labels

Copyright 2014, 2011, and 2008 Pearson Education, Inc. 1-1

International Journal of Software and Web Sciences (IJSWS)

Cancer Informatics Lecture

Using an Integrated Ontology and Information Model for Querying and Reasoning about Phenotypes: The Case of Autism

Analysis of Cancer Omics Data In A Semantic Web Framework

Public Health Challenges. Identified by Public Health England

EBCC Data Analysis Tool (EBCC DAT) Introduction

Semantic Web & Semantic Web Services: Applications in Healthcare And Scientific Research

The Deciphering Development Disorders (DDD) project: What a genomic approach can achieve

Master of Science in Management: Fall 2017 and Spring 2018

Clasificación Molecular del Cáncer de Próstata. JM Piulats

EEL-5840 Elements of {Artificial} Machine Intelligence

Corporate Medical Policy

SAPLING: A Tool for Gene Network Analysis focusing on Psychiatric Genetics

I. Setup. - Note that: autohgpec_v1.0 can work on Windows, Ubuntu and Mac OS.

Semantics in the Semantic Web the implicit, the formal and the powerful

Case Studies and Informed Consent. Ashley Cannon, PhD, MS, CGC Genomics Immersion Course 10/14/2015

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis

Modelling threshold phenomena in OWL: Metabolite concentrations as evidence for disorders

Tumor suppressor genes D R. S H O S S E I N I - A S L

Using large-scale human genetic variation to inform variant prioritization in neuropsychiatric disorders

Clinical Evidence Framework for Bayesian Networks

Cancer Gene Panels. Dr. Andreas Scherer. Dr. Andreas Scherer President and CEO Golden Helix, Inc. Twitter: andreasscherer

Accelerate Your Research with Conversant Bio

Ontology-Based Data Access and OWL 2 QL

CS2220 Introduction to Computational Biology

Science is a way of learning about the natural world by observing things, asking questions, proposing answers, and testing those answers.

TPMI Presents: Translational Genomics Research Update, Opportunities and Challenges

Observations on the Use of Ontologies for Autonomous Vehicle Navigation Planning

Transcription:

Clinical Genome Knowledge Base and Linked Data technologies Aleksandar Milosavljevic

Topics 1. ClinGen Resource project 2. Building the Clinical Genome Knowledge Base 3. Linked Data technologies 4. Using Linked Data technologies to enable pattern discovery in the Clinical Genome Knowledge Base 2

Topics 1. ClinGen Resource project 1. Building the Clinical Genome Knowledge Base 2. Linked Data technologies 3. Using Linked Data technologies to enable pattern discovery in the Clinical Genome Knowledge Base 3

www.clinicalgenome.org Engage the clinical genomics community in data sharing efforts (sharing of data from genetic testing labs) Develop the infrastructure and standards to support curation of the knowledge about the role of genes and genetic variants in human diseases Incorporate machine-learning approaches to speed the identification of clinically relevant variants 4

Variants in ClinVar (mostly from Diagnostic Genetic Testing) 180000 Variant Mode of inheritance Disease condition Ideally: evidence for the assertion as well 120000 60000 0 4/5/2013 10/5/2013 4/5/2014 10/5/2014 4/1/2015 5 (Source: ClinVar)

6

7

Topics 1. ClinGen Resource project 2. Building the Clinical Genome Knowledge Base 3. Linked Data technologies 4. Using Linked Data technologies to enable pattern discovery in the Clinical Genome Knowledge Base 8

Evaluating pathogenicity of genetic variants for specific diseases 9

Evaluating pathogenicity of genetic variants for specific diseases 10

11

Evaluating pathogenicity of genetic variants for specific diseases 12

Evaluating pathogenicity of genetic variants for specific diseases Evidence Items Rules (ACMG Guidelines) Rule-based inference engine Assertion about pathogenicity Explanation (Rules supported by evidence) 13

Evaluating pathogenicity of genetic variants for specific diseases Linked Data Evidence Items Rules (ACMG Guidelines) Rule-based inference engine Assertion about pathogenicity Explanation (Rules supported by evidence) 14

Evidence code PM1: mutational hotspot Variant frequency profile in normal population Cancer-predisposing variants 4/1/2015 15

Collating information about variants from public sources BRCA1 APC 4/1/2015 16

Ontology traversal to collate variants linked to related disease conditions Inferred higher-level terms Gene:TP53 Original disease data Reproductive organ cancer Integumentary system cancer Gastrointestinal system cancer Respiratory system cancer

Topics 1. ClinGen Resource project 2. Building the Clinical Genome Knowledge Base 3. Linked Data technologies 4. Using Linked Data technologies to enable pattern discovery in the Clinical Genome Knowledge Base 18

RDF: Resource Description Framework Originally: metadata data model describing web resources Example: Metastatic brain tumor pathway Web resource = anything identifiable by a URL Subject-predicate-object expressions (related to classical entity-property-value) Subject = anything identifiable by a URL / URI Expression = edge in a Knowledge Graph Wkipathways serializes biological pathway data as RDF 19

RDFs: RDF Schema Uses RDF to express data model (data schema) Example: classification of body fluids Some RDFs predicates: subclassof the subject is a subclass of a class subpropertyof the subject is a subproperty of a property domain domain of the subject property range - range of the subject property 20

OWL: Web Ontology Language Example: ontology traversal to annotate variants to more general disease categories Uses RDF to express to describe taxonomies and classification networks Defines the structure of knowledge for various domains 21

Some Linked Data projects RDF-like graphs : Facebook Social Graph -- Open Graph API Google Knowledge Graph / Vault WikiData RDF graphs: data.gov Bio2RDF Bioontologies / Bioportal -- RDF, RDFs, OWL 22

What is Linked Open Data? 23

Linked RDF Standards 1999-2014 RDF, RDFs, OWL -- graph syntax and semantics 2014 JSON-LD -- JSON -- data format for RDF graphs Feb 26 2015 Linked Data Platform 1.0 HTTP operations Use URIs as names for things Use HTTP URIs so that people can look up those names When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) Include links to other URIs, so that they can discover more things 24

Topics 1. ClinGen Resource project 2. Building the Clinical Genome Knowledge Base 3. Linked Data technologies 4. Using Linked Data technologies to enable pattern discovery in the Clinical Genome Knowledge Base 25

Use Case #1: Discovering match patterns Neighboring genes in pathway graph Neighboring diseases in a disease ontology

Use Case #2: Discovering Recurrent Mutually Exclusive (RME) mutational patterns in cancer 27

Hallmark phenotypes of cancer are acquired by positive selection (Hanahan and Weinberg, Cell 2001 and 2011) 28

Recurrent mutually exclusive mutational patterns identified in key glioblastoma pathways Comprehensive genomic characterization defines human glioblastoma genes and core pathways TCGA Research Network, Nature 455, 1061-1068 (2008) 29

Aristotle s Zoology: either horns or tusks [Aristotle] believed that nature was economical and gave no animal too many gifts, observing that no animal possessed both horns and tusks Acquired capability of animals: defense Horns Tusks

Discovering recurrent mutually exclusive patterns Genes altered by mutations 1 2 3 Tumor samples Mutual exclusivity 31

Discovering recurrent mutually exclusive patterns (Hanahan and Weinberg, Cell 2001 and 2011) 32

Can key modules / pathways be discovered based on RME patterns alone? (Miller CA et al. BMC Med Genomics. 4(1):34 2011) 33

RME algorithm applied to glioblastoma TCGA collection (145 samples) Rediscovered modules within all 3 pathways in glioblastoma reported by the TCGA Research Network, Nature 455, 1061-1068 (2008). 34 (Miller CA et al. BMC Med Genomics. 4(1):34 2011)

RME algorithm applied to glioblastoma TCGA collection (145 samples) Role in Glioblastoma? Activation by acetylation 35 (Miller CA et al. BMC Med Genomics. 4(1):34 2011)

Effect of EP300 expression on grade-independent and age-independent survival in glioblastoma 36 (Miller CA et al. BMC Med Genomics. 4(1):34 2011)

Use Case #2: Discovering Recurrent Mutually Exclusive (RME) mutational patterns in cancer Step 1: Identify variants associated with a specific phenotype (cancer) > Disease ontology traversal Step 2: Find groups of genes that form RME pattern Step 3: Examine connectedness of the genes within pathways/networks > Evaluate pattern of connectedness within networks or pathways 37

Topics 1. ClinGen Resource project 2. Building the Clinical Genome Knowledge Base 3. Linked Data technologies 4. Using Linked Data technologies to enable pattern discovery in the Clinical Genome Knowledge Base 38

Conclusion The Web of hyperlinks (Web 1.0, 2.0) has greatly amplified the impact of the Human Genome Project The Web of Linked Data (Web 3.0) will enable creation of a computable Clinical Genome Knowledge Base to inform Clinical interpretation of genomic variation Pattern discovery leading to new hypotheses 39

Acknowledgements Sharon Plon Andrew R. Jackson Xin Feng Ronak Patel Rajarshi Ghosh 40