TIES Cancer Research Network Y2 Face to Face Meeting U24 CA October 29 th, 2014 University of Pennsylvania

TIES Cancer Research Network Y2 Face to Face Meeting U24 CA 180921 Session IV The Future of TIES October 29 th, 2014 University of Pennsylvania

Afternoon Other Uses of TIES/Future of TIES 12:45-1:15 TIES Radiology at University of Pittsburgh (Legowski and Crowley-Jacobson) 1:15 1:30 Cancer Deep Phenotyping (Mitchell and Crowley-Jacobson) 1:30 2:00 EMR corpus Deep Phenotyping (Feldman) Year 2 Development Projects 2:00 2:30 Whole Slide Imaging inside TIES (Tseytlin) 2:30 3:00 Paraffin Archive (Chavan) 3:00 3:30 Prioritization of current feature requests (Chavan) Wrap Up 3:30 4:00 Y2 Project Plan, Action Items and Wrap Up (Crowley-Jacobson)

Data in isolation Isolated laboratory information systems Isolated radiology information systems Text is usually last on the list for enterprise data warehousing efforts Ability to access data, tissue and images limited to those who have clinical access (HIPAA)

Previous Work Significant previous work in developing NLP systems for processing clinical text (MedLee, ctakes) At least two preceding systems have brought together radiology and pathology data using NLP RadBank/RadTF (Rubin et al) Presto/Montage (Langlotz et al) Potential for enhancing research and QI will probably require more sophisticated information extraction methods

Montage Combined Radiology / Pathology timeline Initial CT shows mass lesion in this patient evaluated for Aphasia

Montage Final pathology result

RadTF Natural language report query (ontologically assisted) Linkage to images in PACS Do B, Wu A, Biswal S, Kamaya A, Rubin DL, Radiographics. 2010 Nov;30(7):2039-48

TIES Radiology TIES Radiology was deployed at University of Pittsburgh in January 2014. Currently contains over 19 million de-identified radiology reports across all UPMC hospitals from 2003- present Fully integrated with Radiology HB System. HBs use TIES to collect accession list, and then provide de-identified images from PACS system to investigators Approval for Radiology and Pathology done by separate groups; governance model Prior to deployment, extensive QA was conducted to ensure accuracy of report coding and search results.

Key differences for Radiology Need to use imaging exam type derived from source system metadata. Probably worse at UPMC than any of your institutions Large number of vendor systems were involved More variation in section labeling required up front work to map them. How much will this generalize? Different vocabulary and semantic types used Required a fair amount of trial and error and expert curation WSD becomes more and more important as you add new domains 9

Vocabulary Building Select and Prioritize vocabularies Select Semantic Type Filters Acronyms and Stop Words Radiology Source Vocabs UMLS or NCIM 12 Vocabularies 1. RADLEX 2. NCI 3. FMA 4. SNOMEDCT 5. ICD10PCS 6. MSH 7. OMIM 8. ICD10CM 1. NCI 2. FMA 3. SNOMEDCT 4. RADLEX 5. CBO 6. ICD10PCS 7. MSH 8. OMIM 45 Semantic Types Acquired Biomedical Abnormality Dental Material Element, Anatomical Ion, Abnormality Isotope Indicator, Anatomical Reagent, Structure or Bacterium Diagnostic Aid Medical Biologic Device Function Body Part, Organ, or Component Body Space or Junction Acquired Cell Abnormality Anatomical Cell Component Abnormality Anatomical Cell Function Structure Bacterium Cell or Molecular Biologic Dysfunction Function Body Gene Part, or Genome Organ, or Molecular Component Biology Research Body Technique Space or Junction CT Would that be Computerized Tomography? Chest tube? Cardiothoracic? Clotting Time? Connecticut? Pathology 14 Vocabularies 51 Semantic Types

RADIOLOGY DEMONSTRATION

Comparison against existing methods Before TIES was deployed in Radiology there were questions about adequacy: Would we be able to match up patients? How would the system compare against current method - experts searching our MARS data repository which includes free text Use iterative QA approach using queries from radiology leadership based on previous studies that they had done As such these were hard queries where MARS expert (with > 20 years experience) had already done extensive work to identify the best possible query

Radiology QA Several QA queries were conducted comparing results from TIES searches with results from searches conducted in MARS. MARS is the database center at UPMC from which TIES reports come from. MARS is searched using text terms. Searches can be conducted among certain report types, in particular header sections, etc. Results from MARS were treated as the gold standard.

Radiology QA Process TIES and MARS queries were constructed to be as analogous as possible. For each query, we first verified that all MARS results existed in the TIES database. Reports were scored as true positive (TP) and false positive (FP). Liz Legowski did initial scoring, which was validated by a radiologist Precision and recall (in comparison to MARS) were computed.

QA Query #1 Query: Hepatocellular carcinoma found on abdominal/pelvis MRIs with contrast Report Type # of Distinct Reports TP FP Relative Recall (measured against MARS) Precision TIES 361 219 142 0.94 0.61 MARS 397 233 164 N/A 0.59 TP Reports FP Reports

QA Query #1 Cont d TP reports not returned by TIES (TIES FN): All 15 TP reports were missed due to negation (wordings such as not typical for or hepatocellular carcinoma is considered unlikely ). Since HCC was not 100% ruled out, the reports were considered TP. This may have been too stringent. FP reports returned only by TIES: Reports were returned due to hepatocellular carcinoma appearing in the clinical history (section searching was not enabled at the time this query was conducted).

QA Query #2 Query: Pulmonary embolism found on chest/thorax CT Due to the large number of reports, only discrepant reports (reports returned by only one system) were scored Final Results: Report Type # of Distinct Reports TP FP Relative Recall (measured Precision against MARS) TIES 187 52 135 0.79 0.28 MARS 12200 (random sample of 115 scored) 14 101 N/A 0.12

QA Query #2 Cont d All reports returned only by TIES contained wording for pulmonary embolism that did not exactly match the MARS query (ex: pulmonary thromboembolism) 128 of the 135 TIES only FP reports were returned due to missed negation. The remaining 7 TIES only FP reports stated the study was adequate to evaluate for pulmonary thromboembolism, but no PE was found. TP reports not returned by TIES (TIES FN): All 14 reports had PE wordings that were used in the MARS query but are not synonyms of the concepts used in TIES (ex: pulmonary embolus)

Challenges Coding on scale required to create such massive text repositories Schema changes to speed up database operations JMS for coordination, enabling arbitrary and dynamic number of coding machines Further modularized system and parallelization of tasks (e.g. database operations and coding happen simultaneously) Differences in pathology and radiology sublanguages (for example nuances of uncertainty), level of maturity of controlled vocabulary, increased complications of WSD Architectures to support Further information extraction Datamining (and potentially in combination with structured data) QI programs

Quality Measures 28 ACR performance measures 5 of which relate to mammography, including One requiring entry Into separate database

Foundation for QI Opportunity for Pathology/Radiology correlation, analysis and feedback beyond what we could easily do previously Currently working on BIRADS extraction and correlation with pathology 630K reports of various types (mammogram, Breast MRI, Breast US) with BIRADS term in Pitt TIES system Represents 385K patients Of these 101K patients have pathology reports And 16098 of those patients have pathology reports within 1 month of BIRADS classification including the term breast

BIRADS Extraction Existing regular expression produced very high accuracy results in recent publication Approach needs work especially due to addendums which are plentiful 1 st evaluation of published code on our corpus with error analysis underway

Breast Imaging and Pathology Inclusion of data in data warehousing efforts may be particularly important in decreasing unnecessary testing Quality metrics and dashboards Audit and feedback efforts Comparison of mammography reading to other imaging tests Comparison of imaging tests to pathology results Ability for clinicians to create their own reports Prototyping interactive system with group of design students from University of Pittsburgh School of Information Science

Questions for Discussion Interest in using Radiology in your local institutions (even if it is not used for TCRN)? What kinds of quality metrics would you be interested in? Does this dovetail with other efforts ongoing at your institutions? How? 24

Team and Funding University of Pittsburgh Rebecca Crowley-Jacobson (MPI), Harry Hochheiser, Roger Day, Adrian Lee, Robert Edwards, John Kirkwood, Kevin Mitchell, Eugene Tseytlin, Girish Chavan, Liz Legowsky Boston Children s Hospital/Harvard Medical School Guergana Savova (MPI), Dmitriy Dligach, Sameer Pradhan, Timothy Miller, Sean Finan, David Harris, Pei Chen NCI 1U24CA184407-01; Another NCIP ITCR grant Funding period 2014-2019, NCI PO is Kim Jessup

Specific Aims - Methods Specific Aim 1: Develop methods for extracting phenotypic profiles. Extract patient s deep phenotypes, and their attributes such as general modifiers (negation, uncertainty, subject) and cancer specific characteristics (e.g. grade, invasion, lymph node involvement, metastasis, size, stage) Specific Aim 2: Extract gene/protein mentions and clinically significant molecular information from the clinical narrative Specific Aim 3: Create longitudinal representation of disease process and its resolution. Link phenotypes, treatments and outcomes in temporal associations to create a longitudinal abstraction of the disease Specific Aim 4: Extract discourses containing explanations, speculations, and hypotheses, to support explorations of causality

Specific Aims Design and Dissemination Specific Aim 5: Design and implement a computational platform for deep phenotype discovery and analytics for translational investigators, including integrative visual analytics. Specific Aim 6: Advance translational research in driving cancer biology research projects in breast cancer, ovarian cancer, and melanoma. Include research community throughout the design of the platform and its evaluation. Disseminate freely available software.

Use Cases and Scientific Experts Melanoma (John Kirkwood) Breast Cancer (Adrian Lee) Ovarian Cancer (Robert Edwards)

Software Dissemination Apache ctakes ctakes.apache.org TIES software - http://ties.pitt.edu/

Combining Structured and Unstructured Data Using clinical element models (Intermountain Health) as templates for information extraction Benefits us in several ways including potential to merge structured and unstructured data Models will be agnostic in the sense that they simply represent the kind of information that translational researchers want to use Can be populated through NLP but also from structured data sources Currently investigating i2b2 versus transmart as warehouse for data 33

Information modeling and template creation Instance level, document level and phenotype level annotations Methods for aggregating from instance level to document level and from document level to phenotype level These methods probably apply equally well to structured data as unstructured data 34

Information modeling and template creation 35

Information modeling and template creation 36

INFORMATION EXTRACTION FROM SELECTED COHORT i2b2 transmart TARGET MODEL IE PIPELINE CASE SET STRUCTURED DATA ADD TO EXISTING REPORT DATA 37

ASSISTED ANNOTATIONS i2b2 transmart TARGET MODEL ANNOTATOR SOFTWARE FINAL STRUCTURED DATA AUTO EXTRACTED DATA 38

EHR driven phenotyping True patient state Recording Discrete Phenotype Discovery Raw EHR data Phenotype Knowledge - Classify - Predict - Understand - Intervene Inform Inform Representation, bidirectional - Frequently unidirectional Use case driven process model - Real world - Defined parameters - Semantic Ontology Incomplete Inaccurate Highly Complex Bias Not developed with real word use cases, just old care delivery model

Techniques Concept extraction Coreference resolution Word sense disambiguation Temporal relationships (bache) Spatial relationship Validation Standardization across approaches Result are annotated corpora but is that enough or is it a start?

Purpose developed NLP Specific feature extraction High accuracy Semantic web/ontologies based on real world use cases (Pathak, Fernadez-Breis) Marry NLP with real world use cases to mine for features and develop machine learnable patterns within annotated corpora

NLP + discreet data NLP alone may not be the answer but combinations may be critical (Tien, Ludvigson) More precision Allow better modeling of extraction of NLP concepts Allow larger multidimensional data Fodder for machine learning algorithms High yield inputs, not entire corpus Cancer and cardiology are two low hanging fruits

Statistical approaches Active learning (Chen) Dimensionality reduction (Lyalina) Graph embedding (my idea) Graph theory (my idea) Bayesian networks (Klann) Conditional random fields (Deleger) Visual Phenome amongst the concept cloud (Warner)