COMPARISON OF BREAST CANCER STAGING IN NATURAL LANGUAGE TEXT AND SNOMED ANNOTATED TEXT

Size: px
Start display at page:

Download "COMPARISON OF BREAST CANCER STAGING IN NATURAL LANGUAGE TEXT AND SNOMED ANNOTATED TEXT"

Transcription

1 Volume 116 No , ISSN: (printed version); ISSN: (on-line version) url: ijpam.eu COMPARISON OF BREAST CANCER STAGING IN NATURAL LANGUAGE TEXT AND SNOMED ANNOTATED TEXT 1Johanna Johnsi Rani G, 2 Dennis Gladis, 3 Joy John Mammen 1 Department of Computer Science, Madras Christian College, Chennai , South India 2 Department of Computer Science, Presidency College, Chennai , South India 3 Department of Transfusion Medicine, Christian Medical College, Vellore , South India 1 johanna.g@mcc.edu.in, 2 Christophergladis67@gmail.com, 3 joymammen@cmcvellore.ac.in Abstract: In recent times, medical reports are generated electronically and stored in databases for automated systems to collate, process, analyze and interpret the patient data. A collection of such reports can help in population studies on the disease domain. Automated systems can also verify the manual diagnosis presented in the reports by experts. The corpus for the automated system discussed is a set of breast cancer pathology reports retrieved and processed using Natural Language Processing (NLP) techniques. According to the protocol by American Joint Committee on Cancer, ptnm classification is used to determine the pathological staging of breast cancer. The characteristics and classifications of Tumour T, Lymph node N and Distant Metastases M determine the stage of cancer. M is not evident from Pathology reports, hence it is given a default value of M0. The T and N classifications in the reports are validated and modified by the domain experts to give the Gold standard, with generation of discrepancy report for those with varying values. The cancer staging parameters extracted by the automated system is compared against the Gold Standard for analysis. The focus of the work is to extract the parameters required to determine the cancer stage of patients from two kinds of reports namely reports with natural language text and reports with SNOMED annotated text. The cancer staging process on both types of reports is compared and results indicate that cancer stage derived from SNOMED annotated pathology reports yield better results than on natural language text. Keywords: Breast cancer; Pathology reports; Natural Language Processing; Annotated text; Cancer stage 1. Introduction Most of the medical reports both in written and electronic form have descriptive narrations in natural language mostly in English. Processing textual data from these documents can be accomplished through natural language processing methods. Most of the hospitals in India generate and store medical reports in databases. Processing these medical reports using automated systems can provide valuable information for analysis and interpretation about the patient population. Statistics indicates that India ranks at the top in Breast cancer deaths. With the available set of breast cancer pathology reports, an automated system is developed to determine the cancer stage of patients. The required parameters are extracted from both natural language reports and reports annotated using Systematized Nomenclature of Medicine Clinical terms (SNOMED CT). The set of breast cancer pathology reports are obtained from a hospital in South India. The report has the following sections namely Specimen, Clinical, Gross, Micro, and Impression. The Impression section of the de-identified Pathology reports are processed to derive the Pathological classification ptnm, in which T represents Tumour, N represents Lymph node and M represents Distant Metastasis. The grouping of T, N and M classifications, is used to detect the stage of cancer of patients. The American Joint Committee on Cancer (AJCC) has created resource materials that provide indepth and easy-to-access information for doctors and other medical professionals who perform the staging of cancer patients, and for cancer registrars who abstract the cancer cases [11]. The existence of primary tumour and its size are the prime values required to classify T. Breast cancer may spread to the axillary lymph nodes in the armpit. The conditions for classification of lymph node N, which is more complex than T classification. Distant Metastases M is not classified based on details in a Pathology report. Hence the system sets a default value of M0 to derive the cancer stage. The stage of breast cancer in a patient describes the extent of the spread of cancer in the body and the grouping of T, N and M clearly specifies the extent of the disease in a patient. The cancer stage is determined through grouping of T, N and M as recommended by AJCC. 243

2 Prior to determining the cancer stage on natural language text and the SNOMED annotated text, the textual content is pre-processed. The pre-processing steps required for natural language includes Natural Language Processing (NLP) related tasks, and standardization of numerical and non-numerical values in the text. The SNOMED annotated text requires a major pre-processing step of extracting a disease specific subset from the SNOMED database. As the subsequent pre-processing step, the SNOMED subset extracted for the disease domain is used to annotate the text with SNOMED terms and their code. Out of the processing steps, extraction of SNOMED subset for breast cancer domain and annotation of natural language text using the subset are out of scope of this paper. The work uses regional data collected from hospital in India. Hence it has practical applicability in the diagnosis, treatment and population-based studies of breast cancer in women in India. The paper is organized as follows: Section II describes Related Works in Natural Language Processing (NLP), SNOMED annotation of text, and Cancer Staging. Section III explains the Materials and Methods used. Section IV describes the Results obtained. Section V presents the Conclusion. 2. Related Works Electronic Health Records (EHR), especially those in narrative text form are processed by applying Natural Language Processing (NLP) and Information Extraction (IE) techniques. Erik Cambria and White mention various approaches that use Production rules, Semantic categories and those based on First-order Logic (FOL) Bayesian and Semantic networks [1]. Dunham et al.[12], Schadow and McDonald [4], Xu et al.[3], Anni Coden et al.[6], and Nguyen et al.[5], used domain-specific lexicons and rules in processing pathology reports. Nelson et al. developed a web-based search application with sequential queries. [9] Buckley JM et al. converted free text EHRs to a machine readable form using NLP techniques. [3] Anni Coden et al. automatically extracted cancer disease characteristics from pathology reports [6]. David Martinz and Yue Li, used text mining tools to extract information with minimal human intervention [8]. Cancer staging in this work is done using extraction of required parameters using pattern-matching on free text and annotated text. The Clinical reports in the developed countries use medical terminologies such as SNOMED or ICD. Buckley et al. used ICD and Current Procedural Terminology (CPT) codes to identify those reports pertaining to breast [2]. Schadow G and McDonald developed a method of extraction for details about specimens and their related findings from coded text. [4] Nguyen et al. applied Symbolic rule-based classification methodology, to identify SNOMED CT concepts in free text. [14]. Napolitano G, Fox C, Middleton R and Connolly D used Pattern-based extraction from pathology reports [7]. Many breast cancer related research works in India use the Wisconsin Breast Cancer dataset. This work uses regional data and hence the results have practical relevance and applicability. The system uses Patternmatching rules for extraction and cancer staging on both natural language text [17] and annotated text. The annotation is done using SNOMED. Ching-Heng Lin, Nai-Yuan Wu, Wei-Shao Lai and Der-Ming Liou developed an Auto-annotation tool that selects terms using a suggesting and ranking algorithm to annotate reports from terms in a SNOMED subset [16]. The two essential processing steps in this work are use of pattern-matching algorithms to extract the necessary parameters for cancer staging and annotation of text using SNOMED. A comparison in the cancer staging process on both natural language text and those annotated text is performed using the dataset and the results are compared and analyzed to determine which performs better. 3. Materials and Methods The dataset and the methods applied to determine the cancer stage of patients are explained in this section. The process applies steps in natural language processing and pattern matching rules to determine the cancer stage. A. Dataset One hundred and fifty de-identified breast cancer pathology reports constitute the corpus used in this work. The reports written by a Pathologist narrates the patient s condition determined by examining cells and tissues under a microscope. The report has the following sections: Demographic information, Specimen section indicating the body part from where the tissue samples are taken, Clinical history describing breast abnormality and the kind of surgery done and, Gross description giving the size, weight, and color of each piece of tissue removed. The Microscopic description describes how cancer cells look under the microscope, and their relationship to the normal surrounding tissue, the size of cancer, results of special tests and growth rate of cells. The Impression section summarizes all the important findings from the tissues examined. 244

3 B. Cancer staging The stage of cancer indicates how far the cancer has spread. There are two types of cancer staging - Clinical staging and Pathological staging. Out of the two, Pathological staging is more accurate than Clinical staging. T, N and M classifications are found from the Impression section, applying AJCC protocol and their grouping determines the stage of cancer. The stage is determined on reports with natural language and SNOMED annotated text. C. Preprocessing for cancer staging on Plain text Retrieval of reports, pre-processing on the report content, extraction of the required details for TNM classification and staging are the major tasks performed in the developed automated system. The pathology report is retrieved either as.pdf or.txt file and the listed preprocessing steps are performed on plain text reports. The precision of results in any process on natural language text depends on the number of preprocessing steps applied to homogenize and standardize the data. The preprocessing steps applied to the breast cancer pathology reports are listed below. Report segregation: Separating multiple reports into individual reports. Section segmentation: Extracting the contents of the sections in the reports as separate sections. Standardization of measures: All tumour sizes are either given in centimeters or millimeters. This step converts all the measures into millimeters. Date formats: All dates are converted to a uniform DD/MM.YYYY format. Sentence segmentation: The contents of each section are separated into individual sentences. Period (.) is used to identify the sentences, with handling of exceptions for fraction values. Standardization of numerical values: The pathology reports have numeric values represented in numerals (3), or in English words (three). Such numerical values are standardized to Arabic numerals. Alpha numeric representations: The number of lymph nodes are represented as 1/3, or 1 out of three, or one out of three. This value is converted into complete textual form as one out of three. Abbreviations: Abbreviations are expanded by the system. Spelling variations: All discrepancies in spelling between British and American English are standardized using British English. Whitespace removal: The whitespaces are removed from the document. This improves the data extraction process. Handling parenthesized terms: Parentheses () or [ ] in the document are homogenized into [ ]. Case sensitivity: All text comparisons are made by converting the terms into lower case. In case of medical terms such as Ductal Carcinoma in situ, the terms are converted to a form as found in SNOMED. Missing headers: The pre-processing module appends missing headers into the document whenever necessary. The application of the above pre-processing steps homogenizes the reports and improves the parameters extraction process for cancer staging. The efficiency and precision of annotation of medical terms in the report, using SNOMED improves with the preprocessing steps. Fig. 1 shows the workflow for the cancer staging process on natural language text in pathology reports and on SNOMED annotated reports. The diagram shows that both archived reports and newly generated reports are processed to determine the cancer stage of patients. Figure 1. Workflow of Comparison on Breast Cancer Staging D. Preprocessing for cancer staging on SNOMED Annotated text The pre-processing steps required for extraction of cancer stage on SNOMED annotated text are, manually building a Lexicon of breast cancer terms and extraction of SNOMED subset for breast cancer domain using the Lexicon and queries. These are part of our earlier work in the development of the automated system. The Lexicon is built in two ways i. Through manual process of examining the reports to accumulate terms and store them in a database and ii. Through application of NLP based tasks such as sectioning, sentence splitting, tokenization and stop word removal, after tagging the medical terms 245

4 found in the manual lexicon. [18] The above two preprocessing steps are out of scope of this paper. In the annotation process using the subset, each medical term in the report is replaced with its corresponding SNOMED term and its code. Pattern-matching algorithms are applied on SNOMED annotated text to find the ptnm classification. The patterns used for cancer staging on annotated text have been coined using several components. The components are SNOMED Concept Ids that were identified using the CliniClue SNOMED browser, numerical values, negation values (No / Not), Logical connective (and, or). The conditions are the same classification conditions specified in AJCC protocol. When the free text is annotated with the SNOMED codes for medical terms, the ptnm classifications are also annotated with their respective codes. 4. Results The automated system successfully determined the cancer stage for each patient from the natural language text and annotated text in all the 150 reports. The pattern-matching rules applied for the process extracted the details required for classification of T, and N and cancer staging. Figure 2to Figure 4 present the analysis reports of T, N and Cancer Stage extracted from natural language text. E. Gold Standard for the Cancer Staging The system has three ptnm classifications: i. ptnm given at the end of the report, manually derived by the Pathologist by examining the parameters in the report, ii. Gold Standard ptnm, the ptnm verified and validated by the Pathologist through a graphical interface and iii. ptnm classification automatically derived by the application. The ptnm specified in the Impression section of each pathology report is verified by the Pathologists, to correct erroneous and missing classifications. This is the gold standard that is used to validate the automatically derived ptnm classification. The ptnm is of prime importance as it determines the stage of cancer in patients. Figure 2. Analysis of T- Classification on Natural language text F. Analysis of Cancer staging process The analysis on cancer stage values derived is performed by finding the True Positive (TP), True Negative (TN), False Positive (FP) and False Negative (FN) values. The evaluation parameters used in the analysis are listed below. Precision (P) = TP / (TP + FP) Recall (R) = TP / (TP + FN) Specificity = TN / (TN + FP) Accuracy = (TP + TN) / (TP + TN + FP + FN) F-measure = (2*Precision*Recall) / (Precision + Recall) Error Rate = (FP + FN) / (TP + TN + FP + FN) Figure 3. Analysis of N-Classification on Natural Language text Figure 4. Analysis of Cancer Staging on Natural Language text 246

5 Cancer staging on free text indicates that the average Precision in cancer staging on natural language text is 94.72%, the average Recall is 95.94%, average Accuracy is 92.12% and average Specificity is 80.96%. The average F-measure for the process is 95.89% and average Error is 3.94%. The results show that the system performs well to extract cancer stage of patients. This efficiency can be attributed to numerous pre-processing steps applied on the textual contents before the extraction process. The results of Cancer staging on SNOMED annotated text in pathology reports is presented in Figures 5 to Figure 7. Figure 5. Analysis of T-Classification on SNOMED Annotated text Cancer staging process on SNOMED annotated report yields the following results. The average Precision of the process is 95.48%, the average Recall is 100%, average Accuracy is 97.97% and average Specificity of 96.27%. The average F-measure for the process is 97.66% and average is Error 0.04%. As the analysis parameters indicate the cancer staging process on SNOMED annotated text, yields better results. This can be attributed to the following reasons. i. The preprocessing steps extensively applied on the medical text contribute to homogeniztion and standardization of text in the reports. This cleans the dataset for efficient process. ii. The correctness of the process is ensured by the manually collating a Lexicon of medical terms relating to breast cancer from the pathology reports and using it for the annotation process. The Lexicon has been obtained and verified using manual and automated means, which standardized the subset extraction process. iii. The Lexicon generated by the system is used in SNOMED subset extraction. The comprehensiveness and the completeness of the lexicon terms contributes to effective subset extraction. iv. SNOMED subset for cancer consists of about 1% of all the SNOMED CT concepts in the database. The extraction of SNOMED subset for breast cancer domain, instead of using the complete SNOMED database, result in faster and precise annotation of reports, thus giving better results for cancer staging than on natural language text. v. The annotation process standardized every medical term in the report, by replacing it with its equivalent term in the Medical vocabulary in SNOMED. 5. Conclusions Figure 6. Analysis of N-Classification on SNOMED Annotated text Figure 7. Analysis of Cancer staging on SNOMED Annotated text The objective of the work to derive the stage of cancer on natural language textual reports and SNOMED annotated reports was successfully achieved. The use of standard AJCC protocol for cancer staging and globally accepted medical vocabulary such as SNOMED yielded better results in the staging process. The natural language text is heterogeneous but the pre-processing steps bring homogeneity to the text. In spite of this, the reason for less efficiency in cancer staging on natural language text reports can be attributed to the use of only the Impression section of the report for the staging process. Processing other sections would improve the results. The accuracy of automated systems in medical domain, especially in a task as critical as cancer staging is of vital importance, as it involves diagnostic and treatment decision on a human being. This critical factor necessitates that reports be annotated and processed for better results, analysis and decision-making. Annotation of the reports using SNOMED also makes it possible to apply numerous 247

6 queries on any annotated disease dataset to get better understanding of the patient population. The work clearly indicates that between cancer staging process on natural language text and the SNOMED annotated text, the process on annotated text yields best results. As extension of this work, the annotation process can be performed on reports of other disease domains for required processing and decision making. 6. Acknowledgement The authors would like to thank the Department of Pathology, Christian Medical College and Hospital, Vellore for providing the sample data for the study. The authors would also like to acknowledge S. Pradeep Vignesh, student of MCA in the Department of Computer Science, Madras Christian College for his contributions towards developing the automated system. References [1] Erik Cambria, Bebo White, Jumping NLP Curves: A Review of Natural Language Processing Research, IEEE Computational intelligence magazine, pp 48-57, May [2] Buckley JM, Coopey SB, Sharko J, et al. The feasibility of using natural language processing to extract clinical information from breast pathology reports. Journal of Pathology Informatics. 2012;3:23. doi: / [3] Xu H, Friedman C. Facilitating research in pathology using natural language processing. AMIA Annual Symp. Proc. 2003:1057. [4] Schadow G, McDonald CJ. Extracting Structured Information from Free Text Pathology Reports. AMIA Annual Symposium Proceedings., pp , [5] Nguyen, Moore, Lawley, Hansen, Colquist, Automatic extraction of cancer characteristics from freetext pathology reports for cancer notifications, Stud Health echnol Inform. 2011;168: [6] Anni Coden et al., Automatically extracting cancer disease characteristics from pathology reports into a Disease Knowledge Representation Model, Elsevier, Journal of Biomedical Informatics 42, pp , [7] Napolitano G, Fox C, Middleton R, Connolly D, Pattern-based information extraction from pathology reports for cancer registration, Cancer causes control, 2010 Nov;21(11): doi: /s Epub 2010 Jul 23. [8] David Martinz, Yue Li,,, Information Extraction from Pathology reports in a hospital setting, Proceedings of the 20th ACM international conference on Information and knowledge management, pp , [9] Nelson HD, Weerasinghe R, Martel M, Bifulco C, Assur T, Elmore JG, et al. Development of an electronic breast pathology database in a community health system. J Pathol Inform 2014;5:26. [10] McCowan I, Moore D, Nguyen AN, Bowman RV, Clarke BE, Duhig EE, et al. Application of Information Technology: Collection of Cancer Stage Data by Classifying Free-text Medical Reports. JAMIA. 2007;14(6): [11] AJCC Cancer Staging Manual. 7th ed. New York, NY: Springer, , [12] G.S. Dunham, M.G. Pacak and A. W. Pratt, Automatic indexing of Pathology data, Journal of the American Society for Information Science, 29(2):81-90, Mar., [13] David A. Hanauer et al., The registry case finding engine: an automated tool to identify cancer cases from unstructured, free-text pathology reports and clinical notes, Journal of the American College of Surgeons, 205(5): pp , Nov [14] Anthony N Nguyen et al., Symbolic rule-based classification of lung cancer stages from free-text pathology reports, Journal of the American Medical Informatics Association (JAMIA), 17: , [15] Carlos Rodrigues-Solano, Leonardo Lezcano, Miguel-Angel Sicilia, Information Systems and Technologies for Enhancing Health and Social Care, 2013, pp. 15. [16] Lin C-H, Wu N-Y, Lai W-S, Liou D-M. Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries. Journal of the American Medical Informatics Association : JAMIA. 2015;22(1): doi: /amiajnl [17] Johanna Johnsi Rani G., Dennis Gladis, Marie Therese Manipadam, Gunadala Ishitha, Breast Cancer Staging using Natural Language Processing, 2015, IEEE Conference publications, pp , DOI: /ICACCI [18] Johanna Johnsi Rani G., Dennis Gladis, Joy John Mammen, Lexicon-based and Query-based Autoannotation of Medical Reports using SNOMED, Proceedings of the International Conference on Computing Paradigms (ICCP), 2017 [19] Johanna Johnsi Rani G., Dennis Gladis, Joy John Mammen, SNOMED Subset Extraction for Annotation of Breast Cancer Pathology Reports, Proceedings of National Conference on ICT Solutions for Challenges and Issues in e-health (NCICTEH'17), [20] K.Srikar,M.Akhil,V.Krishna reddy, Execution of Cloud Scheduling Algorithms,International Innovative Research Journal of Engineering and Technology, vol 02,no 04,pp ,

7 249

8 250

A Simple Pipeline Application for Identifying and Negating SNOMED CT in Free Text

A Simple Pipeline Application for Identifying and Negating SNOMED CT in Free Text A Simple Pipeline Application for Identifying and Negating SNOMED CT in Free Text Anthony Nguyen 1, Michael Lawley 1, David Hansen 1, Shoni Colquist 2 1 The Australian e-health Research Centre, CSIRO ICT

More information

Clinician-Driven Automated Classification of Limb Fractures from Free-Text Radiology Reports

Clinician-Driven Automated Classification of Limb Fractures from Free-Text Radiology Reports Clinician-Driven Automated Classification of Limb Fractures from Free-Text Radiology Reports Amol Wagholikar 1, Guido Zuccon 1, Anthony Nguyen 1, Kevin Chu 2, Shane Martin 2, Kim Lai 2, Jaimi Greenslade

More information

Symbolic rule-based classification of lung cancer stages from free-text pathology reports

Symbolic rule-based classification of lung cancer stages from free-text pathology reports Symbolic rule-based classification of lung cancer stages from free-text pathology reports Anthony N Nguyen, 1 Michael J Lawley, 1 David P Hansen, 1 Rayleen V Bowman, 2 Belinda E Clarke, 3 Edwina E Duhig,

More information

A Study of Abbreviations in Clinical Notes Hua Xu MS, MA 1, Peter D. Stetson, MD, MA 1, 2, Carol Friedman Ph.D. 1

A Study of Abbreviations in Clinical Notes Hua Xu MS, MA 1, Peter D. Stetson, MD, MA 1, 2, Carol Friedman Ph.D. 1 A Study of Abbreviations in Clinical Notes Hua Xu MS, MA 1, Peter D. Stetson, MD, MA 1, 2, Carol Friedman Ph.D. 1 1 Department of Biomedical Informatics, Columbia University, New York, NY, USA 2 Department

More information

Keeping Abreast of Breast Imagers: Radiology Pathology Correlation for the Rest of Us

Keeping Abreast of Breast Imagers: Radiology Pathology Correlation for the Rest of Us SIIM 2016 Scientific Session Quality and Safety Part 1 Thursday, June 30 8:00 am 9:30 am Keeping Abreast of Breast Imagers: Radiology Pathology Correlation for the Rest of Us Linda C. Kelahan, MD, Medstar

More information

An Improved Algorithm To Predict Recurrence Of Breast Cancer

An Improved Algorithm To Predict Recurrence Of Breast Cancer An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant

More information

Semantic Alignment between ICD-11 and SNOMED-CT. By Marcie Wright RHIA, CHDA, CCS

Semantic Alignment between ICD-11 and SNOMED-CT. By Marcie Wright RHIA, CHDA, CCS Semantic Alignment between ICD-11 and SNOMED-CT By Marcie Wright RHIA, CHDA, CCS World Health Organization (WHO) owns and publishes the International Classification of Diseases (ICD) WHO was entrusted

More information

Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Abstracts

Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Abstracts jsci2016 Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Wutthipong Kongburan, Praisan Padungweang, Worarat Krathu, Jonathan H. Chan School of Information Technology King

More information

Automatic Extraction of ICD-O-3 Primary Sites from Cancer Pathology Reports

Automatic Extraction of ICD-O-3 Primary Sites from Cancer Pathology Reports Automatic Extraction of ICD-O-3 Primary Sites from Cancer Pathology Reports Ramakanth Kavuluru, Ph.D 1, Isaac Hands, B.S 2, Eric B. Durbin, DrPH 2, and Lisa Witt, A.S 2 1 Division of Biomedical Informatics,

More information

George Cernile Artificial Intelligence in Medicine Toronto, ON. Carol L. Kosary National Cancer Institute Rockville, MD

George Cernile Artificial Intelligence in Medicine Toronto, ON. Carol L. Kosary National Cancer Institute Rockville, MD George Cernile Artificial Intelligence in Medicine Toronto, ON Carol L. Kosary National Cancer Institute Rockville, MD Using RCA A system to convert free text pathology reports into a database of discrete

More information

Text mining for lung cancer cases over large patient admission data. David Martinez, Lawrence Cavedon, Zaf Alam, Christopher Bain, Karin Verspoor

Text mining for lung cancer cases over large patient admission data. David Martinez, Lawrence Cavedon, Zaf Alam, Christopher Bain, Karin Verspoor Text mining for lung cancer cases over large patient admission data David Martinez, Lawrence Cavedon, Zaf Alam, Christopher Bain, Karin Verspoor Opportunities for Biomedical Informatics Increasing roll-out

More information

IBM Research Report. Automated Problem List Generation from Electronic Medical Records in IBM Watson

IBM Research Report. Automated Problem List Generation from Electronic Medical Records in IBM Watson RC25496 (WAT1409-068) September 24, 2014 Computer Science IBM Research Report Automated Problem List Generation from Electronic Medical Records in IBM Watson Murthy Devarakonda, Ching-Huei Tsou IBM Research

More information

Chapter 12 Conclusions and Outlook

Chapter 12 Conclusions and Outlook Chapter 12 Conclusions and Outlook In this book research in clinical text mining from the early days in 1970 up to now (2017) has been compiled. This book provided information on paper based patient record

More information

Outline. How to Use the AJCC Cancer Staging Manual, 7 th ed. 7/9/2015 FCDS ANNUAL CONFERENCE ST PETERSBURG, FLORIDA JULY 30, 2015.

Outline. How to Use the AJCC Cancer Staging Manual, 7 th ed. 7/9/2015 FCDS ANNUAL CONFERENCE ST PETERSBURG, FLORIDA JULY 30, 2015. 1 How to Use the AJCC Cancer Staging Manual, 7 th ed. FCDS ANNUAL CONFERENCE ST PETERSBURG, FLORIDA JULY 30, 2015 Steven Peace, CTR Outline 2 History, Purpose and Background Purchase and Ordering Information

More information

Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials

Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials Riccardo Miotto and Chunhua Weng Department of Biomedical Informatics Columbia University,

More information

Erasmus MC at CLEF ehealth 2016: Concept Recognition and Coding in French Texts

Erasmus MC at CLEF ehealth 2016: Concept Recognition and Coding in French Texts Erasmus MC at CLEF ehealth 2016: Concept Recognition and Coding in French Texts Erik M. van Mulligen, Zubair Afzal, Saber A. Akhondi, Dang Vo, and Jan A. Kors Department of Medical Informatics, Erasmus

More information

An Intelligent Writing Assistant Module for Narrative Clinical Records based on Named Entity Recognition and Similarity Computation

An Intelligent Writing Assistant Module for Narrative Clinical Records based on Named Entity Recognition and Similarity Computation An Intelligent Writing Assistant Module for Narrative Clinical Records based on Named Entity Recognition and Similarity Computation 1,2,3 EMR and Intelligent Expert System Engineering Research Center of

More information

Christina Martin Kazi Russell MED INF 406 INFERENCING Session 8 Group Project November 15, 2014

Christina Martin Kazi Russell MED INF 406 INFERENCING Session 8 Group Project November 15, 2014 INFERENCING (HW 8) 1 Christina Martin Kazi Russell MED INF 406 INFERENCING Session 8 Group Project November 15, 2014 Page 2 The Clinical Decision Support System designed to utilize the Training Set data

More information

Classification of Cancer-related Death Certificates using Machine Learning

Classification of Cancer-related Death Certificates using Machine Learning Classification of Cancer-related Death Certificates using Machine Learning Luke Butt 1, Guido Zuccon 1, Anthony Nguyen 1, Anton Bergheim 2, Narelle Grayson 2 1The Australian e-health Research Centre, Brisbane,

More information

Innovative Risk and Quality Solutions for Value-Based Care. Company Overview

Innovative Risk and Quality Solutions for Value-Based Care. Company Overview Innovative Risk and Quality Solutions for Value-Based Care Company Overview Meet Talix Talix provides risk and quality solutions to help providers, payers and accountable care organizations address the

More information

Automatic Identification & Classification of Surgical Margin Status from Pathology Reports Following Prostate Cancer Surgery

Automatic Identification & Classification of Surgical Margin Status from Pathology Reports Following Prostate Cancer Surgery Automatic Identification & Classification of Surgical Margin Status from Pathology Reports Following Prostate Cancer Surgery Leonard W. D Avolio MS a,b, Mark S. Litwin MD c, Selwyn O. Rogers Jr. MD, MPH

More information

Asthma Surveillance Using Social Media Data

Asthma Surveillance Using Social Media Data Asthma Surveillance Using Social Media Data Wenli Zhang 1, Sudha Ram 1, Mark Burkart 2, Max Williams 2, and Yolande Pengetnze 2 University of Arizona 1, PCCI-Parkland Center for Clinical Innovation 2 {wenlizhang,

More information

Efficient Encoding of Pathology Reports Using Natural Language Processing

Efficient Encoding of Pathology Reports Using Natural Language Processing Efficient Encoding of Pathology Reports Using Natural Language Processing Rebecka Weegar Dept. of Computer and Systems Sciences Stockholm University rebeckaw@dsv.su.se Jan F Nygård The Cancer Registry

More information

Automatic Extraction of Synoptic Data. George Cernile Artificial Intelligence in Medicine AIM

Automatic Extraction of Synoptic Data. George Cernile Artificial Intelligence in Medicine AIM Automatic Extraction of Synoptic Data George Cernile Artificial Intelligence in Medicine AIM Agenda Background Technology used Demonstration Questions How often are checklist elements included in a report,

More information

NUMERATOR: Reports that include the pt category, the pn category and the histologic grade

NUMERATOR: Reports that include the pt category, the pn category and the histologic grade Quality ID #100 (NQF 0392): Colorectal Cancer Resection Pathology Reporting: pt Category (Primary Tumor) and pn Category (Regional Lymph Nodes) with Histologic Grade National Quality Strategy Domain: Effective

More information

TeamHCMUS: Analysis of Clinical Text

TeamHCMUS: Analysis of Clinical Text TeamHCMUS: Analysis of Clinical Text Nghia Huynh Faculty of Information Technology University of Science, Ho Chi Minh City, Vietnam huynhnghiavn@gmail.com Quoc Ho Faculty of Information Technology University

More information

EXTRACT THE BREAST CANCER IN MAMMOGRAM IMAGES

EXTRACT THE BREAST CANCER IN MAMMOGRAM IMAGES International Journal of Civil Engineering and Technology (IJCIET) Volume 10, Issue 02, February 2019, pp. 96-105, Article ID: IJCIET_10_02_012 Available online at http://www.iaeme.com/ijciet/issues.asp?jtype=ijciet&vtype=10&itype=02

More information

The feasibility of using natural language processing to extract clinical information from breast pathology reports

The feasibility of using natural language processing to extract clinical information from breast pathology reports J Pathol Inform Editor-in-Chief: Anil V. Parwani, Liron Pantanowitz, Pittsburgh, PA, USA Pittsburgh, PA, USA OPEN ACCESS HTML format For entire Editorial Board visit : www.jpathinformatics.org/editorialboard.asp

More information

CLAMP-Cancer an NLP tool to facilitate cancer research using EHRs Hua Xu, PhD

CLAMP-Cancer an NLP tool to facilitate cancer research using EHRs Hua Xu, PhD CLAMP-Cancer an NLP tool to facilitate cancer research using EHRs Hua Xu, PhD School of Biomedical Informatics The University of Texas Health Science Center at Houston 1 Advancing Cancer Pharmacoepidemiology

More information

Reporting of Cancer Stage Information by Acute Care Hospitals in Ontario

Reporting of Cancer Stage Information by Acute Care Hospitals in Ontario Reporting of Cancer Stage Information by Acute Care Hospitals in Ontario Forward This document is an accompanying reference to Ontario s staging policy entitled Guidelines for Staging Patients with Cancer

More information

Analysis of Diabetic Dataset and Developing Prediction Model by using Hive and R

Analysis of Diabetic Dataset and Developing Prediction Model by using Hive and R Indian Journal of Science and Technology, Vol 9(47), DOI: 10.17485/ijst/2016/v9i47/106496, December 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Analysis of Diabetic Dataset and Developing Prediction

More information

Cardiac Risk Prediction Analysis Using Spark Python (PySpark)

Cardiac Risk Prediction Analysis Using Spark Python (PySpark) Cardiac Prediction Analysis Using Spark Python (PySpark) G.Tirupati, Prof. K.Venkata Rao Abstract-Cardiovascular disease is the acute disorder in the world today. Disease control and early diagnosis of

More information

Early Detection of Lung Cancer

Early Detection of Lung Cancer Early Detection of Lung Cancer Aswathy N Iyer Dept Of Electronics And Communication Engineering Lymie Jose Dept Of Electronics And Communication Engineering Anumol Thomas Dept Of Electronics And Communication

More information

Predicting Breast Cancer Survivability Rates

Predicting Breast Cancer Survivability Rates Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer

More information

Building a Diseases Symptoms Ontology for Medical Diagnosis: An Integrative Approach

Building a Diseases Symptoms Ontology for Medical Diagnosis: An Integrative Approach Building a Diseases Symptoms Ontology for Medical Diagnosis: An Integrative Approach Osama Mohammed, Rachid Benlamri and Simon Fong* Department of Software Engineering, Lakehead University, Ontario, Canada

More information

Improved Intelligent Classification Technique Based On Support Vector Machines

Improved Intelligent Classification Technique Based On Support Vector Machines Improved Intelligent Classification Technique Based On Support Vector Machines V.Vani Asst.Professor,Department of Computer Science,JJ College of Arts and Science,Pudukkottai. Abstract:An abnormal growth

More information

Retrieving disorders and findings: Results using SNOMED CT and NegEx adapted for Swedish

Retrieving disorders and findings: Results using SNOMED CT and NegEx adapted for Swedish Retrieving disorders and findings: Results using SNOMED CT and NegEx adapted for Swedish Maria Skeppstedt 1,HerculesDalianis 1,andGunnarHNilsson 2 1 Department of Computer and Systems Sciences (DSV)/Stockholm

More information

CHAPTER 2 MAMMOGRAMS AND COMPUTER AIDED DETECTION

CHAPTER 2 MAMMOGRAMS AND COMPUTER AIDED DETECTION 9 CHAPTER 2 MAMMOGRAMS AND COMPUTER AIDED DETECTION 2.1 INTRODUCTION This chapter provides an introduction to mammogram and a description of the computer aided detection methods of mammography. This discussion

More information

Application of Automated Pathology Reporting Concepts to Radiology Reports

Application of Automated Pathology Reporting Concepts to Radiology Reports Original Article Application of Automated Pathology Reporting Concepts to Radiology Reports Suzanne March, MBA, CMC a ; George Cernile, BSc, CKE, PMP b ; Kim West, BS a ; Diane Borhani, MBA, CMC a ; April

More information

A Descriptive Delta for Identifying Changes in SNOMED CT

A Descriptive Delta for Identifying Changes in SNOMED CT A Descriptive Delta for Identifying Changes in SNOMED CT Christopher Ochs, Yehoshua Perl, Gai Elhanan Department of Computer Science New Jersey Institute of Technology Newark, NJ, USA {cro3, perl, elhanan}@njit.edu

More information

Applications of Machine learning in Prediction of Breast Cancer Incidence and Mortality

Applications of Machine learning in Prediction of Breast Cancer Incidence and Mortality Applications of Machine learning in Prediction of Breast Cancer Incidence and Mortality Nadia Helal and Eman Sarwat Radiation Safety Dep. NCNSRC., Atomic Energy Authority, 3, Ahmed El Zomor St., P.Code

More information

Modeling Annotator Rationales with Application to Pneumonia Classification

Modeling Annotator Rationales with Application to Pneumonia Classification Modeling Annotator Rationales with Application to Pneumonia Classification Michael Tepper 1, Heather L. Evans 3, Fei Xia 1,2, Meliha Yetisgen-Yildiz 2,1 1 Department of Linguistics, 2 Biomedical and Health

More information

Shades of Certainty Working with Swedish Medical Records and the Stockholm EPR Corpus

Shades of Certainty Working with Swedish Medical Records and the Stockholm EPR Corpus Shades of Certainty Working with Swedish Medical Records and the Stockholm EPR Corpus Sumithra VELUPILLAI, Ph.D. Oslo, May 30 th 2012 Health Care Analytics and Modeling, Dept. of Computer and Systems Sciences

More information

Lung Tumour Detection by Applying Watershed Method

Lung Tumour Detection by Applying Watershed Method International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 5 (2017), pp. 955-964 Research India Publications http://www.ripublication.com Lung Tumour Detection by Applying

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017 RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science

More information

Building Evaluation Scales for NLP using Item Response Theory

Building Evaluation Scales for NLP using Item Response Theory Building Evaluation Scales for NLP using Item Response Theory John Lalor CICS, UMass Amherst Joint work with Hao Wu (BC) and Hong Yu (UMMS) Motivation Evaluation metrics for NLP have been mostly unchanged

More information

SAGE. Nick Beard Vice President, IDX Systems Corp.

SAGE. Nick Beard Vice President, IDX Systems Corp. SAGE Nick Beard Vice President, IDX Systems Corp. Sharable Active Guideline Environment An R&D consortium to develop the technology infrastructure to enable computable clinical guidelines, that will be

More information

TF-IDF-Based Automated Application for classification Forensic Autopsy Reports to Identification of Cause of Death (CoD)

TF-IDF-Based Automated Application for classification Forensic Autopsy Reports to Identification of Cause of Death (CoD) Chiew T.K., et al. (Eds.): PGRES 2017, Kuala Lumpur: Eastin Hotel, FCSIT, 2017: pp 57-62 TF-IDF-Based Automated Application for classification Forensic Autopsy Reports to Identification of Cause of Death

More information

Prediction of Key Patient Outcome from Sentence and Word of Medical Text Records

Prediction of Key Patient Outcome from Sentence and Word of Medical Text Records Prediction of Key Patient Outcome from Sentence and Word of Medical Text Records Takanori Yamashita 1, Yoshifumi Wakata 1, Hidehisa Soejima 2, Naoki Nakashima 1, Sachio Hirokawa 3 1 Medical Information

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

Integration of hospital based breast cancer data and population based data at the Greater Poland Cancer Centre

Integration of hospital based breast cancer data and population based data at the Greater Poland Cancer Centre Integration of hospital based breast cancer data and population based data at the Greater Poland Cancer Centre Maciej Trojanowski Director of the Greater Poland Cancer Registry Department of Cancer Prevention

More information

Classification of Smoking Status: The Case of Turkey

Classification of Smoking Status: The Case of Turkey Classification of Smoking Status: The Case of Turkey Zeynep D. U. Durmuşoğlu Department of Industrial Engineering Gaziantep University Gaziantep, Turkey unutmaz@gantep.edu.tr Pınar Kocabey Çiftçi Department

More information

NUMERATOR: Reports that include the pt category, the pn category and the histologic grade

NUMERATOR: Reports that include the pt category, the pn category and the histologic grade Quality ID #100 (NQF 0392): Colorectal Cancer Resection Pathology Reporting: pt Category (Primary Tumor) and pn Category (Regional Lymph Nodes) with Histologic Grade National Quality Strategy Domain: Effective

More information

A REVIEW ON CLASSIFICATION OF BREAST CANCER DETECTION USING COMBINATION OF THE FEATURE EXTRACTION MODELS. Aeronautical Engineering. Hyderabad. India.

A REVIEW ON CLASSIFICATION OF BREAST CANCER DETECTION USING COMBINATION OF THE FEATURE EXTRACTION MODELS. Aeronautical Engineering. Hyderabad. India. Volume 116 No. 21 2017, 203-208 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu A REVIEW ON CLASSIFICATION OF BREAST CANCER DETECTION USING COMBINATION OF

More information

Factuality Levels of Diagnoses in Swedish Clinical Text

Factuality Levels of Diagnoses in Swedish Clinical Text User Centred Networked Health Care A. Moen et al. (Eds.) IOS Press, 2011 2011 European Federation for Medical Informatics. All rights reserved. doi:10.3233/978-1-60750-806-9-559 559 Factuality Levels of

More information

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U T H E U N I V E R S I T Y O F T E X A S A T D A L L A S H U M A N L A N

More information

A review of approaches to identifying patient phenotype cohorts using electronic health records

A review of approaches to identifying patient phenotype cohorts using electronic health records A review of approaches to identifying patient phenotype cohorts using electronic health records Shivade, Raghavan, Fosler-Lussier, Embi, Elhadad, Johnson, Lai Chaitanya Shivade JAMIA Journal Club March

More information

CLASSIFICATION OF BRAIN TUMOUR IN MRI USING PROBABILISTIC NEURAL NETWORK

CLASSIFICATION OF BRAIN TUMOUR IN MRI USING PROBABILISTIC NEURAL NETWORK CLASSIFICATION OF BRAIN TUMOUR IN MRI USING PROBABILISTIC NEURAL NETWORK PRIMI JOSEPH (PG Scholar) Dr.Pauls Engineering College Er.D.Jagadiswary Dr.Pauls Engineering College Abstract: Brain tumor is an

More information

Not all NLP is Created Equal:

Not all NLP is Created Equal: Not all NLP is Created Equal: CAC Technology Underpinnings that Drive Accuracy, Experience and Overall Revenue Performance Page 1 Performance Perspectives Health care financial leaders and health information

More information

Improving the Accuracy of Neuro-Symbolic Rules with Case-Based Reasoning

Improving the Accuracy of Neuro-Symbolic Rules with Case-Based Reasoning Improving the Accuracy of Neuro-Symbolic Rules with Case-Based Reasoning Jim Prentzas 1, Ioannis Hatzilygeroudis 2 and Othon Michail 2 Abstract. In this paper, we present an improved approach integrating

More information

Evaluation of Clinical Text Segmentation to Facilitate Cohort Retrieval

Evaluation of Clinical Text Segmentation to Facilitate Cohort Retrieval Evaluation of Clinical Text Segmentation to Facilitate Cohort Retrieval Enhanced Cohort Identification and Retrieval S105 Tracy Edinger, ND, MS Oregon Health & Science University Twitter: #AMIA2017 Co-Authors

More information

Phone Number:

Phone Number: International Journal of Scientific & Engineering Research, Volume 6, Issue 5, May-2015 1589 Multi-Agent based Diagnostic Model for Diabetes 1 A. A. Obiniyi and 2 M. K. Ahmed 1 Department of Mathematic,

More information

Creating prognostic systems for cancer patients: A demonstration using breast cancer

Creating prognostic systems for cancer patients: A demonstration using breast cancer Received: 16 April 2018 Revised: 31 May 2018 DOI: 10.1002/cam4.1629 Accepted: 1 June 2018 ORIGINAL RESEARCH Creating prognostic systems for cancer patients: A demonstration using breast cancer Mathew T.

More information

Evaluating E&M Coding Accuracy of GoCode as Compared to Internal Medicine Physicians and Auditors

Evaluating E&M Coding Accuracy of GoCode as Compared to Internal Medicine Physicians and Auditors Evaluating E&M Coding Accuracy of GoCode as Compared to Internal Medicine Physicians and Auditors Evaluating E&M Coding Accuracy of GoCode as Compared to Internal Medicine Physicians and Auditors by Rhonda

More information

EXPLORING THE INTERNAL CONSISTENCY OF REGISTRY DATA ON STAGE OF DISEASE AT DIAGNOSIS

EXPLORING THE INTERNAL CONSISTENCY OF REGISTRY DATA ON STAGE OF DISEASE AT DIAGNOSIS EXPLORING THE INTERNAL CONSISTENCY OF REGISTRY DATA ON STAGE OF DISEASE AT DIAGNOSIS Richard Porter Catherine N. Correa John P. Fulton Holly L. Howe Chris Newton Judy Nowak Steven D. Roffers This paper

More information

Lung Cancer Concept Annotation from Spanish Clinical Narratives

Lung Cancer Concept Annotation from Spanish Clinical Narratives Lung Cancer Concept Annotation from Spanish Clinical Narratives Marjan Najafabadipour 1, [0000-0002-1428-9330], Juan Manuel Tuñas 1[0000-0001-8241-5602], Alejandro Rodríguez-González 1,2,* [0000-0001-8801-4762]

More information

Problem-Oriented Patient Record Summary: An Early Report on a Watson Application

Problem-Oriented Patient Record Summary: An Early Report on a Watson Application Problem-Oriented Patient Record Summary: An Early Report on a Watson Application Murthy Devarakonda, Dongyang Zhang, Ching-Huei Tsou, Mihaela Bornea IBM Research and Watson Group Yorktown Heights, NY Abstract

More information

Extracting geographic locations from the literature for virus phylogeography using supervised and distant supervision methods

Extracting geographic locations from the literature for virus phylogeography using supervised and distant supervision methods Extracting geographic locations from the literature for virus phylogeography using supervised and distant supervision methods D. Weissenbacher 1, A. Sarker 2, T. Tahsin 1, G. Gonzalez 2 and M. Scotch 1

More information

Text Mining of Patient Demographics and Diagnoses from Psychiatric Assessments

Text Mining of Patient Demographics and Diagnoses from Psychiatric Assessments University of Wisconsin Milwaukee UWM Digital Commons Theses and Dissertations December 2014 Text Mining of Patient Demographics and Diagnoses from Psychiatric Assessments Eric James Klosterman University

More information

A comparative study of different methods for automatic identification of clopidogrel-induced bleeding in electronic health records

A comparative study of different methods for automatic identification of clopidogrel-induced bleeding in electronic health records A comparative study of different methods for automatic identification of clopidogrel-induced bleeding in electronic health records Hee-Jin Lee School of Biomedical Informatics The University of Texas Health

More information

Effect of (OHDSI) Vocabulary Mapping on Phenotype Cohorts

Effect of (OHDSI) Vocabulary Mapping on Phenotype Cohorts Effect of (OHDSI) Vocabulary Mapping on Phenotype Cohorts Matthew Levine, Research Associate George Hripcsak, Professor Department of Biomedical Informatics, Columbia University Intro Reasons to map: International

More information

SNOMED CT and Orphanet working together

SNOMED CT and Orphanet working together SNOMED CT and Orphanet working together Ian Green Business Services Executive, IHTSDO Dr. Romina Armando INSERM Session outline What is Orphanet? Rare disorders Orphanet nomenclature Mappings to other

More information

PREDICTION OF METASTATIC DISEASE BY COMPUTER AIDED INTERPRETATION OF TUMOUR MARKERS IN PATIENTS WITH MALIGNANT MELANOMA: A FEASIBILITY STUDY

PREDICTION OF METASTATIC DISEASE BY COMPUTER AIDED INTERPRETATION OF TUMOUR MARKERS IN PATIENTS WITH MALIGNANT MELANOMA: A FEASIBILITY STUDY PREDICTION OF METASTATIC DISEASE BY COMPUTER AIDED INTERPRETATION OF TUMOUR MARKERS IN PATIENTS WITH MALIGNANT MELANOMA: A FEASIBILITY STUDY Scheibboeck C 1,5, Mehl T 2, Rafolt D 3, Dreiseitl S 4, Schlager

More information

Wikipedia-Based Automatic Diagnosis Prediction in Clinical Decision Support Systems

Wikipedia-Based Automatic Diagnosis Prediction in Clinical Decision Support Systems Wikipedia-Based Automatic Diagnosis Prediction in Clinical Decision Support Systems Danchen Zhang 1, Daqing He 1, Sanqiang Zhao 1, Lei Li 1 School of Information Sciences, University of Pittsburgh, USA

More information

May All Your Wishes Come True: A Study of Wishes and How to Recognize Them

May All Your Wishes Come True: A Study of Wishes and How to Recognize Them May All Your Wishes Come True: A Study of Wishes and How to Recognize Them Andrew B. Goldberg, Nathanael Fillmore, David Andrzejewski, Zhiting Xu, Bryan Gibson & Xiaojin Zhu Computer Sciences Department

More information

Enhanced Detection of Lung Cancer using Hybrid Method of Image Segmentation

Enhanced Detection of Lung Cancer using Hybrid Method of Image Segmentation Enhanced Detection of Lung Cancer using Hybrid Method of Image Segmentation L Uma Maheshwari Department of ECE, Stanley College of Engineering and Technology for Women, Hyderabad - 500001, India. Udayini

More information

Automated Prediction of Thyroid Disease using ANN

Automated Prediction of Thyroid Disease using ANN Automated Prediction of Thyroid Disease using ANN Vikram V Hegde 1, Deepamala N 2 P.G. Student, Department of Computer Science and Engineering, RV College of, Bangalore, Karnataka, India 1 Assistant Professor,

More information

Data mining with Ensembl Biomart. Stéphanie Le Gras

Data mining with Ensembl Biomart. Stéphanie Le Gras Data mining with Ensembl Biomart Stéphanie Le Gras (slegras@igbmc.fr) Guidelines Genome data Genome browsers Getting access to genomic data: Ensembl/BioMart 2 Genome Sequencing Example: Human genome 2000:

More information

PREPROCESSING AND GENERATION OF ASSOCIATION RULES FOR PREDICTION OF ACUTE MYELOID LEUKEMIA FROM BONE MARROW DATA

PREPROCESSING AND GENERATION OF ASSOCIATION RULES FOR PREDICTION OF ACUTE MYELOID LEUKEMIA FROM BONE MARROW DATA PREPROCESSING AND GENERATION OF ASSOCIATION RULES FOR PREDICTION OF ACUTE MYELOID LEUKEMIA FROM BONE MARROW DATA 1 D.MINNIE, 2 S.SRINIVASAN 1 Madras Christian College, Department of Computer Science, Chennai,

More information

Primary Level Classification of Brain Tumor using PCA and PNN

Primary Level Classification of Brain Tumor using PCA and PNN Primary Level Classification of Brain Tumor using PCA and PNN Dr. Mrs. K.V.Kulhalli Department of Information Technology, D.Y.Patil Coll. of Engg. And Tech. Kolhapur,Maharashtra,India kvkulhalli@gmail.com

More information

BREAST CANCER EPIDEMIOLOGY MODEL:

BREAST CANCER EPIDEMIOLOGY MODEL: BREAST CANCER EPIDEMIOLOGY MODEL: Calibrating Simulations via Optimization Michael C. Ferris, Geng Deng, Dennis G. Fryback, Vipat Kuruchittham University of Wisconsin 1 University of Wisconsin Breast Cancer

More information

Automatically extracting, ranking and visually summarizing the treatments for a disease

Automatically extracting, ranking and visually summarizing the treatments for a disease Automatically extracting, ranking and visually summarizing the treatments for a disease Prakash Reddy Putta, B.Tech 1,2, John J. Dzak III, BS 1, Siddhartha R. Jonnalagadda, PhD 1 1 Division of Health and

More information

HHS Public Access Author manuscript Stud Health Technol Inform. Author manuscript; available in PMC 2015 July 08.

HHS Public Access Author manuscript Stud Health Technol Inform. Author manuscript; available in PMC 2015 July 08. Navigating Longitudinal Clinical Notes with an Automated Method for Detecting New Information Rui Zhang a, Serguei Pakhomov a,b, Janet T. Lee c, and Genevieve B. Melton a,c a Institute for Health Informatics,

More information

How preferred are preferred terms?

How preferred are preferred terms? How preferred are preferred terms? Gintare Grigonyte 1, Simon Clematide 2, Fabio Rinaldi 2 1 Computational Linguistics Group, Department of Linguistics, Stockholm University Universitetsvagen 10 C SE-106

More information

Building a framework for handling clinical abbreviations a long journey of understanding shortened words "

Building a framework for handling clinical abbreviations a long journey of understanding shortened words Building a framework for handling clinical abbreviations a long journey of understanding shortened words " Yonghui Wu 1 PhD, Joshua C. Denny 2 MD MS, S. Trent Rosenbloom 2 MD MPH, Randolph A. Miller 2

More information

PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH

PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH 1 VALLURI RISHIKA, M.TECH COMPUTER SCENCE AND SYSTEMS ENGINEERING, ANDHRA UNIVERSITY 2 A. MARY SOWJANYA, Assistant Professor COMPUTER SCENCE

More information

Data Mining Techniques to Predict Survival of Metastatic Breast Cancer Patients

Data Mining Techniques to Predict Survival of Metastatic Breast Cancer Patients Data Mining Techniques to Predict Survival of Metastatic Breast Cancer Patients Abstract Prognosis for stage IV (metastatic) breast cancer is difficult for clinicians to predict. This study examines the

More information

Automatic coding of death certificates to ICD-10 terminology

Automatic coding of death certificates to ICD-10 terminology Automatic coding of death certificates to ICD-10 terminology Jitendra Jonnagaddala 1,2, * and Feiyan Hu 3 1 School of Public Health and Community Medicine, UNSW Sydney, Australia 2 Prince of Wales Clinical

More information

Lung Cancer and Mesothelioma Site Specific Clinical Reference Group Data Quality Report 2009

Lung Cancer and Mesothelioma Site Specific Clinical Reference Group Data Quality Report 2009 Lung Cancer and Mesothelioma Site Specific Clinical Reference Group Data Quality Report 9 Sharma P Riaz Karen M Linklater Henrik Møller Margreet Lüchtenborg Contents 1. Introduction... 1 2. Methods...

More information

Predicting the Effect of Diabetes on Kidney using Classification in Tanagra

Predicting the Effect of Diabetes on Kidney using Classification in Tanagra Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Copyright 2008 Society of Photo Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE, vol. 6915, Medical Imaging 2008:

Copyright 2008 Society of Photo Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE, vol. 6915, Medical Imaging 2008: Copyright 2008 Society of Photo Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE, vol. 6915, Medical Imaging 2008: Computer Aided Diagnosis and is made available as an

More information

A DATA MINING APPROACH FOR PRECISE DIAGNOSIS OF DENGUE FEVER

A DATA MINING APPROACH FOR PRECISE DIAGNOSIS OF DENGUE FEVER A DATA MINING APPROACH FOR PRECISE DIAGNOSIS OF DENGUE FEVER M.Bhavani 1 and S.Vinod kumar 2 International Journal of Latest Trends in Engineering and Technology Vol.(7)Issue(4), pp.352-359 DOI: http://dx.doi.org/10.21172/1.74.048

More information

I.2 CNExT This section was software specific and deleted in 2008.

I.2 CNExT This section was software specific and deleted in 2008. CANCER REPORTING IN CALIFORNIA: ABSTRACTING AND CODING PROCEDURES FOR HOSPITALS California Cancer Reporting System Standards, Volume I Changes and Clarifications 8th th Edition Revised May 2008 SECTION

More information

CANCER REPORTING IN CALIFORNIA: ABSTRACTING AND CODING PROCEDURES California Cancer Reporting System Standards, Volume I

CANCER REPORTING IN CALIFORNIA: ABSTRACTING AND CODING PROCEDURES California Cancer Reporting System Standards, Volume I CANCER REPORTING IN CALIFORNIA: ABSTRACTING AND CODING PROCEDURES California Cancer Reporting System Standards, Volume I Changes and Clarifications 16 th Edition April 15, 2016 Quick Look- Updates to Volume

More information

Lung Cancer Detection using CT Scan Images

Lung Cancer Detection using CT Scan Images Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 125 (2018) 107 114 6th International Conference on Smart Computing and Communications, ICSCC 2017, 7-8 December 2017, Kurukshetra,

More information

Conditional Outlier Detection for Clinical Alerting

Conditional Outlier Detection for Clinical Alerting Conditional Outlier Detection for Clinical Alerting Milos Hauskrecht, PhD 1, Michal Valko, MSc 1, Iyad Batal, MSc 1, Gilles Clermont, MD, MS 2, Shyam Visweswaran MD, PhD 3, Gregory F. Cooper, MD, PhD 3

More information

Analysis of Classification Algorithms towards Breast Tissue Data Set

Analysis of Classification Algorithms towards Breast Tissue Data Set Analysis of Classification Algorithms towards Breast Tissue Data Set I. Ravi Assistant Professor, Department of Computer Science, K.R. College of Arts and Science, Kovilpatti, Tamilnadu, India Abstract

More information

City, University of London Institutional Repository

City, University of London Institutional Repository City Research Online City, University of London Institutional Repository Citation: Biswal, S., Nip, Z., Moura Junior, V., Bianchi, M. T., Rosenthal, E. S. & Westover, M. B. (2015). Automated information

More information

FUZZY DATA MINING FOR HEART DISEASE DIAGNOSIS

FUZZY DATA MINING FOR HEART DISEASE DIAGNOSIS FUZZY DATA MINING FOR HEART DISEASE DIAGNOSIS S.Jayasudha Department of Mathematics Prince Shri Venkateswara Padmavathy Engineering College, Chennai. ABSTRACT: We address the problem of having rigid values

More information

Artificial Intelligence In Medicine xxx (2018) xxx-xxx. Contents lists available at ScienceDirect. Artificial Intelligence In Medicine

Artificial Intelligence In Medicine xxx (2018) xxx-xxx. Contents lists available at ScienceDirect. Artificial Intelligence In Medicine Artificial Intelligence In Medicine xxx (2018) xxx-xxx Contents lists available at ScienceDirect Artificial Intelligence In Medicine journal homepage: www.elsevier.com Extracting cancer mortality statistics

More information