Augmented Medical Decisions

Similar documents
International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017

Predicting clinical outcomes in neuroblastoma with genomic data integration

Diagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods

Gene Selection for Tumor Classification Using Microarray Gene Expression Data

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures

ABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India

An Improved Algorithm To Predict Recurrence Of Breast Cancer

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:

Evaluating Classifiers for Disease Gene Discovery

An Efficient Attribute Ordering Optimization in Bayesian Networks for Prognostic Modeling of the Metabolic Syndrome

Keywords Missing values, Medoids, Partitioning Around Medoids, Auto Associative Neural Network classifier, Pima Indian Diabetes dataset.

Predicting Kidney Cancer Survival from Genomic Data

PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH

Predicting Breast Cancer Survivability Rates

T. R. Golub, D. K. Slonim & Others 1999

Improved Intelligent Classification Technique Based On Support Vector Machines

A Fuzzy Improved Neural based Soft Computing Approach for Pest Disease Prediction

Model-free machine learning methods for personalized breast cancer risk prediction -SWISS PROMPT

Automated Medical Diagnosis using K-Nearest Neighbor Classification

Primary Level Classification of Brain Tumor using PCA and PNN

Predicting Sleep Using Consumer Wearable Sensing Devices

Malignant Tumor Detection Using Machine Learning through Scikit-learn

Supporting Information Identification of Amino Acids with Sensitive Nanoporous MoS 2 : Towards Machine Learning-Based Prediction

Prediction of Malignant and Benign Tumor using Machine Learning

Analysis of Diabetic Dataset and Developing Prediction Model by using Hive and R

A REVIEW ON CLASSIFICATION OF BREAST CANCER DETECTION USING COMBINATION OF THE FEATURE EXTRACTION MODELS. Aeronautical Engineering. Hyderabad. India.

Multi Parametric Approach Using Fuzzification On Heart Disease Analysis Upasana Juneja #1, Deepti #2 *

Evidence Based Diagnosis of Mesothelioma

Variable Features Selection for Classification of Medical Data using SVM

A NOVEL VARIABLE SELECTION METHOD BASED ON FREQUENT PATTERN TREE FOR REAL-TIME TRAFFIC ACCIDENT RISK PREDICTION

International Journal of Advance Engineering and Research Development A THERORETICAL SURVEY ON BREAST CANCER PREDICTION USING DATA MINING TECHNIQUES

BACKPROPOGATION NEURAL NETWORK FOR PREDICTION OF HEART DISEASE

CaPTk: Cancer Imaging Phenomics Toolkit

Identifying Thyroid Carcinoma Subtypes and Outcomes through Gene Expression Data Kun-Hsing Yu, Wei Wang, Chung-Yu Wang

Applied Machine Learning, Lecture 11: Ethical and legal considerations; domain effects and domain adaptation

Prediction Models of Diabetes Diseases Based on Heterogeneous Multiple Classifiers

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

Applications of Machine learning in Prediction of Breast Cancer Incidence and Mortality

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis

CLASSIFICATION OF BRAIN TUMOUR IN MRI USING PROBABILISTIC NEURAL NETWORK

R2: web-based genomics analysis and visualization platform

Analysis of Classification Algorithms towards Breast Tissue Data Set

Prediction of heart disease using k-nearest neighbor and particle swarm optimization.

The effects of the underlying disease and serum albumin on GFR prediction using the Adaptive Neuro Fuzzy Inference System (ANFIS)

Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets

Abstract. Background. Objective

BLADDERSCAN PRIME PLUS TM DEEP LEARNING

Predicting Breast Cancer Recurrence Using Machine Learning Techniques

Cognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence

Package propoverlap. R topics documented: February 20, Type Package

Good Old clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers q

Nearest Shrunken Centroid as Feature Selection of Microarray Data

Feature selection methods for early predictive biomarker discovery using untargeted metabolomic data

Keywords Artificial Neural Networks (ANN), Echocardiogram, BPNN, RBFNN, Classification, survival Analysis.

Classıfıcatıon of Dıabetes Dısease Usıng Backpropagatıon and Radıal Basıs Functıon Network

The Open Access Institutional Repository at Robert Gordon University

Panel: Machine Learning in Surgery and Cancer

ONCOLOGY: WHEN EXPERTISE, EXPERIENCE AND DATA MATTER. KANTAR HEALTH ONCOLOGY SOLUTIONS: FOCUSED I DEDICATED I HERITAGE

AUTOMATIC BRAIN TUMOR DETECTION AND CLASSIFICATION USING SVM CLASSIFIER

LOCATING BRAIN TUMOUR AND EXTRACTING THE FEATURES FROM MRI IMAGES

Predicting Heart Attack using Fuzzy C Means Clustering Algorithm

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

Artificial Intelligence in Breast Imaging

REINVENTING THE BIOMARKER PANEL DISCOVERY EXPERIENCE

Investigating the performance of a CAD x scheme for mammography in specific BIRADS categories

Keywords: Leukaemia, Image Segmentation, Clustering algorithms, White Blood Cells (WBC), Microscopic images.

MRI Image Processing Operations for Brain Tumor Detection

Survey on Breast Cancer Analysis using Machine Learning Techniques

From data to models: incorporating uncertainty into decision support systems. Outline. Probabilistic vs Mechanistic models.

A Novel Iterative Linear Regression Perceptron Classifier for Breast Cancer Prediction

Classification of ECG Data for Predictive Analysis to Assist in Medical Decisions.

Brain Tumour Detection of MR Image Using Naïve Beyer classifier and Support Vector Machine

DEVELOPMENT OF AN EXPERT SYSTEM ALGORITHM FOR DIAGNOSING CARDIOVASCULAR DISEASE USING ROUGH SET THEORY IMPLEMENTED IN MATLAB

RNA preparation from extracted paraffin cores:

TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS)

Big Image-Omics Data Analytics for Clinical Outcome Prediction

Lung Cancer Diagnosis from CT Images Using Fuzzy Inference System

A HMM-based Pre-training Approach for Sequential Data

Australian Journal of Basic and Applied Sciences

DIABETIC RISK PREDICTION FOR WOMEN USING BOOTSTRAP AGGREGATION ON BACK-PROPAGATION NEURAL NETWORKS

A Deep Learning Approach to Identify Diabetes

Compute-aided Differentiation of Focal Liver Disease in MR Imaging

Mayuri Takore 1, Prof.R.R. Shelke 2 1 ME First Yr. (CSE), 2 Assistant Professor Computer Science & Engg, Department

Predictive Analytics and machine learning in clinical decision systems: simplified medical management decision making for health practitioners

EARLY STAGE DIAGNOSIS OF LUNG CANCER USING CT-SCAN IMAGES BASED ON CELLULAR LEARNING AUTOMATE

London Medical Imaging & Artificial Intelligence Centre for Value-Based Healthcare. Professor Reza Razavi Centre Director

Application of Tree Structures of Fuzzy Classifier to Diabetes Disease Diagnosis

A hierarchical two-phase framework for selecting genes in cancer datasets with a neuro-fuzzy system

International Journal of Pure and Applied Mathematics

Improving the Accuracy of Neuro-Symbolic Rules with Case-Based Reasoning

PROGNOSTIC COMPARISON OF STATISTICAL, NEURAL AND FUZZY METHODS OF ANALYSIS OF BREAST CANCER IMAGE CYTOMETRIC DATA

Lung Region Segmentation using Artificial Neural Network Hopfield Model for Cancer Diagnosis in Thorax CT Images

MR Image classification using adaboost for brain tumor type

arxiv: v1 [cs.lg] 4 Feb 2019

A Fuzzy Expert System for Heart Disease Diagnosis

1. Q: What has changed from the draft recommendations posted for public comment in November/December 2011?

Gene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering

Predictive Biomarkers

A Review on Arrhythmia Detection Using ECG Signal

Transcription:

Machine Learning Applied to Biomedical Challenges 2016 Rulex, Inc.

Intelligible Rules for Reliable Diagnostics Rulex is a predictive analytics platform able to manage and to analyze big amounts of heterogeneous data. With Rulex, it is possible, by means of machine learning methods, to create models that describe the data provided by the user, without any prior information. The models generated by Rulex can be used to forecast the behavior of the system in future situations. Besides standard statistical and machine learning methods, Rulex incorporates innovative approaches, named Logic Learning Machines (LLMs): the peculiarity of LLMs consists in the possibility of generating intelligible rules about the problem at hand. As a matter of fact, models produced by standard machine learning methods are able to forecast the future behaviors but, being described by complex equations, cannot provide to the user an insight of the studied system. On the contrary, the LLMs approach generates a set of threshold rules that can be easily understood by a human being. Consider, for example, the case of medical diagnosis. While standard machine learning methods provide the state of a patient (e.g. ill or healthy) as a complex function of the inputs (e.g. blood pressure, markers etc ), LLMs produce rules in the form: IF blood pressure > x AND lymphoid cell concentration > y AND.. THEN Patient is sick where x and y are thresholds automatically determined by the Rulex software. The number of conditions in a rule is variable and depends on the complexity of the considered problem. The rules generated by Rulex-LLMs can be interpreted by a physician and, in case, adjusted on the basis of her experience. Moreover, Rulex-LLMs, with no additional effort, a ranking of the most relevant features in determining, for example, if a patient is sick or healthy. Thus, it is possible to discover if a variable is not important and, in case, to remove it from the list of the quantities to be monitored. Rulex-LLMSs has been employed to solve problems related to different application fields. In particular, several applications in the biomedical area have been carried out successfully: some examples are reported below. www.rulex.ai 2016 Rulex, Inc. 2/6

Extraction of Rules for Pleural Mesothelioma Diagnosis Malignant pleural mesothelioma (MPM) is a rare highly fatal tumor, whose incidence is rapidly increasing in developed countries due to the widespread past exposure to asbestos in environmental and occupational settings. The correct diagnosis of MPM is often hampered by the presence of atypical clinical symptoms that may cause misdiagnosis with either other malignancies (especially adenocarcinomas) or benign inflammatory or infectious diseases (BD) causing pleurisies. Cytological examination (CE) may allow to identify malignant cells, but sometimes a very high false negative proportion may be encountered due to the high prevalence of non-neoplastic cells. Moreover, in most cases a positive result from CE examination only does not allow to distinguish MPM from other malignancies [3]. Many tumor markers (TM) have been demonstrated to be useful complementary tools for the diagnosis of MPM. In particular, recent investigations analyzed the concentrations of three tumor markers in pleural effusions, namely: the soluble mesothelin-related peptide (SMRP), CYFRA 21-1 and CEA, and their association with a differential diagnosis of MPM, pleural metastasis from other tumors (MTX) and BD. SMRP showed the best performance in separating MPM from both MTX and BD, while high values of CYFRA 21-1 were associated to both MPM and MTX. Conversely, high concentrations of CEA were mainly observed in patients with MTX. Taken together, these results indicate that information from the three considered markers and from CE might be combined together in order to obtain a classifier to separate MPM from both MTX and BD. In this context, Rulex has been applied for the differential diagnosis of MPM by identifying simple and intelligible rules based on CE and TM concentration. The results have been compared to those obtained by other supervised methods showing that Rulex outperforms all the competing approaches (Decision Trees, K- Nearest Neighbors and Artificial Neural Networks). Extraction of a Simplified Gene Expression Signature for Neuroblastoma Prognosis Cancer patient s outcome is written, in part, in the gene expression profile of the tumor. In this study, a 62- probe sets signature (NB-hypo) to identify tissue hypoxia in neuroblastoma was previously identified and showed to stratify neuroblastoma patients in good and poor outcome. It was important to develop a prognostic classifier to cluster patients into risk groups benefiting of defined therapeutic approaches. Novel classification and data discretization approaches can be instrumental for the generation of accurate predictors and robust tools for clinical decision support. In this paper, Rulex was applied to gene expression data; in particular the Attribute Driven Incremental Discretization technique for transforming continuous variables into simplified discrete ones was employed as a pre-processing step for rule extraction by means of www.rulex.ai 2016 Rulex, Inc. 3/6

Logic Learning Machine. The application of Rulex-LLMs produced 9 rules utilizing mainly two conditions of the relative expression of 11 probe sets. These rules were very effective predictors, as shown in an independent validation set, demonstrating the validity of the Rulex-LLMs applied to microarray data and patients classification. Rulex-LLMs performed as efficiently as Prediction Analysis of Microarray and Support Vector Machine, and outperformed other learning algorithms such as C4.5. Rulex carried out a feature selection by selecting a new signature (NB-hypo-II) of 11 probe sets that turned out to be the most relevant in predicting outcome among the 62 of the NB-hypo signature. Rules are easily interpretable as they involve only few conditions. Extraction of Intelligible Rules Concerning the Prognosis of Neuroblastoma Neuroblastoma is the most common pediatric solid tumor. About fifty percent of high risk patients die despite treatment making the exploration of new and more effective strategies for improving stratification mandatory. Hypoxia is a condition of low oxygen tension occurring in poorly vascularized areas of the tumor associated with poor prognosis. The aim of this study was the development of a prognostic classifier of neuroblastoma patients outcome blending existing knowledge on clinical and molecular risk factors with the prognostic NB-hypo signature. Classifiers outputting explicit rules, that could be easily translated into the clinical setting, are particularly interesting in this context. Rulex-LLMs exhibited a good accuracy and promised to fulfill the aims of the work. This algorithm was utilized to classify NB-patients on the bases of the following risk factors: Age at diagnosis, INSS stage, MYCN amplification and NBhypo. The algorithm generated explicit classification rules in good agreement with existing clinical knowledge. Through an iterative procedure, the examples causing instability in the rules were identified and removed from the dataset. This workflow generated a stable classifier, very accurate in predicting good and poor outcome patients. The good performance of the classifier was validated in an independent dataset. NB-hypo was an important component of the rules with a relevance similar to that of tumor staging. Validation of a New Classification for Multiple Osteochondromas Patients Multiple osteochondromas (MO), previously known as hereditary multiple exostoses (HME), is an autosomal dominant disease characterized by the formation of several benign cartilage-capped bone growth defined osteochondromas or exostoses. www.rulex.ai 2016 Rulex, Inc. 4/6

Various clinical classifications have been proposed but a consensus has not been reached. The aim of this study was to validate (using a machine learning approach) an easy to use tool to characterize MO patients in three classes according to the number of bone segments affected, the presence of skeletal deformities and/or functional limitations. The proposed classification has been validated (with a highly satisfactory mean accuracy) by analyzing 150 different variables on 289 MO patients through Rulex LLMs. This approach allowed us to identify Madelung deformity and limitation of the hip extra-rotation as tags of the three clinical classes. In conclusion, the proposed classification provides an efficient system to characterize this rare disease and is able to define homogeneous cohorts of patients to investigate MO pathogenesis. Benchmarking Rulex LLMs Performances on Standard Biomedical Datasets In this study, LLMs were applied to three benchmark datasets regarding different biomedical problems. The datasets, are taken from the UCI archive, a collection of data for machine learning benchmarking, and include: Diabetes: it regards the problem of diagnosing diabetes starting from the values of 8 variables: all the 768 considered patients are females at least 21 years old of Pima Indian heritage: 268 of them are cases whereas remaining 500 are controls. Heart: it deals with the detection of heart disease from a set of 13 input variables concerning patient status; the total sample of 270 elements is formed by 120 cases and 150 controls. DNA: it has the aim of recognizing acceptors and donors sites in a primate gene sequences with length 60 (basis); the dataset consists of 3186 sequences, subdivided into three classes: acceptor, donor, none. LLMs performances were compared to those of other supervised methods, namely Decision Trees (DT), Artificial Neural Networks (ANN), Logistic Regression (LR) and K-Nearest Neighbor (KNN). These tests showed that Rulex-LLMs results are better than those of ANN, DT (that produce rules) and KNN, and are comparable with those of LR. To find out more about Rulex s technology and its applications please visit http://rulex.ai. www.rulex.ai 2016 Rulex, Inc. 5/6

Bibliography [1] S. PARODI, R. FILIBERTI, P. MARRONI, R. LIBENER, G.P. IVALDI, M. MUSSAP, E. FERRARI, C. MANNESCHI, E. MONTANI, M. MUSELLI Differential diagnosis of pleural mesothelioma using Logic Learning Machine. Submitted to BMC Bioinformatics (2014). [2] D. CANGELOSI, M. MUSELLI, S. PARODI, F. BLENGIO, J. KOSTER, A. SCHRAMM, A. GARAVENTA, C. GAMBINI, L. VARESIO Use of Attribute Driven Incremental Discretization and Logic Learning Machine to build a prognostic classifier for neuroblastoma patients. To appear on BMC Bioinformatics (2014). [3] D. CANGELOSI, F. BLENGIO, R. VERSTEEG, A. EGGERT, A. GARAVENTA, C. GAMBINI, M. CONTE, A. EVA, M. MUSELLI, L. VARESIO Logic Learning Machine creates explicit and stable rules stratifying neuroblastoma patients. BMC Bioinformatics 14:S12 (2013). [4] M. MORDENTI, E. FERRARI, E. PEDRINI, N. FABBRI, L. CAMPANACCI, M. MUSELLI, L. SANGIORGI Validation of a New Hereditary Multiple Exostoses Classification Through Switching Neural Networks. American Journal of Medical Genetics 161 (2013) 556 560 DOI: 10.1002/ajmg.a.35819. [5] M. MUSELLI Extracting knowledge from biomedical data through Logic Learning Machines and Rulex. EMBnet Journal 18B (2012), 56 58. www.rulex.ai 2016 Rulex, Inc. 6/6