Predicting Breast Cancer Survivability Rates
|
|
- Ella Lloyd
- 5 years ago
- Views:
Transcription
1 Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer Science, Effat University, Jeddah, Saudi Arabia Abstract The application of data mining and machine learning in directing clinical research into possible hidden knowledge is becoming greatly influencial in cancer research. This research presents a comparison of three data mining classification models: multi-layer perceptron neural networks, C4.5 decision trees and Naive Bayes. The classification models are built for breast cancer survivability prediction. The data set used is collected from registries across Saudi Arabia. Due to data scarcity, data synthesis had to be performed using a random seed and a double sampling procedure. After sufficiently preprocessing the data, the classification models were built and three performance measures were used to rank the models: Accuracy, Sensitivity and Specificity. The experiment was set up with multi-layer perceptron as the baseline scheme and with statistical significance of Decision Trees performed marginally better than multi-layer perceptron and Naïve Bayes performed significantly worse than the baseline scheme. The result showed that Decsion tree is the the most accurate predictor for breast cancer survivbility in Saudi Arabia (Accuracy 0.979% ). Keywords: Data Mining, Breast Cancer Prediction, Neural Networks, Decision Trees, Naïve Bayes. 1 INTRODUCTION With unprecedented growth of data, especially in bioinformatics and the medical field, it has become very important to device computer-based methodologies to analyze meaningful and biologically significant information explosive data banks (Knowledge discovery). Data mining is an algorithmic technique used to describe relationships between patterns and predicting classifications based on the data. Cancer is a malignant disease that is one of the leading causes of deaths worldwide. However, survivability rates in cancer are systematically affected by different interdependent factors. These factors include: genetic factors; genes expressivity which determines phonotypical conditions (including cancer), lifestyle factors and medical history. These factors can be obtained through different data sets. This research aims at implementing data mining algorithms, applying them to data banks collected from various hospitals in Saudi Arabia, and then evaluating the classification models built. The three prediction models are: neural networks, decision trees and Naïve Bayes. 2 SIMILAR WORK Till today, there are numerous studies and projects that explore the benefits of using machine learning methods to predict cancer survivability. However, it was noticeable that there are few cancer classification projects completed in Saudi Arabia. The available global researches were investigated in order to formulate a generic structure of how the research can be conducted. Bellaachia and Guven s [1] used statistical data provided by the National Cancer Institute in the United States through SEER (Surveillance Epidemiology and End Results). The data set contained 16 fields in a total of 151,886 records. In their methodology, the authors investigated three data mining algorithms (Naïve Bayes, back-propagated neural network, and C4.5 decision tree algorithm) to predict survivability rates of cancer patients. WEKA (Weikota Environment of Knowledge Analysis) was used to implement the algorithms. K-fold cross validation was used to validate the results. Their hypothesis implies that their results differ from Delen s and Walker s [2]. The hypothesis was verified as the results showed that the methodology that Bellaachia and Guven used outperforms Delen et al approach. Scharber [3] investigated the role of text data mining in predicting cancer survivability. The result of this paper included that using text data mining helps in identifying vital cancer information like cancer type, tumor size and medical history information. These pieces of information, according to Scharber, will improve the treatment plan for each cancer patient. In Zhou,Z & Jiang,Y[4] study, case studies on diabetes, hepatitis, and breast cancer were used in this research paper as the dataset. Neural network ensemble, a
2 collection of artificial neural networks, was compared to C4.5 Rule-PANE. Neural Network was used as a preprocessor tool and C4.5 Rule-PANE as the main training algorithm. The results showed that C4.5 Rule-PANE is a powerful rule generator. The rules generated by C4.5 have strong generalization ability. Shital Shah, Andrew Kusiak[5] analyzed gene expression data to identify and classify cancer based on its causing genes. For prediction generator, decision tree and support vector machines algorithms were used. The data bank used contained datasets for ovarian, prostate, and lung cancer. This study, integrated all of the algorithms specified in a single gene-finding algorithm. For each type of cancer studied, a set of the most significant gene set was identified with accuracy range of 94-98%. 3 DATA CLEANING AND PREPARATION Before the dataset is used, it needs to be properly preprocessed and a complete relevancy analysis needs to be completed. Preprocessing entails functions like replacing missing values, normalizing numeric attributes and converting discrete attributes to nominal type. Feature selection involves selecting the attributes that are most relevant to the classification problem. The method used in relevancy analysis is information gain ranker. Below is a detailed presentations of the steps completed in the preprocessing and relevancy analysis (Feature Selection) phases. 3.1 Data Preprocessing WEKA s filters for feature selection are an integral component in the WEKA package. These filters can either be supervised, belonging to weka.filters.supervised, or unsupervised, belonging to weka.filters.unsupervised. From the unsupervised filters, numeric to nominal conversion was used for nominal attributes and normalization for numerical attributes. Attributes that had more than 70.0% missing values, were omitted. However, some fields contributions to the pattern were critical and it was considered inefficient to remove them. Table 1 shows the distribution of the class attribute over the collected data from various registries. 1.1 Feature selection In order to avoid inaccurate or random predictions including redundant, insignificant or noisy attributes, features were selected to be included in the classification (See table 2). This vector of attributes was sent to the Ministry of Health and various hospitals in order to get similar data fields from Saudi Hospital Registries. Table1 Survivability Distribution in Data Collected from Registries in Saudi Arabia (after random sampling) Categorical Variable Frequency Percentage Survive Breast Cancer % Not survive breast Cancer % Total 1358 Table2 Predictor Variables Attribute Description Marital status This data item identifies the patient s marital status at the time of diagnosis for the reportable tumor. Birth Place Patient s birth place Laterality The side of the organ in which tumor was detected Age at diagnosis Grade Differentiation of cells Radiation The radiation therapy methodology Survivability 1 for not survive and 0 for survive Sex Primary Site The origination site of the tumor 2 PREDICTION MODEL ANALYSIS 2.1 C4.5 J48 in WEKA refers to Quinlan s C4.5 algorithm with optional pruning. J48 is used to build decision trees from a set of labeled training data using the concept of information entropy. To split the data at each stage of the tree construction, a test is performed to select an attribute with the lowest entropy. Information gain (IG) (as shown in equation 3) is used as a measure of entropy (H) with respect to the class attribute (C) : H(C) = -Σp(c) logp(c), c Є C (1) H (C Xi) = -Σp(x) [ Σp(c x)log p(c x) ] (2) IGi = H(C) - H (C Xi) (3) [1] In survivability analysis, C would be the survivability class. The choice of the attribute in which the branch is formed depends on a low entropy value and high information gain value. In each iteration, if an entropy value is detected to be higher than in the previous iteration, the tree is pruned. Tree pruning requires the removal of the branch with high entropy value. CS4.5 accurate performance is attributed to its ability to split continuous attributes [7]. Each leaf in a decision tree constructed using C 4.5 is a rule. 2.2 Naïve Bayes The naive Bayes model is a classical data mining algorithm. It is commonly used to solve prediction problems for its ease of implementation and usage. At the same time, its simplicity doesn t undermine its robustness and effectiveness. Text classification is one of its most common implementations. Throughout the years, it has been through various improvements that are not only reflected in its data
3 mining capabilities but also in its machine learning pattern recognition. 2.3 Multi-layer perceptron neural network To solve this non linear classification problem, a multi-layer perceptron with back propagation learning was employed for structuring the model. The network was divided into input, hidden and output layers. In the input layer, the number of neurons was specified by the attributes and the number of output neurons where implied by the possible values of class attributes in this classification. The initial weights were assigned randomly for the connections of the networks and sigmoid function was used as the activation function in order to process the input at each layer and pass it to the next layer using the following equation: (2) Where v is the weighted sum of the input nodes. Five main steps are used in a back propagation- neural network learning phase. These steps are completed iteratively until the error propagated is small enough. 1. Randomly assigning weights to the network 2. Feed forward computation of the activation function 3. Back propagation of the error function to the output layer 4. Back propagation of the error function to the hidden layers 5. Change weights accordingly 3 CLASSIFIER S EVALUATION The experimenter module in WEKA shows both text and graphical representation of the results. However, each classifier can have its own additional graphical representations such as decision trees in C4.5. Moreover, the performance parameters that are common amongst all models are: the number of instances that are correctly classified, the number of instances that are incorrectly classified, kappa statistic which measures the agreement of the prediction with the actual classes. Also, there are error Figure 1 Neural Network Architecture measurements such as root mean squared error, mean absolute error, relative absolute error, and root relative squared error. 3.1 Confusion Matrix confusion matrix is a measure used to tabulate the results of a classifier as true positive, true negative, false positive and false negative. The confusion matrix is built to interpret the results of the classifier. The upper row in a confusion matrix represents the number of instances classified for the positive class and the lower row for the negative class. The true positive cell identifies the attributes that are correctly classified for the positive class where the false positive identifies the classes that are incorrectly classified for the positive class, whilst true negative and false negative represents the attributes that are correctly and incorrectly classified for the negative class respectively. (See table 3) 3.2 KAPPA Statistics Kappa statistics are used to indicate the correlation between the predicted values and the actual values. These parameters measure the pair-wise agreement between different observed values. [6] Based on the values in the confusion matrix, a kappa value of 1 indicates complete agreement; a kappa value between 0.61 and 0.80 indicates significant agreement. The built classifiers are expected to have a kappa parameter value of more than zero indicating that the predicted classification is not completed by random chance. 3.3 Performance Measures Three main performance measures are used: accuracy, sensitivity and specificity using the following equations respectively: 1. Accuracy= 2. Sensitivity = 3. Specificity = Where TP is true positive, TN true negative, FP false positive and FN false negative. These three measures will be used for this binary classification problem as follows: Sensitivity will indicate the ratio of how many cases were truly classified as survived out of all which have not been truly classified not survived (true positive and false negative). Specificity will indicate the ratio of how many cases were truly classified as not survived out of those who have not been classified as survived (True negative and false positive). Accuracy will indicate the ratio of truly classified instances out of all instances (true positive, true negative, (6) (7) (5)
4 false positive and false negative). Sensitivity is referred to as the true positive rate (TPR) and specificity the true negative rate (TNR). Thus, the sum of the TPR and the TNR should equal 1. The receiver operator characteristics graph is highly used in diagnosis in the medical field as ROC analysis. It compares the quality of different parameters by plotting the TP rate or sensitivity as the independent variable and the FP rate or specificity as the dependant variable. The classifier quality will be assessed based on the area under the graph. As the area under a ROC graph increases for a classifier, the quality of the classifier increases as well. 4 RESULTS After the three Models were built using 80% split and evaluated using 10 fold cross validation. Decision trees had the highest accuracy and sensitivity with and respectively. Neural Networks had the highest specificity with The table below shows the detailed performance of the three models. The ROC Area was lowest for naïve bayes with an area of 0.873(see Figure1) and highest for decision trees with ROC (Receiver Operating Curve) area of (see Figure 2). Multi-layer perceptron had a ROC area of (see Figure 3). Table3. Tabular Results of Models' Performance Figure 2. ROC curve for Naïve Bayes after 10-fold cross validation Accuracy Sensitivity Specificity Mean Error Kappa Statistic Neural Network Decision Tree Naïve Bayes Figure 3. ROC Curve for Decision Tree after 10-fold Cross Validation FIGURE 4. ROC CURVE FOR MULTI-LATER PERCEPTRON AFTER 10-FOLD CROSS VALIDATION
5 7 CONCLUSION The overwhelming rates of low survivability as a direct result of breast cancer diagnosis is the motivation behind this research especially with a noticed scarcity of the reports and data mining projects complementing the clinical research going in Saudi Arabia. Results from predictive models are useless without the analysis and feedback of those in the field in order to decide if the results obtained are logical and if they are new findings that are novel in the medical field. Thus, data mining and the medical domain are two integrated areas that complement each other. Although data mining is becoming a complementary application for many clinical researches in the medical and bioinformatics fields, there are still limitations that can t be ignored. Total dependence on the automation of data mining is not always feasible. Human intervention via interpretation is needed to explore the extracted knowledge. This research can be extended in different ways to increase its usefulness and effectiveness. First, an ensemble of predictive models can be implemented. This will increase the accuracy and will help in introducing novel data mining techniques. Also, the range of survivability could be expanded beyond breast cancer to include survivability rates comparisons amongst different cancer types. 8 REFERENCES [1] Bellaachia and Guven Predicting cancer survivability using data mining technique. George Washington University, Department of Computer Science. [2] Dursun, D Predicting Breast Cancer Survivability: a Comparison of Three Data Mining Methods. Artificial Intelligence in Medicine, 34 (2), [3] Scharber, Evaluation of Open Source Text Mining Tools for Cancer Surveillance: Phase I: Understanding text mining and identifying tools. NPCR-AERRO Technical Development Team. [4] Zhou, Z & Jiang, Y.(2003). Medical diagnosis with C4.5 rule preceded by artificial neural network ensemble. Information Technology in Biomedicine, 7(1), [5] Shital Shah &Andrew Kusiak, Cancer Gene Search with Data Mining and Genetic Algorithms. Computers in Biology and Medicine, 37 (2007) [6] Jonsdottir,T,et al,2008. The Feasibility of Constructing a Predictive Outcome Model for Breast Cancer using the Tools of Data Mining. Expert Systems with Applications, 34(2008), G. [7] Quinlan, J,R.,1996. Learning Decision Tree Classifiers. ACM Comput. Surv. 28(1): (1996)
Predicting Breast Cancer Recurrence Using Machine Learning Techniques
Predicting Breast Cancer Recurrence Using Machine Learning Techniques Umesh D R Department of Computer Science & Engineering PESCE, Mandya, Karnataka, India Dr. B Ramachandra Department of Electrical and
More informationPREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH
PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH 1 VALLURI RISHIKA, M.TECH COMPUTER SCENCE AND SYSTEMS ENGINEERING, ANDHRA UNIVERSITY 2 A. MARY SOWJANYA, Assistant Professor COMPUTER SCENCE
More informationData Mining Techniques to Predict Survival of Metastatic Breast Cancer Patients
Data Mining Techniques to Predict Survival of Metastatic Breast Cancer Patients Abstract Prognosis for stage IV (metastatic) breast cancer is difficult for clinicians to predict. This study examines the
More informationPersonalized Colorectal Cancer Survivability Prediction with Machine Learning Methods*
Personalized Colorectal Cancer Survivability Prediction with Machine Learning Methods* 1 st Samuel Li Princeton University Princeton, NJ seli@princeton.edu 2 nd Talayeh Razzaghi New Mexico State University
More informationClassification of Smoking Status: The Case of Turkey
Classification of Smoking Status: The Case of Turkey Zeynep D. U. Durmuşoğlu Department of Industrial Engineering Gaziantep University Gaziantep, Turkey unutmaz@gantep.edu.tr Pınar Kocabey Çiftçi Department
More informationA DATA MINING APPROACH FOR PRECISE DIAGNOSIS OF DENGUE FEVER
A DATA MINING APPROACH FOR PRECISE DIAGNOSIS OF DENGUE FEVER M.Bhavani 1 and S.Vinod kumar 2 International Journal of Latest Trends in Engineering and Technology Vol.(7)Issue(4), pp.352-359 DOI: http://dx.doi.org/10.21172/1.74.048
More informationPerformance Evaluation of Machine Learning Algorithms in the Classification of Parkinson Disease Using Voice Attributes
Performance Evaluation of Machine Learning Algorithms in the Classification of Parkinson Disease Using Voice Attributes J. Sujatha Research Scholar, Vels University, Assistant Professor, Post Graduate
More informationINTRODUCTION TO MACHINE LEARNING. Decision tree learning
INTRODUCTION TO MACHINE LEARNING Decision tree learning Task of classification Automatically assign class to observations with features Observation: vector of features, with a class Automatically assign
More informationStage-Specific Predictive Models for Cancer Survivability
University of Wisconsin Milwaukee UWM Digital Commons Theses and Dissertations December 2016 Stage-Specific Predictive Models for Cancer Survivability Elham Sagheb Hossein Pour University of Wisconsin-Milwaukee
More informationImproved Intelligent Classification Technique Based On Support Vector Machines
Improved Intelligent Classification Technique Based On Support Vector Machines V.Vani Asst.Professor,Department of Computer Science,JJ College of Arts and Science,Pudukkottai. Abstract:An abnormal growth
More informationAn Improved Algorithm To Predict Recurrence Of Breast Cancer
An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant
More informationAnalysis of Classification Algorithms towards Breast Tissue Data Set
Analysis of Classification Algorithms towards Breast Tissue Data Set I. Ravi Assistant Professor, Department of Computer Science, K.R. College of Arts and Science, Kovilpatti, Tamilnadu, India Abstract
More informationPerformance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool
Performance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool Sujata Joshi Assistant Professor, Dept. of CSE Nitte Meenakshi Institute of Technology Bangalore,
More informationCOMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION
COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION 1 R.NITHYA, 2 B.SANTHI 1 Asstt Prof., School of Computing, SASTRA University, Thanjavur, Tamilnadu, India-613402 2 Prof.,
More information[Kiran, 2(1): January, 2015] ISSN:
AN EFFICIENT LUNG CANCER DETECTION BASED ON ARTIFICIAL NEURAL NETWORK Shashi Kiran.S * Assistant Professor, JNN College of Engineering, Shimoga, Karnataka, India Keywords: Artificial Neural Network (ANN),
More informationInternational Journal of Pharma and Bio Sciences A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS ABSTRACT
Research Article Bioinformatics International Journal of Pharma and Bio Sciences ISSN 0975-6299 A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS D.UDHAYAKUMARAPANDIAN
More informationEffect of Feedforward Back Propagation Neural Network for Breast Tumor Classification
IJCST Vo l. 4, Is s u e 2, Ap r i l - Ju n e 2013 ISSN : 0976-8491 (Online) ISSN : 2229-4333 (Print) Effect of Feedforward Back Propagation Neural Network for Breast Tumor Classification 1 Rajeshwar Dass,
More informationEvaluating Classifiers for Disease Gene Discovery
Evaluating Classifiers for Disease Gene Discovery Kino Coursey Lon Turnbull khc0021@unt.edu lt0013@unt.edu Abstract Identification of genes involved in human hereditary disease is an important bioinfomatics
More informationApplication of distributed lighting control architecture in dementia-friendly smart homes
Application of distributed lighting control architecture in dementia-friendly smart homes Atousa Zaeim School of CSE University of Salford Manchester United Kingdom Samia Nefti-Meziani School of CSE University
More informationBACKPROPOGATION NEURAL NETWORK FOR PREDICTION OF HEART DISEASE
BACKPROPOGATION NEURAL NETWORK FOR PREDICTION OF HEART DISEASE NABEEL AL-MILLI Financial and Business Administration and Computer Science Department Zarqa University College Al-Balqa' Applied University
More informationLearning Classifier Systems (LCS/XCSF)
Context-Dependent Predictions and Cognitive Arm Control with XCSF Learning Classifier Systems (LCS/XCSF) Laurentius Florentin Gruber Seminar aus Künstlicher Intelligenz WS 2015/16 Professor Johannes Fürnkranz
More informationPredicting Breast Cancer Survival Using Treatment and Patient Factors
Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women
More informationClassification and Predication of Breast Cancer Risk Factors Using Id3
The International Journal Of Engineering And Science (IJES) Volume 5 Issue 11 Pages PP 29-33 2016 ISSN (e): 2319 1813 ISSN (p): 2319 1805 Classification and Predication of Breast Cancer Risk Factors Using
More informationClassification of benign and malignant masses in breast mammograms
Classification of benign and malignant masses in breast mammograms A. Šerifović-Trbalić*, A. Trbalić**, D. Demirović*, N. Prljača* and P.C. Cattin*** * Faculty of Electrical Engineering, University of
More informationBrain Tumor segmentation and classification using Fcm and support vector machine
Brain Tumor segmentation and classification using Fcm and support vector machine Gaurav Gupta 1, Vinay singh 2 1 PG student,m.tech Electronics and Communication,Department of Electronics, Galgotia College
More informationRajiv Gandhi College of Engineering, Chandrapur
Utilization of Data Mining Techniques for Analysis of Breast Cancer Dataset Using R Keerti Yeulkar 1, Dr. Rahila Sheikh 2 1 PG Student, 2 Head of Computer Science and Studies Rajiv Gandhi College of Engineering,
More informationLung Cancer Diagnosis from CT Images Using Fuzzy Inference System
Lung Cancer Diagnosis from CT Images Using Fuzzy Inference System T.Manikandan 1, Dr. N. Bharathi 2 1 Associate Professor, Rajalakshmi Engineering College, Chennai-602 105 2 Professor, Velammal Engineering
More informationEfficacy of the Extended Principal Orthogonal Decomposition Method on DNA Microarray Data in Cancer Detection
202 4th International onference on Bioinformatics and Biomedical Technology IPBEE vol.29 (202) (202) IASIT Press, Singapore Efficacy of the Extended Principal Orthogonal Decomposition on DA Microarray
More informationDiagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods
International Journal of Bioinformatics and Biomedical Engineering Vol. 1, No. 3, 2015, pp. 318-322 http://www.aiscience.org/journal/ijbbe ISSN: 2381-7399 (Print); ISSN: 2381-7402 (Online) Diagnosis of
More informationA Novel Iterative Linear Regression Perceptron Classifier for Breast Cancer Prediction
A Novel Iterative Linear Regression Perceptron Classifier for Breast Cancer Prediction Samuel Giftson Durai Research Scholar, Dept. of CS Bishop Heber College Trichy-17, India S. Hari Ganesh, PhD Assistant
More informationPrimary Level Classification of Brain Tumor using PCA and PNN
Primary Level Classification of Brain Tumor using PCA and PNN Dr. Mrs. K.V.Kulhalli Department of Information Technology, D.Y.Patil Coll. of Engg. And Tech. Kolhapur,Maharashtra,India kvkulhalli@gmail.com
More informationCardiac Arrest Prediction to Prevent Code Blue Situation
Cardiac Arrest Prediction to Prevent Code Blue Situation Mrs. Vidya Zope 1, Anuj Chanchlani 2, Hitesh Vaswani 3, Shubham Gaikwad 4, Kamal Teckchandani 5 1Assistant Professor, Department of Computer Engineering,
More informationApplication of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures
Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures 1 2 3 4 5 Kathleen T Quach Department of Neuroscience University of California, San Diego
More informationColon cancer survival prediction using ensemble data mining on SEER data
2013 IEEE International Conference on Big Data Colon cancer survival prediction using ensemble data mining on SEER data Reda Al-Bahrani, Ankit Agrawal, Alok Choudhary Dept. of Electrical Engg. and Computer
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017
RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science
More information10CS664: PATTERN RECOGNITION QUESTION BANK
10CS664: PATTERN RECOGNITION QUESTION BANK Assignments would be handed out in class as well as posted on the class blog for the course. Please solve the problems in the exercises of the prescribed text
More informationABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 1 ISSN : 2456-3307 Data Mining Techniques to Predict Cancer Diseases
More informationEffective Values of Physical Features for Type-2 Diabetic and Non-diabetic Patients Classifying Case Study: Shiraz University of Medical Sciences
Effective Values of Physical Features for Type-2 Diabetic and Non-diabetic Patients Classifying Case Study: Medical Sciences S. Vahid Farrahi M.Sc Student Technology,Shiraz, Iran Mohammad Mehdi Masoumi
More informationModelling and Application of Logistic Regression and Artificial Neural Networks Models
Modelling and Application of Logistic Regression and Artificial Neural Networks Models Norhazlina Suhaimi a, Adriana Ismail b, Nurul Adyani Ghazali c a,c School of Ocean Engineering, Universiti Malaysia
More informationAutomatic Detection of Heart Disease Using Discreet Wavelet Transform and Artificial Neural Network
e-issn: 2349-9745 p-issn: 2393-8161 Scientific Journal Impact Factor (SJIF): 1.711 International Journal of Modern Trends in Engineering and Research www.ijmter.com Automatic Detection of Heart Disease
More informationTIME SERIES MODELING USING ARTIFICIAL NEURAL NETWORKS 1 P.Ram Kumar, 2 M.V.Ramana Murthy, 3 D.Eashwar, 4 M.Venkatdas
TIME SERIES MODELING USING ARTIFICIAL NEURAL NETWORKS 1 P.Ram Kumar, 2 M.V.Ramana Murthy, 3 D.Eashwar, 4 M.Venkatdas 1 Department of Computer Science & Engineering,UCE,OU,Hyderabad 2 Department of Mathematics,UCS,OU,Hyderabad
More informationA Learning Method of Directly Optimizing Classifier Performance at Local Operating Range
A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range Lae-Jeong Park and Jung-Ho Moon Department of Electrical Engineering, Kangnung National University Kangnung, Gangwon-Do,
More informationFUZZY DATA MINING FOR HEART DISEASE DIAGNOSIS
FUZZY DATA MINING FOR HEART DISEASE DIAGNOSIS S.Jayasudha Department of Mathematics Prince Shri Venkateswara Padmavathy Engineering College, Chennai. ABSTRACT: We address the problem of having rigid values
More informationSparse Coding in Sparse Winner Networks
Sparse Coding in Sparse Winner Networks Janusz A. Starzyk 1, Yinyin Liu 1, David Vogel 2 1 School of Electrical Engineering & Computer Science Ohio University, Athens, OH 45701 {starzyk, yliu}@bobcat.ent.ohiou.edu
More informationData mining for Obstructive Sleep Apnea Detection. 18 October 2017 Konstantinos Nikolaidis
Data mining for Obstructive Sleep Apnea Detection 18 October 2017 Konstantinos Nikolaidis Introduction: What is Obstructive Sleep Apnea? Obstructive Sleep Apnea (OSA) is a relatively common sleep disorder
More informationR Jagdeesh Kanan* et al. International Journal of Pharmacy & Technology
ISSN: 0975-766X CODEN: IJPTFI Available Online through Research Article www.ijptonline.com NEURAL NETWORK BASED FEATURE ANALYSIS OF MORTALITY RISK BY HEART FAILURE Apurva Waghmare, Neetika Verma, Astha
More informationMachine Learning! Robert Stengel! Robotics and Intelligent Systems MAE 345,! Princeton University, 2017
Machine Learning! Robert Stengel! Robotics and Intelligent Systems MAE 345,! Princeton University, 2017 A.K.A. Artificial Intelligence Unsupervised learning! Cluster analysis Patterns, Clumps, and Joining
More informationMachine Learning to Inform Breast Cancer Post-Recovery Surveillance
Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction
More informationGenetic Algorithm based Feature Extraction for ECG Signal Classification using Neural Network
Genetic Algorithm based Feature Extraction for ECG Signal Classification using Neural Network 1 R. Sathya, 2 K. Akilandeswari 1,2 Research Scholar 1 Department of Computer Science 1 Govt. Arts College,
More informationParticle Swarm Optimization Supported Artificial Neural Network in Detection of Parkinson s Disease
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 5, Ver. VI (Sep. - Oct. 2016), PP 24-30 www.iosrjournals.org Particle Swarm Optimization Supported
More informationPrediction Models of Diabetes Diseases Based on Heterogeneous Multiple Classifiers
Int. J. Advance Soft Compu. Appl, Vol. 10, No. 2, July 2018 ISSN 2074-8523 Prediction Models of Diabetes Diseases Based on Heterogeneous Multiple Classifiers I Gede Agus Suwartane 1, Mohammad Syafrullah
More informationBREAST CANCER EPIDEMIOLOGY MODEL:
BREAST CANCER EPIDEMIOLOGY MODEL: Calibrating Simulations via Optimization Michael C. Ferris, Geng Deng, Dennis G. Fryback, Vipat Kuruchittham University of Wisconsin 1 University of Wisconsin Breast Cancer
More informationLogistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India
20th International Congress on Modelling and Simulation, Adelaide, Australia, 1 6 December 2013 www.mssanz.org.au/modsim2013 Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision
More informationComparative Analysis of Machine Learning Algorithms for Chronic Kidney Disease Detection using Weka
I J C T A, 10(8), 2017, pp. 59-67 International Science Press ISSN: 0974-5572 Comparative Analysis of Machine Learning Algorithms for Chronic Kidney Disease Detection using Weka Milandeep Arora* and Ajay
More informationMinimum Feature Selection for Epileptic Seizure Classification using Wavelet-based Feature Extraction and a Fuzzy Neural Network
Appl. Math. Inf. Sci. 8, No. 3, 129-1300 (201) 129 Applied Mathematics & Information Sciences An International Journal http://dx.doi.org/10.1278/amis/0803 Minimum Feature Selection for Epileptic Seizure
More informationModeling Sentiment with Ridge Regression
Modeling Sentiment with Ridge Regression Luke Segars 2/20/2012 The goal of this project was to generate a linear sentiment model for classifying Amazon book reviews according to their star rank. More generally,
More informationGenerating comparative analysis of early stage prediction of Chronic Kidney Disease
International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Generating comparative analysis of early stage prediction of Chronic Kidney Disease L.Jerlin Rubini, Dr.P.Eswaran a Research Scholar,
More informationA Fuzzy Improved Neural based Soft Computing Approach for Pest Disease Prediction
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 13 (2014), pp. 1335-1341 International Research Publications House http://www. irphouse.com A Fuzzy Improved
More informationCredal decision trees in noisy domains
Credal decision trees in noisy domains Carlos J. Mantas and Joaquín Abellán Department of Computer Science and Artificial Intelligence University of Granada, Granada, Spain {cmantas,jabellan}@decsai.ugr.es
More informationAn Experimental Study of Diabetes Disease Prediction System Using Classification Techniques
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 1, Ver. IV (Jan.-Feb. 2017), PP 39-44 www.iosrjournals.org An Experimental Study of Diabetes Disease
More informationA Comparison of Collaborative Filtering Methods for Medication Reconciliation
A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,
More informationYeast Cells Classification Machine Learning Approach to Discriminate Saccharomyces cerevisiae Yeast Cells Using Sophisticated Image Features.
Yeast Cells Classification Machine Learning Approach to Discriminate Saccharomyces cerevisiae Yeast Cells Using Sophisticated Image Features. Mohamed Tleis Supervisor: Fons J. Verbeek Leiden University
More informationSVM-Kmeans: Support Vector Machine based on Kmeans Clustering for Breast Cancer Diagnosis
SVM-Kmeans: Support Vector Machine based on Kmeans Clustering for Breast Cancer Diagnosis Walaa Gad Faculty of Computers and Information Sciences Ain Shams University Cairo, Egypt Email: walaagad [AT]
More informationA prediction model for type 2 diabetes using adaptive neuro-fuzzy interface system.
Biomedical Research 208; Special Issue: S69-S74 ISSN 0970-938X www.biomedres.info A prediction model for type 2 diabetes using adaptive neuro-fuzzy interface system. S Alby *, BL Shivakumar 2 Research
More informationGene Selection for Tumor Classification Using Microarray Gene Expression Data
Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology
More informationPrediction of Malignant and Benign Tumor using Machine Learning
Prediction of Malignant and Benign Tumor using Machine Learning Ashish Shah Department of Computer Science and Engineering Manipal Institute of Technology, Manipal University, Manipal, Karnataka, India
More informationA Naïve Bayesian Classifier for Educational Qualification
Indian Journal of Science and Technology, Vol 8(16, DOI: 10.17485/ijst/2015/v8i16/62055, July 2015 ISSN (Print : 0974-6846 ISSN (Online : 0974-5645 A Naïve Bayesian Classifier for Educational Qualification
More informationConsumer Review Analysis with Linear Regression
Consumer Review Analysis with Linear Regression Cliff Engle Antonio Lupher February 27, 2012 1 Introduction Sentiment analysis aims to classify people s sentiments towards a particular subject based on
More informationDottorato di Ricerca in Statistica Biomedica. XXVIII Ciclo Settore scientifico disciplinare MED/01 A.A. 2014/2015
UNIVERSITA DEGLI STUDI DI MILANO Facoltà di Medicina e Chirurgia Dipartimento di Scienze Cliniche e di Comunità Sezione di Statistica Medica e Biometria "Giulio A. Maccacaro Dottorato di Ricerca in Statistica
More informationContents. Just Classifier? Rules. Rules: example. Classification Rule Generation for Bioinformatics. Rule Extraction from a trained network
Contents Classification Rule Generation for Bioinformatics Hyeoncheol Kim Rule Extraction from Neural Networks Algorithm Ex] Promoter Domain Hybrid Model of Knowledge and Learning Knowledge refinement
More informationJ2.6 Imputation of missing data with nonlinear relationships
Sixth Conference on Artificial Intelligence Applications to Environmental Science 88th AMS Annual Meeting, New Orleans, LA 20-24 January 2008 J2.6 Imputation of missing with nonlinear relationships Michael
More informationFeature selection methods for early predictive biomarker discovery using untargeted metabolomic data
Feature selection methods for early predictive biomarker discovery using untargeted metabolomic data Dhouha Grissa, Mélanie Pétéra, Marion Brandolini, Amedeo Napoli, Blandine Comte and Estelle Pujos-Guillot
More informationCognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence
Cognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence To understand the network paradigm also requires examining the history
More informationChapter 1. Introduction
Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a
More informationAn Empirical and Formal Analysis of Decision Trees for Ranking
An Empirical and Formal Analysis of Decision Trees for Ranking Eyke Hüllermeier Department of Mathematics and Computer Science Marburg University 35032 Marburg, Germany eyke@mathematik.uni-marburg.de Stijn
More informationVariable Features Selection for Classification of Medical Data using SVM
Variable Features Selection for Classification of Medical Data using SVM Monika Lamba USICT, GGSIPU, Delhi, India ABSTRACT: The parameters selection in support vector machines (SVM), with regards to accuracy
More informationarxiv: v2 [cs.cv] 8 Mar 2018
Automated soft tissue lesion detection and segmentation in digital mammography using a u-net deep learning network Timothy de Moor a, Alejandro Rodriguez-Ruiz a, Albert Gubern Mérida a, Ritse Mann a, and
More informationFuzzy Decision Tree FID
Fuzzy Decision Tree FID Cezary Z. Janikow Krzysztof Kawa Math & Computer Science Department Math & Computer Science Department University of Missouri St. Louis University of Missouri St. Louis St. Louis,
More informationCHAPTER 6 HUMAN BEHAVIOR UNDERSTANDING MODEL
127 CHAPTER 6 HUMAN BEHAVIOR UNDERSTANDING MODEL 6.1 INTRODUCTION Analyzing the human behavior in video sequences is an active field of research for the past few years. The vital applications of this field
More informationClassification of Mammograms using Gray-level Co-occurrence Matrix and Support Vector Machine Classifier
Classification of Mammograms using Gray-level Co-occurrence Matrix and Support Vector Machine Classifier P.Samyuktha,Vasavi College of engineering,cse dept. D.Sriharsha, IDD, Comp. Sc. & Engg., IIT (BHU),
More informationCHAPTER 5 WAVELET BASED DETECTION OF VENTRICULAR ARRHYTHMIAS WITH NEURAL NETWORK CLASSIFIER
57 CHAPTER 5 WAVELET BASED DETECTION OF VENTRICULAR ARRHYTHMIAS WITH NEURAL NETWORK CLASSIFIER 5.1 INTRODUCTION The cardiac disorders which are life threatening are the ventricular arrhythmias such as
More informationA Feed-Forward Neural Network Model For The Accurate Prediction Of Diabetes Mellitus
A Feed-Forward Neural Network Model For The Accurate Prediction Of Diabetes Mellitus Yinghui Zhang, Zihan Lin, Yubeen Kang, Ruoci Ning, Yuqi Meng Abstract: Diabetes mellitus is a group of metabolic diseases
More informationClassification of breast cancer using Wrapper and Naïve Bayes algorithms
Journal of Physics: Conference Series PAPER OPEN ACCESS Classification of breast cancer using Wrapper and Naïve Bayes algorithms To cite this article: I M D Maysanjaya et al 2018 J. Phys.: Conf. Ser. 1040
More informationA scored AUC Metric for Classifier Evaluation and Selection
A scored AUC Metric for Classifier Evaluation and Selection Shaomin Wu SHAOMIN.WU@READING.AC.UK School of Construction Management and Engineering, The University of Reading, Reading RG6 6AW, UK Peter Flach
More informationDetection of Lung Cancer Using Backpropagation Neural Networks and Genetic Algorithm
Detection of Lung Cancer Using Backpropagation Neural Networks and Genetic Algorithm Ms. Jennifer D Cruz 1, Mr. Akshay Jadhav 2, Ms. Ashvini Dighe 3, Mr. Virendra Chavan 4, Prof. J.L.Chaudhari 5 1, 2,3,4,5
More informationAutomatic Detection of Epileptic Seizures in EEG Using Machine Learning Methods
Automatic Detection of Epileptic Seizures in EEG Using Machine Learning Methods Ying-Fang Lai 1 and Hsiu-Sen Chiang 2* 1 Department of Industrial Education, National Taiwan Normal University 162, Heping
More informationEfficient Classification of Lung Tumor using Neural Classifier
Efficient Classification of Lung Tumor using Neural Classifier Mohd.Shoeb Shiraj 1, Vijay L. Agrawal 2 PG Student, Dept. of EnTC, HVPM S College of Engineering and Technology Amravati, India Associate
More informationOvarian Cancer Classification Using Hybrid Synthetic Minority Over-Sampling Technique and Neural Network
Journal of Advances in Computer Research Quarterly pissn: 2345-606x eissn: 2345-6078 Sari Branch, Islamic Azad University, Sari, I.R.Iran (Vol. 7, No. 4, November 2016), Pages: 109-124 www.jacr.iausari.ac.ir
More informationA Predication Survival Model for Colorectal Cancer
A Predication Survival Model for Colorectal Cancer PROF. DR. SHERIF KASSEM FATHY Information System Department, College of Computer and Information Technology King Faisal University SAUDI ARABIA Sherif_kassem@kfu.edu.com;
More informationApplication of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets
Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets Chih-Lin Chi a, W. Nick Street b, William H. Wolberg c a Health Informatics Program, University of Iowa b
More informationMulti Parametric Approach Using Fuzzification On Heart Disease Analysis Upasana Juneja #1, Deepti #2 *
Multi Parametric Approach Using Fuzzification On Heart Disease Analysis Upasana Juneja #1, Deepti #2 * Department of CSE, Kurukshetra University, India 1 upasana_jdkps@yahoo.com Abstract : The aim of this
More informationComparative Analysis of Predictive Models for the Likelihood of Infertility in Women Using Supervised Machine Learning Techniques
Comparative Analysis of Predictive Models for the Likelihood of Infertility in Women Using Supervised Machine Learning Techniques Jeremiah Ademola Balogun, Ngozi Chidozie Egejuru, and Peter Adebayo Idowu
More informationINTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY A Medical Decision Support System based on Genetic Algorithm and Least Square Support Vector Machine for Diabetes Disease Diagnosis
More informationApplying Data Mining for Epileptic Seizure Detection
Applying Data Mining for Epileptic Seizure Detection Ying-Fang Lai 1 and Hsiu-Sen Chiang 2* 1 Department of Industrial Education, National Taiwan Normal University 162, Heping East Road Sec 1, Taipei,
More informationPredictive performance and discrimination in unbalanced classification
MASTER Predictive performance and discrimination in unbalanced classification van der Zon, S.B. Award date: 2016 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's),
More informationEarly Detection of Dengue Using Machine Learning Algorithms
Volume 118 No. 18 2018, 3881-3887 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Early Detection of Dengue Using Machine Learning Algorithms 1 N.Rajathi,
More informationSix Sigma Glossary Lean 6 Society
Six Sigma Glossary Lean 6 Society ABSCISSA ACCEPTANCE REGION ALPHA RISK ALTERNATIVE HYPOTHESIS ASSIGNABLE CAUSE ASSIGNABLE VARIATIONS The horizontal axis of a graph The region of values for which the null
More informationKeywords Missing values, Medoids, Partitioning Around Medoids, Auto Associative Neural Network classifier, Pima Indian Diabetes dataset.
Volume 7, Issue 3, March 2017 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Medoid Based Approach
More informationIntroduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018
Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this
More informationEnhanced Detection of Lung Cancer using Hybrid Method of Image Segmentation
Enhanced Detection of Lung Cancer using Hybrid Method of Image Segmentation L Uma Maheshwari Department of ECE, Stanley College of Engineering and Technology for Women, Hyderabad - 500001, India. Udayini
More information