AN EXPERT SYSTEM FOR THE DIAGNOSIS OF DIABETIC PATIENTS USING DEEP NEURAL NETWORKS AND RECURSIVE FEATURE ELIMINATION

International Journal of Civil Engineering and Technology (IJCIET) Volume 8, Issue 12, December 2017, pp. 633 641, Article ID: IJCIET_08_12_069 Available online at http://http://www.iaeme.com/ijciet/issues.asp?jtype=ijciet&vtype=8&itype=12 ISSN Print: 0976-6308 and ISSN Online: 0976-6316 IAEME Publication Scopus Indexed AN EXPERT SYSTEM FOR THE DIAGNOSIS OF DIABETIC PATIENTS USING DEEP NEURAL NETWORKS AND RECURSIVE FEATURE ELIMINATION J. Vijayashree, J. Jayashree School of Computer Science and Engineering, VIT, Vellore, India ABSTRACT The aim of this study is to develop an expert system for the diagnosis of diabetic patients. PIMA Indian Diabetes Dataset is used by the proposed system. The proposed system uses feature reduction techniques like Recursive feature elimination and principal component analysis for enhancing the performance. The expert system is built using Deep Neural Networks and Artificial Neural Networks. This study also compares the performance of proposed deep neural network with artificial neural networks using various performance measures: accuracy, sensitivity, specificity. Key words: Artificial Neural Networks, Deep Neural Networks, Diabetes mellitus, Feature reduction. Cite this Article: J. Vijayashree, J. Jayashree. An Expert System for the Diagnosis of Diabetic Patients using Deep Neural Networks and Recursive Feature Elimination. International Journal of Civil Engineering and Technology, 8(12), 2017, pp. 633-641. http://www.iaeme.com/ijciet/issues.asp?jtype=ijciet&vtype=8&itype=12 1. INTRODUCTION Diabetes Mellitus (DM) is a chronic disease which increases the concentration of glucose in blood. This is because of two reasons; one the pancreas inability to produce enough insulin and the other is the inadequacy of body to use the insulin. Type1 or insulin dependent or juvenile onset diabetes is due to lack of insulin and Type2 or insulin independent or adult onset diabetes is due to the inefficiency of body to use insulin. Gestational diabetes is the third type of diabetes caused during pregnancy due to high glucose level in blood. This type of diabetes disappears after delivery but the mother and child are prone to type2 diabetes in future. According to International Diabetes Federation (IDE) statistics 2015, South East Asia region stands second next to Western Pacific region in diabetes. India, Bangladesh, Bhutan, Sri Lanka, Nepal, Mauritius and Maldives are the seven countries belonging to South East Asia region. About 8.5% that is 78.3 million inhabitants suffer with diabetes of which 52.1% are undiagnosed and 1.2 million deaths occur every year due to diabetes. http://www.iaeme.com/ijciet/index.asp 633 editor@iaeme.com

J. Vijayashree, J. Jayashree According to 2014 estimates of World Health Organization, the population suffering with diabetes has escalated from 108 million in 1980 to 422 million in 2014 and the universal frequency of diabetes amongst adults has escalated from 4.7% in 1980 to 8.5% in 2014. Also in 2030 diabetes will be the 7 th leading cause of death. Family history, unhealthy eating, lack of exercise, increased weight, high blood pressure and history of gestational diabetes are the risk factors which influences diabetes. Most substantial indicators of diabetes are frequent urination, excessive thirst, weight loss and lack of energy. Diabetes may affect other parts of the body like blood vessels, eyes, kidneys and nerves which lead to other health issues like cardiovascular disease, blindness (diabetic retinopathy), kidney failure (diabetic nephropathy) and lower limb amputation (diabetic neuropathy). Diabetes affects different parts of the body leading to other diseases like heart attack, kidney failure, loss of vision and foot ulcers. Herat attack and kidney failure are the major cause of death worldwide. So it is necessary to diagnose diabetes at an early stage to avoid unnecessary deaths. If diabetes is diagnosed at premature phase, it can be meticulously controlled by means of proper diet, insulin, regular exercise and medication. Large amount of patient records were available in the hospitals using which medical practitioners find useful patterns. Classification, prediction and diagnosis of the diseases can be done with the help of patterns found in the data set. Evaluations of patient s record and medical expert s decision are the most important factors in medical diagnosis. Nowadays expert systems, artificial intelligent techniques and classification systems play a major role in helping medical experts to diagnose the disease like tumors, cancers, hepatitis, lung diseases, etc. in an efficient way. 2. LITERATURE REVIEW For diabetes diagnosis, T. Santhanam and M.S Padmavathi (2015) proposed the application of K-Means and genetic algorithm and Support vector machine. K-Means for removing noisy data, genetic algorithm for feature selection and Support vector machine for classification. Nongyao Nai-arun and Rungruttikarn Moungmai (2015) conducted a study to perform a comparison between the performances of five classifiers; decision Tree, Artifcial Neural Networks, Logistic Regression, Naïve Bayes and Random Forest. Then to improve the robustness of these classifiers bagging and boosting techniques were applied. The study concluded that random forest algorithm is the best. Performance analysis of four prediction models; J48 Decision Tree, K-Nearest Neighbors, Random Forest and Support Vector Machines has been done by J. Pradeep Kandhasamy*, S. Balamurali (2015) under two situations. One with noisy data and theother without noisy data. Analysis revealed that prediction models KNN and Random Forest produces good results after removing noisy data. To improve the accuracy of diabetes diagnosis, Kemal Polat, Salih Gunes (2007) combined Principal Component Analysis (PCA) and Adaptive neuro-fuzzy inference system (ANFIS).PCA for feature selection and ANFIS for diabetes disease diagnosis. Lukmanto (2015) proposed fuzzy hierarchical model for early detection of diabetes mellitus. In order to overcome the crisp boundary problem in traditional Desition Tree, varma, kamadi VSRP.et.al used Fuzzy decision boundaries and proposed a modified Gini Index- Gaussian fuzzy Decision Tree algorithm where split points are identified using gini index. http://www.iaeme.com/ijciet/index.asp 634 editor@iaeme.com

An Expert System for the Diagnosis of Diabetic Patients using Deep Neural Networks and Recursive Feature Elimination Patil (2010) proposed a Hybrid Prediction model to predict type-2 diabetic patients. This model uses K-Means clustering algorithm to remove the incorrectly classified instances. Then classification is performed by constructing the decision tree using C4.5 algorithm. M. Durairaj, G. Kalaiselvi (2015) developed diabetes prediction model using three algorithms; Naïve Bayes, Artificial Neural Networks & K_Nearest Neighbors and their efficiency was calculated. The result indicate that ANN was the best followed by Naïve Bayes and then KNN. M. Durairaj, G. Kalaiselvi (2015) reviewed various data mining techniques like Artificial Neural Network, Support Vector Machine, K-Nearest Neighbors & C4.5 and their applications in different medical data sets. Review results showed that Artificial neural networks has got the highest predicted accuracy among the four data mining techniques compared. 3. MATERIALS AND METHODS 3.1. Dataset The benchmark dataset used in this study is the Pima Indian Diabetes Dataset from University of California, UCI repository of machine learning databases. The dataset contains 768 instances where each instance contains 8 attributes and 1 class variable. The values of all the 8 attributes are of numeric type and the class variable has two values 1 and 0. Class value 1 is inferred as tested positive for diabetes and class value 0 is inferred as tested negative for diabetes. 3.2. Proposed Work Figure 1 Process flow of proposed model http://www.iaeme.com/ijciet/index.asp 635 editor@iaeme.com

J. Vijayashree, J. Jayashree Figure 1 shows the process flow of the proposed method. The proposed method uses PIMA Indian diabetes dataset. The dataset contains some missing values. In order to remove the missing values pre-processing is done by filling the missing values using null value. The dataset consists of eight features and all the eight features may not have utmost importance in diagnosing the disease. To find the best features that could diagnose the disease feature reduction techniques like recursive feature elimination and principal component analysis is used. After feature reduction only four features were selected and they form the input to the classification networks. In this paper, the proposed deep neural network is compared with the artificial neural networks for accuracy, sensitivity and specificity. 3.3. Feature Reduction Strategies Data and learning algorithm are the two main characteristics that distress the performance of a classifier. Problems with regard to data are either too much data or to little data or noisy data. Problems with regard to the choice of learning algorithm may affect the learned results, predictive accuracy, speed of learning and comprehensibility. In feature reduction both data and the classifier are considered to overcome the hindrances caused by data as well as to learn efficiently. Feature selection deals with the problems regard to data. Feature selection is a process that chooses an optimal subset of features according to a certain criterion (Liu and Motoda, 1998). It removes redundant or irrelevant features from the dataset which can reduce the classification accuracy and clustering quality. The main advantage of feature selection is to reduce the cost and complexity, and to improve the accuracy and visualization of the classification model (Sebban and Nock (2002). Feature selection method is of three types based on how they combine selection algorithm and model construction; Filter, Wrapper and Embedded method. Filter method selects a subset of variables independent of the later applied machine learning algorithm that shall subsequently use them. Wrapper method is a feedback method that selects a subset of variables by incorporating the machine learning algorithm in feature selection process. Whereas in embedded method the feature selection process is built into the machine learning algorithm itself. 3.3.1. Recursive Feature Elimination algorithm aims to find the best performing feature subset. First the model is trained on the initial set of features and weights are assigned to each one of the features. Then the feature whose weight is smallest among all is detached from the initial of features. Again re-compute the weight on the existing set of features. This procedure is recursively repeated until the desired number of features is reached which is the stop criterion. Recursive feature elimination works as follows. Step 1 - Train the model using initial set of features Step 2 - Compute weight (w) on all the features in initial set Step 3 - Detach the feature with the smallest weight (wi) from the initial set Step 4 - Train the model using existing set of features Step 5- Re-compute weight on the features in the existing set Step 6 - Detach the feature with the smallest weight from the existing set Step 7 - Until stop criterion is met go to step 4 3.3.2. Principal Component Analysis Principal component analysis method is for identifying the important directions in the data. This can rotate data into (reduced) coordinate system that is given by those directions. Find http://www.iaeme.com/ijciet/index.asp 636 editor@iaeme.com

An Expert System for the Diagnosis of Diabetic Patients using Deep Neural Networks and Recursive Feature Elimination direction (axis) of greatest variance. Find direction of greatest variance that is perpendicular to previous direction and repeat. Implementation: find eigenvectors of covariance matrix by diagonalization. Eigenvectors (sorted by eigenvalues) are the directions. 3.4. Classification Techniques 3.4.1. Artificial Neural Networks Artificial neural Network refers to the modelling of the brain in two aspects which are functions and structure. The models are composed of many computing elements, usually denoted by neurons; each neuron has a number of input and output as in figure.2. The input to the neuron is obtained as the sum of the synaptic weight of the input. The input value is compared to the threshold value associated with the neuron to determine the activation signal of the neuron. This activation signal is then passed through an activation function to produce the output of the neuron. Sigmoid activation function was chosen for this research work due to its soft switching operation and also it helps artificial neural network to represent a more complex problem. 3.4.2. Deep Neural Networks Figure 2 Artificial Neural Network Structure Figure 3 Deep Neural Networks Structure http://www.iaeme.com/ijciet/index.asp 637 editor@iaeme.com

J. Vijayashree, J. Jayashree A deep neural network (DNN) in figure.3 is an ANN with multiple hidden layers of units between the input and output layers. Similar to shallow ANNs, DNNs can model complex non-linear relationships. The extra layers enable composition of features from lower layers, giving the potential of modelling complex data with fewer units than a similarly performing shallow network. DNNs are prone to over fitting because of the added layers of abstraction, which allow them to model rare dependencies in the training data. 4. PERFORMANCE EVALUATION OF THE PROPOSED SYSTEM The diagnostic performance is usually evaluated in terms of confusion matrix, classification accuracy, sensitivity, specificity, precision, recall, F- score, ROC. Confusion matrix is given in table 1.The results of various performance measures is displayed in table 2 and the comparison of Precision, Recall and F-score performance measures is given in table 3. Table 1 Confusion Matrices PCA Predicted Predicted Actual P N Actual P N P 36 14 P 38 19 N 12 88 N 17 81 ANN with ANN with PCA Predicted Predicted Actual P N Actual P N P 35 16 P 19 30 N 18 90 N 12 79 Performance Measures Accuracy Sensitivity Specificity Table 2 Results of Various Performance Measures PCA ANN with ANN with PCA 82.67% 76.77% 78.62% 70% 72% 66.67% 68.63% 38.78% 88% 82.65% 83.33% 86.81% Positive Likelihood Ratio 6.00 3.84 4.12 2.94 Negative Likelihood Ratio 0.32 0.40 0.38 0.71 Positively Predicted Value 75% 69.09% 66.04% 61.29% Negatively Predicted Value 86.27% 81% 84.91% 72.48% PCA ANN with ANN with PCA Table 3 Comparison of Precision, Recall and F-score performance measures Precision Recall F-Score Tested negative for diabetes 0.86 0.88 0.87 Tested positive for diabetes 0.75 0.72 0.73 Precision Recall F-Score Tested negative for diabetes 0.81 0.82 0.81 Tested positive for diabetes 0.69 0.67 0.68 Precision Recall F-Score Tested negative for diabetes 0.85 0.83 0.84 Tested positive for diabetes 0.66 0.69 0.67 Precision Recall F-Score Tested Negative for diabetes 0.72 0.87 0.79 Tested positive for diabetes 0.61 0.39 0.47 http://www.iaeme.com/ijciet/index.asp 638 editor@iaeme.com

Accuracy An Expert System for the Diagnosis of Diabetic Patients using Deep Neural Networks and Recursive Feature Elimination 5. RESULTS AND DISCUSSION Accuracy, Sensitivity, Specificity and ROC curve were the performance parameters used to compare the two classifiers: Artificial neural networks and Deep neural networks. Accuracy percentage of DNN and ANN for two different feature selection methods: and PCA are shown in figure 4. Sensitivity and Specificity comparison for deep neural networks and artificial neural networks are shown in figure 5. Receiver Operating Characteristic (ROC) Curve is shown in figure 6. 85.00% 80.00% 75.00% 70.00% 65.00% 60.00% PCA ANN with ANN with PCA Figure 4 Comparison of Accuracy percentage 100% 80% 60% 40% ANN with 20% 0% Sensitivity Specificity Figure 5 Comparison results of sensitivity and specificity ROC curve for ROC curve for PCA http://www.iaeme.com/ijciet/index.asp 639 editor@iaeme.com

J. Vijayashree, J. Jayashree ROC curve for ANN with ROC curve for ANN with PCA Figure 6 ROC Curve 6. CONCLUSIONS Among the two features reduction techniques recursive feature elimination selects the best features for classification. The classification accuracy of proposed deep neural networks is better than artificial neural networks. REFERENCES [1] Santhanam, T., and M. S. Padmavathi. Application of K-means and genetic algorithms for dimension reduction by integrating SVM for diabetes diagnosis. Procedia Computer Science 47 (2015): 76-83. [2] Nai-arun, Nongyao, and Rungruttikarn Moungmai. Comparison of Classifiers for the Risk of Diabetes Prediction. Procedia Computer Science 69 (2015): 132-142. [3] Kandhasamy, J. Pradeep, and S. Balamurali. Performance analysis of classifier models to predict diabetes mellitus. Procedia Computer Science 47 (2015): 45-51. [4] Polat, Kemal, and Salih Güneş. An expert system approach based on principal component analysis and adaptive neuro-fuzzy inference system to diagnosis of diabetes disease. Digital Signal Processing 17.4 (2007): 702-710. [5] Lukmanto, Rian Budi, and E. Irwansyah. The Early Detection of Diabetes Mellitus (DM) Using Fuzzy Hierarchical Model. Procedia Computer Science 59 (2015): 312-319. [6] Varma, Kamadi VSRP, et al. A computational intelligence approach for a better diagnosis of diabetic patients. Computers & Electrical Engineering 40.5 (2014): 1758-1765. [7] Patil, Bankat M., Ramesh Chandra Joshi, and Durga Toshniwal. Hybrid prediction model for type-2 diabetic patients. Expert systems with applications 37.12 (2010): 8102-8108. [8] Durairaj, M., and G. Kalaiselvi. Prediction of diabetes using back propagation algorithm. International Journal of Emerging Technology and Innovative Engineering1.8 (2015) [9] Durairaj, M., and G. Kalaiselvi. "Prediction of diabetes using soft computing techniques- A survey." International journal of scientific and technology research 4.3 (2015): 190-192. [10] Liu, Huan, and Hiroshi Motoda, eds. Feature extraction, construction and selection: A data mining perspective. Vol. 453. Springer Science & Business Media, 1998. [11] Sebban, Marc, and Richard Nock. "A hybrid filter/wrapper approach of feature selection using information theory." Pattern Recognition 35.4 (2002): 835-846. http://www.iaeme.com/ijciet/index.asp 640 editor@iaeme.com

An Expert System for the Diagnosis of Diabetic Patients using Deep Neural Networks and Recursive Feature Elimination [12] Federation, Internation Diabetes. "IDF diabetes atlas." Brussels: International Diabetes Federation (2013) [13] R S Deshmukh, Analysis of Diabetic Retinopathy by Automatic Detection of Exudates, International journal of Electronics and Communication Engineering & Technology, Volume 5, Issue 12, 201 4, pp. 266-275. [14] Ideyat Melsovich Bapiyev, Bekmurza Husainovich Aitchanov, Ihor Anatolyevich Tereikovskyi, Liudmyla Alekseevna Tereikovska and Anna Alexandrovna Korchenko, Deep Neural Networks in Cyber Attack Detection Systems, International Journal of Civil Engineering and Technology, 8(11), 2017, pp. 1086-1092 [15] Preeti Manke and Sharad Tembhurne, Artificial Neural Network Based Nitrogen Oxides Emission Prediction and Optimization In Thermal Power Plant International journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 3, 2013, pp. 491-502 [16] http://www.idf.org/about-diabetes/facts-figures [17] http://www.diabetesatlas.org/ http://www.iaeme.com/ijciet/index.asp 641 editor@iaeme.com