Comparison of support vector machine based on genetic algorithm with logistic regression to diagnose obstructive sleep apnea

Similar documents
Joint Modelling Approaches in diabetes research. Francisco Gude Clinical Epidemiology Unit, Hospital Clínico Universitario de Santiago

Modeling Multi Layer Feed-forward Neural. Network Model on the Influence of Hypertension. and Diabetes Mellitus on Family History of

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

Modeling the Survival of Retrospective Clinical Data from Prostate Cancer Patients in Komfo Anokye Teaching Hospital, Ghana

AUTOMATED DETECTION OF HARD EXUDATES IN FUNDUS IMAGES USING IMPROVED OTSU THRESHOLDING AND SVM

Parameter Estimates of a Random Regression Test Day Model for First Three Lactation Somatic Cell Scores

Copy Number Variation Methods and Data

A MIXTURE OF EXPERTS FOR CATARACT DIAGNOSIS IN HOSPITAL SCREENING DATA

Survival Rate of Patients of Ovarian Cancer: Rough Set Approach

econstor Make Your Publications Visible.

Study and Comparison of Various Techniques of Image Edge Detection

Economic crisis and follow-up of the conditions that define metabolic syndrome in a cohort of Catalonia,

AN ENHANCED GAGS BASED MTSVSL LEARNING TECHNIQUE FOR CANCER MOLECULAR PATTERN PREDICTION OF CANCER CLASSIFICATION

WHO S ASSESSMENT OF HEALTH CARE INDUSTRY PERFORMANCE: RATING THE RANKINGS

THIS IS AN OFFICIAL NH DHHS HEALTH ALERT

Journal of Engineering Science and Technology Review 11 (2) (2018) Research Article

Lymphoma Cancer Classification Using Genetic Programming with SNR Features

Prediction of Total Pressure Drop in Stenotic Coronary Arteries with Their Geometric Parameters

Optimal Planning of Charging Station for Phased Electric Vehicle *

Towards Prediction of Radiation Pneumonitis Arising from Lung Cancer Patients Using Machine Learning Approaches

Diagnosis of Severe Obstructive Sleep Apnea with Model Designed Using Genetic Algorithm and Ensemble Support Vector Machine

Gene Selection Based on Mutual Information for the Classification of Multi-class Cancer

INTRAUTERINE GROWTH RESTRICTION (IUGR) RISK DECISION BASED ON SUPPORT VECTOR MACHINES

Jurnal Teknologi USING ASSOCIATION RULES TO STUDY PATTERNS OF MEDICINE USE IN THAI ADULT DEPRESSED PATIENTS. Full Paper

Lateral Transfer Data Report. Principal Investigator: Andrea Baptiste, MA, OT, CIE Co-Investigator: Kay Steadman, MA, OTR, CHSP. Executive Summary:

A Support Vector Machine Classifier based on Recursive Feature Elimination for Microarray Data in Breast Cancer Characterization. Abstract.

Biomarker Selection from Gene Expression Data for Tumour Categorization Using Bat Algorithm

Automated and ERP-Based Diagnosis of Attention-Deficit Hyperactivity Disorder in Children

Saeed Ghanbari, Seyyed Mohammad Taghi Ayatollahi*, Najaf Zare

Using the Perpendicular Distance to the Nearest Fracture as a Proxy for Conventional Fracture Spacing Measures

THE NATURAL HISTORY AND THE EFFECT OF PIVMECILLINAM IN LOWER URINARY TRACT INFECTION.

Available online at ScienceDirect. Procedia Computer Science 46 (2015 )

Impact of Imputation of Missing Data on Estimation of Survival Rates: An Example in Breast Cancer

Association between cholesterol and cardiac parameters.

Price linkages in value chains: methodology

The Effect of Fish Farmers Association on Technical Efficiency: An Application of Propensity Score Matching Analysis

310 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16

Performance Evaluation of Public Non-Profit Hospitals Using a BP Artificial Neural Network: The Case of Hubei Province in China

Concentration of teicoplanin in the serum of adults with end stage chronic renal failure undergoing treatment for infection

A GEOGRAPHICAL AND STATISTICAL ANALYSIS OF LEUKEMIA DEATHS RELATING TO NUCLEAR POWER PLANTS. Whitney Thompson, Sarah McGinnis, Darius McDaniel,

Statistical Analysis on Infectious Diseases in Dubai, UAE

INITIAL ANALYSIS OF AWS-OBSERVED TEMPERATURE

A Support Vector Machine Classifier based on Recursive Feature Elimination for Microarray Data in Breast Cancer Characterization. Abstract.

Study on Psychological Crisis Evaluation Combining Factor Analysis and Neural Networks *

An Introduction to Modern Measurement Theory

Statistical models for predicting number of involved nodes in breast cancer patients

CONSTRUCTION OF STOCHASTIC MODEL FOR TIME TO DENGUE VIRUS TRANSMISSION WITH EXPONENTIAL DISTRIBUTION

Statistically Weighted Voting Analysis of Microarrays for Molecular Pattern Selection and Discovery Cancer Genotypes

Classification of Breast Tumor in Mammogram Images Using Unsupervised Feature Learning

Richard Williams Notre Dame Sociology Meetings of the European Survey Research Association Ljubljana,

What Determines Attitude Improvements? Does Religiosity Help?

IMPROVING THE EFFICIENCY OF BIOMARKER IDENTIFICATION USING BIOLOGICAL KNOWLEDGE

Physical Model for the Evolution of the Genetic Code

A New Machine Learning Algorithm for Breast and Pectoral Muscle Segmentation

Fast Algorithm for Vectorcardiogram and Interbeat Intervals Analysis: Application for Premature Ventricular Contractions Classification

Dr.S.Sumathi 1, Mrs.V.Agalya 2 Mahendra Engineering College, Mahendhirapuri, Mallasamudram

Evaluation of Literature-based Discovery Systems

Introduction ORIGINAL RESEARCH

Normal variation in the length of the luteal phase of the menstrual cycle: identification of the short luteal phase

Maize Varieties Combination Model of Multi-factor. and Implement

Feature Selection for Predicting Tumor Metastases in Microarray Experiments using Paired Design

Appendix for. Institutions and Behavior: Experimental Evidence on the Effects of Democracy

Incorrect Beliefs. Overconfidence. Types of Overconfidence. Outline. Overprecision 4/22/2015. Econ 1820: Behavioral Economics Mark Dean Spring 2015

Evaluation of the generalized gamma as a tool for treatment planning optimization

National Polyp Study data: evidence for regression of adenomas

The effect of salvage therapy on survival in a longitudinal study with treatment by indication

A New Diagnosis Loseless Compression Method for Digital Mammography Based on Multiple Arbitrary Shape ROIs Coding Framework

A multifactorial assessment of carcinogenic risks of radon for the population residing in a Russian radon hazard zone

Investigation of zinc oxide thin film by spectroscopic ellipsometry

Estimation for Pavement Performance Curve based on Kyoto Model : A Case Study for Highway in the State of Sao Paulo

A Classification Model for Imbalanced Medical Data based on PCA and Farther Distance based Synthetic Minority Oversampling Technique

Cancer Classification Based on Support Vector Machine Optimized by Particle Swarm Optimization and Artificial Bee Colony

PERFORMANCE EVALUATION OF DIVERSIFIED SVM KERNEL FUNCTIONS FOR BREAST TUMOR EARLY PROGNOSIS

RENAL FUNCTION AND ACE INHIBITORS IN RENAL ARTERY STENOSISA/adbon et al. 651

Algorithms 2009, 2, ; doi: /a OPEN ACCESS

Resampling Methods for the Area Under the ROC Curve

Incorporating prior biological knowledge for network-based differential gene expression analysis using differentially weighted graphical LASSO

Non-linear Multiple-Cue Judgment Tasks

Estimation Comparison of Multidimensional Reliability Coefficients Measurement of Senior High School Students Affection towards Mathematics

ARTICLE IN PRESS. computer methods and programs in biomedicine xxx (2007) xxx xxx. journal homepage:

Arrhythmia Detection based on Morphological and Time-frequency Features of T-wave in Electrocardiogram ABSTRACT

Strategies for the Early Diagnosis of Acute Myocardial Infarction Using Biochemical Markers

Journal of Engineering Science and Technology Review 11 (2) (2018) Research Article

UNIVERISTY OF KWAZULU-NATAL, PIETERMARITZBURG SCHOOL OF MATHEMATICS, STATISTICS AND COMPUTER SCIENCE

A Neural Network System for Diagnosis and Assessment of Tremor in Parkinson Disease Patients

INTEGRATIVE NETWORK ANALYSIS TO IDENTIFY ABERRANT PATHWAY NETWORKS IN OVARIAN CANCER

A Linear Regression Model to Detect User Emotion for Touch Input Interactive Systems

Nonlinear Modeling Method Based on RBF Neural Network Trained by AFSA with Adaptive Adjustment

NHS Outcomes Framework

Are National School Lunch Program Participants More Likely to be Obese? Dealing with Identification

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Reconstruction of gene regulatory network of colon cancer using information theoretic approach

Subject-Adaptive Real-Time Sleep Stage Classification Based on Conditional Random Field

Offsetting Behavior in Reducing High Cholesterol: Substitution of Medication for Diet and Lifestyle Changes

Applying Multinomial Logit Model for Determining Socio- Economic Factors Affecting Major Choice of Consumers in Food Purchasing: The Case of Mashhad

KOUJI KAJINAMI, MD,*t HIROYASU SEKI, MD,t NOBORU TAKEKOSHI, MD,t HIROSHI MABUCHI, MD* Kanazawa, Japan

Insights in Genetics and Genomics

EXAMINATION OF THE DENSITY OF SEMEN AND ANALYSIS OF SPERM CELL MOVEMENT. 1. INTRODUCTION

HIV/AIDS-related Expectations and Risky Sexual Behavior in Malawi

Transcription:

Orgnal Artcle Comparson of support vector machne based on genetc algorthm wth logstc regresson to dagnose obstructve sleep apnea Zohreh Manoochehr, Nader Salar 1, Mansour Rezae 1, Habbolah Khazae 2, Sara Manoochehr, Behnam Khaled Pavah 2 Student Research Commttee, Kermanshah Unversty of Medcal Scences, Kermanshah, Iran, 1 Department of Bostatstcs and Epdemology, School of Publc Health, Kermanshah Unversty of Medcal Scences, Kermanshah, Iran, 2 Sleep Dsorders Research Center, Kermanshah Unversty of Medcal Scences, Kermanshah, Iran Background: Dagnosng of obstructve sleep apnea () s an mportant subject n medcne. Ths study amed to compare the performance of two data mnng technques, support vector machne (SVM), and logstc regresson (LR), n dagnosng. The best ft model was used as a substtute for polysomnography (PSG), whch s the gold standard for dagnosng ths dsease. Materals and Methods: A total of 250 patents wth sleep problems complants and whose dsease had been dagnosed by PSG and referred to the Sleep Dsorders Research Center of Farab Hosptal, Kermanshah, between 2012 and 2015 were recruted n ths study. To ft the best LR model, a model was frst ftted wth all varables and then compared wth a model made from the sgnfcant varables usng Akake s nformaton crteron (AIC). The SVM model and radal bass functon (RBF) kernel, whose parameters had been optmzed by genetc algorthm, were used to dagnose. Results: Based on AIC, the best LR model obtaned from ths study was a model ftted wth all varables. The performance of fnal LR model was compared wth SVM model, revealng the accuracy 0.797 versus 0.729, senstvty 0.714 versus 0.777, and specfcty 0.847 vs. 0.702, respectvely. Concluson: Both models were found to have an approprate performance. However, consderng accuracy as an mportant crteron for comparng the performance of models n ths doman, t can be argued that SVM could have a better effcency than LR n dagnosng n patents. Key words: Genetc algorthms, logstc regresson, obstructve sleep apnea, polysomnography, support vector machne How to cte ths artcle: Manoochehr Z, Salar N, Rezae M, Khazae H, Manoochehr S, Pavah BK. Comparson of support vector machne based on genetc algorthm wth logstc regresson to dagnose obstructve sleep apnea. J Res Med Sc 2018;23:65. INTRODUCTION Sleep s a fundamental behavor of all anmal speces and accounts for almost one thrd of the human lfe. There are many knds of sleep dsorders one of whch s sleep dsorder breathng (SDB). [1] One of the most common sleep dsorders s obstructve sleep apnea (). It affects dfferent aspects of human lfe ncludng health and lfe qualty. [2] s characterzed by repettve collapse of the upper arway partally or completely durng sleep. [1] Access ths artcle onlne Quck Response Code: Webste: www.jmsjournal.net DOI: 10.4103/jrms.JRMS_357_17 s assocated wth the ncdence of varous dseases. [3 5] Due to the fact that t nduces sleep and fatgue durng the day, ths syndrome ncreases the rsk of car accdents 2 7 tmes. [6] s a common dsease that affects 2% of women and 4% of men n Western socety. [7] The ncdence of dffers n varous regons of Iran (Isfahan provnce [5%] [8] and Kermanshah provnce [27.3%] [2] ). Rsk factors of sleep apnea ncludng hgh body mass ndex (BMI), old age, male gender, abdomnal obesty, larger neck crcumference, dabetes, cardovascular dseases, hypertenson, eplepsy, cgarette smokng, underactve thyrod, and acromegaly. [9 11] Ths s an open access journal, and artcles are dstrbuted under the terms of the Creatve Commons Attrbuton NonCommercal ShareAlke 4.0 Lcense, whch allows others to remx, tweak, and buld upon the work non commercally, as long as approprate credt s gven and the new creatons are lcensed under the dentcal terms. For reprnts contact: reprnts@medknow.com Address for correspondence: Dr. Nader Salar, Department of Bostatstcs and Epdemology, School of Publc Health, Kermanshah Unversty of Medcal Scences, Kermanshah, Iran. E mal: n_s_514@yahoo.com. Receved: 19 05 2017; Revsed: 01 01 2018; Accepted: 29-04-2018 1 2018 Journal of Research n Medcal Scences Publshed by Wolters Kluwer Medknow 2018

Polysomnography (PSG) s used as the gold standard for the dagnoss of. Although PSG s an accurate dagnostc tool n the doman of sleep dsorder, t s not approprate for screenng purposes because t s expensve and tme consumng, t s not avalable to all patents, and patents have to wat for a long tme to use t. [12] Thus, varous studes have targeted new approaches to substtute PSG. Several studes have used parameters and sgnals obtaned from electrocardogram (ECG) or overnght oxmetry and other sgnals. [13,14] Other studes have appled tools such as Berln Questonnare and Epworth sleepness scale (ESS) as screenng tools to dagnose syndrome and (SDB). [12,15] The dagnoss of can be consdered a bnary classfcaton subject,.e. an attempt to fnd an optmal classfer to dfferentate patents wth or wthout. The realm of data mnng and machne learnng have varous classfcaton methods such as artfcal neural networks, decson trees, k nearest neghbor, random forests, and support vector machne (SVM). Of these methods, SVM has attracted more attenton owng to ts better ablty and hgher performance n dagnoss of dfferent dseases n recent years. [16] Almazaydeh et al. performed a study on dagnoss. ECG characterstcs such as RR dstance, whch are more effectve for dagnoss of sleep apnea, were calculated and appled to SVM. The fndngs showed that SVM could dagnose sleep dsorder perods and patents wth and wthout wth a hgh accuracy of approxmately 96.5%. [13] A study conducted by Hang used overnght oxmetry to dagnose moderate and severe n patent. The SVM model was desgned accordng to oxyhemoglobn desaturaton ndex. The SVM model desgned accordng to radal bass functon (RBF) kernel had the hghest accuracy. The accuracy rate obtaned for the dagnoss of patents wth severe (90.42% 90.55%) was hgher than that of the patents wth moderate to severe (87.33% 87.77%). [14] However, no study has ever used clncal characterstcs to dagnose as SVM nput and compare ts accuracy wth logstc regresson (LR). Hence, ths study was an attempt to develop the mentoned classfcaton models to dfferentate patents wth and wthout usng clncal characterstcs. In ths respect, the best LR model was compared wth SVM model, whose parameters had been optmzed by genetc algorthm (GA). MATERIALS AND METHODS Study desgn and partcpants A total of 250 patents wth sleep problem complants that had been dagnosed by PSG and were referred to the Sleep Dsorders Research Center at Farab hosptal, Kermanshah from 2012 to 2015 were recruted n ths study. All patents underwent one nght of PSG (Somnoscreen plus, Somnomedcs, Germany). The study data were collected from the patents fles ncluded varables such as age (year), gender (male/female), BMI (kg/ m 2 ), neck crcumference (cm), wast crcumference (cm), tea consumpton (Number of tea cups per day), cgarette smokng (Number of cgarettes per day), and underlyng dseases such as hypertenson that measurement wth mercury barometrc measurements (blood pressure >140 by 80), chronc headache (any headache that occurs 15 days a month or more and lasts for at least 3 months), heart dsease (any dsease that occurs n the heart and the vens accordng to a cardologst), respratory dsease (a dsease that affects the respratory system and t can dsrupt the functon of the lungs accordng to a specalst physcan), neurologcal dsease, and dabetes (fastng blood glucose more than 127 mg/dl). All unts of ths varables menton n Tables 1 and 2. Data were also obtaned from the Berln questonnare (low rsk and hgh rsk), ESS, Global sleep assessment questonnare (GSAQ) (snorng, break of breathng n sleep, sadness or anxety), and from PSG, whch determnes the status. Study nstruments Berln questonnare s composed of 10 tems n three sectons (snorng where partcpants are asked to score ther snorng; analyses excessve fatgue and sleepness durng the day, and hypertenson). If the patent s score s postve n at least two sectons, the patent s consdered hgh rsk. [17] In the study that conducted by Amra et al., Cronbach s Alpha for ths questonnare n the general populaton of Iran for group one was obtaned 0.70 and for group two was 0.50. We used the same verson n ths study. [8,18] ESS conssts of eght tems that am to determne daly sleepness among adults. ESS s obtaned from the total score of the eght tems, rangng from 0 to 24. A hgher score Table 1: Comparson (mean±standard devaton) quanttatve varables between two groups (wth and wthout obstructve sleep apnea) Varable (n=96) Wth (n=154) Test statstc Age (year) 37.60±16.19 46.99±12.30 T= 4.869 <0.001* BMI (kg/m 2 ) 25.28±4.80 28.77±4.52 T= 5.812 <0.001* Neck crcumference (cm) 35.54±4.22 39.24±3.79 T= 7.192 <0.001* Wast 90.19±15.61 100.97±11.67 T= 6.228 <0.001* crcumference (cm) Cgarette 1.20±4.83 1.68±5.4 U=7122.5 0.343 smokng (n) Tea consumpton (n) 3.68±3.45 3.90±3.45 U=6807.5 0.288 Epworth questonnare score 4.92±4.74 8.16±5.29 U=4599 <0.001* *T s abbrevaton of test statstc of t student and U s abbrevaton of test statstc of Mann Whtney U test. BMI=Body mass ndex; =Obstructve sleep apnea P Journal of Research n Medcal Scences 2018 2

Table 2: Assocaton between qualtatve varables and obstructve sleep apnea usng Ch square test Varable Frequency (%) P (n=96) Wth (n=154) Total Gender Female 49 (51) 46 (29.9) 95 (38) 0.001* Male 47 (49) 108 (70.1) 155 (62) Hypertenson Yes 10 (10.4) 35 (22.7) 45 (18) 0.014* No 86 (89.6) 119 (77.3) 205 (82) Chronc headache Yes 12 (12.5) 23 (14.9) 35 (14) 0.589 No 84 (87.5) 131 (85.1) 215 (86) Heart dsease Yes 9 (9.4) 11 (7.1) 20 (8) 0.527 No 87 (90.6) 143 (92.9) 230 (92) Respratory dsease Yes 6 (6.2) 11 (7.1) 17 (6.8) 0.785 No 90 (93.8) 143 (92.9) 233 (93.2) Neurologcal dsease Yes 0 4 (2.6) 4 (1.6) 0.111 No 96 (100) 150 (97.4) 246 (98.4) Dabetes Yes 0 9 (5.8) 9 (3.6) 0.016* No 96 (100) 145 (94.2) 241 (96.4) Score of Berln questonnare Hgh rsk 73 (76) 61 (39.6) 134 (53.6) <0.001* Low rsk 23 (24) 93 (60.4) 116 (46.4) Snorng Yes 15 (15.6) 90 (58.8) 105 (42.2) <0.001* No 81 (84.4) 63 (41.2) 144 (57.8) Break of breathng n sleep Yes 10 (10.4) 55 (35.7) 65 (26) <0.001* No 86 (89.6) 99 (64.3) 185 (74) Sadness or anxety Yes 58 (60.4) 73 (47.4) 131 (52.4) 0.045* No 38 (39.6) 81 (52.6) 119 (47.6) *Sgnfcant test n level 0.05, =Obstructve sleep apnea ndcates a hgher level of sleepness durng the day. [15,19] Mean rho of ESS tem scores was obtaned 0.56. [19] GSAQ s used as a famlar screenng tool n prmary care clncs and sleep centers. The Pearson correlaton coeffcents [20] showed that GSAQ dfferentated between dfferent dagnoses and could be helpful n dagnosng varous sleep dsorders ncludng. Relablty of ths questonnare was reported from 0.51 to 0.92. [20] Statstcal analyss LR model, an mportant approach n the doman of classfcaton, expresses the possble ncdence of one of two classes of a bnary crteron. If the estmated probablty s larger than 0.5 (or any other default), that member s classfed nto the class of patents wth. [21] SVM s a supervsed learnng method n machne learnng class, accordng to the statstcal learnng theory proposed by Vapnk n 1979 to perform classfcaton and regresson analyss. [22] The man purpose of SVM s fndng the best hyperplane among possble dfferentatng hyperplanes. Ths optmal dfferentaton hyperplane can classfy the nput data ponts so that the dstance between two classes (margn) s maxmzed. [23] Kernel functon ncludes nner nonlnear functon. In a stuaton where the nput data set can be dfferentated nonlnearly, the data set by kernel functon transferred to a hgh dmensonal feature space to be dfferentated lnearly. [24,25] The most common kernel functons used n SVM nclude: T Kxx (, ) = ( x x ) : Lnear (1) T d Kxx (, ) = ([ x x ] + 1) : Polynomal, and (2) 2 x x Kxx (, ) = exp( 2 ), > 0 : RBF kernel functon (3) 2 Among the kernel functons presented, RBF kernel can be an nterestng suggeston because t s more generalzable and less complex than polynomal kernel functon. When the polynomal degree s ncreased n polynomal kernel functon, many problems occur. [26] When each of these kernel functons n SVM model s used, t s necessary to set ther parameters correctly to ncrease performance n data classfcaton. Numerous optmzaton algorthms have been suggested n ths regard, ncludng grd search method, [27] partcle optmzaton algorthm, and GA. In the present study, we used GA as an optmzaton method. GA ntroduced by Holland (1975). GA s a random search method n artfcal ntellgence and s based on the theory of evoluton by natural selecton. Ths method has been nspred by bologcal processes of heredty such as mutaton, natural selecton, and genetc crossover. [28] GA uses a number of possble solutons, called populaton, smultaneously to solve a problem. Each member of ths populaton s dsplayed by coded strngs called chromosomes. Three mportant concepts n GA nclude ntal populaton, ftness functon, and operators. In GA, frst, several possble solutons, encoded as chromosomes, are randomly chosen from among the ntal populaton and are evaluated by ftness functon. Next, these chromosomes are used to produce the next generaton by selecton, mutaton, and crossover operators. 3 Journal of Research n Medcal Scences 2018

Ths trend s contnued from one generaton to the other untl the algorthm s stopped. [29] Achevement of proposed models Frst, n unvarate analyss, the relatonshp between each ndependent varable and dependent varable was analyzed separately by approprate parametrc (t student test for quanttatve varable) and nonparametrc (Mann Whtney U test for quanttatve varables and Ch square test for qualtatve varables) tests. Snce the number of postve cases for dabetes and neurologcal dseases was very few n the data set of ths study, whch ncreased the estmaton of standard devaton of regresson coeffcent and an unexpected rse n the estmated values of odds rato, these varables were excluded from the dataset. To ft the SVM model, all varables were normalzed and then the gven model was ftted. Wth ths converson, data were transferred wth [0, 1] nterval. The samples were dvded nto two groups: tranng data (176 patents, 70%) and testng (74 patents, 30%) by producng random numbers from Bernoull dstrbuton wth success probablty of around 70%. LR and SVM models were ftted to the tranng set and examned on the testng set. For fttng the best LR model, a model wth all varables was frst ftted by backward elmnaton and stepwse regresson method. Then, the varables wth sgnfcant correlaton wth the response varable n unvarate analyss were used to make another LR model. These models were compared usng the Akake s nformaton crteron (AIC). AIC = 2 P 2 Ln (L), P number of parameters, L maxmum lkelhood. A model wth smaller AIC was selected as the better model. [30] Havng bult the best LR model, the parameters of SVM model wth RBF kernel were optmzed by GA. Then, the groupngs obtaned from them were compared wth actual data to fnd out whether there was a dfference between the ftted groups and real groups. To ths end, McNamara test was run. Then, the performance of the regresson model selected from among the two proposed regressons was compared wth that of SVM based hybrd model, accordng to the evaluaton crtera of classfcaton models. Data analyss was performed usng R3.2.2 software and nstallng e1071 package to use SVM model and ga package to use GA. In addton, caret package n ths software was run to obtan the evaluaton crtera of the model. From among the kernel functons presented, RBF kernel was used n SVM model. [31] For calculaton of ndces, four numbers n each method were determned. True postve (TP), true negatve (TN), false postve (FP), and false negatve (FN) calculated by classfer. [32] TP+TN Acc= N TP Senstvty=Truepostverate(TPR)= TP+FN (5) TN Specfcty=TrueNegatverate(TNR)= FP+TN (6) RESULTS From 250 partcpants, 154 had and 96 dd not. The mean age of the patents wth (46.99 ± 12.3) was hgher than that of the patents wthout (37.6 ± 16.19) (P < 0.001). In addton, the wast crcumference and neck crcumference n the patents wth were hgher than those of the patents wthout. The patents wth had a larger BMI, ndcatng a sgnfcant dfference between groups (P < 0.001). Of the patents wth, 70.1% were male. Further, 49% dd not have [Tables 1 and 2]. Logstc regresson model To ft the best LR model, all predctor varables were frst fed nto the model and a set of fve varables was selected by stepwse selecton backward elmnaton, for ncluson n the model. The regresson model 1 was ftted to the tranng data set usng these varables [Table 3]. The fnal form of the ftted model s presented n equaton 7. Log (P/[1 P]) = 9.41 + 0.045 (age) + 0.180 (neck crcumference) 2.21 (heart dsease) + 0.939 (snorng) + 0.147 (Epworth questonnare Score). (7) where AIC = 172.33. To ftness of LR model wth sgnfcant factors, the varables that were sgnfcant n unvarate analyss were ncluded Table 3: Regresson coeffcents and ther detals n the model wth all factors Varable B SE (B) Wald P OR statstc Constant 9.41 2.121 4.436 <0.001* Age 0.045 0.015 3.076 0.002* 1.046 Neck crcumference 0.180 0.055 3.270 0.001* 1.197 Cgarette smokng 0.070 0.039 1.808 0.07 0.932 Heart dsease 2.21 1.104 2.000 0.045* 0.110 Snorng 0.939 0.467 2.009 0.044* 2.557 Break of breathng n sleep 1.09 0.649 1.679 0.093 2.976 Epworth questonnare score 0.147 0.049 3.022 0.002* 1.158 *Sgnfcant test n level 0.05, OR=Odds rato; SE=Standard error (4) Journal of Research n Medcal Scences 2018 4

n the model by stepwse selecton backward elmnaton. These varables ncluded age, BMI, neck crcumference, wast crcumference, Epworth scale score, gender, hypertenson, Berln questonnare score, snorng, break of breathng n sleep, and feelng anxety or sadness. Fnally, four varables remaned n LR model. It s noteworthy that n ths model the tranng data set was used for fttng. The fnal form of the LR model wth sgnfcant factors s presented n equaton 8 [Table 4]. Log (P/[1 P]) = 8.903 + 0.038 (age) + 0.171 (neck crcumference) + 1.002 (snorng) + 0.135 (Epworth questonnare score). (8) For ths model AIC = 175.97. A model wth smaller AIC,.e., model 7, whch had been ftted wth all factors, wth AIC = 172.33, was selected as the fnal LR model to be compared wth SVM method. The results of McNamara test showed no sgnfcant dfference between the groups ftted wth ths model and real groups (P = 0.117). Then, the performance of ths model was analyzed by the testng data [Table 5]. Based on the results of ths table: accuracy = 0.7297, senstvty = 0.777, and specfcty = 0.7021 was acheved. Support vector machne model Accordng to the range consdered for parameters; 0.01 C 35000 (a penalty parameter that apples a compromse between tranng error and generalzaton) and 0.0001 ɣ 10 ; and searches to fnd the best parameter n ths range [27] by GA, and after establshng the stoppng condton for GA, the obtaned parameters ncludng C = 17034.19 and ɣ = 0.3049167, were appled as SVM nput. The SVM model was traned usng the ranng data and evaluaton crtera of ths model were obtaned from the testng data [Table 5]. Based on the results: accuracy = 0.797, senstvty = 0.7143, and specfcty = 0.8478 was acheved. Further, McNamara test was run to test the presence of sgnfcance dfference between the predcted groups by ths method and real groups, whch yelded no statstcally sgnfcant dfference (P = 1.0). Comparson The assessment crtera for both models [Table 5] are shown that all these crtera except senstvty are better for dagnoss of n SVM model than LR model. However, McNamara test revealed no sgnfcant dfference between the performance of both models (P = 0.07). DISCUSSION Classfcaton models based on artfcal ntellgence have dramatcally affected the decson makng process Table 4: Regresson coeffcents and ther detals n the model wth sgnfcant factors Varable B SE (B) Wald statstc P OR Constant 8.903 2.065 4.312 <0.001* Age 0.038 0.014 2.672 0.007* 1.039 Neck crcumference 0.171 0.055 3.114 0.002* 1.186 Snorng 1.002 0.459 2.185 0.03* 2.726 Break of breathng 0.868 0.626 1.385 0.166 2.381 n sleep Epworth questonnare score 0.135 0.047 2.858 0.004* 1.144 *Sgnfcant test n level 0.05, OR=Odds rato; SE=Standard error Table 5: Evaluaton crtera for logstc regresson model wth all factors and support vector machne for test data (n=74) Actual class Predcted class from LR Wth Predcted class from SVM Wth Wth 21 6 20 8 14 33 7 39 Evaluaton crtera Model LR SVM Accuracy 0.7297 0.797 Senstvty 0.777 0.7143 Specfcty 0.7021 0.8478 Postve predctve value 0.6000 0.7407 Negatve predctve value 0.8462 0.8298 LR=Logstc regresson; SVM=Support vector machne; =Obstructve sleep apnea n dfferent scences ncludng medcal scences. One of the most popular models n ths feld s SVM. The man advantage of SVM as a data mnng method s that ths method overcomes the hgh dmensonalty problem, whch occurs when the number of nput varables s larger than the number of observatons. [33] Unlke, the parametrc statstcal methods such as lnear and nonlnear regresson models, cluster analyss, dscrmnant analyss, and tme seres analyzes that used for predcton, machne learnng methods have no ntal assumptons about the statstcal dstrbutons of nputs. [34,35] Many of these methods work lke a black box that s napproprate for clncal nterpretaton, [34] whle n statstcal methods, such as LR, coeffcents can be nterpreted. Despte countless studes carred out on buldng these models, studes are constantly tryng to acheve models wth easer functon and better performance. The man objectve of the present study was ntroducng a basc artfcal ntellgence classfcaton model to dfferentate the patents wth or wthout. Compared wth former models, the proposed classfcaton model had ths advantage that despte the use of clncal features n the clncal fles of patents as nput features (predctor varables), t had a good accuracy unlke the former models presented n ths doman 5 Journal of Research n Medcal Scences 2018

that used the features obtaned from PSG fndngs, whch were very costly and tme consumng, as nput features of classfcaton model. The proposed hybrd classfcaton model n ths study was derved from a combnaton of two artfcal ntellgence approaches, SVM and GA. The performance of ths model was compared wth the best LR model n ths study,.e., the ftted model wth all predctor varables. Snce the LR model obtaned from all studed features had a lower AIC than the LR model ftted only wth sgnfcant factors obtaned from unvarate analyss (AIC = 172.3 vs. AIC = 176), t was chosen as a better model for comparson wth SVM base model. Although McNamara test showed no sgnfcant dfference between the performance of both models (P = 0.07) and both models had a good performance, the results of ths study ndcated that the proposed SVM base model had a better and more hopeful performance than LR model n terms of accuracy (0.797 vs. 0.729), whch s an mportant crteron to compare the performance of models, and specfcty (0.847 vs. 0.777) n classfcaton of patents wth. The superorty of the SVM model has been proven n varous studes, such as n the study that was done to predct dabetes, SVM had the best performance compared wth 6 data mnng methods ncludng the LR model. [36] Although n most studes, ncludng the present study, the SVM model has more dagnostc power than the LR model, but n the study that performed by Verplanck et al., there was no sgnfcant dfference between multple LRs and SVM for mortalty predcton. [21] It should be noted that one of the lmtatons of ths study was the low sample sze (250) compared wth the studes n ths feld. Whereas many data mnng methods, ncludng the SVM method, on a hgh volume dataset, due to the ncrease n the number of educatonal data wll have a better performance, so that these methods can be appled to a larger data set to ncrease performance. CONCLUSION Although the results of McNamara test showed no sgnfcant dfference between the performance of the two models (P = 0.07), both models were found to have an approprate performance. However, consderng accuracy as an mportant crteron for comparng the performance of models n ths doman, t can be argued that SVM can have a better effcency than LR n dagnoss of n patents. Therefore, the physcans are recommended to use ths model as an auxlary screenng tool for the dagnoss of patents wth. Acknowledgments The authors gratefully acknowledge the Research Councl of Kermanshah Unversty of Medcal Scences Grant Number: 94507 for the fnancal support. Ths work was performed n partal fulfllment of the requrements for postgraduate of Zohreh Manoochehr, n the Department of Bostatstcs and Epdemology, Faculty of Publc Health, Kermanshah Unversty of Medcal Scences, Kermanshah, Iran. Fnancal support and sponsorshp Ths study was fnancally supported by Kermanshah Unversty of Medcal Scences. Conflcts of nterest There are no conflcts of nterest. REFERENCES 1. Sadock B, Sadock V, Ruz P. Comprehensve Text Book of Psychatry. 9 th ed. New York: Lppncott Wllams & Wlkns; 2009. 2. Khazae H, Najaf F, Rezae L, Tahmasan M, Sepehry AA, Herth FJ, et al. Prevalence of symptoms and rsk of obstructve sleep apnea syndrome n the general populaton. Arch Iran Med 2011;14:335 8. 3. Bentvoglo M, Bergamn E, Fabbr M, Andreol C, Bartoln C, Cosm D, et al. Obstructve sleep apnea syndrome and cardovascular dseases. G Ital Cardol (Rome) 2008;9:472 81. 4. Shaw JE, Punjab NM, Wldng JP, Albert KG, Zmmet PZ; Internatonal Dabetes Federaton Taskforce on Epdemology and Preventon. Sleep dsordered breathng and type 2 dabetes: A report from the nternatonal dabetes federaton taskforce on epdemology and preventon. Dabetes Res Cln Pract 2008;81:2 12. 5. Ghas F, Ahmadpoor A, Amra B. Relatonshp between obstructve sleep apnea and 30 day mortalty among patents wth pulmonary embolsm. J Res Med Sc 2015;20:662 7. 6. Hartenbaum N, Collop N, Rosen IM, Phllps B, George CF, Rowley JA, et al. Sleep apnea and commercal motor vehcle operators: Statement from the jont task force of the Amercan College of Chest Physcans, the Amercan College of Occupatonal and Envronmental Medcne, and the Natonal Sleep Foundaton. Chest 2006;130:902 5. 7. Young T, Palta M, Dempsey J, Skatrud J, Weber S, Badr S, et al. The occurrence of sleep dsordered breathng among mddle aged adults. N Engl J Med 1993;328:1230 5. 8. Amra B, Farajzadegan Z, Golshan M, Fetze I, Penzel T. Prevalence of sleep apnea related symptoms n a Persan populaton. Sleep Breath 2011;15:425 9. 9. Koyama RG, Esteves AM, Olvera e Slva L, Lra FS, Bttencourt LR, Tufk S, et al. Prevalence of and rsk factors for obstructve sleep apnea syndrome n Brazlan ralroad workers. Sleep Med 2012;13:1028 32. 10. Kramer MF, de la Chaux R, Fntelmann R, Rasp G. NARES: A rsk factor for obstructve sleep apnea? Am J Otolaryngol 2004;25:173 7. 11. Reddy EV, Kadhravan T, Mshra HK, Sreenvas V, Handa KK, Snha S, et al. Prevalence and rsk factors of obstructve sleep apnea among mddle aged urban Indans: A communty based study. Sleep Med 2009;10:913 8. 12. Chung F, Ward B, Ho J, Yuan H, Kayumov L, Shapro C, et al. Journal of Research n Medcal Scences 2018 6

Preoperatve dentfcaton of sleep apnea rsk n electve surgcal patents, usng the Berln questonnare. J Cln Anesth 2007;19:130 4. 13. Almazaydeh L, Ellethy K, Faezpour M. Obstructve sleep apnea detecton usng SVM based classfcaton of ECG sgnal features. Conf Proc IEEE Eng Med Bol Soc 2012;2012:4938 41. 14. Hang LW, Wang HL, Chen JH, Hsu JC, Ln HH, Chung WS, et al. Valdaton of overnght oxmetry to dagnose patents wth moderate to severe obstructve sleep apnea. BMC Pulm Med 2015;15:24. 15. Johns MW. A new method for measurng daytme sleepness: The epworth sleepness scale. Sleep 1991;14:540 5. 16. Tomar D, Agarwal S. A survey on Data Mnng approaches for healthcare. Int J Bosc Botechnol 2013;5:241 66. 17. Netzer NC, Stoohs RA, Netzer CM, Clark K, Strohl KP. Usng the berln questonnare to dentfy patents at rsk for the sleep apnea syndrome. Ann Intern Med 1999;131:485 91. 18. Amra B, Nouranan E, Golshan M, Fetze I, Penzel T. Valdaton of the Persan verson of berln sleep questonnare for dagnosng obstructve sleep apnea. Int J Prev Med 2013;4:334 9. 19. Johns MW. Sleepness n dfferent stuatons measured by the epworth sleepness scale. Sleep 1994;17:703 10. 20. Roth T, Zammt G, Kushda C, Doghramj K, Mathas SD, Wong JM, et al. A new questonnare to detect sleep dsorders. Sleep Med 2002;3:99 108. 21. Verplancke T, Van Looy S, Benot D, Vansteelandt S, Depuydt P, De Turck F, et al. Support vector machne versus logstc regresson modelng for predcton of hosptal mortalty n crtcally ll patents wth haematologcal malgnances. BMC Med Inform Decs Mak 2008;8:56. 22. Grmald M, Cunnngham P, Kokaram A. An Evaluaton of Alternatve Feature Selecton Strateges and Ensemble Technques for Classfyng Musc, 2003. Cteseer; 2003. 23. Tamura H, Tanno K. Mdpont valdaton method for support vector machnes wth margn adjustment technque. Int J Innov Comput Inf Control 2009;5:4025 32. 24. Ln CF, Wang SD. Fuzzy support vector machnes. IEEE Trans Neural Netw 2002;13:464 71. 25. Javed I, Ayyaz M, Mehmood W. Effcent Tranng Data Reducton for SVM Based Handwrtten Dgts Recognton, 2007. IEEE; 2007. 26. Kapp MN, Sabourn R, Maupn P. A dynamc model selecton strategy for support vector machne classfers. Appl Soft Comput 2012;12:2550 65. 27. Hsu CW, Chang CC, Ln CJ. Practcal Gude to Support Vector Classfcaton, 2003.researchgate.net. 28. Mchalewcz Z. GAs: What are they? Genetc Algorthms + Data Structures=Evoluton Programs. USA: Sprnger; 1994. p. 13 30. 29. Huang J, Hu X, Yang F. Support vector machne wth genetc algorthm for machnery fault dagnoss of hgh voltage crcut breaker. Measurement 2011;44:1018 27. 30. Anderson DR. Model Selecton and Mult Model Inference: A Practcal Informaton Theoretc Approach. New York, USA: Sprnger; 2002. 31. Gong L, L Z, Zhang Z. Dagnoss model of ppelne cracks accordng to metal magnetc memory sgnals based on adaptve genetc algorthm and support vector machne. Open Mech Eng J 2015;9:1076 80. 32. Salar N, Shoham S, Najaf F, Nallappan M, Karshnarajah I. A novel hybrd classfcaton model of genetc algorthms, modfed k nearest neghbor and developed backpropagaton neural network. PLoS One 2014;9:e112987. 33. Chu A, Ahn H, Halwan B, Kalmn B, Artfon EL, Barkun A, et al. A decson support system to facltate management of patents wth acute gastrontestnal bleedng. Artf Intell Med 2008;42:247 59. 34. Stylanou N, Akbarov A, Kontopantels E, Buchan I, Dunn KW. Mortalty rsk predcton n burn njury: Comparson of logstc regresson wth machne learnng approaches. Burns 2015;41:925 34. 35. Bglaran A, Bakhsh E, Rahgozar M, Karmloo M. Comparson of artfcal neural network and logstc regresson n predctng of bnary response for medcal data The stage of dsease n gastrc cancer. J North Khorasan Unv Med Sc 2012;3:15 21. 36. Tapak L, Mahjub H, Hamd O, Poorolajal J. Real data comparson of data mnng methods n predcton of dabetes n Iran. Healthc Inform Res 2013;19:177 85. 7 Journal of Research n Medcal Scences 2018