AR Journal of Engneerng and Appled Scences 2006-2014 Asan Research ublshng etwork (AR). All rghts reserved. ERFORMACE EVALUAIO OF DIVERSIFIED SVM KEREL FUCIOS FOR BREAS UMOR EARLY ROGOSIS Khondker Jahd Reza 1, Sabra Khatun 1, Mohd F. Jamlos 2, Md. Moslemuddn Fakr 3 and Shekh Shanawaz Mostafa 4 1 School of Computer and Communcaton Engneerng, Unverst Malaysa erls, auh utra Man Campus, Arau, erls, Malaysa 2 Advanced Communcaton Engneerng Centre (ACE), School of Computer and Communcaton Engneerng, Unverst Malaysa erls, auh utra Man Campus, Arau, erls, Malaysa 3 Insttute of Engneerng Mathematcs, Unverst Malaysa erls (UnMA), auh utra Man Campus, Arau, erls, Malaysa 4 Department of Bomedcal Engneerng, Khulna Unversty of Engneerng and echnology, Khulna, Bangladesh E-Mal: ahd_rfat@yahoo.com ABSRAC Ultra wde-band (UWB) mcrowave technology s a promsng canddate to detect the early breast cancer. hs paper ams to depct pattern recognton performance of support vector machne (SVM) for confocal UWB breast tumor magng dataset. A novel feature extracton technque s also ntroduced n ths paper for the sgnal classfcaton perfectly and promptly. SVM classfer functons the comparatve study between SVM kernel functons ncludes lnear functon, radal bass functon, polynomal and mult layer perceptons are nvestgated and verfed for pattern recognton performance wth the help of recever operatng characterstc (ROC) graph and confuson matrx. he man motto of ths paper s to dentfy the tumor n ts smallest dmenson from avalable works ncludng ther data usng the proposed feature extracton. In total, thrteen dfferent szes of bengn tumors are beng consdered where the smallest and largest tumor szes utlzed are 1mm and 9 mm, respectvely. Keywords: bengn tumor, breast cancer, support vector machne, ultra wde-band. IRODUCIO Breast cancer s ndsputably consdered as the second deadly cancer among women [1]. Usually, breast cancer causes due to unusual growth of ether lobules (mlk producng tssue) or ducts (connectng tssue between lobule and npple). Most breast tumor tssue s bascally two types: bengn and malgnant. Bengn tssue s not the cancerous; t may remans n the body lfe-long wth the neglgble growth or by the perod of tme t may turns to deadly cancer malgnant tumor [2, 3]. In a recent survey from Amercan cancer socety, t s found that 89% of women could survve f the cancer s detected wthn 5 years. hs rate s lower as 82% and 77% f t s detected after 10 and 15 years, respectvely [4]. Joy et al., [5] and abar et al., [6] showed n ther researches that, early detecton of tumor s mportant for the long term survval n the way of reducng the rsk. At present, Mcrowave UWB magng s one of the most promsng technques n the arena of breast cancer detecton. Many researchers have utlzed ths technology to detect breast cancer n early stage [7-11]. Very hgh detecton capablty and low power consumpton encourage the researchers to utlze ths technque. At present pattern recognton methods are useful term n case of sgnal processng. A good number of technques are also avalable. Artfcal eural etwork (A) s already well ustfed for breast cancer pattern recognton and detecton [11-12] that lterally showed tremendous successful detecton and sze dentfer capablty wth 95.8% accuracy. Some crucal factors.e., tranng tme, feature extracton etc., whch nfluence the system performance was not well consdered. A huge feature value also ncreased the A tranng duraton and reduces the tranng performance. artcle Swam Algorthm can also be a better opton [13]. Machne learnng based artfcal ntellgence s employed n fne needle asprate of breast for early breast cancer detecton. Support Vector Machne (SVM) s already utlzed to dentfy tumor ether from mammographc mage [14] or mammogram data [15]. A revolutonary technology of gene sgnature for breast cancer detecton s descrbed n [16]. hs study gves an dea that SVM can classfy the breast cancer wth 34% better accuracy than conventonal 70-gene sgnature. he basc dea of sgnal processng usng SVM for breast cancer detecton was descrbed n [16]. Several studes conducted usng Wsconsn breast cancer database of Machne Learnng Repostory from the Unversty of Calforna. he data set are manly demonstrated to dfferentate between malgnant and bengn tumors ncludng 97.51% classfcaton accuracy, 97.49% senstvty and 97.52% specfcty are acheved [17]. An mprovement about 100% of accuracy usng radal bass functon (rbf) and about 83.55% utlzng the sgmod kernel functon s achevable and descrbed n [18]. Another study shows that, among SVM kernel functons, polynomal kernel functons provde hgher accuracy of detecton about 97.5% [19]. But SVM become overwhelmed due to large amount of nput data. hs paper s organzed as follows. he next secton presents the system model elaboraton ncludng data acquston, sgnal pre-processng; followed by feature extracton. In secton III, SVM Kernel Functons are defned breast phantom preparaton and devce specfcaton. After that, sgnal processng tool, followed by System performance then results, and fnally the concluson. 329
AR Journal of Engneerng and Appled Scences 2006-2014 Asan Research ublshng etwork (AR). All rghts reserved. MEHODOLOGY he graphcal presentaton of the proposed system model s shown n the Fgure-1. UWB x Antenna Legends x=ransmtter Rx=Recever o = tumor Breast hantom wth/ wthout umor Feature Selecton Feature Reducton UWB Rx Antenna Receved Scattered UWB pulses Sgnal Segmentaton Classfcaton wth SVM umor Exstence/ on-exstence Fgure-1. roposed system model data acquston. Data acquston he UWB transcever from me doman Co. [20] s used for experment wth resonant frequency 4.7GHz and forward scattered pulses were receved at the recever end for sgnal processng. he descrptons of dataset and breast phantom are smlar to Saleh et al., [21]. Used tumor szes for the experment are 1mm, 1.6 mm, 2 mm, 2.7 mm, 3 mm, 3.8 mm, 4 mm, 4.2 mm, 5 mm, 5.3 mm, 6 mm, 6.5 mm, 7 mm and n sze. For each of these tumor szes, 22 tmes UWB sgnals are transmtted through the breast phantom. On the other hand, 36 tmes receved UWB sgnals were examned wthout any tumor nsde the breast phantom. Each of the above cases, n average of 4500 data ponts was generated. Among these huge data ponts, 67% and 33% data sets kept for tranng and testng purposes, respectvely. Sgnal pre-processng In general, each pattern recognton methodology for sgnal processng s almost the same whch s shown n Fgure-1. he UWB receved sgnals go through the segmentaton processng. umerous analog to dgtal converson technques are avalable, Fast Fourer ransform (FF), Dscrete Fourer ransform (DF), Dscrete Cosne ransform (DC) etc. are well known terms. For our research, we have used DC, so that after recevng UWB pulses from recever the analog UWB pulses are dvded nto dscrete values. one of the denosng method was appled n ths context. Feature extracton rocessed segmented data are ready for feature extracton, where large amount of data ponts are reduced to some characterstc features. remendous feature reducton technques are avalable; n our prevous study Saleh et al., [21] used rncple Feature Analyss (FA) extracton where 50-300 larger DC values were consdered. So far t s stll tme consumng for computng complexty of a huge set of data. Another, feature extracton method descrbed n [22] where, features are selected from EEG features for neonatal sezure detecton by acqurng the RMS ampltude, lne length, and number of maxma and mnma values. So n ths paper, a new feature s ntroduced n whch, we have taken only four feature values.e., maxma, mnma, mean and standard devaton values from the data set. It enhanced the processng speed and reduced the computaton complexty. he general equaton used for mean or average, µ and standard devaton, σ are gven n equaton (1) and equaton (2), respectvely. µ = σ = 1 1 [ X n n= 0 1 1 [ X n n= 0 ] µ ] 2 (1) (2) In these equatons, s the total number and X n s the number of data ponts. SVM kernel functons SVM s a supervsed learnng algorthm proposed by Vapnk n 1992 [23, 24]. redomnantly, ths machne learnng algorthm s prospered for the soluton of classfcaton problems. Wth the dscovery of e- nsenstve loss functon [18], SVM s currently extended to solve non-lnear regresson problems. he basc concept of the SVM can be descrbed as follows, consderng (x, y ) s a tranng set of data as f, n ' = 1, 2,..., n wth x R and y { 1,1},. hen the equaton of a decson surface n the form of a hyperplane can be defned as, w x + b = 0 (3) where, w stands for adustable vector and b s for bas.e., he condton then need to be satsfed s, () w x + b 0 for y = + 1 (4) () w x + b < 0 for y = 1 (5) o fnd out the optmzed value of w and b the condton () needs to be satsfed. hen the w mnmzes the cost functon: 330
AR Journal of Engneerng and Appled Scences 2006-2014 Asan Research ublshng etwork (AR). All rghts reserved. 1 φ ( w) = 2 w w Kernel functons are ntroduced n SVM for solvng non-lnear functons. In ths work, fve kernels are verfed, as stated above, to check the classfcaton performance. he kernel functons and ther respected equatons are gven below: Kernel functon: k x, x = φ( x ). φ( x ) Lnear kernel: ( ) k ( x, x ) = x. x olynomal order kernel: k ( x, x ) = ( γx x + r), γ > 0 Radal bass functon (RBF): k ( x, x 2 ) = exp( γ x x ), γ > 0. Mult layer percepton or sgmod kernel: k( x, x ) = tan h( γ x x + r) Here, γ, r and n are kernel parameters [25]. n 2 nd order olynomal kernel s known as quadratc kernel functon and n use n the paper to dfferentate from 3 rd order polynomal kernel. RESULS AD DISCUSSIOS A basc two-by-two contngency table s represented n able-1. A set {, } of postve and negatve label class s mapped whch s known as rue class. Another set {Y, } or Hypotheszed Class s used to separate the actual class and the predcted class [26]. So that, four possble outcomes are found from able-1. If provded nstance s postve and t s classfed as postve, then t s counted as rue ostve, marked as lght green color. However, f t s classfed as egatve then s called as False egatve and marked as lght orange colored cell. On the other hand, f the nstance s negatve and t s classfed as negatve then t s called as rue egatve and marked as lght green cell. Alternatvely, f the nstance s postve then t s counted as False ostve and marked as orange cell. Senstvty and specfcty are defned to clarfy the system performance. able-1. Confuson matrx. [26] Hypotheszed Class rue class rue ostve False ostve Senstvty False egatve rue egatve Specfcty otal ostve egatve Accuracy true postve rate false postve rate Both can be defned as, ostves correctly classfed otal postves egatves ncorrectly classfed otal negatves rue ostve( ) Senstvty = rue ostve() + False ostve( F) rueegatve ( ) Specfcty = rue egatve ( ) + False ostve ( F) rueost ve ( ) + rue egatve ( ) Accuracy = ostve + egatve (6) (7) (8) (9) (10) It s stated earler that, 12 dfferent szes of tumor are used and tumor szes are already defned n secton II (A). We have traned and tested the SVM wthout tumor condton and wth tumor condton. However, UWB pulses are passed through the breast phantom 22 tmes for each sze of tumor. hese propagaton teratons are ncreased to 36 n case of wthout tumor condton. In the frst case, 8 dfferent tumor szes (.e., 0.16mm, 0.2mm, 0.27mm, 0.3mm, 0.38mm, 0.4mm, 0.42mm, 0.53mm) data sets are kept for tranng purposes. So n total 200 tmes (8 22tmes=176 and 24 tmes wthout tumor condton) of data set are used to tran the SVM and kept the rest of the tumor szes data set for the testng purpose. he number of teraton for testng purpose s 100 tmes (4 22=88 and 12 tmes wthout tumor condton). hen the proposed feature extracton.e., four characterstc features s appled on the raw data set. Among these 200 cases, 24 cases out of 200 cases are found true negatve and rest of the 176 cases are found true postve. he result s perfectly matched wth the gven data set. So the tranng accuracy reaches up to 100%. 331
AR Journal of Engneerng and Appled Scences 2006-2014 Asan Research ublshng etwork (AR). All rghts reserved. able-2. Confuson matrx of lnear, quadratc, polynomal order 3 and Rbf kernel functons. Output Class otal ranng estng 176 0 100% 88 0 100% 88% 0% 0% 88% 0% 0% 0 24 100% 0 12 100% 0% 12% 0% 0% 12% 0% 100% 100% 100% 100% 100% 100% 0% 0% 0% 0% 0% 0% On the other hand, n testng phase, 100 sets of data are feed nto the system where 88 are recognzed as true postve. here are another 12 sets of data wthout tumor condton and t s found as true negatve. So the accuracy for testng phase also rses up to 100%. he correspondng confuson matrx s represented n able-2 for lnear, quadratc, polynomal order 3 and rbf kernel functons. wo columns are beng used to separate the tranng and testng outcomes. For the four kernel functons, tranng and testng data set are separately feed n to the SVM. A recever operatng characterstc (ROC) graph s used to vsualze, organze and selectng the classfers based on ther performance. It s bascally used to tradeoff between true postve rates (tp rate) and false postve rate (fp rate) of classfers whch are defned n equaton x and y. Relevant ROC graphs for tranng and testng are shown Fgures 2 (a) and (b). In Fgure-2, the horzontal axs s marked as False ostve Rate and vertcal axs s marked as rue ostve rate. he blue coloured lne s ROC curve and grey coloured lne s a reference curve. It s vsble from both the Fgures 2(a) and (b), ROC curve has a hgher left pont (0, 1) both n tranng and testng whch represents such a classfer commts no false postve errors as well as no negatve errors whch means the perfect classfcaton [26]. It can be defned n another way, the pont where the ROC curve goes flat thence false postve rate s 0 (zero). It mentons the 100% pattern recognton accuracy s acheved. (a) (b) Fgure-2. (a) ranng and (b) estng ROC graphs of lnear, quadratc, polynomal order 3 and Rbf kernel functons. he worst performance s notced for multlayer percepton or sgmod kernel functon among these fve kernel functons. Senstvty and specfcty percentage for ths type of kernel functon s ndexed n able-3. In total 332
AR Journal of Engneerng and Appled Scences 2006-2014 Asan Research ublshng etwork (AR). All rghts reserved. of 200 tranng cases, rue egatve s found n 24 cases; but False egatve s about 16.5% that reduces the performance. On the other hand, 143 out of 200 cases are successfully dentfed the exstence of a tumor. hs s equvalent to 71.5% success as rue ostve wthout any False ostve. hat's why, the total tranng accuracy reduces to 83.5% or ncorrect classfcaton performance s found 16.5%. However, n testng phase, rue egatve and rue ostve success rate are recorded as 12% and 42%, respectvely. able-3. Confuson matrx of mult layer percepton kernel functons. Output Class otal ranng estng 143 33 42.1% 42 46 20.7% 71.5% 16.5% 57.9% 42% 46% 79.3% 0 24 100% 0 12 100% 0% 12% 0% 0% 12% 0% 100% 81.3% 83.5% 100% 47.7% 54% 0% 18.8% 16.5% 0% 52.3% 46% However, the False egatve percentage of testng performance s 46%. So, the total accuracy of testng s 54% and ncorrect classfcaton percentage s reached to 46%. he ROC graphs for tranng and testng phase are presented n Fgures 3(a) and (b), respectvely. It s also matched wth able-3. In Fgure-3(a) pont (0.188, 1) ndcate that 18.8% of postve data are classfed as negatve. And ths false postve rate ncrease to 52.3% n testng phase cleared from (0.523, 1) pont n Fgure-3(b). In both cases senstvty was 1 [26]. (b) Fgure-3. (a) ranng and (b) estng ROC graphs of mult layer percepton kernel functons. (a) COCLUSIOS SVM based pattern recognton technque for early breast tumor detecton s nvestgated n ths paper. Comparatve study among dfferent kernel functons exhbt that, Lnear, Quadratc, olynomal order 3 and Radal Bass kernel functons can classfy the tumor avalablty wth 100% accuracy both n tranng and testng phases. umor szes are vared from the smallest sze of 1mm up to 7mm. Unlke other kernel functons, ML kernel functons provde less accuracy about 83.5% n tranng and 54% n testng result. It can be concluded that, SVM s well capable of classfyng the tumor avalablty usng the proposed feature extracton and s recommended to solve the regresson problem for further 333
AR Journal of Engneerng and Appled Scences 2006-2014 Asan Research ublshng etwork (AR). All rghts reserved. nvestgaton. Also, t s kept for the future research to dentfy the more than 2 patterns from the set of data. ACKOWLEDGEME hs work s supported by the Long-term Research Grant Scheme, LRGS (Code no. LRGS/D/2011/UKM/IC/02/05/, Grant no.: 9012-00005) of Mnstry of Hgher Educaton, Malaysa. Authors would also lke to express ther grattude to the R&D Department of Unverst Malaysa erls, Malaysa for fnancal (Graduate Assstance) support and research facltes. REFERECES [1] D. arkn. 1998. Epdemology of cancer: global patterns and trends. oxcology Letters. 5: 102-103. [2] Abdullah. Hsham and Cheng-Har Yp. 2004. Overvew of breast cancer n Malaysan women: a problem wth late dagnoss. Asan Journal of Surgery. 27(2). [3] Avalable at: http://www.radologymalaysa.org/breasthealth/about/ FactsStats.htm. Updated on: updated 11 August 2008. [4] Breast cancer facts and fgures, Atlanta: Amercan Cancer Socety, Inc, 2011-2012. [5] Joy JE, enhoet EE and ettt DB. 2005. Savng women s lves: strateges for mprovng breast cancer detecton and dagnoss. he atonal Academy ress. [6] abar L, Yen MF, Vtak B, Chen HH, Smth RA and Duffy SW. 2003. Mammography servce screenng and mortalty n breast cancer patents: 20-year follow-up before and after ntroducton of screenng. Lancet. 361: 1405-1410. [7] Fear E. C., S. C. Hagness,. M. Meaney, M. Okonewsk and M. A. Stuchly. 2002. Enhancng breast tumor detecton wth near-feld magng. IEEE Mcrowave Magazne. 3: 48-56. [8] Bond E. J., X. L, S. C. Hagness and B. D. Van Veen. 2003. Mcrowave magng va space-tme beam formng for early detecton of breast cancer. IEEE rans. Antennas ropag. 51: 1690-1705. [9] L X., S. K. Davs, S. C. Hagness, D. W. Wede and B. D. Veen. 2004. Mcrowave Imagng va Space- me Beamformng: Expermental Investgaton of umor Detecton n Multlayer Breast hantoms. IEEE rans. Mcrowav. heory ech. 52: 1856-1865. [10] Shannon C., E. Fear and M. Okonewsk. 2005. Delectrc-flled slotlne bowte antenna for breast cancer detecton. Electroncs Letters. 41: 388-390. [11] S. A. Alshehr, S. Khatun, A. B. Jantan, R. S. A. Raa Abdullah, R. Mahmud, Z. Awang. 2011. 3D expermental detecton and dscrmnaton of malgnant and bengn breast tumor usng -based UWB magng system. rogress In Electromagnetcs Research. 116: 221-237. [12] S. A. AlShehr and S. Khatun. 2009. UWB Imagng For Breast Cancer Detecton Usng eural etwork. rogress In Electromagnetcs Research C. 7: 79 93,. [13] S. H. Zanud-Deen, W. M. Hassen, E. M. Al, K. H. Awadalla and H. A. Sharshar. 2008. Breast Cancer Detecton Usng a Hybrd Fnte Dfference Frequency Doman and artcle Swarm Optmzaton echnques. rogress n Electromagnetcs Research B. 3: 35-46. [14] Y. Ireaneus Anna Rean and S. hamara Selv. 2009. Early Detecton of Breast Cancer Usng SVM Classfer echnque. Internatonal Journal on Computer Scence and Engneerng. 1(3): 127-130. [15] Walker H. Land, Jr., Lut Wong, Dan Mckee, Mark embrechts and Frances Anderson. 2004. Applyng Support Vector Machnes to Breast Cancer Dagnoss usng Screen Flm Mammogram Data. 17 th IEEE Symposum on Computer -Based Medcal Systems (CBMS'04). pp. 224-228. [16] Wang Y and Wan Fuyong. 2006. Breast Cancer Dagnoss va Support Vector Machnes. 25 th Chnese Control Conference. pp. 1853-1856. [17] Sant Wulan urnam, S.. Rahayu and Abdullah Embong. 2008. Feature Selecton and Classfcaton of Breast Cancer Dagnoss Based on Support Vector Machnes. Internatonal Symposum on Informaton echnology, ISm. pp. 1-6. [18] Shang Gao and Hongme L. 2012. Breast Cancer Dagnoss Based on Support Vector Machne. Internatonal Conference on Uncertanty Reasonng and Knowledge Engneerng. pp. 240-243. [19] Muhammad Hussan, Summrna Kanwal Wad, Al Elzaart and Mohammed Berbar. 2011. A Comparson of SVM Kernel Functons for Breast Cancer Detecton. 8 th Internatonal Conference Computer Graphcs, Imagng and Vsualzaton. pp. 145-150. [20] me Doman Corporaton, Cummngs Research ark, 330 Wynn Drve, Sute 300, Huntsvlle, AL 35805, USA. 334
AR Journal of Engneerng and Appled Scences 2006-2014 Asan Research ublshng etwork (AR). All rghts reserved. [21] S. A. Alshehr, S. Khatun, A. B. Jantan, R. S. A. Raa Abdullah, R. Mahmood and Z. Awang. 2011. Expermental Breast umor Detecton Usng n- Based UWB Imagng. rogress n Electromagnetcs Research. 111: 447-465. [22] emko, A., E. homas, W. Marnane, G. Lghtbody, and G. Boylan. 2011. EEG-based neonatal sezure detecton wth support vector machnes. Clncal europhysology 122 (3): 464-473. [23] V.. Vapnk. 1995. he ature of Statstcal Learnng heory. Wley, ew Work. [24] V. Vapnk. 1998. Statstcal Learnng heory. Wley, ew York, USA. [25] Smon Haykn. 2001. eural etworks: A Comprehensve Foundaton. earson Educaton, Inc., 2 nd Edton. [26] om Fawcett. 2006. An Introducton to ROC analyss. attern Recognton Letters. 27: 861-874. 335