Internatonal OEN ACCESS Journal Of Modern Engneerng esearch (IJME) Survval ate of atents of Ovaran Cancer: ough Set Approach Kamn Agrawal 1, ragat Jan 1 Department of Appled Mathematcs, IET, Indore, Inda Department of Mathematcs, SIS, Indore, Inda ABSTACT: Ovary cancer s becomng a record cause of death among women n the whole world. The survval of ovary cancer patents cannot be eactly predcted how long they wll alve. In ths paper, ough Set Theory s used for feature etracton and rule generaton. Eperments conducted on the database of ovary cancer patents who were operated for ovary cancer 10 years before. An epermental result shows that the type of surgery has a sgnfcant role n the survval of patents. Keywords: Ovary Cancer, ough Set, Johnson s Heurstc Algorthm, Decson ule. I. INTODUCTION Cancer s one of the leadng dseases n all over the world. Cancer dseases are developed by the abnormal cells that dvde uncontrollably and destroy body tssues. Cancers are varous types lke Breast Cancer, Bone Cancer, Ovaran Cancer, Lver Cancer, Lung Cancer etc. Ovaran cancer ranks fourth among cancer deaths n women. Ths hgh death rate s due to the fact that almost 70% of women wth epthelal ovaran cancer are not dagnosed untl the dsease gets advanced to Stage III (.e. cancer has spread to the upper abdomen) or beyond [1]. The Mult Class Support Vector Machne was used n the prevous researches for classfyng the normal, early stage and late stage cases as an optmzed attrbute set. The optmzed feature selecton process was acheved by the hybrd artcle Genetc Swarm Optmzaton (GSO) based rough set theory []. Dfferent types of dagnoss are done for detectng cancer dsease. After surgery, the survval of patent s unpredctable because some patents survve so many years and some for few years or months. For the predcton of survval rate of patents, ough Set Theory s used as one of the tools. The theory of ough Sets (S) proposed by Zdzslaw awalk n 198 [3]. It s a mathematcal tool for etractng knowledge from uncertan and ncomplete data. It gves the optmal results for the analyss of data wthout loss of nformaton. The advantage of ths approach s that, t does not requre the user to make any pror assumptons about the data. It can be used to fnd attrbutes effcently; a crtcal step n the decson-makng process, through the computng of reducts [4]. II. FUNDAMENTALS OF OUGH SET THEOY (ST) An nformaton system (IS) s a par (U,A), where U s a nonempty, fnte set of obects called the Unverse and A s a nonempty, fnte set of Attrbutes, such that f a : U Va for any a A, where V a s the set of values of a. If A, there s an assocated Equvalence elaton, called as Indscernblty elaton: IND, U, b b, b, where all dentcal obects of set are consdered as elementary sets of the unverse. The artton determned by, denoted as U/IND(), or smply U/. Let U of set are defne as [5, 6]:, the -lower appromaton and -upper appromaton U : ( ) U : ( ) The dfference between Upper and Lower Appromaton s called as Boundary egon: BN ) ( ) ( If the Boundary egon of s the empty set,.e., BN () =, then the set s crsp (eact) wth respect to ; n the opposte case,.e., f BN (), the set s rough (neact) wth respect to. ough set can be also characterzed numercally by the followng coeffcent: card card IJME ISSN: 49 6645 www.mer.com Vol. 6 Iss. 11 November 016 31
Survval ate Of atents Of Ovaran Cancer: ough Set Approach whch s called as accuracy of appromaton. Obvously 0 () 1. If () = 1, s crsp wth respect to ( s precse wth respect to ), and otherwse, f () < 1, s rough wth respect to ( s vague wth respect to ). The reduct process for attrbutes reduces elementary set numbers, the goal of whch s to mprove the precson of decsons. After the attrbute dependence process, the reduct attrbute sets are generated to remove superfluous attrbutes. The complete set of attrbutes s called a reduct attrbute set. There may be more than one reduct attrbute set n an nformaton system, but ntersectng a number of reduct attrbute sets yelds a core attrbute set. ED COE A ED.1. Johnson s Heurstc Algorthm The Johnson s Heurstc Algorthm s used to calculate reduct for a decson problem. It sequentally selects features by fndng those that are most dscernble for gven decson feature [7]. It computes a dscernblty matr M, where m f F : f c f c for f c f c, and otherwse Johnsons _educt (F p, F d, C) Input F p : condtonal features, f d : decson feature, C: cases Output : educt F, p. 1. F F p M Compute Dscernblty Matr C, F,. 3. Do 4. h f Select Hghest Scorng Feature M 5. f h 6. For 0 to C, to C 7. m f f h m 8. F F f h 9. Untl m, 10. eturn For the standard Johnson s Algorthm, ths s typcally a count of the number of appearances an attrbute makes wthn clauses; attrbutes that appear more frequently are consdered to be more sgnfcant. The attrbute wth the hghest heurstc value s added to the reduct canddate, and all clauses n the dscernblty functon contanng ths attrbute are removed. As soon as all clauses are removed, the algorthm termnates and returns to reduct... Decson ules Let S= (U,, ) be a decson table. Every U 1, n where,......, n., d f d d determnes a sequence 1.,.. n 1, and 1,,......, n, The sequence s called as Decson ule nduced by (n S) and denoted by,...,,,...., n 1..., n The number 1 or The number,, called as Support of Decson ule. IJME ISSN: 49 6645 www.mer.com Vol. 6 Iss. 11 November 016 3
s referred to as the Strength of the decson rule decson rule Survval ate Of atents Of Ovaran Cancer: ough Set Approach,, U the certanty factor s denoted by cer, ( cer,, where denotes the cardnalty of. Wth every, ) and defned as follows: If cer (, ) = 1, then s called as Certan Decson ule; If 0 < cer (, ) < 1 the decson rule s referred to as an uncertan decson rule. Besdes, a Coverage Factor of the decson rule, denoted as cov (, ) and defned as If cov s a decson rule then, further used to gve eplanatons (reasons) for a decson.,, s called an nverse decson rule. The nverse decson rules are III. DESCITION OF DATASET The treatments for ovaran cancer are surgery, chemotherapy, bologcal therapy or radotherapy, the type of treatment depends on the stage of the cancer. Sometmes both surgery and chemotherapy treatment are appled for advance stage of cancer. Advance stage means the cancer has spread away from the ovary. In surgery treatment surgeon removes as much of the cancer as possble. Chemotherapy appled before the surgery. It shrnks the cancer and make t easer to remove. In radotherapy, -rays are used to kll or damage cancer cell and reduce ther actvty [8]. The dataset of the Ovary Cancer dsease s taken from Obel (1975) who studed on women, who were operated for ovary cancer 10 years before. The dataset conssts of 5 attrbutes where 3 are condtonal attrbute, one s decson attrbute and one s frequency attrbute [9, 10, 11]. Table I shows the notaton of ovary cancer attrbutes wth ther descrpton. Table I: Ovary Cancer Database Attrbutes Descrpton Notaton Descrpton A1 Stage of the cancer at the tme of operaton (e = early, a = advanced). A Type of operaton (r = radcal, l = lmted) A3 -ray treatment was receved (yes, no). D Survval status after 10 years (yes, no). F Frequency IV. EEIMENTAL ESULTS AND DISCUSSION Table II represents the three factors affectng ovary cancer survval that concern fourteen cases. U s a unversal set of obects. In table, columns C A, A, } represent condtonal attrbutes and column D { 1 A3 represents decson attrbute. The elementary set of three attrbutes s,,,,,,,,,,,, U C and 1 3 4 5 7 6 8 10 9 11 1 13, equvalence class of decson attrbute s Y Yes, No. Table II: Database of Ovary Cancer U A1 A A3 D F 1 e r No No 10 e r Yes No 17 3 e r No Yes 41 4 e r Yes Yes 64 5 e l Yes No 3 6 e l No Yes 13 14 IJME ISSN: 49 6645 www.mer.com Vol. 6 Iss. 11 November 016 33
Thus, the Lower appromaton s C yes Survval ate Of atents Of Ovaran Cancer: ough Set Approach 7 e l Yes Yes 9 8 a r No No 38 9 a r Yes No 64 10 a r No Yes 6 11 a r Yes Yes 11 1 a l No No 3 13 a l Yes No 13 14 a l Yes Yes 5 6 Upper appromaton of set s C yes,,,,,,,,,,,,,. 1 3 4 5 6 7 8 9 10 11 1 13 14 Boundary regon of set s yes,,,,,,,,,, BNB. 1 3 4 5 7 8 9 10 11 1 13, For fndng the educt, the Johanson Herurstc Algorthm s used. The set of attrbutes thus obtaned s { A 1, A, A3} A e A, l A, no D, yes. The decsons rules are generated from ths reduct are as follows: 1. 1, 3. A 1, a A, la3, no D, no 3. A, r D, yes 4. A, r D, no 5. A 3, yes D, yes 6. A, yes D, no 3 The certanty and coverage factors of the decson rules are gven s Table III. Table III: Certanty and Coverage factors ules Support Strength Certanty Coverage 1 13 0.09 1 0.09 3 0.007 1 0.01 3 1 0.69 0.48 0.8 4 19 0.85 0.51 0.87 5 89 0.196 0.48 0.59 6 97 0.14 0.5 0.65 From the certanty and coverage factors of all decson rules, It s observed that the patent could survve after 10 years of surgery f the type of surgery s lmted and wthout -ray treatment n early or advance stage of ovaran cancer. The patent would be dffcult to survve after 10 years, f the surgery s radcal type or receves -ray treatment. V. CONCLUSION In ths paper, rough set theory approach s used to predct the survval rate of patents after the 10 years treatment of ovary cancer. Johnson Heurstc Algorthm s used for feature selecton of ovary cancer survval. Decson rules are generated on the bass of the features. Certanty and coverage factors are calculated on the bass of decson rules. Both are helpful to predct the survval of patents after 10 years of surgery. It s concluded that patents survve after 10 years, f the surgery s not radcal type. EFENCES [1]. T. Z. Tan, C. uek and G.S. Ng, Ovaran Cancer Dagnoss by Hppocampus and Neocorte Inspred Learnng Memory Structures, Neural Networks, 18, 005, 818 85. [].. Yasodha and N.. Anathanarayanan, Analyss Bg Data to Buld Knowledge Based System For Early Detecton of Ovaran Cancer, Indan Journal of Scence and Technology,8, 015, 1-7. [3]. Z. awlak, ough Sets, Internatonal Journal of Computer and Informaton Scences, 11, 198, 341 356. 14 IJME ISSN: 49 6645 www.mer.com Vol. 6 Iss. 11 November 016 34
Survval ate Of atents Of Ovaran Cancer: ough Set Approach [4]. H. Ba, Y. Ge, J. Wang and Y. L. Lao, Usng ough Set Theory to Identfy Vllages Affected by Brth Defects: The Eample of Heshun, Shan, Chna, Internatonal Journal of Geographcal Informaton Scence, 4, 010, 559 576. [5].. Shen and. Jensen, ough Sets, Ther Etensons and Applcaton, Internatonal Journal of Automaton and Computng, 4, 007, 100-106. [6]. B. Walczak, D.L. Massart, ough Set Theory, Chemometrcs and Intellgent Laboratory Systems, 47, 1999, 1 16. [7]. A. H. El-Baz, Hybrd Intellgent System Based ough Set and Ensemble Classfer for Breast Cancer Dagnoss, Neural Computng and Applcatons, 6, 015, 437 446. [8]. http://www.cancerresearchuk.org/about-cancer/type/ovaran-cancer/treatment/whch-treatment-for-ovaran-cancer. [9]. E. B. Obel, A Comparatve Study of atents wth Cancer of the Ovary Who Have Survved More or Less Than 10 Years. Acta Obstetrca et Gynecologca Scandnavca, 55, 1975, 49-439. [10]. E. B. Andersen, The Statstcal Analyss of Categorcal Data. nd edton, (Berln, Sprnger-Verlag, 1991), 19-193. [11]. https://vncentarelbundock.gthub.o/datasets/datasets.html IJME ISSN: 49 6645 www.mer.com Vol. 6 Iss. 11 November 016 35