AN ENHANCED GAGS BASED MTSVSL LEARNING TECHNIQUE FOR CANCER MOLECULAR PATTERN PREDICTION OF CANCER CLASSIFICATION

Similar documents
Lymphoma Cancer Classification Using Genetic Programming with SNR Features

Biomarker Selection from Gene Expression Data for Tumour Categorization Using Bat Algorithm

Gene Selection Based on Mutual Information for the Classification of Multi-class Cancer

A Support Vector Machine Classifier based on Recursive Feature Elimination for Microarray Data in Breast Cancer Characterization. Abstract.

A Support Vector Machine Classifier based on Recursive Feature Elimination for Microarray Data in Breast Cancer Characterization. Abstract.

Study and Comparison of Various Techniques of Image Edge Detection

Modeling Multi Layer Feed-forward Neural. Network Model on the Influence of Hypertension. and Diabetes Mellitus on Family History of

Copy Number Variation Methods and Data

Reconstruction of gene regulatory network of colon cancer using information theoretic approach

Survival Rate of Patients of Ovarian Cancer: Rough Set Approach

A MIXTURE OF EXPERTS FOR CATARACT DIAGNOSIS IN HOSPITAL SCREENING DATA

Parameter Estimates of a Random Regression Test Day Model for First Three Lactation Somatic Cell Scores

A New Machine Learning Algorithm for Breast and Pectoral Muscle Segmentation

IMPROVING THE EFFICIENCY OF BIOMARKER IDENTIFICATION USING BIOLOGICAL KNOWLEDGE

A Classification Model for Imbalanced Medical Data based on PCA and Farther Distance based Synthetic Minority Oversampling Technique

INTEGRATIVE NETWORK ANALYSIS TO IDENTIFY ABERRANT PATHWAY NETWORKS IN OVARIAN CANCER

Statistically Weighted Voting Analysis of Microarrays for Molecular Pattern Selection and Discovery Cancer Genotypes

Using the Perpendicular Distance to the Nearest Fracture as a Proxy for Conventional Fracture Spacing Measures

Comparison among Feature Encoding Techniques for HIV-1 Protease Cleavage Specificity

Using Past Queries for Resource Selection in Distributed Information Retrieval

Feature Selection for Predicting Tumor Metastases in Microarray Experiments using Paired Design

Dr.S.Sumathi 1, Mrs.V.Agalya 2 Mahendra Engineering College, Mahendhirapuri, Mallasamudram

Physical Model for the Evolution of the Genetic Code

310 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16

Joint Modelling Approaches in diabetes research. Francisco Gude Clinical Epidemiology Unit, Hospital Clínico Universitario de Santiago

*VALLIAPPAN Raman 1, PUTRA Sumari 2 and MANDAVA Rajeswari 3. George town, Penang 11800, Malaysia. George town, Penang 11800, Malaysia

Journal of Engineering Science and Technology Review 11 (2) (2018) Research Article

Introduction ORIGINAL RESEARCH

AUTOMATED DETECTION OF HARD EXUDATES IN FUNDUS IMAGES USING IMPROVED OTSU THRESHOLDING AND SVM

Classification of Breast Tumor in Mammogram Images Using Unsupervised Feature Learning

Comparison of support vector machine based on genetic algorithm with logistic regression to diagnose obstructive sleep apnea

Boosting for tumor classification with gene expression data. Seminar für Statistik, ETH Zürich, CH-8092, Switzerland

Appendix F: The Grant Impact for SBIR Mills

Journal of Engineering Science and Technology Review 11 (2) (2018) Research Article

econstor Make Your Publications Visible.

A Geometric Approach To Fully Automatic Chromosome Segmentation

BINNING SOMATIC MUTATIONS BASED ON BIOLOGICAL KNOWLEDGE FOR PREDICTING SURVIVAL: AN APPLICATION IN RENAL CELL CARCINOMA

Prediction of Total Pressure Drop in Stenotic Coronary Arteries with Their Geometric Parameters

Detection of Lung Cancer at Early Stage using Neural Network Techniques for Preventing Health Care

Balanced Query Methods for Improving OCR-Based Retrieval

Optimal Planning of Charging Station for Phased Electric Vehicle *

Nonlinear Modeling Method Based on RBF Neural Network Trained by AFSA with Adaptive Adjustment

Towards Prediction of Radiation Pneumonitis Arising from Lung Cancer Patients Using Machine Learning Approaches

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

Subject-Adaptive Real-Time Sleep Stage Classification Based on Conditional Random Field

ENRICHING PROCESS OF ICE-CREAM RECOMMENDATION USING COMBINATORIAL RANKING OF AHP AND MONTE CARLO AHP

Appendix for. Institutions and Behavior: Experimental Evidence on the Effects of Democracy

Cancer Classification Based on Support Vector Machine Optimized by Particle Swarm Optimization and Artificial Bee Colony

Fast Algorithm for Vectorcardiogram and Interbeat Intervals Analysis: Application for Premature Ventricular Contractions Classification

A Novel artifact for evaluating accuracies of gear profile and pitch measurements of gear measuring instruments

Incorrect Beliefs. Overconfidence. Types of Overconfidence. Outline. Overprecision 4/22/2015. Econ 1820: Behavioral Economics Mark Dean Spring 2015

Available online at ScienceDirect. Procedia Computer Science 46 (2015 )

Combined Temporal and Spatial Filter Structures for CDMA Systems

ARTICLE IN PRESS. computer methods and programs in biomedicine xxx (2007) xxx xxx. journal homepage:

Modeling the Survival of Retrospective Clinical Data from Prostate Cancer Patients in Komfo Anokye Teaching Hospital, Ghana

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

CLUSTERING is always popular in modern technology

Automated and ERP-Based Diagnosis of Attention-Deficit Hyperactivity Disorder in Children

Design of PSO Based Robust Blood Glucose Control in Diabetic Patients

IDENTIFICATION AND DELINEATION OF QRS COMPLEXES IN ELECTROCARDIOGRAM USING FUZZY C-MEANS ALGORITHM

Proceedings of the 6th WSEAS Int. Conf. on EVOLUTIONARY COMPUTING, Lisbon, Portugal, June 16-18, 2005 (pp )

A REVIEW OF ARTIFICIAL FISH SWARM OPTIMIZATION METHODS AND APPLICATIONS

Nonstandard Machine Learning Algorithms for Microarray Data Mining. Byoung-Tak Zhang

DETECTION AND CLASSIFICATION OF BRAIN TUMOR USING ML

Price linkages in value chains: methodology

Shape-based Retrieval of Heart Sounds for Disease Similarity Detection Tanveer Syeda-Mahmood, Fei Wang

Hierarchical kernel mixture models for the prediction of AIDS disease progression using HIV structural gp120 profiles

Research Article Statistical Analysis of Haralick Texture Features to Discriminate Lung Abnormalities

Non-linear Multiple-Cue Judgment Tasks

An Improved Time Domain Pitch Detection Algorithm for Pathological Voice

Arrhythmia Detection based on Morphological and Time-frequency Features of T-wave in Electrocardiogram ABSTRACT

Incorporating prior biological knowledge for network-based differential gene expression analysis using differentially weighted graphical LASSO

Performance Evaluation of Public Non-Profit Hospitals Using a BP Artificial Neural Network: The Case of Hubei Province in China

A Linear Regression Model to Detect User Emotion for Touch Input Interactive Systems

JOINT SUB-CLASSIFIERS ONE CLASS CLASSIFICATION MODEL FOR AVIAN INFLUENZA OUTBREAK DETECTION

Study on Psychological Crisis Evaluation Combining Factor Analysis and Neural Networks *

Natural Image Denoising: Optimality and Inherent Bounds

Algorithms 2009, 2, ; doi: /a OPEN ACCESS

A GEOGRAPHICAL AND STATISTICAL ANALYSIS OF LEUKEMIA DEATHS RELATING TO NUCLEAR POWER PLANTS. Whitney Thompson, Sarah McGinnis, Darius McDaniel,

INTRAUTERINE GROWTH RESTRICTION (IUGR) RISK DECISION BASED ON SUPPORT VECTOR MACHINES

Prognosis and Diagnosis of Breast Cancer Using Interactive Dashboard Through Big Data Analytics

Investigation of zinc oxide thin film by spectroscopic ellipsometry

Estimation for Pavement Performance Curve based on Kyoto Model : A Case Study for Highway in the State of Sao Paulo

A Neural Network System for Diagnosis and Assessment of Tremor in Parkinson Disease Patients

The Effect of Fish Farmers Association on Technical Efficiency: An Application of Propensity Score Matching Analysis

Tumor Phylogenetic Lineage Separation by Medoidshift Clustering with Non-Positive Kernel

Diagnosis of Severe Obstructive Sleep Apnea with Model Designed Using Genetic Algorithm and Ensemble Support Vector Machine

A Computer-aided System for Discriminating Normal from Cancerous Regions in IHC Liver Cancer Tissue Images Using K-means Clustering*

FAST DETECTION OF MASSES IN MAMMOGRAMS WITH DIFFICULT CASE EXCLUSION

Sparse Representation of HCP Grayordinate Data Reveals. Novel Functional Architecture of Cerebral Cortex

Statistical Analysis on Infectious Diseases in Dubai, UAE

An expressive three-mode principal components model for gender recognition

THIS IS AN OFFICIAL NH DHHS HEALTH ALERT

Using a Wavelet Representation for Classification of Movement in Bed

A Glorious Literature on Linear Goal Programming Algorithms

Project title: Mathematical Models of Fish Populations in Marine Reserves

What Determines Attitude Improvements? Does Religiosity Help?

The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis

Insights in Genetics and Genomics

Transcription:

www.arpapress.com/volumes/vol8issue2/ijrras_8_2_02.pdf AN ENHANCED GAGS BASED MTSVSL LEARNING TECHNIQUE FOR CANCER MOLECULAR PATTERN PREDICTION OF CANCER CLASSIFICATION I. Jule 1 & E. Krubakaran 2 1 Department of Computer Scence, Argnar Anna Government Arts College, Musr 621 201, Inda 2 Senor Deputy General Manager, BHEL, Trchy 620 014, Inda ABSTRACT Cancer Classfcaton s becomng the crtcal bass n patent therapy. Researchers are made contnuously n developng and applyng the most accurate classfcaton algorthms based on the gene expresson profles of patents. Mcroarray technologes have made an enormous encroachment on cancer genome research. To predct the Cancer Classfcaton, there are two methods namely Sgnal-to-Nose Rato (SNR) based Genetc Algorthm on Gene Selecton (GAGS) and Mult-Task Support Vector Sample Learnng Technque (MTSVSL) had proposed. The GAGS s a Flter, whch s used to select target genes n the dagnoss of cancer. The MTSVSL Learnng Technque s a Wrapper, whch s based on Back Propagaton Neural Network and Lnear Support Vector Machne. Ths work yeld good classfcaton accuracy for Leukaema cancer genes. From the lterature survey, ths research work revealed that the classfcaton performance nterms of Accuracy and Error Rate could be mproved f Counter Propagaton Neural Network (CPNN) s combned wth MTSVSL nstead of BPNN. Ths s called as Enhanced MTSVSL (EMTSVSL) Learnng Technque. From the expermental result, t s establshed that ths proposed Technque acheves hgher classfcaton performance nterms Accuracy and Error Rate as compared wth exstng technque. Keywords: Gene Predcton, Genetc Algorthm Gene Selecton, Cancer Classfcaton, Mult Task Learnng, Support Vectors, Back Propagaton Neural Network, Counter Propagaton Neural Network. 1. INTRODUCTION Mcro array technologes, whch measure the expresson level for thousands of gene expresson smultaneously, have had a great mpact on cancer genome research over the past few years. The Mcroarray Gene Selecton[1,2,4,7] procedure s shown n the Fgure 1. Currently, mcroarray-based gene expresson proflng has been vewed as a promsng approach n predctng cancer classes and prognoss outcomes. In most cases, cancer dagnoss depends on the use of a complex combnaton of clncal and hstopathologcal data. However, t s often dffcult or mpossble to recognze a tumor type n some atypcal nstances. Large scale proflng of genetc expresson and genomc alternatons usng DNA mcroarrays can reveal the dfferences between normal and malgnant cells, genetc and cellular changes at each stage of tumor progresson and metastass, and the dfference among cancers of dfferent orgns. Cancer classfcaton s becomng the crtcal bass n patent therapy. Researchers are contnuously developng and applyng the most accurate classfcaton algorthms based on the gene expresson profles of patents. Several Data Mnng technques[1,2,4,5,7,8,9] such as Support Vector Machnes (SVM), K-Nearest Neghbors, Ensemble Rough Hypercubod Approach, Multple-Flter-Multple-Wrapper Approach, Prncpal Component Analyss (PCA), Nonnegatve Prncpal Component Analyss (NPCA), Nonparallel Plane Proxmal Classfer (NPPC), Back Propagaton Neural Network and Multple Flter wth Multple Wrapper (MFMW) had been proposed and appled n cancer dagnoss and classfcaton. In a mcroarray chp, the number of genes avalable s far greater than that of samples, a well-known problem called the curse of dmensonalty [8]. However, most genes n a mcroarray gve lttle benefts to the sample classfcaton problem. Therefore, pror to sample classfcaton, t s mportant to perform gene selecton whereby more nterpretable genes are dentfed as bomarkers, so that a more effcent, accurate, and relable performance n classfcaton can be expected. These bomarkers may also be useful for assessng dsease rsk [6] and understandng the basc bology of a dsorder [8]. There are, n general, two approaches to gene selecton, namely flters and wrappers [8]. The flter approach selects genes accordng to ther dscrmnatve powers wth regard to the class labels of samples. Methods such as Sgnal-to-Nose Rato (SNR), t-statstcs (TS), threshold number of msclassfcatons (TNoM) score, and F-test have been shown to be effectve scores for measurng the dscrmnatve power of genes. In all cases, genes are ranked accordng to ther statstcal scores, and a certan number of the hghest rankng genes are selected for the purpose of classfcaton. However, these Flters have faled to select 139

Jule & Krubakaran Cancer Molecular Pattern Predcton more nterpretable genes. To overcome ths dentfed problem, ths paper planned to focus Sgnal-to-Nose Rato (SNR) based Genetc Algorthm on Gene Selecton (GAGS), whch wll mprove the performance of Gene Selecton Technque whereby more nterpretable genes can be dentfed, so that a more effcent, accurate, and relable performance n classfcaton can be acheved. In the wrapper approach, genes are selected sequentally one by one so as to optmze the tranng accuracy of a partcular classfer [8]. That s, the classfer s frst traned usng one sngle gene, and ths tranng s performed for the entre orgnal gene set. The gene that gves the hghest tranng accuracy s selected. Then, a second gene s added to the selected gene and the gene that gves the hghest tranng accuracy for the two-gene classfer s chosen. Ths process s contnued untl a suffcently hgh accuracy s acheved wth a certan gene subset. From the lterature survey, t s observed the exstng classfers such as Support Vector Machne (SVM), k-nearest Neghbor have ts own lmtatons such as False Postve and False Negatve classfcaton. Fgure. 1. Mcroarray Gene Selecton Mechansm To overcome ths, Austn H and et.al., have proposed the MTSVSL Learnng Technque, whch s based on Back Propagaton Neural Network and Lnear Support Vector Machne. Ths work yeld good classfcaton performance n terms of Accuracy and Error Rate. 1.1 Obectve of ths Work However, from our lterature survey[1,2,8,9], t s dentfed that the performance of Mult-Task Support Vector Sample Learnng (MTSVSL) technque could be mproved f Counter Propagaton Neural Network s ntroduced wth Genetc Algorthm based Gene Selecton (GAGS) rather Back Propagaton Neural Network (BPNN), whch can be named as Extended MTSVSL Learnng Technque. Ths wll acheve to fnd an optmal nformaton gene subset, thereby avodng the over-fttng problem caused by attemptng to apply a large number of genes to a small number of samples. 2. BACKGROUND In ths Secton, the features of Sgnal-to-Nose (SNR) Gene Selecton Method, Genetc Algorthm based Gene Selecton (GAGS) method, Support Vector Samplng Technque (SVS) and Mult-Task Learnng (MTL) method are dscussed. 2.1 Sgnal-to-Nose (SNR) based Gene Selecton Method Gene Selecton s wdely used to select target genes n the dagnoss of cancer. One of the prme goals of gene selecton s to avod the over-fttng problems caused by the hgh dmensons and relatvely small number of samples of mcroarray data. Theoretcally, n cancer classfcaton, only nformatve genes whch are hghly related to partcular classes should be selected. In the study of Austn H and et.al., t had used Sgnal-to-Nose Rato (SNR) as the Gene Selecton method [1]. For each gene, ths work has normalzed the gene expresson data by subtractng the mean and then dvdng by the standard devaton of the expresson value. Every sample s labeled wth {+1,-1} as ether a normal or a cancer sample. The followng formula s used to calculate each gene s F score. 140

Jule & Krubakaran Cancer Molecular Pattern Predcton F g ) 1 ( g ) 1 ( 1 1 ( g) ( g ) ( g )..(1) The µand σ are the mean and standard devaton of the samples n each class (ether +1or -1) ndvdually. Ths work rank these genes wth an F score 2.2 Genetc Algorthm based Gene Selecton (GAGS) Technque The genetc algorthm[1] s an effectve algorthm n searchng complex hgh-dmensonal space and n fndng the optmal soluton. Austn H and et.al., proposed ths Genetc Algorthm based Gene Selecton method that can fnd the most nformatve gene set. The genetc algorthm s a type of evolutonary computng method wdely used n smulatng the process of natural selecton. The basc concept behnd the genetc algorthm s conssted of four steps. They are Populaton Reproducton Crossover And Mutaton Before begnnng the genetc algorthm, ths work has randomly separated the gene expresson data nto three parts. They are Testng Dataset, Tranng Dataset And Valdaton Dataset. The Testng Dataset s an ndependent dataset used purely for measurng the classfcaton performance. 2.3 Populaton Here, all the genes are randomly separated nto m chromosomes and each chromosome contans n genes. Each chromosome represents a possble gene subset. The system s desgned to set the value of m and n depends upon the requrement. 2.4 Reproducton In the bologcal evolutonary process, only the organsms that adapt to the envronment survve. Only chromosomes wth hgh ftness scores replcate and are passed onto the next stage. The ftness functon s defned as Ftness 1 2 ATR ATV 3 3..(2) where ATR s the predctve accuracy of the tranng dataset usng the support vector machne and ATV s the predctve accuracy of the valdaton dataset. The reproducton rate may nfluence the varety of chromosomes. If the varety of chromosomes s low, the genetc algorthm may catch a local optmum soluton nstead of a global optmum soluton. 2.5 Crossover After the reproducton phase, offsprngs are created by crossng over the parent chromosomes at the cross pont. The sngle-pont crossover approach was used. The crossover pont s randomly generated and two chromosomes are randomly selected to do so at ths pont 2.6 Mutaton To ncrease the possblty of fndng the optmal soluton, a mutaton phase s appled. We wll set P and p as the mutaton possblty of each chromosome and each gene respectvely. Here, every chromosome may generate a random number R, and f R > P then ths chromosome wll be added to the mutaton pool. Every gene n these chromosomes may also generate a random number r, where f r > p then the gene wll be replaced wth another randomly selected gene from the F-gene pool. 2.7 Mult-Task Support Vector Sample Learnng (MTSVSL) Ths Mult-Task Support Vector Sample Learnng (MTSVSL) has two methodologes[1]. These technologes combned together to mprove the classfcaton accuracy from the gene expresson data. The technologes are 141

Jule & Krubakaran Cancer Molecular Pattern Predcton Support Vector Sample (SVS) method and Mult-Task Learnng (MTL) Method. By usng ths approach, a classfer can learn two tasks. They are. the man task s whch knd of sample s ths? and the second task s s ths sample a support vector sample?. Ths work categorze the samples nto four classes, namely 1. The sample whch belongs to class 1 and s a support vector sample 2. The sample whch belongs to class 2 and s a support vector sample 3. The sample whch belongs to class 1 but s not a support vector sample 4. The sample whch belongs to class 2 but s not a support vector sample 2.8 Support Vector Samplng Technque (SVS) A bnary SVM[1,8,9] attempts to fnd a hyperplane whch maxmzes the margn between two classes (+1/-1). Let, 1,2..., 1,1, X R X Y Y, (3) be the gene expresson data wth postve and negatve class labels. The SVM learnng algorthm should fnd a maxmzed separatng hyperplane W * X +b = 0, where W s the n-dmensonal vector, whch s called the normal vector that s perpendcular to the hyperplane, and b s the bas. The SVM decson functon s showed n formula(4), where α s a postve real numbers and φ s mappng functon T W T ( X ) b Y( X ) ( X ) b (4) 1 Only ( X ) of α > 0 would be used, and these ponts are support vectors. The support vectors lay close to the separatng hyperplane. Here 0 < α < C, where C s the penalty parameter of Error Term. If α becomes zero, there s no nfluence to the hyperplane. 2.9 Mult-Task Learnng (MTL) method The prncple goal of mult-task learnng[1] s to mprove the performance of a classfer. The mult-task learnng technque can be consdered as an nductve transfer mechansm where the nductve transfer leverages addtonal sources of nformaton to mprove learnng performance wthn a current task. Varables whch were not used as the ntal nputs may contan some useful nformaton. Instead of dscardng these varables, MTL get the nductve transfer beneft from dscarded varables by usng them as an extra output. The Back Propagaton Neural Network (BPNN) s modeled as MTL and learn tasks. 2.10 Identfed Problems From our lterature survey, t s dentfed that the performance of Mult-Task Support Vector Sample Learnng (MTSVSL) technque s mproved as compared wth Back Propagaton Neural Networks. However, the learnng technque of MTSVSL has faled to select more nterpretable genes and hence unable to mprove the classfcaton accuracy. That s the Wrapper of ths system leads to poor Gene Classfcaton. Ths s the maor drawback. To overcome ths dentfed problem, ths paper planned to mprove the performance of Wrapper. 3. ENHANCED MTSVSL As stated n the prevous secton, the Mult-Task Support Vector Sample Learnng (MTSVSL) technque has two methodologes namely Support Vector Sample (SVS) Technque and Mult-Task Learnng (MTL) Technque. These technologes combned together to mprove the classfcaton accuracy of the gene expresson data. The man obectve of ths work s to mprove the performance of MTSVSL. That s ths work s mproved the performance of MTL wth Counter Propagaton Neural Networks. 3.1 The Prncple of Counter Propagaton Neural Networks The Counter-Propagaton Network s a combnaton of a porton of the Kohonen Self-Organzng Map [10] and Grossberg Outstar Structure [10]. Durng learnng, pars of the nput vector X and output vector Y are presented to the nput and nterpolaton layers, respectvely. These vectors propagate through the network n a counterflow manner to yeld the competton weght vectors and nterpolaton weght vectors. Once these weght vectors become stable, the learnng process s completed. The output vector Y 1 of the network correspondng to the nput vector X s then computed. The vector Y 1 s ntended to be an approxmaton of the output vector Y,.e. Y 1 Y = f(x). The equatons of the network are descrbed brefly as follows. 142

Jule & Krubakaran Cancer Molecular Pattern Predcton Let U = [u ] be the arbtrary ntal competton weght vector for the -th neuron n the competton layer where u s the weght connectng the -th neuron n the competton layer to the -th neuron n the nput layer. The Eucldean dstance between the nput vector X and the competton weght vector U of the -th neuron s calculated, That s d m 2 X U ( x u ) (5) 1 Once the dstance d for each neuron has been calculated, the neuron wth the shortest Eucldean dstance to X s selected to represent the wnnng neuron. As a result of the competton, the output of the wnnng neuron s set to unty and the outputs of the other neurons are set to zero. Thus, the output of the -th neuron n the competton layer can be expressed as 1.0 f d d for all Z (6) 0.0 otherwse The weght u connectng the -th neuron n the competton layer to the -th neuron n the nput layer s adusted based on the Kohonen learnng rule, that s u ( p 1) u ( x u ( p)) Z (7) where β s the learnng coeffcent and p s the teraton number. After the competton weght vector U stablzes, the nterpolaton layer starts to learn the desred output vector Y by adustng the nterpolaton weght vector. Let V = [v ] be the arbtrary ntal nterpolaton weght vector for the -th neuron n the nterpolaton layer where v s the weght connectng the -th neuron n the nterpolaton layer to the -th neuron n the competton layer. The weght v s adusted based on the Grossberg learnng rule, that s v ( p 1) v ( y v ( p)) Z (8) where γ s the learnng coeffcent. Ths s repeated untl the nterpolaton weght vector V converges to a preset value. The output vector Y 1 of the network correspondng to the nput vector X can be calculated usng a weghted summaton functon. The -th component y 1 of the output vector Y 1 can be expressed as y 1 v Z (9) In the foregong dscusson, the counter-propagaton network functons as a look-up table. The learnng process assocates the nput vector wth the correspondng output vector based on two well-known algorthms, namely the Kohonen self-organzng map for fndng the most smlar tranng vector and the Grossberg outstar map for proectng the correspondng output vector. Once the network s traned, the applcaton of an nput vector can quckly produce the correspondng output vector. Ths s the enhanced MTL 143

Jule & Krubakaran Cancer Molecular Pattern Predcton 4. EXPERIMENTAL RESULTS AND DISCUSSIONS We have been developed MTSVSL Tool wth NetBeans and t s confgured wth BoWeka0.6.1. As shown n the Fgure. 2. GAGS based MTSVSL Tool wth BoWeka0.6.1 Fgure. 3. MTSVSL SVM Classfcaton 144

Jule & Krubakaran Cancer Molecular Pattern Predcton Fgure 2, t conssts of two modules. The frst module s a Flterng Module, where GAGS s mplemented. In ths module, the Chromosome Sze can be fxed. The second module s the Wrapper Module, where MTSVSL and EMTSVSL have been mplemented. As shown n the Fgure 3, SVM wth BPNN s classfyng the Cancer Pattern from the Dataset. For expermental study, the work s consdered Leukaema Cancer Pattern Datasets and number of Top Genes are taken as 100 and 150. The Confuson Matrces and ther Accuracy and Error Rate are shown n the Fgure from Fgure. 4. to Fgure. 7. Fgure. 4. Confuson Matrx of MTSVSL for Leukaema Cancer pattern ( Top Genes : 100) From the Fgure 4, t s noted that the exstng GAGS based MTSVSL obtaned 93.8033 s the Classfcaton Accuracy and 0.06197 s the Error Rate for Leukaema Cancer pattern wth number of Top Genes are 100. And also observed that ths proposed GAGS based EMTSVSL Technque acheves 95.8443 and 0.4156 as ts Classfcaton Accuracy and Error Rate respectvely, whch s shown n the Fgure 5. It s revealed that our proposed work performs well as compared wth exstng system. Wth Top Genes as 150, the same experment s repeated, whch s shown n the Fgure 6 and Fgure 7 and also realzed that ths proposed work outperform GAGS based MTSVSL. Fgure. 5. Confuson Matrx of EMTSVSL for Leukaema Cancer pattern ( Top Genes : 100) Fgure. 6. Confuson Matrx of MTSVSL for Leukaema Cancer pattern ( Top Genes : 150) 145

Jule & Krubakaran Cancer Molecular Pattern Predcton Fgure. 7. Confuson Matrx of EMTSVSL for Leukaema Cancer pattern ( Top Genes : 150) 5. CONCLUSION Mcroarray technologes have made an enormous encroachment on cancer genome research. To predct the Cancer Classfcaton, the GAGS s used to select target genes n the dagnoss of cancer and the MTSVSL Learnng Technque based on Back Propagaton Neural Network and Lnear Support Vector Machne were mplemented for classfcaton. To mprove ts classfcaton accuracy, ths paper proposed an effcent enhanced MTSVSL (EMTSVSL) s proposed. From the expermental result, t s establshed that ths proposed Technque acheves hgher classfcaton accuracy wth less error rate as compared wth exstng MTSVSL Technque. For expermental study, the Leukaema Cancer Pattern s used. REFERENCES [1]. Austn H, Chen and Jen-Cheh Hsu, Explorng novel algorthms for the predcton of cancer classfcaton, Internatonal Conference on Software Engneerng and Data Mnng (SEDM), ISBN: 978-1-4244-7324-3 pp. 378 383, 2010. [2]. Statnkov A, Alfers CF, Tsamardnos I, Hardn D, Levy S, A comprehensve evaluaton of multcategory classfcaton methods for mcroarray gene expresson cancer dagnoss, Bonformatcs, 2005, vol. 21, pp. 631 643 [3]. Ramaswamy S. et al., Multclass cancer dagnoss usng tumour gene expresson sgnatures, Proc. Natl Acad. Sc. USA 98, 2001,_ pp. 15149 15154. [4]. Greer BT, Khan J, Dagnostc classfcaton of cancer usng DNA mcroarrays and artfcal ntellgence, Ann N Y Acad Sc, 2004, vol. 1020, pp. 49-66. [5]. Ramrez L, Durdle NG, Raso VJ, Hll DL, A support vector machnes classfer to assess the severty of dopathc scoloss from surface topology, IEEE Trans. Inf. Technol. Bomed., 2006, 10, no. 1, pp. 84-91, Jan. 2005. [6]. Y. Wang, I.V. Tetko, M.A. Hall, E. Frank, A. Facus, K.F.X. Mayer, and H.W. Mewes, Gene Selecton from Mcroarray Data for Cancer Classfcaton A Machne Learnng Approach, Computatonal Bology and Chemstry, vol. 29, no. 1, pp. 37-46, 2005. [7]. Rhodes, and et.al., Oncomne 3.0: Genes, Pathways, and Networks n a Collecton of 18,000 Cancer Gene Expresson Profles, Neoplasa, vol. 9, no. 2, pp. 166-180, 2007. [8]. Yukyee Leung and Yeungsam Hung, A Multple Flter Multple Wrapper to gene selecton and mcroarray data classfcaton, IEEE/ACM Transcatons computatonal Bology and Bonformatcs, VOL. 7, NO. 1, JANUARY-MARCH 2010. [9]. Mnghao Pao, Jong Bum Lee, Khald E.K. Saeed, and Keun Ho Ryu, Dscovery of sgnfcant classfcaton rules from Incrementally nducted decson tree ensemble for dagnoss of dsease. 2009. [10]. S.C. Juang, Y.S. Tarng, and H.R. L, A comparson between the back-propagaton and counter-propagaton networks n the modelng of the TIG weldng process, Journal of Materals Processng Technology, pp. 54 63, 1998. 146