Machine learning for HIV-1 protease cleavage site prediction
|
|
- Shanon Warren
- 6 years ago
- Views:
Transcription
1 Pattern Recognition Letters 27 (2006) Machine learning for HIV-1 protease cleavage site prediction Alessandra Lumini, Loris Nanni * DEIS, IEIIT CNR, Università di Bologna, Viale Risorgimento 2, Bologna, Italy Received 17 November 2004; received in revised form 16 May 2005 Available online 2 May 2006 Communicated by L. Goldfarb Abstract Recently, several works have approached the HIV-1 protease specificity problem by applying a number of classifier creation and combination methods, known as ensemble methods, from the field of machine learning. However, it is still difficult for researchers to choose the best method due to the lack of an effective comparison. For the first time we have made an extensive study on methods for feature extraction, feature transformation and multiclassifier systems (MCS) in the problem of HIV-1 protease. In this work we report an experimental comparison on several learning systems coupled with different feature representations. We confirm previous results stating that linear classifiers obtain higher performance than non-linear classifiers using orthonormal encoding, but we also show that using Karhunen Loeve transform the performance of neural networks are comparable to one of linear support vector machines. Finally we propose a new hierarchical approach that, for the first time, combines ideas derived from the machine learning methodologies and from a knowledge base of this particular problem. This approach proves to be a successful attempt to obtain a drastically error reduction with respect to the performance of linear classifiers: the error rate decreases from 9.1% using linear-svm to 6.6% using our new hierarchical classifier based on some pattern rules. Ó 2006 Elsevier B.V. All rights reserved. Keywords: HIV-1 protease; Karhunen Loeve transform; Hierarchical approach 1. Introduction HIV-1 protease (Beck et al., 2000) is an enzyme in the AIDS virus that is essential to its replication. The chemical action of the protease takes place at a localized active site on its surface. HIV-1 protease inhibitor drugs are small molecules that bind to the active site in HIV-1 protease and stay there, so that the normal functioning of the enzyme is prevented. Understanding and predicting HIV-1 protease cleavage sites in proteins is a very important topic, since cleaved substrates are also templates for synthesis of tightly binding chemically modified inhibitors. The standard paradigm * Corresponding author. Fax: addresses: alumini@deis.unibo.it (A. Lumini), lnanni@deis. unibo.it (L. Nanni). for protease peptide interaction is the lock and key model. In this model a sequence of amino acids fits as a key to the active site in the protease, which is eight-residues long in the HIV-1 protease case. In order to design effective HIV protease inhibitors, accurately identifying cleaved peptide of eight residues is very crucial. The potential number of solutions is 20 8 as there are 20 amino acids. This makes an accurate and rapid method for predicting HIV protease (Chou, 1993a,b,c; Cai and Chou, 1998; Chou et al., 1993, 1996; Chou and Zhang, 1992) very helpful, since an exhaustive experimental search is impossible. The interested reader can see (Chou, 1996) for a good review. In order to approach this problem it is important to know that in HIV-1 protease only one class (the uncleaved category) is shift invariant, the other class is not. Shift invariance means that a category remains unchanged if a /$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi: /j.patrec
2 1538 A. Lumini, L. Nanni / Pattern Recognition Letters 27 (2006) pattern is shifted left or right of one position. For instance, the peptide DDFGRCELAAAMKRHGLHL is not cleaved by HIV-1 protease, which means, due to the shift invariance, that all the octamers DDFGRCEL,..., MKRHGLHL belong to the same uncleaved category. On the contrary, the cleaved category is not shift invariant, because the cleaving occurs at one specificity site and not in nearby sites. A machine learning algorithm is one that can learn from experience (observed examples) with respect to some class of tasks and a performance measure. Machine learning methods are suitable for molecular biology data due to the learning algorithm s ability to construct classifiers/hypotheses that can explain complex relationships in the data. Recently, several works have approached the HIV-1 protease specificity problem by applying techniques from machine learning. In (Cai et al., 1998; Thompson et al., 1995) the authors used a standard feedforward multilayer perceptron (MLP) to solve this problem, achieving an error rate of 12%. In (Cai and Chou, 1998) the authors confirm the result of (Narayanan et al., 2002; Thompson et al., 1995) using the same data and the same MLP architecture, showing that a decision tree was not able to predict the cleavage as well as MLP. Recently in (Cai et al., 1998) Support Vector Machines (SVM) have been adopted to predict the cleavage. In (Rögnvaldsson and You, 2003) the authors showed that HIV-1 protease cleavage is a linear problem and that the best classifier for this problem is linear-svm (L-SVM). Multiclassifier systems (Dietterich, 2000; Masulli and Valentini, 2000; Mayoraz and Moreira, 1997) integrate several data-driven models for the same problem; with the aim of obtaining a better composite global model, with more accurate and reliable estimates. In addition, modular approaches often decompose a complex problem into sub-problems for which the solutions obtained are simpler to understand, as well as to implement, manage and update. Some works combined the output of various classifiers in bioinformatics problems: in (Tan and Gilbert, 2003) the authors compared three ensemble methods (stacking, bagging and boosting) showing that combined methods perform better than the individual learners. In this paper we confute these results: we believe that the behavior exposed in (Tan and Gilbert, 2003) was due to the low performance of single classifiers adopted (even if they used a larger training set, obtained by a ten fold cross-validation). In this work we first perform a comparison of several machine learning approaches applied to the problem of HIV-1 protease, then we show how to develop a new hierarchical classifier (HC) architecture by merging some ideas derived from the study of machine learning methodologies and from a knowledge base of this particular problem. The experiments show that the HIV-1 problem can be effectively solved using our hierarchical classifier: this approach (HC) yields an error rate of 6.6%, which is very lower than the best previous approaches (9.1% using linear-svm). Even if some of the rules used (Rögnvaldsson and You, 2003) were found by looking at all the patterns of the dataset, so the performance of HC is partially biased, in our opinion it is very interesting to show that merging some ideas derived from the study of machine learning methodologies and from a knowledge base of this particular problem the error rate is drastically reduced. In fact, using Q-statistic we show that the independence between a machine learning classifier as LDC and a set of mining rules based on pattern motifs is lower than those obtained for any combination of machine learning classifiers. This means that the two approaches, based on different methodologies, are enough different that they can be coupled to improve the accuracy. 2. Methods In this section a brief description of the feature extraction methodologies, feature transformations, classifiers and ensemble methods combined and tested in this work is given Feature extraction (FE) Feature extraction is a process that extracts a set of features from the original pattern representation through some functional mapping. The most used features in this field are Peptide sequences. A protein sequence is made from combinations of variable length of 20 amino acids P ={A,C,D,...,V, W, Y}. A peptide (small protein) is denoted by P ¼ P 4 P 3 P 2 P 1 1 0P 2 0P 3 0P 4 0. Where P i is an amino acid belonging to P. The scissile bond is located between positions P 1 and P 1 0.Asin(Rögnvaldsson and You, 2003) P 3 and P 4 0 are not used. Orthonormal encoding (OE). It is the standard procedure (Rögnvaldsson and You, 2003) to map the sequence P to a sparse orthonormal representation. Each amino acid P i is then represented by a 20 bit vector with 19 bits set to zero and one bit set to one, and each amino acid vector is orthogonal to all other amino acid vectors. P i can take on any one of the twenty amino acid values. n-grams (NG). The n-grams or k-tuples (Wu et al., 1992) are a pair of values (v i,c i ), where v i is the feature and c i is the counts of this feature in a protein sequence for i =1,...,20 n. These features are all the possible combinations of n letters from the set P. The 6-letter exchange group is another commonly used piece of information. The 6-letter group contains six combinations of the letters. These combinations are A = {H, R, K}, B = {D, E, N,Q}, C = {C}, D = {S,T, P,A,G}, E = {M,I,L,V} and F = {F,Y, W}. Each set of n-grams features from a protein sequence can be scaled using: x x ¼ L n 1
3 A. Lumini, L. Nanni / Pattern Recognition Letters 27 (2006) where x represents the count of a generic gram feature, L is the length of the protein sequence and n is the size of n-gram features. insertion or deletion of a letter. The edit distance is coupled with a nearest-neighbor classifier in order to classify a new pattern Feature transformation (FT) Feature transformation is a process through which a new set of features is created from an existing one represented in a vector space R N. Karhunen Loeve transform (KL) (Duda et al., 2001). This transform projects high dimensional data onto a lower-dimensional subspace in a way that is optimal in a sum-squared sense. It has known that KL is the best linear transform for dimensionality reduction. In this paper, we use KL to reduce the original dataset to 50 dimensions. Independent component analysis (ICA) (Duda et al., 2001). This transform seeks the directions in feature space that show the independence of signals. KernelPCA (Scholkopf et al., 1998). Each feature vector is first projected from the input space to a high dimensional feature space by a non-linear map, then a pattern in the high dimensional space is reduced to a lower-dimensionality by KL. This transform is useful when the feature space is non-linear Classifiers A classifier is a component that uses the feature vector provided by the feature extraction or transformation to assign a pattern to a class. Linear discriminant classifier (LDC) (Duda et al., 2001). The linear discriminant analysis method consists of searching some linear combinations of selected variables, which provide the best separation between the considered classes. Linear-SVM (L-SVM) (Duda et al., 2001). The goal of this two-class classifier is to establish the equation of a hyperplane that divides the training set leaving all the points of the same class on the same side, while maximizing the distance between the two classes and the hyperplane. Multilayer perceptrons (MLP) (Duda et al., 2001). Multilayer perceptrons are supervised feedforward neural networks trained with the standard back-propagation algorithm. With one or two hidden layers, they can approximate virtually any input output map, so they are widely used for pattern classification. Edit distance classifier (EDC) (Levenshtein, 1965). The edit distance of two strings, s1 and s2, is defined as the minimum number of point mutations required to change s1 into s2, where a point mutation is a change, 2.4. Pattern motifs Studying the peptide sequence we cannot that particular combinations of amino acids influence the cleaving/noncleaving decision. (i.e. if the third amino acid of the peptide sequences is Glutamine we know that there is a low possibility that this peptide is a cleavage site). Pattern motifs (Rögnvaldsson and You, 2003) can be used for creating a rule-based classifier Multiclassifier systems (MCS) Multiclassifier systems combine different approaches to solve the same problem. They combine, by a decision rule, output of various classifiers trained using different datasets. Typical methods for multiclassifiers are Bagging. Bagging (Breiman, 1996) was among the first methods proposed for ensemble creation. Given a training set S, it generates M new training sets S 1,...,S M randomly picking elements from S; each new set S i is used to train exactly one classifier. Hence an ensemble of individual classifiers is obtained from M new training sets. Random subspace. In the random subspace method (Houle et al., 1998) each individual classifier uses only a subset of all features for training and testing. Decision rule (Kittler et al., 1998). Several decision rules can be used to determine the final class from an ensemble of classifiers; the most used are Vote rule, Max rule, Min rule, Mean rule, Sum rule. 3. Results and discussion In this section we perform an empirical comparison of several classification methods obtained by coupling different approaches described above for performing the HIV- 1 task. Results are reported only for the combinations which present a high accuracy. Then we discuss the results yielded by the experiments in order to design a hierarchical classifier. The performances are compared using two measures: error rate to evaluate the accuracy and Yule s Q statistic (Yule, 1900) to quantify the independence of classifiers. For two classifiers D i and D k the Q statistic is ad bc Q i;k ¼ ad þ bc where a is the probability of both classifiers being correct, d is the probability of both classifiers being incorrect, b is the
4 1540 A. Lumini, L. Nanni / Pattern Recognition Letters 27 (2006) probability first classifier is correct and second is incorrect, c is the probability second classifier is correct and first is incorrect. Q varies between 1 and 1. For statistically independent classifiers, Q i,k = 0. Classifiers that tend to recognize the same patterns correctly will have Q > 0, and those which commit errors on different patterns will have Q < 0. All the tests have been conducted on the following dataset using a 2-fold cross-validation: HIV data set. The dataset contains 362 octamer protein sequences, each of which needs to be classified as an HIV protease cleavable site or uncleavable site. On this dataset, we performed 10 tests, each time randomly resampling learning, and test sets (containing respectively half of the patterns), but maintaining the distribution of the patterns in the two classes. The results reported refer to the average classification accuracy achieved throughout the 10 experiments Accuracy We report some useful tests on the error rate aimed to compare the quality of various methods in the HIV-1 protease problem. Table 1 lists the tests whose results are reported in Fig. 1, we perform each test using both n-grams and orthonormal encoding as feature extraction. The absence of the feature transformation step indicates Table 1 Tests made for the HIV-1 protease problem Short name Feature transformation Classifier KLDC KL LDC ILDC ICA LDC KeLDC Kernel PCA LDC KeSVM Kernel PCA L-SVM KSVM KL L-SVM ISVM ICA L-SVM L-SVM L-SVM MLP MLP KMLP KL MLP 0.25 that the classification task is performed starting from the original features. The graphs in Figs. 1 and 2 report, respectively, the classification error rates given by various classifiers using two different feature extraction methods and the results obtained using MCSs. We cannot that MCS techniques do not obtain a considerable increase in the performance with respect to a single classifier, this behavior can be explained by the analysis of the error independence among classifiers. As concerns the classifiers used in MCS, we adopt a variable set of classifiers in each experiment chosen in order to maximize the performance. Random subspace is tested using KL as feature transformation (to reduce the features to a 50 dimensional space) and LDC as classifier. Bagging- LDC is tested using KL as feature transformation and LDC as classifier. Bagging-MLP is tested using KL and MLP. The performances reported are the best obtained by varying the number of dimensions retained and of classifiers in random subspaces and the number of classifiers used in bagging. The decision rules are evaluated combining the following five methods described in Table 1: L- SVM, KLDC, KSVM, ILDC, ISVM. These results confirm as already stated that using the orthonormal encoding as feature extractor, the HIV-1 protease cleavage site specificity can be solved efficiently by linear models. The confidence limits of the tests reported in Fig. 1 are approximately ±1.5%. In addition these results prove that, using a KL feature transformation, the performances of non-linear and linear models are similar to each other. We argue that combining a linear transformation and a non-linear classifier can effectively handle this problem, where one class is shift variant while the other is shift invariant. Another interesting result is the low performance of non-linear transformations (KernelPCA and ICA) Independence of classifiers Table 2 and Fig. 3 show some useful tests on error independence between two classifiers. Only the most interesting orthonormal encoding n-grams KLDC ILDC KeLDC KeSVM KSVM ISVM LSVM MLP KMLP Random Subspace Bagging-LDC Bagging-MLP Vote rule Mean Rule Max Rule Min Rule Sum Rule Fig. 1. Error rate for different classifiers built on orthonormal (left) and n-grams (right) feature spaces. Fig. 2. Error rate for different MCSs built on orthonormal encoding feature space.
5 A. Lumini, L. Nanni / Pattern Recognition Letters 27 (2006) Table 2 Tests made to study the error independence Short name Method 1 Method 2 FE FT Classifier FE FT Classifier A OE L-SVM OE KL L-SVM B OE L-SVM OE ICA L-SVM C OE KL L-SVM OE ICA L-SVM D NG L-SVM NG KL L-SVM E NG L-SVM NG ICA L-SVM F NG KL L-SVM NG ICA L-SVM G NG L-SVM OE ICA LDC H NG L-SVM OE KL LDC I OE L-SVM OE KL LDC L OE L-SVM OE ICA LDC M OE L-SVM OE KL MLP Fig. 3. Error independence for different classifiers. combinations (evaluated considering accuracy performance of each classifier) are reported for sake of space Analysis of results From the analysis of these results we can draw the following conclusions: A B C D E F G H I L M The feature space n-grams gains an error independence slightly larger than orthonormal encoding, even if the performance of single classifiers is lower. The best trade-off between accuracy and error independence is given by the combination M. Considering these results it seems to be very improbable to improve the performance by an ensemble of parallel classifiers. Therefore we design a hierarchical classifier which can enhance both the good performance of LDC and MLP. Moreover, given the low error independence of all machine learning methods, we insert in our classifier some rules based on pattern motifs and an edit distance classifier. 4. A new hierarchical classifier There are many ways to build a hierarchical multiclassifier: for example by using each level to distinguish between one class and the others, or using only a subset of the input features at each level, or using at each step a classifier with rejection to classify patterns with high confidence and forwarding rejected patterns to the next level (Giusti et al., 2002). In this work, we develop a hierarchical structure, in which each step is constituted by a module able to classify only a fraction of the patterns: the rejected patterns are given as input to the following steps. The classifier is composed by four steps: Edit distance + cleavage rule, LDC, cleavage/non-cleavage rule, MLP. In Fig. 4 a graphical description of the system proposed is given; in the following each step is described in details. For the determination of the optimal parameters of the algorithm, one third of the patterns of the training set is randomly selected and used as a validation set. step 1 step 2 Orthonormal Encoding Edit Distance Classifier Cleavage Rule KL LDC step 4 MLP step 3 Cleavage/Non Cleavage Rule Rejection SYMBOLS Feature extraction Feature transformation Classifier Sequence arrow Classified patterns Rejected patterns Fig. 4. Hierarchical classifier schema.
6 1542 A. Lumini, L. Nanni / Pattern Recognition Letters 27 (2006) Giusti et al. (2002) proved that the error probability for a hierarchical system, given a rejection threshold at each levels, can be expressed as the sum of the optimal Bayes error and the error rate of each classifier (related to the patterns not rejected). This result means that in principle, the optimal Bayes error can be still obtained even if all the stand-alone classifiers are not optimal Edit distance classifier + cleavage rule The edit distance classifier gives good performance for pattern belonging to the shift invariant class, while it is not reliable when assigns a pattern to the shift variant class. For example given a training set of patterns belonging to both the classes, if a new pattern is near (with respect the edit distance) to a pattern of the uncleaved class (shift invariant) we can reasonable assume that it belongs to the same class, on the contrary if the new pattern is near to a pattern of the cleavage class, we cannot make any assumption with high degree of certainly. Starting from this consideration we design a classifier that assigns to the uncleaved class the patterns classified as uncleavage site by EDC, while rejects the others. The error rate of the EDC, if used without rejection to classify all the patterns, is approximately 84.20%. If we reject all the patterns assigned to the cleaved class, it is able to classify the 62.70% of patterns with an error rate of only 4.4%. A possible method to further reduce the error rate of the edit distance classifier is to reject the patterns, classified as uncleaved by the EDC, that satisfy this rule: (xxx(nyla)xxxx & (!xxxkxxxx j!xx(fkq)xxxxx j!xxxxxcxx j!xxxxxxkx)) The rationale of this rule is From a statistical study on the training set, we have noted that a cleaved pattern with high similarity to an uncleaved one often contains the motif xxx(nyla)xxxx (xxx(nyla)xxxx means that the fourth amino acid must be N, Y, L or A). This rule matches partially with a rule shown in (Tozser et al., 2000). To avoid rejection of many patterns we use some motifs (Rögnvaldsson and You, 2003) that characterize the uncleaved class. These motifs are xxxkxxxx xx(fkq)xxxxx xxxxxcxx xxxxxxkx By coupling these rules to the EDC we reject a further 20.6% of the patterns previously classified: this allows to reduce the error rate to 1.1% on the accepted patterns (which are the 49.83% of the total) Multiclassifier with a modified mean rule We train a LDC classifier using patterns represented by orthonormal encoding as feature extractor and reduced to Table 3 Pairs of amino acids and positions that influence the cleaving decision xxxfxexx xxxyxexx xxxlxexx xxxfxqxx xxvxxexx xxxxpexx xxvfxxxx xxxfpxxx xxixxexx xxxmxexx xxaxxexx xxafxxxx xxxxxexk FxxxxExx Table 4 Single amino acids and positions that influence the cleaving decision xxxkxxxx xxxxxsxx xxxxxkxx xxxpxxxx xxxxcxxx Cxxxxxxx xxyxxxxx a 50 dimensional space by a KL transform. The patterns whose confidence is lower than a prefixed threshold are rejected. The value of the threshold is set experimentally (same value in ALL the tests). In this step the 36.5% of the patterns are classified, with an error rate of 5.75% Cleavage/non-cleavage rule The third step consists in some rules proposed in (Rögnvaldsson and You, 2003): if a pattern contains one of the motifs shown in Table 3, it is classified as cleaved, if it contains one of the motifs shown in Table 4 it is assigned to the uncleaved class, otherwise it is rejected to the next step. Using these rules the 5.64% of the patterns can be classified, with an error rate of 1.96% MLP The patterns rejected by the previous steps are finally classified with MLP classifier based on patterns represented by orthonormal encoding as feature extractor and reduced to a 50 dimensional space by a KL transform. In this last step all the remaining patterns (about the 10.39%) are classified without rejection, with an error rate of 39%. 5. Results of the hierarchical structure Finally, in Fig. 5 we compare the classification error rates obtained by our hierarchical method (HC) and by the systems proposed in (Rögnvaldsson and You, 2003): a MPL coupled with a KL dimensionality reduction (KL + MPL) and a linear-svm classifier (L-SVM). In Table 5 the classification performance of each step of the new hierarchical classifier are summarized: the local error rate is evaluated considering only the patterns effectively classified at each step, while the global error rate is the cumulative error obtained considering all patterns classified till that step; analogously with local classified we mean the percentage of the whole patterns classified at each step, while global classified is the cumulative percentage of classified patterns at each step. It is interesting to note that for HC the 10% of patterns can be considered diffi-
7 A. Lumini, L. Nanni / Pattern Recognition Letters 27 (2006) HC KL+MLP L-SVM HC2 In this section we report some experiments to validate our idea of constructing a hierarchical classifier architecture by merging some ideas derived from the machine learning methodologies and from a knowledge base of this particular problem. A first test has been conducted in order to evaluate the independence between a machine learning classifier as LDC and the set of mining rules based on pattern motifs detailed in Section 4.3. The error independence among these two methods is evaluated by the Q-statistic on the fraction of patterns effectively classified by the rules; the result is 0.9, lower than those obtained for any combination of classifiers reported in Table 2 and Fig. 3. This means that the two approaches, based on different methodologies, are enough different that they can be coupled in a hierarchical classifier. As a further proof we tested a simple two level hierarchical classifier composed only by cleavage/ non-cleavage rule at the first level and KL + LDC at the second level: the error rate of this method was 7.6%, which is higher than HC, but significantly lower than the error rates of all the MCSs built on orthonormal encoding feature space reported in Fig Conclusion cult, since they contribute to generate the higher part of the total error rate. The greater advantage in the use of HC is a very low error rate, with the same rejection rate of a stand-alone L-SVM or a stand alone MLP (as shown in Table 6). That is, if we classify the 89.95% of patterns with higher confidence the Error Rate of L-SVM is 6% while for HC is 3.1% Validation tests Fig. 5. Error rates on HIV-1 dataset. Table 5 Error rate and number of patterns rejected in each step by HC Steps Global error rate (%) Local error rate (%) Global classified (%) 1 EDC CR Local classified (%) Table 6 Classification performance, with a rejection rate, using L-SVM or MLP Method Error rate (%) Global classified (%) L-SVM L-SVM L-SVM MLP MLP MLP The problem addressed in this paper is to recognize, given a sequence of amino acids, HIV-1 protease cleavage site. We showed by an empirical comparison of several classification methods that coupling a linear transform to a non-linear classifier low error rates can be obtained. Moreover we introduced, for the first time, a method that combines ideas derived from the machine learning methodologies and from a knowledge base of this particular problem. Our experiments showed, by means of the Q-statistic, that the combination between a machine learning classifier and rules based on pattern motifs can be interesting. Finally we illustrated how to obtain a composed method with a very low error rate for HIV-1 protease specificity problem, starting from an exhaustive study of several classification methodologies. This approach is very important in the field of bioinformatics where methods taken from machine learning are often applied without a deepened study of the problem. In particular, in this work, we proposed a hierarchical classifier that combines linear and non-linear classifiers and rules based on pattern motifs. Our hierarchical classifier is composed, at each level, by classifiers highly independent each other, taken in order of the best classification rate for the patterns not rejected. The major advantage of the proposed approach is the low error rate, better than other stand-alone methods proposed in the literature. The major disadvantage is that our system cannot help to understand the relationship between the data. References Beck, Z.Q., Hervio, L., Dawson, P.E., Elder, J.E., Madison, E.L., Identification of efficiently cleaved substrates for HIV-1 protease using a phage display library and use in inhibitor development. Virology. Breiman, L., Bagging predictors. Machine Learn., Cai, Y.D., Chou, K.C., Artificial neural network model for predicting HIV protease cleavage sites in protein. Adv. Eng. Software 29, Cai, Y.D., Yu, H., Chou, K.C., Using neural network for prediction of HIV protease cleavage sites in proteins. J. Protein Chem. 17, Chou, J.J, 1993a. A formulation for correlating properties of peptides and its application to predicting human immunodeficiency virus proteasecleavable sites in proteins. Biopolymers 33, Chou, J.J., 1993b. Predicting cleavability of peptide sequences by HIV protease via correlation-angle approach. J. Protein Chem. 12, 291.
8 1544 A. Lumini, L. Nanni / Pattern Recognition Letters 27 (2006) Chou, K.C., 1993c. A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins. J. Biol. Chem. 268, Chou, K.C., Review: Prediction of HIV protease cleavage sites in proteins. Anal. Biochem. 233, Chou, K.C., Zhang, C.T., Diagrammatization of codon usage in 339 HIV proteins and its biological implication. AIDS Res. Human Retroviruses 8, Chou, K.C., Zhang, C.T., Kezdy, F.J., A vector approach to predicting HIV protease cleavage sites in proteins. Proteins: Struct., Funct., Genet. 16, Chou, K.C., Tomasselli, A.L., Reardon, I.M., Heinrikson, R.L., Predicting HIV protease cleavage sites in proteins by a discriminant function method. PROTEINS: Struct., Funct., Genet. 24, Dietterich, T.G., Ensemble methods in machine learning. In: Kittler, J., Roli, F. (Eds.), Multiple Classifier Systems. First International Workshop, MCS 2000, Lecture Notes in Computer Science, Springer-Verlag, Cagliari, Italy, pp Duda, R., Hart, P., Stork, D., Pattern Classification. Wiley, New York. Giusti, N., Masulli, F., Sperduti, A., Theoretical and experimental analysis of a two-stage system for classification. IEEE Trans. PAMI 24 (7), Houle, G., Aragon, D., Smith, R., Kimura, D.,1998. A multilayered corroboration-based check reader. In: Hull, J., Taylor, S., (Eds.), Document analysis system Kittler, J., Hatef, M., Duin, R.P.W., Matas, J., On combining classifiers. IEEE Trans. Pattern Anal. Machine Intell. 20 (3). Levenshtein, V.I., Binary codes capable of correcting deletions, insertions and reversals. Doklady Akademii Nauk SSSR 163 (4), Masulli, F., Valentini, G., Comparing decomposition methods for classification. In: Howlett, R.J., Jain, L.C. (Eds.), KES 2000, Fourth International Conference on Knowledge-Based Intelligent Engineering Systems & Allied Technologies. IEEE, Piscataway, NJ, pp Mayoraz, E., Moreira, M., On the decomposition of polychotomies into dichotomies. In: The XIV International Conference on Machine Learning, , Nashville, TN, July. Narayanan, A., Wu, X., Yang, Z., Mining viral protease data to extract cleavage knowledge. Bioinformatics 18, S5 S13. Rögnvaldsson, T., You, L., Why neural networks should not be used for HIV-1 protease cleavage site prediction. Bioinformatics, Scholkopf, S., Smola, A., Muller, K.R., Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10 (5), Tan, A.C., Gilbert, D., An empirical comparison of supervised machine learning techniques in bioinformatics. In the Proceedings of the First Asia Pacific Bioinformatics Conference (APBC 2003). 19: Thompson, T.B., Chou, K.C., Zheng, C., Neural network prediction of the HIV-1 protease cleavage sites. J. Theoret. Biol. 177, Tozser, J., Zahuczky, G., Bagossi, P., Louis, J., Copeland, T., Oroszlan, S., Harrison, R., Weber, T., Comparison of the substrate specificity of the human T-cell leukemia virus and human immunodeficiency virus proteinases. Eur. J. Biochem. 267, Wu, C.H., Whitson, G., McLarty, J., Ermongkonchai, A., Change, T.C., PROCANS: Protein classification artificial neural system. Protein Sci., Yule, G.U., On the association of attributes in statistics. Philos. Trans., A 194,
Contents. Just Classifier? Rules. Rules: example. Classification Rule Generation for Bioinformatics. Rule Extraction from a trained network
Contents Classification Rule Generation for Bioinformatics Hyeoncheol Kim Rule Extraction from Neural Networks Algorithm Ex] Promoter Domain Hybrid Model of Knowledge and Learning Knowledge refinement
More informationMammogram Analysis: Tumor Classification
Mammogram Analysis: Tumor Classification Term Project Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is the
More informationMammogram Analysis: Tumor Classification
Mammogram Analysis: Tumor Classification Literature Survey Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is
More informationEECS 433 Statistical Pattern Recognition
EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern
More informationGene Selection for Tumor Classification Using Microarray Gene Expression Data
Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology
More informationEfficacy of the Extended Principal Orthogonal Decomposition Method on DNA Microarray Data in Cancer Detection
202 4th International onference on Bioinformatics and Biomedical Technology IPBEE vol.29 (202) (202) IASIT Press, Singapore Efficacy of the Extended Principal Orthogonal Decomposition on DA Microarray
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write
More informationApplication of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures
Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures 1 2 3 4 5 Kathleen T Quach Department of Neuroscience University of California, San Diego
More informationData mining for Obstructive Sleep Apnea Detection. 18 October 2017 Konstantinos Nikolaidis
Data mining for Obstructive Sleep Apnea Detection 18 October 2017 Konstantinos Nikolaidis Introduction: What is Obstructive Sleep Apnea? Obstructive Sleep Apnea (OSA) is a relatively common sleep disorder
More informationPredicting Breast Cancer Survivability Rates
Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer
More informationECG Beat Recognition using Principal Components Analysis and Artificial Neural Network
International Journal of Electronics Engineering, 3 (1), 2011, pp. 55 58 ECG Beat Recognition using Principal Components Analysis and Artificial Neural Network Amitabh Sharma 1, and Tanushree Sharma 2
More informationIntelligent Edge Detector Based on Multiple Edge Maps. M. Qasim, W.L. Woon, Z. Aung. Technical Report DNA # May 2012
Intelligent Edge Detector Based on Multiple Edge Maps M. Qasim, W.L. Woon, Z. Aung Technical Report DNA #2012-10 May 2012 Data & Network Analytics Research Group (DNA) Computing and Information Science
More informationBreast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selection based on Mutual Information
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selection based on Mutual Information Abeer Alzubaidi abeer.alzubaidi022014@my.ntu.ac.uk David Brown david.brown@ntu.ac.uk Abstract
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017
RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science
More informationSparse Coding in Sparse Winner Networks
Sparse Coding in Sparse Winner Networks Janusz A. Starzyk 1, Yinyin Liu 1, David Vogel 2 1 School of Electrical Engineering & Computer Science Ohio University, Athens, OH 45701 {starzyk, yliu}@bobcat.ent.ohiou.edu
More informationINTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY A Medical Decision Support System based on Genetic Algorithm and Least Square Support Vector Machine for Diabetes Disease Diagnosis
More informationPredicting Breast Cancer Recurrence Using Machine Learning Techniques
Predicting Breast Cancer Recurrence Using Machine Learning Techniques Umesh D R Department of Computer Science & Engineering PESCE, Mandya, Karnataka, India Dr. B Ramachandra Department of Electrical and
More informationMISSING DATA ESTIMATION FOR CANCER DIAGNOSIS SUPPORT
MISSING DATA ESTIMATION FOR CANCER DIAGNOSIS SUPPORT Witold Jacak (a), Karin Proell (b) (a) Department of Software Engineering Upper Austria University of Applied Sciences Hagenberg, Softwarepark 11, Austria
More informationMACHINE LEARNING BASED APPROACHES FOR PREDICTION OF PARKINSON S DISEASE
Abstract MACHINE LEARNING BASED APPROACHES FOR PREDICTION OF PARKINSON S DISEASE Arvind Kumar Tiwari GGS College of Modern Technology, SAS Nagar, Punjab, India The prediction of Parkinson s disease is
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and
More informationApplied Machine Learning in Biomedicine. Enrico Grisan
Applied Machine Learning in Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Algorithm s objective cost Formal objective for algorithms: - minimize a cost function - maximize an objective function
More informationA Comparison of Collaborative Filtering Methods for Medication Reconciliation
A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,
More informationA HMM-based Pre-training Approach for Sequential Data
A HMM-based Pre-training Approach for Sequential Data Luca Pasa 1, Alberto Testolin 2, Alessandro Sperduti 1 1- Department of Mathematics 2- Department of Developmental Psychology and Socialisation University
More informationNMF-Density: NMF-Based Breast Density Classifier
NMF-Density: NMF-Based Breast Density Classifier Lahouari Ghouti and Abdullah H. Owaidh King Fahd University of Petroleum and Minerals - Department of Information and Computer Science. KFUPM Box 1128.
More informationTITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS)
TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS) AUTHORS: Tejas Prahlad INTRODUCTION Acute Respiratory Distress Syndrome (ARDS) is a condition
More informationPREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH
PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH 1 VALLURI RISHIKA, M.TECH COMPUTER SCENCE AND SYSTEMS ENGINEERING, ANDHRA UNIVERSITY 2 A. MARY SOWJANYA, Assistant Professor COMPUTER SCENCE
More informationSVM-Kmeans: Support Vector Machine based on Kmeans Clustering for Breast Cancer Diagnosis
SVM-Kmeans: Support Vector Machine based on Kmeans Clustering for Breast Cancer Diagnosis Walaa Gad Faculty of Computers and Information Sciences Ain Shams University Cairo, Egypt Email: walaagad [AT]
More informationGender Based Emotion Recognition using Speech Signals: A Review
50 Gender Based Emotion Recognition using Speech Signals: A Review Parvinder Kaur 1, Mandeep Kaur 2 1 Department of Electronics and Communication Engineering, Punjabi University, Patiala, India 2 Department
More informationVariable Features Selection for Classification of Medical Data using SVM
Variable Features Selection for Classification of Medical Data using SVM Monika Lamba USICT, GGSIPU, Delhi, India ABSTRACT: The parameters selection in support vector machines (SVM), with regards to accuracy
More informationHybridized KNN and SVM for gene expression data classification
Mei, et al, Hybridized KNN and SVM for gene expression data classification Hybridized KNN and SVM for gene expression data classification Zhen Mei, Qi Shen *, Baoxian Ye Chemistry Department, Zhengzhou
More informationCognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence
Cognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence To understand the network paradigm also requires examining the history
More informationChapter 1. Introduction
Chapter 1 Introduction Artificial neural networks are mathematical inventions inspired by observations made in the study of biological systems, though loosely based on the actual biology. An artificial
More informationA Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China
A Vision-based Affective Computing System Jieyu Zhao Ningbo University, China Outline Affective Computing A Dynamic 3D Morphable Model Facial Expression Recognition Probabilistic Graphical Models Some
More informationRecognition of HIV-1 subtypes and antiretroviral drug resistance using weightless neural networks
Recognition of HIV-1 subtypes and antiretroviral drug resistance using weightless neural networks Caio R. Souza 1, Flavio F. Nobre 1, Priscila V.M. Lima 2, Robson M. Silva 2, Rodrigo M. Brindeiro 3, Felipe
More informationEfficient Classification of Cancer using Support Vector Machines and Modified Extreme Learning Machine based on Analysis of Variance Features
American Journal of Applied Sciences 8 (12): 1295-1301, 2011 ISSN 1546-9239 2011 Science Publications Efficient Classification of Cancer using Support Vector Machines and Modified Extreme Learning Machine
More information1 Pattern Recognition 2 1
1 Pattern Recognition 2 1 3 Perceptrons by M.L. Minsky and S.A. Papert (1969) Books: 4 Pattern Recognition, fourth Edition (Hardcover) by Sergios Theodoridis, Konstantinos Koutroumbas Publisher: Academic
More informationAn Improved Algorithm To Predict Recurrence Of Breast Cancer
An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant
More informationNearest Shrunken Centroid as Feature Selection of Microarray Data
Nearest Shrunken Centroid as Feature Selection of Microarray Data Myungsook Klassen Computer Science Department, California Lutheran University 60 West Olsen Rd, Thousand Oaks, CA 91360 mklassen@clunet.edu
More informationUniversity of East London Institutional Repository:
University of East London Institutional Repository: http://roar.uel.ac.uk This paper is made available online in accordance with publisher policies. Please scroll down to view the document itself. Please
More informationA Hierarchical Artificial Neural Network Model for Giemsa-Stained Human Chromosome Classification
A Hierarchical Artificial Neural Network Model for Giemsa-Stained Human Chromosome Classification JONGMAN CHO 1 1 Department of Biomedical Engineering, Inje University, Gimhae, 621-749, KOREA minerva@ieeeorg
More information196 IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. 2, NO. 3, SEPTEMBER 2010
196 IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, VOL. 2, NO. 3, SEPTEMBER 2010 Top Down Gaze Movement Control in Target Search Using Population Cell Coding of Visual Context Jun Miao, Member, IEEE,
More informationEEG signal classification using Bayes and Naïve Bayes Classifiers and extracted features of Continuous Wavelet Transform
EEG signal classification using Bayes and Naïve Bayes Classifiers and extracted features of Continuous Wavelet Transform Reza Yaghoobi Karimoi*, Mohammad Ali Khalilzadeh, Ali Akbar Hossinezadeh, Azra Yaghoobi
More informationAuto-Encoder Pre-Training of Segmented-Memory Recurrent Neural Networks
Auto-Encoder Pre-Training of Segmented-Memory Recurrent Neural Networks Stefan Glüge, Ronald Böck and Andreas Wendemuth Faculty of Electrical Engineering and Information Technology Cognitive Systems Group,
More informationJ2.6 Imputation of missing data with nonlinear relationships
Sixth Conference on Artificial Intelligence Applications to Environmental Science 88th AMS Annual Meeting, New Orleans, LA 20-24 January 2008 J2.6 Imputation of missing with nonlinear relationships Michael
More informationEvolutionary Programming
Evolutionary Programming Searching Problem Spaces William Power April 24, 2016 1 Evolutionary Programming Can we solve problems by mi:micing the evolutionary process? Evolutionary programming is a methodology
More informationDesign of Multi-Class Classifier for Prediction of Diabetes using Linear Support Vector Machine
Design of Multi-Class Classifier for Prediction of Diabetes using Linear Support Vector Machine Akshay Joshi Anum Khan Omkar Kulkarni Department of Computer Engineering Department of Computer Engineering
More informationData complexity measures for analyzing the effect of SMOTE over microarrays
ESANN 216 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 27-29 April 216, i6doc.com publ., ISBN 978-2878727-8. Data complexity
More informationClassification of Smoking Status: The Case of Turkey
Classification of Smoking Status: The Case of Turkey Zeynep D. U. Durmuşoğlu Department of Industrial Engineering Gaziantep University Gaziantep, Turkey unutmaz@gantep.edu.tr Pınar Kocabey Çiftçi Department
More informationQuestion 1 Multiple Choice (8 marks)
Philadelphia University Student Name: Faculty of Engineering Student Number: Dept. of Computer Engineering First Exam, First Semester: 2015/2016 Course Title: Neural Networks and Fuzzy Logic Date: 19/11/2015
More informationWhen Overlapping Unexpectedly Alters the Class Imbalance Effects
When Overlapping Unexpectedly Alters the Class Imbalance Effects V. García 1,2, R.A. Mollineda 2,J.S.Sánchez 2,R.Alejo 1,2, and J.M. Sotoca 2 1 Lab. Reconocimiento de Patrones, Instituto Tecnológico de
More informationLearning Classifier Systems (LCS/XCSF)
Context-Dependent Predictions and Cognitive Arm Control with XCSF Learning Classifier Systems (LCS/XCSF) Laurentius Florentin Gruber Seminar aus Künstlicher Intelligenz WS 2015/16 Professor Johannes Fürnkranz
More informationADAPTING COPYCAT TO CONTEXT-DEPENDENT VISUAL OBJECT RECOGNITION
ADAPTING COPYCAT TO CONTEXT-DEPENDENT VISUAL OBJECT RECOGNITION SCOTT BOLLAND Department of Computer Science and Electrical Engineering The University of Queensland Brisbane, Queensland 4072 Australia
More informationStatistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes.
Final review Based in part on slides from textbook, slides of Susan Holmes December 5, 2012 1 / 1 Final review Overview Before Midterm General goals of data mining. Datatypes. Preprocessing & dimension
More informationThe Handling of Disjunctive Fuzzy Information via Neural Network Approach
From: AAAI Technical Report FS-97-04. Compilation copyright 1997, AAAI (www.aaai.org). All rights reserved. The Handling of Disjunctive Fuzzy Information via Neural Network Approach Hahn-Ming Lee, Kuo-Hsiu
More informationDiagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods
International Journal of Bioinformatics and Biomedical Engineering Vol. 1, No. 3, 2015, pp. 318-322 http://www.aiscience.org/journal/ijbbe ISSN: 2381-7399 (Print); ISSN: 2381-7402 (Online) Diagnosis of
More informationNature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.
Supplementary Figure 1 Behavioral training. a, Mazes used for behavioral training. Asterisks indicate reward location. Only some example mazes are shown (for example, right choice and not left choice maze
More informationComparison of discrimination methods for the classification of tumors using gene expression data
Comparison of discrimination methods for the classification of tumors using gene expression data Sandrine Dudoit, Jane Fridlyand 2 and Terry Speed 2,. Mathematical Sciences Research Institute, Berkeley
More informationarxiv: v2 [cs.lg] 1 Jun 2018
Shagun Sodhani 1 * Vardaan Pahuja 1 * arxiv:1805.11016v2 [cs.lg] 1 Jun 2018 Abstract Self-play (Sukhbaatar et al., 2017) is an unsupervised training procedure which enables the reinforcement learning agents
More informationClassıfıcatıon of Dıabetes Dısease Usıng Backpropagatıon and Radıal Basıs Functıon Network
UTM Computing Proceedings Innovations in Computing Technology and Applications Volume 2 Year: 2017 ISBN: 978-967-0194-95-0 1 Classıfıcatıon of Dıabetes Dısease Usıng Backpropagatıon and Radıal Basıs Functıon
More informationCOMP9444 Neural Networks and Deep Learning 5. Convolutional Networks
COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks Textbook, Sections 6.2.2, 6.3, 7.9, 7.11-7.13, 9.1-9.5 COMP9444 17s2 Convolutional Networks 1 Outline Geometry of Hidden Unit Activations
More informationInternational Journal of Pharma and Bio Sciences A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS ABSTRACT
Research Article Bioinformatics International Journal of Pharma and Bio Sciences ISSN 0975-6299 A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS D.UDHAYAKUMARAPANDIAN
More informationFacial Expression Recognition Using Principal Component Analysis
Facial Expression Recognition Using Principal Component Analysis Ajit P. Gosavi, S. R. Khot Abstract Expression detection is useful as a non-invasive method of lie detection and behaviour prediction. However,
More informationImproving the Accuracy of Neuro-Symbolic Rules with Case-Based Reasoning
Improving the Accuracy of Neuro-Symbolic Rules with Case-Based Reasoning Jim Prentzas 1, Ioannis Hatzilygeroudis 2 and Othon Michail 2 Abstract. In this paper, we present an improved approach integrating
More informationERA: Architectures for Inference
ERA: Architectures for Inference Dan Hammerstrom Electrical And Computer Engineering 7/28/09 1 Intelligent Computing In spite of the transistor bounty of Moore s law, there is a large class of problems
More informationClassification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang
Classification Methods Course: Gene Expression Data Analysis -Day Five Rainer Spang Ms. Smith DNA Chip of Ms. Smith Expression profile of Ms. Smith Ms. Smith 30.000 properties of Ms. Smith The expression
More informationCOMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION
COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION 1 R.NITHYA, 2 B.SANTHI 1 Asstt Prof., School of Computing, SASTRA University, Thanjavur, Tamilnadu, India-613402 2 Prof.,
More informationEvaluating Classifiers for Disease Gene Discovery
Evaluating Classifiers for Disease Gene Discovery Kino Coursey Lon Turnbull khc0021@unt.edu lt0013@unt.edu Abstract Identification of genes involved in human hereditary disease is an important bioinfomatics
More informationSUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing
Categorical Speech Representation in the Human Superior Temporal Gyrus Edward F. Chang, Jochem W. Rieger, Keith D. Johnson, Mitchel S. Berger, Nicholas M. Barbaro, Robert T. Knight SUPPLEMENTARY INFORMATION
More informationChapter 1. Introduction
Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a
More informationComputational Cognitive Neuroscience
Computational Cognitive Neuroscience Computational Cognitive Neuroscience Computational Cognitive Neuroscience *Computer vision, *Pattern recognition, *Classification, *Picking the relevant information
More informationGrounding Ontologies in the External World
Grounding Ontologies in the External World Antonio CHELLA University of Palermo and ICAR-CNR, Palermo antonio.chella@unipa.it Abstract. The paper discusses a case study of grounding an ontology in the
More informationPredicting Breast Cancer Survival Using Treatment and Patient Factors
Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women
More informationRobust system for patient specific classification of ECG signal using PCA and Neural Network
International Research Journal of Engineering and Technology (IRJET) e-issn: 395-56 Volume: 4 Issue: 9 Sep -7 www.irjet.net p-issn: 395-7 Robust system for patient specific classification of using PCA
More informationTwo lectures on autism
Lennart Gustafsson August 18, 2004 Two lectures on autism This first lecture contains some of the material that has been published in 1. Gustafsson L. Inadequate cortical feature maps: a neural circuit
More informationApplying Data Mining for Epileptic Seizure Detection
Applying Data Mining for Epileptic Seizure Detection Ying-Fang Lai 1 and Hsiu-Sen Chiang 2* 1 Department of Industrial Education, National Taiwan Normal University 162, Heping East Road Sec 1, Taipei,
More informationEmpirical Mode Decomposition based Feature Extraction Method for the Classification of EEG Signal
Empirical Mode Decomposition based Feature Extraction Method for the Classification of EEG Signal Anant kulkarni MTech Communication Engineering Vellore Institute of Technology Chennai, India anant8778@gmail.com
More informationABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 1 ISSN : 2456-3307 Data Mining Techniques to Predict Cancer Diseases
More informationType II Fuzzy Possibilistic C-Mean Clustering
IFSA-EUSFLAT Type II Fuzzy Possibilistic C-Mean Clustering M.H. Fazel Zarandi, M. Zarinbal, I.B. Turksen, Department of Industrial Engineering, Amirkabir University of Technology, P.O. Box -, Tehran, Iran
More informationWavelet Neural Network for Classification of Bundle Branch Blocks
, July 6-8, 2011, London, U.K. Wavelet Neural Network for Classification of Bundle Branch Blocks Rahime Ceylan, Yüksel Özbay Abstract Bundle branch blocks are very important for the heart treatment immediately.
More informationBREAST CANCER EPIDEMIOLOGY MODEL:
BREAST CANCER EPIDEMIOLOGY MODEL: Calibrating Simulations via Optimization Michael C. Ferris, Geng Deng, Dennis G. Fryback, Vipat Kuruchittham University of Wisconsin 1 University of Wisconsin Breast Cancer
More informationReader s Emotion Prediction Based on Partitioned Latent Dirichlet Allocation Model
Reader s Emotion Prediction Based on Partitioned Latent Dirichlet Allocation Model Ruifeng Xu, Chengtian Zou, Jun Xu Key Laboratory of Network Oriented Intelligent Computation, Shenzhen Graduate School,
More informationA scored AUC Metric for Classifier Evaluation and Selection
A scored AUC Metric for Classifier Evaluation and Selection Shaomin Wu SHAOMIN.WU@READING.AC.UK School of Construction Management and Engineering, The University of Reading, Reading RG6 6AW, UK Peter Flach
More informationIdentification of Tissue Independent Cancer Driver Genes
Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important
More informationIJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: 1.852
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Performance Analysis of Brain MRI Using Multiple Method Shroti Paliwal *, Prof. Sanjay Chouhan * Department of Electronics & Communication
More informationEfficient Classification of Lung Tumor using Neural Classifier
Efficient Classification of Lung Tumor using Neural Classifier Mohd.Shoeb Shiraj 1, Vijay L. Agrawal 2 PG Student, Dept. of EnTC, HVPM S College of Engineering and Technology Amravati, India Associate
More informationAQCHANALYTICAL TUTORIAL ARTICLE. Classification in Karyometry HISTOPATHOLOGY. Performance Testing and Prediction Error
AND QUANTITATIVE CYTOPATHOLOGY AND AQCHANALYTICAL HISTOPATHOLOGY An Official Periodical of The International Academy of Cytology and the Italian Group of Uropathology Classification in Karyometry Performance
More informationCARDIAC ARRYTHMIA CLASSIFICATION BY NEURONAL NETWORKS (MLP)
CARDIAC ARRYTHMIA CLASSIFICATION BY NEURONAL NETWORKS (MLP) Bochra TRIQUI, Abdelkader BENYETTOU Center for Artificial Intelligent USTO-MB University Algeria triqui_bouchra@yahoo.fr a_benyettou@yahoo.fr
More informationMachine Learning! Robert Stengel! Robotics and Intelligent Systems MAE 345,! Princeton University, 2017
Machine Learning! Robert Stengel! Robotics and Intelligent Systems MAE 345,! Princeton University, 2017 A.K.A. Artificial Intelligence Unsupervised learning! Cluster analysis Patterns, Clumps, and Joining
More informationOscillatory Neural Network for Image Segmentation with Biased Competition for Attention
Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention Tapani Raiko and Harri Valpola School of Science and Technology Aalto University (formerly Helsinki University of
More informationDeep learning and non-negative matrix factorization in recognition of mammograms
Deep learning and non-negative matrix factorization in recognition of mammograms Bartosz Swiderski Faculty of Applied Informatics and Mathematics Warsaw University of Life Sciences, Warsaw, Poland bartosz_swiderski@sggw.pl
More informationDIABETIC RISK PREDICTION FOR WOMEN USING BOOTSTRAP AGGREGATION ON BACK-PROPAGATION NEURAL NETWORKS
International Journal of Computer Engineering & Technology (IJCET) Volume 9, Issue 4, July-Aug 2018, pp. 196-201, Article IJCET_09_04_021 Available online at http://www.iaeme.com/ijcet/issues.asp?jtype=ijcet&vtype=9&itype=4
More informationPredicting Human Immunodeficiency Virus Type 1 Drug Resistance From Genotype Using Machine Learning. Robert James Murray
Predicting Human Immunodeficiency Virus Type 1 Drug Resistance From Genotype Using Machine Learning. Robert James Murray Master of Science School of Informatics University Of Edinburgh 2004 ABSTRACT: Drug
More informationAnalysis of Resistance to Human Immunodeficiency Virus Protease Inhibitors Using Molecular Mechanics and Machine Learning Strategies
American Medical Journal 1 (2): 126-132, 2010 ISSN 1949-0070 2010 Science Publications Analysis of Resistance to Human Immunodeficiency Virus Protease Inhibitors Using Molecular Mechanics and Machine Learning
More informationLung Cancer Diagnosis from CT Images Using Fuzzy Inference System
Lung Cancer Diagnosis from CT Images Using Fuzzy Inference System T.Manikandan 1, Dr. N. Bharathi 2 1 Associate Professor, Rajalakshmi Engineering College, Chennai-602 105 2 Professor, Velammal Engineering
More informationImproved Intelligent Classification Technique Based On Support Vector Machines
Improved Intelligent Classification Technique Based On Support Vector Machines V.Vani Asst.Professor,Department of Computer Science,JJ College of Arts and Science,Pudukkottai. Abstract:An abnormal growth
More informationA Novel Iterative Linear Regression Perceptron Classifier for Breast Cancer Prediction
A Novel Iterative Linear Regression Perceptron Classifier for Breast Cancer Prediction Samuel Giftson Durai Research Scholar, Dept. of CS Bishop Heber College Trichy-17, India S. Hari Ganesh, PhD Assistant
More informationKeywords Missing values, Medoids, Partitioning Around Medoids, Auto Associative Neural Network classifier, Pima Indian Diabetes dataset.
Volume 7, Issue 3, March 2017 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Medoid Based Approach
More informationModelling and Application of Logistic Regression and Artificial Neural Networks Models
Modelling and Application of Logistic Regression and Artificial Neural Networks Models Norhazlina Suhaimi a, Adriana Ismail b, Nurul Adyani Ghazali c a,c School of Ocean Engineering, Universiti Malaysia
More informationDevelopment of Soft-Computing techniques capable of diagnosing Alzheimer s Disease in its pre-clinical stage combining MRI and FDG-PET images.
Development of Soft-Computing techniques capable of diagnosing Alzheimer s Disease in its pre-clinical stage combining MRI and FDG-PET images. Olga Valenzuela, Francisco Ortuño, Belen San-Roman, Victor
More informationReactive agents and perceptual ambiguity
Major theme: Robotic and computational models of interaction and cognition Reactive agents and perceptual ambiguity Michel van Dartel and Eric Postma IKAT, Universiteit Maastricht Abstract Situated and
More information