and errs as expected. The disadvantage of this approach is that it is time consuming, due to the fact that it is necessary to evaluate all algorithms,
|
|
- Leona Harvey
- 5 years ago
- Views:
Transcription
1 Data transformation and model selection by experimentation and meta-learning Pavel B. Brazdil LIACC, FEP - University of Porto Rua Campo Alegre, Porto, Portugal pbrazdil@ncc.up.pt Research in the area of ML/data mining has lead to a proliferation of many dierent algorithms. In the area of classication Michie et al. (1994) for instance, describe about two dozen of such algorithms. Previous work has shown that there does not exist a single best algorithm, which would be suited for all tasks. It is thus necessary to have a way of selecting the most promising model type. This process is often referred to as model selection. An interesting question arises as to what kind of method, or methodology, we should adopt to do that. Previous approaches can be divided basically into two groups. The rst one includes methods based on experimentation, and the second one, methods which employ meta-knowledge. Our aim here is to review both of these approaches in some detail and examine how these could be extended to encompass also the data transformation phase which often precedes learning. 1 Model Selection by Experimentation or Using Meta-Knowledge? 1.1 Model Selection by Experimentation Model selection by experimentation works, as the name suggests, by evaluating the possible alternatives experimentally on the given problem. In the context of classication one would normally consider a set of possible classiers and try to obtain reliable error estimates, which is usually done using cross-validation (CV) (Schaer, 1993). This approach has a number of advantages. First, it is quite a general and applicable in many dierent situations. The method is, as Schaer (1993) has demonstrated, quite reliable. Given certain condence level, the approach does indeed identify the best possible candidate 11
2 and errs as expected. The disadvantage of this approach is that it is time consuming, due to the fact that it is necessary to evaluate all algorithms, some of which can be quite slow. Various proposals have been presented how to speed up this process. One possibility is to pre-select some algorithms using certain criteria and then limit the experimentation to this subset. Some people have suggested that we should preferably use algorithms which behave rather dierently form one another. One criteria for deciding this is by examining whether the algorithms lead to uncorrelated errors (Ali and Pazzani, 1996). Another possibility is to try to reduce the number of cycles of cross-validation without eecting the reliability of the result. Moore and Lee (1994) have proposed a technique referred to as racing, which permits to terminate the evaluation of those algorithms which appear to be far behind others. Yet another option is by exploiting meta-knowledge which will be briey reviewed in the next section. 1.2 Model Selection Using Meta-knowledge Meta-knowledge permits to capture our knowledge about which ML algorithms should perform well in which situation. This knowledge can be either theoretical or of experimental origin, or a mixture of both. The rules described by Brodley (1993) for instance, captured the knowledge of experts concerning the applicability of certain classication algorithms. The meta-knowledge of Brazdil et al.(1994) and Gama and Brazdil (1995) was of experimental origin. The objective of the meta-rules generated with the help learning systems was to capture certain relationships between the measured dataset characteristics (such as the number of attributes, number of cases, skew, kurtosis etc.) and the error rate. As was demonstrated by the authors this meta-knowledge can be used to predict the errors of individual algorithms with a certain degree of success. One advantage of this approach when compared to model selection based on experimentation, is that it does not really need extensive experiments. This is due to the fact that meta-knowledge captures certain regularities of the situations encountered in the past. The disadvantage is that the meta-knowledge acquired need not be totally applicable to a new situation and, in consequence, this method tends to be somewhat less reliable than model selection based on experimentation. As neither solution is ideal, this suggests that we may gain by combining the two approaches. Model selection by meta-knowledge can be used to pre-select a subset of promising algorithms and then experimentation can be used to identify the best candidate. This method requires that we dene the criteria for pre-selecting the set candidate algorithms. A good criteria will somehow strike a good balance between the reliability of the outcome and the amount of experimentation we are prepared to undertake. Preselecting fewer algorithms has the advantage that there is less work to be done, but on the other hand, we may get a sub-optimal result. 2 Dierent Approaches to Model Selection by Meta-knowledge There are many dierent ways of how we can approach the problem. Our aim in this section is to describe certain options we can take when addressing the problem. Basically we need to decide: Whether the meta-knowledge should express knowledge concerning pairs of algorithms or a larger group; 12
3 What the reference point is for the comparisons of error rates; Whether the meta-knowledge should be easily updateable; Whether the predictions should be qualitative (e.g. Ai is applicable) or quantitative (the error rate of Ai is E%); Whether or not we want to condition the predictions on dataset characteristics. Let us now analyze each of the points above in some detail. 2.1 Which is the Best Reference Point? The rst important decision is whether we should consider pairs of algorithms generalize the study to N algorithms. The meta-rules of Aha (1992) were oriented towards pairs of algorithms (e.g. IB1, C4). The objectives of the meta-rules was to dene conditions under which one algorithm (e.g. IB1) achieves better results and hence is preferable to another (e.g. C4). This rst major comparative study of a set of 22 classication algorithms was carried out under the StatLog project (Michie et al., 1994). The fact that a number of algorithms were analyzed together provided a reason to establish a kind of common reference point for all comparisons involving error rates. Gama and Brazdil (1995), for instance, considered three kinds of reference points in their study and evaluated them experimentally: the best error rate achieved by one of the algorithms, the mean error rate of all algorithms (or weighted mean), the error rate associated with the majority class prediction. We note that the rst two reference points depend on the set of algorithms under consideration. That is, if we introduce new algorithms into the set, or if we eliminate some existing ones from consideration, we have to, at least in principle, repeat all steps that depend on this reference point. This of course complicates the task of updating the existing meta- knowledge, as soon as new algorithms become available. The third reference point mentioned does not suer from this disadvantage. The error rate associated with the majority class depends entirely on the dataset under consideration. 2.2 Should the Predictions of Meta-Knowledge be Qualitative or Quantitative? Another important issue is whether we want the prediction concerning error to be qualitative or quantitative. Qualitative prediction would simply divide the algorithms into two groups: Those with low error rates, which we could identify as applicable, and the remaining ones which include both the algorithms with unacceptably high error rates, and also, the algorithms which failed to run. Quantitative predictions are concerned with predicting the actual error rate (or error which has been normalized in some way). The question concerning the form of the meta-knowledge is closely related to this issue. If we are interested to obtain only qualitative predictions, then meta-knowledge can be represented in the form of rules or cases. If we are interested in qualitative predictions, then we need to use some kind of a regression model, although qualitative predictions can also be converted to quantitative predictions (i.e. by associating a numeric value with each class). 13
4 2.3 Conclusions of a Previous Comparative Analysis Let us review the results of the experimental analysis carried out by Gama and Brazdil (1995) who have collected test results of about 20 algorithms on more than 20 datasets. Each dataset was characterized using 18 dierent measures (such as the number of attributes, number of cases, skew, kurtosis etc.). The authors have considered and evaluated the three reference points discussed earlier. Besides, the following forms for metaknowledge were considered: rules (generated by C4.5 (Quinlan, 1993)); instances (a version of IB1 (Aha et al., 1991)); linear regression equations (generated by a linear discriminant procedure); piecewise linear regression equations (linear regression equations with restricted applicability generated by Quinlan's (1993b) M5.1); A separate experiment was conducted for each of the 3 reference points and each of the 4 forms of meta-knowledge. There were thus 12 separate experiments in total. In each experiment the predictive power of the meta-knowledge was evaluated using a leave-oneout method. Let us analyze one such experiment for the sake of clarity. Suppose the aim is to evaluate, for instance, the scheme involving normalization method based majority class prediction and meta-knowledge in the form of piecewise linear regression equations. In each step of the leave-one-out method, one dataset was set aside for evaluation. The remaining data was normalized with respect to the reference point chosen and supplied to the learning system to construct the model (i.e. piecewise linear regression equations in this case). The prediction was then denormalized and stored with the actual value. These pairs of values were used to calculate measures characterizing the quality of predictions, such as NMSE, after all cycles of the leave-one-out method have terminated. So, essentially the authors evaluated the possibility of obtaining reliable predictions with the help of meta-level models. This analysis showed that meta-level models were indeed quite useful, although some set-ups were more successful than others. Instance based models (more exactly 3-NN) provided more reliable predictions than some of the other model types (particularly rules and linear regression equations). Piecewise linear regression equations achieved also quite good predictions overall. The best reference point was the one related to majority class prediction. These results have quite interesting implication. The method that provides the most reliable predictions (IBL + majority class as the reference point), enables us to construct a system which is easily extensible. The system can easily accommodate new algorithms which can arise at any time. The new results can just be added to the existing instances and used immediately afterwards in decision making. There is no need to carry out extensive meta- level learning, which is an advantage. This strategy was incorporated in the system Calg (Gama, 1996). The only disadvantage is that the meta-knowledge in this form does not really provide a comprehensible model. 3 Using Meta-Knowledge to Guide Experimentation Let us consider the issue whether meta-knowledge can be used to guide also the process of experimentation. But would this guidance be really useful? 14
5 The answer is armative, if we want to avoid unnecessary work. If pre-selection is done on the basis of performance of the individual algorithms only, we cannot guarantee that the nal subset does not include algorithms which are minor variants of one another. For practical reasons it is not really worth trying them all. What kind of meta-knowledge could be useful here? One interesting and practical possibility is to use statements of the form: pf(ai(di) >> Aj(Di) j Di 2 Dataset pool) which enables us to describe the frequency with which algorithm Ai performs signicantly better than algorithm Aj for given datasets. Here, \Ai(Di) >> Aj(Di)" is used as a shorthand for \algorithm Ai performs signicantly better (considering a given condence level, say 95%) than algorithm Aj". We can use this representation to express the fact that algorithm Ltree, for instance, leads to signicantly better results than C4.5 in 10 out of 22 cases by: pf(ltree(di) >> C4.5(Di) j Di 2 UCIdatasets ofjg) = 10/22 The algorithm Ltree is a decision tree type algorithm which can introduce new terms with the help of constructive induction (Gama, 1997). The frequency can be used to estimate the probability that one algorithm performs better than another. It can help to resolve the problem we have discussed earlier: If Aj is a variant of Aj which does not really bring out any benets, then presumably the frequency of observing a signicant improvement is zero. To express this we can use: pf(aj'(di) >> Aj(Di) j Di 2 Dataset pool) = 0 4 Using Meta-Knowledge to Guide Pre-Processing and Model Selection Previous studies have shown that pre-processing, such as elimination of irrelevant features or discretization of numeric values etc., can often bring about substantial improvements. Langley and Iba (1993), for instance, have demonstrated that the performance of IBL classier can be substantially improved by eliminating irrelevant features. Kohavi and John (1997) have veried that similar improvements can be obtained also with Naive Bayes and ID3 classier. Some classication algorithms (e.g. Naive Bayes) achieve better performance if the numeric features are discretized rst (Dougherty et al., 1995). A question arises whether the system proposed in the previous section can be extended to cover also the pre-processing stage. Our view is that this can indeed be done. Let us consider, for instance, one result presented in (Dougherty et al., 1995): At 95% condence level, the Naive-Bayes with entropy-discretization is better than C4.5 on ve datasets and worse on two (there were 16 datasets in total). This statement can be expressed in the form of the following two meta-level facts: pf(naive-bayes(entropy-discr(di)) >> C4.5(Di) j Di 2 UCIdata DKS)=5/16 pf(naive-bayes(entropy-discr(di)) << C4.5(Di) j Di 2 UCIdata DKS)=2/16 The fact that backward feature selection investigated by Kohavi and John (1997) has lead to improvements on 4 datasets can be expressed as follows: pf(c4.5(back-feature-select(di)) >> C4.5(Di) j Di 2 UCIdata KJ)=4/14 pf(c4.5(back-feature-select(di)) << C4.5(Di) j Di 2 UCIdata KJ)=0/14 15
6 5 Conclusion Our proposal is to use IBL meta-knowledge to perform pre-selection of promising algorithms and then use the representation described above to guide the process of conducting experiments and evaluating candidate algorithms. The search for the best combination of pre-processing method and model type can be seen as a kind of heuristic search. The meta-knowledge rules capture the results of previous experience and are used to avoid the probable pitfalls in future. Our plan is to evaluate its eectiveness of this method. Acknowledgments Gratitude is expressed to nancial support under PRAXIS XXI project ECO and Plurianual support attributed to LIACC. References [1] Aha D. (1992): Generalizing from Case Studies: A Case Study, in ML92, Machine Learning, Proceedings of 9th Machine Learning Conference, D.Sleeman and P.Edwards (eds.), Morgan Kaufmann Publ. [2] Aha D., Kibler D., Albert M. (1991): Instance-based Learning Algorithms, in Machine Learning, Vol.6, No.1, Kluwer Academic Publ., pp [3] Ali and Pazzani (1996): Error Reduction through Learning Multiple Description, in Machine Learning, Vol.24, No.13, Kluwer Academic Publ., pp [4] Blum A., Langley P. (1997): Selection of Relevant Features and Examples in Machine Learning, Journal of Articial Intelligence, Vol. 97, Nos.1-2, pp , Elsevier. [5] Brodley C. (1993): Addressing the Selective Superiority Problem: Automatic Algorithm / Model Class Selection Problem, in Machine Learning, Proceedings of 10th Machine Learning Conference, Morgan Kaufmann. [6] Brazdil, P. (1994): Analysis of Results, Chapter 10 in Michie D. et al. (eds), Machine Learning, Neural and Statistical Classication, Ellis Horwood. [7] Brazdil, P., Gama J. and Henery B. (1994): Characterization the Applicability of Classication Algorithms, in Machine Learning, ECML-94, Proceedings of European Conference on Machine Learning, F.Bergadano and L.dea Raedt (eds.), Springer- Verlag. [8] Dougherty R., Kohavi J. and Sadami M. (1995): Supervised and unsupervised discretization of continuous features, in, Machine Learning, Proceedings of 12th Machine Learning Conference, Morgan Kaufmann Publ. [9] Gama J. (1977): Probabilistic Linear Tree, in, Machine Learning, Proceedings of 14th Machine Learning Conference (ICML-97), Morgan Kaufmann. [10] Gama J., Brazdil P. (1995): Characterization of Classication Algorithms, in C.Pinto-Ferreira, N.Mamede (eds.), Progress in Articial Intelligence, LNAI 990, Springer-Verlag. 16
7 [11] Kohavi R.and John G. (1997): Wrappers for Feature Subset Selection, Journal of Articial Intelligence, Vol. 97, Nos.1-2, pp , Elsevier. [12] Michie D., Spiegelhalter D., Taylor C. (1994): Machine Learning, Neural and Statistical Classication, Ellis Horwood. [13] Moore A. and Lee M. (1994): Ecient Algorithms for Minimizing Cross Validation Error, in ML-94, Machine Learning, Proceedings of 11th Machine Learning Conference, Morgan Kaufmann Publ. [14] Quinlan R. (1993): C4.5: Programs for Machine Learning, Morgan Kaufmann Publ. [15] Quinlan R. (1993b): Combining Instance-based and Model-based Learning, in Machine Learning, Proceedings of 10th Machine Learning Conference, Morgan Kaufmann Publ. [16] Schaer C. (1993): Selecting a Classication Method by Cross-Validation, in Machine Learning, Vol.13, No.1, Kluwer Academic Publ., pp
ate tests of conventional decision trees. Each leaf of a naive Bayesian tree contains a local naive Bayesian classier that does not consider attribute
Lazy Bayesian Rules: A Lazy Semi-Naive Bayesian Learning Technique Competitive to Boosting Decision Trees Zijian Zheng, Georey I. Webb, Kai Ming Ting School of Computing and Mathematics Deakin University
More informationEC352 Econometric Methods: Week 07
EC352 Econometric Methods: Week 07 Gordon Kemp Department of Economics, University of Essex 1 / 25 Outline Panel Data (continued) Random Eects Estimation and Clustering Dynamic Models Validity & Threats
More informationThis paper presents an alternative approach that seeks instead to adjust the probabilities produced by a standard naive Bayesian classier in order to
Adjusted Probability Naive Bayesian Induction Georey I. Webb 1 and Michael J. Pazzani 2 1 School of Computing and Mathematics Deakin University Geelong, Vic, 3217, Australia. 2 Department of Information
More informationEmpirical function attribute construction in classification learning
Pre-publication draft of a paper which appeared in the Proceedings of the Seventh Australian Joint Conference on Artificial Intelligence (AI'94), pages 29-36. Singapore: World Scientific Empirical function
More informationEvaluating Classifiers for Disease Gene Discovery
Evaluating Classifiers for Disease Gene Discovery Kino Coursey Lon Turnbull khc0021@unt.edu lt0013@unt.edu Abstract Identification of genes involved in human hereditary disease is an important bioinfomatics
More informationHow to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection
How to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection Esma Nur Cinicioglu * and Gülseren Büyükuğur Istanbul University, School of Business, Quantitative Methods
More informationConditional Outlier Detection for Clinical Alerting
Conditional Outlier Detection for Clinical Alerting Milos Hauskrecht, PhD 1, Michal Valko, MSc 1, Iyad Batal, MSc 1, Gilles Clermont, MD, MS 2, Shyam Visweswaran MD, PhD 3, Gregory F. Cooper, MD, PhD 3
More informationData complexity measures for analyzing the effect of SMOTE over microarrays
ESANN 216 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 27-29 April 216, i6doc.com publ., ISBN 978-2878727-8. Data complexity
More informationCopyright 2008 Society of Photo Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE, vol. 6915, Medical Imaging 2008:
Copyright 28 Society of Photo Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE, vol. 695, Medical Imaging 28: Computer Aided Diagnosis and is made available as an electronic
More informationINDUCTIVE LEARNING OF TREE-BASED REGRESSION MODELS. Luís Fernando Raínho Alves Torgo
Luís Fernando Raínho Alves Torgo INDUCTIVE LEARNING OF TREE-BASED REGRESSION MODELS Tese submetida para obtenção do grau de Doutor em Ciência de Computadores Departamento de Ciência de Computadores Faculdade
More informationProposing a New Term Weighting Scheme for Text Categorization
Proposing a New Term Weighting Scheme for Text Categorization Man LAN Institute for Infocomm Research 21 Heng Mui Keng Terrace Singapore 119613 lanman@i2r.a-star.edu.sg Chew-Lim TAN School of Computing
More informationData Mining Approaches for Diabetes using Feature selection
Data Mining Approaches for Diabetes using Feature selection Thangaraju P 1, NancyBharathi G 2 Department of Computer Applications, Bishop Heber College (Autonomous), Trichirappalli-620 Abstract : Data
More informationin Human-Computer Interaction social interaction play a role in this response. Reciprocity is the
Social Response and Cultural Dependency in Human-Computer Interaction Yugo Takeuchi 1 Yasuhiro Katagiri 1 Cliord I. Nass 2 B. J. Fogg 2 1 ATR MI&C Research Labs., Kyoto 619-0288, Japan 2 Stanford University,
More informationdiscovery of decision rules characterizing the dependency between values of condition attributes and decision attribute. The presented approach has be
Rough Set Data Mining of Diabetes Mellitus Data Jaroslaw Stepaniuk Institute of Computer Science Bialystok University of Technology Wiejska 45A, 15-351 Bialystok, Poland e-mail: jstepan@ii.pb.bialystok.pl
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017
RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science
More informationData Mining in Bioinformatics Day 4: Text Mining
Data Mining in Bioinformatics Day 4: Text Mining Karsten Borgwardt February 25 to March 10 Bioinformatics Group MPIs Tübingen Karsten Borgwardt: Data Mining in Bioinformatics, Page 1 What is text mining?
More informationEvolutionary Programming
Evolutionary Programming Searching Problem Spaces William Power April 24, 2016 1 Evolutionary Programming Can we solve problems by mi:micing the evolutionary process? Evolutionary programming is a methodology
More informationMachine Learning for Survival Analysis: A Case Study on Recurrence of Prostate Cancer
Machine Learning for Survival Analysis: A Case Study on Recurrence of Prostate Cancer Blaž Zupan 1,2,4,JanezDemšar 1, Michael W. Kattan 3, J. Robert Beck 4,and I. Bratko 1,2 1 Faculty of Computer Science,
More informationUsing Bayesian Networks to Direct Stochastic Search in Inductive Logic Programming
Appears in Proceedings of the 17th International Conference on Inductive Logic Programming (ILP). Corvallis, Oregon, USA. June, 2007. Using Bayesian Networks to Direct Stochastic Search in Inductive Logic
More informationDiagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods
International Journal of Bioinformatics and Biomedical Engineering Vol. 1, No. 3, 2015, pp. 318-322 http://www.aiscience.org/journal/ijbbe ISSN: 2381-7399 (Print); ISSN: 2381-7402 (Online) Diagnosis of
More informationPredicting Breast Cancer Survivability Rates
Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer
More informationPositive and Unlabeled Relational Classification through Label Frequency Estimation
Positive and Unlabeled Relational Classification through Label Frequency Estimation Jessa Bekker and Jesse Davis Computer Science Department, KU Leuven, Belgium firstname.lastname@cs.kuleuven.be Abstract.
More informationPositive and Unlabeled Relational Classification through Label Frequency Estimation
Positive and Unlabeled Relational Classification through Label Frequency Estimation Jessa Bekker and Jesse Davis Computer Science Department, KU Leuven, Belgium firstname.lastname@cs.kuleuven.be Abstract.
More informationLearning with Rare Cases and Small Disjuncts
Appears in Proceedings of the 12 th International Conference on Machine Learning, Morgan Kaufmann, 1995, 558-565. Learning with Rare Cases and Small Disjuncts Gary M. Weiss Rutgers University/AT&T Bell
More informationWrapper subset evaluation facilitates the automated detection of diabetes from heart rate variability measures
Wrapper subset evaluation facilitates the automated detection of diabetes from heart rate variability measures D. J. Cornforth 1, H. F. Jelinek 1, M. C. Teich 2 and S. B. Lowen 3 1 Charles Sturt University,
More informationExploration and Exploitation in Reinforcement Learning
Exploration and Exploitation in Reinforcement Learning Melanie Coggan Research supervised by Prof. Doina Precup CRA-W DMP Project at McGill University (2004) 1/18 Introduction A common problem in reinforcement
More informationApplying Machine Learning Techniques to Analysis of Gene Expression Data: Cancer Diagnosis
Applying Machine Learning Techniques to Analysis of Gene Expression Data: Cancer Diagnosis Kyu-Baek Hwang, Dong-Yeon Cho, Sang-Wook Park Sung-Dong Kim, and Byoung-Tak Zhang Artificial Intelligence Lab
More informationPredicting Malignancy from Mammography Findings and Image Guided Core Biopsies
Predicting Malignancy from Mammography Findings and Image Guided Core Biopsies 2 nd Breast Cancer Workshop 2015 April 7 th 2015 Porto, Portugal Pedro Ferreira Nuno A. Fonseca Inês Dutra Ryan Woods Elizabeth
More informationLazy Learning of Bayesian Rules
Machine Learning, 41, 53 84, 2000 c 2000 Kluwer Academic Publishers. Manufactured in The Netherlands. Lazy Learning of Bayesian Rules ZIJIAN ZHENG zijian@deakin.edu.au GEOFFREY I. WEBB webb@deakin.edu.au
More informationClassifying Substance Abuse among Young Teens
Classifying Substance Abuse among Young Teens Dylan Rhodes, Sunet: dylanr December 14, 2012 Abstract This project attempts to use machine learning to classify substance abuse among young teens. It makes
More informationA Comparison of Collaborative Filtering Methods for Medication Reconciliation
A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,
More informationProbabilistic Plan Recognition for Hostile Agents. Christopher W. Geib and Robert P. Goldman. Honeywell Technology Center
Probabilistic Plan Recognition for Hostile Agents Christopher W. Geib Robert P. Goldman Honeywell Technology Center Minneapolis, MN 55418 USA fgeib,goldmang@htc.honeywell.com Abstract This paper presents
More informationIntelligent Systems. Discriminative Learning. Parts marked by * are optional. WS2013/2014 Carsten Rother, Dmitrij Schlesinger
Intelligent Systems Discriminative Learning Parts marked by * are optional 30/12/2013 WS2013/2014 Carsten Rother, Dmitrij Schlesinger Discriminative models There exists a joint probability distribution
More informationA HMM-based Pre-training Approach for Sequential Data
A HMM-based Pre-training Approach for Sequential Data Luca Pasa 1, Alberto Testolin 2, Alessandro Sperduti 1 1- Department of Mathematics 2- Department of Developmental Psychology and Socialisation University
More informationfuzzy models for cardiac arrhythmia classication Rosaria Silipo 1 International Computer Science Institute, Berkeley, USA
Investigating electrocardiographic features in fuzzy models for cardiac arrhythmia classication Rosaria Silipo 1 International Computer Science Institute, Berkeley, USA Abstract. Simple and composed measures
More informationIntroduction to Computational Neuroscience
Introduction to Computational Neuroscience Lecture 5: Data analysis II Lesson Title 1 Introduction 2 Structure and Function of the NS 3 Windows to the Brain 4 Data analysis 5 Data analysis II 6 Single
More informationEECS 433 Statistical Pattern Recognition
EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern
More informationAn Empirical Evaluation of. Model-Based Case Matching and Adaptation 1. L. Karl Branting and John D. Hastings. University of Wyoming
An Empirical Evaluation of Model-Based Case Matching and Adaptation 1 L. Karl Branting and John D. Hastings Department of Computer Science University of Wyoming Laramie, Wyoming 82071-3682 fkarl,hastingsg@eolus.uwyo.edu
More informationDevelopment of Soft-Computing techniques capable of diagnosing Alzheimer s Disease in its pre-clinical stage combining MRI and FDG-PET images.
Development of Soft-Computing techniques capable of diagnosing Alzheimer s Disease in its pre-clinical stage combining MRI and FDG-PET images. Olga Valenzuela, Francisco Ortuño, Belen San-Roman, Victor
More informationA Hybrid Approach for Mining Metabolomic Data
A Hybrid Approach for Mining Metabolomic Data Dhouha Grissa 1,3, Blandine Comte 1, Estelle Pujos-Guillot 2, and Amedeo Napoli 3 1 INRA, UMR1019, UNH-MAPPING, F-63000 Clermont-Ferrand, France, 2 INRA, UMR1019,
More informationModel reconnaissance: discretization, naive Bayes and maximum-entropy. Sanne de Roever/ spdrnl
Model reconnaissance: discretization, naive Bayes and maximum-entropy Sanne de Roever/ spdrnl December, 2013 Description of the dataset There are two datasets: a training and a test dataset of respectively
More informationAN EXPERIMENTAL STUDY ON HYPOTHYROID USING ROTATION FOREST
AN EXPERIMENTAL STUDY ON HYPOTHYROID USING ROTATION FOREST Sheetal Gaikwad 1 and Nitin Pise 2 1 PG Scholar, Department of Computer Engineering,Maeers MIT,Kothrud,Pune,India 2 Associate Professor, Department
More informationMostly Harmless Simulations? On the Internal Validity of Empirical Monte Carlo Studies
Mostly Harmless Simulations? On the Internal Validity of Empirical Monte Carlo Studies Arun Advani and Tymon Sªoczy«ski 13 November 2013 Background When interested in small-sample properties of estimators,
More informationHybrid HMM and HCRF model for sequence classification
Hybrid HMM and HCRF model for sequence classification Y. Soullard and T. Artières University Pierre and Marie Curie - LIP6 4 place Jussieu 75005 Paris - France Abstract. We propose a hybrid model combining
More informationA DATA MINING APPROACH FOR PRECISE DIAGNOSIS OF DENGUE FEVER
A DATA MINING APPROACH FOR PRECISE DIAGNOSIS OF DENGUE FEVER M.Bhavani 1 and S.Vinod kumar 2 International Journal of Latest Trends in Engineering and Technology Vol.(7)Issue(4), pp.352-359 DOI: http://dx.doi.org/10.21172/1.74.048
More informationCHAPTER 6. Experiments in the Real World
CHAPTER 6 Experiments in the Real World EQUAL TREATMENT FOR ALL SUBJECTS The underlying assumption of randomized comparative experiments is that all subjects are handled equally in every respect except
More informationInternational Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 12, December 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
More informationRational drug design as hypothesis formation
Rational drug design as hypothesis formation Alexander P.M. van den Bosch * Department of Philosophy A-weg 30, 9718 CW, Groningen voice: 0031-50-3636946/6161 fax: 0031-50-3636160 email: alexander@philos.rug.nl
More informationWrapper subset evaluation facilitates the automated detection of diabetes from heart rate variability measures
Wrapper subset evaluation facilitates the automated detection of diabetes from heart rate variability measures D. J. Cornforth 1, H. F. Jelinek 1, M. C. Teich 2 and S. B. Lowen 3 1 Charles Sturt University,
More informationLogistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India
20th International Congress on Modelling and Simulation, Adelaide, Australia, 1 6 December 2013 www.mssanz.org.au/modsim2013 Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision
More informationIn Intelligent Information Systems, M. Klopotek, M. Michalewicz, S.T. Wierzchon (eds.), pp , Advances in Soft Computing Series,
In Intelligent Information Systems, M. Klopotek, M. Michalewicz, S.T. Wierzchon (eds.), pp. 303-313, Advances in Soft Computing Series, Physica-Verlag (A Springer-Verlag Company), Heidelberg, 2000 Extension
More informationStatistics Mathematics 243
Statistics Mathematics 243 Michael Stob February 2, 2005 These notes are supplementary material for Mathematics 243 and are not intended to stand alone. They should be used in conjunction with the textbook
More informationTitle: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection
Author's response to reviews Title: A robustness study of parametric and non-parametric tests in Model-Based Multifactor Dimensionality Reduction for epistasis detection Authors: Jestinah M Mahachie John
More informationA STUDY OF AdaBoost WITH NAIVE BAYESIAN CLASSIFIERS: WEAKNESS AND IMPROVEMENT
Computational Intelligence, Volume 19, Number 2, 2003 A STUDY OF AdaBoost WITH NAIVE BAYESIAN CLASSIFIERS: WEAKNESS AND IMPROVEMENT KAI MING TING Gippsland School of Computing and Information Technology,
More informationDecisions and Dependence in Influence Diagrams
JMLR: Workshop and Conference Proceedings vol 52, 462-473, 2016 PGM 2016 Decisions and Dependence in Influence Diagrams Ross D. hachter Department of Management cience and Engineering tanford University
More informationMeasurement and meaningfulness in Decision Modeling
Measurement and meaningfulness in Decision Modeling Brice Mayag University Paris Dauphine LAMSADE FRANCE Chapter 2 Brice Mayag (LAMSADE) Measurement theory and meaningfulness Chapter 2 1 / 47 Outline 1
More informationFast Affinity Propagation Clustering based on Machine Learning
www.ijcsi.org 302 Fast Affinity Propagation Clustering based on Machine Learning Shailendra Kumar Shrivastava 1, Dr. J.L. Rana 2 and Dr. R.C. Jain 3 1 Samrat Ashok Technological Institute Vidisha, Madhya
More informationGenerative Adversarial Networks.
Generative Adversarial Networks www.cs.wisc.edu/~page/cs760/ Goals for the lecture you should understand the following concepts Nash equilibrium Minimax game Generative adversarial network Prisoners Dilemma
More informationMotivation Empirical models Data and methodology Results Discussion. University of York. University of York
Healthcare Cost Regressions: Going Beyond the Mean to Estimate the Full Distribution A. M. Jones 1 J. Lomas 2 N. Rice 1,2 1 Department of Economics and Related Studies University of York 2 Centre for Health
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and
More informationEvent Classification and Relationship Labeling in Affiliation Networks
Event Classification and Relationship Labeling in Affiliation Networks Abstract Many domains are best described as an affiliation network in which there are entities such as actors, events and organizations
More informationPREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH
PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH 1 VALLURI RISHIKA, M.TECH COMPUTER SCENCE AND SYSTEMS ENGINEERING, ANDHRA UNIVERSITY 2 A. MARY SOWJANYA, Assistant Professor COMPUTER SCENCE
More informationA Pitfall in Determining the Optimal Feature Subset Size
A Pitfall in Determining the Optimal Feature Subset Size Juha Reunanen ABB, Web Imaging Systems P.O. Box 94, 00381 Helsinki, Finland Juha.Reunanen@fi.abb.com Abstract. Feature selection researchers often
More informationMachine Learning Statistical Learning. Prof. Matteo Matteucci
Machine Learning Statistical Learning Pro. Matteo Matteucci Statistical Learning Outline o What Is Statistical Learning? Why estimate? How do we estimate? The trade-o between prediction accuracy & model
More informationTowards Learning to Ignore Irrelevant State Variables
Towards Learning to Ignore Irrelevant State Variables Nicholas K. Jong and Peter Stone Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 {nkj,pstone}@cs.utexas.edu Abstract
More informationDealing with Missing Values in Neural Network-Based Diagnostic Systems
Dealing with Missing Values in Neural Network-Based Diagnostic Systems P. K. Sharpe 1 & R. J. Solly The Transputer Centre University of the West of England Coldharbour Lane Frenchay Bristol BS16 1QY Abstract
More informationBayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions
Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions J. Harvey a,b, & A.J. van der Merwe b a Centre for Statistical Consultation Department of Statistics
More informationModeling Sentiment with Ridge Regression
Modeling Sentiment with Ridge Regression Luke Segars 2/20/2012 The goal of this project was to generate a linear sentiment model for classifying Amazon book reviews according to their star rank. More generally,
More informationAQCHANALYTICAL TUTORIAL ARTICLE. Classification in Karyometry HISTOPATHOLOGY. Performance Testing and Prediction Error
AND QUANTITATIVE CYTOPATHOLOGY AND AQCHANALYTICAL HISTOPATHOLOGY An Official Periodical of The International Academy of Cytology and the Italian Group of Uropathology Classification in Karyometry Performance
More informationA Biased View of Perceivers. Commentary on `Observer theory, Bayes theory,
A Biased View of Perceivers Commentary on `Observer theory, Bayes theory, and psychophysics,' by B. Bennett, et al. Allan D. Jepson University oftoronto Jacob Feldman Rutgers University March 14, 1995
More informationImproving the Accuracy of Neuro-Symbolic Rules with Case-Based Reasoning
Improving the Accuracy of Neuro-Symbolic Rules with Case-Based Reasoning Jim Prentzas 1, Ioannis Hatzilygeroudis 2 and Othon Michail 2 Abstract. In this paper, we present an improved approach integrating
More informationAnnotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation
Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation Ryo Izawa, Naoki Motohashi, and Tomohiro Takagi Department of Computer Science Meiji University 1-1-1 Higashimita,
More informationTHE USE OF MULTIVARIATE ANALYSIS IN DEVELOPMENT THEORY: A CRITIQUE OF THE APPROACH ADOPTED BY ADELMAN AND MORRIS A. C. RAYNER
THE USE OF MULTIVARIATE ANALYSIS IN DEVELOPMENT THEORY: A CRITIQUE OF THE APPROACH ADOPTED BY ADELMAN AND MORRIS A. C. RAYNER Introduction, 639. Factor analysis, 639. Discriminant analysis, 644. INTRODUCTION
More informationFeature Selection for Classification of Music According to Expressed Emotion
Feature Selection for Classification of Music According to Expressed Emotion Pasi Saari Master s Thesis Music, Mind & Technology December 2009 University of Jyväskylä UNIVERSITY OF JYVÄSKYLÄ !"#$%&"'$()"'*+,*%-+)
More informationBIOC2060: Purication of alkaline phosphatase
BIOC2060: Purication of alkaline phosphatase Tom Hargreaves December 2008 Contents 1 Introduction 1 2 Procedure 2 2.1 Lysozyme treatment......................... 2 2.2 Partial purication..........................
More informationFrom: AAAI Technical Report SS Compilation copyright 1995, AAAI ( All rights reserved.
From: AAAI Technical Report SS-95-03. Compilation copyright 1995, AAAI (www.aaai.org). All rights reserved. MAKING MULTIPLE HYPOTHESES EXPLICIT: AN EXPLICIT STRATEGY COMPUTATIONAL 1 MODELS OF SCIENTIFIC
More informationMammogram Analysis: Tumor Classification
Mammogram Analysis: Tumor Classification Literature Survey Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationSubjective randomness and natural scene statistics
Psychonomic Bulletin & Review 2010, 17 (5), 624-629 doi:10.3758/pbr.17.5.624 Brief Reports Subjective randomness and natural scene statistics Anne S. Hsu University College London, London, England Thomas
More informationReview: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections
Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi
More informationContributions to Brain MRI Processing and Analysis
Contributions to Brain MRI Processing and Analysis Dissertation presented to the Department of Computer Science and Artificial Intelligence By María Teresa García Sebastián PhD Advisor: Prof. Manuel Graña
More informationPooling Subjective Confidence Intervals
Spring, 1999 1 Administrative Things Pooling Subjective Confidence Intervals Assignment 7 due Friday You should consider only two indices, the S&P and the Nikkei. Sorry for causing the confusion. Reading
More informationDoing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto
Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling Olli-Pekka Kauppila Daria Kautto Session VI, September 20 2017 Learning objectives 1. Get familiar with the basic idea
More informationApproximately as appeared in: Learning and Computational Neuroscience: Foundations. Time-Derivative Models of Pavlovian
Approximately as appeared in: Learning and Computational Neuroscience: Foundations of Adaptive Networks, M. Gabriel and J. Moore, Eds., pp. 497{537. MIT Press, 1990. Chapter 12 Time-Derivative Models of
More informationInternational Journal of Pharma and Bio Sciences A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS ABSTRACT
Research Article Bioinformatics International Journal of Pharma and Bio Sciences ISSN 0975-6299 A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS D.UDHAYAKUMARAPANDIAN
More informationConfidence Intervals On Subsets May Be Misleading
Journal of Modern Applied Statistical Methods Volume 3 Issue 2 Article 2 11-1-2004 Confidence Intervals On Subsets May Be Misleading Juliet Popper Shaffer University of California, Berkeley, shaffer@stat.berkeley.edu
More informationHebbian Plasticity for Improving Perceptual Decisions
Hebbian Plasticity for Improving Perceptual Decisions Tsung-Ren Huang Department of Psychology, National Taiwan University trhuang@ntu.edu.tw Abstract Shibata et al. reported that humans could learn to
More informationUnsupervised Measurement of Translation Quality Using Multi-engine, Bi-directional Translation
Unsupervised Measurement of Translation Quality Using Multi-engine, Bi-directional Translation Menno van Zaanen and Simon Zwarts Division of Information and Communication Sciences Department of Computing
More informationCognitive modeling versus game theory: Why cognition matters
Cognitive modeling versus game theory: Why cognition matters Matthew F. Rutledge-Taylor (mrtaylo2@connect.carleton.ca) Institute of Cognitive Science, Carleton University, 1125 Colonel By Drive Ottawa,
More informationChapter 1 Data Types and Data Collection. Brian Habing Department of Statistics University of South Carolina. Outline
STAT 515 Statistical Methods I Chapter 1 Data Types and Data Collection Brian Habing Department of Statistics University of South Carolina Redistribution of these slides without permission is a violation
More informationCSE 258 Lecture 1.5. Web Mining and Recommender Systems. Supervised learning Regression
CSE 258 Lecture 1.5 Web Mining and Recommender Systems Supervised learning Regression What is supervised learning? Supervised learning is the process of trying to infer from labeled data the underlying
More informationSubgroup Discovery for Test Selection: A Novel Approach and Its Application to Breast Cancer Diagnosis
Subgroup Discovery for Test Selection: A Novel Approach and Its Application to Breast Cancer Diagnosis Marianne Mueller 1,Rómer Rosales 2, Harald Steck 2, Sriram Krishnan 2,BharatRao 2, and Stefan Kramer
More informationIdentifying Parkinson s Patients: A Functional Gradient Boosting Approach
Identifying Parkinson s Patients: A Functional Gradient Boosting Approach Devendra Singh Dhami 1, Ameet Soni 2, David Page 3, and Sriraam Natarajan 1 1 Indiana University Bloomington 2 Swarthmore College
More informationScientific Journal of Informatics Vol. 3, No. 2, November p-issn e-issn
Scientific Journal of Informatics Vol. 3, No. 2, November 2016 p-issn 2407-7658 http://journal.unnes.ac.id/nju/index.php/sji e-issn 2460-0040 The Effect of Best First and Spreadsubsample on Selection of
More informationComparative study of Naïve Bayes Classifier and KNN for Tuberculosis
Comparative study of Naïve Bayes Classifier and KNN for Tuberculosis Hardik Maniya Mosin I. Hasan Komal P. Patel ABSTRACT Data mining is applied in medical field since long back to predict disease like
More informationPattern Recognition Based Prediction of the Outcome of Radiotherapy in Cervical Cancer Treatment
Pattern Recognition Based Prediction of the Outcome of Radiotherapy in Cervical Cancer Treatment. By Mohammad Yasar and Vimala Nunavath Supervisor Ole Christoffer Granmo This Master s Thesis is carried
More informationAN INFORMATION VISUALIZATION APPROACH TO CLASSIFICATION AND ASSESSMENT OF DIABETES RISK IN PRIMARY CARE
Proceedings of the 3rd INFORMS Workshop on Data Mining and Health Informatics (DM-HI 2008) J. Li, D. Aleman, R. Sikora, eds. AN INFORMATION VISUALIZATION APPROACH TO CLASSIFICATION AND ASSESSMENT OF DIABETES
More informationArtificial intelligence and judicial systems: The so-called predictive justice. 20 April
Artificial intelligence and judicial systems: The so-called predictive justice 20 April 2018 1 Context The use of so-called artificielle intelligence received renewed interest over the past years.. Stakes
More informationAn Integration of Rule Induction and Exemplar-Based Learning for Graded Concepts
Machine Learning, 21,235-267 (1995) 1995 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. An Integration of Rule Induction and Exemplar-Based Learning for Graded Concepts JIANPING ZHANG
More information(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN)
UNIT 4 OTHER DESIGNS (CORRELATIONAL DESIGN AND COMPARATIVE DESIGN) Quasi Experimental Design Structure 4.0 Introduction 4.1 Objectives 4.2 Definition of Correlational Research Design 4.3 Types of Correlational
More information