On the Combination of Collaborative and Item-based Filtering
|
|
- Merryl Anthony
- 6 years ago
- Views:
Transcription
1 On the Combination of Collaborative and Item-based Filtering Manolis Vozalis 1 and Konstantinos G. Margaritis 1 University of Macedonia, Dept. of Applied Informatics Parallel Distributed Processing Laboratory Egnatia 156, P.O. 1591, 54006, Thessaloniki, Greece Abstract. In the past, there has been discussion about an approach that combines the best of the Item-based and the User-based (classic Collaborative Filtering) worlds, by first identifying a reasonably large neighborhood of similar users and then using this subset to derive the item-based recommendation model. We have taken this brief approach outline and developed a full functioning hybrid method. In this paper we first describe the execution steps of this algorithm, then we proceed with extended experiments. The first part of our experiments involves checking various parameter combinations in order to understand the algorithm behavior. The second part of the experiments compares this hybrid approach mainly to plain Item-based Filtering as far as utility and performance are concerned. keywords: personalization, prediction, machine learning 1 Introduction Recommender Systems were introduced as a computer-based intelligent technique to deal with the problem of information and product overload. They can be utilized to efficiently provide personalized services in most e-business domains, benefiting both the customer and the merchant. The two basic entities which appear in any Recommender System are the user and the item. A user is a person who utilizes the Recommender System providing his opinion about various items and receives recommendations about new items from the system. The goal of Recommender Systems is to generate suggestions about new items for a particular user. The process is based on the input provided, which most of the times is expressed in the form of the ratings of that user, and the filtering algorithm, which is applied on that input. Let m be the number of users U = {u 1, u 2,..., u m } and n the number of items I = {i 1, i 2,..., i n }. In order to execute the experiments of this work we used the original GroupLens data set. The data set consists of ratings, assigned by 943 users on 1682 movies. All ratings follow the 1:bad-5:excellent numerical scale and each {mans,kmarg}@uom.gr, URL: {mans,kmarg}
2 user was required to express his opinion for at least 20 movies in order to be considered. We have to note that the initial data set was used as the basis to generate five distinct splits into training and test data. For each split, 80% of the original set was included in the training and 20% of it was included in the test data. The test sets in all cases were disjoint. In our experiments those sets will be referred to as file=1, file=2,..., file=5. 2 The Hybrid Algorithm Karypis in [1] has briefly talked about an approach that combines the best of the Item-based and the User-based (classic Collaborative Filtering) worlds, by first identifying a reasonably large neighborhood of similar users and then using this subset to derive the item-based recommendation model. We have taken this brief approach outline and developed a full functioning hybrid method, the execution steps of which will be described right away. The first step is the Information Representation which does not differ from what we know for both Collaborative and Item-based Filtering. Its purpose is simple: to represent the data in an organized manner. To achieve that, we only require an mxn user-item matrix, R, where element r ij includes the rating that user u i (row i from matrix R) gave to item i j (column j from matrix R), or simply, a value of 1, if user u i purchased item i j, and 0 otherwise. At this point we reach the User Similarity Computation step, which can be described as the contribution of Collaborative Filtering in this hybrid approach. The aim of this step is to create a neighborhood of users most similar to a selected active user u a. We can achieve that by simply applying Pearson Correlation Similarity as follows: sim ai = corr ai = l (raj ra)(rij ri) j=1 l (r aj r a ) 2 l (r ij r i ) 2 j=1 j=1 where r aj and r ij are the ratings that item i j has received from users u a and u i, while r a and r i are the average of users u a and u i ratings. Now we can select the w users that appear to have the biggest similarity to the active user, u a, thus generating its neighborhood, AN. The size of this user neighborhood is important since it will be the base for the implementation of the Item Similarity Computation, executed in the following step. A small user neighborhood would be inadequate for any kind of item similarity computation, leading to poor results. On the other hand, a very wide user neighborhood would make the hybrid filtering approach look very much like plain Item-based Filtering. Still, we do not simply pick the best w correlates, as expressed by the highest Pearson Correlation Similarity values, but we also require that those users selected and the active user have rated in common a number of items that is higher than a specified threshold, known as Common Item Threshold. By this common item threshold we make sure that a possibly high correlation between
3 the active user and a second random user is based on an adequate number of common rated items. In the following step, the Item Similarity Computation should be calculated. The basic idea in that step is to first isolate the users who have rated two items i j and i k and then apply a similarity computation technique to determine their similarity. Various ways to compute that similarity have been proposed. We will be using the Adjusted Cosine Similarity approach, which, as shown in previous experiments [2] [3], performs better than Cosine-based Similarity or Correlation-based Similarity. This is the contribution of Item-based Filtering in the Hybrid approach. Still, there is a small but crucial difference in the way item similarities are computed in the hybrid approach, when compared to the way those calculations are carried out in plain Item-based Filtering. The similarity between two items i j and i k should be calculated if only there exist users who have rated both those items. While in plain Item-based Filtering those users could be extracted from the set of all m available users, in the hybrid approach we search for those users only in the active user neighborhood, AN, generated in the previous step. Thus, the formula for Adjusted Cosine Similarity of items i j and i k in the hybrid approach needs to be altered to the following: sim jk = adjcorr jk = q (r ij r i )(r ik r i ) i=1 q i=1 (r ij r i ) 2 q i=1 (r ik r i ) 2 where r ij and r ik are the ratings that items i j and i k have received from user u i, while r i is the average of user s u i ratings. The summations over i are calculated only for those q users, where q w, who have expressed their opinions over both items. Those users should be strictly selected from the active user neighborhood, AN. Once we have calculated the similarities between all items in the initial useritem matrix, R, the next step in the collaborative filtering procedure is to isolate the l items, i k, with k = 1, 2,..., l, that share the greatest similarity with item i a, for which we want a prediction and form its neighborhood of items, IN. Again, we do not just pick the best l correlates, expressed by the highest proximity measure values, but at the same time we require that those items selected and the active item have been rated by a number of common users that is higher than a specified threshold, known as Common User Threshold. By this common user threshold we make sure that a possibly high correlation between the active item and a second random item is based on an adequate number of commonly rating users. Now we can proceed with the Prediction Generation. Prediction Generation is the same for both plain Item-based Filtering and the hybrid approach we are currently discussing. The most common way to achieve it is through a weighted sum. Briefly, this method generates a prediction on item i j for active user u a by computing the sum of ratings given by the active user on items belonging to the neighborhood of i j. Those ratings are weighted by the corresponding similarity, sim jk, between item i j and item i k, with k = 1, 2,..., l, taken from neighborhood IN:
4 pr aj = 3 Experimental Results l sim jk r ak l k=1 sim ak k=1 In this section we will evaluate the utility of the hybrid filtering method. We will first provide a brief description of the various experiments we executed and then we will proceed and present the results of these experiments. At this point, it is necessary to note that while classic Collaborative Filtering and plain Item-based Filtering each had a couple of changing parameters (size of the user/item neighborhood and common item/user threshold, correspondingly), the hybrid approach has four free parameters, all of which can be altered during experiment execution: user neighborhood size along with common item threshold in the stage of user neighborhood formation, and item neighborhood size along with common user threshold in the stage of item neighborhood formation. Value combinations of those parameters will be utilized extensively in the following experiments. 3.1 Comparing different User Neighborhood sizes for various Common Item and User Thresholds As experiments in both Collaborative and Item-based Filtering have shown [3], neighborhood sizes and common threshold values have a serious impact in the utility of the corresponding filtering algorithm. By this experiment we wanted to monitor the impact of user neighborhood size, item and user common thresholds on the hybrid filtering approach, while also comparing it against the impact of the same factors in Collaborative and Item-based Filtering. For this reason we kept the item neighborhood size fixed and equal to 60 throughout the experiment. Regarding the selection of user neighborhood sizes, we had to keep in mind that an adequate number of users should exist in the user neighborhood for the successful generation of the item neighborhood in the subsequent stage of the hybrid approach. Thus, we avoided very low user neighborhood sizes that would perform poorly. The results from this experiment for a single data split (file=1) are displayed in the following set of three figures. Figure 1 corresponds to the mean absolute error (MAE) and coverage results for Common Item Threshold = 10. We note that the user neighborhood size affects the accuracy of the results: As the number of users in the neighborhood is increased, the error gets lower values, reaching its minimum for u-n=400. Yet, when we get to bigger user neighborhoods (u-n>400) we observe that the error stays fixed or increases. Coverage starts from very low values, when the user neighborhood sizes are small and reaches adequate values for cut=10 and cut=20, and satisf actory values for cut=30, as the user neighborhood keeps increasing. We can conclude that when common item threshold=10, for all common user thresholds tested, the best MAE and coverage values are achieved when the user neighborhood includes 400 users.
5 Fig. 1. Error and Coverage for Common Item Threshold=10 Figures 2 and 3 correspond to the mean absolute error (MAE) and coverage results for Common Item Threshold ={20,30}. Both error and coverage display a behavior reminding us of Figure 1 (cit=10). Specifically, as the user neighborhood size is increased, MAE gets lower, while coverage gets higher. Still there are a couple of significant differences: The error reaches similar low values as in cit=10, but the same is not true for coverage. This time, coverage values are lower, ranging from adequate (coverage=75,56% for cut=10, cit=20) to unacceptable (coverage=42% for cut=30, cit=30). Furthermore, the user neighborhood size after which we observe no significant changes in the behavior of MAE and coverage, shifts to lower values. Specifically, the user neighborhood size threshold has a value of 300 when cit=20, being even lower, equal to 200, for cit=30.
6 Fig. 2. Error and Coverage for Common Item Threshold=20 If we would like to compare the overall behavior displayed for the tested Common Item Threshold values (cit={10,20,30}), the following conclusions can be reached: The best MAE value (lowest error=0,7997) is achieved for common item threshold=30, common user threshold=10 and user neighborhood=350. Still this error value is accompanied by a coverage of 61,33%. As a result, because of the unacceptable coverage value, we have to reject those parameter settings For common item threshold=10 the errors are not as low as in the case of common item threshold=30. Specifically, the lowest MAE value observed is 0,8462 for common user threshold=10 and user neighborhood=400. Still this error value is accompanied by a satisfactory coverage of 87,78%.
7 Fig. 3. Error and Coverage for Common Item Threshold=30 For common item threshold=20 the coverage values lie somewhere between the previous two cases. Furthermore, the errors observed do not improve on any of those cases. As a matter of fact, they are very similar to the errors of common item threshold=10, accompanied by lower coverage. Consequently we are forced to reject this parameter setting. Taking into account all those points, we conclude that the best performing cases are achieved for common item threshold=10. If we wish to single out one best case, that would be for common item threshold=10, common user threshold=10 and user neighborhood=400.
8 3.2 Comparing the Hybrid Approach to Item-based Filtering The hybrid approach we have been discussing can be actually considered as an extension to the plain Item-based Filtering algorithm. We call it an extension since plain Item-based Filtering is enhanced by adding an intelligent way to create the pool from which users that contribute to the construction of the item neighborhood are selected. Based on this assumption, an experiment that compares the performance of Item-based Filtering against that of the hybrid approach would show us how those related algorithms contrast. For this experiment, and specifically for the Hybrid approach part, we kept the item neighborhood fixed to 60 items. Also, the user neighborhood size was set to 400. This size, as a previous experiment showed, displayed the best performing behavior for the combinations of the remaining parameters. The changing parameters were Common Item Threshold and Common User Threshold. As for the Item-based part of the experiment, there is no changing the user neighborhood - which is fixed, including all the users - and consequently, there exists no Common Item Threshold. For the purposes of the experiment, the item neighborhood was fixed to include 60 items, in accordance to the Hybrid approach. The single varying parameter was that of the Common User Threshold. The results from this experiment for two data splits (file=1,4) can be found in Table 1 and Figure 4. Table 1. Comparing Accuracy in Hybrid and Item-based Filtering hyb cit=10 hyb cit=20 hyb cit=30 item-based cut= cut= cut= Starting from the errors table, we can see that as the Common User Threshold gets bigger, MAE is, in all cases, also increased. Item-based Filtering seems to have an average performance for cut=10, the best performance for cut=20, and the worst performance for cut=30. As for the three cases of the hybrid approach that we tested, once again when common item threshold=30 we seem to get the best overall accuracy. On the other hand, when common item threshold=10 we get the worst accuracy for cut=10, and average performance for cut=20 and cut=30. Concluding, Item-based Filtering and the Hybrid approach with common item threshold=10 do not have the lowest error values among the cases we tested, but still, their accuracy values are directly comparable. Moving to coverage, that is an area where Item-based Filtering is clearly superior, when compared to the Hybrid approach, with values above 85% in all cases. The Hybrid approach for common item threshold=30, which performed better than Item-based Filtering when error was concerned, gives coverage results that rank as the worst among the three hybrid cases tested, being also unacceptably low when compared to the coverage values of Item-based Filtering. The
9 Fig. 4. Hybrid Approach vs. Item-based Filtering: Coverage best coverage observed for the hybrid method was achieved when common item threshold=10. Those coverage results lead us to the conclusion that the performance of the hybrid approach can be contrasted to that of item-based filtering, for both accuracy and coverage, only when common item threshold=10. Our final comparison concerned item-based filtering and hybrid filtering with common item threshold=10, for which accuracy and coverage experiments generated comparable results. We wanted to take into consideration a metric that evaluates performance. That metric was execution time. Our experiments showed that Item-based Filtering has clearly the best performance, its execution time being around 7 minutes for all common user thresholds we tested, cut={10,20,30}. On the other hand, for the same common user thresholds, the hybrid approach with cit=10 was disappointing, generating execution times in the range of 25 minutes. Attempting to explain this conclusion, there is one factor to consider: In Item-based Filtering, there is a single item neighborhood for each item, i k. It is calculated once and used in prediction generations for all {user-item} pairs that involve item i k as their active item. On the other hand, in the Hybrid approach there is a different item neighborhood for each {user-item} pair including item i k, since each active user, u a, has a different user neighbor, AN, which affects how item s i k neighborhood would be generated. 4 Conclusions In this work we have introduced a Hybrid filtering algorithm that combines ideas from the areas of Collaborative and Item-based Filtering, but can be better viewed as an extension to Item-based Filtering. The basic characteristic of
10 this approach is that it localizes the item-based techniques to a wide user neighborhood that is created by the implementation of collaborative filtering. Our first set of experiments aimed to display the behavior of the hybrid filtering approach for various combinations of its changing parameters and reach some optimal settings. The next step was to take an implementation of the hybrid approach that utilizes those optimal parameter settings and contrast it to the item-based filtering algorithm. This comparison provided us with disappointing results concerning the performance of the hybrid approach. Accuracy results were pretty close but item-based filtering was clearly superior when comparing coverage. Similar were the findings when comparing the two approaches in time requirements. Based on these results, we can conclude that Item-based Filtering, which utilizes a global selection of users in its prediction generation, works better than the Hybrid approach, which assumably utilizes a localized, more personalized selection of users in its prediction generation. Two factors that may participate in that unexpected difference in performance are: (a) the existence of inadequate data that would otherwise assist in the generation of better user neighborhoods (b) the effectiveness of the selected similarity metrics, which are probably not able to locate true user relations with sparse data. As a result, in our future experiments we intend to utilize a number of statistical (e.g. dimensionality reduction methods) or machine learning techniques (e.g. artificial neural networks) and explore how they assist the recommendation process. References 1. Karypis, G.: Evaluation of item-based top-n recommendation algorithms. In: CIKM (2001) 2. Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.T.: Item-based collaborative filtering recommendation algorithms. In: 10th International World Wide Web Conference (WWW10), Hong Kong (2001) 3. Vozalis, E.G., Margaritis, K.G.: Recommender systems: An experimental comparison of two filtering algorithms. In: Proceedings of the 9th Panhellenic Conference in Informatics - PCI (2003) 4. Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Fourteenth Conference on Uncertainty in Artificial Intelligence, Madison, WI (1998) 5. Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.T.: Analysis of recommendation algorithms for e-commerce. In: Electronic Commerce. (2000) 6. Herlocker, J., Konstan, J.A., Borchers, A., Riedl, J.T.: An algorithmic frameworkd for performing collaborative filtering. In: The 1999 Conference on Research and Development in Information Retrieval. (1999) 7. Vozalis, E., Margaritis, K.G.: Analysis of recommender systems algorithms. In: Proceedings of the Sixth Hellenic-European Conference on Computer Mathematics and its Applications - HERCMA (2003)
A Comparison of Collaborative Filtering Methods for Medication Reconciliation
A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,
More informationTowards More Confident Recommendations: Improving Recommender Systems Using Filtering Approach Based on Rating Variance
Towards More Confident Recommendations: Improving Recommender Systems Using Filtering Approach Based on Rating Variance Gediminas Adomavicius gedas@umn.edu Sreeharsha Kamireddy 2 skamir@cs.umn.edu YoungOk
More informationITEM-LEVEL TURST-BASED COLLABORATIVE FILTERING FOR RECOMMENDER SYSTEMS
ITEM-LEVEL TURST-BASED COLLABORATIVE FILTERING FOR RECOMMENDER SYSTEMS Te- Min Chag Department of Information Management, National Sun Yat-sen University temin@mail.nsysu.edu.tw Wen- Feng Hsiao Department
More informationOvercoming Accuracy-Diversity Tradeoff in Recommender Systems: A Variance-Based Approach
Overcoming Accuracy-Diversity Tradeoff in Recommender Systems: A Variance-Based Approach Gediminas Adomavicius gedas@umn.edu YoungOk Kwon ykwon@csom.umn.edu Department of Information and Decision Sciences
More informationStability of Collaborative Filtering Recommendation Algorithms 1
Stability of Collaborative Filtering Recommendation Algorithms GEDIMINAS ADOMAVICIUS, University of Minnesota JINGJING ZHANG, University of Minnesota The paper explores stability as a new measure of recommender
More informationThe Long Tail of Recommender Systems and How to Leverage It
The Long Tail of Recommender Systems and How to Leverage It Yoon-Joo Park Stern School of Business, New York University ypark@stern.nyu.edu Alexander Tuzhilin Stern School of Business, New York University
More informationUsing Personality Information in Collaborative Filtering for New Users
Using Personality Information in Collaborative Filtering for New Users Rong Hu Human Computer Interaction Group Swiss Federal Institute of Technology (EPFL) CH-1015, Lausanne, Switzerland rong.hu@epfl.ch
More informationExploiting Implicit Item Relationships for Recommender Systems
Exploiting Implicit Item Relationships for Recommender Systems Zhu Sun, Guibing Guo, and Jie Zhang School of Computer Engineering, Nanyang Technological University, Singapore School of Information Systems,
More informationCollaborative Filtering with Multi-component Rating for Recommender Systems
Collaborative Filtering with Multi-component Rating for Recommender Systems Nachiketa Sahoo Ramayya Krishnan George Duncan James P. Callan Heinz School Heinz School Heinz School Language Technology Institute
More informationDe-Biasing User Preference Ratings in Recommender Systems
Gediminas Adomavicius University of Minnesota Minneapolis, MN gedas@umn.edu De-Biasing User Preference Ratings in Recommender Systems Jesse Bockstedt University of Arizona Tucson, AZ bockstedt@email.arizona.edu
More informationHybridising collaborative filtering and trust-aware recommender systems
Hybridising collaborative filtering and trust-aware recommender systems Charif Haydar, Anne Boyer, Azim Roussanaly To cite this version: Charif Haydar, Anne Boyer, Azim Roussanaly. Hybridising collaborative
More informationImproved Intelligent Classification Technique Based On Support Vector Machines
Improved Intelligent Classification Technique Based On Support Vector Machines V.Vani Asst.Professor,Department of Computer Science,JJ College of Arts and Science,Pudukkottai. Abstract:An abnormal growth
More informationImproving the Accuracy of Neuro-Symbolic Rules with Case-Based Reasoning
Improving the Accuracy of Neuro-Symbolic Rules with Case-Based Reasoning Jim Prentzas 1, Ioannis Hatzilygeroudis 2 and Othon Michail 2 Abstract. In this paper, we present an improved approach integrating
More informationTypicality-based Collaborative Filtering Recommendation
1 Typicality-based Collaborative Filtering Recommendation Yi Cai, Ho-fung Leung, Qing Li, Huaqing Min, Jie Tang and Juanzi Li School of Software Engineering, South China University of Technology, Guangzhou,
More informationDetection and Classification of Lung Cancer Using Artificial Neural Network
Detection and Classification of Lung Cancer Using Artificial Neural Network Almas Pathan 1, Bairu.K.saptalkar 2 1,2 Department of Electronics and Communication Engineering, SDMCET, Dharwad, India 1 almaseng@yahoo.co.in,
More informationIntroduction to Computational Neuroscience
Introduction to Computational Neuroscience Lecture 5: Data analysis II Lesson Title 1 Introduction 2 Structure and Function of the NS 3 Windows to the Brain 4 Data analysis 5 Data analysis II 6 Single
More informationA Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China
A Vision-based Affective Computing System Jieyu Zhao Ningbo University, China Outline Affective Computing A Dynamic 3D Morphable Model Facial Expression Recognition Probabilistic Graphical Models Some
More informationMeasurement. Different terminology. Marketing managers work with abstractions. Different terminology. Different terminology.
Different terminology Managerial view Marketing managers work with abstractions Concepts A generalized idea about a class of objects, attributes, occurrences or processes Propositions A set of concepts
More informationChapter 1. Introduction
Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a
More informationHow to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection
How to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection Esma Nur Cinicioglu * and Gülseren Büyükuğur Istanbul University, School of Business, Quantitative Methods
More informationAn Efficient Hybrid Rule Based Inference Engine with Explanation Capability
To be published in the Proceedings of the 14th International FLAIRS Conference, Key West, Florida, May 2001. An Efficient Hybrid Rule Based Inference Engine with Explanation Capability Ioannis Hatzilygeroudis,
More informationWhite Paper Estimating Complex Phenotype Prevalence Using Predictive Models
White Paper 23-12 Estimating Complex Phenotype Prevalence Using Predictive Models Authors: Nicholas A. Furlotte Aaron Kleinman Robin Smith David Hinds Created: September 25 th, 2015 September 25th, 2015
More informationE-MRS: Emotion-based Movie Recommender System. Ai Thanh Ho, Ilusca L. L. Menezes, and Yousra Tagmouti
E-MRS: Emotion-based Movie Recommender System Ai Thanh Ho, Ilusca L. L. Menezes, and Yousra Tagmouti IFT 6251 11/12/2006 Outline Introduction and Future Work 2 Introduction Electronic commerce Large amount
More informationPANDEMICS. Year School: I.S.I.S.S. M.Casagrande, Pieve di Soligo, Treviso - Italy. Students: Beatrice Gatti (III) Anna De Biasi
PANDEMICS Year 2017-18 School: I.S.I.S.S. M.Casagrande, Pieve di Soligo, Treviso - Italy Students: Beatrice Gatti (III) Anna De Biasi (III) Erica Piccin (III) Marco Micheletto (III) Yui Man Kwan (III)
More informationItem Analysis Explanation
Item Analysis Explanation The item difficulty is the percentage of candidates who answered the question correctly. The recommended range for item difficulty set forth by CASTLE Worldwide, Inc., is between
More informationResearch Article A Hybrid Genetic Programming Method in Optimization and Forecasting: A Case Study of the Broadband Penetration in OECD Countries
Advances in Operations Research Volume 212, Article ID 94797, 32 pages doi:1.11/212/94797 Research Article A Hybrid Genetic Programming Method in Optimization and Forecasting: A Case Study of the Broadband
More informationMachine Learning to Inform Breast Cancer Post-Recovery Surveillance
Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction
More informationModeling Sentiment with Ridge Regression
Modeling Sentiment with Ridge Regression Luke Segars 2/20/2012 The goal of this project was to generate a linear sentiment model for classifying Amazon book reviews according to their star rank. More generally,
More informationIMPaLA tutorial.
IMPaLA tutorial http://impala.molgen.mpg.de/ 1. Introduction IMPaLA is a web tool, developed for integrated pathway analysis of metabolomics data alongside gene expression or protein abundance data. It
More informationImproving Recommendation Lists Through Topic Diversification
Improving Recommendation Lists Through Topic Diversification Cai-Nicolas Ziegler 1 Sean M. McNee ABSTRACT 1 Institut für Informatik, Universität Freiburg Georges-Köhler-Allee, Gebäude Nr. 51 79110 Freiburg
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationDecision Support System for Heart Disease Diagnosing Using K-NN Algorithm
Decision Support System for Heart Disease Diagnosing Using K-NN Algorithm Tito Yuwono Department of Electrical Engineering Islamic University of Indonesia Yogyakarta Address: Kaliurang Street KM 14 Yogyakarta,
More informationAdaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida
Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida and Oleksandr S. Chernyshenko University of Canterbury Presented at the New CAT Models
More informationIdentifying Parkinson s Patients: A Functional Gradient Boosting Approach
Identifying Parkinson s Patients: A Functional Gradient Boosting Approach Devendra Singh Dhami 1, Ameet Soni 2, David Page 3, and Sriraam Natarajan 1 1 Indiana University Bloomington 2 Swarthmore College
More informationDiagnosis via Model Based Reasoning
Diagnosis via Model Based Reasoning 1 Introduction: Artificial Intelligence and Model Based Diagnosis 1.1 Diagnosis via model based reasoning 1.2 The diagnosis Task: different approaches Davies example
More informationPositive and Unlabeled Relational Classification through Label Frequency Estimation
Positive and Unlabeled Relational Classification through Label Frequency Estimation Jessa Bekker and Jesse Davis Computer Science Department, KU Leuven, Belgium firstname.lastname@cs.kuleuven.be Abstract.
More informationDeep learning and non-negative matrix factorization in recognition of mammograms
Deep learning and non-negative matrix factorization in recognition of mammograms Bartosz Swiderski Faculty of Applied Informatics and Mathematics Warsaw University of Life Sciences, Warsaw, Poland bartosz_swiderski@sggw.pl
More informationHow should I explain? A comparison of different explanation types for recommender systems
How should I explain? A comparison of different explanation types for recommender systems Fatih Gedikli a, Dietmar Jannach a, Mouzhi Ge b a TU Dortmund, 44221 Dortmund, Germany b Bundeswehr University
More information2016 Children and young people s inpatient and day case survey
NHS Patient Survey Programme 2016 Children and young people s inpatient and day case survey Technical details for analysing trust-level results Published November 2017 CQC publication Contents 1. Introduction...
More informationConstruct Reliability and Validity Update Report
Assessments 24x7 LLC DISC Assessment 2013 2014 Construct Reliability and Validity Update Report Executive Summary We provide this document as a tool for end-users of the Assessments 24x7 LLC (A24x7) Online
More informationBalanced Neighborhoods for Fairness-aware Collaborative Recommendation
ABSTRACT Robin Burke rburke@cs.depaul.edu Masoud Mansoury mmansou4@depaul.edu Recent work on fairness in machine learning has begun to be extended to recommender systems. While there is a tension between
More informationDynamic Sensory Gating Mechanism. data. Conditional Reactive Hierarchical Memory
Dynamic Sensory Gating Mechanism in Conditional Reactive Hierarchical Memory Jeong-Yon Shim, Chong-Sun Hwang Dept. of Computer Software, YongIn SongDam College MaPyeong Dong 57-, YongIn Si, KyeongKi Do,
More informationDetermining the Vulnerabilities of the Power Transmission System
Determining the Vulnerabilities of the Power Transmission System B. A. Carreras D. E. Newman I. Dobson Depart. Fisica Universidad Carlos III Madrid, Spain Physics Dept. University of Alaska Fairbanks AK
More informationAn Escalation Model of Consciousness
Bailey!1 Ben Bailey Current Issues in Cognitive Science Mark Feinstein 2015-12-18 An Escalation Model of Consciousness Introduction The idea of consciousness has plagued humanity since its inception. Humans
More informationIncorporating Game-theoretic Rough Sets in Web-based Medical Decision Support Systems. Abstract
Incorporating Game-theoretic Rough Sets in Web-based Medical Decision Support Systems JingTao Yao and Nouman Azam Department of Computer Science University of Regina, Canada [jtyao,azam200n]@cs.uregina.ca
More informationMeasurement and meaningfulness in Decision Modeling
Measurement and meaningfulness in Decision Modeling Brice Mayag University Paris Dauphine LAMSADE FRANCE Chapter 2 Brice Mayag (LAMSADE) Measurement theory and meaningfulness Chapter 2 1 / 47 Outline 1
More informationRating prediction on Amazon Fine Foods Reviews
Rating prediction on Amazon Fine Foods Reviews Chen Zheng University of California,San Diego chz022@ucsd.edu Ye Zhang University of California,San Diego yez033@ucsd.edu Yikun Huang University of California,San
More informationKeywords Missing values, Medoids, Partitioning Around Medoids, Auto Associative Neural Network classifier, Pima Indian Diabetes dataset.
Volume 7, Issue 3, March 2017 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Medoid Based Approach
More informationActive Sites model for the B-Matrix Approach
Active Sites model for the B-Matrix Approach Krishna Chaithanya Lingashetty Abstract : This paper continues on the work of the B-Matrix approach in hebbian learning proposed by Dr. Kak. It reports the
More informationON SHELF AVAILABILITY ALIGNMENT PROJECT 2011 ASIA PAC SURVEY RESULTS
ON SHELF AVAILABILITY ALIGNMENT PROJECT 2011 ASIA PAC SURVEY RESULTS Introduction The ECR Asia Pacific OSA working group conducted an online survey between July and September 2011 aimed at gaining insights
More informationAffect in Virtual Agents (and Robots) Professor Beste Filiz Yuksel University of San Francisco CS 686/486
Affect in Virtual Agents (and Robots) Professor Beste Filiz Yuksel University of San Francisco CS 686/486 Software / Virtual Agents and Robots Affective Agents Computer emotions are of primary interest
More informationImportance of Cell Stock Concentration for Accurate Target Cell Recovery
TECHNICAL NOTE Importance of Cell Stock Concentration for Accurate Target Cell Recovery INTRODUCTION 10x Genomics Single Cell Protocols require suspensions of viable, single cells (Single Cell Protocols
More informationArtificial intelligence (and Searle s objection) COS 116: 4/29/2008 Sanjeev Arora
Artificial intelligence (and Searle s objection) COS 116: 4/29/2008 Sanjeev Arora Artificial Intelligence Definition of AI (Merriam-Webster): The capability of a machine to imitate intelligent human behavior
More informationDirect memory access using two cues: Finding the intersection of sets in a connectionist model
Direct memory access using two cues: Finding the intersection of sets in a connectionist model Janet Wiles, Michael S. Humphreys, John D. Bain and Simon Dennis Departments of Psychology and Computer Science
More informationUnit 2 Boundary Value Testing, Equivalence Class Testing, Decision Table-Based Testing. ST 8 th Sem, A Div Prof. Mouna M.
Unit 2 Boundary Value Testing, Equivalence Class Testing, Decision Table-Based Testing ST 8 th Sem, A Div 2017-18 Prof. Mouna M. Naravani 19-02-2018 Dept. of CSE, BLDEACET, Vijarapur 2 Boundary Value Testing
More informationLOW-RANK DECOMPOSITION AND LOGISTIC REGRESSION METHODS FOR LINK PREDICTION IN TERRORIST NETWORKS CSE 293 MS PROJECT REPORT, FALL 2010.
LOW-RANK DECOMPOSITION AND LOGISTIC REGRESSION METHODS FOR LINK PREDICTION IN TERRORIST NETWORKS CSE 293 MS PROJECT REPORT, FALL 2010 Eric Doi ekdoi@cs.ucsd.edu University of California, San Diego ABSTRACT
More informationPNN -RBF & Training Algorithm Based Brain Tumor Classifiction and Detection
PNN -RBF & Training Algorithm Based Brain Tumor Classifiction and Detection Abstract - Probabilistic Neural Network (PNN) also termed to be a learning machine is preliminarily used with an extension of
More informationAn empirical evaluation of text classification and feature selection methods
ORIGINAL RESEARCH An empirical evaluation of text classification and feature selection methods Muazzam Ahmed Siddiqui Department of Information Systems, Faculty of Computing and Information Technology,
More informationUsing Perceptual Grouping for Object Group Selection
Using Perceptual Grouping for Object Group Selection Hoda Dehmeshki Department of Computer Science and Engineering, York University, 4700 Keele Street Toronto, Ontario, M3J 1P3 Canada hoda@cs.yorku.ca
More informationPositive and Unlabeled Relational Classification through Label Frequency Estimation
Positive and Unlabeled Relational Classification through Label Frequency Estimation Jessa Bekker and Jesse Davis Computer Science Department, KU Leuven, Belgium firstname.lastname@cs.kuleuven.be Abstract.
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017
RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science
More informationMeasurement. 500 Research Methods Mike Kroelinger
Measurement 500 Research Methods Mike Kroelinger Levels of Measurement Nominal Lowest level -- used to classify variables into two or more categories. Cases placed in the same category must be equivalent.
More informationWELCOME! Lecture 11 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 11 Thommy Perlinger Regression based on violated assumptions If any of the assumptions are violated, potential inaccuracies may be present in the estimated regression
More informationSome Thoughts on the Principle of Revealed Preference 1
Some Thoughts on the Principle of Revealed Preference 1 Ariel Rubinstein School of Economics, Tel Aviv University and Department of Economics, New York University and Yuval Salant Graduate School of Business,
More informationVIEW AS Fit Page! PRESS PgDn to advance slides!
VIEW AS Fit Page! PRESS PgDn to advance slides! UNDERSTAND REALIZE CHANGE WHY??? CHANGE THE PROCESSES OF YOUR BUSINESS CONNECTING the DOTS Customer Focus (W s) Customer Focused Metrics Customer Focused
More informationCOLLABORATIVE filtering (CF) is an important and popular
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. X, XXXXXXX 2014 1 Typicality-Based Collaborative Filtering Recommendation Yi Cai, Ho-fung Leung, Qing Li, Senior Member, IEEE, Huaqing
More informationCocktail Preference Prediction
Cocktail Preference Prediction Linus Meyer-Teruel, 1 Michael Parrott 1 1 Department of Computer Science, Stanford University, In this paper we approach the problem of rating prediction on data from a number
More informationHow Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis?
How Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis? Richards J. Heuer, Jr. Version 1.2, October 16, 2005 This document is from a collection of works by Richards J. Heuer, Jr.
More informationCHAPTER ONE CORRELATION
CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to
More informationMeasuring the Effects of Interruptions on Task Performance in the User Interface
Measuring the Effects of Interruptions on Task Performance in the User Interface Brian P. Bailey, Joseph A. Konstan, and John V. Carlis University of Minnesota Department of Computer Science and Engineering
More informationA Social Curiosity Inspired Recommendation Model to Improve Precision, Coverage and Diversity
A Social Curiosity Inspired Recommendation Model to Improve Precision, Coverage and Diversity Qiong Wu, Siyuan Liu, Chunyan Miao, Yuan Liu and Cyril Leung Joint NTU-UBC Research Centre of Excellence in
More informationThe Effect of Guessing on Item Reliability
The Effect of Guessing on Item Reliability under Answer-Until-Correct Scoring Michael Kane National League for Nursing, Inc. James Moloney State University of New York at Brockport The answer-until-correct
More informationSUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing
Categorical Speech Representation in the Human Superior Temporal Gyrus Edward F. Chang, Jochem W. Rieger, Keith D. Johnson, Mitchel S. Berger, Nicholas M. Barbaro, Robert T. Knight SUPPLEMENTARY INFORMATION
More informationncounter Data Analysis Guidelines for Copy Number Variation (CNV) Molecules That Count NanoString Technologies, Inc.
ncounter Data Analysis Guidelines for Copy Number Variation (CNV) NanoString Technologies, Inc. 530 Fairview Ave N Suite 2000 Seattle, Washington 98109 www.nanostring.com Tel: 206.378.6266 888.358.6266
More informationType II Fuzzy Possibilistic C-Mean Clustering
IFSA-EUSFLAT Type II Fuzzy Possibilistic C-Mean Clustering M.H. Fazel Zarandi, M. Zarinbal, I.B. Turksen, Department of Industrial Engineering, Amirkabir University of Technology, P.O. Box -, Tehran, Iran
More informationSupplementary materials for: Executive control processes underlying multi- item working memory
Supplementary materials for: Executive control processes underlying multi- item working memory Antonio H. Lara & Jonathan D. Wallis Supplementary Figure 1 Supplementary Figure 1. Behavioral measures of
More informationGrounding Ontologies in the External World
Grounding Ontologies in the External World Antonio CHELLA University of Palermo and ICAR-CNR, Palermo antonio.chella@unipa.it Abstract. The paper discusses a case study of grounding an ontology in the
More informationFoundations of AI. 10. Knowledge Representation: Modeling with Logic. Concepts, Actions, Time, & All the Rest
Foundations of AI 10. Knowledge Representation: Modeling with Logic Concepts, Actions, Time, & All the Rest Wolfram Burgard, Andreas Karwath, Bernhard Nebel, and Martin Riedmiller 10/1 Contents Knowledge
More informationData Mining in Bioinformatics Day 4: Text Mining
Data Mining in Bioinformatics Day 4: Text Mining Karsten Borgwardt February 25 to March 10 Bioinformatics Group MPIs Tübingen Karsten Borgwardt: Data Mining in Bioinformatics, Page 1 What is text mining?
More informationCSE 255 Assignment 9
CSE 255 Assignment 9 Alexander Asplund, William Fedus September 25, 2015 1 Introduction In this paper we train a logistic regression function for two forms of link prediction among a set of 244 suspected
More informationEvaluation of Gene Selection Using Support Vector Machine Recursive Feature Elimination
Evaluation of Gene Selection Using Support Vector Machine Recursive Feature Elimination Committee: Advisor: Dr. Rosemary Renaut Dr. Adrienne C. Scheck Dr. Kenneth Hoober Dr. Bradford Kirkman-Liff John
More informationIntroduction to Preference and Decision Making
Introduction to Preference and Decision Making Psychology 466: Judgment & Decision Making Instructor: John Miyamoto 10/31/2017: Lecture 06-1 Note: This Powerpoint presentation may contain macros that I
More informationA Framework for Medical Diagnosis using Hybrid Reasoning
A Framework for Medical using Hybrid Reasoning Deepti Anne John, Rose Rani John Abstract The traditional method of reasoning was rule-based reasoning (). It does not use past experiences to reason. Case-based
More informationLung Tumour Detection by Applying Watershed Method
International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 5 (2017), pp. 955-964 Research India Publications http://www.ripublication.com Lung Tumour Detection by Applying
More informationSOME NOTES ON STATISTICAL INTERPRETATION
1 SOME NOTES ON STATISTICAL INTERPRETATION Below I provide some basic notes on statistical interpretation. These are intended to serve as a resource for the Soci 380 data analysis. The information provided
More informationCHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to
CHAPTER - 6 STATISTICAL ANALYSIS 6.1 Introduction This chapter discusses inferential statistics, which use sample data to make decisions or inferences about population. Populations are group of interest
More informationSAMPLING ERROI~ IN THE INTEGRATED sysrem FOR SURVEY ANALYSIS (ISSA)
SAMPLING ERROI~ IN THE INTEGRATED sysrem FOR SURVEY ANALYSIS (ISSA) Guillermo Rojas, Alfredo Aliaga, Macro International 8850 Stanford Boulevard Suite 4000, Columbia, MD 21045 I-INTRODUCTION. This paper
More informationGenetically Generated Neural Networks I: Representational Effects
Boston University OpenBU Cognitive & Neural Systems http://open.bu.edu CAS/CNS Technical Reports 1992-02 Genetically Generated Neural Networks I: Representational Effects Marti, Leonardo Boston University
More informationFMEA AND RPN NUMBERS. Failure Mode Severity Occurrence Detection RPN A B
FMEA AND RPN NUMBERS An important part of risk is to remember that risk is a vector: one aspect of risk is the severity of the effect of the event and the other aspect is the probability or frequency of
More informationMultilayer Perceptron Neural Network Classification of Malignant Breast. Mass
Multilayer Perceptron Neural Network Classification of Malignant Breast Mass Joshua Henry 12/15/2017 henry7@wisc.edu Introduction Breast cancer is a very widespread problem; as such, it is likely that
More informationOne-Way Independent ANOVA
One-Way Independent ANOVA Analysis of Variance (ANOVA) is a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment.
More informationUsing Statistical Intervals to Assess System Performance Best Practice
Using Statistical Intervals to Assess System Performance Best Practice Authored by: Francisco Ortiz, PhD STAT COE Lenny Truett, PhD STAT COE 17 April 2015 The goal of the STAT T&E COE is to assist in developing
More informationBelow, we included the point-to-point response to the comments of both reviewers.
To the Editor and Reviewers: We would like to thank the editor and reviewers for careful reading, and constructive suggestions for our manuscript. According to comments from both reviewers, we have comprehensively
More informationGenetic Algorithm based Feature Extraction for ECG Signal Classification using Neural Network
Genetic Algorithm based Feature Extraction for ECG Signal Classification using Neural Network 1 R. Sathya, 2 K. Akilandeswari 1,2 Research Scholar 1 Department of Computer Science 1 Govt. Arts College,
More informationOECD QSAR Toolbox v.4.2. An example illustrating RAAF scenario 6 and related assessment elements
OECD QSAR Toolbox v.4.2 An example illustrating RAAF scenario 6 and related assessment elements Outlook Background Objectives Specific Aims Read Across Assessment Framework (RAAF) The exercise Workflow
More informationDiscovering Meaningful Cut-points to Predict High HbA1c Variation
Proceedings of the 7th INFORMS Workshop on Data Mining and Health Informatics (DM-HI 202) H. Yang, D. Zeng, O. E. Kundakcioglu, eds. Discovering Meaningful Cut-points to Predict High HbAc Variation Si-Chi
More informationEvolutionary Programming
Evolutionary Programming Searching Problem Spaces William Power April 24, 2016 1 Evolutionary Programming Can we solve problems by mi:micing the evolutionary process? Evolutionary programming is a methodology
More informationCHAPTER 3 RESEARCH METHODOLOGY
CHAPTER 3 RESEARCH METHODOLOGY 3.1 Introduction 3.1 Methodology 3.1.1 Research Design 3.1. Research Framework Design 3.1.3 Research Instrument 3.1.4 Validity of Questionnaire 3.1.5 Statistical Measurement
More informationA hybrid approach for identification of root causes and reliability improvement of a die bonding process a case study
Reliability Engineering and System Safety 64 (1999) 43 48 A hybrid approach for identification of root causes and reliability improvement of a die bonding process a case study Han-Xiong Li a, *, Ming J.
More informationIncorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.1 7/27/2011
Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.1 7/27/2011 I. Purpose Drawing from the profile development of the QIBA-fMRI Technical Committee,
More information