Semantic Pattern Transformation
|
|
- Richard Phillips
- 5 years ago
- Views:
Transcription
1 Semantic Pattern Transformation IKNOW 2013 Peter Teufl, Herbert Leitold, Reinhard Posch
2 Our Background Topics Mobile device security Cloud security Security consulting for public insititutions (Austria) IT security research IT security lectures A-SIT e-government
3 Why does he talk about Knowledge Discovery? How does IT security relate to knowledge discovery? egov - eparticipation: document analysis, twitter etc. intrusion detection systems (network traffic analysis) malware detection (network traffic, mobile phones) mobile application analysis (metadata, market descriptions) mobile application security (hot topic, BYOD, etc.)
4 What to expect? Motivation for the Semantic Pattern Transformation Basic concepts, techniques How does it work? Evaluation? Applications, results, current topics!
5 Environment Arbitrary features No apriori knowledge Heteregenous domains Supervised learning Anomaly Detection Text analysis Android market descriptions Semantic search Clustering terms flexible histograms new numbers Visualization deployment domains Extracting knowledge
6 Process... Fayyad et al. Domain-specific data set Machine learning Different processing steps From defining the goals To extracting the desired knowledge Machine learning algorithms are often used within KDD KDT Knowledge discovery goals Target data set Preprocessing Data extraction Data mining method Data mining algorithm Data mining ML-KDT Machine learning goals Instance extraction Feature selection, construction Instance selection Machine learning algorithm Preprocessing Algorithm application However, the complete machine learning process is quite similar to KDD Knowledge extraction Knowledge processing Interpretation
7 Machine Learning ADAPTATION COMPLEXITY? Domain-specific data set Machine learning goals Instance extraction Feature selection, construction Instance selection Algorithm selection Preprocessing Algorithm application Interpretation Dependence on domain data and goals High Medium Low Assuming an arbitrary data-set (e-participation, Android Market applications) Further assuming: a knowledge discovery goal: e.g., unsupervised clustering Then: we need to adapt the steps on the left And: We need to adapt this setup when the data changes, even when the knowledge discovery goals remain the same! Android Market applications vs. text documents vs. network traffic vs. malware detection?
8 TOWARDS A SEMANTIC REPRESENTATION Finding a new representation... New representation is called Semantic Patterns Key properties: Still a vector representation (compatible to old representation) Not the feature values themselves, but their semantic relations are represented All values have the same meaning and feature type (activation) Transformation from raw data into Semantic Patterns: Semantic Pattern Transformation
9 SEMANTIC PATTERN TRANSFORMATION The Semantic Pattern Transformation is arranged in five layers Layer 1 Feature Extraction Data set Relation Instances Layer 1 - Feature extraction FROM TO TIME FROM TO TIME FROM TO TIME SF 2 Instance SF 1 SF 2 DF 1 SF 2 DF 2 Map Layer 2 - Associative network - Node generation Layer 2-3 Associative Network Generation SV MV SV SV MV Map Map Layer 3 - Associative network - Link generation Layer 4 Spreading Activation P 1 SV SV P 2 MV MV Layer 4 - Spreading activation (SA) P 3 P 4 Layer 5 - Analysis (machine learning, semantic search etc.) Layer 5 Analysis Semantic relations Semantic development over time Unsupervised clustering Feature value relevance Anomaly detection Pattern similarity Supervised learning
10 SPT: Layer 1 - Feature extraction Country Exports Unemployment rate Fertility rate C1 coffee 20% 5 C2 cacao 20% 5 C3 coffee, cacao 20% 5 C4 machinery 5% 2 C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C7 chemicals, cacao 20% missing data C8 missing data 20% 5 C9 coffee, cacao missing data missing data Extract features, their values and determine the type (categorical, distance-based) Categorical: Exports Distance-based: Unemployment rate, fertility rate
11 SPT: Layer 2 - Node generation Country Exports Unemployment rate Fertility rate C1 coffee 20% 5 C2 cacao 20% 5 C3 coffee, cacao 20% 5 C4 machinery 5% 2 C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C7 chemicals, cacao 20% missing data C8 missing data 20% 5 C9 coffee, cacao missing data missing data Distance-based feature values: map value ranges to single nodes 5% Categorical feature 20% values: Associative network one node for each value 5 coffee machinery 2 chemicals cocoa
12 SPT: Layer 3 - Link generation Country Exports Unemployment rate Fertility rate C1 coffee 20% 5 C2 cacao 20% 5 C3 coffee, cacao 20% 5 C4 machinery 5% 2 C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C7 chemicals, cacao 20% missing data C8 missing data 20% 5 C9 coffee, cacao missing data missing data coffee, 20%, 5 chemicals, cacao, 20% 5% 5 coffee 20% machinery 2 chemicals Link Weight cocoa
13 SPT: Layer 4 - Spreading activation Creating a Semantic Pattern: in this case for coffee and cacao Set activation value of the two nodes to 1.0 Spread this activation value to neighboring nodes via the weighted links 5% 5 20% 1.0 coffee machinery 2 chemicals cocoa 1.0
14 SPT: Layer 4 - Spreading activation Typically, one would create Semantic Patterns for all instances within the data set E.g. a pattern for C1 by activating coffee, 20% and 5 However, we can also create patterns for feature values: e.g. coffee Country Exports Unemployment rate Fertility rate C1 coffee 20% 5 C2 cacao 20% 5 C3 coffee, cacao 20% 5 C4 machinery 5% 2 C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C7 chemicals, cacao 20% missing data C8 missing data 20% 5 C9 coffee, cacao missing data missing data
15 SPT: Layer 4 - Spreading activation After SA: each node % % in the network has an activation value coffee cocoa chemicals 0.08 machinery By representing the 1.15 nodes and their activation values as a vector, we gain a Semantic Pattern coffee cocoa machinery chemicals 20% 5%
16 Export: Cacao Unsorted Semantic Pattern coffee cacao machinery chemicals 20% 5% 5 2 Country Exports Unemployment rate Fertility rate C1 coffee 20% 5 C2 cacao 20% 5 C3 coffee, cacao 20% 5 C4 machinery 5% 2 C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C7 chemicals, cacao 20% missing data C8 missing data 20% 5 C9 coffee, cacao missing data missing data Export: Coffee Unsorted Semantic Pattern Each feature value is represented by a semantic fingerprint coffee cacao machinery chemicals 20% 5% 5 2 Fertility: 2 Unsorted Semantic Pattern Allows for an instant analysis of semantic relations to other feature values 0.25 Sort, mean, variance, adding, 0 coffee cacao machinery chemicals 20% 5% 5 2 subtracting
17 SPT: Layer 5 - Analysis Calculating the distance between two patterns (Euclidean distance, Cosine similarity) For unsupervised clustering, semanticaware search algorithms Keyword search for coffee C1 coffee 20% 5 C3 coffee, cacao 20% 5 C9 coffee, cacao missing data missing data Semantic aware search for coffee C9 coffee, cacao missing data missing data C1 coffee 20% 5 C3 coffee, cacao 20% 5 C2 cacao 20% 5 C8 missing data 20% 5 C7 chemicals, cacao 20% missing data C5 chemicals 5% 2 C6 chemicals, machinery 5% 2 C4 machinery 5% 2
18 SPT: Layer 5 - Analysis Machine learning: apply any machine learning algorithm to the Semantic Patterns Unsupervised clustering Supervised learning Semantic-aware search Knowledge discovery: semantic relations, arbitrary procedures: mean, variance etc. Anomaly detection, feature relevance, simple operations (variance, mean, etc.) Visualization
19 Machine Learning Benefits? Domain-specific data set Machine learning goals Domain-specific data set Machine learning goals Application in heterogeneous domains regardless of the nature of the data Instance extraction Feature selection, construction Instance selection Algorithm selection Preprocessing Instance extraction Feature selection, construction Instance selection Algorithm selection Preprocessing Except for Layer 1, we do not need any manual setup for the layers Regardless of the analyzed data, the Semantic Patterns always use the same model Algorithm application Interpretation Algorithm application Interpretation Dependence on domain data and goals High Medium Low This means: Regardless of the deployed knowledge discovery method, we can always use the same methods for knowledge extraction!
20 Comparing the two models 2.00 Mean pattern: C1, C2, C3 Unsorted Semantic Pattern Semantic Patterns 1.00 Country Coffee Cacao Machinery Chemicals 20% 5% 5 2 C C C C C C C C C coffee cacao machinery chemicals 20% 5% 5 2 Mean pattern: C4, C5, C6 Unsorted Semantic Pattern Value-centric feature vectors 0 coffee cacao machinery chemicals 20% 5% 5 2 Country Coffee Cacao Machinery Chemicals Unemployment rate Fertility rate C % 5 C % 5 C % 5 C % 2 C % 2 C % 2 C % missing data C8 missing data 20% 5 C missing data missing data Same model: Android application, a country or a document... the activation values always have the same meaning
21 Evaluation 26 data sets from the UCI machine learning repository Supervised: SVM Unsupervised: EM and k-means Application to raw data and to Semantic Patterns Data set Label Inst DF SF Classes SVM (N) SVM (NN) SVM (P) KM (N) KM (NN) KM (P) EM (NN) EM (P) Breast Cancer Dermatology KR vs. KP Lymph Mushroom Soybean Splice Vote Zoo Anneal Colic Credit-A Credit-G Heart-C Heart-H Hepatitis Breast-w Diabetes Glass Heart-Statlog Ionosphere Iris Segment Sonar Vehicle Vowel BC DE KR LY MU SO SP VO ZO AN CO CA CG HC HH HE BW DI GL HS IO IR SE SO VE VO SVM K-Means EM SP-Parameters: D=0.5, Comb=E, Norm=L, MDL=1.5, σ = 0.2 Categorical Total Mixed Total Numerical Total
22 DOES IT WORK? Applications described in several publications, which analyze e-participation (Egyptian revolution, Fukoshima, Mitmachen): text documents Intrusion detection: event correlation RDF data analysis (semantic web) WiFi privacy (analyzing captured s) Android Market application analysis
23 Current Project Android application security Container applications for BYOD (require encryption, secure communication, key derivation functions, root checks etc.) Manual analysis is cumbersome Semantic Patterns Extract Dalvik VM code, features (opcodes, methods, local variables etc.) Apply Semantic Patterns technique Clustering, supervised learning, anomaly detection etc.
24 Current Project
25 Current Project Also works directly on the phone... Detecting SMS catchers/sniffers More fine grained detection assymmetric cryptography symmetric cryptography
26 Outlook Publish the Java API... basically a converter from arbitrary feature vectors to Semantic Patterns (e.g. in/out in ARFF format) Deep learning...
27 Thx!
28 Par N NN D 0.0 D 0.1 D 0.3 D 0.5 D 0.7 D 0.1 D 0.3 D 0.5 D 0.7 D 0.1 D 0.3 D 0.5 D 0.7 D 0.1 D 0.3 D 0.5 D 0.7 K-Means EM Total BC DE KR LY MU SO SP VO ZO Total BC DE KR LY MU SO SP VO ZO Raw Data Not available Semantic Patterns Comb=E Norm=L Comb=S Norm=L Comb=E Norm=S Comb=S Norm=S
29 Par N NN σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 K-Means EM Total AN CO CA CG HC HH HE Total AN CO CA CG HC HH HE Raw Data Not available Semantic Patterns D=0.0 MDL=2.0 D=0.0 MDL= D=0.5 MDL=1.0 D=0.7 MDL= D=0.5 MDL=1.5 D=0.7 MDL= D=0.5 MDL=2.0 D=0.7 MDL= D=0.5 MDL=3.0 D=0.7 MDL=
30 Par N NN σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 σ 0.0 σ 0.2 σ 0.4 σ 0.6 σ 0.8 K-Means EM Total BW DI GL HS IO IR SE SO VE VO Total BW DI GL HS IO IR SE SO VE VO Raw Data Not available Semantic Patterns D=0.0 MDL=1.5 D=0.0 MDL= D=0.5 MDL=1.0 D=0.7 MDL= D=0.5 MDL=1.5 D=0.7 MDL= D=0.5 MDL=2.0 D=0.7 MDL= D=0.5 MDL=3.0 D=0.7 MDL=
31 Distance Data Missing BC DE KR LY MU SO SP VO ZO Total AN CO CA CG HC HH HE Total BW DI GL HS IO IR SE SO VE VO Total Euc Cos Raw Semantic Patterns Raw Semantic Patterns 0% 10% 50% 90% 0% 10% 50% 90% 0% 10% 50% 90% 0% 10% 50% 90% Categorical Mixed Numerical
32 Data set EUC (N) EUC (NN) COS (NN) EUC (NN) COS (NN) EUC (NN) COS (NN) RAW Baseline Semantic Patterns Categorical BC DE KR LY MU SO SP VO ZO Total AN CO CA CG HC HH HE Total BW DI GL HS IO IR SE SO VE VO Total Mixed Numerical
Credal decision trees in noisy domains
Credal decision trees in noisy domains Carlos J. Mantas and Joaquín Abellán Department of Computer Science and Artificial Intelligence University of Granada, Granada, Spain {cmantas,jabellan}@decsai.ugr.es
More informationMotivation: Fraud Detection
Outlier Detection Motivation: Fraud Detection http://i.imgur.com/ckkoaop.gif Jian Pei: CMPT 741/459 Data Mining -- Outlier Detection (1) 2 Techniques: Fraud Detection Features Dissimilarity Groups and
More informationEECS 433 Statistical Pattern Recognition
EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern
More informationA Deep Learning Approach to Identify Diabetes
, pp.44-49 http://dx.doi.org/10.14257/astl.2017.145.09 A Deep Learning Approach to Identify Diabetes Sushant Ramesh, Ronnie D. Caytiles* and N.Ch.S.N Iyengar** School of Computer Science and Engineering
More informationVariable Features Selection for Classification of Medical Data using SVM
Variable Features Selection for Classification of Medical Data using SVM Monika Lamba USICT, GGSIPU, Delhi, India ABSTRACT: The parameters selection in support vector machines (SVM), with regards to accuracy
More informationOutlier Analysis. Lijun Zhang
Outlier Analysis Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Extreme Value Analysis Probabilistic Models Clustering for Outlier Detection Distance-Based Outlier Detection Density-Based
More informationAn Improved Algorithm To Predict Recurrence Of Breast Cancer
An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write
More informationImproved Intelligent Classification Technique Based On Support Vector Machines
Improved Intelligent Classification Technique Based On Support Vector Machines V.Vani Asst.Professor,Department of Computer Science,JJ College of Arts and Science,Pudukkottai. Abstract:An abnormal growth
More informationUnsupervised MRI Brain Tumor Detection Techniques with Morphological Operations
Unsupervised MRI Brain Tumor Detection Techniques with Morphological Operations Ritu Verma, Sujeet Tiwari, Naazish Rahim Abstract Tumor is a deformity in human body cells which, if not detected and treated,
More informationAn Experimental Analysis of Anytime Algorithms for Bayesian Network Structure Learning. Colin Lee and Peter van Beek University of Waterloo
An Experimental Analysis of Anytime Algorithms for Bayesian Network Structure Learning Colin Lee and Peter van Beek University of Waterloo Bayesian networks Probabilistic, directed, acyclic graphical model:
More informationShu Kong. Department of Computer Science, UC Irvine
Ubiquitous Fine-Grained Computer Vision Shu Kong Department of Computer Science, UC Irvine Outline 1. Problem definition 2. Instantiation 3. Challenge 4. Fine-grained classification with holistic representation
More informationStatistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes.
Final review Based in part on slides from textbook, slides of Susan Holmes December 5, 2012 1 / 1 Final review Overview Before Midterm General goals of data mining. Datatypes. Preprocessing & dimension
More informationKeywords Missing values, Medoids, Partitioning Around Medoids, Auto Associative Neural Network classifier, Pima Indian Diabetes dataset.
Volume 7, Issue 3, March 2017 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Medoid Based Approach
More informationCase-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials
Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials Riccardo Miotto and Chunhua Weng Department of Biomedical Informatics Columbia University,
More informationAn assistive application identifying emotional state and executing a methodical healing process for depressive individuals.
An assistive application identifying emotional state and executing a methodical healing process for depressive individuals. Bandara G.M.M.B.O bhanukab@gmail.com Godawita B.M.D.T tharu9363@gmail.com Gunathilaka
More informationShu Kong. Department of Computer Science, UC Irvine
Ubiquitous Fine-Grained Computer Vision Shu Kong Department of Computer Science, UC Irvine Outline 1. Problem definition 2. Instantiation 3. Challenge and philosophy 4. Fine-grained classification with
More informationLarge-scale Histopathology Image Analysis for Colon Cancer on Azure
Large-scale Histopathology Image Analysis for Colon Cancer on Azure Yan Xu 1, 2 Tao Mo 2 Teng Gao 2 Maode Lai 4 Zhuowen Tu 2,3 Eric I-Chao Chang 2 1 Beihang University; 2 Microsoft Research Asia; 3 UCSD;
More informationUsing AUC and Accuracy in Evaluating Learning Algorithms
1 Using AUC and Accuracy in Evaluating Learning Algorithms Jin Huang Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 fjhuang, clingg@csd.uwo.ca
More informationA Model for Automatic Diagnostic of Road Signs Saliency
A Model for Automatic Diagnostic of Road Signs Saliency Ludovic Simon (1), Jean-Philippe Tarel (2), Roland Brémond (2) (1) Researcher-Engineer DREIF-CETE Ile-de-France, Dept. Mobility 12 rue Teisserenc
More informationConnecting the Dots Social Media and Influence. Nancy Benavente Cedars Sinai Medical Center
Connecting the Dots Social Media and Influence Nancy Benavente Cedars Sinai Medical Center Nabenavente@gmail.com CARA 2011 1 Everything you do should at least create the beginning of a relationship 2 Relationships
More informationDiagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods
International Journal of Bioinformatics and Biomedical Engineering Vol. 1, No. 3, 2015, pp. 318-322 http://www.aiscience.org/journal/ijbbe ISSN: 2381-7399 (Print); ISSN: 2381-7402 (Online) Diagnosis of
More informationAutomated Medical Diagnosis using K-Nearest Neighbor Classification
(IMPACT FACTOR 5.96) Automated Medical Diagnosis using K-Nearest Neighbor Classification Zaheerabbas Punjani 1, B.E Student, TCET Mumbai, Maharashtra, India Ankush Deora 2, B.E Student, TCET Mumbai, Maharashtra,
More informationGIANT: Geo-Informative Attributes for Location Recognition and Exploration
GIANT: Geo-Informative Attributes for Location Recognition and Exploration Quan Fang, Jitao Sang, Changsheng Xu Institute of Automation, Chinese Academy of Sciences October 23, 2013 Where is this? La Sagrada
More informationPMR5406 Redes Neurais e Lógica Fuzzy. Aula 5 Alguns Exemplos
PMR5406 Redes Neurais e Lógica Fuzzy Aula 5 Alguns Exemplos APPLICATIONS Two examples of real life applications of neural networks for pattern classification: RBF networks for face recognition FF networks
More informationAnalyzing Spammers Social Networks for Fun and Profit
Chao Yang Robert Harkreader Jialong Zhang Seungwon Shin Guofei Gu Texas A&M University Analyzing Spammers Social Networks for Fun and Profit A Case Study of Cyber Criminal Ecosystem on Twitter Presentation:
More informationPredicting Breast Cancer Survivability Rates
Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer
More informationData mining for Obstructive Sleep Apnea Detection. 18 October 2017 Konstantinos Nikolaidis
Data mining for Obstructive Sleep Apnea Detection 18 October 2017 Konstantinos Nikolaidis Introduction: What is Obstructive Sleep Apnea? Obstructive Sleep Apnea (OSA) is a relatively common sleep disorder
More information1. Introduction 1.1. About the content
1. Introduction 1.1. About the content At first, some background ideas are given and what the origins of neurocomputing and artificial neural networks were. Then we start from single neurons or computing
More informationA STUDY OF AdaBoost WITH NAIVE BAYESIAN CLASSIFIERS: WEAKNESS AND IMPROVEMENT
Computational Intelligence, Volume 19, Number 2, 2003 A STUDY OF AdaBoost WITH NAIVE BAYESIAN CLASSIFIERS: WEAKNESS AND IMPROVEMENT KAI MING TING Gippsland School of Computing and Information Technology,
More informationRumor Detection on Twitter with Tree-structured Recursive Neural Networks
1 Rumor Detection on Twitter with Tree-structured Recursive Neural Networks Jing Ma 1, Wei Gao 2, Kam-Fai Wong 1,3 1 The Chinese University of Hong Kong 2 Victoria University of Wellington, New Zealand
More informationMayuri Takore 1, Prof.R.R. Shelke 2 1 ME First Yr. (CSE), 2 Assistant Professor Computer Science & Engg, Department
Data Mining Techniques to Find Out Heart Diseases: An Overview Mayuri Takore 1, Prof.R.R. Shelke 2 1 ME First Yr. (CSE), 2 Assistant Professor Computer Science & Engg, Department H.V.P.M s COET, Amravati
More information1. Introduction 1.1. About the content. 1.2 On the origin and development of neurocomputing
1. Introduction 1.1. About the content At first, some background ideas are given and what the origins of neurocomputing and artificial neural networks were. Then we start from single neurons or computing
More informationSAP Hybris Academy. Public. February March 2017
SAP Hybris Academy Public February March 2017 Agenda Introduction SAP Hybris Academy Overview Java Knowledge Needed for SAP Hybris Development HY200 SAP Hybris Commerce Functional Analyst: Course Content
More informationConceptual Spaces. A Bridge Between Neural and Symbolic Representations? Lucas Bechberger
Conceptual Spaces A Bridge Between Neural and Symbolic Representations? Lucas Bechberger Peter Gärdenfors, Conceptual Spaces: The Geometry of Thought, MIT press, 2000 The Different Layers of Representation
More informationAMERICAN CANCER SOCIETY FUNDRAISING APP FAQS
AMERICAN CANCER SOCIETY FUNDRAISING APP FAQS We're here to answer any questions you might have about the American Cancer Society Fundraising App. Below are answers to some of the most frequently asked
More informationIdentifying Thyroid Carcinoma Subtypes and Outcomes through Gene Expression Data Kun-Hsing Yu, Wei Wang, Chung-Yu Wang
Identifying Thyroid Carcinoma Subtypes and Outcomes through Gene Expression Data Kun-Hsing Yu, Wei Wang, Chung-Yu Wang Abstract: Unlike most cancers, thyroid cancer has an everincreasing incidence rate
More informationA Smart Texting System For Android Mobile Users
A Smart Texting System For Android Mobile Users Pawan D. Mishra Harshwardhan N. Deshpande Navneet A. Agrawal Final year I.T Final year I.T J.D.I.E.T Yavatmal. J.D.I.E.T Yavatmal. Final year I.T J.D.I.E.T
More informationSVM-based Discriminative Accumulation Scheme for Place Recognition
SVM-based Discriminative Accumulation Scheme for Place Recognition Andrzej Pronobis CAS/CVAP, KTH Stockholm, Sweden pronobis@csc.kth.se Óscar Martínez Mozos AIS, University Of Freiburg Freiburg, Germany
More informationMINING OF OUTLIER DETECTION IN LARGE CATEGORICAL DATASETS
MINING OF OUTLIER DETECTION IN LARGE CATEGORICAL DATASETS Mrs. Ramalan Kani K 1, Ms. N.Radhika 2 1 M.TECH Student, Department of computer Science and Engineering, PRIST University, Trichy 2 Asst.Professor,
More informationInferring Clinical Correlations from EEG Reports with Deep Neural Learning
Inferring Clinical Correlations from EEG Reports with Deep Neural Learning Methods for Identification, Classification, and Association using EHR Data S23 Travis R. Goodwin (Presenter) & Sanda M. Harabagiu
More informationArtificial Immunity and Features Reduction for effective Breast Cancer Diagnosis and Prognosis
www.ijcsi.org 136 Artificial Immunity and Features Reduction for effective Breast Cancer Diagnosis and Prognosis Mafaz Mohsin Al-Anezi 1 *, Marwah Jasim Mohammed 2*, Dhufr Sami Hammadi 2* a PhD of Computer
More informationPilot Study: Clinical Trial Task Ontology Development. A prototype ontology of common participant-oriented clinical research tasks and
Pilot Study: Clinical Trial Task Ontology Development Introduction A prototype ontology of common participant-oriented clinical research tasks and events was developed using a multi-step process as summarized
More informationOn the Use of Brainprints as Passwords
9/24/2015 2015 Global Identity Summit (GIS) 1 On the Use of Brainprints as Passwords Zhanpeng Jin Department of Electrical and Computer Engineering Department of Biomedical Engineering Binghamton University,
More informationPrediction Models of Diabetes Diseases Based on Heterogeneous Multiple Classifiers
Int. J. Advance Soft Compu. Appl, Vol. 10, No. 2, July 2018 ISSN 2074-8523 Prediction Models of Diabetes Diseases Based on Heterogeneous Multiple Classifiers I Gede Agus Suwartane 1, Mohammad Syafrullah
More informationClassıfıcatıon of Dıabetes Dısease Usıng Backpropagatıon and Radıal Basıs Functıon Network
UTM Computing Proceedings Innovations in Computing Technology and Applications Volume 2 Year: 2017 ISBN: 978-967-0194-95-0 1 Classıfıcatıon of Dıabetes Dısease Usıng Backpropagatıon and Radıal Basıs Functıon
More informationCENTRAL UNIVERSITY OF HARYANA Mahendergarh
CENTRAL UNIVERSITY OF HARYANA Mahendergarh Master of Computer Applications (MCA) (Comprehensive Structure of Syllabi as per CBCS) Scheme to be followed by students admitted in 215-16 session CORE COURSE
More informationTrajectories of Depression: Unobtrusive Monitoring of Depressive States by means of Smartphone Mobility Traces Analysis
Trajectories of Depression: Unobtrusive Monitoring of Depressive States by means of Smartphone Mobility Traces Analysis Luca Canzian University of Birmingham, UK l.canzian@cs.bham.ac.uk Mirco Musolesi
More informationPDF hosted at the Radboud Repository of the Radboud University Nijmegen
PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is an author's version which may differ from the publisher's version. For additional information about this
More informationPredicting the Effect of Diabetes on Kidney using Classification in Tanagra
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationLazy Learning of Bayesian Rules
Machine Learning, 41, 53 84, 2000 c 2000 Kluwer Academic Publishers. Manufactured in The Netherlands. Lazy Learning of Bayesian Rules ZIJIAN ZHENG zijian@deakin.edu.au GEOFFREY I. WEBB webb@deakin.edu.au
More informationBrain Tumour Detection of MR Image Using Naïve Beyer classifier and Support Vector Machine
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Brain Tumour Detection of MR Image Using Naïve
More informationData Mining. Outlier detection. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Outlier detection Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 17 Table of contents 1 Introduction 2 Outlier
More informationIntroduction to Discrimination in Microarray Data Analysis
Introduction to Discrimination in Microarray Data Analysis Jane Fridlyand CBMB University of California, San Francisco Genentech Hall Auditorium, Mission Bay, UCSF October 23, 2004 1 Case Study: Van t
More informationIdentifying Novel Targets for Non-Small Cell Lung Cancer Just How Novel Are They?
Identifying Novel Targets for Non-Small Cell Lung Cancer Just How Novel Are They? Dubovenko Alexey Discovery Product Manager Sonia Novikova Solution Scientist September 2018 2 Non-Small Cell Lung Cancer
More informationOUTLIER DETECTION : A REVIEW
International Journal of Advances Outlier in Embedeed Detection System : A Review Research January-June 2011, Volume 1, Number 1, pp. 55 71 OUTLIER DETECTION : A REVIEW K. Subramanian 1, and E. Ramraj
More informationCase Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD
Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Department of Biomedical Informatics Department of Computer Science and Engineering The Ohio State University Review
More informationFinal Project Report Sean Fischer CS229 Introduction
Introduction The field of pathology is concerned with identifying and understanding the biological causes and effects of disease through the study of morphological, cellular, and molecular features in
More informationApplying One-vs-One and One-vs-All Classifiers in k-nearest Neighbour Method and Support Vector Machines to an Otoneurological Multi-Class Problem
Oral Presentation at MIE 2011 30th August 2011 Oslo Applying One-vs-One and One-vs-All Classifiers in k-nearest Neighbour Method and Support Vector Machines to an Otoneurological Multi-Class Problem Kirsi
More informationPerformance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool
Performance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool Sujata Joshi Assistant Professor, Dept. of CSE Nitte Meenakshi Institute of Technology Bangalore,
More informationThis is a repository copy of Measuring the effect of public health campaigns on Twitter: the case of World Autism Awareness Day.
This is a repository copy of Measuring the effect of public health campaigns on Twitter: the case of World Autism Awareness Day. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/127215/
More informationA Classification Algorithm that Derives Weighted Sum Scores for Insight into Disease
A Classification Algorithm that Derives Weighted Sum Scores for Insight into Disease Anthony Quinn 1 Andrew Stranieri 1 John L. Yearwood 1 Gaudenz Hafen 2 1 Health Informatics Laboratory, Centre for Informatics
More informationCOMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION
COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION 1 R.NITHYA, 2 B.SANTHI 1 Asstt Prof., School of Computing, SASTRA University, Thanjavur, Tamilnadu, India-613402 2 Prof.,
More informationKnowledge networks of biological and medical data An exhaustive and flexible solution to model life sciences domains
Knowledge networks of biological and medical data An exhaustive and flexible solution to model life sciences domains Dr. Sascha Losko, Dr. Karsten Wenger, Dr. Wenzel Kalus, Dr. Andrea Ramge, Dr. Jens Wiehler,
More informationUnsupervised Identification of Isotope-Labeled Peptides
Unsupervised Identification of Isotope-Labeled Peptides Joshua E Goldford 13 and Igor GL Libourel 124 1 Biotechnology institute, University of Minnesota, Saint Paul, MN 55108 2 Department of Plant Biology,
More informationClass discovery in Gene Expression Data: Characterizing Splits by Support Vector Machines
Class discovery in Gene Expression Data: Characterizing Splits by Support Vector Machines Florian Markowetz and Anja von Heydebreck Max-Planck-Institute for Molecular Genetics Computational Molecular Biology
More informationBLOOD GLUCOSE PREDICTION MODELS FOR PERSONALIZED DIABETES MANAGEMENT
BLOOD GLUCOSE PREDICTION MODELS FOR PERSONALIZED DIABETES MANAGEMENT A Thesis Submitted to the Graduate Faculty of the North Dakota State University of Agriculture and Applied Science By Warnakulasuriya
More informationTactile Internet and Edge Computing: Emerging Technologies for Mobile Health
Tactile Internet and Edge Computing: Emerging Technologies for Mobile Health Zaher Dawy, PhD Department of Electrical and Computer Engineering American University of Beirut http://www.aub.edu.lb/~zd03
More informationStatistical Analysis Using Machine Learning Approach for Multiple Imputation of Missing Data
Statistical Analysis Using Machine Learning Approach for Multiple Imputation of Missing Data S. Kanchana 1 1 Assistant Professor, Faculty of Science and Humanities SRM Institute of Science & Technology,
More informationLecture 13: Finding optimal treatment policies
MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 13: Finding optimal treatment policies Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Peter Bodik for slides on reinforcement learning) Outline
More informationIntroduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018
Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this
More informationDEVELOPMENT OF AN EXPERT SYSTEM ALGORITHM FOR DIAGNOSING CARDIOVASCULAR DISEASE USING ROUGH SET THEORY IMPLEMENTED IN MATLAB
DEVELOPMENT OF AN EXPERT SYSTEM ALGORITHM FOR DIAGNOSING CARDIOVASCULAR DISEASE USING ROUGH SET THEORY IMPLEMENTED IN MATLAB Aaron Don M. Africa Department of Electronics and Communications Engineering,
More informationRenalyx Extending Renal Health
Awarded by Department of Science and Technology, GOI, Lockheed Martin Corporation INNOVATING END-2-END RENAL SOLUTIONS Innovative indigenous low cost hollow fiber Dialyzer won 5 th National award for Technology
More informationLung Cancer Diagnosis from CT Images Using Fuzzy Inference System
Lung Cancer Diagnosis from CT Images Using Fuzzy Inference System T.Manikandan 1, Dr. N. Bharathi 2 1 Associate Professor, Rajalakshmi Engineering College, Chennai-602 105 2 Professor, Velammal Engineering
More informationAbstracts. 2. Sittichai Sukreep, King Mongkut's University of Technology Thonburi (KMUTT) Time: 10:30-11:00
The 2nd Joint Seminar on Computational Intelligence by IEEE Computational Intelligence Society Thailand Chapter Thursday 23 rd February 2017 School of Information Technology, King Mongkut's University
More informationArtificial Neural Networks (Ref: Negnevitsky, M. Artificial Intelligence, Chapter 6)
Artificial Neural Networks (Ref: Negnevitsky, M. Artificial Intelligence, Chapter 6) BPNN in Practice Week 3 Lecture Notes page 1 of 1 The Hopfield Network In this network, it was designed on analogy of
More informationA Review on Arrhythmia Detection Using ECG Signal
A Review on Arrhythmia Detection Using ECG Signal Simranjeet Kaur 1, Navneet Kaur Panag 2 Student 1,Assistant Professor 2 Dept. of Electrical Engineering, Baba Banda Singh Bahadur Engineering College,Fatehgarh
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter
More informationCHAPTER 3 PROBLEM STATEMENT AND RESEARCH METHODOLOGY
64 CHAPTER 3 PROBLEM STATEMENT AND RESEARCH METHODOLOGY 3.1 PROBLEM DEFINITION Clinical data mining (CDM) is a rising field of research that aims at the utilization of data mining techniques to extract
More informationPortable Retina Eye Scanning Device
Portable Retina Eye Scanning Device Engineering Science Department Sonoma State University Students: Cristin Faria & Diego A. Espinosa Faculty Advisor: Dr. Sudhir Shrestha Industry Advisor: Ben Valvodinos
More informationStill important ideas
Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still
More informationKnowledge Discovery and Data Mining I
Ludwig-Maximilians-Universität München Lehrstuhl für Datenbanksysteme und Data Mining Prof. Dr. Thomas Seidl Knowledge Discovery and Data Mining I Winter Semester 2018/19 Introduction What is an outlier?
More informationIntroduction to MVPA. Alexandra Woolgar 16/03/10
Introduction to MVPA Alexandra Woolgar 16/03/10 MVP...what? Multi-Voxel Pattern Analysis (MultiVariate Pattern Analysis) * Overview Why bother? Different approaches Basics of designing experiments and
More informationKINOMAP FITNESS. Version Android KINOMAP FITNESS
Version 1.1 - Android With Kinomap Fitness, ride more than 100,000 km of geolocated videos all over the world uploaded by users themselves. Try to follow their rhythm under the same conditions at the time
More informationA scored AUC Metric for Classifier Evaluation and Selection
A scored AUC Metric for Classifier Evaluation and Selection Shaomin Wu SHAOMIN.WU@READING.AC.UK School of Construction Management and Engineering, The University of Reading, Reading RG6 6AW, UK Peter Flach
More informationAutomatic Context-Aware Image Captioning
Technical Disclosure Commons Defensive Publications Series May 23, 2017 Automatic Context-Aware Image Captioning Sandro Feuz Sebastian Millius Follow this and additional works at: http://www.tdcommons.org/dpubs_series
More informationCancer Cells Detection using OTSU Threshold Algorithm
Cancer Cells Detection using OTSU Threshold Algorithm Nalluri Sunny 1 Velagapudi Ramakrishna Siddhartha Engineering College Mithinti Srikanth 2 Velagapudi Ramakrishna Siddhartha Engineering College Kodali
More informationPredicting Sleep Using Consumer Wearable Sensing Devices
Predicting Sleep Using Consumer Wearable Sensing Devices Miguel A. Garcia Department of Computer Science Stanford University Palo Alto, California miguel16@stanford.edu 1 Introduction In contrast to the
More informationSURVEY ON OUTLIER DETECTION TECHNIQUES USING CATEGORICAL DATA
SURVEY ON OUTLIER DETECTION TECHNIQUES USING CATEGORICAL DATA K.T.Divya 1, N.Senthil Kumaran 2 1Research Scholar, Department of Computer Science, Vellalar college for Women, Erode, Tamilnadu, India 2Assistant
More informationKeywords Artificial Neural Networks (ANN), Echocardiogram, BPNN, RBFNN, Classification, survival Analysis.
Design of Classifier Using Artificial Neural Network for Patients Survival Analysis J. D. Dhande 1, Dr. S.M. Gulhane 2 Assistant Professor, BDCE, Sevagram 1, Professor, J.D.I.E.T, Yavatmal 2 Abstract The
More informationPredicting Breast Cancer Survival Using Treatment and Patient Factors
Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women
More informationData mining in forensics: a text mining approach to proling criminals
UDC 004.912 Vatter C., Mozgovoy M. Data mining in forensics: a text mining approach to proling criminals 1. Introduction. Data mining is the process of processing and analyzing large amounts of data and
More informationMRI Image Processing Operations for Brain Tumor Detection
MRI Image Processing Operations for Brain Tumor Detection Prof. M.M. Bulhe 1, Shubhashini Pathak 2, Karan Parekh 3, Abhishek Jha 4 1Assistant Professor, Dept. of Electronics and Telecommunications Engineering,
More informationClassification of Thyroid Disease Using Data Mining Techniques
Volume 119 No. 12 2018, 13881-13890 ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Classification of Thyroid Disease Using Data Mining Techniques Sumathi A, Nithya G and Meganathan
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017
RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science
More informationAnnotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation
Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation Ryo Izawa, Naoki Motohashi, and Tomohiro Takagi Department of Computer Science Meiji University 1-1-1 Higashimita,
More informationNature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.
Supplementary Figure 1 Behavioral training. a, Mazes used for behavioral training. Asterisks indicate reward location. Only some example mazes are shown (for example, right choice and not left choice maze
More informationKNN CLASSIFIER AND NAÏVE BAYSE CLASSIFIER FOR CRIME PREDICTION IN SAN FRANCISCO CONTEXT
KNN CLASSIFIER AND NAÏVE BAYSE CLASSIFIER FOR CRIME PREDICTION IN SAN FRANCISCO CONTEXT Noora Abdulrahman and Wala Abedalkhader Department of Engineering Systems and Management, Masdar Institute of Science
More informationthe best of care Managing diabetes with the FORA Diamond MINI and tools from Discovery Health Medical Scheme
the best of care 2014 Managing diabetes with the FORA Diamond MINI and tools from Discovery Health Medical Scheme contents What this document is about This document gives an overview of the FORA Diamond
More information