CHAPTER 3 PROBLEM STATEMENT AND RESEARCH METHODOLOGY

Size: px
Start display at page:

Download "CHAPTER 3 PROBLEM STATEMENT AND RESEARCH METHODOLOGY"

Transcription

1 64 CHAPTER 3 PROBLEM STATEMENT AND RESEARCH METHODOLOGY 3.1 PROBLEM DEFINITION Clinical data mining (CDM) is a rising field of research that aims at the utilization of data mining techniques to extract patterns from biological and clinical data. Oncogenomics is one of the key research areas in CDM that aims at applying high-throughput technologies to characterize genes associated with cancer. The major hurdles associated with this task involve: (i) Need to transform oncogenic data for processing by computational methods (ii) Ability to extract interpretable and valid patterns from the processed, voluminous data (iii) Improve prediction of cancer from diverse natured data with the extracted patterns. Since analysis of oncogenic data is a labor and resource intensive task, computational methods were investigated for faster and efficient analysis of oncogenic data piloting a potential area of cancer research named Computational Oncogenomics. This research focused on utilizing data mining methods to analyze and process the stated oncogenic data for the detection of oncogene patterns, oncoprotein patterns, oncoprotein mutations and oncoprotein interactions from the biological data and detection of cancer-cause/symptom patterns from the clinical data comprising of patient records, laboratory investigations and image-based features. Based on an exploration of the existing research issues (Kusiak et al, 2001; Kriegel et al, 2007; Hu, 2011; Huang et al, 2011) in the sphere of data

2 mining methodologies and their utilization in the field of pattern discovery from oncogenic data, the following research objectives were formulated RESEARCH OBJECTIVES The aim of this research was to investigate and explore the utilization of data mining methodologies in detecting oncogene patterns (gene expression data), oncoprotein properties (lung cancer tumor data), oncomutation patterns (P53 mutation data), oncoprotein interactions (HIV1-human PPI data) and identifying cancer-cause patterns/cancer-symptom patterns (from clinical data) by formulating novel feature selection and predictive techniques. In view of this, the following objectives were articulated: Large number of oncogene attributes but with comparatively very few instances characterized microarray-based gene expression data. This data required categorization of the contributory oncogenes according to the specific cancer sub-types. The data contained more than two target classes. Hence, the gene expression data required a suitable feature selection algorithm to extract the most minimal and optimal set of oncogenes that improved cancer prediction accuracy on the diverse gene expression cancer sub-types. Detection of oncoprotein properties for drug design incorporated extensive data cleaning strategies with reported low prediction accuracy. This required a computationally efficient feature selection technique that could eliminate the need for the data cleaning procedures while generating high cancer prediction accuracy with optimal set of protein properties for drug design. Lung cancer tumor data was stated to be the leading cause of

3 death around the world and hence was targeted for drug therapy in this research. 66 Detection of oncoprotein properties/mutations from P53 transcriptional activity data was a serious hurdle due to the heavy imbalance of records and massive data size. This required the formulation of an embedded supervised machine learning technique to detect the minimal and optimal set of oncogenic structural properties from P53 mutations for prediction of P53 transcriptional activity. Detection of oncomutation patterns by predicting P53 transcriptional activity from amino-acid substitutions unfolded a new research issue of categorizing the oncomutations as hot-spot cancer, strong rescue and weak rescue mutants. This led to the formulation of genetic mutant marker extraction methodology that could categorize the P53 mutants from amino-acid substitutions. The methodology needs to be computationally efficient and accurate in processing massive data. Discovery of novel oncoprotein interaction patterns was a challenging task due to the absence of established non-interacting protein pairs. Methodologies devised thus far failed to identify many novel interactions. HIV is a dreaded oncoprotein and hence the objective was to predict novel HIV1 human protein-protein interactions through association rule mining methodology that could capture maximum number of novel HIV1-human PPIs with least loss of information. Research on oncogenic clinical data for detection of cancercause/symptom patterns posed several issues in terms of the

4 67 diverse nature of data (continuous/discrete), multi-class categorization and biased nature of class distribution. This led to an investigation on the utilization of the proposed prediction method to predict cancer-cause/symptom patterns from oncogenic clinical data to identify the most efficient and diagnostically accurate method. The need for a clinical data classifier was identified to predict the nature of oncogenic data. The formulation of research objectives eventually led to the design of a suitable research methodology to explore and investigate the research issues and achieve the stated objectives. 3.3 RESEARCH METHODOLOGY The basic research methodology is stated to involve the process of identifying the problem followed by the formulation of appropriate techniques to handle the defined problem. Analysis of the collected data led this research to focus on two categories of Oncogenic data: (i) Biological data for detection of oncogene and oncoprotein patterns, oncomutation patterns and oncoprotein interactions (ii) Clinical data for detection of cancer-cause/cancer-symptom patterns. Following this, the data mining techniques were explored to analyze and process the stated oncogenic data. The research methodology involved the following phases: (i) Data collection and pre-processing (ii) Data analysis and processing (iii) Performance evaluation of proposed methodologies. Data collection and pre-processing is a pre-requisite to analyze oncogenic data. Data collection required identification of authenticated data from publicly available repositories namely UCI repository, NCBI database, KEGG database and AI labs. This is followed by analysis of the collected data that involved an investigation on the existing feature selection and classification algorithms to evaluate their performance in retrieving the relevant oncogenic features for cancer prediction from diverse types of oncogenic data. The

5 68 subsequent process would involve development of improved and computationally efficient feature selection and classification techniques to yield enhanced cancer prediction accuracy with minimal and optimal set of oncogenic features. In addition, association rule mining techniques were to be investigated to mine valid and potentially useful association rules. The extracted rules may be utilized to identify novel and previously unknown oncogenic patterns with adequate justification. The developed data mining framework could be utilized to design a clinical data classifier to predict the nature of unknown oncogenic data. 3.4 SUMMARY This chapter outlined the definition of the problem based on which the research objectives were articulated to handle the challenges and also concisely presents the diverse oncogenic biological and clinical data for detection of novel oncogene, oncoprotein, oncomutation, oncoprotein interaction and cancer-cause/symptom patterns. A research methodology to address the identified objectives is also given in this chapter. The next chapter details the formulated pattern discovery framework for detecting the most significant oncopatterns in biological and clinical data and their contribution to oncogenic pattern discovery.

CHAPTER 8 ONCOGENIC MARKER DETECTION FROM P53 MUTANT AMINO-ACID SUBSTITUTIONS

CHAPTER 8 ONCOGENIC MARKER DETECTION FROM P53 MUTANT AMINO-ACID SUBSTITUTIONS 134 CHAPTER 8 ONCOGENIC MARKER DETECTION FROM P53 MUTANT AMINO-ACID SUBSTITUTIONS The recent past has witnessed a rapid rise in the utilization of computational techniques to aid and accelerate biological

More information

How to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection

How to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection How to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection Esma Nur Cinicioglu * and Gülseren Büyükuğur Istanbul University, School of Business, Quantitative Methods

More information

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16

38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 PGAR: ASD Candidate Gene Prioritization System Using Expression Patterns Steven Cogill and Liangjiang Wang Department of Genetics and

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a

More information

Keywords Missing values, Medoids, Partitioning Around Medoids, Auto Associative Neural Network classifier, Pima Indian Diabetes dataset.

Keywords Missing values, Medoids, Partitioning Around Medoids, Auto Associative Neural Network classifier, Pima Indian Diabetes dataset. Volume 7, Issue 3, March 2017 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Medoid Based Approach

More information

Empirical function attribute construction in classification learning

Empirical function attribute construction in classification learning Pre-publication draft of a paper which appeared in the Proceedings of the Seventh Australian Joint Conference on Artificial Intelligence (AI'94), pages 29-36. Singapore: World Scientific Empirical function

More information

Contents. Just Classifier? Rules. Rules: example. Classification Rule Generation for Bioinformatics. Rule Extraction from a trained network

Contents. Just Classifier? Rules. Rules: example. Classification Rule Generation for Bioinformatics. Rule Extraction from a trained network Contents Classification Rule Generation for Bioinformatics Hyeoncheol Kim Rule Extraction from Neural Networks Algorithm Ex] Promoter Domain Hybrid Model of Knowledge and Learning Knowledge refinement

More information

Discovering Meaningful Cut-points to Predict High HbA1c Variation

Discovering Meaningful Cut-points to Predict High HbA1c Variation Proceedings of the 7th INFORMS Workshop on Data Mining and Health Informatics (DM-HI 202) H. Yang, D. Zeng, O. E. Kundakcioglu, eds. Discovering Meaningful Cut-points to Predict High HbAc Variation Si-Chi

More information

SCIENCE & TECHNOLOGY

SCIENCE & TECHNOLOGY Pertanika J. Sci. & Technol. 25 (S): 241-254 (2017) SCIENCE & TECHNOLOGY Journal homepage: http://www.pertanika.upm.edu.my/ Fuzzy Lambda-Max Criteria Weight Determination for Feature Selection in Clustering

More information

Mayuri Takore 1, Prof.R.R. Shelke 2 1 ME First Yr. (CSE), 2 Assistant Professor Computer Science & Engg, Department

Mayuri Takore 1, Prof.R.R. Shelke 2 1 ME First Yr. (CSE), 2 Assistant Professor Computer Science & Engg, Department Data Mining Techniques to Find Out Heart Diseases: An Overview Mayuri Takore 1, Prof.R.R. Shelke 2 1 ME First Yr. (CSE), 2 Assistant Professor Computer Science & Engg, Department H.V.P.M s COET, Amravati

More information

Variable Features Selection for Classification of Medical Data using SVM

Variable Features Selection for Classification of Medical Data using SVM Variable Features Selection for Classification of Medical Data using SVM Monika Lamba USICT, GGSIPU, Delhi, India ABSTRACT: The parameters selection in support vector machines (SVM), with regards to accuracy

More information

Data Mining in Bioinformatics Day 7: Clustering in Bioinformatics

Data Mining in Bioinformatics Day 7: Clustering in Bioinformatics Data Mining in Bioinformatics Day 7: Clustering in Bioinformatics Karsten Borgwardt February 21 to March 4, 2011 Machine Learning & Computational Biology Research Group MPIs Tübingen Karsten Borgwardt:

More information

Experimental Methods. Anna Fahlgren, Phd Associate professor in Experimental Orthopaedics

Experimental Methods. Anna Fahlgren, Phd Associate professor in Experimental Orthopaedics Experimental Methods Anna Fahlgren, Phd Associate professor in Experimental Orthopaedics What is experimental Methods? Experimental Methdology Experimental Methdology The Formal Hypothesis The precise

More information

Modeling Individual and Group Behavior in Complex Environments. Modeling Individual and Group Behavior in Complex Environments

Modeling Individual and Group Behavior in Complex Environments. Modeling Individual and Group Behavior in Complex Environments Modeling Individual and Group Behavior in Complex Environments Dr. R. Andrew Goodwin Environmental Laboratory Professor James J. Anderson Abran Steele-Feldman University of Washington Status: AT-14 Continuing

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017 RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science

More information

Predication-based Bayesian network analysis of gene sets and knowledge-based SNP abstractions

Predication-based Bayesian network analysis of gene sets and knowledge-based SNP abstractions Predication-based Bayesian network analysis of gene sets and knowledge-based SNP abstractions Skanda Koppula Second Annual MIT PRIMES Conference May 20th, 2012 Mentors: Dr. Gil Alterovitz and Dr. Amin

More information

Introduction. Introduction

Introduction. Introduction Introduction We are leveraging genome sequencing data from The Cancer Genome Atlas (TCGA) to more accurately define mutated and stable genes and dysregulated metabolic pathways in solid tumors. These efforts

More information

Statistical Analysis Using Machine Learning Approach for Multiple Imputation of Missing Data

Statistical Analysis Using Machine Learning Approach for Multiple Imputation of Missing Data Statistical Analysis Using Machine Learning Approach for Multiple Imputation of Missing Data S. Kanchana 1 1 Assistant Professor, Faculty of Science and Humanities SRM Institute of Science & Technology,

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

Gene Selection for Tumor Classification Using Microarray Gene Expression Data

Gene Selection for Tumor Classification Using Microarray Gene Expression Data Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology

More information

Hypothesis-Driven Research

Hypothesis-Driven Research Hypothesis-Driven Research Research types Descriptive science: observe, describe and categorize the facts Discovery science: measure variables to decide general patterns based on inductive reasoning Hypothesis-driven

More information

HOW TO WRITE A STUDY PROTOCOL

HOW TO WRITE A STUDY PROTOCOL HOW TO WRITE A STUDY PROTOCOL Manar Mohamed Moneer Assistant Professor Epidemiology & Biostatistics Department 2007 Reasoning behind Structure of a research project. What? Every Step of a Study A document

More information

Multi Parametric Approach Using Fuzzification On Heart Disease Analysis Upasana Juneja #1, Deepti #2 *

Multi Parametric Approach Using Fuzzification On Heart Disease Analysis Upasana Juneja #1, Deepti #2 * Multi Parametric Approach Using Fuzzification On Heart Disease Analysis Upasana Juneja #1, Deepti #2 * Department of CSE, Kurukshetra University, India 1 upasana_jdkps@yahoo.com Abstract : The aim of this

More information

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Department of Biomedical Informatics Department of Computer Science and Engineering The Ohio State University Review

More information

Artificial intelligence and judicial systems: The so-called predictive justice. 20 April

Artificial intelligence and judicial systems: The so-called predictive justice. 20 April Artificial intelligence and judicial systems: The so-called predictive justice 20 April 2018 1 Context The use of so-called artificielle intelligence received renewed interest over the past years.. Stakes

More information

Molecular Biology of Cancer. Code: ECTS Credits: 6. Degree Type Year Semester

Molecular Biology of Cancer. Code: ECTS Credits: 6. Degree Type Year Semester 2018/2019 Molecular Biology of Cancer Code: 100863 ECTS Credits: 6 Degree Type Year Semester 2500252 Biochemistry OT 4 0 Contact Name: Carles Arús Caralto Email: Carles.Arus@uab.cat Other comments on languages

More information

Statistical analysis of RIM data (retroviral insertional mutagenesis) Bioinformatics and Statistics The Netherlands Cancer Institute Amsterdam

Statistical analysis of RIM data (retroviral insertional mutagenesis) Bioinformatics and Statistics The Netherlands Cancer Institute Amsterdam Statistical analysis of RIM data (retroviral insertional mutagenesis) Lodewyk Wessels Bioinformatics and Statistics The Netherlands Cancer Institute Amsterdam Viral integration Viral integration Viral

More information

COMPUTATIONAL OPTIMISATION OF TARGETED DNA SEQUENCING FOR CANCER DETECTION

COMPUTATIONAL OPTIMISATION OF TARGETED DNA SEQUENCING FOR CANCER DETECTION COMPUTATIONAL OPTIMISATION OF TARGETED DNA SEQUENCING FOR CANCER DETECTION Pierre Martinez, Nicholas McGranahan, Nicolai Juul Birkbak, Marco Gerlinger, Charles Swanton* SUPPLEMENTARY INFORMATION SUPPLEMENTARY

More information

Statistical considerations in indirect comparisons and network meta-analysis

Statistical considerations in indirect comparisons and network meta-analysis Statistical considerations in indirect comparisons and network meta-analysis Said Business School, Oxford, UK March 18-19, 2013 Cochrane Comparing Multiple Interventions Methods Group Oxford Training event,

More information

Statement of research interest

Statement of research interest Statement of research interest Milos Hauskrecht My primary field of research interest is Artificial Intelligence (AI). Within AI, I am interested in problems related to probabilistic modeling, machine

More information

R2 Training Courses. Release The R2 support team

R2 Training Courses. Release The R2 support team R2 Training Courses Release 2.0.2 The R2 support team Nov 08, 2018 Students Course 1 Student Course: Investigating Intra-tumor Heterogeneity 3 1.1 Introduction.............................................

More information

Predicting Breast Cancer Survivability Rates

Predicting Breast Cancer Survivability Rates Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer

More information

A Deep Learning Approach to Identify Diabetes

A Deep Learning Approach to Identify Diabetes , pp.44-49 http://dx.doi.org/10.14257/astl.2017.145.09 A Deep Learning Approach to Identify Diabetes Sushant Ramesh, Ronnie D. Caytiles* and N.Ch.S.N Iyengar** School of Computer Science and Engineering

More information

PEPFAR Malawi Baobab Health Trust EMRS

PEPFAR Malawi Baobab Health Trust EMRS PEPFAR Malawi Baobab Health Trust EMRS Leveraging Patient-level Systems for Surveillance & Monitoring September 13, 2017 Johannesburg, South Africa What is the Baobab EMRS System? Modular Point of care

More information

Assigning B cell Maturity in Pediatric Leukemia Gabi Fragiadakis 1, Jamie Irvine 2 1 Microbiology and Immunology, 2 Computer Science

Assigning B cell Maturity in Pediatric Leukemia Gabi Fragiadakis 1, Jamie Irvine 2 1 Microbiology and Immunology, 2 Computer Science Assigning B cell Maturity in Pediatric Leukemia Gabi Fragiadakis 1, Jamie Irvine 2 1 Microbiology and Immunology, 2 Computer Science Abstract One method for analyzing pediatric B cell leukemia is to categorize

More information

Project Aims. Management Sciences for Health

Project Aims. Management Sciences for Health Evaluating the Impact of Prevention of Mother to Child Transmission of HIV (PMTCT) in Malawi: Piloting an Immunization Clinic-Based Surveillance Approach Scott Kellerman MD, MPH Erik Schouten, MD, MSc

More information

N. Laskaris, S. Fotopoulos, A. Ioannides

N. Laskaris, S. Fotopoulos, A. Ioannides N. Laskaris N. Laskaris [ IEEE SP Magazine, May 2004 ] N. Laskaris, S. Fotopoulos, A. Ioannides ENTER-2001 new tools for Mining Information from multichannel encephalographic recordings & applications

More information

Inter-session reproducibility measures for high-throughput data sources

Inter-session reproducibility measures for high-throughput data sources Inter-session reproducibility measures for high-throughput data sources Milos Hauskrecht, PhD, Richard Pelikan, MSc Computer Science Department, Intelligent Systems Program, Department of Biomedical Informatics,

More information

An Improved Algorithm To Predict Recurrence Of Breast Cancer

An Improved Algorithm To Predict Recurrence Of Breast Cancer An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant

More information

Emotion Recognition using a Cauchy Naive Bayes Classifier

Emotion Recognition using a Cauchy Naive Bayes Classifier Emotion Recognition using a Cauchy Naive Bayes Classifier Abstract Recognizing human facial expression and emotion by computer is an interesting and challenging problem. In this paper we propose a method

More information

General concepts. Chapters 1 and 2 uploaded to blackboard. All other material on my page. STA 2201S: Jan 13, /8

General concepts. Chapters 1 and 2 uploaded to blackboard. All other material on my page. STA 2201S: Jan 13, /8 General concepts Chapters 1 and 2 uploaded to blackboard. All other material on my page. STA 2201S: Jan 13, 2012 1/8 Preliminaries we need statistics when we have unexplained and haphazard variation distinguish

More information

DEVELOPMENT OF AN EXPERT SYSTEM ALGORITHM FOR DIAGNOSING CARDIOVASCULAR DISEASE USING ROUGH SET THEORY IMPLEMENTED IN MATLAB

DEVELOPMENT OF AN EXPERT SYSTEM ALGORITHM FOR DIAGNOSING CARDIOVASCULAR DISEASE USING ROUGH SET THEORY IMPLEMENTED IN MATLAB DEVELOPMENT OF AN EXPERT SYSTEM ALGORITHM FOR DIAGNOSING CARDIOVASCULAR DISEASE USING ROUGH SET THEORY IMPLEMENTED IN MATLAB Aaron Don M. Africa Department of Electronics and Communications Engineering,

More information

Data Mining in Bioinformatics Day 9: String & Text Mining in Bioinformatics

Data Mining in Bioinformatics Day 9: String & Text Mining in Bioinformatics Data Mining in Bioinformatics Day 9: String & Text Mining in Bioinformatics Karsten Borgwardt March 1 to March 12, 2010 Machine Learning & Computational Biology Research Group MPIs Tübingen Karsten Borgwardt:

More information

Research and Innovation Roadmap Annual Projects

Research and Innovation Roadmap Annual Projects Research and Innovation Roadmap 2019 Annual Projects Contents Overview 01 Capability Development 02 Service Delivery Improvement 03 Forensic Fundamentals 04 Operational Requirement 05 Intelligence Application

More information

CHAPTER VI RESEARCH METHODOLOGY

CHAPTER VI RESEARCH METHODOLOGY CHAPTER VI RESEARCH METHODOLOGY 6.1 Research Design Research is an organized, systematic, data based, critical, objective, scientific inquiry or investigation into a specific problem, undertaken with the

More information

Identifying Novel Targets for Non-Small Cell Lung Cancer Just How Novel Are They?

Identifying Novel Targets for Non-Small Cell Lung Cancer Just How Novel Are They? Identifying Novel Targets for Non-Small Cell Lung Cancer Just How Novel Are They? Dubovenko Alexey Discovery Product Manager Sonia Novikova Solution Scientist September 2018 2 Non-Small Cell Lung Cancer

More information

Classification of Smoking Status: The Case of Turkey

Classification of Smoking Status: The Case of Turkey Classification of Smoking Status: The Case of Turkey Zeynep D. U. Durmuşoğlu Department of Industrial Engineering Gaziantep University Gaziantep, Turkey unutmaz@gantep.edu.tr Pınar Kocabey Çiftçi Department

More information

Final Project Report Sean Fischer CS229 Introduction

Final Project Report Sean Fischer CS229 Introduction Introduction The field of pathology is concerned with identifying and understanding the biological causes and effects of disease through the study of morphological, cellular, and molecular features in

More information

JOB DESCRIPTION. Job Title: Part time (0.7) Clinical Psychologist - Band 7 Equivalent

JOB DESCRIPTION. Job Title: Part time (0.7) Clinical Psychologist - Band 7 Equivalent JOB DESCRIPTION Job Title: Part time (0.7) Clinical Psychologist - Band 7 Equivalent Department: Psychology Reports to: Head of the Psychology Department Job Purpose To provide a direct psychological service

More information

CANCER DIAGNOSIS USING NAIVE BAYES ALGORITHM

CANCER DIAGNOSIS USING NAIVE BAYES ALGORITHM CANCER DIAGNOSIS USING NAIVE BAYES ALGORITHM Rashmi M 1, Usha K Patil 2 Assistant Professor,Dept of Computer Science,GSSSIETW, Mysuru Abstract The paper Cancer Diagnosis Using Naive Bayes Algorithm deals

More information

7.1 Grading Diabetic Retinopathy

7.1 Grading Diabetic Retinopathy Chapter 7 DIABETIC RETINOPATHYGRADING -------------------------------------------------------------------------------------------------------------------------------------- A consistent approach to the

More information

Bioinformatics Laboratory Exercise

Bioinformatics Laboratory Exercise Bioinformatics Laboratory Exercise Biology is in the midst of the genomics revolution, the application of robotic technology to generate huge amounts of molecular biology data. Genomics has led to an explosion

More information

TPMI Presents: Translational Genomics Research Update, Opportunities and Challenges

TPMI Presents: Translational Genomics Research Update, Opportunities and Challenges TPMI Presents: Translational Genomics Research Update, Opportunities and Challenges April 12, 2016 Darren D. O Rielly, Ph.D., FCCMG Director, Molecular Genetics Laboratory, Eastern Health Director, Translational

More information

Discovery and Validation of Prognostic Genomic Based Signatures in High Risk Bladder Cancer Following Cystectomy

Discovery and Validation of Prognostic Genomic Based Signatures in High Risk Bladder Cancer Following Cystectomy Discovery and Validation of Prognostic Genomic Based Signatures in High Risk Bladder Cancer Following Cystectomy Anirban P. Mitra, M.D., Ph.D. Center for Personalized Medicine University of Southern California

More information

Mapping evolutionary pathways of HIV-1 drug resistance using conditional selection pressure. Christopher Lee, UCLA

Mapping evolutionary pathways of HIV-1 drug resistance using conditional selection pressure. Christopher Lee, UCLA Mapping evolutionary pathways of HIV-1 drug resistance using conditional selection pressure Christopher Lee, UCLA HIV-1 Protease and RT: anti-retroviral drug targets protease RT Protease: responsible for

More information

Improved Intelligent Classification Technique Based On Support Vector Machines

Improved Intelligent Classification Technique Based On Support Vector Machines Improved Intelligent Classification Technique Based On Support Vector Machines V.Vani Asst.Professor,Department of Computer Science,JJ College of Arts and Science,Pudukkottai. Abstract:An abnormal growth

More information

Grounding Ontologies in the External World

Grounding Ontologies in the External World Grounding Ontologies in the External World Antonio CHELLA University of Palermo and ICAR-CNR, Palermo antonio.chella@unipa.it Abstract. The paper discusses a case study of grounding an ontology in the

More information

AD (Leave blank) TITLE: Genomic Characterization of Brain Metastasis in Non-Small Cell Lung Cancer Patients

AD (Leave blank) TITLE: Genomic Characterization of Brain Metastasis in Non-Small Cell Lung Cancer Patients AD (Leave blank) Award Number: W81XWH-12-1-0444 TITLE: Genomic Characterization of Brain Metastasis in Non-Small Cell Lung Cancer Patients PRINCIPAL INVESTIGATOR: Mark A. Watson, MD PhD CONTRACTING ORGANIZATION:

More information

A Fuzzy Improved Neural based Soft Computing Approach for Pest Disease Prediction

A Fuzzy Improved Neural based Soft Computing Approach for Pest Disease Prediction International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 13 (2014), pp. 1335-1341 International Research Publications House http://www. irphouse.com A Fuzzy Improved

More information

Development of a NGS Cancer Research Database CancerBase

Development of a NGS Cancer Research Database CancerBase Development of a NGS Cancer Research Database CancerBase Quashiya M. Soudagar 1, Akshatha Prasanna 2, V. G. Shanmuga Priya 3 1 M.Tech, Bioinformatics, KLE Dr. M.S Sheshgiri College of Engineering and Technology,

More information

Molecular and Cell Biology of Cancer. Code: ECTS Credits: 6. Degree Type Year Semester Biomedical Sciences OT 4 0

Molecular and Cell Biology of Cancer. Code: ECTS Credits: 6. Degree Type Year Semester Biomedical Sciences OT 4 0 2018/2019 Molecular and Cell Biology of Cancer Code: 101897 ECTS Credits: 6 Degree Type Year Semester 2501230 Biomedical Sciences OT 4 0 Contact Name: Carles Arús Caralto Email: Carles.Arus@uab.cat Other

More information

ABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India

ABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 1 ISSN : 2456-3307 Data Mining Techniques to Predict Cancer Diseases

More information

International Journal of Advance Research in Computer Science and Management Studies

International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 12, December 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Phone Number:

Phone Number: International Journal of Scientific & Engineering Research, Volume 6, Issue 5, May-2015 1589 Multi-Agent based Diagnostic Model for Diabetes 1 A. A. Obiniyi and 2 M. K. Ahmed 1 Department of Mathematic,

More information

CSC2130: Empirical Research Methods for Software Engineering

CSC2130: Empirical Research Methods for Software Engineering CSC2130: Empirical Research Methods for Software Engineering Steve Easterbrook sme@cs.toronto.edu www.cs.toronto.edu/~sme/csc2130/ 2004-5 Steve Easterbrook. This presentation is available free for non-commercial

More information

Analysis of Classification Algorithms towards Breast Tissue Data Set

Analysis of Classification Algorithms towards Breast Tissue Data Set Analysis of Classification Algorithms towards Breast Tissue Data Set I. Ravi Assistant Professor, Department of Computer Science, K.R. College of Arts and Science, Kovilpatti, Tamilnadu, India Abstract

More information

Supervised Learner for the Prediction of Hi-C Interaction Counts and Determination of Influential Features. Tyler Yue Lab

Supervised Learner for the Prediction of Hi-C Interaction Counts and Determination of Influential Features. Tyler Yue Lab Supervised Learner for the Prediction of Hi-C Interaction Counts and Determination of Influential Features Tyler Derr @ Yue Lab tsd5037@psu.edu Background Hi-C is a chromosome conformation capture (3C)

More information

Hybridized KNN and SVM for gene expression data classification

Hybridized KNN and SVM for gene expression data classification Mei, et al, Hybridized KNN and SVM for gene expression data classification Hybridized KNN and SVM for gene expression data classification Zhen Mei, Qi Shen *, Baoxian Ye Chemistry Department, Zhengzhou

More information

A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range

A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range Lae-Jeong Park and Jung-Ho Moon Department of Electrical Engineering, Kangnung National University Kangnung, Gangwon-Do,

More information

Reporting Checklist for Nature Neuroscience

Reporting Checklist for Nature Neuroscience Corresponding Author: Manuscript Number: Manuscript Type: Tali Sharot NNA549D Article Reporting Checklist for Nature Neuroscience # Main ures: 4 # lementary ures: 5 # lementary Tables: 4 # lementary Videos:

More information

Probability-Based Protein Identification for Post-Translational Modifications and Amino Acid Variants Using Peptide Mass Fingerprint Data

Probability-Based Protein Identification for Post-Translational Modifications and Amino Acid Variants Using Peptide Mass Fingerprint Data Probability-Based Protein Identification for Post-Translational Modifications and Amino Acid Variants Using Peptide Mass Fingerprint Data Tong WW, McComb ME, Perlman DH, Huang H, O Connor PB, Costello

More information

FORECASTING TRENDS FOR PROACTIVE CRIME PREVENTION AND DETECTION USING WEKA DATA MINING TOOL-KIT ABSTRACT

FORECASTING TRENDS FOR PROACTIVE CRIME PREVENTION AND DETECTION USING WEKA DATA MINING TOOL-KIT ABSTRACT FORECASTING TRENDS FOR PROACTIVE CRIME PREVENTION AND DETECTION USING WEKA DATA MINING TOOL-KIT Ramesh Singh National Informatics Centre, New Delhi, India Rahul Thukral Department Of Computer Science And

More information

Comparative Analysis of Machine Learning Algorithms for Chronic Kidney Disease Detection using Weka

Comparative Analysis of Machine Learning Algorithms for Chronic Kidney Disease Detection using Weka I J C T A, 10(8), 2017, pp. 59-67 International Science Press ISSN: 0974-5572 Comparative Analysis of Machine Learning Algorithms for Chronic Kidney Disease Detection using Weka Milandeep Arora* and Ajay

More information

Ultrasonic Phased Array Inspection of Turbine Components

Ultrasonic Phased Array Inspection of Turbine Components ECNDT 2006 - Th.2.6.2 Ultrasonic Phased Array Inspection of Turbine Components Waheed A. ABBASI, Michael F. FAIR, SIEMENS Power Generation, Pittsburgh, USA Abstract. The advent and proliferation of Ultrasonic

More information

Comparative study of Naïve Bayes Classifier and KNN for Tuberculosis

Comparative study of Naïve Bayes Classifier and KNN for Tuberculosis Comparative study of Naïve Bayes Classifier and KNN for Tuberculosis Hardik Maniya Mosin I. Hasan Komal P. Patel ABSTRACT Data mining is applied in medical field since long back to predict disease like

More information

IDENTIFICATION OF OUTLIERS: A SIMULATION STUDY

IDENTIFICATION OF OUTLIERS: A SIMULATION STUDY IDENTIFICATION OF OUTLIERS: A SIMULATION STUDY Sharifah Sakinah Syed Abd Mutalib 1 and Khlipah Ibrahim 1, Faculty of Computer and Mathematical Sciences, UiTM Terengganu, Dungun, Terengganu 1 Centre of

More information

Effective Values of Physical Features for Type-2 Diabetic and Non-diabetic Patients Classifying Case Study: Shiraz University of Medical Sciences

Effective Values of Physical Features for Type-2 Diabetic and Non-diabetic Patients Classifying Case Study: Shiraz University of Medical Sciences Effective Values of Physical Features for Type-2 Diabetic and Non-diabetic Patients Classifying Case Study: Medical Sciences S. Vahid Farrahi M.Sc Student Technology,Shiraz, Iran Mohammad Mehdi Masoumi

More information

HIV Drug Resistance South Africa, How to address the increasing need? 14 Apr. 2016

HIV Drug Resistance South Africa, How to address the increasing need? 14 Apr. 2016 HIV Drug Resistance South Africa, How to address the increasing need? 14 Apr. 2016 1 Thus the HIV DR needs to focus on prevention and then diagnostic capacity to 1 st provide VL monitoring for early &

More information

Ethiopia. Targeted Tuberculosis Case Finding Interventions in Six Mining Shafts in Remote Districts of Oromia Region in Ethiopia PROJECT CONTEXT

Ethiopia. Targeted Tuberculosis Case Finding Interventions in Six Mining Shafts in Remote Districts of Oromia Region in Ethiopia PROJECT CONTEXT Technical BRIEF Photo Credit: Challenge TB Targeted Tuberculosis Case Finding Interventions in Six Mining Shafts in Remote Districts of Oromia Region in Ethiopia PROJECT CONTEXT Ethiopia is the second-most

More information

Visualizing Cancer Heterogeneity with Dynamic Flow

Visualizing Cancer Heterogeneity with Dynamic Flow Visualizing Cancer Heterogeneity with Dynamic Flow Teppei Nakano and Kazuki Ikeda Keio University School of Medicine, Tokyo 160-8582, Japan keiohigh2nd@gmail.com Department of Physics, Osaka University,

More information

QUALITY ASSURANCE GUIDELINES FOR LATENT PRINT EXAMINERS

QUALITY ASSURANCE GUIDELINES FOR LATENT PRINT EXAMINERS QUALITY ASSURANCE GUIDELINES FOR LATENT PRINT EXAMINERS Preamble SWGFAST recognizes the importance and significance of establishing Quality Assurance protocols and procedures for friction ridge examination.

More information

DIRECT IDENTIFICATION OF NEO-EPITOPES IN TUMOR TISSUE

DIRECT IDENTIFICATION OF NEO-EPITOPES IN TUMOR TISSUE DIRECT IDENTIFICATION OF NEO-EPITOPES IN TUMOR TISSUE Eustache Paramithiotis PhD Vice President, Biomarker Discovery & Diagnostics 17 March 2016 PEPTIDE PRESENTATION BY MHC MHC I Antigen presentation by

More information

Global Trends in Early Infant Diagnosis of HIV

Global Trends in Early Infant Diagnosis of HIV Global Trends in Early Infant Diagnosis of HIV Integrating Point-of-Care Testing into the National EID Program: The Case of Malawi 18 th International Conference on AIDS and STIs in Africa 1 December 2015

More information

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies

OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies OncoPPi Portal A Cancer Protein Interaction Network to Inform Therapeutic Strategies 2017 Contents Datasets... 2 Protein-protein interaction dataset... 2 Set of known PPIs... 3 Domain-domain interactions...

More information

THE UMD TP53 MUTATION DATABASE UPDATES AND BENEFITS. Pr. Thierry Soussi

THE UMD TP53 MUTATION DATABASE UPDATES AND BENEFITS. Pr. Thierry Soussi THE UMD TP53 MUTATION DATABASE UPDATES AND BENEFITS Pr. Thierry Soussi thierry.soussi@ki.se thierry.soussi@upmc.fr TP53: 33 YEARS AND COUNTING STRUCTURE FUNCTION RELATIONSHIP OF WILD AND MUTANT TP53 1984

More information

Sequential, Multiple Assignment, Randomized Trials

Sequential, Multiple Assignment, Randomized Trials Sequential, Multiple Assignment, Randomized Trials Module 2 Experimental Design and Analysis Methods for Developing Adaptive Interventions: Getting SMART Daniel Almirall, Ahnalee Brincks, Billie Nahum-Shani

More information

Diagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods

Diagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods International Journal of Bioinformatics and Biomedical Engineering Vol. 1, No. 3, 2015, pp. 318-322 http://www.aiscience.org/journal/ijbbe ISSN: 2381-7399 (Print); ISSN: 2381-7402 (Online) Diagnosis of

More information

Automated Medical Diagnosis using K-Nearest Neighbor Classification

Automated Medical Diagnosis using K-Nearest Neighbor Classification (IMPACT FACTOR 5.96) Automated Medical Diagnosis using K-Nearest Neighbor Classification Zaheerabbas Punjani 1, B.E Student, TCET Mumbai, Maharashtra, India Ankush Deora 2, B.E Student, TCET Mumbai, Maharashtra,

More information

Experimental Design for Immunologists

Experimental Design for Immunologists Experimental Design for Immunologists Hulin Wu, Ph.D., Dean s Professor Department of Biostatistics & Computational Biology Co-Director: Center for Biodefense Immune Modeling School of Medicine and Dentistry

More information

A scored AUC Metric for Classifier Evaluation and Selection

A scored AUC Metric for Classifier Evaluation and Selection A scored AUC Metric for Classifier Evaluation and Selection Shaomin Wu SHAOMIN.WU@READING.AC.UK School of Construction Management and Engineering, The University of Reading, Reading RG6 6AW, UK Peter Flach

More information

What do we know about HIV trial design for adolescents?

What do we know about HIV trial design for adolescents? What do we know about HIV trial design for adolescents? Sinéad Delany-Moretlwe, MBBCh PhD, DTM&H International Workshop on HIV & Adolescence, Cape Town, October 2018 Outline Why include adolescents in

More information

Detection of Cognitive States from fmri data using Machine Learning Techniques

Detection of Cognitive States from fmri data using Machine Learning Techniques Detection of Cognitive States from fmri data using Machine Learning Techniques Vishwajeet Singh, K.P. Miyapuram, Raju S. Bapi* University of Hyderabad Computational Intelligence Lab, Department of Computer

More information

Funnelling Used to describe a process of narrowing down of focus within a literature review. So, the writer begins with a broad discussion providing b

Funnelling Used to describe a process of narrowing down of focus within a literature review. So, the writer begins with a broad discussion providing b Accidental sampling A lesser-used term for convenience sampling. Action research An approach that challenges the traditional conception of the researcher as separate from the real world. It is associated

More information

Structural Variation and Medical Genomics

Structural Variation and Medical Genomics Structural Variation and Medical Genomics Andrew King Department of Biomedical Informatics July 8, 2014 You already know about small scale genetic mutations Single nucleotide polymorphism (SNPs) Deletions,

More information

Detecting Anomalous Patterns of Care Using Health Insurance Claims

Detecting Anomalous Patterns of Care Using Health Insurance Claims Partially funded by National Science Foundation grants IIS-0916345, IIS-0911032, and IIS-0953330, and funding from Disruptive Health Technology Institute. We are also grateful to Highmark Health for providing

More information

A Roadmap for Improving Epilepsy Therapy Through Integrated Advanced Technologies (The Knowledge Project) December 3, 2011

A Roadmap for Improving Epilepsy Therapy Through Integrated Advanced Technologies (The Knowledge Project) December 3, 2011 A Roadmap for Improving Epilepsy Therapy Through Integrated Advanced Technologies (The Knowledge Project) December 3, 2011 Tracy Glauser, M.D. Director, Comprehensive Epilepsy Center Cincinnati Children

More information

Big Image-Omics Data Analytics for Clinical Outcome Prediction

Big Image-Omics Data Analytics for Clinical Outcome Prediction Big Image-Omics Data Analytics for Clinical Outcome Prediction Junzhou Huang, Ph.D. Associate Professor Dept. Computer Science & Engineering University of Texas at Arlington Dept. CSE, UT Arlington Scalable

More information

Semantic Pattern Transformation

Semantic Pattern Transformation Semantic Pattern Transformation IKNOW 2013 Peter Teufl, Herbert Leitold, Reinhard Posch peter.teufl@iaik.tugraz.at Our Background Topics Mobile device security Cloud security Security consulting for public

More information

Predictive and Similarity Analytics for Healthcare

Predictive and Similarity Analytics for Healthcare Predictive and Similarity Analytics for Healthcare Paul Hake, MSPA IBM Smarter Care Analytics 1 Disease Progression & Cost of Care Health Status Health care spending Healthy / Low Risk At Risk High Risk

More information

INTERVIEWS II: THEORIES AND TECHNIQUES 5. CLINICAL APPROACH TO INTERVIEWING PART 1

INTERVIEWS II: THEORIES AND TECHNIQUES 5. CLINICAL APPROACH TO INTERVIEWING PART 1 INTERVIEWS II: THEORIES AND TECHNIQUES 5. CLINICAL APPROACH TO INTERVIEWING PART 1 5.1 Clinical Interviews: Background Information The clinical interview is a technique pioneered by Jean Piaget, in 1975,

More information