Yeast Cells Classification Machine Learning Approach to Discriminate Saccharomyces cerevisiae Yeast Cells Using Sophisticated Image Features.

Similar documents
COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION

Classification of mammogram masses using selected texture, shape and margin features with multilayer perceptron classifier.

Classification of Benign and Malignant Vertebral Compression Fractures in Magnetic Resonance Images

Improved Intelligent Classification Technique Based On Support Vector Machines

Data complexity measures for analyzing the effect of SMOTE over microarrays

Brain Tumour Detection of MR Image Using Naïve Beyer classifier and Support Vector Machine

Detection of Glaucoma and Diabetic Retinopathy from Fundus Images by Bloodvessel Segmentation

On Shape And the Computability of Emotions X. Lu, et al.

Predicting Breast Cancer Survivability Rates

COMPUTERIZED SYSTEM DESIGN FOR THE DETECTION AND DIAGNOSIS OF LUNG NODULES IN CT IMAGES 1

THE data used in this project is provided. SEIZURE forecasting systems hold promise. Seizure Prediction from Intracranial EEG Recordings

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.

Basic Statistics 01. Describing Data. Special Program: Pre-training 1

Derivative-Free Optimization for Hyper-Parameter Tuning in Machine Learning Problems

AUTOMATIC DIABETIC RETINOPATHY DETECTION USING GABOR FILTER WITH LOCAL ENTROPY THRESHOLDING

IDENTIFICATION OF MYOCARDIAL INFARCTION TISSUE BASED ON TEXTURE ANALYSIS FROM ECHOCARDIOGRAPHY IMAGES

Mammogram Analysis: Tumor Classification

3. Model evaluation & selection

A comparative study of machine learning methods for lung diseases diagnosis by computerized digital imaging'"

Title. Author(s)Ishihara, Kenta; Ogawa, Takahiro; Haseyama, Miki. CitationComputers in biology and medicine, 84: Issue Date

Intelligent Edge Detector Based on Multiple Edge Maps. M. Qasim, W.L. Woon, Z. Aung. Technical Report DNA # May 2012

CHAPTER 9 SUMMARY AND CONCLUSION

4. Model evaluation & selection

Early Detection of Lung Cancer

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Machine learning II. Juhan Ernits ITI8600

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

SVM-based Discriminative Accumulation Scheme for Place Recognition

Understandable Statistics

Computer-Aided Quantitative Analysis of Liver using Ultrasound Images

A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range

Ecological Statistics

AUTOMATIC BRAIN TUMOR DETECTION AND CLASSIFICATION USING SVM CLASSIFIER

Fusing Generic Objectness and Visual Saliency for Salient Object Detection

NIH Public Access Author Manuscript Conf Proc IEEE Eng Med Biol Soc. Author manuscript; available in PMC 2013 February 01.

Fibrosis detection from ultrasound imaging. The influence of necroinflammatory activity and steatosis over the detection rates.

Mammogram Analysis: Tumor Classification

Three-dimensional textural features of conventional MRI improve diagnostic classification of childhood brain tumours

Effect of Feedforward Back Propagation Neural Network for Breast Tumor Classification

Computer-aided diagnosis of subtle signs of breast cancer: Architectural distortion in prior mammograms

AP Statistics. Semester One Review Part 1 Chapters 1-5

Conditional Distributions and the Bivariate Normal Distribution. James H. Steiger

arxiv: v1 [cs.cv] 26 Mar 2016

Extraction of Texture Features using GLCM and Shape Features using Connected Regions

Modifying ROC Curves to Incorporate Predicted Probabilities

International Journal of Advance Engineering and Research Development EARLY DETECTION OF GLAUCOMA USING EMPIRICAL WAVELET TRANSFORM

Radiologists detect 'gist' of breast cancer before overt signs appear April 2018

Automated Brain Tumor Segmentation Using Region Growing Algorithm by Extracting Feature

Performance evaluation of the various edge detectors and filters for the noisy IR images

Performance Evaluation of Machine Learning Algorithms in the Classification of Parkinson Disease Using Voice Attributes

Automatic Detection of Diabetic Retinopathy Level Using SVM Technique

Edge detection. Gradient-based edge operators

Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets

Automatic Hemorrhage Classification System Based On Svm Classifier

Investigation of multiorientation and multiresolution features for microcalcifications classification in mammograms

International Journal Of Advanced Research In ISSN: Engineering Technology & Sciences

Comparing Multifunctionality and Association Information when Classifying Oncogenes and Tumor Suppressor Genes

Predictive performance and discrimination in unbalanced classification

EXTRACT THE BREAST CANCER IN MAMMOGRAM IMAGES

CLASSIFICATION OF BREAST CANCER INTO BENIGN AND MALIGNANT USING SUPPORT VECTOR MACHINES

Conclusions and future directions

Comparison Classifier: Support Vector Machine (SVM) and K-Nearest Neighbor (K-NN) In Digital Mammogram Images

A New Approach for Detection and Classification of Diabetic Retinopathy Using PNN and SVM Classifiers

Automatic Classification of Perceived Gender from Facial Images

Supplementary Materials

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

HHS Public Access Author manuscript Mach Learn Med Imaging. Author manuscript; available in PMC 2017 October 01.

Chapter 1: Exploring Data

EEG signal classification using Bayes and Naïve Bayes Classifiers and extracted features of Continuous Wavelet Transform

Assessing Functional Neural Connectivity as an Indicator of Cognitive Performance *

Detection of Mild Cognitive Impairment using Image Differences and Clinical Features

INTRODUCTION TO MACHINE LEARNING. Decision tree learning

Facial Expression Biometrics Using Tracker Displacement Features

PMR5406 Redes Neurais e Lógica Fuzzy. Aula 5 Alguns Exemplos

LOCATING BRAIN TUMOUR AND EXTRACTING THE FEATURES FROM MRI IMAGES

Cover Page. The handle holds various files of this Leiden University dissertation

I. INTRODUCTION III. OVERALL DESIGN

Cancer Cells Detection using OTSU Threshold Algorithm

MEM BASED BRAIN IMAGE SEGMENTATION AND CLASSIFICATION USING SVM

Statistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes.

Enhanced Detection of Lung Cancer using Hybrid Method of Image Segmentation

Texture analysis in Medical Imaging: Applications in Cancer

ISSN (Online): International Journal of Advanced Research in Basic Engineering Sciences and Technology (IJARBEST) Vol.4 Issue.

Methods for comparing scanpaths and saliency maps: strengths and weaknesses

EARLY STAGE DIAGNOSIS OF LUNG CANCER USING CT-SCAN IMAGES BASED ON CELLULAR LEARNING AUTOMATE

Comparative Evaluation of Color Differences between Color Palettes

Classification of Thyroid Nodules in Ultrasound Images using knn and Decision Tree

KHAZAR KHORRAMI CANCER DETECTION FROM HISTOPATHOLOGY IMAGES

Data Mining and Knowledge Discovery: Practice Notes

DETECTING DIABETES MELLITUS GRADIENT VECTOR FLOW SNAKE SEGMENTED TECHNIQUE

An approach to classification of retinal vessels using neural network pattern recoginition

Classification of benign and malignant masses in breast mammograms

Automatic Diagnosis of Ovarian Carcinomas via Sparse Multiresolution Tissue Representation

ECG Beat Recognition using Principal Components Analysis and Artificial Neural Network

Research Article Abdominal Tumor Characterization and Recognition Using Superior-Order Cooccurrence Matrices, Based on Ultrasound Images

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

Segmentation of Tumor Region from Brain Mri Images Using Fuzzy C-Means Clustering And Seeded Region Growing

Computer-aided diagnosis psoriasis lesion and other skin lesions using skin texture features and combination 1

A contrast paradox in stereopsis, motion detection and vernier acuity

Methods for Predicting Type 2 Diabetes

Transcription:

Yeast Cells Classification Machine Learning Approach to Discriminate Saccharomyces cerevisiae Yeast Cells Using Sophisticated Image Features. Mohamed Tleis Supervisor: Fons J. Verbeek Leiden University October 15, 2015 IB-2015 11 th Integrative BioInformatics International Symposium

Mohamed Tleis (Leiden University) Yeast Cells Classification 1 / 40 Table of Contents 1 Introdution 2 Features 3 Classification 4 Results 5 Conclusion

Mohamed Tleis (Leiden University) Yeast Cells Classification 2 / 40 Section 1 Introdution

Mohamed Tleis (Leiden University) Yeast Cells Classification 3 / 40 Saccharomyces cerevisiae Originally isolated from grapes skin. Intensively studied eukaryotic model. Used to understand gene behaviour, under stress response. Saccharomyces cerevisiae...

Mohamed Tleis (Leiden University) Yeast Cells Classification 4 / 40 BMH1-GFP under high stress 50mM NaCl media

Mohamed Tleis (Leiden University) Yeast Cells Classification 5 / 40 Image Analysis Platform Has the following components: Segmentation Module. Measurement Module. Statistics and Data Visualization Module. GUI.

Mohamed Tleis (Leiden University) Yeast Cells Classification 6 / 40 Image Analysis Workflow

Introdution Features Classification Results Conclusion Yeast Cells Image Modality Images Acquired by Zeiss LSM5 Exciter. 2-Channels Bright-Field GFP- Protein Mohamed Tleis (Leiden University) Yeast Cells Classification 7 / 40

Introdution Features Classification Results Conclusion Yeast Cells Segmentation Segmentation on Bright-Field Channels. Resulted masks used to measure all channels. Mohamed Tleis (Leiden University) Yeast Cells Classification 8 / 40

Mohamed Tleis (Leiden University) Yeast Cells Classification 9 / 40 Hough Transform To Detect Circles Detect Geometrical Circles. Using 3D cube-like Accumulator. r θ Center(x, y) A(x, y, r) Threshold to estimate cell locations. T = 2πr {2πr α + p} r x p = β (r max r min ) r index y

Mohamed Tleis (Leiden University) Yeast Cells Classification 10 / 40 Dynamic Programming

Mohamed Tleis (Leiden University) Yeast Cells Classification 11 / 40 Measurement Module Subtle patterns not easy to be extracted. Sometimes it s not possible to see differences in cell groups. We need an automatic system to extract hidden features.

Mohamed Tleis (Leiden University) Yeast Cells Classification 12 / 40 Machine Learning Extraction Technqiues First-Order Histogram Texture Measurement Moment Invariants Co-occurrence Matrix Wavelet-Based Textures 10-K Cross Validation Feature Selection Scaling Sampling Sophisticated Features Supervized Classification Classifiers Evaluation Model Election Classification Model

Mohamed Tleis (Leiden University) Yeast Cells Classification 13 / 40 Section 2 Features

Mohamed Tleis (Leiden University) Yeast Cells Classification 14 / 40 Feature, Texture & Extraction Techniques A feature is a representation/attribute of an image. Texture : is the visual effect produced by spatial distribution of variations. is a rich Source of visual Information. Feature Extraction is locating pixels with distinctive characteristics.

Mohamed Tleis (Leiden University) Yeast Cells Classification 15 / 40 First-Order Histogram An Image as a function f (x, y). h(i) = N 1 x=0 M 1 y=0 δ(f (x, y), i), (1) δ(j, i) = { 1, j = i 0, j i (2)

Mohamed Tleis (Leiden University) Yeast Cells Classification 16 / 40 Features based on First-Order Histogram Features Size Total Intensity Intensity Standard Deviation Perimeter Circularity Vacuole Size Membrane Features Description The number of pixels occupied by the cell. Sum of intensity values of pixels occupied by the cell. The standard deviation from the mean (intensity/pixel) of the intensity values at each pixel. Cell perimeter. The circularity of detected shapes. Circularity = 4πSize perimeter 2. (3) Estimation of the vacuole size. Size, total Intensity, Intensity standard deviation.

Textures based on First-Order Histogram Textures Description Variance Measure of intensity contrast. µ 2 (z) = L 1 i=0 (z i m) 2.P(z i ) Relative Smoothness Zero for constant intensities. R(z) = 1 1 1+σ 2 (z) Skewness Indication of the skewness of the histogram. µ 3 (z) = L 1 i=0 (z i m) 3.P(z i ) Uniformity Entropy Has a maximum value when intensity levels are equal. U(z) = L 1 i=0 P2 (z i ) A measure of variability, is zero for constant images. e(z) = L 1 i=0 P(z i ).log 2 P(z i ) Mohamed Tleis (Leiden University) Yeast Cells Classification 17 / 40

Mohamed Tleis (Leiden University) Yeast Cells Classification 18 / 40 Moment Invariants An image moment: Is a certain particular weighted average, i.e. moment of pixel intensities. Computed based on the information from shape and interior region. Useful descriptor after segmentation. Simple Properties from low order moments. Invariant to translation, scale and rotation. frequently used as features for: Image Processing. Remote sensing. Shape recognition. Classification.

Mohamed Tleis (Leiden University) Yeast Cells Classification 19 / 40 Hu s Set of Moment Invariants hu = {Φ 1, Φ 2, Φ 3, Φ 4, Φ 5, Φ 6, Φ 7 }. Are widely Known set of seven invariants. Φ 1 & Φ 2 are based on second order moments. Φ 3... Φ 7 are based on third order moments. More effective when fused with other techniques.

Co-occurrence Matrix Simple texture attributes can not characterize cells. Similar texutres agree in their second-order statistics. Second Order statistics: Are given by pairs of pixels. Have good discrimination rates. Important in automated image analysis. Features derived from co-occurrence matrix. Co-occurrence Matrix: For Image f(x,y) with L discrete levels (Dimension L x L). The (i, j) th is # of times that f(x1,y1) = i and f(x2,y2) = j. where (x2,y2) = (x1,y1) + (d cos θ, d sin θ). Mohamed Tleis (Leiden University) Yeast Cells Classification 20 / 40

Mohamed Tleis (Leiden University) Yeast Cells Classification 21 / 40 Co-occurrence Matrix Derived Features Features used for texture discrimination. Angular Second Moment. Correlation. Intertia. Absolute Value. Inverse Difference. Entropy. Maximum Probability.

Mohamed Tleis (Leiden University) Yeast Cells Classification 22 / 40 Multi-Scale Features Methods to calculate multi-scale features: Wigner Ditributions. Has interference terms between components. Gabor Transform. Non-orthogonal > Redundant features. Wavelet Transforms.

Mohamed Tleis (Leiden University) Yeast Cells Classification 23 / 40 Section 3 Classification

Mohamed Tleis (Leiden University) Yeast Cells Classification 24 / 40 Classification Dataset of 1440 yeast cell instances. 14-3-3 proteins with GFP in 50mM vs. 0mM NaCl Measure all features per individual cell instance Construct a contigency table to represent dispositions of the set of instances. Evaluate 23 different linear and non-linear classifiers.... including: decision trees, naive Bayes, least-square linear preictors, SVM, etc...

Mohamed Tleis (Leiden University) Yeast Cells Classification 25 / 40 Imbalanced Dataset & Sampling Techniques Unequal distribution between classes. Sampling Techniques improves classifier accuracy. UnderSampling. OverSampling. SMOTE.

Mohamed Tleis (Leiden University) Yeast Cells Classification 26 / 40 Data Scaling, i.e. Normalization Applied at data pre-processing. Some Algorithms will not work without Normalization. Normalization Techniques: UL. x i =, i = 1, 2,...d, xi x MV. x i = (xi µ) σ, i = 1, 2,., d,

Mohamed Tleis (Leiden University) Yeast Cells Classification 27 / 40 Feature (Attribute) Selection To optimally reduce feature space. Advantages: Improves the prediction performance. Provides faster and more cost effective classifiers. Provides a better understanding of the underlying process that generated the data. Reduces overfitting. Reduces training time. Avoid selecting redundant and irrelevant features. Selected Algorithms: Information Gain (IG). Correlation Feature Selection (CFS). Principal Component Analysis (PCA).

Mohamed Tleis (Leiden University) Yeast Cells Classification 28 / 40 Evaluation metrics, ROC and AUC ROC curve is a 2D graphical plot. AUC = 1, Perfect 1 > AUC 0.9, Excellent. 0.9 > AUC 0.8, Good. 0.8 > AUC 0.7, Fair. 0.7 > AUC 0.6, Poor. Sensitivity (TPR) 1-Specificity (FPR)

Mohamed Tleis (Leiden University) Yeast Cells Classification 29 / 40 Classifiers Evaluated 23 different classifiers. Using Weka, R and rweka

Mohamed Tleis (Leiden University) Yeast Cells Classification 30 / 40 Section 4 Results

Mohamed Tleis (Leiden University) Yeast Cells Classification 31 / 40 AUC vs. A min

Mohamed Tleis (Leiden University) Yeast Cells Classification 32 / 40 Power of Sampling

Mohamed Tleis (Leiden University) Yeast Cells Classification 33 / 40 Normalization and Feature Selection

Mohamed Tleis (Leiden University) Yeast Cells Classification 34 / 40 SVM : SMO

Mohamed Tleis (Leiden University) Yeast Cells Classification 35 / 40 Analysis and AUC value of Logistic Classifier

Mohamed Tleis (Leiden University) Yeast Cells Classification 36 / 40 Analysis and AUC value of C4.5 Classifier

C4.5 Logistic Mohamed Tleis (Leiden University) Yeast Cells Classification 37 / 40 Performance of Classifiers using second and up-to third order invariant moment features

Mohamed Tleis (Leiden University) Yeast Cells Classification 38 / 40 Section 5 Conclusion

Conclusion A machine learning approach can discriminate yeast cells cultivated under different stress levels. A feature set is powerful in predicting cell groups, combined features from 1st-order histogram, moment invariants, Co-occurrence matrix and Wavelet-based texture features. Using SMOTE for data sampling, MV for data normalization and IG for feature selection. As future work: Classify different cell strains and conditions in a high-volume HTS studies. Use developmental techniques to create optimal classifier. Mohamed Tleis (Leiden University) Yeast Cells Classification 39 / 40

Acknowledgement Correspondence : {m.tleis, f.j.verbeek}@liacs.leidenuniv.nl Supervisor : Dr. Ir. Fons J. Verbeek (section Imaging & BioInformatics, LIACS) Contributors: