NMF-Density: NMF-Based Breast Density Classifier

Similar documents
Mammogram Analysis: Tumor Classification

Mammography is a most effective imaging modality in early breast cancer detection. The radiographs are searched for signs of abnormality by expert

Mammogram Analysis: Tumor Classification

A REVIEW ON CLASSIFICATION OF BREAST CANCER DETECTION USING COMBINATION OF THE FEATURE EXTRACTION MODELS. Aeronautical Engineering. Hyderabad. India.

Investigating the performance of a CAD x scheme for mammography in specific BIRADS categories

Automatic Classification of Breast Masses for Diagnosis of Breast Cancer in Digital Mammograms using Neural Network

Australian Journal of Basic and Applied Sciences

Classification of benign and malignant masses in breast mammograms

Detection of microcalcifications in digital mammogram using wavelet analysis

DETECTION AND CLASSIFICATION OF MICROCALCIFICATION USING SHEARLET WAVE TRANSFORM

Automatic breast density classification using a convolutional neural network architecture search procedure

Computerized image analysis: Estimation of breast density on mammograms

Mammographic Cancer Detection and Classification Using Bi Clustering and Supervised Classifier

Comparison Classifier: Support Vector Machine (SVM) and K-Nearest Neighbor (K-NN) In Digital Mammogram Images

Classification of Mammograms using Gray-level Co-occurrence Matrix and Support Vector Machine Classifier

Threshold Based Segmentation Technique for Mass Detection in Mammography

CLASSIFICATION OF DIGITAL MAMMOGRAM BASED ON NEAREST- NEIGHBOR METHOD FOR BREAST CANCER DETECTION

COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION

Lung Region Segmentation using Artificial Neural Network Hopfield Model for Cancer Diagnosis in Thorax CT Images

arxiv: v2 [cs.cv] 8 Mar 2018

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

Deep learning and non-negative matrix factorization in recognition of mammograms

Investigation of multiorientation and multiresolution features for microcalcifications classification in mammograms

Available online at ScienceDirect. Procedia Computer Science 70 (2015 ) 76 84

MRI Image Processing Operations for Brain Tumor Detection

Adapting Breast Density Classification from Digitized to Full-Field Digital Mammograms

Classification of mammogram masses using selected texture, shape and margin features with multilayer perceptron classifier.

Jacobi Moments of Breast Cancer Diagnosis in Mammogram Images Using SVM Classifier

Automated Approach for Qualitative Assessment of Breast Density and Lesion Feature Extraction for Early Detection of Breast Cancer

Computer Aided Diagnosis for Breast Cancer Screening

CHAPTER 2 MAMMOGRAMS AND COMPUTER AIDED DETECTION

Research Article Content Based Mammogram Retrieval based on Breast Tissue Characterization using Statistical Features

ISSN Vol.03,Issue.06, May-2014, Pages:

Estimation of Breast Density and Feature Extraction of Mammographic Images

BREAST CANCER EARLY DETECTION USING X RAY IMAGES

Automatic Classification of Perceived Gender from Facial Images

Building an Ensemble System for Diagnosing Masses in Mammograms

Effect of Feedforward Back Propagation Neural Network for Breast Tumor Classification

Local Image Structures and Optic Flow Estimation

EXTRACT THE BREAST CANCER IN MAMMOGRAM IMAGES

AN ALGORITHM FOR EARLY BREAST CANCER DETECTION IN MAMMOGRAMS

Detection of Glaucoma and Diabetic Retinopathy from Fundus Images by Bloodvessel Segmentation

AN EFFICIENT AUTOMATIC MASS CLASSIFICATION METHOD IN DIGITIZED MAMMOGRAMS USING ARTIFICIAL NEURAL NETWORK

IJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: 1.852

Detecting Architectural Distortion in Mammograms Using a Gabor Filtered Probability Map Algorithm

CLASSIFICATION OF ABNORMALITY IN B -MASS BY ARCHITECTURAL DISTORTION

Detection of architectural distortion using multilayer back propagation neural network

MEM BASED BRAIN IMAGE SEGMENTATION AND CLASSIFICATION USING SVM

Pre-treatment and Segmentation of Digital Mammogram

An automatic mammogram system: from screening to diagnosis. Inês Domingues

Electrocardiogram beat classification using Discrete Wavelet Transform, higher order statistics and multivariate analysis

Computer aided detection of clusters of microcalcifications on full field digital mammograms

ISSN (Online): International Journal of Advanced Research in Basic Engineering Sciences and Technology (IJARBEST) Vol.4 Issue.

1 Introduction. Abstract: Accurate optic disc (OD) segmentation and fovea. Keywords: optic disc segmentation, fovea detection.

Network-based pattern recognition models for neuroimaging

Contributions to Brain MRI Processing and Analysis

The Application of Image Processing Techniques for Detection and Classification of Cancerous Tissue in Digital Mammograms

ECG Beat Recognition using Principal Components Analysis and Artificial Neural Network

Quick detection of QRS complexes and R-waves using a wavelet transform and K-means clustering

Mammographic Mass Detection Using a Mass Template

A comparative study of machine learning methods for lung diseases diagnosis by computerized digital imaging'"

Review of Mammogram Enhancement Techniques for Detecting Breast Cancer

Malignant Breast Cancer Detection Method - A Review. Patiala

A novel and automatic pectoral muscle identification algorithm for mediolateral oblique (MLO) view mammograms using ImageJ

Utilizing Posterior Probability for Race-composite Age Estimation

Comparative Study of K-means, Gaussian Mixture Model, Fuzzy C-means algorithms for Brain Tumor Segmentation

A Survey on Localizing Optic Disk

Combination of Different Texture Features for Mammographic Breast Density Classification

IDENTIFICATION OF MYOCARDIAL INFARCTION TISSUE BASED ON TEXTURE ANALYSIS FROM ECHOCARDIOGRAPHY IMAGES

Vital Responder: Real-time Health Monitoring of First- Responders

Automated Brain Tumor Segmentation Using Region Growing Algorithm by Extracting Feature

Sparse Coding in Sparse Winner Networks

Analysis of Mammograms Using Texture Segmentation

Detection of Breast Masses in Digital Mammograms using SVM

MIT International Journal of Electronics and Communication Engineering Vol. 3, No. 1, Jan. 2013, pp

Automatic Diagnosis of Ovarian Carcinomas via Sparse Multiresolution Tissue Representation

DYNAMIC SEGMENTATION OF BREAST TISSUE IN DIGITIZED MAMMOGRAMS

Automated Assessment of Diabetic Retinal Image Quality Based on Blood Vessel Detection

COMPUTER -AIDED DIAGNOSIS FOR MICROCALCIFICA- TIONS ANALYSIS IN BREAST MAMMOGRAMS. Dr.Abbas Hanon AL-Asadi 1 AhmedKazim HamedAl-Saadi 2

Breast tumor detection and classification in Mammograms: Gabor wavelet vs. statistical features

Improved Intelligent Classification Technique Based On Support Vector Machines

Ch.20 Dynamic Cue Combination in Distributional Population Code Networks. Ka Yeon Kim Biopsychology

Automatic Hemorrhage Classification System Based On Svm Classifier

A Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China

EARLY STAGE DIAGNOSIS OF LUNG CANCER USING CT-SCAN IMAGES BASED ON CELLULAR LEARNING AUTOMATE

Data Mining in Bioinformatics Day 4: Text Mining

Improved Detection of Lung Nodule Overlapping with Ribs by using Virtual Dual Energy Radiology

Cancer Cells Detection using OTSU Threshold Algorithm

A Bag of Words Approach for Discriminating between Retinal Images Containing Exudates or Drusen

A Survey on Brain Tumor Detection Technique

Look differently. Invenia ABUS. Automated Breast Ultrasound

Shu Kong. Department of Computer Science, UC Irvine

Biomedical Research 2016; 27 (2): ISSN X

Improved Framework for Breast Cancer Detection using Hybrid Feature Extraction Technique and FFNN

EXEMPLAR SELECTION TECHNIQUES FOR SPARSE REPRESENTATIONS OF SPEECH USING MULTIPLE DICTIONARIES. Emre Yılmaz, Jort F. Gemmeke, and Hugo Van hamme

DETECTING DIABETES MELLITUS GRADIENT VECTOR FLOW SNAKE SEGMENTED TECHNIQUE

CLASSIFYING MAMMOGRAPHIC MASSES INTO BI-RADS SHAPE CATEGORIES USING VARIOUS GEOMETRIC SHAPE AND MARGIN FEATURES

Robust system for patient specific classification of ECG signal using PCA and Neural Network

Detection and Classification of Lung Cancer Using Artificial Neural Network

Shu Kong. Department of Computer Science, UC Irvine

Transcription:

NMF-Density: NMF-Based Breast Density Classifier Lahouari Ghouti and Abdullah H. Owaidh King Fahd University of Petroleum and Minerals - Department of Information and Computer Science. KFUPM Box 1128. Dhahran 31261 - Saudi Arabia. Abstract. The amount of tissue available in the breast, commonly characterized by the breast density, is highly correlated with breast cancer. In fact, dense breasts have higher risk of developing breast cancer. On the other hand, breast density influences the mammographic interpretation since it decreases the sensitivity of breast cancer detection. This sensitivity decrease is due to the fact that both cancerous regions and tissue appear as white areas in breast mammograms. This paper introduces new features to improve the classification of breast density in digital mammograms according to the commonly used radiological lexicon (BI-RADS). These features are extracted from non-negative matrix factorization (NMF) of mammograms and classified using machine learning classifiers. Using ground truth mammographic data, the classification performance of the proposed features is assessed. Simulation results show that the latter significantly outperforms existing density features based on principal component analysis (PCA) by achieving higher classification accuracy. 1 Introduction Digital mammography represents an efficient means for breast cancer detection. Computer-aided diagnosis (CAD) of breast cancer has emerged as a support tool for medical radiologists in the early detection of breast cancer. On the other hand, recent medical studies have revealed a strong correlation between the mammographic density and the risk of breast cancer [6]. However, breast density can negatively influence the decision of radiologists since the detection of cancer tumors can be obstructed by the tissue density [6]. This paper aims at providing a mechanism for systematic classification of breast densities according to the BI-RADS lexicon. It is hoped that this automated mechanism will facilitate the work of radiologists in the mammographic evaluation for breast cancer detection. The proposed density classification scheme introduces new features based on the non-negative matrix factorization (NMF) technique. In the literature, several approaches are proposed for the automated classification of breast density. These approaches can be classified into: 1) matrix factorization; 2) global histogram; and 3) texture analysis methods. Matrix factorization techniques factorizes the mammogram images into a product of several factor images according to specific constrains. Consequently, the mammographic images, known for their high dimensionality, undergo a drastic dimensionality reduction where only dominant features are kept. Oliver et al.[6] proposed a two-class Research supported by King Abdul-Aziz City for Science and Technology through NSTP Project Project No. 11-MED1612-04. 455

breast density classification. Image segmentation is used as a pre-processing step. Then, features are extracted using principle component analysis (PCA) and linear discriminant analysis (LDA) techniques to classify the mammogram images into fatty and dense types. Features extracted using 2D-PCA are proposed by De Olivera et al. [3] to build a two class (fatty and dense) content-based image retrieval (CBIR) system. A support vector machine (SVM) with Gaussian kernels classify image features represented by the first four principle components (PC). Reported results indicate that 2D-PCA outperforms the standard PCA in terms of classification accuracy. Using the same features, proposed in [3], Thomas et al. [4] consider 4 density classes according to the BI-RADS lexicon using a similar classifier. De Oliveira et al. [1] propose a CBIR system, called MammoSVD, where image features are extracted using the singular value decomposition (SVD) algorithm. It is noteworthy that MammoSVD system is a binary classifier (fatty and dense tissue) based on an SVM learning machine. The SVD-based features provide a good characterization of the mammographic texture. MammoSVD system achieves 90% classification accuracy. In [2], a 4-class model, called MammoSVx is proposed where features are represented using the largest 25 singular values of the SVD decomposition of the mammogram images. Using an SVM learning model with polynomial kernel against a mammographic database containing 10000 images, a classification accuracy of 82.14% is achieved by MammoSVx. This paper is organized as follows. Section 2 provides a detailed description of the NMF technique. The proposed scheme for the classification of the breast density is discussed in Section 3 where the features extracted from mammographic images using the NMF factorization are also introduced. Section 4 gives a summary of the performance analysis carried out to assess the accuracy of the proposed breast density classification scheme. The paper concludes with Section 5 where conclusions and directions for future work are given. 2 Non-Negative Matrix Factorization (NMF) NMF is an unsupervised learning approach that leads to parts-based image representations. Such representations are generated using additive combinations of the original images 1. Also, the non-negativity constraint imposed on the factorization allows for more realistic extracted image factors [5]. Given a non-negative input image, A R m n, the NMF yields the following factorization: A W V (1) where the rows of W R m r and the columns of V R r n represent the NMF basis and their encoding coefficients, respectively. Image approximation is achieved using ranks satisfying: (m + n) r < m n. Keeping in mind that the NMF does not allow negative entries in W and V, it has found several applications including face recognition and gene expression. Fig. 1 reveals the power of NMF factorization in terms of the locality of the features extracted. It is clear that the NMF bases are well-localized unlike the PCA ones, which gives 1 Other factorization methods such as PCA and independent component analysis (ICA) yield subtractive combinations leading to holistic image representations. 456

ESANN 2014 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence Fig. 1: NMF versus PCA decomposition. First 16 NMF bases (left). First 16 PCA bases (right). NMF-based features more discriminating capability. The factorization, given by Eq. 1, defines the following optimization problem: Given a non-negative image,, find non-negative approximations, W Rm k and V Rk n, such A Rm n 6= 6= 6= that k < min(m, n). Then, this non-convex constrained optimization is defined as follows: 2 X A (WV) (2) f (W, V) = A WV F 2 = The Frobenius norm, F 2, is used to measure the approximation error. Other common objective functions include the well-known Kullback-Leibler divergence objective function:! X A A log (3) A + (WV) D (A WV) = (WV) Eq. 3 can be solved using different algorithms including multiplicative updates, gradient descent and alternating least squares [5]. The multiplicative updates, proposed by Lee and Seung [5], for solving Eq. 2, are given by: W V 3 (AVT ) (WVVT ) ( W T A) (WT WVT ) W (4) V Proposed NMF-Based Breast Density Classification System The proposed NMF-Based breast density classification system is shown in Fig. 2. This system consists of 3 main building blocks: 1. Preprocessing and segmentation: The preprocessing step is crucial for successful and error-free mammographic interpretation. This step includes noise removal and contrast enhancement. The pectoral muscle, visible in MLO views, is segmented apart enabling the extraction of the image region of interest (ROI). In the case of the experimental database used in this paper, the extracted ROIs contain 300 300 pixels. A sample 457

ESANN 2014 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence Fig. 2: Proposed Block-Based Breast Density Classification Scheme. Fig. 3: Raw mammogram image (left). Preprocessed and segmented sample (middle). 300 300 ROI area (right). mammographic image and its preprocessed sample are shown in Fig. 3. It is obvious that preprocessing and segmentation have significantly improved the visual quality of the image prior to inspection by radiologists. 2. Feature extraction and selection: Mammogram images, having the same density annotation, are grouped into a large mammogram image which is decomposed using the NMF factorization given by Eq. 1. Features are then extracted by retaining only the first few factors. The NMF factorization efficiency is illustrated in Fig. 1 where only 16 NMF factors are retained. 3. Machine learning-based classification: Given their universal classification capabilities, support vector machines (SVM) are proposed to classify the breast density classes (binary or multi class). As such, a CBIR system can be developed based on the breast density categorization. In general, the SVM classifier finds the linear decision boundary (or hyperplane) that successfully separates data pertaining to two given classes. Moreover, this hyperplane maximizes the separating distance between the two classes. higher classification performance is achieved by greater separating distance. 458

Features Full Mammogram Patches (ROIs) PCA (first 5 components) 50.62 55.62 PCA (first 10 components) 50.31 57.81 NMF (first 5 factors) 72.43 77.84 NMF (first 10 factors) 75.34 83.19 Table 1: Mean accuracy of PCA- and NMF-based density classification schemes using full mammogram and ROI images. 4 Simulation Results and Discussion To assess the capability of the proposed NMF-based breast density classification, the mammographic image analysis society (MIAS) database is used. This public database, freely available, contains 322 mammographic images (50 micron pixel edge) taken in medio-lateral oblique (MLO) view [7]. All images, reduced to 1024 1024 pixels, were annotated by experienced radiologists and classified into three distinct density categories: 1) fatty (106 images); 2) fatty-glandular (104 images); and 3) dense-glandular (112 images). Training and testing experiments were conducted using 70% and 30% of the full mammogram and ROI images, respectively 2. Breast densities are also classified according to the BI- RADS lexicon. This lexicon describes the breast density, along with the lesion feature and classification. Table 1 summarizes the classification results of the conducted experiments. Second and third columns of this table show respectively the mean accuracy obtained from the testing phase using 30% of the data. Simulation results reported for the PCA-based features are reproduced from [7]. As it can be clearly observed, NMF yielded higher classification accuracy thanks to its parts-based factorization. Also, the use of ROIs allowed higher classification accuracy than using full mammography. In fact, the use of ROIs with NMF yielded the best classification accuracy since the extracted local features were not biased towards the background regions which are usually dominant in mammographic images. Finally, it is cautionary to mention that increasing the number of retained factors does not always yield higher accuracies, This is reminiscent of the over-fitting problem often encountered in artificial neural networks. 5 Conclusions In this paper, we proposed a new classification scheme for breast density using non-negative matrix factorization (NMF) and support vector machines. Unlike other factorization techniques, NMF enables the extraction of local mammographic features thanks to the non-negativity constraint imposed. Also, the iterative solution of the NMF makes it an excellent candidate for mammogram images which are usually characterized by high resolution and depth. The performance of the NMF-based breast density classification scheme is assessed using 2 Pre-processed full mammogram and ROI images are available at: https://github.com/welber/cad mammography/tree/master/mias 300 300. 459

a standard, annotated and publicly-available mammographic database. As reported in this paper, the proposed classification scheme does not only achieve higher classification accuracy but can properly handle the invariance in mammogram images due to the breast density as well. Finally, the proposed density classification is based on local (or parts)- based features which yield sparse structures when the sparsity constraint is imposed on the NMF factors. Future work includes a detailed investigation of the classification capability of the proposed algorithm versus industry-acclaimed density estimation tools such as the VolparaDensity (previously known as Cumulus) tool. Also, extending the proposed scheme to classify all BI-RADS density types will be considered. Acknowledgment The author would like to acknowledge the support provided by King Abdulaziz City for Science and Technology (KACST) through the Science & Technology Unit at King Fahd University of Petroleum & Minerals (KFUPM) for funding this work through project No. 11-MED1612-04 as part of the National Science, Technology and Innovation Plan. References [1] J. E. E. de Oliveira, G. Camara-Chavez, A. de Araujo, and T. M. Deserno. Mammosvd: A content-based image retrieval system using a reference database of mammographies. In 22nd IEEE International Symposium on Computer-Based Medical Systems, pages 1 4, 2009. [2] J. E. E. de Oliveira, G. Camara-Chavez, A. de Araujo, and T. M. Deserno. content-based image retrieval applied to bi-rads tissue classification in screening mammography. World Journal of Radiology, 3(1):24 31, 2011. [3] J. E. E. de Oliveira and A. de Araujo. Mammosyslesion: A content-based image retrieval system for mammographies. In 17th International Conference on Systems, Signals and Image Processing (IWSSIP 2010), pages 408 411, 2010. [4] T. M. Deserno, M. Soiron, J. E. E. de Oliveira, and A. de Araujo. Towards computer-aided diagnostics of screening mammography using content-based image retrieval. In 24th Conference on Graphics, Patterns and Images (Sibgrapi 2011), pages 1754 1760, 2011. [5] D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788 791, 1999. [6] A. Oliver, X. Lado, E. Perez, J. Pont, J. Denton, E. Freixenet, and J. Marti. Statistical approach for breast density segmentation. Journal of Digital Imaging, 23(5):55 65, 2009. [7] W. R. Silva and D. Menotti. Classification of mammograms by the breast composition. In International Conference on Image Processing, Computer Vision, and Pattern Recognition (ICPV 2012), pages 1 6, July 2012. 460