Building an Ensemble System for Diagnosing Masses in Mammograms

Similar documents
Investigating the performance of a CAD x scheme for mammography in specific BIRADS categories

Classification of mammogram masses using selected texture, shape and margin features with multilayer perceptron classifier.

CLASSIFYING MAMMOGRAPHIC MASSES INTO BI-RADS SHAPE CATEGORIES USING VARIOUS GEOMETRIC SHAPE AND MARGIN FEATURES

Automatic Classification of Breast Masses for Diagnosis of Breast Cancer in Digital Mammograms using Neural Network

Mammogram Analysis: Tumor Classification

NMF-Density: NMF-Based Breast Density Classifier

Mammogram Analysis: Tumor Classification

Investigation of multiorientation and multiresolution features for microcalcifications classification in mammograms

Deep learning and non-negative matrix factorization in recognition of mammograms

CHAPTER 2 MAMMOGRAMS AND COMPUTER AIDED DETECTION

Threshold Based Segmentation Technique for Mass Detection in Mammography

Predictive Data Mining for Lung Nodule Interpretation

Mammographic Mass Detection Using a Mass Template

Mammography is a most effective imaging modality in early breast cancer detection. The radiographs are searched for signs of abnormality by expert

AN ALGORITHM FOR EARLY BREAST CANCER DETECTION IN MAMMOGRAMS

arxiv:physics/ v1 [physics.med-ph] 4 Jan 2007

Copyright 2007 IEEE. Reprinted from 4th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, April 2007.

BREAST CANCER EARLY DETECTION USING X RAY IMAGES

Data complexity measures for analyzing the effect of SMOTE over microarrays

A REVIEW ON CLASSIFICATION OF BREAST CANCER DETECTION USING COMBINATION OF THE FEATURE EXTRACTION MODELS. Aeronautical Engineering. Hyderabad. India.

Comparison Classifier: Support Vector Machine (SVM) and K-Nearest Neighbor (K-NN) In Digital Mammogram Images

Mammographic Cancer Detection and Classification Using Bi Clustering and Supervised Classifier

Using Deep Convolutional Neural Networks to Predict Semantic Features of Lesions in Mammograms

Malignant Breast Cancer Detection Method - A Review. Patiala

Classification of benign and malignant masses in breast mammograms

EXTRACT THE BREAST CANCER IN MAMMOGRAM IMAGES

Effect of Feedforward Back Propagation Neural Network for Breast Tumor Classification

MIT International Journal of Electronics and Communication Engineering Vol. 3, No. 1, Jan. 2013, pp

Australian Journal of Basic and Applied Sciences

Estimation of Breast Density and Feature Extraction of Mammographic Images

Implementation of Brain Tumor Detection using Segmentation Algorithm & SVM

CHAPTER 3 - DATA MING TECHNIQUES FOR MEDICAL IMAGE PROCESSING

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

Jacobi Moments of Breast Cancer Diagnosis in Mammogram Images Using SVM Classifier

International Journal of Advance Research in Engineering, Science & Technology

A comparative study of machine learning methods for lung diseases diagnosis by computerized digital imaging'"

Automated Mass Detection from Mammograms using Deep Learning and Random Forest

Statistical analysis to assess automated Level of Suspicion scoring methods in breast ultrasound

COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION

Research Article. Automated grading of diabetic retinopathy stages in fundus images using SVM classifer

CLASSIFICATION OF ABNORMALITY IN B -MASS BY ARCHITECTURAL DISTORTION

Design of Palm Acupuncture Points Indicator

A Survey of Segmentation in Mass Detection Algorithm for Mammography and Thermography

Detection of architectural distortion using multilayer back propagation neural network

Classification of Microcalcifications into BI-RADS Morphologic Categories Preliminary Results

Cancer Cells Detection using OTSU Threshold Algorithm

MRI Image Processing Operations for Brain Tumor Detection

Automatic Hemorrhage Classification System Based On Svm Classifier

Automatic Diagnosing Mammogram Using Adaboost Ensemble Technique

LUNG NODULE SEGMENTATION IN COMPUTED TOMOGRAPHY IMAGE. Hemahashiny, Ketheesan Department of Physical Science, Vavuniya Campus

Early Detection of Lung Cancer

10.4 Computer-Aided Detection and Diagnosis in Mammography

LUNG NODULE SEGMENTATION FOR COMPUTER AIDED DIAGNOSIS

Detection of suspicious lesion based on Multiresolution Analysis using windowing and adaptive thresholding method.

Breast screening: understanding case difficulty and the nature of errors

Adapting Breast Density Classification from Digitized to Full-Field Digital Mammograms

A Survey on Brain Tumor Detection Technique

Classification of Mammograms using Gray-level Co-occurrence Matrix and Support Vector Machine Classifier

Brain Tumour Detection of MR Image Using Naïve Beyer classifier and Support Vector Machine

COMPUTERIZED SYSTEM DESIGN FOR THE DETECTION AND DIAGNOSIS OF LUNG NODULES IN CT IMAGES 1

Detection of microcalcifications in digital mammogram using wavelet analysis

A Novel Method For Automatic Screening Of Nonmass Lesions In Breast DCE-MRI

International Journal of Pharma and Bio Sciences A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS ABSTRACT

A Pattern Classification Approach to Aorta Calcium Scoring in Radiographs

LUNG CANCER continues to rank as the leading cause

An automatic mammogram system: from screening to diagnosis. Inês Domingues

Image processing mammography applications

Predicting Breast Cancer Survivability Rates

COMPUTER -AIDED DIAGNOSIS FOR MICROCALCIFICA- TIONS ANALYSIS IN BREAST MAMMOGRAMS. Dr.Abbas Hanon AL-Asadi 1 AhmedKazim HamedAl-Saadi 2

LUNG NODULE DETECTION SYSTEM

AN EFFICIENT AUTOMATIC MASS CLASSIFICATION METHOD IN DIGITIZED MAMMOGRAMS USING ARTIFICIAL NEURAL NETWORK

Time-to-Recur Measurements in Breast Cancer Microscopic Disease Instances

Breast Cancer Prevention and Early Detection using Different Processing Techniques

A Multiple Classifier System for Classification of Breast Lesions Using Dynamic and Morphological Features in DCE-MRI

Neural Network Based Technique to Locate and Classify Microcalcifications in Digital Mammograms

Credal decision trees in noisy domains

Automated Tessellated Fundus Detection in Color Fundus Images

Comparison of discrimination methods for the classification of tumors using gene expression data

An Improved Algorithm To Predict Recurrence Of Breast Cancer

Evaluation Challenges for Bridging Semantic Gap: Shape Disagreements on Pulmonary Nodules in the Lung Image Database Consortium

Brain Tumour Diagnostic Support Based on Medical Image Segmentation

The Long Tail of Recommender Systems and How to Leverage It

PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH

Facial expression recognition with spatiotemporal local descriptors

Evaluating Classifiers for Disease Gene Discovery

Copyright 2008 Society of Photo Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE, vol. 6915, Medical Imaging 2008:

Automated Approach for Qualitative Assessment of Breast Density and Lesion Feature Extraction for Early Detection of Breast Cancer

CLASSIFICATION OF DIGITAL MAMMOGRAM BASED ON NEAREST- NEIGHBOR METHOD FOR BREAST CANCER DETECTION

CERIAS Tech Report Multiresolution Detection of Spiculated Lesions in Digital Mammograms by S Liu, C Babbs, E Delp Center for Education and

PNN -RBF & Training Algorithm Based Brain Tumor Classifiction and Detection

Predicting Malignancy from Mammography Findings and Image Guided Core Biopsies

[Kiran, 2(1): January, 2015] ISSN:

System for Breast Cancer Diagnosis: A Survey

arxiv: v1 [cs.cv] 26 Mar 2016

Breast Cancer Diagnosis Based on K-Means and SVM

Available online at ScienceDirect. Procedia Computer Science 70 (2015 ) 76 84

arxiv: v2 [cs.cv] 8 Mar 2018

Research Article A Selective Ensemble Classification Method Combining Mammography Images with Ultrasound Images for Breast Cancer Diagnosis

Diagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods

Improved Intelligent Classification Technique Based On Support Vector Machines

Transcription:

Building an Ensemble System for Diagnosing Masses in Mammograms Yu Zhang, Noriko Tomuro, Jacob Furst, Daniela Stan Raicu College of Computing and Digital Media DePaul University, Chicago, IL 60604, USA {jzhang2, tomuro, jfurst, draicu}@cs.depaul.edu ABSTRACT Purpose. Classification of a suspicious mass (region of interest, ROI) in a mammogram as malignant or benign may be achieved using mass shape features. An ensemble system was built for this purpose and tested. Methods. Multiple contours were generated from a single ROI using various parameter settings of the image enhancement functions for the segmentation. For each segmented contour, the mass shape features were computed. For classification, the dataset was partitioned into four subsets based on the patient age (young/old) and the ROI size (large/small). We built an ensemble learning system consisting of four single classifiers, where each classifier is a specialist, trained specifically for one of the subsets. Those specialist classifiers are also an optimal classifier for the subset, selected from several candidate classifiers through preliminary experiment. In this scheme, the final diagnosis (malignant or benign) of an instance is the classification produced by the classifier trained for the subset to which the instance belongs. Results. The Digital Database for Screening Mammography (DDSM) from the University of South Florida was used to test the ensemble system for classification of masses which achieved a 72% overall accuracy. This ensemble of specialist classifiers achieved better performance than single classification (56%). Conclusion. An ensemble classifier for mammography-detected masses may provide superior performance to any single classifier in distinguishing benign from malignant cases. Keyword: Mass Classification, Mass Segmentation, CADx, Ensemble Learning 1

1 Introduction Breast cancer is the second leading cause of cancer related deaths for women in the U.S. after lung cancer [1]. At the present, the most effective method for the early detection of breast cancer is mammography screening. Many Computer-Aided Diagnosis (CADx) systems have been developed as a second opinion to assist radiologists to detect or diagnose abnormalities in mammography screening. Mass and microcalcification are the two most common types of abnormalities associated with breast cancer. The research presented in this paper is an ongoing project for developing an image-based CADx system to classify suspicious masses in mammograms as malignant or benign. For radiologists, the shape and margin of masses are the two most important criteria in distinguishing malignant from benign masses [2]. In a CADx system for diagnosing masses in mammograms, segmentation separates a mass from its background and captures the shape and boundary of the mass. After segmentation, the contour of a mass is identified; and the shape features and spiculation level can be computed for classifying the mass as benign or malignant [3]. In this research, we present an ensemble system for classifying a suspicious mass region in a mammogram as malignant or benign by using mass shape features. Our CADx system processes a suspicious mass region (region of interest, ROI) in four stages: 1) mass segmentation, 2) feature extraction, 3) feature selection, and 4) classification. Figure 1 below depicts the schematic framework of our approach. In our CADx system, multiple mass contours are extracted from each ROI image by applying multiple segmentations. Each segmentation involves different image enhancement function which is a combination of values of the various parameters. We call each of such segmentations a weak segmentor, since there is no one set of image enhancement parameter values which produce the optimal segmentation for all images. Then for each segmented contour, we compute the mass shape features for classification. Finally for classification, we partition the dataset into four subsets based on the patient age (young/old) and the ROI pixel size (large/small), and build an ensemble learning system consisting of four single classifiers, where each classifier is trained specifically for one of the subsets. Thus, the final diagnosis (malignant or benign) of an instance is the classification produced by the optimal classifier trained for the subset to which the instance belongs. Fig. 1. Overall Framework of Our Approach This paper is organized as follows: Section 2 reviews the mass segmentation, mass CADx systems and ensemble learning, Section 3 describes the dataset, multiple segmentation, shape feature extraction and ensemble learning methods, Section 4 presents the experimental results, and Section 5 discusses the conclusions and future work. 2

2 Mass Segmentation and Mass CADx System 2.1 Mass Segmentation Masses are thickenings of breast tissue which appear as lesions, with the size ranging from 3mm to 30mm. The shape and margin of masses are two important criterions to distinguish malignant from benign masses. Usually a poorly defined shape is more likely to be malignant than a well-circumscribed mass. Margin is the border of a mass. Ill-defined margins or spiculated lesions are much more likely to be malignant [2]. In a CADx system for classifying a mass, segmentation is an essential step: it separates a mass from its background and captures the contour (boundary) of the mass from a suspicious area. Then, the shape and spiculation features from the detected contour can be computed for mass classification. Previous studies have shown that improving the mass segmentation can significantly improve the accuracy of mass diagnosis [4]. Many mass segmentation methods for mammogram images have been developed. There are two common mass segmentation approaches: 1) region-based methods and 2) edge-based methods. In region-based methods, mass regions are iteratively grown by comparing all neighboring pixels and including the pixels with similarity to the respective regions. The similarity can be measured with intensity or pixel texture features. In edge-based methods, segmentation is commonly done by techniques based on edge detection. Jiang et al. proposed a mass segmentation method to obtain initial segmented regions by a threshold based on the principle of maximum entropy [4]. Kupinski et al. developed two methods for segmenting lesions in mammographic images: radial gradient index (RGI) algorithm and a probabilistic algorithm [5]. Mencattini proposed a modified region-based mass segmentation procedure which optimized the similarity criteria [6]. Xu et al. developed a mass segmentation algorithm for two types of mass models. In their system, the Canny edge information of each pixel was computed, and the region growing technique was applied to merge adjacent regions [7]. Yuan et al. developed a method for mass lesion segmentation using a geometric active contour model [8]. Byrd et al. presented a comprehensive analysis to evaluate the performance of three existing digital mammography segmentation algorithms against the manual segmentation results produced by two radiologists [9]. Mammogram images have varied intensity contrast ranges and different noise levels. Also, patients have different breast density levels. Those variability factors make it difficult to find one optimal segmentor which can fit for all mass images [10]. To alleviate this problem, we built multiple weak segmentors for each ROI by using various image enhancements. The results showed that using multiple weak segmentors is an effective method to generate a strong mass segmentation for mammograms [11]. 2.2 CADx Systems for Mass Diagnosis in Mammograms With varied feature extraction, feature selection and classification methods, many CADx systems have been developed to assist radiologists to diagnose masses as benign or malignant. Domínguez et al. applied two segmentation methods to obtain two sets of mass contours, and the simplified contours were used to extract features [12]. In their CADx system, three classifiers (Bayesian classifier, Fisher's linear discriminate, and a Support Vector Machine (SVM)) were used to classify masses as benign or malignant. In Delogu et al. work, the gradient-based segmentation was applied, and mass shape, size and intensity features were computed [13]. At the end, a neural network (NN) was applied to classify mass type. Sampat et al. computed the Beamlet transform features, and applied these features to a K-Nearest Neighbor (K-NN) classifier to predict mass BI- RADS shape categories [14]. Ghosh et al. computed three categories of features (statistical, structural and grey level dependency) from suspicious areas, and applied genetic algorithm for feature selection. In their CADx system, the selected features were fed to a NN classifier to 3

diagnose suspicious areas in mammograms [15]. Zhang et al. compared a few classification and feature selection models for mass classification [16]. 2.3 Ensemble Learning An ensemble of classifiers is a set of individually trained classifiers whose predictions are combined to classify new. Previous research has shown that ensemble learning often achieves better accuracy in classification than the individual classifiers that make up the ensemble [17, 18]. The most commonly used ensemble learning methods include bagging, stacked generalization (stacking) and boosting. In the bagging method, multiple subsets of are formed, and each of these subsets is used to train a classifier (using the same classification algorithm for each subset). Finally, an aggregated predictor is generated from those classifiers [19]. The stacking uses several base-level classifiers to generate multiple predictions, and combines them to generate a final classification by using a meta-level classifier [20, 21]. The boosting method repeatedly runs a classification algorithm and generates a sequence of weak classifiers. In each iteration, a classifier is trained where greater weights are assigned to the which are not correctly classified in the previous iteration; and lower weights are assigned to those correctly classified [22]. In our previous research, we developed a content-based classification system which used BI- RADS features for classifying masses [23]. In that research, the ensemble learning partitioned the whole dataset into subsets based on the patient age and the mass shape category. For each subset, we tested several classifiers and selected the classifier which produced the highest accuracy as the optimal base classifier for the subset. In this research, we will apply similar ensemble learning approach to our image-based CADx system. 3 Data Description and Methodology 3.1 Data Description and Mass ROI Extraction In this work, all mass ROI images were extracted from the Digital Database for Screening Mammography (DDSM) from the University of South Florida. DDSM is the largest publicly available resource for the mammogram analysis research community. In DDSM images, BI- RADS information is annotated for each abnormal region [24]. In DDSM, mammogram images are digitized by different scanners with different resolutions. For the purpose of data consistency, all images are collected from the same type of scanner and resolution in this research. We use all mammograms from the scanner type LUMSYSYS, because the largest number of cases are digitized by this type scanner in DDSM. In DDSM, a suspicious region (ROI) is marked by experienced radiologists, and chain codes recorded in an overlay file indicate the location of the ROI in the mammogram. For each suspicious mass, we extracted a rectangle image as a mass ROI, which includes the suspicious mass and its surrounding area. In our study, we removed with extreme digitization artifacts (e.g. incorrectly ordered scan lines) and of extremely large size (over 2000 x 2000 pixels). We also removed with mixed BI-RADS descriptors and those ROI images which displayed only a portion of a mass. After removing those, a total of 543 mass ROI images were left for this study, where 272 were benign and 271 were malignant. Figure 2 and 3 show the distribution of the mass BI-RADS shape and margin features respectively. 4

Fig. 2. Mass BI-RADS shape distribution Fig. 3. Mass BI-RADS margin distribution 3.2 Building Multiple Mass Segmentors Mammogram images have varied intensity contrast ranges and noise levels. Those variability factors make it difficult to select a single segmentor (one setting of parameters) to produce the optimal segmentation results for all images. To address this problem, we applied multiple segmentations to each ROI image, which we call weak segmentors. For each mass ROI, by applying various gamma corrections and Gaussian filters (k different settings of parameters), a number of k enhanced images are generated. Then, from each of enhanced image, we compute the energy descriptor of each pixel and extract an energy texture image. Finally, we use an edge-based segmentation method to detect the mass contour from each energy texture image, k segmentation results are generated for each mass ROI [10]. The segmentation results are evaluated as successful or unsuccessful by the overlapping ratio. For a successful segmentation, the boundary of the detected region is used as mass contour. Table 1 shows examples of enhanced images and segmentations generated by three weak segmentors (using gamma corrections γ = 1, 2, 5, and Gaussian filter σ = 5) for the same ROI. In segmentation examples, the green line is the mass contour identified by our edge-based segmentation, while the red line is the radiologist marked mass outline. Also note that the three weak segmentors produced, individually, the overall successful segmentation rate of 66%, 73% and 77% respectively (with respect to the whole image set). Table 1 Multiple Weak Segmentors Weak Segmentor Segmentor 1 Segmentor 2 Segmentor 3 Image Enhancement γ = 1, σ = 5 γ = 2, σ = 5 γ = 5, σ = 5 Successful Segmentation Rate 66 % 73 % 77 % Enhanced Image Segmentation Result Segmentation Evaluation Sucessful Unsucessful Sucessful 5

3.3 Mass Extraction The shape and margin of masses are two important criterions to distinguish malignant from benign masses. In this step, for each successfully segmented contour, we compute the mass shape features which measure the properties of a mass. The following 14 shape features are computed: area, convex, perimeter, circularity, compactness, solidity, convex, roughness, equivalent diameter, elongation, major axis length, minor axis length, eccentricity and extent [25]. Then for each ROI image, we concatenate the shape features from the k weak segmentors and represent each mass instance by a total of 14*k shape features: {{f 1_1, f 1_2,, f 1_n,, f 1_14 }, {f 2_1, f 2_2,, f 2_n,, f 2_14 },,{f k_1, f k_2,, f k_n,, f k_14 }} where f i_j (1 <= i <= k, 1 <= j <= 14) denotes a value of the jth shape feature produced by the ith segmentor. For unsuccessful segmentations, no shape features can be computed and their values are set to a default value so that they will have no influence in classification. 3.4 Ensemble Learning Previous research has shown that ensemble learning often achieves better accuracy in classification than individual classifiers. In this research, we propose an ensemble learning which partitions a dataset into several subsets and develop an optimal classifier for each subset. By applying the best classification algorithm for each subset, we expect the overall classification accuracy for the whole dataset could improve. In our previous research, we used the ensemble learning with data partitioned by patient age and mass BI-RADS shape feature, and achieved better performance over the best classification with no data partitioning [23]. In this study, using similar approach, we compute the mean of patient age and mean of ROI size as splitting threshold, and partitioned the data into four subsets, which are displayed in Table 2. Table 2 Partitioning into Four Subsets based on Patient Age and Mass ROI Size Data Subset Patient Age (years) Mass ROI Size (pixels) Instances Young Age Small ROI Size < 57 < 643200 184 Young Age Large ROI Size < 57 >= 643200 87 Old Age Small ROI Size >= 57 < 643200 172 Old Age Large ROI Size >= 57 >= 643200 100 Then for each subset, we performed feature selection to remove potentially irrelevant or redundant features in order to improve classification. To select attributes, we first computed Information Gain (IG) of all features and ranked them from high to low for each subset. IG is a measure of purity based on Entropy, and indicates the amount of information an attribute gives: a larger IG means the attribute is more informative [26]. Then, we removed those features which had very low IG (close to 0), which indicated those features could be nearly irrelevant for classification, and therefore can be removed. After feature selection, we selected an optimal classifier for each data subset. To do so, we ran three classification algorithms (Decision Tree, SVM and K-NN) and selected the one which produced the best accuracy as the optimal classifier for the subset. This way, every selected classifier is a specialist which is trained specifically for a given population of that have certain characteristics, and the ensemble of those specialists forms a system which could diagnose masses more accurately over all in the whole dataset than one general classifier or an ensemble of general classifiers. 6

Note that we chose those three candidate algorithms because they have diverse characteristics. For example, Decision Tree is based on information gain; SVM are known to be robust to noise; K-NN decides the classification based on local information. Also note that in this study, we used the well-known machine learning tool Weka [27] to build classification models, and its cross validation (10-fold) option to do the training and testing for all data (sub) sets. 4 Results and Discussion In our experiment, we built three weak segmentors from different image enhancements (gamma corrections γ = 1, 2, 5, with Gaussian filter using σ = 5 for all three gamma values). Those segmentors achieved a successful segmentation rate of 66%, 73% and 77% respectively. A total of 42 mass shape features (3 segmentors x 14 shape features) were computed from each mass instance. Then, we partitioned the data into four subsets (Young age small ROI, Young age large ROI, Old age small ROI and Old age large ROI) as described in the previous section. For feature selection, we computed IG of all features in all subsets. In the Young age - large ROI subset, three shape features (solidity, eccentricity and elongation) from two segmentors were selected for classification. In the other three subsets, all features had the same IG value, so we kept all 42 features for classification. For each subset, we applied three single classifiers (Decision Tree, SVM and K-NN) to find the optimal classifiers. Table 3 shows the classification accuracies. Column (a) Overall Accuracy in the table indicates the average accuracies weighted by the proportion of the size of the subsets. Column (b) is the classification accuracies without dataset partitions. The ensemble learning with weighted classification accuracy showed better performance than the best classification (by one classifier, SVM) with no data partitioning (72% vs. 56 %), and the difference was statistically significant (p < 0.05). Note that the overall accuracy is largely drawn-down by poor performance of the Old age - small ROI subset, where the three other groups have achieved significantly better classifications. This result is similar to our previous mass BI-RADS feature study [23], where the Old age regular shape subset had the worst classification performance. In our future work, we are planning on investigating the data distribution and the classifications made by the classifiers for the Old age - small ROI size subset. Table 3. Classification Accuracies of Datasets Partitioned by Age and ROI Size Young Age Young Age Old Age Old Age Overall Accuracy Subsets Small ROI Large ROI Small ROI Large ROI Accuracy No Partition With Partition (b) (a) Segmentor γ =1, 2, 5 γ =1, 2 γ =1, 2, 5 γ =1, 2, 5 γ =1, 2, 5 Number of s 3x14 shape 2x3 shape 3x14 shape 3x14 shape 3x14 shape Classifier 184 87 172 100 543 543 Decision Tree 73 % 74 % 54% 78 % 68 % 55 % SVM 76 % 64 % 62 % 83 % 71 % 56 % K-NN (k=5) K-NN (k=15) 71 % 76 % 66 % 64 % 53 % 53 % 81 % 82 % 66 % 68 % 51 % 51 % The Optimal SVM Decision SVM SVM SVM Classifiers K-NN Tree The Best Accuracy 76 % 74 % 62 % 83 % 72 %* 56 % * Weighted accuracy computed from the best classifiers. 7

In this study, the classification accuracies were used to measure the performance of the ensemble learning system. Sensitivity measures how reliable a system is making positive (malignant) identifications, and specificity measures how well a system can make a negative (benign) identification. For the ensemble learning, after data partition, some subsets became unbalanced (for example, in the old age large ROI subset, the majority of training instants are malignant). The unbalanced dataset could lead the poor specificity or sensitivity of classification. In our future work, we will perform data balancing for each subset to improve sensitivities and specificities. And, besides classification accuracy, we plan to add ROC (receiver operating characteristics) to evaluate the ensemble learning system. 5 Conclusions and Future Work In our proposed CADx system for classifying masses in mammograms, multiple weak segmentors are built. From the segmented mass contours, we construct the mass shape feature sets. Then, we build an ensemble learning system, which partitions the whole dataset into four categories by patient age and ROI size. In this study, the ensemble system achieved 72% overall accuracy. The preliminary results showed that our ensemble learning system greatly improved overall diagnosis accuracy for classifying masses in mammograms. In this experiment, we find that mass of Old age small ROI size subset have much lower classification accuracies than other groups. In our future work, we need to further investigate the segmentations and features in this group to improve the classification. Currently, only 61% to 77% of ROI images were successfully segmented by each of weak segmentors. In our future work, we will investigate using other segmentation methods such as region-based segmentation as alternative methods for those ROI images which could not be successfully segmented by using our edge-based segmentation. In this study, we only computed mass shape features for classification. In our future work, for the proposed CADx system, we plan to add mass texture and spiculation features. We also plan to investigate a different ensemble learning model to generate the final diagnosis for a suspicious mass. References 1. National Cancer Institute (2010) American Cancer Society Cancer Facts & Figures 2010. http://www.cancer.org. 2. Winchester DJ, Winchester DP, Hudis CA, Norton L (2007) Breast Cancer (Second Edition). Springer, New York 3. Cheng HD, Shi XJ, Min R, Hu LM, Cai XP et al (2006) Approaches for Automated Detection and Classification of Masses in Mammograms. Pattern Recognition. doi:10.1016/j.patcog.2005.07.006 4. Jiang L, Song E, Xu X, Ma G, Zhang B (2008) Automated Detection of Breast Mass Spiculation Levels and Evaluation of Scheme Performance. Acad Radiol. doi:10.1016/j.acra.2008.07.015 5. Kupinski MA, Giger ML (1998) Automated Seeded Lesion Segmentation on Digital Mammograms. IEEE Transaction on Medical Imaging. doi:10.1109/42.730396 6. Mencattini A, Rabottino G, Salmeri M, Lojacono R, Colini E (2008) Breast Mass Segmentation in Mammographic Image by an Effective Region Growing Algorithm. Advanced Concepts for Intelligent Vision Systems Conference. doi: 10.1007/978-3-540-88458-3_86 7. Xu W, Xia S, Xiao M, Duan, H (2005) A Model-based Algorithm for Mass Segmentation in Mammograms, Engineering in Medicine and Biology 27th Annual Conference. doi: 10.1109/IEMBS.2005.1616987 8. Yuan Y, Giger ML, Li H, Suzuki K, Sennett C (2007) A Dual-stage Method for Lesion Segmentation on digital mammograms. Med Phys 34: 4180-4193. doi:10.1118/1.2790837 9. Byrd K, Zeng J, Chouikha M (2005) Performance Assessment of Mammography Image Segmentation Algorithms. 34th Applied Imagery and Pattern Recognition Workshop. pp.152-157. 8

10. Zhang Y, Tomuro N, Furst JD, Raicu DS (2010) Image Enhancement and Edge-based Mass Segmentation in Mammogram. 2010 SPIE Medical Imaging Conference. doi:10.1117/12.844492 11. Zhang Y, Tomuro N, Furst JD, Raicu DS (2011) Multiple Weak Segmentors for Strong Mass Segmentation in Mammogram. 2011 SPIE Medical Imaging Conference. doi:10.1117/12.877450 12. Domínguez1AR, Nandi AK (2009) Toward Breast Cancer Diagnosis Based on Automated Segmentation of Masses in Mammograms. Pattern Recognition. doi:10.1016/j.patcog.2008.08.006 13. Delogu P, Fantacci M, Kasae P, Retico A (2007) Characterization of Mammographic Masses Using a Gradient-based Segmentation Algorithm and a Neural Classifier. Computers in Biology and Medicine. doi:10.1016/j.compbiomed.2007.01.009 14. Sampat MP, Markey MK, Bovik AC (2005) Computer-Aided Detection and Diagnosis in Mammography. Elsevier Academic Press. 15. Ghosh R, Ghosh M, Yearwood J (2004) A Modular Framework for Multicategory Selection in Digital mammography. European Symposium on Artificial Neutral Networks. 175-180. 16. Zhang P, Verma B, Kumar K (2005) Neural vs. Statistical Classifier in Conjunction with Genetic Algorithm Based Selection. Pattern Recognition Letters. doi:10.1016/j.patrec.2004.09.053 17. Opitz D, Maclin R (1999) Popular Ensemble Methods: An Empirical Study. Journal of Artificial Intelligence Research. doi:10.1613/jair.614 18. Dzeroski S, Zenko B ( 2004) Is Combining Classifiers with Stacking Better than Selecting the Best One? Machine Learning. doi: 10.1023/B:MACH.0000015881.36452.6e 19. Breiman L (1996) Bagging Predictors. Machine Learning. 24:123-140 20. Wolper DH (1992) Stacked Generalization, Neural Networks. 5:241-259 21. Ting K, Witten I (1999) Issues in Stacked Generalization. Journal of Artificial Intelligence Research. 10: 271-289 22. Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. Proceedings of the 13th International Conference on Machine Learning. 148-156 23. Zhang Y, Tomuro N, Furst JD, Raicu DS (2009) Using BI-RADS Descriptors and Ensemble Learning for Classifying Masses in Mammograms. Medical Content-based Retrieval for Clinical Decision Support ( MCR-CDS). doi: 10.1007/978-3-642-11769-5_7 24. Heath M, Bowyer K, Kopans D, Moore R, Kegelmeyer WP (2001) The Digital Database for Screening Mammography. Proceeding of the 5th International Workshop on Digital Mammography. 212-218 25. Choras R (2008) Shape and Texture Extraction for Retrieval Mammogram in Databases. Information Tech. Biomedicine. 47:121-128 26. Quinlan R (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA 27. Witten I, Frank E (2005) Data Mining: Practical Machine Learning Tools and Techniques (2nd edition). Morgan Kaufmann 9