International Journal of Biomedical Signal Processing, (1), 011, pp. 43-47 CLASSIFYING MAMMOGRAPHIC MASSES INTO BI-RADS SHAPE CATEGORIES USING VARIOUS GEOMETRIC SHAPE AND MARGIN FEATURES B. Surendiran 1 and A.Vadivel Department of Computer Applications, National Institute of Technology, Trichy, India E-mail: 1 surendiran@gmail.com, vadi@nitt.edu Abstract: According to Breast Imaging Reporting and Data System (BIRADS) benign and malignant can be differentiated using its shape, size and density, which is how radiologist visualize the mammograms. According to BIRADS mass shape characteristics, benign masses tend to have round, oval, lobular in shape and malignant masses are lobular or irregular in shape. Measuring regular and irregular shapes mathematically is found to be a difficult task, since there is no single measure to differentiate various shapes. In this paper, each mass is described by shape feature vector consists of 17 shape and margin features. The masses are classified into 4 categories such as either round, oval, lobular or irregular. Classifying masses into 4 categories is very difficult compared to classifying as benign vs malignant or normal vs abnormal. Since, only shape characteristics can be used to discriminate these 4 categories. Experiments have been conducted on standard images of The Digital Database for Screening Mammography (DDSM) and classified using Neural Network (NN) Classifier. A total of 4 DDSM database mammograms are considered for experiment. Keywords: Digital Mammogram; Geometrical Shape and Margin features; Regular and Irregular masses; BI-RADS Categories; Computer Aided Diagnosis. I. INTRODUCTION The breast cancer is one of the most common causes of cancer among women causing death. First popular diagnostic tool for identifying breast cancer is Mammogram, which is an X-ray of the breast that can show the presence of abnormal growth of cells in breast. Breast cancer death rates have been dropping steadily since 1990, due to earlier detection and better treatments [1]. While various methods have been proposed and available for early detection and screening of breast cancers, the mammography is being considered as one of the most effective method. Computer Aided Detection (CAD) systems have been developed to aid radiologists in diagnosing cancer from digital mammograms. Several studies have proved that CAD improves breast cancer diagnostic accuracy rate by 14.% []. In India, breast cancer accounts for 3% of all the female cancers followed by cervical cancers (17.5%) in metropolitan cities such as Mumbai, Calcutta, and Bangalore. According to Indian Council of Medical Research (ICMR), it is reported that one in women in India are likely to suffer from breast cancer during her lifetime. Although the incidence is lower in India than in the developed countries, the burden of breast cancer in India is alarming. According to B.Niranjan Naik of cancer specialty Dharamshila hospital, almost 75,000 new cases of breast cancer are detected in India every year. And 0-5% of cancer cases in Indian women are breast cancer. Early detection is the key to avoid breast cancer. Earlier, breast cancer symptoms are found in women aged above 45, but with changing times and lifestyle, younger women are at risk of breast cancer. And in India, it is estimated that around 50,000-70,000 women die every year due to breast cancer. Literature Survey According to BI-RADS system, the masses can be characterized by shape, size, margins (borders) and density [3]. Benign masses are round and oval in shape and have smooth, circumscribed margins. The malignant masses are irregular shape and ill-
44 International Journal of Biomedical Signal Processing defined, microlobulated or spiculated margins. It has been observed that shape and margin characteristics can be effectively used for classifying the masses either as benign or malignant. Based on shape and margin features, some of the known approaches, which classify the abnormalities based on BI-RADS system, have been giving accurate results [4, 5]. Researchers have proposed and used various features for classifying the masses in mammograms. The statistical measures such as variance, uniformity, smoothness, third moment etc [6, 7, 8, and 9] have been used for classifying masses as either benign vs malignant or normal vs abnormal. These statistical measures are based on gray value or histogram of mammogram masses. However, the gray values of mammogram tend to change, due to over-enhancement or due to the presence of noise. Most of the recent works have been concentrated on classifying the mass using shape features [10, 11, 1]. In [10], Flores et al have used various geometric shape features for classifying it normal vs. calcifications and have obtained better results. In [11], Sun et al have used multi-resolution features for classifying the region either as normal or abnormal. In [1], Rojas et al used various shape features for classifying masses as benign or malignant and found to be effective in discriminating benign and malignant masses. In [13], the Beamlet transform is used to classify the mass as either oval, round, lobular and able to obtain classification rate of only 46%. This work is organized as follows. In Section, we present techniques for feature extraction using shape properties. In Section 4, we present the experiments results along with comparisons to existing methods. In section 5, we conclude the paper. II. MASS SHAPE FEATURE EXTRACTION A. Mass Shape Characteristics According to BIRADS mass shape characteristics, benign masses tend to have round, oval, lobular in shape and malignant masses are lobular or irregular in shape. Measuring regular and irregular shapes mathematically is found to be a difficult task, since there is no single measure to differentiate various shapes. The masses are classified into 4 categories such as either round, oval, lobular or irregular. Classifying masses into 4 categories is very difficult compared to classifying as benign vs malignant. Since, only shape characteristics can be used to discriminate these 4 categories. Figure 1: Shape Characteristics of Masses The Fig. 1 shows the mass shapes of mammogram specified by BIRADS system. Benign masses have round and oval shapes with circumscribed margin. It is noticed that benign masses are circular and convex in shape. Malignant masses have irregular shape with ill-defined, microlobulated or spiculated margins and malignant masses are irregular non-convex shapes. B. Shape Properties For the Experiments we have considered mammograms from DDSM Database [14]. And the ground truth available with each mammogram is used for measuring the classification rate. A total of 4 malignant masses are considered, out of which 11 are round, 8 are oval, 55 are lobular and 130 are irregular masses. Fig. shows the extracted sample benign and malignant masses from DDSM Database mammograms. In Table 1, we present the 19 shape properties used for extracting features from the mammogram masses[15.16]. For each ROI, the shape based properties are extracted and constructed as feature vector. Shapes feature vector, SFV= {mammogram, p t, mass_type}, where t=1 19 and p t is the shape property. The mass_type is either B or M to denote Benign and Malignant respectively.
Classifying Mammographic Masses into BI-RADS TM Shape Categories using Various Geometric Shape... 45 Table I Shape Properties S.no Shape Features Expression Impact 1. Area(A) Total Pixels in mass The number of pixels in the mass.. Perimeter(P) Total Pixels in Edge The number of pixels in the boundary of the mass. border of mass To differentiate regular masses from irregular masses. 3. Max Radius (Rmax) Maximum distance To differentiate regular masses from irregular masses. For between center to edge irregular polygon shapes, the max radius will be high of the mass compared regular masses with same area of irregular polygon. 4. Min Radius (Rmin) Minimum distance Minimum distance between center of the mass to its boundary. between center to To differentiate regular masses from irregular masses. edge of the mass 5. Euler Number Number of connected The Euler number is the difference between the number of (EULN) components Number disjoint regions and the number of holes in an image. Around 0 of holes for round/oval masses. Varying values for irregular shapes 6. Modified M i n R a d i u s Eccentricity is the measure of elongation of a mass. For 1 Eccentricity (Ect) M a x R a d i u s identifying elliptical characteristics of a mass. 7. Equivdiameter Computes the diameter of the circle that has the same area as 4 * Area (Eqd) π the region. For differentiating round/oval masses from irregular/lobular mass 8. Modified Area It is ratio of minimum dimension to the maximum dimension of Elongatedness (En) ()* MaxRadius the rectangle. Max value 1 for square. Useful for differentiating regular oval masses from irregular masses. 9. Entropy (Entpy) A measure of randomness. Measures the amount of disorder. ( p * log()) p Measures the amount of spiculation present in mass. Useful for identifying irregular masses. 10. Modified Area Circularity ratio represents how a mass is similar to a circle. Circularity1 (C1) π * Max Radius Useful for differentiating circular/oval masses from irregular masses. 11. Modified MinRadius Circularity ratio represents how a mass is similar to a ellipse. Circularity (C) MaxRadius Useful for differentiating circular/oval masses from irregular masses. 1. Compactness(CN) * Area * π Degree of deviation of the mass from a perfect circle. Perimeter Compactness is independent of linear transformations. Measures the degree of roughness of a region 13. Modified MaxRadius Dispersion, measures the irregularity of the mass. For Dispersion(Dp) Area identifying irregular shape characteristics. 14. Thinness Ratio (TR) 4 * π * Area Used to distinguish circle from other mass. This measure perimeter differentiates circle from other shapes. 15. Std Deviation of n 1 A measure of average contrast of the mass. A measure of the mass (SD) () xi x dispersion of a set of data from its mean. The more spread apart n i = 1 the data, the higher the deviation m 16. Standard Deviation 1 A measure of average contrast of the mass boundary. A () yi y of Edge (Esd) m i = 1 measure of the dispersion of a set of data from its mean. The more spread apart the data, the higher the deviation 17. Modified shape Perimeter Relates to surface curvature. This property provides Index (SI) * MaxRadius information about margin characteristics. For differentiating skinny masses from the regular masses. p is the histogram counts, n is number of pixels in mass, x is the mean of the mass, y is the mean of the Edge, and m is the number of pixels in Edge
46 International Journal of Biomedical Signal Processing II. EXPERIMENTAL RESULTS We had used the SPSS PASW Modeler package for applying neural network classifier[17]. The shape property vector with mass_type is given as input for NN classifier. The experimental result for classifying the masses as either oval or round or lobular or irregular is shown in Table III and in Fig 3. Here O, R, L, I represent oval, round, lobular and irregular mass lesion types. Table III NeuralNetwork - Variable Importance Mass type Imp Features Accuracy Figure : Sample Masses Properties like circularity, convex area, thinness ratio, Equiv.-diameter, Eccentricity and compactness are used to measure shape characteristics. Similarly for margin characteristics, properties such as entropy, shape index, standard deviation of edge, etc are used. C. Shape Property Values Table II shows the few shape property values out of 19 extracted properties for oval, round, lobular and irregular masses. I, O SD,Rmin,C,Euln,TR,C1 99.16% O,R ECT,CN,Esd,C,SD,SI,TR 98.3% L, O C,Rmin,SD,ECT,En,C1 99.1% L, R C,SD,Esd,CN,ECT,Dp,Eqd 98.31% I, L, R Rmin,ECT,Peri,SI,C,CN 99.16% L, O, R C,SD,CN,TR,Rmin,ECT 98.3% I,L,O,R Rmin,SI,C,SD,Euln,Peri 97.48% From Table III, it can be observed that modified features like C1, C, DP, SI, CN, ECT, SD and basic shape features like Perimeter, Rmin are found to be vital features compared to other features. Table II Sample Feature Values Mass Area Eqd C CN TR SD Irregular Masses Irr1 1598 45.1 0.79 1.13 1.8 8.0 Irr 1480 43.4 0.76 1.0 1.43 7. Irr3 1745 47.1 0.64 0.8 0.68 8.9 Irr4 133 5.1 0.66 0.78 0.60 8.7 Lobular Masses LOB1 43 55.6 0.53 0.39 0.15 6.1 LOB 5716 85.3 0.58 0.71 0.50 8.6 LOB3 1948 49.8 0.6 0.46 0.1 6.3 LOB4 3916 70.6 0.17 0.9 0.08 7.6 Oval Masses OV1 57 7.0 0.68 1.15 1.31 5.6 OV 597 7.6 0.6 0.66 0.44 5.4 OV3 940 34.6 0.7 0.39 0.16 4.7 OV4 1596 45.1 0.55 0.63 0.40 7.1 Round Masses RO1 1157 38.4 0.48 0.87 0.75 6.3 RO 978 35.3 0.3 0.40 0.16 3.9 RO3 115 38.3 0.19 0.50 0.5 5.7 RO4 001 50.5 0.66 0.60 0.36 6.1 Figure 3: Classification result using Neural Network III. CONCLUSION The proposed geometric shape and margin features based on min radius and max radius are found to be effective in classifying masses as round, oval, lobular and irregular. The accuracy obtained using these 17 features found to have high discriminating power for discriminating regular shapes from irregular shapes. As a future work, we will try to construct fuzzy membership function using these geometric shape and margin features.
Classifying Mammographic Masses into BI-RADS TM Shape Categories using Various Geometric Shape... 47 Table III Performance Comparison Method Features Accuracy Beamlet 11 Beamlet ILOR 46% Transform[13] features RO- 76% RL 7% Proposed 19 Shape and ILOR 97.48% Approach Margin RO-98.3% features RL- 98.31 % ACKNOWLEDGMENT The work done by Dr. A.Vadivel is supported by research grant from the Department of Science and Technology, India, under Grant SR/FTP/ETA-46/07 dated 5th October, 007. REFERENCES [1] A. C. S. (AMS). :Learn about breast cancer,.available at http://www.cancer.org, (accessed on may 010). [] American Journal of Roentgenology (AJR), Computeraided Detection Improves Early Breast Cancer Identification, http://www.ajronline.org [ accessed on Feb 5 010] [3] American College of Radiology (ACR). Breast Imaging Reporting and Data System (BI-RADS). 3rd ed. Reston, Va: American College of Radiology, (1998). [4] Markey M. K., Lo J. Y., Tourassi G. D., Floyd C. E.,: Cluster analysis of BI-RADS descriptions of biopsyproven breast lesions, In: Medical Imaging: Image Processing, Proceedings of SPIE Vol. 4684, pp. 363-370 (00). [5] Mehul P. Sampat, Alan C., Bovik B., Mia K. Markey: Classification of mammographic lesions into BI- RADS shape categories using the Beamlet Transform, Medical Imaging: Image Processing, Proc. of the SPIE, vol. 5747, pp.16-5, (005). [6] Vibha L., Harshavardhan G. M., Pranaw K., Deepa Shenoy P., Venugopal K. R., Patnaik L. M. : Classification of Mammograms Using Decision Trees, In: 10th International Database Engineering and Applications Symposium (IDEAS 06). Pp.63-66 IEEE (006) [7] Sheshadri, H. S., Kandaswamy, A.: Breast Tissue Classification Using Statistical Feature Extraction Of Mammograms. Medical Imaging and Information Sciences, 3(3), pp. 105 107, (006). [8] Xinsheng Zhang,: Boosting Twin Support Vector Machine Approach for MCs Detection., Asia-Pacific Conference on Information Processing, pp. 149-15, (009). [9] Xinsheng Zhang, Xinbo Gao, Ying Wang: MCs Detection with Combined Image Properties and Twin Support Vector Machines. JCP 4(3): 15-1 (009) [10] Beatriz A. Flores, Jesus A. Gonzalez, :Data Mining with Decision Trees and Neural Networks for Calcification Detection in Mammograms, In: Third Mexican International Conference on Artificial Intelligence, Proceedings -LNCS,Springer, pp. 3-41, (004). [11] Sun Y., Babbs C., Delp E. : Normal Mammogram Classification Based on Regional Analysis, In: Proceedings of the IEEE Midwest Symposium on Circuits and Systems., Vol. 45, pp. 375-378, (00). [1] A Rojas and A K Nandi, Detection of masses in mammograms via statistically based enhancement, multilevel-thresholding segmentation, and region selection, Computerized Medical Imaging and Graphics, Volume 3, Issue 4, June, 008, pp. 304-315, ISSN 0895-6111. [13] Sampat MP, Bovik AC, Markey MK, Classification of mammographic lesions into BI-RADS shape categories using the Beamlet Transform, Medical Imaging: Image Processing, Proc. of the SPIE, vol. 5747, pp.16-5, (005) [14] Chris Rose, Daniele Turi, Alan Williams, Katy Wolstencroft, Chris J. Taylor, Web Services for the DDSM and Digital Mammography Research, pp. 376-383, (003). [15] Surendiran.B, Sundaraiah Y., Vadivel A., Classifying Digital Mammogram Masses using Univariate ANOVA Discriminant Analysis,International Conference on Advances in Recent Technologies in Communication and Computing - ARTCom 009, IEEE explore, Oct 009. (Best Paper Award). [16] B. Surendiran, A.Vadivel, Mammogram Mass Classification using Various Geometric Shape and Margin Features for Early Detection of Breast Cancer, International Journal of Medical Engineering and Informatics, Inderscience, 011. (Accepted). [17] Spss, : SPSS Data Mining, Statistical Analysis Software, Predictive Analysis, Predictive Analytics, Decision Support Systems, http://www.spss.co.in/