Model-Based Detection of Spiculated Lesions in Mammograms

Size: px

Start display at page:

Download "Model-Based Detection of Spiculated Lesions in Mammograms"

Della Grant
6 years ago
Views:

1 Medical Image Analysis (1998) volume 3, number?, pp 1 23 c Oxford University Press Model-Based Detection of Spiculated Lesions in Mammograms R. Zwiggelaar 1, T.C. Parr 1, J.E. Schumm 1, I.W. Hutt 1, C.J. Taylor 1, S.M. Astley 1 and C.R.M. Boggis 2 1 Wolfson Image Analysis Unit, University of Manchester, Oxford Road, Manchester M13 9PT, UK 2 Greater Manchester Breast Screening Service, Withington Hospital, Manchester M13 9PT, UK Abstract Computer-aided mammographic prompting systems require the reliable detection of a variety of signs of cancer. In this paper we concentrate on the detection of spiculated lesions in mammograms. A spiculated lesion is typically characterised by an abnormal pattern of linear structures and a central mass. Statistical models have been developed to describe and detect both these aspects of spiculated lesions. We describe a generic method of representing patterns of linear structures, which relies on the use of factor analysis to separate the systematic and random aspects of a class of patterns. We model the appearance of central masses using local scale-orientation signatures based on recursive median filtering, approximated using principal component analysis. For lesions of 16 mm and larger the pattern detection technique results in a sensitivity of 8% at.14 false positives per image, whilst the mass detection approach results in a sensitivity 8% at.23 false positives per image Simple combination techniques result in an improved sensitivity and specificity close to that required to improve the performance of a radiologist in a prompting environment. Keywords: mammogram, spiculated lesions, oriented line patterns, central mass detection Received?; revised?; accepted? 1. INTRODUCTION Over the past decade, mammographic screening programmes have been established in many countries around the world. The UK Breast Screening Programme alone generates 1.5 million mammograms per annum. Potential malignancies can be detected from subtle abnormalities in radiographic appearance, but it is known that radiologists fail to detect a significant proportion of these abnormalities (Dean, 1996). It has been shown that their performance would improve if they were prompted with the locations of possible abnormalities (Chan et al., 199; Astley et al., 1993; Kegelmeyer et al., 1994; Hutt, 1996). Computer-aided prompting systems require the reliable detection of a variety of mammographic signs of cancer. In this study we concentrate on the detection of spiculated Current address: Division of Computer Science, University of Portsmouth, Mercantile House, Hampshire Terrace, Portsmouth PO1 2EG, UK ( reyer@sis.port.ac.uk) lesions, which are often characterised by an abnormal pattern of radiating linear structures (spicules) and a central mass. An example of a spiculated lesion is shown in Figure 1, both the spicules and the central mass have been annotated by an experienced radiologist. To detect both aspects of spiculated lesions our approach comprises two parts: the detection of the abnormal pattern of linear structures and the detection of the associated central mass. Both techniques can be used on their own for the detection of spiculated lesions, but better results might be expected when the two approaches are combined. In addition, the differing target features of the techniques facilitate the detection of lesions in the absence of either feature; a lesion can still be detected when either the central mass or the abnormal pattern is absent. Thus, the techniques complement each other; also they may be useful for the detection of other mammographic signs of cancer such as architectural distortions, which are effectively abnormal patterns of lines without a central mass, and other focal lesions, which have no associated abnormal pattern of

2 2 R. Zwiggelaar et al. Figure 1. Typical example of a mammogram containing a spiculated lesion. The spicules and the associated central mass are annotated in the right image. linear structures. We have tackled both detection tasks using a modelbased approach. It is difficult to define, ab initio, precisely how changes in appearance relate to the probability of a clinically significant abnormality. We have thus chosen, in both cases, to adopt an approach based on choosing a complete but uncommitted representation for the relevant aspect of appearance, coupled with statistically based learning. To detect abnormal patterns of linear structures we have built a statistical model to give a compact representation of arbitrarily complex orientation fields. The statistical model is based on factor analysis, which is used to separate the pattern information from the image noise. The model parameters are used to classify regions as normal or abnormal on the basis of training data. Similarly, we have developed a method for describing the local grey-level structure, using directional recursive median filtering to obtain a scale-orientation signature at each pixel. To detect abnormal masses we have built a statistical model, based on principal component analysis, capable of representing arbitrary scale-orientation signatures. The model parameters are used to classify pixels as normal or abnormal on the basis of training data. The techniques for the detection of the abnormal pattern of linear structures and the central mass are both generic and could be used to detect other types of patterns and blobs in images. The novel use of factor analysis to separate image structure and noise may also be of more general application. Results for each approach are compared with the most promising published results (Kegelmeyer et al., 1994; Karssemeijer, 1995; Miller and Ramsey, 1996; Petrick et al., 1996; Zouras et al., 1996). The general layout of the paper is as follows. In section 2 we discuss the accuracy required for a mammographic prompting system to improve the performance of a radiologist and propose an accuracy measure, based on experimental evidence, that is particularly relevant to data sets in which the proportion of abnormalities is higher than the screening average. We also describe, briefly, how different detection techniques can be compared through the use of receiver operating characteristics (Metz, 1996). The statistical methods on which both approaches are based are discussed briefly in section 3, whilst sections 4 and 5 give in-depth descriptions of the methods developed for the detection of the abnormal patterns of linear structures and central masses respectively and the results obtained. The outputs of the two techniques can be combined to achieve detection performance which is better than either method on its own - this is discussed in section 6. General discussion and concluding remarks can be found in sections 7 and PROMPTING IN MAMMOGRAPHY The effectiveness of mammographic screening can be improved by drawing the position of potential abnormalities

3 Model-Based Detection of Spiculated Lesions 3 to the attention of the radiologist (prompting) (Astley et al., 1993). In a prompting environment two types of error are possible: false positive (F P ) and false negative (F N) prompts. Both types of false prompts should be kept to a minimum; false negative prompts are potentially life-threatening for the patient whilst a high number of false positive prompts may cause the radiologist to lose confidence in the prompting system and possibly impair detection performance by diverting attention away from genuine abnormalities. Although it may be desirable, we are unlikely, given the current state of the art, to create a prompting system with a sensitivity of 1% (zero F N) and perfect specificity (zero F P ). We explore below the circumstances in which an imperfect prompting system can still lead to an improvement in radiologists performance ROC and FROC analysis The performance of both radiologists and computer-based detection techniques can be assessed using receiver operating characteristic (ROC) and free response operating characteristic (FROC) curves (Metz, 1996). An ROC curve (e.g. Figure 2) indicates the true positive rate (sensitivity) as a function of the false positive rate (1?specificity). When no useful discrimination is achieved the true positive rate is always equal to the false positive rate. As the accuracy increases the ROC curve moves closer to the upper-left corner, where a higher sensitivity corresponds to a lower false positive rate. A measure commonly derived from an ROC curve is the area A z under the curve, which is an indication for the overall sensitivity and specificity of the observer - though similar values for A z can result from differently shaped ROC curves. Free response operating characteristic (FROC) curves plot sensitivity as a function of the number of false positive detections per image. Variations on FROC analysis take into account localisation, which is important in mammography since automated systems may classify large areas as lesion; an extreme example would be the classification of the whole of the breast as being a lesion. Without a measure of localisation the standard FROC curve can present an optimistic view of the accuracy of a system Prompting Requirements Previous authors have demonstrated that an imperfect prompting system can still produce an improvement in radiologists performance. It is, however, difficult to conclude from these results the minimum level of prompting performance which will still have a beneficial effect. To assess the requirements of a mammographic prompting system we performed a series of experiments (Hutt, 1996) which are summarised here. The experiments used 1 pairs of mammograms taken during routine breast screening. Of these, 2 mammograms contained subtle malignant abnormalities of various types. Thus, 1% of the images contained an abnormality. This is an order of magnitude higher than the rate at which malignant abnormalities are typically found in breast screening in the UK (Dean, 1996) but was still believed to be low enough to represent a realistic approximation to the real screening situation. The remaining mammograms were either normals or contained benign abnormalities. The output of a prompting system was simulated: its sensitivity was fixed at 9% and three different F P=T P prompt ratios (1., 1.5 and 2.) were examined. The truth data were provided by an expert radiologist with access to the patient records. All 1 mammogram pairs were presented to each of 3 experienced radiologists from 11 UK screening centres, divided into three groups - one for each of the three prompting conditions. The mammograms were presented under standard viewing conditions with the addition of a paper hardcopy of the mammogram showing the position of any prompt. Each radiologist took part in a prompted and an unprompted experiment. The radiologists were asked to give the location of any abnormality in the mammograms and to give an indication of the severity of the case by using a six point rating scale ranging from normal to malignant. The detection performance of the radiologists under the different prompting conditions was investigated using ROC analysis. In Figure 2, unprompted performance is compared with the three prompting conditions. The results indicate that prompting with a ratio of 2. F P /T P prompts does not significantly improve the performance of radiologists (as compared with the unprompted case). A F P /T P ratio of 1.5 leads to a small but significant improvement whilst a ratio of 1 results in a large improvement (in this case the value of A z increases from.83 for the unprompted condition to.92 for the prompted condition). Hutt (Hutt, 1996) suggests that it is the F P /T P prompt ratio which is significant rather than the absolute number of F P prompts and that the F P /T P prompt ratio should be smaller than 1.5. The results are consistent with work reported by Kegelmeyer (Kegelmeyer et al., 1994) who used a ratio of.66 resulting in an increase in sensitivity of almost 1%. The results of Chan et al. (199) show improved performance with a F P /T P ratio as high as 9.2. However, the unprompted performance in Chan s study may have been adversely affected by a number of factors, including the use of laser printed images rather than the original films, the presence of a vinyl overlay for each image, and particularly by the use of an unrealistically short time limit used to force errors. Under normal screening conditions the proportion of women recalled by radiologists in the UK is 7%, which includes both women with suspected abnormalities and recalls for technical reasons. Of all the recalls 14% have

4 4 R. Zwiggelaar et al. True Positive Fraction False Positive Fraction Figure 2. ROCs resulting from various levels of F P /T P ratio in comparison to the unprompted control condition (the dotted graph), where 4: F P /T P =2., 2: F P /T P =1.5, 3: F P /T P =1.. apparent abnormalities associated with a spiculated lesion or an ill-defined mass (Nicholson et al., 199). Assuming single view screening, there is on average 1 abnormality per 2 mammograms associated with a spiculated lesion or an ill-defined mass. To improve radiologists performance, Hutt s results suggest that an automated system in a screening environment should not exceed the rate of 1.5 false positive prompts per 2 mammograms or.75 F P per image. 3. STATISTICAL MODELLING In this section we review, briefly, the statistical methods which underpin the two detection algorithms. Statistical modelling is used to provide a robust description of the variability in the mammographic data. Generally our aim is to separate the systematic and noise variation in the data, leading to robust classification Principal Component Analysis Principal component analysis (PCA) is a well documented statistical approach to data dimensionality reduction (Jolliffe, 1986). The principal components of a population of observation vectors are the characteristic vectors of the covariance matrix (C) constructed from the population. Projecting the data into its principal components generally results in a compact and meaningful representation in which the first few characteristic vectors describe the major modes of data variation. The characteristic values provide the variances of the principal components. Data dimensionality reduction is achieved by ignoring those principal components which have zero or small characteristic values. Observation vectors can be approximated from a PCA model using (1). x i P b i + m (1) where x i is the i th observation vector, m is the mean observation over the population, P is the matrix of the most significant characteristic vectors and b i is a vector of lower dimensionality than x i (hence the approximation sign). The weights of the principal components (b i ) for an observation (x i ) can be estimated (since P?1 = P T ) by b i = P T (x i? m) (2) Observation vectors can be reconstructed by substituting (2) into (1) Factor Analysis Factor analysis provides a technique for separating systematic variation from specific (random) variation (Anderson, 1984). It explains the interdependence of a set of variables, in terms of the factors, without regard to the observed variability (in contrast, principal component analysis explains the observed variability). In factor analysis an observation vector is considered as being made up of a part which is peculiar to the observation (called specific or error) and a part which is a function of some fundamental variables of the population. The factor model is given in (3). x i = Lf i + U i + m (3) where the observation vectors, x i, are modelled by a linear combination of factor scores (f i ), a specific factor (U i ), and the average observation over the population (m). L is a matrix of coefficients which are called factor loadings and describe the common properties of the population of observations. Since it is impossible to distinguish between measurement errors and the specific factors, the vector (U i ) is often simply termed the error. Comparing (1) and (3) L and f i in a factor model are analogous to P and b i in a PCA model a. A factor model is trained by constructing the covariance matrix () of the population and solving (4) via maximum likelihood estimation (Johnson and Wichern, 1982) for L and. = LL T + (4) The solution assumes f and U are normally distributed. The error covariance matrix ( ) is positive definite and diagonal. The common factor scores (f i ) for an example observation (x i ) can be obtained from the weighted least squares estimator given by (5). f i = (L T?1 L)?1 L T?1 (x i? m) (5) A reconstruction of the systematic part of the observation vector can be obtained by substituting the estimated common a Except note that, unlike P, the columns of L are not unit vectors

5 Model-Based Detection of Spiculated Lesions 5 factor scores vector (f i ) into equation (3) and setting the error vector (U i ) to zero Probability Images The objective of the work described here is to detect abnormalities, that is to classify each pixel as belonging to a normal or abnormal region. Since any method is likely to be imperfect it is useful to explore a range of compromises between false negative errors (poor sensitivity) and false positive errors (poor specificity). This can be achieved conveniently by constructing a probability image (Astley et al., 1993; Kegelmeyer et al., 1994; Karssemeijer, 1995). The starting point is an n dimensional observation vector, x i, for each pixel i, describing properties relevant to the classification task. The probability density function of the observation vectors for each class is assumed (in our experiments) to be normal. For each class,! j (e.g. normal or abnormal), the mean, m!j, and covariance, C!j, of the observation vectors is estimated from a training set of images in which every pixel has been annotated with the appropriate class by an expert. The probability density of obtaining an observation vector x i for a pixel of class! j is given by p(x i j! j ) = 1 (2) n=2 jc!j j 1=2 exp (? ij 2 ) (6) where ij, the Mahalanobis distance to the class mean is given by ij = (x i? m!j ) T C?1! j (x i? m!j ): (7) Applying Bayes theorem a probability image for class! j (e.g. abnormal) is found by calculating, for each pixel p(! j jx i ) = p(x ij! j ) p(! j ) P p(x ij! ) p(! ) ; (8) where covers all classes. Detection can be performed by thresholding the resulting probability image. Different values of the threshold will result in different compromises between true positive and false positive errors. The detection performance as the threshold is varied can be summarised conveniently in Receiver Operating Characteristic (ROC) or Free ROC (FROC) curves (Metz, 1996). 4. DETECTING ABNORMAL PATTERNS OF LINEAR STRUCTURES Previous attempts to detect the abnormal patterns of linear structures associated with spiculated lesions, automatically, have generally targeted the features of known importance, such as the concurrency, angular spread and radial distance of radiating linear structures (Astley et al., 1993; Kegelmeyer et al., 1994; Karssemeijer, 1995). It is difficult to relate much of the published research to clinical practice, since error rates are generally quoted for selected datasets with a higher proportion of abnormalities than would be found in the screening population. By the same token, the level of prompting accuracy required to improve performance in a screening environment may well be very different from that required to produce an improvement in an artificial experiment. Existing methods fail to exploit, fully, the available image evidence. We have therefore developed a generic representation that spans all possible patterns of oriented lines, yet remains uncommitted to a specific type of pattern. Our representation places no emphasis on the features known to be important, but clearly incorporates them. The representation is used in conjunction with factor analysis (section 3.2) to provide a compact model of the oriented line patterns which occur in mammograms. We describe below the details of the method and demonstrate, using synthetic data, the ability to model realistically complex patterns. We have applied the technique to the detection of spiculated lesions by training models using data extracted from spiculated lesions and applying them to both normal and abnormal tissue patterns. We present results for a set of 129 mammograms containing 29 spiculated lesions, all from the Prompting Radiologists In Screening Mammography (PRISM) database (Royal Observatory, 1996). The images used and details of the PRISM data-base are given in Appendix A Detecting Linear Structures We extract linear structures from digitised mammograms by applying a multi-scale oriented line detector; this provides a line-strength, orientation and scale at each pixel. Various techniques are available for the detection of linear structures (Dixon and Taylor, 1979; Cerneaz and Brady, 1995; Karssemeijer, 1995, 1996), but a direct comparison has shown the adopted approach to be superior for our purposes (Zwiggelaar et al., 1996). Briefly, for each pixel a linear neighbourhood is tested at multiple orientations to find the orientation with the highest total grey-level. The mean background grey-level over a local neighbourhood at this orientation is subtracted from the mean grey-level along the line, to produce a measure of line strength (see Figure 3). The process is performed over the image resulting in an orientation and line strength measure at each pixel. Scale is obtained by constructing a Gaussian pyramid (Burt, 1984) of the original image and applying the operator to each level in the pyramid. For each pixel in the original image, the pyramid is scanned to find the scale that results in the maximum value of line strength at that pixel. This scale is taken as the scale for the pixel and the

6 6 R. Zwiggelaar et al. pixel i mean local grey level m Θ 1 x y 1 1 y i 2Θ i grey level sum for oriented line s output pixel i = l - m mean line grey level l Figure 3. Operation of the oriented line detector at one scale. The orientation with maximum grey-level sum s is used to calculate the difference between the mean grey-level along the line (l) and the local neighbourhood mean grey-level (m). Oriented Pattern (i=1..n) Θ n x i Θ i Double Unit Vector Representation x y n n Observation Vector Figure 5. The construction of an observation vector from an oriented pattern window. The spatial location of oriented patterns within the window is represented by position i within the window and vector. w w Figure 4. Examples of oriented patterns extracted from a spiculated lesion. The orientation of each pixel is represented by a small dark line. The light grey lines represent the extensions of the lines to the limits of the windows. The lesion centre is marked with a black dot. line-strength and orientation are taken at that scale. We apply our linear structure detector directly to the digitised mammogram though it would probably be worth investigating preprocessing to normalise the data. Various approaches have been described previously and appear to give good results - including adaptive noise equalization (Karssemeijer, 1994), the h int method (Highnam et al., 1996), and rank-order filtering (Zamperoni, 199) Representing Patterns of Linear Structures A generic representation of oriented line patterns is obtained by constructing observation vectors from the orientation values of pixels in a square window scanned over the image. Thus any pattern of lines with the same window size can be represented. Figure 4 provides examples of oriented patterns extracted from a spiculated lesion. Building statistical models of orientation data involves the calculation of mean angles. Ambiguous results are obtained if the orientations are represented as simple angles (Mardia, 1972). In the simplest scenario, calculating the angle of a linear structure results in two values 18 degrees apart, w due to the periodic nature of orientation data. To avoid such problems the orientation of each pixel was multiplied by a factor of two and represented as a 2D unit vector. A simple normalised angle would give a discontinuity at and degrees, whereas the use of a unit vector avoids this problem as the components are continuous on [?1; 1]. Angle summation using cartesian vectors behaves correctly with, for example, vectors representing lines at right angles summing to zero. Each oriented pattern window is represented by a single observation vector comprising the concatenated values of the 2D unit vectors corresponding to the orientations of the pixels within the window (Figure 5). The number of elements in the observation vector is thus equal to twice the number of sampled pixels in the window with the position of each oriented line coded by position in the vector. Additional, practical considerations must be taken into account when constructing observation vectors. A suitable window size must be used to detect lesions. Ideally the window should cover a substantial proportion of the lesion to be detected, to enable the abnormal pattern (distorted by the carcinoma) to be successfully modelled. However, lesion size is highly variable and will not be known before detection. In our experiments the window size was set to 21.5 mm square ( pixels) which is comparable to the largest of the lesions in the test/training set. A more significant consideration is the size of the observation vector. Training a factor model (see section 3.2) requires the construction of a covariance matrix with (2n) 2 elements where n is the number of oriented pixels in the pattern. For a window size of w x w pixels and with samples taken on a square grid with step size d the number of elements m in the covariance matrix is given by (9). m = (2n) 2 = [2( w d + 1)2 ] 2 (9)

Model-Based Detection of Spiculated Lesions 7 For w=512 and d=1, m=2.77x1 11, requiring approximately 2,216 gigabytes of memory.

Using a grid sampling step greater than unity introduces the additional problem of how to sample the data. We adopted a maximum local line-strength strategy.

The resulting image was thinned to produce a skeleton image containing only those pixels falling along the centre lines of linear structures detected in the mammograms.

7 Model-Based Detection of Spiculated Lesions 7 For w=512 and d=1, m=2.77x1 11, requiring approximately 2,216 gigabytes of memory. A more practical compromise was achieved by setting d=32, giving a requirement for approximately 2.67 megabytes of memory. Using a grid sampling step greater than unity introduces the additional problem of how to sample the data. We adopted a maximum local line-strength strategy. After application of the multiscale directional line operator to the mammogram, nonmaximal suppression (Canny, 1986) was applied to the linestrength image. The resulting image was thinned to produce a skeleton image containing only those pixels falling along the centre lines of linear structures detected in the mammograms. The (d d) neighbourhood around each sample point was searched for the skeleton pixel with the maximum linestrength value; the corresponding pixel s orientation provided the orientation for the grid point. To deal with the problem of missing data, without introducing bias, the orientation of the grid point was drawn from a random distribution if no skeleton pixels were found Statistical Models To allow an investigation of the properties of various statistical models of oriented line patterns, synthetic patterns were generated to train and evaluate the models. An example pattern is shown in Figure 6a. All experiments with synthetic patterns described in this paper used patterns with 81 oriented lines arranged on a 9 x 9 grid. A structured pattern was generated by focusing the lines on a point within the window. Systematic variation of this structure was achieved by altering the orientations of the lines such that the pattern became focused on different pixels. It is important to emphasise that these synthetic focal patterns were merely used as a tool to evaluate the models and that neither the representation nor the statistical models were specialised to deal with focal patterns. An observation vector was constructed by concatenating the 2D unit vector values corresponding to the orientations of the lines in a given synthetic pattern. A training population of patterns was generated by focusing the pattern at each of the points in the window in turn. In some experiments the patterns were corrupted with either (or both) of two types of random noise: a proportion of the lines were randomly oriented (Figure 6b) and a proportion of the lines were randomly perturbed (Figure 6c). This achieved corruption of patterns in a manner that was representative of real data but simple to control PCA Model A PCA model was trained on observation vectors from a population of focused line patterns containing no corrupted data. The resulting model provided a compact description in the form of (1) in which the first ten principal components (a) (b) (c) Figure 6. (a) A synthetic focused line pattern. (b) Same pattern with 5% of the lines randomly oriented. (c) 5% of lines with orientations randomly perturbed by up to 2 degrees. described 9% of the observed variability in the training population of patterns. Figure 7 demonstrates the effect of varying the weights in b i, corresponding to the first 5 principal components. The first two components (or modes of variation) describe horizontal and vertical translation of the focal point. This was expected, since the major variation in the training set was translation of the focal point. Table 1 shows that over half of the total variance is described by the first two components. The remaining modes describe pattern rotation, focus and skew with decreasing variance. We investigated the ability of a PCA model to reconstruct (using (1) and (2)) the underlying structure of a pattern that had been substantially corrupted by noise as described in the previous section. Half of the lines in the pattern were oriented randomly whilst the remaining half had their orientations randomly perturbed by up to 2 degrees. Figure 8 demonstrates the successful reconstruction of the underlying structure from a single corrupted pattern using only the first five principal components of the model. The evaluation of the PCA model continued by training a model using corrupted patterns and determining the ability of the resulting model to reconstruct the underlying structure in corrupted patterns. Figure 9 provides an example of reconstruction for the same corrupted pattern as Figure 8. It can be seen that the model trained with corrupted patterns fails to reconstruct the underlying structure successfully. This is caused by the principal component analysis attempting to explain all of the observed variability in the training population as a systematic effect, transferring noise variance into the model modes Factor Model It was clear from the results of the PCA model, trained using un-corrupted patterns, that we could successfully model oriented patterns. In the presence of noise, however, PCA fails to separate the underlying, systematically varying, structured patterns from the random variation added to each observation vector. Instead we applied Factor Analysis (see section 3.2)

8 R. Zwiggelaar et al. component 1 2 3 4 5 6 7 8 9 1 variance (%) 28.3 26.3 11.7 8.3 6.6 2.3 2.1 2. 1.5 1.

Percentage of total variance represented by the first 1 principal components.

(c) Reconstruction using PCA model trained on un-corrupted patterns. (a) (b) (c) Figure 9.

(c) Reconstruction using PCA model trained on corrupted patterns. Figure 7.

patterns, with the mean pattern in the centre and 2sd on either side.

A factor model was trained on corrupted patterns.

Variation of the common factors resulted in sensible pattern modes of variation even though the

Variation of the first five common factors of the factor model describe horizontal and vertical

variation of the PCA model trained with un-corrupted patterns (Figure 7).

varying structured patterns from the random variation.

successfully reconstructs the underlying structure from examples of patterns in the presence of

To investigate the effect of increasing levels of pattern corruption on a factor model we trained

8 8 R. Zwiggelaar et al. component variance (%) cumulative (%) Table 1. Percentage of total variance represented by the first 1 principal components. -2sd Mean +2sd (a) (b) (c) Figure 8. (a) Original structured pattern. (b) Corrupted pattern. (c) Reconstruction using PCA model trained on un-corrupted patterns. (a) (b) (c) Figure 9. (a) Original structured pattern. (b) Corrupted pattern. (c) Reconstruction using PCA model trained on corrupted patterns. Figure 7. First five modes of variation (from top to bottom) of the PCA model trained with un-corrupted patterns, with the mean pattern in the centre and 2sd on either side. which models the separation, without the need for a-priori knowledge of the underlying structure. A factor model was trained on corrupted patterns. The resulting model provided a compact description using 5 common factors. Variation of the common factors resulted in sensible pattern modes of variation even though the training population contained a substantial amount of random corruption. Variation of the first five common factors of the factor model describe horizontal and vertical translation of the focal point, rotation, focus and skew (Figure 1), similar to the first 5 modes of variation of the PCA model trained with un-corrupted patterns (Figure 7). This demonstrates that the factor model has, to a significant extent, separated the systematically varying structured patterns from the random variation. Although the separation of systematic from specific variation is not perfect, the factor model successfully reconstructs the underlying structure from examples of patterns in the presence of substantial noise corruption (Figure 11). To investigate the effect of increasing levels of pattern corruption on a factor model we trained models with different degrees of corruption, ranging from purely systematic to completely random. Figure 12 shows a plot of the magnitudes of the first five factor loading vectors (columns in L) and error variances ( ) against the proportion of randomly oriented lines. As expected, the magnitude of the systematic part of the model behaves as theory suggests: the quantity of systematic variation reduces as the random proportion increases, hence, the variances of the common factors reduce and the error variance increases.

Model-Based Detection of Spiculated Lesions 9-2sd Mean +2sd.6.5 Vector Magnitude (%).4.3.2.1.2.4.6.8 1.

Vector magnitudes (% of total magnitude) versus proportion of corrupted lines, where 3: error, 4: loading(1), 2: loading(2), 5:

This encouraged us to investigate the use of our generic representation coupled with factor analysis to discriminate between

This required us to make the assumption that the scores for the common and specific factors in the mammographic data were normally

Initially, to gain an insight into the technique s classification capabilities, we trained a factor model to discriminate between

The technique was then applied to a large data set of mammograms to determine the overall lesion detection performance. Figure 1.

either side. (a) (b) (c) Figure 11. (a) Original structured pattern. (b) Corrupted pattern.

4. Detecting Spiculated Lesions The use of corrupted synthetic patterns demonstrated the ability of factor analysis to separate

Pixel Classification 16 high resolution mammograms (5 microns/pixel) containing spiculated lesions (images 144, 145, 148, 175, 178,

, 1994) which provides annotations of lesion positions and extent by an expert radiologist.

9 Model-Based Detection of Spiculated Lesions 9-2sd Mean +2sd.6.5 Vector Magnitude (%) Free Random Line Proportion Figure 12. Vector magnitudes (% of total magnitude) versus proportion of corrupted lines, where 3: error, 4: loading(1), 2: loading(2), 5: loading(3), : loading(4), and +: loading(5). from specific, random variation. This encouraged us to investigate the use of our generic representation coupled with factor analysis to discriminate between patterns obtained from mammographic abnormalities and patterns obtained from normal tissue. This required us to make the assumption that the scores for the common and specific factors in the mammographic data were normally distributed. We did not formally investigate the validity of this assumption. Initially, to gain an insight into the technique s classification capabilities, we trained a factor model to discriminate between lesion pixels and non-lesion pixels. The technique was then applied to a large data set of mammograms to determine the overall lesion detection performance. Figure 1. Variation of the first five common factors (from top to bottom) of the factor model, with the mean pattern in the centre and 2sd on either side. (a) (b) (c) Figure 11. (a) Original structured pattern. (b) Corrupted pattern. (c) Reconstruction using Factor model trained on corrupted patterns Detecting Spiculated Lesions The use of corrupted synthetic patterns demonstrated the ability of factor analysis to separate systematic variation Pixel Classification 16 high resolution mammograms (5 microns/pixel) containing spiculated lesions (images 144, 145, 148, 175, 178, 179, 181, 184, 186, 188, 19, 191, 193, 195, 198, 199) were obtained from the MIAS database (Suckling et al., 1994) which provides annotations of lesion positions and extent by an expert radiologist. A factor model was trained on 16 spiculated lesions using a window size of pixels and a grid sampling step of 32 pixels. All the images were first processed using the multi-scale line operator described in section 4.1. A training population was built by constructing observation vectors from windows centred at each pixel in the annotated lesions. Table 2 shows the variance explained by each of the common factor loading vectors (columns in L) and the error vector. The comparatively large error magnitude was expected, as mammograms are extremely complex images with only limited systematic pattern. Figure 13 demonstrates the effect of the first five common factors. The first two common factor loading vectors describe the majority of the systematic variation. Variation of these two common

1 R. Zwiggelaar et al. -2sd Mean +2sd Figure 13.

factors leads to an alteration in the general orientation of the line patterns. This effect is caused by the training data containing eight images of the left breast and eight of the right.

mammogram. The first two common factors describe this variation.

Initially, a simple classification experiment was performed by randomly sampling 1 lesion points and 4 non-lesion points from each of the 16 mammograms.

Using canonical discriminant analysis (Jolliffe, 1986), 74% of the points were correctly classified.

sampled. This observation led to a further experiment. The lesion annotation circle was divided into five concentric circles with equally spaced radii.

The experiment proceeded by successively reducing the area labelled as lesion one ring at a time, and performing classification at each step.

Figure 14 gives the resulting ROC and percentage correct classification graphs. Figure 14b demonstrates the improvement in total percentage correct classification with reduced lesion area.

In practice, the percentage of correctly classified lesion points reduced as the area labelled as lesion reduced. This corresponds to an increase in specificity at the expense of sensitivity. 4.

to detect whole lesions. To measure the lesion detection performance of the technique, a set of 129 mammograms digitised to a resolution of 42 microns/pixel was used (Royal Observatory, 1996).

10 1 R. Zwiggelaar et al. -2sd Mean +2sd Figure 13. Variation of the first five common factors (from top to bottom) of a factor model trained from 16 spiculated lesions, with the mean pattern in the centre and 2sd on either side. factors leads to an alteration in the general orientation of the line patterns. This effect is caused by the training data containing eight images of the left breast and eight of the right. Many linear structures in the breast are directed towards the nipple, so the expected orientation of these lines is 45 degrees depending on whether they are extracted from a left or right mammogram. The first two common factors describe this variation. The remaining common factors demonstrate systematic pattern variation of a more complex nature involving mixtures of rotation, translation, focus and skew. Initially, a simple classification experiment was performed by randomly sampling 1 lesion points and 4 non-lesion points from each of the 16 mammograms. Ten factor scores were obtained for each sample position and used for linear classification on a leave-one-mammogram out basis. Using canonical discriminant analysis (Jolliffe, 1986), 74% of the points were correctly classified. Plotting the factor scores revealed that the lesion and non-lesion points tended to form compact clusters in feature space with sizes proportional to the size of the lesion from which they were sampled. This observation led to a further experiment. The lesion annotation circle was divided into five concentric circles with equally spaced radii. Forty pixels were randomly sampled from each of the five rings between the circles, so a total of 2 points were sampled from within the original lesion annotation. The experiment proceeded by successively reducing the area labelled as lesion one ring at a time, and performing classification at each step. In addition to canonical discriminant analysis, k-nearest neighbour classification was performed since the lesion/non-lesion factor scores were not simply clustered. Figure 14 gives the resulting ROC and percentage correct classification graphs. Figure 14b demonstrates the improvement in total percentage correct classification with reduced lesion area. However, Figure 14a shows that this improvement was caused by an increase in correctly classified non-lesion points. In practice, the percentage of correctly classified lesion points reduced as the area labelled as lesion reduced. This corresponds to an increase in specificity at the expense of sensitivity Lesion Classification Although our initial experiment demonstrated the ability to discriminate between lesion and non-lesion pixels we were really interested in combining the pixel results to detect whole lesions. To measure the lesion detection performance of the technique, a set of 129 mammograms digitised to a resolution of 42 microns/pixel was used (Royal Observatory, 1996). The set comprised a chronological sequence of 29 screening mammograms containing spiculated lesions and 1 normal mammograms (for details see Appendix A). For each spiculated lesion a detailed annotation by an expert radiologist was obtained in the form of a polygon delineating the extent of the central mass of the lesion. A tenth of the pixels within each annotation were randomly selected as the centres of oriented pattern windows to provide observation vectors for training the factor model. Thus the observation vectors used for training the factor model were based on abnormal patterns only, but the model was used for both the normal mammograms and for those containing abnormalities (on a leave-one-image-out basis for the normals and on a leave-one-patient-out basis for the abnormals, which means that there might be an optimistic bias in the F P results for the normals as the dataset contains both the left and the right breast mammograms for each normal subject).

11 Model-Based Detection of Spiculated Lesions 11 component error magnitudes (%) cumulative (%) Table 2. Error and common factor magnitudes (% of total magnitude). True Positive Fraction Fraction Correct Classification False Positive Fraction (a) Ring Number (b) Figure 14. Pixel classification results using canonical discriminant analysis and k-nearest neighbour (k=1) for 5 concentric circles where circle number 1 is the largest (equal to the annotated lesion extent). (a) ROC curve, (b) Total correct percentage classification versus circle number. Key: 3 canonical discriminant analysis, 4 k- nearest neighbour. 1 A covariance matrix (C!1 ) and a mean vector (m!1 ) of factor scores (f i ) for lesion patterns were constructed from the training set of abnormal images on a leave-one-mammogramout basis. A factor model was built from the training set with one image omitted. The factor model was then applied to patterns extracted from windows centred on the lesion points in the omitted image to produce a set of factor score vectors, f i. The process was repeated omitting each image in turn until a set of factor score vectors had been extracted from all the mammograms in the training set. The resulting set of vectors was used to obtain unbiased estimates of the covariance matrix (C!1 ) and mean (m!1 ) for the factor scores of lesion patterns. A covariance matrix (C!2 ) and mean (m!2 ) representing the distribution of normal patterns was estimated by building a factor model from the whole of the training set (i.e. all the observations from the mammograms with abnormalities) and using it to extract patterns randomly positioned within 3 normal mammograms. Thus, two unbiased estimated covariance matrices (C!j, j=1,2) and mean estimated factor score vectors (m!j, j=1,2) were obtained from equal sized populations, representing the distribution of lesion (j=1) and normal (j=2) patterns respectively. The experiment proceeded on a leave-one-mammogramout basis. A factor model was applied to a mammogram by extracting observation vectors from oriented patterns centred at each pixel within the breast area on a.7 mm grid. For each observation, a vector of 1 factor scores (f i ) was estimated using (5). Following the procedure outlined in section 3.3, a probability image was obtained for each of the 129 mammograms. Regions classified as lesion were extracted by thresholding the probability images after Gaussian smoothing (=1.1 mm) to reduce the effects of noise. Figure 15 shows an example of applying the technique to a single mammogram. The mammogram used for this illustration contains a fairly obvious lesion which is clearly highlighted in the probability image. Two approaches were used to investigate the performance of the technique. The first has been used previously (Brake and Karssemeijer, 1996) and enabled a comparison with other published techniques. In this, the fraction of true positives and the number of false positives per image were determined whilst varying a threshold applied to the probability images; this resulted in a set of Free Response Operating Characteristic (FROC) curves, each with a minimum region size threshold, below which detected regions were ignored. For each detected region a true positive detection was recorded if it at least partially overlapped a lesion annotation, otherwise

a false positive was recorded. Other overlap criteria have been discussed in the literature (Brake and Karssemeijer, 1996) Kegelmeyer et al. (Kegelmeyer et al.

12 12 R. Zwiggelaar et al.? (a) (b) (c) (d) Figure 15. Example of applying the oriented pattern technique to a single mammogram. (a) original mammogram - an arrow indicates the position of a lesion (b) Line strength image after application of multi-scale directional line operator (c) skeleton image (d) probability image. a false positive was recorded. Other overlap criteria have been discussed in the literature (Brake and Karssemeijer, 1996) Kegelmeyer et al. (Kegelmeyer et al., 1994) use a 5% overlap between the detected and annotated region and Karssemeijer and Brake (Karssemeijer and Brake, 1996) determines a false positive based on an overlap between the position of the maximum probability value within a region with the annotated region). Our second approach to evaluation relates to the production of prompts. Since a radiologist will lose faith in an inaccurate system (Hutt, 1996), the size of the prompt, which relates to the localisation accuracy of the algorithm, will affect the ability of the system to aid the radiologist. Whilst the use of traditional detection FROC curves enables published methods to be compared, when viewed in isolation, the traditional FROC curve can give an over optimistic estimate of performance. Ideally the area of the detected region should be taken into account. A large detected lesion (in the extreme case the whole of the breast) where only a small lesion exists counts as a true positive. For this reason, the sensitivity of our technique for a single minimum region size was plotted against the false positive area divided by the area of a circular prompt with a fixed diameter. This effectively measures the number of prompts that will fit into a falsely detected region. We

13 Model-Based Detection of Spiculated Lesions 13 True Positive Fraction double the number of F P per image compared to our target. Since the detection of cancers with a central mass of approximately 1cm leads to an improvement in patient prognosis (Dean, 1996) a minimum region size of 12 mm diameter was used to produce a prompt size FROC curve (Figure 16b). At 1% sensitivity, the equivalent of 1.1 false positive prompts were obtained. An operating point can be selected to achieve 92.5% sensitivity with only.2 false prompts per image or 62% at.75 false prompts per image. 5. DETECTING THE CENTRAL MASS True Positive Fraction False Positives / Image (a) mm Diameter FP Prompts / Image (b) Figure 16. (a) Detection and (b) Prompt size FROC curves resulting from the application of the oriented pattern technique to a set of 129 mammograms containing 29 spiculated lesions, where the minimum region sizes are 3: 8 mm, 4: 12 mm, and 2: 16 mm. present results for a notional prompt 2 mm in diameter, which is consistent with that used in our previous experiments designed to determine the efficacy of prompting (Astley et al., 1993; Hutt, 1996). Figure 16 shows the FROC detection and FROC prompt size curves for the whole data set. At a sensitivity of 8% the number of false positives per image is 1.5,.22 and.14 for minimum region sizes of 8 mm, 12 mm and 16 mm respectively (Figure 16a). For this type of abnormality our target is.75 F P per image (see section 2.2). The results indicate that for lesions with a minimum size of 16 mm we approach the required performance though there are still We describe a method for detecting the masses associated with spiculated lesions, based on directional recursive median filtering. Our approach, which can be used to detect various types of structure in images, generates a scale-orientation signature at each pixel. These signatures can be used directly for pixel classification. A more robust description is given, however, by a principal component model that reduces dimensionality and suppresses local noise effects. Detection results are presented using receiver operating characteristics (ROC) and free-response ROC (FROC) curves (Metz, 1996). The results are compared with other methods (Miller and Ramsey, 1996; Petrick et al., 1996; Zouras et al., 1996) and suggestions for possible improvements are made Recursive Median Filtering The Recursive Median Filter (RMF) is one of a class of filters, known as sieves, that remove image peaks or troughs of less than a chosen size (Bangham et al., 1996). They are closely related to morphological operators (Soille et al., 1996). By applying sieves of increasing size to an image, then taking the difference between the output image from adjacent size sieves, it is possible to isolate image features of a specific size. In 1-D, for every position at which the RMF is centred, the output is the median grey-level within the local neighbourhood (determined by the filter size). A simple example of recursive median filtering applied to a 1D signal f (x) is shown in Figure 17. From top to bottom, the original signal and the filtered results at two scales are shown. The second graph shows the signal f (x) after all the structures of scale equal to 1 pixel have been removed from the original signal. The third graph shows the signal f (x) after all the structures of scales smaller than 8 pixels have been removed from the original signal. The extracted information at a number of scales is shown in the bottom image of Figure 17, where the intensity corresponds to the grey-level change at each pixel at each of a number of scales. This information can be regarded as a scale signature at a pixel level containing all the local structure information available, over the range of scales at

14 14 R. Zwiggelaar et al. f(x) f(x) f(x) x Figure D RMF example. The original signal f (x) is shown at the top, the second graph shows the filtered signal with structures of scale one removed and the third graph shows the filtered signal with structures of scale smaller than eight removed. The image at the bottom shows the 1-D scale signature for each x where the greylevels represent the change in the signal f (x) between scales with the smallest scale at the bottom and the largest scale at the top of the image (zero change is represented by mid grey; positive values are lighter, and negative values darker). which the RMF was applied. In this case, scales of 1 to 64 pixels (on a log 1:5 basis) were used, with the smallest scale at the bottom and the largest scale at the top of the image. For every position x this results in a 1-D scale signature which describes the local behaviour of f (x). This is known as a granulometry (Serra, 1994) Directional Recursive Median Filtering For 2-D images, a 1-D RMF can be applied at any chosen angle, by covering the image with lines at this angle, ensuring that every pixel belongs only to one line (Press et al., 1992). By performing 1-D Directional Recursive Median Filtering (DRMF) at several orientations, a scale-orientation signature can be built for each pixel. The signature is a 2-D array in which the columns represent measurements for the same orientation, the rows represent measurements for the same scale, and the values in the array represent the change in greylevel at the pixel, resulting from applying a filter at the scale and orientation corresponding to the position in the array. The grey-level changes are measured with respect to the image x x x filtered at the next smaller scale at the same orientation Examples Using Synthetic Data Figure 18 shows scale-orientation signatures for pixels located on synthetically generated structures. The response to a circular binary blob results in a signature which has non-zero values only at one scale, which is the same for all orientations (Figure 18a). For a binary line the resulting signature is highly scale and orientation dependent, with the minimum scale related to the width of the line, and the maximum scale related to the length of the line (Figure 18b). When structures are more realistic, such as Gaussian lines or blobs, the signatures become slightly more complicated, but the overall shape remains similar (Figure 18c,d). For blob-like structures the centre pixel gives a very characteristic signature, where the scales at which information is present in the signature are related to the diameter of the structure. This is also true for pixels on the backbone of linear structures, for which the minimum scale in the signatures is related to the width of the linear structure, the maximum scale related to length, and the orientation at which this maximum occurs indicates the direction of the linear structure. Although the signatures for non-centre pixels are not identical, they are usually very similar. This local stationarity property is useful for pixel classification Detecting Spiculated Lesions We present results for a set of 56 mammograms from the PRISM database (Royal Observatory, 1996), of which 28 contained spiculated lesions annotated by an expert radiologist, and 28 were normal mammograms. This is a subset of the mammograms used in the previous section. The number of normals was reduced for storage reasons - for every scale and orientation an image the size of the mammogram has to be stored, so typically this means that 132 (12 orientations and 11 scales) times the size of a mammogram is needed as storage space, although sparseness of the data might mean a factor lower than this might be achievable. A typical example of a mammogram containing a mass associated with a spiculated lesion is shown in Figure 1. Note the other blob structures which are present in the mammogram and are very similar in appearance to the abnormal mass associated with the lesion. To avoid bias, classification experiments were performed on a leave-one-image-out basis (see remarks in section 4.4.2), with the means and covariances used in calculating probability images being determined from all the signatures, except those from a given mammogram, and then applied to the signatures from the excluded mammogram. This was repeated for each mammogram in turn and the results were averaged. To reduce computation time, both the normal and abnormal signatures were extracted

15 Model-Based Detection of Spiculated Lesions 15 (a) (b) (c) (d) Figure 18. Some synthetic examples of multi-scale DRMF signatures, where the larger four images show (a) a binary blob, (b) a binary linear structure, (c) a Gaussian blob and (d) a Gaussian linear structure. The four smaller images are the scale-orientation signatures for the centre pixel of each image, where scale is on the vertical axis and orientation on the horizontal (the background grey-level is zero, i.e. only positive values are present in the DRMF signatures). from the 28 mammograms containing the abnormalities. For the mammograms containing abnormalities these were used on a leave-one-image-out basis, while for the normal mammograms the whole data set was used. A similar statistical modelling approach was taken as for the detection of the abnormal patterns of linear structures. However, for the detection of the central mass we show that the use of PCA gives results which are satisfactory and the expected improvement from the use of factor analysis did not warrant the time that would have been needed to determine the optimal number of factors used in the modelling (as all experiments were performed on a leave-one-image-out basis) Directional Recursive Median Filtering The original mammograms were digitised at 42 m pixel spacing. They were first reduced in size by Gaussian subsampling to give a pixel spacing of 336 m. DRMF was applied at 12 orientations and for all scales from 1 to 64 pixels. Scales 1 to 64 were then divided into 11 bins on a log 1:5 basis with the 12 th bin covering scales larger than 64 pixels. This resulted in a 12 by 12 scale-orientation signature for each pixel, if the residual values were taken into account, or in an 11 by 12 scale-orientation signature if the residual values were not included (in our experiments the latter approach was used) Principal Component Analysis A principal component model was trained on an equal number of normal and abnormal scale-orientation signatures (on a leave-one-image-out basis). Approximately 22 signatures were used to train each PCA model. Slight differences occurred between the numbers of signatures used as these were based on the number of pixels within each annotated abnormality. The abnormal signatures were taken from the annotated central mass while the normal signatures were randomly selected from the same strips (excluding the abnormality) across the mammograms. The first five principal components explained, cumulatively, approximately 3%, 42%, 5%, 55% and 59% of the training set variance. To explain a cumulative variance of 85%, 9%, 95% and 99%, the number of PCs required were 24, 33, 49 and 86 respectively. The large number of principal components needed to explain 99% of the variation implies that there is a significant degree of independent variation of the values in individual cells in the signature, which may be considered as noise. The mean signature and the effect of the first three principal components of the scale-orientation signatures are shown in Figure19. The mean signature contains values close to zero (all smaller than two). The first three principal components encode changes in the signatures, with more pronounced effects at the larger scales. The first principal component codes large blob-like structures which are either positive (+3sd) or negative (?3sd) with respect to their local greylevel. The second and third principal components show more complicated patterns with changes in both scale and orientation. Figure 2 shows scale-orientation signature reconstructions for various mammogram pixels using different numbers of principal components. It appears that a small number of PCs are sufficient to reconstruct the general structure of the scale-orientation signatures whilst reducing the level of noise. All 86 PCs are, however, required to give a detailed reconstruction of the original signatures Probability Images A covariance matrix (C!j ) and a mean vector (m!j ) of reduced dimension vectors from the PCA were constructed for two classes, i.e. mass (j = 1) and non-mass pixels (j = 2). As above, the dataset that was used contained signatures extracted from abnormal images only; the abnormal signatures from the annotations and the normal signatures randomly selected from the same strips (excluding the abnormality) across the mammograms. A common covariance matrix was built from the two classes and used to determine the Mahalanobis distance to the two class means for each pixel in the mammogram. A common covariance matrix was used instead of the individual covariance matrices as the non-class signatures tended to be

16 16 R. Zwiggelaar et al. +3sd 1 st PC 2 nd PC 3 rd PC Mean?3sd Figure 19. The mean signature and the effect of the first three principal components (3 standard deviations) on the scale-orientation signatures; positive values are light, negative values dark. (a) (b) (c) (d) 1-24 PCs 1-33 PCs 1-86 PCs Full Figure 2. Some examples of scale-orientation signatures as extracted from a mammogram and their reconstruction from a limited number of principal component, positive values are light, negative values dark. The pixels for which signatures are shown were positioned on the edge of an abnormal mass (a & c), at the centre of an abnormal mass (b) and on the background (d) all taken from a line across the mass shown in Figure 1. sparse and the resulting matrix not well behaved. From the Mahalanobis distance to the two classes, the class probability densities were obtained and used to produce a probability image (see section 3.3) for every mammogram in the dataset. Results were obtained using different numbers of principal components. A mammogram and the resulting probability image, using the full signatures, is shown in Figure 21b. By comparing the mass annotation and the detected mass-like regions in Figure 21, we can see that the annotated mass has been detected. It is also clear, however, that other regions have been highlighted; in particular, there are a number of linear structures which also have a high probability value. A probability image for the same mammogram based on the first 33 PCs is shown in Figure 22b. By comparing the mass annotation and the detected mass-like regions in Figure 22b, we can see that the annotated mass has been detected. In this case the linear structures which were detected in Figure 21 are not present.

(a) Original mammogram (Figure 1) and (b) the probability image based on the first 33 PCs. 5.4.

17 Model-Based Detection of Spiculated Lesions 17 (a) (b) Figure 21. (a) Original mammogram (Figure 1) and (b) the probability image based on the full signatures. (a) (b) Figure 22. (a) Original mammogram (Figure 1) and (b) the probability image based on the first 33 PCs Pixel Classification Sets of probability images, based on various numbers of PCs and the full signatures, were used to produce ROC curves for pixel classification (see Figure 23). The classification was based on all the pixels within the breast area (a simple

Computer Aided Detection of Abnormalities in Mammograms

Computer Aided Detection of Abnormalities in Mammograms I W Hutt 1, S M Astley 1 & C R M Boggis 2 1: Wolfson Image Analysis Unit, Dept of Medical Biophysics, Manchester University, Oxford. Rd, Manchester,