Use of Artificial Color filtering to improve iris recognition and searching

Pattern Recognition Letters 26 (2005) 2244 2251 www.elsevier.com/locate/patrec Use of Artificial Color filtering to improve iris recognition and searching Jian Fu a, *, H. John Caulfield b, Seong-Moo Yoo c, Venkata Atluri a a Department of Computer Science, Alabama A&M University, 4900 Meridian Street, Normal, AL 35762, United States b Alabama A&M University Research Institute, P.O. Box 313, Normal, AL 35762, United States c Department of Electrical and Computer Engineering, University of Alabama in Huntsville, Huntsville, AL 35899, United States Received 27 May 2004; received in revised form 29 November 2004 Available online 3 June 2005 Communicated by H. Wechsler Abstract Iris recognition and searching are attractive in biometrics for many reasons. The spatial patterns have been studied and recognized effectively for several years. They are more complex than fingerprints. We suggest here that the relatively new field of Artificial Color filtering can provide an orthogonal discriminant to the spatial pattern discriminant. We also show how to combine results from the two discriminants in such a way as to improve performance of the combined system over either part something that has been troubling until now. Ó 2005 Elsevier B.V. All rights reserved. Keywords: Artificial Color; Bayesian analysis; Iris recognition; Biometrics; Support Vector Machine 1. Introduction Irises are complex multilayer structures that derive their color both from pigmentation and scattering as well as by interactions between layers. In a very real sense, they have no color. They have * Corresponding author. Fax: +1 256 372 5578. E-mail address: jian.fu@email.aamu.edu (J. Fu). many colors in different parts of the retina. Absorption and scattering from a complex multilayer system make it very complex to describe (Keating, 1988). Yet, as all enrollment forms suggest, iris color (usually called eye color, as the rest of the eye is either white or black) is a recognized discriminant among people. Yet iris recognition methods do not use color. They use structure (shape). There has been considerable quite successful work in spatial recognition of irises. Some of that 0167-8655/$ - see front matter Ó 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.patrec.2005.03.032

J. Fu et al. / Pattern Recognition Letters 26 (2005) 2244 2251 2245 work can be found in (Boles and Boashash, 1998; Daugman, 1993, 2003; Daugman and Downing, 2001; Wildes, 1997). These and other works have made iris recognition an attractive and viable approach to biometrics. The thesis defended in this paper is that color can supplement spatial pattern recognition to make the system even more reliable. Daugman (2004) has shown that either conjunctive or disjunctive combining of decisions made by two discriminants makes one of the two types of errors (false positives and false negatives) worse. In that case, why not use the better of the two methods and forget the weaker one? That runs contrary to everything ever taught in Bayesian analysis (Gelman et al., 1995) where it is universally understood that the a posteriori result (the one obtained after using new information) is more reliable than the a priori estimate made before the new measurements were taken into account. All new information should be unambiguously helpful. This suggests that Boolean combination of decisions is not likely to be the best way to combine results from two discriminators. Two alternative combination methods are suggested here. Both make the a posteriori result superior to the a priori result (the result of using the spatial discriminator alone). The goal of this paper is modest. We aim to show that color can be a means to improve the recognition of irises beyond what has been achieved by shape alone. To do that, we have two burdens of proof. First, we must show that combining two different approaches to iris recognition can conceivably improve performance over the better of the two methods being combined. Second, we must show that color recognition of irises provides significant discrimination on its own, so it is worth combining with spatial pattern recognition. Neither proposition is obvious. substitute the word computers for the word brains in the pervious sentence, we describe Artificial Color. It has numerous advantages that have been explored in detail elsewhere (Caulfield, 2003; Fu et al., 2004). Here we are concerned with filters analogous to the color filters photographers use on their cameras. A filter transmits or attenuates according to the spectral content of each pixel. A Boolean Artificial Color Filter transmits fully (T = 1) or attenuates fully (T = 0) at any pixel according to whether or not the spectrum is recognized as belonging to a certain class. Boolean logic operations on Boolean Artificial Color Filters result in new Artificial Color Filter (Caulfield et al., 2004). To make an Artificial Color Filter, one must train a discriminator that assigns a pixel to either the class of interest or to some other class. Then the Artificial Color Filter is 1 where the pixel belongs to the class of interest and 0 elsewhere. Multiplying the image by that filter causes only the identified pixels to pass. The filters we show were designed recognize one iris spectral pattern and distinguish it from the others. 1 We used a simple set of nine iris images, as shown in Fig. 1, which were recorded by Prof. John Daugman and used with his permission. The irises in Fig. 1 are indexed from 1 to 9 from left/top to right/bottom. So, for example, one filter sought out iris 3 but not irises 1, 2, 4, 5, 6, 7, 8, and 9. Those filters were not powerful enough to allow for both significant variations in recorded iris color patterns and at the same time significant variations about the other iris colors. Accordingly, we used all nine Artificial Color Filters for every iris. The count of the number of pixels assigned a value 1 in those nine filters was viewed as an identifying vector. We assumed that the measured pattern of nine numbers for a given iris was the mean of a multivariate normal distribution with a diagonal 2. Artificial Color filtering for irises Natural Color is a discriminant computed by our brains and attributed to the object using data obtained through measurements employing multiple overlapping spectral sensitivity curves. If we 1 There is no known large database of colored irises for the simple reason that until now, there seemed little hope of using color for iris discrimination. Lacking that, we attempted to gain some sense of what might occur with large training and test sets by assuming that the nine samples we had were the means of a nine mode Gaussian distribution with various variances about each mean.

2246 J. Fu et al. / Pattern Recognition Letters 26 (2005) 2244 2251 New data from some class are likely to cluster around the old data. Thus, it is desirable to separate the classes as represented in a measurement or discriminator hyperspace by as much as possible. The separation is called margin. It measures how far borderline cases ( support vectors ) would have to change to be misclassified by a given decision surface in that hyperspace. Think of it as a margin of error for new data. Simple separation surfaces are better than complex ones, as they represent gross trends not details that may not be very general. covariance matrix. The standard deviation was taken varied at each point. With those assumptions, we could use noisy versions of a given iris to classify it. For this limited number of irises and the reasonable standard deviations assumed, excellent recognition of the nine irises from their color patterns was obtained. Of great importance here is the pattern recognition method used. We employ a powerful generalizer called Margin Setting (Caulfield, 1974). This allows good selectivity even in the case of a simple commercial color camera using three bands. We place color in quotation marks, because there is no color in the world for a camera to record. All it records in this case is the image viewed through spectrally overlapping RGB filters. The Artificial Color Filter will decide for each RGB set whether it should be assigned T = 1 (the pixel belongs to the set of interest) or T = 0 (it does not belong to that set). 3. Margin setting Fig. 1. Nine iris photographs. The goal of statistical pattern recognition is to assign new inputs (one not in the available training set) to their proper categories as reliably as possible based on what can be learned from the labeled examples that comprise the training set. Rather than use that long task description, workers in this area simply say that the goal is strong generalization. As the new data are unknown, we must make some assumptions about them: A detailed review of this matter would take us far away from the primary objective, and those details are readily available (Caulfield and Heidary, in press; Heidary and Caulfield, 2005, in press). Perhaps the most widely used generalization tool is the Support Vector Machine (Burges, 1998). It finds the separation surface of a given complexity that gives maximum margin for the points lying near the separation surface. Points far from that surface are not important, as new data could vary greatly around them and still be properly classified. Margin Setting takes a new approach. It seeks very simple separation surfaces (usually linear) and finds a subset of the training set that is classified by that surface with a preselected margin. Usually that means that there will be members of the training set that cannot be separated that well. Those members are removed from the first training set to become the members of a new training set for which we seek a new separation surface while still imposing the same margin. Appendix A gives a more detailed description of this new algorithm. Maximum discrimination was not necessary for our goal, however. All we sought to show was that color could achieve discrimination among iris images. That we have done. 4. Demonstration experiment One of our goals in this paper is merely to show that iris color discrimination is easy and powerful. We have not run enough cases to say how powerful

J. Fu et al. / Pattern Recognition Letters 26 (2005) 2244 2251 2247 it is. But the demonstration provided here suggests that it will be valuable, especially when used in conjunction with the much-better-developed spatial pattern recognition of irises. For each of the irises in Fig. 1, we trained an Artificial Color Filter to recognize that iris and discriminate against the other 8. We chose 20 pixels at random from the non-pupil (iris) part of each image. We then filtered all nine iris images with all nine Artificial Color Filters. The resulting images were very interesting, but were only the penultimate goal. Fig. 2(a) (c) shows the image produced by the Artificial Color Filters designed for the irises in the top row. In order to obtain a number useful for discrimination, we simply counted the number of matches for each iris, i.e., the number of places in each iris image that are identified as belonging to the target iris and not any of the others. Table 1 shows the raw hit data, that is, the number of pixels with transmission 1 for a particular Artificial Color Filter with a particular iris image being filtered. Table 2 shows the same results normalized on a row-byrow basis so that the diagonal values in the matrix are 1. The number of pixels surviving the Artificial Color Filter operation is called the count for that image using that filter. Making the counts for the nine Artificial Color Filters the nine components of a discrimination vector, and making the assumptions noted earlier on how discrimination vectors are distributed, we could then estimate the probability that each iris was iris 1, the probability that each iris was iris 2, and so forth. Those results are given in Table 3. Clearly, these results are encouraging. To allow for variations beyond those given, we assumed that each measurement was the mean of an independent Gaussian probability distribution. That is obviously false, but even this simplified and simplifying assumption leads to the desired goal the unmistakable proof that iris color can be a Fig. 2. Filtered images by the Artificial Color Filters designed for the three irises in the top row in Fig. 1.

2248 J. Fu et al. / Pattern Recognition Letters 26 (2005) 2244 2251 Table 1 Number of pixel matches for irises 1 9 Iris no. 1 2 3 4 5 6 7 8 9 1 48806 11106 24092 8783 7279 7720 5677 34877 27027 2 5057 41300 25853 7536 27495 7754 23351 4828 31670 3 25620 23204 48149 21500 31818 15220 23891 8345 31379 4 5577 17305 5884 39504 34373 38067 15962 3488 14406 5 17395 4319 1867 4726 23160 5461 13155 26358 19160 6 1087 17594 2187 19761 10675 38023 4529 5476 3864 7 24319 13032 6039 2555 21064 2812 16811 11378 26956 8 41504 771 1311 1860 11732 753 7756 33175 18404 9 24712 5647 3860 1332 8153 395 7253 36670 20948 Table 2 Numbers of matches for irises 1 9 normalized on a row-by-row basis to the diagonal component Iris no. 1 2 3 4 5 6 7 8 9 1 1.000 0.2276 0.4936 0.1800 0.1491 0.1582 0.1163 0.7146 0.5538 2 0.1224 1.000 0.6260 0.1825 0.6657 0.1877 0.5654 0.1169 0.7668 3 0.5321 0.4819 1.000 0.4465 0.6608 0.3161 0.4962 0.1733 0.6517 4 0.1412 0.4381 0.1489 1.000 0.8701 0.9636 0.4041 0.08829 0.3647 5 0.7511 0.1865 0.08061 0.2041 1.000 0.2358 0.5680 1.1381 0.8273 6 0.02859 0.4627 0.05752 0.5197 0.2808 1.000 0.1191 0.1440 0.1016 7 1.447 0.7752 0.3592 0.1520 1.253 0.1673 1.000 0.6768 1.6035 8 1.2511 0.02324 0.03952 0.05607 0.3536 0.02270 6.2338 1.000 0.5548 9 1.1796 0.2696 0.1843 0.06359 0.3892 0.01886 0.3462 1.7505 1.000 Table 3 Probabilities that each iris was iris 1 with different a a Iris 1 2 3 4 5 6 7 8 9 2.0 1.0000 0.0368 0.0313 0.0026 0.3423 0.0743 0.2998 0.6195 0.6911 1.0 1.0000 0.0000 0.0000 0.0000 0.0137 0.0000 0.0081 0.1473 0.2282 0.2 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 good discriminant when treated as we have treated it. When calculating the probabilities, the multivariate normal Gaussian distribution equation is employed. pðxþ ¼½1=ð2pÞ N=2 jvj 1=2 Š exp½ ðx T m T ÞV 1 ðx mþ=2š ð1þ where x is the measured data vector. m is its mean, V is the covariance matrix. jvj is the determinant of V and V 1 is its inverse. x has components which are x 1,x 2,...,x 9. In our case, we have only one sample of the x vector, so we simply 1. Use our one vector as m as a guess of the mean. Its components are m 1,m 2,...,m 9. 2. Assume V is diagonal. 3. Assume that r = am for each dimension. a is a measure of the expected variability of new measurements from the means we measured. Clearly if we assume large variability (a =2), distinguishability degrades, and if we assume low variability (a = 0.2), distinguishability is almost perfect. For even moderate assumed variability (a = 1), distinguishability is remarkably good. The table supporting those claims for iris 1 is shown as Table 3. Similar results occurred for all nine irises.

J. Fu et al. / Pattern Recognition Letters 26 (2005) 2244 2251 2249 Including all of the off-diagonal terms in the covariance matrix would be the most precise thing we could do, but it would make the calculations a substantially more difficult. Our approximation of the covariance matrix by a diagonal matrix is obviously false, but even this simplified and simplifying assumption leads to the desired goal the unmistakable proof that iris color can be a good discriminant when treated as we have treated it. The results would have been much better if we had not made the incorrect but mathematically convenient assumption that the covariance matrix is diagonal in the multivariate normal distribution formula. The effect would be to use rather than ignore expected correlations in the discrimination. Maximum discrimination was not necessary for our goal, however. All we sought to show was that color could achieve discrimination among iris images. That we have done. 5. Modifications required to apply Artificial Color filtering to iris recognition What the preceding results show quite conclusively is that iris color is in itself a potentially powerful biometric. But the method we used to do that is not extendable to the problem of discriminating your iris from everyone elseõs. The reason is that we do not have everyone elseõs irises to discriminate against. We must instead train to recognize your iris against unknown other irises. One way to do this is to learn to distinguish your iris from those in a data set much larger than 9. Another way is to modify Margin Setting so that it can deal with the single-category case. That has been done (Heidary and Caulfield, 2004) but it was not applied here. 6. Beyond Boolean combinatorics Daugman (2004) shows in his analysis that performing either a Boolean OR or a Boolean AND of results from two discriminators of unequal power will produce some good and also some bad effects. We use his argument (which is certainly correct) as a starting point for a different approach. That is, we show that Boolean logical operations are inappropriate as a way to combine biometric results, so DaugmanÕs results are neither surprising nor troubling. The reason Boolean logic leads to counterintuitive results is that it does not apply well to the problem. In Boolean logic, the assumption is that the discriminant has either recognized the individual (Boolean 1) or not (Boolean 0). Immediately, there is a problem. If a discriminator has already decided by producing a 1 or a 0, then no improvement is possible even in principle. Why seek to improve a decision that is already certain? Or, more appropriately for this paper, why attribute certainty to a decision we know to be uncertain? All of this suggests that we allow for uncertainties. One way to do this is to use fuzzy rather than Boolean logic. This is quite easy to do in statistical pattern recognition. What is normally done is to construct a decision hyperspace that represents the measurements of features as distances along orthogonal axes. Each set of measurements is a point or vector (depending on how we choose to view it) in that hyperspace. Then learning from examples in a training set produces a decision boundary. A new instance also becomes a vector in that hyperspace. Depending on what side of the decision surface the point falls on, we assign it to the corresponding set. Yet it is both true and obvious that we can be much more confident of vectors lying far from the decision surface than we can be of those lying very close to it. Readers familiar with modern statistical pattern recognition, e.g. the Support Vector Machine, will find these comments familiar. It is easy to define a fuzzy set to replace the crisp decision in the manner of Fig. 3. This allows the result to represent the fact that there may be some residual a posteriori uncertainty. Only if the possibility of improvement is allowed for can we hope to achieve it. If multiple biometric classifiers are each represented in terms of membership values in fuzzy sets, many obvious ways to combine the results for increased certainty (higher membership values) are possible. A fuzzy AND (there are many such fuzzy conorms that reduce to the Boolean AND) is one approach. We prefer to view the fuzzy

2250 J. Fu et al. / Pattern Recognition Letters 26 (2005) 2244 2251 µ, Membership in the class 1 0 0 membership values as vectors in a new hyperspace in which we can train a new discriminant surface that is clearly going to be improved over either dimension alone. Our intuitive, Bayesian-informed sense that using new information will certainly improve the performance over that situation before the new information is taken into account is now restored. In Bayesian terms, the a posteriori estimate should be better than the a priori estimate. 7. Conclusions (1- χ) r 0 r R D, Distance from prototype Fig. 3. The fuzzy set is defined by its margin v and its maximum distance R. This, of course, is just one of infinitely many ways to fuzzify the decision, and not necessarily the best one. 1. While the data set used is limited, our results analyzing it show that Artificial Color can offer a powerful iris discriminant. 2. Once trained, an Artificial Color hit counts can be made very rapidly and their patterns used to recognize irises quite reliably. 3. These can be used to provide a numerical value orthogonal to (independent of) the iris spatial pattern recognition number(s) for combination in Bayses-inspired methods to obtain a combined (a posteriori) recognition signal that is superior to spatial-pattern-only (a priori) approach just as Bayesians would expect. 4. Artificial Color can be used to narrow the space searched for spatial patterns if the task is identification of an unknown person rather than simply the verification of a single asserted identification. It may be a preferable item to search on as it is so easy and fast. Appendix A. Filter design methods The simplest Artificial Color Filter is a software map of regions that are judged to belong to the proper class and not to other classes. The map has a value 1 at those picture elements (pixels) where the sought-after class is indicated and a value 0 elsewhere. Multiplying the scene pixel-bypixel with the filter leaves only the pixels deemed to be in the class at non-zero values. The classification operation using Margin Setting is comprised of a sequence of S classifier design stages. We can prescribe the maximum S or continue adding stages until some stopping criterion is achieved. In this study, we will fix various values of S. Classification in each stage is carried out by utilizing the concept of prototype sphere-of-influence. That is, we use a method described below to select prototypes for each set. In the recognition stage, we calculate the distance of a new vector or point from the prototype. If that distance is less than or equal to a threshold value, we assign the new data to the corresponding class. Thus our task is to pick a good prototype and a good threshold. To select the prototype, we use a very simple immunological approach. That is, we start with numerous random points or vectors. Each point is viewed as a potential prototype for the class of the nearest member of the training set. We now rank order the potential prototypes for each class. We do that by finding the nearest member of another set. That distance is called the zero-margin radius. The figure of merit of the potential prototype is given by the number of training set members inside the zero-margin radius around that prototype. Then we choose new potential prototypes based on the old ones. We choose a fixed number of these. Each is the result of two random events. First, we select one of the prototypes from the prior generation randomly from a probability distribution function governed by the figure of merits of the various prototypes. Then the prototype is mutated by choosing a perturbation from a normal distribution centered at the selected prototype. This continues until no improvement in

J. Fu et al. / Pattern Recognition Letters 26 (2005) 2244 2251 2251 figure of merit is achieved. The highest scoring prototype is selected. Next comes the setting of the margin. Calling the zero-margin radius r 0, we set an x-percent margin by setting r x ¼ð1 0.01xÞr 0 The reduction of the radius makes false positives less likely, as it requires the new vector to lie closer to the old one. The attendant increase in unclassifiable data is somewhat like false negatives. It can be fought by increasing the number of generations. Obviously, there are tradeoffs to be made. We have found one satisfactory for our purposes. It is not likely to be optimum in any meaningful sense. After we decrease the radius, we remove from the training set all members not falling within the x-percent radii for the various prototypes. That process is repeated for each new stage. Note, of course, that each stage uses as the training set only those samples not classified in earlier stages. In classification, we test new data on the firststage classifiers. If those classifiers indicate a class, we accept it. If not, we go to the second-stage classifiers, and so forth through all S stages. Three things can happen on labeled test data, 1. The new data point can be classified correctly. 2. The new data point can be classified incorrectly. 3. The data point may remain unclassified. For any particular application, the costs of the two types of imperfections (misclassification and non-classification) may be different. References Boles, W.W., Boashash, B., 1998. A human identification technique using images of the iris and wavelet transform. IEE Trans. Signal Process. 46, 1185 1188. Burges, C.J., 1998. A Tutorial on Support Vector Machines for Pattern Recognition. Kluwer Academic Publishers. Caulfield, H.J., 1974. Holographic spectroscopy. Opt. Eng. 13, 481 482. Caulfield, H.J., 2003. Artificial Color. Neurocomputing 51, 463 465. Caulfield, H.J., Heidary, K., in press. Exploring Margin Setting for Good Generalization in Multiple Class Discrimination. Pattern Recognition. Caulfield, H.J., Fu, J., Yoo, S.-M., 2004. Artificial Color Image Logic. Inform. Sci. 167 (1 7), 2004. Daugman, J., 1993. High confidence visual recognition of persons by a test of statistical independence. IEEE Trans. Pattern Anal. Machine Intell. 15, 1148 1161. Daugman, J., 2003. The importance of being random: Statistical principles of iris recognition. Pattern Recognition 36 (2), 279 291. Daugman, J., 2004. Combining Multiple Biometrics. Available from: <http://www.cl.cam.ac.uk/users/jgd1000/combine/ combine.html>. Daugman, J., Downing, C., 2001. Epigenetic randomness, complexity, and singularity of human iris patterns. Proc. Roy. Soc. 268, 1737 1740. Fu, J., Caulfield, H.J., Pulusani, S.R., 2004. Artificial Color Vision: A preliminary study. J. Electron. Imaging 13 (3), 553 558. Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B., 1995. Bayesian Data Analysis. Chapman & Hall. Heidary, K., Caulfield, H.J., in press. Discrimination among similar looking, noisy color paths using margin setting. Visual Comm. Image Represent. J. Heidary, K., Caulfield, H.J., 2005. Application of supergeneralized matched filters (SGMFs) to target classification. Appl. Optics 44 (1), 47 54. Keating, M.P., 1988. Geometric, Physical, and Visual Optics. Butterworths. Wildes, R.P., 1997. Iris recognition: An emerging biometric technology. Proc. IEEE 85, 1348 1363.