CAN UNCLASSIFIED. Jeffery K. Hovis. Contract Report. September 2017

Size: px
Start display at page:

Download "CAN UNCLASSIFIED. Jeffery K. Hovis. Contract Report. September 2017"

Transcription

1 CAN UNCLASSIFIE ED Assessment of the Next Generation of Colour Vision Tests for Pilots and Aircrew Jeffery K. Hovis Ali Almustanyir Prepared By: School of Optometry and Vision Science, University of Waterloo PWGCS Contract Number: W Contractor's date of publication: September 2017 Technical Authority: Mackenzie G. Glaholt, x2137 Terms of Release: This document is approved for Public release. Defence Research and Development Canada Contract Report DRDC-RDDC-2017-C191 September 2017 CAN UNCLASSIFIED

2 IMPORTANT INFORMATIVE STATEMENTS Disclaimer: This document is not published by the Editorial Office of Defence Research and Development Canada, an agency of the Department of National Defence of Canada but is to be catalogued by DSTKIM, the national repository for Defence S&T documents. Her Majesty the Queen in Right of Canada (Department of National Defence) makes no representations or warranties, express or implied, of any kind whatsoever, and assumes no liability for the accuracy, reliability, completeness, currency or usefulness of any information, product, process or material included in this document. Nothing in this document should be interpreted as an endorsement for the specific use of any tool, technique or process examined in it. Any reliance on, or use of, any information, product, process or material included in this document is at the sole risk of the person so using it or relying on it. Canada does not assume any liability in respect of any damages or losses arising out of or in connection with the use of, or reliance on, any information, product, process or material included in this document. This document was reviewed for Controlled Goods by Defence Research and Development Canada (DRDC) using the Schedule to the Defence Production Act. Her Majesty the Queen in Right of Canada (Department of National Defence), 2017 Sa Majesté la Reine en droit du Canada (Ministère de la Défense nationale), 2017

3 Assessment of the Next Generation of Colour Vision Tests for Pilots and Aircrew Jeffery K Hovis Ali Almustanyir School of Optometry and Vision Science University of Waterloo Waterloo, ON Submitted September 1, 2017 Technical Report prepared under PWGSC Contract Number: W Canadian Institute for Military and Veteran Health Research Task 12: Colour Vision Assessment and Operational Requirements for Military Aircrew. Deliverable Reference 6; Task 5.a.4. Scientific Authority: Mackenzie G. Glaholt Defence Research and Development Canada 1133 Sheppard Avenue W. Toronto, ON, M3K 2C x2137 Mackenzie.Glaholt@forces.gc.ca Disclaimer: The scientific or technical validity of this Contract Report is entirely the responsibility of the Contractor and the contents do not necessarily have the approval or endorsement of the Department of National Defence of Canada Her Majesty the Queen in Right of Canada, as represented by the Minister of National Defence, 2017 Sa Majesté la Reine (en droit du Canada), telle que représentée par le ministre de la Défense nationale, 2017

4 List of Abbreviations: LandC (MT) Monocular Threshold Landolt C LandC (MS) Monocular Screening Landolt C LandC (B) Binocular Landolt C LandC Landolt C CAD (MS) Monocular Screening CAD CAD (B) Monocular CAD CAD Colour Assessment and Diagnosis test SNU CCT (Area) (Cambridge) Binocular Cone Contrast Test using the area of the ellipse CCT (Angle)/CCT (Ang) (Cambridge) Binocular Cone Contrast Test using the angle of orientation of the ellipse CCT (B) Binocular Cambridge Cone Contrast Test CCT Ell Binocular Ellipse of the Cambridge Cone Contrast Test CCT Tri Monocular Trivector of the Cambridge Cone Contrast Test AC1 Gwet s Agreement Coefficient 1 SPP2 Standard Pseudoisochromatic Plates Part 2 HRR Hardy, Rand Rittler pseudoisochromatic plates RCCT Rabin Cone Contrast Test HWA Holmes Wright A Lantern PreF Predictive Fail PreP Predictive Pass PAPI Precision Approach Path Indicator RCAF Royal Canadian Air Force USAF United States Air Force C index Confusion Index

5 S index Scatter Index PF Pass Fail FP Fail Pass RG Red green BY Blue yellow ROC curve Receiver Operating Characteristic curve M P B F Monocular Pass Binocular Fail M F B P Monocular Fail Binocular Pass nm nautical miles

6 Colour Assessment i ABSTRACT The Royal Canadian Air Force (RCAF) revised the colour testing protocol approximately 25 years ago; however, most of the colour vision tests used in the test battery were developed either prior to, or during WWII. With the emergence of affordable computer based colour vision tests, the question arises as to whether the current suite of colour vision tests reflects best practices. Determining the clinical utility of these newer computer tests is the focus of this report. Methods. The tests evaluated were the Ishihara (38 plate edition), Standard Pseudoisochromatic Plates Part 2 (SPP2), Farnsworth Munsell D15 (D15), Holmes Wright Lantern Type A (HWA), Hardy, Rand, Rittler 4th edition (HRR), ColorDx Extended Adult Screening Series, ColorDx D15, Computer Assessment and Diagnosis (CAD), Rabin Cone Contrast (RCCT), Cambridge Colour Vision Test (CCT), and Landolt C Cone Contrast (LandC). Results. With respect to detecting red green colour vision defects, the majority of tests had very high level of agreement with the gold standard Rayleigh colour match. The two exceptions were the LandC and CAD colour vision screening tests. The false positive rates for these two tests ranged from 15% to 40%. The CCT misclassified a large percentage of protan colour defectives as deutans, whereas the other threshold tests had excellent agreement with the anomaloscope as to the type of red green defect. The agreement between most tests with respect to normal versus abnormal colour vision was very good. However, if the criterion is relaxed on one test to allow individuals with a mild red green colour vision defect to pass, then the agreement between tests decreases. In terms of detecting blue yellow defects, there were problems in using the Moreland colour match as the gold standard that confounded the determination of the tests clinical validity in screening for blue yellow defects. The first was that subjects had a difficult time making a colour match because of the saturation differences between the mixture of primaries and the reference light. The second was the unexpected finding that colour defectives had a different midpoint than colour normals. Nevertheless, because of the low frequency of blue yellow defects, most tests had a high level of agreement with the Moreland equation as to the absence of a blue yellow defect. Again, the two exceptions were the LandC and CAD colour vision screening tests. With respect to monocular versus binocular viewing, testing each eye individually increases the number of false positives, but the proportion of false positives is usually less than 8%. Conclusions. The Ishihara and SPP2 could be replaced by the HRR, RCCT or ColorDx Extended Adult Screening Series. However, the entire testing protocol could be replaced by the ColorDx Screening Series and ColorDx D15. The CAD would be another option to consider if one wishes to replace the D15 and have a quantified colour discrimination metric. The LandC and CAD may not be interchangeable when the criterion on one computer test is relaxed to allow more colour defectives to pass. This suggests that correlations between tests and real world colour related tasks may be highly dependent on the test and task if some colour defectives can carry out the task successfully.

7 Colour Assessment ii Table of Contents 1. Introduction Colour vision deficiency Congenital colour vision deficiency Acquired colour vision deficiency Study Scope Methods Testing Sequence Individual Test Procedures Anomaloscope Pseudoisochromatic Plate Tests Ishihara HRR SPP ColorDx Extended Adult Test Cambridge Colour Vison Test Rabin Cone Contrast Test CAD Test Landolt C Holmes Wright Type A Lantern D15 Arrangement Tests Farnsworth Munsell D ColorDx D Subjects Results Anomaloscope Rayleigh Equation Moreland Match Screening Tests Agreement with the Anomaloscope Red Green Defects

8 Colour Assessment iii Blue Yellow Defects Monocular Failures Repeatability Monocular Threshold Tests Agreement with the Anomaloscope Red Green Defects Blue Yellow Defects Monocular Failures Repeatability Binocular Threshold Tests Agreement with the Anomaloscope Red Green Defects Blue Yellow Defects Repeatability Monocular and Binocular Agreement Comparison with the D Comparison with the Holmes Wright Type A (HWA) Comparison with the CAD Pilot Criterion Time to complete Discussion Detecting Red Green Colour Vision Defects Detecting Blue Yellow Colour Vision Defects Monocular vs Binocular Testing Agreement between monocular and binocular threshold Comparisons with the D Comparisons with the HWA Comparison with the CAD pilot criterion Conclusions References I. Appendix 1. Description and review of colour vision tests evaluated in this study II. Appendix 2. Screening questionnaire III. Questionnaire... 72

9 Colour Assessment iv IV. Appendix 3. Cut off scores that maximize the sensitivity for failing the Farnsworth D

10 Colour Assessment 1 1. Introduction Military aviation has undergone an amazing evolution, from the first Wright Brothers military aircraft purchased by the United States Army in 1909, to the fifth generation fighter aircraft and unmanned aerial vehicles (US Army Aviation Museum, 2017). Although there have been major technological advancements related to flight, vision testing of aircrew and aircrew candidates remains rooted in testing equipment and protocols developed 70 to 100 years ago. One example is colour vision testing. The Royal Canadian Air Force (RCAF) did revise the testing protocols approximately 25 years ago; however, most of tests used to evaluate colour vision were developed either prior to, or during WWII. Given the emergence of affordable computer based colour vision tests, the question is raised as to whether the current suite of colour vision tests reflects best practices. Determining the clinical utility of these newer tests is the focus of this report. The current RCAF testing protocol is to screen the colour vision of each aircrew candidate using the Ishihara 38 plate edition colour vision test (Ishihara) and the Standard Pseudoisochromatic Plates Part 2 (SPP2). The Ishihara test is a very effective screening test for identifying individuals with congenital red green defects (NRC, 1981; Birch & McKeever, 1993; Birch, 1997; Birch, 2010). The SPP2 test was added to confirm the presence of a red green colour vision defect and to screen for acquired blue yellow colour vision defects. Colour vision defects that are a result of a disease or disorder tend to be blueyellow (Pokorny et al, 1979). The SPP2 can also screen for congenital blue yellow defects, but these are extremely rare (Delpero et al 2005). Both tests are administered to each eye separately. If the candidate passes both tests, then they are classified as CV1 (colour normal). Individuals who fail either screening test are assessed further with the Farnsworth Munsell D 15 (D15). This test is designed to determine whether, or not, colour defective individuals can distinguish colours that have easily noticeable differences in colours. Colour defective individuals who pass the D15 are classified as CV2 (colourdefective, but safe) and are considered to have adequate colour discrimination to carry out colourrelated tasks in aviation. Colour defectives who fail the D15 are classified as CV3 (colour defective, unsafe) and are disqualified from aircrew positions. Additional testing may occur depending on the medical profile of the candidate. For example, a candidate who fails just the blue yellow screening test will likely be assessed further to determine whether the defect is acquired or congenital regardless of the results on the D15 test. This report provides a brief review of the types of colour vision deficiencies before describing the current study Colour vision deficiency. A colour vision deficiency is defined as having difficulty distinguishing colours, having colour matches that are outside the normal range, or both. The severity of the deficiency can range from mild to severe depending on the nature of the underlying problem. Colour vision deficiencies are divided into congenital and acquired colour vision deficiencies (Pokorny, et al, 1979). Congenital defects are inherited and are independent of any other problem with a person s visual system. The severity of the defect remains relatively stable throughout life. Acquired colour vision deficiencies are associated with an underlying disease or disorder. The underlying problem may be inherited, but some other visual function, such as visual acuity, is also affected. Acquired colour

11 Colour Assessment 2 vision defects can progress, or regress, depending on the underlining condition. Acquired defects are less prevalent than the congenital colour vision defects in the young adult population (Delpero, et al, 2005, Schneck, et al, 2014) Congenital colour vision deficiency. Congenital colour vision deficiencies are classified as either red green or blue yellow based on the colours that are likely to be confused. Red green defects are X linked recessive and are relatively common, affecting approximately 8% of Caucasian males and 0.5% of Caucasian females. The prevalence is lower in other ethnic groups (Delpero et al, 2005; Birch, 2012). Congenital blue yellow defects are autosomal dominant with incomplete penetrance (Deeb, 2004). These defects are very rare, with a prevalence of approximately 0.1% in the general population (Pokorny, et al, 1979; Delpero, et al, 2005). Individuals with a red green defect confuse colours along the red green colour axis (red, orange, yellow and green), whereas individuals with blue yellow defect confuse colours along blue yellow colour axis (violet, gray and yellow green is actually more correct). Within these two broad categories, the defect is divided into dichromatic and anomalous trichromatic based on the number of primary colours required to match a coloured light. Dichromats require only two primaries to make a match and anomalous trichromats require three primaries, but the amounts of the primaries are different from the colour normal population. On average, the colour discrimination of dichromats is worse than the colour discrimination of anomalous trichromats (Pokorny, et al 1979). Red green colour defectives are classified further based on whether the M cone or L cone photopigment is missing, or replaced by other photopigments that have spectral absorption characteristics intermediate between the colour normal M and L cones. Deuteranomalous individuals have an altered M cone photopigment that has a peak absorption closer to the L cone. Deuteranopes are missing the M cone photopigment. Protanomalous individuals have an altered L cone which has a peak absorption closer the M cone. Protanopes are missing the L cone photopigment (Deeb, 2004). The dichromatic blue yellow defect is referred to as tritanopia and the blue yellow anomalous trichromat is referred to as tritanomaly. In both cases, there is a problem with S cones. In tritanopia, the S cone pigment is non functioning, whereas in tritanomaly there are partially functioning S cones (Deeb, 2004). Because it usually not possible to do perform colour matches in the clinic, it is impossible to distinguish between anomalous trichromats and dichromats. For that reason, deuteranomalies and deuteranopes are classified clinically as deutans; protanopes and protanomalies are classified as protans and tritanopes and tritanomalies are classified as tritans. Many colours confused by protan individuals are similar to the colour confused by deutan individuals (Pokorny, et al 1979). One of the major differences between these two subtypes is that protans have a decreased brightness sensitivity to red lights and deutans have a normal brightness sensitivity to red light.

12 Colour Assessment Acquired colour vision deficiency. Acquired colour vision deficiencies are more common in the elderly because visual disorders increase with age (Schneck, et al, 2014). Glaucoma, cataract, macular degeneration and diabetes are common causes of acquired colour vision defects (Pokorny et al, 1979). There are three types of acquired colour vision defects. The most common type is Type III blue yellow defect. This defect is associated with macular degeneration, glaucoma, diabetes, nuclear cataract, and some optic nerve disorders (Pokorny et al, 1979). The second type is a Type I red green defect. This type of defect is usually a result of a photoreceptor/retinal pigment epithelium disease. Individuals with this type of defect tend to require more red in a mixture of red plus green to match a yellow reference. Type II acquired red green the third type of an acquired defect. It is associated with optic nerve diseases such as optic atrophies and optic neuritis (Pokorny, et al, 1979). Patients with this type of defect tend to require more green in a mixture of red plus green to match a yellow reference Study scope. A number of different colour vision tests are evaluated in this study. This tests are listed in Table 1. The main questions addressed are: a) the utility of test to screen for colour vision defects. This is determined based on the test s specificity (ability to identify colour normals correctly), sensitivity (ability to identifying colourdefectives correctly), overall agreement with the gold standard anomaloscope test, and time to complete the test. b) repeatability of the tests in terms of pass fail. c) agreement between monocular and binocular test outcomes for certain tests d) agreement with correct RCAF colour vision tests e) agreement with other occupational based tests. This information will be used to help select the next generation of colour vision tests used by the RCAF. Table 1. Colour vision tests selected for this study and status. Test Status Ishihara (38 plate edition) Current Canadian Forces screening test Standard Pseudoisochromatic Plates Part 2 (SPP2) Current Canadian Forces screening test Farnsworth Munsell D15 (D15) Current Canadian Forces and Transport Canada Civil Aviation functional test Holmes Wright Lantern Type A (HWA) Transport Canada Civil Aviation and Maritime alternative functional test Hardy, Rand, Rittler 4 th edition (HRR) California Peace Officer Standards and Training colour vision test ColorDx Extended Adult Screening Series Recently released computerized screening test ColorDx D15 Recently released computerized D15 Computer Assessment and Diagnosis (CAD) United Kingdom Civil Aviation Authority functional colour vision test Rabin Cone Contrast (RCCT) United States Air Force colour vision test. Cambridge Colour Vision Test (CCT) Primarily used for clinical vision research Landolt C Cone Contrast (LandC) Under development by the United States Air Force

13 Colour Assessment 4 2. Methods Table 1 lists the tests included in this study and whether they are used currently as a colour vision test, or are under development. Appendix 1 contains a description of each test evaluated in the study Testing Sequence. The study started with an overview of the research project and obtaining informed consent. Next, the subject filled in a questionnaire (Appendix 2) to determine if they met the inclusion criteria. Individuals who reported problems with colours were asked additional questions about their colour vision. These results will not be included in this study. Monocular distance visual acuities were measured using a Bailey Lovie logmar chart. Acuities at 100 cm and 40 cm were measured using a Reduced Snellen letter acuity chart. The general procedure was repeated measures within subject design. The colour vision tests were arranged into 10 different stations. These were the LandC, CCT, printed pseudoisochromatic plates, CAD, anomaloscope (alternating between the Moreland equation and Rayleigh equation as the first test), D15, HWA, ColorDx D15, ColorDx Screening Series and RCCT. The order of each test station was determined using a predetermined random block design. The order of the printed pseudoisochromatic plates was also determined by a random block design. Because of the large number of permutations, no two subjects had the exact same testing sequence. However, within a station, the sequence of monocular and binocular testing was identical across subjects. Except for the anomaloscope, if the test was viewed monocularly, then the right eye was tested before the left eye. Our preference would have been to alternate as to which eye was tested first, but several of the programs only allow right eye left eye testing sequence, and so we adopted that testing order for consistency. If the test was viewed both monocularly and binocularly, then monocular testing was performed first. The tests were administered in the reverse order at the second session Individual Test Procedures Unless otherwise stated, Illumination values were measured using a Minolta T 1 Illuminance Meter and luminances were measured using the Minolta CS 100 spot meter (Konica Minolta, Ramsey, NJ) Anomaloscope. The Ocular HMC anomaloscope (OCULUS Optikgeräte GmbH Wetzlar, Germany) tests for red green defects using the Rayleigh equation, and for blue yellow defects using the Moreland equation. Figure 1 shows the anomaloscope. The subject views the stimulus through the eyepiece. The stimulus consisted of two adjacent semi circles. The top one displays the mixture of the two primaries. The top dial controls the ratio of the two colours in the mixture. The bottom semicircle

14 Colour Assessment 5 displays the reference stimulus, and the bottom dial controlss its brightness. The diameter of the circular field is 2 degrees for the Rayleigh match. A magnifying lens was added to the eyepiecee for the Moreland equation in order to increase the field size to 4 degrees. The experimenter can also control the anomaloscope settingss through a computer interface (Oculus, ver ). Figure 2 shows the computer display for controlling the Rayleigh match and Figure 3 shows the control screen for the Moreland match. Subjects viewed the test monocularly starting with their preferred eye. For both tests, the stimulus was presented for 5 seconds followed by a white adaptation field of 3 sec in order to maintain a neutral adaptation state. Figure 1. The Anomaloscope. The black arrow points toward the top dial controls the colour of the top field and the bottom dial controls the brightnesss the bottom field.

15 Colour Assessment 6 Figure 2. Screenshot for controlling window of Rayleigh test. Figure 3. Screenshot for controlling window of Moreland test. In principle, the subject s task is to adjust the relativee amounts of the two primaries in the top half and the brightnesss of the bottom reference field so that the two halves of the circle look identical in colour. This proceduree was not followed because the colour defectives matches tended to make highly variable settings. The actual procedure for Rayleigh match was the following. The initial scale settings used in the procedure followed the accompanying instructions; however, the values used for the final classification of the defect were based on our colour normal data as will be described later. The participant was presented with a typical colour normal match. (For colour normals and some

16 Colour Assessment 7 dichromats, the two fields could look identical). This presentation familiarized the subjects with the stimulus and task. Next, subjects viewed different mixtures of the red and green primaries. The mixtures varied from 100% green (0 on the scale) to 100% red (70) on the scale in 10 unit steps. Subjects were asked to match the brightness of the top field by adjusting the brightness of the bottom reference light. After each brightness match, they were asked whether the top and bottom fields looked identical in both hue and brightness. There were three possible outcomes First, the subject reported that there was a colour match for all presentations. (Full range of mixture settings was accepted). This result indicated that the subject was a red green dichromat. Protanopia and deuteranopia were diagnosed based on the brightness match to the top field when it was set at 70 units. Protanopes adjust the brightness of the bottom yellow light to a value of 10, or less on the yellow brightness scale. Deuteranopes have brightness match setting that is within the range of colour normals (i.e. 15 to 25). Second, the subject matched at least one setting. The subjects were diagnosed a trichromat. Normal vs anomalous trichromat was diagnosed by determining the range of acceptable matches. The range of acceptable matches was determined by decreasing the mixture of the primaries in the top field in steps of 1 unit from either the match point for normals or the lower setting point of the acceptable matches for the anomalous trichromat. The subject adjusted the yellow light s brightness to make a brightness match and was then asked if the two halves looked identical. This process continued until the colours no longer matched. The upper limit of the range was then determined using the same procedure. For trichromats, the diagnosis depended on the actual range of acceptable matches. If the range of acceptable matches was below 35 or the extent of the matches below 35 was greater than the extent in the normal or protanomalous range, then the person was diagnosed as deuteranomalous and the severity determined by the width of the acceptable match range. If the range of acceptable matches was above 45 or the extent of the matches above 45 was greater than the matches in the normal or deuteranomalous range, then the person was diagnosed as protanomalous and the severity was determined on the width of the acceptance match range. If the range was between 35 and 45 or the extent of the matches was greatest in this range, then the person was diagnosed as a normal trichromat Third, if the subject could not make a match, then they were asked to adjust the red green values to make a match to the yellow reference. Next, they matched the brightness of the yellow field to match the top field and repeat until the two fields looked identical. If the match was in the normals range (41±6), subjects were diagnosed as normal. The value of 41 is the mean midpoint (rounded to the nearest integer) for the colour normals in this study and the ±6 is 3 standard deviations from the mean.

17 Colour Assessment 8 If the match was in the green area (0 35 unit scale) or on the red area (45 70 scale unit) the subject was diagnosed as anomalous trichromat (either deuteranomalous or protanomalous respectively). The range of acceptable matches was then measured as described previously. For the Moreland test, the experimenter placed the mixture of the violet and green primaries slightly outside the Normal area outlined on the control screen. The subject s task was to adjust the top field to match the bottom field as closely as possible, and then adjust the brightness of the bottom to make a brightness match and repeat until the two fields looked nearly identical. This was repeated 4 times. The first match was discarded and the average of the remaining 3 was calculated. There was never a true colour match because the bottom field always appeared more saturated relative to the top field. The range of acceptable matches was found by a bracketing technique. The examiner set the blue green ratio either below or above the match point by 10 units. The subject was asked to adjust the brightness of the bottom field to match the top field and report whether the two fields were an acceptable match. Depending on their response, the blue green ratio was set 5 units closer to the colour match or 5 units further away. This was repeated in successively smaller steps as the experimenter converged on the range of acceptable matches. The same procedure was used to find the range of acceptable matches on the opposite side of the match point. Diagnosis was based on the normal settings, which will be presented later. The range of the nonpreferred eye was tested to determine if there was an appreciable difference between eyes. This was done by setting the red green or blue green ratio to the value at each end of the range of the preferred eye. The subject was asked to make a brightness match. If the two fields were identical after the brightness match, then the ratios were changed by 5 units to determine if the range was larger and the procedure was repeated. If there was no match, then the ratio was changed by 5 units to determine if the range was smaller. A change in the range of more than 10 units in total was considered an important difference Pseudoisochromatic Plate Tests. All printed pseudoisochromatic plate test were illuminated with an Illuminant C fluorescent lamp (X Rite, Grand Rapids, MI). The illuminance on the tests was at 1400 lux (±5%) in the horizontal plane. All tests were viewed from approximately 60 cm. Subjects were allowed to view each page on the Ishihara, HRR and SPP2 for approximately 5 sec before the next plate was presented. All pseudoisochromatic plates were viewed monocularly. Because the ColorDx test scores each response as correct or incorrect, any response that was different from the normal response was recorded as an error on the other pseudoisochromatic plate tests. Table 2 summarizes the individual pseudoisochromatic plate tests failure and classification criteria. The next sections provide more information about the testing procedures.

18 Colour Assessment Ishihara. The first 25 plates were administered to all subjects. This included the demonstration, transformation, vanishing, hidden, and diagnostic plates; however, pass/fail was based on the number of errors on the transformation and vanishing figures and thee diagnostic responses were only evaluated if the subject failed the screening series (Birch, 1997). Failuree criterion on the screening plates was 3 or more errors on plates If two figures were reported on the diagnostic plates, then the subject was asked which was more distinct. Classification was based on the column (protan vs deutan) with the most errors or the majority of the less distinct figures HRR. Subjects were informed that theree could be two, one, or no geometric figures on each page. All the plates were presented to each subject. Any error on the blue yellow screening plates indicated a blue yellow defect and more than one error on the red green screening plates was a red green failure. Classification was based on the HRR score sheet shown in Figure 4. The type of red green defect was based on the column with the fewest errors. The severity of the defect was determinedd by how far the subject progressed through the diagnostic platess before theree were no errors. An individual who only made errors on the red green screening plates and none on plates was classified as very mild. Errors on plates 11 15, but none on the rest of the diagnostic platess resulted in a mild classification. A subject who made errors on plates 11 18, but none on plates 19 and 20 was classified as having a medium, (or moderate) defect and errors on plates resulted in a severe classification. Figure 4. HRR score sheet.

19 Colour Assessment SPP2. Subjects were informed that there could be two, one, or no numbers on any test plate. Any error on a red green test figure, including the comparison figure on the 3 rd test plate (Hovis, et al, 1990), was a failure. Any blue yellow error beyond missing the one faint blue yellow test figure on the first test plate was a failure as was any scotopic error ColorDx Extended Adult Test. Version 5.0 (Konan Medical, Irvine, CA) was installed on a Microsoft Surface Pro (Model number 1631) with Windows 10 operating system (Microsoft, Bellevue, WA). The program can be installed on any Mac, Android or PC device. The Surface Pro was selected because the United States Navy was considering evaluating the ColorDx using the Surface Pro at the time of this study 1. The viewing distance was approximately 40 cm away from the subject in a dim room (1 lux). The monitor was calibrated using a Spyder colorimeter (ver 4.5.4; Datacolor, Lawrenceville, NJ) to a white reference of 6500 K correlated colour temperature. This calibration was carried out every 30 days. The test image was presented within a white background that a luminance of 90 cd/m 2. The average luminance of the grey background for the diagnostic plates was 46 cd/m 2. Each test figure was presented for 2 sec within a white background. After the number disappears, the subject enters the number that they saw or N if they did not see a figure using the keyboard or the touchscreen. They have 12 seconds to enter their response. The red green screening plates are presented before the tritan test plates. If the subject failed the red green screening portion, then the red green diagnostic series follows the blue yellow test plates. The order of the red green screening figures is random. The blue yellow screening/diagnosis figures and the red green diagnostic plates start with the most saturated colours. The saturation progressively decreases with subsequent presentations. A failure on the blue yellow screening plates was 3 or more errors, and a failure on the redgreen screening plates was 5 errors. The type and severity of the red green defect was based on the maximum number of errors made on the protan and deutan diagnostic plates. The severity of the blueyellow defect was also based on the total number of errors. Table 2 lists the number of errors used to classify the severity of the defect. 1 Capt, Matthew Ringer ;Personal communication

20 Colour Assessment 11 Table 2. Scoring criteria for the pseudoisochromatic plate tests. Test Ishihara (38 plate) HRR 4 th edition SPP2 ColorDx Failure Criterion Classification Criterion for Red Green Defects >3 errors on plates 1 17 Majority of errors or less distinct figures on the diagnostic plates in protan or deutan columns Blue Yellow: any error on screening plates Red Green: > 1 error on screening plates; no errors on the diagnostic plates Any error on red green or scotopic test figures. Any blue yellow error other than missing the faint figure on the first test plate >3 errors on the blue yellow plates or >5 errors on the red green plates Fewest number of errors in protan or deutan columns NA Majority of errors on protan vs deutan diagnostic plates NA Severity Criterion Red Green Very mild: Any red green error on the screening plates and no errors on plates Mild: Any error on plates and no errors on the rest of diagnostic plates. Moderate: Any error on plates and no errors on plates 19 and 20. Severe: Any error on plates Blue Yellow Mild: Any blue yellow error on the screening plates and no errors on plates Moderate: Any error on plates and no errors on plates 23 and 24. Severe: Any error on plates NA Red Green (defect with the majority of errors) Mild: > 5, but less than 17 errors Moderate: >17 but <28 errors Severe: >28 errors Blue Yellow Mild: >4,but less than 7 Moderate: >7, but less than 9 Severe: >9

21 Colour Assessment Cambridge Colour Vison Test. The CCT program (ver 2.3; Cambridge Research Systems, Ltd) was installed in a PC computer (ASUS Intel Pentium 4 CPU 3.00 GHz) with Windows XP operating system. The test figure was presented on a 21 inch CRT monitor (SONY model GDM F520). The monitor was calibrated using ColorCAL (Serial # ; Konica Minolta CO.LTD) colorimeter to a white reference of 6500 K correlated colour temperature every 30 days. The subject viewed the monitor from a distance of 313 cm. At this distance, the Landolt C subtended 4.3 o and the gap subtended 1.0 o. The luminance noise ranged from 8 to 18 cd/m 2. The Landolt C was presented for 4 seconds. The Trivector test was performed first. The discrimination ellipse was measured using 8 colours that were spaced equally around a grey background in the CIE u v chromaticity space. The room illumination was approximately 1 lux for both tests. Subjects indicated the location of the gap using one of 4 buttons on a response box. If they could not see the ring or identify the location of the gap, then they were instructed not to respond and the next pattern will appear after a few seconds. Thresholds were measured by varying the average saturation of the C using a staircase procedure. The last 6 reversals were averaged to obtain the threshold value measured in the CIE u v chromaticity diagram. Table 3 summaries the scoring criterion for this test and the other computer based tests capable of measuring thresholds. In the Trivector mode, thresholds that exceeded u v units along the protan or deutan axis or u v units along the tritan axis were considered as a failure. Classification of the defect was based on the highest threshold value. The pass/fail values for the elliptical parameters were determined using Receiver Operating Curve analysis described in the Results section Rabin Cone Contrast Test. The RCCT was the commercial version supplied by INNOVA Systems (Burr Ridge, IL). It was displayed on an eleven inch Acer laptop (Model number: Q1VZC) using a Windows 7 operating system. The monitor was calibrated using a Spyder colorimeter (Express ver 4.5.4) to a white reference of 6500 K correlated colour temperature. This version of the program required calibration on a weekly basis. The luminance of the grey background was 19.3 cd/m 2. The test was viewed from approximately 60 cm in dim (1 lux) room lighting. The test began by having the subject position themselves and/or the screen so that the screen was perpendicular to their line of sight. A 9 cm viewing tube that had a dot at each end was used to align the screen. Subjects viewed a target on the screen through the tube and adjusted their position and/or screen angle so that there is only one dot superimposed on the alignment target on the screen. The program presents a single coloured letter in the center of the screen for 5 seconds. The subject used a mouse to select their response from the key displayed on the screen. The test started with a practice trial to ensure the participant understood the task. The order of testing is L cone sensitivity, M cone sensitivity and S cone sensitivity with the right eye tested before the left eye. Each sensitivity measure started with the highest cone contrast and used a staircase procedure to obtain the subject s maximum sensitivity. Cone contrast sensitivity was measured on a relative scale of 0 to 100 for

22 Colour Assessment 13 each stimulus. One hundred is the highest cone sensitivity (lowest contrast) and 0 indicates that the subject could not see the maximum contrast. A sensitivity that is less than 75 for any cone type is abnormal. The type of red green defect was based on the minimum cone sensitivity. The criterion for the screening, classification, and the severity of the defect are listed in Table CAD Test. The Z.X version of this test was installed on a Toshiba laptop (model number: TECRA R950 1EJ) with Pro Windows 8 operation system. The stimuli were presented on a NEC monitor (242 W BK). The monitor was calibrated using LMT photometer (GOSSEN Germany) every 30 days. The average luminance of the grey background was 24 cd/m 2. At a viewing distance of 1.4 m, the side of the square stimulus subtended 1.6 degrees. The room illumination was dim (1 lux). This test also has two options. The first is a screening test for both red green and blue yellow defects. The second classifies and quantifies the severity of the defect. The subject s task for both tests was to identify in which direction the square is traveling. There were 4 possible directions of travel in the diagonal direction. The subject was instructed to maintain fixation on the center of the square and not to track the moving target in order to obtain the best results. If they were unsure of the direction, then subjects were encouraged to make their best guess since a response was necessary to continue the test. If, for any reason, they needed to repeat the same presentation, the experimenter would click on the Represent button in the measurement window. We selected the air traffic controller protocol, which uses a very stringent pass/fail criterion. The test began with a practice session with an easily noticeable stimulus to make sure that subjects understood the task. Next, the screening test was administered. This test presented suprathreshold (~2 SNU units) targets that could be confused with grey. In order to pass, the percent correct had to be greater than 66.67% for all directions. However, the number of presentations for each direction varied according to the following algorithm. If the first three responses to a particular vector were correct, then the results were acceptable and the presentation of that stimulus ended. If two errors were made, then that was counted as failure and the presentation of that stimulus ended. If one error was made within the first three presentations of an individual colour, then 3 more presentations were added to that individual colour. If the percentage correct was greater than 66.6% (i.e. only one mistake), then the result was a pass for that colour. If the percent correct was 66.6% or less on any colour, then the person failed the screening test. The program specifies whether the failure was redgreen, blue yellow or both. Although the threshold program essentially measures a discrimination ellipse, the threshold vectors near the protan and deutan axes were averaged, as were the vectors near the blue yellow axis. These averages are expressed in SNU units. Table 3 lists each criterion. We used the most stringent failure criterion and it was not age corrected. The program also classifies the defect as protan, deutan or unclassified red green based on the directions that have the highest thresholds.

23 Colour Assessment Landolt C. The Landolt C program (ver 1.1.0) was run on a desktop (Lenovo Intel CORE i5) with a Windows 7 Professional Operation system. The stimulus was presented on a NEC monitor (Model 232 W BK). The monitor was calibrated using X Rite (Version EODIS3 i1) Display pro colorimeter every 30 days. The luminance of the grey background was 69 cd/m 2. The current program is divided into a screening portion and a threshold portion. Ideally, if the person passed the screening portion, then there would be no need to measure the individual s thresholds. However, because the program is still under development, we measured the thresholds for each cone class regardless of the screening test results. The Ψ adaptive threshold procedure allows one to measure just the threshold with a fixed slope of the psychometric function, or determine both the threshold and psychometric function slope (Kontsevich & Tyler, 1999). If the slope is fixed, the threshold variability asymptotes near twenty presentations, but there is a risk that the threshold could be biased. If the slope is also a free parameter, then the program estimates both the threshold and slope, but minimizing the variability of the threshold estimate has the highest priority. Between 40 and 100 trials, the priority shifts to minimizing the variability of the slope estimate. After approximately 100, trials, it begins to minimize the variability of the threshold again. (Kontsevich & Tyler, 1999). Based on the behavior of the adaptive procedure, the time required to run more than 30 trials for each cone contrast, and preliminary data from the USAF and our pilot studies, we elected to use a fixed slope of 2.6 for the L and M cone threshold, a fixed slope of 1.9 for the S cone threshold and 20 presentations for each cone threshold for the monocular trials. Both the slope and threshold were determined using 30 trials for the binocular trials. The subject viewed the test from 1m away from the screen, and their task was to press the keyboard direction corresponding to the direction of the gap in the C. At this distance, the Landolt C subtended 1.4 degrees and the gap was 0.3 degrees. The room illumination was dim (1 lux). The subject performed this test 3 times in the following order: right eye, left eye, and binocularly. The order of presentations for all tests was randomized between L cone, M cone, and the S cone stimuli. Each monocular test began with the screening test. The Landolt C had a fixed contrast of 1.66 log contrast for L cone and M cone and 0.55 log contrast for S cone. These values are approximately 3 standard deviations from the USAF colour normal mean thresholds. There were 8 presentations of each contrast. Two or more errors on any cone stimulus was a failure. The failure criteria for the monocular threshold trials were greater than 1.65 log contrast for the L and M cone and 0.43 log contrast for the S cone. These are 3.6 standard deviations from USAF colour normal means. Criterion for the binocular trials were greater than 1.8 log contrast for the L and M cone and 0.67 log contrast for the S cone.

24 Colour Assessment 15 Table 3. Scoring criteria for the various tests that are capable of measuring threshold. Test Failure Criterion Classification Criterion for Red Green Defects Severity Criterion Cambridge Trivector (monocular) Trivector > u v unit in the protan or deutan direction or > u v unit in tritan direction Direction with the highest threshold Based on threshold values Cambridge Ellipses Red Green: Elliptical Area >1.75*10 4 Elliptical angle >128.8 o Blue Yellow: Elliptical Area >2.62*10 4 Elliptical angle >149 o Angle of ellipse Magnitude of the area Rabin Cone Contrast test Sensitivity value <75 for any cone type Classification based on the minimum cone sensitivity Based on Sensitivity value Mild: >55, but<74 Moderate: >40, but <54 Severe: < 40 CAD Screening mode: percentage correct <66.6% for any colour Threshold mode: Red green SNU > 1.77 OR Blue Yellow SNU >1.75 Protan, deutan, or unclassified redgreen based on the directions that have the highest thresholds. Threshold values Landolt C test Monocular: Screening more than one error on an individual cone stimulus Threshold > 1.65 for the L and M cone and > 0.43 for the S cone Binocular: Threshold > 1.81 for the L and M cone and > 0.67 for the S cone Highest cone contrast threshold Highest Cone Contrast Value

25 Colour Assessment Holmes Wright Type A Lantern. The test was viewed from 6m. The test was illuminated with 180 lux in a plane parallel to the floor at a table height (Holmes & Wright, 1982). The test started with the brightness set on DEMO and examples of red, green and white light were shown to the subject. Subjects could review the lights if they wished. The brightness was then changed to HIGH and 27 pairs of the test lights were presented. The starting positions for the second and third runs were randomly varied. There was no time limit for the presentation, but the subjects were encourage to respond within 10 sec. Table 4 lists the failure criterion D15 Arrangement Tests Farnsworth Munsell D15. All the loose caps were removed from the box and arranged randomly on the table in front of the subject. They were asked to place the coloured cap that is most similar to the previous one placed in the box. They were allowed to rearrange the caps once they were placed in the box. The test was administered three times without any feedback. The test was illuminated with Illuminant C at 1400 lux (±5%). Scoring was based on both visual inspection and the Color Difference Vector analysis (Vingrys & King Smith, 1988). A major crossing was defined as a difference between adjacent cap numbers that was greater than 2. Table 4 lists pass/fail criterion for the Color Difference Vector analysis. The angular subtense of the caps was 1.5 o at a 40 cm viewing distance ColorDx D15. The test was part of the ColorDx test suite. It was displayed Using the same Surface Pro as the ColorDx test. The individual test colours were presented as coloured disks in the middle of the screen. The angular subtense of the disks was 2.9 o at a 40 cm viewing distance. The subject s task was to select the coloured disk that was most similar to the last filled rectangle at the top of the screen and drag it to the first empty rectangle. The subject was allowed to rearrange the order. Although the ColorDx D15 displays the subject s results drawn on the traditional score sheet, the color difference vector results are also given. For this reason, the ColorDx D15 was only evaluated using the color difference vector parameters listed in Table 4. The ColorDx D15 was also administered 3 times without any feedback.

26 Colour Assessment 17 Table 4. Scoring criteria for the D15 and Holmes Wright Lantern Tests. Test Failure Criterion Classification Criterion for Red Green Defects Severity Criterion Farnsworth Munsell D15 ColorDx D15 Visual Inspection: > one major crossing on two out of three trails Colour Difference Vector : C index >1.78 Colour Difference Vector : C index >1.78 Visual Inspection: Comparing error pattern to score sheet Colour Difference Vector angle size: Deutan 65 < Angle < 0 Protan 0 < Angle < 25 Scotopic 25 < Angle < 50 Tritan Angle 65 OR Angle 65 Colour Difference Vector angle size: Deutan 65 < Angle < 0 Protan 0 < Angle < 25 Scotopic 25 < Angle < 50 Tritan Angle 65 OR Angle 65 Holmes Wright A >2 errors on 3 runs (27 pairs) NA NA NA NA 2.3. Subjects Sixty colour normal subjects and 68 subjects with a red green colour vision defect participated in the study. Posters, social media, and newspaper advertisements were used to recruit subjects. Inclusion criteria were: between 17 and 60 years old; no known vision problems other than a colour vision problem or a corrected refractive error; visual acuity of at least 6/6 in better eye and 6/9 in the other eye at 6 m; 6/24 in better eye and 6/30 on the other eye at 100 cm; and 6/12 in better eye and 6/15 on the other eye at 40 cm with or without spectacles or contact lenses. The age and acuity criteria are the RCAF requirements for aircrew. The mean age of the colour normal group was 26.4 yrs. (sd +9.5) and 27.7 (sd +10.4) for the colour defective group. The colour normals were 50% males, and 50% females, whereas the colour defective group was 89.7% males and 10.3% females. Colour vision was classified according to the Rayleigh colour match. No tinted contact lenses or spectacles were allowed. This study

27 Colour Assessment 18 received ethics clearance through the office of Research Ethics, at the University of Waterloo (ORE 20996) and the Human Research Ethics Committee at DRDC Toronto Research Centre (Protocol Amendment 3). Subjects were asked to return within approximately 10 to 15 days to repeat the tests. Ninety three percent of the colour normals and 86% of the colour defectives participated in both sessions. 3. Results The results begin with the anomaloscope findings. These results are usually considered to be the gold standard as to the nature of colour vision defect. Next, the different tests are compared with the anomaloscope followed by the repeatability of each test. The tests are grouped based on the primary purpose: screening for defects, measuring monocular thresholds, measuring binocular thresholds. The last two sections are comparisons with the functional tests and the time to complete each test. The primary purpose of colour vision tests is to determine whether the candidate has normal colour vision or a colour vision defect. Some of the tests can classify the type of red green defect and severity of any colour vision defect. In evaluating the clinically validity, there are two primary outcomes. One is the overall agreement with the anomaloscope results as to whether the person has normal colour vision (colour normal) or a colour vision defect (colour defective). However, because it is unlikely that there will be perfect agreement with the anomaloscope, one usually wants to know why there was not perfect agreement with the anomaloscope. This is the other primary outcome and sensitivity and specificity values provide this information. A test with a high specificity indicates that nearly all the colour normals (identified by the anomaloscope) pass the test, and a test with a high sensitivity indicates that nearly all the colour defectives (identified by the anomaloscope) fail the test. In contrast, a test with a reduced specificity indicates there are a number of colour normals failing the test (i.e. false positives) and a test with a reduced sensitivity indicates that the test is missing a number of individuals with a colour vision defect (i.e., false negatives). In addition to the clinical validity, the study also reports on the repeatability of the tests with respect to the pass/fail results. Tests with poor repeatability on different days would be undesirable. Unless stated otherwise, all statistical decisions are based on the 95 percent confidence intervals. If the confidence interval of one test result does not can contain the mean value of the other test, then the results are statistically significant Anomaloscope Although the anomaloscope is considered the gold standard, each instrument should be calibrated based on colour matches made by colour normals (Pokorny, et al, 1979). The Oculus anomaloscope manual does give normal values for the both the Rayleigh and Moreland colour matches.

28 Colour Assessment 19 Nevertheless, these values may not be valid for an individual instrument because variations in temperature and input voltages can alter the red green, and possibly the blue green, ratio required to match the reference light (Jägle, et al, 2005; Jordan & Mollen, 1993). Although this problem was more of a concern with older anomaloscopes, we have had problems applying the manufacturer s norms to another Oculus anomaloscope in a previous study. This other anomaloscope was designed under the assumption that the AC voltage in North America was 120 volts when it is actually 110 volts. This design flaw altered the mean colour normal match by more than 5 units. Because of potential variations in instruments, we used our colour normal data to establish the normal limits. This procedure and definition of colour normal is circular, but the variation in colour normal red green matches and range of acceptable matches is relatively small so that identifying statistical outliers (i.e. colour defectives) and excluding them in establishing normal limits is straight forward (Pokorny, et al, 1979). However, if a subject had failed multiple red green colour vision tests in our study and had red green colour matches within the normal range, then they were not included in determining the colour normal limits. One subject fell into this category Rayleigh Equation. The mean red green midpoint for the colour normals was (sd +1.98). The average range of acceptable matches was 4.12 units (sd +1.91). Although the mean midpoint setting was only slightly larger than the 40 value given in the manual, the difference was statistically significant. To establish the normal range of midpoint settings we used +3 standard deviations from the mean and then rounded to the nearest integer. This results in a normal range of There were no colour normal individuals with a large difference between eyes. Table 5 lists frequencies of the different types of red green colour defectives in our sample for the first and second sessions, along with the expected frequencies for a Caucasian sample (Birch, 2012). The table shows that the proportion of protanopes at both sessions was higher than expected and the proportion of deuteranomalous individuals was lower. The frequencies in our sample were significantly different from the expected values (X 2 = 16.5; df= 3; p= for the first session; X 2 = 17.8; df= 3; p=0.005 for the second session). This difference was likely due to the recruiting process. Although the distribution of the different colour defectives was different from the expected values, the pass rate for the Farnsworth D15 was 42% (95% confidence interval 31% to 54%), which was similar to the 45% to 53% passing frequency from previous studies ((Atchison et al, 1991; Hovis, et al, 2004; Birch, 2008). Three subjects in the colour defective group had atypical anomaloscope results. One was a female who was deuteranopic in her preferred eye and deuteranomalous in her nonpreferred eye. The second was also a female who was deuteranomalous in her preferred eye and had a normal midpoint in her nonpreferred eye with a large range of acceptable matches. This latter result is referred to as colour amblyopia (Pokorny, et al, 1979). Both subjects were classified based on their preferred eye results. The third subject was a male who had a normal midpoint and range, but failed all the other colour vision tests with a deutan defect. Although atypical, similar anomaloscope results have been reported (Barbur, et al, 2008). For consistency, he was classified as normal based on the anomaloscope results. Thus, our specificity values relative to the anomaloscope can never equal 1.0.

29 Colour Assessment 20 Table 5. Frequency of the types of red green colour defectives in this study and the expected values for a Caucasian population (Birch, 2012). Type of Defect Percentage in the First Session Percentage in the Second Session Expected Percentage (n) (n) Deuteranopia 11.9% (8) Deuteranomalous 46.3% (31) Protanopia 28.4% (19) Protanomalous 13.4% (9) 11.3% (7) 45.2% (24) 30.6% (17) 12.9% (8) 14.0% 59.5% 12.2% 14.3% Moreland Match. Most subjects found this colour matching task very challenging. The reason for the difficulty was that the reference field appeared more saturated than the mixture of the two primaries so that subjects could only obtain a hue and brightness match, but not a match in saturation. The difference in appearance of the two fields is a characteristic of this Oculus Anomaloscope model. Table 6 lists the mean midpoint and range of acceptable matches for the two groups of subjects. The mean value for the colour normal group was similar to the 52.5 value reported by Rufer et al, (2012) for subjects between 20 and 39 yrs. old. The colour defective group, however, required more green to make a match than the colour normal group. This difference was statistically significant (t test; t= 5.37; p=0.001). The difference between the matching ranges was not statistically significant (t test; t= 0.67; p=0.505). The higher match settings were present for both protans and deutans. The deutan mean value was 59.9 (sd ) and the protan mean was (sd +9.48). The difference between these means was not statistically significant (t test; t= 1.73; p=0.088). Using + 3 standard deviations from the normal mean midpoint as the normal limits, 1.6% (n=1) of the colournormals, 16% (n=6) of the deutans and 25% (n=7) of the protans were outside of these limits. None of the colour normals had a matching range outside of the + 3 standard deviations from the normal mean range; however, there was one deutan and one protan who had range of acceptable matches outside the normal limits. Thus, 17.9% of the deutans and 28.6% of the protans had either an abnormal range or midpoint on the Moreland. In terms of the nonpreferred eye, 2 colour normal subjects had a major shift (~30 units) in their range of acceptable towards the green primary. Because the range of acceptable matches was similar for each subject, it is possible that the change was due to a change in their criteria as to the appearance

30 Colour Assessment 21 of an acceptable match. The one colour normal and colour defectives who had abnormal results with their preferred eye had similar results with their nonpreferred eye. Table 6. Moreland mean match settings and ranges for the two subject groups Subject Group Match Setting (Standard Deviation) Range of acceptable matches (Standard Deviation) Colour normal 53.9 (5.95) Colour defective 61.9 (10.09) 17.8 (13.84) 19.5 (15.78) 3.2. Screening Tests Agreement with the Anomaloscope. The agreement with the anomaloscope results and between session agreement were calculated using AC1 coefficient of agreement (Gwet, 2008; Wongpakaran, 2013). This index uses a different method to account for chance agreements compared with the κ coefficient recommend by the Working Group 41 (1981). As a result, the AC1 agreement index is similar to the percentage of agreement because it is more robust to asymmetrical marginal totals compared with the κ coefficient. For example, if only a few subjects pass either test and most fail both tests, the percentage of agreement and the AC1 coefficient would both be high because the of the large number of subjects who failed both tests. The κ coefficient, however, would be considerably lower because the total number of subjects who passed either test was low. Similar to the κ coefficient, the AC1 values can vary from 1 to 1, with 1 indicating complete disagreement, 0 meaning that any agreement is due to chance and 1 indicates perfect agreement. The agreement values were calculated using AgreeStat version (Advanced Analytics, Gaithersburg, MD, USA). The current RCAF screening protocol is to test each eye separately in order to detect acquired colour vision defects and for medicolegal purposes. If a candidate failed a test with just one eye, s/he would have to be evaluated further to determine the cause of the asymmetry. A candidate who failed the test with both eyes would be further assessed to determine the severity of the defect. Because both cases require further evaluation, a failure in either eye was considered an overall failure of the test. Most of the tests screen for both blue yellow and red green defects and so there may be multiple combinations of failure outcomes. For the purpose of this study, a pass indicates that the subject passed all portions of the test with each eye, a red green failure indicates that they had a red green defect in one eye or both eyes, and a blue yellow failure indicates that they had a blue yellow defect in

31 Colour Assessment 22 one or both eyes. A mixed failure indicates that the subject failed both the red green portion and blueyellow portion of the test with one or either eye. That is, they could have a red green and blue yellow defect in only one eye, in both eyes, or they have red green defect in one eye and a blue defect in the other eye. For the SPP2, a mixed defect also indicates that they had a scotopic failure on the test. Individuals with a mixed defect were considered to have a red green defect when compared with the Rayleigh match and a blue yellow defect when compared with the Moreland match. Figure 5 shows the agreement between the various tests and the two anomaloscope equations and Figure 6 shows the sensitivity and specificity of the tests for each type of defect. The 95 percent confidence intervals for these two parameters were calculated using the method proposed by Agresti and Coull (1998). The red green results are presented first Red Green Defects. The agreement with the Rayleigh equation as to whether the person had normal or a red green colour vision defect is above 0.80 for most tests, with the ColorDx having the highest value of The exception was the CAD (MS) portion where the agreement was approximately 0.6. The CAD (MS) agreement value was significantly lower than the other screening tests. The AC1 values for the other screening tests were not statistically different. AC1 Coefficient of Agreement Red-Green Defect Blue-Yellow Defect Screening Tests Ishihara HRR SPP2 ColorDx CAD (MS) LandC (MS) Test Figure 5. The AC1 coefficient of agreement values for the various screening tests. Ishihara is the 38 plate edition; HRR is the 4 th edition of the Hardy, Rand, Rittler test; SPP2 is the SPP2; ColorDx is the pseudoisochromatic plates portion, CAD (MS) is the CAD screening performed monocularly; LandC (MS) is the screening portion of the Landolt C test performed monocularly. Error bars are the 95 percent confidence intervals. Figure 6 shows the sensitivity and specificity of the various screening tests. Both values for the Ishihara, HRR and ColorDx were greater than Although the specificity of the SPP2 was very good, its sensitivity was below This indicated that the SPP2 had more false negatives than the other

32 Colour Assessment 23 tests. In contrast, the sensitivity of the CAD (MS) and LandC (MS) was very good, but they both had lower specificity. That is, both tests failed a larger proportion of colour normals relative to the other tests. The CAD (MS) specificity value was approximately 0.6, which translated to a false positive rate of 40%, and the specificity of the LandC (MS) specificity was 0.85, which translated to a false positive rate of 15%. 1.1 Screening Tests Sensitivity or Specificity Sensitivity for Red-Green Defect Spectificity for Red-Green Defect Sensitivityfor Blue-Yellow Defect Specificity for Blue-Yellow Defect Ishihara HRR SPP2 ColorDx CAD (MS) LandC (MS) Test Figure 6. The sensitivity and specificity values for each screening test using the Rayleigh equation as the standard for red green defects and the Moreland equation as the standard for the blue yellow defects. Ishihara is the 38 plate edition; HRR is the 4 th edition of the Hardy, Rand, Rittler test ; SPP2 is the SPP2; ColorDx is the pseudoisochromatic plates portion, CAD (MS) is the CAD screening performed monocularly; LandC (MS) is the screening portion of the Landolt C test performed monocularly. Error bars represent the 95 percent confidence interval Blue Yellow Defects. Figure 5 shows the agreement values with the Moreland equation as to whether or not the subject had a blue yellow defect. The somewhat unexpected result was that, although the AC1 value for the Ishihara test was low, it was significantly greater than zero, whereas the agreement value for the CAD (MS) screening was not significantly different from chance agreement. The reason that the Ishihara, which only tests for red green defects, uncovered blue yellow defects was that majority of individuals who had abnormal settings on the Moreland equation also had a redgreen defect.

33 Colour Assessment 24 Figure 6 shows the underlying reasons for the varied agreement. The high agreement for the HRR, ColorDx, and SPP2 was due to the high specificity. The sensitivity of these tests, however, was extremely low in that they were only able to identify one subject who had an abnormal result on the Moreland equation. This was a different person on each test. Because the number of abnormal results on the Moreland was small, the low sensitivity did not have a large effect on the level of agreement. The lower agreement for the LandC (MS) and the CAD (MS) was due to a low specificity, even though their sensitivities were higher than the printed tests. That is, both the LandC (MS) and CAD (MS) had a relatively large number of false positives Monocular Failures. The main reason to test colour vision monocularly is to determine if there is an asymmetry between the eyes. An asymmetry could be an indication of an acquired colour vision defect. Because each eye is tested in the monocular case, there is an increased probability of a failure and thus increasing the number of false positives. For the colour normal group, percentage who failed the Ishihara, HRR, SPP2 and ColorDx with just one eye ranged from 0% for the Ishihara to 3.3% for the HRR. Although infrequent, a failure in one eye occurred more often than a failure in both eyes in the colour normal group. For the HRR, the monocular failures accounted for approximately 2/3 of the failures in the colour normal group. None of these individuals failed more than one of these tests. The failures on the SPP2 and HRR were on the red green screening figures. The ColorDx monocular failure was on the blue yellow portion of the test. There was higher number of false positives on CAD (MS) and LandC (MS) and so it was not surprising that the number of monocular failures for the colour normals on these two tests were also higher. Twenty seven percent (n=16) of the colour normals failed the CAD with just one eye. This represents 42% of the total failures (n=38). Unlike the other tests, the monocular failures comprised a minority of the overall failures in the colour normal group. The types of failures were approximately equal across red green, blue yellow or mixed (blue yellow and red green) outcomes. The percentage of monocular failures for the colour normals on the LandC (MS) was lower at 15% (n=9). Similar to the printed screening tests, the monocular failures compromised the majority of the total failures in the colour normal group. Sixty nine percent of the total failures (n=13) in this group were monocular. Sixtyseven percent of the monocular failures were red green with 83% of these failures on the M cone stimulus. All the colour defective subjects who failed (or passed) the red green portions of these tests had the same outcome for each eye for these tests. The percentage of blue yellow failures that accompanied the red green failures on the HRR, SPP2 and ColorDx varied from 6% to 7% (n=4 to 5). Approximately half of these subjects had a blue yellow defect in both eyes and only 2 subjects had blueyellow defect on more than one of these tests. Both of these latter subjects had a deuteranomalous defect. Similar to the colour normal data, the percentages of red green colour defective subjects having a blue yellow defect on the CAD (MS) and LandC (MS) were higher. The percentage of blue yellow screening failures on the CAD (MS) was 55.9% (n=38) with 71% of these subjects having a blue yellow defect in both eyes and the remaining 29% have the defect in just one eye. The percentage of the red

34 Colour Assessment 25 green colour defectives with a blue yellow defect on the LandC (MS) was 21% (n=14) with 86% of these subjects having the defect in one eye and 14% having the defect in both eyes Repeatability. Figure 7 shows the between session agreement for the screening tests and Figure 8 shows the frequencies of the between session discrepancies. The repeatability of the Ishihara, HRR, SPP2 and ColorDx tests in terms of pass/fail was excellent in screening for both red green and blueyellow (omitting the Ishihara test) defects. If there was a discrepancy then it was more likely to be an improvement to passing at the second session and these individuals were all colour normal. The between session agreement for the CAD (MS) and LandC (MS) was significantly lower for both red green and blue yellow defects. The percentages of the two types of discrepancies on the redgreen screening figures were approximately equal for both tests. Within the group that had different results on the red green screening portion for either of these two tests, only three had a colour vision defect. Two of the colour defective subjects passed the CAD (MS) on the first session, but failed it on the second session. The third colour defective subject failed the LandC (MS) on the first session, but passed it on the second session. The majority of discrepancies on the blue yellow screening tests were individuals who improved to a passing performance at the second session. For the CAD (MS), 57% of the group had normal colour vision, whereas the percentage of the colour normals in this group for the LandC (MS) was 33%. The percentage of monocular failures was similar to values in the first session. Figure 7. The between session level of agreement for the selected tests. Ishihara is the 38 plate edition; HRR is the 4 th edition of the Hardy, Rand, Rittler test ; SPP2 is the SPP2; ColorDx is the pseudoisochromatic plates portion, CAD (M) is the CAD screening performed monocularly; LandC (MS) is the screening portion of the Landolt C test performed monocularly. AC1 Coefficient of Agreement Screening Repeatability Red-Green Defects Blue-Yellow Defects -0.2 Ishihara HRR SPP2 ColorDx CAD (MS) LandC (MS) Tests

35 Colour Assessment 26 Figure 8. Between session discrepancies for the selected tests. PF indicates that the subject passed the test on the first session, but failed at the second session. FP indicates that the subject failed the test on the first session, but passed at the second session. Ishihara is the 38 plate edition; HRR is the 4 th edition of the Hardy, Rand, Rittler test ; SPP2 is the SPP2; ColorDx is the pseudoisochromatic plates portion, CAD (M) is the CAD screening performed monocularly; LandC (MS) is the screening portion of the Landolt C test performed monocularly. Proportion of Subjects Screening Session Discrepencies Red-Green Defects PF Red-Green Defects FP Blue-Yellow Defects PF Blue-Yellow Defects FP 0.0 Ishihara HRR SPP2 ColorDx CAD (MS) LandC (MS) Test 3.3. Monocular Threshold Tests Agreement with the Anomaloscope. This set of tests measures, or in the case of the RCCT estimates, monocular thresholds for colours that are confused with the grey background. The three coloured stimuli presented by the LandC (MT) and RCCT are supposed to be identical, whereas the CCT Tri colours are slightly different from the other two tests. Figure 9 shows the AC1 agreement with each colour match and the sensitivity and specificity of each test Red Green Defects. In terms of detecting red green colour vision defects, all three tests performed similarly with nearly identical high levels of agreement, sensitivity and specificity. The levels of agreement for all three tests were marginally higher than the Ishihara. The agreement value of the LandC (MT) was significantly higher than the Ishihara, whereas the agreement values for other two tests were statistically identical. The reason for the slightly higher levels of agreement for the threshold tests was that both the sensitivity and specificity were marginally higher than the Ishihara tests Blue Yellow Defects. The blue yellow agreement values for the all three tests were significantly greater than zero; however, this result was due to the high specificity (i.e. low number of false positives) and a low number of failures. None of the tests had sufficient sensitivity to identify the subjects who were outside the normal limits for the Moreland match. No subject passed the LandC (MS) and failed the LandC (MT).

36 Colour Assessment Monocular Threshold Tests 1.1 Monocular Threshold Tests AC1 Coefficient of Agreement Red-Green Defect Blue-Yellow Defect Sensitivity or Specificity Sensitivity for Red-Green Defect Specificity for Red-Green Defect Sensitivity for Blue-Yellow Defect Specificity for Blue-Yellow Defect -0.2 CCT Tri RCCT LandC (MT) 0.0 CCT Tri RCCT LandC (MT) Tests Tests Figure 9. The AC1 coefficient of agreement values along with the sensitivity and specificity values for the various monocular threshold tests. CCT Tri is the Cambridge Trivector colour vision test, RCCT is the Rabin Cone Contrast Test and LandC (MT) is the Landolt C threshold test. Error bars are the 95 percent confidence intervals Monocular Failures. In terms of red green failures, one colour normal passed the RCCT with one eye and had a protan defect with other eye. A second colour normal failed the LandC (MT) and the CCT Tri with each eye. He had a blue yellow defect in the same eye on both tests and a mixed redgreen/blue yellow in the other eye. Thus, the pass/fail discrepancy between eyes for the colour normal subjects ranged from 0% to 1.6% across tests. Five colour defectives had a pass/fail difference between eyes on the RCCT. Four of the subjects had a deutan result in the eye that failed and the fifth subject had a protan defect. Thus, 6% of the colour defectives passed the RCCT with one eye. None of the colour defectives passed the other screening tests with just one eye. The percentage of blue yellow failures on the monocular threshold tests ranged from 0.7% (n=1) for the RCCT, 4% (n=5) for the LandC (MT) to 8.5% (n=11) for the CCT Tri. However, over 98% of the blue yellow failures were in the colour defective group for each test and over 73% of failures were in one eye only, which was similar to the screening test results. The major source of between eye differences was CCT Tri classification of the red green defect. Thirty one percent (n=21) of the colourdefectives had a different classification between eyes on the CCT Tri. Sixty seven percent of these discrepancies were a protan defect in one eye and a deutan defect in the other eye. The remaining 33% had a blue yellow defect in one eye along with a red green defect.

37 Colour Assessment Repeatability. Figure 10 shows the between session coefficients of agreement and betweensession discrepancies. The monocular threshold tests show near perfect agreement in terms of passing and failing the red green section. Given the high level of agreement between sessions for the red green portion of the test, the percent of discrepancies was low. In most cases, the discrepancies between sessions were only 1 or 2 subjects. The between session discrepancies varied for the blue yellow section of the tests because of the low number of failures in general and the low number of failures and the small number of subjects who failed the blue yellow section accounts for the large confidence intervals. The large proportion of subjects who had a blue yellow defect on the RCCT at the first session and none at the second, represent only one subject who had a deuteranopic defect. The CCT Tri and LandC (MT) also showed a higher proportion of subjects who did better at the second session in terms of passing the blue yellow section of the test. Of the 11 subjects who failed the CCT Tri blue yellow portion at the first session, 8 subjects passed it on the second session. Seven of the 8 subjects also had a red green defect. Of the 5 subjects who failed the LandC (MT) blue yellow section, 3 passed the section at the second session. All 3 were in the colour defective group. 1.2 Monocular Threshold Repeatability Monocular Threshold Session Discrepencies 1.0 AC1 Coefficient of Agreement Red-Green Defect Blue-Yellow Defect Proportion of Subjects Red-Green Defects PF Red-Green Defects FP Blue-Yellow Defects PF Blue-Yellow Defects FP -0.2 CCT Tri RCCT LandC (MT) 0.0 CCT Tri RCCT LandC (MT) Tests Tests Figure 10. The between session level of agreement and between session discrepancies for the selected tests. CCT Tri is the Cambridge Trivector colour vision test, RCCT is the Rabin Cone Contrast Test and LandC (MT) is the Landolt C threshold test. Error bars are the 95 percent confidence intervals.

38 Colour Assessment Binocular Threshold Tests Agreement with the Anomaloscope. The CAD and the Landolt C have scoring criteria for binocular versions of their tests and there are normal limits for the CCT elliptical length for each decade of life (Paramei and Oakely, 2014). Because their CCT values were not compared to colour defective data, Receiver Operator Characteristic (ROC) analyses (Sigmaplot ver 11.0; Systat Software Inc, Chicago, IL) were performed using the ellipse length, area, and angle as test parameters. This analysis determines whether the test parameter under evaluation can separate the colour defectives from the colour normals and, if so, then the specificity and sensitivity of a test for various different cut off scores were evaluated to determine the appropriate pass/fail criterion. For the angle analysis, 180 degrees was added to all angles less than 45 degrees so that the angles of the red green colour defectives were all of the same order of magnitude. The optimum cut off score was based on the maximum sum of the sensitivity and specificity. If there were multiple criteria with the same maximum value, then the cut off score was selected based on which one of the choices had the highest sensitivity. Figure 11 shows the ROC curves for the CCT elliptical length, area, and elliptical angle in separating red green colour defectives from colour normals. The 95% confidence intervals for area under the ROC curve for the elliptical length and area were both 0.91 to The 95% confidence interval for the angle was 0.88 to All three parameters performed similarly in that the areas under the curves are essentially equal based on the confidence intervals. The circles on each ROC curve represent the location where the sum of the sensitivity and specificity was at its maximum. The cut off score that gave these points were > for the elliptical length, >1.76*10 4 for the area, and >128 o for the angle. These points show that the sensitivities and specificities for the three parameters were also similar; however, because the area under the curve was marginally larger for the elliptical area, we selected area over length as one index of the subjects colour discrimination. Note that the elliptical length cut off value of was equal to the upper limit of Paramei and Oakley s (2014) values for colour normal 30 to 39 year olds. ROC Curve 1.0 Figure 11. Receiver Operator Characteristic curve. The curves are the sensitivity and specificity values for the three elliptical parameters in separating colour normals from red green colour defectives using the Rayleigh colour match as the standard test for different pass fail criteria. The circles represent the location where the sum of the sensitivity and specificity was at its maximum. A is the area under the ROC curve. Sensitivity Ellipse Length, A = 0.95 Ellipse Area, A=0.95 Ellipse Angle, A= Specificity

39 Colour Assessment 30 A similar analysis was performed for the Moreland Match, but only the area and angle were compared in order to be consistent with the parameters used in the red green colour vision defects analysis. The cut off value for the area was >2.62 *10 4 and >149 o for the angle. The areas under the ROC curve were 0.72 (95% Confidence Interval 0.62 to 0.82) for the angle and 0.78 (95% Confidence Interval 0.68 to 0.87) for the area, which indicates that the both parameters were nearly equally in their capability in separating the blue yellow colour vision defects from colour normals Red Green Defects. Figure 12 shows agreement values with each anomaloscope colour match along with the sensitivity and specificity of each test. All three tests performed similarly with nearly identical high levels of agreement, sensitivity, and specificity. The elliptical angle of the CCT ellipse had a slightly higher level of agreement compared with the area and this was due a slightly higher level of specificity. The level of agreement for all three tests was marginally higher than the Ishihara test; however, the levels of agreement for the threshold tests were not statistically significant from each other or the Ishihara test. 1.1 Binocular Threshold Tests 1.1 Binocular Threshold Tests AC1 Coefficient of Agreement Red-Green Defect Blue-Yellow Defect Sensitivity or Specificity Sensitivity for RG Defect Specificity for RG Defect Sensitivity for BY Defect Specificity for BY Defect -0.2 CAD (B) CCT (Area) CCT (Angle) LandC (B) 0.0 CAD (B) CCT Area CCT Ang LandC (B) Test Test Figure 12. The AC1 coefficient of agreement values with colour matching equations along with the sensitivity and specificity values for the binocular threshold tests. The CAD (B) is the CAD test, CCT (Area) is the area of the Cambridge discrimination ellipse, CCT (Angle) is the angle of the Cambridge discrimination ellipse in u v chromaticity space and LandC (B) is the Landolt C threshold test viewed binocularly. Error bars are the 95 percent confidence intervals.

40 Colour Assessment Blue Yellow Defects. The blue yellow agreement values for the all three tests were significantly greater than zero but lower than the red green agreement values. The relatively high values of agreement for the CAD (B) and LandC (B) were a result of the high specificity (i.e. low number of false positives). Neither of these tests had sufficient sensitivity to identify the subjects who were outside the normal limits for the Moreland match. Compared to the other tests, the sensitivity of the CCT elliptical area and angle were surprisingly good. Of course, there was a corresponding reduction in specificity, which lowered the overall agreement. The result that the cut off value of the elliptical area is larger than the cut off value for the red green defects indicated that subjects who had abnormal results on the Moreland equation were more likely to have larger ellipses on the Cambridge test. The cut off value for the angle was actually further away from the blue yellow axis located at approximately 90 o and closer the angle value of the red green defects ellipses. This was likely a result of more protans having a blueyellow defect than deutans or colour normals on the Moreland equation and their elliptical angles tended to be greater than the deutan and colour normal angles. None of the subjects who passed the Landolt C screening failed the Landolt C threshold test, and none of the subjects who passed the CAD screening failed the CAD (B) threshold test Repeatability. Figure 13 shows the between session agreement and percentages of the discrepancies for the binocular threshold test. The between session agreement for all the tests was very good with the CAD (B) and LandC (B) red green portion having perfect between session agreement. The between session agreement for these two test was significantly greater than the other tests. The few CCT (Area) discrepancies for the red green defects were approximately equally divided between the colour defectives and colour normals. The majority of discrepancies for the CCT (Angle) were in the colour normal group. The total number of blue yellow discrepancies for the CCT (Area) was 11 subjects with 91% from the colour defective group. The blue yellow discrepancies for the CCT (Angle) represent 7 subjects with 57% from the colour normal group.

41 Colour Assessment 32 AC1 Coefficient of Agreement Binocular Threshold Repeatability Red-Green Defect Blue-Yellow Defect -0.2 CAD (B) CCT (Area) CCT (Angle) LandC (B) Tests Proportion of Subjects Binocular Threshold Session Discrepencies Red-Green Defects PF Red-Green Defects FP Blue-Yellow Defects PF Blue-Yellow Defects FP 0.0 CAD (B) CCT (Area) CCT (Angle) LandC (B) Tests Figure 13. The between session level of agreement and between session discrepancies for the selected tests. The CAD (B) is the CAD test, CCT (Area) is the area of the Cambridge discrimination ellipse, CCT (Angle) is the angle of the Cambridge discrimination ellipse in u v chromaticity space and LandC (B) is the Landolt C threshold test viewed binocularly. Error bars are the 95 percent confidence intervals Monocular and Binocular Agreement As outlined in the introduction, monocular testing is incorporated to screen for acquired colour vision defects, whereas binocular viewing is often used to evaluate the candidate s functional colour vision. This strategy could lead to a number of different permutations regarding testing protocols, but four that may be of interest would be to compare the LandC (MS) screening, LandC (MT) threshold and the LandC (B) with each other. The other comparison would be the monocular CCT Tri and CCT Area results. These comparisons could help in establishing a testing protocol should either the LandC or CCT be adopted. For example, if the candidate fails the LandC (MS) with the same defect in each eye, then do the monocular and binocular thresholds agree as to whether the candidate has a colour vision defect? If so, then only one of the threshold tests is required in order to confirm the presence of the defect and estimate the severity of any defect. The comparison between tests will be on a pass/fail basis for two reasons. The first is that the LandC (MT) and LandC (B) have different cut off scores to take into account the differences between binocular and monocular thresholds. The second reason is that the monocular CCT Tri measured thresholds for only 3 colours, whereas the binocular CCT ellipse measured thresholds for 8 colours.

42 Colour Assessment 33 Figure 14 shows the agreement values and discrepancies between the various test comparisons from the first session. The agreement for detecting red green colour vision defects is generally good. The exception is the agreement between the LandC monocular screening and the binocular threshold. This level of agreement was significantly lower than the other tests. The reason for the lower agreement was that 25% of the 86 subjects who failed the screening test passed the binocular threshold test. Eighty two percent of this group had normal colour vision. The discrepancies on the other tests for red green defects were low. In terms of testing for red green colour vision defects, the LandC monocular and binocular threshold tests are equivalent, as is the Cambridge Trivector and discrimination ellipse, with the area of the ellipse having slightly better agreement with the Trivector test than the elliptical angle. The agreement between tests in detecting blue yellow defects is more varied. The LandC (MS) failed a large proportion of subjects who passed the binocular version. This result was expected because of the high number of false positive for the LandC (MS). The result that both the agreement and the proportion of discrepancies for the blue yellow defects was higher than the values for the redgreen defects is a result that 20 subjects failed the blue yellow monocular screening and so the level of agreement is weighted by a large percentage of subjects who passed both tests., Of the 20 subjects who failed the blue yellow screening, 19 passed the binocular version and 68% of these subjects were colour defective. The high level of agreement and high proportion of individuals who failed the monocular LandC threshold, but passed the binocular threshold was also due to the small number (n=6) who failed either one of the blue yellow threshold tests. Four of the 6 subjects improved to a passing performance on the binocular test and 3 of these subjects were colour defective. Taking into account that the discrepancies between the LandC monocular and binocular thresholds were due to a relatively small number of subjects, the two tests could be interchangeable in testing for blue yellow defects. The CCT (Area) and CCT (Angle) criteria for identifying blue yellow defects were essentially identifying the red green colour defectives who had a protan defect, large ellipses or both. As a result, there were 58 subjects who passed the Trivector blue yellow test, but failed based on one of the elliptical parameters. Ninety one percent were in the colour defective group. This result accounts for the relatively high proportion of subjects who passed the monocular blue yellow version of the test, but failed the binocular test. Three subjects failed the Trivector blue yellow test, but passed on the elliptical parameters. Two of the subjects were colour normal. Using the CCT elliptical parameters to test specifically for blue yellow defects may not be useful using criteria based on the Moreland equation because these parameters identify individuals with red green colour vision defects that have larger ellipses and angles closer to a protan defect.

43 Colour Assessment 34 AC1 Coefficient of Agreement Agreement Between Monocular and Binocular Red-Green Defect Blue-Yellow Defect LandC (S) & (B) LandC (MT) & (B) CCT Tri & (Area) CCT Tri & (Angle) Proportion of Subjects Monocular vs. Binocular Discrepencies Red-Green M P B F Red-Green M 0.9 F B P Blue-Yellow M P B F 0.8 Blue-Yellow M F B P LandC (S) & (B) LandC (MT) & (B) CCT Tri & (Area) CCT Tri & (Angle) Tests Tests Figure 14. Agreement and discrepancies for selected tests comparisons. LandC (S) & (B) is the comparison between the Landolt C monocular screening and binocular threshold. LandC (MT) & (B) is the comparison between the Landolt C monocular and binocular thresholds. CCT Tri & (Area) is the comparison between the monocular Cambridge Trivector test and the area of the ellipse for binocular viewing. CCT Tri & (Area) is the comparison between the monocular Cambridge Trivector test and the angle of the ellipse for binocular viewing Comparison with the D15 One of the factors to consider in revising the colour vision testing protocol is whether the newer colour tests could be adequate substitutes for the D15. The D15 is the current test that determines whether a candidate with a colour vision defect has adequate colour discrimination to perform his/her duties safely. In this study, the D15 was compared with ColorDx D15, The CAD (B), CCT (Area), CCT Tri, LandC (B), LandC (MT), and RCCT. In evaluating the agreement between the D15 and the other tests, the AC1 coefficient of agreement, sensitivity, specificity, and predictive pass/fail were calculated. The predictive pass (PreP) value is the probably that a person who passes one of the newer tests will pass the D15. The predictive fail (PreF) is the probability that a person who fails one of the newer tests will fail the D15. Only the red green colour defectives (including the one who was classified as normal by the anomaloscope) were included in this analysis because just the candidates who fail the screening test would be tested with the D15. The parameters evaluated were the ColorDx D15 C index; CAD (B) red green threshold; highest of the CCT Tri red green thresholds and the average of the protan/deutan vectors, CCT (Area); highest of the L or M cone thresholds and the average of the L and M cone thresholds for both the LandC (MT) and LandC (B); and the lowest of the RCCT L or M cone thresholds and the average of the L and M cone thresholds. The different values for the L and M cone thresholds or sensitivities were averaged in order to be consistent with CAD red green SNU, which is the average of the protan and deutan vectors. The monocular data from the CCT Tri, LandC (MT) and RCCT were averaged between eyes. With the exception of the ColorDx D15, ROC analyses were performed using each test parameter to determine

44 Colour Assessment 35 the cut off score that would give the best agreement with the D15 by using the maximum of the sum of the sensitivity and specificity. If there were multiple criteria with the same sum, the cut off with the highest sensitivity was selected. The ColorDx D15 was evaluated using the suggested failing score of >1.78. All data are from the first session. Table 7 lists the results. The cut off point was based on the maximum sum of the sensitivity and specificity value computed from the ROC analysis. Appendix 3 contains the results when the sensitivity was maximized. The results are arranged in descending order according to the AC1 agreement value. The area under the ROC curve and the AC1 agree that the various tests can separate colour defectives who pass or fail the D15 better than chance. Nevertheless, the ColorDx D15 is significantly better than all the rest. This is not surprising because the ColorDx D15 is designed to replicate the Farnsworth D15. The CAD (B) is the next best. The rest of the tests have similar results with moderate levels of agreement. In general, the PreF values are good to excellent with the ColorDx D15 having perfect and significantly higher PreF. This means that if the person fails the ColorDx D15, they will almost certainly fail the D15. High PreF values indicate the specificity of the test is very good. The reason for the lower AC1 values for the other tests if the decrease in the PreP. Lower PrePs indicate that if a person passes one of the computer based tests, there is still a reasonable probability that they will fail the D15. For example, the PreP value for the LandC (B) using the highest threshold criterion is This means 26% of the people whose highest threshold is lower (i.e. better than) 1.048, will still fail the D15. A low PreP indicates that the sensitivity of the test relative to the D15 is low. Table 7. Results of the ROC and agreement analyses for the comparison between the D15 and selected tests. Test ROC Area (95% CI) Cut off point AC1 (95% CI) PreP (95% CI) PreF (95% CI) ColorDx D15 NA ( 0.77 to 0.99) (0.71 to 0.95) (0.9 to 1) CAD (B) (0.81 to 0.97) (0.54 to 0.88) (0.63 to 0.92) (0.72 to 0.94) LandC (B) Maximum Threshold 0.75 (0.63 to 0.87) (0.32 to 0.74) 0.74 (0.52 to 0.87) 0.76 (0.6 to 0.85) RCCT Minimum Sensitivity (0.61 to 0.86) (0.3 to 0.72) (0.52 to 0.84) (0.64 to 0.89) CCT Tri Maximum Threshold (0.64 to 0.89) (0.31 to 0.73) (0.5 to 0.83) (0.62 to 0.87) CCT Tri Avg (0.64 to 0.89) (0.26 to 0.69) (0.48 to 0.78) (0.66 to 0.92) RCCT Average Sensitivity of L and M cones 0.69 (0.55 to 0.82) (0.24 to 0.68) 0.67 (0.48 to 0.81) 0.76 (0.61 to 0.86) LandC (B) Average of L and M cone Thresholds 0.76 (0.65 to 0.88) (0.23 to 0.67) 0.65 ( 0.47 to 0.79) 0.78 (0.63 to 0.89) LandC (MT) Maximum Threshold 0.78 (0.67 to 0.89) (0.23 to 0.67) 0.63 (0.44 to 0.82) 0.82 (0.42 to 0.96) CCT (Area) * (0.59 to 0.85) (0.22 to 0.66) (0.53 to 0.87) (0.61 to 0.86) LandC (MT) Average of L and M cone Thresholds 0.72 (0.59 to 0.84) (0.065 to 0.53) 0.55 (0.4 to 0.68) 0.83 (0.64 to 0.93)

45 Colour Assessment Comparison with the Holmes Wright Type A (HWA) The HWA lantern test determines whether a person has adequate colour vision to identify small lights used in aviation and in the maritime industry. We used the ROC analysis described previously to find the optimum cut off value for same individual test parameters and calculated the level of agreement and predictive values. Only the colour defective subjects are included in the analysis, because HWA test is administered only to individuals who fail a colour vision screening test. Just five colour defectives passed the HWA. One colour normal failed the HWA at the first session, but not the second session. He is not included in this analysis. Table 8 lists the results of the ROC analysis, cut off points, AC1 values, PreP, and PreF, and 95% confidence intervals for the various tests. The D15 is included for reference. ROC analyses were performed using each test parameter to determine the cut off score that would give the best agreement with the D15 by using the maximum of the sum of the sensitivity and specificity. If there were multiple criteria with the same sum, the cut off with the highest sensitivity was selected. Despite the small number of colour defectives who passed the HWA, the area under the ROC curves was significantly greater than chance and the level of agreement statistically identical to 1.0 in many cases. The notable exception is the D15, which had moderate level of agreement with the HWA. The PreP is also equal to 1.0 for many tests although the precision of the value is low because of the small number of subject who passed the HWA. Except for the maximum threshold of CCT Tri and the D15, the values for the rest of the tests are similar. Examining the cut off values for the maximum threshold/minimum sensitivity criteria shows that values are only different from the criteria used to determine normal/abnormal colour vision. This indicates that a person must have near normal colour discrimination in order to pass the HWA.

46 Colour Assessment 37 Table 8. Results of the ROC and agreement analyses for the comparison between the HWA and selected tests. Test LandC (MT) Average of L and M cone Thresholds LandC (MT) Maximum Threshold LandC (B) Average of L and M cone Thresholds LandC (B) Maximum Threshold CCT Tri Average of L and M cone Thresholds CAD (B) RCCT Average Sensitivity of L and M cones RCCT Minimum Sensitivity CCT (Area) CCT Tri Maximum Threshold ROC Area (95% CI) 0.92 (0.72 to 1.08) 0.90 (0.78 to 1.06) 0.93 (0.80 to (0.88 to 1.04) 0.94 (0.83 to 1.06) 0.87 (0.65 to 1.10) 0.82 (0.48 to 1.16) 0.89 (0.69 to 1.09) 0.98 (0.94 to 1.01) 0.97 (0.90 to 1.03) Cut off point e D15 NA NA AC1 (95% CI) 0.98 (0.95 to 1.00) 0.98 (0.95 to 1.00) 0.98 (0.95 to 1.00) 0.98 (0.95 to 1.00) 0.98 (0.95 to 1.00) 0.97 (0.92 to 1.00) 0.95 (0.89 to 1.00) 0.95 (0.89 to 1.00) 0.93 (0.85 to 1.00) 0.80 (0.69 to 0.94) 0.42 (0.18 to 0.65) PreP (95% CI) 1.00 (0.51 to 1) 1.00 (0.51 to 1) 1.00 (0.51 to 1) 1.00 (0.51 to 1) 1.00 (0.51 to 1) 0.80 (0.38 to 0.96) 0.67 (0.30 to 0.90) 0.67 (0.30 to 0.90) 0.56 (0.27 to 0.81) 0.33 (0.15 to 0.58) 0.14 (0.057 to 0.31) PreF (95% CI) 0.98 (0.92 to 1.00) 0.98 (0.92 to 1.00) 0.98 (0.92 to 1.00) 0.98 (0.92 to 1.00) 0.98 (0.92 to1.00) 0.98 (0.92 to 1.00) 0.98 (0.91 to 1.00) 0.98 (0.91 to 1.00) 1.00 (0.94 to 1.00) 1.00 (0.93 to 1.00) 0.98 (0.87 to 1.00) 3.8. Comparison with the CAD Pilot Criterion A number of civilian aviation authorities have adopted the CAD test to determine whether pilots have sufficient colour discrimination to qualify for an unrestricted commercial pilot s license. In the United Kingdom, candidates who fail the Ishihara screening are tested with the CAD. The failure criteria are a red green threshold greater than 6 SNUs for deutans and greater than 12 SNUs for protans. In this study, we used the ROC analysis to determine whether the other computer based tests would be comparable to the CAD using its pilot pass/fail criterion. Similar to the HWA, only a few subjects, one protan and seven deutans, met the CAD pilot criterion. Because of the small number of protan subjects in our sample, no ROC analysis was performed on this group. For the analysis of the deutan subjects, they had to be diagnosed as having a deutan defect by the anomaloscope. The one exception was the colour defective subject who had normal setting on the anomaloscope, but failed all of the other tests and was classified as a deutan. They also have to be classified as a deutan, colour normal or unclassified on the computer tests. Colournormal and unclassified subjects were treated as deutan defects. Although there were no discrepancies between the CAD (B), LandC (MT), LandC (B) and RCCT with respect to this classification, there were many discrepancies with the CCT Tri. Twenty one percent (n=8 deutans and n=6 protans) had different

47 Colour Assessment 38 classifications for each eye on the CCT Tri. In addition, the CCT Tri misclassified 64% of the protan subjects as deutans in both eyes. Although none of the subjects in this last group had red green thresholds below 12 SNU on the CAD, it does raise concerns about the ability of the CCT Tri to classify the nature of the red green defect when that information is important. Because of the large number of misclassifications, the CCT Tri was not evaluated. The parameters examined in the remaining tests were, highest of the L or M cone thresholds and the average of the L and M cone thresholds for both the LandC (MT) and LandC (B); and the lowest of the RCCT L or M cone thresholds and the average of the L and M cone thresholds. The data from the monocular tests; CCT Tri, LandC (MT), and RCCT were averaged between eyes, Table 9 lists results of the ROC and agreement analyses. Only the LandC (MT) Max and LandC (B) Max threshold had an area under the ROC curve that was significantly above 0.5. Nevertheless, we calculated the AC1 level of agreement because the statistical power might be different between the procedures. The cut off values were based on sensitivity and specificity values calculated for different pass/fail criteria. If the AC1 value was not significantly different from zero, then no further calculations were performed. Although the agreement between the CAD pilot, RCCT and the some of the LandC tests parameters were significantly better than zero, it was only moderate. The PreF values of around 0.80 were good, but they indicate that 20% of the people who fail the LandC or RCCT would pass the CAD pilot criterion. The PreP values, however, are lower. Except for the average of the LandC monocular L and M cone thresholds, 50% of the people (or less) who pass the other tests will fail to meet the CAD pilot criterion. Table 9. Results of the ROC and agreement analyses for the comparison between the CAD pilot criterion and the selected tests. Test LandC (MT) Average of L and M cone Thresholds RCCT Average Sensitivity of L and M cones RCCT Minimum Sensitivity LandC (B) Maximum Threshold LandC (MT) Maximum Threshold LandC (B) Average of L and M cone Thresholds ROC Area (95% CI) 0.68 (0.46 To 0.90) 0.51 (0.22 To 0.80) 0.57 (0.32 To 0.82) 0.75 (0.57 To 0.93) 0.76 (0.58 To 0.95) 0.67 (0.46 To 0.88) Cut off point AC1 (95% CI) PreP (95% CI) PreF (95% CI) 0.80 (0.63 to 0.97) 0.72 (0.51 to 0.93) 0.72 (0.51 to 0.93) 0.52 (0.23 to 0.80) 0.36 (0.042 to 0.67) (0 to 0.34) 0.75 (0.30 to 0.95) 0.5 (0.19 to 0.81) 0.5 (0.19 to 0.81) 0.41 (0.22 to 0.64) 0.35 (0.18 to 0.57) NA 0.86 (0.71 to 0.94) 0.85 (0.70 to 0.94) 0.85 (0.70 to 0.94) 0.96 (0.79 to 0.99) 0.95 (0.76 to 0.99) NA 3.9. Time to complete One of the factors to consider in test selection is the amount of time required to complete the test. Assuming that the clinical performance is essentially identical for a number of tests, the test that requires less time to administer may be preferred. Figure 15 shows the mean completion times for the

48 Colour Assessment 39 various tests. The tests were divided into screening, threshold and miscellaneous. The grouping is based on the tests capabilities and the likely purpose of each. The time to complete the monocular Landolt C test includes both the screening portion and threshold portion for the monocular trial and the time to complete the binocular trials is just the threshold portion. Figure 15. Mean time to complete various colour vision tests. Errors bars are 1.0 standard deviation. The tests are: Ishihara is the 38 plate edition; HRR is the 4 th edition of the Hardy, Rand, Rittler test ; SPP2 is the SPP2; ColorDx is the pseudoisochromatic plates portion of the ColorDx, CAD (M) is the CAD screening performed monocularly; CAD (B) is the CAD threshold test; CCT Tri is the Cambridge Trivector; CCT Ell is the Cambridge Ellipse measurements; RCCT is the Rabin Cone Contrast Test, LandC (M) is the monocular Landolt C test which includes a screening and threshold portion; LandC (B) is the binocular Landolt C; Moreland is the Moreland colour match; Rayleigh is the Rayleigh colour match, ColorDx D15 is the D15 test in the ColorDx program; D15 is the Farnsworth Munsell D 15 and HW A is the Holmes Wright Lantern. As expected, most of the screening testss had the shortest completion times. The exception was the ColorDx Adult Extended Series. The longer ColorDx completion time for both groups was a result of subjects entering their responses as opposed to the experimenter crossing out incorrect responses on a score sheet. The longer completion time for the colour defective group was a result of the increased number of figures thatt they viewed. Although most of the colour defective group viewed only 5 of the

49 Colour Assessment 40 screening plates, they had to view the additional 64 diagnostic plates. Compared with the colournormal subjects, the colour defective group viewed and responded to approximately 40 additional images. The longer completion times for the colour normals relative to the colour defectives on the CAD screening was likely a result of making just one error on the first 3 presentations of one or more stimuli. This resulted in an additional three presentations of the corresponding colours. On the other hand, the colour defectives made two errors on the initial 3 presentations for multiple colours. In this case, there were no subsequent presentations of those colours and the test was finished sooner. For the threshold tests, the CAD required the longest time to complete and the RCCT took the least. The longer completion time for the CAD was probably because thresholds were determined for 16 different colours, whereas the CCT ellipse program measured thresholds for eight colours and the rest of the tests measured thresholds for only three colours. However, the LandC (M) and RCCT are both monocular tests so that the time to complete is equivalent to measuring thresholds for 6 colours. The shorter completion time for the RCCT was likely a result of the test only estimating the threshold based on relatively large changes in cone contrast and so this test had the fewest presentations and shortest completion time. The anomaloscope took the longest time to complete. The 20 min required to perform the anomaloscope is typical given that the test included performing both red green and blue green colour matches and verify that the other eye had similar ranges of acceptable matches. For the D15 test, the times to complete the paper and computer versions were similar. The longer completion times for the colour defectives occurred because the colour differences between many D15 caps are subtle for this group and it took longer make their judgements. The time to complete the Holmes Wright Lantern was similar to the screening tests. 4. Discussion 4.1. Detecting Red Green Colour Vision Defects In terms of determining whether, or not, a candidate has a red green colour vision defect, the majority of the tests evaluated in this study had very good agreement with the anomaloscope. This result occurred for tests that just screened for a colour vision defect or for tests that measured thresholds. The level of agreement for most tests was comparable to the Ishihara in terms of screening. This finding is in agreement with previous studies (Belcher, et al, 1958; Birch, 1997; Seshadri et al, 2005; Barbur, et al, 2006, Cole, et al, 2006; Rabin et al, 2004; 2011; Rodriguez Carmona et al, 2012; Almustanyir & Hovis, 2015; Shinomori et al, 2016; Walsh et al, 2016). One exception to these general findings was the LandC (MT). It had a marginally higher, but significantly different, level of agreement with the anomaloscope than the Ishihara test. The other two exceptions were the CAD (MS) and LandC (MS) screening tests. Both tests had lower levels of agreement because of a 15% to 40% false positive rate (i.e., colour normals failed) on the red green screening portion. These individuals would then have to be evaluated further, perhaps with

50 Colour Assessment 41 the threshold test. Further testing on 15% to 40% of the candidate pool is considerably higher than the 0% to 8% false positive rate for the other red green colour vision tests and the 8% prevalence rate for red green colour vision defects in the male population. For the LandC test threshold test, only one colour normal failed monocular M cone threshold component and that was only when viewing with his left eye. He also failed the screening portion. Thus, the false positive rate for the LandC (MT) is comparable to the other screening tests and confirming that the LandC (MS) has an unacceptably high false positive rate. Recall that we selected the most stringent pass fail criteria for the CAD (B) test; therefore, we would anticipate a higher false positive rate for this test. Although there were 4 colour normals who failed the CAD (B) threshold test and the CAD (MS), the 6.7% false positive rate was considerably lower than the CAD (MS) rate and similar to the other tests. All 4 subjects had a red green threshold less than 3.0 SNUs, which suggests that the pass fail criterion could be relaxed if necessary. The false positive rate on the LandC (MS) and CAD (MS) could be reduced by increasing the chromatic contrasts of the screening stimuli. Both tests do offer this option. The CAD offers two different screening contrasts; we used the lower contrast and the updated version may offer ageadjusted stimuli. The contrast of the LandC screening target can be set to any value. However, we are uncertain as to what would be an acceptable alternative. The USAF set the screening contrast at a value equal to 3 standard deviations above the mean colour normal monocular thresholds for each cone stimulus 2. If we used the same rule, then our values would be 1.70 for the L cone stimulus and 1.64 for the M cone stimulus. Both of these values are close to the actual value 1.66 log contrast used for the LandC (MS). Perhaps, the contrast should be based on value greater than mean threshold. One possibility would be to use an anchor that is one standard deviation higher than the color normal mean and select a contrast for the screening that is 3 standard deviations greater than that value. Based on our data, that value would be about 1.5 log contrast for both the L and M cone stimuli. It is difficult to determine whether the increase in the contrast would affect the ability to detect red green defects. Given the high false positive rate, one would be tempted to drop the screening portion from the LandC and CAD testing protocol. However, it is possible that there is a learning/practice/priming effect that occurs by presenting screening test first and then measuring thresholds, although the effect does not necessarily transfer between eyes. The number of errors made on the LandC (MS) L or M cone screening stimuli were not significantly different between eyes (paired t test, two tailed; L cone, t=0.57, p=0.57; M cone, t=0.13, p=.89) on the first session. For the CAD (MS), there was some weak evidence of a carry over of the effect. Thirty one subjects failed the screening with their right eye, but 29% of these subjects passed the screening with their left eye. Practice may be necessary for the CAD because subjects had the tendency to indicate where the square first appeared and not the direction of its movement when they first started the test. 2 James Gaska; Personal communication.

51 Colour Assessment 42 The between session results also suggest any practice effect for the screening portion is short lived. The repeatability data for both the CAD (MS) and LandC (MS) showed that the proportion of subjects that passed the first session and failed the second was higher relative to the proportion who did better on the second session. For the CAD (MS), this difference was significant based on the 95% confidence intervals. However, the LandC (MT) and CAD (B) red green threshold test pass/fail outcomes for the colour normal group were highly repeatable. Only one colour normal subject in each test had a different outcome at the second session. Their performance improved to a pass in both cases. All the colour defectives had the same outcome for the red green thresholds at both sessions. In our study, the LandC and CAD threshold tests started after the screening portion. We are uncertain as to whether skipping the screening portion will have any impact on the threshold data, but the relatively large number of colour normal subjects who failed the LandC and CAD screening tests and passed the threshold tests suggests that subjects should have a fairly extensive practice session if the screening test is not administered. The repeatability of the other tests showed that they were very good in identifying red green colour vision defects on separate sessions. The boundaries were the CAD (B) and LandC (B), having perfect between session agreement in screening for red green defects, and the CCT (Area) and CCT (Angle), having the lowest repeatability. The between session discrepancies for these last two tests were more likely to be individuals who improved to a passing performance at the second session Detecting Blue Yellow Colour Vision Defects The Moreland equation in the Oculus anomaloscope was problematic because of the difference in saturation between the mixture of the primaries and test mixture. Subjects usually complained that this was a difficult task. In fact, one colour normal subject could not obtain an acceptable match on the first trial and had to be coached to accept what she considered to be close to a match. The result that 2 colour normals had a midpoint change 30 units between eyes also suggests that the task was challenging, and caused subjects to change their criteria as to what constitutes a match. Despite the difficulty in making the match, the mean value of the colour normal midpoint was similar to the value reported by Rufer et al, (2012) for subjects between 20 and 39 yrs. old. An unexpected result was the significant difference in the midpoints for the red green colourdefectives. They required more of the green primary in the mixture in order to make a match. This shift in the midpoint was more likely to occur with protan subjects. We are uncertain as to why this result occurred. The increase in the green required to match the reference is suggestive of a progressive cone dystrophy (Hèrmes, et al, 1989). However, 13 of the 14 subjects who had abnormal results were from the colour defective group. It is unlikely that the 13 red green colour defective subjects had a progressive dystrophy because they all had corrected acuities of at least 6/6 in their better eye and 6/9 in the other eye. The above findings raise the question as to whether the Moreland equation should be used as the gold standard, at least when individuals with a red green colour vision defect are included in the sample.

52 Colour Assessment 43 Despite the problems with the Moreland equation, the AC1 level of agreement between the Moreland equation and most of the tests was surprising good. This occurred because a relatively small number of subjects failed any blue yellow test so the percentages that passed both the test itself and the Moreland equation were large and there were very few false positives. That is, AC1 value indicates there was a high level of agreement between tests in that there was no blue yellow defect. On the other hand, assuming that the Moreland was correct, the sensitivity of most tests was quite poor. The result that the Ishihara test, which only tests for red green defects, had a level of agreement with the Moreland equation that was above chance was because nearly all the abnormal results on the Moreland equation had a red green colour vision defect. The relatively high sensitivity of the CCT (B) parameters in detecting blue yellow was also a result of the red green colour defectives having abnormal results on the Moreland equation. Individuals who had abnormal results on the Moreland equation were more likely to have the larger discrimination ellipses, an angle typical of a protan colour vision defect, or both. Table 10 illustrates the lack of correspondence between the tests and the Moreland equation as to whether the person has a blue yellow defect. It shows which subjects failed the blue yellow portion of the various tests at the first session. If test is viewed monocularly, then a failure in either eye was an overall failure because there can be a difference between eyes if the defect is acquired. The CAD (MS) and LandC (MS) tests were excluded because of the relatively high number of false positives. None of the subjects passed these two screening tests and then failed the corresponding threshold tests. The CCT (B) parameters were excluded because they are essentially identifying protan subjects with larger ellipses. Both the CAD (B) and CCT Tri are more likely to have a blue yellow failure although only 3 subjects failed both tests and only one subject failed both tests and the Moreland equation. This was subject 125, a 24 yr old deuteranomalous male. Note that he did not fail any other blue yellow tests and his S cone contrast threshold on the LandC (MT) was slightly lower (i.e. better,) than mean threshold values for both the colour normals and colour defectives. Subject 34 was one of the few subjects to show a failure across more than 3 tests. He was a 54 year old deuteranomalous individual, which suggests that age could be contributing to the blue yellow failures. However, his Moreland settings were within the normal range of young adults. In addition, only 3 other individuals were over 40 yrs. so that it appears unlikely that age was a major factor in the blue yellow failures. It is possible that the discrepancies between the Moreland equation and the other colour vision tests could be related to differences in the density of the macular pigmentation. The Moreland equation is relatively independent to differences in the macular pigmentation across subjects, whereas the other tests may be affected by differences in macular pigmentation density. However, if that were the case, then we would have expected to have approximately equal frequencies of blue yellow failures in both the colour normal and colour defectives. Table 10 shows that is clearly not the case. It seems unlikely that colour defectives would have a higher density of macular pigmentation relative to colournormals and so some other factor is likely responsible for the higher frequencies of blue yellow failures in the colour defective group. In terms of the frequency of the blue yellow failures across both groups of subjects, our blueyellow failures on the commercial version of the ColorDx was 3.1%, which was lower than the 8.2% failure rate reported for a prototype of the ColorDx (Almustanyir & Hovis, 2015). However, the 3.1%

53 Colour Assessment 44 blue yellow failures on the HRR in this study were slightly larger than the 1.6% reported previously for a similar age group (Almustanyir & Hovis, 2015). The blue yellow failure rate of 9% for the CAD (B) and 6% on the CCT Tri were higher than the 1 to 2% shown for the CAD and 3.4% shown for the CCT Tri in earlier studies (Barbur et al, 2006; Shinomori, et al, 2014). However, Shinomori, et al excluded statistical outliers and so it is possible that their blue yellow failure rate could have been higher. We are uncertain as to why our rates are higher, but the result that the majority of the blue yellow failures were in the colour defective group suggests that generally lower colour discrimination can result in a few individuals being diagnosed with both blue yellow defects and congenital red green defects regardless of the test. It is possible that because many colour differences in their everyday lives are small and colourdefectives often make errors in judging colours, they have adopted a more conservative criterion in responding as to whether the stimulus was discernable in order to increase the probability they are correct. This results in a small percentage of colour defectives having both a red green and blue yellow defect. The number of blue yellow failures could depend on the test or, because the order of testing was different across subjects, the order of the tests. Table 10. Summary of which individuals failed the blue yellow portion of the selected tests. Blue rectangles indicate that the subject failed that test.

54 Colour Assessment 45 *40 yrs or older Blue Yellow Test Failed Subject Red green colour vision defect SPP2 HRR ColorDx RCCT LandC (MT) LandC (B) CCT Tri CAD (B) Moreland 6 N 9 N 13 N 32 N 36 N 54 N 60 N 76 N 79 N* 30 Y 34 Y* 48 Y 49 Y 73 Y 77 Y 86 Y 87 Y* 88 Y 89 Y 91 Y 93 Y 95 Y 96 Y 100 Y 101 Y 107 Y 108 Y 110 Y 111 Y 113 Y* 120 Y 121 Y 122 Y 123 Y 124 Y 125 Y 126 Y 127 Y 129 Y 130 Y 132 Y 133 Y 134 Y

55 Colour Assessment 46 Although the agreement between the individual tests and the Moreland equation was inadequate, the repeatability of the pass/fail results for most tests was similar to the repeatability values for red green colour vision defects. This high level of between session agreement should be interpreted cautiously. The high level of agreement occurred because of a high specificity and a large number of subjects passed the test at both sessions. For the small number of subjects who have between session discrepancies, a higher proportion was more likely to the pass at the second trial than pass the first session, but fail the second session. This suggests that there was a learning or practice effect across many of the tests, particularly for the blue yellow portions. The exception is the CAD (MS), where the proportion who failed the first session and passed the second session was slightly lower than the proportion who passed on first session, but failed at the second. We are uncertain as to why that result occurred. It did not carry over to the threshold results because the proportion of subjects that failed the first session passed the second session was statistically greater than the proportion who did worse on the second session. Table 10 suggests that the CCT Tri and CAD (B) may have an unacceptable false positive rate for blue yellow colour vision deficiencies. However, it may be possible to adjust the pass/fail criteria so that the number of false positives is lower if the tests had other useful features, such as high agreement with a colour related task or other colour vision test. For example, the number of blue yellow failures for the CAD (B) would be reduced by 50% if the failure criterion for a blue yellow defect was increased from 1.78 SNU to 2.5 SNU. Using an age adjusted criterion does not appear to help for this group of subjects because the number of failures under 40 yrs was relatively large. The overall poor sensitivity in most cases and poor specificity relative to the Moreland equation raises the issue as to whether the Moreland colour match is an appropriate gold standard. If it is not an appropriate standard, then the question arises as to which test should be the gold standard for blueyellow defects. Until there is a consensus in the area, it is likely that age related norms for individual tests will severe as the reference. Our results show, however, that it may be necessary to have separate norms for individuals with red green colour vision defects Monocular vs Binocular Testing. The primary reason given for testing monocularly is to screen for acquired colour vision deficiencies. The severity of acquired defects is frequently different between eyes. Testing each eye may also be important in establishing a baseline in order to assess changes in the visual system due to an injury or disease that may occur during the soldier s career. The CAD (MS) and LandC (MS), as standalone tests, would be unacceptable for this purpose because of the relatively high number of false positives and relatively lower repeatability. The HRR, SPP2, ColorDx, and RCCT could be suitable for screening if one is willing to accept that approximately 3% of the candidates (pooled across both groups) would have a difference between the two eyes and require additional assessments beyond functional testing.

56 Colour Assessment 47 The between eye discrepancies in terms of pass/fail or a mixed blue yellow/red green defect on the LandC (MT) occurred in 3.1% (n=4) of the subjects. This was similar to the percentage found in the above screening tests. One was a 51 yr old color normal who had blue yellow defect in the right eye and a deutan blue yellow defect in the left eye. He was classified as normal on the LandC (B) at both sessions and only the left eye blue yellow defect was repeatable on the second session LandC (MT). The remaining 3 subjects were in the colour defective group. One subject was 54 yrs old and the other two were under 35 yrs. They had a mixed red green/blue yellow defect in one eye and a red green defect in the other eye. None of the subjects had a blue yellow defect on either the LandC (B) tests or the second session LandC (MT) trials. This suggests that was a small learning/practice effect. Interestingly, all three colour defective subjects also failed a blue yellow section on either the printed screening tests or the RCCT. Unfortunately, it was a different test for each of the 3 subjects. If the subjects had failed the same blue yellow screening test, then one possible testing strategy would be to use that screening test to test each eye individually and if there was a difference between eyes, then measure the thresholds monocularly for these individuals and measure the binocular thresholds for candidates that had the same screening outcome in each eye. It is also possible that some of the blueyellow false positives could be reduced if there LandC cut off values were adjusted for age. For some unknown reason, the CCT Tri gave different diagnoses as to the type of red green defect in each eye and misdiagnosed a large percentage of protans as having a deutan deficiency. There were also number of subjects with a protan defect in one eye and deutan in the other eye. This makes the classification of any defect problematic. Incorrect classification could be important if there are different pass/fail criteria for protans and deutans as in the case of the CAD pilot criterion. The problems with misclassifying protan as deutans and the between eye differences in the protan deutan diagnosis occurred at nearly the same frequency at the second session. The large number of discrepancies for the type of defect in each eye was not consistent with Shinomori, et al (2016), who reported nearly perfect agreement with the anomaloscope as to the type of defect if the thresholds were below the maximum value. They also performed the test monocularly, but only used the eye with the better acuity. We are unsure as to why there was a difference between the two experiments. If the issue of the type of red green deficiency is ignored, then 8.6% (n=11) of all subjects had a blueyellow defect and 66% of these subjects had the defect in only one eye. This gives a between eye discrepancy for blue yellow defects of 5.5%, which is similar to the other tests evaluated in this study. Nevertheless, the high rate of misdiagnosis of protan defects and discrepancies between eyes in terms of the type of red green defect make this test unsuitable for colour vision testing. Measuring the ellipses may provide some useful information with respect to overall colour discrimination, but whether that information is useful for establishing aircrew standards is uncertain Agreement between monocular and binocular threshold. In terms of detecting red green colour vision defects, the LandC (MT) and LandC (B) are equivalent. The cut off scores of the LandC (B) take into account binocular summation effects on thresholds. The small number of discrepancies between the tests was due to small number of colour normals and colour defectives who failed the monocular test with one eye and passed the binocular version. Again, this result is suggestive of a

57 Colour Assessment 48 learning/practice effect. The overall agreement for blue yellow deficiencies is also good because very few subjects failed both tests. The CCT monocular and binocular tests are also equivalent in terms of pass/fail for red green defects. The agreement is lower for blue yellow defects. This lower agreement is expected because the binocular criteria for a blue yellow defect essentially identified red green colour defective subjects with larger ellipses or an angle more typical of a protan defect and these individual were more likely to have abnormal Moreland settings. It is possible these individuals with the larger ellipses actually had a wider ellipse that encroached onto the blue yellow dimension. Two results suggest that this was not the case. The first is the result that the proportion of subjects who passed the CCT Tri, which tests specifically along the blue yellow axis, but failed on the elliptical parameters was higher than the proportion who failed on the CCT Tri blue yellow vector. The second result is that the cut off value for the elliptical angle was further away from the blue yellow axis and close to the red green axis. Both of the findings indicate that was the individuals with the larger ellipses located near the protan axis who were more likely to have abnormal Moreland result and not individuals with larger, more symmetrical ellipses Comparisons with the D15. The current RCAF colour vision requirement will accept individuals who have a colour vision defect and pass the Farnsworth D15. If one of the newer tests replaces the D15, then the new test should have very good agreement with the D15 and high predictive values for passing and failing the D15. Except for the ColorDx D15, the agreement with the other computer based tests was only moderate. The PreF values ranged from 0.76 to This means that 13% to 24% of the candidates who fail one of the newer computer test could pass the D15. The PreP values are marginally lower, ranging from 0.55 to That is, between 18% and 45% of the candidates who pass one of the newer computer tests would fail the D15. Low PreP means that the computer tests have a low sensitivity (more false negatives). A low PreF means that the tests have a low specificity (more false positives), Of the newer tests, the CAD (B) would be a good option because its PreP is 0.82 and the PreF is 0.87, but because the values are not perfect, 31% of the individuals taking the CAD (B) would be misclassified. The ColorDx D15, which is supposed to be a replication of the D15, had excellent agreement with the D15. Its PreF value was 100% and it was significantly greater than the rest of the binocular and monocular test values. The ColorDx D15 PreP was above the rest of the tests and was significantly greater than the LandC (B) and the monocular test values, but not statistically different from the CAD (B), CCT (Area) and maximum threshold for the LandC (B) PreP values. The slightly lower PreP indicates that the ColorDx D15 is slightly less sensitive than the D15. One possible reason for the lower sensitivity is that the circles on the ColorDx are twice the size of the Farnsworth D15. Colours are slightly easier to identify for larger objects (Cole et al, 2006b). The 4 subjects who failed the D15, but passed the ColorDx D15 had no more than 4 major crossings on the D15. Lowering the C index value for a failure on the ColorDx D15 did not improve the agreement because the specificity decreased with only marginal improvement in the sensitivity of the ColorDx D15. One dichromat passed at least one of the D15 versions. This individual was the female

58 Colour Assessment 49 with a deuteranopic defect in her preferred eye and deuteranomalous defect in her nonprefered eye. She also passed the ColorDx D15. The excellent agreement between the ColorDx Screening program with the anomaloscope and the very good agreement between the ColorDx D15 and D15 make the ColorDx a very good replacement for the current test battery of the Ishihara, SPP2 and D15. The ColorDx program reduces the problems that could occur when the test administrator interprets an atypical response. The ColorDx screening test eliminates these potential problems by requiring the candidate to enter their responses from the choice of possible answers using the keyboard or mouse. The response is scored as either correct or incorrect. The red green screening test plates are randomized so that it is harder to memorize the test. The fact that no other plates are shown after 5 errors increases the difficulty further. It may be easy to integrate into an electronic medical records system and it does not require special lighting. The one caveat is that if the program is installed on a Windows Surface Pro, the monitor has to be calibrated. The greenish appearance of the display out of the box indicated that the white point was not 6500 o K Comparisons with the HWA The HWA determines whether individuals with a colour vision defect can identify aviation signal lights that are viewed from 1 nm (Holmes & Wright, 1982; Cole & Vingrys, 1982). Although still in limited use, the HWA is no longer manufactured. The low pass rate for the colour defective group is consistent with previous studies showing that individuals with mild defect have the highest probability of passing, especially if multiple runs of the nine pairs of lights are always presented (Vingrys & Cole, 1983; Birch, 2008b; Hovis, 2008). The fact that the cut off value for the LandC tests was only 0.20 log units higher than the recommended pass/fail also supports the conclusion that only individuals with mild defects will be able to pass the HWA. The agreement with most of the tests was very good, with values for the LandC (MT), LandC (B) and CAD (B) significantly better than the rest of the tests. All the tests have a very good PreF value, indicating that if the candidate fails one of the colour vision tests, then there is over a 90% chance that they will fail the HWA. This agrees with Cole and Vingrys (1983) study. The PreP values for the LandC tests are very good although the uncertainty is relatively large given the small number of subjects who passed the HWA. The lower PrePs for the other tests, including the D15, indicate that individuals who pass one of these other tests still have a very high probability of failing the HWA. The low PreP for the D15 is in agreement with Cole and Vingrys (1983) previous study. The D15 is designed to separate those individuals with a moderate to severe defect from those with a milder defect. However, the HWA fails a large percentage of the mild to moderate cases and so the D15 PreP should be relatively low. The cut off score for the RCCT was very close to the recommended pass/fail score of 75 indicating that the person has to have near normal red green colour vision to pass the HWA according the RCCT. Changing the cut off score to 75 results in similar conclusions. The AC1 agreement value remains the same to the second decimal place, the PreP increases to 0.75 (95% Confidence Interval of 0.30 to 0.95), but the PreF value drops from 0.98 to 0.97 (95% Confidence Interval 0.89 to 0.99).

59 Colour Assessment 50 Nevertheless, the conclusion that the person has to have normal to near normal colour vision on the RCCT agrees with the findings that the HWA is a challenging test for colour defective individuals Comparison with the CAD pilot criterion Assuming that the task analysis shows that identifying Precision Approach Path Indicator (PAPI) lights is a colour critical task in military aviation as it is in civilian aviation (Civil Aviation Authority, 2006b), the CAD could be used to determine whether candidate s colour vision is adequate to perform the task. The CAD has been validated against a simulated PAPI setup (Civil Aviation Authority, 2009). Note that the CAD pilot criterion is actually less stringent than the pass/fail criterion used to determine normal versus red green defective colour vision. If the colour vision tests are strongly correlated with each other, then the other tests may be equivalent to CAD using the civilian aviation criterion. Unfortunately, our results show that this may not to be the case. Although the LandC (MT) average of the L and M cone, RCCT sensitivities and the LandC (B) maximum threshold had a reasonable PreF for failing the CAD pilot criterion, the PreP was only 0.50 for most of these tests. The exception was the LandC (MT) average of the L and M cone thresholds, which had PreF of 0.75; however, this value was not statistically different from the other values. This value indicates that 50% of the deutan colour defectives who pass the other computer based colour vision tests will not meet the CAD criterion. The average of the L and M monocular cone thresholds has the highest level of agreement; however, examining the confidence intervals indicates that RCCT could give similar predictions. The unexpected finding was that the LandC (B) averages of the L and M cone thresholds only had chance agreement with the CAD pilot criterion. This result was unexpected because the M cone thresholds for LandC (B) test are linearly correlated with the CAD deutan thresholds (r 2 =0.37; p<0.001). Nevertheless, the correlation is only modest. The small number of subjects in our study (i.e. 8 deutans and 1 protan) who passed the CAD pilot criterion likely contributed to the low agreement between tests. However, the number of subjects who passed the HWA was also small and the agreement between the newer tests and the HWA was much better. Figure 16 shows a possible explanation for the difference in performance. The figure shows the distribution of LandC (B) average L and M contrast thresholds for subjects who passed or failed the HWA and CAD pilot tests. Note that the HWA results include the protans, whereas the CAD data are only the deutans. First, the figure reiterates that only a small number of colour defective subjects actually passed both tests. Second, although there is overlap in the pass/fail distributions for both tests, the majority of subjects who passed the HWA had contrast thresholds that were below the distribution for colour defectives thresholds who failed the HWA, whereas there was more overlap between the distributions for the CAD outcomes. The reason for the monocular tests providing better agreement than their binocular counterparts is uncertain. One would expect the monocular and binocular agreement values to be similar. It is possible that the discrepancy is a result of the small number of subjects who passed the CAD. Obviously more data is needed, but it appears that as the pass/fail criterion is relaxed or the test becomes easier for colour defectives to pass, the variability between colour defective thresholds now becomes an issue, so that predicting pass/fail results using another threshold test becomes less precise. When the tests are evaluated on a normal

60 Colour Assessment 51 versus colour defective criterion, the variability in the colour defective thresholds is not issue because nearly all the colour defective data is outside of the colour normal range Figure 16. Dot histogram showing the distributions of LandC (B) Average of the L and M cone thresholds for subjects who passed or failed the HWA lantern and CAD using 6 SNUs as the cut off score. The solid horizontal line is the cut off based on passing the HWA. P CAD is passed CAD, F CAD is failed CAD, P HWA is passed HWA and F HWA is failed HWA. Log Contrast P-CAD F CAD P HWA F HWA Test Outcome 5. Conclusions In terms of selecting an alternative colour vision test battery, no one test stands out as being superior to the rest. Nevertheless, some tests may slightly better than others. Using the Moreland equation to diagnosis blue yellow defects is problematic because of the difficulty in making a match, and individuals with red green colour vision defects require more of the green primary to make the match than colour normals. The Cambridge colour vision test may not suitable if it is important to diagnose the type of red green defect. It misclassified a number of protans as deutans and there were a number of between eye differences as to the type of red green defect. Both the Landolt C and CAD screening tests produce a large number of false positives, with respect to both red green and blue yellow defects. However, performing these tests may be necessary to ensure that the subject is very familiar with the test before measuring thresholds. If there were a need to replace the Ishihara and SPP2 in the current test battery, then either HRR or RCCT would be suitable replacements. The HRR had a slightly higher sensitivity in screening for

61 Colour Assessment 52 red green defects and it can screen for blue yellow defects. However, the HRR requires special lighting to ensure proper colour rendition of the test figures. This is actually an important consideration because one of the red green screening figures is relatively faint and so it possible to get more false positives if improper lighting is used or the candidate is older. The RCCT is a quick test that also screens for blue yellow and red green defects, but is more expensive and it requires regular calibration. It is computerized so that special lighting is not required and it reduces need to interpret the results if a candidate gives an atypical response. The D15 would still be used to assess the candidate s functional vision. The entire test battery could also be replaced with the ColorDx computerized test suite. The Extended Adult version screens for both red green and blue yellow defects and it has a computerized version for the D15 that is comparable to the original paper version. No special lighting is required and there is no need to interpret atypical responses. The one drawback is that the computer monitor will have to be calibrated regularly in order to ensure that the proper colours are displayed. The LandC test, which is under development, may be useful in the future to screen for colour vision defects and measure discrimination thresholds. The monocular and binocular L cone and M cone threshold tests have excellent agreement with the anomaloscope in screening for red green colour vision defects. However, it has only moderate agreement with the D15 and CAD pilot criterion and so the consequences of adopting as a functional colour vision test to replace the D15 will require further discussion. The CAD would be an unsuitable screening test, but very good at determine the colour vision status based on threshold measurements. The advantage of the CAD over Landolt C is that it has better agreement with the D15 and it has already been evaluated against a civilian aviation colourrelated task. Both the CAD and Landolt C require an in depth practise session. If the test battery includes tests for blue yellow defects, then expect a small number of blueyellow failures, ranging from 1% to 6%, particularly in the colour defective group. These failures are likely to be false positives and, if so, are usually not repeatable. The CAD had a higher number of blueyellow failures, but these could be reduced by relaxing the criterion for a blue yellow defect. Although the percentages of false positives are still small, testing each eye individually could potentially double the percentages of false positives on a test. This increase in false positives will have to be weighed against the probability of detecting a visual disease or disorder with other tests and medicolegal concerns regarding detecting and quantifying a change in the candidate s vision due to injury or disease. The monocular and binocular versions of the Landolt C are comparable in terms of passing or failing because the number of false positives on the monocular version is relatively low. This result means that only one version of the test is necessary. Based on our results, the newer computer threshold tests may not be interchangeable when the criterion on one computer test is relaxed to allow more colour defectives to pass. The variability between colour defectives results can reduce the agreement between tests to unacceptable levels. This suggests that correlations between tests and real world colour related tasks may be highly dependent on the test and task if a proportion of colour defectives can carry out the task successfully.

62 Colour Assessment References Atchison, D. A., Bowman, K. J., & Vingrys, A. J. (1991). Quantitative scoring methods for D15 panel tests in the diagnosis of congenital color vision deficiencies. Optom Vis Sci, 68(1), Agresti, A. and Coull, B. A. (1998). Approximate is better than "exact" for interval estimation of binomial proportions", The American Statistician, 52(2), Almustanyir, A., & Hovis, J. K. (2015). Military Research ColorDx and Printed Color Vision Tests. Aerospace Medicine and Human Performance, 86(10), Barbur, J. L., Harlow, A. J., & Plant, G. T. (1994). Insights into the Different Exploits of Colour in the Visual Cortex. Proc.R.Soc.Lond.B, 258(1353), Barbur, J. L., Rodriguez Carmona, M., Harlow, J. A., Mancuso, K., Neitz, J., & Neitz, M. (2008). A study of unusual Rayleigh matches in deutan deficiency. Visual Neuroscience, 25(3), Barbur, J. L., Rodriguez Carmona, M., & Harlow, J. A. (2006). Establishing the statistical limits of normal chromatic sensitivity. Paper presented at the CIE Proceedings Expert Symposium 75 Years of the CIE Standard Observer Ottawa. May 2006: Belcher, S. J., K. W. Greenshields, and W. D. Wright. Colour vision survey: using the Ishihara, Dvorine, Boström and Kugelberg, Boström, and American Optical Hardy Rand Rittler tests The British journal of ophthalmology 42.6 (1958): 355. Birch, J., & McKeever, L. M. (1993). Survey of the accuracy of new pseudoisochromatic plates. Ophthalmic and Physiological Optics, 13(1), Birch J. Efficiency of the Ishihara test for identifying red green colour deficiency. Ophthalmic Physiol Opt. 1997;17(5): Birch, J. (2008a). Pass rates for the Farnsworth D15 colour vision test. Ophthalmic and Physiological Optics, 28(3), Birch, J. (2008b). Performance of colour deficient people on the Holmes Wright lantern (type A): consistency of occupational colour vision standards in aviation. Ophthalmic Physiol Opt, 28(3), Birch, J. (2010). Identification of red green colour deficiency: sensitivity of the Ishihara and American Optical Company (Hard, Rand and Rittler) pseudo isochromatic plates to identify slight anomalous trichromatism. Ophthalmic and Physiological Optics, 30(5), Birch, J. (2012). Worldwide prevalence of red green color deficiency. Journal of the Optical Society of America A, 29(3),

63 Colour Assessment 54 Civil Aviation Authority, Safety Regulation Group. Minimum color vision requirement for professional flight crew Part 1: The use of color signals and the assessment of color vision requirements in aviation. Paper 2006/ a. Civil Aviation Authority, Safety Regulation Group. Minimum color vision requirements for professional flight crew Part 2: Task analysis. Paper 2006/ b. Civil Aviation Authority, Safety Regulation Group. Minimum Colour Vision Requirements for Professional Flight Crew Recommendations for new colour vision standards. Paper 2009/ CIE. International recommendations for color vision requirements for transport. Report 143. International Commission on Illumination, CIE technical report, Vienna, Austria Cole, B. L., & Vingrys, A. J. (1982). A Survey and Evaluation of Lantern Tests of Color Vision. Am J Optom Arch Am Acad of Optom, 59(4), 346. Cole, B. L., & Vingrys, A. J. (1983). Who fails lantern tests? Doc Ophthalmol., 55(3), Cole, B. L., Lian, K. Y., & Lakkis, C. (2006a). The new Richmond HRR pseudoisochromatic test for colour vision is better than the Ishihara test. Clin Exp Optom., 89, Cole, B. L., Lian, K. Y., Sharpe, K., & Lakkis, C. (2006b). Categorical color naming of surface color codes by people with abnormal color vision. Optom Vis Sci., 83(12), Deeb, SS. (2004). Molecular genetics of colour vision deficiencies. Clin Exp Optom., 87(4 5), Delpero, W. T., O'Neill, H., Casson, E., & Hovis, J. (2005). Aviation relevant epidemiology of color vision deficiency. Aviat Space Environ Med., 76(2), Farnsworth, D. (1947). The Farnsworth Dichotomous Test for Color Blindness Panel D 15. New York: The Psychological Corp. Gwet, K. L. (2008). Computing inter rater reliability and its variance in the presence of high agreement. Br J Math Stat Psychol, 61. Haskett, M.K., Hovis J.K. (1987)Comparison of the Standard Pseudoisochromatic Plates to the Ishihara color vision test. Optom Vis Sci 64: Hèrmes, D., A. Roth, N. Borot, (1989) The two equation method results in retinal and optic nerve disorders. In Drum, B. Verriest, G. (eds) Colour Vision Deficiencies, B. Drum, G. Kluver Academic Publishers, Dordrecht

64 Colour Assessment 55 Holmes, J. G. and Wright, W. D. (1982), A new colour perception lantern. Color Res. Appl., 7: Hovis, J. K., Cawker, C. L., & Cranton, D. (1990). Normative data for the Standard Pseudoisochromatic Plates Part 2. J Am Optom Assoc, 61(12), Hovis, J. K., Cawker, C. L., & Cranton, D. (1996). Comparison of the Standard Pseudoisochromatic Plates Parts 1 and 2 As screening tests for congenital red green color vision deficiencies. J Am Optom Assoc, 67(6), Hovis, J.K, Ramaswamy, S., & Anderson, M. (2004). Repeatability indices for the Farnsworth D 15 test. Visual Neuroscience, 21(03), Hovis, J. K. (2008). Repeatability of the Holmes Wright type A lantern color vision test. Aviat Space Environ Med, 79(11), Jägle, H., Pirzer, M., & Sharpe, L. T. (2005). The Nagel anomaloscope: its calibration and recommendations for diagnosis and research. Graefe's Archive for Clinical and Experimental Ophthalmology, 243(1), Jordan, G., & Mollon, J. D. (1993). The Nagel anomaloscope and seasonal variation of colour vision. Nature, 363(6429), Kontsevich, L. L., & Tyler, C. W. (1999). Bayesian adaptive estimation of psychometric slope and threshold. Vision Research, 39(16), Moreland, J. D. and Kerr, J. (1979) Optimization of a Rayleigh type equation for the detection of tritanomaly. Vision Res. 19, National Research Council (US) Committee on Vision. (1981). Procedures for Testing Color Vision: Report of Working Group 41. Washington (DC) National Academies Press (US): Retrieved from Paramei, G. V., & Oakley, B. (2014). Variation of color discrimination across the life span. Journal of the Optical Society of America A, 31(4), A375 A384. Pokorny,J. Smith, VC. Verriest, G., Pinckers. AJLG. (Eds.), Congenital and Acquired Color Vision Defects, Grune and Stratton, New York (1979) Rabin, J. (2004). Quantification of color vision with cone contrast sensitivity. Vis Neurosci, 21(3), Rabin, J., Gooch, J., & Ivan, D. (2011). Rapid Quantification of Color Vision: The Cone Contrast Test. Investigative Ophthalmology & Visual Science, 52(2),

65 Colour Assessment 56 Regan B, Reffin J, Mollon J. (1994) Luminance noise and the rapid determination of discrimination ellipses in colour deficiency. Vision Res;34: Rodriguez Carmona, M., O'Neill Biba, M., & Barbur, J. L. (2012). Assessing the Severity of Color Vision Loss with Implications for Aviation and Other Occupational Environments. Aviation, Space, and Environmental Medicine, 83(1), Rüfer, F., Sauter, B., Klettner, A., Göbel, K., Flammer, J., & Erb, C. (2012). Age corrected reference values for the Heidelberg multi color anomaloscope. Graefe's Archive for Clinical and Experimental Ophthalmology, 250(9), Shinomori, K., Panorgias, A., & Werner, J. S. (2016). Discrimination thresholds of normal and anomalous trichromats: Model of senescent changes in ocular media density on the Cambridge Colour Test. Journal Optical Society of America A, 33(3), A65 A76. Schneck, M. E., Haegerstrom Portnoy, G., Lott, L. A., & Brabyn, J. A. (2014). Comparison of Panel D 15 Tests in a large older population. Optom Vis Sci, 91(3), Seshadri, J., Christensen, J., Lakshminarayanan, V., & Bassi, C. J. (2005). Evaluation of the new webbased "Colour Assessment and Diagnosis" test. Optom Vis Sci., 82(10), United States Army Aviation Museum wright militaryflyer/ Last Accessed Mar 3, Vingrys, A. J., & Cole, B. L. (1983). Validation of the Holmes Wright lanterns for testing colour vision. Ophthalmic and Physiological Optics, 3(2), Vingrys, A. J., & King Smith, P. E. (1988). A quantitative scoring technique for panel tests of color vision. Investigative Ophthalmology & Visual Science, 29(1), Vu, B. L., Easterbrook, M., & Hovis, J. K. (1999). Detection of color vision defects in chloroquine retinopathy. Ophthalmology, 106(9), Walsh, D. V., Robinson, J., Jurek, G. M., Capó Aponte, J. E., Riggs, D. W., & Temme, L. A. (2016). A Performance Comparison of Color Vision Tests for Military Screening. Aerospace Medicine and Human Performance, 87(4), Wongpakaran, N., Wongpakaran, T., Wedding, D., & Gwet, K. L. (2013). A comparison of Cohen s Kappa and Gwet s AC1 when calculating inter rater reliability coefficients: a study conducted with personality disorder samples. BMC medical research methodology, 13(1), 61.

66 Colour Assessment 57 Appendix 1. Description and review of colour vision tests evaluated in this study

67 Colour Assessment 58 Anomaloscope. The anomaloscope requires the person to carry out a red green colour match is the gold standard for determining the type and severity of a red green defect. The test requires the individual to adjust the proportion of monochromatic red and green lights in a mixture so that the mixture matches the appearance of a reference monochromatic yellow light (Pokorny et al, 1979). The ratio of the red and green lights determines whether the person has normal or anomalous trichromatic red green vision, and the range of acceptable matches determines the severity of the defect. Deuteranomalous individuals will require more green in the mixture with red to match yellow whereas protanomalous individuals will require more red in the mixture. Their range of acceptable matches may be large, but anomalous trichromats do not accept all possible red green settings as a match to the yellow reference. In contrast, red green dichromats will accept all the possible mixtures of the red and green primaries as an acceptable match to yellow. Deuteranopes are distinguished from protanopes based on their brightness matches to the various mixtures of the red and green lights. Protanopes brightness match settings systematically decrease as the red green ratio increases, whereas deuteranopes have a relatively constant brightness match settings for different red green ratios. Tritan, or blue yellow defects, are also diagnosed based on colour matches. The one commercially available anomaloscope for diagnosing tritan uses the Moreland equation. The subject is required to adjust the relative amounts of blue and green monochromatic lights in a mixture to match a desaturated blue green (cyan) reference. The blue and green primaries were selected to minimize intersubject variability in their matches due to differences in their macular pigment density (Moreland & Kerr, 1979). The reduction in the between subject variability would reduce the number of false positives that could occur due to a higher macular pigment density. However, the colours used in the Moreland equation do not fall on a tritanopic line of confusion and so the Moreland equation is limited in its ability to distinguish between tritanomaly and tritanopia (Moreland & Kerr, 1979). Although a tritanope may exhibit an extended matching range, they may not accept the entire range of settings as a match. Accepting the entire range of settings for the blue and green primaries as match is diagnostic for dichromatic colour vision. Pseudoisochromatic Tests. One of the most common formats used in colour vision screening tests is the pseudoisochromatic format. This format uses a figure of one colour within a background of another colour. The background and figure colours are selected so that they appear identical to a person with a congenital colour vision deficiency. There are variations in brightness, saturation and hue within the figure and background colours. This noise helps to ensure that the person identifies the figure based on their ability to distinguish the dominant hue of the figure from the dominant hue of the background. The pseudoisochromatic figure designs are vanishing, transformation, hidden or diagnostic. Vanishing plates are designed so that the colour normal sees a figure and the colour defective does not report any figure. Transformation plates are designed so that a colour normal reports one number and colour defective reports a different number. The hidden plates are the opposite of the vanishing plates. For these plates, the colour normals should not report any figure, whereas the colour defects do report

68 Colour Assessment 59 a number. Diagnostic plates are usually presented to only those who failed the screening portion of the test. These plates are a vanishing design used to identify a red green defect as either protan or deutan. There are usually two different coloured figures on a plate. One coloured figure should be missed by protans and the other figure should be missed by deutans. The diagnosticc plates may also estimatee the severity of the defect by systematically varying the average colour difference between the figure and background. Ishihara Test. This testt is the most widely used colour vision test for detecting red green deficiencies; however, it does not screen for blue yellow defects. The 38 plate edition is consideredd the gold standard for red green colour vision screening (CIE, 2001). This edition has 25 plates with numerals of one colour embedded in background of a different colour andd 13 plates that have a path between two XXs defined by one colour within a background of another colour (Birch, 1997). These latter plates are designed for individuals who are unfamiliar with numbers. The numeric plates are divided into demonstration (plate no.1), transformation (2 9), vanishing ( 10 17), hidden digit (18 21), and diagnostic (22 25). Figure 1 shows the Ishihara test booklet and demonstration plate. Figure.1 The Ishihara test booklet and an example of the demonstration plate on the right. The test has a high sensitivity and specificity with near perfect agreement with the anomaloscope as to whether the person is colour normal or colour defective (NRC, 1981; Birch & McKeever, 1993; Birch, 1997). Nevertheless, there is generall agreement that the hidden plates are inefficient in terms of screening and the responses to these plates can usually be ignored (Belcher, et al, 1958; Haskett & Hovis, 1987; Birch, 1997; Rodriguez Carmona et al, 2012) ). Birch recommended counting errors on just the transformation and vanishing plates (i.e. the first 17 plates) and use a failure criterion of 4, or more errors. Using this testing protocol, thee sensitivity and specificity are 98.7% and 94.1%, respectively.

69 Colour Assessment 60 The diagnostic plates attempt to diagnose the defectt as protan orr deutan based on whether the person misses the pink figures (i.e., protan) or purple figures (i.e., deutan) ) in the grey background. One of the difficulties in interpreting these plates is that patients report that both the protan and deutan test figures are missing or that both of the figures are present. Birch reported that 83.2% of protans and 94.1% of deutans weree classified correctly based on missing one of the two figures or identifying which figure was more distinct than the other if two numbers were reported (Birch, 1997). Hardy, Rand Rittler 4 th Edition Test (HRR). This test screens for blue yellow and red green colour vision defects. It also attempts to grade the severity of the defect. It contains 24 plates that present either one or two coloured geometric shapes (X,, and O) embedded in a background of gray dots. The test contains 4 demonstration plates, 6 screening plates (2 for tritan defects and 4 for red green defects) and 14 plates designed to determine the type and severity of a red green defect or the severity of a tritan defect. Figure 2 showss the HRR booklet and one of the demonstration plates. The type of red green defect is determined by whether the patient misses the protan or deutan figures. The saturation of the coloured figures on the diagnostic plates increases as one proceeds so that individuals with a milder defect only miss figures on the initial diagnostic plates and individuals with a severe defect miss nearly all the diagnostic figures that correspond to their type of defect. The severity of the defect is graded qualitatively as mild, medium or strong (i.e. severe). Figure 2. The HRR test booklet and an example of the demonstration plate on the right. According to a study by Cole et al, (2006) the sensitivity and specificity of the HRR for red green colour vision defects are 1.0 and 0.96 respectively when the failure criterion is two, or more errors, on the screening test figures. Eighty six percent of the subjects were classified correctly as protan or deutan which was similar to the percentages reported by Birch for the Ishihara. The severity on the HRR

70 Colour Assessment 61 was correlated with the severity diagnosed by the anomaloscope; however, 55% of individuals with a mild defect on the anomaloscope (a range of matches less than 20) were classified as medium or strong on the HRR and 37% of the dichromats were classified as medium(cole et al, 2006). Nevertheless, Birch does not consider the HRR 4 th edition to have adequate sensitivity (Birch, 2010). She reported that the HRR had a lower sensitivity with of 0.93 compared with Ishihara test, which had a sensitivity of There is little information available as to the effectiveness of the HRR in screening for bluesensitivity than the Farnsworth Munsell D15 in detecting colour vision defects due to hydroxychloroquine toxicity ( Vu, yellow defects. However, based on the original printing of the test, it should be more et al, 1999). Standard Pseudoisochromatic Plate Part 2. The SPP2 also screens for red green, blue yelloacquired colour vision defects, but it is reasonably good in screening for congenital red green defects (Hovis, et al, 1996) and it is more and scotopic colour vision deficiencies. It was originally designed as a test for sensitive in detecting acquired blue yellow defects than the HRR or Farnsworth D15 (Vu et al, 1999). Figure 3 shows the SPP2 booklet and one of the demonstration plates. The test plates present two vanishing figures or a vanishing figure and comparison figure. The comparison figure was included to help detect subtle colour vision defects. There are 12 plates.. The first and second plates are demonstration and 3 to 12 are the screening plates. The firstt test plate has a very faint number and it is recommended that this figure should be excluded from the test because nearly everyone misses it (Hovis et al, 1990, 1996). Hovis et al (1990) recommended a pass fail criterion based on the patient s age to control age related changes in colour vision. The SPP22 is comparable to the HRRR in detecting blue yellow colour vision defects resulting from hydroxychloroquine toxicity. Figure 3. The SPP2 test booklet and a demonstration n plate on the right.

71 Colour Assessment 62 ColorDx. The ColorDx test is a new computerized colour vision test that screens for red green and blueof the yellow colour vision deficiencies, classifies the type of red green defect and diagnoses the severity defect. The test plates are similar in design to the Ishihara plates. Figure 4 shows an example of one of the screening plates. Figure 4. The ColorDx test with an example of a screening plate. There are 25 images that are used to screen for red green defects. These are followed by 12 screening/classification plates for the tritan defect. The red green screening test ends once a total of 5 errors are made and the program switches to the blue yellow series before proceeding to the red green diagnostic series. The tritan plates screen for a blue yellow defect and classify the severity by presenting test figures which vary systematically in their saturation. Individual who miss the more saturated colours are classified as having a more severe defect. The red green diagnostic plates are administered next if the person failed the red green screening plates. Half of thee diagnostic plates have figure colours that a protan could miss and the other half have figures that a deutan could miss. Similar to the tritan plates, the saturation of the diagnostic figures changes systematically in order to classify the severity of the defect. The sensitivity and specificity of a prototype of the ColorDx was and 0.99 respectively (Almustanyir & Hovis, 2015). Cambridge test. The Cambridge colour vision test (CCT) is one of the computer based colour vision tests evaluated in this study. The test consists of a C shaped test figure that varies in hue and saturation relative to the reference grey background. The location of the gap in the C varies randomly from trial to trial in one of four directions: up, down, right, and left. The subject s task is to indicate the position of the gap. Figure 5 shows the Cambridge test and opening screen. Both the figure and background colour vary randomly in luminance to ensure that only differences inn hue are used to identify the target.

72 Colour Assessment 63 Figure 5. Picture of the Cambridge Colour Vision test. Section A is the display CRT monitor with an examplee of the stimulus. Section B shows the response box and the graphics card tower that receives the subject s responses and generates the stimuli. There are two different testing protocols available. One is the Trivector and the other measures full discrimination ellipses. The Trivector test measures discrimination thresholds for three different colours from a grey reference. The colours weree selected so that deutanss have difficulty distinguishing one colour from the grey background, protans the second colour, and tritans the third colour. This test could be used to screen and estimate the severity of the defect. The full version measures chromatic thresholds for colours that are equally spaced around the grey reference. Thus, the full test would be considered as a test of general colour discrimination. The threshold data is fitted with an ellipse. The magnitude of the elliptical area is an index of a person s general chromatic discrimination ability, and the orientation of the ellipse showss the colour axis where their discrimination is the worst. Age related norms for the ellipses have been published (Paramei & Oakley, 2014). Different pass/ /fail criteria for the Trivector test have been evaluated (Regan, et al, 1994; Shinomori et al, 2016). Nevertheless, using Shinomori, et al ss data (2016) and the manufacturer s recommended criteria of more than 100 for the deutan and protan test colours, gives a sensitivity and specificity of 100% and 94% respectively for subjects between 18 and 60 yrs. The value of 100 represents a colour difference from the grey reference of 0.01 in the u v chromaticity diagram. Using a score greater than 150 results in slightly lower sensitivity andd specificity values (Regan, et al, 1994).

73 Colour Assessment 64 Rabin Cone Contrast sensitivity Test (RCCT). The principle off the RCCT is similar to the Trivector in that discrimination thresholds from a grey reference are estimated using colours that should be missed by each of the three types of colour vision deficiencies. The differences are that, i) letters are used as the stimuli, ii) no luminance or chromatic noise is present, and iii) each colour is selected so that only one cone type is modulatedd as the saturation of the letter changes. That is, the test measure the discrimination threshold for the S cone (i.e. detect a tritan defect), M cone (i.e. detect a deutan defect) and L cone (i.e. detect a protan defect) (Rabin 2004). Figure 6 shows the RCCT. Figure 6. The cone contrast test with the demonstration target (letter H). In Rabin s two studies (2004, 2011), the RCCT had perfect agreement with the anomaloscope with respect to whether the person had normal colour visionn or a red green defect. That is, the sensitivity and specificity were both 1.0. In a more recent study, Walsh, et al (2016) reported marginally lower values. The sensitivity of RCCT was 0.97 for both right and left eye and the specificity was 0.97 for the right eye and 0.96 for the left eye. Colour Assessment and Diagnosis (CAD) test. The CAD test can screen for colour vision deficiencies, measure chromatic discrimination around gray reference, or both (Barburr et al, 1994, 2006). The CAD test consists of a gray background and coloured stimulus thatt is seen within a background of dynamic luminance contrast noise. Figure 7 shows the CAD test with a screen shott of coloured stimuli. The small individual squares making up the background and the stimulus change their luminance every 50 ms so that the display looks as if it is scintillating. As the individual squares oscillate in luminances, the coloured stimulus moves in one of the four diagonal directions (i.e., bottom left to top right, bottom right to top left, top left to bottom right, or top right to bottom left). The subject task s is to press the appropriate button to indicate the correspondin ng direction of movement.. A four alternative force choice procedure is used to determine the observer s chromatic detection threshold in a specific

74 Colour Assessment 65 direction with in the CIE xy chromaticity diagram (Barbur et al, 2006; Rodriguez Carmona et al, 2012). There are two general testing protocols available. One is fastt screening and the other test measures thresholds for colours confused by protans, deutans and tritans. The dataa from this test could also be used to determine threshold ellipses around the grey reference. The test allows for different pass/fail criteria depending on the specific application of the results. Figure 7. Picture of the CAD test. Section (A) presents the colouredd square on the gray background (the test target). This target moves to four possible diagonal directions (bottom left to top right, bottom right to top left, top left to bottom right, and top right to bottom left) ). The participant indicates where the target ends by pressing the corresponding button shown in B. Section (C) presents the laptop used to run the test.

75 Colour Assessment 66 The fast screening version screens for both red greenn and blue yellow colour vision defects. The moving square is presented at predetermined chromatic contrast for each of 16 different colours. The stimuli were selected to bracket the colours thatt would be confused with the grey background by the three different types of colour vision defects. There are 6 colours to screen for deutan defects, 6 colours to screen for protan defects and 4 colours to screen for tritann defects. The threshold test measuress the chromaticc thresholds for each of these colours. Values are given in units relative to the median value for colour normals (referred to as Standard Normal Units or SNUU units) or as the vector distance in the 1931 xy chromaticity diagram. Figure 8 showss the normative data plotted in the 1931 xy CIE diagram from 238 normal observerss (Barbur, et al, 2006). The black cross in the center indicates the chromaticity coordinates of the grey background. The dark grey ellipse represents the median threshold value, or better, and the lighter grey ellipse denotes the < 97.5 percentilee region. Anyy threshold measurement that falls beyond the largerr grey ellipse would be abnormal. The blue, red, andd green lines represent the test colours for the tritan, protan and deutan respectively. Figure 8. CAD threshold data plotted in the 1931 xy CIE diagram from 238 normal observers. The dark grey ellipse represents the values that at the median or lower and the light grey represent the thresholdss <97.5 percentile of the colour normals (Barbur et al, 2006). The green, red, and blue lines are the test colours for the deutan, protan, and tritan confusion lines respectively. Barbur, et al (2006) used the CAD test to assess the colour vision of 250 individuals with a congenital red green defect. The blue yellow thresholds overlapped substantially with the colour a clear normal data and thesee thresholds were not statistically significant. However, there was differencee in the red green thresholds. If a CAD score of >1.88 SNU was used to identify a person with a

76 Colour Assessment 67 red green defect, analysis of their Figure 3, showed that bothh the sensitivity and specificity of the test were equal to 1.0. Seshadri et al (2005) also reported sensitivity and specificity values over 90% for the WEB based version. However, Walsh et al (2016) reported specificity and sensitivity values near 0.85 for the CAD threshold test. Landolt C colour vision test (LandC). One of the disadvantages of the RCCT is that the range of contrasts is limited so that the test cannot measure colour normal thresholds. In order to provide this capability, the United States Air Force (USAF) is developing an enhanced version of the RCCT. Figure 9 shows the LandC colour vision test. The program displays Landolt Cs oriented up, down, left or right. The subject has to indicate which of the four orientations is presented. Similar to the CAD, the LandC may be used to screen for colour vision deficiencies or measure chromatic thresholds. The screening program presents each of the three colours at a suprathreshold contrast multiple times. More than one incorrect response on any letter would be a failure. The threshold program varies the contrast of the C using the Ψ adaptive proceduree to determine the individual s thresholds (Kontsevich & Tyler, 1999). Figure 9. The Landolt C test. Section (A) the test monitor, section (B) the test field with the crosshair and in section (C) the letter C. The crosshairs help with fixation at the center of the display, but disappear when the stimulus is presented.

77 Colour Assessment 68 Farnsworth Munsell D 15. The D15 colour vision test was introduced to distinguish colour normal and those with a mild colour vision deficiency from those individuals with a moderate to severe colour vision deficiency (Farnsworth, 1947). Individuals who fail the D15 are more likely to encounter problems in making colour judgments in their everyday life or at work. The subject s task is to arrange coloured caps according to similarity by placing the colour sample that is most similar to the previous cap placed in the box. Figure 10 shows the D15 test with a colour normal order and colour defective order along with the score sheets. The numbers on the score sheet represent the cap numbers. Connecting lines are drawn between the caps in the order by which they were arranged. The connecting lines form a smooth curve if the caps are in perfect order and form a seriess of approximately parallell lines if theree are major mistakes. Major mistakes are referred to as major crossings. The test is usually scored by visual inspectionn of the score sheet. Traditionally, 2 or more major crossings is a failure. The D15 can also be scored using Colour Differences Vectors analyses (Vingrys & King Smith, 1988). Three parameters are calculated: the Confusion indexx (C index), Specificity index (S index), and angle size. The C index indicates the severity of the defect. It is correlated with the number of crossings and the total error score. The S index provides a measurement of how regularly the crossing are oriented. A high S index is an indication of a random arrangement. The angle gives measurement of the type of the defect with protan angles larger than zero and deutan angles smaller than zero ((Vingrys & King Smith, 1988). Figure 10. The Farnsworth Munsell D15 colour test. (A) Colour normal D15 test arrangement and results drawn on the sheet. The red arrow indicates the reference cap. (B) D15 test with arrangement of a subject with a moderate to severe protan colour vision deficiency and the corresponding score sheet.

78 Colour Assessment 69 The pass rate for colour defectives ranges from 45% to 53% passed the test if one major crossing is allowed (Atchison et al, 1991; Hovis, et al, 2004; Birch, 2008); however, it is possible for a small number of dichromats (i.e. 3% %) to pass the D15 using this criterion. The percentage of dichromats passing reduced to 1.5% using any crossing as a failure (Birch, 2008). Hovis et al (2004) determinedd the repeatability of the D 15 using 116 red green colour defectivee subjects. They reported that if the failure criterion was 2 or more major crossings, the repeatability of the D15 was good with kappa (κ) coefficients of 0.84 but less than 0.96 κ value calculated from Farnsworth s data (Hovis, 2004). The reason for the difference was that Farnsworth included a large number off colour normals, which would improve the repeatability of the test because colour normalss rarely, if ever, fail the test. ColorDx D15. The ColorDx program also has a computerized version of the D15. Figure 11 shows the test. The program requires the subject to drag the coloured circle up to the top of the screen in order to use that colour to fill one of the empty rectangles. The colour selected should be the one that is most similar to the previous rectangle filled. The colours in the rectangles may be rearranged. To our knowledge, there are no studies reporting on its validity. Figure 11. The ColorDx D15 test. The first blue rectangle on the top right is the reference rectangle. Holmes Wright Type A. A The Holmes Wright Type A (HWA) colour vision test was designed to mimic navigationn lights used in the aviation industry (Holmes & Wright, 1982). Transport Canada Civilian Aviation Medicine usess the HWA as an alternative to the D15.. The test consists of threee lights (red, green, and white) that are presented in pairs. The colours were selected to fall within the boundaries

79 Colour Assessment 70 for aviation green, white and red. Thus, the testt has face validity with the task, unlike the Farnsworth Lantern, which does not use test colours corresponding to actual signal colours (Cole & Vingrys, 1982). Figure 12 shows the Holmes Wright Type A. The individual lights subtend a visual angle of 0.9 min of arc at the recommended of 6 m. The intensity of the lights at thee corneal plane using the high brightness setting is 5.4 μlux (Vingrys & Cole, 1983). Figure 12.The Holmes Wright Type A test. Colour defectives have difficulty passing this test (Vingrys & Cole, 1983; Birch, 2008b, Hovis, 2008). However, dichromats can pass the test if the passing criterion includes stopping the test if there is a perfect score on the first run of the nine pairs of lights. This is a problem because the dichromats (and other colour defectives) cannot pass the test on repetition. The likely cause is tooo few presentation of the different test lights combined with some strategic guessing (Hovis, 2008). In order to avoid this problem, Hovis recommended that there should multiple runs of the nine pairs of lights and the failure score be based on the total number of errors. He recommended using no more than 2 errors on 27 pairs to minimize the probability of a person passing the test the first session and failing the second. This criterion would also ensure that nearly all the colour normalss would pass on the first and second sessions.

80 Appendix 2. Screening questionnaire Colour Assessment 71

81 Colour Assessment 72 I. Questionnaire Subject number... Age... Gender VA (D) RE. LE. (need 6/6 in better eye and 6/9 in other) VA(100 cm) RE. VA (N 40 cm) RE. LE. (need 1.6 M (6/24) in better eye and 2 M(6/30) in other) LE. (need 0.8 M (6/12) in better eye and 1 M(6/15) in other) Yes No 1 Are you being treated or do you have Glaucoma; Optic Neuritis; Multiple Sclerosis; Diabetes 2 Other than wearing spectacles or contact lenses and/ or having colour vision problem, are you aware of any other vision problems? 4 Do you use any of these medications? (circle) Chloroquine, Hydroxychloroquine Plaquenil or Digitalis Digoxin 5 Have you had a cataract surgery? 6 Do you have a colour vision defect? (if 6 answer is NO please proceed to 7) 6a At which age did you first become aware of the problem with colour judgments? Age [ ] 6b Who informed you that you have a colour vision defect?.. 6c Do you have any difficulty with colour judgments in your daily activities? If yes, what is the major problem

82 Colour Assessment 73 6d Did your colour vision problem affect your career choice? 7 Do you have any difficulty identifying the colour of traffic lights? 8 Do you have any problem finding traffic signal lights at night when there are numerous other street lights surrounding the traffic signal or in the background? 9 At some intersections, there are cross walk signals for pedestrians. The hand symbol is a reddish orange and indicates that is it unsafe to cross. What colour is the man-figure which indicates that it is safe to cross?

83 Colour Assessment 74 APPENDIX 3. Cut off scores that maximize the sensitivity for failing the Farnsworth D15

84 Colour Assessment 75 In the main report, the CAD, Cambridge Trivector, Rabin Cone Contrast and Landolt C (Binocular viewing) computer based tests were compared with the Farnsworth D15 (D15) in order to determine whether these newer tests could be an adequate replacement of the Farnsworth D15 test. The criterion used for this comparison was to set the D15 as the standard test and then determine the value of the computer based tests that maximized the sum of the sensitivity and specificity of these newer tests relative to the D15 using an Receiver Operator Characteristic (ROC) analysis. The analysis showed that the CAD had a significantly higher level of agreement with the D15 than the rest of the tests, although the CAD agreement was not perfect. Table 1 at the end of this appendix summarizes those results. The CAD and D15 disagreed on approximately 30% of the subjects. Because the agreement between the computer based threshold tests and the D15 was based on optimizing the sensitivity and specificity, the question was raised as to how the agreement and predictive values would change if the sensitivity was maximized. That is, the computer based threshold tests pass/fail scores were set so that nearly everyone who failed the test also failed the D15. The CAD and Landolt C threshold data in Figures 1 and 2 illustrate the differences between the two options. Figure 1 shows the distribution of CAD scores for the colour defective subjects who passed or failed the D15 along with the CAD cut off score of 20, which maximizes the sum of the specificity and sensitivity, and a cut off score of 13.5, which maximizes the sensitivity without a severe reduction in the specificity. Ideally, one wants to maximize the number of individuals who passed the D15 (solid symbols) falling below the horizontal line, while simultaneously maximizing the number of individuals who failed the D15 (open symbols) falling above the horizontal lines. Figure 2 shows the dot histogram results for the Landolt C (Binocular) using the highest of either the L or M cone threshold or the average of the L and M cone thresholds. The figure shows that the reason for the lower level of agreement between the Landolt C and D15 was that a larger number of subjects who passed the D15 had Landolt C threshold values that fell within the range of the threshold values of subjects who failed the D15. For the pass/fail criterion that maximized the sum of the sensitivity and specificity, both the sensitivity and specificity were lower relative to the CAD. For the criterion that maximized the sensitivity, the specificity decreased, resulting in a large number of false positives on the Landolt C. Table 1 lists the agreement values and the predictive pass and fail values for the two cut off scores for the CAD and other tests. The general trend is that as the cut off score was changed to increase the sensitivity, the predictive value of the computer based test for passing the D15 increases. That is, a higher percentage of the subjects who passed the computer based also passed the D15. However, the predictive value of the computer based tests decreased indicating that a lower percentage of subjects who failed the computer based also failed the D15. Thus, the more stringent pass/fail criterion translates into more false positives for a given test. This means that a larger number of colourdefective candidates who currently meet the CV2 requirement would be disqualified from aircrew positions. Although the false positives increase with the stricter criterion, there was also a slight decrease in the number of false negatives so that the number of discrepancies was reduced slightly. For the CAD, the percentage of discrepancies using the stricter criterion decreases from 31% to 25%.

85 Colour Assessment D15 vs CAD Figure 1. Dot Histogram showing the distribution of CAD threshold values for those subjects who passed or failed the Farnsworth D15. The solid line is the CAD score which maximizes the sum of the sensitivity and specificity and the dashed line maximizes the sensitivity with a specificity value greater than 0. CAD Threshold (SNU) Pass D15 Fail D15 Figure 2. Dot Histogram showing the distribution of Landolt C (binocular) highest of the L or M cone thresholds and average of the L and M cone thresholds for those subjects who passed or failed the Farnsworth D15. The solid line is the Landolt C score which maximizes the sum of the sensitivity and specificity and the dashed line maximizes the sensitivity with a specificity value greater than 0.

Computerized and Non-Computerized Colour Vision Tests

Computerized and Non-Computerized Colour Vision Tests Computerized and Non-Computerized Colour Vision Tests by Ali Almustanyir A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Science in

More information

OPTO Physiology Of Vision II

OPTO Physiology Of Vision II Lecture 8 Relative Luminous Efficiency The sensitivity of the eye to different wavelengths in an equal energy spectrum is known as the Relative Luminous Efficiency (V λ ) function. At photopic levels of

More information

Assessment of Current and Next Generation of Colour Vision Tests for Occupational Use

Assessment of Current and Next Generation of Colour Vision Tests for Occupational Use Assessment of Current and Next Generation of Colour Vision Tests for Occupational Use by Ali Almustanyir A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the

More information

LEA Color Vision Testing

LEA Color Vision Testing To The Tester Quantitative measurement of color vision is an important diagnostic test used to define the degree of hereditary color vision defects found in screening with pseudoisochromatic tests and

More information

Validity of the Holmes Wright lantern as a color vision test for the rail industry

Validity of the Holmes Wright lantern as a color vision test for the rail industry Vision Research 38 (1998) 3487 3491 Validity of the Holmes Wright lantern as a color vision test for the rail industry Jeffery K. Hovis *, David Oliphant School of Optometry, Uni ersity of Waterloo, Waterloo,

More information

Notification of Alternative Means of Compliance

Notification of Alternative Means of Compliance United Kingdom Civil Aviation Authority Safety Regulation Group Licensing & Training Standards Notification of Alternative Means of Compliance Regulation Reference: COMMISSION REGULATION (EU) No 1178/2011

More information

Colour vision requirements in visually demanding occupations

Colour vision requirements in visually demanding occupations British Medical Bulletin, 217, 122:51 77 doi: 1.193/bmb/ldx7 Advance Access Publication Date: 18 March 217 Invited Review Colour vision requirements in visually demanding occupations J. L. Barbur* and

More information

The color of night: Surface color categorization by color defective observers under dim illuminations

The color of night: Surface color categorization by color defective observers under dim illuminations Visual Neuroscience ~2008!, 25, 475 480. Printed in the USA. Copyright 2008 Cambridge University Press 0952-5238008 $25.00 doi:10.10170s0952523808080486 The color of night: Surface color categorization

More information

Colour vision deficiency part 1 an introduction

Colour vision deficiency part 1 an introduction CET CONTINUING EDUCATION & TRAINING Sponsored by 1 CET POINT 48 Colour vision deficiency part 1 an introduction Jennifer Birch Understanding the implications of defective colour vision is an integral part

More information

ANOMALOSCOPE. User s Manual

ANOMALOSCOPE. User s Manual ANOMALOSCOPE User s Manual Foreword Anomaloscope User s Manual This User s Manual shows how to examine red/green colour vision by means of the Rayleigh equation, and blue/green colour vision by means of

More information

COLOR VISION IN AND DISEASE

COLOR VISION IN AND DISEASE COLOR VISION IN AND DISEASE HEALTH Craig Thomas, O.D. 3900 West Wheatland Road Dallas, Texas 75237 972-780-7199 thpckc@yahoo.com Financial Disclosure Craig Thomas, O.D. has received honorarium from the

More information

functions Analysis of Rayleigh match data with psychometric

functions Analysis of Rayleigh match data with psychometric Vol. 1, No. 8/August 1993/J. Opt. Soc. Am. A 187 Analysis of Rayleigh match data with psychometric functions Retina Foundation of the Southwest, Suite 4, 99 North Central Expressway, Dallas, Texas 75231,

More information

Expert Review Examination of Colour Vision

Expert Review Examination of Colour Vision Expert Review Examination of Colour Vision Sahar Parvizi and Peggy Frith.. The Journal of Clinical Examination 2008 (7): 1-8 Abstract Congenital colour vision anomalies are common, with red/green anomaly

More information

Theory and practice of colour vision testing:

Theory and practice of colour vision testing: Brit. J. industr. Med., 1969, 26, 265-288 Theory and practice of colour vision testing: A review. Part 2 R. LAKOWSKI Visual Laboratory, Department of Psychology, Edinburgh University Lakowski, R. (1969).

More information

Minimum Color Vision Requirements for Professional Flight Crew, Part III: Recommendations for New Color Vision Standards

Minimum Color Vision Requirements for Professional Flight Crew, Part III: Recommendations for New Color Vision Standards Federal Aviation Administration DOT/FAA/AM-9/11 Office of Aerospace Medicine Washington, DC 2591 Minimum Color Vision Requirements for Professional Flight Crew, Part III: Recommendations for New Color

More information

8. A minimalist test of colour vision

8. A minimalist test of colour vision 8. A minimalist test of colour vision J.D. MOLLON, S. ASTELL and J.P. REFFIN (Cambridge, U. K.) Abstract We propose a new test that is intended as a rational reduction of existing arrangement tests of

More information

Colour vision examination A guide for employers

Colour vision examination A guide for employers A guide for employers 1 This guidance provides general information on colour vision tests for managers and others who may need to use them. Employers may also wish to refer to Guidance te MS7 : A guide

More information

Color vision deficiency in retinitis pigmentosa

Color vision deficiency in retinitis pigmentosa International Congress Series 1282 (25) 684 688 www.ics-elsevier.com Color vision deficiency in retinitis pigmentosa Rokiah Omar a, *, Stephan Dain b, Peter Herse b a Department of Optometry, Faculty of

More information

Colour Vision Defects

Colour Vision Defects http://www.theiet.org/cpd Colour Vision Defects Some common questions asked by people who think they may have some form of colour vision defect, as well as those asked by employers. www.theiet.org/factfiles

More information

ID# Exam 1 PS 325, Fall 2001

ID# Exam 1 PS 325, Fall 2001 ID# Exam 1 PS 325, Fall 2001 As always, the Skidmore Honor Code is in effect, so keep your eyes foveated on your own exam. I tend to think of a point as a minute, so be sure to spend the appropriate amount

More information

Operational Based Vision Assessment Cone Contrast Test: Description and Operation

Operational Based Vision Assessment Cone Contrast Test: Description and Operation AFRL-SA-WP-SR-2016-0007 Operational Based Vision Assessment Cone Contrast Test: Description and Operation James Gaska, Marc Winterbottom U.S. Air Force School of Aerospace Medicine, Aeromedical Research

More information

zz~ Ircdre o Testing- Colr Vsio Reoto0okigGop4

zz~ Ircdre o Testing- Colr Vsio Reoto0okigGop4 AOAI13 660 NATIONAL RESEARCH COUNCIL WASHINGTON DC COMMITTEE ON-ETC F/4 5/10 PROCEDURES FOR TESTING COLOR VISION,(U) UMCLAS$ZF][[D. C E 1981 J POKORNY, B COLLINS, B HOWETT ML,o'mIIIolllllll zz~ Ircdre

More information

LOW VISION VISD241. MODULE LEADER: DR G WALSH B.Sc. OPHTHALMIC DISPENSING

LOW VISION VISD241. MODULE LEADER: DR G WALSH B.Sc. OPHTHALMIC DISPENSING DIVISION OF VISION SCIENCES SESSION: 2006/2007 DIET: 1 ST LOW VISION VISD241 LEVEL: TWO MODULE LEADER: DR G WALSH B.Sc. OPHTHALMIC DISPENSING MAY 2007 DURATION: 2 HOURS CANDIDATES SHOULD ATTEMPT FOUR QUESTIONS

More information

Acquired Color Deficiency in Various Diseases

Acquired Color Deficiency in Various Diseases 84th Annual AsMA Scientific Meeting Acquired Color Deficiency in Various Diseases Jeff Rabin,1,2 Michael Castro,1 Daniel Ewing,1 Hayley George,1 Paul Lau,1 Shannon Leon,1 Andrew Yoder,1 John Gooch2 and

More information

Evolutionary models of color categorization: Investigations based on realistic population heterogeneity

Evolutionary models of color categorization: Investigations based on realistic population heterogeneity Evolutionary models of color categorization: Investigations based on realistic population heterogeneity Kimberly A. Jameson,, and Natalia L. Komarova 2 Institute for Mathematical Behavioral Sciences, University

More information

Protan Response Times to Red Lights in a Mildly Hypoxic Environment

Protan Response Times to Red Lights in a Mildly Hypoxic Environment RESEARCH ARTICLE Protan Response Times to Red Lights in a Mildly Hypoxic Environment H OVIS JK, M ILBURN NJ, N ESTHUS TE. Protan response times to red sighting distance could be reduced by a factor of

More information

Applying structure-function to solve clinical cases

Applying structure-function to solve clinical cases Applying structure-function to solve clinical cases Professor Michael Kalloniatis Centre for Eye Health, and, School of Optometry and Vision Science Acknowledgements Some material prepared by Nayuta Yoshioka

More information

UK National Aerospace NDT Board

UK National Aerospace NDT Board UK National Aerospace NDT Board c/o The British Institute of NDT Newton Building, St George s Avenue Northampton, NN2 6JB United Kingdom Tel: +44 (0)1604-893-811. Fax: +44 (0)1604-893-868. E-mail: Nicole.scutt@bindt.org

More information

1.4 MECHANISMS OF COLOR VISION. Trichhromatic Theory. Hering s Opponent-Colors Theory

1.4 MECHANISMS OF COLOR VISION. Trichhromatic Theory. Hering s Opponent-Colors Theory 17 exceedingly difficult to explain the function of single cortical cells in simple terms. In fact, the function of a single cell might not have meaning since the representation of various perceptions

More information

Operational Assessment of Color Vision

Operational Assessment of Color Vision AFRL-SA-WP-TR-2016-0008 Operational Assessment of Color Vision Steve Wright, O.D.; James Gaska, Ph.D.; Marc Winterbottom, Ph.D.; Darrell Rousse, O.D.; Steven Hadley, M.D.; Dan LaMothe, M.D. June 2016 Final

More information

Lighta part of the spectrum of Electromagnetic Energy. (the part that s visible to us!)

Lighta part of the spectrum of Electromagnetic Energy. (the part that s visible to us!) Introduction to Physiological Psychology Vision ksweeney@cogsci.ucsd.edu cogsci.ucsd.edu/~ /~ksweeney/psy260.html Lighta part of the spectrum of Electromagnetic Energy (the part that s visible to us!)

More information

Color weakness in congenital color perception deficiency of various degrees

Color weakness in congenital color perception deficiency of various degrees Color weakness in congenital color perception deficiency of various degrees A.V. Ponomarchuk, MD, N.I. Khramenko, Cand Sc (Med) Filatov Institute of Eye Diseases and Tissue Therapy, NAMS of Ukraine; Odessa

More information

In Office Control Therapy Manual of Procedures

In Office Control Therapy Manual of Procedures 1 2 3 4 5 6 7 8 9 In Office Control Therapy Manual of Procedures ATS-VT 10 11 12 13 14 15 16 17 18 19 Don W. Lyon, OD Kristine B. Hopkins, OD, MSPH Ray Chu, OD - 0 - 20 21 22 23 24 25 26 27 28 29 30 31

More information

Commonwealth of Pennsylvania PA Test Method No. 423 Department of Transportation October Pages LABORATORY TESTING SECTION. Method of Test for

Commonwealth of Pennsylvania PA Test Method No. 423 Department of Transportation October Pages LABORATORY TESTING SECTION. Method of Test for Commonwealth of Pennsylvania PA Test Method No. 423 Department of Transportation 10 Pages 1. SCOPE LABORATORY TESTING SECTION Method of Test for RETRO-DIRECTIVE REFLECTIVITY OF REFLECTIVE MATERIALS 1.1

More information

Evaluation of Next-Generation Vision Testers for Aeromedical Certification of Aviation Personnel

Evaluation of Next-Generation Vision Testers for Aeromedical Certification of Aviation Personnel Federal Aviation Administration DOT/FAA/AM-09/13 Office of Aerospace Medicine Washington, DC 20591 Evaluation of Next-Generation Vision Testers for Aeromedical Certification of Aviation Personnel Van B.

More information

Perceptual Learning of Categorical Colour Constancy, and the Role of Illuminant Familiarity

Perceptual Learning of Categorical Colour Constancy, and the Role of Illuminant Familiarity Perceptual Learning of Categorical Colour Constancy, and the Role of Illuminant Familiarity J. A. Richardson and I. Davies Department of Psychology, University of Surrey, Guildford GU2 5XH, Surrey, United

More information

3/16/2018. Perimetry

3/16/2018. Perimetry Perimetry The normal visual field extends further away from fixation temporally and inferiorly than superiorly and nasally. From the center of the retina this sensitivity decreases towards the periphery,

More information

VISION. Software for the colour blind. INFORMATION PACK Version 3 08 January Product of Vision Technology Ltd. All rights reserved.

VISION. Software for the colour blind. INFORMATION PACK Version 3 08 January Product of Vision Technology Ltd. All rights reserved. VISION Software for the colour blind INFORMATION PACK Version 3 08 January 2018 INFORMATION PACK Version 3, 08 January 2018 CONTENTS 1. What does Vision do?...1 2. What is colour blindness and which are

More information

Opponent theory PSY 310 Greg Francis. Lecture 18. Trichromatic theory

Opponent theory PSY 310 Greg Francis. Lecture 18. Trichromatic theory PSY 310 Greg Francis Lecture 18 Reach that last 1%. Trichromatic theory Different colors are represented as a pattern across the three basic colors Nicely predicted the existence of the three cone types

More information

ACQUIRED COLOUR VISION DEFECTS IN OPTIC NERVE DISORDERS- A COMPARISON BETWEEN ISHIHARA S TEST AND ROTH 28-HUE TEST DR. TEENA MARIET MENDONCA

ACQUIRED COLOUR VISION DEFECTS IN OPTIC NERVE DISORDERS- A COMPARISON BETWEEN ISHIHARA S TEST AND ROTH 28-HUE TEST DR. TEENA MARIET MENDONCA ACQUIRED COLOUR VISION DEFECTS IN OPTIC NERVE DISORDERS- A COMPARISON BETWEEN ISHIHARA S TEST AND ROTH 28-HUE TEST By DR. TEENA MARIET MENDONCA Dissertation Submitted to the Rajiv Gandhi University of

More information

2. METHODS. 2.1 Apparatus

2. METHODS. 2.1 Apparatus Pupillary light reflex associated with melanopsin and cone photorecetors Sei-ichi Tsujimura, 1 Katsunori Okajima, 2 1 Faculty of Sciences and Engineering, Kagoshima University, Japan 2 Faculty of Environment

More information

The lowest level of stimulation that a person can detect. absolute threshold. Adapting one's current understandings to incorporate new information.

The lowest level of stimulation that a person can detect. absolute threshold. Adapting one's current understandings to incorporate new information. absolute threshold The lowest level of stimulation that a person can detect accommodation Adapting one's current understandings to incorporate new information. acuity Sharp perception or vision audition

More information

Further studies supporting the identity of congenital tritanopia and hereditary dominant optic atrophy

Further studies supporting the identity of congenital tritanopia and hereditary dominant optic atrophy Further studies supporting the identity of congenital tritanopia and hereditary dominant optic atrophy Alex E. Krill, Vivianne C. Smith, and Joel Pokorny Dominant inherited optic atrophy is usually a stationary

More information

Colour vision screening: a critical appraisal of the literature New Zealand Health Technology Assessment

Colour vision screening: a critical appraisal of the literature New Zealand Health Technology Assessment Colour vision screening: a critical appraisal of the literature New Zealand Health Technology Assessment Authors' objectives To provide an evidence-based review evaluating colour vision screening through

More information

Recognition Memory for Colored and Black-and-White Scenes in Normal and Color Deficient Observers (Dichromats)

Recognition Memory for Colored and Black-and-White Scenes in Normal and Color Deficient Observers (Dichromats) Recognition Memory for Colored and Black-and-White Scenes in Normal and Color Deficient Observers (Dichromats) Serge Brédart 1 *, Alyssa Cornet 1, Jean-Marie Rakic 2 1 Department of Psychology, University

More information

Traffic Sign Detection and Identification

Traffic Sign Detection and Identification University of Iowa Iowa Research Online Driving Assessment Conference 2013 Driving Assessment Conference Jun 19th, 12:00 AM Traffic Sign Detection and Identification Vaughan W. Inman SAIC, McLean, VA Brian

More information

The Perception of Static Colored Noise: Detection and Masking Described by CIE94

The Perception of Static Colored Noise: Detection and Masking Described by CIE94 The Perception of Static Colored Noise: Detection and Masking Described by CIE94 Marcel P. Lucassen, 1 * Piet Bijl, 1 Jolanda Roelofsen 2 1 TNO Human Factors, Soesterberg, The Netherlands 2 Cognitive Psychology,

More information

Framework for Comparative Research on Relational Information Displays

Framework for Comparative Research on Relational Information Displays Framework for Comparative Research on Relational Information Displays Sung Park and Richard Catrambone 2 School of Psychology & Graphics, Visualization, and Usability Center (GVU) Georgia Institute of

More information

7. Sharp perception or vision 8. The process of transferring genetic material from one cell to another by a plasmid or bacteriophage

7. Sharp perception or vision 8. The process of transferring genetic material from one cell to another by a plasmid or bacteriophage 1. A particular shade of a given color 2. How many wave peaks pass a certain point per given time 3. Process in which the sense organs' receptor cells are stimulated and relay initial information to higher

More information

Assessing the effects of dynamic luminance contrast noise masking on a color discrimination task

Assessing the effects of dynamic luminance contrast noise masking on a color discrimination task A178 Vol. 33, No. 3 / March 2016 / Journal of the Optical Society of America A Research Article Assessing the effects of dynamic luminance contrast noise masking on a color discrimination task JOÃO M.

More information

City, University of London Institutional Repository

City, University of London Institutional Repository City Research Online City, University of London Institutional Repository Citation: Rauscher, F. G., Chisholm, C. M., Edgar, D. F and Barbur, J. L. (2013). Assessment of novel binocular colour, motion and

More information

Predictive Validity of the Aviation Lights Test for Testing Pilots With Color Vision Deficiencies

Predictive Validity of the Aviation Lights Test for Testing Pilots With Color Vision Deficiencies DOT/FAA/AM-4/14 Office of Aerospace Medicine Washington, DC 2591 Predictive Validity of the Aviation Lights Test for Testing Pilots With Color Vision Deficiencies Nelda J. Milburn Henry W. Mertens Civil

More information

Do You See What I See?

Do You See What I See? Undergraduate Review Volume 6 Article 19 2010 Do You See What I See? Rachel Mullins Follow this and additional works at: http://vc.bridgew.edu/undergrad_rev Part of the Other Psychology Commons Recommended

More information

Temporal Feature of S-cone Pathway Described by Impulse Response Function

Temporal Feature of S-cone Pathway Described by Impulse Response Function VISION Vol. 20, No. 2, 67 71, 2008 Temporal Feature of S-cone Pathway Described by Impulse Response Function Keizo SHINOMORI Department of Information Systems Engineering, Kochi University of Technology

More information

Vision Research. Very-long-term and short-term chromatic adaptation: Are their influences cumulative? Suzanne C. Belmore a, Steven K.

Vision Research. Very-long-term and short-term chromatic adaptation: Are their influences cumulative? Suzanne C. Belmore a, Steven K. Vision Research 51 (2011) 362 366 Contents lists available at ScienceDirect Vision Research journal homepage: www.elsevier.com/locate/visres Very-long-term and short-term chromatic adaptation: Are their

More information

Spectral sensitivity and color discrimination changes in glaucoma and glaucoma-suspect patients

Spectral sensitivity and color discrimination changes in glaucoma and glaucoma-suspect patients Spectral sensitivity and color discrimination changes in glaucoma and glaucoma-suspect patients Anthony J. Adams, Rosemary Rodic, Roger Husted,* and Robert Stamper Color vision changes may occur early

More information

Vision Seeing is in the mind

Vision Seeing is in the mind 1 Vision Seeing is in the mind Stimulus: Light 2 Light Characteristics 1. Wavelength (hue) 2. Intensity (brightness) 3. Saturation (purity) 3 4 Hue (color): dimension of color determined by wavelength

More information

DRIVING HAZARD DETECTION WITH A BIOPTIC TELESCOPE

DRIVING HAZARD DETECTION WITH A BIOPTIC TELESCOPE DRIVING HAZARD DETECTION WITH A BIOPTIC TELESCOPE Amy Doherty, Eli Peli & Gang Luo Schepens Eye Research Institute, Mass Eye and Ear, Harvard Medical School Boston, Massachusetts, USA Email: amy_doherty@meei.harvard.edu

More information

A contrast paradox in stereopsis, motion detection and vernier acuity

A contrast paradox in stereopsis, motion detection and vernier acuity A contrast paradox in stereopsis, motion detection and vernier acuity S. B. Stevenson *, L. K. Cormack Vision Research 40, 2881-2884. (2000) * University of Houston College of Optometry, Houston TX 77204

More information

Seeing Color. Muller (1896) The Psychophysical Axioms. Brindley (1960) Psychophysical Linking Hypotheses

Seeing Color. Muller (1896) The Psychophysical Axioms. Brindley (1960) Psychophysical Linking Hypotheses Muller (1896) The Psychophysical Axioms The ground of every state of consciousness is a material process, a psychophysical process so-called, to whose occurrence the state of consciousness is joined To

More information

HOW DOES PERCEPTUAL LOAD DIFFER FROM SENSORY CONSTRAINS? TOWARD A UNIFIED THEORY OF GENERAL TASK DIFFICULTY

HOW DOES PERCEPTUAL LOAD DIFFER FROM SENSORY CONSTRAINS? TOWARD A UNIFIED THEORY OF GENERAL TASK DIFFICULTY HOW DOES PERCEPTUAL LOAD DIFFER FROM SESORY COSTRAIS? TOWARD A UIFIED THEORY OF GEERAL TASK DIFFICULTY Hanna Benoni and Yehoshua Tsal Department of Psychology, Tel-Aviv University hannaben@post.tau.ac.il

More information

Appendix: Instructions for Treatment Index B (Human Opponents, With Recommendations)

Appendix: Instructions for Treatment Index B (Human Opponents, With Recommendations) Appendix: Instructions for Treatment Index B (Human Opponents, With Recommendations) This is an experiment in the economics of strategic decision making. Various agencies have provided funds for this research.

More information

Color Vision Measured With Pseudoisochromatic Plates at Five-and-a-Half Years in Eyes of Children From the CRYO-ROP Study

Color Vision Measured With Pseudoisochromatic Plates at Five-and-a-Half Years in Eyes of Children From the CRYO-ROP Study Color Vision Measured With Pseudoisochromatic Plates at Five-and-a-Half Years in Eyes of Children From the CRYO-ROP Study VelmaDobson* Graham E. Quiring Israel AbramovX Robert], Hardy,% Betty Tung,% R.

More information

Behavioural Processes

Behavioural Processes Behavioural Processes 95 (23) 4 49 Contents lists available at SciVerse ScienceDirect Behavioural Processes journal homepage: www.elsevier.com/locate/behavproc What do humans learn in a double, temporal

More information

CANTAB Test descriptions by function

CANTAB Test descriptions by function CANTAB Test descriptions by function The 22 tests in the CANTAB battery may be divided into the following main types of task: screening tests visual memory tests executive function, working memory and

More information

Definition Slides. Sensation. Perception. Bottom-up processing. Selective attention. Top-down processing 11/3/2013

Definition Slides. Sensation. Perception. Bottom-up processing. Selective attention. Top-down processing 11/3/2013 Definition Slides Sensation = the process by which our sensory receptors and nervous system receive and represent stimulus energies from our environment. Perception = the process of organizing and interpreting

More information

Structure of the eye and retina

Structure of the eye and retina 1 of 10 3/6/2012 1:06 PM Syllabus pdf file Course Schedule Structure of the eye and retina 2 of 10 3/6/2012 1:06 PM In-class demo: do Virtual Lab activity 3-6 (Visual Path in the Eyeball) Focusing, changes

More information

Reliability and validity of the GymAware optical encoder to measure displacement data

Reliability and validity of the GymAware optical encoder to measure displacement data Reliability and validity of the GymAware optical encoder to measure displacement data Study details Organization: Kinetic Performance Technology 8/26-28 Winchcombe Crt Mitchell ACT 2911 Australia Site

More information

= add definition here. Definition Slide

= add definition here. Definition Slide = add definition here Definition Slide Definition Slides Sensation = the process by which our sensory receptors and nervous system receive and represent stimulus energies from our environment. Perception

More information

WHITE PAPER. Efficient Measurement of Large Light Source Near-Field Color and Luminance Distributions for Optical Design and Simulation

WHITE PAPER. Efficient Measurement of Large Light Source Near-Field Color and Luminance Distributions for Optical Design and Simulation Efficient Measurement of Large Light Source Near-Field Color and Luminance Distributions for Optical Design and Simulation Efficient Measurement of Large Light Source Near-Field Color and Luminance Distributions

More information

Adapting internal statistical models for interpreting visual cues to depth

Adapting internal statistical models for interpreting visual cues to depth Journal of Vision (2010) 10(4):1, 1 27 http://journalofvision.org/10/4/1/ 1 Adapting internal statistical models for interpreting visual cues to depth Anna Seydell David C. Knill Julia Trommershäuser Department

More information

OPTO 5320 VISION SCIENCE I

OPTO 5320 VISION SCIENCE I OPTO 5320 VISION SCIENCE I Monocular Sensory Processes of Vision: Color Vision Mechanisms of Color Processing . Neural Mechanisms of Color Processing A. Parallel processing - M- & P- pathways B. Second

More information

Natural Scene Statistics and Perception. W.S. Geisler

Natural Scene Statistics and Perception. W.S. Geisler Natural Scene Statistics and Perception W.S. Geisler Some Important Visual Tasks Identification of objects and materials Navigation through the environment Estimation of motion trajectories and speeds

More information

University of Groningen

University of Groningen University of Groningen Effect of compensatory viewing strategies on practical fitness to drive in subjects with visual field defects caused by ocular pathology Coeckelbergh, Tanja Richard Maria IMPORTANT

More information

COLOUR CONSTANCY: A SIMULATION BY ARTIFICIAL NEURAL NETS

COLOUR CONSTANCY: A SIMULATION BY ARTIFICIAL NEURAL NETS OLOUR ONSTANY: A SIMULATION BY ARTIFIIAL NEURAL NETS enrikas Vaitkevicius and Rytis Stanikunas Faculty of Psychology, Vilnius University, Didlaukio 47, 257 Vilnius, Lithuania e-mail: henrikas.vaitkevicius@ff.vu.lt

More information

The effects of subthreshold synchrony on the perception of simultaneity. Ludwig-Maximilians-Universität Leopoldstr 13 D München/Munich, Germany

The effects of subthreshold synchrony on the perception of simultaneity. Ludwig-Maximilians-Universität Leopoldstr 13 D München/Munich, Germany The effects of subthreshold synchrony on the perception of simultaneity 1,2 Mark A. Elliott, 2 Zhuanghua Shi & 2,3 Fatma Sürer 1 Department of Psychology National University of Ireland Galway, Ireland.

More information

ASSESSMENT OF DRIVING-RELATED SKILL (ADReS)

ASSESSMENT OF DRIVING-RELATED SKILL (ADReS) ASSESSMENT OF DRIVING-RELATED SKILL (ADReS) There are 3 key functions for safe driving: Vision, Cognition, and Motor Function. The ADReS assesses these 3 functions. As occupational therapists, we can perform

More information

Evaluating color deficiency simulation and daltonization methods through visual search and sample-to-match: SaMSEM and ViSDEM

Evaluating color deficiency simulation and daltonization methods through visual search and sample-to-match: SaMSEM and ViSDEM Evaluating color deficiency simulation and daltonization methods through visual search and sample-to-match: SaMSEM and ViSDEM Joschua Thomas Simon-Liedtke a and Ivar Farup a and Bruno Laeng b a The Norwegian

More information

ID# Exam 1 PS 325, Fall 2004

ID# Exam 1 PS 325, Fall 2004 ID# Exam 1 PS 325, Fall 2004 As always, the Skidmore Honor Code is in effect. Read each question carefully and answer it completely. Multiple-choice questions are worth one point each, other questions

More information

Development of a new loudness model in consideration of audio-visual interaction

Development of a new loudness model in consideration of audio-visual interaction Development of a new loudness model in consideration of audio-visual interaction Kai AIZAWA ; Takashi KAMOGAWA ; Akihiko ARIMITSU 3 ; Takeshi TOI 4 Graduate school of Chuo University, Japan, 3, 4 Chuo

More information

glaucoma and ocular hypertension

glaucoma and ocular hypertension British Journal of Ophthalmology, 1980, 64, 852-857 Colour vision in patients with chronic simple glaucoma and ocular hypertension D. POINOOSAWMY, S. NAGASUBRAMANIAN, AND J. GLOSTER From the Glaucoma Unit,

More information

Modifiers and Retransmitters (Secondary Light Sources)

Modifiers and Retransmitters (Secondary Light Sources) Vision and Light Vision Generators Transmitters (Light Sources) Modifiers and Retransmitters (Secondary Light Sources) Receivers Decoder Encoders Interpreter (Eyes) (Brain) Sun, Discharge lamps, fluorescent

More information

X-Linked Incomplete Achromatopsia with more than One Class of Functional Cones

X-Linked Incomplete Achromatopsia with more than One Class of Functional Cones X-Linked Incomplete Achromatopsia with more than One Class of unctional Cones Vivianne C. Smith,* Joel Pokorny,* J. W. Dellemaat. Cozjjnsen,^: W. A. Houtman, and L.. Went ive affected males in the fifth

More information

Carmen Barnhardt, O.D., M.S., a Sandra S. Block, O.D., M.Ed., b Beth Deemer, O.D., a Amy Jo Calder, O.D., a and Paul DeLand, Ph.D.

Carmen Barnhardt, O.D., M.S., a Sandra S. Block, O.D., M.Ed., b Beth Deemer, O.D., a Amy Jo Calder, O.D., a and Paul DeLand, Ph.D. Optometry (2006) 77, 211-216 Color vision screening for individuals with intellectual disabilities: A comparison between the Neitz Test of Color Vision and Color Vision Testing Made Easy Carmen Barnhardt,

More information

Selective changes of sensitivity after adaptation to simple geometrical figures*

Selective changes of sensitivity after adaptation to simple geometrical figures* Perception & Psychophysics 1973. Vol. 13. So. 2.356-360 Selective changes of sensitivity after adaptation to simple geometrical figures* ANGEL VASSILEV+ Institu te of Physiology. Bulgarian Academy of Sciences.

More information

Observer variability experiment using a four-primary display and its relationship with physiological factors

Observer variability experiment using a four-primary display and its relationship with physiological factors Observer variability experiment using a four-primary display and its relationship with physiological factors Yuta Asano 1,MarkD.Fairchild 1,LaurentBlondé 2 1 Munsell Color Science Laboratory, Rochester

More information

Introduction to Physiological Psychology

Introduction to Physiological Psychology Introduction to Physiological Psychology Vision ksweeney@cogsci.ucsd.edu cogsci.ucsd.edu/~ksweeney/psy260.html This class n Sensation vs. Perception n How light is translated into what we see n Structure

More information

Note: This is an outcome measure and will be calculated solely using registry data.

Note: This is an outcome measure and will be calculated solely using registry data. Measure #303 (NQF 1536): Cataracts: Improvement in Patient s Visual Function within 90 Days Following Cataract Surgery National Quality Strategy Domain: Person and Caregiver-Centered Experience and Outcomes

More information

Mean Observer Metamerism and the Selection of Display Primaries

Mean Observer Metamerism and the Selection of Display Primaries Mean Observer Metamerism and the Selection of Display Primaries Mark D. Fairchild and David R. Wyble, Rochester Institute of Technology, Munsell Color Science Laboratory, Rochester, NY/USA Abstract Observer

More information

Examining Effective Navigational Learning Strategies for the Visually Impaired

Examining Effective Navigational Learning Strategies for the Visually Impaired Examining Effective Navigational Learning Strategies for the Visually Impaired Jeremy R. Donaldson Department of Psychology, University of Utah This study focuses on navigational learning strategies and

More information

Performance and Saliency Analysis of Data from the Anomaly Detection Task Study

Performance and Saliency Analysis of Data from the Anomaly Detection Task Study Performance and Saliency Analysis of Data from the Anomaly Detection Task Study Adrienne Raglin 1 and Andre Harrison 2 1 U.S. Army Research Laboratory, Adelphi, MD. 20783, USA {adrienne.j.raglin.civ, andre.v.harrison2.civ}@mail.mil

More information

UNCLASSIFIED AD NUMBER LIMITATION CHANGES

UNCLASSIFIED AD NUMBER LIMITATION CHANGES TO: UNCLASSIFIED AD NUMBER ADA071931 LIMITATION CHANGES Approved for public release; distribution is unlimited. FROM: Distribution authorized to U.S. Gov't. agencies only; Test and Evaluation; 06 JUN 1978.

More information

Note: This is an outcome measure and will be calculated solely using registry data.

Note: This is an outcome measure and will be calculated solely using registry data. Quality ID #303 (NQF 1536): Cataracts: Improvement in Patient s Visual Function within 90 Days Following Cataract Surgery National Quality Strategy Domain: Person and Caregiver-Centered Experience and

More information

STANDARD AUTOMATED PERIMETRY IS A GENERALLY

STANDARD AUTOMATED PERIMETRY IS A GENERALLY Comparison of Long-term Variability for Standard and Short-wavelength Automated Perimetry in Stable Glaucoma Patients EYTAN Z. BLUMENTHAL, MD, PAMELA A. SAMPLE, PHD, LINDA ZANGWILL, PHD, ALEXANDER C. LEE,

More information

VISUAL FIELDS. Visual Fields. Getting the Terminology Sorted Out 7/27/2018. Speaker: Michael Patrick Coleman, COT & ABOC

VISUAL FIELDS. Visual Fields. Getting the Terminology Sorted Out 7/27/2018. Speaker: Michael Patrick Coleman, COT & ABOC VISUAL FIELDS Speaker: Michael Patrick Coleman, COT & ABOC Visual Fields OBJECTIVES: 1. Explain what is meant by 30-2 in regards to the Humphrey Visual Field test 2. Identify the difference between a kinetic

More information

CHANGES IN VISUAL SPATIAL ORGANIZATION: RESPONSE FREQUENCY EQUALIZATION VERSUS ADAPTATION LEVEL

CHANGES IN VISUAL SPATIAL ORGANIZATION: RESPONSE FREQUENCY EQUALIZATION VERSUS ADAPTATION LEVEL Journal of Experimental Psychology 1973, Vol. 98, No. 2, 246-251 CHANGES IN VISUAL SPATIAL ORGANIZATION: RESPONSE FREQUENCY EQUALIZATION VERSUS ADAPTATION LEVEL WILLIAM STEINBERG AND ROBERT SEKULER 2 Northwestern

More information

Fundamentals of Psychophysics

Fundamentals of Psychophysics Fundamentals of Psychophysics John Greenwood Department of Experimental Psychology!! NEUR3045! Contact: john.greenwood@ucl.ac.uk 1 Visual neuroscience physiology stimulus How do we see the world? neuroimaging

More information

Color Repeatability of Spot Color Printing

Color Repeatability of Spot Color Printing Color Repeatability of Spot Color Printing Robert Chung* Keywords: color, variation, deviation, E Abstract A methodology that quantifies variation as well as deviation of spot color printing is developed.

More information

CAN WE PREDICT STEERING CONTROL PERFORMANCE FROM A 2D SHAPE DETECTION TASK?

CAN WE PREDICT STEERING CONTROL PERFORMANCE FROM A 2D SHAPE DETECTION TASK? CAN WE PREDICT STEERING CONTROL PERFORMANCE FROM A 2D SHAPE DETECTION TASK? Bobby Nguyen 1, Yan Zhuo 2 & Rui Ni 1 1 Wichita State University, Wichita, Kansas, USA 2 Institute of Biophysics, Chinese Academy

More information