Correlating Trust with Signal Detection Theory Measures in a Hybrid Inspection System

Similar documents
Measurement of human trust in a hybrid inspection system based on signal detection theory measures

The Effects of Eye Movements on Visual Inspection Performance

THE EFFECT OF EXPECTATIONS ON VISUAL INSPECTION PERFORMANCE

Combined Factors Effect of Menstrual Cycle and Background Noise on Visual Inspection Task Performance: a Simulation-based Task

Empirical Research Methods for Human-Computer Interaction. I. Scott MacKenzie Steven J. Castellucci

The impact of inspector's cognitive style on performance in various visual inspection display tasks

CHAPTER 3. Methodology

CHAPTER III RESEARCH METHODOLOGY

Topics. Experiment Terminology (Part 1)

Experimental Research in HCI. Alma Leora Culén University of Oslo, Department of Informatics, Design

Evaluation of CBT for increasing threat detection performance in X-ray screening

Psychology Research Process

AC : USABILITY EVALUATION OF A PROBLEM SOLVING ENVIRONMENT FOR AUTOMATED SYSTEM INTEGRATION EDUCA- TION USING EYE-TRACKING

Evaluation of CBT for increasing threat detection performance in X-ray screening

Psychology Research Process

Human Computer Interaction - An Introduction

CHAPTER 3 METHOD AND PROCEDURE

Framework Partners Incorporated

The role of recurrent CBT for increasing aviation security screeners visual knowledge and abilities needed in x-ray screening

Framework for Comparative Research on Relational Information Displays

SEMINAR ON SERVICE MARKETING

CHAPTER VI RESEARCH METHODOLOGY

A PROCESS MODEL OF TRUST IN AUTOMATION: A SIGNAL DETECTION THEORY BASED APPROACH

Item Analysis Explanation

A NEW DIAGNOSIS SYSTEM BASED ON FUZZY REASONING TO DETECT MEAN AND/OR VARIANCE SHIFTS IN A PROCESS. Received August 2010; revised February 2011

Observational Category Learning as a Path to More Robust Generative Knowledge

Designing Experiments... Or how many times and ways can I screw that up?!?

Controlled Experiments

POST GRADUATE DIPLOMA IN BIOETHICS (PGDBE) Term-End Examination June, 2016 MHS-014 : RESEARCH METHODOLOGY

GEX Recommended Procedure Eff. Date: 09/21/10 Rev.: D Pg. 1 of 7

Touch Behavior Analysis for Large Screen Smartphones

Evaluation: Scientific Studies. Title Text

Enhancement of Application Software for Examination of Differential Magnification Methods and Magnification Interface Factors

Survey Errors and Survey Costs

Rapid communication Integrating working memory capacity and context-processing views of cognitive control

LPU-Laguna Journal of Engineering and Computer Studies Vol. 3 No.1 September 2015

SUPPLEMENTAL MATERIAL

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology

Vaccination Setup. Immunization Set Up & Reporting

Pushing the Right Buttons: Design Characteristics of Touch Screen Buttons

CHAPTER III RESEARCH METHODOLOGY

Categorical Perception

Technical Specifications

Influence of Agent Type and Task Ambiguity on Conformity in Social Decision Making

Mammogram Analysis: Tumor Classification

GROUP DECISION MAKING IN RISKY ENVIRONMENT ANALYSIS OF GENDER BIAS

Reliability of feedback fechanism based on root cause defect analysis - case study

The Simon Effect as a Function of Temporal Overlap between Relevant and Irrelevant

APPLICATION OF FUZZY SIGNAL DETECTION THEORY TO VIGILANCE: THE EFFECT OF CRITERION SHIFTS

[EN-A-022] Analysis of Positive and Negative Effects of Salience on the ATC Task Performance

Research Review: Multiple Resource Theory. out in multi-task environments. Specifically, multiple display layout and control design

CONNERS K-CPT 2. Conners Kiddie Continuous Performance Test 2 nd Edition C. Keith Conners, Ph.D.

Application of ecological interface design to driver support systems

Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN

Evaluating Tactile Feedback in Graphical User Interfaces

Controlling the risk due to the use of gamma sources for NDT First feedback from the deployment of replacement NDT Techniques

Designing A User Study

4. Model evaluation & selection

Using threat image projection data for assessing individual screener performance

A hybrid approach for identification of root causes and reliability improvement of a die bonding process a case study

Pupil Dilation as an Indicator of Cognitive Workload in Human-Computer Interaction

Visual Inspection Reliability for Precision Manufactured Parts

Sound Check: Essentials of a Hearing Screener

Evaluation: Controlled Experiments. Title Text

Sawtooth Software. The Number of Levels Effect in Conjoint: Where Does It Come From and Can It Be Eliminated? RESEARCH PAPER SERIES

Supplementary experiment: neutral faces. This supplementary experiment had originally served as a pilot test of whether participants

Auditory Dominance: Overshadowing or Response Competition?

Intelligent Object Group Selection

Cultural Differences in Cognitive Processing Style: Evidence from Eye Movements During Scene Processing

Older adults associative deficit in episodic memory: Assessing the role of decline in attentional resources

National Culture Dimensions and Consumer Digital Piracy: A European Perspective

Sparse Coding in Sparse Winner Networks

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Controlling Stable and Unstable Dynamic Decision Making Environments

Quick Notes for Users of. Beef Ration and Nutrition. Decision Software & Sheep Companion Modules

Chapter 3: Information Processing

MODELLING CHARACTER LEGIBILITY

ASSIGNMENT 2. Question 4.1 In each of the following situations, describe a sample space S for the random phenomenon.

Dräger Alcotest Handheld Product Suite

Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time.

Problem Solving Approach of Technology Students

Spreadsheet signal detection

Methods for Determining Random Sample Size

Today: Statistical inference.

Attention to health cues on product packages

Types of questions. You need to know. Short question. Short question. Measurement Scale: Ordinal Scale

SHOEBOX Audiometry Pro. Quickstart Guide. SHOEBOX Audiometry Pro

Mammogram Analysis: Tumor Classification

Factors Affecting Speed and Accuracy of Response Selection in Operational Environments

Research Questions and Survey Development

EXAMINING THE RELATIONSHIP BETWEEN ORGANIZATIONAL JUSTICE AND EFFECTIVENESS OF STRATEGY IMPLEMENTATION AT FOOD INDUSTRIES IN ARDABIL PROVINCE

Effect of the number of non-conforming samples on the Kappa indicator values

Data Management, Data Management PLUS User Guide

The Stroop Effect The Effect of Interfering Colour Stimuli Upon Reading Names of Colours Serially ABSTRACT

Examining differences between two sets of scores

Using threat image projection data for assessing individual screener performance

Ch. 11 Measurement. Measurement

A Race Model of Perceptual Forced Choice Reaction Time

A study of association between demographic factor income and emotional intelligence

Transcription:

Correlating Trust with Signal Detection Theory Measures in a Hybrid Inspection System Xiaochun Jiang Department of Industrial and Systems Engineering North Carolina A&T State University 1601 E Market St Greensboro, NC 27411 Mohammad T. Khasawneh, Sittichai Kaewkuekool, Shannon R. Bowling, Brian J. Melloy, Anand K. Gramopadhye Department of Industrial Engineering Clemson University Clemson, SC 29634 Abstract Signal detection theory provides a precise language and graphic notation for analyzing decision making in the presence of uncertainty. It has been widely used in visual inspection to evaluate system performance. Recent studies also found that trust in automation plays a very important role in visual inspection. Particularly, trust has a great influence on the use of automation in a hybrid inspection system where humans and machines work cooperatively. This study attempts to correlate trust with Signal Detection Theory measures in a hybrid inspection system. The results indicate there is a correlation between these two. Keywords Signal Detection Theory, Automation, Trust, Hybrid Inspection 1. Introduction Customer awareness regarding product quality and increased incidences of product liability litigation have increased the importance of the inspection process in manufacturing industries [1]. To remain competitive, manufacturers can accept only extremely low defect rates, often measured in parts per million. This situation requires almost perfect inspection performance in the search for nonconformities in a product, and the two functions central to inspection, visual search and decision making [2], have been shown to be the primary determinants of inspection performance [3]. If inspection is to be successful, it is critical that these functions be performed effectively and efficiently. Since consumers demand that a product be free of defects, 100% inspection has to be applied in order to achieve this zero-defect quality [4]. Unfortunately, while need for error-free detection is important, human inspectors are less than 100% reliable [5]. To overcome this deficiency and to remove errors from the system, many companies are moving towards automated systems designed for 100% inspection. This growing interest in automated visual inspection has resulted in the development of faster, more efficient image processing equipment, and advances in computer technology, sensing devices, image processing, and pattern recognition. Thus, automated systems are now not only better but less expensive as well. As a result of this trend, some of the tasks earlier performed by humans can now be allocated to computers so that the role of inspectors has changed from that of an active controller to that of a supervisor [6]. Given this change, it becomes critical to understand the role of both humans and computers in visual inspection, especially since the availability of computer-based systems and microprocessor-based optically-sophisticated devices has led designers to automate the various functions of the inspection task assuming that this is the solution to eliminating human errors from the inspection process, However as Hou [4] pointed out, humans and computers each have their own advantages and disadvantages. Humans have the innate ability to recognize patterns, make

rational decisions, and quickly adapt to new situations. An automated system cannot surpass the superior decisionmaking ability of the human inspector. Thus, human inspectors are more suited when complex decision making is involved [7]. However, they are limited in their computational ability and short-term memory. On the other hand, computers are good at computation, memory storage and retrieval, but are poor at detecting signals in noise and have very little capacity for creative or inductive functions [8]. Therefore, neither an entirely human nor a purely automated system may fully achieve the desired performance in an inspection task. It is possible, though, that superior performance could be achieved by a hybrid inspection system in which search and decision-making tasks are allocated to either humans, machines or both. Hou [4] proposed seven alternate hybrid inspection systems listed in Table 1 with Alternative Seven, the most complicated and flexible, chosen for use in the current study. Table 1. Allocation alternatives in hybrid inspection task Alternative Search Decision-making System Mode 1 Human Computer Hybrid 2 Computer Human Hybrid 3 Human Human + Computer Hybrid 4 Computer Human + Computer Hybrid 5 Human + Computer Human Hybrid 6 Human + Computer Computer Hybrid 7 Human + Computer Human + Computer Hybrid To measure inspection system performance, Signal Detection Theory (SDT) is often used to model the decisionmaking process in an inspection task [9]. SDT provides a precise language and graphic notation for analyzing decision making in the presence of uncertainty. It has been widely used in visual inspection to evaluate system performance. An inspector gathers data from each observation and decides if a particular item was sampled from a distribution of conforming items or a distribution of nonconforming items. Due to continuous variation in noise underlying both of these distributions, few nonconforming items will be classified as nonconforming. In this research, the simplest version of SDT is considered, which assumes the signal (nonconforming) and noise (conforming) are normally distributed and have equal variances (Figure 1). SDT defines inspection performance using two parameters: sensitivity and response criterion. 1. Sensitivity (d ): This refers to the ability of the inspector to discriminate between a conforming and a nonconforming item and is a function of the overlap of the two distributions as shown in Figure 1. This can be expressed as d ' = z( p1 ) + z( p2 ), where z(p 1 ) and z(p 2 ) are standard normal deviates corresponding to p 1 and p 2. 2. Response criterion (β): This refers to an inspector s response bias or the tendency of the inspector to call an item conforming or nonconforming. Given that an inspector has to call an item conforming or nonconforming, there are four possible outcomes: Hit: Saying the item is nonconforming when it is False alarm: Saying the item is nonconforming when it is conforming Miss: Saying the item is conforming when it is nonconforming Correct Rejection: Saying the item is conforming when it is Refer to Figure 1, if the response criterion was placed where the two distributions cross, beta would equal one indicating a neutral system. If the response criterion was shifted to the right, beta increases and the inspector will call an item nonconforming less often and hence will have fewer hits, but will also have fewer false alarms, indicating a conservative system. If the response criterion was shifted to the left, beta decreases and the inspector will call an item nonconforming more often and hence will have more hits, but will also have more false alarms,

indicating a risky system. Figure 1. Signal detection theory To address best system performance issue [6], though, it is critical that we study the issue of trust in hybrid inspection systems because human trust in automation can directly impact inspection quality and overall inspection performance. In response to this need, a trust questionnaire was developed to determine the effects of the level of trust an operator has in hybrid inspection systems [10]. As shown in Figure 2, this questionnaire incorporated the four dimensions of trust competence, predictability, reliability and faith derived from the multidimensional construct developed by Muir [11] and used them to determine which were the best predictors of trust. Using this questionnaire, a study [12] was conducted to measure trust in an inspection system, the results indicating that measuring trust in a hybrid inspection environment using trust questionnaire is feasible. Figure 2. Screen shot of the trust questionnaire Since trust can be measured using a questionnaire, the next step is to apply it to a hybrid inspection system with trust between the human and the machine becoming a primary focus in evaluating inspection performance. Since human trust can be affected by the accuracy of the machine, it is hypothesized that by manipulating the types of errors made by a system, the influence of trust can be investigated. The objective of this research is to correlate trust in a hybrid inspection system with system response criterion.

2. Methodology 2.1 Subjects The subjects were 6 students, both graduate and undergraduate, enrolled at Clemson University between the ages of 18 and 28. Students can be used as subjects in lieu of inspectors because as Gallwey and Drury [13] have shown, minimal differences exist between inspectors and student subjects on simulated tasks. The subjects were screened for 20/20 vision, corrected if necessary, and paid $5.00/hour for their time. 2.2 Stimulus Material The task was a simulated visual inspection task of printed circuit board inspection implemented on a Pentium III computer with a 19 high-resolution (1024 x 768) monitor. The input devices were a Microsoft standard keyboard and a Microsoft one-button mouse. The task consisted of inspecting simulated PCB images which were developed using adobe PhotoShop 5.5. The inspector searched the PCB boards for six categories of defects. Four categories of defects could occur on any of the individual components (resistors, capacitors, transistors and integrated circuit). These defect categories are missing components, wrong components, inverted components, misaligned components, trace defects and board defects. 2.3 Inspection Task A Human/computer share search and decision-making hybrid inspection system was used in this study. In this system, both computers and humans searched for defects and made the decision on the board with the human having the final decision about whether to accept or override the computer search or decision-making [14]. During visual search, PCB boards containing 1, 2, 3, or no defects were presented to the subjects, whose task was to locate all potential defects and name them. After locating the defect, they clicked the mouse on it and chose its name from a dropdown box listing all possible defects. At the same time, the computer performed the same search task. However, subjects could override the computer if they did not agree with its search results. Then, the computer made its conformance decision and the subjects made their final conformance decisions, either agreeing or disagreeing with the computer. Once the board was classified, the image of the next board would be presented to the subjects. Each inspection task consisted of 48 randomly ordered PCB boards 12 of each zero-defect, singledefect, two-defect, and three-defect boards. Figure 3 shows a typical decision-making response by the computer and the human inspector s decision to override this decision. Figure 3. Screenshot from a hybrid inspection system 2.4 Experimental Design This study used a single factor (response criterion) within subject design. The three levels of the response criterion were conservative (high false alarms / low misses), neutral (equal false alarms and misses), and risky (high misses /

low false alarms). The sensitivities of these three systems were designed to be very close (d is equal to 0.65), hence, the current study focus only on the relationship between the trust and the response criterion. The response criteria for the three systems are set as 1.16 for the conservative system, 1.0 for the neutral system, and 0.86 for the risky system. Two Latin squares with different orders were used to cancel out the order effects. All treatments were randomly assigned to the three Latin letters. 2.5 Procedure The study took place over a seven-day period. Day One was devoted to training the subjects and during the next six days, data were collected on the criterion tasks. A more detailed explanation of the activities conducted on each day can be seen below. On Day One, each subject was required to complete a consent form and a demographics questionnaire. Following this step, instructions were read to the subjects to ensure their understanding of the experiment. Next, all were trained and given three separate tests before beginning the experiment. After completion of defect training, the subjects underwent training sessions on defect matching, single defect inspection and multiple-defect training. Following each session, the subjects were administered a test. Only those subjects who secured a minimum score were allowed to proceed to the next step. 1. Defect matching: PCBs with a marked single defect were displayed on the screen, and subjects classified it by choosing the correct name from a dropdown box. They were provided with immediate feedback about the correctness of their responses. 2. Single-defect training: PCBs with a single defect were displayed on the screen, and subjects located and then classified it by choosing the name from a dropdown box. They were provided with immediate feedback on their search performance using speed and accuracy measures. 3. Multiple-defect training: PCBs with 1, 2, or 3 defects were displayed on the screen, and subjects first visually searched for and then classified them. They were provided with immediate feedback on their performance using speed and accuracy measures. On Day Two, to develop a baseline of a subject s trust of the system, each was administered a criterion task with 24 PCB boards to inspect using a perfect hybrid inspection system, i.e., a system did not make any errors. On Day Three, subjects were required to inspect 48 PCB boards under pre-assigned experimental conditions and to fill out a trust questionnaire on completion of the task. From Day Four to Day Seven, the subjects followed the same procedure and were assigned the other two experimental conditions. On completion of the study, each subject was debriefed. 3. Results and Analysis 3.1 Subjective Ratings of Trust To evaluate the subject s trust in the system, subjective ratings of each trust component as well as the overall trust were solicited. A continuous rating scale from 0 to 100 was used in the study. When the subject dragged the scroll bars, the score was displayed automatically (Figure 2). 3.2 Correlation Analysis As shown in Table 2, the results indicate that trust components as well as overall trust have a positive correlation with system response criterion. The larger the response criterion is, the more trust the inspectors have in automation. Clearly, inspector s trust in automation was affected by the SDT measure, response criterion. A possible explanation is that, larger response criterion means a more conservative system [15]. Although a conservative system makes fewer hits, it also makes fewer false alarms. Since false alarms are commissive errors, which the inspectors cannot overlook if they are paying attention, whereas misses are omissive errors which the inspectors are likely to miss if they are not paying attention [15]. Therefore, inspector s trust of a computer with a risky bias may be affected more than it would be for computers with a conservative or neutral bias.

Table 2. Results of the correlation analysis Trust Competence Predictability Faith Reliability Overall Correlation.5154.6013.4077.5704 0.5607 with Response Criterion (<.05) (<.05) (<.05) (<.05) (<.05) Another interesting finding is that all four trust components are positively correlated with the system response criterion, which further indicates the trust questionnaire is a very useful tool to measure trust in a hybrid inspection environment. 4. Conclusion This study used a trust questionnaire to measure inspector s trust in automation in three hybrid inspection systems with different response criteria. To explore the relationship of the SDT measure, response criterion, and trust in automation, a correlation analysis was conducted. The results showed that trust measures were positively correlated with system response criteria, indicating that the signal detection theory measure, response criterion, has great influence in inspector s trust in automation. References 1. Thapa, V. B., Gramopadhye, A. K. and Melloy, B. J., 1996, Evaluation of different training strategies to improve decision-making performance in inspection, The International Journal of Human Factors in Manufacturing, 6(3), 243-261. 2. Drury, C.G., 1992, Inspection performance, In Handbook of Industrial Engineering (second edition), G. Salvendy, John Wiley and Sons, New York. 3. Sinclair, M. A., 1984, Ergonomics of quality control, workshop document, International Conference on Occupational Ergonomics (Toronto). 4. Hou T., Lin L., Drury, C. G., 1993, An empirical study of hybrid inspection systems and allocation of inspection functions, International Journal of Human Factors in Manufacturing, 351-367. 5. Chin, R., 1988, Automated visual inspection: 1981 to 1987, Computer Vision, Graphics and Image Process, 41, 346-381. 6. Jiang, X., Gramopadhye, A. K., Melloy, B., and Grimes, L., 2003, Evaluation of best system performance: human, automated, and hybrid inspection systems, International Journal of Human Factors in Manufacturing (In press). 7. Drury, C. G. and Sinclair, M. A., 1983, Human and machine performance in an inspection task, Human Factors. 25, 391-399. 8. Kantowitz, B. H. and Sorkin, R. D., 1987, Allocation of functions, In Handbook of Human Factor, John Wiley and Sons: New York. 9. Jiang, X., Srinivasan, A., Gramopadhye, A., K., and Ferrell. W. G., 2002, Modeling Errors in Sampling Inspection: Effect of Degraded Performance, Quality Engineering, 15(1), 67-74. 10. Master, R., Bingham, J., Jiang, X., Gramopadhye, A. K., Melloy, B. J., Measurement of trust in hybrid inspection systems, Proceedings of the 10 th Annual industrial engineering research conference, May 20-22, 2001, Dallas, Texas, in press. 11. Muir, B. M., 1994, Trust in automation Part 1: Theoretical Issues in the study of trust and human intervention in automated systems, Ergonomics, 37, 1905-1922. 12. Jiang, X., Gramopadhye, A. K., Melloy, B. J., Grimes, L., Measuring trust in a hybrid inspection system, International Journal of Industrial Ergonomics, in review. 13. Gallwey, T. J., 1982, Selection Test for visual inspection on a multiple fault type task, Ergonomics, 25(11), 1077-1092. 14. Jiang, X., Bingham, J., Master R., Gramopadhye, A. K and Melloy B., 2002, A Visual inspection simulator for hybrid environments, International Journal of Industrial Engineering: Theory, Applications and Practice, 9(2):123-132. 15. Sanders, M. S., and McCormick, E. J., 1993, Human Factors in Engineering and Design, McGraw-Hill, Inc., New York, 120-121.