ACT-R Workshop Schedule

Similar documents
Learning to Use Episodic Memory

An Activation-Based Model of Routine Sequence Errors

Modelling Interactive Behaviour with a Rational Cognitive Architecture

Recognizing Scenes by Simulating Implied Social Interaction Networks

Good Enough But I ll Just Check: Web-page Search as Attentional Refocusing

An Activation-Based Model of Routine Sequence Errors

Accumulators in Context: An Integrated Theory of Context Effects on Memory Retrieval. Leendert van Maanen and Hedderik van Rijn

Spatial Orientation Using Map Displays: A Model of the Influence of Target Location

Procedural learning in the control of a dynamic system

Simulating Network Behavioral Dynamics by using a Multi-agent approach driven by ACT-R Cognitive Architecture

An Account of Interference in Associative Memory: Learning the Fan Effect

Modeling mind-wandering: a tool to better understand distraction

Using Diverse Cognitive Mechanisms for Action Modeling

Recommender Systems for Literature Selection: A Competition between Decision Making and Memory Models

An Account of Associative Learning in Memory Recall

Optical Illusions 4/5. Optical Illusions 2/5. Optical Illusions 5/5 Optical Illusions 1/5. Reading. Reading. Fang Chen Spring 2004

Observations on the Use of Ontologies for Autonomous Vehicle Navigation Planning

Unsurpervised Learning in Hybrid Cognitive Architectures

Modelling performance in the Sustained Attention to Response Task

The Computational Problem of Prospective Memory Retrieval

Allen Newell December 4, 1991 Great Questions of Science

An Integrated Model of Cognitive Control in Task Switching

Modelling the Relationship between Visual Short-Term Memory Capacity and Recall Ability

An Account of Interference in Associative Memory: Learning the Fan Effect

Cognitive Neuroscience Treatment Research to Improve Cognition in Schizophrenia

Name. True or False: 1. Learning is a relatively permanent change in behavior brought about by experience. True False

Applying Appraisal Theories to Goal Directed Autonomy

Waffler: A Constraint-Directed Approach to Intelligent Agent Design

A Naturalistic Approach to Adversarial Behavior: Modeling the Prisoner s Dilemma

Tracking Multiple Animals per Arena. Advanced EthoVision XT. Summary Tracking multiple animals per arena. Advanced detection settings

Module 1. Introduction. Version 1 CSE IIT, Kharagpur

Modeling the progression of Alzheimer s disease for cognitive assistance in smart homes

Modeling Emotion and Temperament on Cognitive Mobile Robots

Memory Activation and the Availability of Explanations in Sequential Diagnostic Reasoning. Katja Mehlhorn

Page # Perception and Action. Lecture 3: Perception & Action. ACT-R Cognitive Architecture. Another (older) view

Cognitive Architectures I+II

Organizational. Architectures of Cognition Lecture 1. What cognitive phenomena will we examine? Goals of this course. Practical Assignments.

A memory for goals model of sequence errors

Human cogition. Human Cognition. Optical Illusions. Human cognition. Optical Illusions. Optical Illusions

Unified Modeling of Proactive Interference and Memorization Effort: A new mathematical perspective within ACT-R theory

Web-Mining Agents Cooperating Agents for Information Retrieval

M.Sc. in Cognitive Systems. Model Curriculum

Lecturer: Dr. Benjamin Amponsah, Dept. of Psychology, UG, Legon Contact Information:

Modeling Social Information in Conflict Situations through Instance-Based Learning Theory

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society

Artificial Intelligence Lecture 7

Extending the Computational Abilities of the Procedural Learning Mechanism in ACT-R

REQUIREMENTS SPECIFICATION

Measuring Change in Autism Spectrum Disorders

Web-Mining Agents Cooperating Agents for Information Retrieval

A Cognitive Model of Social Influence

Effective and Efficient Management of Soar s Working Memory via Base-Level Activation

Models of Information Retrieval

Dialing while Driving? A Bounded Rational Analysis of Concurrent Multi-task Behavior

MEMORY. Announcements. Practice Question 2. Practice Question 1 10/3/2012. Next Quiz available Oct 11

METHOD AND DEVICE FOR ISOKINETIC QUADRICEPS EXERCISE USING PNEUMATIC MUSCLE ACTUATOR (PMA) RESISTANCE: COMPARISON TO FREE WEIGHT EXERCISE

Leveraging Cognitive Context for Object Recognition

The Influence of Activation Level on Belief Bias in Relational Reasoning

The synergy of top-down and bottom-up attention in complex task: going beyond saliency models.

Unifying Data-Directed and Goal-Directed Control: An Example and Experiments

Implications of a Dynamic Causal Modeling Analysis of fmri Data. Andrea Stocco University of Washington, Seattle

Modeling a Cognitively Limited Network in an Agent-Based Simulation

Trauma Introduction to Trauma-Informed Care and The Neurosequential Model

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society

A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning

Comment on McLeod and Hume, Overlapping Mental Operations in Serial Performance with Preview: Typing

Prof. Tewfik K. Daradkeh

The Influence of Spreading Activation on Memory Retrieval in Sequential Diagnostic Reasoning

Sightech Vision Systems, Inc. Real World Objects

The Rorschach Performance Assessment System (R-PAS): Overview and Case Illustrations

Making Connections: Early Detection Hearing and Intervention through the Medical Home Model Podcast Series

An ACT-R Approach to Reasoning about Spatial Relations with Preferred and Alternative Mental Models

Research Article A Functional Model of Sensemaking in a Neurocognitive Architecture

A Functional Model of Sensemaking. In a Neurocognitive Architecture. Christian Lebiere. Carnegie Mellon University. Peter Pirolli

WP 7: Emotion in Cognition and Action

Rethinking Cognitive Architecture!

CS343: Artificial Intelligence

COMP 3020: Human-Computer Interaction I Fall software engineering for HCI

Experimental and Computational Analyses of Strategy Usage in the Time-Left Task

Introduction. A Complex Task Environment

Cognitive Architectures: Valid Control Mechanisms for Spatial Information Processing

A PROCESS MODEL OF TRUST IN AUTOMATION: A SIGNAL DETECTION THEORY BASED APPROACH

Cognitive modeling versus game theory: Why cognition matters

Proactive Interference in Location Learning: A New Closed-Form Approximation

Time Interval Estimation: Internal Clock or Attentional Mechanism?

Emulating a Brain System

Background. Introduction. Abstract

Mechanisms for Human Spatial Competence

This Lecture: Psychology of Memory and Brain Areas Involved

Learning to play Mario

Preparation for Change: The Tower of Strengths and The Weekly Planner

Chapter 2: Intelligent Agents

ENVIRONMENTAL REINFORCEMENT LEARNING: A Real-time Learning Architecture for Primitive Behavior Refinement

Salt & Pepper Architecture and Toolkit

The Evolution of Cooperation: The Genetic Algorithm Applied to Three Normal- Form Games

Selecting a research method

LECTURE 5: REACTIVE AND HYBRID ARCHITECTURES

Function-Behaviour-Structure: A Model for Social Situated Agents

Cognitive Strategies in HCI and Their Implications on User Error

Transcription:

ACT-R Workshop Schedule Opening: ACT-R from CMU s Perspective 9:00-9:45 Overview of ACT-R -- John R. Anderson 9:45 10:30 Details of ACT-R 6.0 -- Dan Bothell Break: 10:30 11:00 Presentations 1: Architecture 11:00 11:30 Functional constraints on architectural mechanisms -- Christian Lebiere 11:30 12:00 Retrieval by Accumulating Evidence in ACT-R -- Leendert van Maanen 12:00 12:30 A mechanism for decisions in the absence of prior reward -- Vladislav D. Veksler Lunch: 12:30 1:30 Presentations 2: Extensions 1:30 2:00 ACT-R forays into the semantic web -- Lael J. Schooler 2:00 2:30 Making Models Tired: A Module for Fatigue -- Glenn F. Gunzelmann 2:30 3:00 Acting outside the box: Truly embodied ACT-R -- Anthony Harrison 3:00-3:30 Interfacing ACT-R with different types of environments and with different techniques: Issues and Suggestions.-- Michael J. Schoelles Break: 3:30 4:00 Panel: 4:00 5:30: Future of ACT-R from a non-cmu Perspective Danilo Fum, Kevin A. Gluck, Wayne D. Gray, Niels A. Taatgen, J. Gregory Trafton, Richard M. Young 1

ACT-R/GPD Vladislav Dan Veksler

Purpose Propose a mechanism in ACT-R for making decision in the absence of prior reward Not meant to replace the current ACT-R reward-based decision mechanism, but rather to complement it 3

Current ACT-R model of choice 4

Current ACT-R model of choice Central Executive/ Procedural Module 4

Current ACT-R model of choice Central Executive/ Procedural Module based on prior reward (Reinforcement Learning; RL) 4

Current ACT-R model of choice ACT-R model of human choice is based on Reinforcement Learning (RL) a formal model of human/animal trial-and-error behavior predicts human choice based on prior reward/punishment psychological and biological evidence (e.g. Holroyd & Coles, 2002) However, much of human choice is based on other information 5

The 2-goal problem An agent is tasked with achieving some goal, A Then the agent is tasked with achieving B, in the same environment RL would perform on the 2nd task no better than one the 1st Humans learn their environment while achieving A, thus helping to reduce their time to achieve B 6

The 2-goal problem 7

The 2-goal problem 7

The 2-goal problem when goal= key goto left-arm when goal= key goto right-arm 8

The 2-goal problem when goal= key goto left-arm when goal= key goto right-arm 8

The 2-goal problem when goal= key goto left-arm when goal= key goto right-arm 8

The 2-goal problem when goal= key goto left-arm when goal= key goto right-arm + 8

The 2-goal problem when goal= key goto left-arm when goal= key goto right-arm + 8

The 2-goal problem when goal= key goto left-arm when goal= key goto right-arm + 8

The 2-goal problem when goal= key goto left-arm when goal= key goto right-arm + 9

The 2-goal problem when goal= key goto left-arm when goal= key goto right-arm + 9

The 2-goal problem when goal= key goto left-arm when goal= key goto right-arm + when goal= purse goto left-arm when goal= purse goto right-arm 9

The 2-goal problem when goal= key goto left-arm when goal= key goto right-arm + when goal= purse goto left-arm when goal= purse goto right-arm 9

The 2-goal problem when goal= key goto left-arm when goal= key goto right-arm + when goal= purse goto left-arm? when goal= purse goto right-arm? 9

The 2-goal problem humans make the correct choice >50% of the time (Stevenson, 1954; Quartermain & Scott, 1960) when goal= key goto left-arm when goal= key goto right-arm + when goal= purse goto left-arm? when goal= purse goto right-arm? 10

ACT-R can use declarative information in decisions An ACT-R model can be written to perform this task e.g. storing all attended items as declarative chunks: [:location left :item key] [:location left :item purse] [:location right :item candy]... However, there are no architectural constraints for doing this no system-level prediction for how humans make decisions in the absence of reward 11

SNIF-ACT Pirolli & Fu (2003) ACT-R model of web-browsing (SNIF-ACT) in web-browsing new links are encountered with no prior reward choose-link production utilities based on associative knowledge 13

SNIF-ACT Pirolli & Fu (2003) ACT-R model of web-browsing (SNIF-ACT) in web-browsing new links are encountered with no prior reward choose-link production utilities based on associative knowledge Perception Memory GOAL... Cheap DVD Player 14

SNIF-ACT Pirolli & Fu (2003) ACT-R model of web-browsing (SNIF-ACT) in web-browsing new links are encountered with no prior reward choose-link production utilities based on associative knowledge limited to web-browsing type tasks no associative learning * associative knowledge comes from PMI engine (Pointwise Mutual Information; Turney, 2001) PMI predicts the strength of association between words based on cooccurrence 15

Voicu & Schmajuk Voicu & Schmajuk (2002) model of navigation similar to SNIF-ACT, decisions based on spreading activation from the goal simulates qualitative effects of latent learning, shortcut, detour behavior limited to single-goal navigation tasks ENV Perception Memory GOAL 16

Goal-Proximity Decision Mechanism (GPD) Utility of choice is predicted based on its associative strength to current goal inherent value of the goal spreads to options as a complement to the RL mechanism in ACT-R 18

Goal-Proximity Decision Mechanism (GPD) Given goal G, and a choosing between options A and B retrieve A or B from memory option with higher association strength to G more likely to be retrieved association strengths reflect experienced item proximity 19

Goal-Proximity Decision Mechanism (GPD) Given goal G, and a choosing between options A and B retrieve A or B from memory option with higher association strength to G more likely to be retrieved association strengths reflect experienced item proximity... X A J B C G... 19

Goal-Proximity Decision Mechanism (GPD) Given goal G, and a choosing between options A and B retrieve A or B from memory option with higher association strength to G more likely to be retrieved association strengths reflect experienced item proximity... X A J B C G... 19

Implementing GPD in ACT-R SPREADING ACTIVATION Goal Buffer Imaginal Buffer Procedural Module Declarative Module Retrieval Buffer Productions Chunks SPREADING ACTIVATION Visual Buffer Manual Buffer Centrally-Controlled Information Flow Spreading Activation Visual Module Manual Module Environment 20

Implementing GPD in ACT-R Given goal G, and a choosing between options A and B retrieve A or B from memory option with higher association strength to G more likely to be retrieved association strengths reflect experienced item proximity 21

Implementing GPD in ACT-R Given goal G, and a choosing between options A and B retrieve A or B from memory option with higher association strength to G more likely to be retrieved association strengths reflect experienced item proximity 22

Implementing GPD in ACT-R Given goal G, and a choosing between options A and B retrieve A or B from memory +retrieval> option with higher association strength to G more likely to be retrieved association strengths reflect experienced item proximity 23

Implementing GPD in ACT-R Given goal G, and a choosing between options A and B retrieve A or B from memory +retrieval> option with higher association strength to G more likely to be retrieved association strengths reflect experienced item proximity * may be better to implement this at system level (goal module?) 23

Implementing GPD in ACT-R Given goal G, and a choosing between options A and B retrieve A or B from memory +retrieval> option with higher association strength to G more likely to be retrieved association strengths reflect experienced item proximity 24

Implementing GPD in ACT-R Given goal G, and a choosing between options A and B retrieve A or B from memory +retrieval> option with higher association strength to G more likely to be retrieved association strengths reflect experienced item proximity episodic buffer (list of recently attended chunks) association strength between two chunks is incremented proportional to their proximity in the episodic buffer (in error-driven fashion) 25

Associative Learning given a new episode, j for each episode in episodic buffer, i decrease activation of i, ai, by э increase Sji push episode j into episodic buffer aj = 1 Sji(n) strength of association between j and i at time n S ji (n) change in S ji at time n a i activation of i at time n ß learning rate э activation decay 26

Associative Learning given a new episode, j for each episode in episodic buffer, i Episodic Memory B X C A G FIFO queue decrease activation of i, ai, by э increase Sji push episode j into episodic buffer aj = 1 Sji(n) strength of association between j and i at time n S ji (n) change in S ji at time n a i activation of i at time n ß learning rate э activation decay 27

Associative Learning given a new episode, j for each episode in episodic buffer, i Episodic Memory B X C A G FIFO queue decrease activation of i, ai, by э increase Sji push episode j into episodic buffer aj = 1 Sji(n) strength of association between j and i at time n S ji (n) change in S ji at time n a i activation of i at time n ß learning rate э activation decay 27

Associative Learning given a new episode, j for each episode in episodic buffer, i Episodic Memory B X C A G FIFO queue decrease activation of i, ai, by э increase Sji push episode j into episodic buffer aj = 1 Sji(n) strength of association between j and i at time n S ji (n) change in S ji at time n a i activation of i at time n ß learning rate э activation decay * may be better to use ACT-R activation (which includes decay) 27

Associative Learning given a new episode, j for each episode in episodic buffer, i decrease activation of i, ai, by э Associative Learning increase Sji push episode j into episodic buffer aj = 1 Reinforcement Learning Sji(n) strength of association between j and i at time n S ji (n) change in S ji at time n a i activation of i at time n ß learning rate э activation decay 28

Experiment Friday at 14:00 30

RMSE for 2-choice (left) and 3-choice (right) mazes Random: RMSE = 39.70 RL: RMSE = 21.91 GPD: RMSE = 3.16 Ideal Performer: RMSE = 6.21 +,-./" 0./12-" 03" 456" 718.9"58:;2:-8:" (" Random: RMSE = 45.79 RL: RMSE = 18.29 GPD: RMSE = 7.95 Ideal Performer: RMSE = 16.34 -./01" 20134/" 25" 678" 93:0;"7:<=4</:<" ("!"#$%#&'()"*+',"#'-"*)%##")./*!#'"!#&"!#%"!#$"!"#$%#&'()"*+',"#'-"*)%##")./*!#'"!#&"!#%"!#$"!" (" $" )" %" *" &"!" (" $" )" %" *" &" +" '"," (!" ((" ($" ()" (%" 0#1'2*34&5"#* 0#1'2*34&5"#* 31

GPD versus RL GPD is not meant to replace RL it is obvious that much of human choice is based on reward/ punishment GPD is meant to be a complement to RL How GPD and RL interact is a topic for future research!"#$%#&'()"*+',"#'-"*)%##")./* -./01" 20134/" 25" 678" 93:0;"7:<=4</:<" ("!#'"!#&"!#%"!#$"!" (" $" )" %" *" &" +" '"," (!" ((" ($" ()" (%" 0#1'2*34&5"#* 32

Second Life Simulations 33

Second Life Simulations 34

Second Life GPD may perform better than RL in early stages (prior to reward), but... associations between all the objects become confusing after enough exploration and RL eventually outperforms GPD Caveat: the results may be specific to the given task selective attention and further parameter space exploration may help GPD 35

ACT-R/GPD How GPD may be integrated with the current ACT-R decision mechanism: GPD may be an appropriate mechanism prior to reward, but once there is reward, RL may take over In other words, GPD may be useful for approximating what the reward value would be, before actually experiencing that reward value 36

Autonomy GPD model was not altered between tasks 37

Future Research How GPD and RL may interact GPD implementation in ACT-R module Using ACT-R activation equation in place of episodic activation Associative learning implementation in ACT-R module 38

Questions? -./01" 20134/" 25" 678" 93:0;"7:<=4</:<"!"#$%#&'()"*+',"#'-"*)%##")./* ("!#'"!#&"!#%"!#$"!" (" $" )" %" *" &" +" '"," 0#1'2*34&5"#* (!" ((" ($" ()" (%" 39