Time Experiencing by Robotic Agents

Similar documents
Acquiring Rules for Rules: Neuro-Dynamical Systems Account for Meta-Cognition

Artificial Agents Perceiving and Processing Time

Observational Learning Based on Models of Overlapping Pathways

Learning to Use Episodic Memory

The Frontal Lobes. Anatomy of the Frontal Lobes. Anatomy of the Frontal Lobes 3/2/2011. Portrait: Losing Frontal-Lobe Functions. Readings: KW Ch.

Reactive agents and perceptual ambiguity

Neurobiological Foundations of Reward and Risk

Error Detection based on neural signals

ArteSImit: Artefact Structural Learning through Imitation

Evolving Imitating Agents and the Emergence of a Neural Mirror System

Introduction to Computational Neuroscience

Learning Classifier Systems (LCS/XCSF)

Evolution of Plastic Sensory-motor Coupling and Dynamic Categorization

Oxford Foundation for Theoretical Neuroscience and Artificial Intelligence

HIERARCHICAL STRUCTURE IN PREFRONTAL CORTEX IMPROVES PERFORMANCE AT ABSTRACT TASKS

EEL-5840 Elements of {Artificial} Machine Intelligence

Biologically-Inspired Human Motion Detection

A Neural Model of Context Dependent Decision Making in the Prefrontal Cortex

Computational Neuroscience. Instructor: Odelia Schwartz

Visual Selection and Attention

Neuroinformatics. Ilmari Kurki, Urs Köster, Jukka Perkiö, (Shohei Shimizu) Interdisciplinary and interdepartmental

Theoretical Neuroscience: The Binding Problem Jan Scholz, , University of Osnabrück

Cognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence

Oscillatory Neural Network for Image Segmentation with Biased Competition for Attention

Realization of Visual Representation Task on a Humanoid Robot

Toward Minimally Social Behavior: Social Psychology Meets Evolutionary Robotics

The Advantages of Evolving Perceptual Cues

An Overview on Soft Computing in Behavior Based Robotics

A Three Agent Model of Consciousness Explains Spirituality and Multiple Nondual Enlightened States Frank Heile, Ph.D.

A HMM-based Pre-training Approach for Sequential Data

A Neurally-Inspired Model for Detecting and Localizing Simple Motion Patterns in Image Sequences

Lesson 6 Learning II Anders Lyhne Christensen, D6.05, INTRODUCTION TO AUTONOMOUS MOBILE ROBOTS

A model to explain the emergence of reward expectancy neurons using reinforcement learning and neural network $

Motivation represents the reasons for people's actions, desires, and needs. Typically, this unit is described as a goal

Virtual Reality Testing of Multi-Modal Integration in Schizophrenic Patients

Bundles of Synergy A Dynamical View of Mental Function

Introduction and Historical Background. August 22, 2007

Grounding Ontologies in the External World

Plasticity of Cerebral Cortex in Development

Lecture 2.1 What is Perception?

Parallel processing strategies of the primate visual system

Lecture 10: Some experimental data on cognitive processes in the brain

acquisition associative learning behaviorism B. F. Skinner biofeedback

A Computational Framework for Concept Formation for a Situated Design Agent

Brook's Image Scanning Experiment & Neuropsychological Evidence for Spatial Rehearsal

Lecture 7 Part 2 Crossroads of economics and cognitive science. Xavier Gabaix. March 18, 2004

Intro to Cognitive Neuroscience. Numbers and Math

Brain-Based Devices. Studying Cognitive Functions with Embodied Models of the Nervous System

An Embodied Dynamical Approach to Relational Categorization

5.8 Departure from cognitivism: dynamical systems

LECTURE 5: REACTIVE AND HYBRID ARCHITECTURES

5th Mini-Symposium on Cognition, Decision-making and Social Function: In Memory of Kang Cheng

Memory, Attention, and Decision-Making

A Drift Diffusion Model of Proactive and Reactive Control in a Context-Dependent Two-Alternative Forced Choice Task


Visual Context Dan O Shea Prof. Fei Fei Li, COS 598B

Toward A Cognitive Computer Vision System

Supplementary materials for: Executive control processes underlying multi- item working memory

Pavlovian, Skinner and other behaviourists contribution to AI

11, in nonhuman primates, different cell populations respond to facial identity and expression12, and

A Scientific Model of Consciousness that Explains Spirituality and Enlightened States

Social Cognition and the Mirror Neuron System of the Brain

2 Psychological Processes : An Introduction

MSc Psychological Research Methods/ MPsych Advanced Psychology Module Catalogue / 2018

Why is dispersion of memory important*

A Model of Dopamine and Uncertainty Using Temporal Difference

Hebbian Plasticity for Improving Perceptual Decisions

Representational Switching by Dynamical Reorganization of Attractor Structure in a Network Model of the Prefrontal Cortex

Visual Object Recognition Computational Models and Neurophysiological Mechanisms Neurobiology 130/230. Harvard College/GSAS 78454

Embodied Me. 28. Embodied Mental Imagery in Cognitive Robots. Part F 28. Alessandro Di Nuovo, Davide Marocco, Santo Di Nuovo, Angelo Cangelosi

Reward prediction based on stimulus categorization in. primate lateral prefrontal cortex

How hierarchical control self-organizes in artificial adaptive systems

A model of parallel time estimation

Modeling of Hippocampal Behavior

An Artificial Synaptic Plasticity Mechanism for Classical Conditioning with Neural Networks

Introducing Sensory-motor Apparatus in Neuropsychological Modelization

Computational & Systems Neuroscience Symposium

Ch 5. Perception and Encoding

Exploring MultiObjective Fitness Functions and Compositions of Different Fitness Functions in rtneat

Time perception, cognitive correlates, age and emotions

Dynamic functional integration of distinct neural empathy systems

Exploring the Functional Significance of Dendritic Inhibition In Cortical Pyramidal Cells

ENVIRONMENTAL REINFORCEMENT LEARNING: A Real-time Learning Architecture for Primitive Behavior Refinement

2012 Course: The Statistician Brain: the Bayesian Revolution in Cognitive Sciences

On the implementation of Visual Attention Architectures

dacc and the adaptive regulation of reinforcement learning parameters: neurophysiology, computational model and some robotic implementations

Ch 5. Perception and Encoding

Computation and Consciousness

Rules of apparent motion: The shortest-path constraint: objects will take the shortest path between flashed positions.

Russell M. Bauer, Ph.D. February 27, 2006

Hippocampal brain-network coordination during volitionally controlled exploratory behavior enhances learning

shows syntax in his language. has a large neocortex, which explains his language abilities. shows remarkable cognitive abilities. all of the above.

The two sides of human thought. Human thinking: Lessons from Neuroscience. Patient studies. Kalina Christoff Vancouver, BC May 29, 2007

CSE511 Brain & Memory Modeling Lect 22,24,25: Memory Systems

Learning and Adaptive Behavior, Part II

Artificial Intelligence Lecture 7

(Visual) Attention. October 3, PSY Visual Attention 1

Adaptation to Sensory Delays

Navigation: Inside the Hippocampus

An Escalation Model of Consciousness

Transcription:

Time Experiencing by Robotic Agents Michail Maniadakis 1 and Marc Wittmann 2 and Panos Trahanias 1 1- Foundation for Research and Technology - Hellas, ICS, Greece 2- Institute for Frontier Areas of Psychology and Mental Health, EAP, Germany Abstract. Biological organisms perceive and act in the world based on spatiotemporal experiences and interpretations. However, artificial agents consider mainly the spatial relationships that exist in the world, typically ignoring its temporal aspects. In an attempt to direct research interest towards the fundamental issue of time experiencing, the current work explores two temporally different versions of a robotic rule switching task. An evolutionary process is employed to design a neural network controller capable of accomplishing both versions of the task. The systematic exploration of neural network dynamics revealed a self-organized time perception capacity in the agent s cognitive system that significantly facilitates the accomplishment of tasks, through modulation of the supplementary behavioural and cognitive processes. 1 Introduction Sensing the flow of time is fundamental for intelligent organisms [1]. Especially for humans that live in large societies having complex daily schedules, time perception is essential for almost every activity we engage in. Subjective time is an intrinsic indicator of how long external events should last, often functioning as an error signal leading to specific action selection. Despite the essential role of time in natural cognition, current endeavours in developing intelligent robots are by no means directed towards encompassing time perception in the systems repertoire of capacities. The majority of robotic research focuses on task accomplishment such as navigating to reach goals, or moving objects around, with only superficial investigation of time perception issues (for example, tracking systems monitor speed changes across time, but this is far from considering the flow of time). The inability of artificial systems to experience the temporal characteristics of events hinders their understanding about the real, dynamic world. In order to explore how time perception is involved in artificial cognition, the current study explores two time-differentiated versions of a behavioural-rule switching task. In particular, a simulated robotic agent has to consider unpredictably changing reward signals, in order to switch between response rules choosing the one that is considered correct at a given time [2]. We investigate the behaviour of the agent for a sequence of trials assessing its ability to successfully switch among rules. We study two temporally different versions of the aforementioned task exploring how rule switching interacts with perceiving the temporal characteristics of the given problem. In the first version, all trials have equal predefined durations, while in the second version the temporal length of trials is determined in a dynamical manner based on agent s behaviour. We evolve Continuous Time Recurrent Neural Network (CTRNN) controllers capable to 429

accomplish both versions of the rule switching task. Subsequently, we study the mechanisms self-organized in the CTRNN. We observe that the controller monitors the temporal characteristics of the task, a process that plays an important role in the rule following/switching. In the following sections, we describe the experimental setup followed in the present study, the obtained CTRNN results, and how the latter compare to the time perception processes of the human brain. 2 Experimental Setup The current study is an extension of our previous works [2, 3], addressing rule switching in a mobile-robot interpretation of the classical Wisconsin Card Sorting (WCS) task [4]. The Neural Network Controller. We use a CTRNN model [5] to investigate the role of duration perception and how it interacts with rule switching. Following our previous work [2] showing that bottleneck configurations are more effective in rule switching tasks compared to fully connected CTRNNs, the current work employs a bottlenecked network. Intuitively, the loose separation of the network facilitates the development of low level and high level skills in the corresponding parts of the CTRNN. The details of input-output connectivity are similar to [3] and they are omitted here due to space limitations. Mobile Robot Rule Switching Task. The investigated task is inspired by the rat version of WCS, exploring rodents rule switching [6]. We assume that a mobile robotic agent is located at the bottom of a T-maze environment (see Fig. 1). At the beginning of a trial, a light cue appears at either the left or the right side of the robot. Depending on the light side, the robot has to move to the end of the corridor, making a 90 o turning choice towards the left or right. The side of the light is linked to the choice of the robot according to two different cue-response rules. The first is called Same-Side (SS) rule implying that the robotic agent should turn left if the light source appeared at its left side, and it should turn right if the light source appeared at its right side. The second rule is named (OS), implying that robot should turn to the side opposite of the light. The capacity of the agent to follow each rule is evaluated by testing sequences of the above described trials. For example, assume that a human experimenter selects rule SS and asks the agent to follow it for several trials. Based on the side of the light cue, the experimenter provides reward to the side of the T-maze that the robot should turn (see Fig. 1). Every time that the robot gives a correct response, it reaches the target location driving to a reward area that indicates it follows the right rule. At a random time (unknown to the robotic agent), the experimenter changes the rule considered correct, positioning rewards according to the OS rule. Thus, the robot that is not aware of this change will give an incorrect response, being unable to get a reward. This is an indication that the adopted rule is not correct anymore. The agent is necessary to discover this rule change, switching its response strategy according to the new rule. This will make the agent receiving rewards again, indicating that the correct rule is followed. 430

Same Side (SS) Rule Opposite Side (OS) Rule Fig. 1: A schematic representation of the response rules. The robot starts always from the bottom of the T-maze. Light cues are shown as double circles. Target locations are represented by, while reward corresponds to the gray area. At some later time the experimenter will switch the rule again, and so on. The above described rule following/switching process is repeated for 10 times. Trial Duration. In order to investigate artificial mechanisms of duration perception, we have implemented two temporally different versions of the underlying rule switching task. In particular, in the first version we let the controller to dynamically specify the duration of trials (i.e. all trials end as soon as the robot reaches the target location at a distance of 10 units). We call this version of rule switching Dynamic Trial Duration (DTD). In the second version, all trials have the same duration during the whole task (i.e. the trial ends after a predefined number of steps, irrespective to the agent s reaching of the target location). This version is called Static Trial Duration (STD). In order to avoid the possibility that the DTD timing will self-organize in an identical form with STD, we have explicitly differentiated the two scenarios by letting the DTD last at most 170 steps, while the STD lasts exactly 190 steps. Evolutionary Procedure. We use a Genetic Algorithm (GA) to explore how rule switching capacity and duration perception self-organize in CTRNN dynamics. In short, we use a population of artificial chromosomes encoding CTRNN controllers (their synaptic weights and neural biases). Each candidate solution encoding a complete CTRNN is tested on tasks examining the ability of the network to switch between rules in both the DTD and the STD setup. 3 Results We have evolved CTRNN controllers running ten different GA processes. Five of the evolutionary procedures converged successfully configuring CTRNNs capable of rule switching. Interestingly, the results obtained from the statistically independent evolutionary procedures exhibit common internal dynamics, which are discussed below using as a working example one representative solution. The performance of the agent s rule switching during DTD and STD is demonstrated in Fig. 2. During trials 1-4 the agent follows SS rule, successfully acquiring rewards. Next, in trial 5 the experimenter changes rule to OS. The agent that is not aware of this change fails to accomplish reward for one trial in the DTD case and for two consecutive trials in the STD case. In forthcoming trials the agent successfully follows OS. The rule is changed again in trial 15, where the agent is missing the reward for one trial in both the DTD and the 431

DTD STD Fig. 2: The response of the agent in 22 consecutive trials covering three phases, for the case of DTD and STD setups. The robot initially follows SS rule, then it switches to OS, and back to SS. STD case. This time the agent switches quickly back to SS, receives a reward in trial 16, and continues responding according to SS for the rest of the trials. After investigating the internal dynamics of the controller among trials, we observed much slower dynamics in the higher part of the bottlenecked CTRNN compared to the fast fluctuation of the lower level neurons. This finding is similar to [3], implying that the higher part of the controller is mainly involved in encoding the rule that is currently adopted by the robot, while the lower part is responsible for applying the rules considering also environmental interaction issues (e.g. wall avoidance). In the following we restrict our discussion to the higher level neurons of the CTRNN exploring how rule manipulation interacts with time perception. In order to place activation difference between the DTD and STD cases within a general and systematic framework, we have conducted a Principal Component Analysis (PCA) on the activity of higher level neurons. The first two principal components of higher level activity during DTD task with the robot turning left or right following either OS or SS are shown in Fig. 3. Note that the agent has to memorize rules when passing from one trial to the next. The value of principal components at the beginning and ending of trials provide insight in the rule encoding approach. In Fig. 3 we can easily observe that the second principal component (PC2) supports memorizing the currently adopted rule because for OS trials PC2 starts and ends from relatively low values, while for the case of SS trials, the same principal component starts and ends from relatively high values. Thus, PC2 differentiates OS and SS by tracking the currently adopted rule between consecutive trials. Turning to the STD version of the task, the first two principal components of higher level activity for all possible cases are depicted in Fig. 4. We can easily observe that the two rules are now differentiated based on PC1. In particular, for both the left and the right turnings of OS, PC1 starts and ends at rather high values, while for the case of SS rule, PC1 starts and ends at low values. 432

Fig. 3: The activity of the first two principal components in a set of multiple DTD trials. Principal component 1 (PC1) is demonstrated with a solid line, while principal component 2 (PC2) is demonstrated with a dotted line. Clearly, rules SS and OS are separated based on the second principal component, PC2. Overall, CTRNNs encode the temporal characteristics of tasks in the principal components of neural activity, a mechanism that plays an important role in memorizing the rule that is currently adopted by the artificial agent. Discussion. In the human brain, imaging studies propose that time perception may emerge from the integration of interoceptive afferent activity (body sensations) in the insular cortex [7, 8], an area of the brain which is also strongly involved in emotional awarenes and in complex decision making, therefore sharing the same neural substrate with other behavioural and cognitive capacities. Similarly, in our study the primitive time perception ability of CTRNN s is integrated with the rule following/switching capacity. Interestingly, there is an open neuroscience debate regarding the existence or non-existence of two distinct neural mechanisms involved in processing subsecond and supra-second time intervals [9]. Despite this is not the main topic of the present study, the mechanism self-organized in our work may bridge the two opponent arguments, suggesting that sub-second and supra-second mechanisms may rely on different principal components of the same overall system, therefore being partially but not fully segregated. 4 Conclusions In the field of artificial cognitive systems, time perception remains a largely unexplored issue. The current study shows that considering time may significantly support artificial cognition. Our work is an early attempt towards a systematic exploration of the time perception capacity in the context of autonomous 433

Fig. 4: The activity of the first two principal components in a set of multiple STD trials. Principal component 1 (PC1) is demonstrated with a solid line, while principal component 2 (PC2) is demonstrated with a dotted line. Clearly, rules SS and OS are now separated based on the first principal component, PC1. intelligent systems. In the future, we plan to explore additional aspects of time perception, focusing on duration estimation and duration calculus, considering the neurocognitive mechanisms underlying human skills in the temporal domain. References [1] C.R. Gallistel and J. Gibbon. Time, rate, and conditioning. Psychological Review, 107(2):289 344, 2000. [2] M. Maniadakis and J. Tani. Acquiring rules for rules: neuro-dynamical systems account for meta-cognition. Adaptive Behavior, 17(1):58 80, 2009. [3] M. Maniadakis, P. Trahanias, and J. Tani. Explorations on artificial time perception. Neural Networks, 22:509 517, 2009. [4] K.W. Greve, T.R. Stickle, J. Love, K.J. Bianchini, and M.S. Stanford. Latent structure of the Wisconsin Card Sorting Test: a confirmatory factor analytic study. Archives of Clinical Neuropsychology, 20:355 364, 2005. [5] R.D. Beer. On the dynamics of small continuous-time recurrent neural networks. Adaptive Behavior, 3(4):471 511, 1995. [6] D Joel, I. Weiner, and J. Feldon. Electrolytic lesions of the medial prefrontal cortex in rats disrupt performance on an analog of the Wisconsin Card Sorting Test, but do not disrupt latent inhibition: implications for animal models of schizophrenia. Behavioural Brain Research, 85:187 201, 1997. [7] A.D. Craig. Emotional moments across time: a possible neural basis for time perception in the interior insula. Phil. Trans. Royal Society B, 364:1933 1942, 2009. [8] M. Wittmann. The inner experience of time. Phil. Tran. Royal Soc. B, 364:1955 67, 2009. [9] M. Wittmann and V. van Wassenhove. The experience of time: neural mechanisms and the interplay of emotion, cognition and embodiment. Phil. Trans. Royal Society B, 364:1809 1813, 2009. 434