A Stochastic and Competitive Hebbian Learning Mechanism through Spike-Timing Dependent Plasticity to Lift Hard Weight Constraints on Hebbian Synapses

Size: px

Start display at page:

Download "A Stochastic and Competitive Hebbian Learning Mechanism through Spike-Timing Dependent Plasticity to Lift Hard Weight Constraints on Hebbian Synapses"

Paulina King
6 years ago
Views:

1 Nagaoka University of Technology Graduate School of Engineering Information Science and Control Engineering A Stochastic and Competitive Hebbian Learning Mechanism through Spike-Timing Dependent Plasticity to Lift Hard Weight Constraints on Hebbian Synapses A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Engineering August 3 Submitted by Fernando Subha Danushika Supervised by Prof. Koichi Yamada

2 This page intentionally left blank.

3 DEDICATED To my daughter Rathnayake Suheli Nuthara iii

4 Acknowledgment This thesis would not have been possible without the help, support, guidance and patience of my supervisor, Professor Koichi Yamada. His enlightening guidance and inspiring instruction and logical way of thinking have been great value for me to complete the research in good quality. I would also like to express my sincere thanks to the examiners of the dissertation, Prof.Fukumura, Prof.Yukawa, Prof.Marasinghe and Prof.Takahashi for their constructive feedback to improve the dissertation. I would also like to take this opportunity to show my gratefulness to the professors who helped my research during master degree, Prof.Marasinghe, Prof.Nakamura, and Prof.Matsuzaki, for their indispensables assistance on both academic and a personal level. At the same time, I would like to extend my gratitude to all the professors of the Department of Management and Information Systems for their valuable guidance and support to my academic studies. I would also like to express my sincere gratitude to the Japanese Government for awarding me the Monbukagakusho scholarship for five years. Without this generous helping hand, it would be impossible for me to give the full commitment to the research and complete it successfully. It is my great pleasure to recall the help of Dr.Madurapperuma and Prof.Karunananda who guided me to find this great opportunity. Finally I owe my loving thanks to my family and my parents. They have lost a lot due to my research abroad. Without their encouragement and understanding it would have been impossible for me to finish this work. Finally, I wish to extend my warmest thanks to all those who have helped me in academic and in personal level. iv

5 Table of Contents List of Figures List of Tables Nomenclature ix xii xiii Introduction. Motivation Problem in Brief Aim and Objectives of the Research Approach Why a Hebbian Learning Mechanism and Why not a Learning Algorithm? Structure of the Thesis Summary Literature Survey 8. Generations of ANN Learning in ANN Hebbian Learning Weight Normalization Homeostatic Plasticity BCM Theory Perception learning Multi-Layer Perception Delta Learning Back-Propagation Learning Algorithm Winner-take All Learning Kohonen Networks Hopfield Networks Autoassociative Memory Bidirectional Autoassociative Memory Stochastic Networks v

6 .3. Deep-Belief Networks Boltzmann Machines Restricted Boltzmann Machines Contrastive Divergence Learning Belief Nets Greedy Learning Wake-Sleep Algorithm STDP Networks Spike-Timing Dependent Plasticity Networks with Leaky-Integrate-and-Fire Neurons Networks with Stochastic Inhibited Neurons Unanswered Hypothesis in Neuroscience Model of Hedonistic Synapse Problem in Brief Summary Background Theory 4 3. Neurons Neuronal Functionality Chemical Synapses Functionality at Synapses Synaptic Plasticity Short-term Plasticity Facilitation Pre-synaptic Depression as a Depletion Post-synaptic Depression as Receptor Desensitization Long-term Plasticity through STDP Homeostatic Synaptic Plasticity Synaptic Redistribution Physiological Exploration of Hebbian Neurons Stent anti-hebbian Postulate Lisman anti-hebbian Postulate vi

7 3.4.3 Levy and Desmond s Rule of Synaptic Plasticity Summary Mathematical Formulation of the Proposed Mechanism Stochastic and Competitive Hebbian Learning Mechanism through STDP Neuronal Model Mathematical Formulation of Plasticity Mechanisms Short-Term Plasticity Behavioral Rules on the Modeled Synapses Homeostatic Synaptic Plasticity Long-term Plasticity as STDP Binning the Process at Modeled Synapses Defining Synapses s Activity using Bins Activity Estimating Mean and Variance at a Synapse Learning based on STDP and STP Stochasticity and Dynamicity of Modeled Synapses and Competitiveness Summary Evaluation of the Proposed Hebbian Learning Mechanism 7 5. Determining the Biological Feasibility of the Proposed Learning Mechanism Experimental Procedure Stimuli Parameters Structure of the Experiment Evaluation of Hebb s Postulate and Threshold Mechanism Neuron A at Correlated Phases Supported Hebb s Postulate Neuron A at Uncorrelated Training Sessions Supported Hebb s and anti- Hebbian Postulates Neuron A at Uncorrelated Testing Sessions Supported Stent s and Lisman s Postulates New Threshold Mechanism Simulated Homeostatic Synaptic Plasticity Mechanism Evaluation of Synaptic Redistribution with Release Probability of Modeled Synapses 8 vii

8 5..3. Neuron A Supported Synaptic Redistribution and Adhered to Release Probability Rules Evaluating the Proposed Mechanism for Controlling Instability of Hebbian Neurons Experimental Procedure Experiment Structure and External Stimuli Evaluation of Stability of Hebbian Neurons Case :- When both neurons were externally fed by Poisson inputs Case :- Determining appropriate bin size when one neuron was externally fed by Poisson inputs Case 3 :- Balancing the excitation of Hebbian neurons through STDP and STP Case 4 :- Overall Evaluation of the Proposed Hebbian Learning Mechanism Summary Conclusion 9 6. Intuition Discussion Major Contributions to the Research Domain Pros and Cons of the Proposed Learning Mechanism Problems Encountered Future Works Summary References 5 Appendix A Appendix B 5 Appendix C 9 viii

9 List of Figures Figure.. Supervised learning vs Unsupervised learning Figure.. A pre-synaptic Hebbian neuron and post-synaptic neuron Figure..3 Sigmoid function σ(u j (t)) defines firing rate of a neuron as a function of its potential u j (t) Figure..4 Sigmoid function showing upper and lower bounds of the target firing rate Figure..5 Perception learning in single layer network Figure..6 Perception learning in three layer feed-forward network Figure..7 XOR problem in single layer network Figure..8 XOR problem in three layer network Figure..9 Feed-forward network with three layers Figure.. Kohonen Map Figure.. A Hopfield neural network Figure.. Bidirectional autoassociative memories Figure..3 A general Boltzmann machine Figure..4 A restricted Boltzmann machine Figure..5 Learning rule of sigmoid belief nets Figure..6 A belief net with one hidden layer Figure..7 A belief net with greedy learning algorithm Figure.3. General structure of a neuron Figure.3. Structure at a chemical synapse Figure.3.3 Process of neurotransmitter release at a chemical synapse Figure.3.4 Relative time duration of each synaptic plasticity phenomenon Figure.3.5 Synaptic facilitation Figure.3.6 Synaptic depression Figure.3.7 The depletion model of synaptic depression Figure.3.8 The role of homeostatic plasticity Figure.3.9 Firing rate variation of a given neuron over time Figure.3. Homeostatic plasticity in cortical networks Figure.4. Structure of a modeled neuron Figure.4. Structure of the modeled neural network ix

10 Figure.4.3 Three main stages of synapses in information transmission process Figure.4.4 Transition status of modeled synapses in the process of neurotransmitter release... 6 Figure.4.5 Bin the process at a synapse Figure.5. Structure of the network and neurons Figure.5. Signal passing between neurons Figure.5.3 Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron A at stage of exp Figure.5.4 Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron A at stage of exp Figure.5.5 Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron A at stage of 3 exp Figure.5.6 Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron A at stage of exp Figure.5.7 Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron A at stage of exp Figure.5.8 Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron A at stage of 3 exp Figure.5.9 Variation of the number of active-transmitters in each neuron over the time during stage α Figure.5. Variation of the number of active-receptors in neurons A and B, and receptor-group R BA during stage α Figure.5. Variation of the number of active components in neurons A and B during stage γ over the time Figure.5. Variation of the number of active components in neurons A and B during stage β over the time Figure.5.3 Variation of the number of active-transmitters in each neuron during stage α Figure.5.4 Variation of the number of active-receptors in each neuron and receptor-groups during stage α Figure.5.5 Variation of the number of active components in neurons A and B during stage γ over the time Figure.5.6 Variation of the number of active components in neurons A and B during stage β over the time x

11 Figure.5.7 Distribution of active-transmitters and their release probability of neuron A Figure.5.8 Distribution of the median of synaptic weights at each Poisson frequency λ Figure.5.9 Distribution of the median of synaptic weights at each synapse on different Poisson frequency λ Figure.5. Distribution of the average median of synaptic weights at each neuron at Poisson frequency λ Figure.5. Network structure with ten synaptic connections Figure.5. Distribution of the weights and release probability of neurons at Poisson inputs with mean rate Hz Figure.5.3 Distribution of the weights and release probability of neurons at Poisson inputs with mean rate 4 Hz Figure.5.4 Distribution of the median of synaptic weights at each synapse at Poisson inputs... Figure.5.5 Distribution of the coefficient of variation (CV) at each synapse at Poisson inputs with Hz Figure.5.6 Distribution of the coefficient of variation (CV) at each synapse at Poisson inputs with 4 Hz Figure.5.7 Network structure with ten synaptic connections Figure.B. Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron B at stage of exp Figure.B. Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron B at stage of exp Figure.B.3 Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron B at stage of 3 exp Figure.B.4 Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron B at stage of exp Figure.B.5 Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron B at stage of exp Figure.B.6 Distribution of (a) the number of active-receptors, (b) the number of active-transmitters, of neuron B at stage of 3 exp Figure.C. Distribution of the synaptic weights and release probability P r at each synapse on Poisson frequency λ = Hz xi

12 Figure.C. Distribution of the synaptic weights and release probability P r at each synapse on Poisson frequency λ = 5Hz Figure.C.3 Distribution of the synaptic weights and release probability P r at each synapse on Poisson frequency λ = Hz Figure.C.4 Distribution of the synaptic weights and release probability P r at each synapse on Poisson frequency λ = 5Hz Figure.C.5 Distribution of the synaptic weights and release probability P r at each synapse on Poisson frequency λ = 3Hz Figure.C.6 Distribution of the synaptic weights and release probability P r at each synapse on Poisson frequency λ = 35Hz Figure.C.7 Distribution of the synaptic weights and release probability P r at each synapse on Poisson frequency λ = 4Hz List of Tables Table.5. Values for the parameters of neurons Table.5. Structure of a stage Table.5.3 Signal processing time step value (t) of the neurons Table.5.4 Order of the stages in each experiment Table.5.5 Behavior of neuron A at correlated phase Table.5.6 Behavior of neuron A at uncorrelated phase Table.5.7 Asynchrony detection at post-synaptic neuron B Table.5.8 Neuron B at uncorrelated testing session xii

13 Nomenclature ANN CD CTRNN DBN HSP LTD LTP MCMC mepsc MLP RBM STDP STP Artificial Neural Network Contrastive Divergence Continuous Time Recurrent Neural Network Deep Belief Network Homeostatic Synaptic Plasticity Long-Term Depression Long-Term Plasticity Markov Chain Monte Carlo miniature Excitatory Post-Synaptic Current Multi-Layer Perception Restricted Boltzmann Machines Spike Timing Dependent Plasticity Short-Term Plasticity xiii

14 Introduction Despite the exponential increase in computing power in recent decades, one of the main goals of the computer scientists to develop intelligent machines more advanced than human beings is yet to be realized. The explosive growth of interest and readily available success in the field of intelligent machines has led many experts from different disciplines to join hand with computer scientists in order to enhance the power of these machines. Today the field of computer science is nourished by well established theories from a variety of disciplines such as Engineering, Mathematics, Neuroscience, Physiology, Psychology, etc. The main goal of these scientists is to unwrap the learning process inside the animal s brain and to find hybrid technology that can model it efficiently on a computational environment. This attraction of multidisciplinary scientists has reconstructed the field of Artificial Neural Network (ANN), as one of the main technologies that have been developed to realize the learning on machines. With this advancement in the field of ANN, many techniques and approaches have been introduced to improve the effectiveness and performances in learning in ANN. Among those approaches namely integration of new information coding techniques 3, 4 such as spike timing dependent plasticity (STDP) 5 9, integration of biologically supported plasticity mechanisms with existing learning theories such as short-term plasticity (STP) 4, 3 and homeostatic synaptic plasticity (HSP) 4 7, merging of many existing learning mechanisms to develop new efficient learning algorithms such as Boltzmann machines with wake-sleep algorithm 8, 9 and contrastive divergence (CD) learning 9, introduction of new network models such as deep belief networks (DBN),, neural network with inhibitory neurons 4 and network with stochastic neurons 5, 6, etc are some of the significant mechanisms spotlighted. Following this broader interest in ANN, STDP and its computational applications have drawn a much attention of neuroscientists because STDP has been accepted as a key coding mechanism that biological neurons use to process information. The key attention of the scientists in this research area is to understand internal interaction of STDP with other plasticity mechanisms such as long-term plasticity (LTP) and STP that are significant in the animal s learning process. The findings of this interest in neuroscience have been evaluated against the neurobiological plausibility of the behaviors generated by these computational approaches when learning takes place on machines. In that sense, only few researchers have succeeded in defining new learning algorithms or proposing new network architectures that are supported by the recent neuroscience findings. Neural networks with inhibited neurons and computational network models with stochastic neurons are the most significant applications that compliment the recent findings. Many applications of these two mechanisms came up with the proposals to control the uncontrolled excitation in the Hebbian learning by introducing inhibited neurons or stochastic neurons 7. However these proposals are still

15 at preliminarily level and need explosive research investigations before fully applying in the field. Aligned with these findings this thesis has presented a novel Hebbian learning mechanism which explains how STDP and STP with background support of HSP in a complex network model can help to lift the hard weight constraints defined on the Hebbian neurons. Hebbian learning algorithm based on Donald Hebb s Postulate 8, 9, has drawn a tremendous attention compared to other learning algorithms in ANN due to its biological feasibility. This attentions has brought an era where scientists have been trying to combine STDP with Hebb s postulate and to alleviate its overwhelming problem called node saturation. Node saturation is the greatest inefficiency in Hebbian learning based neural network models that results in losing sensitivity of the network for extremely low or highly correlated activities between neurons 3. Though many mechanisms have been considered under conventional neural network models such as weight normalization 3, 3 and threshold updating mechanisms like BCM Theory 33, many of these techniques deteriorate the learning tasks or are limited into application level due to their biological infeasibility. Therefore the problem of Hebbian neurons saturation has not been fully answered and many efforts have been taken to find an answer to this long lasting problem through appropriate integration of STDP The thesis presents a novel Hebbian learning mechanism to control the excitation of Hebbian plasticity without applying hard constraints on the weight. The idea introduces a large number of transmitterreceptor connections into modeled neurons so that the excitation of Hebbian neurons can be limited. These transmitter-receptor connections in the proposed network perform similar to pre-synaptic clefts and postsynaptic receptors in biology and help neurons to dissolve or to gain strength by making or releasing these stochastic connections before the network gets saturated. Moreover the stochasticity and dynamicity of these connections have been mathematically modeled by adhering to the biological behaviors of STDP, STP and HSP. This mathematical modeling of STDP, STP and HSP defines a Hebbian learning mechanism which allows the proposed network to stabilize itself while being sensitive to the inputs mean rates, so that Poisson inputs with higher mean rates stabilized into lower weight values compared to where Poisson inputs with lower mean rates settled. This behavior helps the network to move to a depressed state for excited inputs, and move to an excited state for inhibited inputs. The results further confirm the possibility of equally exciting neurons in a complex neural networks when the time decay constants of STDP potentiation and depression lay near the range of the magnitude of mean rate of applied Poisson inputs 34, 37, 38. Intriguingly this stability of the modeled network has not been the result of a random process, instead it has been proven that it contains significant characteristics of biological learning process, namely Hebb s postulate and Stent s anti-hebbian postulate 39, HSP 4, synaptic redistribution 4 and excited behavior of neu-

16 rons at low firing frequencies 4. When compared to other network models no hard constraints were defined on our Hebbian learning mechanism and excitation was merely controlled by an integration of STDP and STP 38. Therefore findings of the research push the validity of the assumption of STDP one step forward so that it can control the excitation of Hebbian postulate when it is appropriately integrated with other plasticity mechanisms such as STP and HSP. Thus our findings of this research help to alleviate the use of boundary conditions or weight constraints on Hebbian learning based neural networks and provide the biological feasible computational environment to computer scientists to deeply examine underlying behaviors when learning takes place.. Motivation Many of the correlated learning algorithms are preliminarily based on Hebb s postulate. In simple Hebbian learning algorithm, an increase of the synaptic strength between the two neurons can be seen if their activity is correlated otherwise decreases 8. This mathematical interpretation to Hebb s postulate allows boundless growth or weakening of synaptic strength between the two neurons 3. Even though Hebbian learning has been witnessed in biological experiments of long-term plasticity, how it balances the excitation of neurons is still less known. The recent biological findings have supported the fact that STDP is a key mechanism on how neuron process information in brain and is in a form of long-term plasticity that depends on the relative timing of pre-synaptic and post-synaptic action potentials 6,. Applicability of STDP has been evaluated on a variety of computational environments especially to balance the excitation of Hebbian post-synaptic neuron by generating competition between synapses Furthermore a comprehensive mathematical exploration into STDP on stochastic Hebbian neural networks has emphasized the possibility of converging the weight distribution into a stable weight matrix for Poisson inputs 9. However many learning algorithms based on STDP depend on the constraints applied on the weight algorithm which ultimately limit the performance of learning and/or deteriorate the learning tasks. Neuroscientists have cited many possible mechanisms to remove this inefficiency. One approach is to introduce dynamic multiple stochastic synaptic connections between modeled neurons in which synapses are able to adjust their own probability of neurotransmitter release (Pr) 4, 43. Updating Pr according to the history of short-term activity of synapses seems to be a key factor for a successful learning on neural networks. It can also be considered as an elegant way of introducing activity dependent modifications to synapses and consequently to generate the competition between synapses. Even though advanced researches have been conducted in many forms under this hy- Biological neurons use short and sudden increase in voltage to send information to other neurons. These signals are more commonly known as action potentials. The neuron which receives the signal. 3

17 pothesis as modeling networks using stochastic neurons 34 36, using spike neurons 5, 6 and using inhibited neurons 4, the specifics of a biologically plausible model of plasticity mechanisms that can account for the observed synaptic patterns have still remained elusive.. Problem in Brief The main focus of this thesis is to investigate a biologically feasible a stochastic and competitive Hebbian learning mechanism to control the instability of Hebbian neurons using proper integration of STDP with other plasticity mechanisms without defining hard constraints on the learning algorithm..3 Aim and Objectives of the Research The aim of this research is to introduce the biologically plausible stochastic and competitive Hebbian learning mechanism to lift the hard constraints on weight learning algorithm in Hebbian neurons. Objectives:. To analyze Hebbian neurons excitation in mathematical terms.. To evaluate strengths and limitations of the approaches that have been taken in literature to control this excitation. 3. To critically explore the neuro-scientific findings related to Hebb s postulate. 4. To analyze significant plasticity mechanisms underlying the Hebbian learning in the brain. 5. To define those significant biological plasticity mechanisms mathematically. 6. To integrate the identified mechanisms and neuronal features with STDP. 7. To evaluate the behavior of Hebbian neurons and the stability of the network, by conducting series of experiments..4 Approach A fully connected neural network was developed with stochastic neurons in which each neuron consisted of thousands of computational units. These computational units were categorized as transmitters and receptors according to the role they played on the network. A unit was called a transmitter if it transmitted signals to other neurons and a unit was called a receptor if it received the signals into the neuron. The receptors of a given neuron were clustered into receptor groups. According to the excitation and the inhibition of the modeled neuron, these computational units updated their states dynamically from active state to inactive state or vice versa. An active transmitter of pre-synaptic neuron and an active receptor of the corresponding receptor-group of the post-synaptic neuron formed a dynamic and stochastic synaptic connection that 4

18 simulated the process of a single biological synapse. Only when connected computational units were in active state the connection mediated the successful transmission of signals between neurons. A transmitter at pre-synaptic neuron can be imagined as a synaptic vesicle which releases a single neurotransmitter at a time and a modeled receptor can be considered as a post-synaptic receptor at synaptic cleft. Under these features excitation of a modeled neuron at a particular synapse in our network was determined by a function of the number of active-transmitters in the pre-synaptic neuron, transmitters release probability, and the number of active-receptors at the corresponding receptor-group of the post-synaptic neuron. The model has been proposed in this thesis 38 differs from others due to the computational power that has been granted to modeled receptors and transmitters, so that the behavior at a single neuron has been determined by collective activities of these dynamic stochastic components. A series of experiments were conducted to analyze the stability of the modeled network for a range of Poisson inputs by changing the flow of the internal information, and subsequently the underlying process that stabilized the network was critically evaluated against the recent neuroscience findings. The biological fidelity and stability of the modeled network were verified in detail by using two methods, first by varying firing rates of neurons with sigmoid correlated, and uncorrelated inputs 4, and second by applying Poisson inputs in range of Hz to 4Hz with uniform mean firing rate. In both cases the network successfully reached to the stability while holding the sensitivity to the correlation structure of the inputs applied. The latter was conducted in two stages, first both neurons were fed by Poisson inputs and then the network stability was evaluated 44, second, only one neuron was fed by Poisson inputs and then the stability was evaluated 38. In each stage it has been shown that proposed Hebbian learning mechanism drove the network to stability. Furthermore the underground process that supported to stabilize the network had many significant biological characteristics, namely the properties of Hebb s postulate and Stent s anti-hebbian postulate 4, properties of HSP 4, 45, behaved similar to synaptic redistribution 38 and operated at lower firing frequencies similar to excited biological neurons 4. Thus the proposed Hebbian learning mechanism in this research has lifted the hard weight constraints that was put on to Hebbian based neural networks and has postulated the possibility of removing these constraints from the learning theory if the structure of the neural network listens and adjusts itself according to synaptic fluctuations in the form STDP, STP and HSP rather than general fluctuations that occur at neuronal level..5 Why a Hebbian Learning Mechanism and Why not a Learning Algorithm? In this research we have purposely used the term Hebbian learning mechanism without terming it as Hebbian learning algorithm. This section was prefaced to clarify the reader the use of the proposed term. The neuron which transmits the signal 5

19 The main objective of this research is to stabilize the Hebbian neurons without defining hard constraints. With this objective it has aimed to develop a novel neural structure and supportive underlying mechanisms to control this instability when Hebbian learning takes place. The learning algorithm that was practiced by our neurons in this network is Hebbian learning algorithm, which has been proven in the section 5. The issue arises when signals are passing between neurons according to the Hebbian learning algorithm and synaptic weights are updated according to their interactions. We could observe that these neurons tend to either get over excited or over depressed making the entire network either unstable or saturated. Thus as defined in Hebb s postulate, which is used to derive Hebbian learning algorithm, the underlying metabolic changes that takes place when neurons are practicing the Hebbian learning has been focused in this research to remove this inefficiency. Through a critical literature survey it was identified that these metabolic changes are non other than the plasticity mechanisms that underlie in the learning process. Then it was focused to integrate these plasticity mechanisms mathematically into Hebbian neurons. This attempt resulted in developing a novel neural structure and a weight updating algorithm which listened to the integrated plasticity mechanism vigorously. Although the developed weight updating algorithm is a derivative of the standard Hebbian learning algorithm. the main contribution of this research to the field is the successful integration of Hebb observed plasticity mechanism to the Hebbian learning algorithm. This integration resulted in introducing stochastic, competitive and the Hebbian supportive learning mechanism which controlled the instability of Hebbian neurons successfully. Most importantly it has been proved that developed plasticity mechanism exhibits many significant features of the theory it was extracted from, such as synaptic redistribution, Stent s anti-hebbian postulate and Homeostatic synaptic plasticity process. Therefore it has been clearly noticed that the novelty of the research is the integration of the plasticity mechanism to Hebb s postulate that supported our neurons to be competitive and stochastic while allowing them to fire in a feasible operational range when Hebbian learning takes place. This integration of plasticity mechanisms with the novel neural structure that supported Hebb s postulate to remove its inefficiency was collectively termed as stochastic and competitive Hebbian learning mechanism..6 Structure of the Thesis The rest of the thesis has been structured as follows: chapter reviews the recent trends and unresolved issues in the domain. Chapter 3 describes the related background theories and chapter 4 mathematically formulates our approach. Finally chapter 5 evaluates the overall findings of the research and chapter 6 concludes the findings with a note of further work. Appendix A lists the journal publications and conference publications of the research. 6

20 .7 Summary Deterioration of the learning tasks under Hebbian learning due to the application of infeasible hard constraints on the learning algorithm is a great potential for the researchers in ANN. The key suggestion of neuroscientists is to handle this excitation at a synaptic level rather than a neuronal level. Mathematical listening to STDP, STP and other form of plasticity mechanisms might be a productive effort that could result in a feasible solution to answer this ever lasting problem. The thesis presents a stochastic and competitive Hebbian learning mechanism that complies to this very hypothesis and has proven that the proposed mechanism supports to eliminate the weight boundary conditions on Hebbian neurons when learning takes place. The next chapter reviews the literature and state of the art in detail. 7

21 Literature Survey The previous chapter logically presented a brief explanation of the importance of addressing the issue discussed in this thesis. This chapter starts with providing enough background details to the domain of ANN and subsequently it briefly reviews the state of the art in engineering perspective rather than biological perspective. Finally a more elaborate analysis to the main issue discussed in the thesis is presented by critically reviewing the others significant contributions to the subject. A theoretical neuro-scientific background that supported the successful implementation of the proposed Hebbian learning mechanism is discussed in the next chapter.. Generations of ANN One of the greatest efforts of mankind to implement artificial brain systems in the field of Engineering has succeeded with the birth of artificial neural systems. Today it is one of the best technologies that grows rapidly realizing the learning on machines. Its performances have been evaluated on a variety of computing environments ranging from small personal computers to supercomputers in industrial research labs. Generally these researches in ANN are classified into three generations 46, 47. The neural networks of first generation are based on McCulloch-Pitts threshold neurons 48. The McCulloch-Pitts neural model is also known as linear threshold gate. It is a neuron of a set of inputs and one output. The linear threshold gate simply classifies the set of inputs into two different classes. Therefore the output it generates is a binary. Basically these neurons sending high output if the sum of the weighted incoming signals is greater than the preset threshold value. They have been successfully applied in multilayer perceptions neural networks, see section.. and Hopfield nets see section..5. These neurons are capable of computing any boolean function with multilayer perception that has a single hidden layer neuron. They are called universal for digital computing 46, 47. The neurons in second generation neural networks use activation function to calculate their output instead of threshold functions. Therefore neural networks belong to the second generation which take analog inputs and generate analog outputs using appropriate activation functions that suit to the applications. Sigmoid function and hyperbolic tangent function are the commonly used activation functions in the literature. The second generation neurons are commonly used in feed-forward networks and recurrent neural networks, and capable of approximating any arbitrarily analog function. They are called universal for analog computing 48. The neurons in the third generations are more biologically supported than previous generations neural models because they have been developed according to the recent biological findings and are Sum of the product of each input with the weight associated with that input. Activation function is a function that describes the output behavior of a neuron 8

22 much closer to what is happening inside real neurons in the brain. Spiking neurons such as leaky-integrateand fire neurons, stochastic neurons, and inhibited stochastic neurons, (see section.3), are the significant neural models that come under the third generation and have been mainly applied in applications where time plays a major role in inputs. These neurons are very fast, can be easily integrated into real dynamic environment, and has the ability to compute any function that a second generation network can compute 46 but many of these neural models are still in research level seeking for a proper integration with plasticity mechanisms in biology.. Learning in ANN Learning rules in neural networks can be classified into two categories based on whether a teacher is being used to train the network. If the network is trained using a teacher signal then it is called supervised learning otherwise it is called unsupervised learning. In supervised learning, training takes place as error reduction mechanism in each step where error is determined by the difference between network output and the desired output. Networks that practiced unsupervised learning algorithm is called self-organizing maps because they automatically adjust themselves towards the target output without a teacher. Unsupervised learning is used when for a given input the exact numerical output that a network should produce is unknown. Moreover, learning rules in neural networks can be classified either as incremental learning or batch learning, according to the training conducted. Batch learning takes place when the network weights are adjusted in a single training step. Therefore, in batch learning, the complete set of input/output training data is needed to determine weights. Feedback information produced by the network itself is not involved in developing the network. In contrast, learning with feedback in each step is called incremental learning. In sum, generally we can find learning algorithms in neural networks that come under either supervised learning or unsupervised learning and mainly operate as incremental learning algorithms. The basic process of learning on neural network can be depicted as shown in fig... Both learning mechanisms generalize the applied inputs either into patterns or into clusters. The rest of the section comprehensively discusses learning algorithms generally used in applications of ANN with their pros and cons. Many advanced or hybrid versions of these learning algorithms can be found in the literature and have shown impressive performances in the trained task. However, as discussed under each learning algorithms the inherited structural features of their parent algorithms (either positive or negative) are still available on most of these updated versions. 9

23 input Learning W output input Learning W output Error Calculator Learning signal d Supervised Learning Unsupervised Learning Figure.: Supervised learning vs Unsupervised learning In learning algorithms of ANN a neuron is an adaptive element and its weight values with other neurons are modifiable depends on the input it receives and the output it generates. If the network is under supervised learning algorithms then the weight updating are also associated with the error information calculated by a teacher signal and by a generated output. If it is under unsupervised learning algorithm, no error information is associated with the weight modifications... Hebbian Learning Hebbian unsupervised learning algorithm has been mostly used in ANN and has become a hot topic in the research field 5. It has drawn much attention of scientists because of the postulate it is based on has been supported by the neurobiological findings. Hebbian learning algorithm was derived by mathematicians from the postulate made by Hebb in Hebb hypothesized that a biological neuron induces long lasting cellular changes to another neuron s stability by making repetitive activities. He has further added that these repetitive activities are essential when an association occurs in perception, learning and intelligence. This great postulate is also known as Hebb s Synapses, Hebbian Neurons, etc. In Hebb s own-words he has postulated the interactivity of two biological neurons as, When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic changes take place in one or both cells such that A s efficiency, as one of the cells firing B, is increased 8. Although this phenomena has been witnessed in biological neurons when learning takes place, the postulate has not stated when A s efficiency is decreased. Further, the characteristic of the metabolic changes that occur inside neuron A is not clearly defined though it is the main factor to improve A s efficiency. Additionally the interpretation of the phrases as one of the cell s firing B is still at a debate about the relative importance of the locus of the neuron to synaptic potentiation 3. However, the current biological findings has either fully or partially answered many of the cited issues in this great theory. Among them it has identified that the metabolic changes are non other than plasticity mechanisms that help a neuron to adjust to its feasible range,, 5. These plasticity mechanisms are now being tried out to combine with the Hebbian neurons and has helped to open the third generation of ANN.

24 This famous postulate has been rephrased in the sense that modifications to synaptic transmission efficacy are driven by the correlated firing activity between pre-synaptic neuron and post-synaptic neuron 5. Although in Hebb s postulate no synaptic depression has been mentioned, it was introduced when postulate was mathematically formulated by computer scientists 9. Most of the correlation-based learning algorithms in the present are generally called Hebbian learning algorithms and they reduce synaptic transmission efficacy when neural activity are not correlated. Since Hebbian hypothesis is about the modifications of synaptic transmission efficacy according to the correlated firing activity of pre-synaptic neuron and post-synaptic neuron, it has been simply quoted as cells that fire together, wire together. Even though the mathematical interpretation to Hebbian neurons eliminated the issue of reducing synaptic efficacy between neurons, it could not fully answer how to avoid and control this unbounded excitation and inhibition. Many techniques have been introduced to avoid this issue of node saturation 49, such as weight normalization 3, 3, threshold updating mechanisms 33, induction homeostasis plasticity mechanisms 5, 5, etc. Even these mechanisms controlled the over excitation or over inhibition of Hebbian neurons, they also deteriorated the learning tasks 9, 3, (see section...), to detail analysis of these mechanisms. Even with these limitations many learning algorithms have been derived from Hebb s hypothesis, some of those are: continuous time recurrent neural networks (CTRNNs) 5, rate-based Hebbian learning 9, and spike-based learning 9, 47. However aligned to the current biological findings novel neural models have been introduced to Hebbian neurons by mathematically integrating biological plasticity mechanisms to avoid the problem of node saturation. Some of the significant neural models are stochastic neurons,, 5, 6, 5, leaky-integrate and fire neurons34 36, 53, 54 and inhibited neurons 7, 4. Refer section.3.. for more details of these mechanisms with their pros and cons. The basic version of the Hebbian learning rule for a single neuron states that if the cross-product of input and the output is positive then it increases the value of the corresponding weight, otherwise the value is decreased. If X = [x i, x i,..., x i j,..., x in ] are inputs to the i th neuron and W i = [w i, w i,..., w in ] t is the corresponding weight matrix then learning signal r, is equal to the i th neuron output o i as shown in fig.. and defined in eq. (.). r = o i = f (W t X) (.) Here, f (.) is the threshold function and the increment W i of the weight vector becomes: W = c f (W t X)X (.)

25 c is a constant and eq. (.) can simply be written to w i j component as: w i j = co i x j (.3) Figure.: A pre-synaptic Hebbian neuron and post-synaptic neuron According to eq. (.3), if the cross-product of input and the output, or the correlation term o i x j is positive, the value of the corresponding weight component w i j is increased. Otherwise the value is decreased. Therefore, frequent input patterns can take unconstrained weight components into higher value and may produce larger output under Hebbian learning algorithm in the training phase 4. Though this learning algorithm is biologically feasible, it causes unconstrained growth or unconstrained reduction of weight components which leads to produce larger or negligible outputs. Weight normalization, threshold updating mechanisms such as BCM theory, and induction of homeostatic plasticity as a controlling mechanism are significant proposals that came out to control this excitation in conventional neural models. Next couple of paragraphs review these techniques.... Weight Normalization Weight normalization in Hebbian learning algorithms can be distinguished into two methods, multiplicative and subtractive. First, each synapse might decay at a rate proportional to its current strength; this is called multiplicative decay. In second type, each synapse decay at a fixed rate independent of synaptic strength; this is called subtractive decay. The mathematical applications of the multiplicative constraints lead the inputs to develop graded strength while subtractive constraints lead synapses to saturate either the maximal or minimal allowed values, and result in classifying into a few best-correlated inputs 3. However A chemical connection between two neurons, usually its strength is denoted in W in ANN, see next chapter for more details about synapses

26 these user-defined weight constraints significantly affect the dynamic behavior of the applied neural network and limit the performance of learning 3, Homeostatic Plasticity Homeostatic plasticity 5 is a biological mechanism that has been applied in subtractive form to control the excitations in CTRNN. CTRNN can either be perceived as an artificial neural network where neurons are interconnected by fixed weights to establish the synaptic connections, or as a dynamic system that explains the behavior of each node of the network by a differential equation. This architectural flexibility in itself has enabled CTRNN, not only to provide general features of ANN but also the benefits of non-linear dynamics, namely, the capability to maintain autonomous oscillations and ability to approximate the output of any dynamic system when it is correctly parametrized 5, 5, 55. Therefore, CTRNNs are more common in autonomous agents as a substrate for the evaluation of dynamic behavior 5, 55. In CTRNN each neuron in the network is described by a differential equation, eq. (.4), that describes the change of the potentiation of i th neuron over time. Here u i is the internal state of i th neuron, τ i is the time constant of i th neuron, I i is external input (or threshold) of i th neuron, w i j is the weight connection from j th neuron to i th neuron, and σ in eq. (.5) is the output function which defines the relationship between potential and the neuron firing rate55, 56 du i (t) dt = τ i u i (t) + n w i j σ(u j (t)) + I i (.4) j= σ(u j (t)) = + exp u j(t) (.5) The sigmoid shape of the transfer function in eq. (.5) can take neuron firing rate into a saturated level when its potentiation is very low or very high. If neuron potentiation is in range B, cf. fig..3, a change in input will change the neuron potentiation and a corresponding change in the firing rate. Therefore, neurons with potentiation fluctuating in range B are not saturated. If potentiation of a neuron lies either in range A or C, no significant fluctuation in firing rate can be seen because A and C ranges lie on the flat tails of the sigmoid function. A change in input may change the neuron potentiation but there will be a negligible change in the neuron firing rate. So, neurons with potentiation fluctuating in range A or C are called saturated 5, 55. If neurons are saturated, they play no role in dynamic network since they give constant output to any input. These neurons do not oscillate and act as barriers to signal propagation 5. In order to avoid the node saturation in CTRNN, the process of Homeostatic synaptic plasticity has been Homeostatic plasticity is a biological process which helps to maintain the firing frequency of a neuron in the feasible range by changing its morphological and electrical properties 3

27 Figure.3: Sigmoid function σ(u j (t)) defines firing rate of a neuron as a function of its potential u j (t). defined as a synaptic scaling mechanism 5. So that if neuron firing rate z = σ(u j (t)) in eq. (.5) is higher than the user defined upper limit HU, cf. fig..4, or if it is less than the user defined lower limit HL, plasticity facilitation ρ has been calculated as defined in eq. (.6) to update the magnitude of the weight component in order to bring the neuron firing rate back into bounds. Therefore if ẇ is the current magnitude of the weight component, it has been determined by the time constant τ w, the absolute value of the weight w, and by the plasticity facilitation ρ as in eq. (.7). Even though this adaption has improved the signal propagation within the network, it has damaged the learning and evolving of CTRNN 5. ρ = HL z HL z < HL HL z < HU HU z HU HU z (.6) τ w ẇ = ρ w (.7) Figure.4: Sigmoid function showing upper and lower bounds of the target firing rate. source: Williams,5.Homeostatic Plasticity improves continuous-time recurrent neural networks as a behavioral substrate BCM Theory The BCM theory is one of the major theories in neuroscience which explains the synaptic activity in the human brain as a temporal competition between input patterns. In this theory, the dynamics of a neuron is explained by a variable that describes instantaneous post-synaptic firing frequency. Therefore in the BCM 4

28 theory a neuron is a device that performs spatial integration; i.e. it integrates the signals coming to all the parts of the neuron. Thus neuronal output at time t is a function of the input and synaptic efficacy at t, and independent of past history. The most simplified version of BCM theory assumed that the integrative power of a neuron can linearly explain and can describe as in eq. (.8). The vector form of eq. (.8) is written in eq. (.9). c(t) = m j (t)d j (t) (.8) j C(t) = M(t)D(t) (.9) Where c(t) is the output at time t, m j (t) is the efficacy of the j th post-synaptic at time t by the pre-synaptic neuron, and d j (t) is the j th component of the input at time t. This output may be generated through a mixture of excited neurons or inhibited neurons. Therefore the resulting neuron may be of either sign, i.e. excited or inhibited, depending on whether the net effect, i.e. φ(c(t)), is positive or negative as defined in eq. (.). φ(c(t)) < if C(t) < θ M and φ(c(t)) > if C(t) > θ M (.) Where θ M is the threshold defined in terms of the average output of neuron called c(t) = m(t)d as in eq. (.). The time average is meant to be taken over a period T preceding t much longer than the membrane time constant τ so that c(t) evolves on a much slower time scale than c(t). θ M (C) = C p C (.) c c and p are fixed positive constants. In sum, synaptic inputs that drive post-synaptic firing to a higher rate than a threshold value result in an increase of synaptic strength while synaptic inputs that drive post-synaptic firing to a lower rate than the threshold value result in a decrease of synaptic strength. Even though biological evidences support the sliding of threshold in neurons according to the sensitivity of the input it received, sliding of the threshold merely based on the post-synaptic activity as defined in the BCM theory is not supported directly in biology 57. In addition the BCM theory considers instantaneous post-synaptic firing frequencies for its threshold updating mechanism rather than the effect of spike arrival time to the synapses. According to the latest biological findings, a generation of long lasting changes to synapses mainly depends on the spike arrival time to the synapses,, 4 and the probability of neurotransmitter release at those synapses 43. Therefore the present conventional approaches such as weight normalization and the BCM theory cannot well answer the issue of Neurons which generate excited outputs that increase the correlated synaptic efficacy. Neurons which generates inhibited outputs that decrease the correlated synaptic efficacy. 5

29 node saturation in Hebbian learning algorithm without deteriorating learning tasks... Perception learning Supervised learning is further divided into two approaches which use reinforcement or error correction. In the reinforcement learning after each input-output example, the produced output of a network is evaluated against the desired output. If the network does not produce the desired output, weights of the network are updated. This error correction to the weights are done only using input vector. In contrast, when learning with error corrections, the magnitude of the error, i.e. the gap between the desired output and the produced output, and input vector jointly determine the magnitude of the corrections to the weight. Perception learning algorithm is a supervised learning with reinforcement but we could also see some of the derivatives of perception learning use error correction called corrective learning 58. The general single layer perception is shown in fig..5. For the Perception learning algorithm, the learning rule is the difference between the desired and actual neuronal response i.e. r = d i i. Therefore, the increment of weight vector w becomes as given in eq. (.) and eq. (.3). Figure.5: Perception learning in single layer network w i = c.[d i sgn(w t x)].x (.) w i = c.[d i i ].x (.3) Single Feed-forward network with 3 input neurons and one output neuron This rule is applicable only for the binary neural response. Under this rule the weights are adjusted if and only if i is incorrect. The most interesting feature of the perception learning is that it converges its weight matrix w to w that gives the correct response of all training patterns given that the input vector originate from two linearly separable classes are defined as below. Perception Convergence Theorem says that if there is a weight vector w such that f (w p(q)) = t(q) for all q, then for any starting vector w, the perception learning rule will converge to a weight vector (not necessarily unique and not necessarily w ) that gives the correct response for all training patterns, and it will do so in a finite number of steps. Linearly Separable classes can be distinguished by a perception. Let X be the subset of training vectors belonging to C. Let X be the set of training vectors belonging to C. Then we 6

30 can say that X X is the complete training set X. Given the set of vectors X and X to train the perception, the training process involves the adjustment of the weight vector w such that C and C are linearly separable. That is, there exists some w such that w T p > for every input vector p C and w T p < for every input vector p C Multi-Layer Perception Multi-Layer Perception (MLP) is generally used to describe any feed-forward networks. The concept of MLP was introduced by adding at least two or more layers to a single layer perception network to solve the XOR problem (Exclusive OR). Figure.6 shows the feed-forward network with three layers and fig..7 demonstrates the XOR problem graphically. As shown in the fig..7 points in the Cartesian product can not be linearly separated with single layer perception leaning algorithm. However, if you see fig..8, the points in Cartesian product now can be linearly separated. Further, the points are now linearly separated even if we want to add three points into the same class. In MLP when neural networks with more than three layers are used, the first layer draws linear boundaries, second layer combines the boundaries and third layer generates arbitrary complex boundaries. It can simply say that the second layer learns the local knowledge whilst the third layer learns the global knowledge 6. Figure.6: Perception learning in three layer feed-forward network Figure.7: XOR problem in single layer network Figure.8: XOR problem in three layer network..3 Delta Learning The generalized delta learning rule is the product of the difference between the desired output and generated output and derivative of the generated output i.e. r = [d i f (w t i x] f (w t ix). Therefore the delta learning rule is a supervised learning algorithm. Increment of the weight vector becomes as defined in eq. (.4). w i = η[d i f (w t x)] f (w t x)x (.4) A feed-forward network has a layered structure. Each layer consists of units which receive their inputs from neurons from a layer immediately below and send their output to neurons in a layer immediately above the neuron. There are no connections among the neurons within a layer. 7

31 where o = f (w t x) is the output generated by the network and eq. (.4) can be written as in eq. (.5). w i = η(d i i ) f (w t x)x (.5) For a linear activation function f (x) = x in eq. (.4) gives the standard delta rule as in eq. (.6): w i = η(d i i )X (.6) The delta learning rule can be applied only for a two layer network i.e. input layer and output layer, but not applicable in neural networks with more than two layers. To overcome this problem i.e. to teach network with hidden layers under supervised learning algorithm, back-propagation learning algorithm was introduced. It is a special version of the delta learning rule and it updates the weight components in hidden layers backward by starting from the output layer...3. Back-Propagation Learning Algorithm Although back-propagation learning algorithm can be applied to networks with any number of layers, it has shown that one layer of hidden neurons can successfully approximate any function to arbitrary precision provided that activation functions of the hidden neurons are non-linear. In many applications of back-propagation learning algorithm a feed-forward network with a single layer of hidden neurons is used with a sigmoid activation function, f (x) = +exp x. Application of back-propagating learning algorithm involves two phases. In the first phase the input x is presented and propagated forward through the network to calculate the output values o k for output neurons. During the second phase the error signal is passed backward to each neuron in the network and appropriate weight changes are calculated. For example consider the network in fig..9. If gradient descent approach is used to update weights, the objective of the learning is to modify the weight matrices to reduce a sum of square error E = k(d p,k o p,k ) for each input pattern p. E can be defined as in eq. (.7). E = (d k o k ) (.7) k o k = f (net () k ) = f x () j w (,) () (.8) j x () j = f (net () j ) = f x i w (,) (.9) Error function E(w i j ) that measures how far the current network is from the desired one. Then partial derivatives of the error function E(w i j) w i j i ( j,i) tell which direction to be moved in weight space to reduce the error. 8

32 w (,) ji w (,) k j Figure.9: Feed-forward network with three layers denotes the weights on the connection from i th neuron in input layer to the j th neuron in hidden layer. Similarly denotes the weight on the connection from from k th neuron in hidden layer to the k th neuron in output layer. By chain rule, the partial derivatives of E for each hidden layer neuron can be defined as in eq. (.). E w (,) j,i k ) = E f (net () o k net () k net () k x () j x () j net () j net () j w j, i () (.) Therefore, combining the weight updating mechanism of delta learning rule for visible units and the rules in eqs. (.) to (.), give the weight updating learning rules for back-propagating algorithms for visible units and hidden units respectively. where δ k = (d k o k ) f (net () k ) and µ j = k(δ k w (,) k, j ) f (net () j ) w (,) k, j = ηδ k x () j (.) w (,) j,i = ηµ j x i (.) In general, back-propagating learning algorithm has great representation power as it could approximate any function using gradient descent approach. Further it is easy to apply, easy to implement and has good generalization power. However, it takes a long time to converge, the output generating mechanism which is a black box because hidden nodes and the learned weights do not have clear semantics. Moreover it only guarantees to reduce the total error to a local minimum and the error might not be reduced to zero...4 Winner-take All Learning Winner-take all learning rule is an unsupervised and competitive learning rule 6. It was developed by Kohonen and mainly used for clustering and extracting statical properties of data. The weights are modified only for the neuron with the highest output value. Weights of remaining neurons are left unchanged. Therefore, the increment of the weight vector W m = [w m w m... w mn ] of the wining neuron is: W m = α(x w m ) (.3) 9

..4. Kohonen Networks Kohonen networks in their simplest form consist of an input layer and an output layer. The output layer called the Kohonen layer also has a special internal structure.

33 ..4. Kohonen Networks Kohonen networks in their simplest form consist of an input layer and an output layer. The output layer called the Kohonen layer also has a special internal structure. In a Kohonen network, each neuron in the input layer is connected to every neuron of the output layer. In addition, there are interconnections among all the neurons within the Kohonen layer. Each neuron in this layer excites the neurons in its surround to some degree and inhibits the neurons that are far away (center surround environment). Therefore the active neuron is called wining neuron. The active input neurons that represent a pattern, strengthen their connections with the winning neuron by increasing the weight produced over time, cf. fig... Figure.: Kohonen Map In Kohonen networks, all the input data is normalized. So the length of each input vector becomes equal to unity. The activation function could be either unipolar or continuous. For example, if inputs are binaries as X = [,,,, ], the maximum value of the net is net = 5 i= x i w i = XW T when the weights are identical to the input pattern, i.e. W = [,,,, ]. Then weights are updated of the winning neurons in the Kohonen layer as defined in winner-take all algorithm. Even though Kohonen networks are good at extracting statistical properties of the data, it has a few drawbacks such as important information about the length of the input vector being lost during the normalization process, and clustering depending on the order of the patterns applied Hopfield Networks In contrast to feed-forward networks, in recurrent networks neuronal outputs could be connected with their inputs. Thus signals in the network can be continuously circulated. The single layer recurrent neural network was analyzed by Hopfield and called Hopfield networks. In Hopfield neural network every neuron is connected to every other neuron in the network and are termed auto-associative networks, cf. fig... The network shown in the fig.. weights are given by a symmetrical square matrix W with zero element for the connections where w i, j = for i = j. The stability of the system is analyzed by using energy function where E = Ni= Nj= w i j v i v j. It has been proved that during the training the energy E of the network decreases

Figure.: A Hopfield neural network and the system converges to stable points. This is especially true when values of the system outputs are updated in the asynchronous mode.

34 Figure.: A Hopfield neural network and the system converges to stable points. This is especially true when values of the system outputs are updated in the asynchronous mode. If Hebbian learning w i j = w ji = (v i )(v j ) is implemented in a Hopfield Network, it takes on a number of properties. a An input signal can be stored in the network simply as a pattern of activated and non-activated neurons. b The internal connections of the network ensure that when a new input is presented there is much more going on in the network than just the activation of the neurons receiving those inputs. It has been demonstrated that reception of an input changes the activation state of the network for some time before it reaches a stable state. The stable state is the memory of the input pattern and is also called an attractor. Hopfield network are sometimes called attractor networks. c A network can store more than one pattern in this way and therefore produce more than one attractor. d Even if the input is only similar to a stored input, the activation pattern of a network will converge on the closest attractor. In other words, network judges the similarity and thereby generalizes across a set of input patterns. The network recognizes input patterns even if they are only partially present. e Nonetheless the maximum number of input patterns that can be stored in this way is equal to about 3 percent of the total number of neurons in the network...5. Autoassociative Memory Autoassociative memory is developed by Hopfield by extending the concept of his networks. In the same network structure as in fig.. the bipolar neurons were used with outputs equal to - or +. In this network

35 input pattern p are stored in the weight matrix w using auto-correlation function as defined in eq. (.). M W = p m p T m MI (.4) m= where M is the number of stored patterns and I is the identity matrix. Using this modified formula new patterns could be added and removed from memory. When such a memory is exposed to a binary bipolar pattern, then after training, the network converges to the closet stored pattern or its complement. Like Hopfield networks, the associative memory has limited storage capacity and when the number of stored patterns is large and close to the memory capacity, the network has a tendency to converge to a state which is not stored. These states are additional minima of the energy function Bidirectional Autoassociative Memory The concept of the associative memory was extended to bidirectional associative memory 6. As shown in fig.. these associative memories are able to store to associate pairs of the patterns a and b. Bidirectional autoassociative memories are two layer networks where the output of each neuron in the second layer directly connects inputs of each neuron in the first layer. The weight matrix of the second layer is w T and w is the weight matrix of the first layer. The w is calculated as a sum of the correlation matrix as defined in eq. (.5). M W = a m b m (.5) m= Figure.: Bidirectional autoassociative memories where M is the number of stored patterns, and a m and b m are stored vector pairs..3 Stochastic Networks The networks that have been discussed so far enable to retrieve the entire memory from a tiny sample of itself, and operate in a deterministic way that is based on given particular interconnections between neurons and a particular set of initial conditions. The network always exhibits the same dynamical behavior. This is mainly because rules operating on the neurons in those networks have no probabilistic specifications. These types of networks are really good at solving problems or pattern completion. But they cannot update or

36 adjust their states dynamically according to the feedback of external environment. The networks that grant probabilistic specification to their neurons is generally called stochastic neurons. Stochastic networks in ANN are broadly classified into two classes. Stochastic networks in the first category have been developed to perform tasks similar to the tasks performed by conventional neural networks, such as pattern recognition and pattern completion, the best example for these networks are deep-belief networks developed for character recognition. On the other hand stochastic networks in the second category are still at research level and have been developed to encode time-perturbations, such as sine wave, Poisson inputs, or any other inputs that change the form with time. The examples for such neuronal models are leaky-integrate and fire neurons, stochastic inhibited neurons. The main significant feature of these neuronal models is that they use spiketiming-dependent plasticity mechanism to encode their information and thereby accommodate the longlasting changes to the network models. The learning mechanism that presented in this thesis belong to this category. The next couple of subsections thoroughly review these two categories of the stochastic neurons with their significant examples..3. Deep-Belief Networks Deep belief networks are derivatives of Boltzmann networks and belong to first category of the stochastic networks. That is these networks are generally used for object or pattern recognition tasks. These stochastic networks encode a problem as the initial state of the network knowledge and constraints are stored as their connections. General procedure of training these networks is that first, we show to the network a real world distribution of patterns and try to develop the network which is capable to generate the same distributions of patterns on its own. However, these networks can not settle into a stable state because the states of these neurons are always changing, even if their inputs do not change. By letting the network run freely for a longtime and recording the states it passes through, the probability distribution over these states can be developed. The probability distribution after networks reach thermal equilibrium is the distribution which does not change over time even though the states of the stochastic neurons are keep changing. The final goal of this type of networks is to make this thermal equilibrium distribution similar to our world distribution by changing the connection weights in the network. Since connection weights are updated based on the activation of pairs of units, i.e. probability distribution corresponds to activation of every unit in the network, these networks are not able to capture any structure of probability distribution that is higher than the second order because it is always the mean and variance. To overcome this problem, the activation of subset of units has been considered as the patterns. Then these networks become capable of capturing higher order regularities in the distribution. The units involved in the patterns are called visible and the others are called 3

37 hidden 6. Next couple of paragraphs briefly introduce Boltzmann machines and then move to deep belief networks..3.. Boltzmann Machines Boltzmann machine is an extension of simple stochastic associative networks, e.g Hopfield networks, to include hidden units cf. fig..3. For example consider a set of patterns α with Boltzmann machines with its real probability distribution P + α. For each unit in these pattern vector there is a corresponding visible unit in the Boltzmann machine. In addition to these visible units some hidden units are also added to the network to capture higher order data of the real world distribution. Then all units in the Boltzmann machine calculate their energy level using eq. (.6) and then update their states according to the stochastic update rule as given in eq. (.7). After long training, the system reaches a low temperature thermal equilibrium. Then by averaging activities (for example, averaging the activities of i th unit s i ) of all the visible units in this free running mode, the probability distribution over the visible units P α can be determined. Thus our goal in Boltzmann machines is to make this measured P α as close to P + α. To measure the gap between these two distributions Kullback Leibler 63 distance between the distribution can be calculated as in eq. (.8). E i = E E + = w i j S j (.6) p i (+) = + exp E i/t ( P G = G(P + α P α) = P + + ) α ln α α j P α (.7) (.8) Figure.3: A general Boltzmann machine The top layer represents a vector of stochastic binary hidden units h and, the bottom layer represents a vector of stochastic binary visible variables, v. E i is the energy difference of the i th unit and is equal to it being inactive or active. That is i th unit is in active state with probability as defined in eq. (.7) and T is the temperature. Since it is not possible to directly estimate the probability distribution of the hidden units to get the entire network probability distribution to calculate the weight updates of the units in the network, the learning procedure of the Boltzmann machines consists of two phases, 6. In phase +, the visible units are clamped to the value of the particular pattern, then the network is allowed to settle into low temperature thermal equilibrium. Then weights are incremented between any two units when both are in active state. This phase is repeated for a large number of times, with each pattern α having P + α. In phase the network is let to run 4

38 freely without any clamping of inputs and activities of all the units are sampled. Then weights between any two units which are both on are decremented. This phase is called unlearning. Alternating the phases in an appropriate frequency, this learning procedure reduces the cross-entropy between P + α and P α as defined in eq. (.9). G = ( + ) si s j si s j w i j T (.9) Boltzmann machines have abilities to encode greater than second order regularities in data which makes them really powerful. They also provide a very good Bayesian measure of how good a particular model or internal representation is. Nonetheless they are extremely slow due to the many nested loops involved in the learning procedure [i.e. we must run all patterns both in the clamped and un-clamped phases, only then can we obtain a one weight update]..3.. Restricted Boltzmann Machines Restricted Boltzmann machines (RBM) is the immediate advanced version of the Boltzmann machines. In contrast to Boltzmann machines RBM does not have any connections between neurons in the same layer, cf. fig..4. The variable v is the input vector and the variable h represents the hidden units that correspond to the unobserved features 64. An RBM defines a joint probability p(v,h) on both the observed units v and unobserved units h. This joint distribution is then marginalized over the hidden units to give a distribution over the visible units only as in eq. (.3). Here again the probability distribution is defined by an energy function E which is defined over (v,h) of binary vectors as in eq. (.3). Figure.4: A restricted Boltzmann machine The top layer represents a vector of stochastic binary hidden units h and, the bottom layer represents a vector of stochastic binary visible variables, v and there are no connections between units in the same layer. exp E(v,h) p(v) = (.3) Z h E(v,h) = a i v i b j h j w i j v i h j (.3) i j i, j Where Z = u,g exp E(u,g) is the partition function, a i and b j are the biases associated to the input variables v i and hidden variables h j respectively and w i j corresponding weight associated with the connection 5

39 between v i and h j. The rest of the computation is done using Gibbs sampling propagation rule. The idea behind this sampler is, it is much easier to compute the sequence of conditional distributions p(v h) and p(h v) than obtaining marginal distributions p(v) by integrating the joint distribution p(v,h). The sampler starts with some initial value h for h and obtains v by generating a random variable from the conditional distribution p(v h = h ) as defined in eq. (.3). Then sampler uses v to generate a new value of h using conditional distribution p(h v = v ) as defined in eq. (.33). Repeating this process k times, generates a Gibbs sequence of length k. After a sufficient burn-in period, the effects of the initial sample values will be removed and subsequently will converge to a stationary distribution which is the target distribution we are trying to simulate 5, 64. p(v h) = p(v i h) (.3) i p(h v) = p(h j v) (.33) j The p(v i h) and p(h j v) is the calculated as shown in eq. (.34) and eq. (.35) respectively, where sigm(x) = /( + exp ( x) ). p(v i = h) = sigm a i + h j w i j (.34) j p(h j = v) = sigm b j + v i w i j (.35) Although Gibbs sampling works well in RBM easing the inference process, in practice Gibbs sampling does not always mix well. To improve the inefficiencies in RBM and to train RBM as a probabilistic model, the natural criterion is to maximize the log-likelihood. Contrastive divergence learning uses such an approach to improve the efficiency of RBM learning. i.3..3 Contrastive Divergence Learning Contrastive Divergence is an approximate of Maximum-Likelihood learning algorithm. That is, given a probability model p(x, Θ) = Z(Θ) f (x, Θ); where Θ is the model parameters, Z(Θ) = f (x, Θ)dx is the partition function and training data X = Xk= K. The aim of the contrastive divergence is to find Θ that maximizes likelihood of the training data, i.e. p(x, Θ) = K k= f (x k, Θ) or to find Θ that maximizes the negative log of likelihood, i.e. log p(x, Θ), which is called energy and denoted by E(X, Θ) as defined in 6

40 eq. (.36). E(X, Θ) = log p(x, Θ) k= ( ) E(X, Θ) = log f (x, Θ) Z(Θ) E(X, Θ) = K log(z(θ)) K K log( f (x k, Θ)) (.36) k= Under maximum likelihood learning, E(X,Θ) Θ = at minimum E(X, Θ) Θ = E(X, Θ) Θ log Z(Θ) Θ = log Z(Θ) Θ K log f (x k, Θ) K Θ k= log f (x, Θ) Θ X (.37) (.38) <. > is the expectation of given the data distribution. Under the Gradient descent based learning if a fixed step size is moved in the direction of the steepest gradient, the equation in eq. (.39) and subsequently in eq. (.4) give the corresponding parameter updates. E(X, Θ) Θ t+ = Θ t ( Θ ) log Z(Θt ) log f (x, Θt ) Θ t+ = Θ t Θ t Θ t X (.39) (.4) Since Z(Θ) = f (x, Θ)dx, this integral may be graphically intractable. It is not easy to calculate either log Z(Θ) Θ or E(X,Θ) Θ. However some clever substitution has been found as shown in eq. (.4). log Z(Θ) Θ log Z(Θ) Θ = = = = = = = Z(Θ) Z(Θ) Θ f (x, Θ)dx Z(Θ) Θ f (x, Θ) Z(Θ) Θ dx log f (x, Θ) f (x; Θ) dx Z(Θ) Θ log f (x, Θ) f (x, Θ)dx Θ Z(Θ) log f (x, Θ) p(x, Θ)dx Θ log f (x, Θ) Θ p(x,θ) (.4) 7

41 Therefore eq. (.4) can now be written as in eq. (.4). log f (x, Θt ) Θ t+ = Θ t Θ t p(x,θ t ) log f (x, Θt ) Θ t X (.4) Where log f (x,θ t ) Θ t can be estimated numerically. To estimate eq. (.4) the samples were drawn p(x,θ t ) from p(x, Θ). Since Z(Θ) is unknown, samples cannot be drawn randomly from a cumulative distribution curve. However Markov chain Monte Carlo (MCMC) methods can turn random samples into samples from a proposed distribution without knowing Z(Θ) using Metropolis algorithm 65. Let us assume the training data X as the starting point for our MCMC sampling, and XΘ denotes training data, XΘ n training data after n cycles of MCMC, X Θ is the samples from proposed distribution with parameter Θ. Then according to MCMC process our parameter updating equation can be simply written as in eq. (.43). log f (x, Θt ) log f (x, Θt ) Θ t+ = Θ t (.43) Θ t Θ t X Θ t When we make the number of MCMC cycles per iteration small, say n, then eq. (.43) can be written as in X Θ t eq. (.44). log f (x, Θt ) Θ t+ = Θ t Θ t X Θ n t log f (x, Θt ) (.44) Θ t X Θ t MCMC cycle is enough to move the data from the target distribution towards the proposed distribution, and to suggest which direction the proposed distribution should move to better model the training data, but a chance to generate a bias distribution may occur due to dimensionality reduction 9, Belief Nets A belief net is a directed acyclic graph composed of stochastic variables and is a derivative of RBM. Compared to RBM, belief nets allow effectively to sample from the true posterior over the hidden units when the nets have millions of parameters. In an RBM, the hidden units are conditionally independent given the visible states. So we can quickly get an unbiased sample from the posterior distribution when a data-vector is given. This has been done by generating unbiased sample from the posterior distribution over the hidden units given visible or observed units by restricting the connectivity between stochastic units in a special way to make the learning easier, cf. fig..6. As shown in the figure, belief network has a special network architecture, that has a RBM having symmetric connections between neurons. Importantly the connections between visible units and hidden units are directed and the connections between top layer and the layer below the top layer are undirected which makes an associative memory. With this architecture, applying CD learning and greedy step-wise training, see section.3..5, belief networks can easily sample from its true 8

42 posterior distribution over hidden units. For example, the learning rule for sigmoid belief nets is given in eq. (.45), cf. fig..5. The network has tried to maximize the log probability for each unit, so that each unit s binary state in the sample from the posterior would be generated by the sampled binary states of its parents. Figure.5: Learning rule of sigmoid belief nets p i = p(s i = ) = + exp (.45) j s j w ji w ji = ɛs j (s i p i ); here ɛ is learning rate After ignoring the bias terms in eq. (.3), the energy of joint configurations can be defined as in eq. (.46) if v i is the binary state of visible unit i, h j is the binary state of the hidden unit j, w i j is the weight between units i and j and energy with configuration v on the visible units and h on the hidden units. E(v, h) = v i h j w i j (.46) i, j E(v, h) w i j = v i h j Then probability of a joint configuration over both visible and hidden units depends on the energy of that joint configuration compared with the energy of all other joint configurations can be defined as in eq. (.47) and the probability of a configuration of the visible units is the sum of the probabilities of all the joint configurations that contain it and can be defined as in eq. (.48) with Z = u,g exp E(u,g). Figure.6: A belief net with one hidden layer p(v, h) = exp E(v,h) Z p(v) = h exp E(v,h) Z (.47) (.48).3..5 Greedy Learning The idea behind greedy learning is to allow each model in the sequence to receive a different representation of the data. The model performs non-linear transformation on its input vectors and produces the output vectors that will be used as input for the next model in the sequence. Figure.7 shows a multilayer generative model, e.g. a Belief net, in which the top two layers interact via undirected connections and all other connections are directed. There are no intra layer connections and all layers have the same number 9

of units. Therefore, the top two layers form an associative memory. The layers below have directed topdown generative connections that are used to map a state of the associative memory to an image.

43 of units. Therefore, the top two layers form an associative memory. The layers below have directed topdown generative connections that are used to map a state of the associative memory to an image. There are also directed, bottom-up recognition connections that are used to infer a factorial representation in one layer from the binary activities in the layer below. In the greedy learning algorithm in the initial learning, the recognition connections are tied to the generative connections as shown in fig..7. The greedy learning algorithm applies the wake-sleep algorithm between two consecutive layers, see section.3..6, and move hierarchically from bottom to top. The greedy learning algorithm is,. Learn W assuming all the weight matrices are tied.. Freeze W and find W T to infer factorial approximate posterior distribution over the states of the variables in the first hidden layer. 3. Keeping all the higher weight matrices tied to each other, but untied from W, learn an RBM model of the higher level data that was produced by using W T to transform the original data. Figure.7: A belief net with greedy learning algorithm.3..6 Wake-Sleep Algorithm The wake-sleep algorithm is an unsupervised learning algorithm for a multilayer neural network. Training is divided into two phases, wake and sleep. In the wake phase, neurons are driven by recognition connections, while generative connections are modified to increase the probability that they would reconstruct the correct activity in the layer below (closer to the sensory input). In the sleep phase the process is reversed, so that neurons are driven by generative connections, while recognition connections are modified to increase the probability so that they would produce the correct activity in the layer above (further from sensory input). This strategy allows higher level neurons to communicate their needs to lower-level neurons whilst also being easy to implement in layered networks of stochastic, binary neurons that have activation states of or and turn with a probability that is smooth non-linear function of the total input received in eq. (.49). p(s j = ) = + exp( b j Σ i s i w i j ) (.49) connections from what would normally be considered an input to what is normally considered an output connections from what would normally be considered an output to what is normally considered an input 3

44 Where s i and s j are the binary activities of units i and j, w i j is the weight on the connection from i to j and b j is the bias of unit j. For example, if we consider the model shown in fig..7, it shows a three layer neural network. Activities in the bottom layer represent the inputs. Activities in the higher layer learn to represent the causes of the input. The bottom-up recognition connections convert the input into an internal representation. The binary state of a hidden unit that was actually used to generate an image can then be used as its desired state when learning the bottom-up recognition weights. Given the top-down generative weights, the recognition weights can be learned and given the recognition weights, the generative weights can be learned. Therefore it is possible to learn both sets of weights by starting with small random values and altering between the two phases of learning. Learning task consists of two phases. In the wake phase, the recognition weights are used to drive the units bottom-up and the binary states of units in adjacent layers are used to train the generative weights. In the sleep phase, the top down generative connections are used to drive the network. The binary states of units in adjacent layers can then be used to learn bottom-up recognition connections. During the wake phase, a generative weight g k j is changed by g k j = ɛs k (s j p j ) where unit k is in the layer above unit j, ɛ is a learning rate and p j is the probability that unit j turns on if it were being driven by the current states of the units in the layer above using the current generative weights. During the sleep phase, a recognition weight w i j is changed by w i j = ɛs i (s j q j ) where q j is the probability that unit j would turn on if it were being driven by the current states of the units in the layer below using the current recognition weights 8, 9, 5, STDP Networks Spike-Timing Dependent Plasticity(STDP) based networks belong to the second category of stochastic networks which are still in research level and have been developed to encode time-perturbations such as sine waves and Poisson signals. Although applications of STDP can be seen in pattern recognitions 53, 54 and to generate competitions between neurons 34 36, it has been mainly applied to control instability of Hebbian neurons. These second type of STDP networks have introduced a variety of neuronal models so that information in the inputs are encoded in terms of their arriving times rather than the average number of their occurrence within a given period. Among those neural models leaky-integrate and fire neuron model34, 35 and stochastic inhibited neurons 4 are the significant approaches that have been made so far. The next couple of sections review the states of these networks thoroughly with their major applications..3.. Spike-Timing Dependent Plasticity As per recent literature, it has recognized STDP 7 as a key mechanism of how information processed in the brain. STDP is a form of long-term plasticity that merely depends on the relative timing of pre-synaptic 3

45 and post-synaptic action potentials 6,. Although the process and the role of STDP in information passing in some areas of the human brain in the development stages are still not clear 37, 67, it has been shown that average case versions of the perception convergence theorem holds for STDP in simple models of spike neurons for both uncorrelated and correlated Poisson input spike trains. Further it has shown that not only STDP changes the weight of synapses but also STDP modulates the initial release probability of dynamic synapses 9. The weight learning algorithm of STDP can be defined as in eq. (.5). A + exp( (t post t pre )/τ + ) if t pre < t post w(t pre t post ) = (.5) A exp( (t pre t post )/τ ) if t pre t post Where at t pre a spike occurs at the end of pre-synaptic terminal and reaches the post-synaptic neuron t post. STDP is described by the weight window function w(t pre t post ) that determines how the strength of synapses is modified by a pair of action potentials; a pre-synaptic spike occurring at t pre and post-synaptic spike occurring at t post. The parameters τ + and τ determine temporal ranges of the two sides of the window function while A + > and A determine the size of the changes induced by appropriate spike-pairings. The first part in the eq. (.5) occurs when pre-synaptic spike occurs before the occurrence of a post-synaptic spike and produces a long-lasting strengthening called long-term potentiation. The second part in eq. (.5) happens when a spike occurs in post-synaptic neuron before a spike occurs in pre-synaptic spike and produces long-lasting depression called long-term depression. Long-term potentiation and long-term depression are considered the main substrate for learning and memory. See next chapter for the biological explanation of these STDP, long-term potentiation and long-term depression concepts..3.. Networks with Leaky-Integrate-and-Fire Neurons STDP networks are mainly made of integrate-and-fire neurons. These neuronal models generate spikes whenever a neuron s membrane potential u crosses some threshold ϑ as defined in eq. (.5). The moment that the threshold is crossed is called a firing time t ( f ). Immediately after t ( f ), membrane potential is reset to a new value u r < ϑ and can be defined as in eq. (.5). The leaky-integrator neuron model with τ m membrane constant, u membrane potential, R resistance and I current is described in eq. (.53). t ( f ) : u(t ( f ) ) = ϑ and du dt t=t ( f ) (.5) lim r t t ( f ) and t>t ( f ) (.5) du τ m = u + R I(t) dt (.53) The difference in electrical potential between the interior and exterior of a biological cell Describe a neuron that takes the integral of an input but generally leaks a small amount of input over time 3

46 STDP has been tested on a variety of computational environments with leaky-integrate-and-fire neurons especially to balance the excitation of Hebbian neurons by introducing synaptic competition and to identify the repetitive patterns in continuous spike trains 53, 54. These experimental studies on synaptic competition using STDP are conducted in two forms: additive form and multiplicative form. In additive form, for example in Song et al study 34, the membrane potential of the integrate-and-fire neuron model was determined by eq. (.54). When the membrane potential reached a threshold value of 54mV neuron fired a spike and then the membrane potential was reset to 7mV. The increase of synaptic conductance after the arrival of a spike at excited synapse a is defined in eq. (.55) and the amount of synaptic conductance that decreases after arrival of a spike at inhibitory synapse in is defined in eq. (.56). The corresponding leakage of synaptic conductance at excited synapses and inhibited synapses are defined in eq. (.57) and eq. (.58) respectively. τ m dv dt = V rest V + g ex (t)(e ex V) + g in (t)(e in V) (.54) g ex (t) g ex (t) + g a (.55) τ ex dg ex dt τ in dg in dt g in (t) g in (t) + g in (.56) = g ex with τ ex = 5ms and g max =.5 (.57) = g in with τ in = 5ms and g in =.5 (.58) Where τ m = ms, V rest = 7mV, E in = 7mV. Further a function M(t) was defined to decrease synaptic strength each time the post-synaptic neuron fired a spike. M(t) was decreased by A as defined in eq. (.5). If excited synapses a received a spike at t, its maximal conductance was modified by g a (t) g a(t) + M(t)g a under the boundary condition if g a < and then g a was set to. On the other hand P a (t) was defined to increase the strength of synapse a. Every time synapse a received an action potential, P a (t) was increased by the amount A + in eq. (.5). If the post-synaptic neuron fired a spike at time t, g a was modified according to g a (t) g a (t)+p a (t)g max under the boundary condition if g a > g max and then g a was set to g max. Furthermore, initially M =, P a =, and M(t) and P a (t) decayed exponentially over time as defined in eq. (.59) and eq. (.6). τ dm dt τ + dp a dt = M (.59) = P a (.6) Where τ = τ + = ms, A /A + =.5 and A + =.5. Hence in this approach although synapses competed against each other to control the timing of post-synaptic firing and synapses with strong and temporal correlations strengthened as clusters, it has assumed that synaptic strength does not scale synaptic efficacy 33

47 which is not biologically supported, and hard constraints were used to define the efficacy boundaries which limited the learning performances. In the multiplicative form, synaptic scaling was separately introduced to synaptic weight as a function of post-synaptic activity 35, 36. For example, in the study of van Rossum and Turrigiano 35 the weights on the connections between neurons were introduced by assuming that the amount of potentiation is inversely proportional to the synaptic present weight and depression is independent of the weight. Therefore weight updating plasticity rules were w p = c p exp δt/τ S T DP for plasticity and w d = c d exp δt/τ S T DP for depression. c p and c d were the average amount of relative potentiation and depression after one paring respectively. The synaptic competition has been introduced as an external activity dependent scaling mechanism. It adjusted the synaptic weights to regulate the post-synaptic activity which was measured with a(t) as defined in eq. (.6) which increased with every post-synaptic spike and decayed exponentially between spikes. Since scaling was considered in multiplicative form and independent of pre-synaptic activity, the weights were updated every time step according to eq. (.6) with β = 4 5 /sec/hz and γ = 7 /sec /Hz. Although the network reached stable equilibrium for Poisson inputs, because of the reduced competition between synapses, even for moderately strong spike input correlations all synapses stabilized into similar equilibrium. τ da(t) dt = a(t) + δ(t t i ); τ = sec and t i is the spike time. (.6) dw(t) dt i = βw(t)[a goal a(t)] + γw(t) t (a goal a( t))d t (.6) Therefore, it could be said that many similar applications based on STDP to control the excitation of Hebbian neuron depend on the user-defined constraints on weight algorithm which ultimately limit the performance of learning Networks with Stochastic Inhibited Neurons To alleviate this limitation in the learning process of using hard weight constraints, another significant approach has been discussed in the literature to remove the correlation in input spike trains by using recurrent neural networks. Einevoll et al results claim the possibility of reducing the correlation in the spike inputs by recurrent network dynamics. The experiment was conducted on two types of recurrent neural networks; with purely inhibitory neurons and mixed inhibitory-excitatory neurons. At low firing frequencies, response fluctuations were reduced in recurrent neural network with inhibitory neurons when compared to feed-forward network with inhibitory neurons. Moreover, in the case of homogeneous excitatory and inhibitory sub-population, negative feedback helps to suppress the population rate in both recurrent neural 34

48 network and feed-forward network. Because inhibitory feedback effectively suppresses pairwise correlations and population rate fluctuations in recurrent neural networks, they suggested using inhibitory neurons to de-correlate the input spike correlations. Moving one step further by combining the underlying concepts in 7, 3, that is, using nonlinear temporally asymmetric Hebbian plasticity, i.e. STDP, and recent experimental observation of STDP in inhibitory synapses, Luz and Shamir 4 have discussed the stability of Hebbian plasticity in feed-forward networks with N E excitatory synapses, { wi E and N i= I inhibitory synapses, { wi I i= consisting a single post-synaptic neuron. The post-synaptic response ρ post (t) was modeled to delay the linear sum of inputs as defined in eq. (.63) where ɛ is a small positive constant, ρ E/I spike train of the i th excitatory/inhibitory neuron and { t E/I,i j ρ post (t + ɛ) = N E N E j= } i= w E j ρe j (t) N I } NE i } NI = j= δ(t t E/I,i j ) is the are the spike times of the neuron. N I j= w I j ρi j (t) (.63) With the assumption that mean firing rates are equal for all pre-synaptic neurons, i {... N X, X {I, E}} p X j = r and by allowing instantaneous correlations between inputs whilst keeping excitatory and inhibitory inputs uncorrelated, the excitatory and inhibitory synapses followed STDP under Hebbian plasticity as defined in eq. (.64) (it is in the same form of eq. (.5)) with µ [, ] and α E >. f + (w I ) = ( w I ) µ f (w t ) = α(w t ) µ w = ± f ± (w) exp t (.64) Luz and Shamir 4 findings supported the fact that temporally asymmetric Hebbian STDP of inhibitory synapses is responsible for the balance the transient feed-forward excitation and inhibition. Using STDP rules, the stochastic weights on inhibitory synapses were defined to generate the negative feedback and stabilized into a uni-modal weight distribution. The approach was tested on two forms of network structure; feed-forward inhibitory synaptic population and feed forward inhibitory and excitatory synaptic population. The former structure converged to a uniform solution for correlation input spikes but later destabilized and excitatory synaptic weights were segregated according to the correlation structure in input spike train. Even though the proposed model in the presence of inhibitory neurons of the learning is more sensitive to the correlation structure, the stability of the network is needed to be validated when the correlation between the excitatory synapses and inhibitory synapses is present. 35

49 .4 Unanswered Hypothesis in Neuroscience The specifics of a biologically plausible model of plasticity to control the Hebbian neuronal excitation that can account for the observed synaptic patterns have remained elusive despite the rapidly growth of the applicability of STDP to control the excitation of Hebbian neurons. Although these applications have effectively controlled the excitation/depression of Hebbian synapses, the use of boundary conditions and other infeasible biological assumptions have made them inflexible for dynamic adjustments. Further these approaches are mainly modeled the learning behavior at neuronal level. Therefore the probability that associated plasticity mechanisms such as STP and HSP can contribute to this controlling process is minimal. These abstracted neural models have avoided the computer scientist to evaluate the behaviors of neurons in terms of their release probability and as an interaction of other plasticity mechanisms on computational environments. Therefore, to get biologically plausible model and remove the instability in Hebbian plasticity many mechanisms have been discussed in recent findings. One remarkable suggestion is to combine STDP with multiple dynamic and stochastic synaptic connections which enable the neurons to contact each other simultaneously through multiple synaptic communication pathways that are highly sensitive to the dynamic updates and that stochastically adjust their states according to the activity history. Furthermore, strength of these individual connections between neurons is necessarily a function of the number of synaptic contacts, the probability of neurotransmitter release, and post-synaptic depolarization 4. These synapses are further capable of adjusting their own probability of neurotransmitter release p r according to the history of short-term activity 4, 43 which provides an elegant way of introducing activity dependent modifications to synapses and to generate the competition between synapses. Based on this hypothesis many approaches have been proposed by modeling the behavior at synapses stochastically 5, 6. Among these approaches study of Seung 6 is significant because in his hedonistic proposal he has tried to explain how neurobiological findings can be adopted by stochastic gradient learning. Other researchers 5, 68 have also developed synaptic models to study the behavior of synapses under plasticity mechanisms but has not yet found a model that could simulate the biological plasticity mechanisms effectively and explain how could these plasticity mechanisms control the excitation of Hebbian neurons when learning takes place..4. Model of Hedonistic Synapse In Seung study 6, synapses were modeled as hedonistic, that is they responded to a global reward signal by increasing their probability of release or failure, depending on which action immediately preceded reward. The main argument of this study is that if each synapse in the network behaved hedonistically, then the entire network as a whole learned to increase its average reward by generating appropriate corrective actions. The 36

50 study has incorporated that biological synapses are driven by pre-synaptic spike and the efficacy of synaptic transmission varies dynamically over time due to short-term facilitation and depression. These hedonistic synapses followed two basic rules, that is they increased p r if reward follows release and is decreased if reward follows failure. Second rule was defined by generalizing the first rule to negative reinforcement called punishment, so that the p r is decreased if punishment follows release and is increased if punishment follows failure. To adapt these rules, a hedonistic synapse needed to maintain a record of its recent releases and failures. A synapse was modeled with two states, available (A) and refractory (R). When the synapse was available, a pre-synaptic spike could release the vesicle (A R transition) with probability as defined in eq. (.65). p = + exp q c (.65) Where q was the release parameter, c modeled the calcium dynamics at the pre-synaptic terminal, c jumped by c for each pre-synaptic spike and otherwise decayed exponentially, dc/dt = c/τ c. After releasing the vesicle, the synapse entered to refractory state. The synapse recovered (R A) with time constant /τ r. When a synapse was in the refractory period it could not release a vesicle. Furthermore, the synapse maintained the trace of its recent actions in terms of e(t) a hypothetical quantity which jumped by e amount at the available synapse in response to pre-synaptic spike as defined in eq. (.67). There were no jumps of e(t) when the synapse was in a refractory period. During the intervals between pre-synaptic spikes it decayed exponentially as defined in eq. (.66). e = de dt = e τ e (.66) p release p failure (.67) The time constants τ e was set to a time scale where it remembered its past actions. Thus the quantity e(t) was called an eligibility trace, because it signified whether the synapse is eligible for reinforcement by reward signal h(t). Therefore the plasticity was determined by the product of reward signal and the eligibility trace as defined in eq. (.68). dq dt = η h(t) e(t) (.68) Where η > was a learning rate. By segmenting the learning process into distinct episodes that are statistically independent, the convergence to the maximum of the expected reward has followed from stochastic approximation theory providing that the learning rate η as t. Then average reward for an ergodic 37

51 Markov chain has been defined in eq. (.69) as a time average or as an average over the equilibrium distribution π (X) for an X input. For a fixed θ it has shown that time average is an estimate of the gradient of the average reward. H(θ) = lim T T T h(x t ) = h(x) π θ (X) (.69) t= X Although they have shown that the possibility of training these hedonistic synapses to perform desired computation by proper administration of reward, this type of biological synapses have not yet been identified in the brain that behaved similar to hedonistic. There is also a possibility that network ends up in increasing its average reward anyway in spite of doing correct collective mechanisms. Moreover the real applications of these synapse have not yet been fully developed nor thoroughly discussed in the field of ANN..5 Problem in Brief As presented in this chapter, at early stages of ANN, the aim of the researchers was to develop learning algorithms or network models to generalize any learning task. However, today the main intention of the computer scientists is to develop neural networks that are efficient in specific learning tasks. For example Deep-Belief networks have been developed especially for image processing tasks and stochastic networks have been mainly trained to learn patterns in the form of time-varying perturbations. The most important features in the latter case are that these learning algorithms are a mixture of many conventional learning algorithms and they have accompanied their own network structure. Furthermore, the current trend in ANN is to integrate the neurobiological findings to these structures and learning algorithms because it believes to be the concrete base for development of efficient learning algorithms in future. The hypothesis discussed in section.4 is thought to be the main key for such a successful learning algorithm and network structures. Therefore the aim of this research is to develop a stochastic and competitive Hebbian learning mechanism that supports to control the instability of Hebbian neurons using the stated hypothesis. The solution presented in this thesis differs from other approaches because of two main reasons. First its unique network structure that simulate the synaptic behaviors in its ground level by using the concepts of transmitters and receptors. Second is the Hebbian learning mechanism which integrates the short-term plasticity, and spiketiming dependent plasticity in terms of long-term plasticity to the weight updating mechanism to stabilize the Hebbian neurons. Furthermost the proposed learning mechanism has demonstrated that when learning takes place it adheres to the plasticity mechanisms described in neurobiology such as synaptic redistribution, excited behavior of neurons at low-firing frequency and homeostatic synaptic plasticity in addition to Hebbian plasticity and Stent s anti-hebbian plasticity. 38

52 .6 Summary Among many conventional learning algorithms, the Hebbian learning algorithm is thought to be the most biologically feasible learning algorithm in ANN. Although many attempts have been made to solve the issue of its instability in Hebbian neurons such as weight normalization or threshold manipulations, they deteriorated the learning task whilst stability was maintained. The current trend in ANN is to address this issue through STDP and related plasticity mechanisms by proposing neuronal models for specialized learning tasks. However many of the proposed models are still at research level and are still used to apply hard weight constraints on the Hebbian neurons. The postulation of adapting STP and STDP to learning algorithms on a biologically supported network structure is the main hypothesis used in this thesis to remove the inefficiency of Hebbian neurons. The next chapter discusses the related biological background that supported the development of our stochastic and competitive Hebbian learning mechanism. 39

53 3 Background Theory The previous chapter highlighted the issues arisen when attempts were made to control the instability of Hebbian neurons. It further emphasized the importance of developing a Hebbian learning mechanism composed of both a learning algorithm and a cooperative plasticity mechanism to eliminate this instability of Hebbian neurons. Here we discuss the necessary neurobiological and psychological theories that provide a basis for the development of our novel learning mechanism to solve the identified problem. This discussion further comprehensively touches the necessary biological theories in order to give a better understanding of the biological phenomena related to the problem. 3. Neurons Neurons are the basic signaling units in the nervous system. Neurons take information, make a decision about it, then change their activity accordingly, and pass it along to other neurons. Generally a neuron consists of a cell body called soma, dendrites, an axon, and is covered by a cell membrane which consists of lipid and protein molecules. This lipid allows a little passing of water but not to ions. Between the opened and closed states of ion channels molecular transition occurs. The dendrites and axon extend away from the cell body. Dendrites take inputs from other neurons at locations called synapses. This information is processed inside the cell body and passed along the axon terminal to other neurons. Neurons hand over information to other neurons at synapses which are located at the end of axon-terminals 69, cf. fig. 3.. Action potentials take the signals as a means of voltage and amplitude change from one neuron to another along the axon. The distinct feature in action potentials is that their amplitude and duration do not depend on the amplitude and the duration of the stimulus that they were initiated. One action potential should be entirely completed to start another action potential, and after each action potential there is a refractory period, a time frame, in which another action potential cannot be initiated 7. At the end of an axon, axon terminals can be found, which is the location where synapses are situated. The synapses are the exact locations at which information is really handed over to another neuron. We can find two types of synapses in the nervous system. Those are chemical synapses and electrical synapses 7. In electrical synapses, information transmission is mediated by a flow of current while in chemical synapses information is transmitted by means of a neurotransmitter release, cf. fig. 3.. Chemical synapses are the most common in the nervous system. Therefore, in this research we have concentrated only on the process of chemical synapses 7. At chemical synapses, neurotransmitter release is evoked by the entry of Ca + into axon terminals and neurotransmitter release to occur Ca + must present at the time of pre-synaptic terminal depolarization and A depolarization is a change in neuron membrane that makes membrane more positive. 4

54 Figure 3.: General structure of a neuron Generally, a neuron consists of cell body, dendrites, and axon. At the end of axon, axon terminals can be found where synapse structure is located Figure 3.: Structure at a chemical synapse At chemical synapse, information is transmitted by releasing neurotransmitters from pre-synaptic neuron to postsynaptic neuron a large depolarization of pre-synaptic neurons result in generating action potentials 7. Therefore, the neurotransmitter release process at chemical synapses can be summarized as, first, pre-synaptic depolarization, second, Ca + entry to the nerve terminal and third, the release of neurotransmitters from pre-synaptic terminals, cf. fig Conversely neurotransmitter release can be reduced by removing Ca + from the terminals or by inserting blocking ions such as magnesium, cadmium, nickel, and manganese into terminals. This depolarization of pre-synaptic terminals releases the neurotransmitters and it is followed by the post-synaptic depolarization or hyper-polarization based on the characteristics of post-synaptic receptors 7. Therefore, a given neuron can receive inputs from excitatory, inhibitory and modularity synapses Neuronal Functionality Our neurons are subjected to two complementary processes: the need to change and the need to stabilize 49. Neural adaption to external environment brings changes to its morphological, electrical, and synaptic prop- 4

55 Figure 3.3: Process of neurotransmitter release at a chemical synapse (a) Arrival of action potentials to pre-synaptic axon terminals, (b) Entry of Ca + to pre-synaptic axon terminals, and (c) neurotransmitter release from vesicles at the chemical synapse erties. This is achieved by losing and gaining the functional synapses and by constant opening and closing of ion channels that change the electrical and morphological properties of neurons 5. Therefore, a combined action of ion-channels, cell-membrane, and synaptic inputs defines the activity of a neuron 6. These changes that occur in the form of activity dependent plasticity allow the neurons to pass information to other neurons and get refined by the experiences 49. While adapting to environmental changes, neurons also need to maintain its stability without getting into saturation. To avoid the saturation, neurons are supposed to have a way of normalizing its synaptic gain while maintaining the relative strengths in all synapses 7. Such a normalizing mechanism is collectively called homeostatic (synaptic) plasticity 5, 49, see section 3..3 for more details. Even though homeostatic plasticity should guarantee the stability of neural activity, it should not be a barrier to the activity dependent modifications of neurons Chemical Synapses Individual connection between two neurons comprises of multiple synaptic connections. The number of synaptic connections in a given neuronal connection depends on the type of neuronal connection. Further, the number of synaptic contacts in a given neuronal connection defines the neuronal function and its regulation. Many neuronal connections are composed of multiple unreliable synaptic contact points, and the probability of successful neurotransmitter release, i.e p r, is variable and adjustable at each single synapse level of Plasticity is activity dependent changes in the probability of respond to the action potentials by a post-synaptic receptors. 4

56 pre-synaptic neuron. The probability of neurotransmitter release at each synapse is also known as release probability. And single axon terminals contributing to a connection can have release probabilities that are diverse and that can change over time. i.e. synapses are functionally heterogeneous. This heterogeneity of p r at a single synapse depends on three variables: (i) The number of release ready vesicles at pre-synaptic axon terminal. (ii) The Ca + concentrations in the pre-synaptic terminals. (iii) The molecular coupling between Ca + and vesicle fusion. The release probability does not only define the reliability of synaptic transmission but also does change with the short-term activity history of the synapse, thus shaping the way in which a connection dynamically adapts to inputs. The short-term activity dependent p r < means that synaptic strength is dynamic and the synapses can act as filters to the input pattern of the pre-synaptic action potentials. For example, prolonged stimulation or depolarization of the post-synaptic target can suppress neurotransmitter release at pre-synaptic neuron 4, 43. Homeostatic plasticity can also change p r in hippocampal cells for example increasing dendritic depolarization homeostatically decrease p r. Further, equalizing the activity in dendritic significantly reduces the variability in p r. Moreover, p r homeostatically adapts to the synaptic density of each dendritic branch, i.e. the more synapses one axon makes on a dendritic branch, the lower p r of each synapse (i.e. there is an inverse relationship between p r and the number of contacts in the connection). The spatial organization of the synapses over the neuron is also important when making these contacts because some synapses occur across the whole dendritic tree while others target a specific region of the post-synaptic cell. The synapses that mediate these connections can functionally be very different even if they belong to the same axon and different post-synaptic target. The post-synaptic influence on release probability may affect during synaptognesis, (see section 3.4.3), or through retrograde regulation, (see section 3.), after synapse formation. Release probability is modulated by long-term plasticity and the type of the long-term plasticity depends on the post-synaptic target. For example, for high-frequency stimulation of mossy fibers to pyramidal cells cause the long-term potentiation and mossy fibers to inter-neuron causes the long-term depression. Although adjustments can be induced by pre-synaptic terminal itself, most rely on a feedback loop from the dendritic target, suggesting that p r is highly controlled by the identity and activity of the post-synaptic cell, 4, 43. Moreover, many factors affect how the post-synaptic synapse responds to these pre-synaptic action potentials :. Receptor desensitization at the post-synaptic side: long duration of exposure to the neurotransmitters inactivate the receptors and decreases the probability of the receptor responding to the pre-synaptic neurotransmitters.. The post-synaptic receptor which has been activated by the signal also affects the post-synaptic response. 43

57 3..3 Functionality at Synapses Generally when a brief train of stimuli is applied to a pre-synaptic neuron, during the stimuli it either increases (synaptic facilitation) or decreases (synaptic depression) the amplitude of the post-synaptic potential. This amplitude change can last even after the stimuli is ended. According to the duration of its existence, they have been classified into different phenomena. Facilitation is the change of amplitude that increases the post-synaptic potential, and appears only for a couple of milliseconds and lasts during the stimuli. However, it is possible to see the increase in post-synaptic efficacy even after the stimuli is over, this is called augmentation and lasts only for hundred milliseconds. Although repetitive occurrences of train of stimuli results in post-synaptic depression, after hundred milliseconds, recovery occurs to depolarize the post-synaptic potentiation. Synaptic depression can also be observed when tetanus is applied to the pre-synaptic nerve terminals. However these repetitive activation of post-synaptic synapses can produce more persistent changes that last from hours to days and are called long-term potentiation if these changes increase post-synaptic potentiation. They are called long-term depression (LTD) if they decrease the post-synaptic potentiation 7. The relative time duration of these changes are shown in fig Figures 3.5 and 3.6 show the effects of synaptic facilitation and synaptic depression which occur in visual cortex. Figure 3.4: Relative time duration of each synaptic plasticity phenomenon 3. Synaptic Plasticity These activity dependent changes in synaptic transmission collectively known as synaptic plasticity has been classified under three categories :. Long-term plasticity:- involves changes that last from hours to days, and is believed to be the main substrate for learning and memory formation 7. Tetanus is relatively long and high frequency of train of stimuli 44

Figure 3.5: Synaptic facilitation Figure 3.6: Synaptic depression Synaptic facilitation which occurs in visual cortex brights the site vision while depression blurs the vision.

58 Figure 3.5: Synaptic facilitation Figure 3.6: Synaptic depression Synaptic facilitation which occurs in visual cortex brights the site vision while depression blurs the vision.. Short-term plasticity:- occurs milliseconds to seconds and allows neurons to perform critical computational function. 3. Homeostatic plasticity:- allows neurons and synapses to maintain their excitability at a certain level despite the changes brought by the experience dependent synaptic plasticity. Thus, the short-term activity dependent p r means that synaptic strength is dynamic and can act as a filter of the input pattern of the pre-synaptic action potentials and long-term activity dependent p r means an effective means of changing the strength and dynamics of a connection. The same p r modulated by the long-term plasticity can be regulated by homeostatic plasticity by increasing or decreasing the size of the vesicle pools and by recycling the vesicles 4. The information flow from these synapses can be classified into two categories based on its direction 7 :. Feed forward:- the typical process of information flow in which pre-synaptic neurons send signals to post-synaptic targets. Here the information flow is affected only by pre-synaptic activity.. Bidirectional:- the pre-synaptic neuronal activity depends on the feedback given by the post-synaptic neuron by releasing retrograde messengers. Based on the information flow at the synapses, two types of short-term synaptic plasticity mechanisms have been identified :. Feed forward plasticity:- facilitation and depression come under this category and low p r values favor the facilitation while high p r values favor the depression. This plasticity lasts from milliseconds to seconds.. Feedback plasticity:- In feedback plasticity, the retrograde messengers are released by post-synaptic receptors to regulate the neurotransmitter release in pre-synaptic terminals. 45

59 The dynamic behavior of synapses have been characterized as filters with a wide range of properties.. Low-pass filters:- synapses with high initial p r function as low-pass filters.. Band-pass filters:- synapses with intermediate initial p r function as band-filters. 3. High-pass filters:- synapses with low initial p r work as high pass filters. Finally pre-synaptic inhibition makes low-pass filters into band-filters and band-filters into high-pass filters. Therefore the roles of the synapses control the firing frequency of neurons but neuron firing set the states of the synapses. Overall gain of this complex process is the ability of learning. Learning is the process of retrieving new information and memory is the outcome of the learning process. Memory is divided into two categories, short-term memory and long-term memory. Short-term memory retains the data over a couple of minutes but long-term memory retains the data over years. Repetitive recalling of an item in the short-term memory can move the item to long-term memory Short-term Plasticity There are many forms of short-term plasticity mechanism that have been identified in hippocampus. The main disparity between these mechanisms is the time course that induces the changes to synaptic connections. This time course varies from couple of milliseconds to seconds or hours. Here we focus only on a STP mechanism that induces changes within one second or less. Such a short-term increase of synaptic strength within hundreds of milliseconds is called facilitation and similar short-term depression is called depletion Facilitation Facilitation often builds and decays with a time course that can be approximated with an exponential of ms, 3. Facilitation is partially caused by Ca + remaining in the axon terminal after the conditioning stimulus. These remaining Ca + is called residual Ca + and elevating of these residual Ca + enhances the synaptic strength and preventing the increase of residual Ca + eliminates the short-term enhancements. Furthermore, in addition to residual Ca +, extracellular Ca + that influxes to the neuron after the arrival of an action potential is also a main factor that determines the magnitude of the short-term synaptic enhancement. Increase of external Ca + that influxes to the synapse increases the short-term enhancement and the decrease of external Ca + decreases the amount of the enhancement. Moreover, this Ca + entry and the magnitude of residual Ca + are stochastic random variables, and facilitation depends on this random variables given that resting Ca + at the axon terminal is negligible. Additionally the short-term synaptic enhancement made by facilitation reflects an increase in the p r or the increase in the available quanta in the readily releasable pool or the increase in the number of release sites capable of releasing a quantum. The amount of Ca + that 46

60 influxes to synapses due to the arrival of an action potential is about µm Pre-synaptic Depression as a Depletion The mostly accepted short-term depression is due to depletion which is the decrease in the release of neurotransmitters that reflect the depletion of readily releasable pool of vesicles available in pre-synaptic neurons. According to the depletion model in the very simplest form we could explain the process of depletion as depicted in fig A synaptic connection contains a store of S readily releasable vesicles. An action potential releases a fraction of this store, called F. Thus the store transiently depletes RS vesicles. Then immediately following stimulation has only R FR readily releasable vesicles to release. If we assume that the fraction of readily releasable pool remains unchanged for the second action potential then due to second action potential store will transiently deplete (R FR)F readily releasable vesicles. It is usually assumed and observed in vivo that there is a mono-exponential recovery of readily releasable vesicles. Repeated stimulation would result in the depletion of the readily releasable pool, which induces synaptic depression on synapses. In this research we neglect the effect of Ca + on depletion and assume that it merely depends on the activity history of the synapse. Although the estimate of the functional size of the readily releasable pool depends on the brain region of animals, in average it is about for the hippocampal synapses, 3 and it refills with a time constant of 8 ms under short-term depression. Depot Store Mobilization Releasable Store Action Potential Released Quanta Store Depleted by Quanta Released Figure 3.7: The depletion model of synaptic depression Post-synaptic Depression as Receptor Desensitization Receptor desensitization is a significant mechanism in post-synaptic neuron that induces use-dependent decrease in synaptic strength. In general terms desensitization occurs when a receptor decreases its response to a signaling molecule when the exposure is prolonged to signaling. That is, long duration of exposure to the neurotransmitters inactivate the receptor and decreases the probability of the receptor responding to the neurotransmitters. Thus receptor becomes uncoupled from its signaling cascade, and thus the biological effect of receptor activation is decreased. Desensitization has been monitored in hippocampal neurons where 47

61 the amount of receptor desensitization is regulated by the post-synaptic potential. 3.. Long-term Plasticity through STDP One of the most significant challenges in neuroscience is to identify the cellular process that underlie learning and memory formation. Learning can be described as a mechanism by which new information is acquired about the world, and memory as a mechanism by which that knowledge is retained. The past decade has seen remarkable progress in understanding changes which accompany plasticity in synaptic connection, i.e. long-term plasticity. With the findings of Bi and Poo 8 the long-lasted confrontation of how neuronal molecular adapt to long-term plasticity was partially but well answered. Their observation introduced a new information encoding mechanism to neuroscience and further explored how long-term plasticity is encoded on synaptic connection in terms of synaptic strength. This information encoding mechanism is now called spike-timing dependent plasticity (STDP) and is being vigorously used in analyzing neuronal behavior when learning takes place. Their observations supported the main fact that the information is encoded on the synapses according to the time that spikes occur in those synaptic neurons. That is when a spike occurs in pre-synaptic neuron before an occurrence of spike at the corresponding post-synaptic neuron, then synaptic strength between those two synaptic neurons increases inducing long-lasting changes to the synapses called long-term potentiation. When this phenomena is reversed, that is when a spike occurred at post-synaptic neuron it is followed by a spike at pre-synaptic neuron, the synaptic strength between those two neurons is decreased by inducing the long-lasting changes to the synapses called long-term depression. Furthermore, potentiation is induced when a post-synaptic spike occurs within a time window of ms after the onset of pre-synaptic spike. Similarly, depression is induced when a pre-synaptic spike peaked within a time window of ms after the onset of post-synaptic spike. The ability to induce potentiation or depression decreases as the absolute value of time window increases and outside 4 ms no modification to synaptic strength occurs. Moreover, the changes on these synaptic strengths by long-term plasticity mechanisms, such as long-term potentiation and long-term depression depend on initial synaptic strength. So that strong synapses are less potentiated compared to weak synapses and long-term depression has no correlation to the synaptic initial strength. Although they have highlighted the necessity of monitoring Ca + ions that influx to synaptic neurons as a result of induction of synaptic changes, how short-term plasticity as a process of flow of Ca + ions affects to the process of long-term modification was not clearly observed and explained. Although it has also been confirmed that STDP itself is not sufficient to bring long-term modifications to synapses 37, STDP is now being treated as the main underling mechanism for inducing long-term changes to synapses. This 48

62 main observation of STDP to induce long-term plasticity has been mathematically defined in eq. (3.). A + exp( (t post t pre )/τ + ) if t pre < t post w(t pre t post ) = (3.) A exp( (t pre t post )/τ ) if t pre t post Where at t pre a spike occurs at the end of pre-synaptic terminal and reaches the post-synaptic neuron t post. STDP is described by the weight window function w(t pre t post ) that determines how strength of synapses is modified by a pair of action potentials; a pre-synaptic spike occurring at t pre and post-synaptic spike occurring at t post. The parameters τ + and τ determine temporal ranges of the two sides of the window function while A + > and A determine the size of the changes induced by appropriate spike-pairings. First part of the equation defines the long-term potentiation when a pre-synaptic spike precedes a post-synaptic spike whilst the second part defines the long-term depression where a post-synaptic spike is followed by a pre-synaptic spike. According to the Bi and Poo 8 τ + and τ are almost equal to each other and it is ms 34 and it has been further shown that in order to generate stable competitive synaptic modifications the integral of the weight window in eq. (3.) should be negative which confirms that uncorrelated spikes will produce an overall weakening of synaptic strength. Thus the integral of that weight window requires A τ > A + τ + to be negative Homeostatic Synaptic Plasticity As much as we need to maintain our body temperature and cholesterol level at a certain level, neurons also need to maintain their activity at a certain level to prevent damage. This activity level is so narrow that it has become a set point. This gives a neuron to remain at a constant activity level by being able to change and evolve. Although the properties of this set point is not clear, it has been evaluated under three possibilities 49 : first, all neurons were required to resettle at the same point. If so all the diversity would be abolished which should not be the case. Second, it may be a range of activity levels that define the safety barrier for neurons. Third one is more specific: each neuron has its own set point (a very narrow range) allowing diversity of activity among neurons. The process that helps neurons to maintain their activity level in a narrow range is known as homeostatic plasticity which dynamically updates neuronal synaptic strength in the correct direction to stabilize the neural activity by creating synaptic competition, cf. fig Without homeostatic plasticity, activity dependent plasticity such as long-term potentiation, long-term depression, short-term potentiation and short-term depression could take a neuron to over excited or over depressed/quiescence states 4 cf. fig Homeostatic plasticity creates competition between neurons by bringing overall changes to synaptic strength of a given neuron by changing the post-synaptic receptor clustering, by modulating the pre-synaptic neurotransmitter 49

63 release, and by regulating the number of functional synapses 4, 6. Figure 3.8: The role of homeostatic plasticity Figure 3.9: Firing rate variation of a given neuron over time source: Burrone,J. and Murthy, V.N., 3.Synaptic gain control and homeostasis. pp For example, for excitatory neurons, synaptic strength is scaled down to regulate the neuronal activity, and for inhibitory neurons, synaptic strength is scaled up to regulating the neuronal activity. To scale down it increases the number of functional synapses of the post-synaptic neuron and allows pre-synaptic neuron and post-synaptic neuron to contact each other and form trial synapses. Conversely, to scale up, homeostatic plasticity decreases the number of functional synapses of the post-synaptic neuron, strengthens and stabilizes the synaptic connection while at the same time eliminating the inappropriate connections 5, 7. In terms of electrical property manipulations, when the firing rate of a neurons is low, homeostatic plasticity reduces the intracellular ca + ionic concentration in order to increase the neuron firing rate 5. To accomplish these tasks, homeostatic plasticity occurs at the level of individual neuron as a response to the post-synaptic activity. The speed of the process of homeostatic plasticity is also important. If it had rapidly occurred then neurons would not have got enough time to pass the information to others. If it had occurred more slowly then neurons would have reached the excitation or quiescence state 5. The effect of homeostatic plasticity at cortical network in human brain can be depicted as shown in fig. 3.. As shown in fig. 3., if the activity of neuron A is raised, homeostatic plasticity regularly raises the activity of C and lowers the activity of B so that the overall activity of neuron A is driven toward stability. Conversely, if activity of A is reduced then homeostatic plasticity regularly increases the activity of B and lowers the activity of C in order to increase the overall activity of A 5. A similar phenomenon has also been observed in neuro-molecular junction, when post-synaptic receptor activity is experimentally decreased, it automatically increases the pre-synaptic neurotransmitter release 6. 5

64 Figure 3.: Homeostatic plasticity in cortical networks A and B are two pyramidal neurons that make excitatory output to other pyramidal neurons and also on to inhibitory neuron C. The inhibitory neuron feeds inhibition back onto the pyramidal neurons. 3.3 Synaptic Redistribution As per recent biological findings, a synaptic connection between two neurons is strengthened by increasing the number of post-synaptic receptor channels or by increasing the probability of neurotransmitter release at pre-synaptic neuron. This functional behavior at synapses is varied if long-term plasticity interacts with the short-term depression 3. Short-term depression is an activity-dependent reduction of neurotransmitters from the readily releasable pool at pre-synaptic neurons 3. When long-term plasticity interacts with the short-term depression, the effect is called synaptic redistribution. This synaptic redistribution increases the probability of neurotransmitter release pre-synaptically, and subsequently increases the efficiency of signal transmission between neurons and decreases pre-synaptic readily releasable pool size. Therefore, the high-frequencydependent increase in synaptic response for the first few spikes in the spike-train is caused because of the redistribution of the available synaptic efficacy and not because of the increase of synaptic efficacy at the steady state. Thus high-frequency dependent redistribution of synaptic efficacy could represent a mechanism to change the content, rather than the gain, of signals conveyed between neurons 7 and it would be produced by mechanisms such as post-synaptic changes or by adding or removing receptors to or from the connection. Redistribution of synaptic efficacy potentially occurs at any synapse if activated at a faster rate than required for complete recovery of the synaptic efficacy. However, at low-firing frequencies of pre-synaptic neuron such as < Hz, an increase of synaptic efficacy at steady state has been observed. This increase of synaptic efficacy depends on the short-term plasticity factors such as probability of neurotransmitter release and time constant of recovery 7. Conversely Okatan and Grossberg 73 suggest that pairing of Hebbian neuron to reach the steady state might be frequency-dependent, and as a result, the time taken to reach the 5

65 steady state at high-firing frequency of pre-synaptic neuron may be longer than at lower pre-synaptic firing frequencies; therefore, even there is a steady state increase of the synaptic efficacy at high-firing frequency of pre-synaptic neuron, which is not well observed due to the effect of short-term synaptic plasticity factors on the synaptic efficacy. Lisman and Spruston 37 have further added that this increase or decrease of synaptic efficacy at steady state is not merely based on spike arriving time to the synapses but also on the level of post-synaptic depolarization, rate of synaptic inputs and the phase of synaptic input relative to the ongoing frequency oscillations. Nevertheless, the role of synaptic redistribution has not yet been fully understood in biology, especially when it interacts under various plasticity mechanisms. 3.4 Physiological Exploration of Hebbian Neurons Development of many correlation based learning algorithms in neural network is basically guided by the Hebb s postulate which describes how activity between pre-synaptic neuron and post-synaptic neuron affects each other when their association account to learning 8. However, unconstrained strengthening of synaptic connectivity and lack of references to synaptic depression have diminished its value as a learning algorithm in neural networks. Although the lack of references to the synaptic depression in Hebbs postulate is addressed by neural network scientists by introducing a correlation term to the relationships between neurons, the proposed computational mechanisms to induce depression could not handle it without explicit boundary definitions. Nevertheless the same synaptic depression has also been discussed in physiology as a complementary statement of Hebb s postulate. Stent s anti-hebbian postulate 74 and Lisman s anti-hebbian postulate 75 are two such significant postulates that emphasize how synaptic depression can be introduced to Hebb s Neurons Stent anti-hebbian Postulate Stent in his postulate provides a physiological mechanism that explains how an inactivation of one cell affects on an activated cell and subsequently reduces the corresponding synaptic strength. Hebbian postulate can be quoted as follows: Stent s anti- When the pre-synaptic axon of cell A repeatedly and persistently fails to excite the postsynaptic cell B while cell B is firing under the influence of other pre-synaptic axons, metabolic change take place in one or both cells such that A s efficiency, as one of the cells firing B, is decreased. As per Stent, neuron A s activity is decreased when it fails to excite the post-synaptic cell B. This occurs when cell A fails to synchronize its activity with the other pre-synaptic neurons of the post-synaptic cell B. 5

66 Stent s explanation to A s synchrony can be quoted as follows, The activity of the synapse of cell A upon cell B is manifestly asynchronous with the activity of synapses of other cells converging on cell B if most the impulses that arise in cell B occur while the synapse of cell A is inactive. He has further added that this asynchrony between the pre-synaptic neuron and post-synaptic neuron is detected post-synaptically Lisman anti-hebbian Postulate Lisman called that Stent s synaptic depression is a post-not-pre anti-hebbian process because it reduces synaptic strength when pre-synaptic input is inactive but the post-synaptic is active due to other pre-synaptic inputs. He has discussed another possibility of synaptic depression in Hebbs postulate called pre-not-post anti-hebbian process. Lismans synaptic depression occurs 75 : When the pre-synaptic input is active but the post-synaptic cell is not active because of inadequate excitation by other inputs or too much inhibition by other neurons. For example, consider a cell that takes several groups of inputs. Assuming that groups of inputs need to be active together to fire the cell, if only post-not-pre anti-hebbian rule is operative, the active firing of one group makes all the other groups weaken. So in Stent s anti-hebbian postulate, there is a competition between active groups of inputs and a cell represents the most active group. On the other hand, if only prenot-post anti-hebbian rule is operative then there is no competition between groups of inputs and the cell fires whenever any group of inputs is active Levy and Desmond s Rule of Synaptic Plasticity Levy and Desmond proposed elemental synaptic rules of synaptic plasticity which are coherent and biologically feasible. His four rules can be quoted as follows 76 :,. Convergent co-activity increases synaptic efficacy at active synapses i.e. Hebb s rule. Pre-synaptic inactivity during post-synaptic activity decreases synaptic efficacy at the inactive synapse i.e. Stent s rule 3. The receptivity of a post-synaptic target for new innervation varies as an inverse function of its activity i.e. Post-synaptic growth rule 4. An afferent s appetite for axonal growth and competitiveness for claiming an available post-synaptic site is dynamically regulated, increasing with heightened levels of activity 53

67 and decreasing with lowered levels of activity, i.e. axonal growth rule of pre-synaptic neuron The third rule of these elemental synaptic rules provides the basis of how to control the dynamicity of a cell by controlling the maximum number of synapses that a cell can receive against the functionality of the post-synaptic neuron. Thus the maximum number of synapses permitted increase as the post-synaptic excitation decreases. This rule was lately supported by the findings of Turrigiano et al 77. Their findings supported the fact that raising firing rates decreased the strength of of miniature excitatory post-synaptic current (mepsc) and decreasing firing rates increased the strength of mepsc. Moreover, the increased /decreased firing produces regulatory responses that return firing rates to control level. This bidirectional regulation of mepsc amplitude is likely to contribute to the homeostatic regulation of firing rates. They have further added that activity regulate this excitatory post-synaptic current by changing the receptor number or function in a multiplicative manner. This could mainly occur in terms of insertion or elimination of receptors, or through conversion of existing receptors between active and inactive states. However, this insertion or conversion should occur in proportion to the existing number of functional receptors. This receptor insertion and elimination process can generate synaptic competition. This phenomena could be viewed as an outcome of homeostasis regulation of post-synaptic activity 4. On the other hand, the forth rule of these four elemental rules defines the biological phenomena of synaptognesis 5. Synaptognesis is an activity dependent stabilization of synaptic connections where presynaptic neurons extend their axons to long distances and form contacts with their post-synaptic targets. When these connections are made; refinement of connections take place. So that appropriate connections are held, stabilized and strengthened whilst inappropriate connections are lost. This growth and withdrawal of a number of post-synaptic targeted connections are regulated by the activity of the pre-synaptic neuron. So that strengthening of the pre-synaptic neuron halts growth and stabilizes pre- and post-synaptic structures, while weakening of pre-synaptic allow pre- and post-synaptic elements to continue to grow and contact other more desirable patterns. 3.5 Summary This chapter comprehensively discussed the structure of neurons and the role of synapses in signal transmission process. Especially the role played by a synapse is crucial in signal processing which indirectly depends on the neuron firing rate. According to the duration of the changes that they brought to neurons, the neuronal plasticity are categorized under three phenomena; short-term plasticity, long-term plasticity and The post-synaptic signals that are produced in response to spontaneous release of a single vesicle 54

68 homeostatic plasticity. Short-term and long-term plasticity mechanisms are activity dependent modifications that change the synaptic efficacy for a short-term and for a long-term respectively. Conversely, homeostatic plasticity is a collective mechanism which stabilizes the neural activity when a neuron goes beyond or goes above the feasible firing range; allowing neurons to stabilize whilst being adapted to changes. Further, psychological contributions to Hebbian neurons to induce synaptic depression such as Stent s and Lisman s anti-hebbian postulates, were discussed; the behavioral response of neurons when such a depression occurring in real neurons was explored under Levy and Desmond s elemental rules. The next chapter presents our approach to develop stochastic and competitive learning mechanism through STDP to control the instability of Hebbian neurons. 55

69 4 Mathematical Formulation of the Proposed Mechanism In the previous chapter we thoroughly evaluated the behavior of both neurons and synapses as a means of primary elements of signal processing in animal brains. It was then identified that both neurons and synapses depend on each others activity and work together to accomplish adaption and stability of neurons. The so called mechanism that helps to maintain stability whilst being adapted to external stimuli is called synaptic plasticity. This chapter mathematically formulates these plasticity mechanisms and introduces a novel cooperative neuronal structure and a network model. Thus the main objective of this chapter is to present theoretical formulation of stochastic and competitive Hebbian learning mechanism through STDP to control the instability of Hebbian neurons by providing reasonable justifications to select such approaches for neuronal modeling. 4. Stochastic and Competitive Hebbian Learning Mechanism through STDP In the process of learning, the role of associative plasticity mechanisms with the cooperative metabolic structure of neurons is not yet fully discovered. However, it is obvious that the underline process of learning is not an outcome of a simple process happening inside neurons but it is an output of a collective mechanism of small plasticity processes and a variety of ionic interactions. Intriguingly the behavior of learning is not a static and deterministic process but it is a dynamic and stochastic process. Therefore one could consider a learning task as a complex behavior generated from interactions of many simple parts and process. The emergent knowledge through the interactions of these small elements can be deliberated as the weighted memory formation on synaptic connections. Thus, through out this chapter you will see our attempt to model Hebbian learning algorithm as a complex behavior resulting from interactions of plasticity mechanisms on synaptic level. Further you will notice utmost theoretical attempts to stabilize the network activity without defining hard constraints on the learning process and how emergent behavior has been generalized to weighted memory formation on synaptic connections. Finally in section 5 you will find the answer to questions of how this memory has successfully responded according to the sensitivity of the correlation of Poisson inputs whilst being stabilized within a feasible range. If learning is treated as a complex behavior generating from the interactions of small elements, it is guided in the theory of complex system modeling how that kind of mechanisms can be analyzed and evaluated 78. According to the theory, first, it is required to understand how simple parts work separately. Second, how these small parts interact with each other and third, how all these interactions can combine to accomplish the learning task. Our approach is an attempt to align with this theory of complexity. Here we have only concentrated on how the behavior of computational elements which process as receptor and 56

70 transmitter are adjusted and modified by underling plasticity mechanisms. Under appropriate constrained interactions the system has been allowed to evolve. Finally as shown and discussed in section 5 the evolved and matured network has learned to cluster the inputs according to either their mean firing rate or the correlation structure of the applied inputs while being stabilized in a feasible range. 4. Neuronal Model A fully connected network with m neurons is constructed. A neuron is modeled as a central unit which consists of thousands of computational units that switch from active to inactive or vice versa according to the excitation or the inhibition of the attached neuron. These computational units attached to a neuron are classified into two groups based on the role they play on the neuron. A computational unit that transmits a signal from the attached neuron to other neurons is called a transmitter and a computational unit that receives signals to the attached neuron from other neurons is called a receptor. Furthermore the receptors attached to a neuron are clustered into groups so that transmitters from the pre-synaptic neuron can contact the post-synaptic neuron simultaneously through multiple synaptic connections. Figure 4. shows the structure of our model neuron A with n receptor groups and a transmitter-set. Moreover transmitters in our pre-synaptic neurons are similar to the synaptic vesicle in real neurons with a single neurotransmitter, see section 3.. The states of these computational units, either active or inactive, are modeled using two-states stochastic process as explained in the next few sections. Only when the units are in active states, they are reliable to successfully transmit or receive the signals to or from other neurons. The term successful is intentionally used to indicate even when the sender is in an active state, the signal might not reach the other end successfully if the receiving party does not satisfy the behavioral rules defined in section Furthermore transmitters in pre-synaptic neurons contact the receptors of a particular receptor-group of postsynaptic neurons by forming a synapse between the two neurons, see fig. 4.. Through multiple receptorgroups of post-synaptic neurons, pre-synaptic transmitters are allowed to make multiple synaptic connections in parallel, forming dynamic and stochastic synapses. As depicted in fig. 4. each receptor group R of postsynaptic neuron and transmitter-set T of pre-synaptic neuron jointly measures the excitation at the attached synapse W and balances the excitation using threshold θ as discussed in sections 4.3. and respectively. 4.3 Mathematical Formulation of Plasticity Mechanisms This section comprehensively formulates the three main plasticity mechanisms in the human brain which are highly significant when learning takes place; they are short-term plasticity, long-term plasticity and homeostasis synaptic plasticity. By doing so through STP we expect to develop short-term dynamics on our 57

71 Figure 4.: Structure of a modeled neuron A neuron consists of a large number of small computational units which are either receptors or transmitters. These units are either in active states or inactive states at a given time according to the excitation of the whole neuron and the network. Figure 4.: Structure of the modeled neural network synapses while a repetitive call of these short-term dynamics will help LTP to make concrete changes on synapses while HSP helps neurons to be smooth on all these changes while maintaining stability Short-Term Plasticity According to biology, post-synaptic receptors have been clustered (see section 3..3) to specify synaptic connections between two individual neurons. A deeper analysis into the synaptic roles have shown that (in section 3..3) a synapse can generally have three major stages as filters in the information transmission life cycle. They are low-pass filters, band-pass filters and high-pass filters. Figure 4.3 shows the transition of synapses as filters from one stage to another. As shown in the sub-figure (b), with time, the response of synapses can change from low-pass filters to high-pass filters when they continuously open for neurotransmitter release. However, if the synapses stay as high-pass filters for a longer time period, they gradually return to band-pass filters, and later to low-pass filters. Sub-figure (a) shows this variation over time and we 58

72 can summarize this behavior of the synapses with two states, i.e. active and inactive states. Therefore, as shown in sub-figure (a), a given synapse can transit from active to inactive or vice versa as a result of responding to information transmission between neurons. This behavior of synapses is modeled by assigning two states, i.e. active and inactive to our modeled synapses and they are allowed to transit from one state to the other as a response to the feedback given by a collective plasticity mechanism. Figure 4.3: Three main stages of synapses in information transmission process Response of synapses in information transmission process varies according to the time duration they have been opening up for neurotransmitter release. The amount of neurotransmitter release decreases when we move from low-pass filters to high-pass filters, i.e. from (a) to (b). On the other hand, synapses that have started as high-pass filters can later move to low-pass filters if the number of action potentials received to those synapses increases over time, i.e from (c) to (d). However, if these synapses are opened up for neurotransmitter release for a longer time duration, again they move to a stage of high-pass filters gradually. When defining the process under dynamic stochastic synapses we have focused only on the properties and mechanisms of use-dependent short-term plasticity that varies from the few milliseconds to several minutes time scales. Therefore, use-dependent activity to our modeled network is introduced using short-term plasticity, namely facilitation and depletion, see section 3.. When defining the role of transmitters it has been assumed that facilitation depends only on the external Ca + ions that influxes to the synapse after the arrival of an action potential and residual Ca + ion concentrations that synapse already have. Depletion has no influence on Ca + concentrations and merely depends on use-activity of the synapse. Then signal release probability at a transmitter in a synapse is adopted by the model proposed in Maass and Zador 79 which determines the signal release probability p r as a function of Ca + influx to synapse, vesicle depletion and signal arriving time to the transmitter. Only the influx of Ca + after arriving of neurotransmitters into receptors at post-synaptic neuron is considered when determining states of the post-synaptic receptors. If P S (t i ) is the probability, that signal is released by a transmitter S at time t i and train t = {t, t,..., t n,...} consists of exact signal releasing times of the S, and then S (t) consists of the sequences of times where S has 59

73 successfully released signals. The map t S (t) at S forms a stochastic process with two states. i.e., Release (R) for t i S (t) and Failure of release (F) for t i S (t). The probability P S (t i ) in eq. (4.) describes a signal release probability at time t i by S as a function of facilitation C(t) in eq. (4.) and a depletion V(t) > in eq. (4.3) at time t. C and V are the facilitation and depression constants respectively. Maass and Zador 79, allowed S to release the received signal at time t, if P S (t i ) >. We have updated this rule by introducing a new θ threshold, (see section 4.3.), so that if P S (t i ) > θ, the transmitter S is allowed to release the received signal. We call it as in active state. Receptors in the post-synaptic neuron are modeled following the same approach of Maass and Zador 79 except that they do not involve in the process of vesicle depletion. Therefore the states of the receptors are determined by setting the depletion V(t) in eq. (4.) to a unit. P S (t i ) = e C(t i) V(t i ) (4.) The function C(t) models the facilitation phenomenon (see eq. (4.)) while V(t) models the depletion (see eq. (4.3)) at the synapse S at time t. C(t) = C + Ć(t t i ) (4.) t i <t C and Ć(s) = α e s/τ c models the response of C(t) to a pre-synaptic spike that has reached synapse S at time t s. Here α > is the magnitude of the response and τ c > is the time decay constant under facilitation phenomenon. V(t) = max (, V V(t t i ) ) (4.3) t i : t i <t and t i S (t) V > and, V(s) = e s/τ v models the response of V(t) to a preceding release of the same synapse S at time t s t. Here τ v is the synaptic time decay constant under depletion phenomenon Behavioral Rules on the Modeled Synapses Signal passing within the network is fully controlled by behavioral rules which are defined based on the biological phenomena of how synapses respond to signals (see section 3..3). As shown in fig. 4.4 only at points a and c, biological synapses switch their states and at all the other points i.e. points b and d, they do not change their states. Therefore, the response of synapses depends on their current states and how long they have been opening up to neurotransmitter release. According to our model, once a signal is received by a receptor, the signal is propagated to a randomly selected transmitter in the same neuron according to the following behavioral rules: Rule : When a receptor receives a signal from the corresponding pre-synaptic neuron at time step t, the 6

74 Figure 4.4: Transition status of modeled synapses in the process of neurotransmitter release If P S (t i ) > θ then a computational unit S, i.e. either transmitter or a receptor, is in either an active state (A) or an inactive state (I). Therefore, once a signal is sent to the S, it can remain either in the same state or switch to the other state. When a signal is received by S at a point like b, S does not change its active state. Similarly, at a point like d, it remains in its inactive state. On the other hand, at a point like a, when S receives a signal it switches to inactive state while at a point like c, it switches its state from inactive sate to active state as a response signal is propagated within the network according to the following conditions. Cond.: Once a received signal is applied to a receptor if the receptor is updated to inactive state then the received signal is dropped, otherwise the signal is propagated to a randomly selected transmitter of the same neuron. Cond.: Once a transmitter of a particular neuron receives a signal at time step t, the signal is transmitted to a randomly selected receptor of the randomly selected receptor group of the post-synaptic neuron if updated state of the transmitter is active otherwise the received signal is dropped. The above behavioral rule defines the underlying mechanism of signal transmission between the presynaptic neuron and the post-synaptic neuron; i.e., when the related computational units from the two neurons are active only, the signal is successfully transmitted. Therefore, the number of active receptors in a receptor group of the post-synaptic neuron and the number of active transmitters in the pre-synaptic neuron jointly define the efficacy at a given synapse Homeostatic Synaptic Plasticity When mathematically modeling the behavior of homeostatic plasticity mechanism, we are influenced mainly by biological observations and suggestions made by neuro-biologists in addition to the process of homeostatic plasticity itself. It has been biologically observed that the most important feature of receptor accumulation during synaptic scaling under homeostatic plasticity is that all synapses in the cluster gain in strength proportionally to the excitation or quiescence. So that if the synapses are excited, homeostatic plasticity adds the same number of receptors to all the synapses to regulate the synaptic strength 5. To ac- 6

75 complish such a sensitivity, as per neuro-biologists, it is possible only if the neuron can sense its own activity level and adjust its intrinsic properties to drive its activity level to a set point value to maintain its stability in the face of correlation based learning rules 4. Therefore, our approach defines the homeostatic plasticity as an anti-hebbian plasticity mechanism that senses both local active receptor density and active transmitter density of each individual neuron at its ground level. Then the value of θ is adjusted by the modeled homeostatic plasticity mechanism according to the deviation that the neuron has made against the overall neural activity. To accomplish this task, threshold values are defined for each receptor-group and for the set of transmitters. Hebbian learning algorithm encourages (see section..) to increase the weight component when the cross-product of input and output is positive. Therefore, to model the homeostatic plasticity as an anti-hebbian mechanism, we increase (decrease) the corresponding threshold values of receptor groups or transmitter-set when the cross-product is positive (negative). So that when neuronal activity is driving toward a saturated state, the corresponding threshold values of highly impacted receptor groups or transmitter-set are adjusted to maintain the stability. As per Hebbian learning algorithm if x BA is the output, θ BA is the threshold value of R BA receptor group of neuron B, T A is the transmitter-set in neuron A, o A is the output and θ A is the corresponding threshold value of T A, x BA = w BA o A ; (4.4) Where w BA is the corresponding weight value of the connection between neuron B and neuron A. Therefore, in our approach, instead of increasing the value of weight, we increase the threshold value of the corresponding group. Now the threshold value of R BA receptor group can be defined as in eq. (4.5). x BA = θ BA o A (4.5) We can rearrange eq. (4.5) as defined in eq. (4.6), θ BA = f ( xba o A ) (4.6) Here f (x) = e x, x BA = ActR BA R BA (ActR BA is the number of active receptors in R BA and R BA is the number of receptors in R BA ). Similarly, o A = ActT A T A. A similar process is used to determine the threshold value of T A as given in eq. (4.7). θ A = o A x; x = (x AB + x AC + x AD ) (4.7) 6

76 x AB, x AC and x AD are the outputs of R AB, R AC and R AD respectively. Equation (4.7) can be rewritten to assign exponential behavior to the threshold value as shown in eq. (4.8). θ A = f (o A (x AB + x AC + x AD )); where f (x) = ( e x ) (4.8) Therefore, the generalized mathematical model of homeostatic plasticity is, if R i j is the j th receptor group of the i th neuron, x i j is the output of R i j, T i is the transmitter-set of the i th neuron and o i is the output of T i, then threshold value of T i θ i, is determined as defined in eq. (4.9): θ i = f (o i (x i j + x i j + + x i jk )) (4.9) If k is the number of receptor groups in neuron i and f (x) = e x. The threshold value of R i j i.e. θ i j, can be similarly defined as in eq. (4.); Where x i j = Act.R i j R i j and o i = Act.T i T i Long-term Plasticity as STDP θ i j = f ( xi j o j ) (4.) Weight updating on synaptic connections in neural network algorithms reflect formation of long-term memory between synapses. This formation of long-term memory in real neurons is mainly due to LTP mechanism which happens in the brain. This LTP has been recently discovered in the form of STDP where signal arrival times to the target neurons and their absolute time gaps between the initiated neurons are more important when it comes to determine the form of LTP. Therefore as discussed in section 3.. here we have modeled LTP on our network in the form of STDP Binning the Process at Modeled Synapses The process at the modeled synapses where transmitters from the pre-synaptic neuron contacted the receptors in a particular receptor group of the post-synaptic neuron are binned to analyze the excitation at a corresponding synapse. Bin is an array of n columns, which stores data of a given synapse of successive n time steps. A single cell of a bin contains data at a time step t, namely; the number of active transmitters in the pre-synaptic neuron, the number of active-transmitters in the post-synaptic neuron, the number of activereceptors in the corresponding receptor group of the post-synaptic neuron and the mean release probability of transmitters in pre-synaptic neuron. Let C i be i th cell of k th bin, the time gap between two consecutive cells is set to 5 ms as in eq. (4.). This allowed us to define the time represented by each cell in a bin from its first cell as in eq. (4.), cf. fig This arrangement of bin is necessary in our model to satisfy 63

77 the condition ( t c = ms ) < (τ + = τ = ms) < ( t c7 = 3 ms ) where τ + and τ are membrane constants for long-term potentiation and long-term depression. Let { AT pre = AT pre,, AT pre,,..., AT pre,n } be random t ct+ c t = 5ms i =... n (4.) t ci = ( ) n, t ci = 5 (i ) (4.) i= Figure 4.5: Bin the process at a synapse variables of the number of active-transmitters in pre-synaptic neuron at successive n time steps of a bin, similarly { } AT post = AT post,, AT post,,..., AT post,n be random variables of the number of active-transmitters in post-synaptic neuron and { AR post,s = AT post,s, AT post,s,..., AT post,s} n be random variables of the number of active-receptors in receptor-group s that corresponding to synapse s in k th bin, B k. Since the activity between the pre-synaptic transmitters and receptors in receptor-group s are not independent, we defined mean µ B k,s, and variance σ B k,s of the k th bin on synapse s as in eqs. (4.3) and (4.4). µ B k,s = µ AT pre + µ ARpost,s (4.3) σ = Var ( ) AT B k,s pre + AR post,s (4.4) where µ AT pre and σ AT pre are the mean and variance of AT pre. Similarly µ ARpost,s and σ AR post,s are the mean and variance of AR post,s. The mean and variance of both AT pre and AR post,s are estimated using ( maximum nj= ) AT pre j likelihood estimators. So that µ B k,s in eq. (4.3) can be expressed as in eq. (4.5) if AT pre = n ( nj= ) AR j post,s and AR post,s = n are the sample means and σ in eq. (4.4) can be defined as in eq. (4.6) B k,s if S AT pre = ( j ) n AT pre AT pre j= n and S AR post,s = ( ) n AT j post,s AT post,s j= n are the sample variances of AT pre and AR post,s respectively. The covariance of AT pre and AR post,s is defined in eq. (4.7). ˆµ B k,s = AT pre + AR post,s (4.5) ˆσ B k,s = S AT pre + S AR post,s Cov ( ) AT pre, AR post,s (4.6) Cov ( ( ) ( ) nj= j j AT pre AT pre AR post,s AT pre, AR AR ) post,s post,s = (4.7) n 64

Learning in neural networks

http://ccnl.psy.unipd.it Learning in neural networks Marco Zorzi University of Padova M. Zorzi - European Diploma in Cognitive and Brain Sciences, Cognitive modeling", HWK 19-24/3/2006 1 Connectionist