Neuromorphic Self-Organizing Map Design for Classification of Bioelectric-Timescale Signals

Neuromorphic Self-Organizing Map Design for Classification of Bioelectric-Timescale Signals Johan Mes, Ester Stienstra, Xuefei You, Sumeet S. Kumar, Amir Zjajo, Carlo Galuzzi and Rene van Leuken Circuits and Systems Group, Delft University of Technology, The Netherlands j.c.mes@student.tudelft.nl, amir.zjajo@ieee.org, sumeetskumar@ieee.org, t.g.r.m.vanleuken@tudelft.nl Department of Data Science and Knowledge Engineering, Maastricht University, The Netherlands c.galuzzi@maastrichtuniversity.nl Abstract The Self-Organizing Map (SOM) is a recurrent neural network topology that realizes competitive learning for the unsupervised classification of data. In this paper, we investigate the design of a spiking neural network-based SOM for the classification of bioelectric-timescale signals. We present novel insights into the architectural design space, inherent trade-offs, and the critical requirements for designing and configuring neurons, synapses and learning rules to achieve stable and accurate behaviour. We perform this exploration using highlevel architectural simulations and, additionally, through the fullcustom implementation of components. I. INTRODUCTION Artificial Neural Networks (ANN) are an effective means of recognizing and classifying patterns within real-time signals. Conventional ANNs rely on arrays of neurons, each comprising a non-linear summing function, an activation level and an activation function. Neurons are interconnected through synapses, which act as weights, effectively magnifying or suppressing the effect of the pre-synaptic neuron on the postsynaptic neuron. When the aggregated inputs to the neuron match the activation level, the neuron fires, and generates an output using its activation function. Spiking Neural Networks (SNN) are a class of ANNs, which rely on the precise timing of neuron activation to encode information []. When presented with an input signal, SNN neurons respond with a specific temporal sequence of voltage spikes, or spike trains. By adapting synaptic weights, neurons in subsequent layers become selective to specific temporal sequences, enabling classification, and even transformation. Although SNNs have been extensively studied in the past, the challenges related to their use in the classification and processing of real-time signals are less understood. The efficacy and accuracy of the classification are impacted by the choice of neuron model, synapse architecture, learning rule, and input encoding, together forming a complex, nontrivial design space. Prior art in this domain presents limited insights, often in a stand-alone manner for each design choice. For instance, neurons, synapses, interconnects and learning rules are examined independently in [2] [4]. Various neuron models target different design goals ranging from biophysical accuracy [5] to simplicity [6] and computational tractability [7], with a number of component [8] [9] and system-level implementations []. However, configuring them within a practical classification use-case is far from trivial. Similarly for synapses, the choice of learning rules [] [3] is often determined by the ability to configure and deploy them in a stable manner within the SNN architecture. A few notable papers provide insight into the tuning of specific parameters, such as the learning window size to achieve competitive learning in networks [4]. However, since application requirements drive architectural design decisions, these parameters are also impacted by the type of learning behaviour realized by the network topology [5] [6] [7]. In this paper, we explore a practical design trajectory of an SNN-based system for processing and classification of bioelectric-timescale signals. The proposed system implements the Self-Organizing Map (SOM) [7] topology to realize competitive unsupervised learning within the SNN. We present novel insights into the architectural tradeoffs inherent in the design of such SOM classifiers, the critical requirements for neurons, synapses and learning rules to achieve stable and accurate classification behaviour, and details practical experiences from the training and testing of such a system. These insights are delivered on the basis of high-level architectural explorations, and subsequent full-custom UMC65nm implementation. II. ARCHITECTURE The design space of SNNs is evaluated in the context of a classifier for bioelectric-timescale signals. These signals have a frequency range of. Hz 3 Hz, small amplitudes (3 mv), and can be noisy as in the case of electrocardiograms (ECG) and electroencephalograms (EEG). Sampled analog signal values are input to the encoding stage for conversion into spike trains. Each spike train stimulates one or more neurons in the hidden layers of the SNN, corresponding to an output class. The classifier is illustrated in Figure. The characteristics of the output spike train are influenced by a number of factors: SNN topology, neuron model, synapses, learning rules, and input encoding. In this section, we examine each of these factors. 978--5386-3437-/7/$3. 27 IEEE 3

A. Self-Organising Map (SOM) Topology The behaviour of SNNs is determined primarily by the topology in which neurons are interconnected. In addition, the relative population of excitatory and inhibitory connections also impact the stability and convergence of learned synaptic weights. The SOM [7] topology is a variant of the k-means clustering algorithm, and yields spatially distinct classification of inputs through its intrinsic Winner Takes All (WTA) behaviour [8] [9]. The topology of connections between excitatory and inhibitory neurons produces competition in the SOM, resulting in unsupervised learning within the network. Figure 2 illustrates the synaptic weight matrix for a trained SOM, where excitatory connections are light coloured, and inhibitory connections are dark coloured. The feedback loops in the topology are evident from the pattern of facilitation and depression observed in the matrix. Although similar behaviour can be elicited even from a feed-forward network with lateral inhibition, our experience suggests that achieving stability is non-trivial. This is primarily due to the criticality of excitationinhibition ratio, and its dependence on topology and initial weight distributions. Network delays also play an important role across topologies. In the SOM, synaptic delays facilitate the formation of polychronous groups, i.e. distinct clusters of neurons that fire together in response to a specific input stimulus [2]. Polychronic behaviour is a function of input spike timing and delay patterns in the network. In other topologies using recurrent connections, the network delay influences the precise time at which a spiking neuron causally stimulates itself. Used as an additional parameter, the delay can yield configurable variations in spiking behaviour from a homogeneous array of neurons. B. Neuron model The activation behaviour of SNN neurons can be described using a number of models in literature. Two effective models from a hardware-design perspective are the Integrate and Fire (IF) and Izhikevich [7]. The IF model consists of an integrator, Analog input Input Encoder m Spike trains FC m input neurons SNN n x n toroidal grid neurons n x n Spike trains Fig.. Architecture of the SNN-based classifier. Real-time analog inputs are converted to m spike trains by the input encoder. These m spike trains stimulate m input neurons in the hidden SNN layer. Also illustrated is the grid of n n neurons organized as an SOM. The output of this network is spike timing and frequency information of the population. Grid neurons are fully connected through synaptic weights scaled according to the Mexican hat function. Neuron from # 2 3 4 5 6 7 8 9 2 4 6 8 Neuron to # Fig. 2. Synaptic weight matrix of a trained SOM. Colors lighter than grey denote an excitatory connection while colors darker than gray denote an inhibitory connection. The closer to pure black or pure white, the higher the absolute conductivity. Neurons - are input neurons, connected with plastic synapses to grid neurons -. The result of learning is clearly seen in the specific pattern of potentiated connections linking the input neurons to the SOM grid. These connections were initialized with a uniform random distribution before training. accumulating synaptic outputs and generating a spike when the aggregated value exceeds a certain threshold. Following a spike event, the model causes the neuron to hyperpolarize for a short duration known as the refractory period. This prevents the neuron from spiking again, in response to identical synaptic outputs, for the duration of refraction. This behaviour introduces non-linearity in the spiking response of neurons, and is critical in realizing network transfer functions that fit non-linear inputs. However, refraction limits the maximum spiking rate. A refractory period of ms yields a saturating spike rate of khz, which is sufficient for processing of bioelectric-timescale signals, such as ECGs. Neuron implementations in CMOS can use techniques such as [2] to enable configurable refractory periods, as well as configurable voltage thresholds for spiking neurons. An additional means of realizing non-linear response is through the use of spike frequency adaptation, which can be achieved by using output spikes as negative feedback for the membrane capacitance [], or by adapting voltage thresholds in response to output spikes [22]. Izhikevich, on the other hand, is a mathematical model that reproduces the complex spiking behaviour of biologically accurate neuron models, such as Hodgkin-Huxley [5]. The response of the Izhikevich model varies based on the encoding type. When used with rate coding (average firing rate), the model exhibits a linear input to output relationship. How- 4

ever, with temporal coding (precise spike timing), the model exhibits a non-linear relationship between firing rate and spike timing. Thus for rate coding with Izhikevich, non-linear transfer functions can be realized only if the requisite nonlinearity is introduced through the topology or in synapses. C. Synapses and Learning rules Learning in SNNs refers to the modification of synaptic weight in response to inputs. Consequently, learning requires two elements - synapses, and an algorithm to update their weights. In this paper, we evaluate two learning algorithms - Spike Timing Dependent Plasticity (STDP), and Triplet STDP (TSTDP), both variants of the classical Hebbian learning rule [23]. STDP uses the difference in firing times of pre- and postsynaptic neurons to determine the extent to which the synaptic weight is modified. The weight change ( w) is given as: w + = f + (w) A + exp( t τ + ) t > () w = f (w) A exp( t τ ) t < (2) The parameters A + and A define the absolute amplitude and τ + and τ define the width of potentiation and depression STDP learning windows, respectively. t is the time difference between the postsynaptic spike and the presynaptic spike along a synaptic connection defined as being positive when the postsynaptic spike occurs later than the presynaptic spike. The function f(w) relates weight change with the current synaptic weight, yielding a non-linear weight update function. Despite this, STDP suffers from the performance limiting ping-pong effect. When the interspike interval approaches the duration of the STDP learning window, simultaneous potentiation and depression occur in the synapse, which leads to unreliable learning behaviour for high-frequency spike trains. This effect is mitigated in the case of TSTDP which relies on two spike pairs per postsynaptic spike in determining the difference in spike time [3]. TSTDP specifies synaptic weight changes as: w + = f + (w) (A 2+ exp( t τ + ) + (A 3+ exp( t 2 τ y )) t > w = f (w) (A 2 exp( t τ ) + (A 3 exp( t 3 τ x )) t < Similar to STDP in these equations t is positive when the postsynaptic spike occurs after a presynaptic spike. The parameters A 2+ and A 2 are the absolute amplitudes of synaptic weight changes. An extra term is added, taking into account a second spike pair. For potentiation this is t 2, while for depression this is a different pair t 3 as defined in [3]. The amplitudes of these triplet terms are multiplied by A 3+ and A 3, respectively, while their width is governed by τ y and τ x. Note that setting A 3+ and A 3 at zero reduces this (3) (4) set of equations to those of basic STDP. The combination of two pair terms allows for more reliable learning even at high spike frequencies, and is especially useful in rate coded and rate-temporal hybrids where spike frequency constraints limit system behaviour. CMOS implementations of TSTDP exhibit additional non-linearity in the weight update function [24], and this is reported as being biologically accurate [] as compared to pair-based STDP. However, TSTDP also has higher hardware costs, requiring storage of two spike times as opposed to one for STDP. At the circuit level, synaptic weights are converted to synaptic currents through an integrator. For large arrays, the overheads imposed by the integrator can limit the achievable integration density. The differential-pair integrator (DPI) circuit [3] supports linear filtering, allowing linear summation of multiple currents from identical synapses. The use of the DPI yields area savings, and enables higher integration densities for SNNs, while the use of sub-threshold operating regions for transistors yields currents in the range of pa. Drive issues are overcome through the use of scaling blocks for charge phase response amplitude, eliminating the need for extra pulseextender circuits. Multiple options exist for the storage of actual synaptic weights. Traditional capacitive storage employs bulky capacitors to lower the effect of leakage. While this is mitigated by the use of digital memories [25], they require current-mode ADCs and DACs for each neuron, adding complexity. Similarly, floating gate memories [26] offer an effective means for long term synaptic weight storage due to their non-volatility. However, the precise programming of synaptic weights with these is challenging. In addition to these, the synapse can also incorporate additional mechanisms for information storage. One of these is bistability [27] [28] [29], which due to its low area and low power consumption, is a comparably efficient storage medium. During spiking events, synaptic weights drift towards one of two voltage rails depending on their relative value compared to the bistability threshold. In the absence of spikes, weights are held constant. These dynamics lend great robustness to the state storing in synapses against stochastic background events [2]. Furthermore, neural networks with two states have been shown to be effective for pattern classification tasks [3]. D. Input Encoding Information presented to the SNN can be coded either as a firing rate (rate coding), or in the precise timing of spikes (temporal coding). Rate coding varies the frequency of spike trains based on the magnitude of the input signal. Except for the initial spike in a train, the timing of individual spikes carry no information. Consequently, this form of coding is incompatible with STDP learning rules that rely on spike timing. Temporal coding on the other hand relies on the precise timing of spikes to encode information. The latency of spike generation from onset of an input stimulus varies inversely with the magnitude of the input signal. Consequently, the higher the input value, the sooner the spike. For temporally coded systems, in addition 5

- 2 Neuron # 2 3 4 5 6 7 8 9 Synaptic weight.8.6.4.2.8.6.4.2 2 4 6 8 2 4 6 Timesteps [N *.ms] 2 4 6 8 2 Time [s] Fig. 3. Example of population temporal coding in a SOM. Input neurons numbered to encode a value using temporal coding and need only spikes to encode any value transmitted. The network neurons numbered to respond to this input after a certain number of input neurons have fired. The amount of input neuron spikes needed for activation to occur determines the latency of the network, and can be modified by potentiation or depression of the synapses linking the input and grid neurons. Fig. 4. Synaptic weight change over time for all connections from one input neuron to SOM grid neurons. The initial weight distribution is an uniform distribution between.8p and.2p and the maximum allowed conductivity is 2p. Due to the use of a saturating weight dependence function f(w), synaptic weights drift towards two limit values of either low conductivity or high conductivity. Note that all synaptic connections eventually participate in learning due to the correct selection of initial weights. TABLE I COMPARISON OF ENCODING SCHEMES Property Spikes/Word Energy/Word WC Link Delay Rate n n E spike 2 min(f f (value) Temporal E spike max(f φ (value)) to STDP, network delay modifications can also be used as a form of learning [3]. As is in the case of the SOM, the delay influences which neurons respond to a spike train resulting from a presented input stimulus. The population of neurons that spike first forms a population temporal code, as illustrated in Figure 3. A short comparative summary of both encoding schemes is presented in Table I. In terms of efficiency, temporal codes offer a higher information capacity than rate codes, for a given amount of energy. E. Training and Testing The SNN is operated in two modes: training and testing. In the training mode, synaptic plasticity is enabled, allowing the network to adapt synaptic weights according to the input signal. During the training phase, the SNN is exposed to data representative of what it could encounter in the testing phase. In this latter phase, the SNN is used for the classification of data, with synaptic plasticity disabled. During training, it is essential that the whole set of training data is presented to the classifier at least once. Multiple iterations of this data improve accuracy of the training. The ordering of training data also impacts the learned function. For instance, to prevent the classifier from learning correlations between consecutive training data items, the training data sequence can be randomized across each iteration. The presentation of data to the SNN and the classification occur in two distinct stages - active and pause. During the active stage, input data is presented to generate representative spike trains. During the pause stage, these spikes are allowed to propagate through the network, and thus generate a classification output. The duration of these stages is determined by the topology and, thus, the latency of the network. III. EXPERIMENTAL RESULTS This section reports experimental results for an SNN-based signal classifier, and presents insights gained from its design and configuration. We used a MATLAB simulation setup for high-level architectural exploration, and subsequently designed relevant components in the UMC65nm low leakage technology node. A. Network Topology The initial and maximum weights of synapses determine the ability of the network to adapt to input data. Especially in the case of synapses between input and grid neurons in Figure, a skewed distribution of weights results in the network evolving such that multiple output neurons respond to the same class of inputs. It is essential, therefore, for this distribution to be sufficiently spread out. The range of initial weights in the distribution also impacts the final outcome. In general, it is observed that the lower limit of the weight range must be adequately high to facilitate the stimulation of a sufficient number of neurons, yet prevent the activation of connections that are never re-used in the future. This behaviour is shown in Figure 4. Similarly, the upper limit of the range should be sufficiently low to prevent over-training of synapses on the early training data. 6

9 Spike frequency Time to first spike.4 9 Spike frequency Time to first spike.4 8 7.3 8 7.3 Frequency [Hz] 6 5 4 3 2.2. Time to spike [s] Frequency [Hz] 6 5 4 3 2.2. Time to spike [s].2.4.6.8 Input current [A] -7 (a) 5 5 2 25 3 Input current [-] (b) Fig. 5. (a) Simulation of the Integrate and Fire (IF) model where the output spike rate and time to first spike are recorded for each input. This simulation clearly shows the nonlinear rate-to-rate and rate-to-timing transfer functions. The refractory period was set to ms enforcing a hard upper bound of khz spike frequency. (b) Simulation of the Izhikevich RS model where the output spike rate and time to first spike are recorded for each input. This simulation clearly shows the linear behaviour of rate-to-rate and nonlinear rate-to-timing transfer functions. Seven other Izhikevich models [7] showed comparable results. The x-axis spans a dimensionless abstraction of input current, relevant to the Izhikevich model..9.8 Vrfr=.8 V Vrfr=.22 V Vrfr=.29 V.9.8 Membrane voltage [V].7.6.5.4.3.2 Membrane voltage [V].7.6.5.4.3.2.. 2.6 2.7 2.8 2.9 3 3. 3.2 3.3 Time [s] -3 (a).2.4.6.8. Time [s] (b) Fig. 6. (a) Configurable refractory periods of the neuron cell (b) Non-linear spiking behaviour. B. Neurons According to the Universal Approximation Theory nonlinearity is a critical requirement for neuron models to be able to approximate arbitrary input functions using a network of those neurons [32]. Figure 5(a)-(b) report the transfer functions of IF [6] and Izhikevich [7] model neurons. For rate coded systems, simulations show that the Izhikevich model [7] has a linear input to output relationship across all configurations. The IF model with a refractory period, on the other hand, exhibits a nonlinear saturating exponential input to output relationship for both rate-to-rate and rate-totiming transformations. For rate-to-timing transformations, the Izhikevich model can also be used due to its non-linear timingto-frequency relationship. We implemented a full-custom neuron cell based on a variant of the IF model [] in UMC65nm, exhibiting nonlinear behaviour through its use of leak conductances, as well as spike frequency adaptation through a negative feedback loop. The refractory period of the cell is configured through the voltage V rfr, which controls the leakage current in the reset stage. Increasing the leakage current results in the shortening of the refractory period, as illustrated in Figure 6(a). The simulated spiking behaviour of the extracted layout is shown in Figure 6(b). The non-linearity in spike timing is 68%, with spikes complexes ranging in duration from.86 ms to.26 ms. C. Synapses and Learning rules Figure 7 contrasts the synaptic weight changes in STDP and TSTDP neurons. The ping-pong effect of STDP is observed as the small inverted spikes in the weight change trace, corresponding to an input frequency of Hz and 3 Hz, for a learning window of ms. Predictable synaptic weight changes are observed only until the peak input frequency of 7

-9 Resulting synaptic weight change 7.9.8 STDP Triplet STDP 6.7 5 Weight change [C].6.5.4.3 Weight change [mv] 4 3 2.2. 2 4 6 8 Input frequency [Hz] 2 3 4 5 6 Temporal difference of postsynaptic spike [ms] Fig. 7. Comparison of weight change behaviour as a function of input frequency, for a IF neuron with STDP and TSTDP synapses. For this simulation, the first neuron produces spikes for one second for each input frequency, and the resulting weight change over the simulated time step is recorded. This result clearly shows the unreliable frequency dependency of STDP caused by the ping-pong effect after 2 Hz due to the ms learning window period. The sharp drop-off is caused by the refractory period of the IF neurons of ms which causes the second neuron to ignore subsequent input spikes after it has fired. Weight change [mv] 5-3 -2-2 3-5 - -5 Tpost - Tpre [ms] Data Fitted curve Fig. 8. Pair-based learning window of TSTDP circuit. Triangular marks are extracted from TSTDP circuit while red curve shows the exponential fit of these data points. The generated curve matches with classic learning window reported []. 2 Hz. Furthermore, weight changes are observed to flatten out at higher frequencies due to simultaneous potentiation and depression of synapses. TSTDP on the other hand, does not suffer from these pathologies. For pure temporal coding, basic STDP is a viable solution as long as maximum spike frequencies are within the limits imposed by the learning window period. An important factor in the stability of any STDP approach is the configuration of maximum facilitation amplitudes A + and A. In Figure 8 these parameters impact the area of the weight update curves during potentiation and depression. It is observed that stable learning is realized when the aggregate area of depression exceeds that of potentiation in the weight update function, Fig. 9. Synaptic weight change due to temporal difference between postspikes in a triplet. Third-order spike interactions are observed when temporal difference is below 3ms. Thereafter, weight modification is mainly dominated by the pair spikes (constant 2ms throughout the experiment). as shown in the figure. On the contrary, weaker depression results in the extreme potentiation of synaptic weights and the eventual shorting of outputs to inputs. This behaviour prevents the realization of any practical network transfer function. These indications hold true even for TSTDP. TSTDP synapses, although well suited for temporal coding, can additionally support rate coded systems due to the wide frequency range for which they produce weight changes as observed in Figure 8. Basic STDP on the other hand, constrains the maximum frequency range to under 2 Hz for the used learning window. In addition to individual synaptic parameters, the general learning rate of the network also impacts stability. Within the supported range, a low learning rate yields weight stabilization at sub-optimal minima, while a high learning rate causes over-learning of early presented data and in the shorting of outputs to inputs. For network design, it is important to ensure network function stability over time. Although evidence of the relation between limit stability and weight dependence can be contraindicative [33], simulations like Figure 4 show that saturating weight dependencies are able to quickly generate strong connections and can at the same time quickly depress unwanted connections and stabilize. Another method of ensuring stability over time is through the use of bistability mechanisms [2]. Figure 9 reports the weight change induced in the synapse as a function of temporal difference between two post-spikes in a triplet. The influence of spike pairs is negated in this analysis using a fixed 2 ms temporal difference in all experimental runs. As observed, the closer the two post-spikes are, the larger is the effected potentiation, in the synapse. Such third-order spike interactions can be observed for temporal differences under 3 ms. Beyond this, the impact of third order spike interactions wanes, leaving the base potentiation caused by the 2 ms spike pairs, shown as the flat portion of the curve. 8

Item Item 2 Item 3 Item 4 Item Item 2 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8 Item 5 Item 6 Item 7 Item 8 Item 9 Item Item Item 9 Item Item (a) (b) Fig.. (a) SOM temporal response for an input signal varying in amplitude from to V in steps of. V. The output grid depicts the time to spike from the onset of input stimulus. Black denotes no activity, while brighter values denote shorter spike times, i.e. faster spike generation. (b) SOM rate coded response for the same signal. Brighter values denote higher spiking frequencies. Note that valuable information is encoded both in initial spike time and in spike rate. D. Training and Testing The non-zero response latency of the SNN means that each input to the classifier must be succeeded by a pause interval. In this interval, neurons are silenced and the network allowed to present its full response to the input. For highlatency, multi-layer or feedback SNNs, it is essential that spiking activity have finished propagating through the entire network before a new training item is presented. The resulting interaction between spike responses for new and old data in the network results in the incorrect modification of synaptic weights. In order to prevent learning of correlations one can fully randomize the order of input data provided. This makes sure that no effect of previous data lingers in the system when the next input is provided. Alternatively, a hard reset can be executed after each exposure of a training item, erasing all membrane voltages and learning windows, but preserving synaptic weights and connectivity. This tends to suppress spiking behaviour in the network. The exposure time itself plays an important role in determining the extent of synaptic facilitation. The presentation of inputs for extended durations of time result in over-learning of that particular input. This manifests as the maximization of synaptic weights along specific paths, essentially creating an all-pass function from the input to the output. Training is generally performed until the aggregate weight change of all synaptic connections in the network stabilizes. We observe three types of aggregate weight change functions over time, for the network - convergent decrease, convergent stable, and random movement. In Figure 4, the aggregate weight change of all synaptic connections is high at the start of the simulation, decreasing steadily as training progresses and weights stabilize. In the remaining two cases of weight change, the network fails to find a stable end point of graph connectivity. There are two ways in which we varied the total training time in which we observed similar network response: increasing exposure (and if needed pause) time per item, and increasing the number of iterations in the training set. For temporal coding varying the amount of training time per item means integer repeated generation of the temporal sequence while for spike rate coding it involves varying the amount of time a spike train with a certain frequency is applied. As with ordered exposure of training items, overlearning and short circuiting has been observed to occur for extended exposures. E. SOM Response Figure illustrates the temporal and rate-coded response of the SOM classifier realized in this work, showing the distinct responses for each input, and grouping behaviour with neighbouring values. The developed SOM classifies the input signal based on amplitude into one of output classes. Figure (a) depicts the temporal coded spatial response for each output class. Despite its latency benefit, this type of coding provides little basis to discriminate between spatially similar responses, as observed for items 3, and. Figure (b) on the other hand provides a distinct rate code corresponding to the dominant output class, alongside the spatial response. This allows even spatially similar responses to be distinguished from one another. Clearly observed in this figure is the initial simultaneous spiking of response neurons to a specific input, and their subsequent feedback activity causing more widespread activation of the neurons around them. IV. CONCLUSIONS In this paper, we explored a practical design trajectory for a spiking neural network based SOM applied to the classification of bioelectric-timescale signals. We examined the 9

architectural tradeoffs in the selection of neuron models, learning rules, synapse architectures, and highlighted the role of non-linearity, synaptic weight distributions and configuration of critical parameters on stability. Furthermore, we provided insights into common performance pathologies, and detailed strategies to mitigate them during configuration and training of the architecture. These insights were developed on the basis of high-level architectural explorations, and subsequent fullcustom implementation of components. REFERENCES [] W. Maass, Networks of spiking neurons: The third generation of neural network models, Neural Networks, vol., no. 9, pp. 659 67, 997. [2] G. Indiveri, B. Linares-Barranco, T. J. Hamilton, A. Van Schaik, R. Etienne-Cummings, T. Delbruck, S.-C. Liu, P. Dudek, P. Häfliger, S. Renaud et al., Neuromorphic silicon neuron circuits, Frontiers in neuroscience, vol. 5, p. 73, 2. [3] C. Bartolozzi and G. Indiveri, Synaptic dynamics in analog vlsi, Neural computation, vol. 9, no., pp. 258 263, 27. [4] M. R. Azghadi, N. Iannella, S. F. Al-Sarawi, G. Indiveri, and D. Abbott, Spike-based synaptic plasticity in silicon: design, implementation, application, and challenges, Proceedings of the IEEE, vol. 2, no. 5, pp. 77 737, 24. [5] A. L. Hodgkin and A. F. Huxley, A quantitative description of membrane current and its application to conduction and excitation in nerve, The Journal of Physiology, vol. 7, no. 4, pp. 5 544, 952. [6] W. Gerstner and W. M. Kistler, Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge University Press, 22, section 4... [7] E. Izhikevich, Simple model of spiking neurons, IEEE Trans. on Neural Netw., vol. 4, no. 6, pp. 569 572, 23. [8] R. Wang et al., A compacht a vlsi conductance-based silicon neuron, in Biomedical Circuits and Systems Conference (BioCAS), 25 IEEE, Atlanta, GA, October 25, pp. 4. [9] J. Wijekoon and P. Dudek, Compact silicon neuron circuit with spiking and bursting behavior, Neural Networks, vol. 2, pp. 524 534, 28. [] N. Qiao et al., A re-configurable on-line learning spiking neuromorphic processor comprising 256 neurons and 28k synapses, Frontiers in Neuroscience, vol. 9, no. 4, 25. [] G. Q. Bi and M. M. Poo, Synaptic modifications in cultured hippocampal neurons: Dependence on spike timing, synaptic strength, and postsynaptic cell type, The Journal of Neuroscience, vol. 8, no. 24, December 998. [2] J. M. Brader, W. Senn, and S. Fusi, Learning real-world stimuli in a neural network with spike-driven synaptic dynamics, Neural Computation, vol. 9, 27. [3] J.-P. Pfister and W. Gerstner, Triplets of spikes in a model of spike timing-dependent plasticity, The Journal of Neuroscience, vol. 26, no. 38, September 26. [4] S. Song, K. Miller, and L. Abbott, Competitive hebbian learning through spike-timing-dependent synaptic plasticity. Nat Neurosci, 2. [5] M. Oster and S.-C. Liu, A winner-take-all spiking network with spiking inputs, in Proceedings of the 24 th IEEE International Conference on Electronics, Circuits and Systems, 24. ICECS 24., Dec 24, pp. 23 26. [6] W. Maass, T. Natschlger, and H. Markram, Real-time computing without stable states: A new framework for neural computation based on perturbations, Neural Computation, 22. [7] T. Kohonen, Self-organized formation of topologically correct feature maps, Biological Cybernetics, vol. 43, no., pp. 59 69, 982. [8] T. Rumbell, S. L. Denham, and T. Wennekers, A spiking self-organizing map combining stdp, oscillations, and continuous learning, IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 5, pp. 894 97, May 24. [9] B. Ruf and M. Schmitt, Self-Organizing Maps of Spiking Neurons Using Temporal Coding. Boston, MA: Springer US, 998, pp. 59 54. [2] E. M. Izhikevich, Polychronization: computation with spikes, Neural computation, vol. 8, no. 2, pp. 245 282, 26. [2] A. van Schaik, Building blocks for electronic spiking neural networks, Neural networks, vol. 4, pp. 67 628, 2. [22] S. Mihalas and E. Niebur, A generalized linear integrate-and-fire neural model produces diverse spiking behaviors, Neural Computation, vol. 2, no. 3, pp. 74 78, March 29. [23] D. Hebb, The Organization of Behavior. New York: Wiley & Sons, 949. [24] M. R. Azghadi, S. Al-Sarawi, D. Abbott, and N. Iannella, A neuromorphic vlsi design for spike timing and rate based synaptic plasticity, Neural Networks, vol. 45, pp. 7 82, 23, neuromorphic Engineering: From Neural Systems to Brain-Like Engineered Systems. [25] S. Moradi and G. Indiveri, An event-based neural network architecture with an asynchronous programmable synaptic memory, IEEE transactions on biomedical circuits and systems, vol. 8, no., pp. 98 7, 24. [26] P. Hasler, Continuous-time feedback in floating-gate mos circuits, IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, vol. 48, no., pp. 56 64, 2. [27] G. Indiveri, E. Chicca, and R. Douglas, A vlsi array of low-power spiking neurons and bistable synapses with spike-timing dependent plasticity, IEEE Transactions on Neural Networks, vol. 7, no., pp. 2 22, Jan 26. [28] S. Mitra, S. Fusi, and G. Indiveri, Real-time classification of complex patterns using spike-based learning in neuromorphic vlsi, IEEE Transactions on Biomedical Circuits and Systems, vol. 3, no., pp. 32 42, 29. [29] E. Chicca, D. Badoni, V. Dante, M. D Andreagiovanni, G. Salina, L. Carota, S. Fusi, and P. Del Giudice, A vlsi recurrent network of integrate-and-fire neurons connected by plastic synapses with long-term memory, IEEE Transactions on Neural Networks, vol. 4, no. 5, pp. 297 37, 23. [3] D. J. Amit and S. Fusi, Learning in neural networks with material synapses, Neural Computation, vol. 6, no. 5, pp. 957 982, 994. [3] E. Geoffrois, J.-M. Edeline, and J.-F. Vibert, Learning by Delay Modifications. Boston, MA: Springer US, 994, pp. 33 38. [32] K. Hornik, Approximation capabilities of multilayer feedforward networks, Neural Networks, vol. 4, no. 2, pp. 25 257, 99. [33] M. D. Abigail Morrison and W. Gerstner, Phenomenological models of synaptic plasticity based on spike timing, Biol Cybern., vol. 98, no. 6, pp. 459 478, 28. 2