A frugal Spiking Neural Network for unsupervised multivariate temporal pattern classification and multichannel spike sorting

Pokala, Sai Deepesh; Bernert, Marie; Nanami, Takuya; Kohno, Takashi; Lévi, Timothée; Yvert, Blaise

doi:10.1038/s41467-025-64231-2

Download PDF

Article
Open access
Published: 17 October 2025

A frugal Spiking Neural Network for unsupervised multivariate temporal pattern classification and multichannel spike sorting

Nature Communications volume 16, Article number: 9218 (2025) Cite this article

6321 Accesses
2 Citations
Metrics details

Subjects

This article has been updated

Abstract

Advanced large-scale neural interfaces call for efficient algorithms to automatically process and optimally exploit the richness of their heavy continuous flow of data. In this context, we introduce here a very frugal generic single-layer Spiking Neural Network (SNN) for fully unsupervised identification and classification of multivariate temporal patterns in continuous data streams. This approach is first validated on simulated multivariate data, Mel Cepstral representations of speech sounds, and multichannel multiunit neural recordings. Then, this very simple SNN was found to be effective at classifying action potentials in a fully unsupervised and online-compatible mode on simulated and real spike sorting datasets. These results pave the way for highly frugal SNN architectures for automatic unsupervised real-time pattern recognition in high-dimensional neural data recordings, which could be suitable for future embedding into ultra-low power hardware platforms, such as active neural implants.

Spiking neural networks for nonlinear regression of complex transient signals on sustainable neuromorphic processors

Article Open access 25 July 2024

Efficient and robust temporal processing with neural oscillations modulated spiking neural networks

Article Open access 30 September 2025

A multisynaptic spiking neuron for simultaneously encoding spatiotemporal dynamics

Article Open access 04 August 2025

Introduction

Neural interfacing electrode arrays are key devices for understanding the dynamics of the Central Nervous System (CNS) and for the development of neural prostheses for rehabilitation in case of severe paralysis^{1,2,3,4,5,6,7,8,9}. Studies have shown that the decoding accuracy of neural activity increased with an increase in the number of electrodes or cells recorded^1,5,8,10. Therefore, a strong effort has been put to develop neural probes with high numbers of electrodes^{11,12,13,14,15} that produce large amounts of data. A key problem has become to exploit the richness of these data recordings, which is difficult to access using conventional methods. In particular, whether specific spatiotemporal patterns exist in multichannel neural data might not necessarily be obvious to determine, and the corresponding patterns underlying specific neural functions might be difficult to apprehend. To this end, fully unsupervised approaches that could extract patterns of neural dynamics from the large flow of data produced by neural implants would bring invaluable perspective to better understand the dynamics of the brain that underly behavior and to identify non-obvious behaviorally relevant neural features. In particular, because studying the activity of multiple individual neurons is important not only for the advancement of neuroprostheses but also for advancing the understanding of how information is encoded by the CNS^16,17. In this context, spike-sorting is a critical preprocessing step in analyzing neural data which consists in isolating individual neuron activity from raw neural recordings^18,19,20. Several algorithms have been developed for this purpose but, due to their power consumption, they cannot be envisioned to be directly embedded into brain implants. Therefore, the long-term objective of seamlessly integrating this neural data preprocessing step directly into implantable devices calls for new algorithms that are both efficient and online-compatible, while also being low in power consumption.

Over the past decade, formal deep neural networks (DNNs) have reached unprecedented interest to learn patterns within large amounts of data^{21,22,23,24,25}. This second generation of artificial neural networks (ANNs) is now extensively used and ubiquitous in many applications. Yet, despite their capabilities, they face two main drawbacks to envision their embedding in neural interfaces. Firstly, their reliance on mainly supervised learning techniques such as backpropagation²⁶ requires labeled datasets, posing a significant obstacle in applications where such labeled data is either limited, or difficult to obtain as in the case of large-scale neural recordings. Secondly, the computational demands of these networks often call for the use of specialized hardware like GPUs or TPUs to optimize their numerous parameters that need to be maintained in memory and learned through the minimization of a global loss function. Additionally, it is important to note that although state-of-the-art DNN architectures are constructed to handle sequences of information^25,27,28, they inherently lack the true notion of time. Moreover, although self-supervised learning has been proposed to encode input features by DNNs automatically, further supervised decoding steps are required to match these embeddings to specific patterns. As a consequence, beyond their lack of energy efficiency, DNNs are not suitable candidates to build fully unsupervised pattern extraction from ongoing multichannel temporal data in which ground truths are unknown. Therefore, alternative approaches need to be found to eventually embed automated neural processing algorithms at very-low power into future intelligent neural implants for real-time identification and extraction of complex features from large-scale neural recordings.

In this respect, spiking neural networks (SNNs) are neuromorphic ANNs that model the membrane potential of their neural elements^29,30,31. SNNs communicate through discrete action potentials (spikes) that are connected by dynamic synapses. These spikes are sparse in time and thus provide a sparse computing framework for learning. They typically rely on smaller numbers of parameters, and integrate biomimetic plasticity rules found in living neural networks such as spike-timing-dependent plasticity (STDP)^32,33 or other post-synaptic rules. This third generation of ANNs is thus radically different from DNNs, as learning becomes local at each synapse and neural element based on the dynamics of pre- and post-synaptic neurons. This removes the necessity for large global memory storage and energy-consuming global minimization. Based on their local learning rules, SNNs can self-configure in a fully unsupervised way solely based on their inputs to automatically recognize patterns hidden in the data^34,35,36, and while they are also capable of supervised learning through surrogate gradient backpropogation^37,38,39,40, they require less data for training than traditional DNNs^41,42,43. Finally, SNNs are compatible with very-low power neuromorphic hardware emulating spiking neurons^{44,45,46,47,48} and novel materials integrating resistive memories that are well suited to emulate artificial plastic synapses at ultra-low power^{49,50,51,52,53}.

A number of previous studies have shown that STDP-based SNN architectures could be used for pattern or object recognition within static images in either supervised^54,55,56, semi-supervised⁵⁷, or fully unsupervised^34,58,59,60 ways. Other architectures have been developed for supervised learning of patterns within a temporal signal such as speech⁶¹. Such a paradigm has further been extended to the supervised recognition or decoding of dynamic patterns within time-varying multivariate data such as EMG⁶², olfactory signals⁶³, respiratory signals⁶⁴, tactile braille reading⁶⁵, EEG⁶⁶, or intracortical data^67,68. Toward unsupervised learning and classification of multivariate patterns, hybrid approaches have been developed consisting of a self-organizing SNN trained in an unsupervised way to represent an audio input followed by a supervised step to classify each representation into a sound class^69,70. These networks classically have a feed-forward structure with several fully-connected layers, resulting in large numbers of parameters⁶⁵, which can be reduced to some extend using convolutional layers^55,69. As for the task of spike-sorting, although it has seen a significant rise in interest in recent years^71,72,73, methods typically begin with a detection step, followed by often computationally-expensive feature extraction and clustering steps. This approach might introduce latencies that limit real-time applicability, but most importantly remain highly limiting in a perspective of having them embedded at very-low-power in neural implants, particularly in high-channel-count recording devices. Given their unique features, SNNs thus constitute promising candidates for future very-low power and fully unsupervised neural signal processing embedded within cortical implants. However, SNN architectures are typically dependent on the application for which they are dedicated and are not versatile to answer the need of a wide range of different applications. In this context, fully unsupervised extraction and classification of time-varying patterns in multichannel neural recordings using frugal SNNs is a problem that remains to be solved.

In a previous study, we proposed an attention-based SNN to extract and automatically classify action potential shapes from single-channel extracellular neural signals⁷⁴. The same SNN was also later implemented in hardware using low-power FPGAs demonstrating the network’s online-classification capabilities⁷⁵. The next challenge that we address in the present study is to automatically process multiple neural signals simultaneously, where the activity of individual neurons is captured by multiple nearby electrodes, a situation corresponding to the problem of extracting multivariate temporal patterns in a fully unsupervised way⁷⁶. In this respect, previous SNN works have tackled the specific case of time-varying visual scenes. In particular, efficient fully unsupervised learning of spatio-temporal patterns corresponding to moving objects has been demonstrated using SNNs^77,78. When searching for spatio-temporal patterns, the temporal information needs to be somehow captured by the network. In one of these studies, data from AER cameras could be automatically processed to count cars passing in each line of a highway⁷⁷. In this case, spatiotemporal patterns typically had similar dynamics, such as their duration in time and how fast they moved across pixels. Moreover, this strategy was based on the order of spikes within the input, so that the output neurons eventually fire after only a few spikes of the patterns are emitted^35,36, not waiting for the whole patterns to end. As a consequence, such approach typically fails when different spatiotemporal patterns are nested, for example one pattern being exactly the beginning (along either space or time or both) of another longer one, so that both only differ by their endings. In another study⁷⁸, feedforward connections between neurons with different membrane dynamics were used to achieve memory of different time-scales in order to learn temporal patterns with different dynamics. An alternative strategy we employed in a previous study to automatically classify temporal patterns corresponding to different action potential shapes in an extracellular neural signal was to use several synapses between two neurons with different delays⁷⁴. The drawback of these approaches is the multiplication of the number of neurons and/or synapses. An alternative could be to segment the flow of temporal data into fixed-size frames and process each frame like an image. This however prevents fully online processing of the temporal data without any prior on the length of the patterns to search for. Another approach that has recently been proposed is to learn synaptic delays⁷⁹, but this requires supervised learning. To overcome these limitations, there is thus a need for very frugal architectures able to recognize multivariate temporal patterns in a fully unsupervised way and compatible for online processing of neural data streams.

Toward this goal, we propose here an SNN architecture dedicated to automatic multivariate temporal pattern extraction that highly contrasts with classical ones based on Leaky-Integrate-and-Fire (LIF) neurons⁸⁰. Among the wide range of existing models of spiking neurons^{80,81,82,83,84} we used a variant of Low-Threshold Spiking (LTS) neurons⁸⁴, whose dynamics automatically adapt to the temporal durations of input patterns without the need to multiply the number of synapses. A single layer of such neurons is connected to input spike trains by synaptic weights and the network learns through biological learning rules like STDP and Intrinsic Plasticity (IP) in a fully unsupervised manner. We show that, with only a handful number of neurons, this strategy is efficient to recognize highly overlapping multivariate temporal patterns, first on simulated data, and then on Mel cepstral representations of speech sounds and on multichannel neural data, and finally that it can be used to perform spike sorting on multichannel synthetic and real single unit neural data. These results are thus a step towards highly frugal SNNs for unsupervised learning of complex multivariate temporal patterns in multichannel neural data.

Results

Ideally, an SNN developed for unsupervised pattern identification and recognition should be able to process incoming data fed sequentially, and emit one output spike each time a specific pattern occurs in the data, and each spike needs to be emitted by a different neuron for each different pattern to allow direct inference without further supervised step. The initial step of this procedure is to encode the continuous input data into spike trains to enable SNNs to leverage the inherent event-driven nature of neuronal computation and capture temporal dependencies within the data. To this end, we considered two types of encoding methods. In the first one, the original data was quantized using a column of sensory receptive fields that generated spikes when the signal fell within the fields (Fig. 1a), a strategy we previously employed to encode extracellular signals for spike sorting using SNNs⁷⁴. Five spikes were generated for each input value in order to increase robustness of the encoding with respect to small fluctuations of the input. Using this approach, the resulting spike trains directly reflected the shape of the original data (Fig. 1b). Original audio data was decomposed into 24 continuous Mel Cepstral signals (Fig. 1c, see Methods for details). Similarly, the multiunit activity of the neural data was binned and smoothed for each electrode, resulting in 30 continuous signals (Fig. 1d). In both cases, these input data were normalized between 0 and 1 and encoded using 24 receptive fields (Fig. 1e,f). This initial encoding resulted in spike trains that encoded both background noise and relevant patterns. In order to mitigate the influence of noise on the STDP learning process, a short-term plasticity (STP) rule from our previous work was introduced for each input spike train, which is a mechanism that weakens input synaptic weights all the more that the presynaptic activity is high (see Methods)^74,85. With this strategy, only those spike trains encoding the peaks and throughs in the original data were retained (Fig. 1g,h). These final encoding spike trains were then considered as the input spike trains passed into the network to be processed.

**Fig. 1: Receptive-field-based encoding pipeline.**

In the case of spike sorting data, we used a different encoding technique that was more robust to noise and avoided the use of STP. Moving forward, in the context of spike-sorting, we shall use the term ‘simulated action potentials’ for action potentials simulated in the spike-sorting data, ‘real action potentials’ for action potentials of real retina neurons in the real spike-sorting data, and ‘artificial spikes’ for artificial action potentials produced by the LTS neurons. As illustrated in Fig. 2a,b, a delta encoding technique inspired from the AER technique⁸⁶ was employed, which proved to be more robust for noisier data (see Methods). We tested two types of spike-sorting datasets: simulated data and real data made available to test spike sorting algorithms. For the simulated data, an 8×8 MEA grid was placed in the vicinity of 6 neurons and artificial activity was simulated (Fig. 2c,d). The real spike-sorting datasets were taken from an open-source database⁸⁷ of retinal ganglion cells recordings in mice that were recorded with a 16×16 MEA (Fig. 2f,g). Each real dataset includes a juxtacellular recording of one retinal neuron that provides the ground truth for the time occurrences of the action potentials of that particular neuron. For each dataset, we considered a subset of 8×8 electrodes that surrounded this neuron. In all cases, data was encoded using the same delta-encoding technique to obtain the final encoding spike trains (Fig. 2e,h).

**Fig. 2: Spike-sorting data encoding pipeline.**

Shown in Fig. 3a is the final network architecture, which consisted in a single layer of LTS neurons that were connected to the input spike trains by negative synaptic weights (between -1 and 0) initialized randomly according to a uniform distribution. The LTS neurons processed the spikes from the input spike trains (presynaptic spikes) and sporadically produced output spikes (postsynaptic spikes) according to the dynamics of their membrane potentials. A Winner-Take-All (WTA) mechanism implemented across the LTS neurons ensured that there was at most only one neuron that emitted a spike at any given timestep. Learning in the network happened through biological learning rules such as STDP and Intrinsic Plasticity (IP). For each postsynaptic spike emitted by a certain neuron in the network, the STDP rule strengthened the synaptic weights between this neuron and all the spike trains that had presynaptic spike within a certain coincidence time window chosen based on the maximum length of patterns to be searched for in the data. Another lateral STDP rule governing lateral inhibition weakened the synaptic weights between all the other neurons and the same spike trains thereby inhibiting other neurons from learning the same pattern (see Methods). The lateral STDP update rule was much weaker than the principal STDP rule, considering that patterns may share common spiking activity. Additionally, for each postsynaptic spike output by a certain neuron, the IP rule adapted the threshold of the neuron based on the size of the pattern learnt giving the network the ability to differentiate between patterns that are one inside the other. Figure 3b demonstrates the evolution of the membrane potential of an LTS neuron in the presence of multiple input spike trains. Each input spike inhibited the potential of the LTS neuron for a duration that was determined by the neuron’s membrane time constant. Once the stimulus ended, the neuron generated, by nature of the way they were modeled, a rebound. At the network level, the neuron with the highest inhibition produced the steepest rebound according to which it was the chosen neuron to emit an output spike (see Methods). Therefore, the network took in multiple spike trains representative of the original data as input and processed these spikes to produce sporadic output spike trains that corresponded to patterns in the input data, all while learning through biological learning rules in a fully unsupervised manner.

In order to establish a baseline for the network’s classification capabilities, the network was initially tested with artificial patterns that mimicked spike trains obtained from encoding spectrograms (Fig. 4). In this approach, artificial patterns mimicking unique frequency characteristics across 240 spike trains were repeatedly shown to the network to learn. Starting with basic patterns, and increasing the complexity of the patterns iteratively helped us better troubleshoot the classification performance of the network. The most basic example of artificial patterns included four non-overlapping patterns (Fig. 4a). The network was trained on 50 repetitions ( = epochs) of these four patterns and it can be seen that the network was able to identify these four unique patterns and each pattern was learned by a unique LTS neuron right from the beginning (Fig. 4b). Once all the input spikes are passed through the network, the output spike trains produced by the network were matched with the truth spike trains to obtain truth-output pairs to compute an evolving f-score along the learning process. It can be seen that for these simplest patterns, the network had a perfect f-score of 1 since the beginning. This can be attributed to the random initialization of the input weights to the LTS neurons and the absence of commonality in spike trains between patterns.

**Fig. 4: Classification performance on artificial patterns.**

In both neural and vocal datasets, it is not uncommon to encounter patterns that are embedded within each other. Therefore, we tested the network with another example of four patterns (also repeated 50 times), where two were subsets of two other ones in terms of frequency characteristics (Fig. 4c). It can be seen that at the beginning of the training, two LTS neurons each learned two input patterns where one is inside the other (between epochs 7 and 38, see Fig. 4d). However, as learning continued, IP helped the neurons to adapt their threshold thereby preventing them to spike for the smaller patterns, which could then be learnt by two other neurons (the f-score eventually evolved to 1 as learning progresses).

After having established a baseline for the network’s classification capabilities with the artificial patterns, we then tested the network on real speech data consisting of eleven French vowels repeated 50 times by a native French speaker (Fig. 5a, top). The network was trained on 20 epochs of encoded spike trains from this data with eleven LTS neurons and at the end of the training, the output spike trains and the truth spike trains were matched to obtain the best truth-output pairs. After a learning period on as little as 20 epochs, the output spikes produced by the network became coherent with respect to the ground truth sequence of produced vowels (Fig. 5a, bottom). The f-score obtained on the classification performance of the network on the final epoch was 0.92 (see also the corresponding confusion matrix shown in Fig. 5b). The performance then remained stable after the 20^th epoch if the data was continued to be fed to the network. At the end of training, each LTS neuron had learned a unique vowel as reflected by the final weights of the neuron that were strong for the Mels corresponding to this vowel (Fig. 5c). Figure 5d illustrates the evolution of the synaptic weights of neuron 1 (see cyan spikes in the bottom raster of Fig. 5a and first column of Fig. 5c), evolving from random initial values to final values.

**Fig. 5: Unsupervised recognition of vowels.**

In a next step, we tested whether the network could also automatically identify simple spatiotemporal patterns in multielectrode array neural data. Neural activity was recorded in a whole embryonic hindbrain and spinal cord preparation, which was previously shown to exhibit rhythmic propagating waves of activity^88,89. The multiunit spiking activity envelope of all these channels were encoded through the encoding mechanism as illustrated in Fig. 1h. This preparation exhibited a short and a long spiking patterns that were repeating in time. The short pattern had spiking activity propagating caudo-rostrally across the lower channels covering the lower thoracic and lumbar/sacral region of the spinal cord, whereas the long pattern had spiking activity propagating rostro-caudally across all channels. The network was trained on 10 epochs of the encoded spike trains with five output neurons. After 5 epochs, two unique output neurons had learnt the short and long patterns (Fig. 6a), producing consistent and coherent output spikes with respect to the ground truth (fscore = 1). After completion of the 10 epochs, the final weights of the LTS neurons (Fig. 6b) confirmed the learning of the short pattern by neuron 3 and the long pattern by neuron 4, with, in both cases, strong weights reflecting the corresponding patterns.

**Fig. 6: Unsupervised recognition of rhythmic activity patterns in multichannel neural data.**

In a final step, we evaluated the extent to which the proposed SNN could be used to perform fully unsupervised spike sorting. First, we considered simulated spiking data over 64 channels (see Methods). These spike trains encoded 6 action potential shapes from 6 different simulated neurons firing randomly in time for a total duration of 20 s. The network was trained on 10 epochs of the data but we found that it was very quick to learn, with an f-score of ~0.9 after processing only 4 seconds of data in the first epoch (Fig. 7a). The f-score of the network plateaued around 0.92 from the second epoch. Figure 7b further illustrates each of the six ground truth spike trains and their matched output neuron spike trains. The trains of simulated action potentials were reconstructed with a f-score above 0.95 (0.97 on average) and a precision above 0.98 (0.99 on average) for the five ground truths having the highest SNR (mean SNR across channels between 1.67 and 2.33; best SNR on channel with highest amplitude between 14.5 and 32.2). For the 6^th simulated neuron having the lowest SNR (mean SNR of 1.28 and best SNR of 11.0), the classification f-score was 0.66, but the precision remained high in this case (0.94), indicating that although several action potentials were missed, when the occurrence of an action potential was predicted, this prediction was reliable. In a second step, we considered 7 real retina datasets made available to evaluate spike sorting algorithms (see Methods)⁸⁷. Each dataset contains 5 minutes of data, and we trained the network on one epoch of each of these recordings and evaluated its performance with an f-score computed over the last minute of the recordings. The SNN managed to perform the spike sorting task for the 6 out of 7 datasets having high mean SNR for the ground truth action potential (Fig. 7c). For these 6 recordings, the ground truth and reconstructed spike trains are illustrated in Fig. 7d.

**Fig. 7: Unsupervised spike-sorting on simulated and real neural data.**

These results were further compared to several spike-sorting algorithms gathered on the SpikeForest framework⁷¹. Out of the 10 classifiers presented in the SpikeForest framework, only 6 of them were reported with classification of all the recordings of the real spike-sorting dataset: HerdingSpikes2, IronClust, Kilosort, MoutainSort4, SpykingCircus and Tridesclous. We thus compared our network with the worst and the best performances of these 6 classifiers. For this purpose, we used the accuracy, precision and recall metrics as used by SpikeForest. As illustrated in Fig. 8a, the classification accuracy and recall obtained with our SNN-based approach were overall inferior to those of the best spike-sorter but superior to the worst one for two datasets. The precision of the SNN classification was however generally comparable to that of the best spike-sorter and even higher in one case. In order to test the reason why the SNN failed for one dataset, we varied the δ threshold used in the encoding step (see Methods) and found that classification could be improved with appropriate values tuned for each dataset (Fig. 8b), indicating that the encoding approach plays an important role in the overall SNN performance.

**Fig. 8: Comparison of the SNN classification performance with respect to 6 spike-sorting algorithms.**

Discussion

The primary objective of this study was to propose and explore the capabilities of a minimal SNN architecture for unsupervised pattern classification in multichannel temporal data. In this work, we presented a very frugal single-layer SNN containing a very limited number of artificial neurons, yet capable of learning patterns in continuous streams of data through fully unsupervised biological learning rules. The same network architecture with few LTS neurons could classify multivariate patterns in four different types of data, namely simulated artificial patterns, audio data containing eleven different vowels, propagating waves of multiunit activity in an embryonic spinal cord, and simulated and real spike-sorting datasets. Although simplistic, the artificial data was used as a baseline for the classification capabilities of our SNN, and especially to validate the possibility for the SNN to distinguish between two patterns, one being fully inside the other, a problem that online unsupervised STDP-based classification with SNN could not achieve when the size of expected patterns remained unknown. Here, our algorithm was able to achieve this separation thanks to the use of intrinsic plasticity. Moreover, the use of LTS neurons allowed to automatically wait for the end of each pattern before emitting a spike without the need of computationally expensive delay synapses. The analysis of vowel classification revealed that the network maintained robustness across multiple instances of identical vowels, despite possible variability in their pronunciation. Classification on the spike envelopes of multiunit neural data was a proof of concept of fully unsupervised pattern recognition within multielectrode array data. Finally, the same network architecture was found to be also able to extract action potential waveforms in a fully unsupervised and online-compatible mode in simulated and real spike-sorting datasets with only a handful of neurons for several electrodes.

Throughout the presented simulations, we found that the quality of the data encoding was crucial for the classification performance of the SNN. We used two different types of encoding. The receptive field encoding method followed by STP ensured that the input spike trains contained spikes only during the presence of a pattern for audio and multiunit neural data. In case of single unit neural data (spike-sorting data), this method resulted in too few spikes given the rapid change in signal value of action potential waveforms. In this case, the delta encoding method was found to be more robust. We evaluated the SNN performance on the spike sorting task with a rule to set a δ threshold at each delay based on a multiplier of the MAD of the signal changes at these delays. The same multiplier was used for all experimental recordings, which we found was not optimal since better decoding accuracies could be obtained after optimizing the multiplier for each dataset. Developing a more robust encoding method could help the network to further improve its classification capabilities.

The choice of using LTS neurons as the spiking neuron model allowed to have a single layer of only a few neurons based on the fact that the neurons wait for the pattern to end and produce one postsynaptic spike after the end of the pattern. As a result, one neuron produced one output spike per pattern for easy inference. We showed that the combination of the LTS neurons and several biological plasticity rules resulted in each pattern requiring just one neuron to be learned. Existing research in the domain of unsupervised or even supervised classification with SNNs often involve multiple convolutional layers of LIF neurons to extract features from input data, thus typically involving several hundred thousand parameters that need to be learned^77,78, precluding highly energy-efficient neuromorphic pattern learning. Sometimes, they also involve an additional post-hoc classifier at the readout layer to classify the aforementioned features^69,70. The learning paradigm is also different as training is usually done non-sequentially by splitting the continuous data into frames of fixed duration, as opposed to the continuous learning in our case. Finally, handling temporal patterns has typically been solved using multiple synapses with different delays^53,74,79. Here, thanks to the dynamics of LTS neurons, we propose a simple solution avoiding multiple synaptic delays that works sequentially in a fully unsupervised way on continuously incoming temporal data on multiple channels simultaneously.

A significant feature of the proposed algorithm and SNN architecture, is its ability to discriminate between highly overlapping patterns, and in particular between patterns where one is completely included into another one. This was possible thanks to the use of an IP rule that allowed the LTS neurons to adapt their thresholds based on the size of the patterns. The network is further dependent upon only a few critical parameters. The first one is the LTS neuron time constant. It determines the inertia of the neuron before generating a spiking rebound, and should thus be chosen according to the inter-spike intervals within the input spike trains and the expected interval between patterns. The second important parameter is the STDP lookback window, which determines the maximum duration of the patterns that are being searched for. If too short, it might prevent learning long patterns, and if too long and exceeding the minimum interval between patterns, it might aggregate patterns. However, a limitation of the current implementation is that patterns that are the same but of different durations cannot be learned separately. This could be addressed by implementing learnable time constants where the neurons adjust not only their threshold, but also the duration for which they shall remain inhibited.

With the advent of very dense neural probes generating very large amounts of neural data, there is currently no technological solution to perform spike-sorting directly within the probe, as this would require algorithms compatible with very low power hardware. In this context, our results on the spike-sorting datasets aim at bridging this gap. The frugal SNN proposed in this study is a first proof of concept of fully unsupervised and online-compatible recognition of spatiotemporal neural patterns within multielectrode array data that can eventually be implemented in low-power hardware. In an earlier study⁷⁴, we proposed a more complex SNN having two processing layers connected by many synapses and an attention mechanism that was capable of processing data from only a single channel. Our current study proposes a significantly simpler and much more frugal single-layer architecture yet capable of processing multiple channels simultaneously. Beyond spike-sorting, the proposed SNN architecture could also be used to extract other types of spatio-temporal neural patterns. Indeed, neural data is constituted of two types of signals, spiking activity reflecting the emission of action potentials, and local field potentials (LFPs) that are slower signals reflecting all other type of transmembrane neural currents and in particular synaptic activity. Here, we show the network’s capability in handling also slowly propagating neural data with the example of the propagating waves of activity in the developing embryonic spinal cord. This may open perspectives for the automatic detection and classification of other types of slow neural patterns collected by brain implants, a key topic for instance in epilepsy⁴⁸. Eventually, such frugal SNN processing scheme could thus benefit intelligent brain implants embedding fully unsupervised extraction of LFP and spiking activity for versatile applications.

Methods

Network architecture

LTS neurons

As illustrated in Fig. 3a, the network consisted of a single layer of a few LTS neurons that were connected to input spike trains through negative synaptic weights initialized randomly according to a uniform distribution and further clipped between [-1,0] at all times. LTS neurons, that are a type of Integrate-and-Fire (IF) neurons, have the property to be inhibited during the presence of a stimulus and generating a rebound after the end of the stimulus (Fig. 3b). Therefore, as the spike trains are passed through the network, incoming currents (spikes x weights) due to the presence of a pattern hyperpolarized all the LTS neurons. Once the incoming currents stopped due to the end of the pattern, the LTS neurons generated a potential rebound. A Winner-Take-All (WTA) mechanism chose among the neurons that have crossed their respective thresholds, the one with the steepest rebound to generate a postsynaptic spike. The LTS neurons were modeled by the following equations:

$${\tau }_{m}\frac{{dV}}{{dt}}=-V+q+g{I}_{{stim}}$$

(1)

$$\frac{{\tau }_{m}}{\varepsilon }\frac{{dq}}{{dt}}=-q+f\left(V\right)$$

(2)

$${withf}\left(V\right)=\left\{\begin{array}{c}{\alpha }_{n}{VifV} < 0\\ {\alpha }_{p}{ifV}\ge 0\end{array}\right.$$

(3)

where $V$ is the LTS neuron potential, $q$ is an adaptation variable that triggers the rebound after inhibition, ${\tau }_{m}$ is the membrane time constant that is chosen depending on the type of data, $\varepsilon$ is a constant that makes $q$ vary slower than $V$, ${I}_{{stim}}$ is the stimulus current (spikes x weights) of the timestep and $g$ is a constant. Whenever the network produces a postsynaptic spike, both $V$ and $q$ were reset to 0 for all neurons. Table 1 illustrates the LTS neurons parameters used for the different types of data the network was tested on.

Table 1 Core parameters of the LTS neurons for the different types of data tested

Full size table

The membrane time constant ${\tau }_{m}$ was chosen according to the size of the patterns expected in the input data. The artificial patterns and vowels data contained patterns that lasted for about ~500 ms on average. Neural data, on the other hand contained patterns that lasted several seconds. The parameter $\varepsilon$ was chosen according to the inter-pattern interval in the data.

Plasticity rules

Learning took place whenever a postsynaptic spike was output by the network. At the occurrence of every postsynaptic spike, the following plasticity rules enabled the network to learn:

Classical STDP strengthened the synapses connecting the neuron that generated a postsynaptic spike and the input spike trains that exhibited spiking activity within a certain pre-time window, thereby implementing Long-Term Potentiation (LTP). It also weakened the synapses connecting the same post-synaptic neuron and the input spike trains that did not exhibit any spiking activity within the pre-time window, thereby implementing Long-Term Depression (LTD). We chose to implement a simple version of this rule, which is define as follows:

$$\varDelta {w}_{{ij}}=\left\{\begin{array}{c}{w}_{{LTP}},{if}\exists {t}_{i}\in {S}_{i}{such \,that}\,{t}_{j}-{T}_{{STDP}} < {t}_{i}\le {t}_{j}\\ {w}_{{LTD}},{if}\nexists {t}_{i}\in {S}_{i}{satisfying}\,{t}_{j}-{T}_{{STDP}} < {t}_{i}\le {t}_{j}\end{array}\right.$$

(4)

where ${w}_{{ij}}$ is the synapse connecting input spike train $i$ and the LTS neuron $j$ that spiked, ${t}_{i}$ is the time of occurrence of the presynaptic spike and ${t}_{j}$ is the time of occurrence of the postsynaptic spike, ${S}_{i}$ is the set of presynaptic spike times for the input spike train i, ${T}_{{STDP}}$ is the duration of the window preceding ${t}_{j}$ that determines the relevant temporal context for STDP, ${w}_{{LTP}}$ = -0.1 as we use negative weights, and ${w}_{{LTD}}$ = 0.06. ${T}_{{STDP}}$ was set to 500 ms for the artificial patterns and the vowel data, 8 seconds for the neural data, and 3.5 ms for the spike-sorting data.

Lateral STDP, another STDP rule was used to govern lateral inhibition between LTS neurons. It weakened the synapses connecting all neurons other than the postsynaptic neuron that spiked, and the input spike trains that exhibited spiking activity within the same pre-time window. This was to prevent multiple neurons from learning the same pattern. However, this update rule was much weaker than the classical STDP rule, keeping in mind that patterns might share common spiking activity. For the postsynaptic neuron j that spiked, the lateral STDP rule is defined as follows:

$$\varDelta {w}_{{ij}}={w}_{{potentiation}},{if}\exists {t}_{i}\in {S}_{i}{such \,that}\,{t}_{j}-{T}_{{STDP}} < {t}_{i}\le {t}_{j}$$

(5)

For all other postsynaptic neurons $k\ne j$, the lateral STDP rule is defined as:

$$\varDelta {w}_{{ik}}={w}_{{inhibition}},\forall k\in N,k\ne j,{if}\exists {t}_{i}\in {S}_{i}{such \,that}\,{t}_{j}-{T}_{{\mbox{STDP}}} < {t}_{i}\le {t}_{j}$$

(6)

where ${w}_{{ik}}$ is the synapse connecting input spike train $i$ and each non-spiking postsynaptic neuron $k$, $N$ is the set of all postsynaptic neurons, ${w}_{{inhibition}}\,$= 0.0002 and ${w}_{{potentiation}}\,$= −0.001. This formulation drove the network towards a more selective and refined connectivity pattern based on the temporal spiking relationships.

Intrinsic Plasticity, unlike STDP, is a form of plasticity implemented on the neurons and not the synapses connecting the spike trains and the neurons. It helped neurons adapt their thresholds based on the size of the pattern learned. The thresholds of all output neurons were initialized at a low value, to promote learning at the beginning of training and as training progressed, each neuron increased its threshold Th according to the size of the pattern learnt and reached an equilibrium threshold indicative of the size of the pattern learnt. Every time a postsynaptic LTS neuron j emitted a spike, its threshold Th_j was decreased as follows:

$${{Th}}_{j}\leftarrow {{Th}}_{j}-{F}^{\Delta {Th}}*{{Th}}_{j}$$

(7)

where ${F}^{\Delta {Th}}$ is a factor by which the threshold is decreased (see Table 2).

Table 2 Threshold update parameters of the LTS neurons for the different types of data tested

Full size table

For each pre-synaptic spike received from the input spike train i within a coincidence time window before the post-synaptic spike of the LTS neuron j, the threshold of the LTS neuron j was increased by a value that was obtained by multiplying the synaptic weight ${w}_{{ij}}$ between the input train i and the LTS neuron j by the factor ${\Delta {Th}}_{{pair}}$ :

$${{Th}}_{j}\leftarrow {{Th}}_{j}+{\sum}_{i\in S}{{w}_{{ij}}*\Delta {Th}}_{{pair}}$$

(8)

where S is the set of spike trains with pre-synaptic spikes occurring within the coincidence window. The thresholds of all neurons were initialized at 20 and then were clipped between [20,3500] at all times.

Encoding

The process of transforming multichannel data into spike trains is a pivotal step in training SNNs for learning tasks. The effectiveness of this encoding directly influences the network’s ability to classify and interpret data. The encoding method determines how well the temporal and spatial dynamics of the data are captured and represented as spikes. Here, we encoded each channel of the data into a collection of spike trains while retaining the original geometry of the data. As shown in Fig. 1a, each channel is normalized and discretized into 20 receptive fields. The continuous signal, ranging from 0 to 1, was divided into 20 equal intervals representing the sensitivity of each receptive field. At each timestep, depending on the value of the signal, a spike was encoded by the field corresponding to the signal value. Two additional spikes were encoded both above and below the central spike making a total of five spikes per timestep. There were therefore 24 spike trains representing each channel of the data. The artificial patterns did not have an encoding step as they already represented the final spike trains ready to be passed into the network.

Short-term plasticity

To ensure that learning by LTS neurons is not driven by the background noise of all channels, we implemented a mechanism called Short-Term-Plasticity (STP) that quickly suppressed all the spike trains that corresponded to noise/silence. After the unwanted spike trains were suppressed, the retained spike trains were the ones that encoded rich vowel information. To implement STP, we assigned a weight ${w}_{{STP}}$ to every spike train. This weight, which was initialized to 1 for all spike trains, is a probability of the spike train to encode a signal. The input spike trains were subjected to STP before training and as they are processed through time, the weights of the spike trains encoding noise are quickly decreased. Once the weight of any spike train fell below 0.75, we stopped STP and mapped the weights of all spike trains below a certain threshold to 0 and the others to 1. This threshold was 0.92 for the vowels and 1 for the neural data. Furthermore, for each group of 24 spike trains corresponding to a certain signal, we checked if at least 60% of the spikes were retained after STP and if not, the remaining spikes were also mapped to 0 in order to clean up residual spikes potentially corresponding to noise. STP is governed by the following equations:

$$\begin{array}{c}\frac{d{w}_{{STP}}}{{dt}}=\frac{1}{{\tau }_{{\mbox{stp}}}}\left(1-{w}_{{STP}}\right)\\ \,\end{array}$$

(9)

$$\frac{d{w}_{{STP}}}{{dt}}=\frac{1}{{\tau }_{{\mbox{stp}}}}\left(1-{w}_{{STP}}\right)-{w}_{{STP}}*{f}_{d}$$

(10)

where ${\tau }_{{\mbox{stp}}}\,$= 2000 ms is the STP time constant and ${f}_{d}$ = 0.003 is the depression factor. The first of these two equations is the weight update rule for spike trains that do not have spikes in the timestep and the second equation is the weight update rule for spike trains that have spikes in the timestep. Post STP, only the spikes corresponding to the spike trains encoding relevant data beyond noise were retained (see Fig. 1g,h).

Vowel data

The vowels were recorded with a microphone (SHURE Beta 58 A) and Audacity software at a sampling rate of 44.1 kHz. A native French male was asked to repeat eleven French vowels 50 times. The recorded audio was subjected to a frequency transform using the SPTK library to obtain 25 Mel Cepstral coefficients. The first Mel reflecting mostly the amplitude of the sound, and thus being not specific to which vowel was pronounced, was discarded. The other 24 Mels were normalized between 0 and 1, smoothed, quantized and encoded as spike trains (see Fig. 1c,e,g) into an array of binary values. Prior to encoding, we chose to smoothen the Mels with a sliding 2^nd order Butterworth filter below 5 Hz to make the network more robust to different occurrences of the same pattern. Unlike the artificial patterns, which had spikes only during the duration of each pattern and no spikes before or after the pattern, the vowels’ spike trains had spikes corresponding to noise/silence across all Mels. Therefore, the encoded spike trains were first subjected to STP, to eliminate spikes corresponding to noise and to only retain spikes corresponding to peaks and throughs. These spike trains were then passed into the network as input.

Multiunit neural data

Neural data was reused from a previously published study⁸⁹. They corresponded to rhythmic activity waves propagating across a whole embryonic OF1 mouse hindbrain-spinal cord preparation at stage E13 laid down on a 60-channel microelectrode array (Ayanda Biosystems, Lausanne Switzerland) arranged as 4 columns of 15 microelectrodes (Fig. 1d, left). The detailed procedure to acquire these data has been detailed previously⁸⁹ and was in accordance with protocols approved by the European Community Council and conformed to National Institutes of Health Guidelines for care and use of laboratory animals. In short, after dissection and meninges removed, the neural tissue was maintained on the electrode array with a custom net and continuously superfused with aCSF (in mM: 113 NaCl, 4.5 KCl, 2 CaCl₂2H₂O, 1 MgCl₂6H₂O, 25 NaHCO₃, 1 NaH₂PO₄H₂O, and 11 D-glucose) at a rate of 2 ml/min. Neural data were acquired at 10 kHz using a MEA1060 amplifier from Multi Channel Systems (MCS), with x1200 gain and 1–3000 Hz bandpass filters, connected to two synchronized Power 1401 acquisition systems (Cambridge Electronic Design LTD, Cambridge, UK). Each channel was then bandpass filtered between 200 Hz and 2 kHz to retain high-frequency components (Fig. 1d, middle). Once filtered, we extracted multiunit activity by computing the mean and standard deviation of each channel and considered as spikes those datapoints that were at least 3 standard deviations above or below the mean. Each channel was then downsampled by a binning factor of 100 where each bin was replaced by the total number of spikes in the bin. Finally, a Gaussian kernel (n = 501, σ = 51 time bins) was convolved to each channel to obtain smoothed spike envelopes of the original neural data (Fig. 1d, right). These spike envelopes were then normalized between 0 and 1 and encoded as spike trains and subjected to STP in a manner similar to the vowels. The final spike trains obtained after STP were then passed into the network for learning.

Spike-sorting data

For the simulated spike-sorting data, we used a script to simulate neural activity from a fixed set of neurons that were positioned randomly within a 3D volume that represented a layer of neural tissue overlaid on a multielectrode array (MEA) (see Fig. 2c). The electrical activity of each neuron was modeled using random cosine-Gaussian waveforms, which are commonly employed to capture the combined effects of periodic oscillations and spatially localized activation⁹⁰. These waveforms were defined by the equation:

$$w(t)=A\cdot \cos \left(\frac{2\pi t}{{t}_{s}}+\phi \right)\cdot \exp \left(-{\left(\frac{2.3548t}{{t}_{g}}\right)}^{2}\right)$$

where $A$ represents the amplitude of the waveform, ${t}_{s}$ is the time period of the cosine component, $\phi$ is the phase shift, and ${t}_{g}$ is the width of the Gaussian envelope. This equation describes a cosine signal modulated by a Gaussian function, where the exponent terms governs the rate of decay of the waveform over time.

The real spike sorting dataset was a publicly available real MEA dataset dedicated to validate spike sorting algorithms⁸⁷. The dataset was obtained from simultaneous loose-patch and multi-electrode array (MEA) recordings of mouse retinal ganglion cells. The MEA consists of a 16 × 16 grid of microelectrodes with an inter-electrode spacing of 30 μm, offering high-density extracellular recordings where individual action potentials were typically detected by several neighboring electrodes. We considered an 8×8 subgrid of electrodes surrounding the juxta-cellular neuron for which the ground truth was known (see Fig. 2f). Both the simulated and real spike sorting data were sampled at 20 kHz and then filtered between 300 Hz and 3 kHz. Once filtered, the spike-sorting datasets were encoded by a delta encoding technique inspired by the previously proposed Address-Event Representation (AER) technique⁸⁶. At each timestep $t$, each signal is compared to its past values at $t-i$ for $i$ from $1$ to $k$. The signal is encoded into $2k$ spike trains: $k$ for positive and $k$ for negative deltas. If the signal increased by more than a threshold $\delta$ between $t-i$ and t (i.e., signal$[t]$ − signal$[t-i]\ge \delta$), a spike was generated at time $t$ in the corresponding positive delta spike train. If the signal decreased by at least $\delta$ between $t-i$ and t (i.e., signal$[t-i]$ − signal$[t]\ge \delta$), a spike was generated at time $t$ in the corresponding negative delta spike train (see Fig. 2a,b). The threshold value $\delta$ was different for each channel and each $i$ value. For each channel, the differences of all the timesteps with their ${i}^{{th}}$ previous timestep were calculated. Then, the median absolute difference was computed on the absolute values of these differences. These values were then multiplied by a global multiplier to finally obtain the $\delta$ thresholds for each channel and each $i$ value. The value of this multiplier was 10 for the simulated dataset and 25 for the real dataset; indicative of the average SNR of the datasets. This encoding technique did not necessitate the STP step as it encoded spikes only in the presence of patterns. The delta-encoded spike trains were then passed into the network for learning.

Inference and evaluation

To assess the classification performance of the network, we first matched the truth spike trains and the output spike trains to get truth-output pairs. To perform this matching, we convolved all the truth spike trains and output spike trains with a Gaussian kernel (n = 31, σ = 3 time steps) and then computed the cross-correlation between each of the truth spike trains and the output spike trains. In case of spike-sorting data, we performed the matching between the ground truth spike trains and the output neurons’ spike trains by maximizing the number of correct hits.

For each pair of truth-output, the f-score was computed as:

$$\begin{array}{c}{F}_{{ij}}=\frac{2*{H}_{{ij}}}{{T}_{i}+{O}_{j}}\end{array}$$

where ${T}_{i}$ was the number of spikes of the $i$ th truth spike train, ${O}_{j}$ was the number of spikes emitted by the $j$ th output neuron and ${H}_{{ij}}$ was the number of output spikes coinciding with a truth spike within a coincidence window. The coincidence window was 400 ms for the artificial patterns and vowel data, 2.5 s in the case of the multiunit neural data, and 4 ms in the case of spike-sorting data. These values corresponded to the time needed by a LTS neuron to generate its rebound and cross its threshold. We also computed a global f-score across all truth neurons and all output neurons as:

$$F=\frac{2*H}{T+O}$$

where $T$ was the total number of truth spikes, $O$ was the total number of output spikes and $H$ was the total number of hits. In the case of the vowels, a confusion matrix was also computed to evaluate the classification performance of the model.

When comparing our SNN to other spike-sorting algorithms of the SpikeForest framework, we used the metrics used on the SpikeForest platform⁷¹ (https://spikeforest.flatironinstitute.org/metrics).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The retina data is available at: https://zenodo.org/records/1205233. The other data used in this study is made available on the codeocean platform at the following link: https://codeocean.com/capsule/9829487/tree.

Code availability

The code used in this study is made available on the codeocean platform to reproduce the results at the following link: https://codeocean.com/capsule/3153444/tree/v1.

Change history

02 March 2026
In the originally published version of this article, an incorrect CodeOcean link was provided, which has now been amended to https://codeocean.com/capsule/3153444/tree/v1 in the HTML and PDF versions of the article.

References

Wessberg, J. et al. Real-time prediction of hand trajectory by ensembles of cortical neurons in primates. Nature 408, 361–365 (2000).
Article ADS CAS PubMed Google Scholar
Hochberg, L. R. et al. Neuronal ensemble control of prosthetic devices by a human with tetraplegia. Nature 442, 164–171 (2006).
Article ADS CAS PubMed Google Scholar
Hochberg, L. R. et al. Reach and grasp by people with tetraplegia using a neurally controlled robotic arm. Nature 485, 372–375 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Collinger, J. L. et al. High-performance neuroprosthetic control by an individual with tetraplegia. Lancet 381, 557–564 (2013).
Article PubMed Google Scholar
Ifft, P. J., Shokur, S., Li, Z., Lebedev, M. A. & Nicolelis, M. A. A brain-machine interface enables bimanual arm movements in monkeys. Sci. Transl. Med. 5, 21154 (2013).
Article Google Scholar
Moses, D. A. et al. Neuroprosthesis for decoding speech in a paralyzed person with anarthria. N. Engl. J. Med. 385, 217–227 (2021).
Article PubMed PubMed Central Google Scholar
Metzger, S. L. et al. A high-performance neuroprosthesis for speech decoding and avatar control. Nature https://doi.org/10.1038/s41586-023-06443-4 (2023).
Willett, F. et al. A high-performance speech neuroprosthesis. Nature 620, 1031–1036 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Lorach, H. et al. Walking naturally after spinal cord injury using a brain–spine interface. Nature 618, 126–133 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Chang, L. & Tsao, D. Y. The code for facial identity in the primate brain. Cell 169, 1013–1028.e14 (2017).
Article CAS PubMed PubMed Central Google Scholar
Jun, J. J. et al. Fully integrated silicon probes for high-density recording of neural activity. Nature 551, 232–236 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Musk, E. An integrated brain-machine interface platform with thousands of channels. J. Med. Internet Res. 21, 1–14 (2019).
Article Google Scholar
Steinmetz, N. A. et al. Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings. Science 372, eabf4588 (2021).
Tchoe, Y. et al. Human brain mapping with multithousand-channel PtNRGrids resolves spatiotemporal dynamics. Sci. Transl. Med. 14, 1–13 (2022).
Article Google Scholar
Khanna, A. R. et al. Single-neuronal elements of speech production in humans. Nature https://doi.org/10.1038/s41586-023-06982-w (2024).
Moser, E. I., Kropff, E. & Moser, M. B. Place cells, grid cells, and the brain’s spatial representation system. Annu. Rev. Neurosci. 31, 69–89 (2008).
Article CAS PubMed Google Scholar
Roussel, P., Bocquelet, F., Chabardès, S. & Yvert, B. Evidence for common spike-based temporal coding of overt and covert speech in pars triangularis of human Broca’s area. bioRxiv 03, 586130 (2024).
Google Scholar
Pouzat, C., Mazor, O. & Laurent, G. Using noise signature to optimize spike-sorting and to assess neuronal classification quality. J. Neurosci. Methods 122, 43–57 (2002).
Article PubMed Google Scholar
Quiroga, R. Q., Nadasdy, Z. & Ben-Shaul, Y. Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering. Neural Comput. 16, 1661–1687 (2004).
Article ADS PubMed Google Scholar
Yger, P. et al. Fast and accurate spike sorting in vitro and in vivo for up to thousands of electrodes. bioRxiv 067843 https://doi.org/10.1101/067843 (2016).
Lecun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article ADS CAS PubMed Google Scholar
Schmidhuber, J. Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015).
Article ADS PubMed Google Scholar
Goodfellow, I. J. et al. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27, 2672–2680 (2014)
Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. In Proc. 2nd Int. Conf. Learn. Representations (ICLR) (2014).
Vaswani, A. et al. Attention is All you Need. Neural Inform. Process. Syst. 30, 5998–6008 (2017).
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
Article ADS Google Scholar
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning internal representations by error propagation. Readings in Cognitive Science: A Perspective from Psychology and Artificial Intelligence 399–421 https://doi.org/10.1016/B978-1-4832-1446-7.50035-2 (1988).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Article CAS PubMed Google Scholar
Maass, W. Networks of spiking neurons: the third generation of neural network models. Neural Netw. 10, 1659–1671 (1997).
Article Google Scholar
Ghosh-Dastidar, S. & Adeli, H. Spiking neural networks. Int. J. Neural Syst. 19, 295–308 (2009).
Kasabov, N. K. Time-Space, Spiking Neural Networks and Brain-Inspired Artificial Intelligence Vol. 7 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2019).
Bi, G. Q. & Poo, M. M. Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type. J. Neurosci. Off. J. Soc. Neurosci. 18, 10464–10472 (1998).
Article CAS Google Scholar
Bi, G. & Poo, M. Synaptic modification by correlated activity: Hebb’s postulate revisited. Annu. Rev. Neurosci. 24, 139–166 (2001).
Article CAS PubMed Google Scholar
Masquelier, T. & Thorpe, S. J. Unsupervised learning of visual features through spike timing dependent plasticity. PLOS Comput. Biol. 3, e31 (2007).
Article ADS PubMed PubMed Central Google Scholar
Masquelier, T., Guyonneau, R. & Thorpe, S. J. Spike timing dependent plasticity finds the start of repeating patterns in continuous spike trains. PLoS One 3, e1377 (2008).
Article ADS PubMed PubMed Central Google Scholar
Masquelier, T., Guyonneau, R. & Thorpe, S. J. Competitive STDP-based spike pattern learning. Neural Comput 21, 1259–1276 (2009).
Article PubMed Google Scholar
Lee, J. H., Delbruck, T. & Pfeiffer, M. Training deep spiking neural networks using backpropagation. Front. Neurosci. 10, 508 (2016).
Wu, Y., Deng, L., Li, G., Zhu, J. & Shi, L. Spatio-temporal backpropagation for training high-performance spiking neural networks. Front. Neurosci. 12, 1–12 (2018).
Article Google Scholar
Neftci, E. O., Mostafa, H. & Zenke, F. Surrogate gradient learning in spiking neural networks: bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Process. Mag. 36, 51–63 (2019).
Article Google Scholar
Tavanaei, A., Ghodrati, M., Kheradpisheh, S. R., Masquelier, T. & Maida, A. Deep learning in spiking neural networks. Neural Netw. 111, 47–63 (2019).
Article ADS PubMed Google Scholar
Ponulak, F. & Kasiński, A. Supervised learning in spiking neural networks with ReSuMe: sequence learning, classification, and spike shifting. Neural Comput. 22, 467–510 (2010).
Article MathSciNet PubMed Google Scholar
Zenke, F. & Ganguli, S. SuperSpike: supervised learning in multilayer spiking neural networks. Neural Comput. 30, 1514 (2018).
Article MathSciNet PubMed PubMed Central Google Scholar
Bellec, G., Salaj, D., Subramoney, A., Legenstein, R. & Maass, W. Long short-term memory and Learning-to-learn in networks of spiking neurons. Neural Inform. Process. Syst. 31, 787–797 (2018).
Sourikopoulos, I. et al. A 4-fJ/spike artificial neuron in 65 nm CMOS technology. Front. Neurosci. 11, 247370 (2017).
Article Google Scholar
Indiveri, G. et al. Neuromorphic silicon neuron circuits. Front. Neurosci. 5, 73 (2011).
Article PubMed PubMed Central Google Scholar
Merolla, P. A. et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345, 668–673 (2014).
Article ADS CAS PubMed Google Scholar
Davies, M. et al. Loihi: a neuromorphic manycore processor with on-chip learning. IEEE Micro 38, 82–99 (2018).
Article Google Scholar
Costa, F. et al. Robust compression and detection of epileptiform patterns in ECoG using a real-time spiking neural network hardware framework. Nat. Commun. 15, 1–12 (2024).
Article Google Scholar
Ohno, T. et al. Short-term plasticity and long-term potentiation mimicked in single inorganic synapses. Nat. Mater. 10, 591–595 (2011).
Article ADS CAS PubMed Google Scholar
Suri, M. et al. Bio-inspired stochastic computing using binary CBRAM synapses. IEEE Trans. Electron Devices 60, 2402–2409 (2013).
Article ADS Google Scholar
Alibart, F., Zamanidoost, E. & Strukov, D. B. Pattern classification by memristive crossbar circuits using ex situ and in situ training. Nat. Commun. 4, 2072 (2013).
Article ADS PubMed Google Scholar
Vianello, E. et al. Resistive memories for spike-based neuromorphic circuits. in 2017 IEEE 9th International Memory Workshop (IMW, 2017). https://doi.org/10.1109/IMW.2017.7939100.
DAgostino, S. et al. DenRAM: neuromorphic dendritic architecture with RRAM for efficient temporal processing with delays. Nat. Commun. 15, 3446 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Schmuker, M., Pfeil, T. & Nawrot, M. P. A neuromorphic network for generic multivariate data classification. Proc. Natl. Acad. Sci. USA 111, 2081–2086 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Kulkarni, S. R. & Rajendran, B. Spiking neural networks for handwritten digit recognition—supervised learning and network optimization. Neural Netw. 103, 118–127 (2018).
Article PubMed Google Scholar
Kheradpisheh, S. R., Ganjtabesh, M., Thorpe, S. J. & Masquelier, T. STDP-based spiking deep convolutional neural networks for object recognition. Neural Netw. 99, 56–67 (2018).
Article ADS PubMed Google Scholar
Lee, C., Panda, P., Srinivasan, G. & Roy, K. Training deep spiking convolutional Neural Networks with STDP-based unsupervised pre-training followed by supervised fine-tuning. Front. Neurosci. 12, 508 (2018).
Diehl, P. & Cook, M. Unsupervised learning of digit recognition using spike-timing-dependent plasticity. Front. Comput. Neurosci. 9, 99 (2015).
Article PubMed PubMed Central Google Scholar
Thiele, J. C., Bichler, O. & Dupret, A. Event-based, timescale invariant unsupervised online deep learning with STDP. Front. Comput. Neurosci. 12, 46 (2018).
Yang, G. et al. Unsupervised spiking neural network with dynamic learning of inhibitory neurons. Sensors 23, 7232 (2023).
Tavanaei, A. & Maida, A. S. A spiking network that learns to extract spike signatures from speech signals. Neurocomputing 240, 191–199 (2017).
Article Google Scholar
Donati, E., Payvand, M., Risi, N., Krause, R. & Indiveri, G. Discrimination of EMG signals using a neuromorphic implementation of a spiking neural network. IEEE Trans. Biomed. Circuits Syst. 13, 793–801 (2019).
Article Google Scholar
Vanarse, A., Osseiran, A. & Rassau, A. Real-time classification of multivariate olfaction data using spiking neural networks. Sensors 19, 1841 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Hong, J. W., Kim, S. H. & Han, G. T. Detection of multiple respiration patterns based on 1D SNN from continuous human breathing signals and the range classification method for each respiration pattern. Sensors 23, 5275 (2023).
Müller-Cleve, S. F. et al. Braille letter reading: a benchmark for spatio-temporal pattern recognition on neuromorphic hardware. Front. Neurosci. 16, 951164 (2022).
Virgilio, G. et al. Spiking neural networks applied to the classification of motor tasks in EEG signals. Neural Netw. 122, 130–143 (2020).
Article Google Scholar
Fang, H., Wang, Y. & He, J. Spiking neural networks for cortical neuronal spike train decoding. Neural Comput. 22, 1060–1085 (2010).
Article PubMed Google Scholar
Lungu, I. A., Riehle, A., Nawrot, M. P. & Schmuker, M. Predicting voluntary movements from motor cortical activity with neuromorphic hardware. IBM J. Res. Dev. 61, 5-1 (2017).
Dong, M., Huang, X. & Xu, B. Unsupervised speech recognition through spike-timing-dependent plasticity in a convolutional spiking neural network. PLoS One 13, e0204596 (2018).
Wu, J., Chua, Y., Zhang, M., Li, H. & Tan, K. C. A spiking neural network framework for robust sound classification. Front. Neurosci. 12, 379777 (2018).
Article Google Scholar
Magland, J. et al. Spikeforest, reproducible web-facing ground-truth validation of automated neural spike sorters. eLife 9, e55167 (2020).
Toosi, R., Akhaee, M. A. & Dehaqani, M. R. A. An automatic spike sorting algorithm based on adaptive spike detection and a mixture of skew-t distributions. Sci. Rep.11, 1–18 (2021).
Article Google Scholar
Buccino, A. P. et al. Spikeinterface, a unified framework for spike sorting. eLife 9, 1–24 (2020).
Article Google Scholar
Bernert, M. & Yvert, B. An attention-based spiking neural network for unsupervised spike-sorting. Int. J. Neural Syst. 29, 1850059 (2019).
Cheslet, J., Bernert, M., Beaubois, R., Yvert, B. & Lévi, T. Real-time spike sorting using an optimized STDP spiking neural network on FPGA. 1–7 https://doi.org/10.1109/IJCNN60899.2024.10650675 (2024).
Kasabov, N., Dhoble, K., Nuntalid, N. & Indiveri, G. Dynamic evolving spiking neural networks for on-line spatio- and spectro-temporal pattern recognition. Neural Netw. 41, 188–201 (2013).
Article ADS PubMed Google Scholar
Bichler, O., Querlioz, D., Thorpe, S. J., Bourgoin, J. P. & Gamrat, C. Extraction of temporally correlated features from dynamic vision sensors with spike-timing-dependent plasticity. Neural Netw. 32, 339–348 (2012).
Article PubMed Google Scholar
She, X., Dash, S., Kim, D. & Mukhopadhyay, S. A heterogeneous spiking neural network for unsupervised learning of spatiotemporal patterns. Front. Neurosci. 0, 1406 (2021).
Google Scholar
Hammouamri, I., Khalfaoui-Hassani, I. & Masquelier, T. Learning delays in spiking neural networks using dilated convolutions with learnable spacings. 12th International Conference on Learning RepresentationsICLR 2024 (2023).
Delorme, A., Gautrais, J., Van Rullen, R. & Thorpe, S. SpikeNET: a simulator for modeling large networks of integrate and fire neurons. Neurocomputing 26, 989–996 (1999).
Article Google Scholar
Hodgkin, A. L. & Huxley, A. F. A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 117, 500 (1952).
Article CAS PubMed PubMed Central Google Scholar
Izhikevich, E. M. Simple model of spiking neurons. IEEE Trans. Neural Netw. 14, 1569–1572 (2003).
Article ADS CAS PubMed Google Scholar
Jolivet, R., J., T. & Gerstner, W. THe Spike Response Model: A Framework To Predict Neuronal Spike Trains. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2714, 846–853 (2003).
Nanami, T. & Kohno, T. Simple cortical and thalamic neuron models for digital arithmetic circuit implementation. Front. Neurosci. 10, 181 (2016).
Abbott, L. F., Varela, J. A., Sen, K. & Nelson, S. B. Synaptic depression and cortical gain control. Science 275, 220–223 (1997).
Article CAS PubMed Google Scholar
Lichtsteiner, P. & Delbruck, T. A 64x64 aer logarithmic temporal derivative silicon retina. Res. Microelectron. Electron. 2005 PhD 2, 202–205 (2005).
Article Google Scholar
Yger, P. et al. A spike sorting toolbox for up to thousands of electrodes validated with ground truth recordings in vitro and in vivo.eLife 7, e34518 (2018).
Yvert, B., Branchereau, P. & Meyrand, P. Multiple spontaneous rhythmic activity patterns generated by the embryonic mouse spinal cord occur within a specific developmental time window. J. Neurophysiol. 91, 2101–2109 (2004).
Article PubMed Google Scholar
Yvert, B., Mazzocco, C., Joucla, S., Langla, A. & Meyrand, P. Artificial CSF motion ensures rhythmic activity in the developing CNS ex vivo: a mechanical source of rhythmogenesis?. J. Neurosci. 31, 8832–8840 (2011).
Article CAS PubMed PubMed Central Google Scholar
Adamos, D. A., Kosmidis, E. K. & Theophilidis, G. Performance evaluation of PCA-based spike sorting algorithms. Comput. Methods Prog. Biomed. 91, 232–244 (2008).
Article Google Scholar

Download references

Acknowledgements

This work was supported by the French Research National Agency through the ANR-20-CE45-0005 BrainNet collaborative project awarded to B.Y. and T.L.

Author information

Authors and Affiliations

Univ. Grenoble Alpes, INSERM, U1216, Grenoble Institut Neurosciences, Grenoble, France
Sai Deepesh Pokala, Marie Bernert & Blaise Yvert
Institute of Industrial Science, The University of Tokyo, Meguro, Tokyo, Japan
Takuya Nanami & Takashi Kohno
IMS, UMR5218, CNRS, University of Bordeaux, Talence, France
Timothée Lévi

Authors

Sai Deepesh Pokala
View author publications
Search author on:PubMed Google Scholar
Marie Bernert
View author publications
Search author on:PubMed Google Scholar
Takuya Nanami
View author publications
Search author on:PubMed Google Scholar
Takashi Kohno
View author publications
Search author on:PubMed Google Scholar
Timothée Lévi
View author publications
Search author on:PubMed Google Scholar
Blaise Yvert
View author publications
Search author on:PubMed Google Scholar

Contributions

S.D.P. designed the study and network architecture, acquired audio data, created software, performed analysis and interpretation of the data, and wrote the manuscript. M.B. created software to simulate MEA data and designed an earlier architecture of the SNN that prefigurated the architecture proposed in the manuscript, and T.N., T.K., and T.L. participated to this design. B.Y. designed the study and network architecture, acquired audio and multiunit spinal cord data, proposed analysis and performed interpretation of the data, and wrote the manuscript. All authors approved the final version of the manuscript.

Corresponding author

Correspondence to Blaise Yvert.

Ethics declarations

Competing interests

S.D.P., M.B., and B.Y. are co-authors of a patent application related to the proposed method. The authors do not declare any other competing interests related to this work.

Peer review

Peer review information

Nature Communications thanks Ana Stanojevic and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Reporting Summary (download PDF )

Transparent Peer review file (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Pokala, S.D., Bernert, M., Nanami, T. et al. A frugal Spiking Neural Network for unsupervised multivariate temporal pattern classification and multichannel spike sorting. Nat Commun 16, 9218 (2025). https://doi.org/10.1038/s41467-025-64231-2

Download citation

Received: 30 April 2024
Accepted: 11 September 2025
Published: 17 October 2025
Version of record: 17 October 2025
DOI: https://doi.org/10.1038/s41467-025-64231-2

This article is cited by

Advancing neuroengineering with Neuromorphic Twins
- Michela Chiappalone
- Timothée Levi
Nature Communications (2026)