Real-time analog processing with on-chip learning using multiple-input translinear elements

Sonnadara, Charana; Shah, Sahil

doi:10.1038/s44335-025-00022-8

Download PDF

Article
Open access
Published: 06 May 2025

Real-time analog processing with on-chip learning using multiple-input translinear elements

Charana Sonnadara¹ &
Sahil Shah¹

npj Unconventional Computing volume 2, Article number: 11 (2025) Cite this article

2343 Accesses
1 Citations
Metrics details

Subjects

Abstract

Analog processing has re-emerged as a mode of computing complex real-time algorithms since it consumes less power. Analog circuits are susceptible to mismatches, and onboard training could mitigate variations. In this study, we are presenting the use of analog floating-gate MITE-based circuits for various computations. Analog computation can provide a significant advantage in applications where signals are in the analog domain. The study uses a neural decoding task to demonstrate an application where input neural data are mapped to kinematics. The study demonstrates a real-time neural decoding task with an analog adaptive circuit. The on-chip learning algorithm is developed to adapt the parameters of the analog adaptive circuit. On-chip learning improves the overall Pearson correlation coefficient from 0.07 to 0.69 for a neural decoding task. On-chip learning and adaptation can significantly reduce the need for off-chip communication in implantable devices.

A blueprint for precise and fault-tolerant analog neural networks

Article Open access 14 June 2024

The backpropagation algorithm implemented on spiking neuromorphic hardware

Article Open access 08 November 2024

Supervised training of spiking neural networks for robust deployment on mixed-signal neuromorphic processors

Article Open access 03 December 2021

Introduction

The energy consumption of traditional digital circuits is linked to advancements in scaling technology and the manipulation of supply voltages¹. As the progression of Moore’s law begins to stagnate, the search for novel computing methodologies has intensified. Among these, analog computing has re-emerged as a significant area of interest. The interest in analog computing is driven by the need for energy-efficient solutions in continuous-time signal processing, a domain where analog techniques excel due to their lower power requirements and inherent parallelism. These attributes position analog computing as a transformative technology, particularly in applications constrained by power availability, such as edge computing devices and biomedical instrumentation².

Historically, analog computing was primarily utilized in circuits designed for specific applications, such as front-end circuits interfacing with sensors³. Examples of such applications include neural interfaces⁴, biosensing readouts⁵, and low-power readouts for MEMS⁶. Recent work has expanded the use of analog circuits for computation, demonstrating their applicability in basic arithmetic operations like addition and multiplication, nonlinear computations, and solving complex operations such as differential equations^7,8. This study employs translinear circuits⁹ to showcase various analog operations while consuming significantly low power. Field Effect Transistor-based translinear circuits operating in the subthreshold domain have shown higher energy efficiency¹⁰.

Although analog computing is noted for its energy efficiency, it suffers from mismatch and variation issues. Mismatch refers to the time-independent random variations in physical properties of identically designed devices¹¹. Various techniques are employed to mitigate mismatches, including using larger devices, common centroid layouts, and symmetrical designs. From a circuit design perspective, strategies such as chopping, correlated double sampling, and auto zeroing are used to reduce mismatch¹².

In this study, we utilize programmable translinear circuits to mitigate the variations typically associated with device fabrication. Specifically, we deploy Floating Gate (FG)-based Multiple Input Translinear Element (MITE) devices, enabling a broad spectrum of analog operations¹³. This research demonstrates essential analog functions such as multiplication, division, and squaring operations. Additionally, we also demonstrate complex analog computations such as neural decoding. Traditionally, decoding neural data in neural implants has required the use of multiple ADCs in the front-end or a multiplexed ADC system to digitize multiple neural channels, leading to significant power consumption, particularly in devices that employ several hundred electrodes¹⁴. Our approach, however, uses MITE-based adaptive circuits to decode broadband intracortical neural data from non-human primates performing reach-to-grasp tasks. By integrating analog computations directly into the front end, we eliminate the need for an ADC for each neural channel, significantly enhancing processing efficiency and reducing system complexity. Figure 1 shows the overall concept of neural decoding and the use of analog computation for mapping neural data to kinematics. The analog adaptive filtering technique presented in this paper would not have ADC for each of the electrodes and will not be using the DSP in inferencing mode. The mixed signal system presented in this paper could potentially remove or reduce the amount of ADC used in inferencing kinematics from neural decoding.

**Fig. 1: Overview of the mixed signal system presented in this paper with comparison on the digital process.**

Neural data exhibits significant day-to-day variability, which presents a substantial challenge for maintaining consistent decoding accuracy¹⁵. To address this, we have developed an on-chip non-linear learning algorithm that dynamically adjusts the weights on the MITE devices. This capability enables quick calibration against the daily variations in neural data, significantly improving the resilience and accuracy of neural decoding. Such adaptive learning is crucial for the long-term, reliable operation of neural implants, as it compensates for both variations in neural data and mismatches inherent in analog circuits. All circuits were prototyped and evaluated using a Field Programmable Analog Array (FPAA)¹⁶, fabricated using a 350 nm CMOS process, which demonstrates the feasibility and effectiveness of our approach.

The paper is organized as follows. Section “Floating Gate Multiple Input Trans-linear Element (FG MITE)” describes the overall Floating Gate MITEs and presents analog how this element is used in analog computations like additional multiplications. Section “Analog Adaptive Filtering” presents an adaptive filtering system using the MITE.

Floating Gate Multiple Input Trans-linear Element (FG MITE)

Floating-gate-based multiple Input Translinear Elements (MITE) are Field-Effect Transistors (FET), which have a polysilicon gate surrounded by an insulator like silicon dioxide, which isolates the gate, which means no DC path exists from the gate terminal to the actual gate of the transistor. Multiple inputs are coupled into the floating terminal via a capacitor, as seen in Fig. 2a. Each of these inputs linearly affects the transconductance of the device. In this study, a version of 2 input MITEs is used, which has inputs coupled via capacitor C₁ and C₂, where the third input V₃ is held constant.

**Fig. 2: Overview of floating Gate MITE device and I-V relationships.**

To capture the non-linear behavior of the FG-based MITE devices, this study uses the EKV model¹⁷ that accurately models all the regions of operations, sub-threshold, and above the threshold. This modeling allows an accurate and smooth transition between these two regions of operation. The general form of the EKV model for a floating gate MITE device with n inputs can be written by (2) where V_FG is given by (1).

The model allows designing algorithms on software while accounting for the non-linearity observed in the MITE elements. Further, it enables comparing the hardware results with the ideal results obtained from the software:

$${V}_{FG}\propto {V}_{F{G}_{prog}}+{C}_{1}{V}_{1}/{C}_{T}+{C}_{2}{V}_{2}/{C}_{T}+\ldots +{C}_{n}{V}_{n}/{C}_{T}$$

(1)

$$\begin{array}{rcl}{I}_{sd}&=&{I}_{th}W/L\left({\ln }^{2}\left(1+{e}^{(\kappa ({V}_{b}-{V}_{FG}-{V}_{TP})-({V}_{b}-{V}_{s})+\sigma ({V}_{b}-{V}_{d}))/2{U}_{T}}\right)\right.\\ &&\left.-{\ln }^{2}\left(1+{e}^{(\kappa ({V}_{b}-{V}_{FG}-{V}_{TP})-({V}_{b}-{V}_{d})+\sigma ({V}_{b}-{V}_{s}))/2{U}_{T}}\right)\right)\end{array}$$

(2)

Figure 2b shows the measurement performed on an MITE device and its fit to the above equation (2) with different programming voltages (${V}_{F{G}_{prog}}$). In these equations, V_TP represents the threshold voltage for a pMOS transistor, I_th denotes a device-specific parameter determined, and W and L are the width and length of the pMOS device, respectively. The thermal voltage is represented by U_T, while C_T denotes the total capacitance at the floating node, as illustrated in Fig. 2a. The input voltages V₁, V₂, and V₃ are coupled through capacitance C₁, C₂, and C₃, respectively. The stored charge on the floating node is represented by the ${V}_{F{G}_{prog}}$. A detailed description of each device parameter can be found in refs. ^13,18,19.

Programming FG MITE

The stored charge on the floating node can be precisely controlled by adding electrons to the floating node via hot-electron injection and removing trapped electrons using Fowler-Nordheim tunneling, as described by ref. ²⁰. This process affects ${V}_{F{G}_{prog}}$.

In Fowler-Nordheim tunneling, the voltage across the tunneling capacitance C_tun is increased. This effectively increases the electric field across the oxide, allowing trapped electrons to escape through the voltage barrier across C_tun. As a result, the stored charge on the floating gate device increases, leading to a reduction in the effective threshold voltage and conductance.

Hot-electron injection in MOSFETs occurs when a sufficiently large source-to-drain voltage is applied while an adequate current flows through the device²¹. This process reduces the stored charge on the device, thereby increasing the effective threshold voltage and conductance. The silicon dioxide insulator acts as a voltage barrier, trapping electrons on the polysilicon floating gate permanently, which ensures long-term, non-volatile memory storage. The trapped electrons remain on the floating node without external power, contributing to the non-volatile nature of the memory.

Hot-electron injection is widely used for fine programming due to its lower variability and higher resolution compared to Fowler-Nordheim tunneling. While tunneling is used to erase the FPAA device globally, hot-electron injection is utilized for data storage on the FG MITE device, which can be employed in analog circuits.

In this work, the weights are fine-programmed using hot-electron injection, chosen for its efficiency and the ability to program individual floating gate devices effectively.

Basic arithmetic operation

Addition

Current mode addition can be performed by simply adding the drain terminals of PMOS devices together, as shown in Fig. 3a. Each PMOS device should have a V_sd > 0.2 V be on subthreshold saturation, where drain current will become independent of V_sd. An n-channel current mirror can be used in the output of an application that needs to source the summation current by providing the output current at the bias of the current of the n-mirror.

Subtraction

Similar to an addition circuit, multiple current sources can be connected at the bias and output terminals of the n-channel current mirror, as shown in Fig. 3b. The resultant output current is equal to the difference between the bias current and the output terminal input current.

Multiplication in current domain

Multiplication is a crucial operation in analog circuits, particularly in signal processing and machine learning applications. FG MITE devices enable multiplication in the format of I ∝ I₁ × I₂, which is essential for weighted multiplications often required in neural network computations.

The general form of the EKV model for an FG MITE, given in (2), can be simplified for a two-input FG MITE under the conditions V_FG < V_TP and V_ds > 4∣V_TP∣, known as the subthreshold saturation region. In this model, α denotes the proportionality constant, V_prog corresponds to the stored charge, V_DD represents the source voltage, and ${w}_{i}=\frac{{C}_{i}}{{C}_{T}}$. The term C_i represents the input capacitance of i th input.

$${I}_{sd}={I}_{th}\frac{W}{L}{e}^{\frac{\kappa ({V}_{DD}-\alpha ({w}_{1}{v}_{1}+{w}_{2}{v}_{2}+{V}_{prog}))}{{U}_{t}}}$$

(3)

Since the parameters α and V_DD are constant for a device, (3) can be rewritten by combining καw_i as W_i and ${I}_{th}\frac{W}{L}{e}^{\frac{\kappa ({V}_{DD}-\alpha {V}_{prog})}{{U}_{t}}}$ as I_o(V_prog):

$${I}_{1}={I}_{o}({V}_{pro{g}_{1}}){e}^{\frac{-\kappa ({W}_{1}{v}_{1}+{W}_{x}{v}_{x})}{{U}_{t}}}$$

(4)

$${I}_{1}={I}_{o}({V}_{pro{g}_{1}},{v}_{1}){e}^{\frac{-\kappa ({W}_{x}{v}_{x})}{{U}_{t}}}$$

(5)

By connecting three FG MITEs as shown in Fig. 4a, where two MITEs serve as inputs for the multiplication and the output is produced by a third MITE with its gates connected to the input FG MITE blocks. The current through the two input MITEs can be represented by (5), and the output current is given by (6). This can be further simplified by substituting the current equations from (5) into (6), resulting in the multiplication form I_out ∝ I₁ × I₂. This output current can be effectively used in multiplication calculations and perform a multiply-accumulate operation.

**Fig. 4: Overview of multiplication circuit and its results.**

Figure 4b presents results from the hardware. It shows the linearity of the multiplication for I₁, I₂ ∈ [10, 50] currents in the range from 10 nA to 50 nA. This linearity offers more accurate multiplication results. The precision of multiplication results is limited by the driving and measuring circuits.

$${I}_{out}={I}_{o}({V}_{pro{g}_{out}}){e}^{\frac{-\kappa ({W}_{x}{v}_{x}+{W}_{y}{v}_{y})}{{U}_{t}}}$$

(6)

$${I}_{out}=\frac{{I}_{o}({V}_{pro{g}_{out}})}{{I}_{o}({V}_{pro{g}_{1}},{v}_{1})\times {I}_{o}({V}_{pro{g}_{2}},{v}_{2})}\times ({I}_{1}\times {I}_{2})$$

(7)

Division in current domain

The division is the inverse operation of multiplication, which is performed to find the quotient of two numbers. Analog current domain division could be achieved with the circuit topology shown in Fig. 5a. The circuit could be analyzed as similar to the multiplication circuit, simplified relationship between input currents I₁ and I₂ the output current is given in (11). Figure 5b shows the measurement results obtained from the circuit shown in Fig. 5a for I₁ ∈ [20, 30] and I₂ ∈ [50, 70]. These results show that linear division is achievable with MITE devices.

$${I}_{1}={I}_{o}{({V}_{pro{g}_{1}})}^{{\prime} }{e}^{\frac{-\kappa (W{v}_{x}+W{v}_{y})}{{U}_{t}}}$$

(8)

$${I}_{2}={I}_{o}{({V}_{pro{g}_{2}},{v}_{2})}^{{\prime} }{e}^{\frac{-\kappa (W{v}_{x})}{{U}_{t}}}$$

(9)

$${I}_{out}={I}_{o}{({V}_{pro{g}_{out},{v}_{1}})}^{{\prime} }{e}^{\frac{-\kappa (W{v}_{y})}{{U}_{t}}}$$

(10)

$${I}_{out}=\frac{{I}_{o}{({V}_{pro{g}_{2}},{v}_{2})}^{{\prime} }\times {I}_{o}{({V}_{pro{g}_{out}},{v}_{1})}^{{\prime} }}{{I}_{o}{({V}_{pro{g}_{1}})}^{{\prime} }}\frac{{I}_{1}}{{I}_{2}}$$

(11)

**Fig. 5: Overview of division circuit and its results.**

Square

Square and Square root operations are another useful computation that could be simply approximated with analog MITE devices. In a digital counterpart, this computation must have floating point operations to produce reasonably accurate results, requiring higher power consumption and complex dedicated circuits. A straightforward method of computing squares is using multiplication. However, to reduce the number of circuits used, the power computation circuit could be realized by connecting FG MITEs as shown in Fig. 6a.

$${I}_{1}={I}_{o}{({V}_{pro{g}_{1}})}^{{\prime} }{e}^{\frac{-\kappa (W{v}_{1}+W{v}_{y})}{{U}_{t}}}$$

(12)

$${I}_{out}={I}_{o}{({V}_{pro{g}_{out}})}^{{\prime} }{e}^{\frac{-\kappa 2\times (W{v}_{y})}{{U}_{t}}}$$

(13)

$${I}_{out}=a\times {I}_{o}^{2}$$

(14)

Measured square value from hardware implementation is shown in Fig. 6b which further shows curve fitted to 2nd order polynomial in the form a(x−b)² − c with the coefficients a ≈ 1.089 × 10⁻⁹, b ≈ 6.49A, c ≈ 1.523 × 10⁻⁷A With this circuit configuration, a square root operation could be performed by simply interchanging the inputs and outputs. Further, having n input MITEs instead of the 2 input MITEs used in this work enables computation of various powers rather than being limited to squares.

**Fig. 6: Overview of the square circuit and its results.**

One of the advantages of FG MITES devices is that they perform general computations without having different specialized circuits for each of the computations shown. Further, these devices can be individually programmed by changing the charge of the floating node, which can reduce the mismatches and function as memory.

Analog adaptive filtering

Adaptive filters are essential in signal processing and control systems due to their ability to adjust their response to changes in signal characteristics. These systems are particularly useful when signal properties are non-stationary or unknown. Unlike fixed filters, which have constant parameters, adaptive filters can dynamically update their coefficients to optimize an error criterion in real-time. Figure 7a presents a high-level block diagram of a general adaptive filter.

**Fig. 7: Overview of the Analog Adaptive Filter for and its components.**

In this study, we focus on the adaptive filter system illustrated in Fig. 7b, where the output is given by linear combinations of convolution kernels as described by the following equation:

$$y(t)=\sum _{m}\sum _{n}{W}_{i,j}{x}_{i,j}(t)$$

(15)

During the adaptive filter learning phase, the objective function used to update the coefficients is given by:

$${\mathbb{E}}[e(t)]={\mathbb{E}}\left[{\left(\left(d(t)-\sum _{m}\sum _{n}{W}_{i,j}{x}_{i,j}(t)\right)\right)}^{2}\right]$$

(16)

We utilize the Least Mean Square (LMS) algorithm, which iteratively updates W_i,j with a learning rate μ to minimize the error function in (16):

$${W}_{k+1}={W}_{k}+2\mu {e}_{k}X$$

(17)

The charge storage capability of FG-based MITEs makes them particularly suitable as both weight storage elements and multiplicative components in an analog adaptive filter. In this study, we implement a hardware adaptive filter, as shown in Fig. 7c, utilizing FG MITE at each tap. Currently, system inputs are provided via built-in Digital to Analog Converters (DACs). In the final design, these will be replaced by analog front-end circuits that preprocess sensor inputs. Each input channel is passed through several low-pass filters, which introduce a time delay to the input signal. The delayed inputs and outputs from each low-pass filter are then fed into an array of weight storage cells. Each weight storage cell comprises two FG MITEs and a channel current mirror. This configuration allows for precise adjustment of the weight of each input by selectively increasing the current through the FG MITEs connected to the appropriate n-mirror branch via hot-electron injection. The accumulated output currents from all weight storage cells are then connected to a RAMP ADC, which charges a capacitor and triggers when the voltage across the capacitor reaches a specific value. The RAMP ADC has 10 bits of resolution and a measurement range of 18–38 nA. The resulting signal is digitized by an MSP430-based microcontroller, which processes the current output from the adaptive filter.

The weight adaptation process begins by measuring the output of the adaptive filter and evaluating the error relative to the expected output. Based on the inputs to each FG MITE, an error value is generated for each device. The weight adaptation algorithm, presented in Algorithm 1, describes the steps used to update the weights using the Hot Electron Injection process described in 2.1. During the hot electron injection fine-tuning, each FG MITE device is selected, and the tunneling terminal kept fixed at 5 V and input terminals at 3.5 V and 1 μs pulses of 5 V V_DS is applied based on the number of pulses determined in Algorithm 1.

For additional details on the supporting analog circuit, including the OTA-C Supplementary Fig. 1 and RAMP ADC Supplementary Fig. 2 used in this work, see Supplementary information.

Algorithm 1

Analog Adaptive Filter Coefficient Updating Algorithm

i ← N ⊳ Number of Samples

while i > 0 do

Y[i] ← ADC ⊳ Measure ADC

i ← i − 1

end while

i ← C × T ⊳ Number of Channels (C), No. of Taps (T)

e ← 2 × μ × (D − Y) × X ⊳ Desired Signal(D), Learning rate(μ), Input Vector (X)

while i > 0 do

if e[i] < 0 then Select Switches connected to (-) ;

else Select Switches connected to (+) ;

end if

while $\left\vert e[i]\right\vert\, > \,0$ do Inject Switch with 1us pulse

$e[i]\leftarrow \left\vert e[i]\right\vert -1$

end while

i ← i − 1

end while

Neural decoding using adaptive filter technique

This study employs an analog adaptive filter composed of FG MITEs to perform neural decoding. Neural decoding is the process of mapping kinematic variables (such as velocity or displacement) from the information contained in neural signals, such as action potentials. In this work, we use time-domain adaptive filtering to decode neural signals recorded from the motor cortex of a macaque monkey during an instructed delayed reach-to-grasp task, as described in ref. ²².

The dataset includes recordings from two non-human primates, a female and a male. For this study, we focus on a single trial involving the male primate and decode y component of displacement. The neural data was acquired using a Utah array with 10 × 10 electrodes. Given the significant overlap of information between multiple electrodes, reducing the input data to a lower-dimensional space can substantially decrease the complexity and cost of processing. To achieve this, we apply Principal Component Analysis (PCA), an unsupervised algorithm that transforms the original input vectors into a new coordinate system based on their principal components. PCA identifies the directions (principal components) in which the data varies the most and projects the data onto these axes, effectively reducing the dimensionality while preserving as much variance as possible.

This study uses PCA to transform the input neural signals into their principal components, thereby reducing the number of inputs to the adaptive filter. By focusing on the most significant components, PCA simplifies the data and helps retain the most relevant information for decoding. This reduction in input dimensionality leads to more efficient processing, enabling the adaptive filter to perform neural decoding more effectively. Figure 8 illustrates the signal processing chain used for decoding neural data into kinematic information. This study performs PCA offline using a software platform while the adaptive filter is implemented on analog hardware. Figure 7c shows the overall block diagram of the adaptive filter implemented on the hardware.

**Fig. 8: Overall signal flow of the neural decoding system: electrophysiological (ECoG) signal acquisition from macaque and inference of hand kinematics using an adaptive filter.**

To determine the optimal number of principal components (PCA elements) that would provide high accuracy while minimizing the number of channels, this study modeled the non-linear components, such as the FG-based MITE, in Matlab using the EKV equation (Eq. (2)). Figure 9a shows the Pearson correlation coefficient between the predicted and reference kinematics (y-displacement) for different numbers of principal components. The results indicate that the accuracy improves with increasing principal components up to a certain point, specifically around four components. Beyond this point, the accuracy gains become marginal while the complexity of the system and power consumption increases significantly. Therefore, selecting four principal components strikes a balance between achieving high accuracy and maintaining a manageable number of analog components and power efficiency. The four principal components were passed through low-pass filters to add time delays. The sum of these weighted time delays makes a temporal convolution filter. Figure 9a presents the Pearson correlation coefficient between the predicted and reference kinematics as we increase the size of the convolution kernel. However, increasing the convolution kernel size results in higher power and complexities the analog circuit adds. Simulation results suggest that the four principal components with three time-delayed signal results offer a better correlation coefficient to the current experiment while keeping power consumption and complexity minimal. Figure 9b further illustrates the predicted and reference y-displacement over time, demonstrating the effectiveness of this approach.

**Fig. 9: Overall result comparison of proposed hardware adaptive filter with software estimations.**

These selected principal components were inputs to the adaptive filter via on-chip DACs. The adaptive filter was configured with four inputs, each having three taps, as depicted in Fig. 7c. A programmable transconductance amplifier implemented the delay elements between each tap. The weights of the adaptive filter were realized using FG-based MITE blocks, which allowed for precise control and non-volatile storage of the weights. The output currents from all MITE blocks across the four channels were summed onto a single wire, and this aggregated current was sensed using a RAMP ADC. The weights were then adapted based on the weight update algorithm described in Algorithm 1.

Figure 9c presents the output of the hardware filter, comparing it with the adaptive filter implemented in Matlab and the reference displacement. Initially, with a random set of weights, the adaptive filter achieved a Pearson correlation coefficient of 0.07. The correlation improved significantly as the weights were iteratively updated using the algorithm implemented on the MSP430 microprocessor. The adaptation algorithm took 232 iterations in the hardware to reach maximum accuracy. The final Pearson correlation coefficient reached 0.69, indicating a strong alignment between the predicted and actual kinematics, thereby validating the effectiveness of the adaptive filtering approach.

The power consumption of each FG MITE and LPF was around 3 nW and 13 nW, respectively. The 4-channel 3-tapped version of the adaptive circuit shown in the 7c consumes ~140 nW of static power from 2.5 V power supply. RAMP ADC used in this work consumes 10 μW. When measuring the power consumption, the interfacing circuit power was not taken into account. Table 1 does a comparison study of systems which does neural decoding. The table shows that digital systems tend to use significant power, which makes them harder to implant, considering the thermal properties and power requirements.

Table 1 Comparison of multiple neural decoding systems

Full size table

Out of the works presented in Table 1 analog neural decoding of head directions is presented in ref. ²³ is very similar processing with different circuit realization. This approach uses ~300 μW per channel power consumption in SPICE simulations.

Discussion

This study primarily focused on computations with CMOS-compatible non-volatile device FG MITE elements and showed an adaptive filter implementation using computations discussed as building blocks. This adaptive filter presented is beneficial for applications that require low-power, non-stationary signal processing and for applications where it is difficult to predetermine the filter coefficients. Further, on-chip learning provides robustness over the mismatches and variations in analog systems. Otherwise, the exponential current-voltage relationship makes slight mismatches of voltage, which could result in significant variations over multiple devices, making the solutions harder to scale for multiple devices.

The proposed adaptive filtering approach can be extended by increasing the number of channels and time-delayed accounts in computations. Increasing the number of channels could support systems with a higher number of electrode systems, such as Neuropixels 2.0, and increasing the number of time-delayed elements could facilitate more complex relationships.

Further, the on-chip principal component computation can be realized with FG crossbar array to a large number of electrodes similar to the memristor implementation presented in ref. ²⁴ The PCA components currently used in this work can be replaced by the input signals directly or by an analog vector-matrix multiplier to provide a weighted sum of input signals.

Data availability

The neural dataset used in this study is published in Brochier, T., Zehl, L., Hao, Y. et al. Massively parallel recordings in macaque motor cortex during an instructed delayed reach-to-grasp task. Sci Data 5, 180055 (2018). The relevant dataset is available at https://doi.org/10.1038/sdata.2018.55. The circuits and other relevant materials, including underlying codes and scripts, will be made available at the request from the corresponding author.

Code availability

The circuits and other relevant materials, including underlying codes and scripts, will be made available at the request from the corresponding author.

References

Chen, Z. & Gu, J. A time-domain computing accelerated image recognition processor with efficient time encoding and non-linear logic operation. IEEE J. Solid State Circuits 54, 3226–3237 (2019).
Article ADS Google Scholar
Shah, S., Toreyin, H., Gungor, C. & Hasler, J. A real-time vital-sign monitoring in the physical domain on a mixed-signal reconfigurable platform. IEEE Trans. Biomed. Circuits. Syst. 13, 1690–1699 (2019).
Ying, D. & Hall, D. A. Current sensing front-ends: a review and design guidance. IEEE Sens. J. 21, 22329–22346 (2021).
Article ADS Google Scholar
Mendrela, A. E. et al. A bidirectional neural interface circuit with active stimulation artifact cancellation and cross-channel common-mode noise suppression. IEEE J. Solid State Circuits 51, 955–965 (2016).
Article Google Scholar
Shenoy, V. et al. A cmos analog correlator-based painless nonenzymatic glucose sensor readout circuit. IEEE Sens. J. 14, 1591–1599 (2014).
Article ADS Google Scholar
Li, Z. et al. An analog readout circuit with a noise-reduction input buffer for mems microphone. IEEE Trans. Circuits Syst. II: Express Briefs 69, 3983–3987 (2022).
Google Scholar
Liang, J., Tang, X., Hariharan, S. I., Madanayake, A. & Mandal, S. A current-mode discrete-time analog computer for solving Maxwell’s equations in 2D. In Proc. IEEE International Symposium on Circuits and Systems (ISCAS) 1–5 (IEEE, 2023).
Guo, N. et al. Energy-efficient hybrid analog/digital approximate computation in continuous time. IEEE J. Solid State Circuits 51, 1514–1524 (2016).
Article ADS Google Scholar
Gilbert, B. Translinear circuits: an historical overview. Analog Integr. Circuits Signal Process. 9, 95–118 (1996).
Article Google Scholar
Andreou, A. G. & Boahen, K. A. Translinear circuits in subthreshold MOS. Analog Integr. Circuits Signal Process. 9, 141–166 (1996).
Article Google Scholar
Pelgrom, M., Duinmaijer, A. & Welbers, A. Matching properties of mos transistors. IEEE J. Solid State Circuits 24, 1433–1439 (1989).
Article ADS Google Scholar
Enz, C. & Temes, G. Circuit techniques for reducing the effects of op-amp imperfections: autozeroing, correlated double sampling, and chopper stabilization. Proc. IEEE 84, 1584–1614 (1996).
Article Google Scholar
Minch, B. A., Diorio, C., Hasler, P. & Mead, C. A. Translinear circuits using subthreshold floating-gate MOS transistors. Analog Integr. Circuits Signal Process. 9, 167–179 (1996).
Article Google Scholar
Jun, J. J. et al. Fully integrated silicon probes for high-density recording of neural activity. Nature 551, 232–236 (2017).
Article CAS PubMed PubMed Central ADS Google Scholar
Huang, G. et al. Discrepancy between inter- and intra-subject variability in EEG-based motor imagery brain-computer interface: Evidence from multiple perspectives. Front. Neurosci. 17, 1122661 (2023).
Article PubMed PubMed Central Google Scholar
George, S. et al. A programmable and configurable mixed-mode FPAA SoC. IEEE Trans. Very Large Scale Integr. Syst. 24, 2253–2261 (2016).
Google Scholar
Enz, C. C., Krummenacher, F. & Vittoz, E. A. An analytical mos transistor model valid in all regions of operation and dedicated to low-voltage and low-current applications. Analog Integr. Circuits Signal Process. 8, 83–114 (1995).
Article Google Scholar
Mead, C. Analog VLSI and Neural Systems (Addison-Wesley Longman Publishing Co., Inc., 1989).
Yang, K. & Andreou, A. G. A multiple input differential amplifier based on charge sharing on a floating-gate MOSFET. Analog Integr. Circuits Signal Process. 6, 197–208 (1994).
Article Google Scholar
Hasler, P. Floating-gate devices, circuits, and systems. In Fifth International Workshop on System-on-Chip for Real-Time Applications (IWSOC'05) 482–487 (IEEE, 2005).
Kim, S., Shah, S. & Hasler, J. Calibration of floating-gate SoC FPAA system. IEEE Trans. Very Large Scale Integr. Syst. 25, 2649–2657 (2017).
Article Google Scholar
Brochier, T. et al. Massively parallel recordings in macaque motor cortex during an instructed delayed reach-to-grasp task. Sci. Data 5, 180055 (2018).
Article PubMed PubMed Central Google Scholar
Rapoport, B. I. et al. A biomimetic adaptive algorithm and low-power architecture for implantable neural decoders. Annu Int Conf. IEEE Eng. Med Biol. Soc. 2009, 4214–4217 (2009).
PubMed Google Scholar
Choi, S., Sheridan, P. & Lu, W. D. Data clustering using memristor networks. Sci. Rep. 5, 10492 (2015).
Article PubMed PubMed Central ADS Google Scholar
Chen, Y., Yao, E. & Basu, A. A 128-channel extreme learning machine-based neural decoder for brain machine interfaces. IEEE Trans. Biomed. Circuits Syst. 10, 679–692 (2016).
Article PubMed Google Scholar
An, H. et al. A power-efficient brain-machine interface system with a sub-mw feature extraction and decoding ASIC demonstrated in nonhuman primates. IEEE Trans. Biomed. Circuits Syst. 16, 395–408 (2022).
Article PubMed PubMed Central Google Scholar
Shaeri, M. A. et al. 33.3 MiBMI: A 192/512-channel 2.46 mm² miniaturized brain-machine interface chipset enabling 31-class brain-to-text conversion through distinctive neural codes. In Proc. IEEE International Solid-State Circuits Conference (ISSCC) Vol. 67, 546–548 (IEEE, 2024).

Download references

Acknowledgements

This work was supported by the National Science Foundation Award #2338159.

Author information

Authors and Affiliations

The Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, USA
Charana Sonnadara & Sahil Shah

Authors

Charana Sonnadara
View author publications
Search author on:PubMed Google Scholar
Sahil Shah
View author publications
Search author on:PubMed Google Scholar

Contributions

C.S. conducted the experiments, collected and analyzed the data, prepared Figures 1, 3–9, A1, and wrote the manuscript. S.S. contributed to the overall conceptualization, data analysis, Figure 2 and manuscript writing. All authors reviewed the manuscript.

Corresponding author

Correspondence to Sahil Shah.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Sonnadara, C., Shah, S. Real-time analog processing with on-chip learning using multiple-input translinear elements. npj Unconv. Comput. 2, 11 (2025). https://doi.org/10.1038/s44335-025-00022-8

Download citation

Received: 22 August 2024
Accepted: 20 February 2025
Published: 06 May 2025
Version of record: 06 May 2025
DOI: https://doi.org/10.1038/s44335-025-00022-8

This article is cited by

Edge intelligence through in-sensor and near-sensor computing for the artificial intelligence of things
- Yongmin Baek
- Byungjoon Bae
- Kyusang Lee
npj Unconventional Computing (2025)