Abstract
Neural reuse can drive organisms to generalize knowledge across various tasks during learning. However, existing devices mostly focus on architectures rather than network functions, lacking the mimic capabilities of neural reuse. Here, we demonstrate a rational device designed based on ferroionic CuInP2S6, to accomplish the neural reuse function, enabled by dynamic allocation of the ferro-ionic phase. It allows for dynamic refresh and collaborative work between volatile and non-volatile modes to support the entire neural reuse process. Notably, ferroelectric polarization can remain consistent even after undergoing the refresh process, providing a foundation for the shared functionality across multiple tasks. By implementing neural reuse, the classification accuracy of neuromorphic hardware can improve by 17%, while the consumption is reduced by 40%; in multi-task scenarios, its training speed is accelerated by 2200%, while its generalization ability is enhanced by 21%. Our results are promising towards building refreshable hardware platforms based on ferroelectric-ionic combination capable of accommodating more efficient algorithms and architectures.
Similar content being viewed by others
Introduction
Brain-inspired computing, as an emerging computing paradigm, aims to develop an efficient computing system by learning from the structure and functions of biological nervous systems. Within this field, neuromorphic devices based on memristors, are regarded as important components for constructing brain-like computing systems. They can utilize distinct memristive mechanisms to mimic various forms of biological plasticity. However, these prototype devices primarily focus on implementing the functionalities of individual neurons or synapses1. Thus, they lack the bionic capabilities of some important network-level properties, particularly the reusability of neural circuits in biology. According to neural reuse theory2,3, neural circuits established can be redeployed to different uses, without losing their original functions (Fig. 1a). This intrinsic mechanism drives organisms to generalize knowledge across various tasks during learning, rather than starting anew each time. Predictably, achieving this property in neuromorphic hardware will enable adaptable repurposing of the existing components across various tasks, thereby allowing for performing more tasks with greater resilience and lower resource consumption (Fig. 1b).
a Schematic of neural reuse, where shared original neural circuits are specialized across multiple tasks through redeployment (blue arrow) and restoration (red arrow) processes. b Schematic hardware implementation of neural reuse, encompassing the refresh and restore process, along with the cyclic configuration between source model and task model. c Schematic of the reconfiguration strategy and the refresh strategy. It encompasses characteristics of switching between volatility and non-volatility, as well as the allocation of them across diverse tasks. d Schematic of the refreshable memristor proposed in this work. The left and right sides of the diagram represent the energy barriers that need to be crossed for ferroelectric polarization reversal and ion migration, respectively. The dynamic allocation of order-disorder phases in CuInP2S6 allows for the circular configuration of two distinct memristive mechanisms within a single material.
However, it poses a significant challenge for memristor-based neural networks. As shown in Fig. 1b, it requires the network to accommodate two distinct yet related functions (refresh from source model to task model) while ensuring their respective stability and mutability. This seemingly contradictory yet interrelated property bears a resemblance to the volatile and nonvolatile characteristics exhibited by memristors. This led us to contemplate integrating the complete memristive dynamics process of the memristor into a unified framework. Specifically, the stable source model and the tunable task model can be constructed based on the non-volatile and volatile states of a single device, respectively. Then, through cyclical configuration between the different working modes, refresh and restoration operations are accomplished (Fig. 1b).
Given that, utilization of memristor reconfiguration strategy appears to be a possible solution. However, as illustrated in Fig. 1c, current research on reconfiguration solutions has primarily focused on controlling the conversion between volatility and non-volatility, which is leveraged to handle different types of tasks separately1,4,5,6,7,8. Thus, despite possessing switchable working modes, a single device typically can perform only one mode within one task. This restriction, which divides application scenarios based on memristive properties, has hindered the sharing of general knowledge (represented by non-volatility) across different tasks. Therefore, there is a pressing need to explore a targeted refresh strategy, wherein the volatility and non-volatility of a single device can collaborate on the same task, and its non-volatility can be utilized across multiple tasks. As shown in Fig. 1c, this requires the memristor to possess both two distinctive characteristics: firstly, non-volatility always appears earlier and serves as the foundation of volatility; secondly, non-volatility remains consistent both before and after mode switching. The former ensures that the non-volatile function can be refreshed with seamless conductance changes, while the latter ensures its continued utility across multiple tasks.
Albeit a significant challenge, the growing demand for searching innovative memristor materials and mechanisms, the blooming of two-dimensional (2D) layered materials with an atomically thin nature and intriguing physical properties provide a promising material family9. Among them, the exceptional current manipulation capabilities of van der Waals ferroionic CuInP2S6 (CIPS) position it as a highly potential candidate (Fig. 1d). Within this typical room-temperature order-disorder ferroelectric semiconductor, the displacive instability of copper (Cu) leads to two intriguing properties of CuInP2S6: ferroelectricity, characterized by off-center ordering10,11, and ionic conduction, characterized by long-range migration disordering12,13,14. For ferroelectricity, its electric dipoles primarily arise from the displacement of Cu+ ions within the crystal lattice. Below the Curie temperature, spontaneous ordering of electric dipoles produces macroscopic polarization that can be switched by an external electric field10. This polarization exhibits excellent retention and has found widespread applications in devices such as ferroelectric diodes10 and ferroelectric tunnel junctions15. On the other hand, Cu+ ions also exhibit ionic conduction through incoherent, long-range motion16,17. This behavior is facilitated by, and can itself generate, atomic disorder18. And it typically requires an electric field for sustainment, forms the basis of the volatile memory. Intriguingly, the ordered site serves as both the initial site for this reversible ion motion and its stable site after relaxation, which bears a great resemblance to the mechanism of neural reuse. Given that ferroelectric polarization and ion migration in CIPS share a common origin, as they are essentially mere reflections of the order-disorder dynamics of Cu+ ions18, this dual memristive mechanism is expected to be highly integrated into space and perform refresh without reconfiguration constraints.
Here, we proposed the concept of the refreshable memristor and constructed the device based on the vertical structure with a MoS2/CIPS/MoS2 configuration. Electrical characterization validates its capacity to execute multi-state operations in both ferroelectric polarization and ion migration working modes. Moreover, the ferroelectric state can remain consistent before and after multiple redeployments, which is completely different from existing reconfiguration solutions and aligns seamlessly with the crucial requirement for neural reuse. Furthermore, transfer learning, by leveraging parameter sharing to apply general knowledge across various downstream tasks19,20, can be regarded as a concrete application of the concept of neural reuse in algorithms. This renders transfer learning well-suited to evaluating the device’s capability to mimic neural reuse. Simulation results based on realistic refreshable device properties show that our device characteristics are sufficient to fulfill the requirements of neural reuse. Compared to the randomly initialized model, repurposing the well-trained source model can yield an accuracy improvement of over 17% while requiring only 60% of the cost for the same duration of training. In multi-task scenarios, neural reuse can accelerate the training speed for new tasks by up to 22 times, effectively enhancing its generalization ability by over 21%. These results indicate that leveraging the distinctive refresh capability of our devices can mimic neural reuse mechanism, thereby narrowing the gap in cognitive ability between memristor-based neural networks and biological nervous networks. Crucially, leveraging this network-level property as the inner driving, this refreshable device empowers memristor-based neural networks to accommodate more efficient algorithms and architectures, thereby substantially expanding the upper limit of their applications.
Results
Refreshable device performance with dual memristor mechanisms
Due to the characteristic properties of ferroionic materials, in addition to electron carriers, Cu+ ions also play a role in the electrical transport of CIPS materials. To detect electrical effects arising from different motions of Cu+ ions, a schematic illustration of the memristor design is presented in Fig. 2a. The van der Waals (vdW) ferroelectric material CIPS (82 nm thick, with structural information and piezoresponse force microscopy characterization shown in Supplementary Fig. 1 and 2, respectively) and few-layer MoS2 are vertically assembled into a semiconductor-ferroelectric-semiconductor (S-FE-S) architecture. And the top semiconductor layer is equipped with h-BN and graphite, constituting a metal-insulator-semiconductor field-effect transistor (MISFET) simultaneously (Supplementary Fig. 3a). Leveraging the gate-tunable carrier density of few-layer MoS2, a specific band alignment between MoS2 and CIPS can be realized through gate-programming, which broadens the capability of tuning the device’s electrical characteristics. In addition, to ensure the integration of neural reuse within memristor-based neural networks, we investigated the device from the following three aspects. (1). The device can work under two memristive mechanisms with different conductance retention characteristics. (2). Both mechanisms possess the capability to emulate the plasticity of biological synapses (Fig. 2b), including both long-term potentiation (LTP) and long-term depression (LTD)21. (3). The non-volatile device state can be further refreshed by configuring it to the volatility mode. Meanwhile, it can also be restored to its source state when reconfigured back.
a Schematic of the vertical heterojunction integrating the functionality of the MoS2 / CuInP2S6 / MoS2 memristor and MIS-FET. b Schematic of neurons and chemical synapses. c The energy band diagram of the device in both high resistance state (HRS) and low resistance state (LRS), depicting ferroelectric polarization mode (upper) and ion migration mode (lower), respectively. d The demonstration of the memory windows of a typical S-FE-S memristor. The red arrows indicate the current measured via applying the voltage pulse train (shown in the inset). e The LTP and LTD measured under the ferroelectric polarization working mode. Here, the 40 set (30 reset) pulses are organized into a group, with the voltage ranging from 7 V to 14.5 V (−5 V to −6.6 V) in fixed intervals of 0.5 V (−0.1 V) between distinct groups. Each pulse had a duration of 0.01 s, followed by an interval of 0.4 s with no voltage. Finally, each state was assessed as shown in Supplementary Fig. 4. f Retention characteristics of 16 distinct ferroelectric polarization states. The different states were obtained by the same excitation pulses as in (e). g I–V characteristic of the device while applying a fixed gate voltage of −4 V. The blue arrows denote the switching direction. h The LTP and LTD measured under the ion migration working mode. The bias voltage was programmed by a series of set pulses (−1.5 V, 0.2 s) followed by a series of reset ones (0.22 V, 0.4 s) and each excitation pulse is accompanied by a fixed read pulse (−0.6 V, 0.1 s). i The evolution trend of current with time in different states under a fixed read pulse sequence. Here, distinct ion migration states were achieved utilizing the same pulse pattern as in (h), but with varying numbers. The inset is the replotted segments within the range of 45 s.
Given the source model as the foundation for neural reuse, and its imposition of requirements on the non-volatility of the memristive device, we first examined the performance of our device working in ferroelectric polarization mode. As illustrated in the upper panel of Fig. 2c, when the Fermi level of the semiconducting MoS2 layer is situated in the conduction band, the capability for electron transport between the source and drain is regulated by ferroelectric polarizations, inducing a change in memristive performance. Via applying of pulsed train for tuning the polarization switching, the memory window of a typical MoS2/CIPS/MoS2 vdW ferroelectric memristor is shown in Fig. 2d. The I-V curve shows a counter-clockwise hysteresis loop as a function of the applied voltage pulse train, consistent with the ferroelectric polarization switching. The ferroelectric memristor shows an on/off ratio of up to 104 upon the polarization switching. In addition, examining the transfer properties of top-layer MoS2 (Supplementary Fig. 3) confirms that under varying ferroelectric polarization states, the MoS2 serving as the device electrode will be doped with different degrees of induced charges. It also implies that the domain switching ratio of the ferroelectric polarization in the channel can be tuned to modify the average barrier height of the channel, thereby obtaining more conductance states. And that provides the possibility for the device to emulate synaptic plasticity when working in ferroelectric polarization mode.
Upon the application of a series of designed voltage pulses, the device successfully emulates long-term potentiation (LTP) and long-term depression (LTD), featuring a broad range of adjustable states with well cycle stability (Fig. 2e). Moreover, those distinct states both exhibit superior non-volatility. Figure 2f demonstrates retention measurements within 120 s for 16 distinct conductance states (a single cell with 4 bits memory capacity, obtained by varying pulse amplitudes). The differences among any resistance states are clearly separable and exhibit good stability. Additionally, supplementary Fig. 5 shows the retention of up to 103 s for five representative resistance states in the device, displaying more than two orders of magnitude of switching ratio. Analogous to retention performance, cyclic endurance property is also a critical parameter for electronic synapses. Supplementary Fig. 6a demonstrates the good switching capability of the device in at least 104 cycles of erase and write operations when working in ferroelectric polarization mode. The on/off ratio is stable at three orders of magnitude.
To facilitate refresh function for tunable downstream tasks, the device necessitates an additional, more variable memristor mechanism. Thus, we turned to evaluate the device performance when working in ion migration mode. As illustrated in the lower panel of Fig. 2c, when the applied gate voltage leads to a decrease in carrier density in the MoS2 layer, even though the ferroelectric polarization has reduced an extra barrier for electron migration, the current flow in the channel almost vanishes. This deliberate restriction on the electron transport induced by ferroelectric polarization can highlight the electrical property changes caused by ion migration. As migrated Cu+ ions gradually accumulate in the contact region of CIPS and MoS2, which further leads to a decrease in the potential barrier of CIPS in this region, significantly regulating the device conductance. Here, to promote the sufficient migration of Cu+ ions, a DC voltage sweep was applied to validate this mechanism (Raman spectroscopy characterization shown in Supplementary Fig. 7). Figure 2g depicts the hysteretic I–V curve of the device when a fixed voltage of −4 V is applied to the gate. During the bias voltage sweeping from 0 to 3 V (from 0 to −3 V) with a step of 0.01 V, the current almost keeps at a constant value of 40 pA for the bias smaller than 1.5 V (−1 V), and then increases exponentially from 1.5 to 3 V (from −1 V to −3 V). This trend in the evolution of conductance with the electric field can be attributed to the accumulation or dispersion of Cu+ ions at both ends of the channel, resulting in a reduction or increment of the interfacial barrier, respectively. This causes the device to exhibit current rectification behavior like p-n junctions or Schottky barriers, as observed in our device (Supplementary Fig. 8). This synergistic contribution of Cu+ ion migration and the interfacial Schottky barrier can be further variably regulated by programming the bias voltage to the pulse form (Supplementary Fig. 9). In addition, this capability of electrical characteristics can be further expanded through gate-programming (Supplementary Fig. 10), allowing specific band alignment between MoS2 and CIPS to meet the varying requirements of the device’s conductance level for different tasks.
This dynamic regulation capability of the device conductance creates favorable conditions for emulating synaptic plasticity. As shown in Fig. 2h, the consistent trend of device conductance evolution over 20 cycles indicates the good applicability of the ion migration working mode in emulating synaptic behavior. Furthermore, with the more elaborate design of excitation pulses for ion migration working mode, the device exhibits exceptional stability for over 120 cycles of LTP-LTD conductance changes (Supplementary Fig. 6b). Unlike stable macroscopic polarization, when the external electric field is smaller than the built-in electric field resulting from the ion migration, migrated Cu+ ions tend to spontaneously return to the origin lattice, leading the conductance to relax to its initial state22. As shown in Fig. 2i, a series of pulses was initially applied to induce Cu+ ion migration for varying low-resistance states of the device. Subsequently, a series of pulses with a period of 1.5 s, where 0 V is applied for 1.4 s and −0.6 V is applied for 0.1 s, were used to detect the evolution trend of the device conductance. As anticipated, the influence of migrated ions on the barrier noticeably decayed when no bias voltage was applied and eventually disappeared after 45 seconds. Notably, the effect of interface defects on the volatile state can be considered negligible (Supplementary Fig. 11).
Thus far, selective control of the device’s working mode has been achieved through precise pulse engineering. Short, high-amplitude pulses predominantly influence ferroelectric polarization, while longer, lower-amplitude pulses primarily drive ion migration. Beyond differences in on/off ratio and retention capabilities, the opposite shifts in the transfer curves provide further experimental evidence distinguishing them (Supplementary Fig. 12). In addition, the tunability of dual memristive mechanisms has also been verified in devices with varying CIPS thicknesses, as shown in Supplementary Fig. 13. The above results indicate that our device can work adaptably under the ferroelectric polarization and ion migration mode, respectively. And those memristive mechanisms with different retention characteristics all exhibit good applicability in emulating LTP-LTD characteristics. Overall, our device shows enormous promising potential in integrating neural reuse with memristor-based neural networks.
Examination about refresh ability
A refreshable memristor should possess two key properties: refresh and restoration. To confirm the refresh capability of the device, Fig. 3a illustrates the emulation of synaptic potentiation and depression behaviors achieved by cyclically configuring the device to ion migration mode, after every two induced ferroelectric polarizations. Across at least 4 cycles of the redeployment, the device exhibits consistent conductance changes in its respective working modes and well-defined independence between different working modes. This suggests that the motion extent of ferroionic can be precisely regulated solely by adjusting the amplitude and period of the excitation pulse, enabling the realization of both volatile and non-volatile working modes within a single device. Crucially, the experimental results demonstrate that redeployment does not compromise the performance in any given working mode of the device.
a Evolution of the device current when cyclically working under the ion migration (blue dots) and ferroelectric polarization (red dots) mode. Red arrows mark the stages of induced ferroelectric polarizations. The inset illustrates the configuration of examinations within one cycle, including polarization, ion migration, ion relaxation and polarization, sequentially. b Evolution of the device current undergoing repeated cycles of ion migration (light blue dots, obtained utilizing 60 pulses, each including a 0.2 s set voltage at −1.5 V followed by a 0.1 s read voltage at −0.6 V.) and spontaneous relaxation (deep blue dots, obtained utilizing 60 pulses, each with a duration of 3.4 s at a voltage of 0 V, followed by a 0.1 s read voltage at −0.6 V.). The varying ferroelectric polarization states were determined via 20 pulses with a period of 3 s, where 0 V is applied for 2.8 s and −0.6 V is applied for 0.2 s (deep red dots). The subsequent ferroelectric polarization states (light blue dots) were determined by 100 pulses with a period of 0.6 s, where 0 V is applied for 0.5 s and −0.6 V is applied for 0.1 s. c The normalization of the data marked by blue dots in (b). d The dependence of the maximum current (\({I}^{60}\)) induced by the ion migration process on the corresponding initial current (\({I}_{{ferr}}\)) induced by ferroelectric polarization. e The energy band diagram of the device during ion migration process at distinct ferroelectric polarization states, where a is the distance between two adjacent, equivalent lattice sites, E is the electric field and e is charge.
This refresh ability was further examined for its applicability in neural reuse across multiple tasks, with a focus on assessing the restoration capability of the device before and after the refresh. As illustrated in Fig. 3b, after undergoing repeated cycles of ion migration and spontaneous relaxation, the various current states programmed via the ferroelectric polarization remain consistently stable. It suggests that our device meets a crucial requirement for the integration of neural reuse with memristor-based neural networks, that is, the non-volatile state can be refreshed during repurposing and restored afterward. Figure 3c shows the normalization of all ion migration and spontaneous relaxation processes presented in Fig. 3b. Notably, the nearly coincident normalization results indicate a nearly uniform regulation in the interfacial Schottky barrier caused by long-range ion migration, even across varying polarization-induced states. More interestingly, as illustrated in double natural logarithmic coordinates shown in Fig. 3d, the maximum current \({I}^{60}\) (the current after 60 voltage pulses excitation) induced by the ion migration process exhibits a linear dependence on its corresponding initial current \({I}_{{ferr}}\) induced by ferroelectric polarization (See Supplementary Text 1 and Supplementary Fig. 14 for the detailed explanation). This suggests that once the conductance state under the ferroelectric polarization working mode is determined, it is feasible to deduce the conductance evolution process after switching the device into ion migration working mode. It should be noted that these experimental results rely on a premise, that the disordered ferroionic has adequately reverted to its ordered state after undergoing long-range motion. This restoration can occur either spontaneously or be accelerated by the application of the voltage pulse (Supplementary Fig. 15).
The above phenomena could be understood through the fine allocation of ion migration and ferroelectric polarization. Owing to the relatively short duration (0.2 s) and low voltage (−1.5 V) of the pulse excitation, the quantity of migrated Cu+ ions is quite restricted18,23. Nevertheless, even a small quantity of migrated Cu+ ions can substantially influence the conductance of the device23. However, this influence may not be pronounced enough to exhibit an evident coupling effect with ferroelectric polarization, particularly when the migrated ions can be adequately restored to their ordered state18. Simultaneously, the alteration of the energy barrier required for ion migration in that region is directly correlated only to the electric field24 (Fig. 3e). Given that the form of the excitation pulse remained constant throughout all tests, as a result, the extent of ion migration within the channel remained nearly uniform, despite variations in polarization-induced states25.
The results above collectively suggest that our device is not constrained by temporal evolution and can achieve reliable working mode switching within a single device. This characteristic holds significant potential for spatial integration and refresh ability. Moreover, these redeployment and restoration capabilities make it possible for memristor-based neural networks to be cyclically configured between the reusable source model and the tunable downstream task model. It paves the way for seamlessly incorporating neural reuse into the memristor-based neural network platform.
Neural reuse based on device simulation
Transfer learning enables the application of general knowledge to various downstream tasks19,26. This process mirrors the concept of neural reuse and is well-suited to reveal the promising applications of our device in this area. For simulating neural reuse via transfer learning, a residual network27 modeled on the ResNet-18 (Fig. 4a) was employed. Leveraging realistic device properties, this simulation enabled an assessment of the classification performance of the source model after redeployment, comparing it with the randomly initialized model after completing training (Implemented on the CIFAR-10 dataset, see detail in the method part for the simulation process). As anticipated, the well-trained source model serves as a reliable initialization point for redeployment, effectively mitigating overfitting. As shown in Fig. 4b, the well-trained source model demonstrates higher accuracies in the classification task across ten categories compared to the randomly initialized model. Upon closer examination of the learning curves (Fig. 4c), it becomes evident that reusing the source model resulted in faster learning speed (Supplementary Fig. 16) and superior accuracy (> 17%). Moreover, it also demonstrates significantly enhanced generalization ability (> 21%, Supplementary Figs. 17, 18, and Text 2). As the well-trained source model converged much earlier during the redeployment process, both power consumption and weight change range were approximately 40% lower overall (Supplementary Fig. 19) across 39,100 iterations (Fig. 4d). In addition, Fig. 4e illustrates the number of images required for the model to reach 80% accuracy for the first time during the training process. It reveals the substantial time and data savings achieved through neural reuse for diverse downstream tasks. Especially, on the CIFAR-10 dataset, it accelerates the training speed by over 22 times (details provided in Supplementary Text 3). Finally, the error rates for different models were investigated when pixels were subjected to Gaussian noise corruption. The results show that, even when input signals are corrupted by noise strengths within 0.25, the source model maintains better recognition performance (Fig. 4f).
a Schematic of transfer learning on a residual convolutional neural network. All cuboids represent feature maps, with the red one signifying the input layer. The green arrow throughout all convolutional layers represents the main line of feedforward operations, while the purple arrows symbolize the skip connections. Each of them goes through a kernel (depicted as a gray cuboid) to match the size of its preceding map. b Heat map of the confusion matrix obtained by testing the network on the test set of CIFAR−10. Each label contains 1000 samples, and the color depth corresponds to the amount of correct predicted label. c Learning curves during the training process. The light-colored curves record the accuracy per iteration, while the dark-colored curves represent the average accuracy in each epoch. d Weight updates across 39,100 iterations. The data of the randomly initialized model and the well-trained source model are presented in identical coordinate. The region above (below) the horizontal axis represents the total amount of positive synaptic updates (negative synaptic updates). e Comparison of input data consumption. The values represent the numbers of images consumed by the model to first achieve an accuracy of 80%. f The accuracy of two models under varying strengths of Gaussian noise corruption.
Performance of neural reuse in multiple downstream tasks
To accurately assess the advantage of our devices in scenarios involving neural reuse between multiple downstream tasks, a comparative experiment was designed, as shown in Fig. 5a. It comprises two distinct schemes: in Scheme 1, the initialization of each downstream task adheres to a fixed distribution of the source weights. That is the device naturally relaxes from the ion migration state to the ferroelectric state after each task, ensuring a consistent initial state for each neural reuse; in Scheme 2, each downstream task employs the trained weights from the preceding task as its initial weights. That is the device remains in ion migration working mode after each task. Additionally, three color image datasets with partially similar features, namely CIFAR-10, CIFAR-100, and Oxford 102 Flowers, were selected to represent three different downstream tasks in all schemes.
a Processes of two schemes for neural reuse in multiple downstream tasks. Arrows with solid line represent neural reuse on the task flow, and arrows with dotted line represent the ion relaxation process. b Boxplots generated by the loss values calculated on the corresponding dataset. It reflects the measurement level of experimental data, which repeatedly recorded loss in all subsamples received randomly. c Learning curves within 20 epochs for each task. It shows the evolution trend of the learning speed and accuracy in each task flow under the two schemes. d RMSE for two schemes. The RMSE on each scheme downstream task is calculated on its matched test set to reveal the effect after transferring features in varied schemes.
To observe the training effect of the model on the same task under different schemes, the loss values were first assessed. As depicted in Fig. 5b, a lower range of loss values in Scheme 1 indicates a more profound level of convergence. This also suggests that the well-trained source model has effectively prepared a highly optimized and efficient general model for each downstream task. Besides, in multi-task scenarios, the model is typically expected to maintain high accuracy on each new task. Therefore, learning curves for both schemes were recorded, as illustrated in Fig. 5c. The results indicate that Scheme 1 yields an overall improvement in training speed and accuracy across various tasks. Moreover, since the root-mean-square error (RMSE)28 tends to be more sensitive to larger errors, disparities in RMSE values can provide an intuitive basis for comparing generalization abilities. Thence, the prediction qualities of various tasks were evaluated by calculating the RMSE on their respective test sets. As illustrated in Fig. 5d, Scheme 1 can ensure that the model maintains a higher level of experimental recognition accuracy for previously unseen samples in new tasks.
Discussion
In this work, we have demonstrated the integration of neural reuse with memristor-based neural networks through a variable configuration between the ferroelectric polarization and ion migration. This implies that the division of application scenarios based on memristor properties is no longer a necessity. Instead, they can be integrated into a progressive and more powerful framework, which is lacking in most reported reconfiguration strategies.
In addition, through the elaborate modulation of the order-disorder phase in CIPS, the proposed device not only allows for unlocking the dual working modes driven by distinct memristor mechanisms but also ensures ferroelectric polarization remains consistent after undergoing repeated cycles of ion migration and relaxation. Leveraging the refresh ability of the realistic device, further simulations indicate that integrating neural reuse can lead to improvements in the training efficiency and generalization ability of memristor-based neural networks, whether compared to the randomly initialized model or in specific multi-task scenarios. This work opens up the possibility for implementation of neural reuse in memristor-based neural network platforms, thereby loading a more potent inner engine into the hardware.
Methods
Sample preparation and characterization
High-quality single crystals of CIPS and MoS2 were synthesized by chemical vapor transport and flux method, respectively29,30. The PFM and Switching Spectroscopy PFM (SS-PFM) measurements are performed on the Asylum Research Cypher S system. An X-ray diffractometer (DX-27) instrument was used to characterize the phase structure of CIPS bulk at 30 mA and 40 kV using monochromatic Cu Kα radiation. Raman spectrum was performed using a confocal Raman Spectrometer (HORIBA LabRAM Odyssey) with a X50 objective lens (NA = 0.5) at room temperature in air. The 532 nm laser excitation was kept below 5 mW. Thin flakes were obtained by mechanical exfoliation from obtained bulk crystals onto heavily doped silicon substrates with a 285 nm SiO2 oxidation layer.
The vertical assembly of vdW-layered compounds were fabricated using the dry-transfer method31. It includes the following two steps: First, using a poly (bisphenol A carbonate) (PC) film-covered polydimethylsiloxane stamp on a glass slide at 120 °C to pick up the flakes layer by layer; Then, releasing on a SiO2 (285 nm)/Si (p+ doped) substrate at 175 °C followed by rinsing in chloroform.
Device fabrication and electrical measurement
The electrodes were patterned using digital micro-mirror device lithography, followed by the deposition of 8/50 nm Cr/Au films using an electron beam evaporator. Electrical characterization was conducted using an FS-Pro 380 semiconductor device analyzer, with the devices placed in a vacuum probe station (approximately 10−2 Torr).
During the measurement, the top MoS2 layer was grounded, while the bias voltage was applied to the bottom MoS2 layer. Additionally, the gate voltage was applied to the top graphite layer.
Neural network simulation
In this study, the residual neural network consists of 18 layers (ResNet-18), including one input layer, 4 residual layers, and a fully connected (FC) layer. Especially, every residual layer contains a pair of basic residual blocks, each one includes double 3×3 convolutional kernels with identical number of channels. The pre-trained weights are stored in the residual blocks, while fine-tuning is conducted across the entire network. For the network parameters, each synaptic weight is encoded through the differential pair of two memristive devices1,32,33 with the same conductance distribution, thereby constraining the weight range to [−1,1]. And the effective weights (Supplementary Fig. 20) are obtained from the differential results generated by conductance and update with backpropagation (Supplementary Text 4). Furthermore, the source data for pre-trained weights of ResNet-18 model are officially provided by PyTorch. Those data are mapped by the device conductance under the ferroelectric polarization working mode. Subsequently, fine-tuning is implemented during the ion migration mode (Fig. 3 and Supplementary Fig. 15).
It involves three commonly used datasets in the field of deep learning in this work, namely CIFAR-10, CIFAR-100, and Oxford 102 Flowers34,35. CIFAR-10 and CIFAR-100 are publicly available in PyTorch framework. Oxford 102 Flowers can be accessed at https://www.robots.ox.ac.uk/~vgg/data/flowers/102/. Using CIFAR-10 as an illustration, it comprises color images spanning 10 categories of distinct objects. For data enhancement and pixel intensity stabilization, images are horizontally flipped at random and their RGB color channels are standardized using the mean and standard deviation derived from millions of images in ImageNet36. The preprocessing operations on CIFAR-100 are identical to the former. Distinctively, the size of Oxford 102 Flowers is resized to 224 by 224 pixels due to its high resolution. To effectively utilize the small amount of data in Oxford 102 Flowers (which merely possesses 10 images for each label in the training set), more measures including random vertical crop, random rotation, and center crop are taken for enhancing the dataset. In the scenario involving input image corruption, each original image is superimposed with a Gaussian noise37 \({\mbox{N}}\left(0,\sigma \right)\) controlled by the standard deviation (\(\sigma\)). Particularly, the images remain unchanged when \(\sigma=0\).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All data needed to evaluate the findings of this study are available within the Article. Source data are provided with this paper.
Code availability
The codes used for simulation and data plotting are available from the corresponding authors under request.
References
Mou, X. et al. Analog memristive synapse based on topotactic phase transition for high-performance neuromorphic computing and neural network pruning. Sci. Adv. 7, eabh0648 (2021).
Anderson, M. L. Neural reuse: a fundamental organizational principle of the brain. Behav. Brain Sci. 33, 245–266 (2010).
Pulvermuller, F. Neural reuse of action perception circuits for language, concepts and communication. Prog. Neurobiol. 160, 1–44 (2018).
Wang, T. et al. Reconfigurable neuromorphic memristor network for ultralow-power smart textile electronics. Nat. Commun. 13, 7432 (2022).
John, R. A. et al. Reconfigurable halide perovskite nanocrystal memristors for neuromorphic computing. Nat. Commun. 13, 2074 (2022).
Yan, X. et al. A low-power reconfigurable memristor for artificial neurons and synapses. Appl. Phys. Lett. 122, 4 (2023).
Zhou, Y. et al. A Reconfigurable Two‐WSe2‐Transistor Synaptic Cell for Reinforcement Learning. Adv. Mater. 34, 2107754 (2022).
Pan, C. et al. Reconfigurable logic and neuromorphic circuits based on electrically tunable two-dimensional homojunctions. Nat. Electron. 3, 383–390 (2020).
Cao, G. M. et al. 2D Material Based Synaptic Devices for Neuromorphic Computing. Adv. Funct. Mater. 31, 2005443 (2021).
Liu, F. et al. Room-temperature ferroelectricity in CuInP2S6 ultrathin flakes. Nat. Commun. 7, 1–6 (2016).
Maisonneuve, V., Cajipe, V., Simon, A., Von Der Muhll, R. & Ravez, J. Ferrielectric ordering in lamellar CuInP2S6. Phys. Rev. B 56, 10860 (1997).
Vysochanskii, Y. M., Stephanovich, V., Molnar, A., Cajipe, V. & Bourdon, X. Raman spectroscopy study of the ferrielectric-paraelectric transition in layered CuInP2S6. Phys. Rev. B 58, 9119 (1998).
Maisonneuve, V. et al. Ionic conductivity in ferroic CuInP2S6 and CuCrP2S6. Ferroelectrics 196, 257–260 (1997).
Sun, Y. et al. Internal ion transport in ionic 2D CuInP2S6 enabling multi-state neuromorphic computing with low operation current. Mater. Today 66, 9–16 (2023).
Wu, J. et al. High tunnelling electroresistance in a ferroelectric van der Waals heterojunction via giant barrier height modulation. Nat. Electron. 3, 466–472 (2020).
Zhong, Z. et al. Robust Threshold-Switching Behavior Assisted by Cu Migration in a Ferroionic CuInP2S6 Heterostructure. ACS Nano 17, 12563–12572 (2023).
Zhu, H. et al. Highly Tunable Lateral Homojunction Formed in Two-Dimensional Layered CuInP2S6 via In-Plane Ionic Migration. ACS Nano 17, 1239–1246 (2023).
Neumayer, S. M. et al. Alignment of Polarization against an Electric Field in van der Waals Ferroelectrics. Phys. Rev. Appl. 13, 064063 (2020).
Pan, S. J. & Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2009).
Weiss, K., Khoshgoftaar, T. M. & Wang, D. A survey of transfer learning. J. Big Data 3, 1–40 (2016).
Thomas, A. Memristor-based neural networks. J. Phys. D: Appl. Phys. 46, 093001 (2013).
Chen, J. et al. Mimicking Neuroplasticity via Ion Migration in van der Waals Layered Copper Indium Thiophosphate. Adv. Mater. 34, 2104676 (2021).
Neumayer, S. M. et al. Ionic Control over Ferroelectricity in 2D Layered van der Waals Capacitors. ACS Appl. Mater. Interfaces 14, 3018–3026 (2022).
Shewmon, P. G. Diffusion in solids. (Springer, New York, 1989).
Jiang, X. et al. Manipulation of current rectification in van der Waals ferroionic CuInP2S6. Nat. Commun. 13, 574 (2022).
Zhuang, F. et al. A comprehensive survey on transfer learning. Proc. IEEE 109, 43–76 (2020).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE conference on computer vision and pattern recognition. 770–778 (IEEE, 2016).
Chai, T. & Draxler, R. R. Root mean square error (RMSE) or mean absolute error (MAE). Geosci. Model Dev. Discuss. 7, 1525–1534 (2014).
Maisonneuve, V., Evain, M., Payen, C., Cajipe, V. & Molinie, P. Room-temperature crystal structure of the layered phase CuIInIIIP2S6. J. Alloy. Compd. 218, 157–164 (1995).
Cevallos, F. A. et al. Liquid Salt Transport Growth of Single Crystals of the Layered Dichalcogenides MoS2 and WS2. Cryst. Growth Des. 19, 5762–5767 (2019).
Pizzocchero, F. et al. The hot pick-up technique for batch assembly of van der Waals heterostructures. Nat. Commun. 7, 11894 (2016).
Li, C. et al. Efficient and self-adaptive in-situ learning in multilayer memristor neural networks. Nat. Commun. 9, 2385 (2018).
Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641–646 (2020).
Krizhevsky, A. & Hinton, G. Learning multiple layers of features from tiny images (University of Toronto, 2009).
Hiary, H., Saadeh, H., Saadeh, M. & Yaqub, M. Flower classification using deep convolutional neural networks. IET Comput. Vis. 12, 855–862 (2018).
Deng, J. et al. Imagenet: A large-scale hierarchical image database. In Proc. IEEE conference on computer vision and pattern recognition 248-255 (IEEE, 2009).
Dodge, S. & Karam, L. Understanding how image quality affects deep neural networks. In 2016 eighth international conference on quality of multimedia experience (QoMEX) 1-6 (IEEE, 2016).
Acknowledgements
The authors thank T. Zhou and Y. Hu from the Analysis and Testing Center at the University of Electronic Science and Technology of China for technical support. This work was supported by the National Key Research & Development Program (2020YFA0309200), the National Natural Science Foundation of China (No. 62074025, 12161141015, 62274024), Sichuan Science and Technology Program (No. 2024GJHZ0301, 2024NSFSC1002, 2024NSFSC0232) and Sichuan Province Key Laboratory of Display Science and Technology.
Author information
Authors and Affiliations
Contributions
F.L. supervised the project. F.L., J.C., and Z.W. conceived the idea. J.C. fabricated the memristor devices and performed the electrical measurements. Z.W. performed the simulations of transfer learning. F.Y. and R.B. assisted with the device fabrication and electrical measurements. Q.L. and R.B. grew single crystals. E.P., Y.Z., and X.L. conducted the XRD and PFM measurements. J.C. and Q.Z. conducted the Raman characterization. F.L., J.C., Z.W. Q.Z., Q.L., and L.D. analyzed the data. J.C., Z.W., and F.L. wrote the manuscript with input from all the authors. All authors discussed the results.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Bilu Liu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chen, J., Wen, Z., Yang, F. et al. Refreshable memristor via dynamic allocation of ferro-ionic phase for neural reuse. Nat Commun 16, 702 (2025). https://doi.org/10.1038/s41467-024-55701-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-55701-0