Bayesian neural network with unified entropy source and synapse weights using 3D 16-layer Fe-diode array

Huang, Yuanquan; Wu, Qiqiao; Gong, Tiancheng; Yang, Jianguo; Luo, Qing; Liu, Ming

doi:10.1038/s41467-025-63302-8

Download PDF

Article
Open access
Published: 28 August 2025

Bayesian neural network with unified entropy source and synapse weights using 3D 16-layer Fe-diode array

Nature Communications volume 16, Article number: 8063 (2025) Cite this article

5439 Accesses
18 Citations
10 Altmetric
Metrics details

Subjects

Abstract

Edge artificial intelligence systems require higher frequency due to intensive computational demands, while most traditional entropy sources decay with frequency. This work shows the physical properties of the Fe-diode devices are ideal for edge systems with high frequencies and dramatic temperature changes. The noise density of Fe-diode can be modified by the amplitude of the read voltage and remains stable at high frequencies and temperature fluctuations. A Bayesian neural network with Fe-diode devices is experimentally implemented in high-speed, high-density silicon-based chips. This hierarchical Bayesian neural network is demonstrated on 3D 16-layer Fe-diode array based on unified entropy source and 4-state synapse. Properties including high area efficiency, wide working temperature range, low energy in-situ training, high recognition accuracy are finally achieved.

Neural sampling machine with stochastic synapse allows brain-like learning and inference

Article Open access 11 May 2022

Bayesian electronics for trustworthy artificial intelligence

Article 12 November 2025

A hybrid local-global neural network for visual classification using raw EEG signals

Article Open access 08 November 2024

Introduction

Machine learning (ML) examine large datasets to develop prediction models and is known to be more popular to traditional statistical approaches^1,2,3,4. The truthfulness and accuracy of data is of great importance for machine learning. However, due to the inherent variability, partial observation, or lack of information in real dynamic systems^5,6, the datasets collected by machines is finite and introducing randomness to improve model generalization is critical. Traditional deep learning models are often data-driven black boxes that cannot be used freely on error-sensitive domains. For example, in the fields of autonomous driving and medical diagnostics, prediction errors would be immeasurably costly and could lead to ethical issues. In fact, it is difficult for the captured data to cover all cases, so the training dataset cannot be comprehensive, and this can lead to errors in some rare cases. On the other hand, conventional deep learning models lack transparency and interpretability due to the complex nature. Finding the reason for errors and debugging the models are difficult.

To solve the above issues, Bayesian neural networks (BayesNN) provide an alternative path to quantify uncertainty and explain loss functions. Bayesian neural networks achieve non-deterministic inference by introducing randomization of synaptic weights, and weight randomization usually consists of two dominant approaches, one of which is the Gaussian distribution of weights^7,8, and the other is Monte Carlo (MC)-dropout of weights^9,10. In situations where information and conditions are limited, the Bayesian method helps to predict the near-true probability of an event occurring by dynamically adjusting step-by-step. The posterior distribution inferred from previous data helps in calculating the prediction variance or entropy and thus assessing the credibility of the probabilistic model.

Gaussian weighting models are usually harder to implement with simple hardware. In CMOS circuits, Gaussian sampling needs to be implemented using linear feedback shift registers (LFSRs) and combinational logic gates¹¹. To minimize the hardware cost, some work has utilized the conductance of resistive devices to represent the weights and cleverly used noise fluctuations to implement the Gaussian weights^7,8. Lin et al. presented a Bayesian neural network implemented by exploiting the intrinsic stochasticity of RRAMs⁷. They used the analog conductance distribution of multiple RRAM devices fitted to a Gaussian distribution and used to represent the weights. Dalgaty et al. proposed resistive memory-based inference hardware for Bayesian neural networks⁸. They perform the ex-situ training of a Bayesian neural network, and then, the resulting software model is transferred in a single programming step to an array of 16,384 resistive memory devices. However, the noise fluctuations of RRAM, which refer to the low-frequency noise, decrease as the frequency and temperature increase^{12,13,14,15,16}. This intrinsic issue is determined by the nature of the device itself and greatly restricts the application at high frequencies and temperatures. W. Schottky et al discovered that shot noise¹⁷, which has a high noise density regardless of the frequency and temperature, could be generated by the current flowing through the diode. Thus, a diode with storage capacity is exactly what is needed for Bayesian neural networks.

Here, in this work, through the systematic investigation of the noise behavior in the Fe-diode, shot noise in HRS with frequency-independent high noise density and temperature stability is selected as the entropy source. Moreover, a hierarchical BayesNN with unified entropy source and weights based on a 3D 16-layer Fe-diode array is constructed. The top layer in the 3D Fe-diode array is used to generate shot noise with high density for a high-quality entropy source. For other layers, each layer has multi-level conductance states for synapse weights. By unifying the synapse weights and entropy source through 3D integration in the same array, low energy consumption and high area efficiency can be obtained. Based on the above advantages, a low-energy-consumption, high-area efficiency BayesNN system with a wide working temperature range (− 40 °C ~ 125 °C) is experimentally demonstrated and in situ training of edge devices with properties (92.4% MNIST recognition accuracy, 0.9997 minimum entropy) is finally achieved.

Results

Figure 1a depicts the structure of the 3D 16-layer Fe-diode. The schematic of the fabrication process flow of the Fe-diode is provided (Supplementary Fig. 1), which comprises five main steps¹⁸. The first step (i) involves Multiple TiN/SiO₂ are deposited by PVD and PECVD, the second step (ii) is dry etch to split the plane electrode, the third step (iii) is SiO₂ filling in the trench, the fourth step (iv) is hole etching and the fifth step (v) is functional layer deposition. Due to its unique vertical stacking capability, implementing multi-layer 3D Fe-diodes is straightforward. This 3D Fe-diode configuration is a more area-efficient approach, which facilitates ultra-dense storage without significantly complicating the manufacturing process, compared with conventional planar Fe-diodes^{19,20,21,22,23,24,25}. Also, our 3D Fe-diodes are designed to be fully compatible with existing CMOS technology, allowing for large-scale and cost-effective production.

**Fig. 1: Device structure of 16-layer 3D Fe-diode.**

The device’s structure was analyzed using cross-sectional transmission electron microscopy (TEM), which confirmed the vertical arrangement of the 16-layer Fe-diode device (Fig. 1b). In contrast to simpler devices, our Fe-diode integrates 16 layers of ferroelectric and electrode materials, creating a highly structured multi-layer device. Each layer’s precise thickness enhances the electrical performance and scaling potential, allowing for a uniform and reliable device operation. The vertical stacking of these layers addresses scaling limitations found in planar configurations, enabling a more compact design with improved energy efficiency. The zoomed-in high-resolution cross-sectional TEM image illustrates the vertical structure of a single Fe-diode (Fig. 1c).

The advantages of the Hf-based 3D 16-layer Fe-Diode lie in its potential for high-density integration, enabling selector-free cross-point architecture that simplifies circuit design. Unlike traditional 1T1C and FeFET structures^26,27,28, the Fe-Diode provides inherent non-linearity and enhances read performance^{29,30,31,32,33}. In addition, the compatibility of Hf-based materials with existing fabrication processes allows for improved miniaturization and integration^{34,35,36,37,38,39}. Importantly, the Fe-Diode also offers significant energy efficiency with fast operation speed and low read current, as its operational mechanism reduces power consumption during read and write cycles²⁹. This 16-layer configuration not only demonstrates the scalability of the technology but also enhances its performance capabilities. Its ability to achieve efficient data storage while minimizing complexity and energy use makes it suitable for low-energy-consumption and high-area-efficiency memory.

The multilevel storage characteristics of the 3D Fe-diode array are prominently illustrated through the statistical results obtained from the 8 kb 3D Fe-diode array (Supplementary Fig. 2 and 3), as depicted in Figs. 1d, e. Figure 1d highlights the various multilevel conductance states, which represent the synapse weights essential for neuromorphic computing applications. In Fig. 1e, we observe that States 1, 2, and 3 are situated in the low resistance state (LRS), while State 4 is in the high resistance state (HRS). This distinct separation between LRS and HRS is consistent with the observations in Fig. 1d, confirming the effectiveness of the multilevel state design.

Figure 1f shows the classification of noise in electron devices. Random Telegraph Noise (RTN), which is caused by a single trap, is difficult to activate and control. Thermal noise is frequency independent, but the density is very small, and it is easily affected by the test environment. For 1/f noise, although it has the high noise density, the density decreases as the frequency increase⁴⁰.

The S/I² characteristics of the Fe-diode in the high resistance state (HRS) under various reading voltages are shown in Fig. 1g. In the high V_read region, the presence of 1/f noise is notable, suggesting fluctuations that are dependent on frequency. As V_read decreases, however, a frequency-independent high-density noise, characterized by slope (β) = 0, becomes the dominant feature. The observed noise (β = 0) is primarily linked to shot noise⁴¹.

The experimental results indicate that the multilevel states of the 3D Fe-diode exhibit uniformity, which is crucial for reliable data storage. The ability to maintain consistent resistance levels across multiple states enhances the array’s performance in terms of read accuracy and data integrity. Furthermore, the uniform distribution of resistance states allows for precise tuning of synaptic weights, making the 3D Fe-diode array suitable for high-density memory applications and advanced computing systems. In a word, the use of a 16-layer Fe-diode configuration offers several significant advantages over traditional single-layer or fewer-layer designs. Firstly, the increased number of layers enables enhanced functionality and performance due to the higher density of ferroelectric tunnel junctions. This allows for more compact integration and improved device scalability, which is crucial for high-area-efficiency applications. In addition, a 16-layer Fe-diode structure improves the overall energy efficiency of the device. By stacking multiple layers, the effective channel resistivity is optimized, reducing power consumption and heat generation compared to simpler architectures. This configuration also leverages the cumulative benefits of each individual layer, resulting in better electrical characteristics and more reliable operation, which is beneficial for low-energy-consumption applications. Moreover, the 16-layer Fe-diode maintains compatibility with standard CMOS fabrication processes, facilitating its integration into existing semiconductor technology. This makes it feasible for large-scale production and helps in lowering manufacturing costs. In summary, the 16-layer Fe-diode design provides enhanced performance, energy efficiency, and manufacturing feasibility, making it a suitable choice for low-energy-consumption and high-area-efficiency applications.

Ideal entropy source by noise in 3D Fe-diode

The temperature dependence of different types of noise is shown in Fig. 2a. The density of thermal noise and 1/f noise is affected by temperature and frequency, respectively⁴⁰. Thus, a temperature and frequency independent high-density noise is required as an ideal entropy source.

**Fig. 2: Entropy Source by Noise in 3D Fe-diode.**

The characteristics of shot noise in Fig. 2b can be analyzed using the Fano factor (F = (S/I²)/2qI). This factor can enhance or suppress the total shot noise represented by equation 2qI⁴². The calculated value of F at a frequency of 10²Hz within the low V_read region is displayed in Fig. 2c. This shift in noise behavior can be attributed to the underlying conduction mechanism, which in the HRS is governed by trap-assisted tunneling (TAT), depicted in Fig. 2d. The shot noise is generated from the TAT process⁴³. To understand the observed phenomena better, additional context is provided (Supplementary Fig. 4). At low V_read, the conduction is dominated by TAT through fast traps, which facilitates efficient charge transport. However, as V_read increases, the involvement of slow traps becomes significant, influencing the trapping and detrapping processes and altering the overall noise characteristics⁴⁴. Consequently, shot noise is predominantly observed only at lower V_read levels. These findings indicate that the shot noise observed in the HRS at low V_read levels is a valuable characteristic, making it a suitable candidate for use as an entropy source due to its frequency-independent high noise density.

Moreover, the dependence of shot noise on temperature is performed. The normalized I_T power spectral densities (S_IT/I_T²) of the 3D Fe-diode in HRS at − 40, 25, and 125 °C are shown, respectively (Supplementary Fig. 5a–c). The behavior of S_IT/I_T² changes from 1/f noise to shot noise as the read bias (V_read) decreases, regardless of T. Since that the current of TAT is mainly related to the applied voltage V rather than temperature T, the power spectral density of shot noise (S = 2qF·I) is almost constant. Therefore, in the low voltage region, the shot noise shows temperature stability, as shown in Fig. 2e (Supplementary Figs. 6, 7). In a word, shot noise in HRS at low V_read is suitable to be used as the ideal entropy source owing to its frequency and temperature-independent high noise density.

In contrast, the noise behavior of the Fe-diode cells in the low resistance state (LRS) reveals a different profile. Unlike the HRS, the LRS exhibits only 1/f noise without any indication of shot noise, and the magnitude of this noise remains nearly constant, as illustrated in Fig. 2f, g. This consistent behavior aligns with the Schottky emission mechanism, suggesting a stable conduction environment under these conditions. The electrical test results further corroborate this mechanism, as shown in Fig. 2h, i.

Surprisingly, we found that the physical properties of the Fe-diode device are exactly what is needed for edge systems with high frequencies and drastic temperature changes. The noise density of the Fe-diode can be modified by the amplitude of the read voltage and remains stable at high frequencies and temperature fluctuations. With Fe-diode devices, Bayesian neural networks with an entropy source can be implemented in high-speed, high-density silicon-based chips. The mean and standard deviation of the weight distribution of the neural network are independently tunable. The mean is represented by the conductance of the Fe-diode, while the standard deviation can be fine-tuned by the Fe-diode-based entropy source. For entropy sources, different read voltages need to be used to obtain different densities of noise. For weight reads, a constant read voltage is required. We can adjust their mean and standard deviation separately without affecting each other. Thus, hierarchical Bayesian neural network models can be easily implemented. In the hierarchical model³, we can assume that the parameters are chosen from a normal distribution with mean and standard deviation, which are the hyperparameters. In this work, these hyperparameters can be represented by the physical properties of the Fe-diode device.

Figure 3a shows the structure of the hierarchical Bayesian neural network in this work. The network is divided into five layers, and the number of neurons are 784, 480, 240, 120, 10. The number of neurons in the input layer corresponds to the number of pixels in the dataset. The dataset used for the experiments is the MNIST dataset with an image dimension of 28 × 28. The example images of the MNIST dataset are the numbers “1”, “2”, “4”, and “8” are shown at the top of Fig. 3a. When training the network model, we not only update the weight values after each epoch, but also update the standard deviation of the weights. For capturing the correlation of weights in the same layer, the weights at the same hierarchy have the same standard deviation. Limiting the network parameters in such a hierarchical manner provides a profile representation of the Bayesian neural network. As shown in Fig. 3a, σ1, σ2, σ3, and σ4 provide standard deviations for the weights connecting layer i and layer i + 1, respectively. The visualization of weights and standard deviations mapping in 16-layer arrays is shown in Supplementary Fig. 8.

**Fig. 3: The hierarchical BayesNN system based on 3D 16-layer Fe-diode array.**

Figure 3b shows the workflow for in situ training using the Fe-diode array. Energy-efficient in situ training is achieved by backpropagation with expectation maximization. For better training, the initial conductance is usually set to a random moderate value. Then, we input the image pixels and labels from the MNIST training set into the network model and perform forward inference and backpropagation. Immediately after that, the weights and hierarchical deviations represented by the Fe-diode array can be updated according to the backpropagation algorithm. The details of this step are shown in Fig. 3c. The conductance G_target of the target device can be calculated, and this conductance is programmed into the corresponding device of the Fe-diode array. The standard deviation σ_n of weights connecting layer n and layer n + 1 can also be calculated, then the noise magnitude of the weights at each hierarchy is tuned so that they obey the calculated results (Supplementary Fig. 9). The tuning of the amplitude of the noise is realized by fine-tuning the read voltage of the Fe-diode device. The Bayesian neural network in this work contains four different standard deviations, so only four different read voltages need to be set. In order to implement the entire network in one 3D array, we crop the image size and downsize the network size. (Supplementary Figs. 10 and 11).

Since MNIST is just a simple number recognition task. To validate that our technique can scale to more complex and larger tasks, we also validated it on the Fashion-MNIST dataset and the CIFAR-10 dataset with inference accuracies of 88.16% (MLP) and 88.31% (VGG-11), respectively. (Supplementary Figs. 12 and 13).

In order to verify the true randomness of the noise of the Fe-diode devices, i.e., the noises with different weights do not affect each other and has no time correlation. We extracted the noise of the devices at different locations in the array, and based on the underlying random number generator to generate random bits for NIST and ACF tests. The NIST SP800-22 test results from 5 M random bits is generated by Fe-diode devices¹⁸. The NIST 800-90B test results show the minimum entropy is 0.99965 (Supplementary Table 1). The ACF test results at a 95% confidence level (within ± 0.014) verifies that the shot noise is an independent variable¹⁸.

Figure 3d shows the setup of the hierarchical BayesNN system based on a 3D 16-layer Fe-diode array (More details in Supplementary Fig. 14). The die micrograph and SEM of the test chip are shown in Fig. 3e–g. As seen by the yellow and green waveforms in the oscilloscope, the noise of the Fe-diode device can be fine-tuned (More details in Supplementary Fig. 15). Noise in the measurement equipment, including oscilloscope and ADCs, is shielded so that it does not affect the network.

Elemental mapping confirms the precise distribution of various materials within the device¹⁸, including the ferroelectric layers, electrodes, and any intermediate spacers, revealing clear boundaries between different materials. An elemental line scan indicates the thicknesses of individual layers¹⁸, highlighting the device’s fine structural precision with reduced dimensions, leading to a highly optimized Fe-diode design.

Figure 4 shows the training results of the Bayesian neural network based on the Fe-diode array. Figure 4a–c respectively show the conductance evolution of the Fe-diodes in the output layer, including initial states, intermediate states, and final states. Validated against the MNIST dataset, we implement in situ training based on the 3D 16-layer Fe-diode array, and the recognition accuracy increases over the epochs of training. The training accuracy is severely affected by the stochasticity of the noise. When the noise stochasticity is better, its minimum entropy is larger and closer to 1.

**Fig. 4: The conductance of devices at the BayesNN’s output layer. Each BL connects 16 layers, using a total of 160 BLs (80 positive, 80 negative) to store the output layer weights.**

Figure 4d shows the comparison of minimum entropy and MNIST recognition accuracy based on Fe-diode-based entropy source and CMOS latch-based entropy source (Supplementary Fig. 16) under different temperature conditions. As the temperature decreases, the thermal and RTN noise densities in CMOS devices decline, leading to lower randomness of the CMOS latch-based entropy source. On the contrary, thanks to the wide temperature range of the Fe-diode, the minimum entropy remains from − 40 °C to 125 °C. The recognition accuracy also remains good over a wide temperature range.

Figure 4e shows the training accuracy over epochs. The hardware-based Bayesian neural network achieves a recognition accuracy of 93.2%, closely following the accuracy of the software simulation, with a 3–5% gap. This gap comes from hardware non-idealities and data accuracy differences. Figure 4f, g respectively show the certain and uncertain results generated by our Bayesian neural network. The method of adding noise to the network can effectively reduce the error rate in recognizing outliers. The reduction in error rate by BayesNN with noise is 5.38% (MNIST), 8.91% (Fashion-MNIST), and 8.69% (CIFAR-10). (Supplementary Table 2).

Discussion

In summary, we have experimentally demonstrated a Bayesian neural network implementation based on a 3D 16-layer Fe-diode array, where the accuracy achieved by in situ training of the network is close to the software simulation. For the five major requirements for BayesNNs: (1) High quality entropy source; (2) Multi-level conductance states; (3) Temperature stability; (4) Low energy consumption and (5) High area efficiency, we construct a hierarchical BayesNN with unified entropy source and weights based on 3D 16-layer Fe-diode array, where one single layer in the 3D Fe-diode array is used to generate shot noise with high density for high-quality entropy source and for other layers, each layer has multi-level conductance states for synapse weights.

Table 1 summaries the performance of the state-of-the-art works. The proposed Bayesian neural network, owing to the low programming energy consumption of Fe-diodes and the 3D structure, can realize a low-energy (25fJ/program), high area-efficiency (0.06F²/state), in situ training supported Bayesian neural network system. Properties including a wide working temperature range (− 40 °C ~ 125 °C), 92.4% MNIST recognition accuracy and 0.9997 minimum entropy are finally achieved.

Table 1 Benchmark of the proposed Bayesian neural network system with other reported works

Full size table

Methods

Device fabrication

Fabrication of planar metal–ferroelectric–metal (MFM) Capacitors: Planar MFM capacitors were fabricated on p-doped Si (100) substrates. Initially, 30 nm TiN bottom electrodes (BEs) were deposited using physical vapor deposition (PVD). Next, Zr-doped HfO₂ films were deposited at 260 °C using thermal atomic layer deposition (ALD) with Hf[N(C₂H₅)CH₃]₄ and Zr[N(C₂H₅)CH₃]₄, applying a 1:1 precursor cycle ratio to achieve 50% Zr content in the HfO₂ layers. Similarly, 30-nm TiN top electrodes (TEs) were deposited by PVD. The amorphous Zr:HfO₂ films were crystallized through rapid thermal annealing in an N₂ environment at 400 °C for 30 s. The top electrode was etched for PFM testing, and a SiO₂ thin film was deposited on the HZO film. After optical lithography and etching, a 2 μm diameter hole was created down to the HZO film. The TiN top electrode was deposited via ALD, defining the Fe diode cell size by the contact area between TiN and HZO.

3D Integration of Fe Diode: For the preparation of a 16-layer 3D vertical memory with Fe-diode cells, multiple layers of TiN (20 nm) and SiO₂ (30 nm) were deposited using PVD and PECVD, respectively. Patterning and a single-step etching were employed to create stacked wordlines (WL) with smooth sidewalls. After SiO₂ was filled in the trench, a 500 nm hole was etched down to the bottom SiO₂. An Hf_0.5Zr_0.5O₂ bilayer was deposited on the sidewalls via ALD, followed by sputtering TiN/W to fill the hole as the pillar electrode (BL). Each horizontal WL was sequentially opened through selective etching, and the memory cell area was defined by the TiN bottom electrode thickness (20 nm) and the hole perimeter.

Characterizations

The DC and current noise of a self-selective cell were tested by an Agilent B1500A semiconductor parameter analyzer connected to the experimental device. The pulse measurements were performed using the HV-SPGU module of Agilent B1500A. During the electrical measurement, the W top electrode was biased, while the TiN bottom electrode was grounded.

Data availability

The data that support the findings of this study are provided as a Source Data file with this paper. Source data are provided in this paper.

References

Song et al. nDNA-prot: identification of DNA-binding proteins based on unbalanced classification. BMC Bioinform. 15, 1–10 (2014).
Article Google Scholar
Gupta, S. et al. Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry. BMJ open 4, e004007 (2014).
Article PubMed PubMed Central Google Scholar
LeCun, Y., Yoshua, B. & Geoffrey, H. Deep learning. Nature 521, 436–444 (2015).
Article ADS CAS PubMed Google Scholar
Vinuesa, R. et al. The role of artificial intelligence in achieving the Sustainable Development Goals. Nat. Commun. 11, 1–10 (2020).
Article Google Scholar
Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature 521, 452–459 (2015).
Article ADS CAS PubMed Google Scholar
Abdar, M. et al. A review of uncertainty quantification in deep learning: techniques, applications and challenges. Inf. Fusion 76, 243–297 (2021).
Article Google Scholar
Lin, Y. et al. Bayesian neural network realization by exploiting inherent stochastic characteristics of analog RRAM. In Proceedings 2019 IEEE International Electron Devices Meeting (IEDM) 14.6.1–14.6.4 (2019).
Dalgaty, T. et al. Ex situ transfer of bayesian neural networks to resistive memory-based inference hardware. Adv. Intell. Syst. 3, 2000103 (2021).
Article Google Scholar
Li, Y. & Yarin, G. Dropout inference in Bayesian neural networks with alpha-divergences. In Proceedings International conference on machine learning, (2017).
Mae, Y., Wataru, K. & Takafumi, K. Uncertainty propagation for dropout-based Bayesian neural networks. Neural Netw. 144, 394–406 (2021).
Article PubMed Google Scholar
Dorrance, R. et al. An energy-efficient Bayesian neural network accelerator with CiM and a time-interleaved Hadamard digital GRNG using 22-nm finFET. IEEE J. Solid State Circuits 58, 2826–2838 (2023).
Article ADS Google Scholar
Rao, M. et al. Thousands of conductance levels in memristors integrated on CMOS. Nature 615, 823–829 (2023).
Article ADS CAS PubMed Google Scholar
Yi, W. et al. Quantized conductance coincides with state instability and excess noise in tantalum oxide memristors. Nat. Commun. 7, 11142 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Ambrogio, S. et al. Impact of low-frequency noise on read distributions of resistive switching memory (RRAM). In Proceedings 2014 IEEE International Electron Devices Meeting (IEDM), 14.4.1–14.4.4 (2014).
Veksler, D. et al. Methodology for the statistical evaluation of the effect of random telegraph noise (RTN) on RRAM characteristics. In Proceedings 2012 International Electron Devices Meeting, 9.6.1–9.6.4 (2012).
Yu, S. et al. Understanding the conduction and switching mechanism of metal oxide RRAM through low frequency noise and AC conductance measurement and analysis. In Proceedings 2011 International Electron Devices Meeting. 12.1.1–12.1.4 (2011).
Schottky, W. Über spontane Stromschwankungen in verschiedenen Elektrizitätsleitern. Ann. der Phys. 57, 541–567 (1918).
Article ADS Google Scholar
Gong, T. et al. First Demonstration of a Bayesian machine based on unified memory and random source achieved by 16-layer stacking 3D Fe-diode with high noise density and high area efficiency. In Proceedings 2023 International Electron Devices Meeting (IEDM), 1–4 (2023).
Blom, P. W. M., Wolf, R. M., Cillessen, J. F. M. & Krijn, M. P. C. M. Ferroelectric Schottky diode. Phys. Rev. Lett. 73, 2107–2110 (1994).
Article ADS CAS PubMed Google Scholar
Choi, T., Lee, S., Choi, Y., Kiryukhin, V. & Cheong, S.-W. Switchable ferroelectric diode and photovoltaic effect in BiFeO₃. Science 324, 63–66 (2009).
Article ADS CAS PubMed Google Scholar
Jiang, A. Q. et al. A resistive memory in semiconducting BiFeO₃ thin-film capacitors. Adv. mater. 23, 1277–1281 (2011).
Article CAS PubMed Google Scholar
Maksymovych, P. et al. Polarization control of electron tunneling into ferroelectric surfaces. Science 324, 1421–1425 (2009).
Article ADS CAS PubMed Google Scholar
Pintilie, L., Stancu, V., Trupina, L. & Pintilie, I. Ferroelectric schottky diode behavior from a SrRuO₃-Pb(Zr_0.2Ti_0.8)O₃-Ta structure. Phys. Rev. B 82, 085319 (2010).
Article ADS Google Scholar
Pintilie, L., Vrejoiu, I., Hesse, D., LeRhun, G. & Alexe, M. Ferroelectric polarization-leakage current relation in high quality epitaxial Pb(Zr, Ti)O₃ films. Phys. Rev. B 75, 104103 (2007).
Article ADS Google Scholar
Hong, S. et al. Large resistive switching in ferroelectric BiFeO₃ nano-island based switchable diodes. Adv. Mater. 25, 2339–2343 (2013).
Article CAS PubMed Google Scholar
Müller, J. et al. Ferroelectric hafnium oxide: a CMOS-compatible and highly scalable approach to future ferroelectric memories. In Proceedings 2013 IEEE International Electron Devices Meeting (IEDM) 10.18.1–10.18.14 (2013).
Polakowski, P. et al. Ferroelectric deep trench capacitors based on Al: HfO₂ for 3D nonvolatile memory applications. In Proceedings 2014 IEEE 6th International Memory Workshop (IMW) 1–4 (2014).
Luo, Q. et al. Hybrid 1T e-DRAM and e-NVM realized in One 10 nm node Ferro FinFET device with charge trapping and domain switching effects. In Proceedings 2018 IEEE International Electron Devices Meeting (IEDM) 2.6.1–2.6.4 (2018).
Luo, Q. et al. A highly CMOS compatible hafnia-based ferroelectric diode. Nat. Commun. 11, 1391 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Wang, H. et al. Direct observation of room-temperature out-of-plane ferroelectricity and tunneling electroresistance at the two-dimensional limit. Nat. Commun. 9, 3319 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Mo, F., Tagawa, Y., Saraya, T., Hiramoto, T. & Kobayashi, M. Scalability study on ferroelectric-HfO₂ tunnel junction memory based on non-equilibrium Green function method with self-consistent potential. In Proceedings 2018 IEEE International Electron Devices Meeting (IEDM) 16.13.11–16.13.14 (2018).
Max, B., Hoffmann, M., Slesazeck, S. & Mikolajick, T. Ferroelectric tunnel junctions based on ferroelectric-dielectric Hf_0.5Zr_0.5O₂/A1₂O₃ capacitor stacks. In Proceedings 2018 48th European Solid-State Device Research Conference (ESSDERC) 142–145 (2018).
Max, B., Hoffmann, M., Slesazeck, S. & Mikolajick, T. Direct correlation of ferroelectric properties and memory characteristics in ferroelectric tunnel junctions. IEEE J. Electron. Devi. 7, 1175–1181 (2019).
Article CAS Google Scholar
Böscke, T., Müller, J., Bräuhaus, D., Schröder, U. & Böttger, U. Ferroelectricity in hafnium oxide thin films. Appl. Phys. Lett. 99, 102903 (2011).
Article ADS Google Scholar
Muller, J. et al. Ferroelectricity in simple binary ZrO₂ and HfO₂. Nano Lett. 12, 4318–4323 (2012).
Article ADS PubMed Google Scholar
Khan, A. I., Keshavarzi, A. & Datta, S. The future of ferroelectric field-effect transistor technology. Nat. Electron. 3, 588 (2020).
Article Google Scholar
Starschich, S., Schenk, T., Schroeder, U. & Boettger, U. Ferroelectric and piezoelectric properties of Hf_1−xZr_xO₂ and pure ZrO2 films. Appl. Phys. Lett. 110, 182905 (2017).
Article ADS Google Scholar
Müller, J., Polakowski, P., Mueller, S. & Mikolajick, T. Ferroelectric hafnium oxide based materials and devices: Assessment of current status and future prospects. ECS J. Solid State Sci. Technol. 4, N30–N35 (2015).
Article Google Scholar
Liu, C. et al. Role of oxygen vacancies in electric field cycling Behaviors of ferroelectric hafnium oxide. In Proceedings 2018 IEEE International Electron Devices Meeting (IEDM) 16.4.1–16.4.4 (2018).
Koo, R. H. et al. Effect of carrier transport process on tunneling electroresistance in ferroelectric tunnel junction. IEEE Electron Device Lett. 44, 164–167 (2022).
Article ADS Google Scholar
Landauer, R. olf Solid-state shot noise. Phys. Rev. B 47, 16427–16432 (1993).
Article ADS CAS Google Scholar
Lee, J. et al. Noise model of gate-leakage current in ultrathin oxide MOSFETs. IEEE Trans. Electron Devices 50, 2499–2506 (2003).
Article ADS CAS Google Scholar
Houng, M. P., Yeong, H. W. & Chang, W. J. Current transport mechanism in trapped oxides: A generalized trap-assisted tunneling model. J. Appl. Phys. 86, 1488–1491 (1999).
Article ADS CAS Google Scholar
Shin, W. et al. Investigation of low-frequency noise characteristics of ferroelectric tunnel junction: From conduction mechanism and scaling perspectives. IEEE Electron Device Lett. 43, 958–961 (2022).
Article ADS CAS Google Scholar
Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641–646 (2020).
Article ADS CAS PubMed Google Scholar
Dalgaty, T. et al. In situ learning using intrinsic memristor variability via Markov chain Monte Carlo sampling. Nat. Electron. 4, 151–161 (2021).
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grants (No. 62488101 and No. 62425407), and in part by the Youth Innovation Promotion Association of the Chinese Academy of Sciences (No. 2023127). Y. Huang and Q. Wu contribute equally to this work.

Author information

These authors contributed equally: Yuanquan Huang, Qiqiao Wu.

Authors and Affiliations

State Key Laboratory of Fabrication Technologies for Integrated Circuits, Institute of Microelectronics, Chinese Academy of Sciences, Beijing, China
Yuanquan Huang, Tiancheng Gong, Jianguo Yang, Qing Luo & Ming Liu
Laboratory of Microelectronics Devices and Integrated Technology, Institute of Microelectronics, Chinese Academy of Sciences, Beijing, China
Yuanquan Huang, Tiancheng Gong, Jianguo Yang, Qing Luo & Ming Liu
University of Chinese Academy of Sciences, Beijing, China
Yuanquan Huang
Frontier Institute of Chip and System, Fudan University, Shanghai, China
Qiqiao Wu & Ming Liu
Zhangjiang Laboratory, Shanghai, China
Qiqiao Wu & Jianguo Yang

Authors

Yuanquan Huang
View author publications
Search author on:PubMed Google Scholar
Qiqiao Wu
View author publications
Search author on:PubMed Google Scholar
Tiancheng Gong
View author publications
Search author on:PubMed Google Scholar
Jianguo Yang
View author publications
Search author on:PubMed Google Scholar
Qing Luo
View author publications
Search author on:PubMed Google Scholar
Ming Liu
View author publications
Search author on:PubMed Google Scholar

Contributions

Y. Huang and Q. Wu contribute equally to this work. Q. Luo, J. Yang and T. Gong designed the project and provided guidance throughout the research. Y. Huang fabricated the samples and conducted the electrical experiments. T. Gong and Y. Huang proposed and analyzed the shot noise and multilevel storage. Q. Wu highlights its application in Bayesian neural networks. Y. Huang and Q. Wu wrote and revised the manuscript under the guidance from T. Gong, J. Yang, Q. Luo and M. Liu. All authors contributed to discussions on the results.

Corresponding authors

Correspondence to Tiancheng Gong, Jianguo Yang or Qing Luo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Hyungjin Kim, and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Transparent Peer Review file

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Huang, Y., Wu, Q., Gong, T. et al. Bayesian neural network with unified entropy source and synapse weights using 3D 16-layer Fe-diode array. Nat Commun 16, 8063 (2025). https://doi.org/10.1038/s41467-025-63302-8

Download citation

Received: 12 November 2024
Accepted: 11 August 2025
Published: 28 August 2025
Version of record: 28 August 2025
DOI: https://doi.org/10.1038/s41467-025-63302-8

This article is cited by

Microwave-assisted synthesis of CeO2 and CeO2/Fe2O3 nanocomposite towards highly efficient energy storage devices
- Piyush V. Patil
- Dadaso D. Mohite
- M. A. Kadam
Journal of Materials Science: Materials in Electronics (2026)
Electrocaloric and energy storage properties of lead-free Na0.5Bi0.5Ti0.6Hf0.4O3 ferroelectric ceramics for sustainable energy solutions are affected by the sintering temperatures
- Kuppam Mohan Babu
- K. Vijaya Lakshmi
- Ramanaiah Malla
Journal of Materials Science: Materials in Electronics (2025)