Introduction

The development of Internet of Things (IoT) has inspired a substantial increase in the demand for computing and storage at the edge1,2. However, the localized data processing in edge devices, in turn, induces high vulnerability to hardware security threats such as hacking, information leaking and identity disguising1,2,3. To thwart these threats, creditable and lightweight security solutions are urgently desired, where physical unclonable functions (PUFs) have emerged as a promising hardware security primitive3,4,5,6,7. PUFs map the input challenges into unique and unpredictable responses by exploiting the manufacturing variations or the intrinsic stochasticity in devices5,6,7,8,9,10. An input challenge and its corresponding response are known as a challenge-response pair (CRP), which can serve as the secret key in applications such as device authentication3 and key generation9.

Figure 1a shows the workflow of IoT device authentication using PUFs. The PUF chips are initially fabricated and integrated into IoT devices. Then, the manufacturer registers and stores the unique CRPs of the PUFs in trusted cloud servers as the fingerprints of these devices. After being delivered by the supply chain where malicious attacks might be launched, the IoT devices become untrusted10,11,12. Therefore, to authenticate the identity, the local PUF chip produces responses according to the specific challenges offered by the cloud server, which are subsequently compared with the pre-stored ones11,12,13. Besides, in cases such as owner change or device loss, the CRPs of the same PUF should be updated for security (i.e., PUF reconfiguration)14,15,16.

Fig. 1: FeFET-based strong PUF overview.
figure 1

a Conceptual workflow for PUF-based IoT device authentication. After manufacturing, each PUF chip registers unique CRPs, which are pre-stored on trusted cloud servers. The stored CRPs are compared with the ones generated at the edge for authentication. Besides, reconfigurability is essential for updating CRPs to enhance security. b To meet the low-power requirements of IoT devices, nonvolatile FeFETs are integrated with capacitors to reduce static power with energy-efficient operations. c Unlike conventional designs, where the D2D variation may override the C2C variation and degrade reconfigurability, the FeFET C2C variation is exploited throughout both registration and reconfiguration to ensure high reconfigurability. d The proposed 2FeFET-1C PUF cell achieves low-power and accurate charge-domain computations to provide a massive CRP space for numerous IoT devices.

To ensure the security of the above workflow, PUF CRPs should satisfy several basic properties, including uniformity, uniqueness, and repeatability3,4,5,6,7,8,9,10,11,12,13,14,15,16,17. Uniformity measures the CRP randomness, and uniqueness requires different PUFs to register unique CRPs as device fingerprints. Meanwhile, the CRPs of one PUF should also be repeatable to keep the device identity unchanged. In addition to these basic properties, the practical application of PUFs in IoT devices imposes additional requirements on enhanced features, as shown in Fig. 1b–d13,14,15,16,17. Desirable features include low power consumption to meet the resource constraints at the edge, high reconfigurability to improve security, and large CRP space to identify numerous IoT devices.

Based on the CRP space, PUFs can be categorized into weak PUFs and strong PUFs. Weak PUFs have a limited number of CRPs that are linear or polynomial to the number of PUF cells, while strong PUFs can support an exponentially large CRP space3,4,5,6. Therefore, in addition to realizing weak PUF applications, strong PUFs can take advantage of the large CRP space for deployment in advanced cryptographic protocols, such as device authentication12 and multi-party computation18. Another main difference between weak and strong PUFs lies in the threat models. Due to the limited CRP space, challenges and responses of weak PUFs are generally encrypted to prevent exposure. However, the interface of strong PUFs is typically assumed to be publicly accessible3,12, which makes them highly vulnerable to machine learning (ML) attacks19,20,21,22.

Conventional silicon-based strong PUFs include the arbiter PUF (APUF) and its multiple variants23,24,25,26. While APUFs have demonstrated good performance and mature integration, the exploited process variations are almost determined once fabricated. Consequently, low-cost reconfiguration is challenging for APUFs. Emerging devices with intrinsic stochasticity offer opportunities for reconfigurable PUF designs27. Representative works include resistive random-access memory (RRAM)-based PUF28,29,30,31,32, spin-transfer torque magnetic random-access memory (STT-MRAM)-based PUF33,34, graphene field-effect transistor (GFET)-based PUF35, etc. The inherent cycle-to-cycle (C2C) variation of these emerging devices enables low-cost reconfiguration, which opens an era for reconfigurable PUF designs.

Despite the various impressive PUFs, achieving desirable PUF features remains a pivotal challenge. (i) Existing PUFs based on emerging devices exploit resistive nonvolatile memories to avoid static leakage, but the energy efficiency remains limited as the resistive load and low on/off ratio of these memory devices hinder low-power operations36. (ii) As shown in Fig. 1c, prior reconfigurable PUFs leverage device-to-device (D2D) variation for registration and C2C variation for reconfiguration28,29. However, D2D variation is generally more significant and may cause some PUF cells to generate unchangeable CRPs, severely compromising reconfigurability. (iii) Most strong PUF designs require energy-consuming current accumulation or additional post-processing circuits to expand the CRP space28,29,30. The overheads of current accumulation and post-processing become even more expensive for the variants of these designs with enhanced ML attack resilience.

To cope with the challenges, we explore the all-around utilization of the ferroelectric field-effect transistor (FeFET) C2C variation and introduce the charge-domain computing-in-memory (CiM) to strong PUFs. In contrast to resistive devices such as RRAMs and STT-MRAMs, the utilized nonvolatile FeFETs have a capacitive write load and a higher on/off ratio thanks to the transistor-like structure and operations shown in Fig. 1b, facilitating ultra-low power36. Besides, instead of leveraging D2D and C2C variation in CRP registration and reconfiguration, respectively37,38,39, our PUF exploits the C2C variation originated from the inherent stochasticity of FeFETs40 throughout the workflow, as shown in Fig. 1c. The entropy source of both registration and reconfiguration is completely homogeneous to eliminate the negative impact of D2D variation and achieve record-high reconfigurability. In response generation, the proposed 2FeFET-1C PUF array shown in Fig. 1d aggregates the results from all cells through capacitive coupling, achieving an exponentially large CRP space without additional circuits. The operating mechanism of charge-domain CiM eliminates the static power and enables energy-efficient and accurate computation.

In this work, we demonstrate a high-reconfigurability and low-power FeFET-based strong PUF empowered by the above methods. The FeFETs and capacitors are integrated in a 28 nm high-κ-metal-gate (HKMG) process for verification with the advantage of low footprints and good CMOS compatibility41. 9 × 7 FeFET arrays are fabricated in the same process for measurements. Experimental measurements indicate great performances of our PUF, including near-ideal uniformity (50.00%), uniqueness (inter-chip Hamming Distance, HDinter, 49.98%,) and repeatability (intra-chip Hamming Distance, HDintra, 0.00%). Thanks to the proposed device-circuit co-design, our PUF significantly outperforms the state-of-the-art PUFs with record-high reconfigurability (inter-reconfiguration Hamming Distance, HDreconfig, 50.02%) and ultra-low 1.89fJ per bit readout energy. The proposed PUF also demonstrates high robustness against parameter variations, including write voltage, device dimension and temperature, and strong resilience to ML attacks. These metrics highlight the promising prospect of our PUF in IoT security solutions.

Results

FeFET characteristics

Figure 2a illustrates the widely studied metal-ferroelectric-insulator-semiconductor (MFIS) structure of FeFETs, which can be considered as a MOSFET but with an FE layer replacing the conventional gate insulator. The discovery of doped HfO2-based ferroelectric materials in 2011 triggered opportunities to enable FeFETs with low operating voltage, good CMOS compatibility, and high scalability42. Recent reports have shown the FeFETs featuring a memory window of 1.8 V at 2 V write voltage43 and the large-scale 3D integration of FeFETs44. In this work, FeFETs and capacitors are integrated at 28 nm HKMG technology node, as shown in Fig. 2b. The fabricated test chip and the 2FeFET-1C cell circuit are shown in Fig. 2c, d, respectively, with the experimental setup illustrated below. The following results in Fig. 2e–g are all extracted from the measurements on fabricated FeFET arrays.

Fig. 2: FeFET device basics.
figure 2

a FeFET structure and TEM image. b Layout of 2FeFET-1C PUF cells, each consisting of two FeFETs and one capacitor. c Micro-graph of the fabricated chip. d The proposed 2FeFET−1C PUF cell. The challenging bit is input via BL1 and BL2, generating a local result at node X, which is coupled to SL via the capacitor. e \({I}_{{{{\rm{DS}}}}}\)-\({V}_{{{{\rm{GS}}}}}\) curves of one FeFET across 100 write cycles, showing response to -5 V/2.8 V/3.6 V/4.5 V writes pulses for writing the FeFET to various \({V}_{{{{\rm{TH}}}}}\) states. The nonvolatility and ultra-high on/off ratio enable FeFETs with low-power operations. Consistent C2C variation is observed across all write pulses. f Comparison of C2C and D2D variation in FeFETs. The C2C variation extracted from 8 FeFETs over 100 cycles shows significantly less variance than the D2D variation across 63 FeFETs. Therefore, using D2D variation for registration would degrade reconfigurability. g The FeFET C2C variation is evaluated under different write pulses and device dimensions. The measured C2C variation remains stable and significant, which enhances the robustness of the proposed FeFET-based PUF.

The FE layer consists of multiple ferroelectric domains, each in an independent positive or negative polarization state. The polarization states of these domains collectively affect the transfer characteristics of FeFETs, resulting in a shift of threshold voltage (\({V}_{{{{\rm{TH}}}}}\)). By applying positive or negative write pulses to the FE layer, the FeFET state can be programmed to the low-\({V}_{{{{\rm{TH}}}}}\) (LVT) or high-\({V}_{{{{\rm{TH}}}}}\) (HVT) state, respectively, without involving DC currents. As shown in Fig. 2e, the resulting FeFET \({I}_{{{{\rm{DS}}}}}\)-\({V}_{{{{\rm{GS}}}}}\) curve exhibits nonvolatile hysteretic characteristics, and a read voltage (\({V}_{{{{\rm{R}}}}}\)) within the memory window can distinguish between the two memory states. The combination of nonvolatility and capacitive write enables FeFETs to operate with ultra-low power. Besides, the ultra-high on/off ratio of FeFETs facilitates low charging latency and reliable differential voltage in charge-domain computing. Therefore, FeFETs can be leveraged as an ideal candidate for low-power PUF implementations at the edge.

In addition to the low-power advantages, recent research has shown that due to the random polarization nucleation event, the polarization switching of a single ferroelectric domain is an abrupt and stochastic process regulated by the electric field in the domain40. Meanwhile, applying voltage to the FeFET gate also induces stochastic charge trapping and de-trapping within the gate dielectric and affects \({V}_{{{{\rm{TH}}}}}\)45. Both mechanisms enable FeFETs with significant C2C variation40,46,47, which can be exploited as the PUF entropy source. As shown in Fig. 2e, the \({I}_{{{{\rm{DS}}}}}\)-\({V}_{{{{\rm{GS}}}}}\) curve of one FeFET varies across cycles when the same write pulse is applied. Compared with 4.5 V/-5.0 V write voltages, a moderate write voltage can partially switch the FE layer polarization to provide more significant C2C variation. Therefore, we select 2.8 V/3.2 V/3.6 V as the write voltage to measure C2C variation.

Besides, FeFETs also exhibit D2D variation due to external process variations48. However, compared to the C2C variation, D2D variation spans a significantly wider range, as shown in Fig. 2f. As a result, the impact of D2D variation may dominate the device characteristics. For instance, the \({V}_{{{{\rm{TH}}}}}\) of Device 1 is consistently lower than that of Device 5 across the measured cycles. In such cases, determining the PUF cell state by comparing the \({V}_{{{{\rm{TH}}}}}\) of these two devices in two cycles would lead to minimal state changes. This compromises the uniqueness of CRPs across different reconfiguration cycles. Therefore, we are inspired to exploit the FeFET C2C variation as the all-around entropy source of our PUF throughout both CRP registration and reconfiguration. Detailed discussions about the selection of the entropy source are provided in Supplementary Note 3.

Figure 2g shows the robustness of FeFET C2C variation against different write voltages and device dimensions. Each data point represents the \({V}_{{{{\rm{TH}}}}}\) distribution of one FeFET across 100 write cycles. The experimental results demonstrate high stability in FeFET C2C variation, which is desirable for robust and reliable PUF designs.

Proposed FeFET-based strong PUF

To fully exploit the unique FeFET characteristics, we propose the FeFET-based strong PUF based on 2FeFET-1C cells. Each PUF cell consists of two FeFETs (T1 and T2) connected to a capacitor. By applying \({V}_{{{{\rm{DD}}}}}\) to T1 and GND to T2, the relative \({V}_{{{{\rm{TH}}}}}\) difference between the two FeFETs can be measured at node X (\({V}_{{{{\rm{X}}}}}\)). For strong PUF implementation, as illustrated in Fig. 3a, b, the upper plates of the capacitors in all PUF cells are connected via a shared sense line (SL). The SL is further connected to a sense amplifier (SA) with two capacitors (Cc1 and Cc2) via switches. Voltages applied to the FeFETs are output from the decoders and buffers. The results from the PUF cells are accumulated through the charge-domain CiM, and the SA outputs the response back to the decoders.

Fig. 3: Proposed FeFET-based strong PUF.
figure 3

a The cell-by-cell registration starts with resetting and programming one FeFET twice. The \({V}_{{{{\rm{TH}}}}}\) of two cycles is compared to rewrite the cell state (\(S\)) into the FeFETs. b The input challenge (\(C\)) and \(S\) are XORed at node X. Through capacitive coupling, \({V}_{{{{\rm{SL}}}}}\) averages the \({V}_{{{{\rm{X}}}}}\) of all PUF cells and is compared with \({V}_{{{{\rm{ref}}}}}\) to generate the response.

Figure 3a illustrates the cell-by-cell registration workflow of our PUF, which exploits the FeFET C2C variation as the entropy source all-around for improved reconfigurability. All FeFETs are initially reset to HVT. For the selected PUF cell, one FeFET (e.g., T1) is first weakly programmed to \({V}_{{{{\rm{TH}}}}}\) using a positive write pulse. Then \({V}_{{{{\rm{DD}}}}}\) and GND are applied to T1 and T2, respectively, and \({V}_{{{{\rm{R}}}}}\) is applied on both FeFETs to measure the \({V}_{{{{\rm{TH}}}}}\) using the divided voltage at X1 (\({V}_{{{{{\rm{X}}}}}_{1}}\)). With node X of unselected PUF cells floating, \({V}_{{{{{\rm{X}}}}}_{1}}\) is subsequently coupled to \({V}_{{{{\rm{SL}}}}}\) and recorded on Cc1. In the second cycle, T1 is reset and written to produce a new \(V^{{\prime}}_{{{{\rm{TH}}}}}\). Similarly, \(V^{{\prime}}_{{{{\rm{TH}}}}}\) is measured as \(V^{{\prime}}_{{{{{\rm{X}}}}}_{1}}\), and the SA compares \({V}_{{{{{\rm{X}}}}}_{1}}\) and \(V^{{\prime}}_{{{{{\rm{X}}}}}_{1}}\) to output the state of the PUF cell (\(S\)). This state is generated from the inherent entropy source of FeFETs and is thereby random and unpredictable, which can be used for CRP registration. The reset process is aimed at initializing the FeFET to the same state before programming to ensure that the obtained C2C variation is not affected by the original states of FeFETs. Besides, resetting FeFETs to HVT also blocks the current flow during read to save energy. Figure 4a shows the measured distribution of \({V}_{{{{\rm{X}}}}}\) and \(V^{{\prime}}_{{{{\rm{X}}}}}\) difference (\(\Delta {V}_{{{{\rm{X}}}}}\)) for 27 PUF cells, which reflects the FeFET C2C variation. The Gaussian-like distribution with the mean value close to zero also indicates uniform randomness of the entropy source.

Fig. 4: Evaluation of the proposed workflow.
figure 4

a The Gaussian-like distribution of \({V}_{{{{\rm{X}}}}}\) difference in two cycles (\(\Delta {V}_{{{{\rm{X}}}}}\)) ensures uniform randomness of the PUF cell states. b \(\Delta {V}_{{{{\rm{X}}}}}\) map and cell states before and after rewrite. c After rewrite, the distinct \({V}_{{{{\rm{X}}}}}\) separation between PUF cells with different outputs enhances repeatability. d \({V}_{{{{\rm{R}}}}}\) is set to 1.2 V to maximize \(\Delta {V}_{{{{\rm{X}}}}}\) for C2C variation extraction. e \({V}_{{{{\rm{X}}}}}\) is the XNOR result of the input challenge (\(C\)) and the cell state (\(S\)), where the on/off ratio is maximized at 1.2 V read voltage. f \({V}_{{{{\rm{SL}}}}}\) shows high linearity and can be sensed with an appropriate \({V}_{{{{\rm{ref}}}}}\). \({V}_{{{{\rm{DD}}}}}\) is set to 0.5 V to save energy and provide a sufficient sense margin. Source data are provided as a Source Data file.

The PUF cell state is subsequently rewritten into the two FeFETs in a complementary form to ensure reliable response generation. During the rewrite operation, the SA first outputs the comparison result (\(S\)) to the decoder. Based on \(S\), the decoder determines the appropriate write voltages, which are applied to the FeFETs via the buffers. If \(S\) is ‘0’ (\({V}_{{{{{\rm{X}}}}}_{1}}\) < \(V^{{\prime}}_{{{{{\rm{X}}}}}_{1}}\)), T1 and T2 are rewritten to HVT and LVT, respectively, otherwise T1 is written to LVT and T2 to HVT. Through the rewrite, the PUF cell states (\(S\)) are digitalized to replace the original analog \(\Delta {V}_{{{{\rm{X}}}}}\) values (illustrated in Fig. 4b). This in-situ storage scheme, which uses a pair of complementary FeFETs, significantly enhances the repeatability and the \({V}_{{{{\rm{X}}}}}\) reliability against FeFET variations, as shown in Fig. 4c. The above steps constitute the complete CRP registration workflow for one PUF cell and can be iteratively applied to all cells, where \({V}_{{{{\rm{DD}}}}}\) is set to 0.5 V to save energy and \({V}_{{{{\rm{R}}}}}\) is set to 1.2 V to maximize the measured \(\Delta {V}_{{{{\rm{X}}}}}\) for accurate C2C variation extraction, as shown in Fig. 4d. The \(\Delta {V}_{{{{\rm{X}}}}}\)-\({V}_{{{{\rm{R}}}}}\) curves under other write voltages are provided in Supplementary Note 4.

To generate a response bit, a challenge vector is input to the PUF cells complementarily via the decoders, and the FeFETs are read out, as shown in Fig. 3b. If the input challenge bit (\({C}_{i}\)) matches the PUF state (\({S}_{i}\)), the LVT FeFET is on the same side as the \({V}_{{{{\rm{DD}}}}}\) input, charging \({V}_{{{{{\rm{X}}}}}_{i}}\) to \({V}_{{{{\rm{DD}}}}}\). Conversely, if \({C}_{i}\) and \({S}_{i}\) are complementary, \({V}_{{{{{\rm{X}}}}}_{i}}\) is grounded by an on-state FeFET connected to GND. Therefore, as shown in Fig. 4e, the PUF cell outputs the XNOR results (\({V}_{{{{{\rm{X}}}}}_{i}}\)) between \({C}_{i}\) and \({S}_{i}\), with the on/off ratio maximized at 1.2 V \({V}_{{{{\rm{R}}}}}\). The \({V}_{{{{{\rm{X}}}}}_{i}}\) is subsequently averaged on SL through charge redistribution among identical capacitors. As shown in Fig. 4f, the resulting \({V}_{{{{\rm{SL}}}}}\) lies highly linear with the number of PUF cells outputting \({V}_{{{{\rm{DD}}}}}\) and holds sufficient sense margin at 0.5 V \({V}_{{{{\rm{DD}}}}}\). Finally, the SA compares \({V}_{{{{\rm{SL}}}}}\) with a preset reference voltage (\({V}_{{{{\rm{ref}}}}}\)) to produce the response bit. By setting \({V}_{{{{\rm{ref}}}}}\) to 0.5\({V}_{{{{\rm{DD}}}}}\), the \({V}_{{{{\rm{SL}}}}}\) levels are uniformly translated into binary response bits.

Through the proposed charge-domain in-memory computing, all PUF cells contribute to the final response output, enabling a CRP space of 2n for an \(n\)-bit challenge vector, which fulfills the requirements of a strong PUF. Besides, the charge-domain computing offers significant advantages in energy efficiency and computation accuracy compared to conventional current-domain CiM-based PUFs: (i) Current-domain CiMs accumulate the current outputs from the PUF cells. However, the accumulated current (typically ~μA) is energy-consuming and prone to device variations. In contrast, the charge-domain CiM performs computation through dynamic capacitive coupling. The HVT FeFET effectively blocks the current flow within each PUF cell thanks to the FeFET ultra-high on/off ratio. Therefore, the proposed response generation scheme almost eliminates the static currents and achieves ultra-low readout energy. (ii) The capacitor variations are typically smaller and more controllable than the variations of CMOS transistors and FeFETs. The charge redistribution further narrows the variations of the output voltage (\({V}_{{{{\rm{SL}}}}}\)), as demonstrated in prior works49,50. Moreover, the differential storage also minimizes the impact of FeFET variations. These benefits of the charge-domain CiM collectively enable our PUF with high computation accuracy, improving both performance and reliability. Detailed analysis and comparison with existing current-domain CiM-based PUFs are provided in Supplementary Note 5.

The CRP reconfiguration workflow mirrors the registration process and can be performed by repeating the same steps. Due to the FeFET C2C variation, the measured \({V}_{{{{\rm{TH}}}}}\) and \(V^{{\prime}}_{{{{\rm{TH}}}}}\) in the new reconfiguration cycle differ from those in the previous cycle, resulting in the updated PUF cell states. By exploiting this homogeneous entropy source throughout the workflow, our PUF avoids the negative impact of D2D variation and significantly improves the reconfigurability performance.

Experimental and simulation setup

To demonstrate the CMOS compatibility and integration capability of the proposed design, we fabricate the 2FeFET-1C PUF cells at a 28 nm HKMG technology node, as shown in Fig. 2b, c. The experimental measurements are performed on 9×7 AND-type FeFET arrays fabricated with the same process, which imitates the 2FeFET-1C cell-level operations and provides benefits on data extraction. In the test structure, the drain and source of two adjacent FeFETs are connected by an additional jumper, with the probe serving as the capacitive load to emulate the 2FeFET-1C cell (see Supplementary Note 1 for details). With the write and read conditions of the FeFETs set identical to those of the 2FeFET-1C cells (shown in Supplementary Note 2), the test structure operates equivalently to the proposed design and allows accurate measurement of \({V}_{{{{\rm{X}}}}}\) in each cell. Besides, the AND-type FeFET array supports large array sizes up to 64 kb and scaled FeFET dimensions as small as 80 nm × 30 nm41, providing a more representative evaluation of large-scale PUF performance under realistic manufacturing conditions. Using the AND-type FeFET array also helps understand the behavior of the internal node X, which would be challenging to observe in the 2FeFET-1C chip.

The basic simulation setup employs a commercial 28 nm CMOS process and mimics the FeFETs with the measured \({V}_{{{{\rm{X}}}}}\) values instead of SPICE models. Since \({V}_{{{{\rm{X}}}}}\) is derived directly from FeFET experimental measurements, it comprehensively captures all necessary characteristics of FeFETs. Moreover, \({V}_{{{{\rm{X}}}}}\) is a stable voltage division between the two FeFETs, driven strongly by the DC voltage and minimally affected by the feedback from capacitors. Consequently, \({V}_{{{{\rm{X}}}}}\) remains highly stable during both the read process and the charge-domain computing. The subsequent charge-domain computing only involves this stabilized \({V}_{{{{\rm{X}}}}}\), along with capacitors and SAs. Therefore, by incorporating the experimentally measured \({V}_{{{{\rm{X}}}}}\) in the simulation of charge-domain CiM, the setup convincingly reflects the impact of underlying physics and non-idealities in FeFETs on the PUF array. Additional factors, including capacitor non-idealities, SA offset, thermal noise, and parasitic effects, are also included in the simulation. Based on this basic setup, the basic and robustness metrics of our PUF are evaluated. The performance metrics such as energy and latency, which are challenging to extract from the test structure, are evaluated using a widely adopted compact FeFET SPICE model51. The utilized model describes the FeFET as a modulated ferroelectric capacitor connecting to the MOSFET gate, which mimics the structure of the fabricated FeFET devices. After calibration based on experimental measurements, this model shows the FeFET behavior consistent with the measured results. This modeling method is able to capture the impact of both the ferroelectric capacitor and the internal MOSFET gate capacitor for energy and latency evaluations51.

For registration and reconfiguration, the fabricated FeFETs are initially reset to HVT by applying a −5 V erase voltage to the gate terminals for 500 ns and subsequently written using a 500 ns pulse with a moderate voltage (e.g., 2.8 V). Then, the \({V}_{{{{\rm{X}}}}}\) of all PUF cells is measured in parallel by inputting 0.5 V \({V}_{{{{\rm{DD}}}}}\) and GND to the two FeFETs in each cell. This reset, write, and measurement process is repeated for two cycles, and the obtained \({V}_{{{{\rm{X}}}}}\) is random due to the C2C variation of the FeFET devices. The \({V}_{{{{\rm{X}}}}}\) values are subsequently used to simulate the comparison in the proposed workflow. Based on the comparison result, the fabricated FeFETs are either erased to HVT with a −5 V 500 ns write pulse or programmed to LVT using a 4.5 V 500 ns write pulse to store the PUF cell states. As a result, the PUF cells are registered based on the experimentally measured C2C variation.

For response generation, the challenge bits are first input to the PUF cells through corresponding voltages, and the resulting \({V}_{{{{\rm{X}}}}}\) is experimentally measured. Then, these measured voltages are used in the simulation to drive the capacitors connected to the SL to simulate the capacitive coupling in the proposed charge-domain CiM. The response bit is generated by comparing \({V}_{{{{\rm{SL}}}}}\) with \({V}_{{{{\rm{ref}}}}}\) through the SA. Note that the impact of FeFET non-idealities is included in the measured \({V}_{{{{\rm{X}}}}}\). As shown in Fig. 5a, to generate an \(m\)-bit response, \(m\) challenge vectors are input sequentially, and the outputs are combined into a bitstream. The response bitstream length (i.e., \(m\)) is determined by the number of challenge vectors instead of the challenge vector length (i.e., PUF cell number, \(n\)).

Fig. 5: Basic metrics, reconfigurability and performance evaluation.
figure 5

a A response vector is generated by inputting multiple challenge vectors in our experimental measurements. b Entropy and HW show high uniformity. c HDinter and HDintra (i.e., BER) results highlight the high uniqueness and repeatability of our PUF. d The measured HDreconfig indicates near-ideal uniqueness across reconfiguration cycles. e The correlation matrix of the responses over 10 reconfiguration cycles further demonstrates good reconfigurability. The diagonal entries are plotted as 0 for visual clarity. f Layout of the proposed 2FeFET-1C cell with the MOM capacitor integrated on top of FeFETs via the BEOL process. The cell area is 1 μm2 using 200 nm × 100 nm FeFETs and a 0.5fF capacitor at 28 nm node. g The readout energy peaks when half of the PUF cells output \({V}_{{{{\rm{DD}}}}}\) due to the maximum charge redistribution. Increasing challenge vector length, \({V}_{{{{\rm{DD}}}}}\) or \({V}_{{{{\rm{R}}}}}\) all leads to higher energy consumption. Source data are provided as a Source Data file.

Basic metrics, reconfigurability, and performance

Figure 5b, c shows the basic metrics of our PUF, including uniformity, uniqueness, and repeatability (calculation methods provided in Supplementary Note 6). 100-bit responses are produced by sequentially inputting 100 random 27-bit challenge vectors. As shown in Fig. 5b, across 10 experimentally measured reconfiguration cycles, the response entropy is consistently close to the ideal value of 1, with an overall Hamming Weight (HW) of 50.00%. This demonstrates near-ideal uniform randomness (50%) of our PUF, which is further validated by successfully passing all test items in the NIST randomness test suite (shown in Supplementary Note 7).

Figure 5c shows the HDinter and HDintra results to evaluate the uniqueness and repeatability of our PUF. 10 PUF instances and 1000 repetitive evaluations are investigated. The measured HDinter is tightly distributed around the ideal value (50%), with a mean value of 49.98% and a standard deviation of 6.96%, highlighting the high randomness of FeFET C2C variation as the entropy source. Thanks to the FeFET ultra-high on/off ratio after rewrite and the charge-domain computation mode, the HDintra (or bit error rate, BER) is 0, indicating 100% repeatability.

Figure 5d, e presents the experimental reconfigurability analysis of our PUF (calculation method shown in Supplementary Note 8). Across 10 reconfiguration cycles, identical challenges are input to generate 10 responses (100-bit vectors each). Attributed to the homogeneous entropy source in registration and reconfiguration enabled by FeFET C2C variation, the measured HDreconfig distribution is similar to the HDinter, as shown in Fig. 5d. The mean value is 50.02% (ideally 50%) with a standard deviation of 6.77%, indicating near-ideal uniqueness across reconfiguration cycles. Figure 5e shows the correlation coefficients of the responses, with diagonal entries plotted as 0 for clarity. Most coefficients are close to 0, and the maximum observed coefficient is below 0.25. This further demonstrates the high reconfigurability of our PUF.

Figure 5f shows the layout of the proposed 2FeFET-1C cell. Thanks to the scaled dimensions of FeFETs, the cell achieves a low footprint of 1 μm2 with 200 nm × 100 nm FeFETs and a 0.5fF capacitor at 28 nm node. The MOM capacitor is implemented with multiple metal layers and integrated on top of FeFETs via the back-end-of-line (BEOL) process to save area. The layout also takes the routing into account. Overall, our proposed PUF shows comparable density with previous PUF designs30,35.

Figure 5g shows the energy consumption of response generation in our PUF. Due to the capacitive coupling in the charge-domain CiM, the number of PUF cells with \({V}_{{{{\rm{X}}}}}\) = \({V}_{{{{\rm{DD}}}}}\) varies with different challenge inputs, which affects the energy consumed by charge redistribution. When half of the PUF cells output \({V}_{{{{\rm{DD}}}}}\), \({V}_{{{{\rm{SL}}}}}\) approaches 0.5\({V}_{{{{\rm{DD}}}}}\) and causes extensive charge redistribution, resulting in the maximum readout energy. As the capacitance of the capacitors in the PUF cells increases, more charging energy is consumed in the response generation. Besides, increasing the number of PUF cells expands the CRP space and enhances the security performance. Results show that even with 4fF capacitors and 257-bit challenge vectors, the readout energy of the PUF array remains below 40fJ per bit. Considering the massive CRP space of 2257, this metric is highly energy-efficient.

The impact of voltages on the readout energy is also evaluated. Higher \({V}_{{{{\rm{DD}}}}}\) and \({V}_{{{{\rm{R}}}}}\) raise the energy consumption of both the capacitors and the parasitic capacitance. In this work, as mentioned in the previous section, we set \({V}_{{{{\rm{DD}}}}}\) and \({V}_{{{{\rm{R}}}}}\) to 0.5 V and 1.2 V, respectively, to minimize energy while ensuring reliability. For 1fF capacitors (which enable accurate computation as shown below), the readout energy of our PUF with 27 cells is up to 1.89fJ per bit, significantly outperforming the state-of-the-art strong PUFs28,29,30,31,32,33,34,35,39. The ultra-low readout energy is attributed to the low-power characteristics of FeFETs and the proposed charge-domain CiM. A detailed breakdown of the readout energy is provided in Supplementary Note 9.

The latency of response generation is primarily determined by the charging time, which is less than 1 ns for 1fF capacitors thanks to the high on/off ratio of FeFETs. The overall response generation latency is ~2 ns and is comparable to the prior work29. The throughput can be further improved by generating multiple response bits in parallel. For registration and reconfiguration, the write operations dominate the latency, which consists of a 500 ns write duration and a 2 s delay required after each programming for charge de-trapping. Note that the charge de-trapping delay is not required when resetting the FeFETs to HVT, as negative voltages do not induce charge trapping52. Besides, the de-trapping time can be significantly reduced through suitable de-trapping electrical pulses52, interfacial layer engineering53, and pipelining with sequential FeFET writes.

Despite the time-consuming write operations, registration and reconfiguration are typically less frequent in real-world PUF applications3. These operations are usually required in response to insufficient CRP space or malicious attacks. However, the proposed FeFET-based strong PUF features an exponential CRP space, which significantly reduces the case of running out CRPs, while the malicious repeated reconfiguration is beyond the scope of this work. Moreover, the registration and reconfiguration workflow can be parallelized with response generation by using a redundant PUF instance to perform ping-pong mode operations. For example, one PUF instance can be registered while the other generates responses simultaneously, with the roles exchanged in the next cycle. This ping-pong mode mitigates the explicit latency at a low cost. The pre-conditioning process is required only once during the lifetime of a FeFET and is not considered in the evaluation.

Robustness

Robustness against parameter variations is an essential aspect of PUFs in practical applications. To evaluate the robustness of our PUF against write voltage, device dimension, and temperature, we perform extensive system-level experiments on the proposed PUF. The default setup includes 2.8 V write voltage, 450 nm × 450 nm device dimension, and 25 °C. Figure 6a–c shows the HW, HDinter, and HDintra for different write voltages (3.2 V, 3.6 V), device dimensions (200 nm × 200 nm, 200 nm × 100 nm), and temperatures (55 °C, 85 °C), respectively. Detailed results are provided in Supplementary Note 10. For the write voltage and device dimension analysis, 1000 repetitive evaluations are performed to calculate the HDintra, while for temperature analysis, the same devices and CRPs are evaluated for 1000 times at different temperatures. In comparison to the default setup, the HW and HDinter show little degradation but still lie around the ideal value of 50%, and the HDintra is 0.7% at 85 °C. This high uniqueness and repeatability are empowered by the intrinsic robustness of FeFET C2C variation and the co-optimizations of the proposed circuit design.

Fig. 6: Robustness evaluation.
figure 6

Our PUF demonstrates high robustness against parameter variations, including (a) write voltage, (b) device dimension, and (c) temperature. For write voltage and device dimension, 1000 repetitive tests are performed to calculate HDintra. For temperature, the same devices are used to generate 1000 CRPs for HDintra evaluation. d Measured FeFET \({V}_{{{{\rm{TH}}}}}\) at 85 °C shows good retention. e The metrics indicate strong robustness of our PUF over time. Source data are provided as a Source Data file.

Since the robustness of the proposed PUF over time depends on the nonvolatile performance of the FeFETs, we also evaluate the retention of the FeFETs and its impact on the PUF performance. As shown in Fig. 6d, at a high temperature of 85 °C, the \({V}_{{{{\rm{TH}}}}}\) of the HVT FeFETs slightly decreases over time, while the \({V}_{{{{\rm{TH}}}}}\) of the LVT FeFETs remains stable. However, this \({V}_{{{{\rm{TH}}}}}\) variation over time has a negligible impact on the overall performance of the proposed PUF and induces 0% HDintra after 66956 s, as shown in Fig. 6e. This is because the reduction in the FeFET on/off ratio simultaneously increases the \({V}_{{{{\rm{X}}}}}\) for cells outputting ‘0’ and decreases the \({V}_{{{{\rm{X}}}}}\) for cells outputting ‘1’. Besides, the ultra-high FeFET on/off ratio also significantly narrows the \({V}_{{{{\rm{X}}}}}\) variations (detailed in Supplementary Note 5). Therefore, the averaging computation enabled by capacitor coupling effectively cancels the impact and ensures high robustness over time.

The utilized charge-domain CiM may induce computation errors, which compromises response uniformity, uniqueness, and repeatability. Therefore, we analyze the impact of various non-ideal factors on the charge-domain CiM computation accuracy and the PUF performance. The capacitor mismatch is evaluated based on the experimental data from prior works54,55, and the thermal noise is estimated as detailed in Supplementary Note 5.

Figure 7a, b shows the computation error rates for different challenge vector lengths (i.e., PUF cells) and capacitor mismatch, evaluated using the worst-case challenge inputs (with equal number of cells outputting ‘1’ and ‘0’). More bits in a challenge vector narrow the interval between voltage levels, leading to overlaps due to capacitor mismatch. This effect reduces the sense margin and degrades the accuracy. However, the proposed charge-domain computing significantly mitigates the impact of capacitor variations. As shown in Fig. 7a, b, our PUF achieves 0% computation error for 129-bit vectors using 1fF capacitors with 0.8% mismatch. The HDinter remains within an acceptable range of around 50%, while the HDintra exhibits an undegraded 0% BER, demonstrating high scalability. Further theoretical analysis of variations is provided in Supplementary Note 5.

Fig. 7: Computation accuracy evaluation.
figure 7

a Increasing challenge vector length narrows the voltage interval and increases the computation error rate. b Using capacitors with larger capacitance reduces mismatch and improves computation accuracy. c The averaging computation cancels the impact of the capacitor temperature drift, enabling high accuracy. d The reliable charge-domain computing mitigates the impact of \({V}_{{{{\rm{X}}}}}\) variations and ensures less degraded PUF performance. e The computation error rate increases as \({V}_{{{{\rm{DD}}}}}\) decreases due to the reduced sense margin. f Both higher or lower \({V}_{{{{\rm{R}}}}}\) narrows the on/off ratio of \({V}_{{{{\rm{X}}}}}\) and limits the performance of the proposed PUF. Source data are provided as a Source Data file.

Figure 7c, d shows the impact of temperatures and \({V}_{{{{\rm{X}}}}}\) variations. As shown in Fig. 7c, temperature has a negligible impact on the proposed charge-domain computing attributed to the low-temperature sensitivity of the MOM capacitors utilized in the design56. Meanwhile, the averaging computation cancels the impact of temperature drift as the capacitors are under almost the same temperature. The \({V}_{{{{\rm{X}}}}}\) variations minimally affect the uniqueness while increasing the BER. However, thanks to the differential FeFET storage and FeFET ultra-high on/off ratio, \({V}_{{{{\rm{X}}}}}\) is highly stable with a C2C variation of only 3 mV in the experimental measurements and enables 0% BER.

Figure 7e, f shows the impact of \({V}_{{{{\rm{DD}}}}}\) and \({V}_{{{{\rm{R}}}}}\). Decreasing \({V}_{{{{\rm{DD}}}}}\) results in an increasing trend in the computation error rate due to the reduced sense margin. For \({V}_{{{{\rm{R}}}}}\), deviations from the optimal value, either higher or lower, compromise the \({V}_{{{{\rm{X}}}}}\) on/off ratio and limit the performance, which is consistent with the observations in Fig. 4e. Therefore, we utilize 0.5 V \({V}_{{{{\rm{DD}}}}}\) and 1.2 V \({V}_{{{{\rm{R}}}}}\) to achieve a balanced tradeoff between energy efficiency and computation accuracy.

Resilience to ML attacks

The publicly accessible interface of strong PUFs induces the vulnerability of ML attacks. Typical attack models include logistic regression (LR)19,57, support vector machines (SVM)8, boosting algorithms57, and deep learning (DL)58. To evaluate the resilience of our PUF, we utilize multiple models, including LR, linear SVM, SVM with radial basis function kernel, multi-layer perception (MLP), XGBoost, and generative adversarial networks (GAN). These models cover most of the existing ML attacks and have been shown to attack various strong PUFs successfully20.

As illustrated in Fig. 8a, an XOR PUF structure is introduced to enhance the ML attack resilience of our PUF26. In a \(k\)-XOR PUF, the output response bits of \(k\) PUFs are XORed to produce the final output response bit. The nonlinearity provided by the additional XOR operation obfuscates the output response bit, significantly increasing the number of CRPs required to train the ML models, which improves the resilience to the ML attacks.

Fig. 8: ML attack resilience evaluation.
figure 8

a XOR PUF structure is introduced to improve resilience against ML attacks. The response bits generated by multiple independent PUF instances are XORed to introduce nonlinearity into the final output. b 8-XOR APUF still shows vulnerability under the investigated attacks with enough CRP samples. c The 8-XOR version of our PUF exhibits almost invulnerable resilience with a maximum accuracy of 52.1% (ideally 50%) even using 108 training samples. d The ML attack resilience of our PUF can be further enhanced by integrating more PUF cells, benefitting from the high scalability of the proposed charge-domain computing. Source data are provided as a Source Data file.

Figure 8b, c shows the maximum ML attack accuracy among all the investigated attack models for APUF and our PUF. The CRPs of the APUF are generated by the pypuf toolbox59, and the CRPs of our PUF are calculated from the extracted experimental data. Both types of basic PUFs show vulnerability as the CRP training set size increases. However, when utilizing the 8-XOR structure, the arbiter PUFs remain vulnerable (99.8% attack accuracy) to LR and MLP attacks. In comparison, the 8-XOR version of our PUF obtains near-ideal ML attack resilience (52.1% accuracy) in all the attack models even when 108 CRPs are used for training. This robust resilience is attributed to the nonlinear XNOR computation in the CiM-based response generation scheme and the additional XOR structure. Compared with the previous NVM-based strong PUFs28,29,39, the proposed PUF shows comparable resilience to ML attacks. However, thanks to the proposed device-circuit co-optimizations, including the use of FeFETs, differential voltage, and charge-domain CiM, the enhanced version of our PUF achieves ultra-high energy efficiency (15.67fJ per bit), variation resilience, and scalability, significantly outperforming the existing strong PUFs with similar resilience. Detailed ML attack configurations and results are provided in Supplementary Note 11.

Figure 8d analyzes the impact of the PUF cell number on the ML attack resilience. The MLP attack results show that increasing the number of PUF cells expands the required training set size. Since our PUF demonstrates high scalability, more PUF cells can be integrated to further enhance the resilience to ML attacks. Furthermore, the high reconfigurability of our PUF also enables frequent reconfiguration to limit the number of CRP samples available to the attacker. Both characteristics provide more design space to improve the resilience of our PUF against ML attacks.

Discussion

In this work, we demonstrate a FeFET-based strong PUF with high reconfigurability and ultra-low power. Uniquely, the PUF exploits the FeFET C2C variation and the charge-domain CiM to achieve significant improvements in reconfigurability and energy efficiency compared with the state-of-the-art strong PUFs. The summary in Table 1 highlights the great promise of FeFETs in IoT devices. We believe these innovations would open a design space for future reconfigurable and low-power PUF designs. Moreover, the proposed 2FeFET-1C cells can be fabricated with good CMOS compatibility, paving the way for future macro-level integration. The high robustness against parameter variations and strong resilience to ML attacks also make the proposed PUF a promising and feasible security solution in practical implementations.

Table 1 Comparison with prior works

Methods

Device fabrication

In this paper, the AND-type FeFET arrays are fabricated using GlobalFoundries’ 28 nm HKMG technology node. The platform has been demonstrated to provide simultaneous integration of CMOS devices together with the FeFETs on the same chip. In the FeFETs, 10 nm thick silicon-doped HfO2 is used as the FE layer with a 1 nm interface layer of SiO2.

Electrical characterization

The electrical characterization of the PUF is performed on AND-type FeFET arrays, which consist of 9 wordlines (WLs) connecting the gates row-wise and 7 bitlines/sourcelines (BLs/SLs) connecting the drain and source contacts column-wise. All FeFETs are preconditioned for 50 program-erase cycles with write pulses of 4.5 V 500 ns and −5 V 500 ns. The FeFET read operation is performed by applying a voltage ramp from −0.2 V to 1.8 V in increments of 100 mV while measuring the currents at the drain electrode biased at 100 mV with the bulk and source grounded. All the unselected WLs are biased at −0.5 V to suppress leakage currents.

The PUF cell is constructed by shorting the SL of the left FeFET to the BL of the right FeFET using externally connected jumper wires (Supplementary Fig. 1). In this way, 27 PUF cells are constructed over the 9 WLs. The jumpers are first removed, and the whole array is reset by a −5 V erase voltage applied for 500 ns at the gate electrodes, which erases all FeFETs to HVT. Afterward, the right FeFETs in all cells are weakly programmed, e.g., written with a pulse of 2.8 V for 500 ns. To not disturb the erased state of the left FeFETs, the BLs/SLs of the right and left FeFETs are set to 0 V and 1.7 V, respectively. Furthermore, the unselected WLs are raised to 0.9 V to minimize potential disturbs on the half-selected FeFETs that share the 1.7 V BLs and SLs.

The PUF-specific readout is performed by connecting the jumpers. After challenges are input to the PUF cells, the shorted BL/SL contacts (node X) are set to high ohmic mode and \({V}_{{{{\rm{X}}}}}\) is measured. After comparing the \({V}_{{{{\rm{X}}}}}\) of two cycles, both FeFETs in the PUF cell are reset, and the FeFETs corresponding to the comparison result are rewritten to LVT/HVT by applying the program voltage of 4.5 V for 500 ns or −5 V erase voltage for 500 ns. To reduce the interference, the unselected BLs and SLs are inhibited to 2.6 V, and the WLs are raised to 1.3 V.