Introduction

Counterfeiting is a significant problem worldwide and is responsible for serious economic losses in a wide range of everyday transactions1,2. It can even be life-threatening when counterfeited goods such as fake medicines are passed off as genuine3,4. To tackle this critical issue, techniques such as watermarks or fluorescence labels have been developed, and such techniques are used on banknotes all over the world. However, these conventional approaches are now at risk of a resurgence of counterfeiting, due to the deterministic fabrication process which is prone to forgery5. To this end, the emerging physical unclonable function (PUF) systems, based on non-predictable responses of integrated circuits6 or random patterns of micro/nanostructures5, serve as an effective solution for unforgeable anticounterfeiting. Thanks to recent advancements in nanotechnology and optical cryptography techniques7, PUF labels have been successfully developed based on a large number of optical nanomaterials, including plasmonic nanoparticles8,9,10, surface-enhanced Raman spectroscopy nanoparticles11, quantum dots12, Mie-resonant silicon nanoparticles13, upconverting nanoparticles14, and metasurfaces15. In general, these nanomaterials provide a variety of optical signals like photoluminescence, scattering and Raman signals, which can be tailored to carry authentic information for encryption/decryption.

To achieve unbreakable encryption for PUF labels, it is crucial to have a large enough encoding capacity which represents the theoretical maximum number of unique tags5. There are mainly two ways of enhancing encoding capacity5: (1) improving the pixel number of the encoded image, and/or (2) high-dimensional encoding. Compared with the former one, high-dimensional encoding requires a relatively short readout time and is thereby regarded as a promising solution. However, to date, there have been only a few attempts to create high-dimensional (≥3D) encoded PUF labels9,11,16,17. Particularly, the light has been intensively investigated to provide multi-dimensional encrypted information, owing to its abundant degrees of freedom such as polarization, phase, wavelength, and frequency18. Among them, polarization has been explored extensively in various applications such as three-dimensional display technology19, optical communication20, optical storage21, optical encryption22, super-resolution imaging23,24, and orientation measurements23,24. Among the different optical polarization-sensitive emitters (e.g., fluorescent molecules25, plasmonic nanorods26, upconverting nanorods27, carbon nanotubes28, defects in 2D materials29), nitrogen-vacancy (NV) centers, a kind of photoluminescent defects hosted in diamond crystals, have been hailed as among the most promising candidates for anticounterfeiting labels due to their high contrast value in polarization modulation (~89% for single NV center30), unlimited photostability31,32, and cost-effective mass production33, not to mention the properties of diamond materials themselves34,35.

In a practical PUF anticounterfeiting system, the desired authentication method should have a low false-positive rate, low time consumption, and noise tolerance. However, prevailing techniques fall short of meeting these criteria. Firstly, due to broad applicability and a low false-positive rate5, the point-by-point comparison method (evaluated by the similarity index11 or Hamming distance36) is widely used11,37,38,39,40,41. However, this method shows poor noise tolerance, due to sensitivity to pixel level intensity information42 which is prone to being influenced by some common noise sources. Secondly, contemporary artificial intelligence (AI)-driven methods9,12,16,43 can achieve a low false-positive rate and be noise-resilient by learning robust feature representations with deep neural networks. Unfortunately, these approaches frame authentication as an image classification problem, which results in substantial time wastage in the training phase. Specifically, learning classifiers12 typically requires the collection of numerous training samples for a single PUF label and necessitates the retraining of data from all PUF labels whenever a new PUF label is produced43. Another AI-enhanced technique, known as deep metric learning44, directly learns the optimal standard of comparison between data based on deep image features, enabling it not only to authenticate unseen objects45 but also to be noise-robust46. Therefore, metric learning is emerging as a promising candidate to solve the existing drawbacks of both the point-by-point comparison and classification methods.

This paper presents the first demonstration of high-dimensional (3D) encoding for diamond-based PUF labels, based on our previously reported linear polarization modulation (LPM) of NV centers in fluorescent nanodiamonds (FNDs)47, as shown in Fig. 1a. Under the readout time of around 7.5 s, we achieved an encoding capacity of 910 × 1024 (109771) for digitized images with high distinguishability, reproducibility, and long-term stability proved by a point-by-point comparison method. Moreover, we redefine the authentication problem as a metric learning task and propose a deep metric learning algorithm for robust authentication, based on comparing the similarity of abstracted deep features, as illustrated in Fig. 1b. Our method is well-motivated and allows the model to amply satisfy practical application requirements. Specifically, noise resilience evaluation demonstrates that our metric learning method effectively addresses the noise sensitivity issue inherent in the point-by-point comparison method, which is commonly encountered during an end user’s readout. In addition, when compared to previous AI-driven methods that formulate the authentication problem as a deep classification task, our reformulation exhibits two clear benefits: (1) a reduced requirement for training data volume (only two sets of data are needed for a label), and (2) the capacity to effectively authenticate data from unseen labels, rather than necessitating retraining the whole system once a new label is added43. These dual efficiencies significantly enhance the potential of our method for deployment in large-scale commercial settings.

Fig. 1: Extract deep features from 3D anticounterfeiting FNDs via metric learning.
figure 1

a Schematic illustration of obtaining 3D anticounterfeiting information based on LPM curves of the FNDs with random orientations. b Schematic illustration of extracting deep features from 3D encoded information via a metric learning network.

Results

LPM curves of FNDs providing feasibility for 3D anticounterfeiting

Diamond provides promising material properties for fabricating PUF labels, including high photostability, long-term stability, and tolerance to physical stress. Specifically, both the Raman signal of diamond and the fluorescent signal of NV centers can be continuously emitted without blinking or bleaching31,32,38, which provides the basis for maintaining the reproducibility of optical readout results. In addition, high hardness35 and inertness34 make diamond-based labels tolerant to physical stress and long-term storage, respectively. However, despite the huge potential of existing diamond-based PUF labels38,39,48, it has not yet been possible to achieve the much-desired high-dimensional (>2D) encoding. We here propose a method to achieve a 3D encoded diamond-based PUF label, based on the LPM47 curves of FNDs with random orientations.

Fabricated via an electrostatic absorption approach (see section “Methods” for details), our PUF label is composed of FNDs with both high density and satisfactory dispersion on the cover glass (see SEM image in Fig. 2b). Due to the above two characteristics, there are 200–450 bright spots (BSs) over 30 × 30 µm area with a high probability close to diffraction-limited size, in fluorescent images of FND PUF labels. A typical example is given in Fig. 2a. As for the 3D anticounterfeiting information from FND PUF labels, the large number of BSs provides the basis for the distinguishability of encoded images (see Supplementary Notes 1 for detailed analysis), and diffraction-limited sizes of BSs are essential for sensitive optical readout (see latter content for details). In terms of more sample information, sample performance optimization data can be found in Supplementary Notes 6, and characteristics of FND distribution within BSs are shown in Fig. S6.

Fig. 2: Characterization of our FND PUF labels.
figure 2

a A typical wide-field fluorescent image of FNDs. Insert: an enlarged view of six marked bright spots. b A representative SEM image of FNDs. c LPM curves corresponding to the six marked bright spots in (a). \({I}_{\beta }\): the fluorescent intensity. \(\beta\): the linear polarization direction of the excitation laser. d Histogram of LPM contrast distribution among all the identified bright spots in (a). All the scale bars are 2 µm.

In the fluorescent images of our FND PUF labels, diffraction-limited BSs with LPM curves provide the foundation for obtaining three-dimensional anticounterfeiting information. Specifically, based on the optical polarization selective excitation phenomenon of NV centers30, the LPM curves show the relationship between the polarization direction of linearly polarized excitation laser (\(\beta\)) and fluorescent intensity of FNDs (\({I}_{\beta }\)) (Fig. 1a). In actual measurement, when \(\beta\) is changed at a constant speed through rotating a half-wave plate with an electric rotation stage, wide-field fluorescent images are taken for a PUF label with a gap of \(\beta\) as 6° (see section “Methods” for more details). To exactly extract the fluorescent signal corresponding to diffraction-limited BS (size is around 10 pixels × 10 pixels), \({I}_{\beta }\) is calculated via the total signal of 13 pixels × 13 pixels matched with the identified position. Experimental results show that the LPM curves of the identified diffraction-limited BSs can be well fitted (solid lines in Fig. 2c) via Eq. 147 with a coefficient of determination usually larger than 0.85.

$${I}_{\beta }={A}_{1}-{A}_{2}{\cos }^{2}\left(\alpha -\beta \right)$$
(1)

Where \({A}_{1},{A}_{2},\alpha\) are fitting parameters (\({A}_{1} \, > \, 0,{A}_{2} \, > \, 0\)), and \({I}_{\beta }\), \(\beta\) are input parameters. Examples of six fitted LPM curves are shown in Fig. 2c. It should be noted that these LPM curves show different LPM contrast values (\(\frac{{A}_{2}}{{A}_{1}}\)) and LPM phase (the fitting result of \(\alpha\)), corresponding to different orientations of FNDs. Therefore, based on the fluorescent images under different \(\beta\) values, it is possible to obtain the 3D encoded information, including LPM contrast values, LPM phases, and the positions of the diffraction-limited BSs.

For achieving a sensitive optical readout of the above 3D anticounterfeiting information, a sufficiently high LPM contrast value is required. We define the following criteria to determine a sufficiently high LPM contrast value: larger than 15%, which is 10 times the fluorescent intensity error (around 1.5%) in long-term detection (Fig. S1). Experimental results show that diffraction-limited BSs own LPM contrast values usually larger than 15%, but the LPM contrast values of bigger BSs have a high probability of less than 15% (Fig. S2). Therefore, the crucial point to obtain a sufficiently high LPM contrast value is the high probability of finding diffraction-limited BSs. Our PUF label is well matched with this crucial point (see the previous description of Fig. 2a), which causes a high ratio of obtaining high enough contrast values. A typical example is given in Fig. 2d: 215 of 306 identified BSs own LPM contrast values larger than 15%.

Anticounterfeiting performance of 3D encoded images

Then, based on the above-mentioned 3D anticounterfeiting information, we proposed a 3D encoding scheme to obtain digitized images. With the classical and widely used authentication method called point-by-point comparison5, the anticounterfeiting performances of digitized images were tested for distinguishability, reproducibility, long-term stability, and stability under sonication.

Digitized results were obtained based on the optical images of FND PUF labels under different \(\beta\) values with pixel resolution of 32 × 32 (see section “Methods” for details). To effectively show the information in the two dimensions including LPM contrast value and LPM phase, a feasible method is to utilize the relative change of \({I}_{\beta }\) corresponding to different \(\beta\). To reflect this relative change, we convert the photon number to contrast values in the image pixels via Eq. 2:

$${{\mbox{contrast}}}_{\beta={\mbox{n}}}=\frac{{{\mbox{counts}}}_{\beta=n}-{{\mbox{counts}}}_{\beta=0}}{{{\mbox{counts}}}_{\beta=0}}$$
(2)

Where \({{\mbox{contrast}}}_{\beta={\mbox{n}}}\) and \({{\mbox{counts}}}_{\beta=n}\) mean the contrast values and photon number in image pixels, respectively, with the condition of \(\beta=n\). In the digitization process, nine contrast levels are set according to Table S3, which is designed based on the precision and range of the contrast value. An example of the digitized images is given in Fig. 3a: there are three encoding dimensions, including contrast levels, polarization angles, and pixel position. Calculated via the inserted formula5,11 in Fig. 3a, the encoding capacity of our PUF label can arrive at 910 × 1024 (109771), which is much larger than the commonly suggested minimum encoding capacity (10300)5. In addition, the readout time corresponding to these digitized images is just 7.5 s. Therefore, we achieved a sufficiently large encoding capacity as the basis for unbreakable encryption, within a relatively short readout time.

Fig. 3: Anticounterfeiting performance for high-dimensional encoded FND PUF label.
figure 3

a Digitized results of obtained wide-field fluorescent images under different laser polarization directions (\(\beta\)). Insert: the formula to calculate the encoding capacity. Image resolution: 32 pixels × 32 pixels. b Authentication results for two groups of digitized readouts results of 300 PUF labels. Left panel: heating map showing pairwise match. Color bar represents the similarity index. Right panel: histogram showing the statistics of similarity indexes among digitized readout results for the different PUF labels (red bars) and the same PUF labels (blue bars). c Histogram showing the statistics of similarity indexes among digitized results for repeated readouts of the same PUF label. d Long-term stability curves for the readout results of 3D anticounterfeiting information.

To demonstrate the feasibility of the above encoding method in distinguishing different PUF labels, we applied the broadly used authentication method called point-by-point comparison5. Specifically, when two groups of digitized images are compared pixel-by-pixel, the ratio of the same pixels is recorded as the similarity index11. If there exists an evident gap between the similarity indexes of the same labels and different labels, a threshold value within the gap can be chosen to correctly distinguish the different PUF labels. In our authentication process, the similarity indexes are calculated among the two groups of digitized images for 300 PUF labels. Calculation results are shown in Fig. 3b: the similarity indexes of the same labels are always higher than 76%, and the similarity indexes of the different labels are always less than 70%. Therefore, we can choose the threshold value as 75% (within the gap of 70–76%) to successfully distinguish all the 300 PUF labels.

Moreover, based on the above threshold value, we evaluated the reproducibility, long-term stability, and stability under sonication of the digitized readout results for our PUF label. First, for reproducibility, we calculated the similarity indexes among 10 groups of digitized images for the same PUF label. All the authentication results for 20 PUF labels show satisfactory reproducibility in Fig. 3c: the similarity indexes are always significantly higher than the threshold value (75%) (see Fig. S3 for specific examples of the reproducibility of a specific label). Then, long-term stability was tested based on repeated readout results of nine PUF labels for a period of around 159 days. For each readout date, corresponding similarity indexes are calculated between the digitized readout results on this date and the digitized readout results on the first day. The authentication result confirmed satisfactory long-term stability: the similarity indexes are always evidently higher than the threshold value (75%) (Fig. 3d). Finally, stability under sonication was tested via comparing authentication results about identical eight labels in two conditions: (1) with sonication, and (2) without sonication. Test results (Fig. S7) show the satisfactory stability that sonication causes a little influence on the similarity index distribution, and different labels always can be successfully distinguished via the threshold value (75%).

Deep metric learning for authenticating noise-affected digitized images

In terms of practical authentication for anticounterfeiting labels, the system’s tolerance to common noise sources is crucial. Specifically, in the practical authentication process, manufacturers provide images taken under ideal laboratory conditions, but end users may employ images taken in environments saturated with various additional noise sources. Even if the optical signal from a PUF label is stable, these noise sources influence the readout process, which might prevent the authentication algorithms (i.e., point-by-point comparison) from working42.

To this end, simulating the actual authentication process, we conduct a noise resilience evaluation for point-by-point comparison method with FND PUF label. In the noise resilience evaluation, images of PUF labels taken under two different optical readout conditions are used for authentication based on the similarity index, as shown in the left panel of Fig. 4a. Specifically, our experiment mirrors the image capture process by both the manufacturer and the end users, where an optical readout of 150 PUF labels was conducted under ideal laboratory conditions and laboratory conditions contaminated with three kinds of noise sources, respectively (refer to Supplementary Notes 3 for details). These noise sources include background light, sample drift, and out-of-focus, all of which are common noise sources for the readout process. Digitized images corresponding to the above two optical readout conditions show evident differences, with a typical example in the right panel of Fig. 4a. As shown in Fig. 4b, experimental results show that there is an overlap between the distributions of the similarity index for intra-digitized images and inter-digitized images. This implies that it is impossible to find an appropriate threshold to distinguish PUF labels. In noise resilience evaluation results for other PUF labels based on point-by-point comparison42, negative results have also been found and were attributed to the widespread limitation of the point-by-point comparison method, i.e., sensitive to some noise sources. Therefore, this highlights the need for a more robust authentication algorithm.

Fig. 4: Authentication of noise-affected digitized images via point-by-point comparison.
figure 4

a Digitized images (right panel) of the same FND PUF label corresponding to different optical readout conditions (left panel). b, c Authentication results for the digitized images of 150 FND PUF labels captured under the two optical readout conditions in (a). Heat map (b) showing the pairwise match; the color bar reflects the similarity index. Histogram (c) displaying the statistics of similarity indexes among digitized readout results for the different FND PUF labels (red bars) and the same FND PUF labels (blue bars).

To develop a more robust algorithm tolerant of noise, our key inspiration was that it was hard, using the naked eye, to differentiate digitized images of a PUF label in the presence of noise (such as Fig. 4a). However, it is possible for us to recognize natural images with the naked eye, even when they are mixed with noise49. The reason is that most of the noise in our daily life affects information at the pixel level, while we have seen many natural images and have been “trained” to recognize them based on high-level semantic information50. Thus, to solve the challenge in noise resilience evaluation, the critical factor is to show our machine as many PUF labels as possible at a training stage (i.e., prior information). It is essential to provide the model with training data, which are used to teach it to discern high-level information crucial for distinguishing samples and enhancing noise resilience.

Given the profound capacity of deep learning51,52,53 to learn prior information and the ability of a convolutional neural network (CNN) to extract deep patch-level features from images, we propose to exploit these features for our authentication system. In particular, we propose the use of deep metric learning44 to conduct anticounterfeiting authentication. Specifically, metric learning excels at identifying essential differences between data instances, making it well-suited for authentication tasks. By learning a distance metric that reflects intrinsic similarities and dissimilarities among instances, metric learning has demonstrated effectiveness in actual applications such as robust face verification in the wild54, variation-tolerant face recognition46,55, and person re-identification56.

The core concept of metric learning, as shown in Fig. 5a, is to enable the accurate clustering of images based on their content, even when subjected to noise or distortions. In the original image space depicted on the left side of Fig. 5a, we have an image \({I}_{X}\), which becomes more distant from its original position when affected by noise, resulting in the image \({I}_{X\hbox{'}}\). Consequently, if we measure the distance with the point-by-point comparison, image \({I}_{X}\) becomes closer to another image \({I}_{Y}\), which could cause a wrong matching. To tackle this issue, our metric learning framework utilizes a neural network trained to extract noise-resistant features, allowing images with similar content to be accurately categorized together, as depicted on the right side of Fig. 5a. In other words, high-level information of the digitized image is extracted and represented like a deep key in the metric space. In this way, we can compare digitized images’ similarity by calculating a similarity score (see section “Methods” for more details) in the metric space, instead of a point-by-point comparison in the original image space.

Fig. 5: Metric learning for the authentication of noise-affected digitized images.
figure 5

a Schematic diagram showing the design and insight of our metric learning framework. CNN: convolutional neural network. b Schematic diagram showing the training process of CNN in one loop. c, d Authentication results for the 150 pairs of digitized images used in Fig. 4b, c. Heat map (c) showing the pairwise match; the color bar reflects the similarity score. Histogram (d) displaying the statistics of similarity scores among digitized readout results for the different FND PUF labels (red bars) and the same FND PUF labels (blue bars).

To achieve this goal, the key innovation of metric learning lies in its unique training strategy. As shown in Fig. 5b, a deep neural network is trained to bring features (i.e., \({F}_{X}\) and \({F}_{X\hbox{'}}\)) belonging to the same PUF label closer together while pushing features (i.e., \({F}_{Y}\) and \({F}_{X\hbox{'}}\)) from different PUF labels further apart. Before training, features from the same label might be far apart in the metric space (or too close to a different label), which is contrary to our desired outcome. In such cases, the loss function penalizes the network, forcing it to adjust its parameters in the appropriate direction via back-propagation. After several iterations, our network learns how to extract the key information from images, resulting in a metric space where features of the same label are close together, and features of different labels are farther apart. Compared to similarity index based on point-by-point comparison, neural networks typically extract patch-level structural features, which are more resilient to noise.

We demonstrate the robustness of our method against a variety of noise sources commonly encountered during the readout process. Especially, during training, we do not assume prior knowledge about the noise that may present in practical use and apply PUF data from the ideal condition for training. This avoids introducing biased evaluation results. For validation, we use the same 150 pairs of PUF images employed in Fig. 4b. Notably, the PUF labels used for training are completely different from the PUF labels in the test set, thereby preventing any overfitting issue in our test results (see section “Methods” for more details). As depicted in Fig. 5c, d, even under challenging conditions, our method accurately distinguishes between different pairs of PUF labels and identical pairs of PUF labels with 100% precision. It can be seen that there is a significant gap (~13%–23%) between the similarity score distributions of intra-class PUF labels and inter-class PUF labels. By contrast, authentication with point-by-point comparison reveals a tendency to confusion (see Fig. 4b, c). These results demonstrate that our algorithm can recognize PUF labels better than the point-by-point comparison method in the presence of the investigated noise sources. Additionally, we provide validation results of our method under ideal conditions in Fig. S4, highlighting even more distinct decision boundaries. This further proves the robustness and reliability of our metric learning-based approach for accurate PUF label authentication, whether under ideal conditions or in real-life noise environments.

Characteristics of metric learning method compared to prior AI-driven methods

Prior to our work, there have been some AI-driven methods9,12,43 for tackling PUF label authentications. These methods typically formulate the problem as a discriminative image classification task which learns to map a data pattern to a category. In contrast, we redefine the problem as a deep metric learning problem, focusing on learning a similarity measurement between two samples using deep features to assess whether they are from the same label or not. Here, we analyzed the difference between our metric learning-based authentication method and classification-based authentication methods in the training and testing stages.

First, our method is more data-efficient during the training phase. Specifically, classification methods learn to map a data pattern to a category by predicting the probability for each category given an input. Since the mapping differs across categories, this requires a lot of data for each category for training (Fig. 6a). If there is not enough data for a category, the model is prone to overfitting the training data and performs poorly during testing57. The experimental demonstration can be found in Supplementary Notes 9. Our metric learning, on the other hand, focuses on learning a similarity measurement to evaluate whether a given pair of readout results are similar or not, specifically, predicting a similarity score for a pair of readout results. This task can be accomplished with pairs or triplets of samples (a reference sample, a positive sample from the same category, and a negative sample from a different category). Our training stage requires only two groups of readout results from each label (Fig. 6a), because the similarity measurement can be shared among different PUF labels57: with \(N\) readouts from PUF labels, our method derives \({C}_{N}^{2}\) unique pairs for training, leading to a much larger number of data pairs than classification methods (i.e., \({C}_{N}^{2} > > N\)). In sum, our method needs much less training data, thereby saving a lot of time that would otherwise be consumed in the repeated readout of PUF labels.

Fig. 6: Comparison of classification and metric learning methods for PUF label authentication.
figure 6

a Schematic diagram showing the difference in the amount of data required during the training process. b Schematic diagram showing the difference in the authentication process and the capability to authenticate unseen PUF labels, where the classifier cannot evaluate unseen class labels based on predicted class probabilities. “Key” denotes the anticounterfeiting information from a single readout of a PUF label.

Second, our method can effectively authenticate new PUF labels unseen during training (Fig. 5c, d), a capability that classification-based methods lack9,12,43. As shown in Fig. 6b, classification methods fix the number of categories (e.g., 10) during the training phase. In this case, the model predicts the probabilities of 10 categories and utilizes the highest one to determine the class to which a PUF label belongs. Consequently, if a provider manufactures a new 11th PUF label, the network will still only predict probabilities of 10 categories. Under these circumstances, the method would either predict all 10 probability values to be low, thereby deeming the label to be false, or one probability might be high, leading to an incorrect classification of the label (see Fig. S14). Although the features learned in the penultimate layer can indeed be utilized for unseen label authentication in a classifier, similar to the operation of deep metric learning in the testing phase, the efficacy of this approach remains unsatisfactory. This is supported by studies such as DeepFace58 and ArcFace59, which indicate that metric learning is still necessary, whether explicitly or implicitly, to facilitate the learning of discriminative features for unseen label inference through pairwise feature comparisons. A classification model trained using basic SoftMax loss, akin to recent approaches in the PUF authentication field, exhibits poor performance in similarity comparison for unseen labels when using the method described above. We confirmed these findings through additional experiments, as detailed in Supplementary Notes 9 (Experiment 3). By contrast, our method compares whether two readout results of PUF labels are similar, and the learned similarity metric can be applied for new PUF labels. Consequently, even if we only use digitized images of certain PUF labels (such as PUF1-PUF10) during the training stage, we can still compare two new readout results (for instance, from PUF11 and PUF12), as shown in Fig. 6b. The experimental demonstration can be found in Figs. 5c, d and S4, where all our experimental evaluations are on unseen labels. Therefore, our method is well-suited for use in real-world business situations where new PUF labels are continually being created, while the classification technique’s requirement for retraining makes it inconvenient to use in such conditions.

Discussion

Combining the advantages of both FNDs sample and our 3D encoding scheme, LPM of FND is a promising candidate to achieve a useful PUF label, which can satisfy most common requirements in both the aspects of commercial and anticounterfeiting performances. Specifically, common commercial requirements5 contain (1) low-cost and scalable fabrication and (2) convenient and fast readout. As for our FND PUF label, the estimated cost of a working label is considerably lower than 0.19 USD (Table S1). Its simple fabrication method using a mature commercial sample (see section “Methods” for details) offers the opportunity for scalable fabrication, and the strong optical signal of FNDs (Fig. S5) provides the basis for fast optical readout (in our scheme, integration time for an image is 50 ms, and total readout time is 7.5 s). In addition, some common requirements for anticounterfeiting performances of PUF labels are listed as (1) high encoding capacity for achieving unbreakable encryption5, (2) reproducible authentication results11, and (3) labels available for precise authentication within a long time. Authentication results (Fig. 3) prove that our FND PUF label is well able to meet these requirements: encoding capacity as high as 109771 with satisfactory distinguishability, reproducible authentication for 10 times of readout results of the same labels, and stable anticounterfeiting information for a period of around 159 days.

Other two core points about FND PUF label should be stressed here. Firstly, compared with three representative 3D encoded labels (Table S2), our 3D encoding scheme shows two advantages: (1) a higher encoding capacity under the same image pixel conditions; (2) simplified label fabrication requiring only one type of “ink”, i.e., FNDs. Secondly, an existing main challenge is achieving cost-effective and user-friendly readout device, which is crucial in practical usage. Fortunately, rapid development of portable microscopy points towards a promising solution to overcome this obstacle (see Supplementary Notes 5 for details).

Building on the similarity score extracted from the deep features of two sets of digitized results, we propose a metric learning authentication method showing better noise resilience and higher training efficiency than prior AI-driven methods. Specifically, unlike the previous classification method12 that uses the same artificially created “noise” or disruptions during both the training and testing phases, our CNN network is trained via data readout under ideal laboratory conditions, but demonstrates robust noise tolerance in evaluations (refer to Fig. 5b, c). This proves that our method potentially has a better capability to handle data readout in a variety of real-world scenarios. In addition, in contrast to the classification approach9,12,43, in which a large amount of training data for each PUF label is needed and repeated training is required when new labels are introduced, our method requires only two sets of readout results for each PUF label and obviates the need for retraining in the event of encountering new objects, thereby saving a lot of time. Our metric learning approach is flexible and can accommodate various PUF shapes due to its learning-based nature. We believe that our innovative approach will offer valuable insights into the PUF authentication field, encouraging real-world implementations and inspiring future research.

Our authentication method can also well satisfy the common requirements in practical usage: high enough authentication velocity and availability for product traceability. First, in our authentication method, the time of comparing digitized images with one set of stored objects is 1.38 ms, which is sufficiently rapid for practical use with a well-matched design of the authentication process (see Supplementary Notes 4 for detailed analysis). Secondly, in many previously established product traceability algorithms11,14,37, traditional point-by-point comparison methods play a crucial role in determining whether the two groups of encoded images belong to the same label. Seamlessly sharing the above role, our deep metric learning method aligns well with these algorithms.

Methods

Experimental apparatus

All fluorescent images for the FND PUF labels were taken by a customized wide-field fluorescence microscope. In the excitation optical path, the linear polarization direction of a continuous 532 nm laser is changed via a half-wave plate (WPH10M-532, Thorlabs) mounted on the electrical rotation stage (PT-GD62, PDV). Employing the above half-wave plate and a polarizer to control and verify the initial laser polarization direction (\(\beta=0\)), we can maintain a consistent initial laser polarization direction. The excitation laser is then focused on the back-focal plane of an oil immersion objective (NA 1.45, UPLXAPO100XO, Olympus) to illuminate the sample. The sample position can be finely adjusted via a nanopositioning stage (P561.3CD, Physik Instrumente). In the detection optical path, filtered with a long pass filter (FELH0650, Thorlabs), the fluorescence signal is detected via a water-cooled EMCCD (iXon Ultra 897, Andor) with a field of view of around 30 × 30 μm.

Fabrication of FND PUF label

The PUF label is fabricated by FND-COOH containing ensemble NV centers (BR100, FND Biotech, Inc.) through electrostatic absorption. Specifically, cover slides are activated by plasma for 10 min (200 W), and then immersed into the 3-aminopropyltriethoxysilane (APTES, Sigma) solution in ethanol (v/v,5%). After reaction for 24 h at room temperature, the cover slides are taken out and washed with ethanol and water, respectively. After that, 0.02 mg/mL FND solution is drop-casted on the obtained positive-charged cover glass and incubated in the refrigerator for 3 h. Next, the obtained samples are washed with DI water and dried in air. Finally, a PDMS layer is coated onto the cover glass as a protection layer. Supplementary Notes 7 shows the influence of PDMS layer on the readout results of three encoding dimensions.

Particle location code

We use the peak location function (pkfind()) and spatial bandpass filter function (bpass()) in MATLAB particle location code from Daniel Blair and Eric Dufresne (https://site.physics.georgetown.edu/matlab/index.html). The spatial bandpass filter function is used to filter the noise from wide-field fluorescent images. The peak location function is then used to identify and locate the BSs. Without special instruction, the wide-field fluorescent images described in formal content and supporting information have been processed via spatial bandpass filter.

Measurement of LPM curves

First, wide-field fluorescent images of FND PUF label are taken with 6° gap of linear polarization direction of the excitation laser, 50 ms integration time, and around 10 mW laser power. Then, with the peak location function, we identify and locate BSs with threshold value as 3000 counts and spot size as 10 pixels. Next, the total signals of the 13 pixels × 13 pixels matched with the location of identified BSs are calculated as fluorescent intensity. Finally, the LPM curve is recorded as the relationship between fluorescent intensity and the linear polarization direction of the excitation laser. The recorded LPM curve is fitted via the curve fitting function (fit()) in MATLAB based on Eq. 1.

Optical readout and digitization

First, with parameters of 512 × 512 pixel resolution, 50 ms integration time, and around 10 mW laser power, wide-field fluorescent images of FND PUF label are taken under polarization directions of excitation laser as 0, 36, 48, 60, 72, 84, 96, 108, 120, 132, 144°. Next, by accumulating the total signal of 16 pixels × 16 pixels into a new pixel, we change the pixel resolution to 32 × 32. Then, contrast values for each pixel are calculated via Eq. 2. Finally, contrast values are digitized as shown in Table S3. Using the Lenovo Xiaoxin Pro16 laptop with i5-13500H CPU, it takes ~0.16 s to encode a set of readout results.

Training of metric learning

As shown in Fig. S13, the network used is a Siamese network, a type of CNN that processes two inputs concurrently and extracts their deep features. When provided with two digitized images X and X’, the Siamese network extracts their corresponding features as follows:

$$F={\mbox{CNN}}\left(X\right),F\hbox{'}={\mbox{CNN}}(X\hbox{'}).$$

Within our metric learning framework, the objective is to maximize the cosine similarity between F and F’ if X and X’ belong to the same PUF label, and to minimize the similarity if they belong to different PUF labels. To achieve this, the CNN needs to extract the most notable characteristics of a PUF label that differentiate it from other labels. These features are often patch-level, preserve robust structural information, and exhibit greater resilience to noise. To train the network, we calculate the cosine similarity of feature maps and use the Focal loss to update the CNN. The model employs the Adam60 optimizer with an initial learning rate of 0.0001. It is trained with a batch size of 32 for up to 10,000 iterations. Further details regarding the network architecture, cosine similarity calculation, and loss function can be found in Supplementary Notes 8.

Our training and testing datasets are completely distinct. For training, we utilized 240 pairs of readout results under ideal conditions from 240 different PUF labels. For testing, we employed 60 pairs of readout results under ideal conditions from 60 unique PUF labels for standard testing (i.e., results in Fig. S4) and 150 pairs under noisy conditions from 150 distinct PUF labels for noise robustness testing (i.e., results in Fig. 5c, d). Notably, these 450 PUF labels are entirely separate from one another.