Efficient hardware error correction with hybrid on-offline configuration algorithm for optical processor

Zhao, Zichao; Zhu, Huihui; Liang, Qishen; Ma, Haoran; Fu, Ziyi; Jiang, Xingyi; Chen, Bei; Wang, Yuehai; Chen, Tian; Shi, Yuzhi; Yang, Jianyi

doi:10.1038/s42005-025-02247-2

Download PDF

Article
Open access
Published: 09 August 2025

Efficient hardware error correction with hybrid on-offline configuration algorithm for optical processor

Communications Physics volume 8, Article number: 323 (2025) Cite this article

2545 Accesses
1 Citations
Metrics details

Subjects

Abstract

Photonic neural networks (PNNs) have emerged as a promising platform for high-speed, parallel, and low-latency computing by harnessing the linear propagation of optical signals. However, scaling up PNNs faces significant challenges due to hardware errors caused by fabrication variations and environmental factors. Traditional approaches, such as offline error correction and online training, either rely on complex control systems or suffer from local optima convergence issues, resulting in limited scalability and efficiency. Here, we propose a hybrid on-offline configuration (HOOC) algorithm for programmable optical processors. This innovative approach combines offline initial value presetting with online perturbed optimization iteration algorithm, enabling precise and highly efficient error correction. We benchmark the algorithm’s performance in complex-valued matrix configuration and classification tasks, demonstrating robust error correction capabilities, including high reconstruction fidelity (≥98%), rapid convergence (≤10 iterations), and reduced dependence on detection devices. Furthermore, numerical simulations of high-order coherent processors demonstrate that our HOOC algorithm effectively avoids local optima, a common limitation of the conventional in-situ training method, thus simultaneously improving the scalability and robustness. These results underscore the viability and efficiency of the HOOC algorithm for scalable and robust PNN implementations, paving the way for scalable optical computing in artificial intelligence applications.

Single-chip photonic deep neural network with forward-only training

Article 02 December 2024

Scaling up for end-to-end on-chip photonic neural network inference

Article Open access 17 September 2025

Digital-analog hybrid matrix multiplication processor for optical neural networks

Article Open access 12 August 2025

Introduction

Driven by the rapid development of deep neural networks to address computation-intensive tasks in various applications^1,2, numerous types of computing hardware have emerged as accelerators for artificial intelligence^{3,4,5,6,7,8,9}. Among these frontier schemes, photonic neural networks (PNNs) have attracted a great deal of attention for their advantages of high speed, high parallelism, and low latency^10,11,12. By leveraging the linear propagation of optical signals in optical devices, various fundamental operations in neural networks can be efficiently performed in the optical domain, including dot-product¹³, matrix multiplication¹⁴, convolution¹⁵, Fourier transforms¹⁶, and so on. This capability enables PNNs to deliver computational power surpassing that of their electrical counterparts. Meanwhile, photonic integrated circuits (PICs), as CMOS-compatible hardware platforms, facilitate compact on-chip integration of optical devices while enabling large-scale, cost-effective manufacturing of photonic chips^17,18. These advancements have also ensured the practical implementation of PNN models¹⁹. To date, several studies have demonstrated large-scale, multilayer PNNs based on programmable on-chip optical processors^14,20,21,22, showcasing their potential in applications such as image recognition^13,23, natural language processing²⁴, signal processing^25,26, and complicated model solving²⁷, etc.

However, scaling up PNNs remains challenging due to the precise encoding requirements of interferometers, which rely on analog operations²⁸. Unlike digital circuits, which exhibit some tolerance to manufacturing errors²⁹, integrated photonic chips, as analog devices, are more susceptible to errors arising from fabrication process variations^30,31 and temperature fluctuation^32,33. Current PNN implementations typically rely on pre-trained neural networks, which require offline simulation of the ideal model on classical computing equipment. However, discrepancies between the ideal model and actual PICs introduce significant hardware errors that accumulate as the on-chip circuits scale up. While several local error correction methods have been developed to improve matrix accuracies^{34,35,36,37,38}, addressing device imbalance errors³⁴, waveguide losses^28,36, and thermal crosstalk^36,37,38, these approaches typically require high-precision multiple sampling and complex control systems, presenting substantial experimental challenges^35,36,39.

To overcome the limitations of offline error correction methods, significant efforts have been made to develop online weight configuration approaches for PNNs⁴⁰. One promising solution is in-situ backpropagation⁴¹, which dynamically adjusts weights in real time by transmitting bidirectional optical signals through the mesh on the chip. However, these methods that rely solely on in-situ training require the real-time monitoring of signal strength at each node and a complex peripheral electrical system⁴². Alternative online training methods, such as physics-agnostic algorithms such as genetic algorithm⁴³, particle swarm optimization⁴⁴, and in-situ perturbed optimization iteration (POI) method^36,45, update on-chip weights without relying on gradients or prior knowledge of the chip. While these methods eliminate the need for gradient-based optimization, they are prone to local optima and face challenges in converging to global optima⁴⁶. This defect arises from random initialization strategies that may position the initial parameters within the basin of attraction of suboptimal solutions, increasing the possibility of convergence to local minima. While scaling the population size can partially mitigate this issue, it introduces significant computational inefficiency due to iterative iterations. Besides, pruning strategies and enhanced initialization techniques can partially alleviate the risk of local optima^24,47, however, iterative optimization procedures starting from random initial configurations still demand significant computational overhead and prolonged training durations to achieve convergence towards application-specific target matrices³⁶. These problems grow exponentially in high-dimensional space⁴⁸, and highlight the critical need for an accurate and efficient training method for PNNs.

Here, we demonstrate a hybrid on-offline configuration (HOOC) algorithm for programmable optical processors. By combining offline initial value presetting with an online POI algorithm, this approach achieves precise and highly efficient hardware error correction for programmable PNNs. Specifically, offline presetting requires only approximate initial values, eliminating the need for high-precision measurement equipment typically used in traditional offline correction methods. The online calibration process, based on inverse design and Frobenius norm normalization, effectively addresses various hardware errors such as insertion loss, imbalance, and crosstalk. Together, the hybrid configuration enhances device characterization efficiency and avoids the local extrema issues commonly associated with traditional physics-agnostic algorithms. The algorithm is also applicable to complex-valued matrix configurations and can be extended to other symmetric topological structures.

To experimentally validate the proposed method, we design and fabricate two photonic chips with Clements^49,50 and FFT meshes^51,52. We demonstrate several proof-of-principle tasks, including complex-valued matrix configuration and the classification of the Iris and Vowel datasets. The HOOC algorithm exhibit exceptional performance across all tasks. For complex-valued matrix configuration, the algorithm achieves a fidelity greater than 0.98 for 4th-order matrices with fewer than 10 optimization iterations. In contrast to traditional online error correction methods, which typically reach a steady state by around 100 round trips and are prone to becoming trapped in local optima, our algorithm demonstrates a significantly faster convergence rate and achieves a superior convergence value. Given that matrix configuration is a core operation in neural networks, this algorithm can be directly applied to PNNs for classification tasks. Using the Clements mesh, the measured accuracies for the Iris and Vowel datasets converged to 91.18 ± 0.98% and 72.55 ± 1.96%, respectively, demonstrating excellent agreement with the numerical testing accuracies. Similarly, with the FFT-based mesh, the accuracies converged to 83.21 ± 1.96% and 72.98 ± 3.54%, respectively. These results demonstrate the viability and effectiveness of the HOOC algorithm.

Results

Concept and principle

In this work, the HOOC algorithm is employed to achieve high-fidelity optical circuit performance on imperfect hardware platforms. The schematic of the algorithm, depicted in Fig. 1, consists of two primary components: offline preset and online calibration. As shown in Fig. 1a, the offline preset is a forward design process used for circuit initialization, in which the core operation is the mapping between neural network and physical hardware. In this phase, the connections between layers, represented by the weight matrix W in a typical electronic neural network, are decomposed using singular value decomposition (SVD) as W = UΣV^†, where U and V are unitary matrices, and Σ is a diagonal matrix. This decomposition is then encoded into programmable optical architectures, such as the Clements mesh^49,50 and FFT mesh^51,52, which are composed of Mach-Zehnder interferometer (MZI) units (see Supplementary Note 1 for detailed matrix decomposition methods). Subsequently, the phase shifters in MZIs are calibrated by adjusting the voltage supplied to thermo-optic heaters, and the theoretical parameters derived from matrix decomposition (e.g., θ_i) are encoded into the optical processor upon power activation, while the output signals are collected by monitors and transmitted to the computer. Notably, the calibration curve in the offline preset process does not require high precision; instead, sampling a small set of voltage values is sufficient to complete the calibration. This approach significantly reduces the calibration time, which was previously a time-consuming task.

**Fig. 1: Architecture of the HOOC algorithm.**

After completing the offline preset process, the initial values are encoded into the optical processor, and the monitored outputs are subsequently used for online calibration. Figure 1b illustrates the forward propagation process in the coherent optical processor and its associated electrical equipment. Previous works have provided a theoretical foundation to mitigate hardware errors caused by beam splitter imbalances by adjusting the phase shift values³⁴. Building on this insight, we develop an online calibration algorithm designed to effectively address various hardware errors in optical processors, including device insertion loss, beam splitter imbalances, and thermal crosstalk. Figure 1c conceptually shows the online calibration process of our HOOC algorithm. To efficiently optimize the configuration of the on-chip MZI mesh, we employ the POI algorithm, a type of inverse design algorithm, to compensate for discrepancies between the practical matrices and the target matrices. During each iteration, the phase shift value θ_i is perturbed to compute the forward difference $\nabla {{{\mathcal{L}}}}({\theta }_{i})$, which represents the loss gradient with respect to the weight matrix (see Supplementary Note 2 for further details on the updating process). To ensure that the calibration process can cover errors caused by device loss, the loss function ${{{\mathcal{L}}}}$ of the perturbation optimization algorithm is defined as

$${{{\mathcal{L}}}}=\frac{1}{{n}^{2}}{\sum}_{i=1}^{n}{\sum}_{j=1}^{n}{\left| {\hat{{{{\bf{A}}}}}}_{{{\rm{MZI}}},ij}-{\hat{{{{\bf{A}}}}}}_{{{\rm{target}}},ij}\right| }^{2},$$

(1)

$$\hat{{{{\bf{A}}}}}=\frac{{{{\bf{A}}}}}{\parallel {{{\bf{A}}}}{\parallel }_{{{{\rm{F}}}}}},\parallel {{{\bf{A}}}}{\parallel }_{{{{\rm{F}}}}}=\sqrt{{\sum }_{i=1}^{n}{\sum }_{j=1}^{n}| {{{{\bf{A}}}}}_{ij}{| }^{2}},$$

(2)

where $\hat{{{{\bf{A}}}}}$ is matrix obtained by normalizing A using the Frobenius norm, A_target is the target matrix, and A_MZI is the matrix implemented by MZI mesh. During each round of online calibration of the Nth-order MZI grid, each row of the Nth-order identity matrix is coded as the input signal, and the corresponding output vector is collected to reconstruct the Nth-order matrix realized by the MZI grid. This approach allows the online calibration algorithm to accurately estimate the forward deviation of the MZI mesh while accounting for the device loss, thereby providing accurate gradient direction for perturbation optimization. Since the online calibration process begins with a relatively rough initial value obtained from the offline presets, online calibration can quickly approach the target matrix with few iterations. Additionally, by adopting Frobenius norm normalization, the global proportional relationships between matrix elements are preserved, preventing any single element from dominating the whole structure. By utilizing global matrix-level information rather than relying solely on individual device calibration, the robustness of the system is significantly enhanced. As a result, our HOOC algorithm does not over-rely on highly accurate control systems. It is worth noting that the normalization and loss function calculation methods described above are equally applicable to complex-valued matrices, as $\hat{{{{\bf{A}}}}}$ in Equation (1) may assume complex values.

Experimental setup and validation

Figure 2 a illustrates the schematic of the testing system to validate the HOOC algorithm experimentally. The testing system comprises several key components: a tunable continuous-wave light source, an integrated optical processor chip, a photon detector, and a near-field projecting system. The process begins with a continuous laser light at a wavelength of 1550 nm, which is directed through fiber arrays into a polarization controller to adjust the input light mode to ensure alignment with the transmission mode, and then goes into the integrated optical processor chip. The resulting output photons are then collected by a fiber array and transmitted to the photon detector. To enhance the efficiency of rapid phase shifter characterization during the offline preset stage, a near-field projecting system is added. Light from the designed monitoring ports on the chip passes through a Fourier lens system and is captured by a shortwave infrared (SWIR) camera for calibration. During this process, detected photons are converted into photocurrents and further collected by a data acquisition (DAQ) broad, together with programmable phase shifters to be controlled by a classical computer, which also manages the updating process of the POI algorithm. The chip is packaged optically and electrically, with electrodes mounted to a printed circuit board via wire bonding, as demonstrated in Fig. 2b.

**Fig. 2: Schematic of the experimental setup and photonic chips.**

We fabricate two types of programmable photonic chips with programmable MZI meshes to experimentally validate the HOOC algorithm. The micrographs of the fabricated chips, based on the FFT-based scheme⁵¹ and the Clements scheme⁴⁹, are shown in Fig. 2c, d, respectively. The FFT-based scheme integrates a fast Fourier transform in the optical domain using butterfly-style MZI meshes with a single column of MZIs, enabling the realization of a one-layer neural network with a 4th-order circulant matrix connection. On the other hand, the Clements-based scheme incorporates two 4th-order unitary matrices and one diagonal matrix, enabling the multiplication of arbitrary unitary matrices. For the Clements structure, 8 MZI units are added into both the top and bottom optical paths to compensate for inter-path insertion loss variations while simultaneously providing additional monitor ports. For the FFT-based scheme, 1% tap ports are symmetrically introduced into all propagating paths, serving as power monitors. Additionally, on-chip balanced detection units are included to extract the real and imaginary parts of the output signals, which are essential for demonstrating the HOOC algorithm in the complex-valued domain. A detailed description of the on-chip balanced detection unit is provided in Supplementary Note 2.

To evaluate the performance of our HOOC algorithm and validate its design for hardware error correction, we first investigate the reconfigurability of the arbitrary transformation by generating and implementing random complex-valued matrices within two MZI-based meshes. The process begins with the calibration of phase shifters in the MZIs during the offline preset stage, accomplished through near-field detection. Figure 3a displays the near-field image captured by the SWIR camera, with the spot areas corresponding to the monitoring grating couplers highlighted by red dotted boxes. The image within the red dotted box is further converted into a normalized intensity distribution for phase shifter calibration, as illustrated in Fig. 3b. The calibration results using near-field detection are represented by gray dots in Fig. 3c. For comparison, the calibration curve of a complete 2π modulation cycle using the traditional output light detection method is plotted as a blue line in Fig. 3c. It is evident that the data obtained from the SWIR camera closely matches the transmission values from the traditional method but requires significantly less calibration time and fewer sampling points, thereby enhancing the efficiency of the offline presetting process. In addition, although the calibration of all heaters is necessary to account for fabrication-induced phase variations, the availability of multiple monitoring ports enables parallel calibration, preserving overall efficiency. The resulting calibration curves are stored and subsequently used for arbitrary matrix encoding. Further details on the offline calibration procedure are provided in Supplementary Note 3.

**Fig. 3: Experimental results of the matrix configuration with the HOOC algorithm.**

Next, we validate the HOOC algorithm by applying it to reconstruct weight matrices upon programmable optical processors. The target matrices used for validation are numerically generated random normed matrices with 4 × 4 modes, constructed without theoretical approximations. These matrices serve as ideal benchmarks to assess the reconstruction accuracy on Clements-based and FFT-based MZI meshes, as shown in Fig. 3d, f. In these figures, color bars represent experimental results, while opaque bars with solid-dashed borders denote theoretical results. Significant errors between the uncorrected matrix and the target matrix are observed in both the real and imaginary parts. However, after applying online calibration, the corrected matrix shows a substantial reduction in errors, converging closely to the target matrix in both its real and imaginary components, demonstrating the effectiveness of our correction method. In Supplementary Note 4, we provide a detailed procedure of the online calibration based on the POI algorithm.

As a result of the matrix deployment, the fidelity $F=| Tr({\hat{{{{\bf{A}}}}}}_{\,{\mbox{MZI}}\,}^{{{\dagger}} }{\hat{{{{\bf{A}}}}}}_{{{{\rm{target}}}}}){| }^{2}$ is calculated, and its distribution, along with the corresponding loss function, is plotted as a function of iteration rounds in Fig. 3e, g for the Clements and FFT-based MZI meshes, respectively. The matrix fidelity based solely on the offline preset F_unco is 0.79 and 0.71, respectively. After applying the online calibration method, these values improve to 0.98 ± 0.06 and 0.98 ± 0.08 within just 10 iterations. Simultaneously, the matrix losses decrease from 5.26 and 3.51 to 0.70 ± 0.18 and 0.69 ± 0.30 after the same number of iterations, respectively. This improvement stems from correcting static component errors including beam splitter imbalance ratios and inter-path insertion loss variations. In addition, local thermal crosstalk causes functional discrepancies between the on-chip device performances during actual operation and the measured results obtained during the offline presetting stage. Such a performance-limiting factor in our thermally programmable matrix processor can also be compensated by employing the online calibration. In our experiments, fabricated chips for the Clements and FFT-based schemes contain 48 and 28 on-chip phase shifters, respectively. Given the multi-channel voltage source with an update rate of 100 Sa s⁻¹ in our experimental system, the online calibration process requires ~1 s per iteration for the Clements-based chip and ~0.6 s for the FFT-based chip. The 50 iterations in Fig. 3e, g take ~50 s and 30 s, respectively, with each achieving a fidelity exceeding 0.97 within 10 iterations. These results demonstrate that the HOOC algorithm can reconstruct the matrix with high accuracy and efficiency, which can further be applied into the complicated classification tasks.

Classification tasks

In this section, the HOOC algorithm is further applied to optical neural networks, demonstrating its capability to enhance classification performance across various tasks, such as the Iris and Vowel datasets, as illustrated in Fig. 4a. For the Iris dataset, which comprises four features (sepal and petal lengths and widths) and three categories (setosa, versicolor, and virginica), the entire dataset of 150 samples is split into a training set of 99 samples and a test set of 51 samples. A one-layer neural network with a weight matrix W_4×3 (Fig. 4b) is trained on both Clements-based and FFT-based MZI meshes. The numerical testing accuracies for the two MZI meshes are 92.16% and 74.51%, respectively. During the offline preset stage, the testing samples are experimentally evaluated on both hardware chips with classification accuracies below 75%. Subsequently, the online calibration of the HOOC algorithm is employed to correct hardware errors, updating the real and imaginary parts of the trained weight matrix through iterative optimization on the two MZI meshes, as depicted in Figs. 4d, e, g, h. After error correction, the classification accuracies improve to 91.18 ± 0.98% and 72.55 ± 1.96%, respectively, demonstrating convergence towards the values obtained in the numerical simulation.

**Fig. 4: Schematic and experimental results of the HOOC algorithm in the optical neural network.**

A more complicated task, Vowel task, is further used to validate the HOOC algorithm’s ability to mitigate hardware errors and improve classification performance. A two-layer PNN model integrated with the HOOC algorithm is employed for this task. As shown in Fig. 4c, the PNN model consists of a hidden layer W_8×8 and an output layer W_8×4. During matrix encoding, the hidden layer weight matrix is partitioned into four 4 × 4 blocks. The Vowel dataset includes 11 categories, each characterized by 10 features. To manage this complexity, a subset of 4 categories (“hid”, “hId”, “hEd”, and “hAd”) and the first eight features are selected, with the dataset divided into a training set of 196 samples and a test set of 144 samples. The training process of the HOOC algorithm on the two chips is illustrated in Figs. 4f, i, respectively. We perform the hidden layer W_8×8 with our HOOC algorithm in the optical domain, and the output layer W_8×4 is performed on the electrical computer. The fidelity of the trained weight matrices, shown in Fig. 4j and summarized in Table 1, demonstrates significant improvement after applying the HOOC algorithm. Upon completion of the HOOC process, the accuracy of the experimental testing increased from 52.08% to 83.21 ± 1.96% for the Clements-based MZI mesh and from 53.47% to 72.98 ± 3.54% for the FFT-based MZI mesh.

Table 1 Summary of the measured fidelity

Full size table

Performances under high-order coherent processor

As illustrated in the above sections, our HOOC algorithm has been experimentally validated to effectively correct hardware errors in both 4th-order Clements-based and FFT-based MZI meshes. To investigate the scalability of the HOOC algorithm further, this section presents numerical simulations of the two MZI meshes under fabrication imperfections. The primary sources of hardware errors in an MZI mesh include: (1) Insertion loss (α_BS) and imbalance (β_MMI) of the beam splitter; (2) Insertion loss (α_Cross) and crosstalk (β_Cross) of the cross coupler; (3) Insertion loss of the waveguide (α_WG); (4) Metal absorption loss (α_metal) and (5) environmental fluctuations. Among them, the metal absorption loss (α_metal) for each phase shifter is set to 0.06 ± 0.1 dB, according to the simulation results based on the geometric dimensions of the titanium nitride (TiN) heaters (300 μm × 3 μm) in our fabricated chip⁵³. Other losses are estimated from experimental measurements and summarized in Table 2. In Supplementary Note 5, we provide a detailed description of the characterizations for the on-chip devices. In addition, to account for environmental fluctuations in practical conditions, we introduce a 5% random noise to the phase shift values.

Table 2 Summary of hardware parameters in the simulation

Full size table

Using these parameters, we conduct three simulations for circuit sizes N = 4, 8, 16 on both Clements-based and FFT-based MZI meshes. Each set involved 200 randomly generated complex-valued matrices for hardware configuration. The simulation results are evaluated based on matrix fidelity, as shown in Figs. 5a, b. For N = 4, both MZI meshes achieved a matrix fidelity exceeding 0.97 with the HOOC algorithm. While the fidelity of both pre-correction and post-calibration matrices decreases for larger N, a significant improvement in fidelity is consistently observed after calibration, underscoring the scalability and feasibility of our approach. Notably, our simulations assume a minimum detectable optical power of −30 dBm (corresponding to the 7-bit resolution of our control system), demonstrating that the scheme maintains a robust performance without requiring ultra-high-precision external electronics. Furthermore, a comparative analysis between our HOOC algorithm and traditional online error correction methods^43,45, detailed in Supplementary Note 6, demonstrates that the HOOC algorithm achieves both a faster convergence rate and a higher convergence value. Meanwhile, our HOOC algorithm exhibits good robust performance in avoiding the local optimization problem, which ensures its scalability for high-scale circuits containing a large number of physical parameters. In Supplementary Note 6, we provide detailed simulation analysis of the performance of our HOOC algorithm under varying levels of random initial phase noise.

**Fig. 5: Simulated results of Fidelity under different order of MZI meshes with hardware errors.**

Conclusion

In this work, we propose and demonstrate an HOOC algorithm for the efficient configuration of programmable optical processors. By combining the advantages of offline presetting and online calibration, this approach integrates rapid offline characterization with the POI algorithm. The offline presetting stage provides a time-efficient method for initializing optical processor values without requiring the high-precision device characterization used in previous works. The online calibration stage employs the POI algorithm to achieve global matrix optimization. Utilizing Frobenius normalization, our calibration algorithm effectively mitigates errors caused by various hardware imperfections, including insertion loss, imbalance, and crosstalk. This addresses a key limitation of traditional calibration algorithms, which are typically restricted to handling a single type of error. Furthermore, by starting with offline-preset initial values, our online calibration avoids the local extrema issues commonly encountered in traditional physics-agnostic algorithms.

We validate the HOOC algorithm through extensive experiments and simulations. Two photonic chips, based on Clements and FFT-based MZI meshes, are fabricated and integrated into a closed-loop testing system. Through a series of experiments, we demonstrate the algorithm’s outstanding performance in diverse scenarios, including complex-valued matrix configuration and the implementation of PNNs for classification tasks such as Iris and Vowel datasets. Specifically, the algorithm achieves high fidelity (≥98%) in matrix reconstruction with remarkably few iterations (≤10 iterations), while experimental accuracies for classification tasks closely match numerical testing results. Simulation experiments further validate the algorithm’s applicability to large-scale on-chip optical processors, highlighting its reliability and scalability.

Although we demonstrate the HOOC algorithm on Clements-based and FFT-based MZI meshes, its applicability extends to other on-chip symmetric topological structures. For asymmetric designs, such as the Reck⁵⁴ and Diamond⁵⁵ configurations, the algorithm can be adapted by compensating for propagation path loss to mitigate optical power attenuation. Moreover, the offline pre-configuration and online calibration strategies of our algorithm are highly versatile in programmable photonics. By modifying the decomposition strategy and loss function variables, our method can be extended to various applications, including optical interconnects⁵⁶, microwave photonics⁵⁷, and topological beamforming⁵⁸, among others. This adaptability highlights the potential of our algorithm to enhance functionality in systems supporting hundreds of modes, offering a promising pathway for scaling up programmable photonics and enabling transformative applications across diverse industries.

Methods

Fabrication and packaging

The 4th-order Clements-based and FFT-based MZI mesh is fabricated on the silicon-on-insulator (SOI) platform with a 220-nm-thick silicon top layer and a 2-μm-thick buried oxide. Subsequently, a thin layer of titanium nitride (TiN) is then deposited to serve as the resistive material for the heaters, followed by the patterning of a thin aluminum film to form electrical connections to the electrodes and heaters. To minimize power consumption, isolation trenches are etched around the TiN phase shifters through the SiO₂ top cladding and Si substrate. The entire fabrication process is carried out using CMOS-compatible techniques. Detailed analysis of the average power consumption per MZI unit and the loss characteristics of each unit is provided in Supplementary Note 5. For optical packaging, UV-curable glue is employed to attach the fiber array to the chip, with index-matched oil applied to enhance coupling efficiency. The measured coupling loss is ~3.5 dB per facet. For electrical packaging, a two-layer wire-bonding technique is utilized to connect the electrical pads on the chip to the corresponding pads on the PCB.

Experimental setup

The light source is a Santec-570 tunable laser, providing C-band input light. Next, an erbium-doped fiber amplifier (EDFA) is connected to the laser, amplifying the optical power up to 20 dBm. A polarization controller is employed to optimize the coupling efficiency of the light source to the photonic chip. Input and output signals are coupled using a 12-channel fiber array with a uniform spacing of 127 μm. The encoding of input signals and matrix configurations is achieved using an NI-PCI-6704 multi-channel voltage source, controlled via MATLAB, which updates voltages at a rate of ~100 Sa s⁻¹. Light emitted from the monitor ports of the photonic chip is captured using a YM-SC640 shortwave infrared camera. To capture the output optical signals below 14 dBm, eight photodetectors (LSIPD-LD50, 0.9 mA mW⁻¹) convert the optical output signals into electrical signals, which are then transformed into sampling voltages using 62-kΩ resistors. A data acquisition board (DAQ, NI-USB-6210, 250 kS s⁻¹) is utilized to collect the sampling voltages for further processing.

Data availability

The data that support the findings of this study are available from the corresponding author upon request.

Code availability

The codes that support the findings of this study are available from the corresponding author upon request.

References

Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349, 255–260 (2015).
Article ADS MathSciNet Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article ADS Google Scholar
Reuther, A. et al. Survey of machine learning accelerators. in Proc. IEEE High Performance Extreme Computing Conference (HPEC), 1–12 (IEEE, 2020).
Koilia, N. & Kachris, C. Hardware acceleration of LLMs: a comprehensive survey and comparison. Preprint at arXiv: 2409.03384v1 (2024).
Arimoto, Y. & Ishiwara, H. Current status of ferroelectric random-access memory. MRS Bull. 29, 823–828 (2004).
Article Google Scholar
Lv, H. et al. DNA-based programmable gate arrays for general-purpose DNA computing. Nature 622, 292–300 (2023).
Article ADS Google Scholar
Kitayama, K.-I. et al. Novel frontier of photonics for data processing-Photonic accelerator. APL Photonics 4, 090901 (2019).
Article ADS Google Scholar
Khurgin, J., Bykov, A. Y. & Zayats, A. V. Hot-electron dynamics in plasmonic nanostructures: fundamentals, applications and overlooked aspects. eLight 4, 15 (2024).
Article Google Scholar
Wang, X. et al. Exploiting universal nonlocal dispersion in optically active materials for spectro-polarimetric computational imaging. eLight 4, 1–13 (2024).
Article Google Scholar
Zhou, H. et al. Photonic matrix multiplication lights up photonic accelerator and beyond. Light Sci. Appl. 11, 30 (2022).
Article ADS Google Scholar
Huang, C. et al. Prospects and applications of photonic neural networks. Adv. Phys. X 7, 1981155 (2022).
Google Scholar
Ning, S. et al. Photonic-electronic integrated circuits for high-performance computing and AI accelerators. J. Lightwave Technol. 42, 7834–7859 (2024).
Article Google Scholar
Xu, S. et al. Optical coherent dot-product chip for sophisticated deep learning regression. Light Sci. Appl. 10, 221 (2021).
Article ADS Google Scholar
Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photon. 11, 441–446 (2017).
Article ADS Google Scholar
Zhu, Y. et al. Silicon photonic neuromorphic accelerator using integrated coherent transmit-receive optical sub-assemblies. Optica 11, 583–594 (2024).
Article ADS Google Scholar
Wang, Z. et al. On-chip wavefront shaping with dielectric metasurface. Nat. Commun. 10, 3547 (2019).
Article ADS Google Scholar
Dong, P. et al. Silicon photonic devices and integrated circuits. Nanophotonics 3, 215–228 (2014).
Article Google Scholar
Shekhar, S. et al. Roadmapping the next generation of silicon photonics. Nat. Commun. 15, 751 (2024).
Article ADS Google Scholar
Bogaerts, W. et al. Programmable photonic circuits. Nature 586, 207–216 (2020).
Article ADS Google Scholar
Zhu, H. H. et al. Space-efficient optical computing with an integrated chip diffractive neural network. Nat. Commun. 13, 1044 (2022).
Article ADS Google Scholar
Zhang, H. et al. An optical neural chip for implementing complex-valued neural network. Nat. Commun. 12, 457 (2021).
Article ADS Google Scholar
Fu, T. et al. Photonic machine learning with on-chip diffractive optics. Nat. Commun. 14, 70 (2023).
Article ADS Google Scholar
Xu, Z. et al. Large-scale photonic chiplet Taichi empowers 160-TOPS/W artificial general intelligence. Science 384, 202–209 (2024).
Article ADS Google Scholar
Cheng, J. et al. Multimodal deep learning using on-chip diffractive optics with in situ training capability. Nat. Commun. 15, 6189 (2024).
Article Google Scholar
Huang, C. et al. A silicon photonic-electronic neural network for fibre nonlinearity compensation. Nat. Electron. 4, 837–844 (2021).
Article Google Scholar
Romero, P. M.-C. et al. Integrated microwave photonics coherent processor for massive-MIMO systems in wireless communications. IEEE J. Sel. Top. Quantum Electron. 29, 1–12 (2023).
Article ADS Google Scholar
Xu, X.-Y. et al. Reconfigurable integrated photonic processor for NP-complete problems. Adv. Photon. 6, 056011 (2024).
Article Google Scholar
Shafiee, A. et al. Analysis of optical loss and crosstalk noise in MZI-based coherent photonic neural networks. J. Lightwave Technol. 42, 4598–4613 (2024).
Article Google Scholar
Gerakis, V. et al. A low-cost, robust and tolerant, digital scheme for post-bond testing and diagnosis of TSVs. J. Electron. Test. 37, 191–203 (2021).
Article Google Scholar
Mirza, A. et al. Characterization and optimization of coherent MZI-based nanophotonic neural networks under fabrication non-uniformity. IEEE Trans. Nanotechnol. 21, 763–771 (2022).
Article ADS Google Scholar
Bahadori, M. et al. Comprehensive design space exploration of silicon photonic interconnects. J. Lightwave Technol. 34, 2975–2987 (2016).
Article ADS Google Scholar
Padmaraju, K. et al. Thermal stabilization of a microring modulator using feedback control. Opt. Express 20, 27999–28008 (2012).
Article ADS Google Scholar
Ning, N. et al. Thermal flux manipulation on the silicon photonic chip to suppress the thermal crosstalk. APL Photon. 9, 046108 (2024).
Article ADS Google Scholar
Bandyopadhyay, S. et al. Hardware error correction for programmable photonics. Optica 8, 1247–1255 (2021).
Article ADS Google Scholar
Zhang, W. et al. Silicon microring synapses enable photonic deep learning beyond 9-bit precision. Optica 9, 579–584 (2022).
Article ADS Google Scholar
Bandyopadhyay, S. et al. Single-chip photonic deep neural network with forward-only training. Nat. Photon. 18, 1335–1343 (2024).
Article Google Scholar
Teofilovic, I. et al. Thermal crosstalk modelling and compensation methods for programmable photonic integrated circuits. J. Lightwave Technol. 42, 7816–7824 (2024).
Article Google Scholar
Marchisio, A. et al. Comprehensive thermal crosstalk model of meshed MZI topologies for neuromorphic computing. in AI and Optical Data Sciences VI, Vol. 13375, 90–94 (SPIE, 2025).
Zhu, H. et al. A dynamically programmable quantum photonic microprocessor for graph computation. Laser Photon. Rev. 18, 2300304 (2024).
Article ADS Google Scholar
Buckley, S. M. et al. Photonic online learning: a perspective. Nanophotonics 12, 833–845 (2023).
Article Google Scholar
Hughes, T. W. et al. Training of photonic neural networks through in situ backpropagation and gradient measurement. Optica 5, 864–871 (2018).
Article ADS Google Scholar
Pai, S. et al. Experimentally realized in situ backpropagation for deep learning in photonic neural networks. Science 380, 398–404 (2023).
Article ADS Google Scholar
Zhang, H. et al. Efficient on-chip training of optical neural networks using genetic algorithm. ACS Photon. 8, 1662–1672 (2021).
Article Google Scholar
Zhang, T. et al. Efficient training and design of photonic neural network through neuroevolution. Opt. Express 27, 37150–37163 (2019).
Article ADS Google Scholar
Zhou, H. et al. Self-configuring and reconfigurable silicon photonic signal processor. ACS Photon. 7, 792–799 (2020).
Article Google Scholar
Dang, D.-C. et al. Escaping local optima using crossover with emergent diversity. IEEE Trans. Evol. Comput. 22, 484–497 (2017).
Article Google Scholar
Pai, S., Bartlett, B., Solgaard, O. & Miller, D. A. B. Matrix optimization on universal unitary photonic devices. Phys. Rev. Appl. 11, 064044 (2019).
Article ADS Google Scholar
Maring, N. et al. A versatile single-photon-based quantum computing platform. Nat. Photon. 18, 603–609 (2024).
Article ADS Google Scholar
Clements, W. R. et al. Optimal design for universal multiport interferometers. Optica 3, 1460–1465 (2016).
Article ADS Google Scholar
Zhao, Z. et al. Clements-enhanced complex-valued coherent mesh with balanced detection units for photonic neural networks. J. Lightwave Technol. 42, 6839–6849 (2024).
Article Google Scholar
Gu, J. et al. Toward hardware-efficient optical neural networks: beyond FFT architecture via joint learnability. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 40, 1796–1809 (2020).
Article Google Scholar
Feng, C. et al. A compact butterfly-style silicon photonic-electronic neural chip for hardware-efficient deep learning. ACS Photon. 9, 3906–3916 (2022).
Article Google Scholar
Harris, N. C. et al. Efficient, compact and low loss thermo-optic phase shifter in silicon. Opt. Express 22, 10487–10493 (2014).
Article ADS Google Scholar
Reck, M. et al. Experimental realization of any discrete unitary operator. Phys. Rev. Lett. 73, 58–61 (1994).
Article ADS Google Scholar
Shokraneh, F. et al. The diamond mesh, a phase-error- and loss-tolerant field-programmable MZI-based optical processor for optical neural networks. Opt. Express 28, 23495–23508 (2020).
Article ADS Google Scholar
Cheng, Q. et al. Ultralow-crosstalk, strictly non-blocking microring-based optical switch. Photon. Res. 7, 155–161 (2019).
Article ADS Google Scholar
Catalá-Lahoz, C. et al. Self-configuring programmable silicon photonic filter for integrated microwave photonic processors. APL Photon. 8, 116103 (2023).
Article ADS Google Scholar
Wang, W. et al. On-chip topological beamformer for multi-link terahertz 6G to XG wireless. Nature 632, 522–527 (2024).
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (2024YFE0211800); National Natural Science Foundation of China (U22A2082); Ningbo Science and Technology Program (2023Z073); “Leading Goose" R&D Program of Zhejiang Province (2024C01112); “Vanguard" R&D Program of Zhejiang Province (2025C01043); Jinhua Science and Technology Program (2024-1-021); the National Natural Science Founda tion of China (No. 62205246 and 62475192); the National Key Research and Development Program of China (No. 2023YFF0613600); the Funda mental Research Funds for the Central Universities, and Shanghai Pilot Program for Basic Research.

Author information

Authors and Affiliations

College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, Zhejiang, China
Zichao Zhao, Huihui Zhu, Qishen Liang, Haoran Ma, Ziyi Fu, Xingyi Jiang, Yuehai Wang & Jianyi Yang
Jinhua Institute of Zhejiang University, Jinhua, 321000, China
Zichao Zhao & Jianyi Yang
ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou, Zhejiang, China
Huihui Zhu
Key Laboratory of Optoelectronic Materials and Devices, Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China
Bei Chen
The College of Electronic and Optical Engineering, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
Bei Chen
Key Laboratory of Advanced Optoelectronic Quantum Architecture and Measurements of Ministry of Education, School of Physics, Beijing Institute of Technology, Beijing, China
Tian Chen
Institute of Precision Optical Engineering, School of Physics Science and Engineering, Tongji University, Shanghai, China
Yuzhi Shi

Authors

Zichao Zhao
View author publications
Search author on:PubMed Google Scholar
Huihui Zhu
View author publications
Search author on:PubMed Google Scholar
Qishen Liang
View author publications
Search author on:PubMed Google Scholar
Haoran Ma
View author publications
Search author on:PubMed Google Scholar
Ziyi Fu
View author publications
Search author on:PubMed Google Scholar
Xingyi Jiang
View author publications
Search author on:PubMed Google Scholar
Bei Chen
View author publications
Search author on:PubMed Google Scholar
Yuehai Wang
View author publications
Search author on:PubMed Google Scholar
Tian Chen
View author publications
Search author on:PubMed Google Scholar
Yuzhi Shi
View author publications
Search author on:PubMed Google Scholar
Jianyi Yang
View author publications
Search author on:PubMed Google Scholar

Contributions

Z.C.Z., H.H.Z., Q.S.L., H.R.M., Y.H.W., and J.Y.Y. jointly conceived the idea. Z.C.Z., Q.S.L., Z.Y.F., and B.C. performed the numerical simulations and theoretical analysis. Z.C.Z., Q.S.L., H.R.M., Z.Y.F., and X.Y.J. did the experiments. Z.C.Z., H.H.Z., Q.S.L., H.R.M., and A.Q.L. were involved in the discussion and data analysis. Z.C.Z., Q.S.L., H.R.M., and Z.Y.F. prepared the manuscript. H.H.Z., Y.H.W., T.C., Y.Z.S., and J.Y.Y. supervised and coordinated all the work. All authors commented on the manuscript.

Corresponding authors

Correspondence to Huihui Zhu, Yuzhi Shi or Jianyi Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Physics thanks Silvia Ortin and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. [A peer review file is available.]

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Transparent Peer Review file (download PDF )

Supplemental Material (download PDF )

LaTeX source file for the main text (download ZIP )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhao, Z., Zhu, H., Liang, Q. et al. Efficient hardware error correction with hybrid on-offline configuration algorithm for optical processor. Commun Phys 8, 323 (2025). https://doi.org/10.1038/s42005-025-02247-2

Download citation

Received: 17 February 2025
Accepted: 29 July 2025
Published: 09 August 2025
Version of record: 09 August 2025
DOI: https://doi.org/10.1038/s42005-025-02247-2