Introduction

The rapid evolution of electronic technology is approaching a critical turning point, marked by two dominant trends. On the one hand, progress on the very large-scale integration (VLSI) front has been slowing down, deviating from Moore’s law predictions1 due to the fundamental physical limits of FET technology, including thermal and quantum effects2. On the other hand, the rise of cutting-edge applications –such as data-driven artificial intelligence (AI)3 and quantum computing4 –has led to unprecedented demands for integration scale and computational power. As an example, the computational requirements for training state-of-the-art AI systems have been doubling approximately every 3.5 months5.

As traditional electronic systems reach their physical limits, alternative paradigms are of vital importance to sustain innovation in computational capabilities. Among other solutions, photonic integrated circuits (PICs) are an approach that shows great promise in facing these challenges and that is also rapidly becoming economically viable6. Indeed, despite the disadvantage of being a less mature technology in comparison to traditional CMOS-based integrated electronics, PICs could be beneficial for a series of advantages inherent to photonic platforms, such as increased parallelism (exploiting wavelength, polarization, and mode division multiplexing), reduced power consumption, ultrahigh operating speeds, and compatibility with the silicon industry6,7,8,9, all of which could open the path to new breakthroughs in the aforementioned applications. Indeed, multiple examples of photonic quantum devices have been proposed10,11 and numerous implementations of photonic accelerators for AI are reported in literature12,13,14,15,16. Moreover, with the advances in the field of co-integration of photonic components and electronic tuning elements17, there are plenty of examples of programmable photonic devices based on Mach-Zehnder interferometers (MZIs) meshes that can be employed in multiple applications, ranging from optical computing to photonic quantum computing18,19, with a strategy similar to CMOS multipurpose FPGA devices.

In this general context, the goal of this work consists in the development of a comprehensive circuit model for MZI-based meshed topologies that can be employed for reliable simulations of photonic computing or neuromorphic circuits based on this technology. In particular, we include not only propagation effects and losses, but also parasitic phenomena, such as thermal crosstalk, which, if not properly accounted for, can significantly impact the performance of a photonic processor or the accuracy of a trained photonic neural network (PNN)20,21. Our methodology offers a computationally efficient alternative to full-scale multi-physics simulations, making it feasible to model and optimize larger photonic networks without compromising accuracy.

In order to benchmark the proposed model, we first apply it to the description of a 3 × 3 mesh of MZIs that can be employed as a programmable photonic unit21,22. The simulated results obtained with the proposed model are compared with the measurements performed on the actual device to highlight the accuracy of the model itself. The proposed methodology can be easily generalized to model and program arbitrarily-sized MZI-based mesh devices. After the characterization of the 3 × 3 reference device, in order to highlight the versatility of our model, we show a possible application to the offline programming of the same circuit: using the validated model, we are able to determine the MZI voltages needed to implement various user-defined logic functions, also demonstrating the resilience to fluctuations of control signals.

Methods

Reference circuit and technology

The device that we have considered for this analysis is a mesh of nine interconnected 2 × 2 MZIs, with three input ports and three output ports. Its layout is shown in Fig. 1a. Each MZI, whose structure is represented in Fig. 1b, comprises two 2 × 2 multi-mode interferometers (MMIs) connected by two  ~267 μm long arms. The effective refractive index on the internal arms is regulated by means of a voltage-controlled thermal phase shifter on the upper arm, implemented with a titanium strip.

Fig. 1: Masks of the reference circuit.
figure 1

a Mask of the total reference circuit. The waveguide structure is shown in blue, the Ti heaters in red. The grid size is 50 μm × 50μm; the overall size of the shown region is approximately 1900 μm × 320 μm. The black dashed rectangle indicates the area where our thermal analysis will be performed. b Detail of a single MZI from the circuit’s mask. The input and output 2 × 2 MMIs are highlighted by the black dashed rectangles on the left and right of the image. The Ti strip used for thermal tuning with the electrical pads (the red structure) covers the center portion of the upper waveguide. The grid size is 10 μm × 10 μm; the overall size of the depicted region is approximately 500 μm × 65 μm.

The meshed topology shown in Fig. 1a is part of a larger PIC, developed at the Technical University of Denmark and used as a 7 × 7 reconfigurable optical switch for C-band operations22. The PIC is designed on a silicon-on-insulator (SOI) platform, with a buried aluminum mirror produced via flip bonding22. A detailed schematic of the device layers is reported in Fig. 2a. The cross sections of the Si waveguides and the Ti heaters are 0.5 μm × 0.25 μm and 1.8 μm × 0.1 μm, respectively, as shown in Fig. 2b. Moreover, the heaters are 100 μm long, with additional 40 μm × 20 μm pads used to apply the driving voltage, without introducing additional heating thanks to their large cross section and correspondingly low resistance. From Fig. 2, it can be seen that the Ti heaters and the Si waveguides are separated by a 1 μm layer of SiO2, used to mitigate the absorption losses that arise from the proximity between the Si waveguide and the metal plate.

Fig. 2: Photonic integrated circuit platform technology.
figure 2

a Schematic of the vertical structure of the PIC. Each color is associated to a different material: Si in orange, benzocyclobutene (BCB) in lilac, SiO2 in yellow, Al in blue, and Ti in gray. b Schematic of the Si waveguides (orange) and Ti heaters (gray).

All MZIs were designed to work in the cross state at 0 V and in bar state at 2 V. However, it can be observed experimentally that devices 6 and 8 display opposite behavior23.

As can be seen in Fig. 1a, the nine MZIs of the circuit are interconnected by means of a series of bent waveguides and optical crossings. For an accurate description of the phase change and the losses accumulated between consecutive MZIs, the length of each optical connection must be considered (listed in Supplementary Table 1). The circuit also contains four optical crossings that will be included in our model by means of their insertion loss (IL), assuming negligible optical crosstalk24.

Device modeling

The goal of this section is to capture the physics of the reference 3 × 3 circuit, developing a general methodology of analysis that can be applied to any meshed topology.

The analysis begins by defining the behavioral model for a single MZI. This involves two main elements: (a) a model for light propagation and (b) a model for thermal effects. The latter describes how the changes in temperature, caused by the voltage applied to the heaters, affect the waveguide effective index neff and it must account for spurious effects, such as thermal crosstalk between neighboring MZIs.

For the first point, we employ the traditional transmission matrix-based formulation to describe the propagation of the field in the device. The transmission matrices for the input and output MMIs can be expressed exploiting Coupled Mode Theory25:

$${T}_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}=\left[\begin{array}{cc}{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}\sqrt{{\gamma }_{{{{\rm{in}}}}1}}&j{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}\sqrt{1-{\gamma }_{{{{\rm{in}}}}2}}\\ j{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}\sqrt{1-{\gamma }_{{{{\rm{in}}}}1}}&{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}\sqrt{{\gamma }_{{{{\rm{in}}}}2}}\end{array}\right]$$
(1)
$${T}_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}=\left[\begin{array}{cc}{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}\sqrt{{\gamma }_{{{{\rm{out}}}}1}}&j{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}\sqrt{1-{\gamma }_{{{{\rm{out}}}}2}}\\ j{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}\sqrt{1-{\gamma }_{{{{\rm{out}}}}1}}&{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}\sqrt{{\gamma }_{{{{\rm{out}}}}2}}\end{array}\right]$$
(2)

where \({\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}\) and \({\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}\) are insertion losses for input and output MMIs, respectively, γini = γouti γi are the corresponding splitting ratios, defined as the ratio between the output power at port i and the input power at port i, in absence of losses. For a shorter notation, we define four new quantities \({\Gamma }_{11}=\sqrt{{\gamma }_{1}}\), \({\Gamma }_{12}=\sqrt{1-{\gamma }_{2}}\), \({\Gamma }_{21}=\sqrt{1-{\gamma }_{1}}\), and \({\Gamma }_{22}=\sqrt{{\gamma }_{2}}\).

Then, we can describe the propagation through the two arms of the MZI with the following diagonal transfer matrix, taking into account the Ti heater that can modify the effective refractive index.

$${T}_{{{{\rm{prop}}}}}=\left[\begin{array}{cc}{\xi }_{{{{\rm{m}}}}}{\epsilon }_{+}{e}^{j(\frac{2\pi }{\lambda }{n}_{{{{\rm{eff,1}}}}}(T){L}_{{{{\rm{h}}}}}+\delta \varphi )}&0\\ 0&{\alpha }_{{{{\rm{b}}}}}^{2}{e}^{-{\alpha }_{{{{\rm{prop}}}}}L}{\epsilon }_{-}{e}^{j(\frac{2\pi }{\lambda }{n}_{{{{\rm{eff,2}}}}}(T){L}_{{{{\rm{h}}}}}-\delta \varphi )}\end{array}\right]$$
(3)

where \({\xi }_{{{{\rm{m}}}}}={\alpha }_{{{{\rm{b}}}}}^{2}{\alpha }_{{{{\rm{m}}}}}{e}^{-{\alpha }_{{{{\rm{prop}}}}}L}\), αm is a metal absorption factor, αb is a bending radiation factor (which appears as a squared quantity since each arm includes two bends, as it can be appreciated from Fig. 1b), αprop are the propagation losses through the waveguide, L is the total length of the arm, Lh is the heater length, and λ is the signal wavelength. The terms \({\epsilon }_{\pm }={e}^{j(\frac{2\pi }{\lambda }{n}_{{{{{\rm{eff}}}}}_{{{{\rm{0}}}}}}(L-{L}_{{{{\rm{h}}}}})\pm \delta \varphi )}\), with \({n}_{{{{{\rm{eff}}}}}_{{{{\rm{0}}}}}}\) effective refractive index at room temperature T = T0 = 293 K, introduce the optical phase accumulated in the portion of the arms that is not covered by the electrode. The remaining quantities introduced in (3) will now be discussed.

The effective refractive indices on the two arms of the MZI neff,1 and neff,2 are functions of the temperature T due to the action of the thermal phase shifters. The temperature of the upper arm, placed directly below the metal pad, is modified, but due to the lack of insulation trenches22 and the limited distance between the two waveguides, the temperature of the lower arm is also affected. Even if the latter variation is smaller than the temperature change in the upper waveguide, this thermal crosstalk can significantly affect the behavior of the single MZI and the whole device. In fact, thermal crosstalk is one of the main limitations for the large integration of devices in PICs26 and can strongly affect the accuracy of PNNs21 or the programmability of meshed topologies19.

Moreover, the transmission matrix presented in (3) formally takes into account losses due to metal absorption as a result of the proximity of the upper waveguide to the Ti heater and the metallic pad. Due to the inclusion of the additional SiO2 layer between the waveguides and heaters discussed in the “Methods" section, the absorption coefficient is set to αm = 1, but it has been included to offer a general description that can be used on technological platforms where this type of loss is not mitigated.

The term δφ represents a phase offset and is introduced to better describe the real behavior of each MZI. Indeed, this parameter allows us to accommodate for possible fabrication uncertainties that could result in spurious neff shifts and, consequently, in different working points for each MZI. This term, together with the presence of unbalanced splitting ratios, allows us to capture the spurious optical transmission on the opposite port that can be observed when devices are in bar or cross state27 without having to implement more sophisticated models, where, for instance, optical crosstalk can be modeled statistically28.

The matrices \({T}_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}\), Tprop, and \({T}_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}\) can be used to compute the total transfer matrix of the MZI by means of a matrix multiplication, corresponding to the cascade of the constituent blocks29:

$$T={T}_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}{T}_{{{{\rm{prop}}}}}{T}_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}$$
(4)

Finally, the fields \({E}_{1}^{{{{\rm{out}}}}}\) and \({E}_{2}^{{{{\rm{out}}}}}\) at the two output ports of the single MZI can be computed as functions of the input fields by multiplying the input field components by the transmission matrix obtained:

$$\left[\begin{array}{c}{E}_{1}^{{{{\rm{out}}}}}\\ {E}_{2}^{{{{\rm{out}}}}}\end{array}\right]=T\left[\begin{array}{c}{E}_{1}^{{{{\rm{in}}}}}\\ {E}_{2}^{{{{\rm{in}}}}}\end{array}\right]=\left[\begin{array}{cc}{T}_{11}&{T}_{12}\\ {T}_{21}&{T}_{22}\end{array}\right]\left[\begin{array}{c}{E}_{1}^{{{{\rm{in}}}}}\\ {E}_{2}^{{{{\rm{in}}}}}\end{array}\right]$$
(5)

Through straightforward matrix calculations, we can expand Eq. (5) to obtain two expressions that can be efficiently evaluated numerically. In this case, we retrieve the following output field equations:

$${E}_{1}^{{{{\rm{out}}}}}={\xi }_{1}{\Gamma }_{11}^{2}{p}_{1}{E}_{1}^{{{{\rm{in}}}}}-{\xi }_{2}{\Gamma }_{12}{\Gamma }_{21}{p}_{2}{E}_{1}^{{{{\rm{in}}}}}+j{\xi }_{1}{\Gamma }_{11}{\Gamma }_{12}{p}_{1}{E}_{2}^{{{{\rm{in}}}}}+j{\xi }_{2}{\Gamma }_{12}{\Gamma }_{22}{p}_{2}{E}_{2}^{{{{\rm{in}}}}}$$
(6)
$${E}_{2}^{{{{\rm{out}}}}}=j{\xi }_{1}{\Gamma }_{21}{\Gamma }_{11}{p}_{1}{E}_{1}^{{{{\rm{in}}}}}+j{\xi }_{2}{\Gamma }_{22}{\Gamma }_{21}{p}_{2}{E}_{1}^{{{{\rm{in}}}}}-{\xi }_{1}{\Gamma }_{21}{\Gamma }_{11}{p}_{1}{E}_{2}^{{{{\rm{in}}}}}+{\xi }_{2}{\Gamma }_{22}^{2}{p}_{2}{E}_{2}^{{{{\rm{in}}}}}$$
(7)

where, for brevity’s sake, the following quantities were introduced:

$${\xi }_{1}={\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}{\alpha }_{{{{\rm{m}}}}}{\alpha }_{{{{\rm{b}}}}}^{2}{e}^{-{\alpha }_{{{{\rm{prop}}}}}L}$$
$${\xi }_{2}={\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}{\alpha }_{{{{\rm{b}}}}}^{2}{e}^{-{\alpha }_{{{{\rm{prop}}}}}L}$$
$${p}_{1}={e}^{j(\frac{2\pi }{\lambda }{n}_{{{{\rm{eff}}}}}(T({V}_{{{{\rm{in}}}}})){L}_{{{{\rm{h}}}}}+\delta \varphi )}{e}^{j(\frac{2\pi }{\lambda }{n}_{{{{{\rm{eff}}}}}_{{{{\rm{0}}}}}}(L-{L}_{{{{\rm{h}}}}})+\delta \varphi )}$$
$${p}_{2}={e}^{j(\frac{2\pi }{\lambda }{n}_{{{{{\rm{eff}}}}}_{{{{\rm{0}}}}}}L-\delta \varphi )}$$

At this point, in cases where ξ1 = ξ2 and \({\Gamma }_{ij}=\sqrt{0.5},\,i,j=1,2\), we would be able to exploit the prosthaphaeresis formulae to analytically obtain a closed-form solution proportional to \(\cos (\Delta \phi )\), where Δϕ is the phase difference between the two arms of the MZI. However, with the presented formulation, where unbalanced MMI splitting ratios and metal absorption loss αm are considered, it is not possible to easily obtain this result analytically, but we can still expect the sinusoidal-like behavior inherent in the physics of MZIs.

To estimate the values of the various parameters previously described, we simulated the structure of the constituent components in RSoft™ CAD30, employing the Finite-Difference Time-Domain (FDTD) method. In this way, we obtained the bending radiation term for the curved sections of the MZIs, the IL for the waveguide crossings, and the IL and the coupling factors of the MMIs. The values of these parameters at different wavelengths are reported in Supplementary Table 2. The MZI parameters depend weakly on the wavelength over the considered range, implying a wide band of operation for the device. Note that, in these simulations, we neglected all non-linear effects (e.g., two-photon absorption), assuming to be always working with sufficiently low input power levels31.

At this point, we can introduce a model for the thermally-controlled MZIs, including thermal crosstalk.

The main thermal effect to be included, of course, consists in the thermal control of the MZIs by means of Ti microheaters. A voltage is applied to the Ti strip that will heat up as a result of the Joule effect and, consequently, will increase the temperature in the waveguide underneath. This induces a change of the neff of the waveguide and a subsequent phase difference between the two arms of the interferometer. Following a well-known approach32, we can express the dependence of the effective refractive index with respect to temperature T introducing a first-order Taylor expansion:

$${n}_{{{{\rm{eff}}}}}(T)={n}_{{{{\rm{eff}}}}}({T}_{0})+{\left.\frac{{{{\rm{d}}}}{n}_{{{{\rm{eff}}}}}}{{{{\rm{d}}}}T}\right| }_{{T}_{0}}(T-{T}_{0})$$
(8)

The derivative of neff with respect to temperature is calculated starting from the neff(T) curves at different temperatures obtained with the RSoft™ simulations of a single waveguide (see Supplementary Note 2). This derivative amounts to 1.9832 × 10−4 K−1 at λ = 1550 nm, which is in line with the values reported in literature for waveguides with similar cross sections33. Since the neff change is caused by the temperature difference ΔT = T − T0 induced by Vin, we need a simple way to relate the increase in temperature in the waveguide to the applied voltage on the Ti strip. In order to integrate this into a larger simulation framework and achieve an accurate representation of thermal effects without relying on complex analytical or numerical models, we chose to use targeted COMSOL Multiphysics® simulations of a simplified system. This approach, with the appropriate strategy, can be generalized to represent more complex configurations, such as the one under study.

In particular, in COMSOL, we simulated a 3D system made of 6 parallel Si waveguides deposited on a 1 mm × 1 mm × 200 mm SiO2 substrate and covered by a SiO2 cladding, on top of which 3 microheaters Ti strips are located, as indicated by the dashed rectangle in Fig. 1. We assume that, in the circuit under test (Fig. 1a), thermal effects are relevant only for devices having the same position y, while it is otherwise negligible, since the distance between the arms is much smaller than the distance between different MZIs in the y direction; we can easily measure the thermal crosstalk by computing the total spatial variation of the temperature due to three vertically stacked MZIs. For the simulations, we considered the Joule effect for the heating of the electrodes, convection between solids for the propagation of the temperature in the geometry, and the linear resistivity model for the Ti strips. The simulation incorporated the following boundary conditions (BCs): the top surface of the chip exchanges heat with the surrounding air by convection (Robin BCs34), the sides are treated as adiabatic without external heat exchange (homogeneous Neumann BCs34,35), and the substrate is maintained at ambient temperature by an ideal Peltier cell (Dirichlet BCs34,35). In Fig. 3a, we show a screenshot of the simulated system in COMSOL Multiphysics®. In Fig. 3b, it is also possible to appreciate the structure of the considered layers in the transverse plane.

Fig. 3: COMSOL Multiphysics® simulations.
figure 3

The results refer to the portion of the device indicated by the black rectangle in Fig. 1a. a 3D view of the simulated domain and heat map obtained when a voltage of 2 V is applied to the three electrodes; the straight lines indicate the upper and lower arms of MZI 1, 2, and 3. b Heat map in the transverse plane in the middle of the electrodes (y = 0), when 2 V are applied to the microheater of MZI 2. c Spatial distribution of the temperature variation when various voltages Vin,2 are applied to the microheater of MZI 2. The vertical black lines indicate the positions of the six waveguides. d Temperature in the waveguides vs. voltage applied to the microheater of MZI 2; the temperature variations of MZI 1 and MZI 3 are approximately the same.

With this strategy, we are able to compute the temperature variation with respect to position for the single MZI, when changing the applied Vin. Fig. 3c represents the temperature difference from room temperature (20 °C) when a voltage Vin,2 is applied to the central heater while the other ones are grounded. For instance, let us consider Vin,2 = 2 V: in this case, we observe that the temperature in the waveguide below that heater increases by  ~52 °C, but also that, in the lower arm of the same MZI, the temperature variation is  ~16 °C (Fig. 3c, d). This equates to a reduced thermal tuning efficiency, as the optical path variation is proportional to the difference in temperature between the two arms (ΔTMZI,2 = 36 °C). Even more importantly, there is also a significant temperature variation between the waveguides of MZI 1 (ΔTMZI,1 = 5 °C) and MZI 3 (ΔTMZI,3 = 2 °C), despite the fact that both have grounded heaters: this is thermal crosstalk. Despite the possibility of mitigating it with a larger separation between the waveguides of the same MZI or between different MZIs, or with insulation trenches, these solutions would imply either lower integration density or increased fabrication complexity.

The COMSOL simulations can be employed to compute the temperature variation when the lateral MZIs are turned on separately or when multiple MZIs are turned on at the same time, which is fundamental to describe a realistic use of the device. A trivial approach would require one simulation for each set of voltages applied to the three electrodes, but this solution would be, of course, excessively time consuming. Instead, we decided to exploit the curves computed for a single heater (Fig. 3c). For each MZI k = 1,  2,  3, we select the curve at the correct Vin,k and shift it in the x direction (for the lateral MZIs). The three contributions are summed to approximate the complete spatial temperature distribution. The same approach is used for the two other groups of MZIs present in the device.

Figure 4 depicts the spatial distribution of the temperature variation with 1 V, 2 V, and 1 V applied to three heaters respectively, compared to the actual COMSOL simulation of the system represented in Fig. 3a: it is evident that the results obtained with our procedure accurately reproduce the COMSOL thermal simulations. In this way, we can sample the temperature change ΔT in the positions corresponding to each waveguide and use these in Eq. (8) to compute the neff variation. This approximation of the sum of three contributions holds because the heat sources are far from the box borders in the x direction, otherwise the adiabatic BCs would not be true and would affect the result.

Fig. 4: Example of spatial distribution of temperature variation.
figure 4

Spatial distribution of temperature variation for Vin,1 = 1 V, Vin,2 = 2 V, and Vin,3 = 1 V applied simultaneously to the three heaters, simulated in COMSOL Multiphysics® (solid blue line) and reconstructed using our method (dashed red line).

With this description, we are able to create a model that can rapidly compute the response of a meshed MZI-based topology, including multiple effects that would otherwise require time-consuming multi-physics simulations.

Results and discussion

Validation with experiments

In this section, we will validate the model previously described by comparing the simulated results with measurements of the actual device. For this purpose we employ a set of measured output-input power ratios. These power ratios are measured as follows: for each pair of input-output ports, a broadband signal is injected into one of the input ports and each MZI is switched gradually, by spanning its input voltage from 0 V to 2 V in steps of 0.1 V, while all the other MZIs are grounded. Figure 5a reports an example of raw measured data21,23: amplified spontaneous emission (ASE) is injected into input port 1 and measured with an optical spectrum analyzer (OSA) at output port 223, for various values of Vin,1 applied to MZI 1. The flat measured responses confirm the wideband properties of the device. To have a reference that is more robust to noise and to simplify the subsequent analysis, we averaged the spectra over the 1540 nm-1555 nm range (gray box in Fig. 5a), leading to the power ratio curve shown in Fig. 5b.

Fig. 5: Power ratio spectral measurements.
figure 5

a Experimental spectra at different voltages Vin applied to MZI 1 for the Pout,2/Pin,1 ratio. The gray window represents the range of wavelengths that were averaged to create the dataset21,23. b Resulting Pout,2/Pin,1 power ratio curve associated to MZI 1, obtained by averaging the spectra in the gray box.

By repeating the process and applying an input voltage to each MZI in sequence, we obtain other averaged curves similar to Fig. 5b. These curves can be concatenated into a single trace as the one reported in Fig. 6. The concatenated curve PdB has been scaled aligning its maximum value Pmax (–4.8 dB) to 1, while the value Pref measured with all null driving voltages (−17.7 dB) is converted to 0:

$${P}_{{{\rm{scaled}}}}=\frac{{{{\rm{P}}}}_{{{\rm{dB}}}}-{{{\rm{P}}}}_{{{\rm{ref}}}}}{{{{\rm{P}}}}_{{{\rm{max}}}}-{{{\rm{P}}}}_{{{\rm{ref}}}}}$$
(9)
Fig. 6: Power ratio curve and associated optical path.
figure 6

Example of measured Pout/Pin curve (input port 1, output port 2) and the associated optical path in the circuit. Both simulated and measured curves have been normalized to account for additional sources of loss that are not present in our model (e.g., measurement setup losses) and to simplify subsequent analysis. The left y axis represents the measured power ratio values (in decibel); the right y axis represents the same quantity scaled with Eq. (9).

This scaling operation allows for easier comparison with our simulation results, as experimental measurements may include additional optical losses (e.g., measurement setup losses) that are not accounted for in our model.

This kind of measured data is interesting because it allows us to have clear evidence of the effect of thermal crosstalk on the response of the circuit: considering Fig. 6, it is possible to appreciate three main contributions to the Pout/Pin curve, corresponding to the three MZIs that are located on the optical path from input 1 to output 2, namely MZIs 1, 4, and 8. First, all curves start from the same Pout/Pin value, corresponding to the case with all grounded MZI (indicated by the red marker in Fig. 6). In this condition, MZI 1 is in cross state, so Pin,1 is mostly routed to MZI 5, except for a small portion due to the non-ideal behavior of the MZIs, as already discussed in the previous section. Being MZIs 5 and 7 in the cross state as well, the majority of Pin,1 reaches the output 3. When applying a voltage to MZI 1, this device switches to bar state, routing Pin,1 to output 2. Similarly, power is routed away from output 2 as MZI 4 goes from cross to bar state and MZI 8 from bar to cross (as stated in the “Methods" section, MZIs 6 and 8 are in bar state when grounded23). However, it is also evident from the experiments that MZI 2, although not on the direct light path connecting input 1 to output 2, has an effect: due to the action of the heater of MZI 2 on the waveguides of MZI 1, the latter enters even more in the cross state, thus bringing power away from output 2. Indeed, this is one instance of the effect of thermal crosstalk, and, since it has an evident effect when employing a single MZI with a single input, it is clear that it will have an even larger impact when a circuit is used at full capacity.

In order to improve the match with the experiments by accounting for process variations, the phase correction terms δφ introduced in Eq. (3) are now adjusted for each MZI. This can be done with an optimization procedure, for example, using the Particle Swarm Optimization (PSO) method36. PSO is an optimization algorithm based on the social interaction between agents called “particles", which move within an N-dimensional solution space (N = 9 is the number of δφ parameters to be tuned), with the goal of minimizing an error measurement (called “fitness")37. This fitness parameter is a measurement of the quality of the solution found by each particle and, for this particular application, it was calculated as the mean squared error (MSE) between the experimental power ratios (target of the optimization) and the ones obtained by simulating the circuit with the set of δφ parameters found by each particle, at each iteration of the algorithm. Thanks to the movement rules of the particles37, the algorithm is able to converge to a solution that minimizes fitness, which allows us to obtain a set of nine δφ parameters giving us an accurate match of the experimental target.

The fitting parameters obtained with this procedure are reported in Table 1. Note that for MZI 6 and MZI 8 we obtained values close to  ±180°, consistent with experimental evidence that these two devices are in the bar state when Vin = 0 V, showing opposite behavior with respect to the other MZIs23. These fitting parameters are then used in our model to reproduce the Pout/Pin measurements. In Fig. 7 we report all the possible combinations of input and output ports, with solid blue lines representing the measured data20 and the circled red lines representing the simulated results with the phase corrections of Table 1. The curves are normalized with Eq. (9), employing, for each combination of input and output ports, the corresponding experimental values of Pmax and Pref. From the comparisons it is clear that our model, with the optimized phase correction terms, is able to closely match the experimental evidence: the overall behavior is well reproduced, meaning that our model is able to capture correctly the thermal crosstalk, which can surely be beneficial to compensate for it or take it into account for specific applications. Small discrepancies are still present between the predictions and the references. For example, in Fig. 7a, for MZI 4, it is evident that the simulation produced a lower peak power. This and other similar cases can be ascribed to additional effects present in the real device (e.g., fabrication tolerances), but also to the measurement uncertainties, especially for the transfer function minima. In Fig. 7i the trends of MZI 2 and MZI 8 predicted by the simulator do not match the experimental evidence. However, it should be noted that this is the only example in which, experimentally, the peaks and the floor have a difference of  ~30 dB. Moreover, the lowest value of the Pout,3/Pin,3 curve is  − 73 dB, which could be limited by the noise floor of OSA used for the measurements.

Table 1 Phase parameters δφ extracted with PSO to fit the experimental data
Fig. 7: Extraction of the phase parameters δφ.
figure 7

Comparison of the results obtained with the presented model (red circles) and the averaged measured curves (blue lines), for each set of input/output ports. Two consecutive vertical black lines indicate a 0 V–2 V span for a single device. For all 9 images, the curves are concatenated and normalized in the same way as in Fig. 6.

For the purpose of validating the model, we effectively created a digital twin of the device in Fig. 1a. The same methodology, which starts with an accurate description of the individual building blocks followed by a targeted analysis of their parasitic interactions (in our case, dominated by thermal crosstalk effects), can be easily extended to more complex photonic devices, based on– but not limited to– MZI meshes.

Applications to photonic computing

In this section, we use the device digital twin to determine the optimal driving conditions of the MZI to implement user-defined logic functions with 3 optical inputs, also discussing the sensitivity of the output to fluctuations of operating voltages. This ability of the model to explore the implementation of user-defined logic functions aligns with the growing demand for programmable photonic circuits in high-speed computing applications.

In this context, one possible strategy to program a PIC consists in the use of a software-defined procedure to find suitable “weights” (control voltages) to implement the desired functionality. This approach is akin to the so-called “offline training” methods for PNNs, where the backward propagation is performed on a traditional computer and the weights are applied a posteriori on the chip38. However, the effectiveness of offline methods can be drastically reduced by unforeseen fabrication variations39, affecting the behavior of devices supposed to operate identically, while the use of error correction techniques could be extremely challenging38. To overcome this limitation, it is possible to use “online training” techniques, where an optimization algorithm is directly executed on the chip to find the best control signal for each device, automatically accounting for manufacturing defects38,39,40; this approach is often “physics agnostic”40 for better adaptability.

Our model is inherently physics-informed and can be employed in support to an offline training procedure, but providing multiple key advantages with respect to both in-situ training and traditional offline approaches. It enables rapid investigation of a large parameter space, allowing us to evaluate approximately up to 1 × 104 different configurations per second per core on a modern workstation for the considered 3 × 3 device. This computational efficiency leads to the possibility of generating very large datasets or of running advanced optimization algorithms to find, for a specific device, the ideal control signal (e.g., in terms of robustness to voltage fluctuations or minimizing the operation power consumption). This advantage is evident considering that approximately 40 hours were needed to perform 5000 measurements on the 3 × 3 reference circuit23. Moreover, the capability of predicting the behavior of a single MZI provides a reliable framework for circuit design and pre-deployment validation, leading to a more systemic optimization, which may take into account robustness against fabrication variations and electrical noise, minimizing thermal crosstalk or power consumption. Proper characterization of the device at the design stage helps reduce the need for costly iterative testing. When addressing real components, the main drawback of offline training is the fact that each device must be characterized in detail, mainly because of the intrinsic fabrication uncertainties. With our model, this results automatically from the tuning of the δφ parameters to mimic the operation of the reference circuit under test, thus effectively overcoming the main drawback of canonical offline methods.

Building upon this physics-informed offline approach, we now use the identified model parameters to analyze the behavior of the device under practical operating conditions and to evaluate its performance in executing logic functions.

We assume that signals at 1550 nm are applied in the input, but, due to the wideband properties of the device, other wavelengths could be considered in a WDM scenario.

Before discussing the technique that we propose to efficiently find the required voltages, we need to address the conversion of the analog optical signals into digital 0s (false) and 1s (true). In practice, it is possible to avoid analog to digital conversion using novel techniques41; however, we decided to adopt an intensity-based approach, similar to the one used in electronics, converting the analog optical signal into a digital one thanks to a threshold for the output powers, thus separating lower power levels (corresponding to logic 0s) and higher power levels (corresponding to logic 1s). To reduce the effect of noise when dealing with output power values close to the threshold, we decided to set two separate thresholds, for the false and true levels, respectively.

These thresholds are estimated as follows: first, a dataset with 6 × 106 entries is created by randomizing the input voltages of the 9 MZIs and computing the power at each output port, when the 23 possible combinations of the digital input signals are applied. Due to the computational efficiency of the model, the generation of this dataset requires less than 6 minutes on an Intel® i9 12th generation workstation. At each p-th output port, we compute the median tm,p of the output power: the actual thresholds are defined as t0,p = 0.85 tm,p and t1,p = 1.15 tm,p. Therefore, for each p-th output port, the output power will be considered a logic 0 if pp < t0,p and a logic 1 if pp > t1,p. Figure 8 contains a visualization of the probability density function (pdf) of the dataset and the thresholds for the three different output ports.

Fig. 8: Power datasets generated for the definition of the logic thresholds.
figure 8

Visualization of the probability density function of the dataset for the three output ports and the associated logic thresholds (black dashed lines). The green background represents the values considered logic 1s, the blue background the values considered logic 0s. The powers on the x-axis have been represented in decibel for better graphical clarity.

Once the logic thresholds have been defined, we can test the capabilities of the reference circuit as a programmable logic gate by means of the proposed model. After choosing the desired logic functions (potentially including do not-care (X) terms), we need to find the proper set of nine Vin voltages that allows the device to produce the correct truth tables. This could be achieved with a properly trained Machine Learning agent42 or using an optimization routine36. We opted for the latter and, to speed up the computation, we preliminarily searched, in our 6 × 106 entry dataset, the combination of voltages that better approaches the desired truth table.

Table 2 contains the logic functions that have been tested. In multiple cases, the solution is not unique and multiple sets of voltages allow the implementation of the same desired functions. Moreover, the same set of functions can be obtained on multiple permutations of the outputs; for instance, with reference to case 2, it is possible to obtain the logic and on port 1 and the logic or on port 2 and viceversa. The second to last column of the table indicates whether a combination of Vin was found capable of producing the requested functions, either as listed in the table or with permutations of the output ports. It should be noted that not all the cases analyzed can be successfully implemented. As an example, it is not possible to negate the 3 inputs at the same time on the 3 outputs (case 9): if all input signals are 0s, it is not possible to obtain any power at any output. However, the simple 3 × 3 device allows us to implement basic logic functions (and, or, xor, nand, nor, sum of product and product of sum), to negate the signals at ports 1 and 2 when a logical 1 is applied to port 3, to program half and full adders (between port 1 and port 2, with carry-in on port 3), and to compute the two’s complement of the 2 bit and 3 bit numbers in input. Finally, it is possible to obtain a set-up in which we compute the logic and of the signal at the input ports 1 and 2, if the signal at port 3 is true or the logic or otherwise, using the optical signal at port 3 to decide which operation must be performed. The results show the great versatility of this device.

Table 2 List of functions tested on the reference device

In order to validate the robustness of our findings with respect to uncertainties on the applied voltages, we performed a series of Monte Carlo simulations. For each successful case listed in Table 2, we run 106 simulations applying random perturbations to the 9 nominal voltages previously determined. The perturbations are generated uniformly on the range  ±5% of Vnom; for each run, we verify if the same truth table is obtained. For the cases in which  ±5% of Vnom did not always produce the correct output, we also tested  ±2% and, if necessary,  ±1% of Vnom, still compatible with standard electronic equipment. The rightmost column of Table 2 contains the maximum tolerance that yields correct truth tables in all 106 cases, despite the perturbation on the input voltages. As one can appreciate, for all the working logic functions a sturdiness range has been found, which could mean that not only the device can be programmed to perform arbitrary operations, but also that it is stable enough to maintain the result despite noisy fluctuations of the electrical control signals.

Conclusions

We proposed a method to develop a comprehensive model describing MZI-based meshed photonic topologies. The model includes effects which are essential for the proper description of the circuit, accounting for physical properties and the fabrication variations, and it accurately captures parasitic effects such as thermal crosstalk, a key limitation in densely integrated photonic circuits. To validate the predictions of the model, we compared the simulated results with the experimental data from a real 3 × 3 mesh of MZIs: the excellent agreement highlights the effectiveness of our approach even in the presence of strong thermal crosstalk.

Subsequently, the validated model was used to determine the control voltages to operate the reference device as a programmable logic circuit to implement a set of user-defined logic functions. Furthermore, we assessed the robustness of these logic operations against applied voltage fluctuations, which confirmed the reliability of the proposed approach.

This work highlights the need for accurate modeling of integrated circuits for photonic computing applications and offers a foundation for the scalable design and optimization of PICs for next-generation telecommunications and high-performance computing.