Comprehensive model of MZI-based circuits for photonic computing applications

Marchisio, Andrea; Da Ros, Francesco; Curri, Vittorio; Carena, Andrea; Bardella, Paolo

doi:10.1038/s42005-025-02176-0

Download PDF

Article
Open access
Published: 03 July 2025

Comprehensive model of MZI-based circuits for photonic computing applications

Andrea Marchisio ORCID: orcid.org/0009-0002-8576-8424¹,
Francesco Da Ros²,
Vittorio Curri¹,
Andrea Carena¹ &
…
Paolo Bardella¹

Communications Physics volume 8, Article number: 277 (2025) Cite this article

6258 Accesses
2 Citations
Metrics details

Subjects

Abstract

Photonic computing leverages the intrinsic advantages of photonic integrated circuits, including enhanced parallelism through wavelength, polarization, and mode division multiplexing, reduced power consumption, ultra-high operational speeds, and compatibility with silicon technology. We present a comprehensive circuit model for Mach-Zehnder interferometer (MZI) based meshed topologies, that is able to accurately predict the behavior of fabricated devices and that can be used for an efficient design of this kind of devices. Our proposed model incorporates both essential physical effects and parasitic phenomena, such as thermal crosstalk, that significantly influence device performance, thus enabling more realistic and accurate predictions of the device behavior, especially in densely integrated photonic circuits. By validating the model against the measured data of a fabricated device, we demonstrate its ability to reproduce the experimental evidence with high accuracy. Finally, we showcase the use of our approach in practical photonic computing scenarios, employing our model to program the MZI control voltages to implement specific logic functions on the reference device.

Asymptotically fault-tolerant programmable photonics

Article Open access 29 November 2022

A fixed phase tunable directional coupler based on coupling tuning

Article Open access 16 October 2024

Highly uniform thermally undercut silicon photonic devices in a 300 mm CMOS foundry process

Article Open access 14 August 2025

Introduction

The rapid evolution of electronic technology is approaching a critical turning point, marked by two dominant trends. On the one hand, progress on the very large-scale integration (VLSI) front has been slowing down, deviating from Moore’s law predictions¹ due to the fundamental physical limits of FET technology, including thermal and quantum effects². On the other hand, the rise of cutting-edge applications –such as data-driven artificial intelligence (AI)³ and quantum computing⁴ –has led to unprecedented demands for integration scale and computational power. As an example, the computational requirements for training state-of-the-art AI systems have been doubling approximately every 3.5 months⁵.

As traditional electronic systems reach their physical limits, alternative paradigms are of vital importance to sustain innovation in computational capabilities. Among other solutions, photonic integrated circuits (PICs) are an approach that shows great promise in facing these challenges and that is also rapidly becoming economically viable⁶. Indeed, despite the disadvantage of being a less mature technology in comparison to traditional CMOS-based integrated electronics, PICs could be beneficial for a series of advantages inherent to photonic platforms, such as increased parallelism (exploiting wavelength, polarization, and mode division multiplexing), reduced power consumption, ultrahigh operating speeds, and compatibility with the silicon industry^6,7,8,9, all of which could open the path to new breakthroughs in the aforementioned applications. Indeed, multiple examples of photonic quantum devices have been proposed^10,11 and numerous implementations of photonic accelerators for AI are reported in literature^{12,13,14,15,16}. Moreover, with the advances in the field of co-integration of photonic components and electronic tuning elements¹⁷, there are plenty of examples of programmable photonic devices based on Mach-Zehnder interferometers (MZIs) meshes that can be employed in multiple applications, ranging from optical computing to photonic quantum computing^18,19, with a strategy similar to CMOS multipurpose FPGA devices.

In this general context, the goal of this work consists in the development of a comprehensive circuit model for MZI-based meshed topologies that can be employed for reliable simulations of photonic computing or neuromorphic circuits based on this technology. In particular, we include not only propagation effects and losses, but also parasitic phenomena, such as thermal crosstalk, which, if not properly accounted for, can significantly impact the performance of a photonic processor or the accuracy of a trained photonic neural network (PNN)^20,21. Our methodology offers a computationally efficient alternative to full-scale multi-physics simulations, making it feasible to model and optimize larger photonic networks without compromising accuracy.

In order to benchmark the proposed model, we first apply it to the description of a 3 × 3 mesh of MZIs that can be employed as a programmable photonic unit^21,22. The simulated results obtained with the proposed model are compared with the measurements performed on the actual device to highlight the accuracy of the model itself. The proposed methodology can be easily generalized to model and program arbitrarily-sized MZI-based mesh devices. After the characterization of the 3 × 3 reference device, in order to highlight the versatility of our model, we show a possible application to the offline programming of the same circuit: using the validated model, we are able to determine the MZI voltages needed to implement various user-defined logic functions, also demonstrating the resilience to fluctuations of control signals.

Methods

Reference circuit and technology

The device that we have considered for this analysis is a mesh of nine interconnected 2 × 2 MZIs, with three input ports and three output ports. Its layout is shown in Fig. 1a. Each MZI, whose structure is represented in Fig. 1b, comprises two 2 × 2 multi-mode interferometers (MMIs) connected by two ~267 μm long arms. The effective refractive index on the internal arms is regulated by means of a voltage-controlled thermal phase shifter on the upper arm, implemented with a titanium strip.

**Fig. 1: Masks of the reference circuit.**

The meshed topology shown in Fig. 1a is part of a larger PIC, developed at the Technical University of Denmark and used as a 7 × 7 reconfigurable optical switch for C-band operations²². The PIC is designed on a silicon-on-insulator (SOI) platform, with a buried aluminum mirror produced via flip bonding²². A detailed schematic of the device layers is reported in Fig. 2a. The cross sections of the Si waveguides and the Ti heaters are 0.5 μm × 0.25 μm and 1.8 μm × 0.1 μm, respectively, as shown in Fig. 2b. Moreover, the heaters are 100 μm long, with additional 40 μm × 20 μm pads used to apply the driving voltage, without introducing additional heating thanks to their large cross section and correspondingly low resistance. From Fig. 2, it can be seen that the Ti heaters and the Si waveguides are separated by a 1 μm layer of SiO₂, used to mitigate the absorption losses that arise from the proximity between the Si waveguide and the metal plate.

**Fig. 2: Photonic integrated circuit platform technology.**

All MZIs were designed to work in the cross state at 0 V and in bar state at 2 V. However, it can be observed experimentally that devices 6 and 8 display opposite behavior²³.

As can be seen in Fig. 1a, the nine MZIs of the circuit are interconnected by means of a series of bent waveguides and optical crossings. For an accurate description of the phase change and the losses accumulated between consecutive MZIs, the length of each optical connection must be considered (listed in Supplementary Table 1). The circuit also contains four optical crossings that will be included in our model by means of their insertion loss (IL), assuming negligible optical crosstalk²⁴.

Device modeling

The goal of this section is to capture the physics of the reference 3 × 3 circuit, developing a general methodology of analysis that can be applied to any meshed topology.

The analysis begins by defining the behavioral model for a single MZI. This involves two main elements: (a) a model for light propagation and (b) a model for thermal effects. The latter describes how the changes in temperature, caused by the voltage applied to the heaters, affect the waveguide effective index n_eff and it must account for spurious effects, such as thermal crosstalk between neighboring MZIs.

For the first point, we employ the traditional transmission matrix-based formulation to describe the propagation of the field in the device. The transmission matrices for the input and output MMIs can be expressed exploiting Coupled Mode Theory²⁵:

$${T}_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}=\left[\begin{array}{cc}{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}\sqrt{{\gamma }_{{{{\rm{in}}}}1}}&j{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}\sqrt{1-{\gamma }_{{{{\rm{in}}}}2}}\\ j{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}\sqrt{1-{\gamma }_{{{{\rm{in}}}}1}}&{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}\sqrt{{\gamma }_{{{{\rm{in}}}}2}}\end{array}\right]$$

(1)

$${T}_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}=\left[\begin{array}{cc}{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}\sqrt{{\gamma }_{{{{\rm{out}}}}1}}&j{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}\sqrt{1-{\gamma }_{{{{\rm{out}}}}2}}\\ j{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}\sqrt{1-{\gamma }_{{{{\rm{out}}}}1}}&{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}\sqrt{{\gamma }_{{{{\rm{out}}}}2}}\end{array}\right]$$

(2)

where ${\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}$ and ${\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}$ are insertion losses for input and output MMIs, respectively, γ_ini = γ_outi ≜ γ_i are the corresponding splitting ratios, defined as the ratio between the output power at port i and the input power at port i, in absence of losses. For a shorter notation, we define four new quantities ${\Gamma }_{11}=\sqrt{{\gamma }_{1}}$, ${\Gamma }_{12}=\sqrt{1-{\gamma }_{2}}$, ${\Gamma }_{21}=\sqrt{1-{\gamma }_{1}}$, and ${\Gamma }_{22}=\sqrt{{\gamma }_{2}}$.

Then, we can describe the propagation through the two arms of the MZI with the following diagonal transfer matrix, taking into account the Ti heater that can modify the effective refractive index.

$${T}_{{{{\rm{prop}}}}}=\left[\begin{array}{cc}{\xi }_{{{{\rm{m}}}}}{\epsilon }_{+}{e}^{j(\frac{2\pi }{\lambda }{n}_{{{{\rm{eff,1}}}}}(T){L}_{{{{\rm{h}}}}}+\delta \varphi )}&0\\ 0&{\alpha }_{{{{\rm{b}}}}}^{2}{e}^{-{\alpha }_{{{{\rm{prop}}}}}L}{\epsilon }_{-}{e}^{j(\frac{2\pi }{\lambda }{n}_{{{{\rm{eff,2}}}}}(T){L}_{{{{\rm{h}}}}}-\delta \varphi )}\end{array}\right]$$

(3)

where ${\xi }_{{{{\rm{m}}}}}={\alpha }_{{{{\rm{b}}}}}^{2}{\alpha }_{{{{\rm{m}}}}}{e}^{-{\alpha }_{{{{\rm{prop}}}}}L}$, α_m is a metal absorption factor, α_b is a bending radiation factor (which appears as a squared quantity since each arm includes two bends, as it can be appreciated from Fig. 1b), α_prop are the propagation losses through the waveguide, L is the total length of the arm, L_h is the heater length, and λ is the signal wavelength. The terms ${\epsilon }_{\pm }={e}^{j(\frac{2\pi }{\lambda }{n}_{{{{{\rm{eff}}}}}_{{{{\rm{0}}}}}}(L-{L}_{{{{\rm{h}}}}})\pm \delta \varphi )}$, with ${n}_{{{{{\rm{eff}}}}}_{{{{\rm{0}}}}}}$ effective refractive index at room temperature T = T₀ = 293 K, introduce the optical phase accumulated in the portion of the arms that is not covered by the electrode. The remaining quantities introduced in (3) will now be discussed.

The effective refractive indices on the two arms of the MZI n_eff,1 and n_eff,2 are functions of the temperature T due to the action of the thermal phase shifters. The temperature of the upper arm, placed directly below the metal pad, is modified, but due to the lack of insulation trenches²² and the limited distance between the two waveguides, the temperature of the lower arm is also affected. Even if the latter variation is smaller than the temperature change in the upper waveguide, this thermal crosstalk can significantly affect the behavior of the single MZI and the whole device. In fact, thermal crosstalk is one of the main limitations for the large integration of devices in PICs²⁶ and can strongly affect the accuracy of PNNs²¹ or the programmability of meshed topologies¹⁹.

Moreover, the transmission matrix presented in (3) formally takes into account losses due to metal absorption as a result of the proximity of the upper waveguide to the Ti heater and the metallic pad. Due to the inclusion of the additional SiO₂ layer between the waveguides and heaters discussed in the “Methods" section, the absorption coefficient is set to α_m = 1, but it has been included to offer a general description that can be used on technological platforms where this type of loss is not mitigated.

The term δφ represents a phase offset and is introduced to better describe the real behavior of each MZI. Indeed, this parameter allows us to accommodate for possible fabrication uncertainties that could result in spurious n_eff shifts and, consequently, in different working points for each MZI. This term, together with the presence of unbalanced splitting ratios, allows us to capture the spurious optical transmission on the opposite port that can be observed when devices are in bar or cross state²⁷ without having to implement more sophisticated models, where, for instance, optical crosstalk can be modeled statistically²⁸.

The matrices ${T}_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}$, T_prop, and ${T}_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}$ can be used to compute the total transfer matrix of the MZI by means of a matrix multiplication, corresponding to the cascade of the constituent blocks²⁹:

$$T={T}_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}{T}_{{{{\rm{prop}}}}}{T}_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}$$

(4)

Finally, the fields ${E}_{1}^{{{{\rm{out}}}}}$ and ${E}_{2}^{{{{\rm{out}}}}}$ at the two output ports of the single MZI can be computed as functions of the input fields by multiplying the input field components by the transmission matrix obtained:

$$\left[\begin{array}{c}{E}_{1}^{{{{\rm{out}}}}}\\ {E}_{2}^{{{{\rm{out}}}}}\end{array}\right]=T\left[\begin{array}{c}{E}_{1}^{{{{\rm{in}}}}}\\ {E}_{2}^{{{{\rm{in}}}}}\end{array}\right]=\left[\begin{array}{cc}{T}_{11}&{T}_{12}\\ {T}_{21}&{T}_{22}\end{array}\right]\left[\begin{array}{c}{E}_{1}^{{{{\rm{in}}}}}\\ {E}_{2}^{{{{\rm{in}}}}}\end{array}\right]$$

(5)

Through straightforward matrix calculations, we can expand Eq. (5) to obtain two expressions that can be efficiently evaluated numerically. In this case, we retrieve the following output field equations:

$${E}_{1}^{{{{\rm{out}}}}}={\xi }_{1}{\Gamma }_{11}^{2}{p}_{1}{E}_{1}^{{{{\rm{in}}}}}-{\xi }_{2}{\Gamma }_{12}{\Gamma }_{21}{p}_{2}{E}_{1}^{{{{\rm{in}}}}}+j{\xi }_{1}{\Gamma }_{11}{\Gamma }_{12}{p}_{1}{E}_{2}^{{{{\rm{in}}}}}+j{\xi }_{2}{\Gamma }_{12}{\Gamma }_{22}{p}_{2}{E}_{2}^{{{{\rm{in}}}}}$$

(6)

$${E}_{2}^{{{{\rm{out}}}}}=j{\xi }_{1}{\Gamma }_{21}{\Gamma }_{11}{p}_{1}{E}_{1}^{{{{\rm{in}}}}}+j{\xi }_{2}{\Gamma }_{22}{\Gamma }_{21}{p}_{2}{E}_{1}^{{{{\rm{in}}}}}-{\xi }_{1}{\Gamma }_{21}{\Gamma }_{11}{p}_{1}{E}_{2}^{{{{\rm{in}}}}}+{\xi }_{2}{\Gamma }_{22}^{2}{p}_{2}{E}_{2}^{{{{\rm{in}}}}}$$

(7)

where, for brevity’s sake, the following quantities were introduced:

$${\xi }_{1}={\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}{\alpha }_{{{{\rm{m}}}}}{\alpha }_{{{{\rm{b}}}}}^{2}{e}^{-{\alpha }_{{{{\rm{prop}}}}}L}$$

$${\xi }_{2}={\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{in}}}}}}{\alpha }_{{{{{\rm{MMI}}}}}_{{{{\rm{out}}}}}}{\alpha }_{{{{\rm{b}}}}}^{2}{e}^{-{\alpha }_{{{{\rm{prop}}}}}L}$$

$${p}_{1}={e}^{j(\frac{2\pi }{\lambda }{n}_{{{{\rm{eff}}}}}(T({V}_{{{{\rm{in}}}}})){L}_{{{{\rm{h}}}}}+\delta \varphi )}{e}^{j(\frac{2\pi }{\lambda }{n}_{{{{{\rm{eff}}}}}_{{{{\rm{0}}}}}}(L-{L}_{{{{\rm{h}}}}})+\delta \varphi )}$$

$${p}_{2}={e}^{j(\frac{2\pi }{\lambda }{n}_{{{{{\rm{eff}}}}}_{{{{\rm{0}}}}}}L-\delta \varphi )}$$

At this point, in cases where ξ₁ = ξ₂ and ${\Gamma }_{ij}=\sqrt{0.5},\,i,j=1,2$, we would be able to exploit the prosthaphaeresis formulae to analytically obtain a closed-form solution proportional to $\cos (\Delta \phi )$, where Δϕ is the phase difference between the two arms of the MZI. However, with the presented formulation, where unbalanced MMI splitting ratios and metal absorption loss α_m are considered, it is not possible to easily obtain this result analytically, but we can still expect the sinusoidal-like behavior inherent in the physics of MZIs.

To estimate the values of the various parameters previously described, we simulated the structure of the constituent components in RSoft™ CAD³⁰, employing the Finite-Difference Time-Domain (FDTD) method. In this way, we obtained the bending radiation term for the curved sections of the MZIs, the IL for the waveguide crossings, and the IL and the coupling factors of the MMIs. The values of these parameters at different wavelengths are reported in Supplementary Table 2. The MZI parameters depend weakly on the wavelength over the considered range, implying a wide band of operation for the device. Note that, in these simulations, we neglected all non-linear effects (e.g., two-photon absorption), assuming to be always working with sufficiently low input power levels³¹.

At this point, we can introduce a model for the thermally-controlled MZIs, including thermal crosstalk.

The main thermal effect to be included, of course, consists in the thermal control of the MZIs by means of Ti microheaters. A voltage is applied to the Ti strip that will heat up as a result of the Joule effect and, consequently, will increase the temperature in the waveguide underneath. This induces a change of the n_eff of the waveguide and a subsequent phase difference between the two arms of the interferometer. Following a well-known approach³², we can express the dependence of the effective refractive index with respect to temperature T introducing a first-order Taylor expansion:

$${n}_{{{{\rm{eff}}}}}(T)={n}_{{{{\rm{eff}}}}}({T}_{0})+{\left.\frac{{{{\rm{d}}}}{n}_{{{{\rm{eff}}}}}}{{{{\rm{d}}}}T}\right| }_{{T}_{0}}(T-{T}_{0})$$

(8)

The derivative of n_eff with respect to temperature is calculated starting from the n_eff(T) curves at different temperatures obtained with the RSoft™ simulations of a single waveguide (see Supplementary Note 2). This derivative amounts to 1.9832 × 10⁻⁴ K⁻¹ at λ = 1550 nm, which is in line with the values reported in literature for waveguides with similar cross sections³³. Since the n_eff change is caused by the temperature difference ΔT = T − T₀ induced by V_in, we need a simple way to relate the increase in temperature in the waveguide to the applied voltage on the Ti strip. In order to integrate this into a larger simulation framework and achieve an accurate representation of thermal effects without relying on complex analytical or numerical models, we chose to use targeted COMSOL Multiphysics® simulations of a simplified system. This approach, with the appropriate strategy, can be generalized to represent more complex configurations, such as the one under study.

In particular, in COMSOL, we simulated a 3D system made of 6 parallel Si waveguides deposited on a 1 mm × 1 mm × 200 mm SiO₂ substrate and covered by a SiO₂ cladding, on top of which 3 microheaters Ti strips are located, as indicated by the dashed rectangle in Fig. 1. We assume that, in the circuit under test (Fig. 1a), thermal effects are relevant only for devices having the same position y, while it is otherwise negligible, since the distance between the arms is much smaller than the distance between different MZIs in the y direction; we can easily measure the thermal crosstalk by computing the total spatial variation of the temperature due to three vertically stacked MZIs. For the simulations, we considered the Joule effect for the heating of the electrodes, convection between solids for the propagation of the temperature in the geometry, and the linear resistivity model for the Ti strips. The simulation incorporated the following boundary conditions (BCs): the top surface of the chip exchanges heat with the surrounding air by convection (Robin BCs³⁴), the sides are treated as adiabatic without external heat exchange (homogeneous Neumann BCs^34,35), and the substrate is maintained at ambient temperature by an ideal Peltier cell (Dirichlet BCs^34,35). In Fig. 3a, we show a screenshot of the simulated system in COMSOL Multiphysics®. In Fig. 3b, it is also possible to appreciate the structure of the considered layers in the transverse plane.

**Fig. 3: COMSOL Multiphysics® simulations.**

With this strategy, we are able to compute the temperature variation with respect to position for the single MZI, when changing the applied V_in. Fig. 3c represents the temperature difference from room temperature (20 °C) when a voltage V_in,2 is applied to the central heater while the other ones are grounded. For instance, let us consider V_in,2 = 2 V: in this case, we observe that the temperature in the waveguide below that heater increases by ~52 °C, but also that, in the lower arm of the same MZI, the temperature variation is ~16 °C (Fig. 3c, d). This equates to a reduced thermal tuning efficiency, as the optical path variation is proportional to the difference in temperature between the two arms (ΔT_MZI,2 = 36 °C). Even more importantly, there is also a significant temperature variation between the waveguides of MZI 1 (ΔT_MZI,1 = 5 °C) and MZI 3 (ΔT_MZI,3 = 2 °C), despite the fact that both have grounded heaters: this is thermal crosstalk. Despite the possibility of mitigating it with a larger separation between the waveguides of the same MZI or between different MZIs, or with insulation trenches, these solutions would imply either lower integration density or increased fabrication complexity.

The COMSOL simulations can be employed to compute the temperature variation when the lateral MZIs are turned on separately or when multiple MZIs are turned on at the same time, which is fundamental to describe a realistic use of the device. A trivial approach would require one simulation for each set of voltages applied to the three electrodes, but this solution would be, of course, excessively time consuming. Instead, we decided to exploit the curves computed for a single heater (Fig. 3c). For each MZI k = 1, 2, 3, we select the curve at the correct V_in,k and shift it in the x direction (for the lateral MZIs). The three contributions are summed to approximate the complete spatial temperature distribution. The same approach is used for the two other groups of MZIs present in the device.

Figure 4 depicts the spatial distribution of the temperature variation with 1 V, 2 V, and 1 V applied to three heaters respectively, compared to the actual COMSOL simulation of the system represented in Fig. 3a: it is evident that the results obtained with our procedure accurately reproduce the COMSOL thermal simulations. In this way, we can sample the temperature change ΔT in the positions corresponding to each waveguide and use these in Eq. (8) to compute the n_eff variation. This approximation of the sum of three contributions holds because the heat sources are far from the box borders in the x direction, otherwise the adiabatic BCs would not be true and would affect the result.

**Fig. 4: Example of spatial distribution of temperature variation.**

With this description, we are able to create a model that can rapidly compute the response of a meshed MZI-based topology, including multiple effects that would otherwise require time-consuming multi-physics simulations.

Results and discussion

Validation with experiments

In this section, we will validate the model previously described by comparing the simulated results with measurements of the actual device. For this purpose we employ a set of measured output-input power ratios. These power ratios are measured as follows: for each pair of input-output ports, a broadband signal is injected into one of the input ports and each MZI is switched gradually, by spanning its input voltage from 0 V to 2 V in steps of 0.1 V, while all the other MZIs are grounded. Figure 5a reports an example of raw measured data^21,23: amplified spontaneous emission (ASE) is injected into input port 1 and measured with an optical spectrum analyzer (OSA) at output port 2²³, for various values of V_in,1 applied to MZI 1. The flat measured responses confirm the wideband properties of the device. To have a reference that is more robust to noise and to simplify the subsequent analysis, we averaged the spectra over the 1540 nm-1555 nm range (gray box in Fig. 5a), leading to the power ratio curve shown in Fig. 5b.

**Fig. 5: Power ratio spectral measurements.**

By repeating the process and applying an input voltage to each MZI in sequence, we obtain other averaged curves similar to Fig. 5b. These curves can be concatenated into a single trace as the one reported in Fig. 6. The concatenated curve P_dB has been scaled aligning its maximum value P_max (–4.8 dB) to 1, while the value P_ref measured with all null driving voltages (−17.7 dB) is converted to 0:

$${P}_{{{\rm{scaled}}}}=\frac{{{{\rm{P}}}}_{{{\rm{dB}}}}-{{{\rm{P}}}}_{{{\rm{ref}}}}}{{{{\rm{P}}}}_{{{\rm{max}}}}-{{{\rm{P}}}}_{{{\rm{ref}}}}}$$

(9)

**Fig. 6: Power ratio curve and associated optical path.**

This scaling operation allows for easier comparison with our simulation results, as experimental measurements may include additional optical losses (e.g., measurement setup losses) that are not accounted for in our model.

This kind of measured data is interesting because it allows us to have clear evidence of the effect of thermal crosstalk on the response of the circuit: considering Fig. 6, it is possible to appreciate three main contributions to the P_out/P_in curve, corresponding to the three MZIs that are located on the optical path from input 1 to output 2, namely MZIs 1, 4, and 8. First, all curves start from the same P_out/P_in value, corresponding to the case with all grounded MZI (indicated by the red marker in Fig. 6). In this condition, MZI 1 is in cross state, so P_in,1 is mostly routed to MZI 5, except for a small portion due to the non-ideal behavior of the MZIs, as already discussed in the previous section. Being MZIs 5 and 7 in the cross state as well, the majority of P_in,1 reaches the output 3. When applying a voltage to MZI 1, this device switches to bar state, routing P_in,1 to output 2. Similarly, power is routed away from output 2 as MZI 4 goes from cross to bar state and MZI 8 from bar to cross (as stated in the “Methods" section, MZIs 6 and 8 are in bar state when grounded²³). However, it is also evident from the experiments that MZI 2, although not on the direct light path connecting input 1 to output 2, has an effect: due to the action of the heater of MZI 2 on the waveguides of MZI 1, the latter enters even more in the cross state, thus bringing power away from output 2. Indeed, this is one instance of the effect of thermal crosstalk, and, since it has an evident effect when employing a single MZI with a single input, it is clear that it will have an even larger impact when a circuit is used at full capacity.

In order to improve the match with the experiments by accounting for process variations, the phase correction terms δφ introduced in Eq. (3) are now adjusted for each MZI. This can be done with an optimization procedure, for example, using the Particle Swarm Optimization (PSO) method³⁶. PSO is an optimization algorithm based on the social interaction between agents called “particles", which move within an N-dimensional solution space (N = 9 is the number of δφ parameters to be tuned), with the goal of minimizing an error measurement (called “fitness")³⁷. This fitness parameter is a measurement of the quality of the solution found by each particle and, for this particular application, it was calculated as the mean squared error (MSE) between the experimental power ratios (target of the optimization) and the ones obtained by simulating the circuit with the set of δφ parameters found by each particle, at each iteration of the algorithm. Thanks to the movement rules of the particles³⁷, the algorithm is able to converge to a solution that minimizes fitness, which allows us to obtain a set of nine δφ parameters giving us an accurate match of the experimental target.

The fitting parameters obtained with this procedure are reported in Table 1. Note that for MZI 6 and MZI 8 we obtained values close to ±180°, consistent with experimental evidence that these two devices are in the bar state when V_in = 0 V, showing opposite behavior with respect to the other MZIs²³. These fitting parameters are then used in our model to reproduce the P_out/P_in measurements. In Fig. 7 we report all the possible combinations of input and output ports, with solid blue lines representing the measured data²⁰ and the circled red lines representing the simulated results with the phase corrections of Table 1. The curves are normalized with Eq. (9), employing, for each combination of input and output ports, the corresponding experimental values of P_max and P_ref. From the comparisons it is clear that our model, with the optimized phase correction terms, is able to closely match the experimental evidence: the overall behavior is well reproduced, meaning that our model is able to capture correctly the thermal crosstalk, which can surely be beneficial to compensate for it or take it into account for specific applications. Small discrepancies are still present between the predictions and the references. For example, in Fig. 7a, for MZI 4, it is evident that the simulation produced a lower peak power. This and other similar cases can be ascribed to additional effects present in the real device (e.g., fabrication tolerances), but also to the measurement uncertainties, especially for the transfer function minima. In Fig. 7i the trends of MZI 2 and MZI 8 predicted by the simulator do not match the experimental evidence. However, it should be noted that this is the only example in which, experimentally, the peaks and the floor have a difference of ~30 dB. Moreover, the lowest value of the P_out,3/P_in,3 curve is − 73 dB, which could be limited by the noise floor of OSA used for the measurements.

Table 1 Phase parameters δφ extracted with PSO to fit the experimental data

Full size table

**Fig. 7: Extraction of the phase parameters δφ.**

For the purpose of validating the model, we effectively created a digital twin of the device in Fig. 1a. The same methodology, which starts with an accurate description of the individual building blocks followed by a targeted analysis of their parasitic interactions (in our case, dominated by thermal crosstalk effects), can be easily extended to more complex photonic devices, based on– but not limited to– MZI meshes.

Applications to photonic computing

In this section, we use the device digital twin to determine the optimal driving conditions of the MZI to implement user-defined logic functions with 3 optical inputs, also discussing the sensitivity of the output to fluctuations of operating voltages. This ability of the model to explore the implementation of user-defined logic functions aligns with the growing demand for programmable photonic circuits in high-speed computing applications.

In this context, one possible strategy to program a PIC consists in the use of a software-defined procedure to find suitable “weights” (control voltages) to implement the desired functionality. This approach is akin to the so-called “offline training” methods for PNNs, where the backward propagation is performed on a traditional computer and the weights are applied a posteriori on the chip³⁸. However, the effectiveness of offline methods can be drastically reduced by unforeseen fabrication variations³⁹, affecting the behavior of devices supposed to operate identically, while the use of error correction techniques could be extremely challenging³⁸. To overcome this limitation, it is possible to use “online training” techniques, where an optimization algorithm is directly executed on the chip to find the best control signal for each device, automatically accounting for manufacturing defects^38,39,40; this approach is often “physics agnostic”⁴⁰ for better adaptability.

Our model is inherently physics-informed and can be employed in support to an offline training procedure, but providing multiple key advantages with respect to both in-situ training and traditional offline approaches. It enables rapid investigation of a large parameter space, allowing us to evaluate approximately up to 1 × 10⁴ different configurations per second per core on a modern workstation for the considered 3 × 3 device. This computational efficiency leads to the possibility of generating very large datasets or of running advanced optimization algorithms to find, for a specific device, the ideal control signal (e.g., in terms of robustness to voltage fluctuations or minimizing the operation power consumption). This advantage is evident considering that approximately 40 hours were needed to perform 5000 measurements on the 3 × 3 reference circuit²³. Moreover, the capability of predicting the behavior of a single MZI provides a reliable framework for circuit design and pre-deployment validation, leading to a more systemic optimization, which may take into account robustness against fabrication variations and electrical noise, minimizing thermal crosstalk or power consumption. Proper characterization of the device at the design stage helps reduce the need for costly iterative testing. When addressing real components, the main drawback of offline training is the fact that each device must be characterized in detail, mainly because of the intrinsic fabrication uncertainties. With our model, this results automatically from the tuning of the δφ parameters to mimic the operation of the reference circuit under test, thus effectively overcoming the main drawback of canonical offline methods.

Building upon this physics-informed offline approach, we now use the identified model parameters to analyze the behavior of the device under practical operating conditions and to evaluate its performance in executing logic functions.

We assume that signals at 1550 nm are applied in the input, but, due to the wideband properties of the device, other wavelengths could be considered in a WDM scenario.

Before discussing the technique that we propose to efficiently find the required voltages, we need to address the conversion of the analog optical signals into digital 0s (false) and 1s (true). In practice, it is possible to avoid analog to digital conversion using novel techniques⁴¹; however, we decided to adopt an intensity-based approach, similar to the one used in electronics, converting the analog optical signal into a digital one thanks to a threshold for the output powers, thus separating lower power levels (corresponding to logic 0s) and higher power levels (corresponding to logic 1s). To reduce the effect of noise when dealing with output power values close to the threshold, we decided to set two separate thresholds, for the false and true levels, respectively.

These thresholds are estimated as follows: first, a dataset with 6 × 10⁶ entries is created by randomizing the input voltages of the 9 MZIs and computing the power at each output port, when the 2³ possible combinations of the digital input signals are applied. Due to the computational efficiency of the model, the generation of this dataset requires less than 6 minutes on an Intel® i9 12^th generation workstation. At each p-th output port, we compute the median t_m,p of the output power: the actual thresholds are defined as t_0,p = 0.85 ⋅ t_m,p and t_1,p = 1.15 ⋅ t_m,p. Therefore, for each p-th output port, the output power will be considered a logic 0 if p_p < t_0,p and a logic 1 if p_p > t_1,p. Figure 8 contains a visualization of the probability density function (pdf) of the dataset and the thresholds for the three different output ports.

**Fig. 8: Power datasets generated for the definition of the logic thresholds.**

Once the logic thresholds have been defined, we can test the capabilities of the reference circuit as a programmable logic gate by means of the proposed model. After choosing the desired logic functions (potentially including do not-care (X) terms), we need to find the proper set of nine V_in voltages that allows the device to produce the correct truth tables. This could be achieved with a properly trained Machine Learning agent⁴² or using an optimization routine³⁶. We opted for the latter and, to speed up the computation, we preliminarily searched, in our 6 × 10⁶ entry dataset, the combination of voltages that better approaches the desired truth table.

Table 2 contains the logic functions that have been tested. In multiple cases, the solution is not unique and multiple sets of voltages allow the implementation of the same desired functions. Moreover, the same set of functions can be obtained on multiple permutations of the outputs; for instance, with reference to case 2, it is possible to obtain the logic and on port 1 and the logic or on port 2 and viceversa. The second to last column of the table indicates whether a combination of V_in was found capable of producing the requested functions, either as listed in the table or with permutations of the output ports. It should be noted that not all the cases analyzed can be successfully implemented. As an example, it is not possible to negate the 3 inputs at the same time on the 3 outputs (case 9): if all input signals are 0s, it is not possible to obtain any power at any output. However, the simple 3 × 3 device allows us to implement basic logic functions (and, or, xor, nand, nor, sum of product and product of sum), to negate the signals at ports 1 and 2 when a logical 1 is applied to port 3, to program half and full adders (between port 1 and port 2, with carry-in on port 3), and to compute the two’s complement of the 2 bit and 3 bit numbers in input. Finally, it is possible to obtain a set-up in which we compute the logic and of the signal at the input ports 1 and 2, if the signal at port 3 is true or the logic or otherwise, using the optical signal at port 3 to decide which operation must be performed. The results show the great versatility of this device.

Table 2 List of functions tested on the reference device

Full size table

In order to validate the robustness of our findings with respect to uncertainties on the applied voltages, we performed a series of Monte Carlo simulations. For each successful case listed in Table 2, we run 10⁶ simulations applying random perturbations to the 9 nominal voltages previously determined. The perturbations are generated uniformly on the range ±5% of V_nom; for each run, we verify if the same truth table is obtained. For the cases in which ±5% of V_nom did not always produce the correct output, we also tested ±2% and, if necessary, ±1% of V_nom, still compatible with standard electronic equipment. The rightmost column of Table 2 contains the maximum tolerance that yields correct truth tables in all 10⁶ cases, despite the perturbation on the input voltages. As one can appreciate, for all the working logic functions a sturdiness range has been found, which could mean that not only the device can be programmed to perform arbitrary operations, but also that it is stable enough to maintain the result despite noisy fluctuations of the electrical control signals.

Conclusions

We proposed a method to develop a comprehensive model describing MZI-based meshed photonic topologies. The model includes effects which are essential for the proper description of the circuit, accounting for physical properties and the fabrication variations, and it accurately captures parasitic effects such as thermal crosstalk, a key limitation in densely integrated photonic circuits. To validate the predictions of the model, we compared the simulated results with the experimental data from a real 3 × 3 mesh of MZIs: the excellent agreement highlights the effectiveness of our approach even in the presence of strong thermal crosstalk.

Subsequently, the validated model was used to determine the control voltages to operate the reference device as a programmable logic circuit to implement a set of user-defined logic functions. Furthermore, we assessed the robustness of these logic operations against applied voltage fluctuations, which confirmed the reliability of the proposed approach.

This work highlights the need for accurate modeling of integrated circuits for photonic computing applications and offers a foundation for the scalable design and optimization of PICs for next-generation telecommunications and high-performance computing.

Data availability

All experimental and simulated data are available upon reasonable request from the authors.

Code availability

The MZI model and the circuit level simulations have been implemented in MATLAB® by the authors. The PSO algorithm was implemented in house as well. All custom codes are available upon reasonable request from the authors.

References

Schaller, R. Moore’s law: past, present and future. IEEE Spectr. 34, 52–59 (1997).
Article Google Scholar
Waldrop, M. More than moore. Nat. News. 530, 144 (2016).
Article Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–44 (2015).
Article ADS Google Scholar
Ladd, T. et al. Quantum computers. Nature 464, 45–53 (2010).
Article ADS Google Scholar
de Lima, T. F. et al. Machine learning with neuromorphic photonics. J. Lightwave Technol. 37, 1515–1534 (2019).
Article ADS Google Scholar
Streshinsky, M. et al. The road to affordable, large-scale silicon photonics. Opt. Photon. N. 24, 32–39 (2013).
Article Google Scholar
De Marinis, L., Cococcioni, M., Castoldi, P. & Andriolli, N. Photonic neural networks: A survey. IEEE Access 7, 175827–175841 (2019).
Article Google Scholar
Teng, M. et al. Miniaturized silicon photonics devices for integrated optical signal processors. J. Lightwave Technol. 38, 6–17 (2020).
Article ADS Google Scholar
Dabos, G. et al. Neuromorphic photonic technologies and architectures: scaling opportunities and performance frontiers. Opt. Mater. Express 12, 2343–2367 (2022).
Article ADS Google Scholar
Zhong, H.-S. et al. Quantum computational advantage using photons. Science 370, 1460–1463 (2020).
Article ADS Google Scholar
Maring, N. et al. A versatile single-photon-based quantum computing platform. Nat. Photonics 18, 1–7 (2024).
Article Google Scholar
Tait, A. N., Nahmias, M. A., Shastri, B. J. & Prucnal, P. R. Broadcast and weight: An integrated network for scalable photonic spike processing. J. Lightwave Technol. 32, 4029–4041 (2014).
Article Google Scholar
Peng, H.-T., Nahmias, M. A., de Lima, T. F., Tait, A. N. & Shastri, B. J. Neuromorphic photonic integrated circuits. IEEE J. Sel. Top. Quantum Electron 24, 1–15 (2018).
Article Google Scholar
Nakajima, M., Tanaka, K. & Hashimoto, T. Scalable reservoir computing on coherent linear photonic processor. Commun. Phys. 4, 20 (2021).
Article Google Scholar
Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photonics 11, 441–446 (2017).
Article ADS Google Scholar
Wang, J., Rodrigues, S. P., Dede, E. M. & Fan, S. Microring-based programmable coherent optical neural networks. Opt. Express 31, 18871–18887 (2023).
Article ADS Google Scholar
Bogaerts, W. et al. General-purpose programmable photonic chips. In 2021 IEEE Photonics Society Summer Topicals Meeting Series (SUM), 1–2 (2021).
Clements, W. R., Humphreys, P. C., Metcalf, B. J., Kolthammer, W. S. & Walmsley, I. A. Optimal design for universal multiport interferometers. Optica 3, 1460–1465 (2016).
Article ADS Google Scholar
Perez, D. et al. Multipurpose silicon photonics signal processor core. Nat. Comm. 8 https://doi.org/10.1038/s42005-025-02135-9 (2017).
Cem, A. et al. Comparison of models for training optical matrix multipliers in neuromorphic PICs. In 2022 Optical Fiber Communications Conference and Exhibition (OFC), 1–3 (2022).
Cem, A., Yan, S., Ding, Y., Zibar, D. & Da Ros, F. Data-driven modeling of Mach-Zehnder Interferometer-based optical matrix multipliers. J. Lightwave Technol. 41, 5425–5436 (2023).
Article ADS Google Scholar
Ding, Y. et al. Reconfigurable SDM switching using novel silicon photonic integrated circuit. Sci. Rep. 6, 39058 (2016).
Article ADS Google Scholar
Cem, A. Modeling Photonic Integrated Circuits for Optical Computing using Machine Learning. Ph.D. thesis, Technical University of Denmark (2023).
McMahon, P. L. The physics of optical computing. Nat. Rev. Phys. 5, 717–734 (2023).
Article Google Scholar
Huang, W.-P. Coupled-mode theory for optical waveguides: an overview. J. Opt. Soc. Am. A 11, 963–983 (1994).
Article ADS Google Scholar
Shekhar, S. et al. Roadmapping the next generation of silicon photonics. Nat. Comm. 15, 751 (2024).
Article ADS Google Scholar
Marchisio, A. et al. Optimization of 3x3 neuromorphic photonic network for programmable Boolean operations. In SPIE Photonic West: Physics and Simulation of Optoelectronic Devices XXXII, vol. 12880, 191104–8 (2024).
Shafiee, A., Banerjee, S., Chakrabarty, K., Pasricha, S. & Nikdast, M. Analysis of optical loss and crosstalk noise in MZI-based coherent photonic neural networks. J. Lightwave Technol. 42, 4598–4613 (2024).
Najjar Amiri, A., Vit, A. D., Gorgulu, K. & Magden, E. S. Deep photonic network platform enabling arbitrary and broadband optical functionality. Nat. Comm. 15, 1432 (2024).
Article ADS Google Scholar
RSoft™. RSoft™ Photonic Device Tools. [Online; accessed 2-December-2024]. https://www.synopsys.com/photonic-solutions/rsoft-photonic-device-tools/rsoft-products.html.
Zhang, Y. et al. Non-degenerate two-photon absorption in silicon waveguides: analytical and experimental study. Opt. Express 23, 17101–17110 (2015).
Article ADS Google Scholar
Bahadori, M. et al. Thermal rectification of integrated microheaters for microring resonators in silicon photonics platform. J. Lightwave Technol. 36, 773–788 (2018).
Article ADS Google Scholar
Schmid, J. H. et al. Temperature-independent silicon subwavelength grating waveguides. Opt. Lett. 36, 2110–2112 (2011).
Article ADS Google Scholar
Hahn, D. & Özişik, M. Heat conduction fundamentals. In Heat Conduction, chap. 1, 1–39 (John Wiley & Sons, New York, 2012).
Yepez, P. A. K., Scholz, U., Caspers, J. N. & Zimmermann, A. Novel measures for thermal management of silicon photonic optical phased arrays. IEEE Photonics J. 11, 1–15 (2019).
Article Google Scholar
Marchisio, A., Ghillino, E., Curri, V., Carena, A. & Bardella, P. Particle swarm optimization-assisted approach for the extraction of VCSEL model parameters. Opt. Lett. 49, 125–128 (2024).
Article ADS Google Scholar
Kennedy, J. & Eberhart, R. Particle swarm optimization. In Proceedings of ICNN’95 - International Conference on Neural Networks, vol. 4, 1942–1948 vol.4 (1995).
Xue, Z. et al. Fully forward mode training for optical neural networks. Nature 632, 280–286 (2024).
Article Google Scholar
Zhang, W. et al. Online training and pruning of photonic neural networks. In 2023 IEEE Photonics Conference (IPC), 1–2 (2023).
Zhang, H. et al. Efficient on-chip training of optical neural networks using genetic algorithm. ACS Photonics 8, 1662–1672 (2021).
Article Google Scholar
Wu, Y. et al. Integrated photonic modular arithmetic processor. Photon. Res. 12, 2676–2690 (2024).
Article Google Scholar
Khan, I. et al. A machine learning-based model for characterizing stationary-and-dynamic behavior of VCSEL. In CLEO: Fundamental Science, JW2A–141 (2023).

Download references

Acknowledgements

A.M. Ph.D. scholarship is funded by the European Union Next-GenerationEU and by the Italian National Recovery and Resilience Plan (PNRR) through the Italian Ministry of University and Research (MUR) under grant D.M.352/2022. The authors thank L. Tunesi from Politecnico di Torino for his guidance with COMSOL simulations and for the fruitful discussions, E. Ghillino and C. Mavidis from Synopsys Photonic Solutions for their guidance and help during the work, and Y. Ding from Technical University of Denmark for providing the reference device.

Author information

Authors and Affiliations

Dipartimento di Elettronica e Telecomunicazioni, Politecnico di Torino, Torino, Italy
Andrea Marchisio, Vittorio Curri, Andrea Carena & Paolo Bardella
Department of Electrical and Photonics Engineering, Technical University of Denmark, Lyngby, Denmark
Francesco Da Ros

Authors

Andrea Marchisio
View author publications
Search author on:PubMed Google Scholar
Francesco Da Ros
View author publications
Search author on:PubMed Google Scholar
Vittorio Curri
View author publications
Search author on:PubMed Google Scholar
Andrea Carena
View author publications
Search author on:PubMed Google Scholar
Paolo Bardella
View author publications
Search author on:PubMed Google Scholar

Contributions

All authors participated in the conceptual phase of the work; A.M. carried out the theoretical modeling, the literature review, and performed the simulations, with the support of P.B.; F.D.R. provided the experimental measurements; A.C., V.C., and P.B. supervised and coordinated the work; all authors contributed to the writing of the manuscript and revised it.

Corresponding author

Correspondence to Andrea Marchisio.

Ethics declarations

Competing interests

The authors declare no competing interest.

Peer review

Peer review information

Communications Physics thanks Hassan Rahbardar Mojaverand the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information for

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Marchisio, A., Da Ros, F., Curri, V. et al. Comprehensive model of MZI-based circuits for photonic computing applications. Commun Phys 8, 277 (2025). https://doi.org/10.1038/s42005-025-02176-0

Download citation

Received: 20 December 2024
Accepted: 04 June 2025
Published: 03 July 2025
Version of record: 03 July 2025
DOI: https://doi.org/10.1038/s42005-025-02176-0