Introduction

Deep neural networks are powerful computational tools that have been applied in various fields, including image recognition1, time-series forecasting2, and natural language processing3. Although the computation of deep neural networks has been supported by progress in integrated circuit (IC) technology, they face concerns regarding energy consumption and processing time. In recent years, optical neural networks (ONNs) have attracted attention4,5,6; one such approach is diffractive neural networks (DNNs). DNNs use diffractive optical elements to physically implement neural networks, enabling them to achieve fast inference with ultra-low energy consumption7,8,9. Each pixel of the diffractive layer acts as a neuron that modulates the phase of incident light, and connections between neurons in each layer are implemented through the diffraction of light. Although increasing the number of hidden layers improves the performance of DNNs, the use of multiple layers introduces alignment complexity, which makes their implementation challenging. Therefore, achieving high performance using DNNs with simpler architectures remains an important challenge10,11,12,13,14.

The common method used to implement diffractive layers is to introduce optical path length differences by three-dimensional (3D) printing or etching technology7,10,15. However, achieving compact DNNs that operate in the visible wavelength region requires submicron-sized neurons, which are difficult to fabricate by etching. In addition, once the device is fabricated, the diffractive layer pattern is fixed; thus, fabrication errors cannot be compensated for and the functionality cannot be reconfigured on demand for different tasks. Several reconfigurable DNNs have been developed using optically active materials such as liquid crystals (LCs)14,16,17 or phase-change materials (PCMs)18,19,20. LC-based phase modulation works by controlling the orientation of LC molecules, which exhibit birefringence. However, their response time is relatively long (in the millisecond range) and their volatility leads to static power consumption. Although some groups have reported an LC array with a pixel pitch of 1 μm21,22, further miniaturization is hindered by crosstalk from electric field leakage between adjacent pixels. The material states of PCMs can be switched between crystalline and amorphous phases with an optical or electric pulse, enabling the modulation of light by differences in the refractive index between the two phases. Although PCMs are nonvolatile, their high absorption in the visible wavelength region limits their use to the mid-infrared region23.

We have studied DNNs using the magneto-optical (MO) effect in magnetic materials24,25. The MO effect causes a rotation of the polarization plane and an elliptical polarization when linearly polarized light passes through or is reflected from the MO material. Magnetic domains can be used as neurons because the rotation angle or ellipticity is proportional to the projection of the magnetization along the light’s wavevector k. The magnetic domain pattern can be reconfigured using a thermomagnetic recording technique, and submicron domains have been experimentally demonstrated26,27. This magnetic switching is extremely fast and highly durable; thus, incorporating the MO effect into DNNs has potential for implementation28. Leveraging these advantages, MO materials have also been explored in various optical applications such as holograms29,30 and MO-based spatial light modulators (SLMs)28,31. The drawback of the MO medium is the small degree of phase modulation, whereas LCs can achieve phase shifts greater than 360°. Bismuth-substituted iron garnet, which is known for its large Faraday effect (MO effect in the case of transmission geometry), exhibits a Faraday rotation of 30°/µm32. However, a large phase shift (e.g., 90°) is not feasible because of the trade-off relationship between transmittance and the Faraday effect. If MO-DNNs can overcome the limitation imposed by weak phase modulation, they could provide superior performance.

In the present study, we focus on a diffraction scheme caused by the MO effect. As shown in Fig. 1, when incident light is linearly polarized along the x-axis, the MO effect from upward and downward magnetic domains generates a positive and negative y-component of the electric field, respectively (Fig. 1). In the far field, the interference of the y-component leads to a 90° difference in the polarization angle between the zeroth- and first-order diffracted light33,34. This unique phenomenon that only occurs in diffraction based on the MO effect suggests that the performance of MO-DNNs can be increased by inserting a polarizer to eliminate the zeroth-order light.

In the present study, we evaluate the performance of MO-DNNs consisting of a single hidden layer with and without a polarizer operating at 532 nm wavelength. The Modified National Institute of Standards and Technology (MNIST) dataset is used as a benchmark classification task. For the physical implementation, we use a bismuth, gallium-substituted garnet film as an MO film and experimentally demonstrate handwritten digit classification using the MO-DNN with a polarizer. In addition, we rewrite the magnetic domain pattern to demonstrate different classification tasks in the same system and verify the effectiveness of the reconfigurable MO-DNN.

Fig. 1
Fig. 1
Full size image

Schematic of image classification using single-layer magneto-optical diffractive neural network (MO-DNN) with polarizer.

Results and discussion

MO-DNN models

We evaluated the performance of three MO-DNN models for classifying MNIST handwritten digits (Fig. 2). All three MO-DNN models share the identical conditions, differing only in their output detection configurations. The input light was linearly polarized along the x-axis, and its intensity distribution corresponded to the input image. The MNIST dataset was resized from 28 × 28 pixels to 200 × 200 pixels and used as the input images. The single hidden layer consisted of a magnetic film containing 200 × 200 magnetic domains with a size of 1 × 1 µm2. The first model directly detected light intensity as the output signal. The second model detected the polarization angle of the output light. The third model detected intensity after the light passed through a polarizer that eliminated the x-component of the electric field. Ten designated regions corresponding to digits 0 to 9 were placed on the output plane, and these models were trained so that the region corresponding to input digits showed maximal signal intensity. The calculation of light propagation and the training method are provided in the sections “Calculation method for light propagation” and “Training method,” respectively. The distance between the input and hidden layers d1 was set to 6.5 mm, and the distance between the hidden and output layers d2 was set to 1.0 mm. These distances were selected because they provided the highest accuracy among the various interlayer distances we tested.

Fig. 2
Fig. 2
Full size image

Schematic of three MO-DNN models. (a) MO-DNN directly detecting intensity as an output signal. (b) MO-DNN detecting the polarization angle of output light as an output signal. (c) MO-DNN detecting intensity through a polarizer.

Simulation results for MNIST classification

The classification accuracy of the three MO-DNN models for the testing data when the Faraday rotation was set to 5° and the numerically simulated results are shown in Fig. 3. The classification accuracy of the MO-DNN without a polarizer was 10.60%. As evident from the output image of this model, the diffracted light that originated from the unmodulated x-components was dominant compared with the diffracted light that originated from the modulated y-component; thus, this model was unable to classify the digit images. This insufficient training performance mainly resulted from the small Faraday rotation, which produced weak optical modulation and vanishing gradients. When the Faraday rotation is sufficiently large, the MO-DNN without a polarizer can also be trained successfully, as demonstrated in our previous study24. In the MO-DNN that detects the polarization angle, accuracy improved to 80.49%. The absolute value of the polarization angle was small in areas with high light intensity and large in areas with low light intensity from the output image. We speculate that unmodulated x-components were still dominant in high-intensity areas, limiting the performance of this model. In the third model, which incorporated a polarizer, the classification accuracy increased to 97.88%. The polarizer removed the electric field component parallel to the input light, so that only the y-component rotated by ± 90° due to the Faraday effect contributed to the output signal. This selective utilization of the modulated component increased its proportion in the output, thereby enabling highly accurate classification. This performance is comparable to that achieved by conventional DNNs that use a phase modulation of 2π14. Note that we confirmed no overfitting was observed in this study as shown in Figure S1 in the Supplementary Materials.

Fig. 3
Fig. 3
Full size image

(a) Classification accuracy for MNIST handwritten digits using three MO-DNN models when Faraday rotation was set to 5°. (b) Simulated images of each layer and output signal distribution for three MO-DNNs when digit “2” was input.

We also investigated the classification accuracy as a function of the Faraday rotation by varying the Faraday rotation from 1° to 5° as shown in Fig. 4. For the MO-DNN without a polarizer, no change in accuracy was observed in this range because the Faraday rotation was too small. When the MO-DNN detected the polarization angle, the accuracy decreased with decreasing Faraday rotation. By contrast, the classification accuracy of the model with a polarizer was independent of the Faraday rotation and maintained high accuracy even at a Faraday rotation of 1°. These results can be explained by the MO diffraction properties—specifically, by the fact that the first-order diffracted light is polarized 90° with respect to the zeroth-order light.

Fig. 4
Fig. 4
Full size image

Classification accuracy for MNIST handwritten digits using three MO-DNNs, plotted as function of Faraday rotation.

To validate the scalability of the MO-DNNs, we examined the dependence of the classification accuracy on the number of neurons using the single-layer MO-DNN with a polarizer. We selected the Fashion-MNIST dataset as the classification task because it is a more challenging benchmark than MNIST handwritten digit classification. The number of neurons was set to N × N, where N was varied as 28, 50, 100, 200, 300, and 400. The corresponding classification accuracies were 70.20%, 80.47%, 85.11%, 87.02%, 88.29%, and 88.74%, respectively (Fig. 5). The accuracy increased rapidly from 70.20% to 85.11% as N increased from 28 to 100, and then gradually saturated at approximately 89% for N ≥ 200. This result suggests that, although increasing the number of neurons improves accuracy, the benefit of fabricating larger hidden layers shows diminishing returns beyond N = 200, especially when the increased implementation cost is considered.

Fig. 5
Fig. 5
Full size image

Classification accuracy for Fashion-MNIST dataset using single-layer MO-DNN with polarizer, plotted as function of number of neurons N × N.

Experimental demonstration

We experimentally validate the performance of the MO-DNN consisting of a single hidden layer and a polarizer. The magnetic domain pattern of the hidden layer was designed by pre-training using simulation. We reconfigured the MO-DNN by rewriting the magnetic domain pattern and performed two inferences: one for MNIST handwritten digit classification and another for Fashion-MNIST classification (Fig. 6(a)). We built an experimental setup to enable both hidden layer reconfiguration and image classification using the MO-DNN (Fig. 6(c)). We fabricated the hidden layer by recording a magnetic domain pattern on the MO film using a laser-scanning thermomagnetic recording technique27. The input images were generated by the LC on a silicon SLM (LCoS-SLM), and the output images were captured by a CMOS camera. Details of this setup are provided in the section titled “Optical setup.”

For the MO film, we used a bismuth, gallium-substituted iron garnet thin film with the composition of Y0.5Bi2.5Fe4GaO12 (Bi, Ga: YIG); this film exhibits perpendicular magnetic anisotropy and a Faraday rotation of −3.3° at a wavelength of 532 nm35. The fabrication process for the film is described in the section “Magneto-optical film.” In our previous work, we recorded a magnetic domain pattern with a width of 1 μm on a Bi, Ga: YIG film27. However, the neuron (magnetic domain) size used in this experiment was 4 × 4 µm2 because of the 8 μm pixel pitch of the LCoS-SLM and the alignment requirements. The number of neurons was 112 × 112. The magnetic domain patterns optimized by training and experimentally recorded domain patterns are shown in Fig. 6(b). The recorded patterns closely matched their designed patterns. These recorded domain patterns were used as the hidden layer for inference.

Fig. 6
Fig. 6
Full size image

(a) Schematic of image classification using reconfigurable MO-DNN. (b) Magnetic domain patterns obtained from training simulations for handwritten digit classification and fashion item classification, along with corresponding recorded magnetic domain patterns. (c) Optical setup integrating magnetic domain pattern recording system and image classification system for MO-DNN.

Figures 7(a) and 7(b) show representative input images, output images, and classification results obtained from simulations and experiments, respectively, for the handwritten digit classification. The projected input images and acquired output images agree well with the simulation results. We confirmed that the calculation of light propagation including the Faraday effect was accurate. 500 images, which were randomly selected from testing dataset, were used to evaluate the classification accuracy in the simulations and experiments. Confusion matrices for simulation and experimental results show the distribution of correctly and incorrectly identified digits (Figs. 7(c) and 7(d)). The classification accuracy of the MNIST handwritten digit images reached 94.8% in the simulation, and we demonstrated classification with an accuracy of 83.4% in the experiment.

Fig. 7
Fig. 7
Full size image

(a) Simulated and (b) experimental inference results using reconfigurable MO-DNN for MNIST handwritten digits. Confusion matrices for (a) simulation and (b) experimental results based on test images.

Figure 8 shows the simulation and experimental results for the Fashion-MNIST classification. The generated input image exhibited vertical interference fringes, and the intensity distributions were less homogeneous compared with those of the original images. Nevertheless, the experimental output results were in good agreement with the simulation results. The classification accuracy reached 87.4% in the simulation and 71.0% in the experiment. These results confirm that different classification tasks can be successfully performed by reconfiguring the MO-DNN.

Here, we discuss the accuracy differences between the simulation and the experiment. During the training phase, the light intensity outside the designated target regions was ignored due to the use of cross-entropy loss as the loss function. As a result, the light intensity at the output plane tends to be higher outside of each target region in both the handwritten digit and the fashion item classification cases. We speculate that slight differences in the output images between the simulation and experiment, which were caused by the fill factor of the LCoS-SLM, alignment errors, and the inhomogeneous input images, substantially reduced the classification accuracy. The robustness could be improved by combining the cross-entropy loss with mean squared error36.

Fig. 8
Fig. 8
Full size image

Inference results and confusion matrices in simulation and experimental results for fashion item classification.

Table 1 summarizes a comparison between our reconfigurable MO-DNN and previously reported DNNs. Despite using a weak phase modulation, our model achieved performance comparable to a multilayer DNN with a phase modulation of π or 2π. MO-DNN consisting of the single layer and small neuron size can be extremely compact device. Table 2 summarizes the characteristics of three materials: LC, PCM and MO. Unlike LC, MO and PCM which have non-volatile can perform computing with zero static power consumption. Switching a magnetic domain by thermomagnetic recording required 192 nJ per pixel, which is comparable to the switching energy of PCM23. Reconfiguring the magnetic domain pattern requires 1 s due to laser scanning in this study. However, it may be possible to achieve GHz switching using pattern exposure because the magnetization switching in nanosecond order has been experimentally confirmed in garnet film with a composition similar to the Bi, Ga: YIG37. Thermomagnetic recording offers a large number of switching cycles28, making it a suitable mechanism for realizing a reconfigurable hidden layer.

Table 1 Comparison among various diffractive neural networks.
Table 2 Comparison between the performance of various reconfigurable photonic material.

Conclusion

In this study, we focused on the diffraction behavior based on the MO effect, characterized by 90° polarization rotation between the zeroth- and first-order light. We proposed an MO-DNN model incorporating a polarizer and evaluated its performance in MNIST classification tasks. The model achieved 98% classification accuracy for the MNSIT handwritten digit dataset, and this high accuracy was maintained even when the Faraday rotation was as small as 1°. We confirmed that high performance can be realized without relying on large phase modulation in an MO-DNN. In addition, we physically implemented a reconfigurable MO-DNN by recording a magnetic domain pattern in a Bi, Ga: YIG film and carrying out optical image classification. The system’s reconfigurability was validated through successful switching between two classification tasks. Although the use of a polarizer causes optical loss due to the small Faraday rotation angle of the magnetic film, such loss could be reduced in the future through advances in magnetophotonic crystals that enhance magneto-optical effects39. Overall, this work has demonstrated the feasibility of MO-DNN as a reconfigurable and energy-efficient photonic computing device operating in the visible wavelength region. Owing to its compactness and compatibility with image sensors, such a reconfigurable MO-DNN holds strong potential for integration with cameras as an edge-sensing device capable of real-time operation and low power consumption.

Methods

Calculation method for light propagation

The MO effect arises from the difference in the complex refractive indices \({N}_{\pm}\) for left- and right-circular polarization in a magnetic medium:

$$\begin{array}{c}\varDelta\:N={N}_{+}-{N}_{-}=\varDelta\:n+i\varDelta\kappa=\left({n}_{+}-{n}_{-}\right)+i\left({\kappa}_{+}-{\kappa}_{-}\right).\end{array}$$
(1.1)

Here, subscripts + and – denote right- and left-circular polarization, respectively. In the present study, the incident linearly polarized light is decomposed into right- and left-circularly polarized light. Propagation and phase modulation are calculated separately, and the circular components are subsequently recombined at the output layer. Linearly polarized light \({\overrightarrow{E}}_{x}\) can be expressed as the sum of two circularly polarized lights of equal amplitude and phase:

$$\begin{array}{c}{\overrightarrow{E}}_{x}={\overrightarrow{E}}_{+}+{\overrightarrow{E}}_{-}\end{array}$$
(1.2)

Free-space propagation is calculated using the band-limited angular spectrum method based on the Rayleigh–Sommerfeld diffraction formula40. In the hidden layer, the phase of the electric field is modulated by the Faraday effect. As the right- and left-circular polarized light propagates through a magnetic medium of thickness d, the phases advance by \(i\omega\frac{{N}_{+}}{c}d\) and \(i\omega\frac{{N}_{-}}{c}d\), respectively:

$$\begin{array}{c}{\overrightarrow{E}}_{\pm}=\frac{{E}_{\pm}}{\sqrt{2}}\left(\genfrac{}{}{0pt}{}{1}{\pm\:i}\right)\text{exp}\left(i\overrightarrow{k}\cdot\overrightarrow{r}-i\omega\left(t-\frac{{N}_{\pm}}{c}d\right)\right)\end{array}$$
(1.3)

where c is the speed of light and \(\omega\) is the angular frequency. Because \({N}_{\pm}=N\pm\varDelta\:N/2\), Eq. (1.3) can be rewritten as

$$\begin{array}{c}{\overrightarrow{E}}_{\pm}=\frac{{E}_{\pm}}{\sqrt{2}}\left(\genfrac{}{}{0pt}{}{1}{\pm\:i}\right)\text{exp}\left(i\overrightarrow{k}\cdot\overrightarrow{r}-i\omega\left(t-\frac{N}{c}d\mp\frac{\varDelta\:N}{2c}d\right)\right)\end{array}$$
(1.4)

.

The complex Faraday rotation \({{\Phi}}_{F}\) is defined as

$$\begin{array}{c}{{\Phi}}_{F}={\theta}_{\text{F}}+i{\eta}_{\text{F}}=-\frac{\omega\:d}{2c}\varDelta\:N\end{array}$$
(1.5)

.

where θF denotes the Faraday rotation and ηF represents the Faraday ellipticity. By omitting the common phase term from the electric field in Eq. (1.4) and expressing using \({{\Phi}}_{\text{F}}\), we obtain

$$\begin{array}{c}{\overrightarrow{E}}_{\pm}=\frac{{E}_{\pm}}{\sqrt{2}}\left(\genfrac{}{}{0pt}{}{1}{\pm\:i}\right)\text{exp}\left(\mp\:i{{\Phi}}_{\text{F}}\right)\end{array}$$
(1.6)

.

The polarizer oriented at angle \(\varphi\) is represented by the Jones matrix,

$$\begin{array}{c}{P}_{\pm}\left(\varphi\right)=\frac{1}{2}\left\{\left(\begin{array}{cc}{\text{cos}}^{2}\varphi&\frac{1}{2}\text{sin}\left(2\varphi\right)\\\frac{1}{2}\text{sin}\left(2\varphi\right)&{\text{sin}}^{2}\varphi\end{array}\right)\pm\:i\left(\begin{array}{cc}-\frac{1}{2}\text{sin}\left(2\varphi\right)&-{\text{sin}}^{2}\varphi\\{\text{cos}}^{2}\varphi&\frac{1}{2}\text{sin}\left(2\varphi\right)\end{array}\right)\right\}\end{array}$$
(1.7)

The electric field distribution after propagation and transmission through the polarizer is normalized by its maximum value.

Training method

To train the MO-DNN models, we used supervised learning based on the gradient descent method. The sum of the signal intensities within each class region was calculated from the output images. A softmax function was applied to these signals, and the cross-entropy loss was used as the loss function. Note that model does not include any nonlinear activation functions. The Faraday rotation for each neuron was treated as a trainable weight. Because the weights only took a binary value of \(\pm{\theta}_{\text{F}}\), we adopted a method used in binary neural networks41. During forward propagation, binary Faraday rotation was used; by contrast, the SignSwish function \({\text{S}\text{S}}_{\beta}\), which is a differentiable function, was used during backpropagation:

$${\text{S}\text{S}}_{\beta}\:\left(x\right)=2\sigma\left(\beta\:x\right)\left[1+\beta\:x\left\{1-\sigma\left(\beta\:x\right)\right\}\right]-1$$

,

where σ is the sigmoid function and \(\beta=5\). The model was trained using the Adam optimizer with a learning rate of 0.0001. A total of 60,000 images from the MNIST dataset were used for training, and 10,000 images were used for testing. The training was conducted over 70 epochs with a batch size of 50. The implementation was carried out in Python ver. 3.9.16 using TensorFlow ver. 2.9.2.

Magneto-optical film

A Y0.5Bi2.5Fe4GaO12 (Bi, Ga: YIG) film was prepared by the metal-organic decomposition (MOD) method42. An MOD solution with a Y: Bi: Fe: Ga ratio of 0.5:2.5:4.0:1.0 (Kojundo Chemical Laboratory, YBiFeGa-04(0.5/2.5/4/1)) was spin-coated onto a Gd3Ga5O12(111) substrate at 3000 rpm for 30 s, dried at 100 °C for 30 min, and pre-annealed at 430 °C for 30 min. This process was repeated five times. The sample was then annealed in a furnace at 700 °C for 3 h. The process from spin-coating to annealing was repeated to obtain a thicker film. The magnetic and optical properties of the sample were measured using a multichannel MO spectrometer43. Figure 9 shows the Faraday rotation and transmittance spectrum and the Faraday hysteresis loop at a wavelength of 515 nm. The Bi, Ga: YIG film showed a Faraday rotation of − 3.3° at a wavelength of 532 nm and a perpendicular magnetic anisotropy with a coercivity field HC of 300 Oe.

Fig. 9
Fig. 9
Full size image

Characteristics of Y0.5Bi2.5Fe4GaO12 (Bi, Ga: YIG) films prepared by MOD method. (a) Faraday rotation and transmittance spectrum. (b) Faraday rotation hysteresis at wavelength of 515 nm.

Optical setup for MO-DNN experiments

A green laser with a wavelength of 532 nm (Kochi Toyonaka Giken, GSHG-2050 F) was used as the light source for the MO-DNN. Input images were generated using LCoS-SLM (Santec, SLM-210). Amplitude modulation was implemented by combining the SLM with a polarizing beam splitter and a half-wave plate. The magnetic domain pattern was recorded using a thermomagnetic recording technique with laser scanning27. Irradiation with a violet laser (Thorlabs, LP405C1) induced localized heating, which reduced the coercivity field of the film. Under an applied bias magnetic field, the magnetization was recorded selectively in the irradiated regions. The laser beam was deflected using a two-axis galvanometer mirror (Thorlabs, GVS202). A scanning lens (Thorlabs, CLS-SL) and a tube lens (Thorlabs, TTL200-A) were used to make the laser spot size on the focal plane uniform. The laser beam was focused on the film by an objective lens (Mitutoyo, M Plan Apo 10×). The laser power, laser pulse width, and applied bias magnetic field were 12 mW, 16 µs, and 30 Oe, respectively. The output images were acquired using a microscope setup consisting of the objective lens (Mitutoyo, M Plan Apo 10×), an imaging lens, and a CMOS camera (Baumer, VCXU.2-50MP). The polarizer was inserted between the objective lens and the imaging lens.