Enhanced confocal microscopy with physics-guided autoencoders via synthetic noise modeling

Ahmad, Zaheer; Shabeer, Junaid; Hidayat, Abdullah; Saleem, Usman; Qadeer, Tahir; Sami, Abdul; Khalidi, Zahira El; Mehmood, Saad; Pokharel, Shyam; Rana, Osama Ahmed; Aydogan, Bulent; Muhammad, Wazir

doi:10.1038/s41598-025-34839-x

Download PDF

Article
Open access
Published: 07 January 2026

Enhanced confocal microscopy with physics-guided autoencoders via synthetic noise modeling

Zaheer Ahmad¹,
Junaid Shabeer²,
Abdullah Hidayat³,
Usman Saleem⁴,
Tahir Qadeer⁵,
Abdul Sami⁵,
Zahira El Khalidi⁶,
Saad Mehmood⁷,
Shyam Pokharel⁸,
Osama Ahmed Rana⁹,
Bulent Aydogan¹⁰ &
…
Wazir Muhammad³

Scientific Reports volume 16, Article number: 4842 (2026) Cite this article

763 Accesses
Metrics details

Subjects

Abstract

We present a Physics-guided deep learning framework to address common limitations in Confocal Laser Scanning Microscopy (CLSM), including diffraction-limited resolution, noise, and under sampling due to low laser power conditions. The optical system’s point spread function and primary CLSM image degradation mechanisms, namely photon shot noise, dark current noise, motion blur, speckle noise, and under sampling are explicitly incorporated into the model as physics-based constraints. A convolutional autoencoder is trained with a custom loss function that integrates these optical degradation processes, ensuring that the reconstructed images adhere to physical image formation principles. The model is evaluated on simulated CLSM datasets generated based on experimentally observed CLSM noise characteristics. Statistical comparisons, including intensity histograms, spatial frequency distributions, and structural similarity metrics, confirm that the synthetic dataset closely matches accurate CLSM data. The proposed approach is compared with traditional image reconstruction methods, including Richardson-Lucy deconvolution, non-negative least squares, and total variation regularization. Results indicate that the physics-constrained autoencoder improves structural detail recovery while maintaining consistency with known CLSM imaging physics. This study demonstrates that Physics-guided deep learning can provide an alternative computational approach to CLSM enhancement, complementing existing optical correction methods. Future work will focus on further validation using experimental CLSM acquisitions.

Introduction

Recent progress in microscopy needs higher imaging speeds to accommodate dynamic biological studies¹, like cell biology², 3D genome organization³, developmental biology⁴, single cell optical imaging and microscopy⁵, mass spectroscopy imaging⁶, and high-throughput experiments⁷. On the other hand, resolution enhancements often result in increased noise due to fewer photons being available per pixel, especially in low illumination conditions. These challenges indicate the need for methods to enable efficient data acquisition without compromising image quality.

Confocal laser scanning microscopy (CLSM) offers high-resolution imaging with optical sectioning and detailed 3-D reconstructions of porous media⁸, corneal surfaces⁹, human cementocytes¹⁰, orthopedic research¹¹, breast cancer¹², cell biology¹³, and in optical microscopy of various materials¹⁴.

However, under practical conditions, CLSM images are often degraded by diffraction-limited resolution, which is fundamentally dictated by the system’s point spread function (PSF)^15,16. The PSF imposes intrinsic limits on spatial resolution and introduces blurring artifacts^17,18, and its precise characterization remains essential for understanding imaging performance across various modalities¹⁹. Additionally, dark current and speckle noise degrade CLSM image quality further to the point that maintaining clarity in low light and/or power-sensitive conditions becomes challenging. Under-sampling or non-uniform illumination further worsens these issues.

Traditional approaches to improving CLSM image quality rely on post-acquisition techniques such as deconvolution and denoising, which often address the symptoms of degradation but are unable to incorporate the underlying physical principles of image formation. Hardware-based solutions, including advanced detectors or adaptive optics, provide direct improvements but are often costly and not universally accessible.

Deep learning has demonstrated remarkable success in image restoration, with both general-purpose frameworks^20,21 and application-specific designs, such as those tailored for endoscopic videos²² and tomographic imaging²³. However, most existing methods are constructed as broadly applicable models, often without explicitly incorporating the underlying physics of the imaging process. Physics-guided computational methods offer an alternative by incorporating optical system constraints directly into imaging models. Unlike standard deep learning approaches, which primarily optimize visual fidelity, Physics-guided models integrate physical degradation mechanisms such as PSF convolution, photon shot noise, motion blur, dark current, speckle noise, and undersampling into the model training process. This ensures that image reconstruction remains consistent with the physical principles of CLSM.

In the current work, we developed a Physics-guided autoencoder for CLSM image restoration that incorporates the imaging system’s point spread function (PSF), diffraction effects, noise mechanisms, and sampling constraints directly into its design. This study focuses on addressing the imaging challenges due to noisy conditions in CLSM of various biosystems, where prolonged light exposure and/or high-intensity lasers can potentially damage or, in some cases, destroy the samples. This imposes imaging under low light and low laser power conditions to preserve sample integrity. However, these conditions result in under-sampling, increased noise levels, and reduced image resolution. Therefore, by integrating such physical constraints into an autoencoder model we aim to find a robust solution for restoring high-quality CLSM images while ensuring minimal impact on delicate biological samples. The model simulated common CLSM imaging degradations, including photon shot noise, motion blur, speckle noise, and under-sampling, and used these to train a physics-constrained neural network that restores high-quality CLSM images from degraded inputs. Our methods will reduce the need for additional hardware or modifications to existing CLSM systems, focusing instead on software-based solutions to improve image quality in CLSM. The proposed method will be tested on varied synthetic CLSM datasets, to demonstrate its capability to restore spatial resolution and reduce noise while maintaining consistency with the physical principles of CLSM imaging.

Materials and methods

Common noise types in confocal laser scanning microscopy

Confocal laser scanning microscopy (CLSM) imaging faces inherent limitations due to physical degradations caused by optical diffraction, noise sources, and undersampling. These degradations are amplified under conditions of low laser power, which is essential to prevent photodamage to delicate samples.

The imaging resolution in CLSM is fundamentally limited by diffraction, characterized by the point spread function (PSF)¹⁶, which defines the intensity distribution in the focal plane. For an ideal circular aperture, the PSF follows the Airy disk profile^24,25:

$$\:PSF\left(r\right)={\left[\frac{2{J}_{1}\left(\pi\:D\frac{r}{\lambda\:}\right)}{\pi\:D\frac{r}{\lambda\:}+\varepsilon}\right]}^{2}$$

(1)

Such that, $\:{J}_{1}$ is first order Bessel function, D is numerical aperture, r is the radial distance in the focal plane, $\:\lambda\:$ is laser wavelength and $\varepsilon$ is an arbitrary constant to avoid division by zero.

The lateral resolution of CLSM is then^24,25

$$\:d=\frac{0.61\lambda\:}{D}$$

(2)

Similarly, the axial resolution (which defines depth discrimination) is given as

$$\:{d}_{z}=\frac{2\lambda\:}{n{D}^{2}}$$

(3)

Several other noise sources degrade CLSM images. Photon shot noise arises from the quantum nature of light, where the number of detected photons fluctuates statistically. This phenomenon limits the signal-to-noise ratio at low photon counts. For an optical signal $\:\text{I}\left(\text{x},\text{y}\right)$ measured in photon counts per pixel, the arrival of photons follows a Poisson process. At moderate to high photon flux, this Poisson distribution can be approximated by a Gaussian distribution with variance equal to the signal intensity. Light intensity affected by photon shot noise is modeled as

$$\:{I}_{shot}\left(x,y\right)=I\left(x,y\right)+{R}_{random}\sqrt{I\left(x,y\right)}$$

(4)

such that, $\:I\left(x,y\right)$ is true intensity for a pixel at $\:\left(x,y\right)$ and $\:{R}_{random}$ is a Gaussian random variable with zero mean and unit variance.

Similarly, thermally activated carriers within the photodetector contribute a background signal even in the absence of illumination. This dark current is independent of the signal and can be modeled as an additive zero-mean Gaussian process. The intensity of light $\:I\left(x,y\right)$ for a pixel at $\:\left(x,y\right)$ affected by dark current noise is expressed mathematically as

$$\:{I}_{dark}\left(x,y\right)=I\left(x,y\right)+{R}_{dark}{\sigma\:}_{dark}$$

(5)

Speckle arises from coherent interference of backscattered laser light and manifests as multiplicative intensity fluctuations. These variations are typically modeled as a random modulation of the underlying intensity. This noise, caused by coherent light interference, is modeled for the same pixel as multiplicative noise, with $\:{\sigma\:}_{speckle}$ as speckle noise strength:

$$\:{I}_{speckle}\left(x,y\right)=I\left(x,y\right)(1+{R}_{speckle}{\sigma\:}_{speckle})$$

(6)

Mechanical instability, drift, or sample motion during the raster scan introduces spatially coherent blur. This is modeled via a deterministic convolution of the true image with a one-dimensional linear kernel K_blur of fixed length and direction. The noise can be mathematically modeled using a blur Kernal as a convolution “*” below:

$$\:{I}_{blur}\left(x,y\right)={K}_{blur}*I\left(x,y\right)$$

(7)

where ‘*’ denotes 2D convolution. The kernel is chosen to emulate a 5-pixel linear blur at random angles.

Any fluctuations in laser power may also introduce noise in CLSM images, they are represented by a spatially random but globally multiplicative scaling factor, and it can be expressed mathematically as

$$\:{I}_{fluctuation}\left(x,y\right)=I\left(x,y\right)(1+{R}_{fluctuation}{\sigma\:}_{fluctuation})$$

(8)

The undersampling of a CLSM image can be simulated using a binary mask $\:M(x,y)$ with pixel omission probability p is defined as

$$\:M=\left\{\begin{array}{c}1\:\:\:;probabliity=\:\:1-p\\\:0\:;\:\:\:\:\:\:\:\:\:probability=\:p\end{array}\right.$$

(9)

The affected intensity will then be

$$\:{I}_{undersampled}\left(x,y\right)=I\left(x,y\right).M(x,y)$$

(10)

To ensure reproducibility of our synthetic data generation, we detail below the specific degradation parameters used in the simulation pipeline, which were selected to reflect typical noise characteristics observed in experimental CLSM systems. The point spread function (PSF) was modeled using an Airy disk approximation with a numerical aperture (NA) of 1.4 and an emission wavelength λ = 500 nm. Photon shot noise was introduced via Poisson-distributed counts with a scaling factor of 0.1, while Gaussian-distributed additive noise with a standard deviation σ = 0.05 was used to simulate dark current. Speckle noise was applied as multiplicative Gaussian noise with σ = 0.05. Motion blur was introduced by convolving the image with a horizontal linear kernel of size 5 pixels. Additionally, multiplicative Gaussian noise with σ = 0.05 was used to emulate laser power fluctuations. Undersampling was simulated using a random binary mask that removed 20% of the pixels, corresponding to an undersampling ratio of 0.2. Finally, to simulate optical attenuation, an exponential decay term with an attenuation coefficient of 1.20 was applied to the convolved image. These parameter settings form the basis of our degradation model and were held constant throughout training. These details are also summarized in in Table 1.

Table 1 Parameters used for synthetic degradation of CLSM training data.

Full size table

Adaptive physics autoencoder model

To mitigate these degradations, we developed an Autoencoder model, the training of which is constrained by the physics confocal laser scanning microscopy and all the noise types described in the previous section. An autoencoder is a type of neural network with an encoder-decoder structure. The encoder maps noise-degraded image X into a compressed latent space Z

$$\:Z={f}_{encoder}(X;\theta\:)$$

(11)

The latent space representation $\:Z$ encodes the image while filtering out noise. It follows the constraints imposed by the PSF defined earlier by equation (x), which governs the spatial resolution. The encoder consists of four convolutional layers, which gradually extract features while reducing the image size. Each layer applies a convolution operation followed by a LeakyReLU activation function, allowing the model to detect patterns while reducing noise. The detailed architecture of the encoder is shown in Table 2, while the decoder is described in Table 3.

The decoder then reconstructs a clean denoised image from Z.

$$\:\widehat{X}={f}_{decoder}(Z;\varphi\:)$$

(12)

Table 2 Encoder architecture of the adaptive physics autoencoder, consisting of four convolutional layers.

Full size table

Table 3 Decoder architecture of the adaptive physics autoencoder, using three transpose Convolution layers, followed by a final Convolutional layer with a sigmoid activation function.

Full size table

The $\:\theta\:$ and $\:\varphi\:$ are encoder and decoder parameters, respectively. The training objective minimizes the Mean Squared Error (MSE) between reconstructed images X and ground truth images, $\:\widehat{X}$. The primary loss term, the Mean Squared Error (MSE), is defined for N training samples as

$$\:{\mathcal{L}}_{MSE}=\frac{1}{N}\sum\:_{i=}^{N}{\|{\widehat{X}}^{\left(i\right)}(x,y)-\:{X}^{\left(i\right)}(x,y)\|}^{2}$$

(13)

To ensure the reconstructed image adheres to the photon conservation law in CLSM, a photon loss between reconstructed and ground truth images is introduced.

$$\:{\mathcal{L}}_{photon}=\left|\sum\:\widehat{X}-X\right|$$

(14)

The morphological preservation of features in CLSM images was ensured by edge loss defined as

$$\:{\mathcal{L}}_{edge}=\left|\nabla\:\widehat{X}-X\right|$$

(15)

The total loss function was defined as a weighted sum of a pixel-wise reconstruction loss and a physics-consistency penalty, with scalar hyperparameters λ₁ and λ₂ governing the relative contributions of each term. $\:{\lambda\:}_{1}={\lambda\:}_{2}=0.1$ were used to moderately favor physics-guided consistency during training.

$$\:{\mathcal{L}}_{total}={\mathcal{L}}_{MSE}+{{\lambda\:}_{1}\mathcal{L}}_{photon}+{{\lambda\:}_{2}\mathcal{L}}_{edge}$$

(16)

To simulate real-world imaging conditions encountered in confocal laser scanning microscopy (CLSM), the training dataset was augmented with a series of physically motivated degradations. These included convolution with a point spread function (PSF) modeled as an Airy disk, followed by the sequential application of photon shot noise (Poisson-distributed), additive Gaussian noise representing dark current, and multiplicative Gaussian noise simulating speckle effects. Motion blur was introduced by convolving each image with a linear kernel of 5 pixels at random orientations. Additional multiplicative noise was applied to emulate laser power fluctuations, and random undersampling was implemented by removing 20% of pixels using a binary mask. These augmentation steps were applied consistently across the dataset to train the model under a diverse range of degradation scenarios.

The model was trained using the Adam optimizer with a fixed learning rate of 0.001 and a batch size of 4. All training was conducted on a standard CPU workstation to maximize reproducibility, requiring approximately 6–8 h for 300 epochs on a dataset of ~ 180 images (256 × 256 pixels). When executed on a single mid-range GPU (NVIDIA RTX 3050), total training time reduced to approximately 1 h. The model architecture is compact, comprising roughly 5 million trainable parameters, and achieves inference times under 0.1 s per image on GPU. These characteristics make the model suitable for real-time or near-real-time deployment in microscopy workflows, even in resource-constrained environments.

Figure 1 demonstrates the Adaptive Physics Autoencoder for CLSM image restoration. On the left, noisy and degraded CLSM images are input to the encoder. The encoder extracts features using multiple convolutional layers with ReLU activation. The image is compressed into a latent space (64 × 64 × 256), capturing its core structure. The decoder, composed of transpose convolution layers, progressively reconstructs the image while reducing noise and restoring fine details. The final convolutional layer with Sigmoid activation ensures correct intensity scaling. The output on the right shows a restored CLSM image with improved clarity and structural preservation.

Unless otherwise stated, all image intensities are normalized to the range [0, 1] and are unitless. Wavelengths are in nanometers (nm), spatial resolution is expressed in micrometers (µm), and kernel sizes and image dimensions are given in pixels. Standard deviations (σ) for noise models are unitless and refer to normalized image scales. Peak Signal-to-Noise Ratio (PSNR) is reported in decibels (dB), and Structural Similarity Index (SSIM) is dimensionless. All the units used are being summarized in Table 4.

Table 4 Summary of units employed for physical quantities involved in this study.

Full size table

Model training

The model was trained to reconstruct clean images by minimizing the total loss function defined by Eq. (7). Training of the model was optimized using the Adam optimizer with a learning rate of 0.001 and for 1000 epochs with a batch size of 4 images. The model learns to restore CLSM images by continuously improving its ability to remove noise while maintaining physically meaningful structures.

Model evaluation

The reconstructed images were evaluated using the Structural Similarity Index (SSIM), Peak Signal-to-Noise Ratio (PSNR), and intensity profile matching.

SSIM measures structural similarity between the reconstructed and original images.

$$\:SSIM\left(X,\widehat{X}\right)=\frac{(2{\mu\:}_{X}{\mu\:}_{\widehat{X}}+{C}_{1})(2{\sigma\:}_{X\widehat{X}}+{C}_{2})}{({{\mu\:}_{X}}^{2}+{{\mu\:}_{\widehat{X}}}^{2}+{C}_{1})({{\sigma\:}_{X}}^{2}+{{\sigma\:}_{\widehat{X}}}^{2}+{C}_{2})}$$

(17)

such that, $\:\mu\:$ are the mean values, $\:\sigma\:$ are variances and $\:{\sigma\:}_{X\widehat{X}}$ is the covariance between the images.

Peak Signal-to-Noise Ratio measures the pixel-wise accuracy of the reconstructed image.

$$\:PSNR=10{\text{log}}_{10}\frac{{(max.I)}^{2}}{MSE}$$

(18)

where $\:max.I$ is the maximum possible intensity value.

The performance of the autoencoder was also compared against traditional image restoration methods.

Richardson-Lucy Deconvolution method is given by:

$$\:{X}_{k+1}={X}_{k}\cdot\:\left(\frac{1}{{X}_{k}*PSF}*{PSF}^{\dagger}\right)$$

(19)

where ‘*’ represents convolution and $\:{PSF}^{\dagger}$ is flipped point spread function.

Non-Negative Least Squares (NNLS) solves:

$\:\underset{X}{\text{min}}{\|PSF*X-I\|}^{2}$ subject to $\:X\ge\:0$²⁰.

Total Variation regularization minimizes:

$$\:\underset{X}{\text{min}}{\|PSF*X-I\|}^{2}+\lambda\:\sum\:_{i,j}\sqrt{{({X}_{i+1,j}-{X}_{i,j})}^{2}+{({X}_{i,j+1}-{X}_{i,j})}^{2}}$$

(20)

such that $\:\lambda\:$ controls the smoothness of the reconstructed image.

Results and discussion

The individual effect of each of the noise types introduced in Sect. 2.1, simulated on an example image, is displayed in Fig. 2 with their effect on the intensity profile displayed in Fig. 2h. However, real-life CLSM imaging may contain a combination of few or all such noise types.

Figure 3 displays the sequential degradation of an example image. Following this, the training data was augmented by applying CLSM-specific noise types (PSF convolution, photon shot noise, motion blur, speckle noise, and undersampling), so that the model learns degradation characteristics aligned with experimentally observed CLSM conditions.

While our synthetic degradations are physically grounded, real CLSM images often contain unmodeled effects such as illumination drift, photobleaching, and sample heterogeneity. These may impact model performance under real conditions. To address this, we are initiating collaborations to acquire paired experimental datasets for validation and fine-tuning. Our framework supports transfer learning and can be adapted to real-world data once such training sets are available, making it a scalable starting point for practical deployment. The training progression of the Adaptive Physics Autoencoder for restoring CLSM images is shown in Fig. 4. Starting with the degraded input i.e. Network Input, the reconstructed images progressively converge toward the Ground Truth with gradual recovery of fine structural details as training progresses. This is also complemented quantitatively by normalized intensity profiles that show improved fidelity in reconstructions, with closer alignment to the Ground Truth over successive epochs. Quantitative metrics i.e. Structural Similarity Index (SSIM) and Peak Signal-to-Noise Ratio (PSNR) also indicate consistent improvements in image quality throughout training, achieving optimal performance at 300 epochs.

Performance on CLSM images of lipid droplet morphology in adipocytes

We first demonstrated the ability of the proposed Adaptive Physics Autoencoder to improve the degraded CLSM images of lipid droplets in a gel matrix. Figure 5 shows the visualization of the recovery performance of different algorithms.

The Network Input represents the noisy, degraded image, while the ground truth shows the high-resolution target. The network output demonstrates the reconstructed image compared to the results from the Richardson-Lucy (RL) and Non-Negative Least Squares (NNLS) deconvolution algorithms. Enlarged Regions of Interest (ROIs) highlight the superior restoration of droplet morphology and surrounding features by the proposed network. Quantitative evaluations using SSIM and PSNR further validate the performance improvement achieved by the proposed method.

The network output images are inferred from the degraded images, i.e., the network input image. We compare the visualized reconstruction performance of the proposed network with widely used image deconvolution algorithms, including the non-negative least squares (NNLS) algorithm and the RL algorithm.

For a fair comparison, the degraded images were up-sampled by bicubic interpolation before being deconvolved. It can be seen that the resolution of the images processed by the two deconvolution algorithms is somewhat improved compared to the input images.

The reconstructed confocal images of the network output, however, present a much finer structure than the deconvolution results. From the enlarged images of the white dotted line frame shown in the bottom row of Fig. 5, we can see that the network output image has reconstructed the circular droplet structures with a more obvious outline and the surrounding network structure closer to the real image.

Furthermore, we quantified the performance of images generated by different algorithms using SSIM and peak signal-to-noise ratio (PSNR) indexes. The quantitative results (bottom row) report that the SSIM of our algorithm is approximately 0.98, and the PSNR exceeds 36 dB. The experiment was repeated with 20 images, yielding similar results.

Performance on CLSM images of neuronal networks in cerebral organoids

The proposed Adaptive Physics Autoencoder was also applied to the CLSM synthetic images of neuronal networks of cerebral organoids to assess its capability in reconstructing structurally complex and densely connected systems.

Figure 6 illustrates the comparative reconstruction performance of the network alongside traditional deconvolution methods. The network’s output is directly inferred from degraded inputs, while the other approaches, the RL and non-negative least squares (NNLS) deconvolution algorithms, require preprocessing, including bilinear interpolation to match resolution requirements. Unlike the RL and NNLS methods, which partially enhance the image resolution but struggle to reconstruct fine neuronal details, the network output excels in restoring the complexity of the neuronal architecture, including well-defined filaments and continuous network structures. ROIs in the bottom row further emphasize the network’s ability to recover delicate connections and accurately represent the topology of neuronal systems. Quantitative metrics support these visual observations, with the network achieving an SSIM of approximately 0.98 and a PSNR of 35.88 dB, significantly outperforming the deconvolution-based methods.

Performance on CLSM images of sparse fibrillar structures

The evaluation of the proposed Adaptive Physics Autoencoder was extended to synthetic CLSM images of sparse fibrillar structures. Figure 7 displays the reconstruction performance of our model compared to traditional deconvolution methods. While RL and NNLS algorithms result in slight enhancements to the resolution of the input images, they struggle to accurately reconstruct the sparse fibrillar features. On the other hand, the network output demonstrates significant improvements, with well-defined fibrillar structures and enhanced contrast closely resembling the Ground Truth.

Quantitative analysis also confirms these visual findings. The SSIM and PSNR values of the network output are markedly higher than those of RL and NNLS, with an SSIM of approximately 0.94 and a PSNR of 35.25 dB.

Performance comparison with traditional reconstruction methods

The proposed Adaptive Physics Autoencoder is further evaluated by comparing its performance to additional image reconstruction methods, including Total Variation (TV) regularization, Wiener filtering, and Wavelet denoising, on simulated CLSM images of spherical structures embedded in a gel matrix, resembling microplastic particles. Figure 8 demonstrates the reconstruction performance of all methods using low-resolution, noisy inputs (Network Input) as the starting point, with the Ground Truth serving as the ideal reference.

While TV regularization, Wiener filtering, and Wavelet denoising improve the input image quality to some extent, they struggle to reconstruct fine details and introduce unwanted artifacts or excessive smoothing. On the other hand, the network output achieves better reconstruction of the spherical structures and preserves their boundaries and overall morphology closely matching the Ground Truth. The ROIs show the superior performance of the network in restoring high-quality CLSM images.

Ablation study on physics-guided loss

To quantitatively and qualitatively assess the role of Physics-guided loss terms in the training process, we performed an ablation study by training our Physics-guided model under two conditions: (1) using mean squared error (MSE) alone as the loss function, and (2) using the full Physics-guided composite loss incorporating MSE, photon-consistency penalty, and Laplacian-based edge preservation. The goal was to isolate the contribution of the physics constraints to reconstruction fidelity across training epochs.

Results of this study on LYSOSOMES like structures are displayed in Fig. 9, visual comparisons over training epochs (5 to 300) reveal that both versions of the model progressively reconstruct finer structural details. However, the Physics-guided variant consistently achieves sharper, more accurate restorations earlier in training and with greater structural consistency in localized regions (magnified insets). This advantage is particularly evident in the normalized intensity profiles, where the physics-augmented model more closely tracks the ground truth peaks and valleys across the pixel domain.

Quantitative trends in SSIM and PSNR further support this finding. As shown in the comparison plots in Fig. 9, the model trained with physics penalties exhibits a steeper and more consistent improvement trajectory in both metrics. At 300 epochs, it reaches an SSIM of ~ 0.93 and a PSNR exceeding 33 dB—both of which are higher than the corresponding values for the MSE-only model (~ 0.80 SSIM, ~ 31.5 dB PSNR). Notably, the performance gap begins to diverge substantially beyond 150 epochs, indicating that the added physical constraints guide the network toward more faithful and data-consistent solutions.

These results validate the efficacy of Physics-guided regularization in accelerating convergence and enhancing fidelity in image reconstruction, thereby demonstrating the non-trivial impact of physical priors in microscopy-specific restoration tasks.

Conclusions

Our method reduces equipment costs, ensures consistency with real imaging conditions, and takes humans out of the loop, a step towards self-driving labs. Comparisons with widely used deconvolution algorithms and other reconstruction methods demonstrated its ability to recover fine structural details, validated both qualitatively and quantitatively through network output and ground truth comparison, SSIM, and PSNR metrics. In summary, Adaptive Physics Autoencoder represents a step forward in interpretable deep learning by integrating Physics-guided models with deep learning for confocal microscopy imaging, real-time denoising of biological and medical imaging, and self-driving labs.

Data availability

The datasets used in this paper are available upon reasonable request from the corresponding author. Specific details about data sources and preprocessing steps are described in the Materials and Methods section.

References

McLeod, E. & Ozcan, A. Unconventional methods of imaging: computational microscopy and compact implementations. Rep. Prog. Phys. 79 (7), 076001 (2016).
Article ADS PubMed Google Scholar
Fischer, R. S., Wu, Y., Kanchanawong, P., Shroff, H. & Waterman, C. M. Microscopy in 3D: a biologist’s toolbox. Trends Cell Biol. 21 (12), 682–691 (2011).
Article CAS PubMed PubMed Central Google Scholar
Jerkovic, I. & Cavalli, G. Understanding 3D genome organization by multidisciplinary methods. Nat. Rev. Mol. Cell Biol. 22 (8), 511–528 (2021).
Article CAS PubMed Google Scholar
Cutrale, F., Fraser, S. E. & Trinh, L. A. Imaging, visualization, and computation in developmental biology. Annual Rev. Biomedical Data Sci. 2 (1), 223–251 (2019).
Article Google Scholar
Stender, A. S. et al. Single cell optical imaging and spectroscopy. Chem. Rev. 113 (4), 2469–2527 (2013).
Article CAS PubMed PubMed Central Google Scholar
Buchberger, A. R., DeLaney, K., Johnson, J. & Li, L. Mass spectrometry imaging: a review of emerging advancements and future insights. Anal. Chem. 90 (1), 240 (2017).
Article ADS PubMed Google Scholar
Hauser, M. et al. Correlative super-resolution microscopy: new dimensions and new opportunities. Chem. Rev. 117 (11), 7428–7456 (2017).
Article CAS PubMed Google Scholar
Shah, S., Crawshaw, J. & Boek, E. Three-dimensional imaging of porous media using confocal laser scanning microscopy. J. Microsc. 265 (2), 261–271 (2017).
Article CAS PubMed Google Scholar
Zhivov, A., Stachs, O., Stave, J. & Guthoff, R. F. In vivo three-dimensional confocal laser scanning microscopy of corneal surface and epithelium. Br. J. Ophthalmol. 93 (5), 667–672 (2009).
Article CAS PubMed Google Scholar
Scivetti, M., Pilolli, G. P., Corsalini, M., Lucchese, A. & Favia, G. Confocal laser scanning microscopy of human cementocytes: analysis of three-dimensional image reconstruction. Annals Anatomy-Anatomischer Anzeiger. 189 (2), 169–174 (2007).
Article Google Scholar
Jones, C. W., Smolinski, D., Keogh, A., Kirk, T. & Zheng, M. Confocal laser scanning microscopy in orthopaedic research. Prog. Histochem. Cytochem. 40 (1), 1–71 (2005).
Article CAS PubMed Google Scholar
Liu, S., Weaver, D. L. & Taatjes, D. J. Three-dimensional reconstruction by confocal laser scanning microscopy in routine pathologic specimens of benign and malignant lesions of the human breast. Histochem. Cell Biol. 107 (4), 267–278 (1997).
Article CAS PubMed Google Scholar
Wright, S. J. et al. Introduction to confocal microscopy and three-dimensional reconstruction. Methods Cell. Biol. 38, 1–45 (1993).
Article CAS PubMed Google Scholar
Teng, X., Li, F. & Lu, C. Visualization of materials using the confocal laser scanning microscopy technique. Chem. Soc. Rev. 49 (8), 2408–2425 (2020).
Article CAS PubMed Google Scholar
Braat, J. J., van Haver, S., Janssen, A. J. & Dirksen, P. Assessment of optical systems by means of point-spread functions. Progress Opt. 51, 349–468 (2008).
Article ADS Google Scholar
Rossmann, K. Point spread-function, line spread-function, and modulation transfer function: tools for the study of imaging systems. Radiology 93 (2), 257–272 (1969).
Article CAS PubMed Google Scholar
Xie, X., Chen, Y., Yang, K. & Zhou, J. Harnessing the point-spread function for high-resolution far-field optical microscopy. Phys. Rev. Lett. 113 (26), 263901 (2014).
Article ADS PubMed Google Scholar
Stallinga, S. & Rieger, B. Accuracy of the Gaussian point spread function model in 2D localization microscopy. Opt. Express. 18 (24), 24461–24476 (2010).
Article ADS CAS PubMed Google Scholar
Karataev, P. et al. First observation of the point spread function of optical transition radiation. Phys. Rev. Lett. 107 (17), 174801 (2011).
Article ADS PubMed Google Scholar
Su, J., Xu, B. & Yin, H. A survey of deep learning approaches to image restoration. Neurocomputing 487, 46–65 (2022).
Article Google Scholar
Tai, Y., Yang, J., Liu, X. & Xu, C. (eds) Memnet: A persistent memory network for image restoration. In Proceedings of the IEEE international conference on computer vision (2017).
Ali, S. et al. A deep learning framework for quality assessment and restoration in video endoscopy. Med. Image. Anal. 68, 101900 (2021).
Article PubMed Google Scholar
Wang, G., Ye, J. C. & De Man, B. Deep learning for tomographic image reconstruction. Nat. Mach. Intell. 2 (12), 737–748 (2020).
Article Google Scholar
Born, M. & Wolf, E. Principles of Optics: Electromagnetic Theory of Propagation (Elsevier, 2013).
Richards, B. & Wolf, E. Electromagnetic diffraction in optical systems, II. Structure of the image field in an aplanatic system. Proc. Royal Soc. Lond. Ser. Math. Phys. Sci. 253 (1274), 358–379 (1959).
ADS Google Scholar

Download references

Acknowledgements

The authors declare that no external funding was received for this work.

Author information

Authors and Affiliations

Department of Physics and Astronomy, Georgia State University, Atlanta, GA, 30303, USA
Zaheer Ahmad
Department of Physics, Riphah International University, Islamabad, 46000, Pakistan
Junaid Shabeer
Department of Physics, Charles E. Schmidt College of Science, Florida Atlantic University, Boca Raton, FL, 33431-0991, USA
Abdullah Hidayat & Wazir Muhammad
Roots IVY, Islamabad, 44000, Pakistan
Usman Saleem
Arid Agriculture University, Rawalpindi, 46000, Pakistan
Tahir Qadeer & Abdul Sami
Department of Physics, University of Illinois, Chicago, IL, 60607, USA
Zahira El Khalidi
Department of Physics, University of Central Florida, Orlando, FL, 32816, USA
Saad Mehmood
Lynn Cancer Institute-Radiation Oncology, Boca Raton Regional Hospital, Boca Raton, FL, 33431, USA
Shyam Pokharel
Department of Electro-Optics and Photonics, University of Dayton, Dayton, OH, 45469, USA
Osama Ahmed Rana
Radiation and Cellular Oncology, University of Chicago, Chicago, IL, 60637, USA
Bulent Aydogan

Authors

Zaheer Ahmad
View author publications
Search author on:PubMed Google Scholar
Junaid Shabeer
View author publications
Search author on:PubMed Google Scholar
Abdullah Hidayat
View author publications
Search author on:PubMed Google Scholar
Usman Saleem
View author publications
Search author on:PubMed Google Scholar
Tahir Qadeer
View author publications
Search author on:PubMed Google Scholar
Abdul Sami
View author publications
Search author on:PubMed Google Scholar
Zahira El Khalidi
View author publications
Search author on:PubMed Google Scholar
Saad Mehmood
View author publications
Search author on:PubMed Google Scholar
Shyam Pokharel
View author publications
Search author on:PubMed Google Scholar
Osama Ahmed Rana
View author publications
Search author on:PubMed Google Scholar
Bulent Aydogan
View author publications
Search author on:PubMed Google Scholar
Wazir Muhammad
View author publications
Search author on:PubMed Google Scholar

Contributions

Z.A. conceived the idea, generated the simulated CLSM datasets, designed and implemented the deep learning model, performed training and data analysis, and wrote the original draft. J.S., and A.H., assisted in developing and scripting the deep learning network and contributed to writing. U.S., and T.Q. contributed to data analysis and manuscript writing. A.S., Z.E.K., S.M., S.P., O.A.R., and B.A supported data analysis, validation, and manuscript revision. W.M. supervised the project as principal investigator and contributed to manuscript writing and revision. All authors reviewed and approved the final manuscript.

Corresponding author

Correspondence to Wazir Muhammad.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Ahmad, Z., Shabeer, J., Hidayat, A. et al. Enhanced confocal microscopy with physics-guided autoencoders via synthetic noise modeling. Sci Rep 16, 4842 (2026). https://doi.org/10.1038/s41598-025-34839-x

Download citation

Received: 20 May 2025
Accepted: 31 December 2025
Published: 07 January 2026
Version of record: 04 February 2026
DOI: https://doi.org/10.1038/s41598-025-34839-x