Efficient real-world image denoising using multi-scale gaussian pyramids

Rani, Asha; Bhogal, Rosepreet Kaur

doi:10.1038/s41598-025-23942-8

Download PDF

Article
Open access
Published: 17 November 2025

Efficient real-world image denoising using multi-scale gaussian pyramids

Asha Rani¹ &
Rosepreet Kaur Bhogal¹

Scientific Reports volume 15, Article number: 40086 (2025) Cite this article

2385 Accesses
Metrics details

Subjects

Abstract

The field of image denoising has undergone significant advancements over the years. Recently, Convolutional Neural Networks (CNN) based denoising methods have shown remarkable performance in image denoising. Most of these adopt single-scale features, which may have limitations in denoising real-world images. Real-world noise is complex and non-Gaussian in nature. The multi-scale strategy of the Gaussian pyramid (GP) facilitates the attenuation of noise while preserving image details. Additionally, this multiscale architecture inherently reduces the data’s dimensionality, resulting in decreased computational complexity. Over the past few decades, this method has been employed for image denoising; however, its application to real-world images remains computationally challenging. In this study, we implemented the GP method for denoising X-ray, MRI, non-medical images, and SIDD datasets. Furthermore, its denoising performance is compared with the wavelet transforms (Coiflet4, Haar, Daubechies, and Symlets). Quantitatively, GP achieves a significant improvement in PSNR, SSIM, and computational complexity compared to the wavelet method. PSNR of 36.8024 dB, SSIM of 0.9428, and computational complexity of 0.0046 s have been achieved, thereby offering an effective and practical solution for real-world image applications.

An efficient lightweight network for image denoising using progressive residual and convolutional attention feature fusion

Article Open access 25 April 2024

A foundation model for enhancing magnetic resonance images and downstream segmentation, registration and diagnostic tasks

Article 05 December 2024

Remote sensing image Super-resolution reconstruction by fusing multi-scale receptive fields and hybrid transformer

Article Open access 16 January 2025

Introduction

Noise in images is a common problem. It is the unwanted signal that is generally introduced during the acquisition, transmission, and/or reconstruction of an image. Though the noise cannot be altogether eliminated, it can be reduced at acquisition time¹. Possible ensuing image processing tasks, such as video processing, image analysis, and tracking, are adversely affected; therefore, image denoising plays a crucial role in modern image processing systems.

Image denoising plays a pivotal role in many computer vision and image analysis tasks, including object recognition, medical imaging, remote sensing, and surveillance. The main objective of denoising is to enhance image quality by removing noise while preserving important structural details such as textures, edges, and contours². Noise can arise from various sources, including sensor limitations, transmission errors, or environmental interference during image acquisition. The choice of denoising technique depends on the specific application and the type of noise³. The key factors, such as edge preservation, artifact introduction, and computational efficiency, must be carefully considered. Overall, image denoising remains an active area of research, with ongoing efforts to improve performance and adaptability to various noise models⁴.

Real-world noise is complex and unpredictable, which occurs in images captured under practical conditions using imaging devices such as smartphones, digital cameras, or medical scanners⁵. In the medical domain, modalities including X-ray, CT, and MRI are affected by this signal-dependent, non-Gaussian, spatially variant, and more structured⁶. In the past decades, a large number of noise modelling methods have been proposed to remove Additive White Gaussian noise (AWGN) and Mixture of Gaussian (MoG) noise⁷. Despite achieving competitive results, these are not adaptive enough to denoise real-world images⁶. Some existing methods, such as mean filtering, median filtering, and wavelet thresholding, offer simple and computationally efficient solutions⁸. However, these compromise image details⁹. More advanced methods like non-local means¹⁰, Block-Matching, 3D filtering¹¹ and dictionary learning¹² provide improved performance by exploiting spatial and statistical redundancies, and may face challenges with multilevel real-world noise.

Advancements in deep learning led to more sophisticated approaches¹³. Still, one notable constraint in many existing methods is that they generally focus on deeper and larger Convolutional Neural Networks (CNNs)¹⁴ where a large number of network parameters are to be learned to represent the noise features. In that case, a trade-off between computational complexity (CC) and denoising quality must be established.

Real-world noise often appears at multiple scales, and effectively addressing it across varying scales remains a significant challenge. Multi-scale approaches have shown competitive performance compared to state-of-the-art denoisers¹⁵^,¹⁶. Each layer captures features or noise components at different scales, which can then be isolated and removed from coarse to fine levels. The Gaussian Pyramid (GP) framework has shown great potential in addressing real-world noise¹⁷. This framework employs a three-stage process, including noise estimation, denoising, and feature fusion, to effectively handle real-world noise¹⁸. This decomposition process facilitates noise attenuation at coarser levels while preserving fine details at higher resolutions with improved CC¹⁹. This denoising method exhibits significant advantages in terms of accuracy, efficiency, and information preservation compared to other multiscale techniques, such as wavelet transforms.

Given the wide range of denoising techniques available in the literature, this study delivers a rigorous comparative evaluation of the GP-based denoising method against wavelet variants, including wavelet methods Coiflet4 (Coif4), Daubechies (db4), Haar, and Symlets (sym4), across diverse X-ray, MRI, Non-medical, and SIDD datasets. It provides a fine-grained analysis of filter selection and justifies the 5-layer architecture, while quantifying performance using PSNR, SSIM, MSE, RMSE, VIF, FOM, MAE, computational efficiency, and standard deviation (SD), along with statistical tests, including paired t-test, and Wilcoxon Signed-Rank Test. The findings offer practical guidance for real-world deployment, highlighting scenarios where GP achieves an optimal balance between structural fidelity and processing cost. Most existing image denoising studies are evaluated on the SIDD and other natural image benchmarks. In addition to these datasets, our work also focuses on medical image datasets, where preserving fine details is critical for accurate diagnosis.

Objective of the paper

The objective of this research is to:

Implement a GP-based approach for denoising real-world images.
Evaluate the performance of this approach against wavelet transforms (coif4, db4, Haar, sym4).
Demonstrate the potential of GP in both preserving structural details and reducing noise.

Literature survey

In recent years, a large amount of work has been proposed on image denoising. Fixed noise removal from images has been well-studied; however, limited work has been done on real-world image denoising. Real-world noise is variant and random in nature, which may not be identified efficiently by using a single noise level. Therefore, image denoising remains a significant challenge in real-world images.

Noises in real-world images

Images taken in real-world situations often encounter various types of noise, including Gaussian, Salt and Pepper, quantization, and Poisson noise. Each type affects the image differently²⁰. Figure 1 illustrates that salt and pepper noise has a notable effect and occurs randomly, likely due to sensor issues or transmission errors. Gaussian noise also has a significant impact, creating a statistical distribution due to sensor thermal noise and electronic circuit fluctuations. Poisson noise and quantization noise tend to have less impact and originate during the image generation and digitization. Figure 2 illustrates real-world image noise types based on the signal dependency levels, which stem from physical processes. Photon shot noise is signal-dependent when dark current and read-out noise are signal-independent. Fixed pattern noise arises across the sensor array because of non-uniformity in pixel response. This variability of noise profiles challenges the assumption of fixed noise in denoising models and emphasizes the necessity for denoising methods that account for signal dependency across the entire dynamic range.

Image denoising methods

Image denoising is a fundamental process in computer vision for enhancing image quality by reducing noise. In general, mathematically, the image denoising problem can be modelled as

$$y = x + n$$

(1)

where y is the noisy image, x is the unknown clean image, and n represents the noise. This noise can be estimated by various methods. The purpose of the denoising method is to decrease the noise without compromising the important details of the image. These can be categorized as classical and advanced methods.

Classical image denoising methods

Various methods have been proposed for image denoising, such as spatial, transform, and statistical domain filtering. Most of these methods are based on linear or statistical models, which may limit their ability to capture the complexities of real-world images. Table 1 presents the image denoising methods based on their domain, noise handling capability, edge preservation, and CC. While mean and median filters²¹ have good computational efficiency, yet offer limited performance in retaining important image details. Transform domain filters, such as the Discrete Cosine Transform (DCT)²² and Wavelet Transform (WT)²³, provide enhanced performance for high-quality and multi-level noise while maintaining moderate edge preservation. Statistical methods, including the Wiener filter²² and Non-Local Means (NLM)²⁴, demonstrate strong adaptability to various noise types while maintaining excellent edge retention, albeit at the cost of increased computational complexity. We must consider trade-offs among noise reduction, edge preservation, and computational demands.

Table 1 Classical image denoising methods.

Full size table

Advanced image denoising techniques

With the enhancement in deep learning (DL), there has been a shift toward data-driven image denoising methods, such as CNN and other machine learning-based methods. These denoising approaches employ cutting-edge computational models to reduce noise and preserve key structural details. Multiple advanced image denoising methods include model-based methods, DL approaches, variational and generative models, hybrid and multiscale strategies.

Model-based approach involves optimization concepts (such as total variation method and sparse coding) using noise priors²⁶. They are versatile and interpretable, but require human intervention, which limits their scalability. They often generate noise artifacts, impeding their generalization to real-world scenarios.
Learning-based denoising approaches leverage CNNs to model noise distribution. They are effective; however, they require large-scale datasets and often lack interpretability. While these approaches are good at preserving image details, they tend to affect interpretability. So, it becomes challenging to debug or optimize them for better performance in specific applications²⁷. Technologii et al. proposed a denoiser using CNNs, which has shown notable enhancements over classical techniques, albeit at the cost of CC²⁸.
Generative models such as Generative Adversarial Networks (GANs)²⁹ and variational autoencoders (VAEs)³⁰ generate clean images by learning the image distribution.
Multiscale Techniques: Multiscale representations have emerged as key in the image denoising task because of their ability to capture image structures at different resolutions. While WT has been extensively used for this purpose, GP provides a simpler yet powerful alternative, which involves iterative low-pass filtering and down-sampling, allowing efficient noise suppression across scales while preserving structural information²⁷.

Literature review

Over the course of several decades, extensive research has been dedicated to developing robust and well-structured techniques for image denoising. Many of these targeted fixed noise patterns, typically modeled as Gaussian noise³¹. However, these methods are not flexible and adaptive enough to address the complex and spatially varying characteristics of real-world noise². To address this, Xu et al³² proposed prior learning approach that requires human interference, whereas Majed et al.¹¹ proposed a blind denoising technique to remove fixed Gaussian noise. Wavelet-based methods depend on the proper selection of a threshold and assumptions about noise for effective denoising³³. Some studies have proposed enhanced threshold functions to address denoising, leading to an increase in computational cost³⁴. Several model-driven methods have been explored. Buades et al. proposed a non-local means filter, which leverages the presence of self-similarity of features in the image for denoising²⁵. Xu et al. further introduced a patch grouping-based algorithm to reduce redundancy between similar patches^25,35. However, it shows limitations for spatially variant noise, which often occurs in real-world images. Xiao et al. extended NLM to a multiscale framework for denoising³⁶. Panigrahi et al. proposed an avenue for multiscale NLM by combining curvelet domain processing with NLM filtering to minimize artifacts, achieving a PSNR of 30.526 and an SSIM of 0.896 with a noise density of 30³⁷. With the advent of deep learning, CNN-based approaches have gained prominence in image denoising due to their ability to learn complex noise patterns and achieve high PSNR³⁸. Model-based²⁶ and learning-based techniques using a pattern learning approach, such as DnCNN³⁸, Noise2Void³⁹, RIDNet⁴⁰, and autoencoders⁴¹ have shown significant efficacy. Another neural network-based approach employed a dual-attention mechanism, achieving promising results⁴². Hybrid methods combining GP, CNN, and DNN have been developed^{43,44,45,46,47} which offered enhanced performance for image denoising while incurring computational complexity. To further improve denoising, Zhang et al.⁴⁸ provide a framework integrating GP decomposition with a conditional GAN. Additionally, multilevel frameworks are developed by Lam et al.⁴⁹, Ma et al.⁵⁰, Zhong et al.⁵¹, and others^11,52. Asem Khmag also proposes a fast and accurate denoising method that integrates pulse-coupled neural networks, wavelet filtering, and regularization of the Perona–Malik equation, achieving improvements in PSNR and SSIM of 0.85–1.54 dB and 0.0132–0.1521, respectively⁵³. All these methods require aggressive training, which may limit their computational efficiency.

Table 2 lists the image denoising methods, along with their corresponding PSNR values and references. Figure 3 illustrates the performance of all these image denoising methods. This bar chart displays the denoising results of all these methods achieved on both synthetic and real-world image datasets. As shown in Fig. 3 and Table 2, GP presents a promising approach for image denoising. It achieved better results with PSNR values of 48.48⁵⁴ and 39.77 dB⁵⁰. A hybrid method combining residual learning image denoising (RLID), direct image denoising (DID), GAN, and CNN achieved the highest PSNR of 59.33 dB¹⁹ when classical and standalone methods achieved 31 and 32 dB²²,¹³. In another GAN-based denoising method integrated with semi-soft thresholding, an improvement of 2.24 dB in PSNR has been achieved relative to the state-of-the-art studies with the BSE68 and Waterloo exploration datasets⁵⁵.

Table 2 Reported PSNR values of different image denoising methods from existing studies with references.

Full size table

To understand the prevailing methodological trends in multiscale image denoising, a comprehensive review of the publications is conducted. Figure 4 presents the frequency of studies using multi-scale techniques, including CNN, GP, Deep Neural Networks (DNN), and GAN, employed in the literature on image denoising. This distribution suggests a strong research interest in CNNs and GPs due to their effectiveness in capturing spatial hierarchies and reducing noise across different scales.

To assess the dissemination pattern of research on the GP method, a source-wise analysis of published papers was done. Figure 5 (bar chart) presents the source-wise distribution of GP-based research publications across major publication sources. Out of 28 papers, 13 are conference-based and 17 are journal-based. Table 3 presents literature spanning from 2013 to 2024, covering a diverse set of denoising techniques, including CNNs, DNNs, GANs, GP, and hybrid methods. GP-based approaches have been consistently used since 2018, with a notable increase in 2024, in addition to hybrid methods that combine CNN and autoencoders. Recent trends show an increasing interest in multi-model and hierarchical strategies for enhancing denoising performance.

Table 3 Literature span of Image denoising techniques.

Full size table

While GP has been previously employed in denoisers, they have predominantly focused on synthetic Gaussian noise or domain-specific applications. They often lack the adaptivity required for diverse and complex real-world noise and in maintaining image structures, necessitating the integration with advanced techniques such as neural networks to overcome these limitations⁴⁸. Some are assuming a fixed Gaussian noise model. Zhang & Lam⁴⁹ and Zhao, et al⁵⁴ have used a multi-level structure and exhibit good results. However, it focuses on synthetic datasets, such as Kodak, BSD65, and SIDD, with simulated noise, which limits its generalisability to practical scenarios. Chihaoui et al. proposed a Mask, Impaint, and Denoise (MID) framework using unsupervised learning. Masking and impainting process involving multiple rounds of training and iterations, which is likely to increase the computational complexity⁵⁸.

GP methods, though computationally efficient for multiscale representation, often induce blurring and loss of fine structures. This may limit their effectiveness in detail-sensitive tasks¹⁹. These are often inadequate for variable and complex noise arising from sensor imperfections in denoisers⁶¹ and often face challenges in non-uniform noise distribution conditions, like underwater imaging⁶² Henceforth, hybrid approaches integrating DL and GP are recommended for achieving robust image denoising.

Existing literature mainly favors wavelet or deep learning methods, leaving a gap in the systematic evaluation of GP-based techniques. Moreover, current studies using this multiscale approach often focus on specific modalities. Additionally, comparative research on GP-based denoising across multiple imaging modalities, including MRI, X-ray, Non-medical images, and the SIDD datasets, remains limited, as most studies concentrate on a single modality. This study aims to address this gap by implementing a GP-based denoising framework and benchmarking its performance against the wavelet method using real-world image datasets.

Methodology

To assess the effectiveness of multiscale denoising strategies, this work implements a GP method and compares its performance with WT techniques, specifically Haar, Daubechies, Coiflet-4, and Symlets transforms. While both methods aim to enhance image quality by isolating noise components, they employ distinct mathematical models for image decomposition. Noise is commonly characterized by a mean value of zero and a variance σ², and it can be mathematically represented as:

$$I_{n} \left( {x,y} \right) = I\left( {x,y} \right) + N\left( {x,y} \right)$$

(2)

where I(x,y) is the clean image and N(x,y) represents the noise, which may follow a Gaussian or real-world distribution depending upon the imaging conditions. I_n(x,y) is the noisy version of the image⁸, and (x,y) are pixel coordinates in the image. The aim of denoising is to estimate the original clean image I from the noisy observation I_n. Denoising methods can be parametric or non-parametric. Classical denoising methods are fast, interpretable, and effective for simple or known noise, while struggling with complex or real-world noise. Learning-based methods achieve state-of-the-art performance and can generalize to complex noise patterns; however, they are limited in computational efficiency and interpretability. The motivation behind this study is to evaluate the effectiveness of multi-resolution strategies rooted in spatial-domain smoothing versus frequency-domain decomposition for noise suppression. The detailed explanations of these multiscale denoising approaches are as follows.

Gaussian pyramid

A GP is a hierarchical and multi-resolution representation of an image. The process involves repeatedly applying Gaussian smoothing to generate images at pro smaller scales. Hence, it is valuable for real-world image denoising, where multiple levels help in separating noise from image content.

Gaussian pyramid construction

Consists of progressively down-sampled versions of an image, where each level is obtained by convolving the original image with a Gaussian kernel Gσ(x,y), to reduce high-frequency components and suppress noise⁶³. The kernel is defined by:

$$G_{\sigma } \left( {x,y} \right) = \frac{1}{{2\pi \dot{\sigma }^{2} }}e^{{ - \frac{{x^{{2 + y^{2} }} }}{{2\sigma^{2} }}}}$$

(3)

where σ is the standard deviation of the Gaussian kernel, which controls the level of blurring, and (x,y) are the spatial coordinates of the kernel centered around zero. These define the position of a pixel relative to the center of the Gaussian filter kernel. Figure 6 illustrates that after blurring, the image is downsampled to create its reduced-resolution version. The smoothing and down-sampling steps are repeated iteratively until the desired number of pyramid levels is achieved.

Mathematically, this downsampled image at level l is computed as:

$$I_{l + 1} \left( {x,y} \right) = \mathop \sum \limits_{i = - k}^{k} \mathop \sum \limits_{j = - k}^{k} G\left( {i,j} \right)I_{l} \left( {2x + i,2y + j} \right)$$

(4)

where I_l(x,y) is the image at level l, G(i,j) is the Gaussian filter kernel, and (x,y) are pixel coordinates in the image.

Multiscale denoising process

At each level of the pyramid, a denoising technique is applied to reduce noise. At the finest level (Level 0), a bilateral filter is utilized to attenuate high-frequency noise while maintaining edge sharpness and fine texture details with a neighborhood parameter of 5 and an intensity parameter of 7. This level corresponds to the original resolution of the image, ensuring that high-frequency details such as edges and boundaries remain intact during denoising. Intermediate levels (levels 1 and 2) are processed using median filtering (MF) with a kernel size of 3X3, which is effective in mitigating localized and impulsive noise. Downsampling in the GP compresses the image, causing scattered noise to cluster and become more detectable. Applying MF at lower resolutions avoids blurring delicate textures and edges in the original image. At the coarsest levels (Levels 3 and 4), Gaussian filtering (GF) with a standard deviation of 0.7 and kernel size of 5X5 is applied. These levels represent low-frequency and spatially homogeneous components, and GF here smooths broad intensity variations without affecting finer details, thereby contributing to a smoother and more coherent reconstruction. In GP, five levels are preferred as these many levels capture a broad range of noise, from fine textures (top layers) to coarse structures (bottom layers), enabling effective noise separation across scales. Layers less than this may not capture the full scale of noises present in real-world images, and deeper pyramids (with more than 5 layers) increase memory usage and processing time, incurring computational cost.

Image reconstruction

Following the application of denoising filters at each level of the GP, the reconstruction process commences from the coarsest level and progresses to the finest level. Once denoising is completed at each level, the denoised image is upsampled and combined with the denoised image from the next finer level, preserving the fine details. Using multiple levels ensures that fine details are maintained while reducing unwanted noise.

The mathematical expression for up-sampling and interpolation is:

$$I_{l + 1} \left( {x,y} \right) = \mathop \sum \limits_{i = - k}^{k} \mathop \sum \limits_{j = - k}^{k} G\left( {i,j} \right)I_{l} \left( {\frac{x}{2} + i,\frac{y}{2} + j} \right)$$

(5)

where missing pixels are estimated using interpolation techniques.

Wavelet transform

Wavelet methods employ orthogonal basis functions to decompose the image into approximation and detail coefficients. The core principle involves the decomposition of an image into multiple sub-bands using discrete WT, thereby isolating different frequency components at different levels.

Figure 7 illustrates the denoising process by using the wavelet transform method, comprising all three primary stages. The noisy image is first decomposed into approximation and detail coefficients through multi-level wavelet decomposition. Subsequently, noise is suppressed by applying soft or hard thresholding to the detail coefficients. Finally, the image is reconstructed using the inverse WT.

Wavelet decomposition

Noisy image is decomposed into four sub-bands as follows

$$W\left( {I_{n} } \right) = \left\{ {LL, LH, HL, HH} \right\}$$

(6)

where LL is the approximation. LH, HL, and HH are the vertical, horizontal, and diagonal details, respectively.

Wavelet functions can be represented as

$$W_{j,k} = \mathop \sum \limits_{x,y} I_{n} \left( {x,y} \right)\psi_{j,k} \left( {x,y} \right)$$

(7)

$$V_{j,k} = \mathop \sum \limits_{x,y} I_{n} \left( {x,y} \right)\Phi_{j,k} \left( {x,y} \right)$$

(8)

with ψ being the wavelet 0unction and Φ being the scaling function.

Thresholding

The noise is suppressed by applying soft or hard thresholding to the detail coefficients.

Hard thresholding

$$\hat{W}_{j,k} = \left\{ {\begin{array}{*{20}c} {W_{j,k} ,} & {\left| {W_{j,k} } \right| \ge T} \\ 0 & {\left| {W_{j,k} } \right| \ge T} \\ \end{array} } \right.$$

(9)

Coefficients are kept unchanged if the coefficient is equal to or greater than the threshold value; otherwise, it is set to zero (assumed to be zero).

Soft thresholding

$$\hat{W}_{j,k} = sign\left( {W_{j,k} } \right).\max \left( {\left| {W_{j,k} } \right| - T,0} \right)$$

(10)

It shrinks the magnitude of large coefficients by T and discards small ones.

Reconstruction

Finally, the image is reconstructed by applying the inverse Wavelet Transform (IWT) to the threshold coefficients

$$\hat{I}\left( {x,y} \right) = IWT\left( {\left\{ {LL, \hat{L}H \hat{H}L, \hat{H}H} \right\}} \right)$$

(11)

In this study, we employed orthogonal wavelet families, Haar, Daubechies, Coiflet-4, and Symlets to analyze the denoising performance at various scales.

Previous work has demonstrated that GP is effective in applications such as texture analysis, image compression, and multi-scale denoising. However, its application in real-world image denoising remains relatively underexplored, particularly in the presence of complex noise patterns found in real images.

Evaluation metrics

The quality of an image is determined by both objective and subjective evaluation. For subjective evaluation, the image has to be observed by a human expert. The human visual system is highly complicated; therefore, objective evaluation is preferred to measure the image quality. Various matrices are available for the objective evaluation of an image denoising method. Some of these are mean squared error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM). CC also plays an important role in evaluating the system’s performance.

Mean squared error (MSE)

Mean square error is the average squared difference between the estimated values and the actual values. It is given by

$$MSE = \frac{1}{mn}\mathop \sum \limits_{i = 0}^{m - 1} \mathop \sum \limits_{j = 0}^{n - 1} \left[ {I\left( {i,j} \right) - I_{n} \left( {i,j} \right)} \right]^{2}$$

(12)

where I(i,j) is the noise-free image and I_n(i,j) is the noisy image of size m x n.

Root mean squared error (RMSE)

RMSE is a measure of the average error between the estimated and actual values. It is directly related to MSE.

$$RMSE = \sqrt {MSE}$$

(13)

Mean absolute error (MAE)

MAE evaluates the pixel-wise average absolute difference between the reference and denoised image and is less sensitive to large outliers.

$$MAE = \frac{1}{MN}\mathop \sum \limits_{i = 1}^{M} \mathop \sum \limits_{j = 1}^{N} \left| {I\left( {i,j} \right) - K\left( {i.j} \right)} \right|$$

(14)

where I(i,j) is the pixel value of the reference image, K(i,j) is the pixel value of the denoised image with size MXN.

Peak signal-to-noise ratio (PSNR)

PSNR is the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation⁶⁴. Because many signals have a very wide dynamic range, PSNR is generally represented as a logarithmic quantity using the decibel scale. This is linked to the commonly known mean squared error as follows:

$$PSNR = 10\;log_{10} \left( {\frac{{MAX_{i}^{2} }}{MSE}} \right)$$

(15)

where MAX_i is the maximum possible pixel value of the image. When the pixels are represented using 8 bits per sample, the maximum value is 255. So, PSNR can also be represented as

$$PSNR = 10\;log_{10} \left( {\frac{{255^{2} }}{MSE}} \right)$$

(16)

Structural similarity index measure (SSIM)

SSIM is a method for predicting the perceived quality of images⁶⁴. SSIM is used for measuring the similarity between two images. It calculates changes in the luminance, contrast, and structure difference between them. The difference with other techniques, such as MSE or PSNR, is that these approaches estimate absolute errors. Structural information refers to the concept that pixels exhibit strong interdependencies, particularly when they are spatially close. These dependencies carry important information about the structure of the objects in the visual scene. The SSIM index is calculated on various windows of an image⁸.

The measure between two windows x and y of common size NXN is represented as follows

$$SSIM\left( {x,y} \right) = \frac{{\left( {2\mu_{x} \mu_{y} + c_{1} } \right)\left( {2\sigma_{xy} + c_{2} } \right)}}{{\left( {\mu_{x}^{2} + \mu_{y}^{2} + c_{1} } \right)\left( {\sigma_{x}^{2} \sigma_{y}^{2} + c_{2} } \right)}}$$

(17)

where σ_x and σ_y are the variances of x and y, respectively, and σ_xy is the covariance of x and y. μ_x and μ_y are the averages of x and y. c₁ = (k₁L)² and c₂ = (k₁L)² are two variables to stabilise the division with a weak denominator. L is the dynamic range of the pixel value, and k1 and k₂ are 0.01 and 0.03, respectively.

Visual information fidelity (VIF)

VIF is a full-reference quality metric that evaluates the visual information retained in a noised image with reference to the original image. is preserved in noise. It is based on mutual information theory and varies from 0 (no useful information) to 1 (perfect fidelity). It correlates with visual perception than PSNR.

Figure of merit (FOM).

FOM is often used to evaluate how well the edges are retained after processing the image. It varies from 0 (poor edge match) to 1 (perfect edge preservation).

Computational complexity (CC)

CC refers to the time and memory resources required to process an image, remove noise, and preserve important details. It is a key factor, especially for real-world applications. It affects runtime speed, memory usage, and energy consumption. Different denoising algorithms vary in complexity based on their mathematical operations, the image size, and the type of noise. GP and WT are both multi-resolution approaches for image denoising, but they differ in CC, efficiency, and effectiveness.

Standard deviation (SD)

Standard deviation is the statistical measure that indicates the dispersion of a dataset from its mean value. It provides insight into the algorithm stability and residual error characteristics. Performance metrics such as PSNR or SSIM are reported as mean with SD values. SD indicates the consistency across the dataset, and the mean value represents the average effectiveness.

Mathematically, it is represented as

$$\sigma = \sqrt{\frac{1}{N}} \mathop \sum \limits_{i = 1}^{N} (x_{i} - \mu )^{2}$$

(18)

where σ is the standard deviation, x_i is each individual parametric value, N is the number of samples, and µ is the mean value.

Paired t-test

This statistical testing is performed to evaluate the effectiveness of denoising results on the same set of images. Paired t-tests are performed on PSNR values obtained from two different methods as follows:

$$t = \frac{{\sum \left( {x_{1} - x_{2} } \right)}}{{\frac{s}{\sqrt n }}}$$

(19)

where x₁ and x₂ are the difference means of the pairs, s is the standard deviation, and n is the sample size, which is 10 in this study. A high t-value and a low p-value imply that method A significantly outperforms method B.

Experimental setup

To evaluate the performance of the GP and WT-based image denoising approach, experiments have been conducted on real-world noisy images. All methods were implemented under identical conditions. Quantitative and qualitative metrics were employed to analyze both noise reduction capability and detail preservation. The denoising process is implemented in-house using Python 3.12, leveraging the OpenCV library (for image processing), the NumPy library (for numerical operations), PyWavelets, and scikit-image. All experiments were conducted on an AMD Ryzen 7 processor with Radeon graphics, 2900 MHz, 8-core (16 GB RAM).

To validate the efficiency of GP and WT methods, datasets that capture the characteristics of real-world images are utilized. We have used medical images (MRI and X-Ray), non-medical images, and SIDD datasets.

MRI dataset

The Brain Tumor MRI Dataset, created by Masoud Nickparvar in 2021, is a collection of Magnetic Resonance Imaging (MRI) scans labeled for the presence or absence of brain tumors, mostly in JPEG/PNG format. This dataset is a combination of three different datasets, namely Fighshare, SARTAJ, and Br35H, and contains 7023 images. This dataset comprises 7023 human brain MRI images of varying sizes, aggregated from three different sets, namely SARTAJ, Fighshare, and Br35H. These are further classified into 4 classes: glioma, meningioma, no tumor, and pituitary⁶⁵.

X-ray dataset

This dataset, created by Paul Moony in 2018, comprises Chest X-ray images (anterior–posterior) that have been selected from retrospective cohorts of pediatric patients. There are 5856 X-Ray images (JPEG) and 2 categories (Pneumonia/Normal). 3883 images are characterized as depicting pneumonia (2538 bacterial and 1345 viral) and 1349 normal. In the test folder, 234 normal images and 390 pneumonia images (242 bacterial and 148 viral) from 624 patients^66,67. No synthetic noise has been added; noise levels depend on the X-ray machine, patient movement, and exposure conditions.

Non-medical images dataset

This dataset, created by Dibakar Sil in 2018, comprises natural images of 9 distinct categories. The images are corrupted by nine distinct types of noise, including additive Gaussian noise, lognormal noise, uniform noise, exponential noise, Poisson noise, salt and Pepper noise, Rayleigh noise, Speckle noise, and Erlang noise⁶⁸.

Smartphone image denoising dataset (SIDD)

The SIDD dataset, created by Rajat Gupta in 2020, comprises 160 pairs of noisy/ground-truth images taken under five different lightning conditions, including Google Pixel, iPhone 7, Samsung Galaxy Nexus 6, Motorola Nexus 6, and LG G4. The authors have provided a real noisy images dataset with high-quality ground truth. The noise is complex and signal-dependent⁵⁶_.

Results and discussion

This section analyzes the performance of the GP image denoising method and compares it with wavelet methods Coif4, Haar, dv4, and Sym. Both GP and WT are multiresolution approaches. Performance is evaluated using eight standard metrics: PSNR, SSIM, MSE, RMSE, MAE, VIF, FOM, and CC. The performance of GP is compared with the findings of WT methods, BM3D, and learning-based approaches, including DnCNN, RIDNet, and Noise2Void. The datasets have images with varying spatial resolutions. Denoising is performed at the original resolution, and the outputs are resized back to their original dimensions. For a comprehensive evaluation, the results are presented and discussed dataset-wise.

X–ray dataset

For the X-ray dataset, GP exhibits superior performance compared to WT. Table 4 presents the quantitative results for image denoising using GP and WT methods. It achieves a higher PSNR of 36.8023 dB, indicating minimal distortion and effective preservation of intensity levels, which is highly important for retaining diagnostically important information in X-ray images. We used a five-layer GP to decompose the image at various resolutions. Bilateral, median, and Gaussian filters have been applied at different levels to suppress noise across multiple scales. The BF applied at the finest GP level is sensitive to edges and textures, removing high-frequency noise. Intermediate levels utilize median filters to remove salt-and-pepper noise without blurring anatomical structures such as bones and tissues, while Gaussian filters at higher levels eliminate low-frequency and spatially uniform background noise.

Table 4 Quantitative results of X-ray image denoising using soft thresholding, in terms of PSNR, SSIM, MSE, RMSE, VIF, FOM, MAE, and processing time metrics.

Full size table

So, multilevel filtering effectively addresses each type of noise. Additionally, the SSIM value of 0.9428 is high. SSIM assesses luminance, contrast, and structural correlations, making it sensitive to edge clarity and the spatial arrangement of pixels. This demonstrates the method’s ability to preserve details. Furthermore, the MSE value of 15.646 is lower than that of the wavelet method, indicating minimal pixel-level errors. The reduction in MSE alongside enhanced perceptual quality substantiates the robustness of the GP method for medical image denoising. Its localized, level-specific approach also offers a balance between accuracy and speed. Wavelet methods Coif4, db4, Haar, and Sym4 show PSNR values of 25.7377, 25.6400, 26.616, and 26.109 dB, respectively. Wavelet methods tend to introduce artifacts that impair pixel accuracy. Although WT effectively addresses high and low-frequency components, it often struggles to preserve edges. This leads to blurring of anatomical boundaries, reducing SSIM to 0.788. Higher values of VIF (0.7717) and FOM (0.992) also demonstrate the effectiveness of GP in preserving perceptual qualities well. Wavelet-based denoising employs predefined basic functions such as Haar and Daubechies, which may not be optimal for real-world images. A higher MSE between the original and denoised images indicates the presence of artifacts. Unlike GP, wavelet methods do not utilize a scale-specific filter. The computational efficiency of 0.0046 for GP is also notable, demonstrating that this method has a higher processing speed compared to all four WT methods. Table 5 shows the t-test results and the Wilcoxon test, which further confirm that GP performs significantly better than WT methods, with a t-value exceeding 19, a p-value of less than 0.0001, and a W value of 0.

Table 5 t-test and Wilcoxon test analysis for X-ray Dataset.

Full size table

Figure 8 shows the variation of all parametric values across different X-ray images. These images, captured under different conditions (such as sensor noise, exposure, and varying radiation levels), contain complex, multi-level noise. Images with large uniform regions (like bone scans) tend to have relatively higher PSNR. PSNR also varies depending on the image content.

The lower SD value of 1.11 for WT indicates that these methods aggressively smooth the image intensities and also eliminate fine details. GP, with high PSNR and SSIM, preserves the structural details of the image, resulting in a higher SD.

Figure 9 shows a visual comparison of the original images and denoised images by the GP and WT methods, which further supports the superiority of the GP method, as Fig. 9b presents better visual performance relative to the WT methods.

The lower SD value of 1.11 for WT indicates that these methods aggressively smooth the image intensities and also eliminate fine details. GP, with high PSNR and SSIM, preserves the structural details of the image, resulting in a higher SD.

MRI dataset

In the MRI dataset, the GP method again yielded higher PSNR and SSIM values, indicating its robustness and effectiveness in noise suppression. Table 6 shows that the GP method exhibits a PSNR of 35.2776 dB and an SSIM of 0.9601 for the MRI dataset. BF, MF, and GF have been applied at different levels to reduce noise at various scales. As discussed for X-ray images analysis, the high PSNR is due to the multi-scale denoising performed in GP. In contrast, wavelet methods Coif4, db4, Haar, and Sym4 show PSNR values of 26.1197, 26.1154, 26.6431, and 26.1282 dB, respectively, indicating limited performance. The SSIM of 0.96013 indicates strong preservation of anatomical structures and higher values of VIF and FOM, demonstrating improved perceptual quality of GP compared to WT methods. Conversely, wavelet methods produced SSIM values ranging from 0.7763 to 0.7785, suggesting lower perceptual quality. With an MSE of 19.96, GP outperforms the others, while wavelet methods show higher MSE values, indicating the presence of noise artifacts. The low computational time of 0.00728 s provided by the GP denoising method confirms that it is highly effective for denoising MRI images. Figure 10 shows the performance of denoising methods across 8 different parameters. Figure 10a presents the lower SD value of 1.29 for WT, indicating that these methods aggressively smooth the image intensities and also suppress the fine details. Figure 11 shows the visual comparison of original images and denoised images using GP and WT variants. Figure 11b also favours that GP outperforms the wavelet methods with better visual performance demonstrated by the GP methods.

Table 6 Quantitative results of MRI image denoising using soft thresholding, in terms of PSNR, SSIM, MSE, RMSE, VIF, FOM, MAE, and processing time metrics.

Full size table

Table 7 presents the paired t-test results and Wilcoxon test analysis, which also indicate that GP significantly outperforms WT-based methods, with a t-value exceeding 14, a p-value less than 0.0001, and W equals zero for 10 images. WT achieves lower SD by aggressively smoothing image intensities; however, it tends to over-smooth intensity variations and can also remove fine details and edges. On the other hand, GP retains multilevel structures. This demonstrates that GP performs more consistently than WT, which has an SD of 1.33 across the dataset.

Table 7 t-test and Wilcoxon test analysis for MRI Dataset.

Full size table

The lower SSIM of X-ray images (0.9428) compared to MRI images (0.0.96013) using the GP method is because MRI images have higher soft tissue contrast and a broader grayscale dynamic range than X-ray images. Additionally, X-ray images, especially chest X-rays, contain many sharp anatomical edges (e.g., ribs and bone margins), which make them more prone to distortion or blurring during denoising. In contrast, MRI images generally have smooth anatomical transitions, making them less vulnerable to structural degradation and resulting in higher SSIM values. Noise suppression and signal retention result in higher PSNR in some images. Images with less contrast between anatomical structures tend to have lower PSNR. In smooth regions (e.g., white matter in MRI), even small noise can decrease PSNR, while in textured areas (e.g., grey matter), the same level of smoothing may be statistically beneficial but more visually or regionally significant. Therefore, PSNR may vary across images.

Figures 8a and 10a show that GP results in lower SD on MRI because these images contain large, homogeneous regions like soft tissues, and GP effectively removes random fluctuations in these areas, which reduces the residual intensity spread (SD). In contrast, X-rays have high-frequency structures (ribs and edges) and Poisson-like noise that GP intentionally preserves, leading to higher SD.

Non-medical images dataset

In this dataset, GP outperformed WT by achieving higher PSNR and SSIM scores, demonstrating its ability to reconstruct images accurately and reliably. Table 8 shows that the GP method achieves a significantly higher PSNR of 25.04 dB as compared to WT methods, which have PSNR values ranging from 21.89 to 22.062 dB. This difference demonstrates the method’s ability to denoise images more effectively. SSIM values of GP (0.61133) and WT (0.60138) are comparable, as it is less sensitive to fine pixel-wise variation, while VIF and FOM highlight GP’s strength in retaining fine details. In medical images, where anatomical fidelity is responsive to gray-level variations, GP achieves clear improvement in SSIM. PSNR shows a notable difference between the two methods because it is highly sensitive to pixel-wise errors. Lower PSNR and SSIM values are observed for non-medical images due to their complex textures and high-frequency details, making denoising more challenging compared to medical images, which tend to have more uniform structures with fewer textures.

Table 8 Quantitative results of Non-medical Image denoising using soft thresholding in terms of PSNR, SSIM, MSE, RMSE, VIF, FOM, MAE, and processing time.

Full size table

Figure 12 exhibits the parameter values for different images. Although wavelet transforms (Haar, db4, and sym4) are well-suited for multi-resolution analysis, these appear less effective for real-world image denoising when compared to the GP method. Variations in PSNR values across different non-medical images occur because these images vary significantly, such as in terms of skies, walls, and water. Images with smooth regions tend to have higher PSNR, while fine textures and edges are more susceptible to noise suppression, resulting in lower PSNR values. Figure 12(a) shows that for non-medical images, the SD variation in GP and WT is small (1.28), as compared to medical images (2.13), because their complex and diverse intensity patterns overshadow the impact of variance reduction. Conversely, in medical images with large homogeneous regions, denoisers have a stronger effect on pixel variance, resulting in a noticeable gap between SD values for these two methods.

Table 9 displays the t-test and Wilcoxon test values for the NMI dataset, which also support the finding that GP performs significantly better than WT methods, with a t-value exceeding 11, a p-value of less than 0.0001, and a W score of zero. Figure 13 shows the visual comparison denoised image. Visual comparison further supports the findings. Figure 13(b) shows that the GP-based method generated denoised images with finer structure and fewer artifacts compared to WT.

Table 9 t-test and Wilcoxon test analysis for NMI Dataset.

Full size table

SIDD dataset

In this dataset, GP maintained its consistency in reconstruction capability with a higher PSNR. Table 10 shows that GP achieves higher pixel-level accuracy, reflected in higher PSNR (34.9333 dB) and lower MSE and RMSE, while WT exhibits higher SSIM and VIF for the SIDD dataset. Therefore, for natural images, GP is more effective at reducing reconstruction error and computational complexity, whereas WT emphasizes the preservation of global and perceptual content. The high variations in PSNR across the SIDD dataset images are due to heterogeneous noise levels, diverse scene content, and preprocessing differences. Figure 14 exhibits the performance analysis of GP and WT methods using PSNR, SSIM, MSE, RMSE, VIF, FOM, MAE, and computational complexity.

Table 10 Quantitative results of SIDD Image denoising using soft thresholding, in terms of PSNR, SSIM, MSE, RMSE, VIF, FOM, MAE, and processing time.

Full size table

To validate the results, statistical analysis is also performed using t-tests and the Wilcoxon signed-rank test, which also confirmed the superiority of GP over wavelet-based methods. Table 11 shows paired t-tests with a t-value of 4.131 and a p-value of 0.0026, along with a Wilcoxon test showing a w-score of 2.0 and a p-value of 0.00585, supporting the robustness of GP compared to WT. In Fig. 14a lower SD (2.57) of GP demonstrates that it is highly sensitive to scene variability, achieving higher accuracy in smooth areas but large fluctuations in texture-rich regions. Figure 15 presents the visual comparison for this dataset. GP produced cleaner images with preserved information and minimal artifacts.

Table 11 t-test and Wilcoxon test analysis for SIDD Dataset.

Full size table

Table 12 presents the effect of soft and hard thresholding used in wavelet methods. The improved PSNR (ranging from 0.5 to 1.0 dB) and SSIM (ranging from 0.18 to 0.20) using soft thresholding are attributable to its gradual shrinkage function applied to coefficients, whereas in hard thresholding, coefficients below a threshold are set to zero, directly resulting in discontinuities in the image. This improved reconstruction ability comes at the cost of increased computational time due to the additional processing required for coefficient adjustment. In contrast, high thresholding offers faster execution at the expense of compromised image quality, as it eliminates fine details and noise.

Table 12 Comparison of soft and hard thresholding for Wavelet Transform–based denoising.

Full size table

This work highlights the strength of a GP framework for real-world image denoising.

Table 13 presents a comparative analysis of GP and WT-based methods against the benchmarked learning-based denoising methods, including DnCNN, RIDNet, and Noise2Void. The results show that GP outperforms WT, producing higher PSNR and SSIM with less computational complexity. Results are competitive with these state-of-the-art learning-based methods. All of these require extensive training datasets, have high computational costs, and are more resource-intensive. In contrast, the GP method is lightweight, training-free, and adaptable across diverse datasets. For medical images, it achieves a PSNR of more than 35 dB with an SSIM of 0.96013, comparable to RIDNet (0.9526) and DnCNN (0.8635) and higher than Noise2Void (27.71), indicating that GP not only suppresses noise but also maintains structural integrity. This demonstrates that GP is an interpretable, efficient, and reliable alternative for real-world scenarios where both accuracy and interpretability are important.

Table 13 Comparison of Gaussian Pyramid, Wavelet Transform, BM3D, and DL–based denoising methods in terms of PSNR, SSIM, computational time, and the dataset used.

Full size table

The GP method effectively reduces high-frequency noise while preserving edges and finer. It is also computationally efficient and free of iterative optimization or DNN inference, making it suitable for real-time applications such as medical diagnosis, satellite imaging, underwater imaging, video surveillance, and robotics vision, where expedited noise reduction is necessary to improve workflow. As efficiency is a crucial characteristic of filters and transforms, it can be implemented even on resource-constrained platforms, such as embedded clinical image consoles and embedded systems.

Challenges and limitations

Despite the promising results of the GP-based image denoising technique in terms of PSNR, SSIM, MSE, and computational cost for X-ray and MRI images, several challenges remain that require further attention. The performance of the GP denoising method highly depends on the choice of filters at various pyramid levels. Suboptimal settings can either over-smooth diagnostically important details at finer levels or leave residual noise at coarser levels. Although GP is less complex than deep learning techniques, its multiscale processing and reconstruction still require iterative operations, such as downsampling, filtering, and upsampling, which increase processing time compared to simple spatial filters. The effectiveness of the GP method is influenced by the content of the image; for example, high-contrast or textured images may retain more noise, while homogeneous regions such as soft tissues in MRI or skies and lakes in non-medical images may benefit more. Medical images can also contain motion artifacts. More complex artifacts, such as Poisson noise from low-dose acquisition, streaks caused by motion, or coil-related intensity heterogeneity, are only partially reduced, indicating a need for hybrid noise models. Unlike deep learning methods, GP is manually parameterized, which limits its ability to generalize across heterogeneous datasets. Unlike deep learning methods, GP is manually parameterized, which limits its ability to generalize across heterogeneous datasets and learning methods. This limits its generalizability across heterogeneous datasets and learning methods.

Future directions

While the GP method shows significant improvements over traditional techniques, researchers continue to seek further enhancements. Exploring hybrid networks that combine GP with deep learning frameworks, such as CNNs or GANs, is promising. Developing adaptive and context-aware methods that can manage non-uniform noise distributions is recommended. Domain-specific applications, including medical and underwater imaging, along with multiscale and multimodal approaches, complemented by standardized benchmarks, offer promising directions for better denoising performance. Several aspects of GP decomposition remain open for future research, which could enhance adaptability by replacing fixed filters with adaptive ones. VAEs can be integrated at various layers of the pyramid to boost flexibility in image denoising. Future studies should focus on optimizing deep learning models, especially for unsupervised learning and real-time processing in large-scale image tasks. CC is more suitable for GP, whereas the Kalman filter excels in denoising efficiency; thus, combining GP with the Kalman filter could enhance adaptability and perceptual quality across different imaging modalities.

The GP denoising framework has low complexity and is not based on training; hence, it is easy to implement for real-time processing on system-on-chips, such as digital camera or smartphone image processing pipelines. Its hierarchical multi-scale structure provides excellent noise suppression and fine structural detail preservation capability, resulting in good image quality under low-light and high-noise-level conditions. Further studies may investigate the direct incorporation of the GP approach into on-device imaging platforms to demonstrate practical usability.

Data availability

No datasets were generated or analysed during the current study.

Code availability

The code developed and used in this study is developed by the authors. It is not publicly available as it is being further developed for ongoing research. Reasonable requests for access to the code can be considered by contacting the corresponding author.

References

Yapici, A. & Akcayol, M. A. A review of image denoising with deep learning.In 2021 2nd International Informatics and Software Engineering Conference (IISEC), 1–6. IEEE (2021). https://doi.org/10.1109/IISEC54230.2021.9672379
Gupta, H., Chauhan, H., Bijalwan, A. & Joshi, K. A review on image denoising (2019). http://ssrn.com/link/ICAESMT-2019.html=xyz.
Mafi, M. et al. A comprehensive survey on impulse and Gaussian denoising filters for digital images. Signal Process. 157, 236–260. https://doi.org/10.1016/j.sigpro.2018.12.006 (2019).
Article ADS Google Scholar
Perry, S. Denoising of Photographic Images and Video (Springer International Publishing, 2018). https://doi.org/10.1007/978-3-319-96029-6.
Book Google Scholar
Akhade, K., Ghodekar, S., Kapse, V., Raykar, A. & Wadhvane, S. A survey on image denoising techniques. Int. J. Innov. Sci. Res. Technol. 9 (2024).
Chen, C., Xiong, Z., Tian, X., Zha, Z. J. & Wu, F. Real-world image denoising with deep boosting. IEEE Trans. Pattern Anal. Mach. Intell. 42(12), 3071–3087. https://doi.org/10.1109/TPAMI.2019.2921548 (2020).
Article ADS PubMed Google Scholar
Russo, F. A method for estimation and filtering of Gaussian noise in images. IEEE Trans. Instrum. Meas. 52(4), 1148–1154. https://doi.org/10.1109/IMTC.2002.1007214 (2003).
Article ADS Google Scholar
Fan, L., Zhang, F., Fan, H. & Zhang, C. Brief review of image denoising techniques. Vis. Comput. Ind. Biomed. Art https://doi.org/10.1186/s42492-019-0016-7 (2019).
Article PubMed PubMed Central Google Scholar
Sharma, A. & Sunkaria, R. K. Convolution neural network based image denoising: A review. In 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), 2094–2097. IEEE (2022). https://doi.org/10.1109/ICACITE53722.2022.9823935
Taassori, M. & Vizvári, B. Enhancing medical image denoising: A hybrid approach incorporating adaptive Kalman filter and non-local means with Latin square optimization. Electronics https://doi.org/10.3390/electronics13132640 (2024).
Article Google Scholar
El Helou, M. & Susstrunk, S. Blind universal Bayesian image denoising with Gaussian noise level learning. IEEE Trans. Image Process. 29, 4885–4897. https://doi.org/10.1109/TIP.2020.2976814 (2020).
Article ADS Google Scholar
Iqbal, A. & Seghouane, A. K. An α-divergence-based approach for robust dictionary learning. IEEE Trans. Image Process. 28(11), 5729–5739. https://doi.org/10.1109/TIP.2019.2922074 (2019).
Article ADS MathSciNet PubMed Google Scholar
Bian, S., He, X., Xu, Z. & Zhang, L. Hybrid dilated convolution with attention mechanisms for image denoising. Electronics https://doi.org/10.3390/electronics12183770 (2023).
Article Google Scholar
Tian, C., Xu, Y., Zuo, W., Lin, C. W. & Zhang, D. Asymmetric CNN for image superresolution. IEEE Trans. Syst. Man Cybern. Syst. 23, 1–13. https://doi.org/10.1109/TSMC.2021.3069265 (2021).
Article Google Scholar
Li, S., Chen, Y., Jiang, R. & Tian, X. Image denoising via multi-scale gated fusion network. IEEE Access 7, 49392–49402. https://doi.org/10.1109/ACCESS.2019.2910879 (2019).
Article Google Scholar
Kong, Z., Deng, F., Zhuang, H., Yu, J., He, L. & Yang, X. A comparison of image denoising methods. Preprint at https://arxiv.org/abs/2304.08990 (2023).
Li, S., Hao, Q., Kang, X. & Benediktsson, J. A. Gaussian pyramid-based multiscale feature fusion for hyperspectral image classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 11(9), 3312–3324 (2018). https://doi.org/10.1109/JSTARS.2018.2856741
Guo, S., Yan, Z., Zhang, K., Zuo, W. & Zhang, L. Toward convolutional blind denoising of real photographs. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) https://doi.org/10.1109/CVPR.2019.00181 (2019).
Article Google Scholar
Khan, A., Jin, W., Haider, A., Rahman, M. & Wang, D. Adversarial Gaussian denoiser for multiple-level image denoising. Sensors https://doi.org/10.3390/s21092998 (2021).
Article PubMed PubMed Central Google Scholar
Boyat, A. K. & Joshi, B. K. A review paper: Noise models in digital image processing. Signal Image Process. 6(2), 63–75. https://doi.org/10.5121/sipij.2015.6206 (2015).
Article Google Scholar
Anchal, Budhiraja, S., Goyal, B., Dogra, A. & Agrawal, S. An efficient image denoising scheme for higher noise levels using spatial domain filters. Biomed. Pharmacol. J. 11(2), 625–634. https://doi.org/10.13005/bpj/1415 (2018).
Lee, Y. K. & Ding, J. J. Efficient color image denoising using DWT-based noise estimation and adaptive Wiener filter. Proc. 2024 8th Int. Conf. Imaging, Signal Processing and Communications (ICISPC), 47–51, (2024). https://doi.org/10.1109/ICISPC63824.2024.00016
Parse, T. A., Awasthi, T., Yadav, D. & Joshi, P. QAAD: Quality aware adaptive denoising. Proc. 11th Int. Conf. Signal Processing and Integrated Networks (SPIN), 180–186, (2024). https://doi.org/10.1109/SPIN60856.2024.10511652
Lebrun, M., Buades, A. & Morel, J. M. A nonlocal Bayesian image denoising algorithm. SIAM J. Imaging Sci. 6(3), 1665–1688. https://doi.org/10.1137/120874989 (2013).
Article MathSciNet Google Scholar
Buades, A., Coll, B. & Morel, J. M. A non-local algorithm for image denoising. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), 60–65, (2005). https://doi.org/10.1109/CVPR.2005.38
Chen, F., Zhang, L. & Yu, H. External patch prior guided internal clustering for image denoising. Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 603–611, (2015). https://doi.org/10.1109/ICCV.2015.76
Sajid, M. & Khurshid, K. Satellite image restoration using RLS adaptive filter and enhancement by image processing techniques. Proc. 2015 Symp. Recent Advances in Electrical Engineering (RAEE), (2015). https://doi.org/10.1109/RAEE.2015.7352750
Technologii, C. H., Poc, S. & Multime, G. A. Deep learning for image denoising. J. Comput. Sci. Technol. 7(3), 171–180. https://doi.org/10.14257/ijsip.2014.7.3.14 (2013).
Article Google Scholar
Ma, R., Li, S., Zhang, B. & Li, Z. Generative adaptive convolutions for real-world noisy image denoising. Proc. AAAI Conf. Artificial Intelligence (2022). https://www.aaai.org
Chatterjee, S. et al. Variational autoencoder based imbalanced COVID-19 detection using chest X-ray images. New Gener. Comput. 41(1), 25–60. https://doi.org/10.1007/s00354-022-00194-y (2023).
Article PubMed Google Scholar
Ma, H. & Nie, Y. Directional weighted mean filter and improved adaptive anisotropic diffusion model. J. Imaging Sci. https://doi.org/10.1155/2018/IDxxxxxx (2018).
Article Google Scholar
Xu, J., Zhang, L. & Zhang, D. External prior guided internal prior learning for real-world noisy image denoising. IEEE Trans. Image Process. 27(6), 2996–3010. https://doi.org/10.1109/TIP.2018.2811546 (2018).
Article ADS MathSciNet Google Scholar
Fathi, A. & Naghsh-Nilchi, A. R. Efficient image denoising method based on a new adaptive wavelet packet thresholding function. IEEE Trans. Image Process. 21(9), 3981–3990. https://doi.org/10.1109/TIP.2012.2200491 (2012).
Article ADS MathSciNet PubMed Google Scholar
Mafi, M., Tabarestani, S., Cabrerizo, M., Barreto, A. & Adjouadi, M. Denoising of ultrasound images affected by combined speckle and Gaussian noise. IET Image Process. 12(12), 2346–2351. https://doi.org/10.1049/iet-ipr.2018.5292 (2018).
Article Google Scholar
Xu, J., Zhang, L., Zuo, W., Zhang, D. & Feng, X. Patch group based nonlocal self-similarity prior learning for image denoising. Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 244–252, (2015). https://doi.org/10.1109/ICCV.2015.36
Feng, W., Qiao, P., Xi, X. & Chen, Y. Image denoising via multiscale nonlinear diffusion models. SIAM J. Imaging Sci. 10(3), 1234–1257. https://doi.org/10.1137/16M1093707 (2017).
Article MathSciNet Google Scholar
Panigrahi, S. K., Gupta, S. & Sahu, P. K. Curvelet-based multiscale denoising using non-local means and guided image filter. IET Image Process. 12(6), 909–918. https://doi.org/10.1049/iet-ipr.2017.0825 (2018).
Article Google Scholar
Zhang, K., Zuo, W., Chen, Y., Meng, D. & Zhang, L. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155. https://doi.org/10.1109/TIP.2017.2662206 (2017).
Article ADS MathSciNet PubMed Google Scholar
Krull, A., Buchholz, T. O. & Jug, F. Noise2Void – Learning denoising from single noisy images, accessed April 5, 2019. Preprint at http://arxiv.org/abs/1811.10980 (2019).
Anwar, S. & Barnes, N. Real image denoising with feature attention. Proc. IEEE Int. Conf. Comput. Vis. https://doi.org/10.1109/ICCV.2019.00325 (2019).
Article Google Scholar
Nazir, N., Sarwar, A. & Saini, B. S. Recent developments in denoising medical images using deep learning: An overview of models, techniques, and challenges. Micron 180, 103615. https://doi.org/10.1016/j.micron.2024.103615 (2024).
Article PubMed Google Scholar
Ma, R., Li, S., Zhang, B., Fang, L. & Li, Z. Flexible and generalized real photograph denoising exploiting dual meta attention. IEEE Trans. Cybern. 53(10), 6395–6407. https://doi.org/10.1109/TCYB.2022.3170472 (2023).
Article PubMed Google Scholar
Zheng, D., Tan, S. H., Zhang, X., Shi, Z., Ma, K. & Bao, C. An unsupervised deep learning approach for real-world image denoising. arXiv preprint arXiv:xxxx.xxxxx (2025).
Abuya, T. K., Rimiru, R. M. & Okeyo, G. O. An image denoising technique using wavelet-anisotropic Gaussian filter-based denoising convolutional neural network for CT images. Appl. Sci. 13(21), 12069. https://doi.org/10.3390/app132112069 (2023).
Article CAS Google Scholar
Li, P., Li, G., Hu, D. & Zhang, S. Research on feature extraction and center localization recognition of circular hole based on Gaussian pyramid hierarchical clustering and Hough circle detection. Proc. IEEE 2nd Int. Conf. Image Process. Comput. Appl. (ICIPCA), 626–630, (2024). https://doi.org/10.1109/ICIPCA61593.2024.10708992
Sundarrajan, M., Choudhry, M. D., Biju, J., Krishnakumar, S. & Rajeshkumar, K. Enhancing low-light medical imaging through deep learning-based noise reduction techniques. Indian J. Sci. Technol. 17(34), 3567–3579. https://doi.org/10.17485/IJST/v17i34.2489 (2024).
Article Google Scholar
Song, Y., Zhu, Y. & Du, X. Grouped multi-scale network for real-world image denoising. IEEE Signal Process. Lett. 27, 2124–2128. https://doi.org/10.1109/LSP.2020.3039726 (2020).
Article ADS Google Scholar
Ma, R., Zhang, B. & Hu, H. Gaussian pyramid of conditional generative adversarial network for real-world noisy image denoising. Neural Process. Lett. 51, 2669–2684. https://doi.org/10.1007/s11063-020-10215-w (2020).
Article Google Scholar
Zhang, S. & Lam, E. Y. Denoising for photon-limited imaging via a multi-level pyramid network. Proc. IEEE TENCON https://doi.org/10.1109/TENCON55691.2022.9977646 (2022).
Article Google Scholar
Ma, R., Hu, H., Xing, S. & Li, Z. Efficient and fast real-world noisy image denoising by combining pyramid neural network and two-pathway unscented Kalman filter. IEEE Trans. Image Process. 29, 3927–3940. https://doi.org/10.1109/TIP.2020.2965294 (2020).
Article ADS Google Scholar
Zhong, T., Li, F., Zhang, R., Dong, X. & Lu, S. Multiscale residual pyramid network for seismic background noise attenuation. IEEE Trans. Geosci. Remote Sens. 60, 1–14. https://doi.org/10.1109/TGRS.2022.3217887 (2022).
Article Google Scholar
Wang, Z., Cun, X., Bao, J., Zhou, W., Liu, J. & Li, H. Uformer: A general U-shaped transformer for image restoration. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 17662–17672, (2022). https://doi.org/10.1109/CVPR52688.2022.01716.
Khmag, A. Natural digital image mixed noise removal using regularization Perona-Malik model and pulse coupled neural networks. Soft Comput. 27, 15523–15532. https://doi.org/10.1007/s00500-023-09148-y (2023).
Article Google Scholar
Zhao, Y., Jiang, Z., Men, A. & Ju, G. Pyramid real image denoising network. Proc. IEEE Visual Commun. Image Process. (VCIP), 1–4, (2019). https://doi.org/10.1109/VCIP47243.2019.8965754 .
Khmag, A. Additive Gaussian noise removal based on generative adversarial network model and semi-soft thresholding approach. Multimed. Tools Appl. 82, 7757–7777. https://doi.org/10.1007/s11042-022-13569-6 (2023).
Article Google Scholar
Zhang, D. & Zhou, F. Self-supervised image denoising for real-world images with context-aware transformer. IEEE Access 11, 14340–14349. https://doi.org/10.1109/ACCESS.2023.3243829 (2023).
Article Google Scholar
Li, L., Wei, W., Yang, L. et al. CT-Mamba: A hybrid convolutional state space model for low-dose CT denoising. Preprint, https://arxiv.org/abs/2411.07930 (2024).
Chihaoui, H. & Favaro, P. Unsupervised real-world denoising: sparsity is all you need. Preprint, https://arxiv.org/abs/2503.21377 (2025).
Yin, L., Gao, W. & Liu, J. Deep convolutional dictionary learning denoising method based on distributed image patches. Electronics 13(7), 1266. https://doi.org/10.3390/electronics13071266 (2024).
Article Google Scholar
Zhang, S., Liu, C., Zhang, Y., Liu, S. & Wang, X. Multi-scale feature learning convolutional neural network for image denoising. Sensors 23(18), 7713. https://doi.org/10.3390/s23187713 (2023).
Article ADS PubMed PubMed Central Google Scholar
Majeed Zangana, H. & Mustafa, F. M. From classical to deep learning: A systematic review of image denoising techniques. J. Ilmiah Comput. Sci. 3(1), 50–65. https://doi.org/10.58602/jics.v3i1.36 (2024).
Article Google Scholar
Huo, C., Zhang, D. & Yang, H. An underwater image denoising method based on high-frequency abrupt signal separation and hybrid attention mechanism. Sensors 24(14), 4578. https://doi.org/10.3390/s24144578 (2024).
Article ADS PubMed PubMed Central Google Scholar
Oliveira, F. D. V. R., Gomes, J. G. R. C., Fernandez-Berni, J., Carmona-Galan, R., del Rio, R. & Rodriguez-Vazquez, A. Gaussian pyramid: Comparative analysis of hardware architectures. IEEE Trans. Circuits Syst. I Regul. Pap. 64(9), 2308–2321 (2017). https://doi.org/10.1109/TCSI.2017.2709280
Goyal, B., Dogra, A., Agrawal, S., Sohi, B. S. & Sharma, A. Image denoising review: From classical to state-of-the-art approaches. Inf. Fusion 55, 220–244. https://doi.org/10.1016/j.inffus.2019.09.003 (2020).
Article Google Scholar
Celik, M. & Inik, O. Development of hybrid models based on deep learning and optimized machine learning algorithms for brain tumor multi-classification. Expert Syst. Appl. 238, 122159. https://doi.org/10.1016/j.eswa.2023.122159 (2024).
Article Google Scholar
Mooney, P. Chest X-ray images (pneumonia). Kaggle dataset https://doi.org/10.17632/rscbjbr9sj.2 (2018).
Article Google Scholar
Kermany, D. S. et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172(5), 1122-1131.e9. https://doi.org/10.1016/j.cell.2018.02.010 (2018).
Article CAS PubMed Google Scholar
Sil, D., Dutta, A. & Chandra, A. Convolutional neural networks for noise classification and denoising of images. Proc. IEEE TENCON, 447–451, (2019). https://doi.org/10.1109/TENCON.2019.8929277
Cui, Y., Shi, M. & Jiang, J. Multi-scale detail–noise complementary learning for image denoising. Appl. Sci. 14(16), 7044. https://doi.org/10.3390/app14167044 (2024).
Article CAS Google Scholar
Zhang, D., Zhou, F., Jiang, Y. & Fu, Z. MM-BSN: Self-supervised image denoising for real-world with multi-mask based on blind-spot network. Preprint at https://arxiv.org/abs/2304.01598 (2023).

Download references

Acknowledgements

We thank the extensive body of existing work in image denoising and GP methods, which has provided valuable insights and guidance throughout this research.

Funding

This study has no funding resources.

Author information

Authors and Affiliations

School of Electronics and Electrical Engineering, Lovely Professional University, Phagwara, 144411, India
Asha Rani & Rosepreet Kaur Bhogal

Authors

Asha Rani
View author publications
Search author on:PubMed Google Scholar
Rosepreet Kaur Bhogal
View author publications
Search author on:PubMed Google Scholar

Contributions

A.R. conceptualized the study, conducted the experiments, and drafted the manuscript. R.K.B. provided critical feedback, supervised the research process, and contributed to the revision and finalization of the manuscript.

Corresponding author

Correspondence to Asha Rani.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

This article does not involve human participants, animal subjects, or any sensitive data requiring an ethical approach.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Rani, A., Bhogal, R.K. Efficient real-world image denoising using multi-scale gaussian pyramids. Sci Rep 15, 40086 (2025). https://doi.org/10.1038/s41598-025-23942-8

Download citation

Received: 11 June 2025
Accepted: 09 October 2025
Published: 17 November 2025
Version of record: 17 November 2025
DOI: https://doi.org/10.1038/s41598-025-23942-8

Subjects

Abstract

Similar content being viewed by others

An efficient lightweight network for image denoising using progressive residual and convolutional attention feature fusion

A foundation model for enhancing magnetic resonance images and downstream segmentation, registration and diagnostic tasks

Remote sensing image Super-resolution reconstruction by fusing multi-scale receptive fields and hybrid transformer

Introduction

Objective of the paper

Literature survey

Noises in real-world images

Image denoising methods

Classical image denoising methods

Advanced image denoising techniques

Literature review

Methodology

Gaussian pyramid

Gaussian pyramid construction

Multiscale denoising process

Image reconstruction

Wavelet transform

Wavelet decomposition

Thresholding

Hard thresholding

Soft thresholding

Reconstruction

Evaluation metrics

Mean squared error (MSE)

Root mean squared error (RMSE)

Mean absolute error (MAE)

Peak signal-to-noise ratio (PSNR)

Structural similarity index measure (SSIM)

Visual information fidelity (VIF)

Figure of merit (FOM).

Computational complexity (CC)

Standard deviation (SD)

Paired t-test

Experimental setup

MRI dataset

X-ray dataset

Non-medical images dataset

Smartphone image denoising dataset (SIDD)

Results and discussion

X–ray dataset

MRI dataset

Non-medical images dataset

SIDD dataset

Challenges and limitations

Future directions

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical approval

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links