BDL: transformer-based super-resolution network for degraded underground coal mine images

Hu, Tao; Qiu, Jinbo; Cheng, Xiang

doi:10.1038/s41598-026-48248-1

Download PDF

Article
Open access
Published: 13 April 2026

BDL: transformer-based super-resolution network for degraded underground coal mine images

Tao Hu^1,2,
Jinbo Qiu^1,2 &
Xiang Cheng³

Scientific Reports (2026) Cite this article

884 Accesses
Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

Underground coal mine images often suffer from severe blurring and low-resolution degradation due to harsh lighting, dust, and machinery motion, which hinder accurate visual inspection and automated analysis. This study proposes a transformer-based super-resolution (SR) network that integrates local convolution with adaptive interaction mechanisms for effective local–global feature modeling. The network employs a hierarchical architecture consisting of shallow feature extraction, cascaded spatial and channel transformer blocks, and a reconstruction module. Each transformer block incorporates a bidirectional adaptive interaction module (BAIM) to fuse convolutional local features with transformer-based global representations through adaptive reweighting in both spatial and channel dimensions. A dual-group feedforward network (DGFN) decouples channel feature preservation from spatial information enhancement, while cross-group interactions ensure balanced channel modeling and spatial perception without information loss. Additionally, a local convolution block (LCB) with SE-based channel weighting is used to restore fine-grained details. Extensive experiments on both a dedicated coal mine dataset and public benchmarks demonstrate that the proposed method consistently outperforms existing state-of-the-art (SOTA) SR approaches. Specifically, for ×2 super-resolution, it achieves a PSNR/SSIM of 32.07/0.9688 on the coal mine dataset, improving over the previous best by 0.59 dB and 0.0036, respectively. For ×4 super-resolution, it attains 28.10/0.8836, surpassing the previous best by 0.24 dB and 0.0013. Similar improvements are observed on public datasets, confirming the method’s effectiveness in both general and challenging industrial scenarios.

A lightweight coal mine pedestrian detector for video surveillance systems with multi-level feature fusion and channel pruning

Article Open access 17 February 2025

An open paradigm dataset for intelligent monitoring of underground drilling operations in coal mines

Article Open access 13 May 2025

An improved and advanced method for dehazing coal mine dust images

Article Open access 02 April 2025

Data availability

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Li, Z., Åhman, M., Nilsson, L. J. & Bauer, F. Towards carbon neutrality: Transition pathways for the Chinese ethylene industry. Renew. Sustain. Energy Rev. 199, 114540. https://doi.org/10.1016/j.rser.2024.114540 (2024).
Google Scholar
Mao, Q. et al. Clarity method of fog and dust image in fully mechanized mining face. Mach. Vis. Appl. 33, 30. https://doi.org/10.1007/s00138-022-01282-1 (2022).
Google Scholar
Wei, D. et al. Adaptive image enhancement method for coal-mine underground image based on no-reference quality evaluation. IEEE Trans. Instrum. Meas. https://doi.org/10.1109/TIM.2024.3470234 (2024).
Google Scholar
Wang, J., Hao, Y., Bai, H. & Yan, L. Parallel attention recursive generalization transformer for image super-resolution. Sci. Rep. 15, 8669. https://doi.org/10.1038/s41598-025-92377-y (2025).
Google Scholar
Zhang, W., Li, Y., Li, Y., Xu, S. & Zhang, K. Image super-resolution reconstruction based on active displacement imaging. Opto-Electron Eng. 51, 230290 (2025).
Google Scholar
Chen, C. et al. Hypergraph neural network for remote sensing hyperspectral image super-resolution. Knowl. -Based Syst. 113755. https://doi.org/10.1016/j.knosys.2025.113755 (2025).
Mei, K. et al. The power of context: How multi-modality improves image super-resolution. Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR). 23141–23152 https://doi.org/10.48550/arXiv.2503.14503 (2025).
Huang, W. & Huang, D. Local feature enhancement transformer for image super-resolution. Sci. Rep. 15, 20792. https://doi.org/10.1038/s41598-025-07650-x (2025).
Google Scholar
Zhang, Z., Wan, L., Xu, W. & Wang, S. Low-resolution human pose estimation and action recognition via pose-driven super-resolution reconstruction. Mach. Learn. 114, 135. https://doi.org/10.1007/s10994-025-06759-4 (2025).
Google Scholar
Wu, Y., Dai, S. & Ma, Z. Advanced Histogram Equalization Based on a Hybrid Saliency Map and Novel Visual Prior. Mach. Intell. Res., 21(6), 1178–1191. https://doi.org/10.1007/s11633-023-1448-2 (2024).
Yang, H., Li, X., Zhang, H. & Meng, Y. Dual-Branch Structure Low-Light Image Enhancement Algorithm Combined with Brightness Constraint. Comput. Eng. Appl. 61 (8), 250–259. https://doi.org/10.3778/j.issn.1002-8331.2312-0264 (2025).
Google Scholar
Wang, J., Sun, Y. & Yang, J. Multi–Modular Network–Based Retinex Fusion Approach for Low–Light Image Enhancement. Electronics 13 (11), 2040. https://doi.org/10.3390/electronics13112040 (2024).
Google Scholar
Zhou, Y., Li, J., Xu, W. & Liu, A. Robust super–resolution compressive sensing: a two–timescale alternating MAP approach. arXiv (2025). https://doi.org/10.48550/arXiv.2508.07013 (2025).
Wang, J., Li, D., Yang, Q. & Peng, Y. Compressed adaptive–sampling–rate image sensing based on overcomplete dictionary. Entropy 27, 709. https://doi.org/10.3390/e27070709 (2025).
Google Scholar
Wu, G., Jiang, J., Jiang, K. & Liu, X. Fully 1 × 1 convolutional network for lightweight image super–resolution. Mach. Intell. Res. 21, 1062–1076. https://doi.org/10.1007/s11633-024-1501-9 (2024).
Google Scholar
Xie, Y., Ou, J., Zhong, J., Jiang, T. & Ma, T. Multi–scale feedback residual network for image super–resolution. Signal. Image Video Process. 19, 635. https://doi.org/10.1007/s11760-025-04255-9 (2025).
Google Scholar
Kaur, H. & Singh, R. Enhanced Pyramidal Residual Networks for Single Image Super-Resolution. Neural Comput. Appl. 36, 11563–11577. https://doi.org/10.1007/s00521-024-09702-1 (2024).
Google Scholar
Xie, Y. et al. Multi-Scale Edge Enhancement and Joint Attention Mechanism Based Infrared and Visible Image Fusion. Sensors 24, 5860. https://doi.org/10.3390/s24175860 (2024).
Google Scholar
Park, S. W., Jung, S. H., Kim, S. & NeXtSRGAN Enhancing Super-Resolution GAN with ConvNeXt Discriminator for Superior Realism. Visual Comput. 41, 7141–7167. https://doi.org/10.1007/s00371-024-03797-2 (2025).
Google Scholar
Zhang, P., Pan, L., Xiao, C., Wu, W. & Wang, H. Improved Generative Adversarial Power Data Super-Resolution Perception Model. Electronics, 14, 3222. https://doi.org/10.3390/electronics14163222 (2025).
Yang, G., Wang, Y., Yi, C. & Wang, Z. A new super-resolution restoration method with generative adversarial network for underground video images in coal mines. J. Phys. Conf. Ser., 012011. (2031). https://doi.org/10.1088/1742-6596/2031/1/012011 (2021).
Wang, Q. et al. Rapid multispectral image identification of coal and gangue based on super-resolution reconstruction. Appl. Opt. 63, 7362–7369. https://doi.org/10.1364/AO.502769 (2024).
Google Scholar
Zou, L. et al. Improved generative adversarial network for super-resolution reconstruction of coal photomicrographs. Sensors 23, 7296. https://doi.org/10.3390/s23167296 (2023).
Google Scholar
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. (NeurIPS. 30. https://doi.org/10.48550/arXiv.1706.03762 (2017).
Dosovitskiy, A. et al. An image is worth 16×16 words: Transformers for image recognition at scale. arXiv, arXiv:2010.11929. (2020). https://doi.org/10.48550/arXiv.2010.11929
Chen, Y. et al. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV). 3435-3444 https://doi.org/10.1109/ICCV.2019.00353 (2019).
Zhang, W. et al. An Efficient Hybrid CNN-Transformer Approach for Remote Sensing Super-Resolution. Remote Sens. 16, 880. https://doi.org/10.3390/rs16050880 (2024).
Google Scholar
Zhou, X. et al. Early Local Attention in multi–scale vision transFormers. Knowl. –Based Syst. 325, 113851. https://doi.org/10.1016/j.knosys.2025.113851 (2025).
Google Scholar
Li, J. et al. A vision MLP backbone for multi–scale feature extraction. Inf. Sci. 701, 121865. https://doi.org/10.1016/j.ins.2025.121865 (2025).
Google Scholar
Zhou, X. et al. PDAViT: Pyramid dual-attention vision transformer. Neurocomputing 662, 131966. https://doi.org/10.1016/j.neucom.2025.131966 (2026).
Google Scholar
Ma, Z., Li, J., Jiang, K. & Wong, W. K. Integrating local and global correlations with Mamba–Transformer for multi–class anomaly detection. Knowl. –Based Syst. 324, 113740. https://doi.org/10.1016/j.knosys.2025.113740 (2025).
Google Scholar
Ma, Z., Li, J. & Wong, W. K. Patch distance based auto–encoder for industrial anomaly detection. Expert Syst. Appl. 270, 126537. https://doi.org/10.1016/j.eswa.2025.126537 (2025).
Google Scholar
Xie, F., Lu, P. & Liu, X. Asymmetric convolutional modulation network for efficient image super–resolution. Knowl. –Based Syst. 301, 112274. https://doi.org/10.1016/j.knosys.2024.112274 (2024).
Google Scholar
Wang, H. et al. A Lightweight CNN-Transformer Based on Separable Multiscale Depthwise Convolution and Efficient Self-Attention for Rotating Machinery Fault Diagnosis. Mater. Continua, 82(1), 1417–1437. https://doi.org/10.32604/cmc.2024.058785 (2025).
Google Scholar
Chen, Z. et al. Dual aggregation transformer for image super-resolution. Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV). 12312-12321 https://doi.org/10.1109/ICCV51070.2023.01131 (2023).
Geva, M., Caciularu, A., Wang, K. R. & Goldberg, Y. Transformer feed-forward layers build predictions by promoting concepts in the vocabulary space. arXiv, arXiv:2203.14680. (2022). https://doi.org/10.18653/v1/2022.emnlp-main.3
Uwantege, S. P. Smart miner helmet and monitoring system in Rwanda: A case of Rutongo mines, Gasabo district. Ph.D. Thesis, College of Science and Technology (2022).
Bevilacqua, M., Roumy, A., Guillemot, C. & Alberi-Morel, M. L. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. Proc. Eur. Signal Process. Conf. (EUSIPCO), 135 (2012).
Zeyde, R., Elad, M. & Protter, M. On single image scale-up using sparse representations. Int. Conf. Curves Surf. 711–730. https://doi.org/10.1007/978-3-642-27413-8_47 (2010).
Huang, J. B., Singh, A. & Ahuja, N. Single image super-resolution from transformed self-exemplars. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR). 5197–5206. https://doi.org/10.1109/CVPR.2015.7299156 (2015).
Sun, C., Wang, C. & He, C. Image Super-Resolution Reconstruction Algorithm Based on SRGAN and Swin Transformer. Symmetry 17 (3), 337. https://doi.org/10.3390/sym17030337 (2025).
Google Scholar
Wang, W. et al. A lightweight large receptive field network LrfSR for image super–resolution. Sci. Rep. 15 (1), 12535. https://doi.org/10.1038/s41598-025-96796-9 (2025).
Google Scholar
Kim, J., Lee, J. K. & Lee, K. M. Accurate image super-resolution using very deep convolutional networks. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 1646–1654. (2016). https://doi.org/10.1109/CVPR.2016.182
Lim, B., Son, S., Kim, H., Nah, S. & Lee, K. M. Enhanced deep residual networks for single image super-resolution. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), 136–144. (2017). https://doi.org/10.1109/CVPRW.2017.151
Liang, J. et al. SwinIR: Image restoration using Swin Transformer. Proc. IEEE/CVF Int. Conf. Comput. Vis. Workshops (ICCVW). 1833-1844 https://doi.org/10.1109/ICCVW54120.2021.00210 (2021).
Lyu, S. et al. A General Few–shot Defect Classification Model Using Multi–View Region–Context. arXiv preprint, arXiv:2412.16897. https://doi.org/10.48550/ARXIV.2412.16897 (2024).
Google Scholar

Download references

Funding

This work was supported by the Science Foundation of China Coal Technology and Engineering Group Shanghai Company Ltd (No. 02062235825 J).

Author information

Authors and Affiliations

State Key Laboratory of Intelligent Coal Mining and Strata Control, Shanghai, 200030, China
Tao Hu & Jinbo Qiu
China Coal Technology and Engineering Group Shanghai Co., Ltd, Shanghai, 200030, China
Tao Hu & Jinbo Qiu
The Chinese Antarctic Center of Surveying and Mapping, Wuhan University, Wuhan, 430079, China
Xiang Cheng

Authors

Tao Hu
View author publications
Search author on:PubMed Google Scholar
Jinbo Qiu
View author publications
Search author on:PubMed Google Scholar
Xiang Cheng
View author publications
Search author on:PubMed Google Scholar

Contributions

T.H: Methodology, software, writing—original draft preparation, visualization, funding acquisition. J.Q: Conceptualization, experimentation, data curation, supervision. X.C: Validation, writing—review and editing, project administration. All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Tao Hu or Xiang Cheng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hu, T., Qiu, J. & Cheng, X. BDL: transformer-based super-resolution network for degraded underground coal mine images. Sci Rep (2026). https://doi.org/10.1038/s41598-026-48248-1

Download citation

Received: 13 February 2026
Accepted: 07 April 2026
Published: 13 April 2026
DOI: https://doi.org/10.1038/s41598-026-48248-1