Abstract
Underground coal mine images often suffer from severe blurring and low-resolution degradation due to harsh lighting, dust, and machinery motion, which hinder accurate visual inspection and automated analysis. This study proposes a transformer-based super-resolution (SR) network that integrates local convolution with adaptive interaction mechanisms for effective local–global feature modeling. The network employs a hierarchical architecture consisting of shallow feature extraction, cascaded spatial and channel transformer blocks, and a reconstruction module. Each transformer block incorporates a bidirectional adaptive interaction module (BAIM) to fuse convolutional local features with transformer-based global representations through adaptive reweighting in both spatial and channel dimensions. A dual-group feedforward network (DGFN) decouples channel feature preservation from spatial information enhancement, while cross-group interactions ensure balanced channel modeling and spatial perception without information loss. Additionally, a local convolution block (LCB) with SE-based channel weighting is used to restore fine-grained details. Extensive experiments on both a dedicated coal mine dataset and public benchmarks demonstrate that the proposed method consistently outperforms existing state-of-the-art (SOTA) SR approaches. Specifically, for ×2 super-resolution, it achieves a PSNR/SSIM of 32.07/0.9688 on the coal mine dataset, improving over the previous best by 0.59 dB and 0.0036, respectively. For ×4 super-resolution, it attains 28.10/0.8836, surpassing the previous best by 0.24 dB and 0.0013. Similar improvements are observed on public datasets, confirming the method’s effectiveness in both general and challenging industrial scenarios.
Similar content being viewed by others
Data availability
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Li, Z., Åhman, M., Nilsson, L. J. & Bauer, F. Towards carbon neutrality: Transition pathways for the Chinese ethylene industry. Renew. Sustain. Energy Rev. 199, 114540. https://doi.org/10.1016/j.rser.2024.114540 (2024).
Mao, Q. et al. Clarity method of fog and dust image in fully mechanized mining face. Mach. Vis. Appl. 33, 30. https://doi.org/10.1007/s00138-022-01282-1 (2022).
Wei, D. et al. Adaptive image enhancement method for coal-mine underground image based on no-reference quality evaluation. IEEE Trans. Instrum. Meas. https://doi.org/10.1109/TIM.2024.3470234 (2024).
Wang, J., Hao, Y., Bai, H. & Yan, L. Parallel attention recursive generalization transformer for image super-resolution. Sci. Rep. 15, 8669. https://doi.org/10.1038/s41598-025-92377-y (2025).
Zhang, W., Li, Y., Li, Y., Xu, S. & Zhang, K. Image super-resolution reconstruction based on active displacement imaging. Opto-Electron Eng. 51, 230290 (2025).
Chen, C. et al. Hypergraph neural network for remote sensing hyperspectral image super-resolution. Knowl. -Based Syst. 113755. https://doi.org/10.1016/j.knosys.2025.113755 (2025).
Mei, K. et al. The power of context: How multi-modality improves image super-resolution. Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR). 23141–23152 https://doi.org/10.48550/arXiv.2503.14503 (2025).
Huang, W. & Huang, D. Local feature enhancement transformer for image super-resolution. Sci. Rep. 15, 20792. https://doi.org/10.1038/s41598-025-07650-x (2025).
Zhang, Z., Wan, L., Xu, W. & Wang, S. Low-resolution human pose estimation and action recognition via pose-driven super-resolution reconstruction. Mach. Learn. 114, 135. https://doi.org/10.1007/s10994-025-06759-4 (2025).
Wu, Y., Dai, S. & Ma, Z. Advanced Histogram Equalization Based on a Hybrid Saliency Map and Novel Visual Prior. Mach. Intell. Res., 21(6), 1178–1191. https://doi.org/10.1007/s11633-023-1448-2 (2024).
Yang, H., Li, X., Zhang, H. & Meng, Y. Dual-Branch Structure Low-Light Image Enhancement Algorithm Combined with Brightness Constraint. Comput. Eng. Appl. 61 (8), 250–259. https://doi.org/10.3778/j.issn.1002-8331.2312-0264 (2025).
Wang, J., Sun, Y. & Yang, J. Multi–Modular Network–Based Retinex Fusion Approach for Low–Light Image Enhancement. Electronics 13 (11), 2040. https://doi.org/10.3390/electronics13112040 (2024).
Zhou, Y., Li, J., Xu, W. & Liu, A. Robust super–resolution compressive sensing: a two–timescale alternating MAP approach. arXiv (2025). https://doi.org/10.48550/arXiv.2508.07013 (2025).
Wang, J., Li, D., Yang, Q. & Peng, Y. Compressed adaptive–sampling–rate image sensing based on overcomplete dictionary. Entropy 27, 709. https://doi.org/10.3390/e27070709 (2025).
Wu, G., Jiang, J., Jiang, K. & Liu, X. Fully 1 × 1 convolutional network for lightweight image super–resolution. Mach. Intell. Res. 21, 1062–1076. https://doi.org/10.1007/s11633-024-1501-9 (2024).
Xie, Y., Ou, J., Zhong, J., Jiang, T. & Ma, T. Multi–scale feedback residual network for image super–resolution. Signal. Image Video Process. 19, 635. https://doi.org/10.1007/s11760-025-04255-9 (2025).
Kaur, H. & Singh, R. Enhanced Pyramidal Residual Networks for Single Image Super-Resolution. Neural Comput. Appl. 36, 11563–11577. https://doi.org/10.1007/s00521-024-09702-1 (2024).
Xie, Y. et al. Multi-Scale Edge Enhancement and Joint Attention Mechanism Based Infrared and Visible Image Fusion. Sensors 24, 5860. https://doi.org/10.3390/s24175860 (2024).
Park, S. W., Jung, S. H., Kim, S. & NeXtSRGAN Enhancing Super-Resolution GAN with ConvNeXt Discriminator for Superior Realism. Visual Comput. 41, 7141–7167. https://doi.org/10.1007/s00371-024-03797-2 (2025).
Zhang, P., Pan, L., Xiao, C., Wu, W. & Wang, H. Improved Generative Adversarial Power Data Super-Resolution Perception Model. Electronics, 14, 3222. https://doi.org/10.3390/electronics14163222 (2025).
Yang, G., Wang, Y., Yi, C. & Wang, Z. A new super-resolution restoration method with generative adversarial network for underground video images in coal mines. J. Phys. Conf. Ser., 012011. (2031). https://doi.org/10.1088/1742-6596/2031/1/012011 (2021).
Wang, Q. et al. Rapid multispectral image identification of coal and gangue based on super-resolution reconstruction. Appl. Opt. 63, 7362–7369. https://doi.org/10.1364/AO.502769 (2024).
Zou, L. et al. Improved generative adversarial network for super-resolution reconstruction of coal photomicrographs. Sensors 23, 7296. https://doi.org/10.3390/s23167296 (2023).
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. (NeurIPS. 30. https://doi.org/10.48550/arXiv.1706.03762 (2017).
Dosovitskiy, A. et al. An image is worth 16×16 words: Transformers for image recognition at scale. arXiv, arXiv:2010.11929. (2020). https://doi.org/10.48550/arXiv.2010.11929
Chen, Y. et al. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV). 3435-3444 https://doi.org/10.1109/ICCV.2019.00353 (2019).
Zhang, W. et al. An Efficient Hybrid CNN-Transformer Approach for Remote Sensing Super-Resolution. Remote Sens. 16, 880. https://doi.org/10.3390/rs16050880 (2024).
Zhou, X. et al. Early Local Attention in multi–scale vision transFormers. Knowl. –Based Syst. 325, 113851. https://doi.org/10.1016/j.knosys.2025.113851 (2025).
Li, J. et al. A vision MLP backbone for multi–scale feature extraction. Inf. Sci. 701, 121865. https://doi.org/10.1016/j.ins.2025.121865 (2025).
Zhou, X. et al. PDAViT: Pyramid dual-attention vision transformer. Neurocomputing 662, 131966. https://doi.org/10.1016/j.neucom.2025.131966 (2026).
Ma, Z., Li, J., Jiang, K. & Wong, W. K. Integrating local and global correlations with Mamba–Transformer for multi–class anomaly detection. Knowl. –Based Syst. 324, 113740. https://doi.org/10.1016/j.knosys.2025.113740 (2025).
Ma, Z., Li, J. & Wong, W. K. Patch distance based auto–encoder for industrial anomaly detection. Expert Syst. Appl. 270, 126537. https://doi.org/10.1016/j.eswa.2025.126537 (2025).
Xie, F., Lu, P. & Liu, X. Asymmetric convolutional modulation network for efficient image super–resolution. Knowl. –Based Syst. 301, 112274. https://doi.org/10.1016/j.knosys.2024.112274 (2024).
Wang, H. et al. A Lightweight CNN-Transformer Based on Separable Multiscale Depthwise Convolution and Efficient Self-Attention for Rotating Machinery Fault Diagnosis. Mater. Continua, 82(1), 1417–1437. https://doi.org/10.32604/cmc.2024.058785 (2025).
Chen, Z. et al. Dual aggregation transformer for image super-resolution. Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV). 12312-12321 https://doi.org/10.1109/ICCV51070.2023.01131 (2023).
Geva, M., Caciularu, A., Wang, K. R. & Goldberg, Y. Transformer feed-forward layers build predictions by promoting concepts in the vocabulary space. arXiv, arXiv:2203.14680. (2022). https://doi.org/10.18653/v1/2022.emnlp-main.3
Uwantege, S. P. Smart miner helmet and monitoring system in Rwanda: A case of Rutongo mines, Gasabo district. Ph.D. Thesis, College of Science and Technology (2022).
Bevilacqua, M., Roumy, A., Guillemot, C. & Alberi-Morel, M. L. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. Proc. Eur. Signal Process. Conf. (EUSIPCO), 135 (2012).
Zeyde, R., Elad, M. & Protter, M. On single image scale-up using sparse representations. Int. Conf. Curves Surf. 711–730. https://doi.org/10.1007/978-3-642-27413-8_47 (2010).
Huang, J. B., Singh, A. & Ahuja, N. Single image super-resolution from transformed self-exemplars. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR). 5197–5206. https://doi.org/10.1109/CVPR.2015.7299156 (2015).
Sun, C., Wang, C. & He, C. Image Super-Resolution Reconstruction Algorithm Based on SRGAN and Swin Transformer. Symmetry 17 (3), 337. https://doi.org/10.3390/sym17030337 (2025).
Wang, W. et al. A lightweight large receptive field network LrfSR for image super–resolution. Sci. Rep. 15 (1), 12535. https://doi.org/10.1038/s41598-025-96796-9 (2025).
Kim, J., Lee, J. K. & Lee, K. M. Accurate image super-resolution using very deep convolutional networks. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 1646–1654. (2016). https://doi.org/10.1109/CVPR.2016.182
Lim, B., Son, S., Kim, H., Nah, S. & Lee, K. M. Enhanced deep residual networks for single image super-resolution. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), 136–144. (2017). https://doi.org/10.1109/CVPRW.2017.151
Liang, J. et al. SwinIR: Image restoration using Swin Transformer. Proc. IEEE/CVF Int. Conf. Comput. Vis. Workshops (ICCVW). 1833-1844 https://doi.org/10.1109/ICCVW54120.2021.00210 (2021).
Lyu, S. et al. A General Few–shot Defect Classification Model Using Multi–View Region–Context. arXiv preprint, arXiv:2412.16897. https://doi.org/10.48550/ARXIV.2412.16897 (2024).
Funding
This work was supported by the Science Foundation of China Coal Technology and Engineering Group Shanghai Company Ltd (No. 02062235825 J).
Author information
Authors and Affiliations
Contributions
T.H: Methodology, software, writing—original draft preparation, visualization, funding acquisition. J.Q: Conceptualization, experimentation, data curation, supervision. X.C: Validation, writing—review and editing, project administration. All authors have read and agreed to the published version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Hu, T., Qiu, J. & Cheng, X. BDL: transformer-based super-resolution network for degraded underground coal mine images. Sci Rep (2026). https://doi.org/10.1038/s41598-026-48248-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-48248-1


