Abstract
Transformer-based models have exhibited superior performance in the field of multivariate long-sequence time series forecasting. The Informer model, which adopts the probabilistic sparse self-attention mechanism, presents advantages in computational complexity compared to other models. However, the correlations between variables and multi-scale local features are difficult to be effectively captured by Informer, which results in suboptimal prediction accuracy. To improve prediction accuracy and further reduce computational complexity, a novel Informer-based model, named TACDformer, is proposed in this paper. Firstly, a multi-scale feature distillation layer is designed. This layer can be utilized to extract multi-scale local features that are difficult to be captured by the self-attention mechanism, and to perform adaptive downsampling while fully capturing temporal dependencies and inter-sequence dependencies. Furthermore, an encoder-only architecture is adopted by the model, whereby the negative impacts of the stacked decoder architecture and the historical time series used as decoder inputs on the prediction results are avoided. Since the features output by the encoder cannot be fully learned through simple linear mapping, an output layer based on the temporal attention mechanism and channel attention mechanism is designed. Model performance is enhanced and computational complexity is reduced simultaneously by this layer. Finally, the reversible instance normalization method is introduced on the basis of the Informer model; by performing normalization and denormalization during the data input and output phases respectively, the prediction bias caused by data distribution differences is reduced. To validate the effectiveness of TACDformer, extensive experiments were conducted on seven benchmark datasets spanning four domains: electricity, finance, energy, and weather. The results demonstrate that TACDformer outperforms eight baseline models, including TimeMixer and iTransformer, in terms of MAE and MSE metrics. Meanwhile, compared with Informer, the computational complexity and memory usage of TACDformer are reduced by 33.2% and 63.4%, respectively, and its running time is shortened by 77.2%. Code is available at: https://github.com/hclxk-hzy/TACDformer.
Data availability
Data will be made available on request.
References
Mahalakshmi, G., Sridevi, S. & Rajaram, S. A survey on forecasting of time series data. In 2016 International Conference on Computing Technologies and Intelligent Data Engineering (ICCTIDE’16), 1–8 (IEEE, 2016).
Zhang, Y. & Yan, J. Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting. In The Eleventh International Conference on Learning Representations (2023).
Jan, F., Shah, I. & Ali, S. Short-term electricity prices forecasting using functional time series analysis. Energies 15(9), 3423 (2022).
Zeng, P. et al. Muformer: A long sequence time-series forecasting model based on modified multi-head attention. Knowl.-Based Syst. 254, 109584 (2022).
Mendis, K., Wickramasinghe, M. & Marasinghe, P. Multivariate time series forecasting: A review. In Proceedings of the 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition, 1–9 (2024).
Yu, R., Zheng, S., Anandkumar, A. & Yue, Y. Long-term forecasting using tensor-train rnns (2018).
Hewamalage, H., Bergmeir, C. & Bandara, K. Global models for time series forecasting: A simulation study. Pattern Recognit. 124, 108441 (2022).
Lea, C., Flynn, M. D., Vidal, R., Reiter, A. & Hager, G. D. Temporal convolutional networks for action segmentation and detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 156–165 (2017).
Wu, H., Hu, T., Liu, Y., Zhou, H., Wang, J. & Long, M. Timesnet: Temporal 2d-variation modeling for general time series analysis. arXiv preprint arXiv:2210.02186 (2022).
Vaswani, A. et al.. Attention is all you need. Advances in neural information processing systems 30 (2017).
Amin, S. U., Jung, Y., Fayaz, M., Kim, B. & Seo, S. Enhancing pine wilt disease detection with synthetic data and external attention-based transformers. Eng. Appl. Artif. Intell. 159, 111655 (2025).
Haider, Z. A. et al. A comprehensive approach for image quality assessment using quality-centric embedding and ranking networks. Pattern Recognit. 173, 112890 (2025).
Haider, Z. A. et al. A comprehensive approach for image quality assessment using quality-centric embedding and ranking networks. Pattern Recognit. 173, 112890 (2025).
Wen, Q. et al. Transformers in time series: A survey. arXiv preprint arXiv:2202.07125 (2022).
Li, S. et al. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Advances in neural information processing systems 32 (2019).
Liu, S. et al. Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. In International Conference on Learning Representations (2021).
Cirstea, R.-G. et al. Triformer: Triangular, variable-specific attentions for long sequence multivariate time series forecasting–full version. arXiv preprint arXiv:2204.13767 (2022).
Wu, H., Xu, J., Wang, J. & Long, M. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in neural information processing systems 34, 22419–22430 (2021).
Zhou, T. et al. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International Conference on Machine Learning, 27268–27286 (PMLR, 2022).
Ul Amin, S. et al. Eadn: An efficient deep learning model for anomaly detection in videos. Mathematics 10(9), 1555 (2022).
Nie, Y. A time series is worth 64words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730 (2022).
Liu, Y. et al. itransformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625 (2023).
Yu, C., Yan, G., Yu, C., Liu, X. & Mi, X. Mriformer: A multi-resolution interactive transformer for wind speed multi-step prediction. Inf. Sci. 661, 120150 (2024).
Zhou, H. et al. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, 11106–11115 (2021).
Zhong, X. et al. Efficient seizure detection by lightweight informer combined with fusion of time-frequency-spatial features. Appl. Intell. 55(7), 643. (2025).
Wu, B., Lin, J., Lv, S.-X. & Wang, L. End-to-end multidimensional interpretable tourism demand combined forecasting model based on feature fusion. Appl. Intell. 55(12), 1–32 (2025).
Huang, X., Chen, N., Deng, Z. & Huang, S. Multivariate time series anomaly detection via dynamic graph attention network and informer. Appl. Intell. 54(17–18), 7636–7658 (2024).
Quanwei, T., Guijun, X. & Wenju, X. Mgmi: A novel deep learning model based on short-term thermal load prediction. Appl. Energy 376, 124209 (2024).
Zheng, Q. et al. Application of complete ensemble empirical mode decomposition based multi-stream informer (ceemd-msi) in pm2.5 concentration long-term prediction. Expert Syst. Appl. 245, 123008 (2024).
Mukhtorov, D., Baltayev, J., Muksimova, S., Umirzakova, S. & Cho, Y.-I. Standards-aligned ai validation and certification platform for trustworthy modeling. IEEE Access 13, 216302–216317 (2025).
Kim, T. et al. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations (2021).
Song, H. et al. Cmkd-net: A cross-modal knowledge distillation method for remote sensing image classification. Adv. Space Res. 75(12), 8515–8534 (2025).
Song, H. et al. Symmetrical learning and transferring: Efficient knowledge distillation for remote sensing image classification. Symmetry 17(7), 1002 (2025).
Zhang, Y. et al. Solar array power prediction of long endurance stratospheric aerostat using a hybrid model based on blur informer. Solar Energy 287, 113121 (2025).
Dong, X., Li, D., Wang, W. & Shen, Y. Bwo-caformer: An improved informer model for aqi prediction in Beijing and Wuhan. Process Saf. Environ. Prot. 195, 106800 (2025).
Huo, J., Bian, W. & Chang, C. Ultra-short-term wind power forecasting based on the mifcformer model and a critical low wind speed region power revision strategy. Digit. Signal Process. 168, 105521 (2025).
Liu, B., Li, Z., Li, Z. & Chen, C. Home appliance load forecasting based on improved informer. In 2023 3rd International Conference on Intelligent Communications and Computing (ICC), 74–79 (IEEE, 2023).
Ruan, G. et al. A deep learning model for predicting the state of energy in lithium-ion batteries based on magnetic field effects. Energy 304, 132161 (2024).
Zhang, Y. et al. Achieving high precision and balanced multi-energy load forecasting with mixed time scales: a multi-task learning model with stacked cross-attention. Energy AI 21, 100561 (2025).
Ul Amin, S., Kim, B., Jung, Y., Seo, S. & Park, S. Video anomaly detection utilizing efficient spatiotemporal feature fusion with 3d convolutions and long short-term memory modules. Adv. Intell. Syst. 6(7), 2300706 (2024).
Jannu, C. & Vanambathina, S. D. An overview of speech enhancement based on deep learning techniques. Int. J. Image Graph. 25(01), 2550001 (2025).
Jannu, C., Burra, M., Vanambathina, S. D. & Parisae, V. Real-time single channel speech enhancement using triple attention and stacked squeeze-tcn. Comput. Intell. 41(1), 70016 (2025).
Sun, B., Chen, X., Shen, T. & Ma, L. Enhancing long-term load forecasting with convolutional informer-based hybrid model. Eng. Appl. Artif. Intell. 161, 112051 (2025).
Liu, Z., Tan, Z. & Wang, Y. A mconvtcn-informer deep learning model for soc prediction of lithium-ion batteries. J. Energy Storage 129, 117092 (2025).
Li, C. et al. Cnn-informer: A hybrid deep learning model for seizure detection on long-term eeg. Neural Netw. 181, 106855 (2025).
Kumar, T., Mileo, A. & Bendechache, M. Facesaliencyaug: Mitigating geographic, gender and stereotypical biases via saliency-based data augmentation. Signal Image Video Process. 19(2), 129 (2025).
Kumar, T., Mileo, A. & Bendechache, M. Keeporiginalaugment: Single image-based better information-preserving data augmentation approach. In IFIP International Conference on Artificial Intelligence Applications and Innovations, 27–40 (Springer, 2024).
Kumar, T., Mileo, A. & Bendechache, M. Saliency-based metric and facekeeporiginalaugment: A novel approach for enhancing fairness and diversity—T. Kumar et al. Multimed. Syst. 31(3), 153 (2025).
Salam, A. & El Hibaoui, A. Comparison of machine learning algorithms for the power consumption prediction:-case study of Tetouan city. In 2018 6th International Renewable and Sustainable Energy Conference (IRSEC), 1–5 (IEEE, 2018).
Lai, G., Chang, W.-C., Yang, Y. & Liu, H. Modeling long-and short-term temporal patterns with deep neural networks. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 95–104 (2018).
Candanedo, L. M., Feldheim, V. & Deramaix, D. Data driven prediction models of energy use of appliances in a low-energy house. Energy Build. 140, 81–97 (2017).
Zeng, A., Chen, M., Zhang, L. & Xu, Q. Are transformers effective for time series forecasting? In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, 11121–11128 (2023).
Wang, S. et al. Timemixer: Decomposable multiscale mixing for time series forecasting. arXiv preprint arXiv:2405.14616 (2024).
Funding
This work was supported by the Key Research and Development Program Project of Xinjiang Uygur Autonomous Region (No. 2023B01032), the Research Project of Xinjiang Sky-Ground Integrated Intelligent Computing Technology Laboratory (No. 2025A05-1), and the Tianshan Talent Training Project-Xinjiang Science and Technology Innovation Team Program (No. 2023TSYCTD0012).
Author information
Authors and Affiliations
Contributions
Zeyu Hu Has written the main manuscript. Yuan Jia and Wu Le has conducted data collection. Congbing He and HuiHui Fan Has created figures and tables. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Hu, Z., Jia, Y., Le, W. et al. TACDformer: an improved informer-based model for accurate multivariate long-term time series forecasting. Sci Rep (2026). https://doi.org/10.1038/s41598-026-46529-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-46529-3