Abstract
The purpose of this paper is to explore the reform and innovation mode of music teaching driven by artificial intelligence (AI). By introducing the Dilated Convolutional Neural Network (DCNN) algorithm and further integrating the attention mechanism, an audio recognition model based on a Multi-Branch Fusion Network Based on Dilated Convolution and Attention Mechanism (MBFN-DCAM) is constructed to improve the accuracy and efficiency of music audio recognition. Horizontal comparison experiments show that MBFN-DCAM outperforms 6 representative state-of-the-art (SOTA) models, including Audio Spectrogram Transformer (AST) and ResNet-50, with a recognition accuracy of 95.65% ± 0.35% (p < 0.01). Validated by a randomized controlled trial (n = 60), feedback provided by the model significantly improves students’ pitch accuracy and enhances their Music Self-Efficacy Scale (MES). Furthermore, the inference latency of the model on Jetson Nano is only 82.4 ms, fully demonstrating its deployment advantages in resource-constrained environments. This paper provides an efficient, robust, and educationally effective technical approach for intelligent music teaching evaluation.
Data availability
The datasets used and/or analysed during the current study available from the corresponding author Ningning Shi on reasonable request via e-mail [2003248@hlju.edu.cn](mailto:2003248@hlju.edu.cn) .
References
Yu, H. & Zou, Z. The music education and teaching innovation using blockchain technology supported by artificial intelligence. Int. J. Grid Util. Comput. 14 (2–3), 278–296 (2023).
Zhang, W., Shankar, A. & Antonidoss, A. Modern art education and teaching based on artificial intelligence. J. Interconnect. Networks. 22 (Supp01), 2141005 (2022).
Bakariya, B. et al. Facial emotion recognition and music recommendation system using CNN-based deep learning techniques. Evol. Syst. 15 (2), 641–658 (2024).
Du, R. et al. Valence-arousal classification of emotion evoked by Chinese ancient-style music using 1D-CNN-BiLSTM model on EEG signals for college students. Multimedia Tools Appl. 82 (10), 15439–15456 (2023).
Alluhaidan, A. S., Saidani, O., Jahangir, R., Nauman, M. A. & Neffati, O. S. Speech emotion recognition through hybrid features and convolutional neural network. Appl. Sci. 13 (8), 4750 (2023).
Bai, H., Tong, W., Wei, B., Duan, C. & Liu, Q. A sterna migration algorithm-based efficient bionic engineering optimization algorithm. Sci. Rep. 15 (1), 38261 (2025).
Bai, H., Tong, W., Geng, Z. & Gao, C. A rolling bearing fault diagnosis method based on an improved parallel one-dimensional convolutional neural network. PLoS ONE 20 (8), e0327206. (2025).
Zhang, Y. Research on the Problems of Teaching Traditional Music Basic Theory in the Information Age. Appl. Math. Nonlinear Sci. 9 (1), 1–15 (2024).
Pavlenko, O. et al. The Role of Current Computer Technologies in Developing Professional Competence of Music Teachers: A Model of a Personalized Educational Environment. BRAIN Broad Res. Artif. Intell. Neurosci. 15 (1), 325–340 (2024).
Liu, X. & Shao, X. Modern mobile learning technologies in online piano education: Online educational course design and impact on learning. Interact. Learn. Environ. 32 (4), 1279–1290 (2024).
Huang, A. Y. et al. Effects of artificial Intelligence–Enabled personalized recommendations on learners’ learning engagement, motivation, and outcomes in a flipped classroom. Comput. Educ. 194, 104684 (2023).
Yang, C. C. & Ogata, H. Personalized learning analytics intervention approach for enhancing student learning achievement and behavioral engagement in blended learning. Educ. Inform. Technol. 28 (3), 2509–2528 (2023).
Li, P. P. & Wang, B. Artificial intelligence in music education. Int. J. Human–Computer Interact. 40 (16), 4183–4192 (2024).
Yu, X. et al. Developments and applications of artificial intelligence in music education. Technologies 11 (2), 42 (2023).
Dai, D. D. Artificial intelligence technology assisted music teaching design. Sci. Program. 2021 (1), 9141339 (2021).
Yang, Y. et al. Multi-source and heterogeneous online music education mechanism: an artificial intelligence-driven approach. Fractals 31 (06), 2340154 (2023).
Qian, C. Research on Human-centered Design in College Music Education to Improve Student Experience of Artificial Intelligence-based Information Systems. J. Inform. Syst. Eng. Manage. 8 (3), 23761 (2023).
Song, X. Applications Of Artificial Intelligence-Assisted Computing In Piano Education. Educational Administration: Theory Pract. 30 (6), 1124–1134 (2024).
Jena, K. K. et al. A hybrid deep learning approach for classification of music genres using wavelet and spectrogram analysis. Neural Comput. Appl. 35 (15), 11223–11248 (2023).
Prabhakar, S. K. & Lee, S. W. Holistic approaches to music genre classification using efficient transfer and deep learning techniques. Expert Syst. Appl. 211, 118636 (2023).
Xie, C. et al. Music genre classification based on res-gated CNN and attention mechanism. Multimedia Tools Appl. 83 (5), 13527–13542 (2024).
Schmid, F. et al. Dynamic convolutional neural networks as efficient pre-trained audio models. IEEE/ACM Trans. Audio Speech Lang. Process. 32, 2227–2241 (2024).
Tami, M. et al. Transformer-based approach to pathology diagnosis using audio spectrogram. Information 15 (5), 253 (2024).
Bharti, K. et al. Dysarthric Speech Detection and Severity Classification using Audio Spectrogram Transformer. Eng. Lett. 33 (6), 2044 (2025).
Nguyen-Duc, M. et al. A Comparative Study of Deep Audio Models for Spectrogram-and Waveform-based SingFake Detection. IEEE Access. 13, 95739–95752 (2025).
Wang, D. et al. Multi-view enhanced graph attention network for session-based music recommendation. ACM Trans. Inform. Syst. 42 (1), 1–30 (2023).
Chou, Y. C. et al. SEM-Net: A Social–Emotional Music Classification Model for Emotion Regulation and Music Literacy in Individuals with Special Needs. Appl. Sci. 15 (8), 4191 (2025).
Pham, N. T., Dang, D. N. M., Nguyen, N. D., Nguyen, T. T., Nguyen, H., Manavalan,B., … Nguyen, S. D. Hybrid data augmentation and deep attention-based dilated convolutional-recurrent neural networks for speech emotion recognition. Expert Syst. Appl. 230, 120608. (2023).
Li, Y., Wang, X., Wu, R., Xu, W. & Cheng, W. A Audio-visual fusion piano transcription model based on the attentiotwo-stagen mechanism. IEEE/ACM Trans. Audio Speech Lang. Process. 32, 3618–3630. (2024).
Liu, Q., Tong, W., Bai, H. & Dong, S. SCBM-Net: a multimodal feature fusion-based dual-channel method for bearing fault diagnosis. Sci. Rep. 15 (1), 37904 (2025).
Dong, S., Tong, W., Bai, H. & Liu, Q. Bearing fault diagnosis method based on WSST and ISSA-MCNN-BIGRU. Sci. Rep. 15 (1), 41533 (2025).
Chen, Z. et al. Learning multi-scale features for speech emotion recognition with connection attention mechanism. Expert Syst. Appl. 214, 118943 (2023).
Ivanko, D., Ryumin, D. & Karpov, A. A review of recent advances on deep learning methods for audio-visual speech recognition. Mathematics 11 (12), 2665 (2023).
Prashanth, A., Jayalakshmi, S. L. & Vedhapriyavadhana, R. A review of deep learning techniques in audio event recognition (AER) applications. Multimedia Tools Appl. 83 (3), 8129–8143 (2024).
He, Y., Seng, K. P. & Ang, L. M. Multimodal Sensor-Input Architecture with Deep Learning for Audio-Visual Speech Recognition in Wild. Sensors 23 (4), 1834 (2023).
Sun, J., Gao, P., Liu, P. & Wang, Y. Memristor-Based Feature Recall Neural Network Circuit With Temporal Differentiation of Emotion and its Application in Parts Inspection. IEEE Trans. Industr. Inf. 21 (7), 5633–5643 (2025).
Sun, J., Gao, P., Wen, S., Liu, P. & Wang, Y. Memristor-based conditioned inhibition neural network circuit with blocking generalization and differentiation. IEEE Internet Things J. 11 (7), 11259–11270 (2023).
Acknowledgments
This work was supported by Research Project on the Protection and Contemporary Value of Heilongjiang Manchu Music, 2025 Basic Research Business Expenses of Provincial Universities in Heilongjiang Province.
Author information
Authors and Affiliations
Contributions
Chang Liu: Conceptualization, methodology, software, validation, formal analysis, investigation, resources, data curation, writing—original draft preparation. Ningning Shi: methodology, software, validation, formal analysis. Shen Jiang: writing—review and editing, visualization, supervision, project administration, funding acquisition.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics statement
The studies involving human participants were reviewed and approved by College of Arts, Heilongjiang University Ethics Committee (Approval Number: 2021.398472). The participants provided their written informed consent to participate in this study. All methods were performed in accordance with relevant guidelines and regulations.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Liu, C., Shi, N. & Jiang, S. Artificial intelligence technology for music teaching reform mode under DCNN algorithm. Sci Rep (2026). https://doi.org/10.1038/s41598-026-45027-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-45027-w