Abstract
***This study aims to improve the effectiveness and outcomes of youth football training by utilizing advanced deep learning and artificial intelligence (AI) technologies. Firstly, the relevant dimensions of deep learning and key training techniques of deep learning convolutional neural networks (CNNs) are analyzed. Secondly, a key point detection model for youth football training is constructed based on deep learning CNNs. Lastly, interviews are conducted with five technology companies and thirty sports teachers to analyze the application scenarios of AI in campus football training. The results show that the difficulty of key point visibility in the youth football training key point detection model is not high, and the model can provide relatively accurate results. The model achieved an accuracy of over 90% in critical point prediction, with low prediction errors for key points related to foot placement and curve positioning, all of which remained below 15%. Both companies and schools consider policy, cognitive, attitude, and technological factors as key factors for applying AI in campus football. In addition, hardware facility factors and other related factors also have an impact on the application of AI in campus football. The research results have practical reference significance for the intelligent development of youth football training and can promote the high-quality development of youth football in China.
Similar content being viewed by others
Introduction
Research background and motivations
The training of youth football plays a crucial role in cultivating future football talents. However, traditional training methods often rely on coaches’ subjective experience and observations, lacking objective and quantifiable data support1. Furthermore, due to the large and complex training data, traditional data analysis methods often fail to explore the information fully. Therefore, the intelligent analysis and application of youth football training data using deep learning and artificial intelligence (AI) technologies have significant research and practical value2,3.
Deep learning, an important branch of AI, is based on simulating the structure and function of the human brain’s neural network and achieves intelligent decision-making and prediction through learning from a large amount of data4. Youth football training data contains abundant information, such as player movements and match statistics, which can be utilized to train deep learning models. Through deep learning techniques, researchers can construct models to automatically analyze and understand patterns and regularities within youth football training data, assisting coaches and players in gaining better insights and optimizing training methods5,6. The integration of football and AI, as well as deep learning, has also garnered national attention, igniting new possibilities. Just as smartwatches have become essential products for the general public, there is a strong interest in the combination of football and AI and deep learning among both adults and youth7,8.
AI and deep learning play a vital role in talent development, improving students’ physical fitness and enhancing their football skills and tactics9. Integrating campus football with AI and deep learning is not only a current trend but also an important avenue for future development. This integration provides students with comprehensive football training and personalized guidance, helping them achieve better results in the field of football and cultivating more professionals with specialized skills10.
Research objectives and methodology
This study aims to conduct an in-depth investigation into the practical application of AI and deep learning in campus football, analyzing their potential impacts and value. By utilizing AI techniques, this study aims to provide key technologies for data analysis in campus football and offer practical guidance for its development. Additionally, through specific case studies, it will explore development strategies for the application of AI and deep learning in campus football, providing guidance for its practical promotion and implementation.
This study proposes a novel keypoint detection model based on deep learning, which exhibits significant innovation in the field of youth soccer training. The previous study explored the use of various machine learning methods for identifying motion activities. For example, Cuperman et al. utilized accelerometer data to extract motion statistics and employed deep learning models to recognize soccer-related activities. The findings indicate that using deep learning models can accurately, robustly, and rapidly identify soccer-related activities, including jogging, sprinting, passing, shooting, and jumping, among others. By combining convolutional layers with bidirectional Long Short-Term Memory (LSTM) layers, the model achieved an accuracy of 98.3%11. However, despite the high accuracy achieved in activity recognition in this study, there are still some limitations, including potential issues with the model’s generalization ability, the cost and convenience of data collection, among other aspects. Compared with the study by Cuperman et al., the proposed model harnesses advanced convolutional neural networks (CNNs) and heatmap regression techniques to enable real-time and precise detection of key points in soccer training. This innovation enhances training efficiency and provides coaches and players with valuable insights to improve their technical and tactical abilities. Furthermore, the study offers a fresh perspective on youth soccer training through an in-depth analysis of large-scale video data, addressing some of the shortcomings in existing studies. These innovations make important contributions to the advancement and elevation of the field of soccer training. This study aspires to inject new vitality into the development of campus football, elevate its development level, and positively impact cultivating future football talents. Furthermore, it is significant for innovating and improving youth football training methods. The technical roadmap of this study is ***illustrated in Fig. 1.
The structural arrangement of this study encompasses the following key sections. The introduction section initially provides the study’s background and motivation, followed by formulating research questions and delineating the principal research objectives. The literature review section offers a comprehensive overview of cutting-edge research and existing works in the relevant field, thereby furnishing the theoretical foundation for subsequent research endeavors. The methodology section explains the adopted methods and techniques, encompassing the deep learning CNNs model and pertinent parameter configurations. The experimental design and performance assessment section describes the data collection process, hardware and software environment configurations, and model performance evaluation. This includes an analysis and presentation of test results, encompassing accuracy and positional error, as well as interview survey findings with industry practitioners and educators, highlighting the research’s pivotal discoveries. Finally, the results section summarizes the contributions while delineating the study limitations and prospects for the future.
This study holds significant research significance and practical application value. Firstly, it fills a research gap within the domain of soccer training by integrating deep learning CNNs and AI technologies into practical soccer training. These advanced technologies enable a more comprehensive analysis and evaluation of players’ technical movements, body postures, and match performances, thereby providing coaches with more precise feedback and training recommendations. Secondly, the study underscores the practical application potential of AI and deep learning in soccer training. Research on 360-degree panoramic Virtual Reality (VR) soccer instructional videos and AI-based K-means algorithms demonstrates how VR technology and deep learning can enhance the quality and effectiveness of soccer teaching. This has profound practical implications for improving players’ skill levels and nurturing the next generation of soccer athletes. Additionally, the study covers critical technologies such as keypoint detection, simultaneous motion estimation, and soccer field line detection. These technologies are applicable to soccer training and can also be employed in other sports disciplines and motion analysis, expanding the scope of study applicability. Lastly, by delving into the application of AI and deep learning in soccer training, this study aspires to offer vital references and guidance for developing and enhancing intelligent sports training systems. This has the potential to enhance the training efficiency for coaches and players, promote the development of soccer sports, and possibly exert a wide-ranging impact on sports training.
Literature review
As an emerging technology in the modern era, AI has been widely applied in various fields, becoming an important means to propel the development of China. The rapid advancement of AI technology, combined with its close integration with daily life, has significantly enhanced China’s strength, promoted social development, and improved residents’ living standards12. Katipoğlu (2023) outlined the development process of AI and divided it into five stages, including three peaks and two troughs. By analyzing the current application status of AI from multiple perspectives, its feasibility and potential have been discovered, driving progress in the nation, society, and people’s lives13. Huang et al. (2021) pointed out through their research that with the popularity of AI and the emergence of intelligent terminals, data has become a new type of resource that can be integrated into an information database and extracted and searched based on specific needs, ultimately outputting effective information that meets the demands14. Data extraction and search are crucial data analysis steps in big data technology, as various AI algorithms orderly and systematically organize the data. Araz et al. (2020) emphasized the importance of data information and proposed methods to fully develop technological application capabilities, integrate data, and optimize data analysis15. They introduced the development process of data technology and analyzed the prospects and trends of big data technology based on the current domestic and international development status.
Najafabadi et al. (2015) conducted an analysis of the concept of deep learning and pointed out that due to limitations in data volume and computing power, deep learning did not achieve significantly better results than other traditional machine learning methods16. However, Wani et al. (2022) participated in an image recognition competition using deep learning methods and decisively outperformed the second-place contestant who used traditional machine learning methods17. Deep learning began to receive widespread attention through this competition and rapidly developed. Nowadays, remarkable achievements have been made in fields such as image classification, image detection, computer vision, and image generation through deep learning.
Furthermore, in AI-based soccer training, Li et al. (2022) investigated strategies to enhance the quality of soccer instruction in the context of mobile internet by leveraging meta-universe-enabled 360-degree panoramic VR soccer instructional videos and an AI-based K-means algorithm18. They proposed an optimized 360-degree panoramic VR soccer instructional video delivery strategy based on K-means. The results indicated that students could intuitively analyze actions and improve instructional quality through learning from 360-degree panoramic VR soccer instructional videos. The reconstruction of the soccer training environment facilitates the integration of soccer instruction with intelligent learning. Zhang et al. (2021) emphasized the application of AI in the training of soccer athletes, particularly addressing the issue of simultaneous motion estimation19. Their study employed a recognition algorithm with a multi-layer decision tree identifier and successfully achieved accurate identification of the simultaneous movements of soccer athletes in sports training. This is of significant importance for improving soccer training and elevating the skill levels of athletes. Zhou et al. (2022) conducted research on the training of soccer athletes and the detection of soccer robots using deep learning technology, aiming to enhance the performance of intelligent soccer training20. They integrated AI and spatial flow networks, constructing an action recognition system based on CNNs. Additionally, they proposed a fully convolutional network-based soccer field line detection model, providing robust support for soccer training. This study offers crucial references and guidance for the development of intelligent sports training systems. The main findings of the aforementioned literature and relevant areas are summarized in Table 1.
Based on the aforementioned literature analysis, existing literature reveals a research gap, as it has not extensively explored the specific applications and benefits of AI in the field of soccer training, nor has it delved into how deep learning and AI technologies can enhance the practical outcomes of soccer training. While some studies have focused on the application of AI in the realm of sports, in-depth research within the domain of soccer training remains to be conducted. Furthermore, despite research indicating that deep learning has not consistently surpassed traditional machine learning methods in certain contexts, these studies have not yet ventured into the domain of soccer training. Therefore, the study question of this study revolves around how to leverage deep learning CNNs and AI technology to address the gap in soccer training, enhance the practical efficiency of soccer training, and provide coaches and athletes with improved training tools and methodologies.
Research methodology
Application areas of AI in campus soccer training
The development of campus soccer has achieved preliminary results, with an increasing number of soccer coaches and referees and a greater variety of campus soccer events. These are significant progressions that have emerged after the introduction and implementation of campus soccer. With the increase in on-campus soccer training activities, more schools have become characterized as campus soccer schools and actively carry out on-campus soccer training. This has led to increased student participation, a more vibrant campus soccer atmosphere, improved physical fitness among students, and the cultivation of their character and qualities21. Interviews with thirty physical education teachers revealed that the application of AI in campus soccer training is mainly focused on health monitoring, load monitoring, injury prevention, and technical and tactical abilities. More than half of the physical education teachers expressed their willingness to use relevant technologies in these scenarios. However, only a few physical education teachers chose to adopt AI technology in remote training scenarios. In addition, interviews with five companies showed that these companies primarily apply AI technology in the areas of health monitoring, load monitoring, injury prevention, and technical and tactical abilities in campus soccer training. There is almost no application of related technologies in remote training scenarios. In summary, the application areas of AI in campus soccer training mainly include health monitoring, load monitoring, injury prevention, and technical and tactical abilities. These areas are the primary domains where AI plays a role in campus soccer training (Fig. 2).
First, health monitoring plays a crucial role in campus soccer training22. Traditional health monitoring methods mainly rely on regular check-ups conducted by doctors and medical equipment in hospitals to collect individuals’ health records, including personal information, physical condition, genetic factors, and lifestyle. However, with the advancement of technology, modern health monitoring has become more intelligent and efficient. In particular, the introduction of wearable technology has made health monitoring more diverse and flexible, moving away from being singular, cumbersome, and fixed. Wearable devices can provide real-time monitoring of physiological data, enabling individuals to understand their health status. These devices can record key indicators such as heart rate, sleep quality, and activity levels, providing valuable information for players and coaches. Although some argue that single indicators cannot fully reflect one’s health condition, these data are still very helpful in assessing physical fitness.
In campus soccer training, health monitoring plays a crucial role. Coaches can require players to wear smart devices to collect their physiological data in real-time during training sessions. Through data analysis and interpretation, coaches can gain better insights into players’ physical condition and training effectiveness, enabling them to make more scientifically informed training plans. For example, by monitoring heart rate, coaches can control training intensity to avoid excessive fatigue and the occurrence of sports injuries23,24. Furthermore, health monitoring can help identify potential health issues in advance, such as abnormal heart rate or excessive fatigue, allowing prompt actions to be taken. Therefore, health monitoring has broad application prospects in campus soccer training. Wearable technology can provide real-time monitoring of the physical condition, creating a safer and more scientifically driven training environment for players, promoting their healthy development, and enhancing their overall qualities. The application scenarios of wearable devices are illustrated in Fig. 3.
Second, load monitoring refers to the real-time observation of external pressure and stimuli exerted on the body. In sports, load primarily occurs during matches and training sessions25,26. By monitoring athletes’ loads during training and matches, their physical condition can be observed in real time, ensuring the effectiveness of training and match performance. Monitoring load helps identify factors that affect athletes’ performance, enabling coaches to understand athletes’ load and prepare for future plans and arrangements27. With the introduction of wearable device technology and data analysis techniques, load monitoring has become more efficient, convenient, and rapid.
However, the field of soccer currently faces a series of challenges and issues. Insufficient multi-party communication and exchange have led to a decline in parental support, insufficient enthusiasm for learning among students, and a negative attitude from schools. In order to address these problems, advanced technology has been introduced to monitor athletic load, utilizing the Internet of Things (IoT), cloud computing, and big data technologies combined with the latest wearable devices. The load monitoring system plays a crucial role in the development of soccer. It can monitor each student’s real-time heart rate data and load and transmit the data to a central server for analysis through IoT connectivity and cloud computing processing. Such a system can also generate detailed analysis reports, providing schools with comprehensive and integrated physical fitness assessment methods. It is worth emphasizing that the load monitoring system incorporates the latest wearable devices, making data collection more convenient and accurate. Students only need to wear these devices, and the system can record their heart rate and load in real time without requiring additional operations or interventions. This reduces the burden on students and teachers and improves the reliability and availability of the data. Through the load monitoring system, schools can better understand students’ physical condition and training load, promptly identify potential issues, and take appropriate measures. Additionally, through the generated analysis reports, schools can conduct comprehensive evaluations and provide personalized guidance and training plans to enhance students’ overall physical fitness and soccer skills. In summary, the load monitoring system effectively addresses the issues in soccer development with its advanced technology and convenient operation. Through the application of IoT, cloud computing, and big data technologies, combined with the latest wearable devices, this system offers schools comprehensive physical fitness assessment methods, promoting the improvement of students’ overall qualities and fostering a positive cycle of soccer development.
The main goal of the sports data collection platform is to provide strong support for teaching management, improve the quality of classroom teaching, and enhance students’ learning outcomes. The platform allows real-time sports data collection and possesses multiple powerful features. After the training session, the platform automatically generates comprehensive data reports. These reports include students’ physical fitness data and provide detailed analysis and interpretation. Coaches can formulate more precise goals and plans for subsequent training based on the trends in physical fitness changes during the training course. In addition to the data reports, the platform offers a range of practical tools. Coaches can easily manage and organize courses using the platform’s teaching management function. They can develop personalized training plans, track students’ progress, and provide timely feedback and guidance. The platform also supports the sharing and exchange of teaching resources, allowing coaches to learn from each other and improve their teaching skills. Furthermore, the platform has intelligent assistant features. It can automatically analyze students’ sports data and provide personalized suggestions and improvement strategies. This way, students can comprehensively understand their physical fitness status and undergo targeted training to improve their learning outcomes. In conclusion, the sports data collection platform provides a basis for teaching management and offers powerful tools and support for coaches. Coaches can scientifically set training goals and provide personalized guidance through automatically generated data reports and intelligent assistant features. This will help improve the quality of classroom teaching and student learning outcomes, promoting the progress of the entire education system. Additionally, the data collection platform can also provide an assessment of teaching effectiveness, assisting coaches in adjusting and improving teaching methods for better results.
Third, injury prevention. In sports, athletes’ injuries have always been their inner pain, and the risk of injury is constantly present. Timely treatment and recovery become crucial when athletes get injured. For campus soccer, the reasons causing players to suffer sports injuries can be summarized as shown in Table 2.
Due to frequent injuries and damages suffered by players, parental support has declined, and schools have adopted a negative attitude, resulting in the sluggish development of campus football. However, this dilemma also presents a rare opportunity for the application of AI technology. Key technologies in AI, such as advanced wearable devices, big data analysis, and visual capture technology, can be applied in the context of sports injury prevention, thereby improving the accuracy and timeliness of injury prevention efforts within campuses. In the application scenario of injury prevention, key technologies in AI play a significant role. The development of campus football requires the cultivation of outstanding young players; however, the frequent occurrence of injuries severely affects talent reserves. Therefore, it is crucial to utilize key technologies for injury prevention. These key technologies can be applied through various approaches. For example, technologies like video motion capture or wearable devices can capture players’ movement patterns, trajectories, joint angles, rotations, heel coordinates, and other kinematic data and physical health indicators. Subsequently, by utilizing third-party software for data analysis and processing, players’ physical indicators can be obtained, and in-depth analysis of visualized data can be conducted from medical and kinematic perspectives. Finally, the players’ physical health status can be assessed, their running status can be evaluated for normalcy, and measures such as treatment can be determined. Through such preventive measures, the deterioration of players’ injuries and the occurrence of sudden death incidents can be avoided in a timely manner, greatly alleviating concerns for parents and schools. The specific process is illustrated in Fig. 4.
Fourth, Football Skills and Tactical Abilities. In the sport of football, players at different levels undergo training involving various aspects, methods, and focuses, including physical fitness, speed, explosiveness, football skills, and football tactics. Regardless of the specific training aspect, learning should be conducted according to the coach’s approach and methodology. However, traditional teaching ideologies and outdated instructional forms present a major problem that can lower training levels and consequently reduce students’ learning outcomes. With the advancement of time, AI technology has been widely applied in football, bringing innovation to teaching and training models and creating new teaching methods and forms. In terms of football skills and tactics, visual motion capture technology and big data analysis technology have been incorporated into some products and systems and applied in the training process to enhance players’ technical proficiency and tactical competence. These innovative technologies provide coaches and players with more resources and tools to deeply analyze and understand the key aspects of technical and tactical abilities. Through visual motion capture technology, precise capture and recording of players’ movement details can be achieved, helping them improve the accuracy and effectiveness of their technical actions. Simultaneously, big data analysis technology can extract valuable information from vast datasets, offering coaches profound insights into player performance, trends, and potential areas for improvement. These innovative technologies facilitate personalized, precise, and efficient teaching and training, thus enhancing players’ technical and tactical abilities. The specific methods of technology application are presented in Table 3.
The key technology of deep learning CNNs training
CNNs are the type of deep learning model specifically designed to process image data. It can automatically extract features from input images and excels at capturing spatial hierarchies in the data. In the context of keypoint detection tasks for soccer training, the objective is to identify critical body parts of players (such as foot positions, joint locations, and ball positions) from video frames or images. Through multiple convolutional and pooling layers, CNNs efficiently extract spatial features, recognizing localized patterns such as moving limbs or ball trajectories. By progressively building more abstract feature representations, CNNs demonstrate a significant advantage in these tasks. Compared to other neural networks, such as fully connected networks (FCNs), CNNs are better equipped to capture local features and spatial relationships within images. Soccer actions are typically highly dynamic, involving rapid player movements, complex postures, and variable ball trajectories. CNNs can capture localized, position-invariant features through convolutional layers, effectively identifying posture changes during motion. For instance, CNNs can automatically learn information such as foot positions, joint angles, and ball trajectories through convolution operations—an essential capability for keypoint detection tasks. In fields like human pose estimation and motion analysis, CNNs have proven highly effective for keypoint detection. Advanced convolutional networks such as OpenPose and High-Resolution Network (HRNet) have achieved exceptional performance in this domain. By leveraging multi-layer convolution and feature fusion, these models can accurately detect the precise positions of various body joints. Consequently, CNNs-based models can quickly and accurately identify key movements and positions in youth soccer training, facilitating personalized technical analysis and training feedback. While other types of deep learning models, such as recurrent neural networks (RNNs) and FCNs, can also be applied to keypoint detection, they typically underperform CNNs in image data processing. RNNs, while suitable for sequential data such as time-series predictions, are relatively weak in extracting spatial information and learning multi-scale features in images. FCNs often rely on manual feature extraction and are computationally intensive, making them unsuitable for processing complex image data. In youth soccer training, player movements are often unstable, and players’ physical characteristics and movement patterns can vary significantly, making accurate modeling with traditional methods challenging. CNNs, with their powerful learning capabilities and adaptability to complex patterns, offer high accuracy and robustness in addressing these challenges. Moreover, CNNs can learn important features directly from data, even when training samples are limited—a critical advantage when working with young players who may have fewer training samples available. Therefore, CNNs are the most suitable choice for addressing the challenges of complex movements and keypoint detection in youth soccer training. First, CNNs consist of multiple layers arranged in sequential order, including convolutional layers, pooling layers, and fully connected layers28. The specific composition of the network structure is shown in Table 4.
Based on the information provided in the table, the structure of CNNs is illustrated in Fig. 5.
CNNs employ strategies such as local connections, weight sharing, and downsampling to effectively reduce the number of network parameters, enabling the construction of deeper structures32. By utilizing convolutional layers and extracting features using convolutional kernels, multiple stacked convolutional layers continuously abstract and iterate on low-level features, simulating the process of object recognition in the human visual cortex. Compared to traditionally handcrafted feature extraction algorithms, CNNs demonstrate superior performance.
Furthermore, training CNNs involves key techniques. Completing a single iteration over all training data is referred to as one epoch during the training process. Typically, an epoch is divided into multiple batches for processing, with each batch containing a specific number of data samples known as the batch size. In order to fully utilize the reading speed of computer memory, it is common to choose a batch size that is a multiple of 8, such as 64, 128, etc. The training process can be divided into four steps: forward propagation, loss computation, backward propagation, and weight updating.
In the forward propagation step, a batch of data is processed through the network structure to obtain the output. Loss computation involves calculating the error based on a predefined loss function and the data labels. The choice of the loss function is subjective and measures the discrepancy between the output and the labels33. In regression problems, the most commonly used loss function is the mean square error (MSE), as shown*** in Eq. (1):
In Eq. (1), Q represents the labels of the data, W represents the predicted values of the network, and X represents the number of data samples.
For classification problems, the most commonly used loss function is cross-entropy loss, which has both discrete form (Loss) and continuous form (Loss’), as shown in Eqs. (2) and (3):
Aiming at the requirement of key point detection in youth football training scene, this study systematically optimizes the classical convolutional neural network architecture. Through the hierarchical convolution kernel configuration strategy, the algorithm architecture uses 3 × 3 small kernels to accurately capture the micro-morphological characteristics of joint points in the shallow layer, and gradually expands to 7 × 7 large convolution kernels with the increase of network depth, effectively modeling the macro-dynamic characteristics such as the overall attitude evolution of athletes and football trajectory. The innovative design of dynamic adaptive pooling structure breaks through the limitation of traditional fixed pooling window, and automatically adjusts the pooling area through real-time perception of feature map dimensions. It significantly improves the robustness of the algorithm to multi-resolution input data and ensures the stable performance of the training ground under different acquisition equipment and shooting angles.
In order to cope with the complex action patterns such as fast changing direction, sudden stop and sharp turn in football training, the technical scheme constructs a multi-scale feature collaborative learning mechanism. The temporal and spatial features of different receptive fields are extracted by parallel convolution paths, and a cross-layer feature interaction module is designed to realize the organic integration of fine-grained local information and global motion patterns. In the aspect of regularization strategy, the compound regularization method combining random inactivation and L2 weight attenuation, combined with the rotation and scaling samples generated by data enhancement, effectively suppressed the risk of over-fitting in complex scenes. In the training process, a multi-task joint learning framework is introduced, and the key point coordinate regression and action classification tasks are cooperatively optimized. With the weighted loss function of difficult samples, the feature discrimination ability in the fast-moving state is strengthened, and finally a football training analysis solution with both accuracy and real-time performance is formed.
Key point detection model for youth soccer training
Annotation of soccer training data. In defining the key points, this study organized the annotation of these points and performed a selection among the available images. Ultimately, 27,987 images with annotations were obtained. These images were divided into a training set and a test set, with 24,315 images in the training set and 3672 images in the test set. {N1, N2, …, Nm} represents the dataset for each image Nm. Its position annotations are denoted as \(\left\{{x}_{1}^{m},{y}_{1}^{m},\dots ,{x}_{U}^{m},{y}_{U}^{m}\right\}\), where U represents the number of key points in the dataset. It is important to note that over 35% of the images in the dataset do not contain any key points. These images include extreme close-ups, audience shots, or advertising shots. Therefore, the effective number of training images that can be used for network training and convergence is significantly less than the total number of images in the training set, which needs to continuously collect and annotate images of these types. Additionally, in the majority of images containing key points, the number of key points is less than 11. This also reflects the characteristics and distribution of the dataset. Figure 6 presents the statistics of the number of key points in a single image.
Key point detection model. The study adopted a hierarchical refinement approach using cascaded networks34 and introduced keypoint heatmaps as the supervisory information. A cascaded structure of deep CNNs was designed for both detection and regression tasks. An effective heatmap method was used to construct the ground truth. The overall architecture of the model is illustrated in Fig. 7.
The model in this study consists of two levels. The first level is the detection network, which extracts the initial positions of the key points and generates key point heatmaps. The second level is the regression network, which utilizes the idea of heatmap regression35. The input to this network includes the original image and the key point heatmaps generated by the first-level network. The results are obtained after feature fusion. The regression is performed using the constructed heatmaps as labels. The performance of both the detection and regression networks was verified through different experiments conducted in this study. In this model, detected key points encompass various aspects, including the player’s body, joints, movements, and the soccer ball’s position, field lines, and markers. Key body points of the player include those at the head, neck, shoulders, arms, waist, knees, ankles, and other body parts. These key points can be used to track the player’s postures and movements, such as headers, passes, and shots. Joint key points of the player include elbow joints, knee joints, ankle joints, etc., which are used for accurately capturing and identifying the player’s motion actions, such as bending, stretching, and rotating. Key points related to the soccer ball’s position encompass the central coordinates of the soccer ball and possible additional reference points, enabling tracking of the ball’s location and movement on the field. This is vital for analyzing interactions between players and the ball. Field lines and marker key points are utilized to detect lines, boundaries, and markers on the soccer field, such as the centerline, boundary lines, goalposts, penalty spots, etc. This aids in understanding field layouts and rules.
Action key points comprise a series of action-related key points, such as passing actions, shooting actions, dribbling actions, and more. By tracking these key points, the technical movements of the player can be analyzed and assessed. Accurate capture of these key points allows researchers and coaches to understand the player’s performance and provide enhanced guidance and training recommendations.
This study adopts the CNNs-based deep learning architecture for the keypoint detection task. CNNs were selected as the primary deep learning architecture for this experiment due to its outstanding performance in image processing, particularly its ability to automatically learn features and patterns from images. Several key factors were considered in the design of the CNNs architecture. First, the convolutional layers are the core of the CNNs architecture and are responsible for extracting local features from the input image. In this study, three convolutional layers were designed, each containing multiple convolutional filters. The first convolutional layer uses smaller convolutional kernels (such as 3 × 3 or 5 × 5) to capture detailed features in the image, such as edges and corners. The second and third convolutional layers employ larger convolutional kernels (such as 7 × 7 or larger) to capture larger-scale features, such as body posture and ball position. This progressively increasing kernel size design helps the model extract more abstract features at different scales, thereby enhancing its adaptability to complex patterns. To reduce computational burden and improve the model’s generalization ability, a max-pooling layer was added after each convolutional layer. The pooling layer reduces the size of the feature map by selecting the maximum value from the pooling region. For example, a 2 × 2 pooling window halves the size of the feature map. Max-pooling not only reduces the dimensions of the feature map but also preserves important spatial features, helping the model extract more distinctive key information. In the final part of the model, fully connected layers were designed to map the features extracted by the convolutional and pooling layers to the final output space. The output dimension of the fully connected layer matches the number of keypoints, allowing each keypoint to be predicted through regression and classification, thus enabling precise keypoint detection. To enhance the model’s ability to fit non-linear relationships and improve its generalization performance, the Rectified Linear Unit (ReLU) activation function was used. ReLU effectively alleviates the vanishing gradient problem and accelerates network convergence. Furthermore, to prevent overfitting, Dropout regularization was incorporated. Dropout randomly drops some neuron connections during training, reducing the model’s reliance on specific nodes and thus improving its robustness. In terms of optimization, the Adam optimizer was used. This algorithm performs excellently on large-scale datasets and can dynamically adjust the learning rate, improving training efficiency and stability. For the loss function, cross-entropy loss was selected, as it is suitable for classification tasks and effectively measures the difference between the model’s outputs and the true labels, guiding the network to update its parameters. Overall, the design of this CNNs architecture takes into full consideration the specific challenges of keypoint detection in youth football training. Football movements are often fast and irregular, and the keypoints in the images may be somewhat blurred. Through multi-layer feature extraction from deep convolutional layers, the model can progressively extract fine-grained and large-scale features from images, effectively addressing complex patterns in motion. The introduction of max-pooling enhances computational efficiency, while the ReLU activation function and Dropout regularization ensure efficient training and excellent generalization ability of the model.
The pseudo-code for the model is depicted in Fig. 8.
Experimental design and performance evaluation
Datasets collection
In order to meet the project requirements, the experiment collected football training video resources from the internet. These data were collected from real football training videos. A total of three crawls were conducted, resulting in 90 football match videos with a total duration of approximately 85 h. In order to ensure video quality, a resolution of 720p was selected. After obtaining the video resources, OpenCV was used to extract frames from the videos, resulting in a series of images. In order to ensure the diversity of images, a frame extraction interval of every 10 s was set, resulting in approximately 28,000 original images with a size of 720 × 1280. These image resources meet the requirements of the subsequent project. This data set is self-collected.
Furthermore, a human-body parsing experiment was conducted to validate the model’s performance. The models compared in the experiment included the youth soccer training keypoint detection model proposed in this study, local–global long short-term memory (LG-LSTM), Whole-Body Human Pose Estimation (WSHP), Pyramid Scene Parsing Network (PSPNet), Pose Guided Person Image Generation (PGN), DeepLab V2, and others. The dataset used for this experiment was the PASCAL-Person-Part dataset. The PASCAL-Person-Part dataset comprises a total of 3533 images, categorizing human body parts and the background into seven classes, including background, head, torso, upper arms, lower arms, upper legs, and lower legs. The dataset link is: PASCAL-Part Dataset (roozbehm.info). ***Accuracy and the mean Intersection over Union (mIoU) across all categories were employed as evaluation criteria. The calculation of mIoU is described by Eq. (4):
In Eq. (4), \(n\) represents the number of categories in the dataset, counting from 0 onwards. Therefore, \(n+1\) signifies the total number of categories. \({M}_{ii}\) denotes the count of true positives where the true value is i, and it is predicted as i, \({M}_{ij}\) represents false positives and \({M}_{ji}\) signifies false negatives. Accuracy is a metric for assessing the degree of match between the model’s predicted results and the actual labels36. The calculation of accuracy is shown in Eq. (5):
In Eq. (5), TP represents True Positive, TN represents True Negative, FP represents False Positive, and FN represents False Negative. The higher the accuracy, the closer the model’s predicted results align with the actual situation. The calculation of Recall is given by Eq. (6):
Equation (6) measures the proportion of targets successfully detected by the model to the total number of true targets, indicating the model’s success in detecting targets among all actual targets37. Additionally, Average Precision (AP) assesses the precision of the model at various confidence thresholds. Firstly, calculate precision and recall at different confidence thresholds. Then, based on the calculated precision and recall, plot the Precision-Recall Curve. Finally, compute the area under this curve, which represents the AP38. The calculation of AP is given by Eq. (7):
In Eq. (7), \(P(r)\) represents precision at a recall rate of r, and \(\Delta r\) indicates the change in recall rate.
Detailed information regarding the training and testing sets of both datasets is presented in Table 5.
Experimental environment
In terms of hardware, the experimental server used in this study was equipped with two NVIDIA GeForce GTX 1080 graphics cards, each with a memory size of 10,000 MB. All model training was conducted on this hardware configuration. As for the software, the research server employed a 64-bit Ubuntu 16.04 operating system. The graphics driver version was 390.67, and the CUDA version was 9.1.85. These hardware and software configurations provide the necessary support and foundation for the experiments.
Parameters setting
This study used CNNs for the keypoint detection task. Several factors were considered in selecting the parameters of the CNNs model to ensure its effectiveness in image feature extraction and keypoint detection. Firstly, for the design of the convolutional layers, three convolutional layers were used, with the size of the convolutional kernels progressively increasing from 3 × 3 for the smaller kernels to 5 × 5 and 7 × 7 for the larger ones. This design aims to capture fine-grained features in the image, such as edges and corners, with the smaller kernels, while the larger kernels help extract larger-scale features, such as motion posture and ball position. Additionally, the stride for all convolutional layers was set to 1 to prevent loss of fine details in the image, and the “same padding” strategy was used to avoid excessive shrinking of the image size, preserving important information. The number of convolutional filters increases gradually from 32 to 128 to avoid high computational complexity while extracting more complex features. The pooling layer design aims to reduce the feature map size, decrease computational load, and enhance the model’s generalization ability. After each convolutional layer, a 2 × 2 max-pooling layer with a stride of 2 was applied to downsample the feature maps while retaining key image information. The fully connected layers, located at the end of the network, map the extracted features to the output space. Two fully connected layers were designed, containing 512 and 256 neurons, respectively, to enhance the model’s representational power and prevent overfitting. The number of neurons in the output layer corresponds to the number of keypoints, ensuring precise processing of each keypoint’s regression or classification task. For the activation function, the ReLU function was used because it effectively alleviates the vanishing gradient problem and accelerates network convergence. To further prevent overfitting, a Dropout regularization method was applied in the fully connected layers, randomly dropping some neurons during training to enhance the model’s generalization ability. Regarding the optimizer, the Adam optimizer was chosen, as it combines the advantages of AdaGrad and RMSProp, adapting the learning rate to provide better training performance and convergence. The loss function selected was the cross-entropy loss function, which is well-suited for classification tasks and effectively measures the difference between the model’s predictions and the true values, guiding model optimization. All parameter selections were validated through multiple experiments and hyperparameter tuning, using cross-validation to ensure the model’s stability and robustness across different training and validation datasets. This process carefully selected hyperparameters, including learning rate, batch size, and dropout rate, to achieve optimal model performance.
Regarding the settings for the second-level network, 50 training epochs were performed using the mini-batch Adam algorithm for parameter optimization. The initial learning rate was set to 0.001, and a linear decay strategy was applied during training. Specifically, learning rate decay was performed at the 25th, 40th, and 50th epochs, with a decay factor of 0.1. When constructing the Ground Truth for the discriminator, the “true” and “false” heat maps need to be defined. This involves two hyperparameters, ω and μ. ω represents the threshold value of pixel intensity, and μ represents the threshold value of the number of pixels greater than that threshold. These two parameters do not need to be adjusted simultaneously in the training process. Instead, an appropriate ω is first selected, and then μ is fine-tuned. In this study, ω was set to 0.4, and the optimal μ value of 75 was obtained through experimental tuning. The selection of these parameters is crucial for the accuracy of the training process and results. This study delves into the process of hyperparameter tuning for deep learning models to ensure stability and reliability. Key hyperparameters under scrutiny in the experiments include the learning rate, batch size, and weight decay. Initially, a systematic adjustment of the learning rate is conducted. A range of learning rates is attempted, starting from smaller values and progressively increasing. The study monitors the model’s performance on both the training and validation sets, selecting a learning rate that demonstrates optimal performance between the two. The results of the hyperparameter tuning for the model parameters are depicted in Table 6. Analyzing the data in Table 6, the optimal parameter selection for the model is identified as a learning rate of 0.01, batch size of 64, and weight decay of 0.1. The model exhibits high accuracy and low loss values on both the training and validation sets, demonstrating relatively strong performance and robust generalization capabilities.
Performance evaluation
This study opts not to use pre-trained CNNs models, based on several considerations. 1. Specificity of the Dataset: The specificity of the dataset is a critical factor. This study focuses on keypoint detection in youth soccer training, a domain that exhibits unique characteristics compared to general image classification tasks, such as those addressed by datasets like ImageNet. Images from soccer training are distinguished by dynamic scenes, rapid movements, and complex motion patterns. While pre-trained CNNs models, such as ResNet and VGGNet, perform exceptionally well in general image recognition tasks, they may not effectively capture the critical features specific to the soccer training domain. 2. Custom Architecture Design: To better accommodate the keypoint detection tasks in youth soccer training, the customized CNNs architecture was designed and trained from scratch. By tailoring the convolutional layer structures and pooling strategies to meet the task’s specific requirements, the model is more capable of learning features relevant to soccer training rather than relying on generic features learned by pre-trained models. This approach enables the model to adapt more effectively to rapidly changing motion scenarios, complex movement patterns, and the diverse characteristics of players across different age groups. 3. Limitations of Pre-trained Models: Although pre-trained models can offer computational convenience and enhance performance through transfer learning, their features may differ significantly from those required for this study’s task-specific data. This disparity could introduce unnecessary biases, potentially undermining the model’s generalization ability and suitability for the task. Consequently, training the model from scratch ensures that it learns features most relevant to youth soccer training. In summary, the decision to forego pre-trained models and train the specialized CNNs from the ground up was driven by the need to capture domain-specific features, improve adaptability to complex motion patterns, and avoid biases introduced by generic features. This approach ensures that the model is finely tuned to the specific demands of keypoint detection in youth soccer training.
This study conducts a series of detailed designs to ensure the reliability and replicability of the experiments. The main elements of the experimental design are as follows: Independent Variable: PASCAL-Person-Part data. Dependent Variables: Accuracy and performance of the keypoint detection model. Control Variables: Data Preprocessing: Preprocessing of the PASCAL-Person-Part dataset is carried out to ensure data quality and consistency. Training Dataset: Images, including keypoint annotations, are extracted from the PASCAL-Person-Part dataset using OpenCV to ensure the adequacy and quality of the training data. Hardware and Software Environment: The experiments are conducted on a server equipped with two NVIDIA GeForce GTX 1080 graphics cards, running on a 64-bit Ubuntu 16.04 operating system, ensuring consistency in hardware and software environments. Data Quality: Strict quality control measures, including data cleaning and denoising, are applied to the PASCAL-Person-Part dataset to ensure the accuracy and reliability of the data. Data Annotation: Precise annotation of keypoints in the PASCAL-Person-Part dataset is performed to ensure that the model learns sufficient information for accurate keypoint detection. Randomization Procedures: To avoid experimental biases and enhance replicability, randomization procedures are adopted: Data Selection: Random selection of images and keypoint annotations from the PASCAL-Person-Part dataset is carried out to ensure diversity and uniformity in the training data. Training Dataset: Random selection of images during the image data extraction process is implemented to ensure a uniform distribution and diversity of image data.
First, model performance analysis. The accuracy and position error on the test set is analyzed, as shown in Fig. 9:
In Fig. 9, the performance of keypoint predictions is relatively poorer in the vicinity of the penalty area, primarily due to the densely populated keypoint distribution in that area. However, the key point prediction errors for foot positions and curve positions are relatively small. Therefore, this study further conducted a regional statistical analysis of coordinate errors in the test dataset. These analytical results suggest that keypoint visibility is not a significant challenge, as the model provides fairly accurate results. In summary, the model excels in predicting key points’ visibility, indicating that the proposed model performs well in capturing key points, especially in critical areas on the soccer field. This result holds significant implications for soccer training and skill improvement.
On the PASCAL-Person-Part dataset, the results of human body parsing classification for each model are presented in Table 7 and Fig. 10. The comparison models in Table 7 (LG-LSTM, WSHP, PSPNet, PGN, etc.) are selected from the latest research results in related fields, and the concrete implementation is based on the model architecture provided here. During the experiment, these models are reproduced and compared fairly with the same dataset (PASCAL-Person-Part) and training strategies (such as data enhancement and optimizer setting). LG-LSTM: Reference from Sun et al.’s (2022) framework of time sequence action recognition39. WSHP: A whole-body attitude estimation network based on Jung et al. (2022)40. PSPNet: The scenario analysis network of Yuan et al. (2022) is adopted41. PGN: Reproduce the posture guidance generation model of Bodaghi et al. (2018)42.
In Fig. 10, the mean mIoU for the proposed soccer training keypoint detection model is 73.78%. Compared to the LG-LSTM, WSHP, PSPNet, and PGN models, the mIoU results of this study’s model have improved by 27.29%, 9.16%, 8.34%, and 7.88%, respectively, in the task of human body parsing. The results indicate that this study’s model performs better in human body parsing tasks. The recall and average precision curves for these models are depicted in Fig. 11.
In Fig. 11, this study’s model exhibits a recall rate of 84.2% and an average precision rate of 84.6%. Compared to the other models, the highest improvements are 24.2% and 18.8%, respectively, while the lowest improvements are 8.8% and 6.0%, respectively. The data suggests that this study’s proposed CNNs-based soccer training keypoint detection model achieves higher detection accuracy. When comparing the detection speeds of these five models, the results are depicted in Fig. 12.
In Fig. 12, the detection speed of this study’s model is 35 frames per second (fps). The LG-LSTM model achieves the fastest detection speed at 55 fps but has a relatively lower mIoU of only 57.96%.
To ensure the accuracy of the deep learning model, the experiment pays special attention to the design and results of baseline evaluations. This study chooses two main baseline evaluation methods: traditional computer vision methods and simple machine learning models. Traditional computer vision methods are employed as baseline evaluations to contrast with the performance of the deep learning model. The PASCAL-Person-Part dataset is utilized, and classical computer vision algorithms such as edge detection and corner detection are applied. By comparing the results with those of the deep learning model, the experiment assesses the relative advantages of deep learning in keypoint detection tasks. Furthermore, this study introduces some simple machine learning models, such as support vector machines and decision trees, as another means of baseline evaluation. The selection of these models is based on their simplicity and widespread application in image processing tasks. The experiment uses the same PASCAL-Person-Part dataset and applies these simple models for keypoint detection. By comparing their performance with the deep learning model, a more comprehensive assessment of the effectiveness of deep learning in this task is achieved. Table 8 presents key indicators from baseline evaluations based on different methods. The table reveals that the proposed model outperforms in all metrics, achieving an accuracy of 94.6%, AP of 84.6%, recall of 84.2%, and F1 score of 89.2%. This indicates that the proposed deep learning model exhibits high accuracy and performance in human parsing tasks. While the simple machine learning models show improvement over traditional computer vision methods, they still fall short of the proposed model. They slightly surpass traditional methods in accuracy, AP, recall, and F1 score but still lag behind the proposed model. This suggests that deep learning models have better performance and application prospects in human parsing tasks. The baseline evaluation results of the proposed deep learning model on the PASCAL-Person-Part dataset demonstrate its significant advantages in human parsing tasks, which are crucial for improving the accuracy and efficiency of human pose detection and analysis.
This study focuses on the performance of the deep learning model in keypoint detection for youth soccer training. To verify the differences in model prediction accuracy and their practical significance, the experiment employs Analysis of Variance (ANOVA) as the primary statistical analysis method. Additionally, Cohen’s d is applied to measure the effect size, evaluating the practical differences in the effectiveness of different training techniques. Specifically, the model’s prediction accuracy and position errors on all test sets are used as input data for ANOVA. Through ANOVA analysis, the experiment can determine whether there is a significant difference in keypoint prediction accuracy in different regions, such as penalty and non-penalty areas. Furthermore, to ensure the statistical persuasiveness of the results, the experiment also calculates the statistical power. Statistical power analysis helps the experiment assess whether the current sample size is sufficient to detect the actual effects, avoiding Type II errors—accepting the null hypothesis (i.e., no effect) erroneously due to a small sample size. Table 9 presents the results of ANOVA analysis and Cohen’s d calculation.
The results in Table 9 demonstrate a significant difference in keypoint prediction accuracy between penalty and non-penalty areas (P = 0.009), with a Cohen’s d value of 0.62. The data indicates a moderate to large effect size. The statistical power is 0.88, well above the commonly accepted standard of 0.8, suggesting that the sample size is sufficiently large to reveal the actual effects. For validation of the model’s generalization ability, the experiment employed a fivefold cross-validation technique. This technique randomly divides the entire dataset into five equal parts, with each part serving as the test set in turn while the remaining parts act as the training set, thereby enhancing the model’s ability to generalize across different data. Although the model demonstrates high overall prediction accuracy, its performance is relatively lower in densely populated areas such as the penalty box. ANOVA analysis reveals that the prediction errors for keypoints in the penalty box are notably higher. This is primarily due to the high player density, complex movements, and rapid actions in this region, which increase the difficulty of keypoint detection. In particular, while the prediction errors for foot positions and curved trajectories within the penalty box are relatively low, there remain certain deviations that could affect the precision required for soccer training. Considering the differences in player movements across various regions, a region-adaptive technique was employed to adjust the model’s learning weights. This allows the model to more accurately identify keypoints in high-density areas such as the penalty box. During training, the dataset was augmented with examples simulating densely populated scenarios, such as players moving rapidly or executing complex overlapping runs within the penalty area. These enhancements aim to improve the model’s performance in high-density regions. In practical soccer training, coaches can leverage the model’s performance insights in such regions to refine training priorities. Specifically, they can focus on enhancing players’ precision and rapid response abilities in critical areas like the penalty box. By implementing these strategies, the prediction accuracy of the model in dense keypoint regions can be effectively improved, thereby increasing its practical value in soccer training applications.
Additionally, to boost the model’s generalization capability, this study introduced additional datasets and data augmentation techniques. Specifically, apart from the original PASCAL-Person-Part dataset, the study conducted additional validation on an independent dataset suitable for training detection models in sports analysis. This dataset is collected from the 2017 UEFA Super Cup match between Real Madrid and Manchester United (highlight reel), and the dataset link is: https://www.kaggle.com/datasets/sadhliroomyprime/football-semantic-segmentation. Data augmentation methods such as rotation, scaling, and cropping are applied during the preprocessing stage to enhance the model’s adaptability and generalization across different scenes and poses. Through testing on this independent dataset, the experiment confirms the effectiveness and generalization ability of the model. Table 10 presents the results of fivefold cross-validation and independent dataset validation:
By comparing the baseline model with other state-of-the-art models and baseline evaluation methods, the proposed model has demonstrated superior performance in both accuracy and average position error. Particularly noteworthy is the independent dataset validation, where the proposed model maintained high accuracy and low position error even when faced with unknown data, showcasing strong generalization ability and practical application potential.
To further validate the model’s effectiveness in real-world scenarios, this study conducted on-site tests in actual soccer training environments. These tests covered various weather conditions, training grounds, and time periods to assess the model’s performance in complex environments. The key performance indicators obtained by the proposed model in on-site tests are presented in Table 11. The results indicate that the proposed model maintains high accuracy across different environmental conditions and exhibits stability in predicting keypoint positions. Even in the face of challenging weather and lighting conditions, the model consistently provides reliable results, demonstrating its feasibility and practicality in real soccer training scenarios.
Second, the following are the results of interviews conducted to understand the application of AI and deep learning technologies in campus football. Interviews were conducted with five technology companies and 30 sports teachers. To gather feedback from these participants, a questionnaire based on a scoring scale was designed to quantify each aspect of their responses. For instance, feedback regarding technical performance, user experience, accuracy, and real-time responsiveness required respondents to rate each aspect on a 5-point scale. These ratings facilitated an overall evaluation of performance across various technical dimensions. The quantitative data collected from the feedback were subjected to statistical analysis, including calculations of mean, standard deviation, and correlation analysis, to identify relationships and differences among various feedback dimensions. For open-ended questions or descriptive feedback, the responses were categorized into thematic groups, such as “model accuracy,” “real-time response,” and “ease of operation.” Analyzing this qualitative feedback provided deeper insights into the specific needs of respondents and the practical application scenarios of the technology. Text analysis techniques, such as topic modeling and word frequency analysis, were employed to automate the analysis of large volumes of textual feedback. This helped identify the most frequently mentioned themes and recommendations for technical improvements. The results were summarized in Fig. 13.
In the context of campus football, interviews were conducted with five technology companies and 30 sports teachers to assess their understanding of the application of AI and deep learning technologies in this field. The results indicate that all five technology companies apply wearable devices and big data analysis technologies in campus football, with four of them also utilizing video analysis technology and positioning systems. Only one technology company offers VR technology products for campus football. Among the teachers, the majority have some understanding of the application of wearable devices and big data analysis technologies in campus football, while over half of the teachers also have some knowledge of video analysis technology. Regarding the application of positioning systems in campus football, general teachers have some understanding, whereas only a few teachers know about VR technology in campus football. The understanding of technology companies and teachers regarding the application of AI and deep learning technologies in campus football contributes to research aimed at comprehending the current status and potential of these technologies in the field of campus football.
Third, in youth football training, the application of deep learning and AI is influenced by various factors. The main influencing factors were identified through expert interviews and surveys of teachers and coaches, including policy, technological, hardware infrastructure, and cognitive and attitudinal factors. The specific details are shown in Fig. 14.
In Fig. 14, the application of deep learning and AI in youth soccer training is influenced by several factors. These factors were derived from expert interviews and surveys of teachers and coaches. Among these factors, policy considerations are regarded as the most crucial in both technology companies and campus communities, with all five technology companies unanimously identifying policy factors as key. Furthermore, four experts emphasized the importance of cognitive and attitudinal factors in the application of AI in campus soccer. One expert also recognized the significance of technological factors. In contrast, some physical education teachers and coaches view technological factors as the primary influencing factor, while several teachers consider cognitive and attitudinal factors key. Additionally, some teachers mentioned policy factors, but the impact of hardware infrastructure factors and other factors is relatively minor, with only a few teachers mentioning them. Therefore, it can be concluded that policy factors, cognitive and attitudinal factors, and technological factors play crucial roles in the application of AI and deep learning technologies in campus soccer. In contrast, the impact of hardware infrastructure factors and other factors is relatively limited. Understanding these factors is paramount to promoting the application of AI in the field of campus soccer.
Additionally, data on the number of requests for 360-degree VR soccer videos were collected to understand users’ interest and demand for these videos. The results are presented in Fig. 15.
In Fig. 15, the most popular video type among users is match highlights, accounting for 25% of the total requests. This suggests that users prefer to watch highlights and key moments of important soccer matches. Requests for goal highlights videos are also high, constituting 18.8%, indicating user interest in spectacular goals. While requests for training skill tutorials in educational content are relatively lower, they still have a certain audience, accounting for 8.3% of the total requests. These data reflect the diversity of 360-degree VR soccer videos, with different types of videos appealing to various audiences. When designing and promoting 360-degree VR soccer videos, consideration can be given to creating more relevant content based on user preferences and interests, such as more match highlights and goal highlights. Furthermore, this data can provide valuable insights into user preferences and market trends for soccer-related brands and platforms.
To further validate the computational efficiency of the proposed deep learning-based keypoint detection model for youth soccer training, this section compares it against other optimized models under similar hardware configurations, analyzing their performance during training and inference. According to the test results, the computational speed using a GTX 1080 GPU was significantly faster than the other two configurations, especially when processing larger datasets, with notable improvements in both inference speed and training time. A comparison of training times on the training and testing datasets is presented in Table 12. The GTX 1080 GPU demonstrated outstanding performance in both training and inference times while significantly reducing memory usage. Table 12 shows that the hardware configuration (such as GPU model) mainly affects the training speed, but has no significant correlation with the model accuracy. For example, the accuracy difference between GTX 1080 and GTX 1060 is less than 0.1%, and the improvement of verification performance is mainly due to model optimization.
When comparing models across different hardware configurations, this study evaluates its results against several published deep learning models utilizing similar hardware. For instance, the action recognition model proposed by Tsai et al. (2020) required 20 h of training on an NVIDIA GTX 1080 GPU, with training times exceeding 25 h on similar GPUs such as the GTX 1070 or GTX 1060. Moreover, its inference speed and memory usage were relatively high43. In contrast, the model proposed in this study achieves significantly improved computational efficiency through optimized algorithms that reduce unnecessary computations and memory access. Specifically, during image processing and keypoint detection, the proposed model employs advanced optimization strategies within the CNNs framework, such as batch normalization and weight sharing, enabling a more efficient training process. By comparing computational performance across different hardware, the results demonstrate that the proposed model offers a clear advantage in terms of computational efficiency. During both training and inference, it performs high-precision keypoint detection in shorter timeframes while maximizing hardware resource utilization. When combined with the details from Table 5 regarding the training and test sets, the proposed model optimizes memory usage and computation time, making large-scale dataset processing significantly more efficient.
To comprehensively evaluate the model’s ability to detect keypoints in youth soccer training, additional analyses focused on specific keypoints, such as foot positions and ball trajectories. The study also categorized model performance by age group and skill level to provide more targeted training recommendations for young players. For specific keypoints, such as foot positions and ball trajectories, precision and recall metrics were used for evaluation. Table 13 shows the results of precision and recall analysis for specific key points, such as foot position and ball trajectory. The significance of this table is to show the performance of the model in the detection of different key points, especially in the scene where the foot touches the ball, which shows that the model can effectively identify these key points. Although the performance of ball trajectory detection is slightly inferior, it can still accurately predict the situation of large angle projection. These results provide a theoretical basis for accurate positioning and action capture in youth football training.
To further refine the model’s performance and provide personalized training feedback for young athletes, the model was evaluated across different age groups and skill levels. Table 14 shows the performance of the model in different age groups (6–8 years old, 9–12 years old, 13–16 years old and over 16 years old). The table shows that with the increase of age, especially in the 13–16 age group, the precision and recall rate of the model have been significantly improved. This shows that with the maturity of players’ skills, the ability of the model in predicting foot position and ball trajectory has also been improved, especially in the case of fast and complex movements, the model can maintain high precision.
Table 15 shows the performance of the model among players with different skill levels (beginner, intermediate, and advanced). The table shows that beginners’ detection accuracy is relatively low, mainly focusing on the prediction of foot movement and ball trajectory, which shows that the model can capture most movements, but there are some errors in details. For intermediate players, the model is balanced in foot position and ball trajectory detection, while for advanced players, the model shows very high precision, especially in complex and fast movements, which can maintain high precision and recall rate.
To accommodate real-time application scenarios, the design of the model must consider the trade-off between speed and accuracy. Real-time applications typically require systems to complete data processing and feedback within milliseconds. Therefore, the model’s computational efficiency and response speed are critical factors. However, high-precision models often require more complex calculations and greater resources, which can lead to computation delays. The impact of different hardware configurations on the balance between speed and accuracy is shown in Table 16. There is a significant difference in performance between CPU and GPU when processing high-precision models. The GPU demonstrates a clear acceleration effect, completing computational tasks in a shorter time, while the CPU is slower. On embedded devices, lightweight models offer a more balanced trade-off between speed and accuracy, making them suitable for real-time feedback applications that are sensitive to delays.
Based on the results in Table 16, for training and analysis scenarios that require high precision, it is recommended to use GPUs with stronger computational power and employ the full deep learning model. While this may lead to extended processing times, it ensures high precision in training data and motion analysis. For applications requiring real-time feedback in sports and training, it is recommended to use a lightweight network architecture and rely on GPUs or embedded devices to balance accuracy and speed. Although the precision of the lightweight model slightly decreases, it offers significant advantages in processing speed and response time, meeting the real-time feedback requirements.
To comprehensively evaluate the proposed CNNs model, multiple comparison models were selected for the experiment, including state-of-the-art methods in the field as well as some classic benchmark models. Table 17 presents the performance of each comparison model in the keypoint detection task.
In Table 17, the proposed CNNs model performs best across multiple metrics, including accuracy, recall, and F1 score, particularly excelling in the keypoint detection task with an accuracy of 0.91, recall of 0.88, and an F1 score of 0.89. This indicates that the model offers high precision and stability in the context of youth soccer training, effectively capturing keypoint locations, especially in fast-moving and complex scenarios. In contrast, the performance of the classic benchmark CNNs model is relatively weaker, with an accuracy of 0.84, recall of 0.80, and an F1 score of 0.82. While this model still performs reasonably well in some scenarios, it lags significantly behind the more advanced CNNs models, especially in recall, suggesting potential instances of missed detections. The traditional machine learning method performs the worst, with an accuracy of only 0.77, recall of 0.75, and an F1 score of 0.76. This indicates that traditional machine learning methods have limited ability to handle complex movements and dynamic scenes, unable to fully leverage deep features in images, resulting in poorer performance compared to deep learning-based approaches. The deep learning-based keypoint detection method (OpenPose) performs relatively well, with an accuracy of 0.89, recall of 0.85, and an F1 score of 0.87. Although this method also shows strong detection capabilities, it slightly falls short compared to the advanced CNNs model, possibly due to differences in network structure or training strategies. Finally, the performance of the Simple CNN is better than the traditional machine learning method but still does not match the performance of other deep learning methods. With an accuracy of 0.80, recall of 0.78, and an F1 score of 0.79, this model is able to capture some features, but its depth and complexity are insufficient to handle higher-level features. These results indicate that the proposed model not only outperforms traditional methods in terms of performance but also effectively addresses the keypoint detection task in youth soccer training.
This study compares the performance of human experts and AI models in the keypoint detection task, thoroughly exploring the differences between the two and analyzing the reasons behind these differences. A comparison and analysis of the results between human and AI applications are shown in Table 18.
In Table 18, although the AI model performs comparably to human experts in most standard scenarios, its performance is relatively poorer in complex environments, such as fast movement or action occlusion. Furthermore, the real-time capability of the AI model is significantly superior to that of human experts. The AI can provide rapid feedback in a short time, while humans require more time for judgment and reaction. The AI model is trained on large datasets and can make predictions by learning features extracted from the data. In contrast, human experts rely more on intuition, experience, and an understanding of movement patterns when detecting key points. For example, although the AI model can accurately detect key points in standard scenarios through data learning, it often struggles to adapt and adjust in complex or unfamiliar scenarios, leading to a decrease in accuracy. Human experts, on the other hand, exhibit strong adaptability in dynamic and changing environments. For instance, in fast movements or under complex occlusion conditions, humans can infer the location of key points using contextual information, even when parts of the image are obscured or distorted. The AI model, however, relies heavily on image clarity and data consistency. When the input environment changes, the accuracy of the model may decline, especially in unfamiliar situations. While the AI model can achieve fast and accurate keypoint detection in standardized environments, its performance is limited in extreme conditions, such as intense lighting, complex movements, or severe occlusion. The model tends to rely on features present in its training dataset, and in cases of unseen situations or significant changes, the AI’s performance may be biased. In contrast, humans can compensate for these limitations through contextual awareness, accumulated experience, and rapid adaptation to the environment. Human experts are capable of making subjective judgments and adjusting actions based on real-time feedback. For example, in live matches, sports coaches or trainers not only rely on visual information but also consider the athlete’s condition, technical movements, and environmental changes. This flexibility is something current AI models find challenging to fully replicate. Although AI outperforms humans in certain aspects, such as real-time processing and data handling, AI models still face challenges in complex and dynamic scenarios. Human experts, with their experience and contextual judgment, demonstrate greater adaptability in such environments, while AI is better suited to standardized environments with sufficient data.
Performance comparison between pre-training model and this model in shown in Table 19. The table shows that this model is superior to the pre-training model in precision, recall and F1 score. Specifically, the precision of ResNet50 (pre-training) model is 89.2%, the recall is 86.5%, and the F1 score is 87.8%. The precision of VGG16 (pre-training) model is 85.7%, the recall is 83.1%, and the F1 score is 84.4%. In contrast, the precision of the proposed model is 91.0%, the recall is 88.0%, and the F1 score is 89.5%. This shows that the proposed model has higher accuracy and recall ability in key point detection tasks, and its overall performance is better than the pre-training model.
Finally, the advantages and disadvantages of the methods proposed in this study are summarized. The main advantages include high accuracy, practicality, and comprehensiveness. The study demonstrates that the model achieves good accuracy on the test dataset and can accurately detect key points in soccer training, including key movements and positions of soccer players. Such a model can be practically applied in soccer training, contributing to improved training efficiency and quality. By accurately detecting player movements, coaches can provide better guidance and enhance training programs. This study comprehensively combines deep learning CNNs and AI technology, offering a comprehensive solution for soccer training, including action recognition and position detection. The disadvantages include data requirements, computational complexity, and dependence on technical infrastructure. Deep learning models typically require a substantial amount of training data to achieve optimal performance. Insufficient soccer training data may limit the model’s performance. Deep learning models often have high computational complexity, demanding substantial computational resources and time for training and deployment. The successful application of the model may rely on the technical infrastructure of schools or soccer training facilities, including camera equipment and computing resources. In some environments, this could be a limiting factor. In conclusion, the approach proposed in this study holds great potential in soccer training but still needs to address certain challenges, such as data requirements and computational complexity. As technology continues to advance and data accumulates, these issues may gradually be resolved.
Discussion
This study developed a key point detection model for youth football training, which provides accurate results and enhances training efficiency. This is consistent with the findings of Sun and Ma (2021), whose research proposed an object recognition model that can detect the position information of key points, providing relative positional information for player identification and tracking in football video understanding44. It effectively resolves errors caused by rapid changes in video angles and provides necessary positional information for automated commentary and tactical generation.
This study identified policy, cognitive, attitudinal, and technological factors as important influencers of AI application in campus football, followed by hardware infrastructure and other factors. This is in line with the findings of Ahsan et al. (2022), which suggested that the optimal application of AI wearable devices in campus football includes smart vests, smart armbands, smart wristbands, motion patches, and football chest belts45. Additionally, Chidambaram et al. (2022) indicated that AI devices, through direct contact with players’ bodies, utilize sensors and big data technology to collect various performance and capability data of players, such as average speed, goals scored, playing time, high-intensity exercise duration, sprint counts, total running distance, and high-speed running distance46. These findings provide a theoretical basis for future experimental directions in research.
In addition, the proposed soccer training keypoint detection model in this study holds various industrial significance. Firstly, it enhances soccer training efficiency through real-time feedback. The model can monitor player movements in real time and provide feedback, which is valuable for immediate improvement and adjustments. This is beneficial for training before, during, and after matches. The model can assist coaches and players in conducting soccer training more effectively. By automatically detecting and correcting player movements and tactical errors, it can enhance the effectiveness of training, enabling players to improve their skill levels more rapidly. Secondly, it offers personalized training, reducing training costs. The model can provide personalized training recommendations based on each player’s performance and needs, making training more precise and tailored. Traditional soccer training often requires substantial human and time resources. This model can reduce the demand for human resources, thus lowering training costs. Additionally, by collecting and analyzing a large amount of training data, the model can help coaches and teams develop better training plans and tactical strategies, thereby enhancing the team’s competitiveness. Lastly, the application of AI and deep learning to soccer training represents the forefront of the sports technology field. The development of this field is expected to drive the growth of the sports technology industry, including hardware devices, software applications, and media communication. In summary, the proposed soccer training keypoint detection model has the potential to improve the efficiency and quality of soccer training, reduce costs, promote the development of sports technology, and provide better training and competitive advantages for soccer players and teams. This holds significant industrial importance for soccer and sports technology fields.
In addition to the aforementioned significance, it can promote intelligent sports training for adolescents, inspiring their interest and learning. Intelligent sports training can help them better understand soccer matches comprehensively and prepare for them more effectively. For young people interested in soccer, this study may spark their interest in science and technology. They can learn how to apply deep learning and AI to address real-world problems, which could be inspiring for their future academic and career development. Using modern technology to enhance sports training can positively impact the development of sports for the younger generation.
Conclusion
Research contribution
This study proposes a deep learning CNNs-based model for detecting key points in youth soccer training. Through the collection and analysis of the dataset, this model can provide accurate results, thereby improving the efficiency of soccer training. Furthermore, this study analyzes the application scenarios of AI technology in youth soccer training and finds that companies and schools consider policy, cognitive, attitudinal, and technological factors as key factors for applying AI in campus soccer. Additionally, hardware facility factors and other related factors also influence the application of AI in campus soccer.
In conclusion, this study successfully applied deep learning CNNs and AI technology to enhance the practical efficiency of soccer training. By analyzing soccer training videos and athlete data, a key point detection model was developed that could accurately identify key movements and the positions of soccer players. The contribution of this study lies in filling a research gap in the field of soccer training, providing coaches and athletes with better training tools and methods. This study offers important references and guidance for the development of intelligent soccer training systems, laying a solid foundation for future research efforts.
Future works and research limitations
There are certain limitations in this study. Firstly, in the dataset, the majority of images contain fewer than ten key points, resulting in a significantly lower number of images per key point compared to the total number of images in the dataset. Therefore, future work will involve further collection of annotated data. Additionally, the number of participating companies and teachers in the survey and interviews were relatively small. Therefore, future research will expand the number of participants to obtain more comprehensive results.
Data availability
The datasets used and/or analyzed during the current study are available from the corresponding author Chao Fu on reasonable request via e-mail 23872157@qq.com.
References***
Butcher, J. & Beridze, I. What is the state of artificial intelligence governance globally?. RUSI J. 164(5–6), 88–96 (2019).
Cioffi, R., Travaglioni, M., Piscitelli, G., Petrillo, A. & De Felice, F. Artificial intelligence and machine learning applications in smart production: Progress, trends, and directions. Sustainability-Basel 12(2), 492 (2020).
Goralski, M. A. & Tan, T. K. Artificial intelligence and sustainable development. Int. J. Manag. Educ. Oxf. 18(1), 100330 (2020).
Guo, W. Explainable artificial intelligence for 6G: Improving trust between human and machine. IEEE Commun. Mag. 58(6), 39–45 (2020).
Mehonic, A. et al. Memristors—From in-memory computing, deep learning acceleration, and spiking neural networks to the future of neuromorphic and bio-inspired computing. Adv. Intell. Syst.-Ger. 2(11), 2000085 (2020).
Wang, P., Fan, E. & Wang, P. Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recognit. Lett. 141, 61–67 (2021).
Mauer, M. A. D. et al. Automated age estimation of young individuals based on 3D knee MRI using deep learning. Int. J. Legal Med. 135, 649–663 (2021).
Kelly, A. L., Williams, C. A., Cook, R., Sáiz, S. L. J. & Wilson, M. R. A multidisciplinary investigation into the talent development processes at an English Football Academy: a machine learning approach. Sports 10(10), 159 (2022).
İçen, M. The future of education utilizing artificial intelligence in Turkey. Hum. Soc. Sci. Commun. 9(1), 1–10 (2022).
Kuleto, V. et al. Exploring opportunities and challenges of artificial intelligence and machine learning in higher education institutions. Sustainability-Basel 13(18), 10424 (2021).
Cuperman, R., Jansen, K. M. & Ciszewski, M. G. An end-to-end deep learning pipeline for football activity recognition based on wearable acceleration sensors. Sensors 22(4), 1347 (2022).
Qian, J. Research on artificial intelligence technology of virtual reality teaching method in digital media art creation. J. Internet Technol. 23(1), 125–132 (2022).
Katipoğlu, O. M. Prediction of streamflow drought index for short-term hydrological drought in the semi-arid Yesilirmak Basin using Wavelet transform and artificial intelligence techniques. Sustainability-Basel 15(2), 1109 (2023).
Huang, W., Ren, J., Yang, T. & Huang, Y. Research on urban modern architectural art based on artificial intelligence and GIS image recognition system. Arab. J. Geosci. 14, 1–13 (2021).
Araz, O. M., Choi, T. M., Olson, D. L. & Salman, F. S. Role of analytics for operational risk management in the era of big data. Decis. Sci. 51(6), 1320–1346 (2020).
Najafabadi, M. M. et al. Deep learning applications and challenges in big data analytics. J. Big Data-Ger. 2(1), 1–21 (2015).
Wani, J. A. et al. Machine learning and deep learning based computational techniques in automatic agricultural diseases detection: Methodologies, applications, and challenges. Arch. Comput. Method E 29(1), 641–677 (2022).
Li, H., Cui, C. & Jiang, S. Strategy for improving the football teaching quality by AI and metaverse-empowered in mobile internet environment. Wirel. Netw., 1–10 (2022).
Zhang, B., Lyu, M., Zhang, L. & Wu, Y. Artificial intelligence-based joint movement estimation method for football players in sports training. Mob. Inf. Syst. 2021, 1–9 (2021).
Zhou, D., Chen, G. & Xu, F. Application of deep learning technology in strength training of football players and field line detection of football robots. Front. Neurorobot. 16, 867028 (2022).
Zhao, H. & Meng, H. Research on the connotation and development strategy of Chinese campus football culture from the perspective of the integration of sports and education. Open J. Soc. Sci. 9(3), 157–163 (2021).
Hoftiezer, L. et al. From population reference to national standard: new and improved birthweight charts. Am. J. Obstet. Gynecol. 220(4), 383.e1-383.e17 (2019).
Momartin, S. et al. Capoeira Angola: An alternative intervention program for traumatized adolescent refugees from war-torn countries. Torture 29(1), 85–96 (2019).
Firek, W., Płoszaj, K. & Czechowski, M. Pedagogical function of referees in youth sport: Assessment of the quality of referee–player interactions in youth soccer. Int. J. Environ. Res. Public Health 17(3), 905 (2020).
Perrey, S. Training monitoring in sports: it is time to embrace cognitive demand. Sports 10(4), 56 (2022).
Rahlf, A. L. et al. A machine learning approach to identify risk factors for running-related injuries: study protocol for a prospective longitudinal cohort trial. BMC Sports Sci. Med. Rehab. 14(1), 1–11 (2022).
Scantlebury, S. et al. Navigating the complex pathway of youth athletic development: Challenges and solutions to managing the training load of youth team sport athletes. Strength Cond. J. 42(6), 100–108 (2020).
Lei, F., Liu, X., Dai, Q. & Ling, B. W. K. Shallow convolutional neural network for image classification. Sn Appl. Sci. 2, 1–8 (2020).
Han, H. Residual learning based CNN for gesture recognition in robot interaction. J. Inf. Process Syst. 17(2), 385–398 (2021).
Ullah, A. et al. Secure healthcare data aggregation and transmission in IoT—A survey. IEEE Access 9, 16849–16865 (2021).
Janke, J., Castelli, M. & Popovič, A. Analysis of the proficiency of fully connected neural networks in the process of classifying digital images. Benchmark of different classification algorithms on high-level image features from convolutional layers. Expert Syst. Appl. 135, 12–38 (2019).
Zhang, X. et al. Asymmetric cross-attention hierarchical network based on CNN and transformer for bitemporal remote sensing images change detection. IEEE Trans. Geosci. Remote 61, 1–15 (2023).
Gajowniczek, K. et al. Semantic and generalized entropy loss functions for semi-supervised deep learning. Entropy-Switz 22(3), 334 (2020).
Elharrouss, O. et al. Refined edge detection with cascaded and high-resolution convolutional network. Pattern Recognit. 138, 109361 (2023).
Gogić, I., Ahlberg, J. & Pandžić, I. S. Regression-based methods for face alignment: A survey. Signal Process. 178, 107755 (2021).
Stoeve, M., Schuldhaus, D., Gamp, A., Zwick, C. & Eskofier, B. M. From the laboratory to the field: IMU-based shot and pass detection in football training and game scenarios using deep learning. Sensors 21(9), 3071 (2021).
Şah, M. & Direkoğlu, C. Review and evaluation of player detection methods in field sports: Comparing conventional and deep learning based methods. Multimed. Tools Appl. 82(9), 13141–13165 (2023).
Jin, G. Player target tracking and detection in football game video using edge computing and deep learning. J. Supercomput. 78(7), 9475–9491 (2022).
Sun, H., Chen, R., Liu, T. et al. LG-LSTM: Modeling LSTM-based interactions for multi-agent trajectory prediction. In Proc. 2022 IEEE Int. Conf. Multimedia Expo (ICME) 1–6 (IEEE, 2022).
Jung, Y. et al. A comprehensive review of thermal potential and heat utilization for water source heat pump systems. Energy Build. 266, 112124 (2022).
Yuan, W., Wang, J. & Xu, W. Shift pooling PSPNet: Rethinking PSPNet for building extraction in remote sensing images from entire local feature pooling. Remote Sens. 14(19), 4889 (2022).
Bodaghi, A. & Shahidzadeh, M. Synthesis and characterization of new PGN based reactive oligomeric plasticizers for glycidyl azide polymer. Propellants Explos. Pyrotech. 43(4), 364–370 (2018).
Tsai, J. K. et al. Deep learning-based real-time multiple-person action recognition system. Sensors 20(17), 4758 (2020).
Sun, C. & Ma, D. SVM-based global vision system of sports competition and action recognition. J. Intell. Fuzzy Syst. 40(2), 2265–2276 (2021).
Ahsan, M. et al. Smart clothing framework for health monitoring applications. Signals 3(1), 113–145 (2022).
Chidambaram, S. et al. Using artificial intelligence-enhanced sensing and wearable technology in sports medicine and performance optimisation. Sensors-Basel 22(18), 6920 (2022).
Author information
Authors and Affiliations
Contributions
Shaowei Liao: Conceptualization, methodology, software, validation, formal analysis, investigation, resources, data curation, writing—original draft preparation. Chao Fu: writing—review and editing, visualization, supervision, project administration, funding acquisition.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics statement
The studies involving human participants were reviewed and approved by School of Physical Education, Xinyu University Ethics Committee (Approval Number: 2022.6542023). The participants provided their written informed consent to participate in this study. All methods were performed in accordance with relevant guidelines and regulations.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Liao, S., Fu, C. The optimization of youth football training using deep learning and artificial intelligence. Sci Rep 15, 8190 (2025). https://doi.org/10.1038/s41598-025-93159-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-93159-2