Introduction

As competitive sports training becomes more advanced, traditional evaluation methods increasingly fail to meet the demands for individualized, real-time, and dynamically adaptive assessments required by elite sports teams. For instance, in volleyball training, athletes’ physiological indicators exhibit high volatility and nonlinear functional responses. Indicators such as heart rate, lactate, and explosive power often show short-term phase lag, delayed changes, and asymmetric coupling, which pose challenges to function monitoring and load regulation1,2. Therefore, it is necessary to construct an intelligent evaluation model with dynamic response and structure perception capabilities.

In recent years, artificial intelligence technologies have been increasingly applied to modeling sports training data, including Back Propagation Neural Networks (BPNN), Long Short-Term Memory (LSTM) time-series networks, and Support Vector Machines (SVMs). However, these methods often struggle to capture the complex nonlinear relationships among indicators and are less effective in addressing high-dimensional feature redundancy3,4. Consequently, developing an evaluation model with both a flexible response structure and high compression adaptability has become an important research focus5.

This study introduces a Variable-Structure Multi-Kernel Generalized Regression Neural Network (VSGRNN). Building on the generalized regression neural network (GRNN), the proposed model incorporates a multi-kernel fusion mechanism, integrating Gaussian, Matern, and radial basis kernel functions to expand its capacity for modeling nonlinear relationships. In addition, a dynamic smoothing factor adjustment mechanism based on local gradient sensitivity is employed to improve the model’s stability under training perturbations. For feature construction, the model combines the Structural Embedding and Encoding Mechanism (SEEM) with the Multi-Scale Linear Compression (MSLC) framework, enabling automatic screening and semantic reconstruction of redundant physiological features. This approach enhances computational efficiency and adaptability for deployment.

The proposed method was tested using six consecutive weeks of training data from a high-level university men’s volleyball team. A multi-dimensional indicator input matrix was constructed, and the VSGRNN’s performance was systematically compared with various mainstream models. Experimental results demonstrated that VSGRNN achieved superior global prediction accuracy, improved response to local perturbations, greater tolerance to structural compression, and shorter real-time feedback delays. These findings highlight the model’s strong potential for engineering applications and practical promotion. Overall, this study provides a novel approach for the intelligent evaluation of physical function in volleyball-specific training, offering both theoretical foundations and methodological support for the broader use of structure-adaptive modeling in sports data analysis.

Literature review

In recent years, monitoring and evaluating changes in physical function in volleyball and other specialized sports training have attracted increasing attention6,7. Wang et al. (2023) developed a training load evaluation system using heart rate variability (HRV) and lactate concentration as core indicators. Although effective for detecting phasic fatigue states, this system employed a static evaluation model that lacked responsiveness to time-series fluctuations8. Salim et al. (2024) created a volleyball training system integrating inertial measurement units, pressure-sensitive floors, and machine learning. This system enabled automatic action recognition, real-time feedback, and a highly interactive training environment, significantly enhancing the intelligence level of sports performance monitoring9. Wang et al. (2025) introduced a fuzzy comprehensive evaluation method for periodic function scoring, but its heavy subjectivity and limited robustness in handling outliers restricted its practical value10.

With the rise of intelligent algorithms in sports science, techniques such as neural networks, ensemble learning, and time-series modeling have been widely applied to predicting and diagnosing athletes’ physical functions11,12. Chen et al. (2023) used a multi-layer perceptron to identify fatigue status in adolescent basketball players, accurately predicting recovery within 24 h after training13. Sattaburuth and Piriyasurawong (2022) applied an LSTM network to heart rate prediction in football training, demonstrating the effectiveness of time-series modeling for slowly varying physiological states14. Imperiali et al. (2025) employed the XGBoost algorithm to construct a function scoring model, which eliminated redundant indicators through feature importance ranking and enabled automated annotation of training load levels15. However, most existing methods rely on single-kernel functions or static structures, limiting their ability to adjust response pathways in real-time to sudden fluctuations during training16,17.

In recent years, several scholars have proposed neural network models with strong structural convergence and noise resistance, mainly applied to solving dynamic nonlinear equations and time-varying matrix inversion problems. For instance, Li et al. (2020) developed a finite-time convergent and noise-rejection zeroing neural network (FTNRZNN) for robust dynamic equation solving18. Zhang et al. (2022) presented a structurally simplified unified gradient neural network (GNN) that achieved fast and stable inversion of time-varying matrices while improving noise tolerance19. Ying et al. (2025) further designed a neural differential structure incorporating an adaptive noise learning mechanism to handle dynamic disturbance responses in constrained optimization problems20. These works provide effective methods for addressing high-dimensional, complex structural problems. However, their focus primarily lies in mathematical problem-solving and robotic control tasks, resulting in limited adaptability for evaluating human physiological functions, which involve highly redundant features and multi-disturbance physiological time series. By contrast, the VSGRNN model proposed in this study targets multi-scale disturbance modeling and elastic structural response tailored to physiological data. It combines multi-kernel nonlinear mapping with a structural embedding and compression mechanism to accommodate the real-time demands and individual heterogeneity characteristic of volleyball training scenarios.

Although existing studies have made progress in indicator modeling and the application of intelligent algorithms, several limitations remain. These include limited capability to model nonlinear coupling among multiple indicators and to respond effectively to local perturbations. Additionally, there is a lack of a comprehensive framework that balances accuracy, compression efficiency, and real-time feedback. Therefore, there is an urgent need to develop an intelligent evaluation model with structural flexibility and stable output performance, capable of adapting to the high variability and heterogeneity present in volleyball training scenarios.

Research model

Identifying bottleneck of nonlinear mode in training monitoring

In the long-duration and highly variable environment of volleyball-specific training, traditional machine learning methods often struggle with weak nonlinear identification when modeling changes in athletes’ physical functions21,22. This issue is especially pronounced during instantaneous fluctuations in training load—such as sprints or rapid offense-defense transitions—or periodic oscillations like alternating phases of fatigue and recovery. Under these conditions, models tend to produce unstable outputs and exhibit delayed responses23,24.

For example, when using the BPNN to model multi-dimensional physiological indicators—such as lactate concentration, explosive load, and HRV—the model often suffers from highly nonlinear local fitting issues, as illustrated by Eq. (1)25.

$$\:\widehat{y}\left(x\right)=f({W}_{2}\cdot\:\sigma\:({W}_{1}x+{b}_{1})+{b}_{2})$$
(1)

σ represents activation function. \(\:{W}_{1}\) and \(\:{W}_{2}\) are weight matrices. \(\:{b}_{1}\) and \(\:{b}_{2}\) are bias terms. Although the expression has a certain nonlinear fitting ability, under multivariable interference (such as sudden change of training intensity), \(\:\widehat{y}\left(x\right)\) is prone to excessive oscillation, which is called local over-fitting problem, and it is difficult to extract patterns with physical significance26.

Meanwhile, SVMs often depend on the mapping capabilities of kernel functions when modeling complex training response processes27. However, in the presence of locally redundant features—such as repeated heart rate recordings during recovery periods—the high-dimensional mapping can actually weaken the model’s ability to capture essential trends28,29. The kernel function transformation is illustrated in Eq. (2):

$$\:K({x}_{i},{x}_{j})=\text{e}\text{x}\text{p}\left(-\gamma\:\parallel\:{x}_{i}-{x}_{j}{\parallel\:}^{2}\right)$$
(2)

When the value of γ is unreasonable or the sample distribution is densely overlapped, the response of the kernel function tends to be flat, resulting in response lag and training blind area30.

A Multi-core variable structure GRNN modeling strategy

To provide a clearer illustration of the overall architecture and data flow of the proposed VSGRNN model, Fig. 1 presents a structural flowchart of the model, highlighting key components such as feature construction, kernel function fusion, gradient adjustment, and local weighted prediction.

Fig. 1
figure 1

Model Architecture Flowchart.

Figure 1 illustrates the overall architecture of the VSGRNN model. It integrates structural feature construction, multi-scale compression, dynamic kernel fusion, and gradient-aware adjustment techniques to achieve robust and adaptive prediction in complex training scenarios.

Unlike the traditional GRNN, which is relatively static in structure, VSGRNN introduces three heterogeneous kernel functions in the kernel function layer: Gaussian Kernel (Eq. (3)), Radial Basis Kernel (Eq. (4)) and Matern Kernel (Eq. (5)):

$$\:{K}_{G}(x,{x}_{i})=\text{e}\text{x}\text{p}\left(-\frac{\parallel\:x-{x}_{i}{\parallel\:}^{2}}{2{\sigma\:}^{2}}\right)$$
(3)
$$\:{K}_{R}(x,{x}_{i})=\frac{1}{1+\alpha\:\parallel\:x-{x}_{i}{\parallel\:}^{2}}$$
(4)
$$\:{K}_{M}(x,{x}_{i})=\left(1+\frac{\sqrt{3}\parallel\:x-{x}_{i}\parallel\:}{l}\right)\text{e}\text{x}\text{p}\left(-\frac{\sqrt{3}\parallel\:x-{x}_{i}\parallel\:}{l}\right)$$
(5)

σ, α and l are the shape parameters corresponding to each kernel function, respectively. These kernel functions will not be combined with equal weights, but the dynamic weight ωk will be calculated through the local gradient sensitivity distribution to construct a multi-core fusion kernel function, as shown in Eq. (6):

$$\:{K}_{min}(x,{x}_{i})=\sum\:_{k=1}^{3}\:{\omega\:}_{k}\left(x\right)\cdot\:{K}_{k}(x,{x}_{i})$$
(6)

To avoid the response redundancy of the kernel function in the dense area of training samples, the smoothing factor control mechanism based on local gradient disturbance is introduced into the model31,32. Specifically, let the perturbation gradient of the j-th input feature in the local neighborhood be Eq. (7):

$$\:{\nabla\:}_{j}=\frac{1}{n}\sum\:_{i=1}^{n}\:\left|\frac{\partial\:{y}_{i}}{\partial\:{x}_{ij}}\right|$$
(7)

Then, the adjustment form of the overall nuclear response function for this disturbance is Eq. (8):

$$\:{\sigma\:}_{j}^{*}=\frac{{\sigma\:}_{0}}{1+\beta\:{\nabla\:}_{j}}$$
(8)

\(\:{\sigma\:}_{0}\) is the initial smoothing factor and β is the adjustment coefficient. This strategy makes the response function of high gradient features in the model “sharper”, while the flat region maintains the generalized response, forming adaptive compression of gradient sensitive regions33.

Finally, the output estimation of the model is not the equal weight average of all sample responses, but the exponential decay local weighting mechanism is introduced, as shown in Eq. (9):

$$\:\widehat{y}\left(x\right)=\frac{\sum\:_{i=1}^{N}\:{y}_{i}\cdot\:\text{e}\text{x}\text{p}\left(-\lambda\:\cdot\:d(x,{x}_{i})\right)\cdot\:{K}_{min}(x,{x}_{i})}{\sum\:_{i=1}^{N}\:\text{e}\text{x}\text{p}\left(-\lambda\:\cdot\:d(x,{x}_{i})\right)\cdot\:{K}_{min}(x,{x}_{i})}$$
(9)

\(\:d(x,{x}_{i})\) represents the input spatial distance and λ is the attenuation factor. This mechanism essentially introduces a dual attention mechanism: spatial distance attention + kernel function adaptive attention to suppress the interference of remote redundant samples and realize the spatial local convergence of the response structure34.

Feature configuration mechanism and embedded dimension reduction design

Let the original feature matrix be XRn×m, where n is the number of samples and m is the original index dimension. Generate a map by defining a mutual exclusion tensor, as shown in Eq. (10):

$$\:{\mathcal{T}}_{ijk}=\varphi\:({x}_{ij},{x}_{ik})=\frac{|{x}_{ij}-{x}_{ik}|}{{x}_{ij}+{x}_{ik}+\epsilon\:}$$
(10)

ε is a perturbation constant used to prevent the denominator from approaching zero. This mapping encodes the relative difference between any two indicators as a mutual exclusion coefficient within the tensor, reflecting their “relative expressive ability.” This allows the subsequent compression process to identify variable groups with strong mutual exclusion and high representativeness35.

Once the embedded tensor is constructed, a multi-scale linear compression (MSLC) framework is applied to reconstruct and compress the structural tensor. Unlike Principal Component Analysis (PCA), MSLC does not rely on the covariance matrix to extract principal components. Instead, it introduces a scale window function matrix Ws to map the feature tensor into a multi-scale projection space36,37, as shown in Eq. (11):

$$\:{\mathcal{T}}^{\left(s\right)}=\mathcal{T}\cdot\:{W}_{s}$$
(11)

\(\:s\in\:\{{s}_{1},{s}_{2},{s}_{3}\}\) represent layers with different compression scales. \(\:{W}_{s}\in\:{\mathbb{R}}^{m\times\:{d}_{s}}\), and ds<m. Different scale windows retain different levels of discrimination information to realize dynamic reconstruction from coarse granularity to fine granularity.

Finally, the reconstructed tensors at multiple scales are merged into a fused spatial representation, as shown in Eq. (12):

$$\:\mathcal{Z}=\underset{s}{\oplus}\:\text{R}\text{e}\text{L}\text{U}\left({\mathcal{T}}^{\left(s\right)}\right)$$
(12)

stands for splicing operation, and ReLU activation is used to compress nonlinear redundant items.

Experimental design and performance evaluation

Datasets collection

The training sample data in this study were collected from a volleyball-specific experimental platform at a national sports institute. The subjects included 30 male elite athletes from the school team’s main lineup. Data collection spanned a continuous six-week training camp. Each week featured a fixed basic training load, special simulated matches, and periodic fatigue recovery cycles, ensuring that the indicators exhibited sufficient dynamic evolution.

A total of 19 data dimensions were recorded, covering two main categories: exercise physiology and physical fitness structure. The monitored indicators are detailed in Table 1.

Table 1 Monitoring indicators.

Due to the multi-source heterogeneity and varying sampling frequencies among indicators—where some devices use continuous sensing and others intermittent detection—the overall data exhibit the following characteristics:

  1. 1.

    Strong structural fluctuations and significant non-stationarity between time segments. Typical physiological perturbations, such as abnormal lactate spikes and delayed heart rate drift, occur on certain training days.

  2. 2.

    Asymmetric cross-correlation structures among indicators. For example, a positive correlation between VE and TRIMP rapidly weakens or even reverses during fatigue periods. The overall data distribution deviates from Gaussian, showing pronounced heavy tails.

To ensure data quality for model input, all data undergo a standardized preprocessing pipeline: Sampling frequency unification: Data from different sources are aligned by timestamps and resampled using a sliding window with a 30-second interval. Noise filtering: Discrete wavelet transform combined with soft-thresholding is applied to denoise the signals. Standardization: All input variables are normalized using Z-score standardization. Missing value imputation: For occasional missing segments in HRV and lactate indicators, third-order local linear interpolation is used to reconstruct the data at the segment level, preserving the original trend patterns. The final training sample matrix has dimensions \(\:\text{X}\in\:{\mathbb{R}}^{T\times\:19}\), where T = 6450 represents the number of sample frames. This corresponds to data collected from 30 athletes over six consecutive weeks with a 30-second sampling interval.

To further validate the applicability of the VSGRNN model beyond volleyball team data, the publicly available PAMAP2 (Physical Activity Monitoring 2) dataset was introduced for generalization testing and comparative analysis, thereby supplementing the external robustness evidence of the evaluation framework.

Experimental environment

The specific software configuration and hardware deployment are shown in Table 2.

Table 2 Software configuration and hardware deployment.

The experimental environment was designed with the following priorities:

  • Stability: Ensuring no crashes, memory leaks, or GPU deadlocks occur during multiple training cycles.

  • Reproducibility: Locking all dependency versions and providing deployment images to facilitate replication of results and ease of engineering adoption.

  • Debugging flexibility: Supporting multi-level training log tracking at both batch and epoch levels, with immediate annotation of any abnormal gradients.

  • Resource isolation: Completely separating the operating environments of each model to prevent cross-interference.

Parameters setting

The main parameter configuration and control logic of each model are shown in Table 3.

Table 3 Main parameter configuration and control logic of each model.

During parameter initialization, sensitivity tests on the smoothing factor σ revealed that when σ₀ is less than 0.1, the network output exhibits high-frequency noise and is easily misled by distant disturbance samples. Conversely, when σ₀ exceeds 0.2, the response becomes overly flat, resulting in a loss of discrimination ability. Therefore, σ₀ was set to 0.15 and is adaptively adjusted through a local gradient feedback mechanism.

At the start of training, the kernel function fusion coefficients ω are uniformly distributed. However, after several iterations, they spontaneously bias toward the Matern kernel (kernel 3), indicating its superior fitting capability in non-stationary regions.

In the MSLC module, incorporating varying compression ratios (dₛ) endows VSGRNN with a multi-scale response mechanism. Specifically, when handling cross-fluctuation regions such as HRV and lactate levels, the lower-dimensional compression layers provide effective feature filtering. Meanwhile, the higher-dimensional windows preserve complex structural relationships to maintain prediction stability.

Performance evaluation

To systematically assess the performance of the proposed VSGRNN in evaluating physical functions during volleyball training, five model comparison groups were established: the proposed VSGRNN (featuring multi-kernel fusion and smooth self-tuning mechanisms), Lightweight Artificial Neural Network (LightANN), LSTM with Attention, Extreme Gradient Boosting (XGBoost), and Tabular Data Network (TabNet). All models were trained and tested under identical data partitions and training protocols. Their performance was evaluated across five dimensions: global fitting accuracy, nonlinear disturbance response, structural compression adaptability, response delay stability, and overall multi-dimensional score.

Global fitting performance analysis

On the full validation set, VSGRNN achieved the best overall prediction accuracy. It particularly maintained stable error rates when predicting lactate rise segments and heart rate buffering periods during non-stationary phases. Three mainstream evaluation metrics were used: goodness of fit (R²), which measures the trend coverage of predictions; root mean square error (RMSE); and symmetric mean absolute percentage error (SMAPE), which evaluates the symmetry of prediction fluctuations. Figure 2 presents the global prediction performance of each model on the validation set.

Fig. 2
figure 2

Global prediction performance of each model on the verification set.

Figure 2 shows that VSGRNN outperforms all other models across the three evaluation metrics. With an R² of 0.927, it demonstrates strong capability in fitting complex nonlinear changes. Its RMSE is only 1.68, indicating minimal prediction error. The SMAPE stands at 8.21%, reflecting the smallest prediction fluctuations in segments with significant changes, such as lactate levels, heart rate, and explosive power. LSTM with Attention ranks second with an R² of 0.884. It slightly surpasses TabNet in short-term trend prediction but has higher RMSE and SMAPE values than VSGRNN, showing mild overfitting on highly heterogeneous samples. TabNet and XGBoost perform moderately: TabNet benefits from its ability to learn tabular features but is limited by the absence of deep convolutional fusion; XGBoost tends to deviate when faced with strong nonlinear coupling and imbalanced samples. LightANN performs the worst. Although it offers fast inference, it struggles to capture asymmetric changes among indicators, with an SMAPE near 15%—almost double that of VSGRNN.

Evaluation of local disturbance and nonlinear response capability

Two typical local scenarios were constructed for this evaluation: (1) the lactate surge segment—occurring 30 min after peak exercise intensity during the rapid lactate rise phase; and (2) segments exhibiting intense HRV fluctuations—during training periods alternating between high- and low-intensity exercises, resulting in asymmetric physiological responses. For these non-stationary sample regions, five structured micro-scale indicators were defined to assess model response performance, as detailed in Table 4.

Table 4 Five structured microscale indicators.

Comparison of prediction deviation and response index of local disturbance interval is shown in Fig. 3.

Fig. 3
figure 3

Comparison between prediction deviation and response index of local disturbance interval. Note: PRDB, LPDB, TIR, and ADF (left vertical axis) are metrics where lower values indicate better performance. The LVS metric (right vertical axis) is plotted on a secondary axis, with higher values indicating stronger sensitivity to local fluctuations.

As shown in Fig. 3, VSGRNN maintains a Peak Response Deviation Bound (PRDB) of 6.7%, which is significantly lower than LightANN’s 14.6%. This demonstrates VSGRNN’s excellent peak-tracking ability without suffering from peak-blunting issues. For Local Perturbation Drift Bound (LPDB), VSGRNN records a value of 1.02, markedly lower than other models, effectively avoiding sudden oscillation errors. Regarding the Trend Inversion Rate (TIR), VSGRNN achieves a low trend error rate of 4.3%, outperforming TabNet (7.4%) and XGBoost (9.0%), both of which frequently misjudge the direction of edge fluctuations. With a Local Variation Sensitivity (LVS) score of 0.91, VSGRNN surpasses XGBoost and LightANN (both below 0.75), as these latter models tend to linearly smooth local fluctuations, losing important details. In terms of Adaptive Drift Following (ADF), VSGRNN’s drift delay rate is only 3.5%, better than LSTM + Attention’s 5.7%, demonstrating its superior sensitivity not only to abrupt changes but also to slow-varying indicator trends.

Compressive mapping and structural adaptability verification

To further assess the model’s adaptability to feature dimension compression for practical deployment, three compression ratios are tested under the MSLC mechanism: retaining 70% (light compression), 50% (medium compression), and 30% (heavy compression) of the original feature dimensions. For each compression level, four evaluation metrics are calculated: RMSE, Structure Compression Error Tolerance (SCET), Computation Time Gain (CTG), and Memory Footprint Drop Rate (MFDR). Since LSTM + Attention and XGBoost lack a unified dimension compression mechanism, they are excluded from the MSLC compression evaluation group in this section. Figure 4 presents the error tolerance and resource gain results under the different compression ratios.

Fig. 4
figure 4

Error tolerance and resource benefit evaluation under different compression ratios. a is VSGRNN; b is TabNet; c is LightANN.

In Fig. 4, VSGRNN shows only a 2.1% increase in RMSE at a 70% compression rate, with error growth remaining below 8% even at 30% compression. This stability surpasses that of TabNet (12.4% error increase) and LightANN (17.1%). These results indicate that VSGRNN’s kernel response function effectively suppresses error propagation, demonstrating strong structural shear resistance. The Structure Compression Error Tolerance (SCET) changes in sync with RMSE, confirming that the model’s compression process is a “continuous contraction” rather than an “abrupt collapse.” Regarding computational efficiency, VSGRNN achieves a computation time gain (CTG) of 33.6% and 46.1% during medium and heavy compression stages, respectively, as the shortened feature pathways reduce inference burden. The memory footprint drop rate (MFDR) reaches 55.6%, facilitating deployment on edge computing devices and portable terminals. By comparison, TabNet exhibits a nearly 7.2% increase in error at 50% compression, with only modest gains in CTG and MFDR, indicating considerable performance degradation under dimension compression. LightANN’s errors nearly double at 30% compression, reflecting its shallow architecture’s sensitivity to input dimension changes and lack of elastic adjustment capability.

Inference delay and response jitter evaluation

In practical volleyball training applications, physical function evaluation systems require rapid response and stable inference. This study assesses the real-time performance of different models based on three key metrics:

  • Average Prediction Time (APT): The inference time per single sample, measured in milliseconds.

  • Max Response Jitter (MRJ): The maximum deviation in response time for individual samples during batch inference.

  • Cold Start Latency (CSL): The total time from model loading to readiness for inference.

All models were tested on a unified GPU platform, each running 1,000 inference tasks. The results, averaged over these runs, are presented in Table 5.

Table 5 Comparison of model inference delay and response stability.

As shown in Table 5, VSGRNN maintains high prediction accuracy while keeping inference delay within 24.1 ms, second only to the minimalist LightANN. The MRJ is just 3.8 ms, indicating extremely low variability during inference and making the model well-suited for stable output in continuous sampling scenarios. Although VSGRNN’s CSL is not as low as LightANN’s, it still offers faster loading times compared to other complex models. In contrast, despite LSTM + Attention’s strength in time-series modeling, it suffers from high inference delays and long loading times, which limit its suitability for edge deployment. TabNet and XGBoost deliver moderate performance in both loading speed and inference time but demonstrate weaker control over response fluctuations.

Generalization capability validation based on a public dataset

To further evaluate the adaptability and generalization performance of the VSGRNN model across non-specific training populations, this study incorporates the PAMAP2 public dataset as an external validation platform. This dataset includes multi-channel sensor data from nine participants performing various physical activities, such as walking, running, and stair climbing. It contains typical physiological and motion indicators, including accelerometer, gyroscope, and heart rate data, providing strong generality and dynamic diversity. To align with the model’s input structure, six physiological parameters closely related to training response were selected, and temporal window inputs were constructed to simulate fluctuations in exercise function segments. Five comparative models—VSGRNN, LSTM + Attention, XGBoost, TabNet, and LightANN—were evaluated under the same training strategy as used with the volleyball team data: sliding window, standardized preprocessing, Z-score normalization, and an 80/20 train-validation split. Since the PAMAP2 dataset does not directly provide fatigue scoring labels, a “load estimation metric” was constructed based on heart rate and activity intensity for regression prediction tasks. The results are presented in Table 6:

Table 6 Comparative performance of models on the PAMAP2 public Dataset.

Based on Table 6, VSGRNN achieves the highest R² of 0.891, outperforming all other models, indicating strong structural expressiveness in dynamic multi-channel environments. Its RMSE is controlled below 1.92, approximately 14.7% lower than XGBoost. The SMAPE of 9.34% further demonstrates its stable advantage in prediction symmetry. LSTM + Attention ranks second, showing certain strengths in capturing temporal sequences but slightly weaker compression performance on high-dimensional heterogeneous features compared to VSGRNN. TabNet and XGBoost perform moderately, while LightANN remains the weakest model, indicating limited generalization capability. These results suggest that although VSGRNN was primarily designed based on volleyball-specific data, its structural elasticity and multi-kernel fusion mechanism provide strong transferability. The model maintains stable performance on general sports datasets and shows good cross-task adaptability. This further validates VSGRNN’s robustness and engineering deployment potential in non-specific data environments.

Discussion

VSGRNN outperforms comparative models in multiple aspects because its structural mechanisms closely align with the dynamic characteristics of volleyball training data. Traditional neural networks and ensemble models possess rigid architectures when processing multi-source heterogeneous physiological signals. In contrast, VSGRNN’s multi-kernel fusion strategy dynamically adjusts local response behaviors, enabling stable predictions in scenarios characterized by uneven feature activation intensities and nonlinear indicator correlations. The dynamic adjustment mechanism for the smoothing factor effectively mitigates the overfitting issue common in traditional GRNN models within high-gradient regions. This grants the model enhanced “local punishment” and “boundary repair” capabilities during high-perturbation intervals, explaining its superior performance on micro-scale indicators such as peak prediction and trend following. The synergistic design of the MSLC compression strategy combined with structural embedded mapping allows the model to preserve discriminative feature expression pathways after dimensionality reduction, thereby minimizing performance loss. Unlike LSTM-based architectures that rely on long-term sequence memory, VSGRNN achieves rapid fitting through localized kernel responses, excelling in inference delay and jitter control. Its “local regulation plus kernel structural elasticity” architecture satisfies the dual requirements of sports evaluation systems for real-time performance and interpretability, providing strong theoretical support and practical feasibility for deployment on mobile platforms and edge devices. Furthermore, the testing results on the PAMAP2 public dataset further validate VSGRNN’s strong generalization ability across non-specific sports scenarios. This dataset includes a variety of common physical activities and multimodal physiological sensor data, which differ significantly from the volleyball-specific training environment. Without any modifications to the model structure or core parameters, VSGRNN consistently outperforms traditional models on multiple evaluation metrics, demonstrating that its structural elasticity and multi-kernel fusion mechanism offer robust cross-domain adaptability.

Conclusion

Research contribution

This study proposed an enhanced VSGRNN for the intelligent evaluation of physical functions in volleyball training. The model integrated a heterogeneous combination of Gaussian, radial basis, and Matern kernels with a local gradient-driven dynamic smoothing factor adjustment mechanism, improving responsiveness to highly fluctuating samples. At the feature input level, a collaborative framework combining structural embedding mapping and multi-scale linear compression was developed to suppress high-dimensional data redundancy and reduce deployment costs. Comparative experiments with four mainstream models demonstrated that VSGRNN achieved superior performance across multiple dimensions, including prediction accuracy, structural compression adaptability, and inference delay control, highlighting its engineering feasibility and practical potential for large-scale deployment.

Future work and research limitations

Although the VSGRNN model exhibited strong performance across various metrics, several limitations remained. The model relied heavily on structured input data and faced challenges in directly processing unstructured information, such as subjective evaluation categories (e.g., fatigue perception scores). Furthermore, its stability in micro-sample environments required additional validation. Future research may focus on two directions: (1) developing highly elastic model architectures tailored to non-Euclidean index distributions by integrating Transformer or graph-based structures, and (2) applying knowledge distillation and model pruning techniques to further enhance the lightweight performance of VSGRNN for mobile and edge deployment.