Abstract
Athletic person’s fatigue and stamina prediction plays a vital role for improving the overall performance in the sports. Identification of the athletic person’s facial expression on track and field using image, is still a challenge task. The complex background and improper environmental lighting conditions affects the identification of athlete’s facial expressions while playing. Existing methods use RGB and traditional night vision cameras for detecting athlete’s facial expressions that operates only in minimum lighting condition. These cameras does not function in low lighting (< 30%) and complete dark environment. Moreover, the existing systems never predict fatigue, pain and stamina of the player on the ground in dark environment. In this paper, the facial thermal images of athletic person during playing are acquired and enhanced through the proposed HEOP preprocessing method. Further, the proposed ECOC-MCSVM method classifies fatigue, pain and stamina of sportsperson using facial biomarkers such as cheek raising, lip spreading, tongue position, jaw dropping and nose wrinkling. The prediction levels are optimized using Bayesian optimized Multiple Polynomial Regression analysis (BO-MPR). The proposed ECOC-MCSVM method has an accuracy of 97.69% for fatigue, pain and stamina prediction and it is validated with existing methodologies.
Similar content being viewed by others
Introduction
Sportsperson’s emotional status is identified by observing their facial expressions. The changes in the facial muscle movements lead to the different facial expressions. In Face Emotion Recognition (FER) field1,2, the facial features are the predominant indicators to predict the athlete’s facial expressions. Facial expressions and body languages are the quickest and effective method for conveying information to other persons. The psychologists Paul Ekman and Wallace V. Friesen3,4 provided six universally acknowledged facial expressions such as happy, angry, fearful, disgust, sad and surprised. FER is applied in various fields4 such as human–computer interaction, market research, mental health monitoring5,6 and security systems7. Mental health monitoring, pain assessment, neurological disorders, autism spectrum disorders, stress and fatigue monitoring are some of the medical application of FER8. FER analysis in sports domain supports performance monitoring, pre-competition preparation, post-game analysis, injury management, and rehabilitation. Fatigue monitoring plays a significant role in shaping the trajectory of a sportsperson’s career. In general, fatigue is monitored based on physiological markers such as sleep patterns, heart rate variation and hormonal abnormalities, which is observed immediately after the game8. The primary vital signs namely heart rate (Pulse), respiratory rate (No. of breaths), blood pressure (mmHg) and body temperature (°F or °C) are used for predicting the health issues in sportsperson.
Research gap analysis
In sports analytics, fatigue is monitored using wearable and portable electronic gadgets9. The fitness trackers of sports person are lightweight and portable10. These wearable devices are in form of fitness bands9, smart watches10, neckband10,11 and smart phone Apps1,12, which tracks the sleep patterns and activity of sports person. The cardiovascular strain is measured through heart rate13,14,15 monitors while playing. Even the movement of sports person and their cardiac rhythm is tracked through Actigraphy devices14,15,16. The health conditions of sports person is monitored by observing the physiological data through expensive non-invasive devices such as Electroencephalogram (EEG), Photoplethysmography (PPG). The above techniques measure the fatigue and stamina of a sports person using RGB images. Moreover, wearable sensors in the sportsperson’s body vibrates at various situations17,18 like catching the ball, sweating, running and body movement while playing in the field. The falsification of sensors leads to inaccurate prediction of fatigue, stamina and pain of the sportsperson.
Problem statement
Recently, thermal image processing1,19 overcomes expensive medical image modalities20 such as MRI, CT and ultrasound for analyzing the images. Thermal face signature of sports person16,17,21 is captured through the amount of heat radiations emitted from their face while playing. Thermal cameras are independent of light intensity, used in the dark environment19,20 whereas visible cameras need minimum lighting condition for image acquisition. Thermal image needs effective pre-processing techniques to enhance the image quality. Thermal images using infrared wavelength (7 to 14 µm) has noises in images19 due to environmental conditions (fog, midst, fire, low lighting and complete darkness). These noises degrade the quality of image, so an efficient de-noising technique is required. In existing methods only facial emotions of the sportsperson is detected but it never predicts the fatigue, stamina and pain of an athlete using thermal image. In this research work, thermal facial images are used for extracting the facial biomarker features such as lowering eyebrows, cheek raising, lip tightening, jaw dropping and mouth stretching. Utilizing these biomarkers the fatigue, stamina and pain of the sport person is predicted, while playing in the ground. The biomarker features for predicting athlete’s pain are categorized as jaw drop, eye tightening, tightening the lip and mouth stretching. Similarly for fatigue prediction, the features such as eye shrinking, frowning, tongue out and visible teeth are used. The stamina is predicted using the features such as cheek rising, lips spreading.
Contributions
To analyze facial thermal biomarkers of sportsperson, irrespective of environmental conditions such as day and night using proposed Thermal Facial Sports Person (TFSP) dataset. To enhance, de-noise and normalize the facial thermal images using proposed Histogram Equalization, Order Statistics Filter and Power Law Transform (HEOP) pre-processing algorithm.
-
2.
To extract features of thermal biomarkers using the Block Processing based Temperature Detection (BPTD) technique and to analyze the facial temperature variation of the sportsperson on the field. To classify fatigue, pain and stamina of sportsperson through extracted facial biomarkers using proposed Error-Correcting Output Codes based Multi-class Support Vector Machine (ECOC- MCSVM) algorithm.
-
3.
To optimize the classification model with the Bayesian optimized Multiple Polynomial Regression (BO-MPR) analysis. The proposed ECOC- MCSVM model is validated by comparing it with the existing algorithms.
The sections of this paper are organized as follows. An introduction about the FER and thermal imaging is discussed in "Introduction" section. "Related works" section provides a detailed review about the existing methodologies in FER, along with their pros and cons. The proposed architecture is explained in "Proposed methodology" section. “Discussion” section provides a detailed analysis about the experimental setup and results of the proposed model. Finally the conclusion of this research article is briefed in last section.
Related works
Researchers have explored different strategy for feature extraction, emotion modeling, training and testing the datasets in FER, yet achieving optimum solution is still a challenging task. In this paper, literature survey provides the detailed review of existing methodologies, datasets and cameras used in FER. The facial emotions of the sportsperson are predicted through selecting the appropriate bio markers. Bio markers refer to the facial geometry such as width of the nose, interpupillary distance, jaw line shape, facial texture variation in skin color and tone. Moreover, it also includes facial landmark variations in eye corners, tip of the nose, mouth corners and contour of face. Analyzing these bio-marker’s position and movements, the probability of six basic expressions such as anger, disgust, fear, happiness, sadness and surprise are detected in FER4,5,6,9.The facial expressions are also identified through the vital physiological parameters such as heart rate and breath rate2,14. Brick et al.2 predicted the heart rate of the participants while running in treadmill, using Polar RS400 sports watch to track their speed and distance. Facial Action Coding System (FACS)12 predicts the facial expressions through facial muscles movement. The facial muscles such as frontalis, orbicularis oculi, zygomaticus major, risorius, platysma, and depressor angulioris are used for the emotion classification. Facial Emotion Recognition (FER) 2013 dataset is widely used in recent research works to identify the facial expressions of the persons. The FER 2013 dataset was launched in the International Conference on Machine Learning (ICML) 2013 challenge and it has nearly 35,000 labeled gray scaled images captured with different personalities. It is suitable for deep learning models due to its scalability and reliability. Cohn-Kanada (CK +)4,5,9, JAFFE5,10, Raf-DB5, FER-20134,5,6 and AffectNET7 are open source FER datasets. CK + dataset have nearly 593 Gy scale images and JAFFE dataset have 213 Gy scale images. Raf-DB and AffectNET datasets have 8040 Gy scale images and 4, 50,000 Gy scale images respectively. FER-2013 dataset has 35,000 Gy scale images.FER is also applied in the gaming industry for developing the avatar of the game17. The FER dataset is created with RGB (Red Green and Blue) cameras such as Sony Alpha ILCE 640013 and Logitech HD Pro C920 webcam15. The RGB images have noise due to insufficient light, bad illumination and complex background during acquisition process. To overcome this problem, thermal cameras are used, as it is independent of the lighting illumination and dark environment. Thermal cameras acquire the image through observing the heat radiations from the face instead of using reflection of light19. Very few thermal datasets are available for facial emotion recognition like USTC–NVIE20 and there is huge demand.
Lalitha et al. proposed Histogram of Oriented Gradients (HOG) technique for analyzing the emotions of persons. HOG divides an image into smaller regions and analyze the intensity, direction of change through its histograms14. However, Linear Binary Pattern (LBP) extracts the features and predicts the texture of the image by comparing neighborhood pixels. The annoying signals from the images are de-noised using noise filters19. De-noising techniques such as spatial domain filtering (Mean, Median and Gaussian filter) and frequency domain filtering (Fourier Transform and Wavelet Transform filter)10,17,18,20,21 are used widely for enhancing pixel quality. Fast median filtering and Butterworth filter removes the noise from facial images of basketball player16 and their emotions are predicted.
Medical image processing using AI has created milestones in medical diagnosis and treatment plans4 by utilizing the ML models like Support Vector Machine (SVM)5,19, K-Nearest neighbors19, Decision Tree5, and Neural networks7. Deep learning algorithms like Artificial Neural Networks (ANN)12, Convolutional Neural Networks (CNN)1,7,10,11,40 Recurrent Neural Network (RNN), Long Short- Term Memory (LSTM) Network8,10, Multilayer Perceptron (MLP)9,10 are used for FER, object detection and medical image segmentation. Majority of the neural networks use ImageNet12 as its pre-trained dataset due to its reliability and scalability. Since training deep layers are tedious tasks22, some pre-trained neural networks like (Residual Network) ResNet6, Squeeze Net7,12, VGG10 are also used to identify the facial emotions of person.
Santana et al. proposed the deep face model with ResNet13 to observe the correlation between the runner’s facial expressions. The fatigue condition of the runners is detected through Py-Feat toolkit and their movement is tracked through YOLO (You Look Only Once). The main drawback is the quality of image is compromised due to the environmental factors such as less illumination and complex background. The health status of exercising person using indoor bike ergo meter is investigated by Timmi et al.15. Shimmer3 ECG device is fixed to the participants and their heart rate is measured using Shimmer 3 (Inertial Measurement Unit) IMU23, fixed at their lumbar region. The external wearable device is expensive and difficult to monitor the movement of sportsperson while playing in the ground. Xie et al.16 predicts the fatigue condition of basketball player through bioelectric signals. The basketball players are segregated as mild fatigue, moderate fatigue, severe fatigue and no fatigue with their facial expressions and classified through Weighted Support Machine Vector (WSVM). Device compatibility, user acceptance and environmental influences are common disadvantages in this model. Zhao et al.18 used WSVM algorithms to predict disgust and happiness of the sports persons. WSVM algorithm integrates spatial–temporal motion LBP with Gabor multi orientation to predict emotion of players in the ground. However, this model could not distinguish between anger and sadness due to occlusions, rotation and illumination in the facial image. Sinhal et al. proposed a three-stage SVM model14 for classifying the facial expressions at accuracy of 89%. The major drawback is it requires an effective pre-processing technique at low computational time. Convolutional Neural Network (ConvNet) deep learning architecture6,22,24,25,26,40,41 extracts the features through Oriented fast and Rotated Brief (ORB) and classifies the anger, disgust, fear, happiness, neutrality, sadness and surprise with accuracy of 92.05%. Galic et al.25 emphasized the utilization of advanced filtering techniques such as CLAHE, wavelet de-noising, and anisotropic diffusion for analyzing the medical images of human. The medical image processing is utilized for analyzing the facial images of humans for diagnosing various diseases such as Parkinson’s disease, depression and epilepsy. Hence it is evident that effective pre-processing method is mandatory for precise classification.
Mutanu et al.27 proposed a model for reducing the eye strain in persons using their facial expressions (Fatigue, glare & squint). The excessive screen time in recent times is the major reason for this eye issues. The FER2013 and CK + images are cropped at the resolution of 48 × 48 pixels and converted into gray scale for processing. The faces are detected from the image using Viola Jones algorithm and classified using VGGNet model with 9 convolution layers and 3 max pooling layers. The classification model is further optimized using ADAM optimizer and fine tuned using Rectified Linear Unit (ReLU) activation function. The model is trained for 35 epochs and achieves an accuracy of 77% approximately. Kang et al.28 assessed the neck pain of the persons form the estimated images using the ensemble method. The dataset was created by capturing the actions such as typing, gaming and video watching task for 30 min subjected to different head pose angles (Yaw, Pitch, roll) using Logitech C920 camera with resolution of 1920 × 1080 pixels. Among 38 participants this bagging ensemble model classifies the 17 with neck pain and 21 as healthy and achieves an accuracy of 87%. Sumedh Khodke et al.29 proposed a hybrid model (Genetic algorithm & ML based fitness function) for selecting the NFL players. The player’s rating, salary and age are considered as attributes and classified using the hybrid model and it is optimized using roster optimization algorithm. Similarly, Saif et al.30 segments the lips using the convolution neural network (CNN) for enhancing the visual speech recognition system. The UOTletters dataset30 is used for this experimental purpose that comprises 30 speakers uttering 1560 words. The emotions like tiredness, sadness and frustration is detected from the facial expressions of the speakers. However, this model is sensitive to noise; environmental conditions and it demand the necessity of robust dataset. Thus the application of facial recognition in sports is huge and some of the recent existing methodologies are compared with their pros and cons in Table 1.
It is inferred from the Table 1 that the majority of the existing research works have utilized the dataset such as FER 2013, CK+ and JAFFE for their experimental purpose. Though the machine learning models like Support Vector Machine (SVM), Convolutional Neural Network (CNN), and Random Forest (RF) predicts the emotions and physical conditions of sports players, yet it still couldn’t handle issues like the noisy images and imbalanced dataset. Moreover, majority of the existing models need enhanced pre-processing module for predicting the facial emotions of the sports person irrespective of the hindrance of the external factors (sensor noise, environmental light and distortion). In the existing studies, the generic emotions and stress analysis of sports person is experimented in indoor1,8,10,20 and outdoor2,13,15,34,41,42 environment. Application of thermal images is widely used in the medical image processing comparing the facial emotion recognition. Only few recent research works have employed thermal image1,10,13,20 for analyzing the facial expressions of sports persons. This proposed research work focuses on the outdoor prediction of fatigue, stamina and pain of the sportsperson. Among several sports domain, running2,13,15,34 is most frequently chosen sport in existing methods. Thermal image based facial recognition is an emerging field in recent years and its utilization is anticipated to expand in future research.
Inferences from literature survey
It is inferred that several existing methodologies lag in classifying the images affected by external factors such as lighting condition and complex background. Most of the existing systems are expensive and never classifies the emotions and fatigue of the sports person while playing. Moreover, the existing system does not classify the sportsperson fatigue, stamina and pain using thermal image. It only identifies the facial expressions of the sportsperson. The FER dataset27,28 is classified using the deep learning architectures that perform well only when trained on large datasets. The diverse datasets helps to identify the minute variations in facial expression leading to precise classification. The constrained dataset results in lower generalization and reliability of the classification models. Moreover, the variation in lighting, illumination, and background effects in the image causes the noise interruption. The complex patterns of the facial expressions (fatigue, strain, pain & discomfort) demands diverse datasets to avoid misclassification. In order to solve above problems, thermal images are used in this study to reduce the lighting complexity and noise interruption.
Proposed methodology
Image acquisition
A team of 5 members are engaged in collecting facial image of sports persons both on the ground and resting time using thermal camera. The participants (Sports players) are briefed about our research work and the images are collected after getting appropriate ethical consent from the sports players. The running track length is about 400 m. The sports persons are at age group ranging between 20 to 30 years. The field study was carried out by the sports co-coordinator of “Jaya Sakthi Engineering College, Chennai-602024” and all the experiments were done according to the relevant guidelines and protocols.
The extraction of facial features of sportsperson is challenging due to the introduction of annoying noise signals, environmental lighting conditions and temperature fluctuations. However, thermal cameras are capable of capturing the sportsperson’s face irrespective of environmental lighting condition. The facial images are acquired using HIKMICRO Mini2 USB Thermal Camera with 256 × 192 IR resolution, 25 Hz reframe rate with 50°wide angle. This thermal camera is capable of capturing image with temperature range of − 4°F to 622°F. The thermal camera has flexible measurement setting such as 3-Dimensions presets with Automatic Center Spot, Hot Spot and Cold Spot recognition. This thermal camera has flexible 15 color palettes for predicting even the minute facial temperature variations. In this proposed model, iron bow palette is selected for image acquisition task and its justification is shown in Fig. 1a.
During the outdoor acquisition, HIKMICRO Mini2 USB Thermal Camera was mounted on a stabilized tripod at a fixed height (1.4 m) and downward tilted angle (10°-15°) ensuring the standardized view point (Fig. 1b). The distance between the mounted camera and the players were kept at a distance of 0.8 to 1.2 m for optimal field of view. This controlled setup facilitates the accurate ROI extraction and tracking for classifying fatigue, stamina and pain. The proposed Thermal Facial Sports Person (TFSP) dataset consist of 500 thermal images of female sports persons and 500 thermal images of male sports person. Among the 500 images, 250 facial images are captured at running state and remaining 250 images at rest state for both male and female players. Nearly 70% of the images are used for training and remaining 30% is used for testing the proposed classification model.
The overall methodology of the proposed architecture is shown in Fig. 1c respectively. After the image acquisition process, the images are pre-processed using proposed HEOP algorithm for improving its pixel clarity. The facial biomarkers (cheeks, lips, eyes, tongue, jaws and mouth) of the sports person are analyzed and segmented for classification. The proposed ECOC-MCSVM classification model predicts whether the player’s condition is fatigue, pain or stamina. The model is further optimized using the proposed BO-MPR model.
HEOP—preprocessing of facial thermal images
The proposed Histogram Equalized Order statistics and Power law function (HEOP) pre-processing method enhances the performance of proposed ECOC-MCSVM model by minimizing noise, generalizing unseen data and avoids over fitting. HEOP improves the contrast of the image and redistributes the pixel intensities equally. The proposed HEOP based preprocessing method consist of cascaded filters such as CLAHE, order static filter and Power law transform. The intensities of each pixel of thermal image are calculated as in Eq. (1).
Where, \(H(i)\) is the number of pixels with intensity \((i)\), intensity level of pixel co-ordinates \(I\left(x,y\right)\) of thermal image and Kronecker delta function \(\delta\) is the constant value ‘0’ or ‘1’. The acquired thermal image has temporal variations due to the weather conditions such as sunlight exposure, rain, and technical artifacts (sensor and calibration noise). HEOP removes uneven illumination in the image and preserves the local details. The output for HEOP pre-processed images at each stage is tabulated in Table 2 below. Its performance is validated by Peak to Signal Noise Ratio (PSNR) and Mean Squared Error (MSE). The proposed HEOP algorithm is compared with traditional Mean filter. It is witnessed that the proposed system removes the noisy pixels effectively from thermal image, when comparing the traditional mean filters. The workflow of the proposed HEOP pre-processing method is explained with its pseudo code in Table 3.
Histogram Equalization (HE) redistributes the pixel intensity of thermal image and enhances its clarity. The order statistics filter removes the noisy pixels and results in the smooth image with fine details. The kernel size (k) is chosen 3 × 3 as it reduces noise without degrading the edges and its empirical validation is given in Table 4. The power law function (\(\upgamma )\) is used to control the brightness of the thermal image and suppresses the background information. The gamma (\(\upgamma )\) value is tuned (0.4–1) to find the optimal value for enhancing the image quality. From the observation, the gamma (\(\upgamma )\) is chosen as ‘0.8’ because it highlights the darker region without degrading its pixel quality.
The iron bar palette of HIKMICRO thermal camera has the capacity to function at both day and night environment irrespective environmental hindrance. On hot weather, the facial temperature of thermal image is uniformly warm sometimes and makes it difficult to distinguish thermal facial biomarkers. However, this problem is addressed by the proposed HEOP model that enhances the pixel intensities of thermal image through cascaded filters. This filter removes the salt and pepper noise from the acquired thermal image without losing its finer details. The darker regions of thermal images are enhanced without over exposing using the power law transform method. The gamma (\(\upgamma )\) value adjusts the brightness and contrast of thermal image and suppress the background highlighting the facial features of sports person. This proposed methodology prioritizes the enhancement of both global and local contrast of the facial thermal image and its empirical validation is shown in Table 5.
Block processing based temperature detection (BPTD) for facial biomarkers
In the Block Processing based Temperature Detection (BPTD) algorithm, HEOP based preprocessed thermal image (256 × 192 pixels) is divided into small 12 blocks (64 × 64 pixels) as shown in Fig. 2. In this method, image matrix is in form of square blocks and corresponding operations are performed by traversing each individual blocks. These blocks cover the entire thermal image without any overlaps. The temperature of facial features is analyzed in each block of image using corresponding biomarkers. Sportsperson’s eyes, lips, teeth, jaws and nose images are selected as the facial biomarker parameters and corresponding temperature is measured. The temperature for eyes approximately lies between the range 30–38 °C during normal condition and corresponding histogram is generated. Similarly the temperature ranges of all the biomarkers are analyzed for predicting the sportsperson’s stamina, fatigue and pain condition. The BPTD method isolates the facial ROI by segmenting the thermal image into 12 thermal blocks. Each block is of resolution 64 × 64 pixels and it used for analyzing the temperature distribution patterns in facial regions. This BPTD examines the temperature variations in each blocks rather than focusing the entire image. Therefore it distinguished the minute temperature variations in thermal facial biomarkers such as eyes, lips, teeth, jaws, tongue and nose. Outdoor images are often affected by the sunlight reflections and it results in the noisy image. This problem is addressed by the proposed HEOP pre-processing model and BPTD as it focuses only on the blocks with human temperature range avoiding the irrelevant regions. The traditional methods using RGB images rely on the visual features of the images like contours and edges. The BPTD model resides on the thermal gradients instead of making it robust to occlusion and lighting conditions. The inter-subject variability of the sportspersons is handled by both the hardware setup and normalization techniques. The thermal facial biomarkers are identified using their respective temperature gradients. The thermal face is geometrically aligned by predicting the regions with consistent thermal intensity patterns. The corresponding ROI of facial thermal images are transformed into common spatial co-ordinate by subjecting to affine transformation process such as rotation, scaling and translation. Then the extracted ROI is resized into fixed input size of 64 × 64 using bilinear interpolation for classification.
The facial temperature of sportsperson is monitored, while playing in the field and resting time irrespective of lighting conditions. The players from three different sports such as running, cricket and hockey are analyzed for stamina, fatigue and pain conditions. The facial temperature variations of three different sport persons at two conditions (playing and resting time) are depicted in Fig. 3.
The facial biomarkers are selected based on the corresponding protective sports equipment (PSE) and show in Table 6. In cricket, players wear helmet, so thermal image block comprising eyes and tongue are selected for stamina, fatigue and pain measurement. Similarly, for hockey player, the jaws, teeth and tongue are selected because hockey players will be running continuously, so the eyes and cheeks can’t be visualized properly. For athletes, eyes, nose, teeth, jaws and mouth thermal image blocks are considered for stamina, fatigue and pain measurement. During running, the temperature is high in facial regions and fatigue is measured from the eyes, nose, teeth, jaws, and mouth blocks with average temperature of 38 °C. Similarly, pain is measured from mouth, eyes, and jaws blocks, with average temperature of 36 °C. Stamina is measured from cheek, lip and jaw blocks, with average temperature of 33 °C. It is observed that the persons on track during running have higher temperature, when comparing the players at resting time. While considering the cricket players, the temperature of fatigue (37 °C) is as high as pain (36.5 °C).The cricket players with stamina has moderate temperature at resting time (34 °C). During playing cricket, the facial temperature of player is at range of 37 °C. For hockey player, temperature range is high during stamina, fatigue and pain condition, because the hockey players are running entire game. The temperature for the players, off the field is somewhat low when comparing on field for all players such as cricket, hockey and running. From Fig. 3, it is witnessed that irrespective of sports, the persons with stamina has facial temperature of range (31–36 °C), fatigue (32–37 °C) and pain (33–38 °C) respectively. The above measured temperature values are fed to the proposed ECOC-MCSVM algorithm for classifying stamina, fatigue, and pain of sportspersons.
Proposed ECOC-MCSVM based classification
The BPTD extracted temperature features are trained in the proposed Error-Correcting Output Codes based Multi-class Support Vector Machine (ECOC-MCSVM) model for classification of stamina, fatigue, and pain. This hybrid model encodes the unique arbitrary value for each class to distinguish it from other classes. The proposed method classifies three health parameter (stamina, fatigue, and pain) based on facial expressions of sportsperson. The proposed MCSVM is multiclass classifier, yet it handles the multiclass problems into set of binary SVM classifiers. The detailed flow of the proposed ECOC- MCSVM classification model is shown in Fig. 4. In the proposed ECOC- MCSVM model, 70% of 500 thermal images are used for training the model, while 30% of images are used for testing. Initially the raw thermal images are pre-processed using HEOP technique, and then BPTD is applied for temperature features extraction.
The extracted features are trained in the proposed model for classification with sufficient data so that it doesn’t falls into the overfitting or underfitting. A good classification model should have balance between its bias and variation so the training and testing datasets are divided in proper ratio. In this proposed ECOC- MCSVM model, K fold cross validation is used to divide dataset into k-folds, where ‘k-1’ folds are used for training and ‘1’ fold for validation. In this proposed ECOC- MCSVM model the value for k is assigned as ‘5’ and their performance is measured in terms of accuracy. Table 7 compares the athletic person’s pain using proposed facial biomarkers across the manual measurement of pain using Graphic Rating Scale (GRS). The pain of sportsperson is measured based on the athlete’s feedback and the rating scale is between zeros to eight. Likewise, Table 8 compares the athletic person’s stamina using Stamina model. Similarly, Table 9 compares the fatigue condition of sportsperson across the proposed and manual measurement using Fatigue Assessment Scale (FAS). Bayesian Optimized- Multiple Polynomial Regression analysis (BO-MPR) predicts the relationship between the stamina, fatigue and pain conditions of the sportsperson using the linear equations as in Eq. 2.
Here,\(Y\) is the dependent variable, \({X}_{1},{X}_{2},{\dots ,X}_{n}\) are the independent variables, \({\beta }_{0},{\beta }_{1,}{\beta }_{2},{\beta }_{n}\) are the coefficients and \(\in\) is the error term. This proposed BO-MPR optimized model predicts the health status of the sportsperson using the polynomial equations. The corresponding facial biomarkers are selected as the independent variables to calculate the dependent variables such as stamina, fatigue and pain. The proposed BO-MPR polynomial model analyzes the facial features of the sportsperson and predicts the stamina, fatigue and pain condition as in Table 10. The significance of the co-efficient in polynomial regression is estimated using the parameters such as ’s-statistics’ and ‘p- statistics’ values . An effective classification model has low p-value and high t-statistics value to reduce the error rate. The proposed model has higher t-statistics value and low p-value (< 0.05). Hence, it is witnessed that the proposed model identifies the stamina, fatigue and pain of the sportsperson precisely. Based on the generated coefficients (\({\beta }_{0},{\beta }_{1,}{\beta }_{2},{\beta }_{n}\)) the corresponding histograms and fitting plots for pain is shown in Fig. 5, stamina is shown in Fig. 6 and fatigue is shown in Fig. 7.
The player’s facial features such as eyes, nose, lips and tongue are identified and analyzed using the histogram and normal probability distribution plot. The features of the players suffering from pain condition are observed and its corresponding residuals are shown in Fig. 5a and b. The proposed ECOC-MCSVM and BO-MPR model compare the actual trained features and the predicted features to evaluate the performance of the proposed model. Similarly, stamina and fatigue condition of the players are also identified using the histograms and normal probability distribution as shown in Figs. 6a,b, 7a and b respectively. The red diagonal line in Figs. 5b, 6b and 7b is ideal normal distribution and the blue points represent the predicted residuals. If the blue points fall near the red line then it shows that the proposed models classify the player’s condition efficiently. In case, if it deviates from the red line then it indicates the presents of outliers and the model needs more training and pre-processing. Since our proposed model has effective pre-processing and normalization techniques the predicted residuals lies close to the red line. From the histogram and normal probability plot analysis, it is witnessed that the overall prediction rate of the proposed BO-MPR model is good for stamina, fatigue and pain condition. The data points are scattered near the fitting line, which is a sign of a good trained model with higher accuracy rate.
Experimental results and discussions
The proposed BO-MPR model is validated using accuracy, precision, recall and F1-score. To calculate these metrics the parameters like True Positive (TP), True Negative (TN), False Positive (FP) and False Negative (FN) is identified from the confusion matrix. Confusion matrix explains the performance of the classification model by comparing the actual and predicted class. This confusion matrix has instance of predicted class in row wise and actual class in column wise. The accuracy determines the overall correctness of the model and it is calculated by formula given below Eq. (3).
The precision identifies the true positives among all the positive prediction as following Eq. (4).
Recall predicts the true positive values among the correctly predicted value as in Eq. (5).
The precision and recall value is correctly balanced by F1-score and it is mathematically calculated as in following Eq. (6).
The proposed TFSP dataset is compared with other pre-defined datasets like running dataset34, Cricket38 and Hockey39 to plot the accuracy rate as in Fig. 8.
The accuracy rate of the fivefold cross validation process is comparatively higher (Fig. 8) for all the datasets. The proposed TSFP dataset has the highest accuracy rate of about 97.5% due to high pixel resolution irrespective of environmental conditions. Hockey dataset39 has 94.8% of accuracy, while cricket dataset38 has 91.56% approximately. Among all the datasets running dataset34 has the lowest accuracy (90%), this is due to the pixel variation by the external factor (lighting condition). The proposed ECOC-MCSVM classifier’s performance is measured using the metrics such as accuracy, precision, recall and F1-score and its corresponding results are represented in bar chart (Fig. 9). The efficiency of the proposed hybrid model is also validated in both day and night environment and it is shown in Fig. 10. Comparing Figs. 9 and 10, the proposed hybrid model ECOC-MCSVVM classifies the expressions of sportsperson well irrespective of environmental condition. Henceforth it is evident that proposed hybrid model predicts the stamina, fatigue and pain condition of the sportsperson on the ground, with good accuracy rate comparing existing methods. The proposed ECOC-MCSVM model’s accuracy is justified by comparing with the existing methodologies and datasets as tabulated in Table 11.
From Table 7, it is inferred that proposed TSFP dataset has the highest accuracy rate of 97.69% when comparing the existing methodologies. The proposed TSFP dataset has the highest prediction rate because it identifies the facial features even in dark environment.
Discussion
The illumination of light affects the classification of player’s face during playing, which was a major issue in existing methodologies. The performance of the proposed ECOC-MCSVM model is compared with traditional SVM model and their F1-score is plotted in Fig. 9.The traditional SVM model classifies the emotions of sportsperson at an average of 94% during day time and 92% at night time approximately. The proposed ECOC-MCSVM model classifies with an average of 98% during day environment and 97% at night environment. The hybrid model extracts the features from thermal image despite of lighting condition due to enhanced pre-processing module. The training time for this proposed model is around 0.956 s and testing time is nearly 0.010 s. Henceforth, the stamina, fatigue and pain of sportsperson is predicted at appreciable rate of accuracy and F1-score from acquired TSFP dataset and proposed ECOC-MCSVM model. Though the proposed model predicts the classes at higher percentage, it has some negligible error rate of 2.5% approximately. This can be corrected by training the model with more dataset explicitly.
Conclusion
Most existing research has primarily utilized RGB image datasets to classify basic human emotions such as happiness, sadness, anger, etc. Currently, there is only few publicly available thermal dataset for predicting fatigue, stamina, and pain specifically for sportspersons. While thermal imaging has seen wide application in medical diagnostics, leading to the availability of many thermal datasets in healthcare, its application in the sports domain remains limited. To address this critical gap, our research work proposed TFSP thermal image dataset for sportspersons and aimed at classifying fatigue, stamina, and pain conditions using HIKMICRO Mini2 USB Thermal Camera. Thermal imaging cameras capture the temperature gradients and heat distribution patterns across the facial regions, which are strong indicators of underlying physiological states. Our proposed system helps to monitor the health and performance conditions of athletes using thermal biomarkers. The proposed HEOP pre-processing model has cascaded filters to deal with the factors like minimal lighting conditions and noisy pixels. The noisy pixels are reduces using Order Statistics filter (k = 3 × 3) and darker regions of thermal image is enhanced with Power Law Transform (\(\gamma = 0.8\)) to distinguish its background. The facial biomarker features are extracted by analyzing the temperature of each pixel with the proposed BPTD algorithm. Even the minute variations in thermal facial biomarkers (eyes, lips, teeth, jaws, tongue and nose) are spotted precisely by normalizing them using the affine transformation and bilinear interpolation process. Finally it is classified using the proposed ECOC-MCSVM model with an accuracy of 98% approximately. The main purpose of this research work is to classify the fatigue, pain and stamina of the sportsperson using thermal facial biomarkers irrespective of environmental hindrance. Early prediction of stamina, fatigue and pain condition helps the coaches to manage the injuries, health diets, drugs, and exercise of the athlete. The proposed method promotes a healthy relationship between the coaches and players, which is mandatory to win a game in the sports. This proposed approach marks a significant advancement over traditional emotion-based models by incorporating thermal physiology into the detection process, making it highly relevant for sports monitoring and performance optimization. However, this proposed work needs to experiment under different microclimatic conditions and datasets. As a future work, we are planning to experiment the scalability and robustness of the model subjecting to different thermal cameras like FLIR and SEEK compact PRO in real world scenario.
Data availability
Data will be made available upon request", Contact Santhosh P K (Corresponding author), Email ID : santhoshpkphd24@gmail.com.
Abbreviations
- TFSP:
-
Thermal facial images of sports person
- HEOP:
-
Histogram equalization, order statistics filter and power law transform
- BPTD:
-
Block processing based temperature detection
- PSE:
-
Protective sports equipment
- ECOC-MCSVM:
-
Error correcting output codes based multi-class support vector machine (ECOC-MCSVM)
- BO-MPR:
-
Bayesian optimized multiple polynomial regression
References
Alqudah, M. Affective state recognition using thermal-based imaging: a survey. Comput. Syst. Sci. Eng. 37, 47–62. https://doi.org/10.32604/csse.2021.015222 (2021).
Brick, N. E., Mcelhinney, M. & Metcalfe, R. S. The effects of facial expression and relaxation cues on movement economy, physiological, and perceptual responses during running. Psychol. Sport Exerc. 34, 20–28 (2018).
Biró, A., Cuesta-Vargas, A. I. & Szilágyi, L. AI-assisted fatigue and stamina control for performance sports on IMU-generated multivariate times series datasets. Sensors. 24(1), 132. https://doi.org/10.3390/s24010132 (2024).
Cîrneanu, A.-L., Popescu, D. & Iordache, D. New trends in emotion recognition using image analysis by neural networks, a systematic review. Sensors. 23(16), 7092. https://doi.org/10.3390/s23167092 (2023).
Dagher, I., Dahdah, E. & Al Shakik, M. Facial expression recognition using three-stage support vector machines. Vis. Comput. Ind. Biomed. https://doi.org/10.1186/s42492-019-0034-5 (2019).
Debnath, T. et al. Four-layer Convnet to Facial Emotion Recognition With Minimal Epochs and the Significance of Data Diversity.https://doi.org/10.21203/rs.3.rs-511221/v1 (2021).
Huang, Z. Y. et al. A study on computer vision for facial emotion recognition. Sci. Rep. https://doi.org/10.1038/s41598-023-35446-4 (2023).
Kolosov, D., Kelefouras, V., Kourtessis, P. & Mporas, I. Contactless camera-based heart rate and respiratory rate monitoring using AI on hardware. Sensors. 23, 4550. https://doi.org/10.3390/s23094550 (2023).
Jiang, M. et al. IoT-based remote facial expression monitoring system with sEMG signal. In 2016 IEEE Sensors Applications Symposium (SAS), Catania, Italy, 1–6 https://doi.org/10.1109/SAS.2016.7479847 (2016).
Saganowski, S. et al. Emognition dataset: emotion recognition with self-reports, facial expressions, and physiology using wearables. Sci. Data 9, 158. https://doi.org/10.1038/s41597-022-01262-0 (2022).
Li, Y. A recognition method of athletes’ mental state in sports training based on support vector machine model. J. Electr. Comput. Eng. https://doi.org/10.1155/2022/1566664 (2022).
Pise, A. A. et al. Methods for facial expression recognition with applications in challenging situations. Comput. Intell. Neurosci. 2022, 9261438. https://doi.org/10.1155/2022/9261438.PMID:35665283;PMCID:PMC9159845 (2022).
Santana, O. J. et al. Facial expression analysis in a wild sporting environment. Multimed. Tools Appl. 82, 11395–11415. https://doi.org/10.1007/s11042-022-13654-w (2023).
Sinhal R. A. et al. Use of color channels to extract heart beat rate remotely from videos. Biosci. Biotech. Res. Commun. 15(1) (2022).
Timme, S. & Brand, R. Affect and exertion during incremental physical exercise: Examining changes using automated facial action analysis and experiential self-report. PLoS ONE 15, e0228739. https://doi.org/10.1371/journal.pone.0228739 (2020).
Xie, Z. Fatigue monitoring and recognition during basketball sports via physiological signal analysis. Int. J. Inf. Syst. Model. Des. 13, 1–11. https://doi.org/10.4018/IJISMD.313581 (2022).
Zhan, C., Li, W., Ogunbona, P. & Safaei, F. A real-time facial expression recognition system for online games. Int. J. Comput. Games Technol. https://doi.org/10.1155/2008/542918 (2008).
Zhao, L., Wang, Z. & Zhang, G. Facial expression recognition from video sequences based on spatial-temporal motion local binary pattern and gabor multiorientation fusion histogram. Math. Probl. Eng. 2017, 1–12. https://doi.org/10.1155/2017/7206041 (2017).
Nancy, V. & Balakrishnan, G. Thermal image-based object classification for guiding the visually impaired. Comput. J. 64(11), 1747–1759. https://doi.org/10.1093/comjnl/bxaa097 (2019).
Siddiqui, M. F. H., Dhakal, P., Yang, X. & Javaid, A. Y. A survey on databases for multimodal emotion recognition and an introduction to the VIRI (visible and infrared image) database. Multimodal Technol. Interact. 6(6), 47. https://doi.org/10.3390/mti6060047 (2022).
Ly, L. & Weary, D. Facial expression in humans as a measure of empathy towards farm animals in pain. PLoS ONE 16, e0247808. https://doi.org/10.1371/journal.pone.0247808 (2021).
Ngo, Q. T. & Yoon, S. facial expression recognition based on weighted-cluster loss and deep transfer learning using a highly imbalanced dataset. Sensors. 20(9), 2639. https://doi.org/10.3390/s20092639 (2020).
Han, Y. & Xu, Y. The research of emotion recognition based on multi-source physiological signals with data fusion. ITM Web Conf. 45, 01038. https://doi.org/10.1051/itmconf/20224501038 (2022).
Gao, T. et al. Sports video classification method based on improved deep learning. Appl. Sci. 14(2), 948. https://doi.org/10.3390/app14020948 (2024).
Galić, I., Habijan, M., Leventić, H. & Romić, K. Machine learning empowering personalized medicine: a comprehensive review of medical image analysis methods. Electronics 12(21), 4411. https://doi.org/10.3390/electronics12214411 (2023).
Seçkin, A. Ç., Ateş, B. & Seçkin, M. Review on wearable technology in sports: concepts, challenges and opportunities. Appl. Sci. 13(18), 10399. https://doi.org/10.3390/app131810399 (2023).
Mutanu, L., Gohil, J. & Gupta, K. Vision-autocorrect: a self-adapting approach towards relieving eye-strain using facial-expression recognition. Software 2(2), 197–217. https://doi.org/10.3390/software2020009 (2023).
Kang, J.-H. et al. Assessing non-specific neck pain through pose estimation from images based on ensemble learning. Life 13(12), 2292. https://doi.org/10.3390/life13122292 (2023).
Khodke, S. et al. Leveraging genetic algorithms to optimize team health and well-being toward sustainable game development and strategy. SoftwareX 25, 101635. https://doi.org/10.1016/j.softx.2024.101635 (2024).
Safi, M. E. & Abbas, E. I. Lip segmentation for visual speech recognition based on the convolution process. In 2023 International Conference on Engineering Applied and Nano Sciences (ICEANS), Erbil, Iraq, 102–106, https://doi.org/10.1109/ICEANS58413.2023.10630463 (2023).
Huxter, K., Atkin, A. E. & Singhal, A. Perception of emotion in the facial expressions and body language of athletes. Soc. Behav. Personal. Int. J. 51(4), 1–12 (2023).
Sarlis, V., Papageorgiou, G. & Tjortjis, C. Injury patterns and impact on performance in the NBA league using sports analytics. Computation. 12(2), 36. https://doi.org/10.3390/computation12020036 (2024).
Makhmudov, F., Turimov, D., Xamidov, M., Nazarov, F. & Cho, Y.-I. Real-time fatigue detection algorithms using machine learning for yawning and eye state. Sensors. 24(23), 7810. https://doi.org/10.3390/s24237810 (2024).
https://www.kaggle.com/datasets/mexwell/long-distance-running-dataset.
Haefeli, M. & Elfering, A. Pain assessment. Eur. Spine J. 15(Suppl 1), S17-24. https://doi.org/10.1007/s00586-005-1044-x (2006) (Epub 2005 Dec 1).
Hendriks, C., Drent, M., Elfferich, M. & De Vries, J. The Fatigue Assessment Scale: quality and availability in sarcoidosis and other diseases. Curr. Opin. Pulm. Med. 24(5), 495–503. https://doi.org/10.1097/MCP.0000000000000496 (2018).
Svartengren, M. & Hellman, T. study protocol of an effect and process evaluation of the stamina model; a structured and time-effective approach through methods for an inclusive and active working life. BMC Public Health 18(1), 1070. https://doi.org/10.1186/s12889-018-5807-9 (2018).
https://www.kaggle.com/datasets/open-source-sports/professional-hockey-database.
Lalitha, S. D. & Thyagharajan, K. K. Micro-Facial Expression Recognition in Video Based on Optimal Convolutional Neural Network (MFEOCNN) Algorithm (2020).
Host, K., Pobar, M. & Ivasic-Kos, M. Analysis of movement and activities of handball players using deep neural networks. J. Imaging. 9(4), 80. https://doi.org/10.3390/jimaging9040080 (2023).
Cossich, V. R. A., Carlgren, D., Holash, R. J. & Katz, L. Technological breakthroughs in sport: current practice and future potential of artificial intelligence, virtual reality, augmented reality, and modern data visualization in performance analysis. Appl. Sci. 13(23), 12965. https://doi.org/10.3390/app13231296 (2023).
Acknowledgements
We acknowledge “Chase Technologies, Avadi, Chennai, Tamil Nadu, India” for their support.
Author information
Authors and Affiliations
Contributions
P. K. Santhosh (Corresponding author) – Collecting data, Formulating methodology, Writing research article. B.Kaarthick – Planning, Analysis of study, Reviewed research article. We acknowledge “Chase Technologies, Chennai, TamilNadu, India” for their support.
Corresponding author
Ethics declarations
Informed consent
Informed consent was obtained from all subjects for publication of identifying images / information in an online open-access publication" in the manuscript.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Santhosh, P.K., Kaarthick, B. Fatigue and stamina prediction of athletic person on track using thermal facial biomarkers and optimized machine learning algorithm. Sci Rep 15, 25974 (2025). https://doi.org/10.1038/s41598-025-10757-w
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-10757-w












