Research on drowsiness detection in UAV operators based on the random decision forest method

Wojtowicz, Konrad; Wojciechowski, Przemysław; Panasiewicz, Adrian

doi:10.1038/s41598-026-39195-y

Download PDF

Article
Open access
Published: 18 February 2026

Research on drowsiness detection in UAV operators based on the random decision forest method

Konrad Wojtowicz¹,
Przemysław Wojciechowski¹ &
Adrian Panasiewicz¹

Scientific Reports volume 16, Article number: 9726 (2026) Cite this article

900 Accesses
Metrics details

Subjects

Abstract

Drowsiness poses a significant risk in safety-critical operations such as operating unmanned aerial vehicles (UAV). While behavioral indicators like eye closure and head pose are effective for detection, the interpretability of complex models remains a challenge. This work employs a Random Forest model not merely as a classifier, but as a diagnostic tool to analyze dataset biases and feature correlations in drowsiness detection. Using established benchmarks, we demonstrate how this interpretable framework provides actionable insight into feature importance and model decision boundaries. The analysis offers a method to audit training data and informs the more reliable application of high-performance black-box systems. Our approach underscores the value of model transparency for developing robust, trustworthy drowsiness detection in operational environments.

An efficient privacy-preserving multilevel fusion-based feature engineering framework for UAV-enabled land cover classification using remote sensing images

Article Open access 03 July 2025

White shark optimizer with optimal deep learning based effective unmanned aerial vehicles communication and scene classification

Article Open access 27 December 2023

Neural radiance fields assisted by image features for UAV scene reconstruction

Article Open access 20 August 2025

Introduction

Drowsiness is one of the factors that significantly affects the quality of the task at hand, hindering it to varying degrees. It is particularly perilous in areas of daily life where continuous attention is essential for safety. The best example is in the broad field of transportation, including roads, seas, and air. The National Center for Statistics and Analysis (NCSA) reported 693 deaths due to accidents caused by driver drowsiness in 2022¹. In aviation, drowsiness is particularly dangerous, as it can lead to large-scale crashes, as was the case with Air India Express Flight 812, in which 158 people died as a result of the pilot falling asleep (from the Gokhale report²).

To minimize the impact of drowsiness and fatigue on transportation safety, drowsiness detection systems are being developed to notify the driver or pilot before the phenomenon occurs. Such solutions help reduce the risks associated with prolonged task performance while increasing worker productivity and efficiency by minimizing errors due to drowsiness.

Methods of detecting drowsiness

Drowsiness detection methods can be classified into three main categories³:

1.
Methods based on vehicle control parameters.
2.
Methods based on physiological parameters.
3.
Methods based on behavioral parameters.

Figure 1 presents a breakdown of the methods used for drowsiness detection and the psychophysical parameters measured for each method. Table 1 summarizes the main advantages and disadvantages of each approach.

Table 1 Advantages and disadvantages of drowsiness detection methods.

Full size table

Methods based on vehicle control parameters

Methods for detecting drowsiness based on vehicle control parameters rely on the fact that drowsiness impairs driving performance. The strong correlation between fatigue and vehicle handling has led to several approaches for detection.

Liu, Hosking, and Lenné⁵ compared different methods of detecting drowsiness by analyzing vehicle control parameters. They identified two of the most effective: the standard deviation of lane position (SDLP) and steering wheel movement (SWM).

In aviation, Morris and Miller⁶ analyzed variations in aircraft speed, heading, altitude, and vertical speed to calculate error rates. They found that the error rate increased as pilot sleepiness progressed during simulated flights.

Research confirms that vehicle dynamics, such as lane-keeping and steering control, can serve as reliable indicators of drowsiness. For example, lane departures linked to sleep deprivation correlate strongly with fatigue levels^7,8. This supports the use of vehicle performance data in drowsiness detection systems.

More recently, machine learning methods have been introduced to assess and predict drowsiness using control parameters. These systems improve upon traditional methods by analyzing real-time data⁹. Advanced algorithms, such as machine learning-based optimization, further enhance detection by making systems more responsive to abnormal driving behavior¹⁰. This marks a shift toward vehicle-centric detection approaches, which emphasize operational parameters rather than only physiological signals.

Despite these advances, an important limitation remains. Detection based on control parameters often occurs only after risky behaviors emerge. This can endanger both the driver and others. For this reason, predictive methods that identify drowsiness before performance declines are especially valuable.

Methods based on physiological parameters

Drowsiness detection through physiological parameters increasingly relies on indicators related to brain activity, particularly through electroencephalography (EEG). Research indicates that EEG signals can effectively highlight different states of alertness and fatigue, particularly through fluctuations in specific wave patterns. Notably, delta and theta rhythms are associated with reduced vigilance and increased sleepiness, making them vital for drowsiness detection^11,12. Methods for detecting drowsiness based on physiological parameters are built on the premise that fatigue or drowsiness causes changes in physiological parameters regulated by the sympathetic and parasympathetic parts of the nervous system. By identifying these changes, safety systems can detect the state of drowsiness⁴. According to Hu and Lodewijks¹³, some of the most promising methods are those based on EEG. These involve measuring the spectral power of brain waves in the alpha, beta, and theta bands.

Recent studies have increasingly applied deep learning models to EEG analysis in order to improve accuracy and robustness in fatigue detection. For example, a modified Inception-Dilated ResNet architecture presented by Alghanim et al. work¹⁴ has been proposed to address the nonstationary nature of EEG signals and to enhance inter-channel feature extraction. Using spectrogram representations of EEG recordings, this hybrid model demonstrated improved performance on benchmark datasets such as Figshare and SEED-VIG. These findings highlight the potential of neural architectures to capture both temporal and spatial information from EEG data, surpassing traditional feature-engineering approaches.

Other than EEG, fatigue and drowsiness are associated with changes in the autonomic nervous system (ANS), which regulates key physiological functions such as heart rate, respiration, and ocular and skin responses. In the context of operating remote systems, including unmanned aerial vehicles (UAVs), ANS responses and physiological indicators are particularly relevant due to prolonged cognitive workload and the necessity to maintain vigilance.

Studies on UAV operators have shown that heart rate variability (HRV) parameters measured via electrocardiography (ECG) are significantly affected by the level of automation and cognitive workload during UAV missions^15,16. Moreover, research on the psychophysiological state of UAV operators confirms that HRV can serve as an objective biomarker for monitoring operator condition during training and operational tasks¹⁷.

Furthermore, measurement of skin conductance (EDA) can serve as an indirect marker of sympathetic activity, exhibiting changes in response to decreased arousal and sensory-cognitive load during prolonged monitoring tasks¹⁸. In addition, multi-modal approaches combining ECG, EDA, and respiratory sensors improve the detection of drowsiness and fatigue by identifying alterations such as heart rate stabilization, reduced respiratory amplitude, and decreased tonic EDA during states of lowered alertness¹⁸. The integration of these multimodal physiological signals allows for a comprehensive assessment of UAV operators and other remote system operators, providing a promising complement or alternative to EEG-based methods in operational settings.

However, these methods require participants to wear appropriate sensors, which poses practical challenges in operational environments. Similar to methods based on vehicle control parameters, approaches based on physiological parameters also have limitations. Wearing additional devices can reduce comfort and increase stress, activating the sympathetic nervous system and potentially affecting the reliability of measurements. A summary of the discussed parameters is provided in the Table 2.

Table 2 Physiological and behavioral markers for sleepiness/fatigue assessment.

Full size table

Considering these limitations, this work proposes an alternative approach based on behavioral parameters for detecting drowsiness.

Methods based on behavioral parameters

Behavioral parameters are non-invasive methods of detecting drowsiness that monitor pilot fatigue by analyzing indicators such as eye closure time ratio, blink frequency, pupil diameter, saccadic movements (involuntary eye movements), and yawning.

The relationship between drowsiness and eye movement behavior has been extensively documented, with studies indicating that decreased gaze stability and prolonged blink duration correlate with increased fatigue levels^23,24. Additionally, reaction time assessments revealed that lapses in attentiveness are closely tied to drowsiness states. The Psychomotor Vigilance Test (PVT) has been utilized to pinpoint such lapses, reinforcing the link between behavioral performance and cognitive alertness during driving^23,25.

Data from those parameters is analyzed in real time. Relevant features are detected on the pilot’s face and recorded to measure parameters related to drowsiness detection. The pilot’s drowsiness assessment classifier evaluates the test result in the next step. If the pilot is drowsy, the safety system informs the vehicle operator. Otherwise, the algorithm continues monitoring until drowsiness is detected²⁶. A diagram of this algorithm is shown in Fig. 2.

A study by Morad²⁷ found that the mean pupil diameter is correlated with subjective feelings of fatigue. Therefore, this potential parameter could be used to detect drowsiness in drivers. However, one problem with recognizing drowsiness by measuring the mean pupil diameter is its high sensitivity to light intensity. For this reason, if pupil diameter is selected, its measurement should be integrated into a safety system for detecting drowsiness.

Some of the most characteristic symptoms of drowsiness are slow closing of the eyelids and frequent blinking of the eyes. For this reason, Wierwille and Ellsworth created a parameter named the percentage of eyelids closed (PERCLOS) to determine this relationship²⁸. Studies have shown that the PERCLOS parameter increases as drowsiness grows, leading to reduced driver efficiency and slower responses to stimuli²⁹. Therefore, it is one of the most reliable indicators of drowsiness³⁰.

PERCLOS measures the percentage of time the eyelids are over the pupil, indicating a slow dropping of the eyelids instead of the typical blinking. This parameter is calculated by dividing the time the eyelid covers about 80% of the eye by the total measurement time. It can be expressed by equation (1).

$$\begin{aligned} PERCLOS = \displaystyle \frac{n_{close}}{N_{total}}\cdot 100\% \end{aligned}$$

(1)

Where $n_{close}$ represents the number of frames when the eyes are closed over a predefined interval and $N_{total}$ is the total frame length over the same interval. A higher PERCLOS percentage indicates a higher degree of sleepiness. The typical recommended alarm threshold for the PERCLOS parameter is 15%³¹. However, Sommer and Golz³² note in their study that measuring only this parameter to indicate drowsiness is insufficient to avoid drowsiness-related accidents.

Another parameter that can detect drowsiness in pilots is related to saccadic movements. These movements are rapid, leaping movements of the eyeballs to move the gaze from one point to another. They are used to scan the visual field and are essential for activities such as reading or looking around in the environment. A study by Henn, Baloh, and Hepp³³ supports the possibility of their use, which shows a relationship between sleepiness and parameters related to saccadic movements. Schleicher and colleagues showed that saccadic movement duration correlates with driver and pilot sleepiness³⁴.

Diaz-Piedra and colleagues³⁵ conducted a study of the speed of saccadic movements in pilots before and after a helicopter flight lasting more than 2 hours. They noted that the speed of saccadic movements decreased by about 3%. They concluded that the speed of saccadic movements is a promising biomarker for detecting sleepiness.

Yawning is included among the parameters correlated with sleepiness³⁶, a common phenomenon in humans and animals. A wide mouth opening, deep inhalation, and short exhalation characterize it. The reasons for this phenomenon are varied and include lowering brain temperature³⁷, increasing arousal³⁸, and performing social functions³⁹.

As fatigue increases, neck muscles relax, causing involuntary drooping or tilting of the head. These movements reflect reduced alertness and may indicate a state of lethargy⁴⁰. Continuous monitoring of head tilt can therefore enable earlier detection of worsening fatigue.

In summary, behavioral parameters such as pupil diameter, eyelid closure (PERCLOS), blink rate, saccadic movements, yawning frequency, and head tilt provide reliable non-invasive indicators of drowsiness. While each parameter has its limitations when used in isolation, combining multiple indicators within an integrated system significantly improves detection accuracy. Such multimodal behavioral monitoring offers a practical and effective approach for real-time assessment of driver or pilot alertness.

Classification methods for detecting drowsiness

There are numerous ways of classifying drowsiness based on behavioral parameters. This article presents a selection of methods, most of which have been evaluated on the NTHUDDD dataset⁴¹. A broader review of machine learning systems for drowsiness detection is provided by El-Nabi⁴². Table 3 summarizes the key approaches.

Table 3 Drowsiness detection systems based on behavioral parameters.

Full size table

Liu et al.⁴³ propose a driver fatigue detection algorithm based on a two-stream network with multi-facial feature fusion. The approach consists of four main steps: (1) locating the eyes and mouth using multi-task cascaded convolutional networks (MTCNNs), (2) extracting static features from partial facial images, (3) extracting dynamic features from partial facial optical flow, and (4) combining static and dynamic features through a two-stream neural network for classification. By focusing on partial facial regions and fusing static and dynamic information, the method emphasizes fatigue-related cues, improving detection performance. Gamma correction is applied to enhance image contrast, particularly improving results in low-light conditions. Evaluated on the NTHUDDD dataset⁴¹, the system achieved an accuracy of 97.06%, demonstrating the effectiveness of multi-facial feature fusion combined with two-stream networks for driver fatigue detection.

Rezaee et al.⁴⁴ present a real-time intelligent alarm system for detecting driver fatigue based on video sequences. The system captures the driver’s face at 15 fps and converts the images from RGB to YCbCr and HSV color spaces. The face region is segmented with high precision, and eye closure is determined using thresholding combined with facial symmetry equations. Yawning frequency is then identified via K-means clustering. Evaluated on four different video sequences totaling 35,000 frames, the system achieved an average accuracy of 93.18% and a detection rate of 92.71%. The high segmentation accuracy, low error rate, and fast processing distinguish this approach, demonstrating its potential for reducing accidents caused by driver fatigue.

Dua et al.⁴⁵ present an ensemble framework combining FlowImageNet, AlexNet, VGG- FaceNet, and ResNet. By integrating diverse features related to eye blinking, yawning, and nodding, the ensemble improves robustness under varying conditions, such as changes in lighting and background. This approach achieves an accuracy of 85% which highlights the strength of ensemble learning for generalized detection.

Guo and Markoni⁴⁶ introduce a hybrid CNN–LSTM model. CNNs are used for spatial feature extraction of eyes and mouth, while a novel Time-Skip Combination LSTM (TSC-LSTM) processes temporal dependencies across multiple time intervals. This reduces prediction noise and improves stability, achieving an accuracy of 84.85% on the NTHUDDD dataset⁴¹.

Moujahid et al.⁴⁷ propose a framework based on handcrafted features, including HOG, covariance, and LBP descriptors. These are extracted from pyramidal multi-level face representations, reduced via PCA, and classified with SVMs. Fusion strategies further improve robustness under difficult conditions, achieving an accuracy of 79.84%.

Finally, Wijnands et al.⁴⁸ focus on mobile deployment using depthwise separable 3D CNNs optimized for smartphones. Their system integrates early fusion of spatial and temporal information. Although accuracy is lower (73.9%), the lightweight design highlights the feasibility of large-scale, cost-effective applications.

In summary, drowsiness detection systems have evolved from handcrafted feature-based methods to deep learning approaches. Two-stream and hybrid CNN–LSTM networks offer high accuracy by capturing both spatial and temporal facial features, while ensemble frameworks improve robustness under varying conditions. Handcrafted descriptors with SVMs remain competitive in challenging scenarios, and lightweight 3D CNNs enable practical real-time deployment on mobile devices. Together, these methods demonstrate the trade-offs between accuracy, robustness, and computational efficiency in behavioral-based drowsiness detection.

Methods

Program description

The purpose of the system is to detect drowsiness in real time. The input data of the application is a video feed from a live transmission. The output data includes the classification of the pilot’s drowsiness state and drowsiness-related parameters, which are displayed on the Graphical User Interface (GUI) and saved in a comma-separated values (CSV) file for archiving the application’s processed data. The detected parameters include the Eye Aspect Ratio (EAR), the Percentage of Eye Closure (PERCLOS), the Mouth Aspect Ratio (MAR), as well as the Euler head tilt angles: pitch and roll. Figure 3 presents the graphical interface of the application. The system generates visual alerts when the pilot’s drowsiness is detected.

During the application execution, image frames are continuously captured from the camera’s live feed. The pilot’s face is then detected within the frame. A face mesh is applied to the detected face. Then, using that face mesh, selected drowsiness parameters are calculated. Based on the value of the parameters, the classifier infers if the pilot is showing signs of drowsiness. If drowsiness is detected, the system generates a visual alert and returns to its baseline state.

Data analysis

This subchapter presents methods for detecting EAR, PERCLOS, MAR, and Euler head tilt angles. These parameters were selected as the most reliable behavioral indicators.

Other indicators, including yawning, pupil diameter, and saccadic movement speed, were considered but ultimately excluded. A model containing yawn indicator was trained but analysis of the importance plot showed that this indicator did not contribute to predicting drowsiness. Pupil diameter was excluded because the facial landmark detection technology (MediaPipe) did not provide pupil landmarks. Although image processing was considered for pupil diameter detection, the results were inconsistent. Saccadic movements were also evaluated, but the extracted signals were too noisy, and even with filtering, the subtle effect of drowsiness on saccade speed rendered this feature unreliable.

Face mesh detection

In the proposed application, an approach using the MediaPipe library was used to detect 478 points on the pilot’s face. This approach is similar to that presented in Lee’s work⁴⁹. Figure 4 shows which points are included in consecutive parameters for each feature detection.

EAR detection

After determining the position of the face mesh, the coordinates of the key points related to the eye region were extracted. These points were then connected in pairs to form segments. Two pairs represent the degree of eye openness, whereas one represents the eye’s width. This method allows for the normalization of eye dimensions relative to the camera’s distance.

Equation (2) was then used to calculate the EAR value for both eyes⁵⁰, which are the coordinates of selected points from the face mesh.

$$\begin{aligned} EAR = \displaystyle \frac{\left| \left| P_3-P_4 \right| \right| + \left| \left| P_5-P_6 \right| \right| }{2\cdot \left| \left| P_1-P_2 \right| \right| } \end{aligned}$$

(2)

The final value of the EAR is calculated as an average of EAR values for both eyes. Following the literature⁵¹, the EAR threshold was set to 0.2. If the value of the EAR drops below 0.2, then the application registers that the eyes are closed in the selected frame. If the value of the EAR was greater than or equal to 0.2, then the eyes are registered as open for the selected frame.

PERCLOS detection

The PERCLOS value is calculated as an average value over one minute, as described in Cheng et al.’s work⁵⁰. The total time the eyes were closed is determined using a moving time window method. PERCLOS is calculated via equation (1). The interval for which PERCLOS was calculated was set to 60 seconds.

MAR detection

The proposed system used the MAR method to find the degree of mouth opening. MAR is defined as a ratio between the height of the mouth opening and the width of the mouth. The process of determining this value corresponds to the EAR method. Equation (3) selects key points from the mouth region to calculate the value of MAR⁵².

$$\begin{aligned} MAR = \displaystyle \frac{\left| \left| P_3-P_4 \right| \right| + \left| \left| P_5-P_6 \right| \right| + \left| \left| P_7-P_8 \right| \right| }{2\cdot \left| \left| P_1-P_2 \right| \right| } \end{aligned}$$

(3)

Euler head tilt angle detection

In the proposed system, face mesh points were used to calculate the Euler angles of head tilt. Euler angles include pitch, roll, and yaw. Only pitch and roll were used because they are relevant for assessing the pilot’s drowsiness. The yaw angle was disregarded, as it does not provide useful information about drowsiness and could lead to overfitting in the trained model.

Key points from the face contour were selected to calculate these head tilt angles. Three line segments were created by pairing those key points. Then, by using trigonometric transformations, the pitch and roll angles were calculated. The final values of these angles were obtained by averaging the results from all the line segments. Additionally, the moving time window filters angle values to get a more stable result. The pitch angle was calculated using equation (4):

$$\begin{aligned} \psi = \arctan \left( \frac{y_2 -y_1}{z_2-z_1}\right) \end{aligned}$$

(4)

where:

$y_1$ - y coordinate of the first point
$y_2$ - y coordinate of the second point
$z_1$ - z coordinate of the first point
$z_2$ - z coordinate of the second point

The roll angle was computed using equation (5):

$$\begin{aligned} \theta = \arctan \left( \frac{y_2 -y_1}{x_2-x_1}\right) \end{aligned}$$

(5)

where:

$y_1$ - y coordinate of the first point
$y_2$ - y coordinate of the second point
$x_1$ - x coordinate of the first point
$x_2$ - x coordinate of the second point

Drowsiness classification

This work presents a method based on extracting selected parameters and aggregating them. The suggested method reduces the dimensionality of the input data, enables the separation of the feature detection stage from the classification stage, and minimizes the risk of model overfitting. An additional advantage of this method is the reduction of inference time and the ease of extending the model by incorporating additional parameters.

In the proposed work, the model distinguishes between two pilot states: drowsy and not drowsy. This work focuses on a random forest approach. As defined by Liu⁵³, “Random forests are a combination machine learning algorithm that consists of a series of decision trees. Each tree casts a unit vote for the most popular class, and by combining these votes, the final classification is obtained”. Each tree must be trained on different subsets of training data and attributes.

Model training

The NTHUDDD dataset⁴¹ was used to train the model since it includes a relatively large and diverse sample of 36 individuals from various ethnic backgrounds. It minimizes the risk of overfitting a specific population subset. Some participants wore regular glasses, others wore sunglasses, and the rest did not wear eye accessories. Recordings were made both during the daytime and nighttime, increasing the overall generality of the dataset.

The NTHUDDD dataset⁴¹ includes information about whether an individual is drowsy in specific recordings. Still, it does not provide the values of parameters such as EAR, PERCLOS, MAR, and head tilt angles (pitch and roll). Because of that, the application preprocessed the dataset so that the aforementioned parameters were computed for each frame.

A random forest model was trained on the following input parameters: EAR, MAR, and head tilt angles (pitch and roll). The scikit-learn library, which provides methods for constructing decision trees, was used to create and train the random forest model. The proposed random forest model ensembled 100 decision trees, and each decision tree was trained on a bootstrap sample consisting of 60% of the training dataset (sampled with replacement).

Drowsiness classification algorithm

In the proposed work, the random forest model is employed selectively. PERCLOS is the primary determinant for classifying the drowsiness state since it is the most scientifically validated parameter for passive drowsiness detection⁵¹. If the PERCLOS value is lower than 12.5%, the system classifies the pilot as not drowsy. The pilot is classified as drowsy if the PERCLOS value exceeds 25%. If the PERCLOS values fall between 12.5% and 25%, then the random forest model determines whether the pilot is drowsy. Threshold values were adopted from Hanowski et al.’s work⁵⁴. To assess their generalizability in an operational environment, functional tests were conducted on a group of five participants. The results support the suitability of the selected thresholds, though the limited sample size should be noted as a potential limitation.

For the pilot to be classified as drowsy, a certain threshold percentage of drowsiness classification needs to be exceeded within a specified time window. Because comparable methodologies are scarce in the scientific literature, the threshold values and the time window range were determined empirically. Satisfactory results were achieved using a 60-frame moving window and a critical threshold of 50% for the pilot to be alerted to potential drowsiness. Figure 5 illustrates the proposed algorithm for classifying the pilot’s drowsiness state.

Results

The random forest model was tested on the NTHUDDD testing dataset⁴¹, which was used only after the model’s hyperparameters had been selected and on DROZY dataset⁵⁵.

Partial dependence plots (PDPs) were generated for each feature involved in the model’s decision-making process: EAR, MAR, pitch angle, and roll angle (Fig. 6a–d). The methodology for constructing PDPs was detailed in the review⁵⁶. These plots illustrate how the model’s predicted classification probability changes in response to variations in a single feature, assuming that all other parameters remain constant. In this context, output values closer to 0 indicate a “drowsy” classification, whereas values closer to 1 correspond to “not drowsy.”

Furthermore, a feature importance plot (Fig. 6e) was generated for the random forest model, illustrating each input parameter’s relative influence on the model’s final classification outcome. Higher importance indicates that a specific feature plays a larger role in the model’s decision.

A confusion matrix (Fig. 6f) was generated to evaluate the random forest model’s performance. This matrix illustrates the agreement between the model’s predictions and the ground-truth labels assigned to the samples. Each cell in the matrix represents the count of instances classified into the corresponding category (“drowsy” or “not drowsy”), with rows denoting the true labels and columns representing the predicted labels.

In the scientific literature⁴², specific metrics are used for quantitative assessment of the model. These include accuracy, precision, recall, and the F₁-score. Each metric serves a different purpose and provides a way to examine a model from different perspectives. The model has been tested on NTHUDDD testing set⁴¹ and on DROZY dataset⁵⁵ to verify the degree of the overfitting of the model. For the random forest model, the quantitative metrics obtained from these datasets are summarized in Table 4.

Table 4 Values of the quantitative metrics for the random forest model tested on different datasets.

Full size table

To assess the validity of real-time working of this application, functional tests were performed. The application was executed for one minute while FPS measurements were recorded, and this procedure was repeated twenty times. Tests were conducted on the following hardware:

CPU: AMD Ryzen5 7535HS,
RAM : 16GB,
GPU: AMD Radeon RX 6550M.

The average FPS value was $38.0\pm 0.7$ (standard deviation).

Discussion

Figure 6a showcases PDP for EAR value. In the EAR range of approximately 0.10 to 0.30, a gradual transition in classification from “drowsy” to “not drowsy” was observed as EAR increased, although the change was not uniform. In the 0.10–0.15 interval, a slight upward trend was noted. Between 0.15 and 0.22, the partial dependence value remained relatively stable, followed by the most pronounced increase in the 0.22–0.30 range. Around EAR value of 0.25, the transition from the drowsy to the not drowsy state occurred. This mostly aligns with the typical threshold values of EAR at 0.2⁵¹. Beyond 0.30, further increases in EAR had no impact on the model’s decision.

During testing, EAR values below 0.01 were not observed, even with fully closed eyelids. Minor fluctuations observed at the EAR value of 0.01 (Fig. 6a) may result from inaccuracies in the placement of facial landmarks during the creation of the training dataset. However, these do not have a significant impact on the final classification results.

Figure 6b presents the partial dependence plot showing the relationship between the MAR parameter and the model’s decision. Two characteristic ranges can be distinguished. In the 0–0.27 interval, MAR has a negligible influence on classification, as indicated by a stable partial dependence value around 0.5. In other words, within this range, the model does not interpret minor variations in mouth opening as relevant to drowsiness detection; fluctuations in the curve may result from overfitting to specific samples in the training dataset. A notable decrease in the partial dependence value is observed only in the 0.27–0.40 range, corresponding to the classification of the drowsy state. This suggests that the model considers a more pronounced mouth opening as one of the indicators of drowsiness. This is consistent with the use of yawning as an indicator of drowsiness present in the literature⁴².

Figure 6c presents the partial dependence plot of the model with respect to the pitch parameter. Negative values indicate forward head tilt, while positive values correspond to backward tilt. Excluding observations around approximately $2^{\circ }$, a degree of symmetry can be identified relative to $-10^{\circ }$. Analysis of the NTHUDDD⁴¹ dataset indicates that the camera was positioned higher than the operator’s head, resulting in a systematic underestimation of pitch values as an upright posture was recorded as a slight forward tilt. This should be kept in mind if one decides to use this dataset for training machine learning models.

Taking this correction into account, it can be observed that around the pitch value of $-10^{\circ }$ the model is more likely to classify the state as not drowsy. As the value deviates from $-10^{\circ }$ either toward more negative or more positive angles the probability of drowsiness classification increases, with backward head tilt leading to a slightly faster transition toward the drowsy state.

It is also worth noting the range between $2^{\circ }$ and $6^{\circ }$, where the partial dependence curve rises, indicating a higher probability of a not drowsy classification with more pronounced backward head tilt. This is most likely an effect of overfitting to specific training set examples in which the operator maintained such a head position without exhibiting signs of fatigue.

Furthermore, the pitch range in the NTHUDDD⁴¹ dataset spans approximately from $-24^{\circ }$ to $6^{\circ }$, which, assuming a systematic error of about $10^{\circ }$, corresponds to an actual head tilt distribution within $\pm 15^{\circ }$. This is a relatively narrow range, indicating limited representation of more extreme angles in the training data. Such a limitation may reduce the model’s ability to reliably detect drowsiness in cases where the operator’s head assumes extreme positions.

Figure 6d presents the partial dependence plot of the model with respect to the roll parameter, describing lateral head tilt. Negative values indicate tilt to the right, while positive values correspond to tilt to the left. In the range from $-3^{\circ }$ to $-1^{\circ }$, the model classifies the operator as more prone to drowsiness. This may be due to the fact that, in the sample frames, operators typically tilted their heads forward or backward (pitch) with only slight lateral deviation (roll). The curve also shows a degree of symmetry around $-2^{\circ }$, which may suggest a small systematic error.

When the roll value changes by approximately $3^{\circ }$ relative to $-2^{\circ }$, the curve stabilizes, indicating that the model is less sensitive to further deviations along this axis. At the same time, rightward head tilt is more frequently associated by the model with a not drowsy state, which may reflect overfitting to training examples where operators exhibited drowsiness primarily through forward or backward head tilt rather than lateral tilt.

It is also worth noting that the roll range observed in the NTHUDDD⁴¹ dataset spans from approximately $-12^{\circ }$ to $8^{\circ }$. Considering a possible systematic error of about $2^{\circ }$, this corresponds to an actual lateral tilt distribution of roughly $\pm 10^{\circ }$. Such a limited range may result in insufficient representation of cases with more extreme lateral head positions, thereby restricting the model’s ability to accurately detect drowsiness in situations outside the range observed in the training data.

The feature importance plot (Fig. 6e) for the random forest model revealed that the EAR is the most influential parameter in the model’s decision-making process. The impact of the EAR on classification outcomes is approximately 6.85 times greater than that of the MAR, 11.03 times greater than that of the pitch angle, and 32.2 times greater than that of the roll angle.

Figure 6f showcases the confusion matrix of the model. It can be observed that the model is more likely to predict not drowsy operators as drowsy (false positives) than drowsy operators as not drowsy (false negatives). In the opinion of the authors this is more preferable than the alternative, as undetected drowsiness could result in an accident.

These results indicate that the model exhibits a degree of overfitting to the training set, which stems from certain dataset characteristics. This is further supported by Table 4, which shows that the model performs worse on an unseen dataset. This is one of the limitations of this study. To mitigate this, the training set should be expanded to include a broader variety of training samples from different datasets to improve the model’s performance. This expansion would mitigate biases and improve the model’s generalization capability.

Guidotti et al.⁵⁶ stated that artificial neural networks function as “black boxes” and lack methods to analyze the internal workings of the model. In contrast, random forest models, despite having lower performance scores, provide ways to see how algorithms process data and, as a result, allow the identification of certain biases in datasets, which can subsequently be addressed to improve the performance of the models.

This study has additional limitations. Although the PERCLOS thresholds were tested for generalizability in an operational environment, the sample size was small and not fully representative. Additional data are needed to confirm the validity of the selected PERCLOS values.

Moreover, the current model performs only binary classification, which limits its sensitivity during the early stages of drowsiness. This restriction may delay detection and increase the risk of accidents. Drowsiness can be more effectively modeled using the Karolinska Sleepiness Scale⁵⁷, and even introducing a single intermediate class could improve detection performance under real-world operational conditions.

Furthermore, the model has not been evaluated against environmental and subject-specific variables, such as lighting conditions, background clutter, ethnicity, or the presence of glasses and facial hair. Further research is required to assess the influence of these factors on model accuracy, although some hypotheses can be drawn from the present analysis. Since the model relies on numerical values of extracted parameters, its accuracy is inherently dependent on the precision with which these parameters are obtained. In particular, the accuracy of drowsiness detection is affected by the performance of Mediapipe, which combines a face detector with a 3D mesh model applied to facial geometry⁵⁸. Variations in lighting and background clutter may affect the detector’s performance, while partial obstructions, such as glasses or facial hair, can reduce landmark localization accuracy. This, in turn, reduces the reliability of the derived behavioral parameters and ultimately lowers the accuracy of drowsiness detection. Additional experiments are therefore necessary to quantify the impact of these variables on model performance.

Conclusions

While artificial neural networks dominate current drowsiness detection systems in the scientific literature^42,59, this work aims to highlight the benefits that stem from the use of random forest models, particularly in terms of interpretability and bias analysis, which could result in improved performance of subsequent models.

This work could be improved upon by implementing the following:

1.
Expanding the model with additional features, such as physiological and UAV control parameters.
2.
Training the model on data from different datasets.
3.
Verifying the effect of environmental and subject-specific variables on model performance.
4.
Historical data can be incorporated into the inference process of the model by means of recurrent neural networks (RNN), long short-term memory (LSTM), or transformer-based models.
5.
Introducing intermediate drowsiness classes beyond the current binary classification to increase sensitivity of detecting early-stage drowsiness.

Data availability

The authors declare that the data supporting the findings of this study are available within the paper and its supplementary files. The code is available at https://github.com/AdrianPanasiewicz/Drowsiness_detection_system.

References

Overview of motor vehicle traffic crashes in 2022. Tech. Rep., National Center for Statistics and Analysis (2024). Accessed: 2024.
Court of Inquiry (India) & Air Marshal BN Gokhale, PVSM, AVSM, VM (Retd). Report on accident to air india express boeing 737-800 aircraft vt-axv on 22nd may 2010 at mangalore (2010).
Lohani, M., Payne, B. R. & Strayer, D. L. A review of psychophysiological measures to assess cognitive states in real-world driving. Front. Hum. Neurosci. 13, 1–27 (2019).
Article Google Scholar
Ramzan, M. et al. A survey on state-of-the-art drowsiness detection techniques. IEEE Access 7, 61904–61919 (2019).
Article Google Scholar
Liu, C. C., Hosking, S. G. & Lenné, M. G. Predicting driver drowsiness using vehicle measures: Recent insights and future challenges. J. Safety Res. 40, 239–245 (2009).
Article PubMed Google Scholar
Morris, T. L. & Miller, J. C. Electrooculographic and performance indices of fatigue during simulated flight. Biol. Psychol. 42, 343–360 (1996).
Article CAS PubMed Google Scholar
Shiferaw, B. et al. Stationary gaze entropy predicts lane departure events in sleep-deprived drivers. Sci. Rep. 8 (2018).
Flynn-Evans, E. et al. Supervision of a self-driving vehicle unmasks latent sleepiness relative to manually controlled driving. Sci. Rep. 11 (2021).
Nguyen, T., Ahn, S., Jang, H., Jun, S. & Kim, J. Utilization of a combined eeg/nirs system to predict driver drowsiness. Sci. Rep. 7 (2017).
Hu, Z. et al. Optimization of pid control parameters for marine dual-fuel engine using improved particle swarm algorithm. Sci. Rep. 14 (2024).
Arefnezhad, S. et al. Driver drowsiness estimation using eeg signals with a dynamical encoder-decoder modeling framework. Sci. Rep. 12 (2022).
Ren, B., Zhou, Q. & Chen, J. Assessing cognitive workloads of assembly workers during multi-task switching. Sci. Rep. 13 (2023).
Hu, X. & Lodewijks, G. Detecting fatigue in car drivers and aircraft pilots by using non-invasive measures: The value of differentiation of sleepiness and mental fatigue. J. Safety Res. 72, 173–187 (2020).
Article PubMed Google Scholar
Alghanim, M. et al. A hybrid deep neural network approach to recognize driving fatigue based on eeg signals. Int. J. Intell. Syst. 2024, 9898333 (2024).
Article Google Scholar
Alharasees, O. & Kale, U. Examining heart rate variability in unmanned aerial vehicle operator automation. In International Symposium on Unmanned Systems and The Defense Industry, 115–120 (Springer, 2024).
Wojciechowski, P., Wojtowicz, K. & Błaszczyk, J. Effect of unmanned aerial vehicle mission difficulty level on pilot’s autonomic nervous system. Int. J. Occup. Med. Environ. Health 38, 391 (2025).
Article PubMed PubMed Central Google Scholar
Maciejewska, M., Galant-Gołębiewska, M. & Łodygowski, T. Psychophysical state aspect during uav operations. Appl. Sci. 14, 150 (2023).
Article Google Scholar
Wang, J. et al. Towards generalizable drowsiness monitoring with physiological sensors: A preliminary study. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol. 69, 1145–1150 (SAGE Publications Sage CA: Los Angeles, CA, 2025).
Liu, M., Liu, Y., Feleke, A. G., Fei, W. & Bi, L. Neural signature and decoding of unmanned aerial vehicle operators in emergency scenarios using electroencephalography. Sensors 24, 6304 (2024).
Article ADS PubMed PubMed Central Google Scholar
Wang, S. et al. Task-independent auditory probes reveal changes in mental workload during simulated quadrotor uav training. Health Inf. Sci. Syst. 11, 12 (2023).
Article PubMed PubMed Central Google Scholar
Malafeev, A. et al. Automatic detection of microsleep episodes with deep learning. Front. Neurosci. 15, 564098 (2021).
Article PubMed PubMed Central Google Scholar
Wang, L., Wang, H. & Jiang, X. A new method to detect driver fatigue based on emg and ecg collected by portable non-contact sensors. Promet-Traffic Transp. 29, 479–488 (2017).
Chua, E. et al. Classifying attentional vulnerability to total sleep deprivation using baseline features of psychomotor vigilance test performance. Sci. Rep.9 (2019).
Kim, D., Park, H., Kim, T., Kim, W. & Paik, J. Real-time driver monitoring system with facial landmark-based eye closure detection and head pose recognition. Sci. Rep.13 (2023).
Liberto, G. et al. Robust anticipation of continuous steering actions from electroencephalographic data during simulated driving. Sci. Rep.11 (2021).
Ngxande, M., Tapamo, J.-R., & Burke, M. Driver drowsiness detection using behavioral measures and machine learning techniques: A review of state-of-art techniques. In Pattern Recognition Association of South Africa and Robotics and Mechatronics (PRASA-RobMech), 156–161 (IEEE 2017 (South Africa, Bloemfontein, 2017).
Morad, Y., Lemberg, H., Yofe, N. & Dagan, Y. Pupillography as an objective indicator of fatigue. Curr. Eye Res. 21, 535–542 (2000).
Article CAS PubMed Google Scholar
Wierwille, W. W. & Ellsworth, L. A. Evaluation of driver drowsiness by trained raters. Accid. Anal. Prev. 26, 571–581 (1994).
Article CAS PubMed Google Scholar
Ji, Q., Zhu, Z. & Lan, P. Real-time nonintrusive monitoring and prediction of driver fatigue. IEEE Trans. Veh. Technol. 53, 1052–1068 (2004).
Article ADS Google Scholar
United States Federal Motor Carrier Safety Administration. PERCLOS: A valid psychophysiological measure of alertness as assessed by psychomotor vigilance (Tech. Rep, Technology Division, 1998).
Google Scholar
Nguyen, T. P., Chew, M. T. & Demidenko, S. Eye tracking system to detect driver drowsiness. In 2015 6th International Conference on Automation, Robotics and Applications (ICARA), 472–477 (IEEE, Queenstown, New Zealand, 2015).
Sommer, D. & Golz, M. Evaluation of PERCLOS based current fatigue monitoring technologies. In 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, 4456–4459 (IEEE, Buenos Aires, 2010).
Henn, V., Baloh, R. W. & Hepp, K. The sleep-wake transition in the oculomotor system. Exp. Brain Res. 54, 166–176 (1984).
Article CAS PubMed Google Scholar
Schleicher, R., Galley, N., Briest, S. & Galley, L. Blinks and saccades as indicators of fatigue in sleepiness warnings: looking tired?. Ergonomics 51, 982–1010 (2008).
Article ADS CAS PubMed Google Scholar
Diaz-Piedra, C. et al. Fatigue in the military: towards a fatigue detection test based on the saccadic velocity. Physiol. Meas. 37, N62–N75 (2016).
Article PubMed Google Scholar
Guggisberg, A. G., Mathis, J., Herrmann, U. S. & Hess, C. W. The functional relationship between yawning and vigilance. Behav. Brain Res. 179, 159–166 (2007).
Article PubMed Google Scholar
Gallup, A. C. & Gallup, G. G. Jr. Yawning as a brain cooling mechanism: Nasal breathing and forehead cooling diminish the incidence of contagious yawning. Evol. Psychol. 5, 92–101 (2007).
Article Google Scholar
Askenasy, J. J. M. Is yawning an arousal defense reflex?. J. Psychol. 123, 609–621 (1989).
Article CAS PubMed Google Scholar
Schürmann, M. et al. Yearning to yawn: the neural basis of contagious yawning. Neuroimage 24, 1260–1264 (2005).
Article PubMed Google Scholar
Popieul, J. C., Simon, P. & Loslever, P. Using driver’s head movements evolution as a drowsiness indicator. In IEEE IV2003 Intelligent Vehicles Symposium, 616–621 (Columbus, OH, USA, 2003).
Weng, C.-H., Lai, Y.-H. & Lai, S.-H. Driver drowsiness detection via a hierarchical temporal deep belief network. In Asian Conference on Computer Vision Workshop on Driver Drowsiness Detection from Video (2016).
El-Nabi, S. A. et al. Machine learning and deep learning techniques for driver fatigue and drowsiness detection: a review. Multimedia Tools Appl. 83, 9441–9477 (2024).
Article Google Scholar
Liu, W., Qian, J., Yao, Z., Jiao, X. & Pan, J. Convolutional two-stream network using multi-facial feature fusion for driver fatigue detection. Future Internet 11, 115 (2019).
Article Google Scholar
Rezaee, K. et al. Real-time intelligent alarm system of driver fatigue based on video sequences. In 2013 First RSI/ISM International Conference on Robotics and Mechatronics (ICRoM), 378–383 (IEEE, 2013).
Dua, M., Shakshi, Singla, R., Raj, S. & Jangra, A. Deep cnn models-based ensemble approach to driver drowsiness detection. Neural Comput. Appl. 33, 3155–3168 (2021).
Guo, J.-M. & Markoni, H. Driver drowsiness detection using hybrid convolutional neural network and long short-term memory. Multimedia Tools Appl. 78, 29059–29087 (2019).
Article Google Scholar
Moujahid, A., Dornaika, F., Arganda-Carreras, I. & Reta, J. Efficient and compact face descriptor for driver drowsiness detection. Expert Syst. Appl. 168, 114334 (2021).
Article Google Scholar
Wijnands, J. S., Thompson, J., Nice, K. A., Aschwanden, G. D. & Stevenson, M. Real-time monitoring of driver drowsiness on mobile platforms using 3d neural networks. Neural Comput. Appl. 32, 9731–9743 (2020).
Article Google Scholar
Lee, K. H., Kim, W., Choi, H. K. & Tae Jang, B. A study on feature extraction methods used to estimate a driver’s level of drowsiness. In 2019 21st International Conference on Advanced Communication Technology (ICACT), 710–713 (PyeongChang, Korea, 2019).
Cheng, Q., Wang, W., Jiang, X., Hou, S. & Qin, Y. Assessment of driver mental fatigue using facial landmarks. IEEE Access 7, 150423–150434 (2019).
Article Google Scholar
Abe, T. PERCLOS-based technologies for detecting drowsiness: current evidence and future directions. Sleep Adv. 4, 1–13 (2023).
Article Google Scholar
Thulasimani, L., Poojeevan, P. & Prithashasni, P. Real time driver drowsiness detection using opencv and facial landmarks. Int. J. Aquat. Sci. 12, 4297–4314 (2021).
Google Scholar
Liu, Y., Wang, Y. & Zhang, J. New machine learning algorithm: Random forest. In Liu, B., Ma, M. & Chang, J. (eds.) Information Computing and Applications, vol. 7473, 246–252 (Springer, Berlin, Heidelberg, 2012).
Hanowski, R. J., Bowman, D. S., Alden, A., Wierwille, W. W. & Carroll, R. J. PERCLOS+: Development of a robust field measure of driver drowsiness (Tech. Rep, Federal Motor Carrier Safety Administration, 2008).
Google Scholar
Massoz, Q., Langohr, T., François, C. & Verly, J. G. The ulg multimodality drowsiness database (called drozy) and examples of use. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), 1–7 (2016).
Guidotti, R. et al. A survey of methods for explaining black box models. ACM Comput. Surveys 51, 93:1–93:42 (2018).
Åkerstedt, T. & Gillberg, M. Subjective and objective sleepiness in the active individual. Int. J. Neurosci. 52, 29–37 (1990).
Article PubMed Google Scholar
Lugaresi, C. et al. Mediapipe: A framework for perceiving and processing reality. In Third workshop on computer vision for AR/VR at IEEE computer vision and pattern recognition (CVPR), vol. 2019 (2019).
Albadawi, Y., Takruri, M. & Awad, M. A review of recent developments in driver drowsiness detection systems. Sensors 22, 1–41 (2022).
Article Google Scholar

Download references

Funding

No funding was utilized during the preparation of materials, writing, or publication of the article.

Author information

Authors and Affiliations

Faculty of Mechatronics, Armament and Aerospace, Military University of Technology, Warsaw, Poland
Konrad Wojtowicz, Przemysław Wojciechowski & Adrian Panasiewicz

Authors

Konrad Wojtowicz
View author publications
Search author on:PubMed Google Scholar
Przemysław Wojciechowski
View author publications
Search author on:PubMed Google Scholar
Adrian Panasiewicz
View author publications
Search author on:PubMed Google Scholar

Contributions

P.W., A.P. and K.W. conceived the methodology, P.W. and A.P. carried out the experiments. P.W., A.P. and K.W. wrote the main manuscript text. All the authors have read the manuscript and agreed to its submission for publication.

Corresponding author

Correspondence to Konrad Wojtowicz.

Ethics declarations

Competing interests

Authors declare no competing interests.

Informed consent

The identifiable image in Fig. 3 depicts one of the authors (A.P.), who provided explicit consent for its publication. No other personal or identifiable data are included in this work.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information 1. (download CSV )

Supplementary Information 2. (download CSV )

Supplementary Information 3. (download CSV )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wojtowicz, K., Wojciechowski, P. & Panasiewicz, A. Research on drowsiness detection in UAV operators based on the random decision forest method. Sci Rep 16, 9726 (2026). https://doi.org/10.1038/s41598-026-39195-y

Download citation

Received: 10 June 2025
Accepted: 03 February 2026
Published: 18 February 2026
Version of record: 24 March 2026
DOI: https://doi.org/10.1038/s41598-026-39195-y

Subjects

Abstract

Similar content being viewed by others

An efficient privacy-preserving multilevel fusion-based feature engineering framework for UAV-enabled land cover classification using remote sensing images

White shark optimizer with optimal deep learning based effective unmanned aerial vehicles communication and scene classification

Neural radiance fields assisted by image features for UAV scene reconstruction

Introduction

Methods of detecting drowsiness

Methods based on vehicle control parameters

Methods based on physiological parameters

Methods based on behavioral parameters

Classification methods for detecting drowsiness

Methods

Program description

Data analysis

Face mesh detection

EAR detection

PERCLOS detection

MAR detection

Euler head tilt angle detection

Drowsiness classification

Model training

Drowsiness classification algorithm

Results

Discussion

Conclusions

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Informed consent

Additional information

Publisher’s note

Supplementary Information

Supplementary Information 1. (download CSV )

Supplementary Information 2. (download CSV )

Supplementary Information 3. (download CSV )

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links