Abstract
Even though the capability of aircraft manufacturing has improved, human factors still play a pivotal role in flight accidents. For example, fatigue-related accidents are a common factor in human-led accidents. Hence, pilots’ precise fatigue detections could help increase the flight safety of airplanes. The article suggests a model to recognize fatigue by implementing the convolutional neural network (CNN) by implementing flight trainees' face attributions. First, the flight trainees’ face attributions are derived by a method called the land-air call process when the flight simulation is run. Then, sixty-eight points of face attributions are detected by employing the Dlib package. Fatigue attribution points were derived based on the face attribution points to construct a model called EMF to detect face fatigue. Finally, the proposed PSO-CNN algorithm is implemented to learn and train the dataset, and the network algorithm achieves a recognition ratio of 93.9% on the test set, which can efficiently pinpoint the flight trainees’ fatigue level. Also, the reliability of the proposed algorithm is validated by comparing two machine learning models.
Similar content being viewed by others
Introduction
The fatigue detection of pilots is an important subject in aviation due to the direct safety of flights and passengers1. Pilots are primary factors in ensuring flight safety2,3,4. According to the investigation outcomes of flight accidents, about 70% of them are directly related to human factors. Especially, most of the investigated accidents are caused by pilots when their fatigue-related conditions are dominant5,6. Strong correlations between facial expressions and human fatigue levels have been found in numerous studies7,8. Similarly, pilots' fatigue states can be efficiently determined by surveilling their expression alterations during flights. Moreover, two different approaches are employed to pinpoint fatigue levels: 1. physiological signals; and 2. face information.
When fatigue detection is determined based on utilizing physiological signals, electroencephalogram (EEG), electrocardiogram (ECG), electro-myo-graphic (EMG), and other types of measurements are primarily employed to obtain pilots' brain, heart, and muscle signals through various kinds of sensors. Thus, the fatigue stages of pilots can be numerically measured9,10. When ECG, EMG, pulse, and respiration data were collected during flights, Jiayun et al.11 suggested an algorithm to evaluate the workload of the pilot and optimize the design of the aircraft’s cockpit ergonomics. Fei et al.12 extracted the attributes of pilots' EEG signals and employed the support vector machine (SVM) algorithm to cluster pilots’ fatigue states. Xu et al.13 suggested a model based on a hybrid multi-class Gaussian process to pinpoint pilots’ fatigue levels by investigating the surface electromyographic signals measured from the pilot's neck and the muscles of the upper arm. Hu et al.14 provided psychological insights into available non-invasive fatigue measurements of drivers and pilots by segregating between drowsiness and mental fatigue. Du15 investigated pilots’ fatigue when EEG signals were used.
Alternatively, information extraction from human faces helps collect data. Yang16 introduced a network framework based on the length and angle attributions of face grid points to resolve the low accuracy problem of face expression detection due to skewed face postures when face structure is point-wise represented. Wang et al.17 proposed a method that utilizes monitoring pilot fatigue levels when human eye detection was implemented. You Y18 suggested a method utilizing machine vision and the percentage of eyelid closure over the pupil over time (PERCLOS) algorithm by the camera to derive information for fatigue detection. The pilot images were collected to process facial recognition, eye recognition, and eye state determination to monitor pilots’ fatigue in real-time. Zhang et al.19 suggested a method that detects and tracks pilots’ head positions based on a cascade CNN, which can effectively surveil the head movements of the pilot in the training stage of the cockpit simulation. Liu et al.20 proposed a deep learning algorithm to detect fatigue by using facial expressions, aiming at improving the accuracy and timeliness of the fatigue detection of drivers. Twenty-four face attributions were extracted. Two parameters describing drivers' fatigue states were computed, and finally, a fuzzy inference system was implemented to determine drivers' fatigue states.
Most of the available literature utilizes physiological data and face images to model and analyze. However, quite a few research papers have been present to detect the face attribution points of pilots. In the article, flight trainees were utilized and their face attribution points were derived. Land and air calls were simulated to construct a face fatigue model, called EMF, by utilizing face attribution points. Then, a particle swarm optimization-based CNN (PSO-CNN) algorithm was proposed to construct a model to recognize face fatigue and implemented for determining the fatigue levels of flight trainees.
Experimental design
Participants
In the study total of forty male students whose ages between 20 and 22 years with a mean and standard deviation of 21.5 and 0.47 years, respectively, pursuing a degree in flight technology at the College of General Aviation and Flight at Nanjing University of Aeronautics and Astronautics (NUAA) participated in the tests. All had normal visions (or normal vision after correction) and no vestibular symptoms or neurologic disorders were detected and received training before flight simulations were run. Moreover, consuming alcohol or taking any neurological drugs was not allowed before the experiment was conducted. Five of the subjects had sufficient sleep (more than 8 h) on the day before the experiment was run while five subjects did not get sufficient sleep (5–6 h).
Experimental flight subjects
Airfield traffic patterns are chosen to simulate flights since the comprehensive literature survey and consultation with senior flight instructors underline the importance of the subject. Note that the airfield traffic pattern is a pivotal component of the pilot training stage since it is composed of maneuvering around airports. In the training scenario, pilots acquire skills such as takeoff, climb, turning, leveling off, descent, and landing. Figure 1 depicts the schematic diagram of an airfield traffic pattern.
Procedure
Participants completed the five-sectioned flight simulation in the Primary Flight Simulation Laboratory at NUAA, where the subjects' face video recordings were collected when the pilots' land and air calls were simulated during the flights.
The Cessna C172SP Skyhawk airplane was utilized and Beijing Capital International Airport was chosen to run the experiment. The airport’s environmental condition was set to be clear sky and wind velocity of 5–15 knots. The flight simulation utilized the subject of the airfield traffic pattern that lasted about 12 min. Figure 2 depicts the whole process and gives a picture of the flight trainee’s face.
The experiment requires the flight trainees to properly maneuver the airplane and make land and air calls when the simulation is on. The whole experiment was videotaped, and the video clips of maneuvering, land and air calls, and yawning were screened. Also, when the experiment was terminated, flight trainees were asked to express how they felt while a recorded video was played to collect data regarding feelings.
The point extraction of face features
In the article, the open-source Dlib library is employed to derive face attribution points20. The shape_predictor_68_face_landmarks.dat in the Dlib library is employed to pinpoint face attribution points of faces in the recorded video when pilots and air calls occur. Thus, the 68 attribution point coordinates of the pilots’ faces in each frame are obtained. Figure 3 depicts 68 face attribution points of a face, and the authors’s picture shows them.
Facial feature model
EMF feature model for face fatigue recognition
EMF face fatigue model’s attribution dimensions are composed of three attributions such as eyes, mouth, and face contour, respectively. The detection model of face fatigue is constructed based on these three attributions, and the author’s picture shows them in the middle.
Eye attributions
When flight trainees operate aircraft under normal conditions, the aspect ratio of the eyes (EAR) is stable in the vicinity of a certain score except for blinking. On the other hand, when yawning occurs, the EAR alters substitutionally. Figure 4a depicts the eye attribution points. A total of six attribution points of the eye are pinpointed and represented by P1-P6, respectively in Fig. 4a. Equation (1) presents the eye aspect ratio.
Mouth attributions
When flight trainees operate aircraft during land-air conversations, the mouth aspect ratio (MAR) alters. However, when flight trainees yawn, as depicted in Fig. 5, their MARs are important. Figure 4b depicts the mouth attribution points. The similar six attribution points of the mouth are represented by P1-P6, respectively. Equation (2) presents the MAR.
Facial attributions
When flight trainees yawn during a land-air conversation, his or her facial contour will alter substitutionally and is distinct from what happens in normal speech. Thus, the aspect ratio of the facial contour (FAR) can reflect the distinction between these two occurrences. Figure 4c depicts the facial attribution points. The similar six attribution points of the FAR are identified and characterized by the attribution points pinpointed by the Dlib model package. Equation (3) presentsthe FAR.
The attribution points of eyes, mouth, and face, respectively, could well reflect the fatigue characterizations of the trainees. Thus, the face fatigue recognition of the EMF model is constructed based on the attribution points of the three regions.
EMF’s face fatigue attributions
Processing of face attribution data
In the article, a 20 s video of a flight trainee's face is captured at 10 frames/s with a resolution of 640 × 360. The whole video is transformed to obtain 200 images, and some frame images are extracted to obtain the images in Fig. 5. The video content contains normal speech and yawning, respectively. The change in each dimension of the EMF model is analyzed through the video. The portrait in Fig. 5 is the test flight trainee.
The characterization changes of the EMF model
In the research, the variations of the EAR, MAR, and FAR of the above 20 s video were derived as depicted in Fig. 6. In the figure, the X-axis is the frame of the image (in units of frames) and the Y-axis is the ratio of the width to the height (unitless).
For the video under consideration, normal speech was performed from 60 to 90 frames and a yawn happened from 120 to 160 frames. Normal speech and yawn expressions change from 60 to 90 and 120 to 160 frames, respectively in Fig. 6. The face attribution data revealed that when yawning happened, there was a significant variation in the EAR, MAR, and FAR. On the other hand, there were variations in the MAR and FAR during normal speech. However, an insignificant variation in the EAR occurred. Hence, note that the proposed EMF model could reflect the changes in face attributions well.
Methodology
Experimentally collected data
Recordings from 10 subjects were selected to be processed and analyzed. Face attribution data were marked when yawning occurred as a fatigue representation and normal speaking and no facial expression change occurred as a non-fatigue-less representation. A total of 766 face attribution data were derived. When flight trainees maneuver the aircraft in simulation, conduct land and air talk, and yawn, 225, 287, and 254 images, respectively exist. These are labeled as 1, 2, and 3, respectively. Table 1 presents them.
PSO-CNN
Particle swarm optimization
The initialized particle swarm optimization (PSO) is composed of a group of random particles. Then, the optimal solution is determined by running iteration steps21. Equations (4) and (5) present the fundamental mathematical expressions of the PSO.
where N represents the total particle numbers; vi denotes the speed of the particle; rand() designates a random number between (0,1); xi shows the particle current position, i = 1,2,3…N; c1 and c2 represent the learning factors, which are generally set to 2; and pbest and gbest represent the two extremes that the particle needs to follow.
CNN
Figure 7 depicts that a CNN, composed of a convolutional layer, a pooling layer, and a fully connected layer, is a kind of feed-forward neural network that is defined by an activation function22.
The convolutional layer includes multiple convolutional kernels, which cover an area called the "sensory field". The pooling layer is implemented to pick attributions to decrease the attribution numbers in the input data. The fully connected layer is employed to fit the derived attributions to the output non-linearly.
The combination of EMF with PSO-CNN model
In the research, EMF-PSO-CNN is employed to recognize face features. The recognition of the feature points of a face and the calculation of the aspect ratio of the EMF model are first conducted. Then, the recognition results are visualized. The process can synchronize video recognitions and calculate the aspect ratio of the three dimensions of the EMF model.
Then, the PSO-CNN algorithm is employed for training. The convolutional layer contains multiple convolutional kernels, which cover an area called the "sensory field", the pooling layer is used for feature selection to reduce the number of features in the input data, and the fully-connected layer is implemented to fit the extracted nonlinear features to the output. Figure 8 depicts the structure of the algorithm.
Network training and prediction
The training process employs the Keras architecture based on the deep learning framework of TensorFlow to build a runtime environment. CPU: Intel i5 -10400f. 3.2 GHz; operating system: Win10 64-bit; programming language: Python 3.7.7; deep learning architecture: TensorFlow 2.3.0 Keras 2.4.3 are the parameters of the experimental environment.
75% of the data were allocated as training data and 25% as test data, respectively After the second fully connected layer is optimized by the PSO, the optimal neuron number is 38. Table 2 presents the parameters of the whole network.
Figure 9 depicts The specific structure of the PSO-CNN algorithm.
Figure 10a depicts that the model optimization is carried out using an RMSprop optimizer with a learning rate of 0.001, a training batch of 16, and an iteration number of 500. The final detection precision of the proposed algorithm is attained based on both training and test data, respectively.
After running 500 training sessions with the PSO-CNN, the precision reached up to 93.9% on the validation set.
Then, the CNN with the fully connected layer is presented in Fig. 10b when the PSO is not used to optimize. Figure 10b depicts the final detection precision of the proposed algorithm based on the training and test datasets, respectively, when the PSO is used to optimize.
The highest precision of 89.6% on the validation set after running 500 sessions when CNN is employed is lower than the highest precision of the PSO-CNN.
Comparative analysis
To validate the employed transformer deep learning algorithm, the number of heads in the transformer layer is set to 4, and Table 3 presents the structure of the algorithm.
The PSO-CNN model utilizes 75% of the data as training and 25% as a test. The iteration number is set to 500. The loss function is Categorical-Crossentropy and the optimizer is the Adam algorithm. The model accuracy is depicted in Fig. 11.
To validate the reliability of the PSO-CNN, two conventional machine learning methods, the Random Forest23 and Support Vector machine algorithms24, were picked to compare. These two algorithms are run until the iteration number reaches 50 in each and finally, the optimal recognition accuracy is obtained. Table 4 summarizes the recognition results of each algorithm.
Table 4 depicts that when the 4 conventional algorithms are compared, the optimized PSO-CNN has a high recognition ratio and robustness, and can achieve accurate classification of face fatigue levels in flight trainees.
The proposed algorithm is fast but is restricted by hardware conditions when compared to conventional machine learning algorithms that run slightly slowly. Overall, the proposed algorithm can be implemented to detect the fatigue condition of flight trainees.
Verification and Validation of algorithm
In order to deeply verify the accuracy of the model, we collected some additional data to verify and validate the algorithm. We searched for ten flight trainees majoring in flight technology according to the same criteria. Their mean age was 20.5 years with a standard deviation of 0.71. The face feature point extraction process is shown in Fig. 12:
A total of 358 data were collected in the validation test. When flight trainees maneuver the aircraft in simulation, conduct land and air talk, and yawn, 112, 124, and 122 images respectively exist.
The face feature points are extracted by EMF model and some of the data obtained are shown in Table 5:
The trained PSO-CNN model is used to recognize the validation data and the accuracy of recognition is obtained as 91.2%. The ability to recognize the flight trainee's face expression well verifies the reliability of the model.
Conclusion
The subsequent conclusions were obtained when flight trainees' simulated Airfield Traffic Patterns with land-air calls were under investigation.
-
(1)
The face video recordings of flight trainees during land-air calls were simulated, and the face attribution points were obtained by employing the Dlib package.
-
(2)
Based on the extracted attribution points of flight trainees' faces, an EMF fatigue model was constructed.
-
(3)
A PSO-CNN algorithm was constructed and implemented to train and predict the fatigue attribution of the flight trainees' face data by running a simulation, and the prediction accuracy reached 93.9%. To compare the results of the proposed algorithm with those of the RF and SVM algorithms, a comparison study is run to validate.
-
(4)
The training speed of the model can be effectively reduced, and the efficiency of recognition can be improved by screening the data of face attribution to pinpoint fatigue levels of flight trainees' faces.
In future research, we will plan to optimize the algorithm further to improve the accuracy when flight trainees conduct land-air calls.
Data availability
The datasets generated and analyzed during the current study are not publicly available due to the data relating to the privacy of the individual flight trainees but are available from the corresponding author on reasonable request.
Code availability
The algorithms proposed in the current study can be accessed via the following web links: https://github.com/shanglei6618/Python_FACE_EMF-PSO-CNN.git.
References
Lee, S. & Kim, J. K. Factors contributing to the risk of airline pilot fatigue. J. Air Transp. Manag. 67, 197–207 (2018).
Li, Y. et al. The influence of mindfulness on mental state concerning safety among civil pilots. J. Air Transp. Manag. 84, 101768 (2020).
Liu, H. et al. Flight training evaluation based on dynamic Bayesian network and fuzzy gray theory. Acta Aeronautica et Astronautica Sinica 42(08), 250–261 (2021).
Wang, R. & Gao, Z. X. Influencing factors of civil aircraft landing safety based on flight data. J. Transp. Inf. Saf. 37(04), 27–34 (2019).
Kelly, D. & Efthymiou, M. An analysis of human factors in fifty controlled flight into terrain aviation accidents from 2007 to 2017. J. Saf. Res. 69(6), 155–165 (2019).
Sun, R. S. & Wang, P. Research on the factors influencing pilot fatigue based on structural equation modeling. J. Saf. Environ. 22(06), 3252–3258 (2022).
Xu, J. J. Research of the Train Driver Fatigue Detection and Recognition System Based on Facial Features (Southwest Jiaotong University, 2010).
Guo, H. L., Wang, N. & Guo, H. Research on fatigue driving early warning system based on multiple signal characteristics. J. Commun. 39(S1), 22–29 (2018).
Dias, N. S., Carmo, J. P., Mendes, P. M. & Correia, J. H. Wireless instrumentation system based on dry electrodes for acquiring EEG signals. Med. Eng. Phys. 34(7), 972–981 (2012).
Borghini, G. et al. Measuring neurophysiological signals in aircraft pilots and car drivers for the assessment of mental workload, fatigue, and drowsiness. Neurosci. Biobehav. Rev. 44, 58–75 (2012).
Jiang, J. Y., Sun, Y. C. & Zhang, X. Study on evaluating pilot workload based on multi-source physiological data fusion. Chin. J. Ergon. 29(03), 1–10 (2023).
Wang, F. et al. Driving fatigue detection based on EEG recognition combined with maneuvering features. Instrum. 35(2), 398–404 (2014).
Xu, B. et al. Recognition of the fatigue status of pilots using BF–PSO optimized multi-class GP classification with sEMG signals. Reliab. Eng. Syst. Saf. 199, 106930 (2020).
Hu, X. & Lodewijks, G. Detecting fatigue in car drivers and aircraft pilots by using non-invasive measures: The value of differentiation of sleepiness and mental fatigue. J. Saf. Res. 72, 173–187 (2020).
Du, P. P. Pilot Fatigue State Detection based on EEG (Zhongyuan University of Technology, 2023).
Yang, Q. Expression Recognition Based on Attention Mechanism and Length Feature of Facial and marks (Nanjing University of Posts and Telecommunications, 2023).
Wang, J. N., Pan, W. P. & Li, Y. H. Research on human eye location and condition recognition in pilot fatigue monitoring. Aeronaut. Comput. Techn. 46(04), 78–82 (2016).
You, Y. Research on Pilot Fatigue Monitoring Technology Based on Machine Vision (University of Electronic Science and Technology of China, 2011).
Zhang, L. M. et al. Research on pilot head position tracking method based on neural network. J. Ordnance Equipm. Eng. 42(05), 88–93 (2021).
Liu, Z., Peng, Y., Hu, W. Driver fatigue detection based on deeply-learned facial expression representation. In 2018 IEEE international conference on information and automation (ICIA). IEEE, 12 (2020).
Smets, P. & Kennes, R. The transferable belief model. Artif. Intell. 66(2), 191–234 (1994).
Krizhevsky, A., Sutskever, Y. & Hinton, G. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2), 25 (2012).
Breiman, L. Random forest. Mach. Learn. 45, 5–32 (2001).
Sholkopf, B. et al. Compaing supportvector machine with Gaussian Kenels to radial basis funcioncassifiers. IEEE Trans. Signal Process. Publ. IEEE Signal Process. Soc. 45, 2758–2765 (1997).
Funding
This research was funded by the Supported by Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education, Nanjing University of Aeronautics and Astronautics(NJ2024029), and Joint Fund of the National Natural Science Foundation of China and Civil Aviation Administration of China (No. U2033202), the Fundamental Research Funds for the Central Universities (No. NS2022094), and the Experimental Technology Research and Development" project of Nanjing University of Aeronautics and Astronautics Project (No.SYJS202207Y), and the first batch of industry-university-research cooperative collaborative education projects of the Ministry of Education in 2021(No. 202101042005), and the Nanjing University of Aeronautics and Astronautics PhD short-term visiting scholar project(No.ZDGB2021024).
Author information
Authors and Affiliations
Contributions
L.S., T.P., H.L. and Y.L. wrote the main manuscript text, H.S. and H.W. were supported by the Fund, J.Q. and M.X. were processed for data. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics declarations
The authors confirmed that all studies were conducted following the 2013 revised version of the Declaration of Helsinki. The authors confirmed that the experimental protocol was approved by the Ethics Committee of Nanjing University of Aeronautics and Astronautics.
Informed consent
Informed consent was obtained from all subjects for participation and publication of identifiable images. In this case, the portraits in Figs. 3 and 4 are the authors themselves, and the portraits in Figs. 2,5,8,and 12 are the test flight trainees.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Shang, L., Si, H., Wang, H. et al. Research on fatigue detection of flight trainees based on face EMF feature model combination with PSO-CNN algorithm. Sci Rep 14, 20641 (2024). https://doi.org/10.1038/s41598-024-71192-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-71192-x