Table 7 10-fold CV versus LOSO CV.
From: TATPat based explainable EEG model for neonatal seizure detection
Feature | 10-fold CV | LOSO CV |
|---|---|---|
Definition | The dataset is randomly divided into 10 equal folds. The model is trained on 9 folds and tested on the remaining fold. This process is repeated 10 times, each time with a different fold used for testing | In LOSO, the model is trained on all subjects except one and tested on the left-out subject. This process is repeated for each subject in the dataset |
Use Case | Commonly used for general performance evaluation, especially in smaller datasets where subject independence is not a concern | Preferred in scenarios like EEG signal analysis or medical data, where each subject’s data is distinct, ensuring that the model is evaluated on unseen subjects |
Data Split | Data is split randomly into 10 parts, regardless of subject boundaries | Data is split by subject, ensuring that each subject’s data is used only once for testing |
Computational time | Typically faster as it uses a fixed number of folds (10) for cross-validation | Slower, especially with a large number of subjects, as each subject represents one test fold |
Bias and variance | Provides a balance between bias and variance, giving a more generalized evaluation of the model | Reduces subject-related bias but may result in higher variance as the model is tested on one subject at a time |
Generalization | Offers better insight into the model’s generalization across the overall dataset | Provides more reliable generalization for unseen subjects but can be less stable due to individual subject difference |