Table 7 10-fold CV versus LOSO CV.

From: TATPat based explainable EEG model for neonatal seizure detection

Feature

10-fold CV

LOSO CV

Definition

The dataset is randomly divided into 10 equal folds. The model is trained on 9 folds and tested on the remaining fold. This process is repeated 10 times, each time with a different fold used for testing

In LOSO, the model is trained on all subjects except one and tested on the left-out subject. This process is repeated for each subject in the dataset

Use Case

Commonly used for general performance evaluation, especially in smaller datasets where subject independence is not a concern

Preferred in scenarios like EEG signal analysis or medical data, where each subject’s data is distinct, ensuring that the model is evaluated on unseen subjects

Data Split

Data is split randomly into 10 parts, regardless of subject boundaries

Data is split by subject, ensuring that each subject’s data is used only once for testing

Computational time

Typically faster as it uses a fixed number of folds (10) for cross-validation

Slower, especially with a large number of subjects, as each subject represents one test fold

Bias and variance

Provides a balance between bias and variance, giving a more generalized evaluation of the model

Reduces subject-related bias but may result in higher variance as the model is tested on one subject at a time

Generalization

Offers better insight into the model’s generalization across the overall dataset

Provides more reliable generalization for unseen subjects but can be less stable due to individual subject difference