Table 4 Database size and effect on accuracy, overfitting considerations.

Database	Size	Number of samples (training set)	Number of samples (test set)	Effect on accuracy	Overfitting risk
Arabic Sign Language (ArSL) Dataset (Database 1)	15,086 images	13,926 images	290 images	Larger training set allows the model to learn more robust features, improving generalization and reducing bias	Overfitting could occur if the model is too complex relative to the dataset size, particularly with deep models like DenseNet and ResNet, which might memorize specific features if not properly regularized
RGB Arabic Alphabets Sign Language Dataset (Database 2)	7857 images	4000 images	3000 images	A larger dataset contributes to better model performance as it enables better feature extraction and generalization	With larger datasets, overfitting risk is lower but still possible if the model is overtrained on the data without sufficient cross-validation or regularization
KArSL (Database 3)	75,300	60,240 (80%)	15,060 (20%)	High accuracy potential due to large dataset; consistent signer data improves learning	Moderate (due to repetitive samples from same signers, may limit generalization)

Quick links

Search