Table 1 Splitting the Arabic characters datasets (80%, 10%, and 10%) for training, validation, and testing sets.

From: Integrating CNN and transformer architectures for superior Arabic printed and handwriting characters classification

Dataset

Splitting set

Number of images

Total

Arabic OCR dataset

Training set

101,610

 

Validation set

12,716

127,027

Testing set

12,701

 

AHCR dataset

Training set

13,261

 

Validation set

1668

16,591

Testing set

1662

Â