Table 1 DatBenchmark results using different models on the dataset.
Metadata Item | Description |
|---|---|
Dataset Name | Ancient Yi Script Handwritten Character Dataset |
Dataset Description | The dataset contains character samples for handwritten Yi character recognition, a total of 2922 commonly used Yi glyphs, each character is written by a different participant. |
Collection Time | January 2021 to December 2021 |
Collection Location | Sichuan Province and Guizhou Province, China |
Participant Information | 9845 Yi writers of different ages, genders, and occupational backgrounds |
Data Format | The data is stored in image format with a resolution of 300 DPI. Each character sample is stored as a separate image file, with filenames containing character codes and participant IDs. |
Metadata Recording | Each image file is accompanied by a JSON metadata file containing character codes, participant IDs, collection time, collection location, writing tools, and image processing steps. |
Image Preprocessing | All images undergo noise filtering, binarization, and normalization to improve accuracy and consistency of character recognition. |
Cultural Sensitivity | The dataset includes significant cultural and linguistic values. When using the data, respect for the uniqueness of Yi culture and language is required. Avoid misunderstanding and misrepresentation of Yi culture and language. |
Data Sharing and Collaboration | Researchers are encouraged to collaborate with the Yi community and data providers to ensure the accuracy and social impact of research results. |