Table 1 DatBenchmark results using different models on the dataset.

Metadata Item	Description
Dataset Name	Ancient Yi Script Handwritten Character Dataset
Dataset Description	The dataset contains character samples for handwritten Yi character recognition, a total of 2922 commonly used Yi glyphs, each character is written by a different participant.
Collection Time	January 2021 to December 2021
Collection Location	Sichuan Province and Guizhou Province, China
Participant Information	9845 Yi writers of different ages, genders, and occupational backgrounds
Data Format	The data is stored in image format with a resolution of 300 DPI. Each character sample is stored as a separate image file, with filenames containing character codes and participant IDs.
Metadata Recording	Each image file is accompanied by a JSON metadata file containing character codes, participant IDs, collection time, collection location, writing tools, and image processing steps.
Image Preprocessing	All images undergo noise filtering, binarization, and normalization to improve accuracy and consistency of character recognition.
Cultural Sensitivity	The dataset includes significant cultural and linguistic values. When using the data, respect for the uniqueness of Yi culture and language is required. Avoid misunderstanding and misrepresentation of Yi culture and language.
Data Sharing and Collaboration	Researchers are encouraged to collaborate with the Yi community and data providers to ensure the accuracy and social impact of research results.

Quick links

Search