Table 1 DatBenchmark results using different models on the dataset.

From: Ancient Yi Script Handwriting Sample Repository

Metadata Item

Description

Dataset Name

Ancient Yi Script Handwritten Character Dataset

Dataset Description

The dataset contains character samples for handwritten Yi character recognition, a total of 2922 commonly used Yi glyphs, each character is written by a different participant.

Collection Time

January 2021 to December 2021

Collection Location

Sichuan Province and Guizhou Province, China

Participant Information

9845 Yi writers of different ages, genders, and occupational backgrounds

Data Format

The data is stored in image format with a resolution of 300 DPI. Each character sample is stored as a separate image file, with filenames containing character codes and participant IDs.

Metadata Recording

Each image file is accompanied by a JSON metadata file containing character codes, participant IDs, collection time, collection location, writing tools, and image processing steps.

Image Preprocessing

All images undergo noise filtering, binarization, and normalization to improve accuracy and consistency of character recognition.

Cultural Sensitivity

The dataset includes significant cultural and linguistic values. When using the data, respect for the uniqueness of Yi culture and language is required. Avoid misunderstanding and misrepresentation of Yi culture and language.

Data Sharing and Collaboration

Researchers are encouraged to collaborate with the Yi community and data providers to ensure the accuracy and social impact of research results.