Table 2 Comparative summary of AraTraditions10k and related cross-lingual mage annotation datasets

From: AraTraditions10k bridging cultures with a comprehensive dataset for enhanced cross lingual image annotation retrieval and tagging

Dataset

Scale

Languages

Annotation quality

Cultural diversity

Strengths

Limitations

COCO-CN

20,342 images

Chinese, English

High (recommendatio n-assisted)

Moderate

Improved cross-lingual performance

Smaller scale, limited to Chinese and English

Flickr30k-CN

Smaller scale

Chinese, English

Moderate (translated captions)

Limited

Facilitates Chinese image captioning

Limited size and diversity

AIC-ICC

240,000 images

Chinese

Varies (crowdsourced)

Low (biased towards human activities)

Large-scale, detailed annotations

Bias limits generalizability, variable annotation quality

Multi30k

Moderate scale

English, German

High

Moderate

Multilingual, detailed annotations

Limited to European languages

AI Challenger

240,000 images

Chinese

High

Moderate

Extensive dataset, supports deep learning research

Focused on Chinese, may lack diverse cultural aspects

AraTraditions10k

10,000 images

Arabic, English

High (professional translation, recommendation n-assisted)

High

Culturally rich, advanced annotation techniques, diverse visual content

Smaller scale compared to some datasets but has high quality and cultural relevance