Extended Data Table 1 Overview of human-centric computer vision (HCCV) datasets commonly used for fairness

From: Fair human-centric image dataset for ethical AI benchmarking

  1. This table compares the properties of 27 HCCV datasets frequently used for evaluating bias in computer vision models. Features include dataset size, collection method, availability of annotations (Bounding Boxes [BB], Key Points [KP], Segmentation Masks [SM]), consent details, terms of use, and demographic diversity attributes. The abbreviations used are defined as follows: BB (a: automatic, m: manual, F: face, O: object, P: person), KP/SM (a: automatic, m: manual, v: manually verified, with the integer value denoting the number of key points or landmarks, or segmentation categories), Consent (no details: consent obtained, but no details provided; details: consent details provided, but no explicit mention of AI; details, for AI: consent details provided, including data processing for AI fairness purposes), and Terms of Use (n-c: non-commercial, research: research only, eval.: evaluation only, edu.: educational use, revoked: authors no longer make dataset available). Attributes marked with * are self-reported. (-) denotes where the relevant information was not available. MS-Celeb-1M127, YFCC100M149, Megaface150, VGGFace151, Diversity in Faces (DiF)152, Pilot Parl. Benchmark9, FRGC153, RWF154, Morph155, Adience156, BUPT-Globalface157, WIDERFACE-DEMO158, KANFace159, FairFace160, ImageNet (ILSVRC)161, CelebA86, LFWA162, MTFL163, UTKFace164, MIAP42, FACET24, MS-COCO40, VQA 2.041, Casual Conversations25, CCV226, Chicago Face Database27, Dollar Street43.