Extended Data Table 1 Overview of human-centric computer vision (HCCV) datasets commonly used for fairness
From: Fair human-centric image dataset for ethical AI benchmarking

- This table compares the properties of 27 HCCV datasets frequently used for evaluating bias in computer vision models. Features include dataset size, collection method, availability of annotations (Bounding Boxes [BB], Key Points [KP], Segmentation Masks [SM]), consent details, terms of use, and demographic diversity attributes. The abbreviations used are defined as follows: BB (a: automatic, m: manual, F: face, O: object, P: person), KP/SM (a: automatic, m: manual, v: manually verified, with the integer value denoting the number of key points or landmarks, or segmentation categories), Consent (no details: consent obtained, but no details provided; details: consent details provided, but no explicit mention of AI; details, for AI: consent details provided, including data processing for AI fairness purposes), and Terms of Use (n-c: non-commercial, research: research only, eval.: evaluation only, edu.: educational use, revoked: authors no longer make dataset available). Attributes marked with * are self-reported. (-) denotes where the relevant information was not available. MS-Celeb-1M127, YFCC100M149, Megaface150, VGGFace151, Diversity in Faces (DiF)152, Pilot Parl. Benchmark9, FRGC153, RWF154, Morph155, Adience156, BUPT-Globalface157, WIDERFACE-DEMO158, KANFace159, FairFace160, ImageNet (ILSVRC)161, CelebA86, LFWA162, MTFL163, UTKFace164, MIAP42, FACET24, MS-COCO40, VQA 2.041, Casual Conversations25, CCV226, Chicago Face Database27, Dollar Street43.