Fig. 2: Distribution of HLE questions across categories.
From: A benchmark of expert-level academic questions to assess AI capabilities

HLE consists of 2,500 exam questions in over a hundred subjects, grouped into eight high-level categories.
From: A benchmark of expert-level academic questions to assess AI capabilities

HLE consists of 2,500 exam questions in over a hundred subjects, grouped into eight high-level categories.