Nature

Extended Data Table 1 Accuracy and RMS Calibration error of frontier LLMs on the text-only questions of HLE

From: A benchmark of expert-level academic questions to assess AI capabilities

Back to article page

Search

Advanced search

Quick links