- NEWS AND VIEWS
Expert-level test is a head-scratcher for AI
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Nature 649, 1115-1116 (2026)
doi: https://doi.org/10.1038/d41586-025-04098-x
References
Zellers, R., Holtzman, A., Bisk, Y., Farhadi, A. & Choi, Y. in Proc. 57th Annu. Meet. Assoc. Comput. Linguist. 4791–4800 (Association for Computational Linguistics, 2019).
Hendrycks, D. et al. in 8th Int. Conf. Learn. Represent. (ICLR, 2020).
Jimenez, C. E. et al. Preprint at arXiv https://doi.org/10.48550/arXiv.2310.06770 (2024).
Cobbe, K. et al. Preprint at arXiv https://doi.org/10.48550/arXiv.2110.14168 (2021).
Center for AI Safety, Scale AI & HLE Contributors Consortium Nature 649, 1139–1146 (2026).
Collins, K. M. et al. Nature Hum. Behav. 8, 1851–1863 (2024).
Chu, J., Tenenbaum, J. B. & Schulz, L. E. Trends Cogn. Sci. 28, 628–642 (2024).
Getzels, J. W. in Frontiers of Creativity Research: Beyond the Basics (ed. Isaksen, S. G.) 88–102 (Bearly, 1987).
Competing Interests
The authors know some colleagues who participated in the HLE question generation and review process.
A benchmark of expert-level academic questions to assess AI capabilities
Mathematicians put AI model AlphaProof to the test
AI discovers learning algorithm that outperforms those designed by humans