Fig. 1: Performance of frontier LLMs on popular benchmarks and HLE. | Nature