Extended Data Table 1 Characteristics of the 15 language models evaluated in this paper

From: General scales unlock AI evaluation with explanatory and predictive power

  1. SFT, supervised fine-tuning; RLHF, reinforcement learning from human feedback; CoT, chain-of-thought.