Extended Data Table 2 In-distribution predictability results of 15 LLMs for the ADeLe battery using tenfold cross-validation, averaged across ten seeds
From: General scales unlock AI evaluation with explanatory and predictive power

From: General scales unlock AI evaluation with explanatory and predictive power
