Extended Data Fig. 2: Model sizes.
From: World and Human Action Models towards gameplay ideation

a, Effect of model size and training compute (FLOPS) on the cross-entropy training loss (lower is better). Highlighted in green are models on the ‘efficient frontier’—models that minimize training loss for a given compute budget. Note that larger model sizes become efficient at larger compute budgets. b, FVD for a range of model sizes each at various numbers of updates, plotted in terms of the training loss. Note that a lower loss usually leads to a lower FVD. c, Transformer configurations.