Fig. 3: Consistency results.
From: World and Human Action Models towards gameplay ideation

a, FVD for a range of WHAM sizes over training compute budget (FLOPS). FVD improves for larger models and compute budgets. b, Key frames of two example generations (one per row) from the 1.6B WHAM of 2 min each, indicating that the 1.6B WHAM is capable of generating long-term consistent gameplay.