Fig. 5: Head-to-head win rate comparison matrix between different models.

Each cell shows the win rate of the model on the y-axis against the model on the x-axis. A win rate above 0.5 indicates that the y-axis model outperforms the x-axis model more frequently in direct comparisons. The color intensity corresponds to the win rate magnitude, with darker red indicating higher win rates.