Fig. 2: Training loss and validation loss for GatorTron-base (345 million), medium (3.9 billion), and large (8.9 billion) models.

a Training loss. b Validation loss. MLM masked language modeling.
a Training loss. b Validation loss. MLM masked language modeling.