Extended Data Table 3 Generalization gap for RoBERTaLARGE on GLUE datasets

From: Parameter-efficient fine-tuning of large-scale pre-trained language models

  1. The experiments of generalization gap for RoBERTaLARGE on GLUE datasets. We report the average result (train performance - dev performance) of multiple random seeds. denotes the component is included in the combination and ✗ denotes it is excluded in the combination.