Fig. 3: The power of scale of delta-tuning methods.
From: Parameter-efficient fine-tuning of large-scale pre-trained language models

a–o, We perform all delta-tuning methods on different scales of T5: T5SMALL(), T5BASE() and T5XXL(). We report the performance of Adapter in (a–c), LoRA in (d–f), Prefix-tuning in (g–i), Last-layer tuning in (j–l), and Selective-module tuning in (m–o). From this figure, we can observe that with the scale of T5 increasing, all delta-tuning methods could converge faster and achieve better performance on MNLI, QNLI and SST-2.