Table 2 Results of combining different delta-tuning methods

From: Parameter-efficient fine-tuning of large-scale pre-trained language models

Prompt

BitFit

Adapter

Tunable parameters

0%

1.75%

0.09%

1.84%

0.003%

1.76%

0.09%

1.85%

RoBERTaLARGE, full data, without manual templates

CoLA(Matt.)

4.6

66.61.6

63.50.6

65.90.5

42.72.3

63.11.5

63.70.9

64.40.9

SST-2(acc)

50.9

95.80.1

95.60.1

95.70.2

95.30.2

95.70.1

95.30.2

95.50.1

MRPC(F1)

1.4

92.70.2

91.90.4

93.00.4

85.40.5

92.00.5

92.20.5

92.90.3

STS-B(Pear.)

-6.2

91.40.1

90.70.2

90.50.1

83.02.8

90.50.4

90.30.7

90.90.1

QQP(F1.)

6.4

83.50.1

83.50.0

84.40.0

77.20.4

84.30.0

83.60.1

84.40.0

MNLI(acc)

34.2

88.60.2

88.00.2

89.00.1

77.92.5

88.90.1

88.00.2

88.90.1

QNLI(acc)

50.6

93.70.3

93.40.3

94.20.1

86.20.5

94.20.1

93.20.3

94.40.1

RTE(acc)

47.7

86.80.5

86.21.0

84.50.5

74.40.5

84.10.8

85.71.5

84.71.1

Average

23.7

87.40.4

86.60.4

87.10.2

77.71.2

86.60.4

86.50.6

87.00.3

RoBERTaLARGE, full data, with manual templates

CoLA(Matt.)

2.2

66.91.1

64.20.5

65.51.0

37.820.8

64.71.3

64.80.7

64.91.0

SST-2(acc)

83.6

96.30.2

96.10.1

96.20.2

95.70.2

95.80.1

95.90.1

95.80.2

MRPC(F1)

61.9

92.20.4

92.70.6

92.70.2

84.20.5

91.80.2

92.20.4

92.00.4

STS-B(Pear.)

-3.3

91.30.5

90.90.1

90.70.2

79.61.3

91.90.3

90.80.4

90.10.6

QQP(F1)

49.7

83.60.1

83.60.0

84.60.1

77.00.7

84.30.0

83.70.0

84.40.2

MNLI(acc)

50.9

88.60.1

87.70.1

88.70.1

80.20.2

88.70.1

88.00.1

88.90.1

QNLI(acc)

50.8

93.60.1

93.10.2

93.80.1

86.60.4

93.80.1

93.00.1

93.80.1

RTE(acc)

51.3

86.90.2

86.21.0

86.00.7

78.30.3

84.60.5

86.41.5

84.70.9

Average

43.4

87.40.3

86.80.3

87.30.3

77.43.0

86.90.3

86.90.4

86.80.4

RoBERTaLARGE, 16 shot, without manual templates

CoLA(Matt.)

4.6

19.69.6

15.117.0

17.711.4

3.50.6

21.411.5

20.819.6

21.513.4

SST-2(acc)

50.9

92.70.4

92.70.6

93.10.6

74.90.6

91.70.8

92.20.5

91.60.7

MRPC(F1)

1.4

78.24.4

69.81.6

81.20.0

6.24.1

74.67.1

69.36.5

77.45.4

STS-B(Pear.)

-6.2

66.52.5

67.58.0

71.02.5

10.73.5

63.31.6

64.75.6

69.68.6

QQP(F1)

6.4

55.95.8

55.16.8

54.64.2

52.41.4

58.37.2

55.14.8

58.56.1

MNLI(acc)

34.2

58.14.5

64.63.4

62.74.1

35.30.6

61.43.9

61.45.1

61.03.8

QNLI(acc)

50.6

60.23.0

69.71.9

59.81.7

52.81.0

60.24.9

60.94.0

61.67.0

RTE(acc)

47.7

55.01.6

54.50.8

54.92.9

50.10.7

58.22.5

54.62.4

58.73.4

Average

23.7

60.84.0

61.15.0

61.93.4

35.71.6

61.24.9

59.96.1

62.56.0

RoBERTaLARGE, 16 shot, with manual templates

CoLA(Matt.)

2.2

10.515.0

4.65.0

9.210.2

1.41.7

10.24.2

5.92.5

5.95.5

SST-2(acc)

83.6

93.10.3

92.90.1

92.10.1

90.90.6

91.90.4

92.00.4

92.20.6

MRPC(F1)

61.9

77.21.4

74.54.9

81.20.0

72.14.4

76.81.3

76.12.4

81.20.0

STS-B(Pear.)

-3.3

65.84.7

69.36.0

71.04.1

12.08.0

61.75.7

71.36.4

67.12.8

QQP(F1)

49.7

66.60.5

67.80.5

66.34.1

53.41.0

66.91.9

68.61.2

67.12.9

MNLI(acc)

50.9

68.01.4

69.43.3

68.90.4

53.22.5

67.11.8

67.12.0

68.10.3

QNLI(acc)

50.8

69.51.1

70.23.4

68.12.4

59.40.5

69.92.5

72.53.9

70.42.3

RTE(acc)

51.3

70.63.6

67.35.1

73.02.0

56.34.6

70.42.3

69.23.5

72.42.8

Average

43.4

65.23.5

64.53.5

66.22.9

49.82.9

64.42.5

65.32.8

65.62.2

  1. Performance of RoBERTaLARGE on GLUE datasets. We report the average result of multiple random seeds on the validation set. A tick symbol denotes that the component is included in the combination and a cross symbol denotes that it is excluded in the combination. The best performance of each dataset is highlighted in bold.