Table 3 Performance comparison of transformer-based models across Epoch.

From: Resolving passage ambiguity in machine reading comprehension using lightweight transformer architectures

Model

Epochs #

Train Acc

Valid Acc

Train Loss

Valid Loss

Distil-BERT-MRC

1

75.52

72.28

1.20

1.35

5

92.00

90.50

0.45

0.55

10

94.10

92.58

0.30

0.39

15

94.30

92.70

0.26

0.36

RoBERTa

1

72.50

70.00

1.25

1.40

5

90.00

88.50

0.50

0.50

10

93.80

92.30

0.32

0.40

15

93.95

92.50

0.29

0.37

XLNet

1

70.00

68.50

1.30

1.45

5

89.50

87.50

0.55

0.65

10

93.20

92.00

0.35

0.42

15

93.50

92.20

0.31

0.39