Table 1 Performance of the DISTIL-BERT-MRC model on squad 2.0, NewsQA, and natural questions datasets across different Epochs.

From: Resolving passage ambiguity in machine reading comprehension using lightweight transformer architectures

Model

Dataset

Epochs #

Train Acc %

Valid Acc %

Train Loss %

Valid Loss %

Distil-BERT-MRC

SQuAD 2.0

1

76.10

68.85

1.18

1.41

 

5

91.30

89.20

0.49

0.60

 

10

93.50

91.80

0.33

0.44

 

15

94.10

93.25

0.27

0.38

NewsQA

1

78.00

74.45

1.15

1.28

 

5

91.80

90.05

0.46

0.56

 

10

93.80

92.80

0.29

0.39

 

15

94.40

94.52

0.25

0.36

Natural Questions

1

77.20

72.65

1.16

1.33

 

5

91.50

89.90

0.47

0.57

  

10

93.90

92.20

0.31

0.41

  

15

94.20

92.58

0.28

0.37