Table 1 Results of the credibility test
Evaluation dimensions | Allusion types | Sample size (items) | Precision | Recall | F1-Score |
|---|---|---|---|---|---|
Classified by Allusion Types | Historical Events Category | 185 | 0.91 | 0.89 | 0.9 |
Myths and Legends Category | 128 | 0.88 | 0.86 | 0.87 | |
Previous Generation Literature Category | 109 | 0.85 | 0.79 | 0.82 | |
Overall model performance | All types merged | 422 | 0.89 | 0.86 | 0.87 |
Comparative experiment (baseline model) | BiLSTM+CRF | 422 | 0.81 | 0.78 | 0.79 |
Special Evaluation of Low-Frequency Allusions | occurrences ≤ 5 times | 63 | 0.80 | 0.75 | 0.77 |