Table 5 Evaluation datasets, dataset size, and evaluation metrics
| Â | Training | Validation | Testing | Primary metrics | Secondary metrics |
|---|---|---|---|---|---|
Named entity recognition | |||||
BC5CDR-chemical59 | 4560 | 4581 | 4797 | Â | |
NCBI-disease60 | 5424 | 923 | 940 | Â | |
Relation extraction | |||||
ChemProt55 | 19,460 | 11,820 | 16,943 | Macro F190 | |
DDI201362 | 18,779 | 7244 | 5761 | Micro F116 | |
Multi-label document classification | |||||
HoC64 | 1108 | 157 | 315 | Micro F186 | |
LitCovid56 | 24,960 | 6239 | 2500 | Macro F156 | Micro F156 |
Question answering | |||||
MedQA 5-option66 | 10,178 | 1272 | 1273 | Accuracy66 | Macro F191 |
PubMedQA67 | 190,142 | 21,127 | 500 | Accuracy67 | Macro F191 |
Text summarization | |||||
PubMed Text Summarizationa68 | 117,108 | 6631 | 6658 | Rouge-L68 | |
MS^2b50 | 14,188 | 2021 | - | Rouge-L50 | |
Text simplification | |||||
Cochrane PLS69 | 3568 | 411 | 480 | Rouge-L69 | |
PLOS Text Simplification70 | 26,124 | 1000 | 1000 | Rouge-L70 | |