Table 5 Distribution of seen and unseen instances in different groups of BioWiC35 test set.
From: A Dataset for Evaluating Contextualized Representation of Biomedical Concepts in Language Models
Instance group | Seen | Unseen | All | |||
---|---|---|---|---|---|---|
# | % | # | % | # | % | |
Term identity | 412 | 52 | 388 | 48 | 800 | 100 |
Abbreviations | 190 | 95 | 10 | 5 | 200 | 100 |
Synonyms | 675 | 84 | 125 | 16 | 800 | 100 |
Label similarity | 179 | 90 | 21 | 10 | 200 | 100 |
All | 1457 | 73 | 543 | 27 | 2000 | 100 |