Table 5 Distribution of seen and unseen instances in different groups of BioWiC35 test set.

From: A Dataset for Evaluating Contextualized Representation of Biomedical Concepts in Language Models

Instance group

Seen

Unseen

All

#

%

#

%

#

%

Term identity

412

52

388

48

800

100

Abbreviations

190

95

10

5

200

100

Synonyms

675

84

125

16

800

100

Label similarity

179

90

21

10

200

100

All

1457

73

543

27

2000

100