Table 2 Numbers of entities annotated in PGxCorpus, by type.

From: PGxCorpus, a manually annotated corpus for pharmacogenomics

PGxCorpus entity

Simple

Nested

Discont.

Both N&D

Total

Chemical

1,512

192

2

12

1,718

Genomic_factor

21

68

7

3

99

Gene_or_protein

1,685

20

3

0

1,708

Genomic_variation

14

37

3

0

54

Limited_variation

237

537

98

47

919

Haplotype

15

112

4

6

137

Phenotype

282

330

60

27

699

Disease

460

143

14

18

635

Pharmacodynamic_phenotype

157

390

60

25

632

Pharmacokinetic_phenotype

31

109

14

6

160

Total

4,414

1,938

265

144

6,761

  1. Because nested and discontiguous (Discont.) entities are dealt with differently in our baseline experiments, we report numbers of “simple” annotations, i.e. those that are neither nested nor discontiguous. Nested and Discont. refers to annotations that are either nested or discontiguous. “Both N&D” refers to entities both nested and discontiguous. Every entity is only counted within its most specific type. An entity that appears several times is counted as many times it appears.