Table 7 Number of entities pre-annotated after extending PubTator annotation with the PHARE ontology.

From: PGxCorpus, a manually annotated corpus for pharmacogenomics

PHARE entity

Discontiguous

All

Chemical

430

87,764

Disease

0

29,589

Gene_or_protein

4,690

10,1326

Genomic_variation

8,698

13,601

Phenotype

10,935

16,770

  1. Because discontiguous entities are excluded from our baseline experiments (see Section Technical Validation), their number is specified. No disease entity is discontiguously annotated first because PubTator is not generating discontiguous annotation, and second because the extension of annotations with PHARE (which may be discontiguous) is not producing disease annotations, but phenotype annotations.