Table 4 Comparison of the performance of the BioPLBC model under different feature ablation scenarios on the NCBI-Disease and BC2GM datasets.

From: From biomedical knowledge graph construction to semantic querying: a comprehensive approach

Model/dataset

NCBI-disease

BC2GM

No BioBERT embedding

86.20

81.43

No part of speech embedding

90.23

85.84

No lexical morphological embedding

89.79

84.88

BioPLBC

90.86

86.61

  1. This table demonstrates the effect of removing specific features (BioBERT embedding, Part of speech embedding and lexical morphology embedding) on the model F1 score. By removing these features step by step, the contribution of each feature to the model performance can be evaluated. The experimental results show that removing the BioBERT embeddings has the largest impact on the model performance, leading to a significant decrease in the F1 score, while removing the lexical embeddings and lexical morphology embeddings has a relatively small impact, but still leads to a decrease in performance. The last row represents the full BioPLBC model with all features, which achieves the highest performance on both datasets, demonstrating the effectiveness of multi-feature fusion.