Table 3 Performance (in %) of complicated prompt design (P/R/F).

From: The influence of prompt engineering on large language models for protein–protein interaction identification in biomedical literature

 

AIMed

BioInfer

GPT-3.5

 Prompt 1

41.2/93.3/57.1

65.3/76.2/70.3

 Prompt 6

42.4/93.3/58.3

62.3/90.5/73.8

GPT-4

 Prompt 1

40.0/93.3/56.0

62.3/78.6/69.5

 Prompt 6

43.8/93.3/59.6

62.1/97.6/75.9

  1. Bold values indicate highest performance scores in each comparison.