Table 3 Performance (in %) of complicated prompt design (P/R/F).
AIMed | BioInfer | |
---|---|---|
GPT-3.5 | ||
Prompt 1 | 41.2/93.3/57.1 | 65.3/76.2/70.3 |
Prompt 6 | 42.4/93.3/58.3 | 62.3/90.5/73.8 |
GPT-4 | ||
Prompt 1 | 40.0/93.3/56.0 | 62.3/78.6/69.5 |
Prompt 6 | 43.8/93.3/59.6 | 62.1/97.6/75.9 |
AIMed | BioInfer | |
---|---|---|
GPT-3.5 | ||
Prompt 1 | 41.2/93.3/57.1 | 65.3/76.2/70.3 |
Prompt 6 | 42.4/93.3/58.3 | 62.3/90.5/73.8 |
GPT-4 | ||
Prompt 1 | 40.0/93.3/56.0 | 62.3/78.6/69.5 |
Prompt 6 | 43.8/93.3/59.6 | 62.1/97.6/75.9 |