Table 1 Performance of the open-source and closed-source base models using the different prompting techniques in extracting the inclusion and exclusion biomarkers from the clinical trials free-text documents
From: Enhancing biomarker based oncology trial matching using large language models
Inclusion Biomarkers | Exclusion Biomarkers | |||||
|---|---|---|---|---|---|---|
Technique | Precision ↑ | Recall ↑ | F2 ↑ | Precision ↑ | Recall ↑ | F2 ↑ |
GPT-3.5-Turbo (0S) | 0.61 | 0.42 | 0.45 | 0.02 | 0.18 | 0.06 |
GPT-3.5-Turbo (PC) | 0.21 | 0.28 | 0.26 | 0.02 | 0.13 | 0.05 |
GPT-3.5-Turbo (1S) | 0.46 | 0.60 | 0.56 | 0.06 | 0.21 | 0.14 |
GPT-3.5-Turbo (2S) | 0.40 | 0.59 | 0.54 | 0.05 | 0.13 | 0.10 |
GPT-4 (0S) | 0.55 | 0.56 | 0.56 | 0.47 | 0.41 | 0.42 |
GPT-4 (PC) | 0.77 | 0.76 | 0.76 | 0.75 | 0.68 | 0.70 |
Hermes-2-Pro-Mistral-7B (0S) | 1 | 0.97 | 0.98 | 0.42 | 0.77 | 0.66 |