Table 1 Model performance—zero-shot prompting with definitions
From: Privacy-preserving large language models for structured medical information retrieval
Sensitivity | Specificity | Positive predictive value | Negative predictive value | Accuracy | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
7b | 13b | 70b | 7b | 13b | 70b | 7b | 13b | 70b | 7b | 13b | 70b | 7b | 13b | 70b | |
Ascites | 1.00 | 0.75 | 0.95 | 0.77 | 0.99 | 0.95 | 0.16 | 0.71 | 0.44 | 1.00 | 0.99 | 1.00 | 0.78 | 0.98 | 0.95 |
Abdominal pain | 0.88 | 0.74 | 0.84 | 0.67 | 0.89 | 0.97 | 0.38 | 0.60 | 0.86 | 0.96 | 0.94 | 0.97 | 0.71 | 0.86 | 0.95 |
Shortness of breath | 0.87 | 0.42 | 0.87 | 0.77 | 0.99 | 0.96 | 0.45 | 0.86 | 0.82 | 0.96 | 0.89 | 0.97 | 0.79 | 0.88 | 0.94 |
Confusion | 0.63 | 0.59 | 0.76 | 0.89 | 0.90 | 0.94 | 0.34 | 0.34 | 0.54 | 0.96 | 0.96 | 0.98 | 0.87 | 0.87 | 0.93 |
Liver cirrhosis | 1.00 | 0.96 | 1.00 | 0.70 | 0.99 | 0.96 | 0.16 | 0.81 | 0.56 | 1.00 | 1.00 | 1.00 | 0.71 | 0.99 | 0.96 |