Table 2 Performances of Fu-LLM with different strategies for adjudications of clinical events
Raw agreement, % (95% CI) | Sensitivity, % (95% CI) | Specificity, % (95% CI) | Positive predictive value, % (95% CI) | Negative predictive value, % (95% CI) | |
|---|---|---|---|---|---|
Performances of finetune_qwen2_7b | |||||
Whether the information came from the participant himself/herself | 97.2 (96.1–98.2) | 97.9 (96.7–98.9) | 96.2 (94.5–97.9) | 97.4 (96.1–98.7) | 96.9 (95.2–98.6) |
Whether the participant died | 99.8 (99.5–100.0) | 100.0 (100.0–100.0) | 99.8 (99.5–100.0) | 91.7 (79.3–100.0) | 100.0 (100.0–100.0) |
Whether the participant was hospitalizeda | 82.7 (80.4–85.0) | 88.9 (84.7–93.3) | 91.3 (88.9–93.6) | 79.1 (73.7–83.9) | 95.7 (94.0–97.3) |
Whether the participant underwent surgerya | 92.1 (90.3–93.8) | 95.3 (91.8–100.0) | 89.8 (87.3–92.5) | 75.1 (69.5–80.5) | 98.4 (97.0–99.4) |
Whether the participant taken medicationa | 96.4 (95.2–97.5) | 99.8 (99.4–100.0) | 91.0 (86.1–95.5) | 98.5 (97.6–99.3) | 98.5 (96.2–100.0) |
Total | 93.7 (93.1–94.3) | 97.5 (96.7–98.2) | 95.0 (94.2–95.8) | 93.1 (91.9–94.2) | 98.2 (97.8–98.7) |
Performances of finetune_qwen2_7b_wo_aug | |||||
Whether the information came from the participant himself/ herself | 92.7 (91.2–94.3) | 95.5 (93.8–97.1) | 88.7 (85.6–91.5) | 92.5 (90.5–94.6) | 93.1 (90.2–95.3) |
Whether the participant died | 99.6 (99.2–99.9) | 100.0 (100.0–100.0) | 99.6 (99.2–99.9) | 84.6 (70.0–96.8) | 100.0 (100.0–100.0) |
Whether the participant was hospitalizeda | 69.5 (66.5–72.3) | 77.9 (72.6–83.1) | 89.8 (87.0–92.2) | 73.6 (67.9–79.8) | 91.7 (89.4–94.0) |
Whether the participant underwent surgerya | 87.5 (85.5–89.5) | 87.7 (82.8–92.4) | 85.7 (82.8–88.7) | 66.4 (59.7–72.8) | 95.6 (93.7–97.4) |
Whether the participant taken medicationa | 94.8 (93.3–96.2) | 98.9 (98.2–99.5) | 81.4 (75.2–87.6) | 96.8 (95.4–98.0) | 92.9 (87.9–96.9) |
Total | 88.9 (88.1–89.7) | 94.4 (93.3–95.4) | 92.1 (91.1–93.1) | 89.1 (87.7–90.5) | 96.0 (95.2–96.7) |
Performances of zero_shot_qwen2_7b | |||||
Whether the information came from the participant himself/ herself | 87.2 (85.0–89.0) | 91.2 (88.8–93.2) | 81.4 (77.7–85.0) | 87.8 (85.3–90.1) | 86.3 (82.6–89.3) |
Whether the participant died | 99.4 (98.9–99.8) | 100.0 (100.0–100.0) | 99.4 (98.9–99.8) | 78.6 (63.0–91.7) | 100.0 (100.0–100.0) |
Whether the participant was hospitalizeda | 25.0 (22.4–27.5) | 51.4 (45.0–57.9) | 10.6 (8.0–13.1) | 17.5 (14.5–20.5) | 37.3 (29.9–44.9) |
Whether the participant underwent surgerya | 67.3 (64.3–70.2) | 76.6 (69.9–82.6) | 64.4 (60.6–68.3) | 40.9 (35.7–46.3) | 89.5 (86.3–92.3) |
Whether the participant taken medicationa | 82.9 (80.5–85.2) | 83.6 (81.0–85.9) | 86.9 (81.3–92.1) | 97.3 (96.1–98.4) | 48.1 (42.3–54.2) |
Total | 72.5 (71.3–73.7) | 82.1 (80.2–83.6) | 70.3 (68.5–71.9) | 65.5 (63.5–67.4) | 85.1 (83.5–86.5) |