Fig. 3: Sensitivity and specificity of reviewers for the three test sets. | npj Digital Medicine

Fig. 3: Sensitivity and specificity of reviewers for the three test sets.

From: Standardized patient profile review using large language models for case adjudication in observational research

Fig. 3

Points indicate sensitivity and specificity of each human or LLM reviewer against the gold standard. Error bars indicate 95% confidence intervals. For test set 1, the gold standard was created by external reviewers. For test set 2 and 3, the gold standard was the majority vote of human reviewers using a leave-one-out approach. Slanted lines denote iso-AUC contours, spaced 0.1 apart.

Back to article page