Fig. 4: Overview of development and test sets used in this study. | npj Digital Medicine

Fig. 4: Overview of development and test sets used in this study.

From: Standardized patient profile review using large language models for case adjudication in observational research

Fig. 4

A development set was created to guide prompt engineering. Test set 1 was also used in our prior work on KEEPER, thus providing a benchmark for consistency. Test set 2 mimics 1 but uses insurance claims data. Test set 3 takes a truly random sample across more diseases to enhance generalizability. The highly sensitive set demonstrates the use of LLMs to annotate a large set of patients, allowing computation of sensitivity and PPV of phenotype algorithms.

Back to article page