Fig. 7: Comparison of structured output with and without Pydantic in pathological T (pT) classification of gynecologic cancers using Qwen2.5 72B. | npj Precision Oncology

Fig. 7: Comparison of structured output with and without Pydantic in pathological T (pT) classification of gynecologic cancers using Qwen2.5 72B.

From: Real-world application of large language models for automated TNM staging using unstructured gynecologic oncology reports

Fig. 7

A Example of conventional prompt-based structured output. The output may include unnecessary explanations or inconsistent formatting, requiring manual post-processing. B Example output using Pydantic-enforced constraints, which ensures format consistency and suppresses irrelevant text. C, D Confusion matrices showing the accuracy of Qwen2.5 72B in extracting pT classification from pathology reports of gynecologic cancers (n = 951), with (C) and without (D) Pydantic-based structured decoding. The vertical axis represents the ground truth pT values obtained via manual annotation, and the horizontal axis indicates the pT values extracted using the model. Each cell shows the number of cases (n). “N.S.” denotes “Not specified.” E Summary of performance differences between the two approaches. The use of Pydantic significantly improved all metrics, including accuracy, precision, recall, and F1 score, as evaluated using bootstrapped mean differences and 95% confidence intervals. All improvements were statistically significant.

Back to article page