Table 4 Impact of ToolAgent on diagnostic performance
From: Early diagnosis of axial spondyloarthritis in primary care using multi-agent systems
Models | Dataset | UNSURE rate | Sensitivity without UNSURE (95% CI) | Specificity without UNSURE (95% CI) | Accuracy without UNSURE (95% CI) | Accuracy with UNSURE (95% CI) | F1-score without UNSURE (95% CI) | Mean Time/s |
|---|---|---|---|---|---|---|---|---|
Agents without ToolAgent | Training Dataset | 0.0279 | 0.9300 (0.8967–0.9575) | 0.7717 (0.6829–0.8515) | 0.8883 (0.8567–0.9198) | 0.8635 (0.8280–0.898) | 0.9246 (0.900–0.9472) | 55.8 |
Agents with ToolAgent | 0.0279 | 0.9228 (0.8884–0.9533) | 0.8667 (0.7975–0.9334) | 0.9083 (0.8739–0.9398) | 0.8830 (0.8530–0.9100) | 0.9373 (0.9149–0.9579) | 47.9 | |
Agents without ToolAgent | Validation Dataset | 0.0538 | 0.9380 (0.883–0.973) | 0.7234 (0.5908–0.8421) | 0.8807 (0.8295–0.9261) | 0.8333 (0.7680–0.8920) | 0.9202 (0.8794–0.9498) | 55.8 |
Agents with ToolAgent | 0.0323 | 0.8615 (0.7969–0.9174) | 0.8000 (0.6800–0.9057) | 0.8444 (0.7889–0.9000) | 0.8172 (0.7560–0.8760) | 0.8889 (0.8413–0.9242) | 47.9 | |
Agents without ToolAgent | Testing Dataset | 0.0000 | 0.9062 (0.8077–1.0000) | 0.7368 (0.5238–0.9287) | 0.8431 (0.7451–0.9412) | / | 0.8788 (0.7869–0.9552) | 54.4 |
Agents with ToolAgent | 0.0000 | 0.9375 (0.8485–1.0000) | 0.7368 (0.5263–0.9375) | 0.8627 (0.7647–0.9608) | / | 0.8955 (0.8136–0.9677) | 56.3 |