Table 2 Diagnostic performances of Models A and B.
Dice score | Recall | Precision | |
|---|---|---|---|
Model A | |||
Training | 0.866 (0.774–0.918) | 0.953 (0.851–0.992) | 0.736 (0.658–0.780) |
Validation | 0.835 (0.702–0.899), | 0.918 (0.772–0.967) | 0.710 (0.596–0.764) |
Internal test | 0.858 (0.752–0.909) | 0.944 (0.828–0.981) | 0.639 (0.729–0.773) |
External test without TF | 0.734 (0.56–0.843) | 0.807 (0.616–0.927) | 0.624 (0.476–0.716) |
External test with TF | 0.832 (0.671–0.916) | 0.915 (0.738–0.978) | 0.707 (0.57–0.779) |
Model B | |||
Training | 0.896 (0.813–0.940), | 0.970 (0.887–0.992) | 0.771 (0.7–0.808) |
Validation | 0.865 (0.745–0.923) | 0.943 (0.813–0.983) | 0.744 (0.641–0.794) |
Internal test | 0.857 (0.723–0.921) | 0.934 (0.787–0.970) | 0.737 (0.622–0.792) |
External test without TF | 0.756 (0.613–0.851) | 0.824 (0.668–0.928) | 0.650 (0.527–0.732) |
External test with TF | 0.846 (0.730–0.902) | 0.922 (0.788–0.969) | 0.727 (0.638–0.776) |