Table 6 Confidence-based identification of patient groups with high predictive performance for garden types III and IV in cross-validation using the internal dataset (N = 1,588). Confidence indicates the predicted probability output by deep-learning models. The Gray zone represents patient groups with confidence below 95% for both garden types III and IV. For individual networks, the best performance value for each metric is highlighted in bold. The ensemble results were obtained by combining the outputs of ResNet101, EfficientNetB4, and ResNet50, which were the top 3 models ranked based on DSC in total.

From: Garden classification of femoral neck fracture using deep-learning algorithm

Name

ACC (%)

AUC (%)

DSC (%)

Number of patients in gray zone

Total

Confidence ≥ 95%

Total

Confidence ≥ 95%

Total

Confidence ≥ 95%

EfficientNetB0

67.0

70.7

68.7

69.0

62.7

63.2

305 (29.3%)

EfficientNetB2

67.2

71.7

71.0

74.5

63.6

66.4

279 (26.8%)

EfficientNetB4

68.6

74.1

70.0

70.0

64.9

66.7

431 (41.4%)

ResNet18

65.5

74.2

67.9

70.7

60.8

65.7

481 (46.2%)

ResNet50

66.9

73.1

69.1

72.2

64.3

69.0

395 (37.9%)

ResNet101

67.4

72.5

71.1

74.0

65.5

69.5

350 (33.6%)

ResNext50

67.1

75.3

71.2

74.2

64.0

70.6

393 (37.7%)

ReXNet100

63.4

69.0

65.1

66.4

60.0

63.1

336 (32.2%)

ReXNet130

66.6

72.2

67.8

68.8

62.5

65.0

373 (35.8%)

ReXNet150

67.9

75.0

70.4

71.8

63.9

68.9

366 (35.1%)

DenseNet121

65.5

72.8

68.1

70.5

62.6

67.3

427 (41.0%)

MobileNetV3

66.6

72.0

69.0

70.9

63.4

67.1

313 (30.0%)

Ensemble of top 3

70.4

81.5

73.9

74.2

67.6

73.3

691 (66.3%)

  1. ACC, accuracy; AUC, area under the curve; DSC, Dice similarity coefficient.