Table 4 Performance comparison and ablation studies on the test set of 68Ga-PSMA-11 PET/CT.

From: Mask R-CNN assisted 2.5D object detection pipeline of 68Ga-PSMA-11 PET/CT-positive metastatic pelvic lymph node after radical prostatectomy from solely CT imaging

WSI pretrain

Post-processing

Bagging

Metrics

(− 79,141) (%)

Quantile (− 84, 151) (%)

Windowing (− 90, 169) (%)

(− 125, 225) (%)

(− 97, 310) (%)

   

Sensitivity

66.671

75.831

70.831

75.002

70.831

   

Precision

26.743

28.642

28.844

25.896

24.000

   

F-1 Score

38.17

41.453

41.323

38.498

35.855

   

AUC

79.667

87.231

84.327

87.092

84.215

   

Sensitivity

66.674

70.834

70.836

83.333

75.006

 

Precision

31.716

29.825

32.068

29.573

30.101

   

F-1 Score

42.987

41.976

44.146

43.651

42.968

   

AUC

80.765

84.799

83.972

89.124

86.998

   

Best sensitivity

62.501

66.671

66.678

83.333

66.679

 

Best precision

47.942

47.943

47.957

48.945

47.954

   

Best F-1 Score

48.854

52.382

52.382

53.077

52.045

   

AUC

78.257

80.011

79.993

89.579

79.681

   

Best sensitivity

83.351

Best precision

58.621

   

Best F-1 Score

60.021

   

AUC

90.034

  1. Bold results are the best ones. Underlined results are the worst cases, indicating the effectiveness of ablated strategies. Sensitivity, Precision, F-1 score, and area under the curve (AUC) are used as evaluation metrics. Specifically, AUC is calculated image-based, where the probabilities of any detected nodes for an image will be examined, and the maximum of those nodules will be chosen as an image score, and if there is no prediction for a certain image, we set the corresponding image score as 0. Results in the last three rows are generated from “window bagging” of the model predictions of all quantile windowing ranges listed in the below columns.