Table 1 Model performance and concordance with pathologist/surgeon annotations for the gross measurements, tissue orientation/mapping, completeness, and margin assessment tasks.

From: Intraoperative margin assessment for basal cell carcinoma with deep learning and histologic tumor mapping to surgical site

Task

Evaluation metric

Subtask

Estimate

2.5% CI

97.5% CI

Gross Measurements

3D Point Cloud: MADa (cm)

L

0.36

0.21

0.53

  

W

0.23

0.12

0.52

  

H

0.29

0.19

0.50

  

Overall

0.29

0.2

0.47

 

NeRF: MAD (cm)

L

0.21

0.11

0.53

  

W

0.19

0.17

0.22

  

H

0.27

0.17

0.36

  

Overall

0.22

0.18

0.33

 

3D Point Cloud: MPCb (%)

L

12.4

7.9

17.7

  

W

13.0

8.3

32.0

  

H

31.7

20.0

58.0

  

Overall

19.0

12.6

33.2

 

NeRF: MPC (%)

L

11.3

4.5

14.3

  

W

11.3

8.3

16.9

  

H

34

6.5

58.3

  

Overall

18.9

8.2

28.5

 

3D Point Cloud: Correlation

L

0.848

0.544

0.973

  

W

0.833

0.467

0.971

  

H

0.812

0.456

0.955

  

Overall

0.831

0.602

0.937

 

NeRF: Correlation

L

0.946

0.814

0.988

  

W

0.875

0.576

0.976

  

H

0.893

0.646

0.984

  

Overall

0.905

0.782

0.96

Tissue orientation

MAD (°)

 

4.883

4.04

5.451

 

Proportion Correct Orientation

≤45°

94.7%

92.3%

96.6%

  

≤15°

86.0%

82.6%

88.9%

  

≤5°

51.3%

46.7%

55.7%

GNN: Tissue completeness

AUC

 

0.839

0.825

0.855

 

Macro-AUC

 

0.851

0.839

0.863

GNN: Margin assessment

AUC

Original Slides

0.967

0.960

0.979

  

Follicle Removal

0.965

0.959

0.975

 

Macro-AUC

Original Slides

0.962

0.954

0.970

  

Follicle Removal

0.957

0.949

0.964

CNN: Tissue completeness

AUC

 

0.923

0.917

0.930

 

Macro-AUC

 

0.916

0.907

0.925

CNN: Margin assessment

AUC

 

0.900

0.887

0.912

 

Macro-AUC

 

0.877

0.859

0.893

Tumor Mapping

Accuracy

 

0.992

0.915

0.999

  1. Macro-AUC represents reporting of AUC statistic on slide level and averaging across slides, giving each slide equal weight, while normal AUC statistic is calculated for subimages across all slides. 95% confidence intervals were acquired using 1000-sample non-parametric bootstrap, where bootstrapping was done on the WSI level to account for variation in concordance across the cases.
  2. aMedian Absolute Deviation.
  3. bMedian Proportion Change.