Table 4 Nonoverlapping patch number and cancer proportion resulting from splitting 200 whole-slide images (WSIs) at 10 × objective magnification by 512 × 512 pixels (1 pixel corresponds to ~ 1 µm) obtained from the Cancer Genome Atlas image library for prostate cancer (PRAD).

From: Critical evaluation of artificial intelligence as a digital twin of pathologists for prostate cancer pathology

Finding

Random data splitting

Fold 1

Fold 2

Fold 3

Training set (case number = 195)

 Patches

  Noncancer tissues, n (%)

44,240 (53.99)

43,530 (53.35)

43,975 (54.03)

  Prostate cancer, n (%)

37,69 (46.01)

38,069 (46.65)

37,411 (45.97)

  Total, n (%)

81,937 (100.00)

81,599 (100.00)

81,386 (100.00)

  Average cancer pixel proportion in a patch labeled with prostate cancer (median)

78.8% (100.0%)

78.9% (100.0%)

78.9% (100.0%)

In-training optimization set (case number = 5)

 Patches

   

  Noncancer tissues, n (%)

1769 (72.0)

2479 (88.7)

2034 (67.6)

  Prostate cancer, n (%)

689 (28.0)

317 (11.3)

975 (32.4)

  Total, n (%)

2458 (100.0)

2796 (100.0)

3009 (100.0)

  Average pixelwise cancer proportion in a patch labeled with prostate cancer (median)

76.7% (100.0%)

64.9% (72.0%)

75.7% (97.0%)

  1. All the slides were scanned at 40 × objective magnification. We randomly selected 200 WSIs for model development, where 195 WSIs were considered for training and 5 WSIs were considered for in-training optimization. These numbers were fixed when curating different folds by random data splitting. All the images were stained with H&E.