Table 1 Data sets used from multi-center data sources.

From: Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images

Data source

Dataset usage

Sample preparation

Examination type

Population

CRC

 

Non-CRC

 

Total

 
   

Radical surgery/colonoscopy

 

Subjects

Slides

Subjects

Slides

Subjects

Slides

Xiangya Hospital (XH, Dataset-PATT)

PATTa

FFPEb

100%/0%

Changsha, China

614

614

228

228

842

842

NCT-UMM (Dataset-PAT)

PATc

FFPE

NAd

Germany

NA

NA

NA

NA

NA

86

Xiangya Hospital (XH-Dataset-PT)

PTe

FFPE

80%/20%

Changsha, China

3990

7871

1849

2132

5839

10,003

Xiangya Hospital (XH-Dataset-HAC)

PT & HACf

FFPE

89%/11%

Changsha, China

98

99

97

114

195

213

Pingkuang Collaborative Hospital (PCH)

PT & HAC

FFPE

60%/40%

Jiangxi, China

50

50

46

46

96

96

The Third Xiangya Hospital (TXH)

PT & HAC

FFPE

61%/39%

Changsha, China

48

70

48

65

96

135

Hunan Provincial People’s Hospital (HPH)

PT & HAC

FFPE

61%/39%

Changsha, China

49

50

49

49

98

99

Adicon clinical laboratory (ACL)

PT & HAC

FFPE

22%/78%

Changsha, China

100

100

107

107

207

207

Fudan University Shanghai Cancer Center (FUS)

PT & HAC

FFPE

97%/3%

Shanghai, China

100

100

98

98

198

198

Guangdong Provincial People’s Hospital (GPH)

PT & HAC

FFPE

77%/23%

Guangzhou, China

100

100

85

85

185

185

Southwest Hospital (SWH)

PT & HAC

FFPE

93%/7%

Chongqing, China

99

99

100

100

199

199

The First Affiliated Hospital of Air Force Medical University (AMU)

PT & HAC

FFPE

95%/5%

Xi’an, China

101

101

104

104

205

205

Sun Yat-Sen University Cancer Center (SYU)

PT & HAC

FFPE

100%/0%

Guangzhou, China

91

91

6

6

97

97

Chinese PLA General Hospital (CGH)

PT

FFPE

100%/0%

Beijing, China

0

0

100

100

100

100

The Cancer Genome Atlas (TCGA-FFPE)

PT

FFPE

100%/0%

U.S.

441

441

5

5

446

446

Total

    

5881

9786

2922

3239

8803

13,111

  1. aPatch-level training and test.
  2. bFormalin-fixed and paraffin-embedded.
  3. cIndependent patch-level test.
  4. dNo information (NA) on the number of cancer or non-cancer subjects/slides were provided.
  5. ePatient-level test.
  6. fHuman-AI competition.