Table 2 Performance comparison of radiomics and deep learning models across training, testing, and multicenter validation cohorts

From: Noninvasive prediction of occult pT3a upstaging in localized ccRCC with radiogenomic insights and prognostic relevance

 

Dataset

AUC (95% CI)

Sensitivity

Specificity

Accuracy

Radiomics_Peritumor

Training Cohort

0.992(0.987–0.998)

0.783

0.986

0.966

Testing Cohort

0.709(0.605–0.814)

0.154

0.988

0.895

Validation Cohort Ⅰ

0.659(0.554–0.763)

0.095

0.956

0.895

Validation Cohort Ⅱ

0.633(0.533–0.732)

0.070

0.993

0.781

Validation Cohort Ⅲ

0.602(0.489–0.716)

0.080

0.943

0.850

Validation Cohort Ⅳ

0.685(0.558–0.812)

0.130

0.992

0.865

Radiomics_Tumor

Training Cohort

0.995(0.991–0.999)

0.783

0.957

0.966

Testing Cohort

0.787(0.71–0.863)

0.154

0.988

0.869

Validation Cohort Ⅰ

0.677(0.552–0.801)

0.286

0.931

0.885

Validation Cohort Ⅱ

0.641(0.537–0.744)

0.116

0.986

0.786

Validation Cohort Ⅲ

0.656(0.54–0.773)

0.120

0.928

0.842

Validation Cohort Ⅳ

0.664(0.546–0.783)

0.130

0.977

0.853

Radiomics_Combined

Training Cohort

0.996(0.993–0.999)

0.750

0.991

0.967

Testing Cohort

0.800(0.709–0.892)

0.154

0.994

0.899

Validation Cohort Ⅰ

0.707(0.588–0.826)

0.095

0.985

0.922

Validation Cohort Ⅱ

0.693(0.596–0.789)

0.140

0.993

0.797

Validation Cohort Ⅲ

0.676(0.568–0.784)

0.200

1.000

0.915

Validation Cohort Ⅳ

0.751(0.633–0.868)

0.130

1.000

0.872

Deep learning

Training Cohort

1.000

1.000

0.891

0.995

Testing Cohort

0.818(0.703–0.932)

0.538

0.994

0.852

Validation Cohort Ⅰ

0.732(0.626–0.838)

0.333

0.927

0.885

Validation Cohort Ⅱ

0.764(0.684–0.844)

0.465

0.861

0.770

Validation Cohort Ⅲ

0.696(0.594–0.798)

0.400

0.804

0.761

Validation Cohort Ⅳ

0.720(0.596–0.843)

0.391

0.910

0.833