Table 5 Discriminative performances of the machine learning-based binary classification models for prediction of kidney outcomes.

From: Deep learning-based quantitative analysis of glomerular morphology in IgA nephropathy whole slide images and its prognostic implications

Model

Input for training

Internal validation

External validation

97 (84:13)

399 (365:34)

AUC

p-value a

AUC

p-value

XGB

Image features

0.941 (0.851–1.000)

–

0.753 (0.656–0.850)

–

Clinical data

0.823 (0.680–0.967)

0.143 b

0.686 (0.584–0.789)

0.167

Image + clinical data

0.902 (0.789–1.000)

0.248 c

0.758 (0.661–0.855)

0.088

IIgAN-PT variables

0.878 (0.754–1.000)

0.194 b

0.739 (0.640–0.837)

0.779

Image + IIgAN-PT variables

0.904 (0.791–1.000)

0.551 d

0.751 (0.653–0.848)

0.795

RF

Image features

0.878 (0.754–1.000)

–

0.761 (0.665–0.857)

–

Clinical data

0.842 (0.704–0.980)

0.583

0.724 (0.624–0.824)

0.450

Image + clinical data

0.899 (0.784–1.000)

0.353

0.782 (0.688–0.876)

0.190

IIgAN-PT variables

0.855 (0.722–0.989)

0.675

0.739 (0.640–0.838)

0.652

Image + IIgAN-PT variables

0.916 (0.809–1.000)

0.246

0.776 (0.681–0.871)

0.432

LR

Image features

0.883 (0.760–1.000)

–

0.732 (0.632–0.831)

–

Clinical data

0.862 (0.731–0.993)

0.747

0.687 (0.585–0.789)

0.435

Image + clinical data

0.893 (0.775–1.000)

0.512

0.749 (0.651–0.846)

0.149

IIgAN-PT variables

0.865 (0.736–0.995)

0.783

0.717 (0.616–0.817)

0.791

Image + IIgAN-PT variables

0.916 (0.809–1.000)

0.296

0.779 (0.685–0.873)

0.177

  1. Values are presented as AUC (range of 95% confidence interval).
  2. AUC area under the curve, XGB extreme gradient boosting classifier, RF random forest classifier, LR logistic regression classifier, IIgAN-PT international IgAN prediction tool.
  3. ap-values, according to the Delong’s test, bp-value versus image features, cp-value versus clinical data, dp-value versus IIgAN-PT variables.