Table 3 The predictive performance of the best-performing models vs JADBio models in predicting atezolizumab response of 320 mUC patients from the merged discovery and validation datasets

From: Predicting atezolizumab response in metastatic urothelial carcinoma patients using machine learning on integrated tumour gene expression and clinical data

Features

The best algorithm

Methods

#features

MCC

ROC-AUC

PR-AUC

GEP

LGBM-OMC

Nested 10-fold CV

49 genes

0.252

0.685

0.860

GEP + clinical

CART-OMC

Nested 10-fold CV

63 genes + TMB + TNB

0.253

0.664

0.870

GEP

JADBio-RF-SESa

Repeated 10-fold CV (maximum repeats = 20)

25 genes

0.179

0.625

0.842

GEP + clinical

JADBio-RF-SESb

Repeated 10-fold CV (maximum repeats = 20)

15 gene + TMB + TNB

0.198

0.660

0.834

  1. aJADBio optimal parameters for GEP: RF (with hyper-parameters: training 100 trees with mean squared error (MSE) splitting criterion, minimum leaf size = 9, splits = 1, alpha = 1, and variables to split = number of variables divided by 9.0) with SES feature selection (with hyper-parameters: maxK = 2, alpha = 0.1 and budget = 3 * number of variables).
  2. bJADBio optimal parameters for GEP + clinical: RF (with hyper-parameters: training 100 trees with MSE splitting criterion, minimum leaf size = 7, splits = 1, alpha = 1, and variables to split = number of variables divided by 9.0) with SES feature selection (with hyper-parameters: maxK = 2, alpha = 0.1 and budget = 3 * number of variables).
  3. Two sets of features were used: gene expression profiles (GEP) and integrated gene expression profiles with clinical data (GEP + clinical). Out-of-sample CV predictions for the patients from JADBio’s best-performing model were obtained. To perform a direct comparison, evaluation metrics were calculated with the same script and approach from the out-of-sample CV predictions of each method. Random-level performance is 0.0 for MCC, 0.5 for ROC-AUC, 0.765 for PR-AUC when using GEP, and 0.733 for PR-AUC when using GEP + clinical.