Table 1 Publications relevant to machine learning in immunotherapy response prediction

From: Informing immunotherapy with multi-omics driven machine learning

Task

ML Model

ML-based biomarker

Cancer

Patients

Therapy

Validation method

Performance

Input

Output

Ref

Predict response

RF *, CNN

Yes

NSCLC

915

Anti-PD-(L)1

5-fold cross-validation

AUC (0.96–0.97)

55 SNV locations

Response prediction (DCB, PFS, OS)

21

Predict response

SVM-RFE *, LASSO regularized LR

Yes

Metastatic BLCA

272

Anti-PD-L1

10-fold cross-validation

AUC (0.93)

TMB related genes

Responder vs. Non-responder and selected genes

22

Predict response

Multi-task linear regression using elastic net regularization

No

SKCM, STAD, BLCA, GBM

432

Anti-PD-(L)1

Hold-out

AUC (0.79–0.84)

RNA-based features

Responder vs. Non-responder

23

Predict PFS

linear SVM *

Yes

Metastatic gastrointestinal cancer

96

Anti-PD-(L)1

13-fold cross-validation

AUC (0.74)

RNA of 395 genes

DCB vs. non-DCB

24

Predict response

SVM and XGBoost

No

Pan-cancer

Not mentioned

ICI

Hold-out

Accuracy (0.88)

RNA of 2387 genes

Responder vs. Non-responder

25

Predict response

SVM-RFE *, RF

Yes

SKCM

212

Anti-PD-1

10-fold cross-validation

AUC (0.71–0.87)

RNA + SNV + clinical features

Response prediction

26

Predict response

LR *, NN

Yes

Esophageal adenocarcinoma

76

ICI

Hold-out

AUC (0.88–1.00)

RNA

Responder vs. Non-responder and selected genes

27

Predict response

A joint NMF-based model *

Yes

Pan-cancer (12 cancer types)

764

Anti-PD-1, anti-PD-L1, anti-PD-L2, anti-CTLA4

5-fold cross-validation

AUC (0.74)

RNA

Responder vs. Non-responder

28

Predict response

LASSO regression *,SVM

Yes

NSCLC

122

Anti-PD-(L)1

Hold-out

Significant hazard ratio differences

RNA

Responder vs. Non-responder

29

Predict response

LASSO regression *

Yes

NSCLC, UC, RCC

366

Anti-PD-L1

5-fold cross-validation

AUC (up to 0.62)

RNA

Responder vs. Non-responder selected gene features

30

Predict response

KNN, Linear SVM, RBF-SVM, GP, RF, DT, NN, AdaBoost, NB, quadratic classifier

No

BCC

11

Anti-PD-1

5-fold cross-validation

Accuracy (0.61–0.97 from different models)

Top 2,000 highly variable genes of CD8 T cell scRNA-seq data

Responder vs. Non-responder

32

Predict response

NN *

Yes

SKCM, BCC

43

Anti-PD-1

LOOCV

Accuracy (up to 1.00)

scRNA-seq data of CD8 + T cell

Responder vs. Non-responder

33

Predict response

LR

No

GEA

Not mentioned

Anti-PD-1 along with radiation therapy

10-fold cross-validation

Accuracy (up to 1.00)

Expression of selected genes from PMBC

Responder vs. Non-responder

34

Predict response

RF

No

NSCLC

213

Anti-PD-1

5-fold cross-validation

AUC (0.76–0.83)

Circulating miRNA + clinical information

Responder vs. Non-responder

35

Predict response and identify response related cfmiR biomarkers

RF *

Yes

Metastatic melanoma

47

ICI

Not mentioned

Not mentioned

162 differentially expressed cfmiRs

Responder vs. Non-responder and selected cfmiRs

36

Predict response

LASSO regression

No

NSCLC

78

Anti-PD-(L)1

10-fold cross-validation

AUC (0.80)

Differentially methylated CpG sites

Responder vs. Non-responder

37

Predict response

LASSO regularized LR

No

Metastatic melanoma

65

ICI

10-fold cross-validation

AUC (0.96)

5000 most variable methylated CpG sites

Responder vs. Non-responder

38

Predict response

NN

No

HNSCC

37 + simulated patients

Anti–PD-1

10-fold cross-validation

AUC (0.61–0.90)

Clinical features

Responder vs. Non-responder

39

Predict response

RF *,SVM

Yes

Colorectal cancer

25 (mice)

Anti-mouse CTLA4, anti-mouse PD-L1

LOOCV

Not directly showed

Spectra features from Raman spectroscopy

Responder vs. Non-responder and feature contributions

40

Predict response

MIL + DeepTCR

No

Not mentioned

43

ICI

Monte Carlo cross-validation

AUC (0.86)

TCR sequencing data + MHC sequencing data

Responder vs. Non-responder

41

Predict response

RF

No

Pan-cancer (16 cancer types)

1,479

ICI

5-fold cross-validation

AUC (up to 0.85)

Genomic features based on DNA variants, RNA, demographic and clinical data

Responder vs. Non-responder

42

Predict response

SVM, NB, RF, KNN, AdaBoost, boosted LR

No

RCC, UC, SKCM, GBM, BCC

955

Anti-PD-(L)1, anti-CTLA4, anti-PD-(L)1 plus anti-CTLA-4 combination

5-fold cross-validation

AUC (0.62–0.81)

Stemness features based on RNA

Responder vs. Non-responder

43

Predict response

XGBoost

No

Metastatic NSCLC

239

ICI

10-fold cross-validation

AUC (0.72–0.74)

25 variables based on blood immune cell signatures and clinical data

DCB vs. non-DCB

44

TME analysis and response prediction

LR

No

ccRCC

172

Anti-PD-(L)1, anti-CTLA4

Hold-out

AUC (up to 0.93)

RNA of selected genes

Responder vs. Non-responder

45

Predict response

L2 regularized LR *

Yes

Melanoma, gastric cancer, bladder cancer

729

ICI

LOOCV, Monte Carlo cross-validation

AUC (0.69–0.79 in different datasets)

Network-based biomarkers + gene-based biomarkers + TME-based biomarkers

Response (Responder vs. Non-responder) and OS prediction

46

Predict response

CNN *, Attention-based multiple-instance learning

Yes

NSCLC

345

Anti-PD-(L)1

10-fold cross-validation

AUC (up to 0.80)

Radiology, pathology, genomic alternation, TMB

Risk score

47

Predict CAR T cell phenotype for immunotherapy response

NN

No

Not mentioned

NA

CAR T therapy

10-fold cross-validation

R squared

Array of signaling motifs of a CAR costimulatory domain + initial CAR T cell count

Quantitively phenotype prediction (cytotoxicity and stemness) from a CAR motif combination

51

  1. *machine learning models with a feature selection process, SVM-RFE support vector machine recursive feature elimination, LASSO least absolute shrinkage and selection operator, LR logistic regression, BLCA bladder Urothelial Carcinoma, AUC area under the curve, TMB tumor mutational burden, RF random forest, CNN convolutional neural network, DCB durable clinical benefit, PFS progression-free survival, OS overall survival, SKCM skin cutaneous melanoma, STAD stomach adenocarcinoma, GBM glioblastoma multiforme, SVM support vector machines, XGBoost extreme gradient boosting, NN neural network, ICI immune checkpoint inhibitors, NMF non-negative matrix factorization, NSCLC non-small-cell lung cancer, UC urothelial carcinoma, RCC renal cell carcinoma, KNN k-nearest neighbors, GP Gaussian process, DT decision tree, NB naïve Bayes, BCC basal cell carcinoma, LOOCV leave-one-out cross-validation, GEA gastroesophageal adenocarcinoma, PMBC peripheral blood mononuclear cells, cfmiRs circulating cell-free microRNAs, HNSCC head and neck squamous cell carcinomas, MIL multiple-instance learning, TCR T-cell receptor, MHC major histocompatibility complex, RCC renal cell carcinoma, UC urothelial carcinoma, ccRCC clear cell renal cell carcinoma, TME tumor microenvironment, NA not applicable, CAR chimeric antigen receptor, irAE immune-related adverse events.