Table 2 Application of machine learning technologies in immunotherapy-related tumor microenvironment analyses

From: Informing immunotherapy with multi-omics driven machine learning

Task

ML Model

Cancer

TME feature type

Input

Output

Ref

Predict MSI status

RF-based model, SVM

Not mentioned

808 cancer-gene panel (DNA, RNA)

54 features based on the sequenced panel

MSI classification: MSI high vs. MSS

59

Identify gene target panel to predict TMB and response

LASSO regression

Metastatic melanoma, NSCLC

WES

Somatic mutations

Responders vs. Non-responders and selected mutations

60

Identify cancer stem-like signatures

LASSO COX regression

Gastric cancer

RNA

RNA of 2,527 genes

Responders vs. Non-responders and selected stem-like features

67

Identify cancer stem-like signatures

Cancer stemness clustering: K-means; Cancer stemness feature selection: LASSO regression, SVM, RFB, XGBoost, LR;

Response prediction: TIDE

GBM

RNA

RNA of cancer stemness-associated DEGs

Stemness subtype cluster and selected cancer stemness-associated genes

68

Identify CAF signatures

CAF subtype clustering: Consensus clustering;

Gene selection: RF, DT, KNN

Melanoma, lung cancer, TNBC

RNA

Prognostic-related RNA data

CAF-subtype clustering and selected subtype-related genes

70

Identify CAF signatures

LASSO regression, RF

Melanoma

RNA

DEGs

Responders vs. Non-responders and key CAFs-related DEGs

71

Identify gene signatures and immunotherapy response prediction

TME clustering: Hierarchical clustering; Cluster feature selection: LASSO Cox regression, RF; TME cluster classification: SVM, NB, RF, NN; Risk prediction: DT

LUAD

RNA

RNA, clinicopathological traits

TME (risk) cluster classification: low vs. intermediate vs. high and their cluster related gene features

76

Identify immune-related genes from protein signatures

Immune-related gene identification: NN

Immunotherapy response: DT

Gastric cancer

PPI network data, RNA

PPMI matrix based on PPI network data, RNA

NN: Gene property classification (immune-promoted vs. immune-inhibited); DT: Response prediction (Responders vs. Non-responders)

77

Identify TIIClnc

LASSO regularized LR, Boruta, XGBoost, SVM, RF

GBM

RNA

Selected lncRNA

Regulation prediction in immune cell lines and GBM cell lines (upregulated vs. downregulated)

78

Identify TIIClnc

LASSO, Ridge, stepwise Cox, CoxBoost, RSF, Enet, plsRcox SuperPC, GBR, survival-SVM

LGG

RNA

Filtered top expressed TIIClnc signatures

Responders vs. Non-responders and selected TIIClnc signatures

79

Identify impact of CTLA-4 blockade on antigen-specific, human T-cell responses early between neonates and adults

RF

Healthy donors

Flow cytometry

Frequencies of cytokine producers in the encountered CD4 + T-cell responses

CD4 + T cell classification (neonates vs. adults) after CTLA-4 blockade stimulation

80

Predict T cell infiltration

LR, SVM

Colorectal cancer

Histological data, 373 cancer and immune related gene panel from FoundationOne

LR: image-based features

SVM: patient’s gene expression profile

T cells and tumor cells co-localized vs. not co-localized

81

Predict TIL

Multimodal NN model

Colorectal cancer, breast cancer, lung cancer, pancreatic cancer

RNA, H&E staining images

RNA-seq + Visual texture feature extracted from H&E staining

Proportions of five immune cell types within tumors and total TIL proportions

82

Identify epigenomic signatures

RF

LUAD

DNA methylation data

iDMCs

Immunoactivity classification and selected signatures

83

TIME deconvolution

nu-SVR-based noise constrained recursive feature selection

Not mentioned

RNA

RNA

Proportions of 22 immune cell types

84

Identify tumor-associated metabolism subtypes

Cox regression with LASSO penalty

LUAD

RNA

RNA of 1,426 lipid metabolism genes and 1,638 immune-related genes

Metabolic TME subtype prediction (metabolism vs. immunoactive)

92

  1. MSI microsatellite instability, RF random forest, SVM support vector machines, MSS microsatellite stable, LASSO least absolute shrinkage and selection operator, TMB tumor mutational burden, NSCLC non-small-cell lung cancer, WES whole-exome sequencing, TIDE tumor immune dysfunction and exclusion, GBM glioblastoma multiforme, RFB random forest and Boruta, CAF cancer-associated fibroblast, DT decision tree, KNN k-nearest neighbors, TNBC triple-negative breast cancer, DEG differentially expressed genes, LUAD Lung adenocarcinoma, PPI Protein-protein interaction, PPMI positive pairwise mutual information, TIIClnc tumor-infiltrating immune cell-associated lncRNAs, GBM glioblastoma, RSF random survival forest, Enet elastic network, plsRcox partial least squares regression for Cox, SuperPC supervised principal components, GBR generalized boosted regression, LGG low-grade glioma, TIL tumor immune infiltration, iDMCs immunophenotype-specific differentially methylated CpG sites, nu-SVR support vector regression.