Table 1 Key AI/ML methodologies and their specific applications in ADC target identification and validation

From: Leveraging artificial intelligence in antibody-drug conjugate development: from target identification to clinical translation in oncology

AI/ML methodology

Data sources utilized

ADC-specific challenge addressed

Specific application in target validation

Example reference/Tool

Deep learning (CNNs, GNNs, Autoencoders)

Genomics, transcriptomics, proteomics, digital pathology images, molecular structures

Overcoming target heterogeneity and ensuring functional relevance (e.g., internalization)

Assessing antigen density and homogeneity from imaging; Predicting internalization efficiency; Correlating spatial expression with outcomes

RADR®31, PandaOmics32

Natural language processing (NLP)

Scientific literature, patents, clinical trial databases, EHRs

Aggregating fragmented evidence for less-studied or novel targets

Aggregating evidence for target function and expression patterns; Identifying reported associations with resistance or sensitivity

BioGPT42, PubMedBERT43

Bayesian networks

Multi-omics data, clinical data, pathway databases

Identifying driver targets with causal relationships, beyond simple correlation

Modeling impact of target modulation on cellular pathways; Assessing likelihood of on-target, off-tumor toxicity

Causal network inference platforms (e.g., packages in R/Python)

Support vector machines (SVMs)

Gene expression data, protein feature data

Classifying tumor vs. normal tissues with high precision for safety assessment

Predicting whether a protein is membrane-bound and accessible; Classifying targets based on predicted immunogenicity

Scikit-learn, LIBSVM, various custom models

Random forests/Gradient boosting (e.g., XGBoost)

Multi-omics data, chemical databases, preclinical data

Prioritizing targets based on a weighted combination of multiple complex features

Predicting target-drug interactions; Correlating target expression with preclinical drug response; Ranking targets based on weighted criteria

XGBoost, LightGBM

Multimodal data integration platforms

Genomics, proteomics, imaging, clinical outcomes, RWD

Building a holistic view of the target by integrating disparate data types

Validating targets by converging evidence from disparate sources; Stratifying patient populations based on multi-modal target signatures

MOFA+, Owkin (MOSAIC), TCGA pan-cancer atlas