Table 1 Model classes, compound and kinase descriptors and training data used by the Round 2 top-performing teams and the baseline model17.

From: Crowdsourced mapping of unexplored target space of kinase inhibitors

Team

Algorithm type

Algorithm name

Combined models

Training strategy

DMIS_DK

Deep learning, multi-target learning

Multi-task graph convolutional neural networks

12

Train test split

AI Winter is Coming

Gradient boosting decision trees

XGboost

5 per target

K-fold nested cross validation, boosting

Q.E.D

Kernel learning

CGKronRLS

440

Boosting

Gregory Koytiger

Deep learning, artificial neural network

Not applicable

6

Fixed hold out

Olivier Labayle

Ridge regression

Not applicable

Not applicable

K-fold cross validation

Baseline

Kernel learning

CGKronRLS

1

K-fold nested cross validation

Team

Training data sources

Compound-protein pairs

Bioactivity types

Protein features

Chemical features

DMIS_DK

DrugTargetCommons, BindingDB

953521

Kd, Ki, IC50

None

Molecular graphs

AI Winter is Coming

DrugTargetCommons, ChEMBL

600000

Kd, Ki, IC50, EC50 %inh, %activity

None

ECFP5, ECFP7, ECFP9, ECFP11

Q.E.D

DrugTargetCommons, ChEMBL, UniProt

60462

Kd, Ki, EC50

Amino acid sequences

ECFP4, ECFP6

Gregory Koytiger

ChEMBL

250000

Kd, Ki, IC50

Amino acid sequences

SMILES strings

Olivier Labayle

DrugTargetCommons, ChEMBL, UniProt

18200

Kd

K-mer counting

ECFP

Baseline

DrugTargetCommons

44186

Kd

Amino acid sequences

Path-based fingerprints

  1. Even if the teams chose to combine predictions from multiple models, they had to submit only one prediction per compound-kinase pair for scoring against the measured activities. Supplementary Table 1 provides further details of all the models submitted together with method surveys and model performances in Round 2.