Fig. 1: DRAGONFLY architecture and workflow. | Nature Communications

Fig. 1: DRAGONFLY architecture and workflow.

From: Prospective de novo drug design with deep interactome learning

Fig. 1

a Left: To construct the drug-target interactome graph, the targets are connected to their corresponding ligands based on reported bioactivities in the ChEMBL database28. Specifically, a connection is established between a ligand (blue circle) and its corresponding target (orange circle) if the ligand has been reported with a bioactivity equal to or <200 nM. Right: By representing allosteric and orthosteric binding sites as separate nodes (shown in green and orange, respectively), the drug-target interactome graph captures the specific interactions and relationships associated with each type of binding site. b Left: During the training phase for ligand-based design, a ligand molecule (represented as a blue circle) is taken as the input to the model. The desired output molecules (represented as brown circles) are selected based on their connection to the input molecule through a common node, indicating that they share a binding site. Right: For structure-based design, the input for the model is the binding site itself, represented as a blue circle. The desired output molecules, represented as brown circles, are ligands that have been observed to bind to the corresponding binding site. c The protein binding site (here: Janus Kinase 2, PDB-ID 6VNK113) is represented as a three-dimensional (3D) graph, i.e., \({{{{{{\mathcal{G}}}}}}}=\left({{{{{{\mathcal{V}}}}}}},{{{{{{\mathcal{E}}}}}}},{{{{{{\mathcal{R}}}}}}}\right)\) where \({{{{{{\mathcal{G}}}}}}}\) denotes the graph, \({{{{{{\mathcal{V}}}}}}}\) vertices, \({{{{{{\mathcal{E}}}}}}}\) edges and \({{{{{{\mathcal{R}}}}}}}\) the position in 3D space. All protein atoms farther away than 5 Å from any atom of the bound ligand were removed, yielding a pocket-centric representation of the binding pocket. d The ligands are represented as two-dimensional (2D) graphs, i.e., \({{{{{{\mathcal{G}}}}}}}=\left({{{{{{\mathcal{V}}}}}}},{{{{{{\mathcal{E}}}}}}}\right)\). e In the proposed approach, the node features within the graph are updated through a message passing process. This can be done using either 2D or 3D message passing, depending on the nature of the molecular representation. As a result of the subsequent pooling process, a latent space vector is obtained, which captures the essential characteristics and representations of the molecule. This condensed representation provides a compact encoding of the molecule’s features, enabling downstream analysis, prediction, or structure generation tasks. The latent space vector can be optionally concatenated with a wishlist of desired physicochemical properties for the output molecule. This allows for the incorporation of project-specific property constraints or objectives in the de novo molecular design process. MLP denotes Multilayer Perceptron, RNN denotes Recurrent Neural Network, and LSTM refers to a type of RNN with Long Short-Term Memory cell architecture. f Workflow of the presented study including DRAGONFLY validation, DRAGONFLY application to peroxisome proliferator-activated receptor (PPAR), chemical synthesis and biological characterization. ADME denotes Absorption, Distribution, Metabolism, and Excretion, and FEP denotes Free Energy Perturbation calculations.

Back to article page