Abstract
Biomedical network learning offers fresh prospects for expediting drug repositioning. However, traditional network architectures struggle to quantify the relationship between micro-scale drug spatial structures and corresponding macro-scale biomedical networks, limiting their ability to capture key pharmacological properties and complex biomedical information crucial for drug screening and therapeutic discovery. Moreover, challenges such as difficulty in capturing long-range dependencies hinder current network-based approaches. To address these limitations, we introduce the Spatial Hierarchical Network, modeling molecular 3D structures and biological associations into a unified network. We propose an end-to-end framework, SpHN-VDA, integrating spatial hierarchical information through triple attention mechanisms to enhance machine understanding of molecular functionality and improve the accuracy of virus-drug association identification. SpHN-VDA outperforms leading models across three datasets, particularly excelling in out-of-distribution and cold-start scenarios. It also exhibits enhanced robustness against data perturbation, ranging from 20% to 40%. It accurately identifies critical motifs for binding sites, even without protein residue annotations. Leveraging reliability of SpHN-VDA, we have identified 25 potential candidate drugs through gene expression analysis and CMap. Molecular docking experiments with the SARS-CoV-2 spike protein further corroborate the predictions. This research highlights the broad potential of SpHN-VDA to enhance drug repositioning and identify effective treatments for various diseases.

Similar content being viewed by others
Introduction
In the contemporary era, the expeditious development of de novo new drugs to address infectious diseases such as COVID-19 is typically impractical due to prolonged cycles and exorbitant costs1. The repurposing of known drugs is often regarded as an attractive opportunity to quickly develop new therapies from which patients can greatly benefit2. With the primary goal of identifying novel indications for existing drugs, drug repositioning proves valuable in abbreviating development timelines and mitigating the risk of toxicity-related clinical effects. By narrowing down potential drug candidates, this approach is positioned as a promising alternative to the intricate process of de novo drug design3,4.
With the emergence of COVID-19, the results of many launched in vitro and clinical trials for COVID-19 or other infectious diseases have provided extensive research data for computational methodologies5,6. Many computational methods have been successful in predicting candidate drugs for COVID-19 patients; for instance, baricitinib was discovered by an expert-curated network model7. Notably, imatinib mesylate, initially designed for leukemia treatment, was later discovered to be effective in curing gastrointestinal stromal tumors8,9. As more cases of drug repositioning emerge, researchers have paid increasing attention to data-driven existing drug screening10,11, and the research based on analysis of virus-drug associations (VDAs)12,13, due to high infectivity and mutability of the virus genes.
Over the past decade, deep learning methods have been regarded as revolutionary tools and have been utilized for drug repositioning, broadly classified into chemical structure-based and biomedical network-based methods14,15. The former focuses on extracting structure sequence information from drugs and viruses16,17,18. However, the performance of the mentioned methods is restricted by the oversimplified linearized molecular representation, making it challenging to explore spatial atomic information. Moreover, these methods cannot capture higher-order information on the potential interactions with heterogeneous biological entities, which is critical for discovering unidentified functions of drugs.
In contrast, the latter approach concentrates on extracting complex interaction information from the structure of biomedical networks (VDA, DTI, PPI, DDA, CMI, etc.)19,20,21,22,23,24. Presently, to capture effective representation from multiple types of nodes and diverse relationships, a commonly convincing method is heterogeneous graph representation learning, which has been widely utilized to obtain phenotypes of interest in biology25,26,27,28, such as DLCMNMF29 model based on nonnegative matrix factorization, MHGNN30 based on DTP correlation graph, and HetioNet31 and DeepR2cov32 based on the metapath learning. However, these methods with the single architecture of a biomedical network fail to quantify the interaction relationship between the micro view drug spatial network and the macro view biomedical network and lack the capability to discern drug spatial substructures influencing activity and specific pharmacological properties. The major drawback leads to limited molecular understanding and representation. Furthermore, existing network-based methods commonly suffer from three shared issues: sharing homogeneous information among the heterogeneous nodes; only capturing the short-range dependence with one-hop acceptance domain; and occurring feature over-smoothing when increasing the number of layers33.
Additionally, the interpretability of predicting the process of complex nonlinear deep learning models remains a significant challenge34,35,36,37. This is attributed to commonly used implicit data representations in research, restricting their applicability in the pharmaceutical domain38. Even though some techniques, such as HampDTI39 and P-NET40, endeavor to clarify meaningful interpretations by capturing the critical metapaths and bio-molecules from the heterogeneous network to explain biological processes, the specific functional structures of molecules still remain invisible “black boxes”, which can impede the progress of future investigations into drug properties.
A recent study has proved that multiple-granularity hierarchical structure can produce a more accurate portrayal representation41,42. As a critical study object in the field of drug design, molecular 3D spatial structure analysis plays a critical role in bioactivity exploring43 and drug development44. Therefore, to precisely predict VDA based on the effective quantification and integration of 3D spatial network information and biomedical network information, we have devised a unified “spatial hierarchical network” structure, embedding the spatial network of a single drug as a subnetwork of VDA network to describe hierarchical structure. This marks the inaugural definition of such a structure, distinct from the existing community hierarchical network structure (details in the “Construction of the spatial hierarchical network” section). Specifically, we substitute the drug entity within the VDA network with the 3D structure network of molecules, thereby describing a micro-network hierarchy at the atom level. Alternatively, we conceptualize the complete 3D network structure of the molecule as a drug node within the VDA network, thereby forming a macro network hierarchy at the entity level. This structure models the interaction processes between atom-level drug spatial information and entity-level biomedical association information. Modeling the interaction of multi-hierarchy information rather than concatenating these information features can effectively mitigate the issues highlighted in previous studies, where naive concatenation manipulation for diverse information impedes knowledge acquisition45 and leads to biased comprehension and conflicting conclusions46,47.
Then, we further describe a pipeline tailored for the spatial hierarchical network structure to predict VDA, a Spatial Hierarchical Network for Virus-Drug Associations (SpHN-VDA). This framework is built upon spatial graph neural networks and metapath graph neural networks, integrating atom-level hierarchical information into the entity-level hierarchy to model the interaction between the drug spatial network and the biomedical network. We design triple attention mechanisms to effectively learn implicit data representations within and across hierarchical information layers, thereby enabling a comprehensive machine understanding and obtaining the complete reasoning process from the 3D molecular structure to the biological association metapath. In addition, our model further alleviates three common issues inherent in network-based methods, which are discussed in “Spatial-GNN for learning atomic space representations” and “Metapath-GNN for learning drug representations and virus representations” sections. Overall, SpHN-VDA utilizes its architectural superiority to establish a robust machine understanding of VDAs, guiding and promoting the screening of candidate compounds for antiviral drugs by confident predictions. Our contributions are summarized as follows:
-
To quantify biomedical network relations and encapsulate the essential pharmacological properties required for accurate VDA prediction, we devised the Spatial Hierarchical Network (SpHN). Within this framework, we embed the spatial network of individual drugs as subnetworks within the VDA network, thus delineating a unified hierarchical structure.
-
SpHN-VDA seamlessly integrates molecular spatial structure information with biological functional interaction data, optimizing them cohesively to extract advanced entity features. Experiments prove this methodology ensures not only minimized embedding distances for drugs sharing analogous biological functions at a macro scale but also effectively captures feature dependencies among long-range atoms at a micro level.
-
SpHN-VDA employs triple attention mechanisms to understand VDAs thoroughly. It identifies crucial biological network neighbors and their key spatial structures, guiding drug screening. For instance, it accurately identifies critical motifs in atazanavir with HIV-1 binding sites, even without protein residue annotations. Interpretable analysis reveals how it enhances prediction accuracy by exploiting the synergy between different levels of information.
-
SpHN-VDA exhibits superior performance across three datasets employing various splitting ratios, and demonstrates enhanced generalization capability in out-of-distribution (OOD) scenarios of emerging viruses with different cold start ratios. It also exhibits improved robustness against data perturbation ranging from 20% to 40%. Furthermore, we illustrate the reliability of the model by showing that as predicted scores approach 0 and 1, the model accuracy increases, and model accuracy exceeds 0.85.
-
Based on SpHN-VDA’s ability to express uncertainty, we choose predictions with scores exceeding 0.9 to compile a candidate list. Subsequently, 25 potential candidate drugs are identified through gene expression analysis and CMap48, Notably, thiothixene emerges as a promising drug due to its relevant biological functions and low binding energy with the SARS-CoV-2 spike protein.
-
The proposed SpHN-VDA is publicly available via https://github.com/MrPhil/SpHN-VDA.
Results
SpHN-VDA introduces a spatial hierarchical network for modeling the macro-to-micro interaction information
Although numerous biomedical network-based deep learning frameworks have been applied for the task of universal drug repurposing, the single architecture of a biomedical network fails to quantify the relationship between the micro view drug spatial network and the macro view biomedical network and the spatial structures of molecules still remain invisible “black boxes”. In this study, we devised Spatial Hierarchical Network structure bridging the molecular 3D structure and biological association into a unified biomedical network and developed a novel computational pipeline of SpHN-VDA to predict the potential association between drugs and viruses, enabling efficient drug repurposing from pre-existing drugs (Fig. 1). SpHN-VDA introduces a micro perspective and a macro perspective to understand molecular spatial networks and VDA biomedical networks, respectively, and proposes two message-passing modules, namely Spatial-GNN and Metapath-GNN, to extract potent representations. Triple attention is designed for capturing crucial effective motifs and specific biological associations, and providing interpretation while giving accurate prediction.
a An illustrative example of Spatial Hierarchical Network and Community Hierarchical Network reveals their markedly distinct structures and learning processes. b The pipeline of SpHN-VDA. We formulated the whole process as five phases. c The process of prior knowledge learning, whose information is extracted from the drug synergistic/antagonistic interaction network and virus sequences, is used to initialize the features. Furthermore, both the drug spatial structure and the association structure are essential for VDA prediction. Thus, the SpHN-VDA contains two-level perspectives: d the micro-atom-level perspective, which serves each atom of the molecule as the node and connects the spatially proximate atoms to learn the atom-level features, and e the macro entity-level perspective, which serves drugs and viruses as nodes and uses the prior knowledge information and atom-level representation as node initialization to learn the entity-level features. f According to the prediction results of the SpHN-VDA, we evaluate the performance in diverse scenarios, containing sample splitting with multiple ratios, out-of-distribution, and perturbational datasets; provide interpretable biochemical evidence by uncovering the complete reasoning process from the 3D molecular structure to the biological association metapath coherently; find a potential candidate drug with high confidence through further biological data analysis of gene expression analysis and CMap; and visualize the molecular docking result for further verification.
Serving as an end-to-end training model, SpHN-VDA comprises five key steps, shown in Fig. 1b (details in “Construction of the spatial hierarchical network” section): Spatial Hierarchical Network construction, prior knowledge learning, atom-level (micro view) feature message passing, entity-level (macro view) feature message passing, and virus-drug association inference. As shown in Fig. 1c, SpHN-VDA learns synergistic and antagonistic interactions among drugs, and the sequence information of viruses, as prior knowledge information, is utilized as part of the initial feature of entity-level message-passing modules. Additionally, to model the potential correlation between the molecular substructure and the virus, we specifically designed a Spatial-GNN module to extract atom-level spatial features of drugs with 3D structure information that expresses the physicochemical properties of molecules and calculates the importance of each atom. Then, the atom-level feature, alongside the prior knowledge feature of drugs, is assembled and employed as the initial representation of the drug in the VDA graph (details in “Spatial-GNN for learning atomic space representations” section). For the initial representation of the virus in the VDA graph, we also add random perturbation to the prior knowledge feature of the virus to improve the robustness of the model (details in the “Feature initialization of prior knowledge” section). In Metapath-GNN, according to multiple representative metapath (feature message passing) patterns, independent drug message delivery the metapath graph of homogeneity between head and tail (HoMG) and virus message delivery HoMG are built (details in the “Spatial-GNN for learning atomic space representations” section), and the features are propagated along the edge of the corresponding metapath graph to learn high-order neighbor information and specific-task preferences of the metapath. In each message delivery HoMG, multihead graph attention (GAT)49 units conduct the operation of feature message passing with node attention, and we also specifically design a weight decay strategy to effectively consider the contribution level of each message layer. Afterward, SpHN-VDA integrates features under diverse HoMGs, utilizing mode attention to calculate the optimal projection of drugs and viruses in feature space, regarded as informative but low-dimensional vector representations and the bilinear decoder is then applied to infer the association between drugs and viruses.
We adhere to an end-to-end paradigm and evaluate the SpHN-VDA on the HDVD database and VDA2 dataset and use random split and cold pair split strategies to evaluate and demonstrate the superior performance of our model. (details in the “Statistics and reproducibility” section). Since VDA prediction can be regarded as a binary classification problem, generic binary classification evaluation metrics, including AUC, AUPR, and ACC, are used.
SpHN-VDA achieves the best performance, generalization, and robustness in multiple evaluation scenarios
To demonstrate the ability of our model to predict VDAs, we conduct a comprehensive comparison between SpHN-VDA and four leading models, considering three crucial perspectives: (1) The overall predictive performance is evaluated under random split data with multiple ratios between positive and negative samples; (2) The generalization capacity of SpHN-VDA is assessed in terms of its ability to infer associations between drugs and unknown viruses under out-of-distribution scenarios; (3) The robustness of the model is examined against different degrees of random perturbation of VDA pairs.
First, the overall performance of the SpHN-VDA is compared with four leading end-to-end baselines in Fig. 2a. For a comprehensive evaluation, based on the same data division, the random split strategy is elaborately designed, which sets diverse ratios between positive and negative samples of 1:1, 1:2, 1:5, and 1:10 to construct balanced and imbalanced datasets. We have respectively selected representative models from four categories of computation methods, containing DrugBAN50, DTINet51, SGCL-DTI52, and GAEMDA53, whose detailed descriptions can be found in Supplementary Note 1. Moreover, to ensure a fair comparison of model performance, we also utilize the best default hyperparameters of comparison models to conduct multiple experiments on the same datasets based on fivefold cross-validation.
a The AUC for VDA prediction on the HDVD dataset across 7 independent experiments (each with a different random seed) for various negative sample proportions under positive-to-negative ratios of 1:1, 1:2, 1:5, and 1:10. For each ratio, 7 distinct sets of negative samples were randomly selected. Error bars represent the mean standard deviation across the 7 independent experiments, which differs from technical replicates. Each bar graph shows the performance significance of intergroup differences. The estimated effect sizes of four ratios are 0.94, 0.79, 0.86, and 0.87, respectively. The significance of SpHN-VDA versus DTINet is shown in each case (Tukey’s HSD test: *P = 2.65 × 10-2 for 1:1, *P = 3.84 × 10-2 for 1:2, ****P < 1 × 10-4 for 1:5, and ****P < 1 × 10-4 for 1:10). The details of statistical test result are reported in the Supplementary Table 11–14 and the significance test results based on the t-test are reported in the Supplementary Tables 8–10. b The AUC for VDA prediction on the VDA2 dataset with 7 independent experiments (each with a different random seed) for various negative sample proportions under positive-to-negative ratios of 1:1, 1:2, 1:5, and 1:10. For each ratio, 7 distinct sets of negative samples were randomly selected. Error bars represent the mean standard deviation across the 7 independent experiments, which differs from technical replicates. The estimated effect sizes of four ratios are 0.96, 0.98, 0.99, and 0.99, respectively. The significance of SpHN-VDA versus GAEMDA is shown in each case (Tukey’s HSD test: ****P < 1 × 10-4 for 1:1, *P = 1.01 × 10-2 for 1:2, *P = 1.88 × 10-2 for 1:5, and nsP = 0.99 for 1:10). The details of statistical test result are reported in the Supplementary Tables 15–18 and the significance test results based on the t-test are reported in the Supplementary Tables 8–10. c The change in AUPR on HDVD and VDA2 datasets is illustrated under different negative proportions with 7 independent experiments (each with a different random seed). For each proportion, 7 distinct sets of negative samples were randomly selected, which differs from technical replicates. For boxplots, the center line represents the median, upper and lower edges represent the interquartile range, and the whiskers extend from the minimum to the maximum values. d Generalization evaluation for OOD scenarios portrays the distributions of AUC and AUPR under 9 independent experiments (each with a different random seed) regarding 20% of viruses as cold-start viruses. For each independent experiment, a distinct set of cold-start viruses were randomly sampled. The evaluation metrics are represented as violin plots, where the center line depicts the median and the upper and lower lines denote the interquartile range. e Under 9 independent experiments (each with a different random seed), robustness evaluation showing the best AUC of each model prediction against different ratios of random perturbation of VDA pairs where the pairs are replaced with adding or removing. For each perturbation ratio, 9 distinct sets of cold-start viruses and perturbative samples were randomly sampled, which differs from technical replicates. For boxplots, the center line represents the median, upper and lower edges represent the interquartile range, and the whiskers extend from the minimum to the maximum values. Source data are provided as a Source Data file in Supplementary Data 4.
We repeat experiments with different negative samples and record statistical scores, which can also be seen as a way of measuring the robust performance of the model on VDA prediction. The bar graphs of AUC under HDVD and VDA2 datasets in each ratio between positive and negative are reported in Fig. 2a and b, respectively. Notably, the SGCL-DTI model exhibits limited applicability in unbalanced datasets, with AUC values below 0.5 in the 1:5 and 1:10 proportions, indicating performance inferior to random guessing. Thus, we only present the AUC of SGCL-DTI in the balanced dataset, and all the scores of each model are described in Supplementary Data 1. The superior performance of SpHN-VDA is evident across all proportions. As shown in Fig. 2c, the AUPR of all methods decreased dramatically as the negative proportion increased, as the AUPR is generally more affected by the negative sample proportions. Although DTINet shows the competitive performance of AURP in the HDVD dataset, SpHN-VDA constantly outperforms DTINet with significance superiority under the AUC criterion (1:1 of Tukey’s HSD test: *P = 2.65 × 10-2, 1:2 of *P = 3.84 × 10-2, 1:5 of ****P < 1 × 10-4, 1:10 of ****P < 1 × 10-4), which demonstrates the undisputed superiority of SpHN-VDA whether on balanced or unbalanced datasets. All detailed results are presented in the Supplementary Data 1.
Second, the capability of generalization is investigated by analyzing the performance in out-of-distribution (OOD) scenarios. This investigation involves employing the cold pair split strategy, where diverse cold-start viruses are repeatedly sampled to construct training-test samples, and statistical performance is calculated54. We display the distribution of AUC and AUPR for each model under the HDVD and VDA2 datasets in Fig. 2d. To provide a more comprehensive evaluation of the model’s performance, we conducted tests using the extra dataset of VDA_ex. The detailed results and experimental analysis are reported in the Supplementary Fig. 1–4. It can be seen that the SpHN-VDA constantly outperforms the second-best model of DTINet, with significance superiority of the AUC metric under HDVD (Tukey’s HSD test: **P = 6.3 × 10-3) and VDA2 (Tukey’s HSD test: *P = 1.88 × 10-2), and the first quartile of the SpHN-VDA is almost higher than the third quartile of the other methods in the violin plot. The superior performance in OOD scenarios ensures the feasibility of applying the model to drug repositioning, especially for new viruses. In addition, we find that all models are more stable in VDA2 than in HDVD, as VDA2 contains more virus entities and abundant virus side information flow, which indicates that the dataset size impacts model stability to a certain extent. Much to our surprise, although DrugBAN is the most stabilized, its generalization ability is ranked fourth. This discrepancy could be attributed to the model’s neglect of entity-level feature messages passing within the VDA network. It is reasonable that the SpHN-VDA maintains a satisfactory generalization ability, as it is designed to recognize critical motifs and patterns of drug–virus associations, which are highly related to drug repositioning based on unknown diseases.
Finally, to further evaluate the robustness of the SpHN-VDA, the tolerance of the model against training data perturbation is evaluated. We randomly add or remove drug–virus associations with diverse probabilities based on the cold pair split strategy. This process simulates the scenario in which the training data of VDA may introduce mislabeled associations and ignore unconfirmed ones during drug repositioning. We also repeat experiments with different cold-start viruses to avoid contingent results. According to the cold-start virus, the training and test sets are split based on the perturbed VDA network (details in “Statistics and reproducibility” section); The results indicate that the proposed SpHN-VDA model exhibits stable performance in terms of Area Under the Curve (AUC) with perturbations ranging from 20% to 40% (see Fig. 2e). Compared to the second-best baseline of the HDVD and VDA2 datasets (i.e., DTINet and SGCL-DTI), the SpHN-VDA achieves a significant improvement in performance by 0.05 and 0.0224, respectively, reflecting the most powerful robustness among all leading models. Notably, although the SpHN-VDA and DTINet show consistent performance in the overall evaluation (see Fig. 2a and c), the SpHN-VDA performs with significantly more robustness than DTINet, which demonstrates the preponderance of the SpHN-VDA. Furthermore, in the experiments, we find that different degrees of severe overfitting and underfitting widely emerged in the baseline models while employing perturbation with >20%, particularly in the GAEMDA model, denoting a lack of robustness and anti-perturbation ability. In contrast, this result demonstrates that the SpHN-VDA exhibits robustness in extracting latent information from the spatial hierarchical graph even under perturbed data.
To assess the contribution of individual modules in SpHN-VDA, we conducted an ablation study to evaluate their effectiveness. The analysis of the results indicates that the micro-level molecular structure information is particularly crucial, and the model achieves optimal performance only when all designed modules are utilized. The detailed experimental results are illustrated in Supplementary Note 2 and Supplementary Table 1.
Spatial molecular structure provides effective atom representation and micro-interpretable insight
We explore the role of the micro-atom-level structure from three perspectives: (1) assessing the effectiveness of the spatial message-passing structure in capturing molecular features, (2) evaluating the ability of the model to capture the long-range dependence among the atoms, and (3) examining its capability to identify the critical effective motifs.
First, we fully explore the effectiveness of the different micro-atom-level structure modeling, including the spatial message-passing structure (SpHN-VDA), planar message-passing structure (SpHN-VDA _atomGCN), spatial message passing without attention structure (SpHN-VDA_w/o_3DAttention), and no message-passing structure (SpHN-VDA_w/o_3DInformation). For fairness, we provide the same negative sample with a 1:1 proportion between positive and negative for each structure, whose performance evaluations are displayed by average ROC curves. We utilize the proposed module of Spatial-GNN and the two-layer GCN as the structures of SpHN-VDA and SpHN-VDA_atomGCN, respectively. In the case of SpHN-VDA_w/o_3DAttention, we directly employ the SchNet55 module as the backbone to extract the features of interatomic interactions. SpHN-VDA_w/o_3DInformation only uses prior knowledge as the feature, which is fed to the same subsequent macro entity-level architecture and bilinear decoder. Based on 3D information, we test the predictive performance of each structure. For the SpHN-VDA and SpHN-VDA_w/o_3DAttention, the sequence concatenation and 3D Cartesian coordinate data are fully employed for each \({C}_{\alpha }\). For SpHN-VDA _atomGCN, we apply the molecular graph to learn from the atom adjacency matrix, determined by the \({C}_{\alpha }-{C}_{\alpha }\) map. As expected, we discover that SpHN-VDA_atomGCN always achieves relatively unsatisfactory performance on the HDVD and VDA2 datasets (average AUC = 0.7433 and 0.7887). This outcome can be attributed to the fact that the GCN predominantly learns information from the 2D molecular graph. The effective utilization of 3D molecular graph information consistently yields superior performance compared to its 2D counterpart, underscoring the efficacy of the spatial message-passing structure. More interestingly, we find that SpHN-VDA_w/o_3DAttention and SpHN-VDA_w/o_3DInformation achieve similar performance. This observation suggests that, in the absence of an effective approach to learning 3D information, performance might be worse than neglecting 3D information, emphasizing the significance of an effective learning approach. Remarkably, Spatial-GNN demonstrates adeptness in conducting effective 3D information learning, a critical factor contributing to performance enhancement. Furthermore, the structure based on GCN lacks an innate ability to learn the relationships between atoms over long distances. Next, we analyze in detail the ability of the model to capture the long-range dependence among atomic features.
To gain a deeper insight into information captured from the atom-level structure, we scrutinize the relationships among atomic features to provide additional insights into the reliability of SpHN-VDA. Based on the message-passing mechanism, the correlation degree of atom features reflects the ability of the model capturing long-range dependence. After the model learning process, we collect each atomic feature and visualize the correlation between them to present the feature message-passing process (see Fig. 3b), and the capability of long-distance atomic interaction extraction manifests the reliability of the model56. Specifically, for the convenience of observing long-distance information message passing, we randomly select a molecule, Anidulafungin, with a length exceeding 80 after removing hydrogen atoms. For each atom embedding vector, the Pearson correlation coefficient is applied to describe the correlation between atoms. We calculate the correlation coefficient from each pairwise atomic feature, encompassing the training phases of SpHN-VDA, SpHN-VDA_atomGCN, and a randomly generated set without training. Our methodology adeptly captures atomic long-distance relationships, leveraging the 2D space curvature that brings distant atoms closer in 3D space, subsequently generating interaction edges. Our model can fully utilize the “short-cut” through a spatial message-passing structure. Nevertheless, the method based on the GCN captures only the weak relationship between long-range atoms, which supports the wisdom that SpHN-VDA_atomGCN always achieves relatively unsatisfactory performance.
a The average ROC curves based on fivefold cross-validation of VDA prediction with the different message-passing structures in the HDVD and VDA2 datasets show the performance of SpHN-VDA compared to variant methods containing SpHN-VDA_atomGCN, SpHN-VDA_w/o_3DAttention, and SpHN-VDA_w/o_3DInformation. b The correlation of the heatmap of each atomic feature under SpHN-VDA training, SpHN-VDA_atomGCN training, and random generation without training. The color of each pixel is determined by the Pearson correlation coefficient of the corresponding pairwise atom features. Red indicates a high value of the Pearson correlation coefficient, and green indicates a low value. The larger the number of high values represents the powerful ability of model capturing long-range dependence. c The binding sites for the HIV-1 protease IRM mutant (PDB id: 2FXD) with atazanavir and the predicted critical motifs of atazanavir. The contribution of these motifs is presented as a heatmap, where color depth is positively correlated to the z score. The top three motifs were confirmed to maintain the corresponding binding sites of GLY-48 and GLY-27. Source data are provided as a Source Data file in Supplementary Data 4.
The ability to correctly identify the vital effective motifs provides the basis for reliable predictions when applying the model to drug repositioning in real life. Hence, to verify whether the model provides relatively reliable biochemical evidence, we explore the capability of the SpHN-VDA to identify critical motifs in drugs. Notably, SpHN-VDA demonstrates the capacity to autonomously discern the importance of motifs without relying on binding site or protein residue annotations. As an example of drug repositioning, HIV-1, regarded as a cold-start virus, was simulated as the researched virus, and atazanavir (DrugBank id: DB01072) was correctly predicted with a 100% score. The details of the candidate drug set are reported in Supplementary Table 2. For interpreting decisions made by SpHN-VDA, the binding sites for the HIV-1 protease IRM mutant (PDB id: 2FXD) with atazanavir and the predicted critical motifs of atazanavir are displayed in Fig. 3c. The atom-level attention weights are employed as the correlation measure between motifs and the medicinal function specific to the virus. We extract the attention weight for each atom, normalizing it to determine atom importance through the z score. According to the importance z score of each motif, the SpHN-VDA can identify more than one motif, in which only the most crucial top three are mapped with orange (see the left part of Fig. 3c). All the predicted motifs hit the binding site, containing GLY-48 and GLY-27, which demonstrates that our method can precisely identify the critical atom with a corresponding potential binding site. Additionally, additional motifs, though with slightly lower priority, are also identifiable, specifically ranked 6, 24, 28, and 29 (refer to Supplementary Fig. 5 for detailed atom rankings). In addition to showing the accurate identification of the most crucial motifs, we also provide more in-depth insight into the comprehensive identification of effective motifs in Section 3.2. Identification of the critical effective atom sheds light on SpHN-VDA decisions and effectively provides guidance for research directions to further wet experimental research for VDA investigation. Undoubtedly, our model not only achieves precise predictions but also furnishes interpretable biochemical evidence from an atom-level perspective.
Biological association metapath provides effective entity representation and macro-interpretable insight
We explore the role of the macro entity-level structure from three perspectives: (1) We explore the superiority of multiple metapath patterns with prior knowledge as initial information, (2) we assess whether metapath-based high-order neighbor messages passing can effectively capture the similarity information of implicit biological functions, and (3) we explore the capability to express uncertainty, that is, the correlation between prediction score and credibility.
First, to validate the superiority of the designed entity-level feature message-passing process, we study five variants of SpHN-VDA that differ in the representation computation: without node attention (SpHN-VDA_w/o_nodeAttention), without mode attention (SpHN-VDA_w/o_modeAttention), without weight decay for each message-passing layer (SpHN-VDA_w/o_weightDecay), and without prior knowledge initialization (SpHN-VDA_w/o_priorKnowledge). To establish fair comparisons, we set the same negative sample with a 1:1 proportion, and evaluations are exhibited by average ROC curves (see Fig. 4a). According to the ‘guilt-by-association’ (GBA) principle, the presentations of a particular entity can be deduced by its neighbors10. In particular, the proposed node attention and mode attention play integral roles in capturing relationships among multiple neighbors and integrating information from diverse metapath patterns through respective importance scores. Our weight decay for each message-passing layer is inspired by the residual network structure57 to effectively consider the contribution of each message layer. Additionally, prior knowledge about synergistic and antagonistic interactions among drugs and the sequence of viruses is extracted to initialize node features. Consistently, the SpHN-VDA achieves superior performance, demonstrating the vital role of the proposed entity-level feature message-passing module in VDA prediction. Notably, SpHN-VDA _w/o_nodeAttention attains the least favorable performance, which supports the perspective that crucial neighbors are more representative. While the weight decay appears to exert minimal influence on prediction, the residual information often plays a crucial role in avoiding gradient disappearance and relieving the smooth feature. Intuitively, we also find that prior knowledge can further directly improve the performance of the model through a reduced learning space. Apart from the quantitative analysis, the samples in three different features initializing strategies are visualized by t-SNE58 in the HDVD and VDA2 datasets to further demonstrate the effectiveness of prior knowledge, containing random, prior knowledge and prior knowledge adding Gaussian noise. We find each sample produces a change from a random distribution to an independent cluster when using prior knowledge. And the distribution becomes more compact as noise is added, which can promote nonlinear learning in the narrowed learning space. Clustering visualization can be found in the Supplementary Fig. 6.
a The average ROC curves based on fivefold cross-validation of VDA prediction with different entity-level feature message-passing variants in the HDVD and VDA2 datasets show the performance of SpHN-VDA compared to methods containing SpHN-VDA _w/o_nodeAttention, SpHN-VDA _w/o_modeAttention, SpHN-VDA _w/o_weightDecay, and SpHN-VDA _w/o_priorKnowledge. b The visualization of high-order neighbor relationships between study drugs in the HoMG, containing chlorphenoxamine-Bcx4430, amantadine-pentoxifylline, and apigenin and didanosine, whose different colors represent corresponding virus-mediated associations, specifically SARS-CoV and HIV-1. c The HoMG visualization, where the blue edges indicate the original heterogeneous network without introducing relationships of the metapath pattern, and the other edge with diverse colors denotes corresponding virus-mediated associations under the metapath pattern of Drug1\({\longrightarrow }^{treat}\)Virus1\({\longrightarrow }^{treatedby}\)Drug2. The red nodes represent drugs, while the blue nodes represent viruses. d Quantifying the correlation between node embedding distances and the inferred biological function similarity post-message passing. We compare the embedding distances of drugs with similar biological functions to those without, under Metapath-GNN learning employed and not employed, respectively. The degree of relationship is assessed through cosine distance after normalization. Our evaluation aims to ascertain the effectiveness of metapath-based high-order neighbor message passing in capturing implicit biological function similarities. e Detailed view of the prediction accuracy distribution on HDVD, where the length of blue indicates the count of the true prediction sample and the length of red indicates the count of the false prediction sample, based on fivefold cross-validation. Source data are provided as a Source Data file in Supplementary Data 4.
Subsequently, to investigate whether high-order neighbor messages passing can effectively capture the similarity information of implicit biological functions, we explore the feature changes of the drugs before and after the messages passing. The constructed drug message delivery network is shown in Fig. 4c, where the blue edges indicate the original heterogeneous network without introducing relationships of the metapath pattern, while the other edge with diverse colors denotes corresponding virus-mediated associations under the metapath pattern of Drug1\({\longrightarrow }^{treat}\)Virus1\({\longrightarrow }^{treatedby}\)Drug2. As seen, utilizing metapath patterns can expand the information network, which is beneficial for mining deeper potential functional relationships among the nodes. Specifically, there are 46,225 drug pairs, with 331 maintaining a similarity greater than 0.5 and being used to construct a network, accounting for 0.7%. When the metapaths are added, the number of relationships increases by 6069, which is 20 times of the original network. We first calculate embeddings of each drug from messages passing in the HoMG and further study whether similar biomedical functions of nodes can be measured by embedding distance. We notice that many drugs have similar biological functions, while their structure similarity is very low. Chlorphenoxamine and Bcx4430 are relevant for MERS-CoV and SARS-CoV treatment. However, the similarity score of these two drugs is only 0.056 (A). Amantadine is utilized for treating dyskinesia in Parkinson’s patients, while pentoxifylline addresses intermittent claudication with a similarity score of 0 (B). HIV-1 can be treated by apigenin and didanosine, and their similarity is 0.093 (C). In the HoMG, these drugs have corresponding ‘bridges’ connecting them to pass high-order feature messages based on biomedical function, which cannot be achieved in the original networks, as visualized in Fig. 4b. To quantify the relationship between the embedding distance of nodes and similarity of implied biological functions after message passing, we extract the features of the mentioned drugs with and without Metapath-GNN learning. After normalization, the relationship is measured by the cosine distance. Additionally, we conducted control experiments between the mentioned drugs and the drug mesalazine with dissimilar biomedical functions for treating ulcerative colitis, including experiments with chlorphenoxamine-mesalazine (D), pentoxifylline-mesalazine (E), apigenin-mesalazine (F) and didanosine-mesalazine (G) (see Fig. 4d). In experimental Groups A, B, and C, The embedding of drugs with similar actual functions has closer distance when using the Metapath-GNN module, while the cosine similarity of these drug embeddings is low without the Metapath-GNN module. In control Groups D, E, F, and G, the similarity is still at a low level (<0.3) due to dissimilar drug functions. The experiments suggest that high-order neighbor information can capture the similarity information of implied biological functions and that the Metapath-GNN can make drugs embedded closer with similar functions, even though the drug structures are different, which is more in line with the drug functional background.
In addition, uncertainty serves as a metric to gauge the confidence of the model, particularly in complex chemical spaces22,59,60. In the field of chemistry, investigations into uncertainty serve as a supplementary tool to explicate conclusions61, which can minimize the possibility of experimental failures62 and facilitate the identification of common characteristics in molecular active structures based on prediction results63. Thus, we explore the correlation between prediction score and credibility to test whether the model has high reliability when giving a high prediction score. In SpHN-VDA, the output of our classification model is a prediction score between 0 and 1, rather than a simple prediction of a positive or negative class. The score can be interpreted as the probability of belonging to the positive class, where a probability >0.5 signifies a positive classification. We provide a detailed view of the prediction accuracy distribution on HDVD in Fig. 4e (a detailed view of VDA2 provided in Supplementary Fig. 7), where the length of blue (red) indicates the count of true (false) prediction samples. As seen, most scores of true predictions are close to 0 or 1, which denotes that the SpHN-VDA has high confidence in the prediction result. However, the distribution of false predictions appears much more uniform, especially under the range from 0.2 to 0.8, which contains approximately one-tenth of the prediction score. Hence, for this subset, our model seems to exhibit uncertainty, with an accuracy of only 0.506. Conversely, the SpHN-VDA has an accuracy of higher than 0.85 for a prediction score near 1, and its high reliability can provide a discriminant measure for whether the prediction results of drug repositioning for new viruses can be accepted. The detailed count of each score range is reported in Supplementary Table 3.
Analysis of interpretation from macro to micro
To assess whether both hierarchical information can regulate each other to promote reasonable explanations, we analyzed the prediction results of the model using two independent hierarchical information sources and examined the flow process of information through these hierarchies. The analysis indicates a significant difference (contradictory conclusion) in more than half of the predictions between the atom-level module and entity-level module, which can be avoided by our unified framework. To explore the process of optimization from micro to macro, we further analyze how the model unifies different conclusions.
The top two different positive samples selected for analysis are Emodin with Hepatitis B virus and Mizoribine with HSV-1. In the first sample, the correct result can be predicted by the complete framework or atom-level module, but the entity-level module leads to incorrect conclusions. Based on the atom-level attention analysis, we discovered that the captured molecular structures in both the complete model and the atom-level module are similar (see Fig. 5a). This indicates that correctly learning crucial molecular structures can enhance prediction accuracy. Based on the node-level and mode-level attention analysis, we found that there are five different neighbors among the top ten high-order neighbors of the complete model and entity-level module under the most important biological function metapath. These are the reasons for the incorrect prediction. Upon further analysis of these ten different neighbor molecules (see Fig. 5b), it is evident that there are structural differences among them, like the number of ring structures and halogen elements, etc. This demonstrates that micro-atom-level information is a crucial factor to consider when selecting entity-level network neighbors.
a The structure capturing of Emodin with SpHN-VDA and Atom-level module. The attention scores are extracted from the corresponding models. b The selection of critical entity-level neighbors with SpHN-VDA and Entity-level module. As the Entity-level module has no attention score for the molecular structure, it uses the score from SpHN-VDA. The same score source is useful for analyzing potential neighbor selection criteria. c All atoms scores of Mizoribine with SpHN-VDA and Atom-level module. The attention scores are extracted from the corresponding models. d The docking results of Mizoribine with HSV-1 gE ectodomain (PDB: 2GIY). The figure displays six binding sites and three important molecular structures under the optimal docking structure.
In the second sample, the complete model or entity-level module can predict the correct results, whereas the atom-level module can lead to incorrect conclusions. First of all, we observed significant differences in molecular structure considerations between the complete model and the atom-level module (see Fig. 5c), and incorrect considerations have a severe impact on sample identification. Molecular docking64 experiments further demonstrated that the complete model can accurately identify critical binding sites, whereas the atom-level module may erroneously capture other structures (see Fig. 5d). This indicates that the macro entity-level information can influence the selection of atom-level spatial structure. To sum up, the analysis of the model reasoning processes reveals that the architecture of SpHN-VDA can enhance the accuracy and hierarchical information consistency, thereby ensuring a rational interpretation from micro to macro.
SpHN-VDA suggests potentially repurposable drugs for COVID-19 with interpretation from macro to micro scale
SARS-CoV-2, the seventh coronavirus capable of infecting humans, has triggered the global dissemination of COVID-19, resulting in millions of infections and numerous fatalities worldwide65. The symptoms of infected patients occur in the range of mild to severe66. Despite efforts to develop beneficial targeted therapies, definitive therapies against SARS-CoV-2 are still urgently needed67.
To effectively give suggestions for COVID-19 therapeutics based on the existing drugs, SpHN-VDA makes full use of the whole VDA information to infer repurposed candidate drugs, treating SARS-CoV-2 as a cold-start virus. Specifically, we first exploit the SpHN-VDA to calculate the probability of each drug being related to SARS-CoV-2, which is ranked in descending order, and then select high-confidence drugs with probabilities greater than 0.9 to construct a prediction list since the SpHN-VDA has high reliability when giving high prediction score (see the “Biological association metapath provides effective entity representation and macro-interpretable insight” section). Furthermore, the gene expression profiling of peripheral blood mononuclear cells from 10 SARS-CoV-infected patients (GEO: GSE1739)68 and FFPE lung tissue from 11 SARS-CoV-2-infected patients (GEO: GSE190496)69 is used to conduct differentially expressed gene analysis. Then, according to the upregulated and downregulated genes (see Supplementary Tables 4 and5 for details), SpHN-VDA performs Connectivity Map (CMap, https://clue.io/query)48 analysis, and candidate drugs for COVID-19 are further screened using transcriptome data. Due to the opposite gene expression signature between a drug and a disease, meaning that the drug has a potential treatment for the disease, drugs with CMap scores lower than the threshold of 0 can be treated as candidates for COVID-19 therapy. CMap analysis results for two sets of differential genes are reported in Supplementary Data 2 and Supplementary Data 3. For the significance of the results, we set the threshold to −1 and identified the potential COVID-19 therapeutic candidate drugs from the prediction list, which are shown in Table 1.
As shown in Table 1, we divide the 25 candidate drugs into three confidence levels based on whether the CMap analysis results of each drug were significant under two gene expression profiles (see Supplementary Table 6 for all predictions), where CMap score-1 and CMap score-2 represent the drug gene expression analysis scores related to SARS-CoV and SARS-CoV-2, respectively. The score related to SARS-CoV-2 is given higher priority. Essentially, all the candidate drugs have been proven to promote therapy for COVID-19 or its complications, such as diarrhea and an excessive inflammatory response.
From the molecular structure to the biological association metapath, we perform interpretation from macro to micro in accordance with the designed triple attention modules. According to the mode-level attention, the metapath pattern of Drug1\({\longrightarrow }^{treat}\)Virus1\({\longrightarrow }^{treatedby}\)Drug2 receives the highest score, indicating a crucial information exchange relationship among drugs with the same treatment function. Then, we derive attention scores from the node-level attention module and extract interesting instances of the crucial neighboring drugs for visualization. Their crucial local structures are further scrutinized through spatial atom attention. Supplementary Fig. 8 reported the visualization results, exhibiting that almost all of the drugs with high attention scores in the critical metapath share overlapping substructures. That demonstrates the reasonability of the similar treatment functions among the drugs, which indicates the decision process of SpHN-VDA from macro to micro view. Moreover, a set of biomedical function analyses are conducted to demonstrate confidence in the interpretation and the dependability of the prediction results.
Ganciclovir, used as a DNA polymerase inhibitor for treating cytomegalovirus, may contribute to decreasing the mortality of COVID-19 patients. Similarly, SpHN-VDA also infers the potential treatment capability of ganciclovir and predicts that carbonyl, amino, and hydroxyl groups are critical binding structures. Interestingly, among all high-confidence candidate drugs, only thiothixene with confidence level I lacks related evidence demonstrating therapeutic effects on COVID-19. Thiothixene holds antipsychotic pharmacological effects approved for the management of schizophrenia. Relevant studies have found that thiothixene is effective in inhibiting viral entry and virus-cell fusion70,71. Additionally, molecules with three-ring structures, such as thiothixene, have been shown to exhibit a higher affinity for ACE72. Based on these findings, we hypothesize that this drug may act on proteins involved in the viral entry process. As reported in previous studies, cell entry of SARS-CoV-2 depends on its receptor angiotensin-converting enzyme 2 (ACE2) and on host cell proteases priming ACE273. The binding of the therapeutic drug and vital functional receptor ACE2 may be an effective treatment to prevent the virus from infecting the host cell. We further explore whether thiothixene has the potential to bind to ACE2, potentially preventing SARS-CoV-2 from entering cells.
To explain the potential evidence for inferring VDA using SpHN-VDA and provide insight into the possible mechanism of action, we visualize the important prediction motifs and further clarify them through molecular docking analysis of thiothixene (CID: 5071) and SARS-CoV-2 spike protein/ACE2 (PDB ID: 6M0J) through AutoDockTools74, which is shown in Fig. 6. The docking result indicates that thiothixene binds to ACE2 via three hydrogen bonds, where the formation of hydrogen bonds is related to oxygen atoms, whose importance rankings on SpHN-VDA prediction are 5 and 9, respectively. The binding energy, whose value has an inverse relationship to the binding ability, between thiothixene and ACE2 is −2.56 kcal/mol, which further validates its applicability in treating SARS-CoV-2. As a consequence, the results of the explainable prediction of the SpHN-VDA and molecular docking demonstrate that thiothixene may alter ACE2 functional expression by binding with ACE2 to prevent the virus from entering the cell. However, our method primarily provides computational possibilities, and based on prior knowledge, we have explored only one potential therapeutic mechanism of thiothixene through molecular docking experiments. The drug may exert therapeutic effects through additional mechanisms, which would require further wet-lab studies and validation. Nonetheless, it is clear that SpHN-VDA offers valuable predictive guidance for identifying candidate compounds in drug repositioning.
Discussion
In this study, we have mainly discussed the novelty and validity of our research from various viewpoints, including hierarchical network structure, learning model framework, and result interpretability. Regarding hierarchical network structure, we have identified a limitation in biomedical field modeling using heterogeneous networks, as the single architecture of a biomedical network fails to quantify the relationship between the micro view drug spatial network and the macro view biomedical network, which causes shallow understanding and biased representation. Therefore, to enhance the modeling of biomedical networks, we introduced the Spatial Hierarchical Network structure for the first time, to the best of our knowledge, embedding the spatial network of the single drug as a subnetwork of the VDA network to describe hierarchical structure. This structure can represent multiple levels (atom and entity levels) of network structure in a hierarchical manner.
Concerning the learning framework, existing models fall short in learning biomedical networks with 3D structural information, resulting in their inability to acquire an efficient representation of essential spatial drug substructures and consequently reliable molecular analysis. More importantly, our experiments show that, in the absence of an effective approach to learning 3D information, the performance of using 3D information might be worse than neglecting it. Therefore, we propose the framework of the Spatial Hierarchical Network for Virus-Drug Associations (SpHN-VDA). SpHN-VDA has the capability to capture the spatial structure at the atom level, preserving pharmacologically potential properties. Additionally, it can extract the association structure at the entity level, incorporating complex heterogeneous interaction information. Moreover, the hierarchical features optimize each other to effectively learn implicit data representations within and across hierarchical information layers. This approach ensures not only closer embedding distance for drugs with similar biological functions at the macro level, and captures feature dependencies among long-range atoms at the micro level.
Regarding result interpretability, predictive interpretation is of utmost importance in the field of biomedicine. However, several existing models only provide a one-sided interpretation and scattered knowledge can lead to conflicting conclusions. Current network-based techniques exclusively explicate the results through entity relationships, failing to capture the crucial effective motifs that commonly contribute decisively to pharmaceutical functions75. Hence, we proposed atom-level, node-level, and mode-level attention modules to establish a robust machine understanding of VDAs. And we further relieve three problems of GNN methods. It offers a complete reasoning process from macro to micro levels by identifying the most contributing biological network neighbors and further analyzing their critical spatial structures, which provides reliable guidance for drug design and subsequent wet experimental validations76. Experimentally, the SpHN-VDA displays superior performance against leading models with a significant margin for VDA prediction. Additionally, as predicted scores approach 0 or 1, model accuracy exceeds 0.85, demonstrating its reliability. It’s also robust against data perturbation of 20% to 40% and shows strong generalization for emerging viruses. The results of several case studies prove the superior performance of the model.
The design inspiration of the SpHN-VDA originates from the intuition that the world is complex, leading to the conceptualization of problems and knowledge from a hierarchical perspective77. The human mind consistently enhances comprehension by transitioning between macro and micro perspectives. However, the SpHN-VDA exhibits three primary limitations, and we propose potential solutions for addressing these limitations in future research. (1) The influence of a more complex heterogeneous graph was not explored in depth. As the model is primarily tailored to address drug repositioning, the dataset’s scale is relatively modest, and the diversity of entity types in the heterogeneous graph is limited. This limitation may result in suboptimal performance, despite the model demonstrating superiority in performance, robustness, and generalization. Hence, we recommend that future research explore the utilization of larger datasets and apply the model to a more extensive heterogeneous graph, which holds more entities such as proteins. Moreover, the parameters of the model should be further optimized to enhance overall performance. (2) The binding site information may be beneficial for drug representation. In the process of model learning, we did not use the corresponding binding site for supervised learning, which may be the reason for high accuracy only occurring in the prediction with higher scores. Furthermore, the absence of binding site information may lead the model to excessively focus on a specific molecular group, resulting in high accuracy but incomplete predictions. In the case of drug repositioning for HIV-1, the model pays more attention to the structure containing nitrogen. To resolve this issue, it is suggested to consider introducing the protein corresponding to the virus and its binding site information in future work. It is imperative to adjust the model’s structure to facilitate convergence. (3) Biomedical text provides an intuitive interpretation of predictions with abundant external information. Reasonably utilizing text descriptions as an additional source of hierarchical information is crucial for understanding molecules comprehensively. In the future, we will fully integrate text information and optimize it to improve the interpretation ability of our framework continuously.
Methods
Construction of the spatial hierarchical network
The spatial hierarchical network encompasses two-level perspectives: the atom-level inside-of-drug aspect, focusing on the molecular 3D structure, and the entity-level outside-of-drug aspect, centering on virus-drug associations. These perspectives are elaborated upon separately below.
In the atom-level inside-of-drug aspect, we denote a set of atoms in a molecule spatial graph as \(Mol=\{{a}_{1},{a}_{2},\cdot \cdot \cdot ,{a}_{n}\}\), whose atomic physicochemical properties can be described with features of \({v}_{i} \in {\mathbb{R}}^{{d}_{v}}\). In addition, each atom node conducts an atom-level feature message passing through the interatomic interaction edge, identified by spatial distance, which is represented as the feature vector of \({e}_{k}\in {\mathbb{R}}^{{d}_{e}}\). Consequently, each molecular spatial graph can be defined as \({G}_{Mol}=({u}_{Mol},{V}_{Mol},{E}_{Mol},P)\), where \({u}_{Mol}\in {\mathbb{R}}^{{d}_{{u}_{Mol}}}\) is the molecular feature vector for graph \({G}_{Mol}\); \({V}_{Mol}={\{{v}_{i}\}}_{i=1:n}\) is the set of atom feature vectors; \({E}_{Mol}={\{({e}_{k},{r}_{k},{s}_{k})\}}_{k=1:m}\) indicates the set of edges from the \({s}_{k}\) atom to the \({r}_{k}\) atom, which is decided by the cutoff distance for interatomic interaction; and \(P={\{{r}_{h}\}}_{h=1:n}\) is the set of the spatial information (3D Cartesian coordinates) of each node.
In the entity-level outside-of-drug aspect, a set of molecular spatial graphs and viruses are connected within a VDA graph \({G}_{VDA}=(V,E,N,R)\) with \({G}_{Mol}\in V\), where V, E, N, and R represent vertex, edge, vertex type, and edge type sets, respectively, and \(|N|+|R| > 2\). All the molecular spatial graphs are regarded as the refined subnetwork of the VDA graph. Specifically, \({G}_{VDA}\) is constructed from three types of networks: virus-drug association (VDA), virus-virus similarity (VVS), and drug-drug similarity (DDS) networks. The VVS was calculated from the multiple sequence alignment method of MAFFT78, applied to the genomic sequence of viruses. The DDS is calculated from the Tanimoto index79 between MACCS fingerprints of pairwise drugs, utilizing the chemical structure of drugs. MACCS fingerprints are converted from SMILES by the Babel chemistry toolbox80. Assuming MACCS fragment bit strings of drug i and drug j are set as \(D(i)\) and \(D(j)\), the function for calculating the similarity score, can be calculated as:
Each score of \(Si{m}_{ij}\) can be divided into edges or nonedges in the similarity network by the threshold. Finally, we model \({G}_{Mol}\) and \({G}_{VDA}\) as a spatial hierarchical network.
To note, our proposed spatial hierarchical network structure differs significantly from the community hierarchical network structure in that the feature of each node in a subnet (community) of the hierarchical network is utilized for prediction tasks directly, as done by hier2vec in constructing hierarchical social networks to describe multiple-granularity information81. Conversely, in the spatial hierarchical network, the feature of each node in a subnet (spatial graph) must first be aggregated to generate a representation of the entire subnet (entity), and then the representation of the subnet is used for prediction tasks. The most noticeable difference is that the local structure of the community hierarchical network is seen as a subnetwork, but in the proposed structure, the spatial structure of the node itself is seen as a subnetwork. The model of HIGH-PPI is a typical application of the community hierarchical network structure22. By comparison, our suggested network framework exhibits greater suitability for modeling biomedical networks. Figure 1a shows the difference between two distinct network structures.
Feature initialization of prior knowledge
The integration of prior knowledge can help reduce learning space and improve the accuracy of predictions82. Thus, we innovatively introduce the sequence of viruses and the synergistic/antagonistic interaction83 among drugs to effectively utilize potential functional information (Fig. 1c).
A model84 based on NLP with a local fusion strategy is used to learn virus sequence information. Specifically, each virus sequence is divided into triple symbols by 3-mers, and then GloVe is used to generate the embedding vectors by minimizing the loss function:
where \({w}_{i}\) and \({w}_{j}\) indicate embedding vectors of symbols i and j; \({x}_{ij}\) means the appearance count of symbol j in the context of symbol i; and \(b\) denotes the bias term. The weight function \(f(\cdot )\) is defined as:
To improve the robustness and generalization capability of the proposed method and alleviate the model’s tendency to overfit due to redundancy in sequence information, we add random perturbations to the embedding vectors, which can be represented as \((w+\varepsilon )\), where \(\varepsilon\) conforms to the Gaussian distribution.
We utilize SDGNN85, the GNN model applied to the symbol graph, to extract the feature of the synergistic/antagonistic interaction among drugs based on the enhancive and depressive DDI network. SDGNN is trained through three loss functions: \({L}_{sign}\), \({L}_{direction}\) and \({L}_{triangle}\). Given nodes \(v\) and \(u\), \({L}_{sign}\) is the cross-entropy loss function used to model the sign between the nodes, which is defined as:
where \(\varepsilon\) indicates the list of edge signs and \({y}_{uv}\) means the ground truth of the sign. \({L}_{direction}(u\to v)={{\sum}_{{e}_{u,v}\in \varepsilon }({q}_{uv}-(s({z}_{u})-s({z}_{v})))}^{2}\) measures the status between nodes \(v\) and \(u\), which can be defined as:
\({L}_{triangle}\) aims to learn the true triangle distribution, which can be defined as:
where \({L}_{ij}=-{y}_{i,j}\,\log (s({z}_{i}^{T}{z}_{j}))-(1-{y}_{i,j})\log (1-s({z}_{i}^{T}{z}_{j}))\). Through optimizing the three loss functions, the prior knowledge of the drug can be extracted as the embedding vector of \({u}_{Mpk}\).
Spatial-GNN for learning atomic space representations
In the micro view, we propose a Spatial-GNN module to capture the dependence between spatially adjacent atoms, and the local atomic interaction guided by chemical bonds. The 3D graph neural network (3D-GNN) has shown great capacity for learning spatial relationship data, leading to effective 3D graph-structured drug representations. Hence, we propose a Spatial-GNN module, inspired by SchNet55, to effectively utilize the Cartesian coordinate position information message passing with the molecular 3D structure. This module captures molecule features based on a micro perspective graph through atom-level feature message passing (Fig. 1d). Specifically, Spatial-GNN calculates the optimal energy structure of all molecules in 3D space with the MMFF86 force field and constructs the 3D molecular graph through interatomic interactions based on atomic spatial positions, treating this structure as the micro-network hierarchy at the atom-level. The features containing spatial information of each micronetwo77rk hierarchy (3D molecular graph) can be extracted through atom-level message passing with information filtering by atom-level attention mechanism.
As the spatial message-passing process, given a molecular graph \({G}_{Mol}=({u}_{Mol},{V}_{Mol},{E}_{Mol},P)\), the 3D-graph message-passing paradigm is defined as:
where \(\varphi\) indicates the update function for the corresponding geometries; \(\rho\) represents the transfer function for aggregating features from one type of geometrics to another; and \({N}_{i}\) denotes the set of incoming nodes of node \(i\).
Formally, the spatial-GNN module constructs continuous distance features, which are produced by a continuous filter operation55 from the distance between atoms. Furthermore, we add the attention mechanism into the process of molecular feature vector updating, and the message updating is defined as:
\({{\rho }^{v \to u}}{^{\circ}} Atten\) denotes aggregating atom feature \({v}_{i}\) to molecular feature \({u}_{Mol}\) with attention score.
Finally, the drug molecular feature can be calculated and serves as an integral part of the initial feature for entity-level message-passing modules (Fig. 1e). The attention score can be optimized and read after completing the training process of the Metapath-GNN module. The flowchart of Spatial-GNN is reported in Supplementary Fig. 9.
Metapath-GNN for learning drug representations and virus representations
We propose the Metapath-GNN module to learn VDA semantic pattern features based on a macro perspective graph through entity-level feature message passing, regarding multiple semantics as a feature message-passing mode to promote information capturing of heterogeneous graph structures and higher-order semantics from multiple modalities. Specifically, based on the head and tail of diverse metapaths, the metapath graphs containing different interaction modes, called HoMGs, are generated, which can not only capture long-range dependencies but also integrate semantic-rich information from heterogeneous networks into the aggregation paradigm of GNNs. To avoid sharing homogeneous information among heterogeneous nodes, the head and tail are set to the same type. In each HoMG with a distinct mode, we introduce a node attention module based on a graph attention network (GAT)49 to capture local dependency. Then, a weight decay coefficient is used to account for the features of each layer, which can effectively mitigate the problem of feature smoothness33.
Formally, a metapath of \(\varPhi\) is defined as \({\Phi }={N}_{1}{\to }^{{R}_{1}}{N}_{2}{\to }^{{R}_{2}}\cdot \cdot \cdot {\to }^{{R}_{l}}{N}_{l+1}\), which denotes a composite relationship of \(R={{R}_{1}}{{\circ}} {{R}_{2}}{{\circ }}\cdot \cdot \cdot {{\circ }}{R}_{l}\), i.e., a metapath pattern, between the vertex types of \({N}_{1}\) and \({N}_{l+1}\), which are limited to the same type in this study to avoid sharing homogeneous information among heterogeneous nodes. Then, the metapath graph of homogeneity between head and tail (HoMG) is generated from the VDA graph \({G}_{VDA}=({V}_{VDA},{E}_{VDA})\) to capture higher-order structure information and learn potential semantic information. We elaborate on three representative metapaths as the feature message-passing patterns to construct \(HoM{G}_{\varPhi }=({V}_{VDA},\varPhi )\), whose details of metapath construction are outlined in Supplementary Table 7. For each drug node, Metapath-GNN updates the representation with metapath neighbors of \(HoM{G}_{\phi \in \varPhi }\) based on multihead node-level attention and can be defined as:
where \(K\) indicates the number of heads; \({W}_{\varPhi }^{k}\) means the learnable weight under the kth head; and the learnable attention score \({\alpha }_{{\varPhi }_{ij}}^{k}\) between nodes i and j is defined as:
where \({N}_{i}\) indicates the metapath neighbors of node i and \({a}^{k}\) means the learnable weight under the kth head. Notably, the initialization of the nodes is defined as the weighted mean between 3D-molecular and prior knowledge information:
After node message passing based on each \(HoM{G}_{\varPhi }\), we propose mode-level attention to fuse semantic information with bias, which is defined as:
where the earnable attention score \({\beta }_{\phi }\) is defined as:
We conduct the same process to acquire higher-order semantics information of viruses and apply dual channels to independently learn the features of the drug (\({H}_{d}\)) and the virus (\({H}_{v}\)). The bilinear decoder is utilized to calculate the probability of drug and virus association, and the prediction \(\widehat{{y}_{ij}}\) between drug \(d(i)\) and virus \(v(j)\) is calculated by:
where \(Q\) indicates a learnable map matrix.
Model training details
Given the training sets \(\varOmega\) and \(\overline{\varOmega }\), representing known and unknown VDA entities, respectively, along with the ground truth label matrix \(Y\), Spatial-GNN and Metapath-GNN are trained in an end-to-end manner by minimizing the loss function:
where \({P}_{\Omega }(\cdot )\) is the operation of projecting the matrix to set \(\varOmega\). To promote the generalization and stability of the model, we introduced an asymmetric coefficient \(\lambda\) to address imbalanced samples.
To enhance prediction accuracy, all the hyperparameters are determined through grid search with fivefold cross-validation. For Spatial-GNN, the dimensions of atoms \({d}_{v}\), \({d}_{e}\) and \({d}_{{u}_{Mol}}\) are set to 16, 32, and 64, respectively. For each GAT block in MetaPath-GNN, we use 2 GNN layers and 8 heads with an output dimension of 128. The details of parameter tuning are reported in the Supplementary Fig. 10.
The description of comparing methods
We have respectively selected representative models from four categories of computation methods for comparing: one 2D molecular graph-based method (i.e., DrugBAN), one matrix factorization-based method (i.e., DTINet), one graph co-contrastive learning-based method (i.e., SGCL-DTI) and one novel deep learning-based method (i.e., GAEMDA). DrugBAN utilizes the GNN module on the 2D molecular graph to learn and represent the potential function of critical substructures and make predictions through fully connected classification. DTINet applies compact feature learning algorithms to capture the topological features of heterogeneous networks and optimizes prediction results by reconstructing the original interaction matrix. SGCL-DTI optimizes the metapath encoding features through contrasting topological views and semantic views, which are constructed by treating interaction pairs as nodes. The GAEMDA model fuses two kinds of similarity as node features, further reinforcing the features by integrating the neighbor information. A more detailed description of the methods is reported in Supplementary Note 1.
Statistics and reproducibility
In the study, we assess the statistical significance between the proposed model and the comparison baselines using one-way ANOVA. We then further compare each method with our approach using Tukey’s HSD (Honestly Significant Difference) test to determine the significance levels between them. Unlike the t-test, which compares differences between only two groups, one-way ANOVA is more effective for assessing significant differences across multiple groups. This method effectively identifies differences among each comparing group, with a P-value < 0.05 is considered significant.
Evaluation scenarios and datasets
We study the prediction performance of the SpHN-VDA and five state-of-the-art baselines on two public datasets: HDVD and VDA2, and one extra public dataset: VDA_ex. HDVD is a web-accessible database87 of experimentally validated human drug–virus associations, focusing primarily on assembling drug–virus interaction entries from a significant number of works in the literature using text mining technology. After removing the drugs with missing space structure information, this dataset includes 34 viruses, 215 drugs and 453 confirmed human VDAs. VDA2 originated from the DrugVirus.info database88 consisting of various experimentally validated drug–virus resources by Shen et al. 89. This dataset contains 755 drug–virus associations between 124 drugs and 69 viruses after removing the drugs with missing space structure information. The VDA_ex dataset is compiled from DrugBank90, NCBI91, and PubMed92 databases. It is constructed based on 93 known virus-drug associations involving 11 viruses similar to SARS-CoV-2, such as SARS-CoV, MERS-CoV, and influenza A viruses, as well as 75 small-molecule drugs for which structural information is available. All the sequence information for drugs and viruses can be collected from DrugBank90 and NCBI91, respectively.
In the evaluation experiments, we use two different split strategies-random split and cold pair split-to evaluate and demonstrate the superior performance of our model. In the random split, the confirmed associations between drugs and viruses were regarded as positive samples, and we set diverse ratios between positive and negative samples of 1:1, 1:2, 1:5, and 1:10 for randomly sampling negative samples. To avoid contingency of the results, repeatedly sampling different negative samples with different random seeds at each ratio is conducted for evaluation. Based on fivefold cross-validation, we randomly divide each experimental dataset into training, validation, and test sets at a ratio of 7:1:2. In the cold pair split, we divide the training set and test set based on the virus, simulating the drug repurposing scenario. Specifically, one-fifth of the viruses are randomly selected as the cold-start virus, whose pairs with drugs are utilized as the test samples, while the pairs for the rest of the viruses are exploited as training samples. We maintain a ratio of 1:1 between positive and negative samples when randomly sampling negative samples. Similarly, to avoid contingency of the results, repeat sampling of different cold-start viruses with different random seeds was conducted for evaluation.
In this study, we use generic evaluation metrics of binary classification, and the performance of the SpHN-VDA and baselines are evaluated by AUC (the area under the receiver operating characteristic curves), measuring the cost of the true positive rate and the false-positive rate at diverse thresholds93, AUPR (the area under the precision-recall curve), describing the tradeoff between recall and precision, and accuracy (ACC), showing the proportion of correctly predicted samples. In this context, AUC and AUPR serve as the primary metrics for evaluating the performance of both balanced and unbalanced samples, with ACC used as a supplementary metric due to its insensitivity to unbalanced class distribution.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All the SMILES information and the synergistic and antagonistic interactions of drugs were collected from DrugBank (https://go.drugbank.com/). The sequence information of viruses was collected from NCBI. In this study, additional data supporting the key findings are available within the Supplementary Information files. The relevant data are also available from the corresponding author upon reasonable request. Source data are provided with Supplementary Data 4.
Code availability
An open-source SpHN-VDA is available at the following GitHub repository (https://github.com/MrPhil/SpHN-VDA) with a DOI94 of https://doi.org/10.5281/zenodo.13881677.
References
Dickson, M. & Gagnon, J. P. Key factors in the rising cost of new drug discovery and development. Nat. Rev. Drug Discov. 3, 417–429 (2004).
Fernández-Torras, A., Duran-Frigola, M., Bertoni, M., Locatelli, M. & Aloy, P. Integrating and formatting biomedical data as pre-calculated knowledge graph embeddings in the Bioteque. Nat. Commun. 13, 5304 (2022).
Wang, R.-S. & Loscalzo, J. Repurposing drugs for the treatment of COVID-19 and its cardiovascular manifestations. Circ. Res. 132, 1374–1386 (2023).
Pushpakom, S. et al. Drug repurposing: progress, challenges and recommendations. Nat. Rev. Drug Discov. 18, 41–58 (2019).
Guy, R. K., DiPaola, R. S., Romanelli, F. & Dutch, R. E. Rapid repurposing of drugs for COVID-19. Sci 368, 829–830 (2020).
Galindez, G. et al. Lessons from the COVID-19 pandemic for advancing computational drug repurposing strategies. Nat. Comput. Sci. 1, 33–41 (2021).
Smith, D. P. et al. Expert-augmented computational drug repurposing identified baricitinib as a treatment for COVID-19. Front. Pharmacol. 12, 709856 (2021).
Frantz, S. Drug discovery: playing dirty. Nature 437, 942 (2005).
McLean, S. R. et al. Imatinib binding and cKIT inhibition is abrogated by the cKIT kinase domain I missense mutation Val654Ala. Mol. Cancer Ther. 4, 2008–2015 (2005).
Bang, D., Lim, S., Lee, S. & Kim, S. Biomedical knowledge graph learning for drug repurposing by extending guilt-by-association to multiple layers. Nat. Commun. 14, 3570 (2023).
Zhang, Z. et al. Overcoming cancer therapeutic bottleneck by drug repurposing. Signal Transduct. Target. Ther. 5, 1–25 (2020).
Hua, Y. et al. Drug repositioning: progress and challenges in drug discovery for various diseases. Eur. J. Med Chem. 234, 114239 (2022).
Long, Y. et al. Heterogeneous graph attention networks for drug virus association prediction. Methods 198, 11–18 (2022).
Dotolo, S., Marabotti, A., Facchiano, A. & Tagliaferri, R. A review on drug repurposing applicable to COVID-19. Brief. Bioinformp 22, 726–741 (2021).
Chen, Z.-H., Zhao, B.-W., Li, J.-Q., Guo, Z.-H. & You, Z.-H. GraphCPIs: a novel graph-based computational model for potential compound-protein interactions. Mol. Ther. Nucleic Acids 32, 721–728 (2023).
Ren, Z.-H. et al. DeepMPF: deep learning framework for predicting drug–target interactions based on multi-modal representation with meta-path semantic analysis. J. Transl. Med. 21, 1–18 (2023).
Deepthi, K., Jereesh, A. & Liu, Y. A deep learning ensemble approach to prioritize antiviral drugs against novel coronavirus SARS-CoV-2 for COVID-19 drug repurposing. Appl. Soft Comput. 113, 107945 (2021).
Wang, Y., Zhai, Y., Ding, Y. & Zou, Q. SBSM-Pro: support bio-sequence machine for Proteins. Sci. China Inf. Sci. 67, 212106 (2024).
Ren, Z.-H. et al. A biomedical knowledge graph-based method for drug–drug interactions prediction through combining local and global features with deep neural networks. Brief. Bioinform 23, bbac363 (2022).
Wei, M.-M., Yu, C.-Q., Li, L.-P., You, Z.-H. & Wang, L. BCMCMI: a fusion model for predicting circRNA-miRNA interactions combining semantic and meta-path. J. Chem. Inf. Model. 63, 5384–5394 (2023).
Huang, Y.-a, Hu, P., Chan, K. C. & You, Z.-H. Graph convolution for predicting associations between miRNA and drug resistance. Bioinformatics 36, 851–858 (2020).
Gao, Z. et al. Hierarchical graph learning for protein–protein interaction. Nat. Commun. 14, 1093 (2023).
Chen, Z. et al. In silico prediction methods of self-interacting proteins: an empirical and academic survey. Front. Comput. Sci. 17, 173901 (2023).
Wang, X.-F. et al. A feature extraction method based on noise reduction for circRNA-miRNA interaction prediction combining multi-structure features in the association networks. Brief. Bioinform. 24, bbad111 (2023).
Ruiz, C., Zitnik, M. & Leskovec, J. Identification of disease treatment mechanisms through the multiscale interactome. Nat. Commun. 12, 1796 (2021).
Yang, J. et al. Deep learning identifies explainable reasoning paths of mechanism of action for drug repurposing from multilayer biological network. Brief. Bioinform 23, bbac469 (2022).
Zeng, X. et al. deepDR: a network-based deep learning approach to in silico drug repositioning. Bioinformatics 35, 5191–5198 (2019).
Wang, X.-F. et al. KS-CMI: a circRNA-miRNA interaction prediction method based on the signed graph neural network and denoising autoencoder. iScience 26, 107478 (2023).
Su, X. et al. A deep learning method for repurposing antiviral drugs against new viruses via multi-view nonnegative matrix factorization and its application to SARS-CoV-2. Brief. Bioinform 23, bbab526 (2022).
Li, M., Cai, X., Xu, S. & Ji, H. Metapath-aggregated heterogeneous graph neural network for drug–target interaction prediction. Brief. Bioinform 24, bbac578 (2023).
Himmelstein, D. S. et al. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. eLife 6, e26726 (2017).
Wang, X. et al. DeepR2cov: deep representation learning on heterogeneous drug networks to discover anti-inflammatory agents for COVID-19. Brief. Bioinform 22, bbab226 (2021).
Song, Y., Zhou, C., Wang, X. & Lin Z. Ordered GNN: ordering message passing to deal with heterophily and over-smoothing. In The Eleventh International Conference on Learning Representations (Ithaca, 2023).
Montavon, G., Samek, W. & Müller, K.-R. Methods for interpreting and understanding deep neural networks. Digital Signal Process. 73, 1–15 (2018).
Belkoura, S., Zanin, M. & LaTorre, A. Fostering interpretability of data mining models through data perturbation. Expert Syst. Appl. 137, 191–201 (2019).
Yu, J.-L., Dai, Q.-Q. & Li, G.-B. Deep learning in target prediction and drug repositioning: Recent advances and challenges. Drug Discov. Today 27, 1796–1814 (2022).
Zhang, Y., Tiňo, P., Leonardis, A. & Tang, K. A survey on neural network interpretability. IEEE Trans. Emerg. Top. Comput. Intell. 5, 726–742 (2021).
Sun, H., Wang, G., Liu, Q., Yang, J. & Zheng, M. An explainable molecular property prediction via multi-granularity. Inf. Sci. 642, 119094 (2023).
Wang, H., Huang, F., Xiong, Z. & Zhang, W. A heterogeneous network-based method with attentive meta-path extraction for predicting drug–target interactions. Brief. Bioinform 23, bbac184 (2022).
Esser-Skala, W. & Fortelny, N. Reliable interpretability of biology-inspired deep neural networks. NPJ Syst. Biol. Appl. 9, 50 (2023).
Frolichs, K. M., Rosenblau, G. & Korn, C. W. Incorporating social knowledge structures into computational models. Nat. Commun. 13, 6205 (2022).
Zheng, J., Li, Q., Liao, J. & Wang, S. Explainable link prediction based on multi-granularity relation-embedded representation. Knowl. Based Syst. 230, 107402 (2021).
Verma, J., Khedkar, V. M. & Coutinho, E. C. 3D-QSAR in drug design-a review. Curr. Top. Med. Chem. 10, 95–115 (2010).
Yang, S.-Y. Pharmacophore modeling and applications in drug discovery: challenges and recent advances. Drug Discov. Today 15, 444–450 (2010).
Zeng, Z., Yao, Y., Liu, Z. & Sun, M. A deep-learning system bridging molecule structure and biomedical text with comprehension comparable to human professionals. Nat. Commun. 13, 862 (2022).
Schep, R. et al. Impact of chromatin context on Cas9-induced DNA double-strand break repair pathway balance. Mol. Cell 81, 2216–2230.e2210 (2021).
Sun, Y. et al. A graph neural network-based interpretable framework reveals a novel DNA fragility–associated chromatin structural unit. Genome Biol. 24, 90 (2023).
Lamb, J. et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313, 1929–1935 (2006).
Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations (Ithaca, 2018).
Bai, P., Miljković, F., John, B. & Lu, H. Interpretable bilinear attention network with domain adaptation improves drug–target prediction. Nat. Mach. Intell. 5, 126–136 (2023).
Luo, Y. et al. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat. Commun. 8, 573 (2017).
Li, Y., Qiao, G., Gao, X. & Wang, G. Supervised graph co-contrastive learning for drug–target interaction prediction. Bioinformatics 38, 2847–2854 (2022).
Li, Z., Li, J., Nie, R., You, Z.-H. & Bao, W. A graph auto-encoder model for miRNA-disease associations prediction. Brief. Bioinform 22, bbaa240 (2021).
Sun, Y., Ming, Y., Zhu, X. & Li, Y. Out-of-distribution detection with deep nearest neighbors. In International Conference on Machine Learning (PMLR, 2022).
Schütt, K. et al. Schnet: a continuous-filter convolutional neural network for modeling quantum interactions. Adv. Neural Inf. Process. Syst. 30, 992–1002 (2017).
Hou, Z., Yang, Y., Ma, Z., Wong, K.-c & Li, X. Learning the protein language of proteome-wide protein-protein binding sites via explainable ensemble deep learning. Commun. Biol. 6, 73 (2023).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (IEEE/CVF, 2016).
Van der Maaten, L. & Hinton G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Zhan, H., Zhu, X., Qiao, Z. & Hu, J. Graph neural tree: a novel and interpretable deep learning-based framework for accurate molecular property predictions. Anal. Chim. Acta 1244, 340558 (2023).
Zhang, Y. Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning. Chem. Sci. 10, 8154–8163 (2019).
Ryu, S., Kwon, Y. & Kim, W. Y. A Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification. Chem. Sci. 10, 8438–8446 (2019).
De, P., Kar, S., Ambure, P. & Roy, K. Prediction reliability of QSAR models: an overview of various validation tools. Arch. Toxicol. 96, 1279–1295 (2022).
Lewell, X. Q., Judd, D. B., Watson, S. P. & Hann, M. M. Recap retrosynthetic combinatorial analysis procedure: a powerful new technique for identifying privileged molecular fragments with useful applications in combinatorial chemistry. J. Chem. Inf. Comput. Sci. 38, 511–522 (1998).
Pinzi, L. & Rastelli, G. Molecular docking: shifting paradigms in drug discovery. Int J. Mol. Sci. 20, 4331 (2019).
Valencia, D. N. Brief review on COVID-19: the 2020 pandemic caused by SARS-CoV-2. Cureus 12, e7386 (2020).
Gupta, A. et al. Extrapulmonary manifestations of COVID-19. Nat. Med 26, 1017–1032 (2020).
Murakami, N. et al. Therapeutic advances in COVID-19. Nat. Rev. Nephrol. 19, 38–52 (2023).
Reghunathan, R. et al. Expression profile of immune response genes in patients with severe acute respiratory syndrome. BMC Immunol. 6, 1–11 (2005).
Barrett, T. et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acid Res. 41, D991–D995 (2012).
Naasani, I. J. A. P. COMPARE analysis as an efficient bioinformatic approach to accelerate repurposing of existing drugs against Covid-19 and other emerging epidemics. Authorea Preprints at https://www.techrxiv.org/doi/full/10.22541/au.159611489.95884381 (2020).
Dyall, J. et al. Repurposing of clinically developed drugs for treatment of Middle East respiratory syndrome coronavirus infection. Antimicrob Agents Ch. 58, 4885–4893 (2014).
Hosseini, F. S. & Motamedi, M.R. Mulberrofuran G, a potent inhibitor of spike protein of SARS corona virus 2. J. Pharm. Care. 9, 74–81 (2021).
Hoffmann, M. et al. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell 181, 271–280.e8 (2020).
Morris, G. M. et al. AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J. Comput. Chem. 30, 2785–2791 (2009).
Schenone, M., Dančík, V., Wagner, B. K. & Clemons, P. A. Target identification and mechanism of action in chemical biology and drug discovery. Nat. Chem. Biol. 9, 232–240 (2013).
Ye, Q. et al. A unified drug–target interaction prediction framework based on knowledge graph and recommendation system. Nat. Commun. 12, 6775 (2021).
Li, J. et al. Semi-supervised graph classification: a hierarchical graph perspective. In 2019 The World Wide Web Conference (ACM, 2019).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Zeng, X. et al. Target identification among known drugs by deep learning from heterogeneous networks. Chem. Sci. 11, 1775–1797 (2020).
O’Boyle, N. M. et al. Open Babel: an open chemical toolbox. J. Cheminform. 3, 1–14 (2011).
Fu, S., Wang, G. & Xu, J. hier2vec: interpretable multi-granular representation learning for hierarchy in social networks. Int. J. Mach. Learn. Cybern. 12, 2543–2557 (2021).
Deng, Y. et al. A multimodal deep learning framework for predicting drug–drug interaction events. Bioinformatics 36, 4316–4322 (2020).
Shi, J.-Y., Mao, K.-T., Yu, H. & Yiu, S.-M. Detecting drug communities and predicting comprehensive drug–drug interactions via balance regularized semi-nonnegative matrix factorization. J. Cheminform. 11, 1–16 (2019).
Ren, Z.-H. et al. SAWRPI: a stacking ensemble framework with adaptive weight for predicting ncRNA-protein interactions using sequence information. Front. Genet. 13, 839540 (2022).
Huang, J., Shen, H., Hou, L. & Cheng, X. SDGNN: learning node representation for signed directed networks. In Proc. AAAI Conference on Artificial Intelligence (AAAI, 2021).
Tosco, P., Stiefl, N. & Landrum, G. Bringing the MMFF force field to the RDKit: implementation and validation. J. Cheminform. 6, 1–4 (2014).
Meng, Y., Jin, M., Tang, X. & Xu, J. Drug repositioning based on similarity constrained probabilistic matrix factorization: COVID-19 as a case study. Appl. Soft Comput. 103, 107135 (2021).
Andersen, P. I. et al. Discovery and development of safe-in-man broad-spectrum antiviral agents. Int J. Infect. Dis. 93, 268–276 (2020).
Shen, L. et al. VDA-RWLRLS: an anti-SARS-CoV-2 drug prioritizing framework combining an unbalanced bi-random walk and Laplacian regularized least squares. Comput Biol. Med. 140, 105119 (2022).
Wishart, D. S. et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acid Res. 46, D1074–D1082 (2018).
Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acid Res. 49, D10 (2021).
White, J. PubMed 2.0. Med. Ref. Serv. Q. 39, 382–387.
Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 27, 861–874 (2006).
Ren, Z. A spatial hierarchical network learning framework for drug repositioning allowing interpretation from macro to micro scale. https://doi.org/10.5281/zenodo.13881676 (2024).
Acknowledgements
This research was supported by the National Natural Science Foundation of China (No. 62131004, No. 62250028), the Sichuan Provincial Science Fund for Distinguished Young Scholars (2021JDJQ0025), and the Municipal Government of Quzhou (No. 2022D040).
Author information
Authors and Affiliations
Contributions
Z.R. and X.Z. wrote the first draft of the manuscript. Q.Z., Y.L., and H.Z. revised the manuscript to the submitted version. Z.R., Q.Z., X.Z., and Z.Y. conceived the study. Z.R. designed all the experiments and wrote the codebase of SpHN-VDA. Z.R., Q.Z., and H.Z. conduct the benchmarks and run all of the analyses. Z.R. and H.X. collected and preprocessed all datasets. Z.R., X.Z., Y.L., and H.Z. contributed to data analysis and model discussion. Z.R., Q.Z., and Z.Y. conducted the figure design for the overall framework. Z.R., X.Z., and Y.L. completed the visualizations. Q.Z. supervised the research. All of the authors reviewed the manuscript and approved it for submission.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Biology thanks Xin Gao and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Laura Rodríguez Perez and David Favero. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ren, Z., Zeng, X., Lao, Y. et al. A spatial hierarchical network learning framework for drug repositioning allowing interpretation from macro to micro scale. Commun Biol 7, 1413 (2024). https://doi.org/10.1038/s42003-024-07107-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42003-024-07107-3
This article is cited by
-
AI in drug development: advances in response, combination therapy, repositioning, and molecular design
Science China Information Sciences (2025)