Introduction

Background

Research on complex diseases, such as cancer, diabetes, and cardiovascular diseases, has always been a significant and challenging issue due to the interplay of genetic and environmental factors1. In the past few decades, although the development of genomics and life sciences has been rapid, new drug research remains time-consuming, costly, and characterized by a low success rate. Consequently, investigations into new drugs have begun to stagnate. According to conservative estimates, developing a new drug typically takes 10–15 years, with research and development costs nearly doubling to $2 billion, while the return on investment in research and development decreased from 10% in 2010 to 2% in 20192,3. Phase I to Phase III clinical trials are essential in drug development, typically lasting 3–7 years. Phase I trials usually last about 1 year, Phase II trials typically last 2 years, and Phase III trials may last 4 to 5 years. Due to the unpredictable side effects of drugs with novel structures, approximately 90% of experimental drugs fail Phase I clinical trials4, and 50% fail to reach the market in Phase III due to poor efficacy5,6. Some drugs that pass Phase III may still be withdrawn during Phase IV market surveillance. Therefore, the development and discovery of new drugs face significant challenges. In this context, drug repositioning becomes particularly important and urgent. Compared to innovator drugs, drug repositioning offers significant advantages in research time, funding, and success rates, as illustrated in Fig. 1. As a drug development strategy, drug repositioning presents clear cost-benefit advantages, significantly reducing the costs and risks associated with new drug development while shortening the time frame between discovery and clinical availability7.

Fig. 1
figure 1

Comparison of innovator drug and drug repositioning.

In the face of various intractable diseases, the process of developing new drugs is complex and lengthy, thus, the application of Artificial Intelligence (AI) technology has become increasingly important8,9,10,11,12,13,14. The application of AI in drug discovery relies on computer-aided drug design, which combines extensive chemical and biological data to establish high-quality machine learning models. This approach guides the discovery and optimization of compounds during target screening, molecular structure/ chemical spatial analysis, ligand-receptor interaction simulation, and three-dimensional quantitative structure-activity relationship analysis of drugs15,16. In drug repositioning research, graph neural network-based methods effectively handle complex graph-structured data, such as drug-target interaction networks and biomolecule networks, capturing potential relationships between drugs and targets while demonstrating strong generalization ability. Knowledge graph-based methods using multi-source data fusion can integrate biological information from various sources, enhance data comprehensiveness, and improve the accuracy of drug repositioning. Machine learning-based methods can quickly process large volumes of data and perform research and analysis on large-scale datasets. However, several problems arise during the research process:

  • Insufficient semantic features exist in entity and relationship representation: The se-mantic features of entity and relationship representation in datasets are insufficient, and traditional word embedding technology is designed for general natural language, making it less practical for knowledge graph embedding. However, traditional translation models only embed knowledge graph structures and exhibit weak semantic features. Integrating text semantic information with deep learning methods, such as convolutional neural networks, also faces challenges related to difficult data acquisition and high complexity.

  • Currently, the efficiency of knowledge fusion technology is low: Knowledge graphs involve massive amounts of data and complex technologies. As a key component, knowledge fusion is still not mature enough in both theoretical research and practical application. Tasks such as knowledge embedding and entity alignment do not meet the efficiency and accuracy requirements for large-scale data processing in today’s industry. Given the sparse nature of knowledge graphs and the semantic complexity of natural language, improving the efficiency of knowledge fusion remains a priority in big data intelligence.

  • Quality verification of screened drug candidates: Even superior models must verify and evaluate results, which serves as a standard for measuring model quality. Determining effective methods to verify this quality has become a challenging issue.

  • Large drug sample sets: The dataset of old drugs is extensive, with many similarities among different drugs. Drug screening results may recommend too many drugs, limiting the range of older drugs available for repositioning.

This study presents several innovations and research initiatives to address the difficulties mentioned above:

  • Introduction of attention mechanisms in translation and bilinear models: The translation model TransE is concise and effective; however, it has limitations in handling complex relations. The model is limited to the knowledge graph structure, and its semantic features for entity and relation representation are weak. The bilinear model computes the potential semantic reliability of entities and relations in vector space. It employs a global attention mechanism to assign attribute node features to entity representations and a self-attention mechanism to process label word order and relational representations, integrating text features. This approach addresses the shortcomings of insufficient semantic features in entity and relation representations and effectively enhances the quality of knowledge embeddings.

  • Integrating multiple models: In this research, various models with attention mechanisms are integrated, and their screening results are combined for better results to enhance drug screening outcomes and quality.

  • Quality verification of screened candidate drugs: Predict and rank scores for all possible combinations of (drug, treatment, virus) triplets. Next, compare the top-ranked drugs with those currently undergoing clinical trials to identify several high-scoring candidate drugs.

  • The model’s feature extraction does not depend on a single model. This approach enhances accuracy by not relying solely on one model for feature extraction. After further screening of the integrated results, it can effectively identify better outcomes from various versions.

In this paper, a method based on knowledge graph embedding is proposed to rapidly identify the potential efficacy of known drugs. It uses COVID-19 as an example to verify the effectiveness of the method and minimize the conversion gap between preclinical test results and clinical results. Since the knowledge graph has been successfully established, it can be easily expanded in the future to cover various categories, symptoms, disease genetic characteristics, and more. This will enable broader drug repositioning screening for both infectious and non-infectious diseases17.

Related work

Attention mechanism

The application of the attention mechanism is very extensive. In medicine, the attention mechanism is frequently used for analyzing and processing medical images. Feng Y. et al. proposed a novel medical image segmentation network, PAMSNet, which enhances image details using efficient pyramid attention and channel-space attention modules18. Di J. et al. proposed an image fusion method utilizing an improved attention mechanism and a decomposition network, achieving better texture preservation and sharper edge contours19. The attention mechanism also plays a significant role in drug repositioning. Zu J. et al. proposed a drug repositioning method that combines word vector representation with the attention mechanism, enhancing the classification accuracy of drug-target protein interaction predictions20. Tang X. F. et al. proposed a drug repositioning method based on a bilinear attention network, introducing a layer attention mechanism to combine embeddings from different graph convolutional layers, resulting in more expressive representations of drugs and diseases21.

Knowledge graph

In 2012, Google proposed the Google Knowledge Graph, from which the term “knowledge graph” is derived, and improved search engine performance through this technology. A knowledge graph is a graph-based data structure composed of points and edges. Each point represents an “entity,” and each edge represents a “relationship” between entities. In essence, a knowledge graph is a relational network that connects various types of information, facilitating better queries of complex associated information. Understanding user intent at a semantic level allows for problem analysis from a relational perspective. Knowledge Graph Embedding is a technique that converts entities and relationships in high-dimensional, sparse knowledge graph data into low-dimensional, dense vector representations, enabling the computation and inference of semantic relationships in a vector space. Graph embedding has shown significant potential in elucidating molecular mechanisms and predicting the biological activities of repurposed drugs for various diseases. Graph embedding can simulate and analyze interactions between drug molecules and biological targets, identify key drug-target interaction patterns, and enhance understanding of how these interactions affect disease occurrence, development, and treatment. It can accurately predict the biological activities of existing drugs for other diseases. These predictions not only accelerate drug discovery but also provide new insights and directions for clinical treatment, especially in the face of urgent public health challenges or diseases that are under-researched.

With the development of AI technology, knowledge graphs have significant applications in the medical fields of diagnosis and treatment, drug research and development, and knowledge management. Guo Z. Q. et al. developed an intelligent question-and-answer platform for proprietary Chinese medicines using technologies such as knowledge graphs, natural language processing for multi-label text classification, named entity recognition, and speech recognition. This platform quickly and accurately queries relevant information based on users’ questions and presents a relevant knowledge graph to assist users in understanding proprietary Chinese medicines22. Wu D. et al. developed an automatic question-and-answer system for cardiovascular diseases based on a cardiovascular disease knowledge graph to effectively answer users’ questions regarding the diagnosis of symptoms and drug recommendations23. Remy C. et al. built a knowledge graph based on patient symptoms to help doctors quickly identify potential rare diseases24. Weng H. et al. designed a framework for constructing and applying a Traditional Chinese Medicine (TCM) knowledge base based on representation learning, which is effective in knowledge discovery and aiding decision-making in diagnosis and treatment25. Fu Z. X. et al. proposed a multi-link predictive reasoning algorithm based on rules and a Markov Logic Network (MLN) with a research background in TCM Visceral Syndrome Differentiation, which provides auxiliary diagnosis for clinical practice in TCM26. Li J. et al. used knowledge graph technology to connect scattered information related to the plague, creating a comprehensive knowledge graph focused on this disease. This graph facilitates in-depth exploration of the complex pathogenesis and potential treatment methods for the plague. They also identified the value of Coptis, Rhubarb, and other varieties of traditional Chinese medicine, as well as moxibustion therapy, for the treatment and prevention of plague27. Lu Y. W. et al. extracted biomedical knowledge, constructed a gastric cancer knowledge graph, and used knowledge embedding vectors to predict that nine drugs could treat gastric cancer, thereby verifying the medical application value of the knowledge graph28. Lyu Y. H. et al. used the tools SemRep and Metamap, based on the Unified Medical Language System (UMLS), to obtain autism drug entity triplets and construct an autism drug entity knowledge graph. Based on the knowledge graph, 27 potential drugs for autism were screened using three semantic paths, providing a theoretical and methodological basis for drug repositioning29. Fan M. et al. proposed a storage structure and visualization technology based on knowledge graphs to visualize the relevant attributes of Tibetan medicine. This approach aims to enhance the integration of Tibetan medicine knowledge and improve data correlation, ultimately better exploring the potential value of Tibetan medicine data30. Ouzounis S. et al. proposed a data-driven approach to facilitate the reuse of diabetes drugs by integrating heterogeneous biomedical data into a unified knowledge graph31. Xi C. C. combined knowledge graph with qualitative and quantitative research methods, and compared the research of Sun Yikui, Zhao Xianke, and Zhang Jiebin through grounded theory and data mining, greatly improving the efficiency of analysis32. Wang Q. et al. constructed 643 medical records related to the treatment of coronary heart disease, featuring 144 renowned TCM doctors, to provide a methodological reference for the inheritance of their experiences33. Xiong W. P. et al. extracted information on Chinese Patent Medicine (CPM), diseases, symptoms, and other relevant data from electronic medical records, constructed a knowledge graph, and based on this graph, developed a rule base for CPM monitoring, ensuring comprehensive application of CPM to guarantee drug safety34.

Research content

Since the safety of marketed drugs has been clinically verified, the pharmacokinetic characteristics have been clearly defined, and the production process, quality standards, and dosage forms are complete, the amount of preclinical research needed is significantly reduced compared to developing drugs from scratch. Only the main pharmacodynamics of new indications need clarification. In the clinical research stage, if the drug dosage form and administration mode are the same as the original indication, and if the dosage and administration time are less than or equal to those of the original indication, the clinical study for the new drug indication will directly enter phase II b, generally not requiring phase I and phase II a clinical trials. Compared to traditional new drug development, drug repositioning can effectively shorten the drug research and development cycle, reduce costs, and avoid risks, making it a very promising drug development strategy. The comparison of the development process of the innovator drug and drug repositioning is shown in Fig. 2.

Fig. 2
figure 2

Comparison of the development process of innovator drugs and drug repositioning.

This research primarily encompasses the construction of a knowledge graph, its embedding model, and the evaluation of the verification model’s professionalism.

Data and algorithms offer new opportunities for constructing knowledge graphs. Knowledge graphs serve as the foundational support for AI, making them essential. The Drug Repurposing Knowledge Graph (DRKG) is a large-scale drug repositioning knowledge graph created collaboratively by the Amazon Shanghai AI Laboratory, Amazon AI North America, the University of Minnesota, Ohio State University, and Hunan University. It holds significant reference and application value. This article constructs a knowledge graph using DRKG, which includes DrugBank and 24 million publicly available publications. A new knowledge graph is created by assembling six coronaviruses as comprehensive nodes for COVID-19. We randomly divide the dataset into training, validation, and testing sets in a ratio of 9:0.5:0.5.

With the development of Internet technology, data has shown explosive growth. However, due to the multi-source heterogeneous content on the Internet and the loose organizational structure, using this information efficiently has become difficult. The emergence of knowledge graphs aims to transform massive amounts of unstructured or semi-structured data into standardized, unified, reliable, and effective structured knowledge. This transformation creates a highly interconnected semantic web to support data mining and intelligent services35,36,37. A knowledge graph describes various entities, concepts, and their relationships in the real world. It can essentially be seen as a directed graph-structured network. In the graph, nodes represent entities or concepts, while edges represent the relationships between entities and other entities (or between entities and concepts). Based on this highly structured knowledge, diverse data mining applications and intelligent services can be developed. Knowledge Embedding (KE), also known as Knowledge Representation Learning (KRL), aims to learn the quantification of entity relations38, transforming the symbolic form of knowledge into a computable real-valued vector. It is the core technology in the entire knowledge graph construction process and often provides vector input as part of knowledge fusion. In this study, the translation model TransE and the bilinear models Dismult and Rescal are selected. These models were combined with attention mechanisms to create Attranse, Attdismult, and Attrescal for training the knowledge graph obtained from our research39.

In the final validation stage, the model’s effectiveness can be demonstrated through prediction scores, cross-comparison of multiple model scores, comparative analysis with COVID-19 clinical drugs, drug-gene characteristics, and HCoV-induced enrichment analysis.

The model flowchart of this research is shown in Fig. 3:

Fig. 3
figure 3

Model flow.

Methods

Knowledge graph embedding model

The embedding of the knowledge graph utilizes machine learning to represent the semantic information of research objects as dense low-dimensional vectors. This effectively addresses data sparsity and enhances knowledge fusion and reasoning performance. These models consider the collaboration and computational costs among entities. They represent entities with vectors and perform matrix transformations on these vectors or their relationships. Additionally, they propose evaluation functions to measure the correlation between entities. Graph embedding represents complex biochemical mechanisms as relationships in a low-dimensional vector space. Using graph embedding, the model predicts new drug action mechanisms based on known interactions of drug- treatment- virus. Embedding vectors help discover potential drug roles in unexplored biochemical mechanisms, thus facilitating drug repositioning.

Attention mechanism model

The attention mechanism enables the model to learn how to allocate its attention by weighting input signals. The primary purpose of the attention mechanism is to score the various dimensions of input and weight features based on these scores, highlighting the impact of important features on downstream models.

In this study, a global attention mechanism is applied to head and tail entities, while a self-attention mechanism is utilized for relationships. In the self-attention mechanism, each element of a sequence makes attention calculations with all other elements, capturing the influence relationships between them without any additional information. Its effectiveness has been demonstrated in machine reading, text summarization, and image annotation.

Translation model

The translation model is a key method of knowledge embedding, with TransE being the most popular and representative model. This model is simple and effective, achieving good results with high performance on large-scale knowledge graphs, which are widely studied in translation models. TransE is a distance-based knowledge graph embedding method that assumes relationships between entities can be represented as translations in vector space. This model learns entity and relationship representations through translation operations and distance metrics. The distance function is typically used to measure the difference between the predicted and true vectors. It interprets the relationship in a knowledge graph triplet as the translation operation from the head entity to the tail entity in the embedded space. For a triplet (h, r, t), the head entity vector plus the relationship vector in the embedded space should be as close as possible to the tail entity vector; that is, h + r ≈ t, as shown in Fig. 4.

Fig. 4
figure 4

TransE model.

The triplet evaluation function is defined as formula (1):

$$f_{{transe}} (h,r,t) = \left\| {h + r - t} \right\|_{2}^{2}$$
(1)

where h represents the head entity vector, r represents the relationship vector, and t represents the tail entity vector.

Bilinear model

  1. 1.

    Rescal

Rescal is a matrix factorization-based model that uses tensor decomposition techniques to learn low-dimensional vector representations of entities and relationships in knowledge graphs. Rescal is a bilinear model that represents the knowledge graph as a third-order vector, T. The first order represents the head entity, the second order represents the relationship, and the third order represents the tail entity. If the triplet (h, r, t) exists in the Knowledge graph, and the serial numbers of h, r, and t in each order are i, j, k, then Ti, j, k=1, otherwise, Ti, j, k=0. Rescal obtains the representations of entities and relationships in vector space through tensor decomposition. Rescal represents each entity as a vector to represent its implicit semantics. Each relationship is represented as a matrix that illustrates the relationships between the dimensions of the head and tail entities. Figure 5 illustrates the relationship diagram.

Fig. 5
figure 5

Rescal model.

The evaluation function of Rescal for triplet (h, r, t) is defined as follows:

$$\:{\text{f}}_{\text{R}\text{escal}}\text{(}\text{h,r,t}\text{)=}{\text{h}}^{\text{T}}{\text{M}}_{\text{r}}\text{t}$$
(2)

Here, h,t ϵ Rk represent the vectors of the head and tail entities, while Mr ϵ Rk × k is the matrix corresponding to the relationship r. The evaluation function calculates the semantic correlation of h and t under the relationship r using this bilinear method.

The RESCAL model learns embedded representations of entities and relationships by optimizing algorithms during training to minimize the error between predictions and actual triplets. After training, the learned embedding vectors can be used for tasks such as relationship inference, entity classification, and link prediction in knowledge graphs.

  1. 2.

    Dismult.

The core idea of the Distmult is to model the interaction between entities and relationships using the inner product, rather than simply adding or connecting them. This approach makes the model more flexible and better able to capture multiple correlations between entities and relationships. Distmult improves the model performance by restricting the relationship matrix Mr to a diagonal matrix based on Rescal. For the triplet (h, r, t), Distmult represents the head and tail entities h and t as vectors h, t ϵ Rk, respectively. The relationship diagram is illustrated in Fig. 6.

Fig. 6
figure 6

Dismult model.

The evaluation function is defined as:

$$\:{\text{f}}_{\text{D}\text{ismult}}\text{(}\text{h,r,t}\text{)=}{\text{h}}^{\text{T}}\text{d}\text{iag}\text{(}{\text{M}}_{\text{r}}\text{)t}$$
(3)

Since the matrix Mr is a diagonal, the evaluation function can obtain correlations between the same dimensions h and t, significantly reducing the number of parameters compared to Rescal.

The combination of attention mechanism and knowledge graph embedding

For the representation of entities and relationships in the Knowledge graph, the model considers two features: one is their own semantic features. For entity representation, the model uses a global attention mechanism to incorporate attribute features. For relationship representation, it employs a self-attention mechanism to extract semantic features from relationship label words. Another is the structural features on triplet (h, r, t).

The entity attributes and their respective characteristics have been determined by the DRKG knowledge graph, which includes 13 entity types, totaling 97238 entities, and 5,874,261 triplets belonging to 107 relationship types. Before training, collect a list of coronavirus (CoV) diseases in DRKG, with all coronavirus diseases as targets. We assemble six types of coronaviruses (including SARS-CoV, MERS-CoV, HCoV-229E, and HCoV-NL63) as comprehensive nodes of coronaviruses and reconnect the links between genes and drugs. The generated knowledge graph contains four types of entities: drugs, genes, diseases, and drug-related information, along with 39 relationships, 145,179 nodes, and 15,018,067 edges.

In the embedding space, the head and tail entities obtain their respective attribute features through global attention and self-attention mechanisms, and then perform translation operations. We refer to this network as Attranse. As shown in Fig. 7.

Fig. 7
figure 7

Attranse model.

The bilinear models, Rescal and Dismult, derive their potential semantics from a vector representation of each entity. Each relationship is represented as a matrix that models the paired interactions between potential factors. Similarly, the head entity and tail entity acquire new attribute features through global attention, while relational tag words derive relational semantics representation through a self-attention mechanism. Each relationship is represented by a matrix, as illustrated in Fig. 8. Finally, the matrix multiplication operation is performed. The two networks are referred to as Attrescal and Attdismult, respectively.

Fig. 8
figure 8

Bilinear model.

Verification

Input the COVID-19 entity and database into the Knowledge Graph embedding model mentioned above, and obtain the score of each drug using the evaluation functions of Attranse, Attrescal, and Attdismult. Cross-validation was performed on the top 100 drugs predicted by each model to yield the final results. The clinical data of COVID-19 trial is obtained from https://covid19-trials.com/. By comparing the top 100 drugs predicted by the model with those used in COVID-19 clinical treatments, we can select current clinical drugs from this list, demonstrating the effectiveness of drug repositioning methods.

Additionally, this paper further employs enrichment analysis of drug gene characteristics in human cell lines, along with the transcriptomic and proteomic data induced by SARS-CoV, to more effectively validate the best candidate drugs.

First, three datasets of differential gene expression in human cell lines infected with HCoV were collected from the Gene Expression Omnibus database on https://www.ncbi.nlm.nih.gov/geo/. Specifically, two transcriptome datasets were used: one from the peripheral blood (GSE1739) and another from Calu-3 cells (GSE33267) of SARS-CoV-infected patients. A transcriptome dataset from Calu-3 cells (GSE122876) infected with MERS-CoV was also selected. Additionally, a proteome dataset specific to SARS-CoV-2 was collected on https://biochem2.com/index.php/22ibcii/pqc/130-frontpage-pqc#coronavirus. P-values less than 0.01 are defined as differentially expressed genes and proteins. Differential gene expression in cells treated with various drugs was retrieved from the Connectivity Map (CMap) database and used as a gene profile for drug analysis. The Enrichment Score (ES) calculated for each CoV dataset is as follows:

$$ES = \left\{ {\begin{array}{*{20}l} {ES_{{up}} - ES_{{down}} ,} \hfill & {\text{sgn} \left( {ES_{{up}} } \right) \ne \text{sgn} \left( {ES_{{down}} } \right)} \hfill \\ {0,} \hfill & {else} \hfill \\ \end{array} } \right.$$
(4)

ESup and ESdown are up-regulated and down-regulated genes calculated from the CoV gene signature dataset, respectively. The calculations for aup/down and bup/down are as follows:

$${\text{a = }}\mathop {{\text{max}}}\limits_{{{\text{1}} \le {\text{j}} \le {\text{s}}}} \left( {\frac{{\text{j}}}{{\text{s}}} - \frac{{{\text{v(j)}}}}{{\text{r}}}} \right)$$
(5)
$$b = \mathop {\max }\limits_{{1 \le j \le s}} \left( {\frac{{v\left( j \right)}}{r} - \frac{{j - 1}}{s}} \right)$$
(6)

Where j = 1,2,, s are the genes in the HCoV dataset, arranged in ascending order in the gene profile of the calculated drug. The level of gene j is represented by V(j), and 1 < = V(j) < = r, where r is the number of genes from the CMap database (12,849). If aup/down>bup/down, then ESup/down=aup/down, If aup/down<bup/down, then ESup/down=-bup/down. To quantify the importance of ES scores, a randomly generated gene list was repeated 100 times, with the same number of up-regulated and down-regulated genes as the CoV dataset. If ES > 0 and P < 0.05, it is considered that the drug has a significant enrichment effect.

Experimental results

Experimental environment

Experimental results were found to be influenced by the experimental environment. Consequently, the parameters of the experimental environment are provided below. In this work, the Pytorch framework in deep learning was utilized to train the model. The specific details of the experimental environment are presented in Table 1.

Table 1 Training environment parameters.

Dataset

DRKG is a comprehensive knowledge graph in biomedicine, encompassing six main data aspects: human genes, compounds, biological processes, drug side effects, diseases, and symptoms. DRKG extracts data from six large-scale open medical databases, including DrugBank, Hetionet, GNBR, String, IntAct, and DGIdb, as well as recent medical literature related to COVID-19, and standardizes this information. The DRKG knowledge graph contains 97,238 entities across 13 entity types and 5,874,261 triplet data across 107 relationship types. These 107 relationship types illustrate the interaction types between 17 entity type pairs, with multiple interaction types possible for the same entity pair, as shown in Fig. 9.

Fig. 9
figure 9

DRKG knowledge graph structure.

The medical knowledge graph is the cornerstone of smart medicine40. However, existing knowledge graph construction technology in the medical field generally faces issues such as low efficiency, numerous restrictions, and poor scalability. Considering the characteristics of medical data, such as cross-language, strong professionalism, and complex structure, the ontology representation depicts knowledge as a network, where associated nodes (entities) are represented by a triple (entity 1, relationship, entity 2)41. The number of nodes in the knowledge graph affects the structural complexity of the network and the efficiency and difficulty of reasoning.

In this research, the construction of the knowledge graph is completed using the DRKG. DrugBank combines the structural and pharmacological data of drug molecules, including biotech drugs, with the protein sequences, structures, and modes of action of their targets. It also integrates information on the chemical structure, pharmacological effects, protein targets, physiological pathways, and drug interactions. Additionally, it links to the PDB and KEGG databases to analyze detailed drug information. Drugs with a molecular weight greater than 230 daltons that also exist in the GNBR are selected from DrugBank. For these drugs, the knowledge graph includes relationships related to drug interactions, side effects, ATC codes, mechanisms of action, pharmacodynamics, and toxicity. It also incorporates relationships between coronaviruses and genes discovered in knowledge graph experiments. Like ‘Disease::SARS-CoV2 E’, ‘Disease::SARS-CoV2 Spike’, ‘Disease::SARS-CoV2 nsp1’, ‘Disease::SARS-CoV2 orf10’, etc. By assembling six types of coronaviruses (including SARS-CoV, MERS-CoV, HCoV-229E, and HCoV-NL63) as comprehensive nodes of coronaviruses (CoV), and reconnecting the links between genes and drugs. The generated knowledge graph contains four types of entities: drugs, genes, diseases, and drug-related information; along with 39 relationships, 145,179 nodes, and 15,018,067 edges.

Experimental results

This model is deployed in a web environment and presents the final results as a table, selecting drugs ranked in the top 100 by their scores. To meet validation requirements, the results of the 100 drugs obtained from the final model were compared with current clinical drugs for treating COVID-19, and the intersection was selected and output, marking those consistent with clinical drugs as 1 and those inconsistent as 0. All predicted drug scores fall within the range of 0 to 1, with a lower score indicating a better drug prediction effect. To improve clarity, we transformed the original scores. The transformation method involves subtracting 10 times the original score from 100, adjusting the score range to 90–100. In this range, a higher score indicates a better predicted drug effect. It was found that 7 of the drugs predicted by the model were identical to the clinical drugs for treating COVID-19, as shown in Table 2.

Table 2 Drug prediction results.

Both the Diagnosis and Treatment Protocol for Novel Coronavirus Pneumonia (Trial sixth edition) and the Diagnosis and Treatment Protocol for Novel Coronavirus Pneumonia (Trial seventh edition) clearly indicate that ribavirin can be used for antiviral treatment. Trial results from the RECOVERY study in the United Kingdom show that dexamethasone can reduce mortality in patients on invasive ventilators. Patients who received dexamethasone, hydrocortisone, or methylprednisolone experienced an estimated 20% reduction in their risk of death, according to a team at Imperial College London and ICNARC. Thai doctors have used oseltamivir, originally developed to treat MERS, along with lopinavir and ritonavir, which were initially used to treat AIDS, in clinical treatments with good results. A trial of colchicine conducted by a medical team at Attikon Hospital in Athens, Greece, proved effective in treating COVID-19 patients, particularly those with severe symptoms. Professor Xia Jinglin from Zhongshan Hospital affiliated with Fudan University innovatively used thalidomide to treat severe COVID-19, and results indicated its effectiveness. Given that the clinical manifestations of severe COVID-19 pneumonia resemble iron overload, deferoxamine may serve as a promising supportive treatment for COVID-19 pneumonia complications, and studies on its anti-COVID-19 use are ongoing42. In summary, the results obtained from drug predictions are highly reliable, aiding in the identification of drugs with potential therapeutic effects, providing effective support for drug development, and reducing both the time and costs involved.

Discussion

Research in drug repositioning can be classified into two categories based on methods. One category is drug knowledge discovery based on experimental data, while the other is drug knowledge mining based on scientific data. The former relies on clinical trials, focusing on the interaction between drug molecules and cell receptors to uncover potential drug effects by establishing clinical models43,44. The latter is based on computer technology, primarily targeting the correlations among scientific data as the research object and conducting drug knowledge discovery by constructing data models through computing45,46. When AI methods are applied in drug repositioning, as opposed to clinical experimental methods, they can leverage vast amounts of data and strong computing capabilities to analyze the relationship between drugs and diseases from multiple perspectives. This enhances the efficiency and success rate of drug screening, improves efficacy and safety predictions, shortens the research and development cycle, and reduces costs.

In this study, we introduced the global attention mechanisms for three models: Attranse, Attrescal, and Attdismult, to incorporate attribute features into entities, and the self-attention mechanism was used to provide semantic features for relational tags, to construct a drug repositioning model based on knowledge graph embedding. For example, the predicted drugs related to COVID-19 were analyzed and validated. A total of 8,104 drugs are predicted, 32 of which are COVID-19 clinical drugs. These drugs have a molecular weight greater than 230 and were selected from FDA-approved drugs in DrugBank. Among the three models in traditional methods, the transE model achieved the best results, with three drugs ranked in the top 10. The first is ribavirin, the fifth is dexamethasone, and the ninth is colchicine. In this study, seven drugs were ranked in the top 10, as shown in Table 2. The experimental results demonstrate a 133% improvement in prediction accuracy of the model’s predicted drugs compared to traditional methods, confirming the effectiveness of this study. Based on the results and validation analysis, this research method provides a theoretical basis for drug repositioning, offers new ideas for traditional drug discovery, and supports decision-making for future clinical experiments and research.

The research has significant potential for expansion. Currently, experimental verification has been conducted only on the relevant data for COVID-19. Future research can also explore potential drugs for other diseases, such as Alzheimer’s disease and cancer, using the methods proposed in this paper or improved methods. This may require collaboration with professionals to obtain safer and more reliable drugs.

Conclusion

The traditional new drug development process often requires significant financial support, long research and development periods, and continuous advancements in research technology. However, most candidate drugs ultimately fail to reach the market due to safety issues or unsatisfactory efficacy. In contrast, the Drug Repositioning strategy explores potential new indications for existing drugs, offering a more cost-effective, time-efficient, and low-risk pathway for drug development. This study tackles the challenge of drug repositioning by incorporating attention mechanisms into translation and bilinear models, enhancing the quality of knowledge embedding and enabling more accurate drug screening. This study combines multiple models with attention mechanisms, overcoming the limitations of a single model and enhancing the robustness and accuracy of predictions. By incorporating attention mechanisms into models such as TransE, Rescal, and Dismult, it significantly improves the representation of entities and relationships in knowledge graphs. The experimental results, particularly the drug predictions related to COVID-19, validate the proposed method. Among the top 10 predicted drugs, seven align with those already in clinical trials. Compared to traditional methods, this model improves drug prediction accuracy by 133%, demonstrating its effectiveness and practicality. This result not only confirms the potential of our method in drug repositioning but also provides a solid foundation for future clinical trials.

The main contributions of this study are as follows:

  1. (1)

    Introduced attention mechanisms and bilinear models in translation to enhance semantic representation.

  2. (2)

    Integrated multiple models to improve prediction quality and the drug screening process.

  3. (3)

    Validated candidate drugs against current clinical trial drugs, strengthening the effectiveness of the proposed method.

  4. (4)

    Developed a drug repositioning framework based on knowledge graph embedding, applicable to various diseases.

Although the current research focuses on COVID-19, there are several promising ways to expand it:

  1. (1)

    Application to other diseases: The method proposed in this study can be extended to explore potential drugs for treating diseases such as Alzheimer’s disease, cancer, and viral infections. Future research should adapt these models to address the unique challenges posed by different diseases.

  2. (2)

    Model enhancement: Future work could explore improvements to attention mechanisms, such as more complex variants of self-attention or combining them with other neural network architectures. These improvements could lead to more accurate predictions and better handling of complex drug-disease relationships.

  3. (3)

    Clinical validation collaboration: Future research should prioritize collaboration with clinical professionals to assess the safety and efficacy of drugs. These partnerships are essential for translating model predictions into real-world applications and clinical trials.

  4. (4)

    Expanding data sources: To further validate and improve the model’s reliability, future work could incorporate a wider range of data sources, including patient-specific data and other drug-related information. This will enhance the robustness and scalability of the method.

In summary, this study presents a promising approach for drug repositioning, emphasizing the use of attention mechanisms in knowledge graph embedding. This research has significant potential to contribute to personalized medicine and drug discovery by extending the method to other diseases and collaborating with clinical experts.