Introduction

Nipah virus (NiV) is a Pteropus bat-borne pathogen belonging to the family Paramyxoviridae and genus Henipavirus1. It can cause asymptomatic infections to acute respiratory infection, fatal encephalitis and even myocarditis2,3. It was first reported between 1998 and 1999 in Sungai Nipah, a village in Malaysia, where humans contracted NiV from pigs, the intermediate hosts of the virus4. Since 2001, Bangladesh has experienced recurring seasonal outbreaks of NiV infections, primarily linked to the consumption of raw date palm sap contaminated by bat saliva and excreta5. In India, the virus was first detected during an outbreak in Siliguri, West Bengal, in 2001, followed by another in Nadia in 2007, largely spread through nosocomial and person-to-person transmission6. In May 2018, a third outbreak occurred in two different districts of Kerala, India7. The transmission was observed to be fatal with a mortality rate of 91%. In the following years, Kerala witnessed occasional cases of NiV infections in 2019, 2021, and 20238. In 2024, India reported two fatal cases of Nipah virus in the state of Kerala. Recognizing its pandemic potential, the World Health Organization (WHO) has designated NiV a priority pathogen for research and development2. Deforestation, urbanization, and climate change significantly impact NiV transmission by altering bat habitats and increasing human-wildlife interactions. Habitat loss and urban expansion force bats into closer proximity to humans, facilitating spillover events, while climate-induced changes in bat migration and food availability enhance virus adaptability and transmission risk. The Nipah virus has been known to exist in bats for centuries but has undergone evolutionary changes in the past few decades to affect humans. These interconnected factors create conditions for NiV evolution and amplify the potential for outbreaks in human populations9.

The viral pathophysiology and biological functions surrounding infection are still unclear. Viral ribonucleocapsid surrounded by an envelope consists of the 18.2 kb negative-sense single-stranded ribonucleic acid (ssRNA) genome of the Nipah virus that encodes nine proteins out of which six are structural proteins including nucleoprotein, phosphoprotein, a matrix protein, viral fusion and glycoproteins (F and G), large polymerase and three are nonstructural proteins [interferon antagonists W and V, C protein]1,3,4,10. The G protein facilitates viral attachment by binding to host ephrin B2 or B3 receptors, while the F protein assists in the fusion of cellular and viral membranes, which disassembles the capsid, thus delivering the viral genome into the cell. Once the viral genome reaches the cytoplasm, the synthesis and the translation of viral messenger and genomic ribonucleic acid (RNAs) and subsequent production of viral proteins will occur11.

Viral RNA replication involves a two-step process, initiating with the synthesis of the complete sequence of the positive-sense copy of the genome, referred to as the antigenomic RNA, which subsequently serves as a template for the production of genomic RNA. Successful infection requires the ability to evade the host’s key innate immune response such as the alpha/beta interferon (IFN-α/β) response. The Nipah virus employs four proteins encoded by the P gene—P, V, W, and C—to counteract the host immune response. The P, V, and W proteins bind to STAT1 thereby, disrupting the Jak-STAT signaling pathway by inhibiting its phosphorylation in response to IFN, thus impairing the antiviral signaling cascade11. The NiV phosphoprotein (P) protein plays an important role in genome replication by attaching the viral polymerase to the nucleocapsid12. Its multimerization domain forms an extended, parallel, tetrameric coil characterized by an N-terminal cap and a hydrophobic core12. All phosphoproteins possess distinct structural modules that independently bind to the L polymerase, unassembled nucleoproteins, and the nucleocapsid. Thus, the phosphoprotein promotes the initiation of RNA synthesis, confers processivity to the Nipah virus polymerase (NiV-L protein), stabilizes the unassembled nucleoprotein in a soluble, RNA-free state and efficiently delivers for encapsulation of newly synthesized RNA13. The Nipah Viral RNA polymerase complex composed of the large protein and tetrameric phosphoprotein complex (NiV L-P) complex thus plays an important role in catalyzing the viral genome replication14. There are three catalytic domains (RNA-dependent RNA polymerase (RdRp), polyribonucleotidyl transferase (PRNTase), methyltransferase (MTase)), two structural domains (connecting domain (CD) and C-terminal domain (CTD)) are present in NiV L protein15. The conserved RdRp domain containing amino acid residues from 369 to 982 is organized into three subdomains forming a “fingers-palm-thumb” right-hand fold and motifs14. Motif C of the RdRp domain contains the active site residues GDNE (residues 831–834), which is responsible for catalytic activity of the enzyme16. The NiV-L protein is crucial due to the absence of RdRp in human cells, making RdRp-targeting therapies a promising strategy for antiviral treatment17. Inhibiting RdRps in different viral infections such SARS CoV, dengue, HCV (hepatitis), and influenza, is considered a critical strategy to control the viral multiplication18.

No approved drugs or vaccines are currently available to treat NiV infection. In India, as per the report issued by the Health and Family Welfare Department of the Govt. of Kerala dated 5 Sept 2021, the currently available treatment options are Ribavirin, Monoclonal antibody m102.4, Remdesivir, and Favipiravir19. Currently, monoclonal antibodies 1 F5 and m102.4, along with the small molecule Remdesivir (either alone or in combination), are the only treatments with sufficient evidence to be considered for treatment of prophylaxis and early stage of Nipah virus disease20. A previous study by V. S. Patil et al. (2024) identified Remdesivir-TP, Favipiravir-TP, and Ribavirin-TP as potential NiV RdRp inhibitors, where he concluded that Remdesivir and its derivatives were the most potent antiviral drug against Nipah infection21. To address the urgent need for effective interventions during outbreaks, drug repurposing provides a rapid and cost-effective strategy by reformulating existing drugs with established safety profiles. However, evaluating bioactive compounds for their potential as active drugs based on molecular properties remains a significant challenge due to the vast chemical dimension22. The latest deep learning models offer revolutionary solutions for designing novel molecules with required properties23. Techniques such as generative adversarial networks (GANs)24,25, geometric reinforcement-based convolutional neural networks (CNNs)26, variational autoencoders (VAEs), and graph neural networks27have shown remarkable success in designing pharmacophore systems, enabling advanced drug repurposing strategies. Furthermore, integrating multi-parametric systems into the latent dimensions of pharmacophore-based deep learning approaches—such as bioactive molecule generation (PGMG)28, PharmRL26, and PharmacoNet29 provides enhanced generalization capabilities for modeling chemical properties. These advancements hold great promise for accelerating the discovery of effective therapeutics.

Targeting the viral proteins or pathways involved in its lifecycle could enable rapid identification of drugs that can be potentially repurposed to effectively combat the viral infection30. Our study employs an attention-based deep neural network, enabling each protein residue to directly interact with all ligand features. This approach overcomes the limitations of 3D spatial relationships, which are less effective for sparse atomic data in methods like GNNs, CNNs, and other machine learning techniques towards the sparsification of graphical neural networks31,32. The DNN is designed solely based on ligand properties and the chemical characteristics of the Nipah viral protein’s active sites for drug filtering, serving as a standard in-silico approach to assess interaction and complex stability. We further explored the docking affinity and stability of the top scoring drugs with Nipah viral protein.

Methodology

Protein preparation

The structure of the NiV L-P was acquired from the RCSB with a resolution of 2.92Å (PDB ID: 9 CGI)14. The structure of the protein was preprocessed using Schrodinger Maestro Protein Preparation Wizard33,34. This process involved the removal of metal ions, hetero groups, and water molecules located beyond 5 Å, as well as the assignment of disulfide bonds, bond orders, and formal charges. Hydrogen atoms were then added to the structure. Energy minimization was carried out using the OPLS4 force field during preprocessing35. To refine the hydrogen atom placement, the Impref utility was used to optimize the positioning of the heavy atoms, while the “H-bond assignment” tool was employed to adjust the hydrogen-bonding network.

Deep learning method for pharmacophore-based ligand screening

To screen the successful ligands from the DrugBank dataset (2059 FDA-approved drugs), we employed attention based deep neural network architecture on pharmacophore-based molecular features from ligand SDF files using RDKit22. To handle flexible input formats, and directly model protein-ligand interactions we employed attention layers that align well with the 3D complex and interactive nature of protein ligand systems. We extracted the ligand properties from 3D SDF files by defining a function that includes rotatable bonds, hydrogen bond donors and acceptors, molecular weight, LogP, and estimates of acidic, basic, and hydrophobic properties for each ligand These pharmacophore descriptors are structured with their corresponding ligands for further analysis.

In order to access the binding affinity, the “GDNE” protein cavity features are derived from three-dimensional protein structures in PDB format (PDB ID: 9 CGI). The PDBParser module from BioPython is recruited to parse structural data, identifying relevant chains and residues containing the specified cavity sequence. For each residue in the cavity sequence, atomic-level properties are computed, including atom type, hydrogen bond donor/acceptor potential, and hydrophobicity. Additionally, atomic centroid coordinates are calculated to capture spatial information, enabling the alignment of cavity features with ligand pharmacophore properties.

Binding affinity calculation is computed using a weighted dot product between ligand and cavity features, ensuring dimensional consistency through truncation or padding as necessary. Affinity scores are normalized, and binary labels are generated based on an affinity threshold, classifying interactions as successful or unsuccessful.

Initializing an effective learning process from the optimal subset of the ligand to protein cavity generation designed to capture complex in-depth attributes by framing an attention based neural network is designed to predict ligand binding affinity. The model comprises three main components: ligand and protein embedding layers, a multi-head attention mechanism and a full connected output layer. The embeddings transform input features into a latent space, while the attention mechanism captures the contextual interactions between ligands and protein cavity features. The model is trained based on the binary cross-entropy loss function and the Adam optimizer. Training is conducted in batches to optimize calculations and improve generalization. The model’s performance is monitored over multiple epochs, with the loss function guiding optimization.

Each ligand is represented as a feature vector:

figure a

where fi represents the i-th feature of the ligand.

Cavity features

The cavity of the protein is represented as a feature matrix:

figure b

Where ci, j represents the j-th feature of the i-th residue in the cavity, and p is the number of residues in the cavity.

Affinity score calculation

The interaction between the ligand and the protein cavity is computed as the weighted dot product.

figure c

where:

W Є Rn×m: Weight matrix aligning ligand features with cavity features.

b: Bias term

S: Binding affinity score

If ligand and cavity dimensions differ, padding or truncation ensures dimensional consistency:

figure d

Deep neural network model architecture

The model input consists of ligand and protein feature vectors with ligand and protein dimensions, respectively. These inputs are passed through embedding layers, where the ligand features are transformed using linear ligand dimension, hidden layer, and the protein features using Linear protein dimension with hidden layer. The embedded representations are then processed by a multi-head attention mechanism with attention heads, enabling the model to capture intricate ligand-protein interactions effectively. The attention layer operates with a hidden dimension and processes data in a batch-first manner to ensure computational efficiency.

Embedding layers

The ligand vector and cavity matrix are embedded into latent spaces:

figure e
figure f

where EL Є Rd×n and EC Є Rd×m are embedding matrices.

Attention mechanism

Multihead attention computes the contextual interaction:

figure g

where:

Query3: Lembed.

Key3: Cembed.

Value3: Cembed.

The final score is computed as:

figure h

where:

A: Output of the attention mechanism.

W0: Weight matrix for the output layer.

b0 : Bias term.

σ : Sigmoid activation function.

The binary cross-entropy loss for training is defined as:

figure i

Following the attention mechanism, the output undergoes processing through a feedforward neural network designed to enhance feature representation and support robust prediction capabilities.

Training and validation

The feedforward network is designed to process ligand-protein interactions by integrating ligand pharmacophore features and RdRp protein cavity residues. The training process ensures that both molecular level features and protein ligand interactions are captured optimally. The process initiates with a linear transformation, reducing the dimensionality from 128 to 64. Since the input data consists of ligand pharmacophore features (such as hydrogen bond donors, acceptors, hydrophobic system, etc.) and RdRp cavity residues, it helps to retain essential interaction-specific information while making learning more computationally efficient. Hidden layer simplifies the data for model to learn by focusing on the highlighted part (Attention mechanism), thus the model will focus on what matters most and skip the extra, unnecessary information. Subsequent to the linear transformation, batch normalization is applied for mitigating internal covariate shifts, thus accelerating the convergence during training. This is particularly useful when handling structural variations in RdRp binding sites, as it prevents drastic changes in weight updates. To learn the non-linear binding interactions of ligand feature scores with the RdRp active site, a ReLU (Rectified Linear Unit) activation function. In order to prevent overfitting and enhance models’ generalization capabilities, a dropout layer is incorporated with a dropout rate of 0.1. This regularization technique randomly deactivates a fraction of neurons during each training iteration, thereby reducing reliance on specific neurons and promotes model robustness. The final layer of the network involves an output layer that maps the reduced feature output dimension to generate final binding affinity from the learned representations. The model is trained using Adam optimizer, known for its adaptive learning rate and efficient handling of sparse gradients. We trained the entire frames in 50 epochs.

To ensure optimal performance, we conducted hyperparameter tuning using Bayesian optimization adjusting key parameters such as the number of attention heads, hidden dimension size, and dropout rate. The combination of an attention mechanism and a structured feedforward network allows the model to effectively learn and generalize ligand-protein interactions. The model’s performance was validated through internal cross validation, leveraging feature parameter analysis and interpretability of learned molecular descriptors to assess its predictive reliability. Our team developed a similar model and was recently published36.

Ligand Preparation

3D structures of FDA-approved drugs were retrieved from DrugBank. The molecular structures of drugs from the DrugBank were optimized close to the physiological pH (7 ± 1) through the generation of different possible tautomer’s and stereoisomeric conformations, as well as assigning the correct protonation states. These optimized structures were energy-minimized using the OPLS4 force field, with all steps executed through the LigPrep module of the Schrödinger Suite. We used Remdesivir as a control for this study.

Virtual screening

Virtual screening workflow from the grid-based ligand docking using the GLIDE module was used for the drug screening37,38. The top hits obtained from the phase screen were further analyzed using the virtual screening workflow in the Glide module. The centroid of the receptor grid for the protein structure was defined based on the active site of the RdRp complex protein. The grid was defined within a cubic box of 20 × 20 × 20 Å along the X (175.72), Y (165.4), and Z (173.38) coordinates covering the ligand-binding area. Virtual screening was then performed using the phase screen output and the constructed receptor grid. Glide score was considered for ranking the ligands with NiV L-P complex, with the score for Remdesivir used as a control. The virtual screening workflow began with an initial high-throughput virtual screening (HTVS) using the lowest docking precision, followed by standard precision docking, and finally extra precision (XP) docking24,39. After each docking step, the ligand library was filtered, and the top 10% of compounds with the highest docking scores were selected for further analysis.

Estimation of binding free energy of top-performing ligands

The top hits from the XP docking were further analyzed to estimate their binding free energies using the MM-GBSA method in the Prime module of the Schrodinger Suite40. As a control, the binding free energy of the remdesivir-NiV RdRp docked complex was also calculated. The resulting value served as a reference for comparing the binding free energy values of the candidate drugs41.

The Pharmacokinetic analysis of the lead drug

The pharmacokinetic or ADMET properties (absorption, distribution, metabolism, excretion, and toxicity) were assessed using the ADMETlab 3.0 software (https://admetlab3.scbdd.com)42. This software explores the topological and predicted information of the model to estimate pharmacokinetic properties using smile strings of the compounds.

Induced-fit molecular docking (IFD)

Based on the docking scores and binding free energy, the top-ranking FDA-approved drug was selected for Induced Fit Docking (IFD) on the rigid protein43,44. Van der Waals scaling was set to 0.50 for both the receptor and ligand to generate up to 20 poses of the best-scoring ligand-bound NiV L-P protein complex. IFD aims to ensure that the ligand interacts effectively with the RdRp structure, yielding the lowest energy values. The active site residues were used as the centroid of the grid for conducting IFD to evaluate the binding affinity with the RdRp complex. The best-scoring pose was then subjected to further analysis.

Molecular dynamic simulation (MDS)

The selected protein complex with the ligand was subjected to 200 ns MDS using the Desmond module to explore the stability and affinity of the ligand bound complex45,46. The prepared structures were placed into an orthorhombic box and solvated using the TIP3P water molecules with OPLS4 force field. The box sizes for each simulation were adjusted to ensure that the protein and ligand were fully enclosed. Adequate amounts of Na+/Cl- ions were added for neutralizing the system. Temperature and pressure were optimized to 300 K and 1 atmosphere, respectively, based on the isotropic scaling and the Nose-Hoover temperature coupling47. Intermediate structures were saved every 100 ps for the calculation of the following parameters: (i) root mean square deviations (RMSD) of the protein and ligands, (ii) root mean square fluctuations (RMSF) of the protein and ligands, (iii) ligand interactions, (iv) radius of gyration (RoG), (v) solvent accessible surface area (SASA), and (vi) post MDS MM-GBSA using the thermal.py script.

Principal component analysis

The stability of the protein depends on the atomic movement. Principal Component Analysis (PCA) is used in evaluating the change of the protein-ligand complex during the simulation48. PCA focuses on identifying the most contributing factors of a system, i.e., the principal components. These principal components are the c-α atoms of the protein. The properties of the protein-ligand complex such as ligand-induced conformations, modes of ligand binding, and binding pocket were explored to understand its structural and dynamic properties, thus gaining insights into the underlying biochemistry of the complex. The trj_essential_dynamics.py script was used for the MDS trajectory-based PCA analysis.

Free energy landscape

The free energy landscape22,49,50solely relies on the probabilistic occurrences of specific data point combinations that are then converted to free energy by a simple relationship51. Principal components 1 and 2 of the MDS were considered to analyse the conformation dynamics of the unbound and ligand-bound complexes of NiV L-P proteins. Temperature was set at 300 K for calculations, with the Boltzmann constant considered for converting the probability densities into free energy units (kJ/mol). The probability density was calculated using a 2D histogram; the density of the bin’s represents the likelihood of principal components 1 and 2 of the proteins. The density was converted to a free energy landscape using the Free Energy Calculation Formula:

figure j

where:

ΔG(x, y) is the free energy at point (x, y) in the principal component space.

kB is the Boltzmann constant (often set to 1 in these simplified models as we’re dealing with relative free energy).

T is the temperature in Kelvin.

p(x, y) is the normalized probability density from the histogram.

For avoiding any computational issues with zero probabilities, we added (1e-10) to each density before taking the logarithm. A Gaussian filter with Σ = 1.0 was used to smooth the data and reduce the noise for a coherent representation of the energy basins and energy barriers.

Fig. 1
figure 1

The graphical representation of our entire study. (Created in BioRender. Raju, R. (2025) https://BioRender.com/jgtqkfu).

Results

Structure-based drug discovery is a well-known bioinformatic concept in which the 3D structure of the target and potential drugs is studied to determine binding affinity and interactions. The Cryo-EM structure of the NiV-L protein was selected for our study from Ge Yang et al. (PDB ID: 9 CGI)14. The residues corresponding to the RdRp domain of the NiV-L protein have conserved right-hand fold and motifs52. The active site is present in Motif C residues GDNE (831–834), where D832 plays a crucial role in the transcription activity of NiV-L protein14. Motif F (535–557) in RdRp finger subdomain consists of a β-sheet region that contributes to the formation of the NTP entrance channels thereby stabilizing the assembly of the L-P complex. Motif A (713–730) binds to divalent cations vital for the catalytic activity of L protein while Motif E acts as the connection to the thumb subdomain, thus contributing to the structural integrity and the selection of NTPs along with B, A and D14,15. All these motifs in L protein of Nipah play an important role in the transcription and replication of the viral genome. In addition, a recent study by Hu, Side et al. (2025), reported that the active site residues 831–833 (GDN) along with the residues R551, located near the motif C residue D832 is universally conserved throughout the Paramyxoviridae family53. Our study evaluates the binding and potential inhibition of NiV L protein by repurposing FDA-approved drugs. Graphical representation of our entire study and results is provided in Fig. 1.

Pharmacophore model generation and screening

The results of our study showcases the application of a deep learning neural network framework, designed to prioritize ligands for drug repurposing based on ligand-target interactions and the chemical properties of the protein cavity. The model effectively evaluated a set of 2059 FDA-approved drugs, in which the results were ranked based on their predicted binding affinity and relevance to the target site. We selected 500 FDA-approved drugs with the highest predicted binding score for further analysis (Supplementary Table 1). The attention network captured critical molecular features, enabling a robust assessment of ligand-target compatibility. The top-ranked ligands, such as Pentetic acid, Cangrelor, and Miltefosine, demonstrated strong binding scores, highlighting their potential as candidates for therapeutic intervention. These results underscore the power of deep learning in navigating complex chemical spaces, facilitating the identification of bioactive compounds with high precision and efficiency. Our deep learning model was validated through feature parameter analysis and performance evaluation using molecular descriptors and pharmacophore characteristics of known compounds. While we did not use a separate test dataset, we assessed the model’s predictive reliability through internal cross-validation and interpretability of learned features. As a quantitative measure, we report an AUC-ROC score of 0.998 and a PR AUC of 0.998, demonstrating the model’s ability to distinguish active from inactive compounds with high accuracy (Fig. 2A and B). This approach underscores the importance of our attention-based model in enhancing feature learning and interpretability.

Fig. 2
figure 2

The performance plot for models AUC-ROC and Precision AUC curve of the model’s ability to distinguish active from inactive compounds with high accuracy. (A) Plots the True Positive Rate (TPR) on the y-axis against the False Positive Rates (FPR) on x-axis with the area under the ROC (AUC) curve is 0.998. (B) Precision-Recall curve plotting the precision on the y-axis and Recall on the x-axis with the area under the Precision-Recall Curve is 0.998 indicating the model’s high precision and high recall.

Molecular docking-based virtual screening

We employed a virtual screening workflow for the attention-based deep learning screened 500 drugs to identify the potential lead compounds. The extra precision (XP) docking was used to select the lead compound for further analysis. Previous docking studies on the polymerase of NiV and other viruses highlighted Remdesivir as the most potent antiviral candidate which can inhibit replication through chain termination21,54,55. Based on these studies, we considered Remdesivir as control. Our results presented that the control Remdesivir had a docking score of −6.148 kcal/mol against the RdRp complex. We used the docking score for ranking the drugs from FDA-approved drug library. Notably, the top two lead compounds exhibited superior docking scores of −9.942 (Cangrelor), and − 8.223 (Cocarboxylase), outperforming control Remdesivir (−6.148 kcal/mol). Glide emodel provides a different kcal/mol score calculated based on the force field and energy of particular conformation. Cangrelor exhibited the best glide emodel score with − 108.4 kcal/mol compared to the other complexes. Based on these results, the top three lead compounds (Table 1) were selected for further analysis.

Table 1 Extra-precision (XP) Docking and MMGBSA result of the control and the top three ligands.

Molecular interactions of the docked complex

Docking also demonstrates molecular interaction with substantial hydrophobic interactions, hydrogen bonds, and pi-pi stacking interactions with key residues of the NiV L active site. The control Remdesivir only showed 5 hydrogen bonds with 3 residues, namely, ASN833, GLN830, and GLU291, with a docking score of −6.148 kcal/mol, while Cangrelor had better binding affinity with a docking score of −9.942kcal/mol. This interaction is stabilized by 8 hydrogen bonds formed with the RdRp residues GLU291, THR546, ARG551, LYS724, ASP832, ASN833, ILE890, SER892, and LYS893. The docking results showed interaction between the active site residues ASP832 and ASN833, which play a vital role in transcription and RNA synthesis56. The residues of the active site (831GDNE834) are located at β-hairpin in motif C, supported by the N-terminal domain (NTD) and finger subdomains of RdRp, which together form the tunnel guiding template RNA to the catalytic site14. GLU291, a key residue in the NTD, stabilizes the L-P complex and forms the template and NTP entry channels14. THR546 and ARG551 are also involved in the formation of NTP and template entrance channel and a major role of assembly of the L-P complex. ARG551 in particular contributes to the stability and flexibility of the RdRp’s structure, thereby enabling effective accommodation of template RNA and NTPs15,16. Similarly, binding of Cangrelor to ILE890, SER892, LYS893 residues which belong to motif E will also affect the structural integrity and the NTPs selection process14,15. LYS724 is part of motif A (713–730) of the palm subdomain. Conserved across all NNS RNA viruses, aspartate residues (Asp722 in NiV) are involved in binding divalent cations and are essential for the catalytic activity14,15,57. In the RdRp-Cangrelor complex, Cangrelor interacts with key residues such as E291, T546, R551, I890, S892, K893, and K724, which play pivotal roles in the function of RdRp. These interactions contribute to restricting access to the binding domain and are also involved in the formation of the NTP entrance channels and assembly of the L-P complex. The 2-D ligand interaction diagram is provided in Fig. 3.

Fig. 3
figure 3

2D ligand interaction diagram of the XP docking:(A) Remdesivir, (B) Cangrelor, (C) Cocarboxylase, (D) Allupurinol.

Estimation of binding free energy (MMGBSA)

The three top-scoring drugs from XP docking were subjected to MM-GBSA along with the control, Remdesivir, to estimate the binding free energy58. Remdesivir shows a binding energy of −30.37 kcal/mol, whereas Cangrelor, Allopurinol, and Cocarboxylase showed − 78.25, −29.61, and − 71.18 kcal/mol binding free energies, respectively. These analyses suggested that Cangrelor, an antiplatelet drug, had better binding free energy compared to others.

Pharmacokinetic analysis

ADMET analysis of the compound was conducted using ADMETlab 3.042. This analysis revealed mixed pharmacokinetic properties for Cangrelor, Cocarboxylase, Allopurinol, and Remdesivir. Cangrelor (HIA = 1.0) and Cocarboxylase (0.999) exhibited best absorption, while Remdesivir (0.0) and Allopurinol (0.01) had poor uptake, with all the four drugs requiring IV administration due to low bioavailability. Cangrelor had high plasma protein binding (97.94%) compared to Remdesivir (74.30%), Cocarboxylase (43.35%), and Allopurinol (2.52%). Although Cocarboxylase and Allopurinol show minimal CYP450 metabolism, their faster clearance can hinder their antiviral efficacy. Unlike Remdesivir, Cangrelor had minimal CYP450, CYP2B6, and CYP2 C8 with reduced likelihood in drug-drug interaction. In addition, Cangrelor does not interfere with major metabolic pathways, making it a suitable candidate for combination therapy as well. Unlike Remdesivir and Allopurinol, which are predicted to have both metabolic and toxicity effects, Cocarboxylase and Cangrelor lead the group with a balance between efficacy and safety. Whilst, both Cangrelor and Cocarboxylase are the most balanced candidates, the very short half-life of Cocarboxylase would necessitate frequent dosing, thus reducing the practical feasibility. Thus, Cangrelor is the most suitable for repurposing as an antiviral drug, with formulation enhancement (e.g., nanoparticle-based delivery and PEGylation) would further improve its therapeutic potential. Table 2 provides the detailed result of ADMET analysis of Remdesivir and Cangrelor.

Table 2 The Pharmacokinetic properties of Cangrelor and remidesivir based on the ADME analysis.

Induced fit

Induced fit docking was carried out between NiV L protein and Cangrelor to get an in-depth evaluation of ligand binding by exploring potential conformational changes in the ligand and target protein complex59. This approach provides insights into the variations in ligand poses and their interactions with the amino acid residues of the target protein, offering a comprehensive understanding of binding dynamics60. A maximum of 20 possible poses were generated for Cangrelor of which top five poses with the 2D interaction diagram is presented in the Table 3. The best IFD docking score for Cangrelor was − 12.30 kcal/mol which presented a pose with 11 H-bonds with 6 different residues: LYS542, ARG551, GLU799, GLN830, ASP832, LYS893. Most of the residues form a similar bond with Cangrelor as obtained while XP docking. LYS542, located on motif F, contributes to the template entrance as well as assembly of the L-P complex. By forming a hydrogen bond with this specific residue, Cangrelor blocks the entry of NTP as well as the formation of templates, which may contribute to the inhibition of RdRp, thereby establishing a negative regulation of NiV L protein14,15.

Table 3 The top 5 IFD poses of NiV L protein in complex with Cangrelor.

Molecular dynamics simulations

To validate the behavior, unbound NiV L-P protein, NiV L-P in complex with Remdesivir, and Cangrelor complexes were subjected to MDS for a period of 200 nanoseconds13. This approach can explain the importance of the structure-function relationship of the macromolecules in the protein complex for a limited time. Figure 4 provides the complete results of the MDS simulation.

Dynamic stability of the unbound NiV L-P and bound NiV L-P complexes using root mean square deviation analysis

The root mean square deviation (RMSD) of a ligand-protein complex measures the average displacement of a selected residue over a specific time point61,62. The average RMSD for the unbound NiV L-P was 3.460 (Å) (Fig. 4A), NiV L-P in complex with Remdesivir was 3.7642 (Å) (Fig. 4B), and NiV L-P in complex with Cangrelor was 3.547 (Å) (Fig. 4C). The blue line indicating RMSD values of protein remains comparatively stable throughout the 200 ns simulation time. As per the average c-alpha, RMSD of Cangrelor shows better affinity with RdRp, indicating good stability as the structure of the protein maintains its initial conformation without significant deviations which is similar to the Remdesivir-bound complex also showing minimal structural deviation.

Fig. 4
figure 4

The MDS results of the unbound NiV L protein, NiV L in complex with Remdesivir, and NiV L in complex with Cangrelor.(A) RMSD of the unbound NiV L protein. (B) RMSD of the NiV L in complex with Remdesivir. (C) RMSD of the NiV L in complex with Cangrelor. (D) RMSF of the unbound NiV L protein. (E) RMSF of the NiV L in complex with Remdesivir. (F) RMSF of the NiV L in complex with Cangrelor. (G) The contact analysis of the NiV L in complex with Remdesivir. (H) The contact analysis of the NiV L in complex with Cangrelor.

Assessing the flexibility of unbound NiV L-P and bound NiV L-P complexes using root mean square fluctuations analysis

The root mean square fluctuations (RMSF) evaluates the stability and alterations in a protein-ligand complex by analyzing local variations in amino acid residues, where lower RMSF values indicate greater stability and higher values suggest reduced stability30. A similar drug repurposing study on RdRp of SARS-CoV-2 identified Remdesivir as a potential drug candidate where the active sites experienced higher fluctuations as they adjusted to bind63. There are minor fluctuations observed in the peak comprising the active site residues, i.e., 721–1055 AA, with the average RMSF values of the different complexes as 1.442 Å for the unbound NiV L-P protein, 1.346 Å for the Remdesivir-bound NiV L complex, and 1.803 Å for the Cangrelor-NiV L complex. The fewer fluctuations in the regions of 830–834 AA suggests the strong binding of Remdesivir and Cangrelor. Both the Remdesivir and Cangrelor-bound NiV L protein complex exhibited more interactions for the residues from 500 to 900 AA during simulation. Specifically, the 831–834 residues in both Remdesivir and Cangrelor-bound NiV L protein complexes showed subsequent interactions to the ligand. The RMSF analysis during the simulation is summarized in Table 4. The drugs exhibit reduced fluctuation in the active site region, indicating their strong affinity during the simulation. Both the control and Cangrelor complexes show significant binding with limited fluctuations when compared to the unbound NiV L-P protein. Higher fluctuations were observed in the loop regions of the NiV L-P complex. Figure 4D, E and F illustrates the RMSF of unbound and drug-bound NiV L-P complex.

Table 4 Average RMSF of bound and unbound NiV L protein complexes.

Protein-ligand contact analysis

The protein-ligand contact analysis for Remdesivir and Cangrelor binding to the L-P complex of the Nipah virus revealed distinct interaction profiles. During the 200 ns MDS, Remdesivir formed interactions primarily through hydrogen bonds with the residues GLN 830, ASP 832 and LYS 893 for 59%, 56%, 41% of the simulation time respectively. Similarly, we observe hydrophobic contacts with residues ILE 890 and LYS 893 (Fig. 4G), with fewer contributions from electrostatic and water-mediated interactions. In contrast, Cangrelor demonstrated a broader range of interactions, with much stronger hydrogen bonding with residues GLU 291, LYS 724 and ASN 833, in which ASN 833 has dual hydrogen bonding with an average of 90% of the simulation time. ASP 832, ASP 722, GLU 881, LYS 893 were having significant water molecules based interactions. These water based interactions that are found at the interface of the protein interacting with the ligand often mediates the hydrogen bonding (H-bonding) interaction networks between proteins and ligands, thus making a promising contribution to the free energy compared to the void in the same positions64. Additionally, significant hydrophobic interactions (LYS 893), alongside notable electrostatic, non-covalent, and water-mediated interactions (Fig. 4H) during the 200 ns MDS suggest that Cangrelor may form more diverse and potentially stronger binding interactions with the L protein.

Post-MDS MMGBSA

The estimation of the binding free energy between ligand-bound NiV L complexes provides insights regarding the stability and affinity of the interaction. The NiV L-Remdesivir complex had − 190.79177 Kcal/mol, and the Cangrelor complex scored − 181.83591 Kcal/mol. Although the total binding free energy is higher for Remdesivir, Cangrelor exhibits stronger hydrogen bonding (−9.344872721 Kcal/mol). In addition, Cangrelor also had better solvation energy of −39.06939702 Kcal/mol compared to Remdesivir’s −14.62096985Kcal/mol, suggesting Cangrelor has better interaction with the aqueous environment around protein65. Similarly, the slightly higher lipophilic interactions further support that Cangrelor (−39.44180024 Kcal/mol) can effectively bind within the protein’s hydrophobic region than Remdesivir (−39.43035993 Kcal/mol). Remdesivir has favorable Van der Waals interaction with the NiV L protein, contributing more negatively to the total binding free energy compared to the Cangrelor, suggesting that stronger binding free energy is due to non-covalent interactions. Thus, the estimation of the total binding free energy prompts a further analysis into accessing these effects on exposure to the solvent through Solvent Accessible Surface Area (SASA). Binding energy of the MD trajectory is calculated using selected frames after the MD run. The structures were extracted for each 20 ns frames of 200 ns MD run (total 10 structures). These structures were exported and binding energy was predicted for respective complexes. Table 5 provides the detailed results of the post MDS-MMGBSA results. Remdesivir (control) showed highest average binding energy during the MD run (−190.79 kcal/mol) and the energy value was higher to the binding energy before the MD run (−30.37 kcal/mol). The binding energy value of Cangrelor before is −78.25 kcal/mol and after is −181.83 kcal/mol during the MD run. In both complexes, Van der Waals energy plays a major contributing factor. Cangrelor shows much higher binding affinity during the dynamics condition than the rigid condition.

Table 5 The results of post-MDS MMGBSA of the Ligand-bound NiV L protein.

Solvent accessible surface area and radius of gyration

The solvent-accessible surface area (SASA) and radius of gyration (rGyr) to assess the solvent exposure and structural compactness in both the unbound and ligand-bound complexes of the NiV L protein. The SASA values for the unbound NiV L-P, NiV L-P-Remdesivir, and NiV L-P-Cangrelor complex were 78,619.90 Ų, 78,195.77 Ų, and 75,602.80 Ų, respectively (Fig. 5A). The lower SASA values of the unbound NiV L-P compared to the NiV L-P-ligand complexes indicate that ligand binding can slightly affect structural changes in the viral infection, which likely leads to an increase in the SASA for the drug-bound complexes. Notably, the Cangrelor complex shows a lower SASA compared to the unbound form, suggesting that Cangrelor’s binding leads to a more tightly bound conformation at the active site. The lower value could mean fewer non-specific interactions that would potentially lead to a more precise and balanced modulation of protein activity66. The average Cangrelor SASA value was closely related to the unbound proteins, supporting the NiV L protein dynamics which might be optimized without significantly altering its solvent interaction. Thus, the subtle modulation of protein function rather than through dramatic conformational changes.

Fig. 5
figure 5

(A) Solvent Accessible Surface Area of the unbound NiV L protein, NiV L in complex with Remdesivir and NiV L in complex with Cangrelor. (B) Radius of gyration of the unbound NiV L protein, NiV L in complex with Remdesivir, and NiV L in complex with Cangrelor.

To better understand the effects of the protein’s structural flexibility, we analyzed its radius of gyration. The rGyr represents the mass-weighted RMSD between a group of atoms and their shared center of mass, providing insight into the overall dimensions of the protein. rGyr is an important parameter to ascertain the mobility and stiffness of the protein61,62. The rGyr for the unbound NiV L-P, NiV L-P-Remdesivir, and NiV L-P-Cangrelor complex were found to be 50.75 Å, 44.96 Å, and 51.05 Å, respectively (Fig. 5B). These results suggest that the binding of Cangrelor does not significantly alter the compactness of the protein. rGyr analysis reveals the firmness of the binding of Cangrelor towards NiV L67. Studies combining both in silico and in vivo approaches have examined drugs for identifying their inhibitory properties, potentially modulating the regulation of the viral infection. Although particular to Cangrelor, similar approaches have been used to examine the inhibition of other biological targets, suggesting that, depending on its pharmacological profile, Cangrelor might have comparable inhibitory effects on NiV L protein68. Table 6 provides the average values of RMSD, RMSF, SASA, rGyr and H-bonds based on the bound and unbound NiV L complexes.

Table 6 Average values of the RMSD, RMSF, SASA, rGyr and H-bonds.

Principal components analysis

MD simulation trajectories were utilized to carry out principal component analysis (PCA). PCA is widely explored to understand the effective global motions exhibited in respective protein and protein ligand complexes during simulation. The projections of motions in the phase space from the principal components of unbound NiV L-P protein, NiV L-P in complex with Remdesivir, and NiV L-P in complex with Cangrelor in the MDS trajectory were plotted using scatter plots. Figure 6A and B provides the scatter plots based on the C-alpha (Cα) of the NiV L-P-Remdesivir and NiV L-P-Cangrelor complex superimposed on the unbound NiV L-P protein complex.

During simulation in a solvent system, which exhibits structural changes as well as conformation changes over a period of time are represented within a high-dimensional vector space which corresponds to the system’s degrees of freedom (DOF) encompassing the independent parameters defining its configuration. PCA picks the crucial elements present in the trajectory with the help of a covariance matrix or a correlation matrix, which explains the effective DOF of the protein. Dynamic cross-correlation matrix plot (DCCM) of the Cα of NiV L-P (Fig. 6C), NiV L protein in complex with Remdesivir (Fig. 6D), and NiV L-P in complex with Cangrelor (Fig. 6E). Here we consider Cα atoms of the NiV L proteins as the two principal components. In order to evaluate the critical collective motions of NiV L protein with and without ligands, we used covariance matrices of the Cα atoms to calculate the principal modes using trj_essential_dynamics.py. Positive values provide correlated movements between Cα atoms, while negative values indicate an anti-correlated motion. The highest variation in the protein’s internal motion was represented using Principal component 1 (PC1) and the second highest variation by Principal component 2 (PC2). The NiV L-P-Remdesivir complex was completely superimposed in the unbound NiV L protein, and the NiV L-P-Cangrelor complex occupied most of the unbound NiV L-P protein conformation spaces; hence, there are no significant structural changes observed, thus binding of Cangrelor does not exert any conformation changes. Additionally, the DCCM reveals the residues in the NiV L protein 700–1000 exhibit higher affinity towards Cangrelor which is provided in Fig. 6E, this region is critical for the polymerase function. The stabilization of the dynamics could impair the flexibility required for RNA synthesis, thus enhancing the mechanistic basis for the inhibitory effect of Cangrelor. However, Remdesivir exerts much less influence on the correlation, supporting a weaker targeted dynamic impact. Therefore, these findings would further enhance how subtle changes to the conformation can have effective inhibition of the NiV L-P protein, providing valuable insights for developing effective antiviral compounds against NiV replication.

Fig. 6
figure 6

Scatter plot of the principal component of NiV L protein Cα atoms.(A) Scatter plot of the PCA results of the Cα atoms; black dots represent the unbound NiV L protein, and red represents the NiV L protein in complex with Remdesivir (red). (B) Scatter plot of the PCA results of the Cα atoms; black dots represent the unbound NiV L protein, and green represents the NiV L protein in complex with Cangrelor. (C) Dynamic cross-correlation matrix plot of the Cα atoms in unbound NiV L protein. (D) Dynamic cross-correlation matrix plot of the Cα atoms in NiV L protein in complex with Remdesivir. (E) Dynamic cross-correlation matrix plot of the Cα atoms in NiV L protein in complex with Cangrelor.

Free energy landscape

Based on the insights from PCA, we explored the stability of each complex through the free energy landscape22. The differences in the thermodynamic properties of the unbound and ligand-bound NiV L protein complexes are determined using FEL calculation. The MDS trajectories of the principal components 1 and 2 were used to investigate the conformational dynamics of each complex. FEL relies on the probabilistic occurrence of certain patterns of data points that are later converted to free energy values using simple relationships69. Figure 7 illustrates the free energy landscapes projected onto the first two principal components of the unbound NiV L-P (Fig. 7A), NiV L-P in complex with Remdesivir (Fig. 7B), and NiV L-P in complex with Cangrelor (Fig. 7C) for the backbone atoms of NiV L-P protein. In the free energy contour map, the shape and size of the minimal energy area (in red) suggest the complex is much more stable. Similarly, the small and centralized blue area indicates the ligand-bound complex is more stable. Similarly, the NiV L-P protein in complex with Cangrelor exhibits enhanced stability relative to its unbound state. These findings highlight the inhibitory potential of both compounds, with Remdesivir widely recognized as an effective antiviral drug for NiV infection in experimental studies. However, despite its therapeutic application, the high mortality rate associated with NiV infection emphasizes the need for novel and effective drugs. Our study reveals that Cangrelor exhibits superior binding affinity and stability as compared to Remdesivir, suggesting its potential as a more effective antiviral agent.

Fig. 7
figure 7

The free energy landscape of unbound and bound NiV L protein complexes.(A) Unbound NiV L protein. (B) NiV L protein in complex with Remdesivir. (C) NiV L protein in complex with Cangrelor.

Discussion

The Nipah virus causes one of the deadliest infections in human history, with a death rate reaching up to 90%. Since there are no therapeutics or vaccines approved for the disease, managing the disease remains the only alternative. There is a growing need for therapeutic options targeting the viral proteins. Among the multiple proteins involved in the viral life cycle, RNA polymerase4and its cofactor P protein are known to perform the most important functions of transcription and replication of the viral genome16.

With the most recent study by Yang G et al. (2024) providing the 2.9-Å cryo-electron microscopy structure of the NiV L-P complex, we can now identify potential drugs that can bind and inhibit its functioning. From the same study, out of the three domains discussed we considered the active site residues GDNE (residues 831–834) belonging to the RdRp domain to identify the potential lead candidates to control the viral replication15.

In the present study, we developed an attention-based deep learning framework for drug repurposing that combines pharmacophore based chemical properties of the ligand and the active site residues of the NiV L-P protein. Unlike conventional methods, our approach uniquely incorporates ligand-target interactions and chemical cavity properties as key constraints during the prediction process70,71. The methodology involves encoding pharmacophore features and spatial attributes derived from the target cavity into a structured representation to layered attention mechanisms to prioritize potential drug candidates based on their binding affinities and compatibility with the target cavity.

This demonstrates several advantages over existing repurposing strategies. First, the use of receptor cavity-based pharmacophore properties ensures the integration of biologically meaningful data into the model, enhancing both prediction accuracy and interpretability. This framework allows for the direct utilization of FDA-approved drugs without requiring extensive re-training or additional structural modifications. Second, the ability of the model to rank ligands based on their scores highlights its capacity to handle diverse chemical properties efficiently. Furthermore, the model’s ability to generalize across different targets without fine-tuning underscores its robustness and potential applicability in a wide range of therapeutic contexts. Despite its robust architecture, the deep learning model has certain limitations. As it is purely feature based with no training data, the model’s performance heavily relies on the quality and comprehensiveness of the input features. This dependency may lead to potential biases if the features do not adequately represent the full spectrum of ligand protein interactions. Additionally, the absence of training data limits the model’s capacity to learn from real world variations, increasing the risk of overfitting to specific feature patterns. Addressing these limitations through feature augmentation or incorporating experimental data in future work could enhance the deep learning framework.

Our study identified Cangrelor, Allopurinol, and Cocarboxylase as potent lead molecules with the best docking score as compared to the control (Remdesivir) among the 500 FDA-approved drugs against the RdRp domain of Nipah virus using molecular docking. Among these 3 top-scoring drugs, we selected Cangrelor as more potent towards NiV-L, which can be repurposed against NiV. Cangrelor is an antiplatelet drug that is used to treat cardiovascular disorders72. It has already been studied for its potential role in combating viral diseases such as SARS-CoV273,74and African swine fever virus75. Our in-silico analysis identifies Cangrelor as a potential repurposed drug candidate against NiV L protein of the L-P complex as it showed greater binding affinity, binding free energy, ligand-protein contact, hydrogen bonding, rGyr, SASA and FEL as compared to Remdesivir. Cangrelor’ s ability to effectively bind to the key residues of NiV-L protein forming hydrogen bonds with RdRp residues GLU291, THR546, ARG551, LYS724, ILE890, SER892, LYS893 including critical interactions with the active site residues ASP832 and ASN833 supports its role as a promising candidate in developing novel strategies to limit the pathogenicity of NiV. Unlike nucleoside analogs such as Remdesivir, which rely on chain termination or mutagenesis, Cangrelor can prevent nucleotide incorporation and disrupts polymerase stability at an earlier stage. These findings pave the way for further exploration of Cangrelor as a viable therapeutic option for combating Nipah virus infections.

Although NiV is primarily recognized for its neurological manifestation, recent studies have emphasized the crucial involvement in cardiovascular implications, which includes thrombotic events, vascular compromise, endothelial dysfunction and myocarditis76. This lethal cardiovascular complication may lead to morbidity and mortality in association with NiV infection. So there an extra critical care is needed to treat affected individuals with cardiac problems3. Cangrelor, an antiplatelet drug, could help treat thrombotic events linked to NiV infection while also inhibiting NiV-L polymerase activity highlighting its dual role which needs be further validated experimentally77. To the best of our knowledge, this is the first study based on the latest cryo-EM structure of the NiV L-P complex.

Even though Cangrelor exhibits relatively few off-target effects, some of these may still contribute to adverse outcomes, presenting a limitation in its clinical use78. Schrör (2012) discusses how repeated platelet receptor occupancy by P2Y₁₂ inhibitors, including Cangrelor, may lead to platelet “exhaustion” and transient adverse effects, such as dyspnea, which could be related to transfusion-related acute lung injury (TRALI)79. Kubica et al. (2019) describes how off-target effects of Cangrelor include potential improvement in endothelial function and arterial reactivity, which could influence infection-related vascular complications80. A study by Müller et al. (2015) highlights the complex interaction between platelets, inflammation, and the anti-inflammatory effects of antiplatelet drugs like Cangrelor by reducing platelet-leukocyte interactions and downregulating the release of pro-inflammatory cytokines, raising concerns about potential immunomodulatory effects81. Moreover, Marchi et al. (2024) suggest that adenosine-related drugs, including Cangrelor, may have unintentional cardiac and neurological effects due to their impact on ATP metabolism and purinergic signaling82. Addressing these challenges through further research could enable a more targeted approach to mitigate off-target effects while preserving the therapeutic benefits of Cangrelor. Therefore, the therapeutic efficacy of Cangrelor must be further verified through in vitro and in vivo targeted experiments.

Conclusion

Nipah virus threatens public health due to its higher infection fatality rate along with the lack of vaccination and effective treatment. Screening the FDA-approved drug library through the integration of pharmacophore properties with an attention-based deep learning technique allows for a rapid methodology for short-listing the drug library against the NiV L-P complex. Further, we employed various in silico approaches to propose Cangrelor as a potential drug against NiV replication. By evaluating the docking score, MMGBSA, MDS, and PCA we identified Cangrelor as a better drug compared to the control Remdesivir with the binding affinity and stability against the NiV L protein. In vivo studies using Cangrelor against the NiV L-P protein could further solidify our findings and validate its antiviral efficacy.