Introduction

Hepatitis B is a severe liver infection caused by the Hepatitis B virus (HBV), a global health concern, particularly due to its potential to cause chronic liver disease, cirrhosis, and liver cancer1. The virus belongs to a member of the hepadnavirus family2 and has distinct characteristics such as strong resilience to fluctuations in humidity and temperature3. The global disease burden of viral hepatitis, including hepatitis B, increased from 980.9 thousand in 1990 to 1412.3 thousand in 2017, accounting for 97.6% of total deaths4. Hepatitis B mortality rates vary regionally, affecting approximately 900,000 deaths annually due to cirrhosis and hepatocellular carcinoma complications, emphasizing the need for effective, finite-duration curative therapy5. The dynamic structural properties of HBV and the evolutionary origins of hepadnaviruses contribute to viral pathogenesis and immune interaction6. The high mutation rate of HBV presents challenges for disease control7.

The HBV infects humans primarily through a complex interaction of viral mechanisms and host immune responses8,9. Upon entering the human body, HBV targets hepatocytes10, the major cells of the liver11. The virus’s envelope protein PreS1 domain of large surface antigen (L-HBsAg) binds to human NTCP (Sodium Taurocholate Co-transporting Polypeptide), a receptor on hepatocytes, facilitating entry through Receptor-mediated endocytosis12. Once inside the cell, the virus’s partially double-stranded DNA is converted in the nucleus into a covalently closed circular DNA (cccDNA), serving as a template for viral mRNA synthesis and establishing a persistent infection13. HBV infection is non-cytopathic, meaning it does not directly cause cell death9,14,15. However, the immune response, particularly the activation of CD8 + T cells and natural killer cells, plays a critical role in causing liver damage while attempting to control the virus16. This response can lead to chronic liver inflammation, cirrhosis, and even hepatocellular carcinoma17. The age at which HBV infection occurs significantly affects the immune response, with neonates and young children often developing chronic infections due to an immature immune system and hepatic macrophages facilitating lymphoid organization and immune priming within the adult liver, promoting successful immunity18.

The Core Protein (HBc or Cp), also known as the hepatitis B core antigen (HBcAg), is the viral protein responsible for forming the capsid. The Capsid itself is a complete protein shell, composed of multiple core protein monomers arranged in an icosahedral structure. This structure, which measures approximately 36 nm in diameter, functions to enclose and protect the viral genome. The core protein is crucial for the virus’s structure and lifecycle, facilitating the assembly of the capsid19. The icosahedral capsid, formed through the self-assembly of these core protein monomers, is essential for the virus’s integrity and infectivity 21 22. Understanding this protein’s structure is essential for developing effective therapeutic strategies against HBV21. The capsid protein shell encases and protects the viral genetic material, crucial for virus stability and infectivity22. It contains the viral reverse transcriptase and pregenomic RNA, essential for DNA synthesis23. The capsid also plays a role in virus interaction with host cells, facilitating entry, uncoating, and release of genetic material The HBV pgRNA codes for the core protein, which is involved in the production of relaxed circular DNA (rcDNA) and acts as a template for reverse transcription24. The N-terminal domain reflects the assembly of the capsid and highlights the complex interplay between the structure and function of the core protein25. Another level of complexity is the function of protein charge density on the HBV capsid19. Nucleic acid binding is carried out by the C-terminal tail of the HBV core protein, underscoring the significance of electrostatic interactions in capsid integrity26. Studies on the molecular processes involved in HBV capsid building and disassembly offer important new information about possible targets for antiviral treatments. Novel approaches have been facilitated by recent studies that have illuminated the significance of protein charge density in capsid stability26. Furthermore, deciphering the HBV capsid protein’s nuclear export and import helps to clarify important facets of the virus-host relationship3.

Blocking the HBV core protein, inhibits capsid formation, disrupts reverse transcription, decreases viral DNA production, and limits virus spread. Disrupted capsid formation may be more effective for immune detection and clearance. Thus, in this study, we examine the complex interplay between the core protein of hepatitis B, including their molecular connections, assembly dynamics, and possible therapeutic applications.

Materials and methods

Identification of hits by virtual screening

The virtual screening workflow employed the Zinc Database27and the BIMP (Indian Medicinal Plant) repository accessed through RASPD+. Using the RASPD + protocol28,, the HBV core protein (PDB ID: 6J10) was modeled as the target structure, with the co-crystallized ligand B40 defining the active site. A binding energy cutoff of -9.5 kcal/mol was applied to filter hits, derived from empirical validation using B40’s docking energy with the HBV core protein. Compounds satisfying this threshold were prioritized for downstream analyses.

Filtering hits based on drug-likeness

Drug-likeness assessment was performed using RdKit29 on the JupyterNotebook platform30. Molecules incompatible with pharmaceutical requirements were excluded by applying Lipinski’s Rule of Five31, a benchmark for oral bioavailability. Compounds fulfilling all criteria and exhibiting drug-like characteristics were advanced to subsequent analyses.

Toxicity

To qualify as viable drug candidates, lead molecules must demonstrate minimal toxicity. Toxicity profiling was conducted using DataWarrior32, evaluating parameters including mutagenicity, tumorigenicity, reproductive effects, irritancy, and cytotoxicity. Only compounds showing no predicted toxicity across these metrics were retained for further investigation.

ADME

Pharmacokinetic properties were assessed using SwissADME33. Key parameters included solubility (LogS, water solubility, SKlogS buffer), absorption (GI absorption, human intestinal absorption %, MDCK permeability, Caco-2 permeability), distribution (LogP, LogD, BBB permeability, plasma protein binding %), metabolism (CYP1A2/CYP2D6/CYP2C9/CYP2C19 inhibition, CYP3A4 substrate activity), and excretion (skin permeability logKp). Compounds with favorable ADME profiles were selected for advanced studies.

Consensus molecular Docking

Docking accuracy can be compromised by algorithmic biases arising from parameter overfitting or training limitations34,35,36. To mitigate these issues, a multi-algorithm consensus strategy was implemented, integrating : AutoDockTools37, Auto Dock Vina38, idock39, rdock40, qvina41, and Plant42 following protocols established in prior work43. Ligand-receptor interactions, including residue-specific contacts and bond formation, were analyzed using PLIP44. Molecular visualization and interaction mapping were performed with NGL Viewer45.

DFT calculation

DFT calculations quantified electronic parameters like HOMO/LUMO energies, their gap, Ionization Potential, and Electron Affinity, crucial for assessing molecular reactivity and stability. The workflow utilized ORCA software46 for molecular structure analysis, starting with the conversion of molecular structures into XYZ format files using Open Babel47, confirming readiness for DFT calculations. Computational methods (B3LYP) and basis sets (def2-SVP) were specified for the DFT calculations, including optimization directives. ORCA was executed to perform DFT calculations, with outputs captured for analysis. The outputs were then converted to Molden format with “orca_2mkl” for visualization and analysis of molecular orbitals and electronic structure. IboView48,49 was launched with Molden input for a graphical display of electronic properties, including HOMO and LUMO, completing the process.

MD simulation

The dynamic nature of Core protein-ligand complexes that were optimally docked has been assessed by a 100 ns all-atom Molecular Dynamic simulation. This simulation was conducted using the GROMACS50,51,52 and NyroMDNotebook v1, a Jupyter notebook developed by Girinath G. Pillai53. The process involves generating topology files for the protein and ligand, defining the cubic box, including the TIP3P water model54,55, incorporating ions into the system, equilibrating the system using the constrained constant NVT ensemble56, and running the NPT ensemble for 100 ps. Lastly, we have done simulations of 100 ns molecular dynamics production run using the same approach followed in our previous work57. By utilizing inbuilt gromacs analytical tools, MDAnalysis58 and g_mmpbsa59. MD trajectories were analyzed to assess the stability of each protein-ligand complex under dynamic conditions through multiple metrics: structural deviations (RMSD, RMSF), compactness (Rg), solvent accessibility (SASA), conformational sampling (PCA), interaction patterns (hydrogen bonds, protein-ligand contacts), and free binding energy (mm-PBSA-derived free energy).

Results

Identification of hits by virtual screening

After screening and removing duplicates, 30,318 unique compounds demonstrating receptor binding with free energies < -9.5 kcal/mol were curated for subsequent analysis.

Filtering hits based on drug-likeness

From the initial 30,318 hits, 21,436 molecules adhered to Lipinski’s Rule of Five, qualifying them as drug-like candidates.

Toxicity

Toxicity profiling eliminated 16,776 compounds, retaining 4660 molecules with no predicted mutagenic, tumorigenic, reproductive, irritant, or cytotoxic properties for further evaluation.

ADME

Computational pharmacokinetic profiling prioritized compounds with optimal solubility (LogS), high gastrointestinal absorption (HIA > 80%), moderate lipophilicity (LogP 1–4), and blood-brain barrier impermeability. Only 34 of 4660 candidates satisfied all ADME criteria, advancing to molecular docking.

Consensus molecular Docking

Compared to single docking methods, consensus docking enhanced the quality of docking and virtual screening results60,61,62,63. The selection of multiple docking was deliberate to overcome the inherent limitations of individual docking algorithms. Each program employs distinct scoring functions and search algorithms that excel in different aspects of molecular docking: AutoDock offers force field-based scoring with genetic algorithm search, Vina provides enhanced speed and accuracy through knowledge-based potentials, idock excels in multithreaded performance, rdock incorporates solvent effects through empirical scoring, qvina optimizes speed without compromising accuracy, and PLANTS implements ant colony optimization for sampling. Individual docking software programs often show biases and limitations that can affect the accuracy of results. Thus, the consensus approach minimizes systematic biases inherent to any single program and enhances the reliability of binding pose predictions. Integration of multiple docking algorithms enhanced pose prediction reliability by prioritizing ligands with consistent binding conformations across methodologies. Compounds exhibiting high pose consensus in virtual screening (Table 1) demonstrated improved hit rates, validating this multi-technique approach.

Table 1 Results of molecular Docking by using multiple Docking programs.

The Rank− by− rank (RbR) strategy64,65 optimizes consensus docking outcomes by aggregating molecular rankings from individual scoring functions. In this workflow, each algorithm assigns ranks (1 = most favorable), which are averaged to generate a unified score for cross-method comparison. Final RbR ratings, tabulated in Table 2, reflect the compound’s overall docking performance.

Table 2 Results of consensus molecular docking after arranging them rank-by-rank.

The ultimate ranking provides a thorough evaluation of the expected binding efficacy of the compounds, encompassing their rankings across many scoring methodologies. The top 6 molecules form the aggregated outcomes of RbR has been selected for the further studies which have been graphed in the radar plot and are displayed in Fig. 1.

Fig. 1
figure 1

Results of consensus molecular docking after RbR depicted in Radar Plot.

The interaction details of Core proteins all the six molecules have been shown in ribbon representation and 2D Depiction in Figs. 2 and 3.

Fig. 2
figure 2

Ribbon representation of core protein with ligands: (a) core protein-ZINC16607363 complex; (b) core protein-ZINC02691764 complex; (c) core protein-ZINC01158358 complex; (d) core protein-ZINC02330823 complex; (e) core protein-ZINC00674395 Complex; (f) Core protein-ZINC00789496 complex.

Fig. 3
figure 3

2D representation of core protein with ligands: (a) core protein-ZINC16607363 complex; (b) core protein-ZINC02691764 complex; (c) core protein-ZINC01158358 complex; (d) core protein-ZINC02330823 complex; (e) core protein-ZINC00674395 complex; (f) core protein-ZINC00789496 Complex.

DFT

DFT calculations enabled the measurement of important electronic properties, such as the energies of the HOMO and LUMO, which are essential for comprehending the reactivity and stability of molecules. The HOMO-LUMO energy gap clarifies the chemical stability and ability to be excited about the molecule. The ionization potential and electron affinity offer valuable information regarding a molecule’s capacity to donate or take electrons, respectively. Electronegativity represents the atom’s inclination to draw electrons towards itself within a molecule, whereas chemical potential signifies the alteration in the system’s energy when an electron is added under constant volume and temperature. The concepts of global hardness and global softness measure the degree of resistance to electron distribution changes and its opposite, indicating the reactivity of molecules. The electrophilicity index quantifies the molecule’s tendency to receive electrons, indicating its nucleophilic or electrophilic nature. These metrics together provide a thorough electronic profile that is crucial for evaluating molecular interactions in drug discovery procedures. Analyzing the DFT descriptors for the six molecules provides a comparative view of their electronic properties. Table 3 summarizes the results of DFT Calculation. Table 3 summarizes the results of the DFT Calculation and Fig. 4. depicted the optimized structure HOMO and LUMO.

Table 3 Results of DFT calculation.

ZINC00674395 demonstrated balanced electronic properties with moderate HOMO-LUMO gap (0.1522 eV), strong electron-donating capability, and good electron-accepting properties, suggesting optimal reactivity and stability. ZINC00789496 showed lower reactivity with higher HOMO-LUMO gap and weaker electron-accepting tendencies, while ZINC01158358 exhibited moderate electron-donating capabilities and greater chemical stability due to its larger HOMO-LUMO gap. ZINC02330823 displayed unique characteristics with a small HOMO-LUMO gap (0.1182 eV), high ionization potential, and the highest electrophilicity index (0.236796 eV⁻¹), indicating high reactivity potential. ZINC02691764 demonstrated moderate electronic properties across all parameters, with balanced electron-donating and accepting capabilities and a moderate HOMO-LUMO gap (0.1673 eV). ZINC16067363 emerged as the most reactive compound with the highest HOMO energy and lowest LUMO energy, resulting in the smallest HOMO-LUMO gap, though this suggests potential stability concerns. Among these, ZINC00674395, ZINC02330823, and ZINC02691764 showed the most promising electronic configurations for stable protein-ligand interactions, balancing reactivity with stability requirements for drug-like molecules.

Fig. 4
figure 4

Depiction of the optimized structure HOMO and LUMO.

Molecular dynamics simulation

Molecular Dynamics (MD) simulations serve as a cornerstone in the computational assessment of drug candidates, providing critical insights into the dynamic behavior of protein-ligand complexes over time. Through a comprehensive 100 ns MD simulation, we have thoroughly examined the dynamic stability of each ligand bound protein complexes. This extensive analysis encompasses a detailed evaluation of parameters such as RMSD, RMSF, Rg, SASA, PCA, hydrogen bond formation, Protein-ligand contact timeline, and binding free energy with mm-PBSA. A simulation duration of 100 ns was selected based on several considerations. First, this timescale is sufficient to observe protein-ligand equilibration and conformational adaptation, as evidenced by the stabilization of RMSD values for all complexes well before the 100 ns mark (Fig. 5). Second, the time frame provides adequate sampling of the conformational space to meaningfully compare the binding stability of different ligands with the core protein. Third, previous studies on HBV core protein interactions have shown that key binding events. Finally, the 100 ns duration strikes an optimal balance between computational feasibility and scientific rigor for a comparative study of multiple protein-ligand systems.

RMSD

RMSD values for all three ligands stabilize after an initial rise, indicating that each protein-ligand complex reaches a certain degree of equilibrium. Figure 5 depicts the RMSD of Protein-bound to different ligands over time. ZINC00674395 shows a higher average RMSD (0.422651 ± 0.054241 nm) with a lower standard deviation, suggesting a more significant deviation from the initial protein structure but with consistent behavior through the simulation. ZINC02330823 has the lowest average RMSD (0.363595 ± 0.061406 nm), implying a closer conformation to the initial structure and potentially a more stable interaction with the protein, despite a slightly higher standard deviation which indicates more fluctuation in its binding conformation. ZINC02691764 displays an intermediate average RMSD (0.390548 ± 0.050267 nm) with the lowest standard deviation, which could be indicative of a stable protein-ligand complex with minimal structural deviation over time.

Fig. 5
figure 5

RMSD of Protein-bound to different ligands over time.

RMSF

The RMSF plots (Fig. 6) illustrate the flexibility of individual residues in a protein when bound to three different ligands during MD simulations. RMSF values can be indicative of the strength and specificity of ligand binding. The ligand that causes lower flexibility in key active site residues may result in a more stable and specific interaction, potentially leading to a more favorable drug candidate.

Fig. 6
figure 6

RMSF of protein is bound to different ligands over time.

ZINC00674395 exhibits significant fluctuations across several residues, indicating a flexible binding interaction that may allow for adaptability in ligand binding but may also suggest less binding specificity. ZINC00674395, the average RMSF (0.206469 ± 0.087470 nm) suggests that while the overall flexibility of the protein is moderate when this ligand is bound, there is a relatively wide range of motion among the different residues, as indicated by the standard deviation value. ZINC02330823 has a lower average RMSF value (0.207089 ± 0.130327 nm), suggesting overall tighter and more consistent binding. However, with a higher standard deviation, certain residues might be particularly flexible or fluctuating during the simulation. ZINC02691764 shows the lowest overall RMSF (0.180330 ± 0.109015 nm), indicating a more rigid protein-ligand complex, which could imply a strong and specific interaction. The comparative RMSF of amino acid residue present in the active site has been shown in Fig. 7. This bar graph displays the RMSF values for six key amino acid residues (LEU19, PHE24, PRO25, TRP102, TYR118, and PHE122) in the active site of the HBV core protein when bound to three different ligands: ZINC00674395 (blue), ZINC02330823 (green), and ZINC02691764 (purple).

The graph reveals distinct fluctuation patterns: ZINC02330823 (green) induces notably high fluctuations at residue 19 (nearly 0.200 nm) and residue 122 (approximately 0.150 nm), suggesting this ligand may allow greater mobility in these regions. ZINC00674395 (blue) causes pronounced fluctuations at residue 24 (approximately 0.175 nm) and residue 25 (about 0.150 nm), while maintaining relatively lower fluctuations at residues 102 and 118. ZINC02691764 (purple) shows a more moderate pattern of fluctuations across most residues, with its highest value at residue 19. All three compounds induce relatively lower fluctuations at residues 102 and 118 (below 0.100 nm), suggesting these positions may be stabilized regardless of which ligand is bound. The differential RMSF patterns reveal that each inhibitor uniquely modulates the flexibility of key active site residues, with ZINC00674395 specifically restricting motion at the dimer-dimer interface residues (102, 118) while allowing controlled flexibility at other positions (24, 25), suggesting an optimal balance of rigidity and flexibility for effective capsid assembly inhibition.

Fig. 7
figure 7

Comparative RMSF of amino acid residues present in the active site.

Rg

The Rg analysis for the protein-ligand complexes (Fig. 8) reveals that ZINC02691764 maintains the most compact and stable protein structure with an average Rg of 1.732686 nm and a standard deviation of 0.015425 nm. In contrast, ZINC00674395 and ZINC02330823 display slightly less compact structures with higher average Rg values of 1.756854 nm and 1.758062 nm, respectively, and also higher standard deviations (0.018818 and 0.019937 respectively), indicating more structural variability throughout the simulation. The compactness of the protein structure, as indicated by a lower Rg value, is crucial in drug design as it can correlate with enhanced ligand binding efficiency and increased structural stability of the protein-ligand complex, which are desirable attributes in potential drug candidates.

Fig. 8
figure 8

Rg of protein bound to different ligands over time.

SASA

The SASA calculations present insights into the dynamic conformational states of the protein when bound to various ligands depicted in Fig. 9. The ligand ZINC00674395 exhibits the lowest average SASA at 93.50425 nm² with a standard deviation of ± 2.035731 nm², suggesting a more compact protein conformation with less surface area exposed to the solvent. In contrast, ZINC02330823 induces a more open protein structure, as reflected by the highest average SASA of 95.69128 nm² and a standard deviation of ± 1.926553 nm². ZINC02691764 results in a protein conformation with an intermediate average SASA of 94.76383 nm² but demonstrates the greatest conformational variability, indicated by the highest standard deviation of ± 2.434626 nm². These variations in SASA and associated standard deviations underscore the distinct influence each ligand has on the protein’s solvent exposure and conformational flexibility.

Fig. 9
figure 9

SASA protein is bound to different ligands over time.

Principle component analysis

Principle Component Analysis (PCA) was performed to elucidate the conformational space explored by the target protein upon binding with various ligands. The PCA projections onto the first two principal components (PC1 and PC2) reveal distinct conformational clusters for each ligand-protein complex (Fig. 10). Notably, the protein bound to ZINC00674395 exhibited a broader distribution along both PC1 and PC2, indicating a more extensive exploration of the conformational landscape. In contrast, complexes with ZINC02330823 and ZINC02691764 displayed comparatively constrained distributions, suggesting a more limited set of accessible conformations. Quantitatively, the conformational range for ZINC00674395 spanned from − 3.478 to 3.119 nm on PC1 and − 2.529 to 3.975 nm on PC2. For ZINC02330823, the observed ranges were narrower, with PC1 varying between − 2.816 and 4.183 nm and PC2 ranging from − 3.784 to 3.371 nm. The ZINC02691764-ligand complex demonstrated similar constrictions, with PC1 and PC2 ranging from − 2.991 to 3.419 nm and − 3.073 to 3.820 nm, respectively.

Fig. 10
figure 10

PCA projection on PC1 and PC2.

These observations imply that the binding of ZINC00674395 to the target protein induces significant conformational flexibility, potentially facilitating adaptive binding or allosteric modulation. Conversely, ZINC02330823 and ZINC02691764 may stabilize specific conformations of the protein, possibly reflecting a more rigid binding mode or distinct mechanistic interaction profiles.

Free energy landscape

The free energy landscapes derived from the simulations of the protein in complex with ligands are depicted in Fig. 11. Each landscape provides a two-dimensional representation of the thermodynamic stability of the protein conformations across the sampled conformational space. The top panels of Fig. 11 display contour maps, while the bottom panels present a three-dimensional surface view, enhancing the visual distinction of energy minima.

Fig. 11
figure 11

Comparative free energy landscapes of a protein in complex with various ligands.

The ligand ZINC00674395 shows a prominent minimum, suggesting a well-defined stable conformation. The free energy basin is deep and narrow, indicative of a conformational state with a significant population during the simulation, implying a potentially high-affinity interaction with the protein. In contrast, the landscape corresponding to ZINC02330823 exhibits two distinct minima, implying the existence of two stable conformations that the protein may adopt in the presence of this ligand. Such a bimodal distribution might reflect the ligand’s ability to stabilize multiple active or inactive states of the protein or may indicate the potential for allosteric regulation. The binding of ZINC02691764 presents a landscape with a broad, shallow minimum, suggesting greater conformational flexibility and a wider range of accessible conformations.

Analysis of hydrogen bond formation

Hydrogen bond interactions play a crucial role in stabilizing protein-ligand complexes. Figure 12 illustrates the time distribution plots of the average number of hydrogen bonds formed between HBV core protein and each ligand during the 100 ns simulation period.

Fig. 12
figure 12

Hydrogen bonds of protein bound to different ligands over time.

ZINC00674395 establishes an average of 3 hydrogen bonds throughout the simulation, demonstrating consistent and stable hydrogen bonding interactions. The compound maintains at least 2 hydrogen bonds for approximately 85% of the simulation time, with frequent occurrences of 3–4 hydrogen bonds. This stable hydrogen bonding network contributes significantly to its high binding affinity, as reflected in the binding free energy calculations. ZINC02691764 forms an average of 2 hydrogen bonds during the simulation. While slightly lower than ZINC00674395, this compound still exhibits relatively stable hydrogen bonding, maintaining at least 2 hydrogen bonds for approximately 76% of the simulation duration. ZINC02330823 displays the lowest hydrogen bonding capability among the three compounds, with an average of 1 hydrogen bond throughout the simulation. This compound maintains at least 2 hydrogen bonds for only about 53% of the simulation period. This reduced hydrogen bonding capacity likely contributes to its relatively lower binding affinity compared to the other two compounds. The comparative analysis of hydrogen bonding patterns correlates well with the binding free energy results, where ZINC00674395, with the most stable and numerous hydrogen bonds, also demonstrated the highest binding affinity. The consistent hydrogen bonding exhibited by ZINC00674395, particularly with key residues TRP102 and TYR118, provides further evidence for its potential as an effective HBV capsid protein inhibitor.

Protein-ligands in interaction timeline

The interaction timelines for ligands ZINC00674395, ZINC02330823, and ZINC02691764 reveal distinct temporal interaction patterns over the course of 100 ns molecular dynamics simulations (Figs. 13 and 14, and 15), elucidating the nuances of ligand stability and binding propensity within the active site of the target protein.

Fig. 13
figure 13

Interaction timeline of ZINC00674395 bound to Cps.

ZINC00674395 demonstrates sustained and robust interactions throughout the 100 ns trajectory, with key hydrophobic contacts at LEU140 and PHE122, as well as pi-stacking and hydrogen bonding events at TRP102 and ASN136 respectively (Fig. 13). The consistency of these interactions underscores a stable binding conformation with the potential for strong ligand efficacy. The interaction timeline for ZINC02330823, however, reveals an intriguing phenomenon wherein interactions are maintained robustly up to approximately 80 ns, after which a notable decrease is observed (Fig. 14). The initial persistent interactions, particularly the hydrophobic and van der Waals contacts, indicate a favorable engagement, but the absence of interactions past 75 ns may suggest a reduction in binding stability over time, which could impact the ligand’s therapeutic reliability.

Fig. 14
figure 14

Interaction timeline of ZINC02330823 bound to Cps.

ZINC02691764 displays a varied interaction profile characterized by transient contacts and a lack of sustained interactions with the active site residues (Fig. 15). The fluctuating interaction dynamics suggest an adaptable but potentially less stable ligand-protein complex, which could manifest in a lower binding affinity or efficacy in a biological context.

Fig. 15
figure 15

Interaction timeline of ZINC02691764 bound to Cps.

In the context of these findings, ZINC00674395 emerges as the most promising candidate for further drug development, demonstrating a stable and persistent interaction profile conducive to high-affinity binding. ZINC02330823, despite a strong initial interaction profile, may require further studies to understand the observed reduction in interactions, which could have implications for its long-term binding stability. The dynamic and less consistent binding pattern of ZINC02691764 might necessitate structural optimization to enhance its interaction stability and potential as a therapeutic agent.

Interaction dynamics with Sankey diagram analysis

The Sankey diagram provides a visual representation of the interaction dynamics between the protein and various ligands, elucidating the relative frequency and strength of interactions with specific protein residues (Fig. 16). The width of the links correlates with the prevalence of each ligand-residue interaction, affording a clear, comparative view of binding patterns across different ligand-protein complexes. For ZINC00674395, a notable interaction density with residues Leu19, Leu30, and Gln39 is observed, indicating a concentrated interaction network within a defined region of the protein. This localized binding pattern suggests a specific binding pocket engagement, which may underpin a high-affinity interaction characteristic. Conversely, ZINC02330823 demonstrates a broader interaction profile, engaging with an extended array of residues including Phe22, Tyr118, and Phe23. The distributed nature of these interactions across the protein surface may reflect a multi-site binding modality or an induced fit mechanism, where the protein adjusts its conformation to accommodate the ligand across several contact points. Similarly, ZINC02691764 shows interactions dispersed among various residues, with significant engagement observed with Leu140, Asp22, and Ile138. The distributed interactions suggest the possibility of allosteric modulation, as the ligand influences multiple regions of the protein, potentially altering its functional state or dynamics.

Fig. 16
figure 16

Sankey diagram illustrating the frequency and distribution of ligand-residue interactions for ligands.

Free-binding energy with mm-PBSA

Since the accuracy of the molecular docking technologies used has been insufficient for understanding the free binding energy in a dynamic environment, we examined the estimation of the free binding energy of the ligand-receptor complexes by MD simulation. The mm-PBSA method has been used to determine the binding free energy for each complex. Table 4 displays the binding free energy calculations based on the MD trajectories.

Table 4 The free binding energy between core protein-ligand complexes.

Ligand ZINC00674395 demonstrates the most favorable binding profile, with a total binding energy of − 100.425 ± 3.559 kJ/mol. This suggests a potent interaction with the protein, likely driven by strong van der Waals contacts and favorable desolvation effects, despite the less favorable electrostatic interactions. In comparison, ZINC02330823 shows a moderately lower binding energy of − 79.709 ± 5.413 kJ/mol, indicating less interaction affinity relative to ZINC00674395. This could be attributed to less optimal van der Waals interactions or less favorable desolvation upon binding, as suggested by the individual energetic components. ZINC02691764 exhibits the least negative total binding energy of − 84.710 ± 5.139 kJ/mol, reflecting the lowest affinity of the three ligands for the target protein. The relatively high electrostatic energy indicates potential for improved binding affinity through optimization of charge–charge and polar interactions.

Considering all the data, ZINC00674395 stands out as the most promising candidate for in-vitro/in-vivo studies due to its superior binding energy profile, stable interaction with the target protein as indicated by MD simulations, and consistent engagement with key active site residues.

Discussion

The importance of capsid assembly in the HBV life cycle was highlighted by Tan et al.‘s demonstration that HBV capsid proteins assemble around the pregenomic RNA (pgRNA) and viral reverse transcriptase66. Gaining an understanding of the molecular process involved in capsid disintegration is essential to understanding viral infection. The Ghaemi et al. work clarifies the steps involved in HBV capsid breakdown and offers insights into how genetic information is released into the host cell20. Research is now being conducted on the development of capsid inhibitors. Studies involving in silico screening to investigate novel options, as that done by Hayakawa et al., show potential in reducing viral load67. Whereas important structural insights are obtained from Kim et al.‘s study of the quasi-6-fold structure of the HBV capsid. For future studies and possible therapeutic uses, it is essential to optimize settings for reducing capsid damage and maximizing stability21. Understanding the electrostatic interactions between proteins and the possible therapeutic implications is made possible by Sun et al.‘s work on the function of protein charge density in HBV capsid formation26. A fragment-based drug discovery approach identified novel Ciclopirox derivatives with improved binding energies, contributing to the development of a robust quantitative structure-affinity relationship model (QSAR)68. Another study highlights the identification of sulfamoylbenzamide derivatives as effective capsid assembly modulators (CAMs), showing promising pre-clinical profiles for oral administration69. The effectiveness of CAMs, particularly sulfonamide-based drugs, in inhibiting HBV replication is further supported, with some compounds surpassing clinical drugs in efficacy70. Novel HBV CAMs were discovered through structure-based virtual screening, leading to compounds with significant inhibitory effects on capsid interactions and viral replication71. Additionally, new CAM hits were identified from a protein-protein interaction library, providing structurally unique scaffolds with potential for further antiviral development72. The study of adenoviruses demonstrates the use of a virus-like particle (VLP) vaccine incorporating an epitopic region from hexon proteins, showing high protective efficacy against adenoviral diseases in chickens73. Improvement of drug candidates’ physicochemical properties and bioavailability, such as GLS4 derivatives, is also a significant theme, with one compound exhibiting potent in vitro anti-HBV activity74. The use of scaffold hopping and pharmacophore hybrid strategies in designing heterocycle derivatives shows promise in identifying potent non-nucleoside anti-HBV agents75. High-throughput screening (HTS) assays aid in identifying anti-HBV compounds while excluding false positives76. The mechanism of action of HAPs targeting HBV capsid protein is investigated using molecular dynamics simulations, offering insights into new drug targets77. Virtual screening (VS) strategies are emphasized for their impact on accelerating HBV drug discovery and development78. The potential of HIDs and HPDs as HBV inhibitors is explored, with structure-activity relationship guiding optimization79. Finally, the structural analysis of core protein allosteric modulators (CpAMs) provides a complex perspective on antiviral drug design80, while the delineation of critical epitopes is crucial for VLP vaccine efficacy against viral infections81. These findings collectively underscore the potential of CAMs, QSAR models, HTS, VLP vaccines, and VS in advancing the HBV drug development pipeline.

Our findings both complement and extend these previous studies in several important ways. While earlier work by Hayakawa et al. (2015) employed in silico screening to identify capsid inhibitors, our approach integrates multiple computational methods to provide a more comprehensive evaluation of potential inhibitors. Unlike the fragment-based approach, we employed a high-throughput virtual screening of complete drug-like molecules, which allowed us to identify compounds with optimal pharmacokinetic profiles from the outset. Our consensus docking methodology addresses limitations noted in previous virtual screening studies Elmessaoudi-Idrissi et al., 2018) by minimizing system-bias effects through the integration of six different docking algorithms. Furthermore, the mechanism of HAPs using molecular dynamics, our 100 ns simulations provide deeper insights into the dynamic behavior of protein-ligand complexes, particularly at the dimer-dimer interface critical for capsid assembly. The electronic property analysis through DFT calculations represents a novel contribution not present in previous studies, offering mechanistic explanations for the observed binding affinities. Notably, our lead compound ZINC00674395 demonstrates superior binding affinity (-100.425 ± 3.559 kJ/mol) compared to previously reported inhibitors, with a unique interaction profile that specifically restricts motion at key dimer-dimer interface residues while maintaining an optimal balance of flexibility elsewhere in the structure. This distinctive binding mode suggests a potentially more effective mechanism for disrupting capsid assembly than previously identified compounds. Each study has merits of its own. The main obstacle in drug discovery is ADME82 and toxicity83. Preclinical and clinical studies for lead compounds often result in failure84,85,86. Drugs used orally are often accepted87. The medication’s oral administration, toxicity, and ADME have not been comprehensively studied in recent research on Cps. We have used extensive in silico research to address these problems. Following the acquisition of the hits, we screened our hit molecules to ensure compliance with all recognized drug-likeness guidelines. To obtain trustworthy data, we have also used multiple tools to assess drug-likeness, ADME, and toxicity investigations. ZINC00674395 has found drug-like features in oral formulations, which will facilitate the development of Oral medication administration as it is preferred as a comfortable route of drug delivery for patients. ZINC00674395 will be able to disperse to the site of action and be readily absorbed through the gastrointestinal system as it has efficient ADME characteristics. To increase the quality of docking and virtual screening results in comparison to single docking approaches, consensus docking has been carried out using multiple docking software. Every top conformer predicted by several docking software programs has been analyzed to determine which conformer is the best. For detailed theoretical investigation of chemical processes at molecular level DFT studies have been conducted. To understand the dynamic behavior of molecules in a simulated physiological environment of 100 ns all atom simulation has been conducted. These simulations revealed the stability of the protein-ligand interaction, conformational changes of the protein upon ligand binding, and the energetics of the binding process, which helped us ultimately identify ZINC00674395 as the best compound among all. Upon simulation, DFT and consensus molecular docking the understanding of the mechanism of action of potential therapeutics is clearer. The moderate HOMO-LUMO gap of ZINC00674395 implies a balance between reactivity and stability which makes it possible to form hydrophobic interactions, VdWContact and PiStacking with the active site of Cps. Due to its perfect electronic configuration, it forms a stable, compact protein-ligand complex and forms a strong network of hydrogen bonds, which is critical for specific and strong interactions. The HOMO of ZINC00674395, localized on the aromatic rings and adjacent to the carboxylic acid moiety, suggests an electron-rich region capable of engaging in pi-stacking interactions with aromatic residues within the binding pocket, such as TRP102. The electron density distribution in the HOMO supports the observed pi-stacking interaction, which is a key stabilizing force in the ligand-protein complex. Conversely, the LOMO is observed to be distributed away from the core scaffold, towards the extremities of the ligand, particularly around electronegative atoms like oxygen and nitrogen. This distribution indicates potential sites for nucleophilic attack and aligns with the hydrogen bond formation between ZINC00674395 and active site residues such as GLN99 and THR33. The localization of LOMO around these functional groups enhances the ligand’s capacity for forming directional hydrogen bonds, reinforcing the specificity and affinity of binding. The 100 ns molecular dynamics simulations provided critical insights into the dynamic behavior of protein-ligand interactions that static analyses alone cannot reveal. ZINC00674395 demonstrated superior binding stability as evidenced by its consistently lower RMSF values at key binding site residues TRP102 and TYR118, indicating restricted motion at the dimer-dimer interface critical for capsid assembly. This restricted flexibility correlates with its highest binding free energy, where van der Waals interactions were the primary driving force. The RMSD trajectory of ZINC00674395, while showing the highest average value, exhibited the lowest standard deviation, suggesting a conformational shift to an optimized binding pose that remained stable throughout the simulation. This is further supported by the free energy landscape analysis, which revealed a deep, narrow energy minimum for ZINC00674395, indicating a well-defined, thermodynamically favorable binding conformation. The hydrogen bond analysis demonstrates that ZINC00674395 forms an average of 3 hydrogen bonds throughout the simulation, with the interaction timeline showing persistent contacts with LEU140, PHE122, TRP102, and ASN136, explaining its superior binding affinity. Additionally, the Rg and SASA values suggest that ZINC00674395 induces a more compact protein conformation, potentially stabilizing the protein in a state unfavorable for capsid assembly. These dynamic simulation parameters, when integrated with the electronic properties from DFT analysis, particularly the optimal HOMO-LUMO gap, explain why ZINC00674395 achieves both high binding affinity and specificity—it balances favorable electrostatic interactions through its electronic configuration while maintaining critical stabilizing contacts at the binding interface over the simulation timeframe. Blocking the active site of the Hepatitis B virus (HBV) capsid protein with ZINC02691764 can prevent the core protein from self-assembling into the capsid, the formation of a proper capsid is inhibited, and consequently, the reverse transcription process is disrupted, leading to a decrease in viral DNA production, reduce the production of infectious HBV particles, thereby limiting the spread of the virus within the host and potentially to new hosts, disrupted capsid formation may expose the virus to immune detection and clearance more effectively than intact viruses. The mechanism by which ZINC02691764 prevents capsid assembly can be explained through our computational analyses. The binding poses from consensus molecular docking revealed that ZINC02691764 interacts with key residues at the dimer-dimer interface (LEU19, PHE24, and TRP102), which are critical for proper capsid formation. The MD simulation results support this finding, as shown by the interaction timeline where ZINC02691764 maintains transient contacts with these interface residues throughout the 100 ns simulation. These interactions likely disrupt the precise alignment required for core protein monomers to self-assemble into the functional capsid structure. Importantly, our lead compound ZINC00674395 demonstrates an even stronger potential to inhibit capsid assembly. Its superior binding affinity and more persistent interactions with critical binding pocket residues, especially TRP102 and PHE122, suggest it would more effectively block the protein-protein interactions required for capsid formation. The RMSF analysis further supports this conclusion, showing that ZINC00674395 specifically restricts motion at interface residues (102, 118) while allowing controlled flexibility at other positions. This optimal balance of rigidity and flexibility, combined with its favorable electronic properties revealed by DFT calculations, makes ZINC00674395 a promising candidate for preventing HBV capsid formation, which would consequently disrupt viral replication and reduce the production of infectious HBV particles. Thus, ZINC00674395 may be a viable, safe, and efficient substitute for further preclinical and clinical studies, which may be a possible treatment alternative for Hepatitis B.

Conclusion

In this study, we employed a comprehensive computational pipeline integrating high-throughput virtual screening, multi-parameter filtering, consensus molecular docking, density functional theory, and 100 ns molecular dynamics simulations to identify potential inhibitors of the HBV core protein. Through rigorous analysis of electronic, structural, and dynamic properties, ZINC00674395 emerged as the most promising compound, demonstrating superior binding affinity, favorable drug-like properties, and optimal ADME characteristics without predicted toxicity issues. The ZINC00674395 uniquely modulates the flexibility of key active site residues, particularly restricting motion at the dimer-dimer interface residues (TRP102, TYR118) while maintaining persistent interactions throughout the simulation. This optimal balance of rigidity and flexibility, combined with its moderate HOMO-LUMO gap and strong van der Waals interactions, provides a mechanistic explanation for its potential to disrupt capsid assembly. The multi-faceted computational approach employed in this study offers advantages over single-method screening by minimizing bias and providing convergent evidence from complementary techniques. Our findings not only identify ZINC00674395 as a lead compound for HBV core protein inhibition but also provide structural and dynamic insights into the mechanism of capsid assembly disruption that could guide further optimization. This research contributes to the ongoing efforts to develop novel anti-HBV therapeutics targeting the capsid protein, with ZINC00674395 representing a promising starting point for preclinical investigation. Future studies should focus on experimental validation of these computational predictions, including in vitro binding assays, capsid assembly inhibition tests, and cellular antiviral activity assessments, potentially leading to a new class of HBV therapeutics that could address the limitations of current treatment options.