Introduction

Serological evidence of hMPV infection in humans has existed since the 1950s. Since its discovery in 2001, various groups/subgroups of hMPV have been identified to cause respiratory infections1,2, with severe infection in immunocompromised individuals3 posing a significant public health burden. In 2023 and early 2024, China experienced hMPV outbreaks, highlighting its potential to cause widespread respiratory illness4. The Metapneumovirus encompassing hMPV and Orthopneumovirus, including respiratory syncytial virus (RSV), are two important genera causing respiratory infection in humans5. F and G protein genetic variation analysis unraveled the two lineages (lineage A and lineage B) of hMPV, which were genetically stratified into four subgroups (sub-group A1, sub-group A2, sub-group B1, and sub-group B2). These genotypes exhibit geographic and temporal variability, influencing disease severity and transmission patterns6. It has a significant public health impact, with symptoms closely resembling Respiratory Syncytial Virus (RSV), including fever, cough, and dyspnoea, with severe cases leading to bronchiolitis or pneumonia7. The symptomatic overlap between hMPV and RSV complicates the diagnosis and therapeutic management of the patients, as conventional testing often fails to differentiate between the two pathogens1. In 2017, the World Health Organization identified hMPV accounting for 10% of deaths among children younger than five years old. By 2018, global estimates indicated that hMPV infected around 14 million children under five, leading to 640,000 hospitalizations and more than 8,000 mortalities8. Furthermore, genotype-specific variations may influence immune responses, potentially impacting vaccine design and antiviral efficacy9.

The recent outbreaks of hMPV infections across various countries highlighted the urgent need for enhanced surveillance, diagnostic tools, and the development of targeted vaccines and treatments to mitigate hMPV’s global impact. The absence of vaccines and limited treatment options contribute significantly to hospitalizations and mortality during epidemics. As of January 2025, no approved vaccine for preventing hMPV infections exists. Nevertheless, substantial advancements have been made in hMPV vaccine development. A few trials and studies have been carried out in the recent past10,11. Limited supportive therapies, including immunoglobulins and glucocorticoids, are primary treatments for hMPV, while ribavirin and IVIG are used for severe cases despite limited evidence, necessitating further trials12.

The genome (13 kb) of hMPV consists of a non-segmented RNA (negative-sense) encoding nine proteins arranged sequentially to support transcription and replication processes13. Viral RNA synthesis relies on the functions of the nucleoprotein (N), phosphoprotein (P), and large polymerase protein (L), while the matrix protein (M) facilitates viral assembly and release. The M2 protein is also crucial for regulating and facilitating RNA transcription14. The G protein mediates viral attachment, the F protein facilitates membrane fusion, and the SH protein is responsible for modulating the host immune response15,16. The F protein primarily assists in viral entry into cells. Its high level of conservation signifies its potential use as a promising candidate for vaccine formulation, as it has the potential to trigger strong neutralizing antibody responses17,18. Attachment glycoprotein, short hydrophobic protein, and matrix protein encoded by hMPV could also be potential vaccine targets19.

mRNA vaccines have revolutionized immunization with their safety, scalability, and rapid development because the synthetic mRNA is directed to host cells to produce antigens, eliciting precise immune responses without the risks of live pathogens20. The synthesis of mRNA molecules allows for the inclusion of multiple epitopes in a single construct for optimized vaccine designs that target specific pathogens. Multiepitope mRNA vaccines overcome single-antigen limitations by enhancing immunogenicity, addressing immune evasion, and minimizing antigenic escape21. The remarkable efficacy of anti-COVID-19 mRNA-based vaccines has emphasized the immense potential of innovative immunoinformatics approaches in combating a wide spectrum of infectious diseases22. Multiepitope vaccines offer new opportunities to target pathogens like hMPV.

Reverse vaccinology utilizes protein data and iterative computational analyses to identify vaccine targets, while machine learning enhances the precision of epitope prediction and vaccine design23. This research aimed to develop an efficient and thermo-stable multiepitope mRNA vaccine by integrating immunoinformatics and a reverse vaccinology approach, targeting four hMPV surface proteins (F, G, M, SH) as potential vaccine targets. The vaccine design proposed in this research could address hMPV’s genetic variability and immune evasion, offering broad protection with a reduced risk of antigenic escape. mRNA vaccine developed by reverse vaccinomics ensures adaptability, scalability, and safety, making it a promising solution for reducing hMPV-related morbidity and mortality, especially in vulnerable populations.

Materials and methods

Fig. 1
figure 1

details this study’s comprehensive workflow and methodological framework, providing an overview of the sequential processes and analytical approaches undertaken.

Figure 1. Schematics of the research execution workflow. CD-HIT = cluster database at high identity with tolerance, ANN = artificial neural network, NN-align = artificial neural network-based model, NetMHCII2.3 = allele-specific method, MLA = machine learning algorithm, kNN = k-nearest neighbors (A supervised learning classifier), ACC = auto cross variance, SMV = support vector machine, hMPV = human metapneumovirus, TLR = Toll-like receptor, MHC = major histocompatibility complex, ML = machine learning, NMA = normal mode analysis, and MDS = molecular dynamic simulation. All the tools/servers used in each step of the analysis shown in this figure have been detailed under different sections of “Materials and methods”.

Target proteins, their retrieval, and protein dataset generation

The FASTA-formated amino acid sequences of four key hMPV proteins (F, glycoprotein G, M, and SH) were sourced from the NCBI viral genome database for further analyses (Accessed on 08/01/2025)24. The acquired protein sequences were evaluated for their antigenic potential using the VaxiJen server with an applied threshold >0.4, as it is widely accepted for viral antigen prediction primarily because it achieves a good balance between predictive accuracy and the number of predicted antigens and has been reported in many studies (https://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html)25,26. This value has been validated in multiple peer reviewed studies to distinguish protective antigens reliably. Choosing this threshold ensures robust screening, minimizes false positives, and aligns with established benchmarks in immunoinformatics-based vaccine design. CD-HIT analysis was executed to remove redundancy in protein sequence, employing a 90% similarity threshold to retain only non-paralogous sequences to minimize the bias within the dataset and ensure reliability for large scale investigations by using standalone CD-HIT program27 (Fig. 1a of supplementary file S1). NCBI BLASTp (Protein BLAST: search protein databases using a protein query) was used to screen the identified proteins against human protein sequences with parameters laid down as E-value ≥ 10⁻⁴ and identity < 50% to obtain non-paralogous hMPV-specific sequences for further study.

Antigenic determinants (epitopes) prediction

B-cell, Cytotoxic T-lymphocyte (CTL), and Helper T-lymphocyte (HTL) epitopes were predicted using the IEDB prediction tool to elicit humoral and cellular immune responses.

Computational prediction of linear B-cell epitopes (LBEs)

The selected target proteins were processed on ABCpred for LBEs (http://crdd.osdd.net/raghava/abcpred/)28. A cutoff (score = 0.51) was prioritized to predict B-cell epitopes/antigenic determinants (16-mer). The server, ABCpred, uses a neural network algorithm to evaluate specificity, sensitivity, accuracy, and positive predictive value for each epitope28. The threshold score of 0.51 was selected in the ABCpred server for predicting 16-mer linear B-cell epitopes, in line with the default and validated setting reported in a study28. This score was shown to yield optimal sensitivity (75%) and specificity (63%) and has since been widely adopted in numerous immunoinformatics studies for epitope-based mRNA vaccine design29.

Computational predictive identification of CTL epitopes (CTLEs)

The CTLEs were computationally predicted by leveraging the IEDB consensus methodological approach with ANN 4.0, which utilizes artificial neural networks (ANNs) trained on experimental binding affinity data. The ANN 4.0 refines its predictions using large-scale datasets, making it highly effective for identifying CTL epitopes30. Although a few recently developed epitope prediction tools are available, IEDB prediction tools were selected due to their well-established validation, compatibility with our input format, and wide acceptance in previous studies. To optimize the immune activation of CD8+ T lymphocytes, only highly antigenic epitopes with an IC50 value below the stringent threshold of 100 nM were selected as prime candidates for incorporation into the engineered vaccine construct. The IC50 threshold was selected based on the IEDB guidelines for CTLE selection (IC50 < 50 nM to IC50 < 500 for high to moderate binding affinity) (http://tools.iedb.org/mhci/help/). IC50 < 100 nM cutoff was adopted to ensure high-affinity binding while maintaining a balance between stringency and epitope coverage, as recommended in earlier multi-epitope vaccine design studies31,32.

Computational prediction of the HTL-epitopes (HTLEs)

HTL epitopes, presented on antigen-presenting cells via MHC class II molecules, engage/interact with CD4+ T-cell receptors to activate immune signaling mechanisms. Computational prediction of the MHC class II binding epitopes (HTLEs) was accomplished by using the neural network-based alignment (NN-align 2.3)/ML-based-NetMHCII 2.3 algorithm from the IEDB platform. High-rank epitopes were manually selected for vaccine design, and the cutoff adopted was IC50 < 100 nM. The IC50 threshold was selected based on the IEDB guidelines for HTLE selection (IC50 < 50 nM to IC50 < 500 for high to moderate binding affinity) (http://tools.iedb.org/mhcii/help/). An IC50 cutoff of < 100 nM was used to ensure high-affinity binding while preserving epitope diversity, as recommended in previous studies31. Interferon-gamma (IFN-γ) inducing HTLEs33 and IL-4 inducing HTLEs were assessed by using IFNepitope and IL-4pred, respectively34. All HTLEs overlapped with IL-4-inducing ability; however, they did not overlap with IFN gamma induction. Therefore, an additional HTLE overlapping with high core (1.0) IFN-γ and high score IL-4 (1.07) was derived from fusion protein and incorporated into the final design to boost the induction of IFN-γ.

Comprehensive analysis and prioritization of predicted epitopes for the construct

The most promising LBEs, CTLEs, and HTLEs were identified and examined for their antigenic properties (VaxiJen 2.0) with cutoff >0.4 adopted in various reports35,36, toxicity (ToxinPred) (http://crdd.osdd.net/raghava/toxinpred/)37, soluble protein expression (SoluProt 1.0) (SoluProt 1.0 – Prediction of soluble protein expression in Escherichia coli)38 and allergenicity (Allertop) (https://www.ddg-pharmfac.net/allertop_test/)39. Non-toxic, nonallergenic, and soluble B-cell epitopes were selected manually based on ABCpred scores for inclusion into the vaccine construct. CTLEs and HTLEs, exhibiting non-toxic, nonallergenic, and soluble properties, were identified and screened based on their antigenicity scores from VaxiJen 2.0 to include in the construct. To evaluate the potential of HTLEs to elicit IFN-γ and IL-4 responses, the IFNepitope (IFNepitope: A server for predicting and designing IFN-gamma inducing epitopes) and IL-4Pred servers (http://crdd.osdd.net/raghava/il4pred/) were employed, respectively. Eight LBEs, eight CTLEs, and eight HTLEs—two from each target protein—were selected as final candidates. Additionally, an IFN-γ inducing epitope from the fusion protein was included in the vaccine design (Tables 1 and 2).

Population coverage assessment

Broad population coverage, a crucial aspect of vaccine design, was ensured as it enhances the vaccine’s effectiveness across diverse populations. Selected T-cell epitopes were evaluated by country. The population coverage tool (available on IEDB), operating with all standard parameters (default parameters), was employed to perform the analysis40. This assessment aimed to confirm that the proposed vaccine would produce effective immunity for most of the population worldwide. The tool estimates the expected population coverage by analyzing the binding affinity of each epitope to its corresponding HLA.

Chimeric vaccine design and selection of the potential vaccine model

Potential immune epitopes, including those for B-cells, helper T-cells (HTLs), and cytotoxic T-cells (CTLs), were strategically combined using specialized amino acid linkers to generate two chimeric hMPV vaccine constructs. The two constructs were conjugated separately with two distinct adjuvants: one set with mammalian beta-defensin and the other with ribosomal protein, generating the four vaccine constructs. The inclusion of specific linkers was accomplished to optimize protein expression, enhance bioactivity, and ensure a robust immunogenic response. LBEs were connected using GPGPG linkers, HTLEs were joined with KK linkers, and CTLEs were linked through AAY linkers41. AAY functions as a proteosomal cleavage site and offers epitope separation, and GPGPG linkers maintain epitope integrity and elicit strong HTL response, whereas KK acts as a cathepsin B-cleavage site, crucial for lysosomal processing and effective MHC-II presentation42. These linker sequences play a crucial role in minimizing junctional epitope formation, improving proper protein folding, ensuring a clear separation of epitopes, and enhancing both the structural coherence and immunological function of the vaccine construct43. Moreover, the EAAAK linker primarily functions to provide structural flexibility and separation between the adjuvant and antigenic regions to enhance the overall immunogenic profile of the vaccine construct44.

Four vaccine models were assessed for their antigenicity and physiochemical properties. To ensure optimal efficacy and safety, the vaccine must exhibit stability, non-allergenicity, antigenicity, nontoxicity, and high solubility. The physicochemical attributes, antigenic potential, allergenic profile, toxicity, and expression solubility of the multiepitope peptide vaccine construct were meticulously assessed using the ProtParam, VaxiJen v2.0 (threshold = 0.4)35, AllerTop39, ToxinPred, SoluProt 1.038, respectively, to evaluate all four vaccine constructs. Based on their antigenicity, solubility, toxicity, and physiochemical characteristics, the most potent one of all four constructs was screened for further analyses. The most promising vaccine construct was the one that included a 45-mer mammalian beta-defensin peptide adjuvant. The adjuvant was attached via a linker (EAAAK) to the N-terminal of the designed sequence.

Finally, a non-natural-pan-DR-epitope with sequence AKVAAWTLKAAAC, known for activating CD4+ T cells, was introduced into the most potent vaccine construct sequence to strengthen the vaccine’s potency and effectiveness. A well-designed mRNA vaccine construct typically requires a few key elements (Kozak sequence encompassing start codon, epitopes, a suitable adjuvant, and functional linkers) to maximize translational efficiency45. However, the stop codon could be enhanced46. Moreover, two additional components have been identified to be beneficial to enhance antigen presentation: a signal peptide, which enables the extracellular secretion of translated epitopes, and an MHC class I trafficking signal (MITD) linked to the C-terminus of antigenic protein. The MITD is essential for directing CTLEs to the MHC 1 compartment to boost the antigen presentation. The 5′ region of the ORF was designed to include the tissue plasminogen activator (tPA) secretory signal sequence (UniProt ID: P00750), while the MITD (UniProt ID: Q8WV92) was integrated into the 3′ region of ORF47,48. The instability of messenger RNA-based therapeutics is another challenge, and to address the instability issue, elements commonly found in eukaryotic mRNAs must be added49,50. The vaccine construct was designed by including the sequences for four essential components: the 5′ m7G cap, poly(A) tail, and both 5′ and 3′ untranslated regions (UTRs). These elements were integrated to enhance mRNA stability and translational efficiency, leveraging the synergistic interplay between the element poly(A) tail and the 5′ m7G cap to optimize these mechanisms51. The segment length of the poly(A) tail component is critical, as excessively short or overly long tails result in inefficient translation52. Studies suggest an optimal poly(A) tail length of approximately 115–150 nucleotides for robust mRNA vaccine effectiveness53. In addition to that, NCA-7d was integrated into the 5′ UTR to stabilize the mRNA structure, whereas S27a + R3U was incorporated into the 3′ UTR54,55.

Secondary-level structural assessment of the construct (hMPVbet 1)

The prediction of the secondary structural composition (beta-sheets, coil, and alpha-helices) of the most potent construct (hMPVbeta 1) was executed using the SOPMA56 and the PSIPRED 4.0 server57, utilizing position-specific scoring matrices (PSSM). These servers analyze key structural elements, such as transmembrane helices, topology, and domain folding patterns within the peptide sequence, to offer comprehension of the stability and functional features of the vaccine constructs. The mRNA secondary structural characterization was accomplished using the RNAfold resource of ViennaRNA (v. 2.0)58, which executes the prediction of the centroid secondary structure and accomplishes McCaskill’s algorithm-based estimation of the related parameters, such as the minimal free energy (MFE)59.

3D modeling and computational refinement of the most potent vaccine construct

The 3D structure of a protein or peptide provides essential insights into its stability, functionality, and interactions with other biomolecules. The 3-dimensional structure of the most potent construct was produced by employing the Robetta server, which utilizes a hybrid approach combining comparative modeling with homologous protein templates and de novo modeling for sequence-based predictions when templates are unavailable. This methodology ensures accurate and reliable structural insights leading to the evaluation of the vaccine constructs’ stability, folding, and functionality60. Galaxy Refine was used for the refinement of the 3D structure by getting the.pdb file input created by the Robetta server61. After refining and enhancing structural characteristics, the model’s structural quality was optimized by the PROCHECK server. Moreover, the ProSA server-based validation of the model was executed62.

Molecular Docking and normal mode analysis (NMA) for gaining vaccine-receptor interaction insight

To trigger a robust and targeted host immunological response effectively, the vaccine must establish precise and stable interactions with immune receptors to ensure efficient recognition and activation of immune pathways. Molecular docking analyses assessed the binding affinities and interaction dynamics between the most potent vaccine construct and host immune receptors and MHCs. The.pdb files of TLR4 (PDB Id; 4G8A) and TLR2 (PDB Id; 2Z7X) were acquired by the Protein Data Bank (PDB). The.pdb files of MHC class I (PDB Id; 2XPG) and MHC class II (PDB Id; 1KG0) were also obtained from the same database. Pymol-assisted preprocessing (manipulation of the chain identifier) was performed on these TLRs and MHCs.pdb files. Following preprocessing, ClusPro 2.0-assisted receptor docking with the vaccine construct was accomplished63. Subsequently, PDBsum, in conjunction with the application Pymol script/commands, was employed to assess interaction events (residues and atomic-level biochemical interactions) at the interface of each complex64. The iMODS server, designed for normal mode analysis (NMA) of biomolecular structures such as protein complexes, was harnessed with its default parameters to evaluate the intrinsic dynamic behavior, flexibility, and stability of the TLR-vaccine and MHCIs-vaccine molecular structures. By assessing the intrinsic vibrational modes of the complexes, iMODS provides insights into molecular motions and flexibility65,66. By performing energy minimization and simulating atomic and molecular movements, iMODS enables the assessment of biomolecular complex dynamics, revealing their intrinsic motion patterns and structural flexibility65,66.

Molecular dynamics simulation (MDS) of the TLR-receptors (TLR-4 and TLR-2)-vaccine and MHCs-vaccine complexes

To investigate the stability of vaccine prototypes, TLR-vaccine, and MHC class I/II-vaccine complexes in an aqueous environment, MDS was executed for 100 ns using the GROMACS bioinformatics tool. These simulations provided detailed insights into the binding interactions between the vaccine and immune molecules. The simulation system was set up and parameterized using the AMBER99SB force field67. The complex solvation was achieved in a triclinic simulation box filled with explicit water molecules (SPC (Simple Point Charge) water model) to mimic the aqueous environment. To attain the neutralized system, physiological concentration Na + or Cl- ions (0.15 M) was adopted to maintain the ionic strength and charge neutrality. Execution of energy minimization (EM steps = 5000, EM-integrator = Steepest Descent algorithm) to relax the initial configuration of the TLRs-vaccine and MHC class I/II-vaccine complexes was accomplished. The system was equilibrated to achieve the thermodynamic equilibrium (at 310 K and 1 bar pressure), with a time step of 100 picoseconds (ps) under NVT ensembles and NPT ensembles68. Temperature coupling was applied to gain a stabilized environment, whereas pressure coupling was employed to allow the simulation box to adjust according to its volume in response to the applied target pressure, achieving thermodynamic stability. Final MDS production was run for a period of 100 nanoseconds (ns). A post-simulation assessment was undertaken to determine the RMSD (root mean square deviation), RMSF (root mean square fluctuation), the number of hydrogen bonds, and Rg (gyration radius) on the obtained MDS trajectories69. In addition to that, post-MDS, the Molecular Mechanics Poisson–Boltzmann Surface Area (MMPBSA) analysis, critical for assessing the stability and interaction strength of biomolecular complexes, was executed to calculate the binding free energy (ΔGbind) between the vaccine and TLRs/MHC molecules for 100 ns and last 20 ns of simulation70.

Execution of codon adaptation and cloning of vaccine constructs with an expression vector

The vaccine protein sequence was subjected to reverse translation. Following reverse translation, the DNA sequence of the vaccine was processed for codon optimization, leveraging the Java Codon Adaptation Tool (JCAT) to enhance and evaluate the translational efficiency and expression potential of the construct71. The GC content in DNA and codon adaptation index (CAI) were determined. The CIA value was predicted to determine the expression efficiency of the cloned vaccine construct. The value equal to 1.0 was considered an ideal CAI value, while scores exceeding 0.8 were considered optimum for robust expression. The GC content (optimal within a range of 30% to 70%) was also assessed, as deviations beyond this range may adversely influence transcriptional and translational processes, thereby impacting protein expression72. pET28a(+) (expression vector) and the E. coli expression system from the SnapGene tool were employed to perform cloning of the critically optimized (by JCAT) vaccine construct sequence73.

Immune response simulation

The immunogenic potential of the most potent vaccine construct was assessed through computational immune simulations using the C-ImmSim server. This server employs position-specific scoring matrices (PSSM) to predict immune responses accurately. The C-ImmSim server works at a cellular-level, agent-based computational algorithm that meticulously simulates the intricate interactions within the mammalian immune system. It provides detailed characteristics of the immune responses elicited by antigens, encompassing the activation mechanism and dynamics of B cells and T cells along with other immune cell populations. The server comprehensively evaluates the immune response to the proposed vaccine construct74. This tool has been extensively validated and applied in numerous studies involving multi-epitope mRNA vaccine design, including SARS-CoV-2, NeoCoV, and norovirus75,76. A multiepitope vaccine designed using this tool against norovirus progressed beyond the in-silico phase and exhibited promising results during in-vivo validation77. The simulation was conducted for both the initial dose and two booster doses administered at 30 and 60 days. All parameters, including random speed (12345), simulation steps (initial dose: 200, first dose: 600, and second/final booster: 1200), and MHC class I and II alleles, were kept as default. The simulation parameters were configured with a volume setting of 10 and antigen injection (1000) for every dose, and the vaccine was injected without lipopolysaccharide (LPS).

Results

Target sequence retrieval and incorporation

Four target proteins, F-protein (n = 984), G-protein (n = 1510), M-protein (n = 888), and SH-protein (n = 672) were processed (Accessed on 8/1/2025). The accession numbers of the four selected virus-specific non-paralogous protein sequences for downstream study were F-protein (AN: YP_009513268.1), G-protein (AN: YP_009513272.1), M-protein (AN: MH828686.1) and SH-protein (AN: YP_009513271.1) (Fig. 1b of supplementary file S1). These proteins were processed to predict potential LBEs, CTLEs, and HTLEs.

LBEs assessment and incorporation into vaccine construct

LBEs, essential elements for vaccine effectiveness, were analyzed. The most promising epitopes were determined based on their strong interaction with B-cell receptors, a critical factor in activating humoral immune responses. Screening criteria included multiple factors, such as high binding ABCpred scores, antigenicity exceeding a threshold of 0.4, nontoxicity, and nonallergenicity. Eight LBEs (2 from each protein) were determined for incorporation into the final vaccine constructs. The binding affinity, antigenicity scores, toxicity, and allergenicity status of all the LBEs are summarized in Table 1.

Evaluation of CTL-epitopes for incorporation into vaccine construct

CTLEs were determined because the CTLs/CD8+ T cells are critical components of the defense mechanism against viral infections, and the epitopes with IC50 value < 100 nm were prioritized for further screening. The non-toxic, nonallergenic epitopes with positive antigenicity values (top-ranked; score > 0.5) were carefully examined to incorporate into the vaccine construct. Eight (n = 08) epitopes (two from each target protein) were picked for adding to the final vaccine constructs. The characteristics of CTLEs are summarized in Table 1.

Table 1 The Immunogenic potential of prioritized linear B-cell epitopes and promiscuous CTL Epitopes.

Assessment of potential HTL-epitopes for incorporation into vaccine construct

This study analyzed 15-mer HTLEs and their corresponding MHC class II alleles using the MHC-II binding assessment tool available on IEDB, with humans designated as the host species. The epitopes with IC50 value < 100 nm were prioritized. The non-toxic, nonallergenic, IL-4-inducing HTLEs with positive antigenicity values (top-ranked; score > 0.5) were selected for incorporation into the vaccine construct. Eight (n = 08) HTLEs (2 from each target protein) were picked for incorporation into the ultimate vaccine constructs. Moreover, one F-protein-based high score (1.0) IFN-γ-inducing HTLEs with IL-4-inducing overlapping activity (IL-4pred score = 1.07) was screened for inclusion in vaccine constructs. Table 2 summarizes the HTLEs allele pair, IC50 value, antigenicity scores, IL-4pred scores, IL-4-inducing and IFN-γ-inducing toxicity, and allergenicity status.

Table 2 The Immunogenic potential of the prioritized promiscuous MHC-class-2 binders (IL-4 inducer overlapped) and one IFN-gamma epitopes (IL-4 inducer overlapped).

Determination of the population coverage for the hMPVbeta1 vaccine

The population coverage determination for the prioritized CTLEs and HTLEs revealed that they provided nearly 100% coverage worldwide. Notably, the selected epitopes achieved almost full coverage, emphasizing their effectiveness in targeting the diverse HLA alleles across different countries. This extensive population coverage demonstrates the efficacy of the hMPVbeta 1 construct in providing immunity to hMPV infection across diverse global populations (Fig. 2).

Fig. 2
figure 2

Country-wise population coverage analysis of the hMPVbeta1 vaccine construct.

Chimeric vaccine design and screening for the potential vaccine model

Two vaccine constructs were designed by integrating evaluated, screened, and prioritized LBEs, CTLEs, and HTLEs into two distinct combinations using appropriate linkers. Additionally, incorporating two different adjuvants—mammalian beta-defensin and ribosomal protein—into the former two vaccine constructs resulted in the development of four (n = 4) vaccine constructs (hMPV-beta1, hMPV-beta2, hMPV-Ribo1, and hMPV-Ribo2), which were subsequently subjected to further analysis. All four constructs were antigenic, non-toxic, and non-allergenic. The amino acid (AA) length/molecular weight of the hMPV-beta1, hMPV-beta2, hMPV-Ribo1, and hMPV-Ribo2 were 383/39633.71, 385/39668.16, 468/47913.00, and 470/47947.45. The expression solubility of hMPVbeta1 and hMPV-beta2 were higher than the construct with ribosomal protein adjuvant (hMPV-Ribo1 and hMPV-Ribo2), as summarized in Table 3. The solubility expression constructs were hMPVbeta1 (Soluproscore = 0.784, cut off ≥ 0.5) and hMPVbeta 2 (Soluproscore = 0.9, cut off ≥ 0.5); however, the antigenicity of hMPV-beta1 (0.746) was higher than hMPVbeta 2 (0.732) and another two hMPV-Ribo1, and hMPV-Ribo2 constructs (Table 3), therefore, hMPVbeta1 was determined as the most potential model. The theoretical PI, aliphatic index (AI) as well as Grand Average of Hydropathicity (GRAVY), and instability index (Ii) hMPV-beta1 were observed as 9.92, 68.75, −0.260, and 31.24, respectively. A negative value of GRAVY (cut off = zero) for selected hMPVbeta1 indicates that the model was potentially hydrophilic and comparatively soluble. Ii of 31.24 of the selected vaccine model suggested that the construct was stable and soluble. Moreover, the aliphatic index (68.75) between 40 and 85 of the chosen hMPVbeta1 construct suggested its appropriate thermostability and solubility. The details of physiochemical parameters, antigenicity scores, toxicity scores, solubility scores, half-life period, and allergenicity status for all four models are summarized in Table 3.

Table 3 Antigenic, allergenic, toxigenic, and physiochemical characteristics of the target four models of vaccine construct.

Prediction of the secondary for chimeric vaccine construct (hMPVbeta1)

The predicted secondary structures of hMPVbeta1 unraveled that the construct comprised 37.6% α-helices, 16.97% extended strands, and 45% random coils, as depicted in Fig. 3a and (Fig. 2; Supplementary file S1). The secondary mRNA structure was thermodynamically stable, with a low (negative value) minimum free energy (MFE) of 339.5 kcal/mol free energy of thermodynamic ensemble (− 461.61 kcal/mol) and moderate ensemble diversity (301.36). The centroid secondary structure, representing the ensemble’s most probable or representative structure, showed a minimum free energy (MFE) of − 337.22 kcal/mol as represented by dot-bracket annotation (Fig. 3 of supplementary file S1). While slightly less stable than the global MFE structure, it was still thermodynamically favorable and represents the RNA’s likely structure. Figure 3b represents only a smaller region with high entropy (high variability), while other areas were blue to green (region of low entropy and high stability). The mountain plot showed most of the area where lines overlapped, suggesting a strong agreement between the MFE, PF, and centroid predictions and a stable and well-defined structure in those regions (Fig. 3c). In the context of RNA secondary structure prediction, low entropy values refer to positions in the RNA sequence where the predicted structure is minimally variable. The mountain plot showed a positional entropy value of less than 2.0 at most positions, indicating stable and well-defined secondary structures. However, in some positions, the entropy value was higher than 2.0, representing flexible and less stable regions (Fig. 3c).

Fig. 3
figure 3

The predicted secondary structure of protein and mRNA of the screened vaccine construct (hMPV-beta1) : (a)-represents the helix, sheet, turns, and coils in the secondary protein structure of the construct (hMPV-beta1), (b)-mRNA secondary structure (MFE-SE-str = mean free energy secondary structure and c-SE-str = centroid secondary structure), (c)-mountain plot representing the mean free energy, thermodynamic ensemble, and centroid structure of mRNA, and entropy plot illustrating positional entropy for each position in mRNA. MFE (blue) = minimum free energy structure, showing the most thermodynamically stable configuration; PF (green) = partition function, representing the statistical ensemble of structures contributing to RNA stability; and Centroid (red): The centroid structure, which is the most representative structure from the ensemble.

Prediction of 3D structures for chimeric vaccine construct (hMPV-beta1)

The chimeric vaccine construct’s stable and well-optimized 3D structure is vital for evaluating its molecular interactions with host immune receptor proteins. This structural integrity plays a significant role in ensuring proper antigen recognition and enhancing immune activation. GalaxyRefine refined the 3D structure of hMPV-Beta1; the top GlaxyRefine model (Rama favored 95.3%), evaluated by Pymol, is illustrated in Fig. 4a. The evaluation of the top five models’ structures based on the two essential quantitative measures, Root Mean Square Deviation (RMSD) for differentiating molecular conformations and Global Distance Test-High Accuracy (GDT-HA) to assess the closeness between predicted protein structure with reference structure were employed to ensure high precision. Rama favored score, with their findings comprehensively summarized in Table (Table 1; Supplementary file S1). ProSA (Z-score) computed was − 8.38 for the 3D model of hMPV-Beta1, which supported its structural stability and reliability; moreover, the position of the query model in the graph suggested that the model’s structural quality is comparable to the reliable experimental structures (Fig. 4b). The protein structure was validated by analyzing backbone dihedral angles of amino acid residues. Phi (Φ) and psi (Ψ) were the two specific dihedral angles analyzed. Ramachandran plot exhibited 97.7% of residues, comprising 91.8% in favored regions plus 5.9% in allowed regions, suggesting a high-quality 3D model of hMPV-Beta1 (Fig. 4c). However, a small percentage of residues fell in the disallowed areas (due to flexible loops or special conformations), which is expected and acceptable within limits. The overall structure was reliable and suitable for further analysis of docking or molecular interactions. Moreover, the stability of the structure was substantiated through additional validation by knowledge-based energy as a function of sequence position (Fig. 4d). Knowledge-based energy plots revealed that most sequences have negative energy values, indicating that the structure is generally stable in regions with strong negative peaks. The validation findings indicated that the 3D structure exhibited substantial stability, making it suitable for further analysis.

Fig. 4
figure 4

The predicted three-dimensional structure of the screened vaccine construct (hMPVbeta 1) and validation of the 3-D structure: (a)-Robetta-predicted and Galaxy-refined 3-D model of the hMPVbeta1 construct, (b)-Prosa-web-based validation plot with z-score (c)-Ramachandran plot, and (d)-energy validation plot of a 3-D model of the hMPVbeta 1 construct. The light blue region in section b of the figure corresponds to high-quality protein structures solved using X-ray crystallography, and the dark blue region represents high-quality structures solved using NMR. The two lines represent the energy calculated with different window sizes (10 and 40) in section d of the figure, which smooth out the fluctuations over short or long sequence segments. RF = Ramachandran favored FR = Favoured region, and AR = allowed region.

Advanced computational Docking analysis of hMPVbeta1 interactions with Toll-Like receptors and major histocompatibility complex molecules

A comprehensive investigation into the molecular interactions between hMPVbeta 1 and human Toll-Like Receptors (TLRs) immune receptors is essential for understanding their role in eliciting a stable, effective, and sustained immune response. Docking studies involving MHCs with the proposed chimeric vaccine are essential for assessing its immunogenic potential. The vaccine construct was precisely docked with Toll-like receptor 4 (TLR4) and Toll-like receptor 2 (TLR2) for determining the receptor binding dynamics. Figure 5a and b illustrate the molecular interaction events (at residue and atomic level) of the TLR4-hMPVbeta1 vaccine and TLR2-hMPVbeta1 vaccine docking analyses, respectively. Figure 5 depicts the TLRs-vaccine complexes, TLR-vaccine interface, interacting residues at the interface, hydrogen bonds with bond length, and salt bridges with their bond length. For TLR2-vaccine docking, out of 29 clusters, cluster 1, with 51 members with the lowest energy score of − 1056.5 kcal/mol and a weighted score of − 1027.9 kcal/mol, was selected for further analysis due to its superior average binding energy, whereas cluster zero (61 members) for TLR4-vaccine with the lowest energy score of − 982.9 kcal/mol and a weighted score of − 784.9 kcal/mol was chosen for downstream analysis. Furthermore, For MHC class I-vaccine docking, cluster 0, with 69 members with the lowest energy score of − 820.4 kcal/mol and a weighted score of − 697.4 kcal/mol, was selected, whereas cluster zero (52 members) for MHC class II-vaccine with the lowest energy score of − 944.5 kcal/mol and a weighted score of − 799.1 kcal/mol was prioritized for further analysis.

Fig. 5
figure 5

Interaction events at the interface of the different protein-protein docked complexes. (a)-TLR4-hMPVbeta1 interaction, and (b)-TLR2-hMPVbeta1 interaction. TLR = Toll-like receptor, V = chimeric vaccine construct, chain A = TLR, chain B = vaccine’s chain, PC = polar contracts, Hbs = hydrogen bonds, SBs = salt bridges. Red balls represent the atoms of interacting residues of the vaccine construct, and purple balls represent the atoms of the interacting residues of the TLRs.

There were 36 residues of TL4 and 30 residues of hMPVbeta1 vaccine interacting, involving n = 08 salt bridges, n = 16 hydrogen bonds, and n = 214 non-bounded contacts with a total interface of 1610–1717 Ų, suggesting strong rigid docking interaction (Table 4). There were 30 residues of TL2 and 27 residues of hMPVbeta1 vaccine interacting, involving n = 17 hydrogen bonds and n = 165 non-bounded contacts with a total interface of 1508–1567 Ų, suggesting stable docking interaction (Table 4). There were 26 residues of TL4 and 29 residues of hMPVbeta1 vaccine interacting, involving n = 03 salt bridges, n = 15 hydrogen bonds, and n = 151 non-bounded contacts with a total interface of 1430 − 1367 Ų supporting strong rigid docking interaction (Table 4). The complete profile of interacting residues with polar contacts in the TLR4-hMPVbeta1, TLR2-hMPVbeta1, and hMPVbeta1 can be accessed (Figs. 1 and 2; Supplementary file S2). The interacting residues, atoms, chemical bonds, bond lengths, and interface area of MHC class I-hMPV-beta1 and MHC class II-hMPVbeta1 complexes are illustrated in Fig. 6a (vaccine complex with MHC 1) and Fig. 6b (vaccine complex with MHC2), respectively. A total (chain A and chain B) number (n = 15) of MHC class I residues and n = 22 residues of hMPVbeta1vaccine interacted, involving n = 3 salt bridges, n = 08 hydrogen bonds, and n = 104 non-bounded contacts (Table 4). A total (chain A and chain B) number (n = 33) of MHC class II residues and n = 40 residues of hMPVbeta1vaccine interacted, involving n = 7 salt bridges, n = 13 hydrogen bonds, and n = 277 non-bounded contacts (Table 4) unraveled stable and rigid binding.

Fig. 6
figure 6

Interaction events at the interface of the different protein-protein docked complexes. (a)-MHC class I-hMPV beta1 interaction and (b)-MHC class II-hMPV beta1 interaction. V = chimeric vaccine construct. Chain A (violet) and chain B (red) are the two chains of the MHC molecules. Chain C (sand color) represents the chain of vaccine construct. The red, blue, and grey dash represent the salt bridges, hydrogen bonds, and nonbonded contact between the interacting residues of MHC-vaccine molecules.

Table 4 Description of the interface molecular events at the atomic level in the TLRs-hMPV beta1 and MHCs-hMPV beta1 docked complexes.

NMA and MDS of hMPVeta1 with TLRs and MHCs

NMA was performed to assess the initial stability, flexibility, and dynamic behavior of TLR4-hMPVbeta1, TLR2-h-MPVbeta1, MHC-I-hMPVbeta1, and MHC-II-hMPVbeta1 (Fig. 7). The primary deformation plot for the TLR4-hMPVbeta1 complex indicated that most regions displayed low flexibility, suggesting a stable conformation (Fig. 7a). Peaks at certain sequence positions highlight regions of higher flexibility, which could correspond to functional or dynamic areas such as binding sites, loops, or hinges. The deformity graph suggested that most of the structure is stable and rigid, with a few localized regions showing higher flexibility. The flexible region could be critical for dynamic interactions, such as binding or conformational changes. Moreover, the eigenvalue (1.00538 × 10− 4) distribution confirms a balance between flexibility and stability, which is critical for biological activity and structural integrity as the low first eigenvalue indicates that the structure allowed for significant collective motions, essential for functional flexibility, such as binding interactions or conformational changes.

In contrast, higher eigenvalues in later modes suggested the localized rigidity, likely corresponding to stable structural regions (Fig. 7). Additionally, the deformity graphs for TLR2-hMPVbeta1 (Fig. 7b), MHC I-hMPVbeta1 (Fig. 7c), and MHC II-hMPVbeta1 (Fig. 7d) suggested that most of the structure was stable and rigid, with a few localized regions showing higher flexibility. The NMA results of the complexes (MHC class I-hMPVbeta1 and MHC class II-hMPVbeta1) confirmed their stability, supporting their respective roles in an adaptive immune response.

Fig. 7
figure 7

NMA of TLRs-vaccine and MHCs-vaccine complexes. (a): NMA results of TLR4-hMPVbeta1, (b) NMA results of TLR2-hMPVbeta1, (c) NMA results of MHC class I-hMPVbeta1 and (d) NMA results of MHC class II-hMPVbeta1 complexes. Left-side plot; Beta factor/mobility graph (flexibility/deformity value vs. atomic index); middle plot; comparison between the Normal Mode Analysis (NMA) and PDB-derived flexibility data for a protein or molecular complex. The overlay of NMA (red) and PDB (black) provides insight into the consistency between computational and experimentally derived structural flexibility and right-side plot eigenvalues obtained from Normal Mode Analysis (NMA), plotted against the mode indices.

RMSDs for all TLRs-vaccine and MHCs-vaccine flattened after the rise at the initial stage (~ 20–30 ns), indicating conformation adjustment followed by complex stability. However, a limited-range deviation (TLR2-vaccine:~0.5–0.6 nm, TLR4-vaccine: ~0.3–0.4 nm, MHC1-vaccine: ~0.6–0.7 nm, and MHCII-vaccine:~0.5–0.53 nm) without large deviation or significant drift (Fig. 8a) was observed. RMSF profiles for TLRs-vaccine and MHCs-vaccine exhibited only localized flexibility in the N- and C-terminal with fluctuation values ranging from ~ 0.15–1.0 nm. However, fluctuation remained broadly within a stable range (TLR2-vaccine: ~0.15–0.5 nm, TLR4-vaccine: ~0.15–0.4 nm, MHC1-vaccine: ~0.2–0.4 nm, and MHC1I-vaccine: ~0.14–0.4 nm) without signs of significant residue-level instability (Fig. 8b). The radius of gyration (Rg) of TLRs-vaccine and MHC-II-vaccine complexes stabilized immediately after the simulation started; however, MHC-1-vaccine stabilized after minor initial fluctuation. Rg of TLR2-vaccine and MHCs-vaccine stabilized (TLR2-vaccine: ~3.47–3.53 nm, TLR4-vaccine: ~3.1–3.14 nm, MHC-I-vaccine: ~3.39–3.44 nm, and MHC-II-vaccine: ~2.86–2.89 nm) within a narrow range variation, indicating good compactness of the complexes without considerable without considerable collapse or expansion (Fig. 8c). Moreover, the hydrogen bond trajectories remained consistent over 100 ns of the simulation, exhibiting stable intermolecular interactions (TLR2–vaccine: ~650–700 bonds, TLR4–vaccine: ~680–700 bonds, MHC-I-vaccine: ~540–570 bonds and MHC-II-vaccine: ~540–555 bonds), which demonstrates the sustained hydrogen bonding and structural integrity in the receptor-vaccine and MHCs-vaccine complexes (Fig. 8d). Furthermore, the stabilized solvent-accessible surface area (SASA) (Fig. 8e) and SASA volume (Fig. 8f) during the simulation with minor localized changes indicated the structural compactness and conformational stability of the TLRs-vaccine and MHCs-vaccine molecular complex without conservable unfolding, expansion, or collapse.

Moreover, MM/PBSA energy for 100 ns and the last 20 ns were computed to assess the binding energy of TLR2, TLR4, MHC-I, and MHC-II vaccine complexes. Binding free energy (ΔG < sub > bind) estimated for the last 20 ns and full 100 ns of simulation of the TLR2–vaccine was − 154.38 kcal/mol (complex: −20574.32 kcal/mol; TLR2: −13902.99 kcal/mol; vaccine: −6516.95 kcal/mol) and − 131.62 kcal/mol (complex: −20461.75 kcal/mol; TLR2: −13863.66 kcal/mol; vaccine: −6466.47 kcal/mol), respectively. For the TLR4–vaccine, ΔG < sub > bind</sub > during the last 20 ns was − 163.96 kcal/mol), based on complex, TLR4, and vaccine energies of −20177.21, −13491.67, and − 6521.57 kcal/mol, whereas, over the 100 ns simulation, ΔG < sub > bind</sub > was − 161.39 kcal/mol (complex: −20131.96, TLR4: −13479.05, and vaccine: −6491.52 kcal/mol).

In addition, in the case of the MHC-II–vaccine, binding free energy for the last 20 ns was − 181.70 kcal/mol, with − 17486.53, −10815.34, and − 6489.48 kcal/mol for complex, MHC-II, and vaccine respectively. Over the complete 100 nm simulation, ΔG < sub > bind</sub > was − 181.70 kcal/mol, derived from − 17486.53, −10815.34, and − 6489.48 kcal/mol. Similarly, the MMPBSA binding free energy of MHC-I-vaccine for the last 20 ns was − 103.33 kcal/mol (complex; −11767.48 kcal/mol, MHC class I; −8927.83 kcal/mol, and vaccine construct; −2736.33 kcal/mol) whereas that for entire trajectory of 100 ns was − 97.39 kcal/mol derived from complex (−11747.71 kcal/mol), MHC-I (−8907.81 kcal/mol), and vaccine (−2742.51 kcal/mol). The MM/PBSA results indicate strong and stable binding affinities of the vaccine with TLR2, TLR4, MHC-I, and MHC-II, as reflected by consistently negative ΔG < sub > bind</sub > values, suggesting effective and persistent receptor engagement throughout the simulation, and supporting the vaccine’s immunogenic potential.

Furthermore, the total gromcs energy (TLR2-vaccine: ~ −1.59 × 10⁶ kJ/mol, TLR4-vaccine: ~ −1.2 to − 1.23 × 10⁶ kJ/mol, MHC-I-vaccine: ~ −1.59 × 10⁶ kJ/mol, and MHC-II-vaccine: ~ −1.5 × 10⁶ kJ/mol) (Fig. 9a), kinetic energy (TLR2-vaccine: ~2.65 × 105 kJ/mol, TLR4-vaccine: ~2.65 × 105 kJ/mol, MHC-I-vaccine: ~2.65 × 105 kJ/mol, and MHC-II-vaccine:~2.5 × 105 kJ/mol) (Fig. 9b), and short-range electrostatic energy (TLR2-vaccine:~ −2.48 × 10⁶ kJ/mol, TLR4-vaccine:~ −1.77 × 10⁶ kJ/mol, MHC-I-vaccine: ~ −1.77 × 10⁶ kJ/mol, and MHC-II-vaccine: ~ −1.74 × 10⁶ kJ/mol) (Fig. 9c) were observed to be consistent throughout simulation of 100 ns, suggesting the stability of these macromolecular structures.

Fig. 8
figure 8

Depiction of RMSD, RMSF, radius of gyration, hydrogen bond formation, SAS area, SAS volume of TLR2-vaccine, TLR4-vaccine, MHC-I-vaccine and MHC-II-vaccine (a)-RMSD, (b)-RMSF, (c)-radius of gyration, (d)-hydrogen bond formation, (e)-total solvent accessible area, and (f)-surface accessible volume.

Fig. 9
figure 9

Demonstration of Gromacs total energy, kinetic energy, short-range electrostatic energy of TLR2-vaccine, TLR4-vaccine, MHC-I-vaccine, and MHC-II-vaccine (a)-Gromacs total energy, (b)-Kinetic energy, and (c)-Short-range electrostatic energy.

Assessment of the potency of the hMPVbeta1 by immunological simulation

Simulation results of the initial dose/first injection predicted the antigen count peaked around Day 1 at the level of approximately 6.9 × 105, followed by a rapid decline around Day 7 in levels supporting the activation of the immune system. The early IgM peak (5.6 × 103) highlighted the primary immune response. In comparison, the later rise in IgG levels (especially IgG1 and IgG2) (1.5 × 103) exhibited the induction of a comprehensive, robust, and specific secondary immunological response. The high levels of IgM plus IgG antibodies (7.4 × 103) suggested effective immunological memory formation, ensuring rapid and efficient antigen clearance upon subsequent exposures (Fig. 10a). Interferon-gamma (IFN-γ) showed the most pronounced peak around Day 15 (4.2 × 105). Interleukin-2 (IL-2) peaked early, around Day 10 (2 × 105 ng/ml) and was suitable for promoting T-cell proliferation and differentiation. Its rapid decline afterward suggests a tightly regulated response (Fig. 10c). The cytokine and interleukin simulation results showed the generation of a coordinated immunological response, where early activators (like IL-2) initiate the response, and later IFN-γ sustained and enhanced immune functions (Fig. 10c).

Moreover, the second/final booster dose analysis results demonstrated the hMPVbeta1 vaccine’s capability for inducing sustained primary and secondary immune responses (IR), with rapid antigen clearance and increased antibody production after booster exposure (Fig. 10b). After the second booster dose (Day 60), the antigen count showed a rapid but smaller peak than the primary exposure. This indicates that the immune system was primed and quickly recognized and eliminated the antigen. The increase in IgM titers early during the primary response peaked around Day 65 (0.9 × 105) (Fig. 10b). Combined IgG1 + IgG2 and IgG1 levels showed a significant secondary rise after the booster, with IgG1 increasing more prominently, indicating a robust secondary response driven by memory B cells (Fig. 10b). The booster dose amplified the humoral (antibody-mediated) immune response, supported by the sharp increase in IgG antibodies (especially IgG1 and IgG2). Following the second booster dose, the cytokine dynamic results highlighted the interplay of pro-inflammatory and regulatory cytokines, with IFN-γ (4.0 × 105 ng/ml) and IL-2 (4.9 × 105), which is essential for driving the immune response in a regulated manner (Fig. 10d). The simulation demonstrated the booster dose’s effectiveness in strengthening and prolonging the immunological response. The effects of the hMPVbeta1 chimeric vaccine on cell populations involved in both the arms of the IR are illustrated in Supplementary file S3 (Fig. 1: after the initial dose, Fig. 3: after first booster dose, Fig. 4: after the second/last booster dose). Additionally, the immune response and cytokine prole following second booster first booster doses have been demonstrated in Fig. 2 of Supplementary file S3.

Fig. 10
figure 10

Immune simulation results (a)-Represents the simulated immune response over time (in days); the black curve shows the antigen count, which peaks rapidly around Day 1 and then declines sharply as the immune response eliminates the antigen; (b)-The graph illustrates the immune response after the second/final booster dose at day 60; (c)- Presents the cytokines production after initial doses (d)-Highlights the cytokine dynamics after the second (final at day 60) booster dose.

Codon optimization followed by gene cloning

The efficiency of vaccine construct expression is crucial for successful vaccine development. This study evaluated hMPVbeta1 protein expression in an appropriate bacterial strain (E. coli K12) (Fig. 1; Supplementary file S4). Before the protein expression analysis, the codon optimization for efficient protein synthesis is essential. The codon usage was accomplished for peak expression in the bacterial (E. coli) system. The adjusted and refined cDNA achieved a CAI = 1.0 and the optimum value of the GC content (50.91%) (Fig. 1; Supplementary file S4). Strategically, two restriction sites (EcoRI and XhoI) were incorporated into the vaccine construct’s N-terminal and C-terminal terminal regions (Fig. 11). Finally, cloning of the optimized hMPVbeta1 sequence was achieved by utilizing the appropriate expression vector (pET28a (+)) leveraging SnapGene, and a recombinant plasmid of length 6478 base pairs was obtained (Fig. 11).

Fig. 11
figure 11

Identification of restriction sites and cloning of hMPVbeta1 vaccine construct into the pET28a (+) expression vector for E. coli.

Discussion

hMPV has emerged as a major respiratory pathogenic microbe, gaining increasing attention due to its growing impact on public health5. Severe cases of hMPV predominantly impact pediatric, elderly, and immunocompromised populations, while cases of hMPV-induced pneumonia in immunocompetent adults are less frequently reported. A study reported n = 155 × 103 pediatric hospitalizations owing to acute respiratory tract infections (ARTI), and It was observed that severe hMPV infections were mainly found in younger patients78. Although severe outcomes have been documented, studies on the effects of hMPV in immunocompetent adults are still limited79. The severe hMPV-related pneumonia and recent outbreaks affecting even immunocompetent adults highlight the pathogen’s capacity to cause substantial morbidity in populations previously considered at low risk, necessitating the urgent need for targeted therapy and vaccines to manage the patients clinically and provide immune protection against HMPV infection.

hMPV infections may lead to severe or fatal outcomes in immunocompromised individuals, with reported pneumonia-related mortality rates ranging from 10% to 80%80. A recent study highlighted the serious impact of hMPV infections, showing that nearly half of the patients developed lower respiratory tract infections, and about one in three required intensive care. While the overall hospital death rate was around 10%, this rose to 23% for those admitted to the ICU. The findings also emphasized that patients with underlying cancers were especially at risk, as their weakened immune systems made them more vulnerable to severe and potentially fatal outcomes81. Unlike well-established vaccines for viruses such as influenza or COVID-19, the development of a vaccine for hMPV remains in its early stages82. A combination mRNA vaccine (mRNA-1653) against hMPV and Parainfluenza virus 3 (PIV3) has entered the preclinical and investigational stage and exhibited antibody level boosting, supporting the development of mRNA-based vaccine83. Although initial research has shown some encouraging signs, no candidate has yet moved beyond preclinical testing, and there are currently no approved vaccines or specific treatments available for hMPV. Past efforts have struggled with hurdles like the virus’s ability to escape immune detection and difficulties in generating lasting immune protection84. Moving forward, innovative strategies in hMPV vaccine development must overcome these challenges to ensure both safety and long-term effectiveness. The proposed candidate in this research could overcome the current challenges upon experimental validation.

Immunoinformatics-based vaccines, enhanced by machine learning methods, offer significant advantages over conventional vaccines85. A well-known success story of reverse vaccinology in mRNA vaccine development is the Pfizer–BioNTech COVID-19 vaccine (BNT162b2), which was developed by targeting spike protein highlights the remarkable significance of the combined impact immunoinformatics and reverse vaccinology to combat infection outbreak including emerging zoonoses86. This approach allows the development of vaccines that focus exclusively on these key target components in comparison to traditional vaccines. Which typically comprise a wider array of antigens. Some antigens may not be optimal for triggering a robust immune response. Furthermore, machine learning accelerates the identification of potential vaccine candidates, a critical factor in rapidly addressing emerging infectious diseases and containing outbreaks efficiently87. The current research method has proven advantageous in terms of speed, adaptability, and precision in vaccine development against SARS-CoV-2 over traditional methods during a pandemic; therefore, the method could considerably reduce the vaccine development timeline and provide safe and effective candidate; however, it carries limitations of experimental and clinical validations88.

Four essential hMPV proteins were chosen to identify potential epitopes and devise effective vaccine constructs. The fusion protein of hMPV was prioritized as the principal target protein due to its crucial function (viral entry to host cells) through membrane fusion. Targeting and neutralizing this protein can prevent viral infection by blocking entry89. Moreover, the F protein is highly conserved across various hMPV strains, making it an optimal target for developing vaccines with broad-spectrum effectiveness, providing protection against multiple hMPV-variants/subvariants90. The G protein has a remarkable potential to bind to host cells and modulate the immune response, a critical process in viral infection that makes it an excellent target for vaccine development to prevent hMPV infection91. The M protein is vital for viral assembly and budding, and its conservation across hMPV strains renders it one of the favorable targets for broad-spectrum vaccine design92. The SH protein modulates host immune responses and influences immune signaling pathways93. Targeting it might enhance antiviral immunity. However, the G, M, and SH proteins of hMPV have also been explored as potential vaccine targets, though they are less commonly prioritized than the F protein94. Although a study exploited only glycoproteins of currently emerged genotypes A2a, A2b, and A2c to design an mRNA-based vaccine95, the proposed candidate in this research targeted multiple proteins from recently emerging strains to design the mRNA vaccine that could provide broad coverage against various strains.

A total of eight top-ranked (n = 24) LBEs, CLTEs, and HTLEs, two from each target protein, were incorporated into the two vaccine constructs using suitable linkers that could improve protein folding, antigen presentation, stabilize the protein, and enhance the expression of the multiepitope protein96. The human beta-defensin (hBDs) and ribosomal proteins were incorporated into the vaccine constructs using EAAAK linkers73and further analyzed. The human beta-defensin (hBDs)90 and ribosomal proteins97 adjuvants can significantly enhance vaccine-induced immunity, making them valuable vaccine components.

The physicochemical properties of the hMPVbeta1 construct were calculated as follows: a theoretical PI = 9.92, an AI = 68.75, a GRAVY of −0.260, and an instability index (Ii) of 31.24. These values suggest that the construct is suitable for consideration as a potential vaccine candidate, as reported in a previous study98. The negative GRAVY value (less than zero thresholds) signified the hydrophilic behavior of the hMPVbeta 1 construct with its appropriate solubility. An Ii of 31.24 suggests that the vaccine model is stable and soluble. Additionally, the aliphatic index of 68.75, falling within the ideal range of 40–85, supports the suitability of the hMPVbeta1 construct in terms of thermostability and solubility99. hMPVbeta 1 3D structure was subjected to refinement, and the validation was carried out to assess the quality of the hMPV construct. The Ramachandran plot for the hMPVbeta1 showed that 97.7% (91.8% + 5.9%) of residues of amino acids were found in Rama favored region, implying that the 3D structure of the hMPVbeta1 construct was of high-quality100.

Molecular docking, NMA, and MDS examined the molecular interactions between the hMPVbeta1 construct and TLRs101. The hMPVbeta construct underwent docking with MHC classes (I & II) to assess binding affinity. A substantial number of polar interactions, encompassing hydrogen bonds (Hbs) and salt-bridges (SBs), along with nonbonded (NB) interactions among the interfacial residues of the hMPVbeta1 construct and the TLR or MHCs, indicated probable strong elicitation of both the arms of the immune responses, along with cytokine production102,103. Moreover, eigenvalue from NMA analysis for the complexes (TLR4-hMPVbeta1, TLR2-hMPVbeta1, MHC class I-hMPVbeta1, and MHC class II-hMPVbeta1) indicated that the hMPVbeta1construct’ binding interactions with TLRs and MHCs were strong and stable which was supported by findings from other studies31. Shiyang et al. recently proposed a multiepitope vaccine (MEV) against hMPV, which reported the TLRs-vaccine docking; however, the current study reported additional MHCs-vaccine docking to provide deep insight into the construct; in addition, the lowest energy score for TLR2-vaccine was comparable but for TLR-4 was better, highlighting the significance of this study over the previous study104. Moreover, a study published in 2025 on MEV for hMPV targeting on G-protein also reported only TLRs-vaccine docking of vaccine with immune receptors, not MHC molecules, lacking comprehensive insight compared to the current research95. The normal mode value reflects motion stiffness, with smaller eigenvalues suggesting simpler bending structures, supporting the view that low eigenvalues signal favorable and stable vaccine-receptor conformations105. Overall, the favored deformity results and lower eigenvalue for the complexes suggested a stable interaction with localised flexibility between TLRs/MHCs-vaccine complexes106.

The MDS provides insights into the stability of molecular complexes, such as TLRs-vaccine and HMCs-vaccine under physiological conditions107. The TLRs-vaccine and MHCs-vaccine complexes demonstrated the RMSD stabilizing around 20–30 ns of the 100 ns simulation, suggesting structural stability and integrity of all the complexes, which are consistent with the findings of other studies90,107. In addition, in accordance with the recent mRNA vaccine MDS study (~ 100 ns MD yielding RMSDs of 0.85–0.86 nm in vaccine–TLR complexes and values within 0.2–0.7 nm after stabilization), our vaccine-TLRs interactions showed better stability with lower RMSD108. RMSF, a critical stability parameter, revealed that residues remained relatively rigid, with small fluctuations indicating stable conformations over time, which corroborates the result of other studies90,108. Similarly, the MHCs-vaccine complexes also reflected the stability for 100 ns with appropriate RMSD and RMSF profiles. The radius of gyration, a parameter to measure the shape and size of molecules (compactness) and hydrogen bond, a parameter to assess interaction, demonstrated the compactness and strong intermolecular interactions between TLR-vaccine and MHCs-vaccine complexes, which was further authenticated by MMPBSA ΔGbind observed for TLR2/TLR4-vaccine and MHCs-vaccine109,110. Gromac energy, short-range culumbic energy, and kinetic energy also indicated the structural integrity, interTLRs/MHCs-vaccine electrostatic interaction, and thermal stability of the TLRs/MHCs-vaccine complexes111. Moreover, the stable total SAS area and SAS volume for TLRs/MHCs-vaccine complexes validated the stability of the molecular complexes needed for antigenic presentation112,113. Overall, the results of NMA and MDS indicate that the vaccine constructs form robust, stable, and compact interactions with immune receptors, supporting their potential immunogenic efficacy.

An immune-simulation analysis was accomplished on the prioritized hMPVbeta1 vaccine construct to assess the induced cellular and humoral immune responses, comprehensively evaluating its immunogenic potential114. The high levels of IgG antibodies suggested effective immunological memory formation, ensuring rapid and efficient antigen clearance upon subsequent exposures. Interferon-gamma (IFN-γ) showed the most pronounced peak, indicating its potential to activate cellular immunity, especially in supporting cytotoxic T-cell responses and macrophage activation115. Interleukin-2 (IL-2) peaked early, reflecting the proposed vaccine’s potential to promote T-cell proliferation and differentiation during the initial immune activation phase116. The cytokine and interleukin simulation results showed the generation of a coordinated response, where early activators (like IL-2) initiated the response, and later IFN-γ sustained the response. Moreover, the booster dose analysis results demonstrated the hMPVbeta1 vaccine’s capability to induce strong primary and secondary IRs, which is in line with previous findings117. The increase in IgM titers early during the primary response and decline as IgG titers rose reflected the isotype switching118. Combined IgG1 + IgG2 and IgG1 levels showed a significant secondary response after the booster, with IgG1 increasing more prominently, indicating a robust secondary response driven by memory B cells119. Cytokine production results highlighted the interplay of pro-inflammatory and regulatory cytokines, necessary for the central roles in driving cellular immunity and ensuring a robust response to the proposed construct. The simulated immune responses of the construct indicated strong IFN-γ and IL-2 activity, which aligns with elevated levels of these markers seen during hMPV infections120. The findings suggested that the vaccine may trigger a balanced and potentially immune response to the natural infection. The population coverage analysis revealed approximately 100% global coverage, demonstrating the vaccine’s potential to target diverse HLA alleles and provide broad protection against hMPV, which corroborates the population coverage reported in another study90. The full length of the hMPV construct was cloned for its subsequent expression in the bacterial (E. coli strain) because in silico gene cloning is imperative for predicting the outcome of the designed construct before its validation in the laboratory121.

Limitation

In silico mRNA hMPVbeta 1 vaccine designed against hMPV is promising; however, it requires experimental and clinical validation to address challenges like stability, immunogenic efficacy, and safety when it is expressed in a mammalian system because it is solely based on computational predictive modeling. Future work should focus on validating through experiments and trials.

Conclusion

The development of mRNA vaccine candidates using the reverse vaccinology approach has emerged as a highly efficient strategy, offering significant advantages in saving time, reducing costs, and minimizing laboratory hazards. In this research, machine-learning algorithms based on immunoinformatics tools/servers were employed to identify and characterize potential LBEs (n = 08), CTLEs (n = 08), and HTLEs (n = 08) derived from critical hMPV’ F, G, M, and SH proteins. Combining these epitopes through appropriate linkers, two constructs were engineered, which yielded four different hMPV constructs (hMPV-beta1, hMPV-beta2, hMPV-Ribo1, and hMPV-Ribo2) after adding the two different adjuvants through a linker. Based on the inquisitive analysis of the antigenic properties, toxicity profile, allergenicity status, expression solubility and physical and chemical properties of all four designed constructs, the hMPV beta1 vaccine construct was identified as the most promising candidate, with a predicted antigenic score = 0.746 having 383 residues, a molecular weight of 39,633.71 Da, pI of 9.92, and favorable stability parameters (AI: 68.75, GRAVY: −0.260, Ii: 31.24). Moreover, the hMPV-beta1 exhibited high solubility (score: 0.784) and greater structural stability (ProSA Z-score of − 8.38) comparable to experimental structures. Despite these promising results, further experimental confirmation by accomplishing vitro assays and in vivo methods is essential to validate its immunogenicity and therapeutic potential. This research significantly contributes to the advancement of mRNA-based multiepitope chimeric vaccine technology and offers a potential framework for the development of early-stage preventive strategies and effective defenses against hMPV.

Supplementary materials

The following supporting information can be downloaded: Supplementary file S1 (Fig. 1a. CD-HIT culturing analysis for a target protein; Fig. 1b. Pruned virus-specific, non-paralogous target sequences for vaccine epitope prediction with their accession numbers; Fig. 1b. Pruned virus-specific, non-paralogous target sequences for vaccine epitope prediction with their accession numbers; Fig. 2. Elements of the secondary structure of the hMPVbeta 1 vaccine construct; Fig. 3. Aspects of the secondary structure of mRNA of vaccine construct (dot-bracket notation); Table 1. Five models of hMPVbeta 1 generated by GalaxyRefine. Supplementary file S2 (Fig. 1: Interacting residues of the TLR4-vaccine complex and Fig. 2: Interacting residues of the TLR2-vaccine complex). Supplementary file S3 (Fig. 1. Effect of hMPVbeta1 vaccine constructs on cell populations involved in humoral and cell-mediated immune response following the initial dose; Fig. 2: Result of immune response and cytokine profile after first booster dose at Day 30; Fig. 3. Effect of hMPVbeta1 vaccine constructs on cell populations involved in humoral and cell-mediated immune response following the first booster dose; Fig. 4. Effect of hMPVbeta1 vaccine construct on cell populations involved in humoral and cell-mediated immune response following the second/final booster dose). Supplementary file S4 (Fig. 1. Illustration of improved DNA sequence, CAI value, GC content, and finally translated protein sequences of the hMPVbeta 1 construct).