Abstract
The Plasmodium falciparum cysteine-rich protective antigen (PfCyRPA) is a promising target as a next-generation blood-stage malaria vaccine and together with PCRCR complex members, the reticulocyte binding-like homologous protein 5 (PfRh5) and the Rh5-interacting protein (PfRipr), are currently being evaluated in clinical trials. PfCyRPA is essential for merozoite invasion and appears to be highly conserved within the P. falciparum parasite populations. Here, we used a targeted deep amplicon next-generation sequencing approach to assess the breadth of PfCyRPA genetic diversity in 93 P. falciparum clinical isolates from Kédougou, an area with a high seasonal malaria transmission in Senegal. Our data show the dominant prevalence of PfCyRPA wild type reference allele, while we also identify a total of 15 single nucleotide polymorphisms (SNPs). Of these, only five have previously been reported, while the majority of the SNPs were present as singletons within our sampled population. The population prevalence of these SNPs ranges from 1.1% (singletons) to the most prevalent SNP V292F at 9.7% population prevalence. The reference 3D7 allele was predominant in the population. We also applied a structure-based modelling approach to thread these SNPs onto PfCyRPA crystal structures and showed that these polymorphisms have different predicted functional impacts on the interactions with binding partner PfRH5 or neutralizing antibodies. Our prediction revealed that the majority of these SNPs have minor effects on PfCyRPA antibodies, while others alter its structure, stability, or interaction with PfRH5. Altogether, our present findings reveal conserved PfCyRPA epitopes which will inform downstream investigations on next-generation structure-guided malaria vaccine design.
Introduction
Malaria is caused by parasites of the genus Plasmodium spp. and remains a major cause of morbidity and mortality, especially in Africa which bears the brunt of over 90% of the disease burden. Despite being a preventable and curable disease, malaria remains a global public health burden with an estimated 608,000 deaths and an associated mortality rate of 14.3 deaths per 100,000 population at risk in 2022. Combined efforts in both preventive and therapeutic measures have significantly reduced the malaria burden over the last two decades. However, this fragile progress has reversed in recent years with the emergence and spread of both insecticide-resistant mosquitos and the antimalarial resistant parasites1. This emphasizes the urgent need to accelerate the development of highly effective vaccines against the human malaria parasites2 which will further support current control measures to reduce the incidence of this disease in endemic countries and strive towards malaria elimination. Malaria vaccine development strategies have recently achieved a milestone following the WHO’s recommendation of the RTS, S/AS01 and R21/Matrix-M (R21) malaria vaccines for the prevention of P. falciparum malaria in children living in regions with moderate to high transmission1,3,4. Both RTS, S and R21 target the circumsporozoïte protein of the P. falciparum’s liver stage and have extensively been evaluated in clinical trials and pilot implementation for RTS, S. Primary analysis of R21 phase 3 clinical trial data showed protection of 67–75% against multiple clinical malaria episodes after a 12-month follow-up of fully vaccinated children (5–36 months)4, while that of RTS, S vaccine is limited by its modest efficacy, as demonstrated in the large phase 3 clinical trial across eight African countries where efficacy was 55.8% in children aged 5–17 months was observed over first year, and waned to 36.3% in the same age group over 48 months of follow-up5,6. There is an opportunity to complement these first-generation with next-generation vaccines, preferably targeting other stages of the parasite’s life cycle, to complement the existing malaria vaccine toolbox. Such vaccines need to consider genetic diversity at the very early stage of development. Malaria vaccine development has tremendously benefited from the publication of the genome of the Plasmodium falciparum7, which has paved the way for malaria reverse vaccinology8. This approach enabled the prioritization of current lead blood-stage malaria vaccine P. falciparum reticulocyte binding homolog 5 (PfRh5), the terminal member of the PCRCR complex9 that binds to erythrocyte receptor Basigin10. In addition to PfRh5, members of this complex include the Rh5-interacting protein (PfRipr), Cysteine-rich protective antigen (PfCyRPA), the Plasmodium thrombospondin-related apical merozoite protein (PfPTRAMP), and the cysteine-rich small secreted protein (PfCSS)9,11. Of these, PfRh5 remains the most advanced antigen of the complex in clinical development, having recently completed Phase 2b clinical trials in Burkina Faso12, while PfCyRPA and PfRipr are currently being assessed in phase 1 clinical trials (NCT0538547)13. Our present study evaluates the breadth of genetic diversity of PfCyRPA and uses structural insights to predict the functional impact of such diversity, contributing to structure-guided vaccine development.
Results
Characteristics of study participants
This study was conducted in Kédougou, a Southeastern region of Senegal, with a seasonal malaria transmission from May to November. Informed consent was obtained from study participants and/or their legal guardians. A total of 94 patients presenting confirmed cases of symptomatic P. falciparum infection and recruited in 2019 and 2022 from five healthcare centers in Kédougou, Bandafassi (N = 21), Camp militaire (N = 23), Dalaba (N = 33), Mako (N = 13) and Tomboronkoto (N = 4). Table 1 summarizes the demographic and parasitological characteristics of the study participants. Participants enrolled for this study were aged 2 to 67 years (Median 21.75; SD = 13.11), and there were 59 males and 35 females. We observed significantly different sex ratios, with an overall sex ratio of 1.68 in favour of the males. While no significant difference was observed in the median age across sampling sites, we observed a lower proportion of children (≤ 10 years) across sites. Moreover, we observed an overall complexity of infection (COI) of 3.65 our study population (Table 1).
Prevalence of SNPs
To determine the degree of PfCyRPA-associated genetic diversity within the population, we employed targeted deep amplicon sequencing using Illumina short-read next-generation sequencing on a NovaSeq6000 platform. Genetic diversity was assessed using a very sensitive threshold (2% variant allele frequency) and applied both quantitative statistical metrics as well as manual curation to enable accurate SNP discovery and validity. Of the 94 clinical isolates sequenced, 93 yielded sequence data; thus, a total of 93 clinical isolates were included in this analysis and the resulting sequences were compared to that of the reference strain (3D7). Overall, 26/93 (28%) of the isolates carried at least one SNP in the PfCyRPA gene relative to the 3D7 reference, which represented the dominant allele 67/93 (72%) within our sampled population (Fig. 1A). We identified 15 individual SNPs, of which only five have previously been reported, F41L (2.2%), V165I (2.2%), D236N (4.3%), N270T (4.3%) and V292F (9.7%). Additionally, of the novel SNPs reported here, 2 were previously described at the same position but we observed different amino acid substitutions (D236N and N338D (1.1%). The majority of the novel SNPs were rare and only identified in a single isolate, except for R50C (2.2%), I196F (2.2%) and K211Q (2.2%), which were all found in two isolates each and D236N (4.3%), N270T (4.3%) and V292F (9.7%). Overall, most of the isolates with a mutant allele carried only a single SNP at a time, while the highest number of SNPs carried by a single isolate (320697) was 4 (R31H; F41L; D236N; V292F). Moreover, only one SNP (V292F) was detected at a population prevalence greater than 5%. Furthermore, analysis of the VAFs after removing duplicate reads enabled us to make side by side comparison of VAFs with our current analysis that incorporates duplicate reads. The results show a minor difference overall in VAFs of the different SNPs (Supplementary Table 4).
Variant allele frequency of individual SNPs within complex infections
Given the likelihood of mixed-genotypes infections associated with natural malaria infections in high transmission settings, we measured the complexity of infection (COI) in our sample using msp1 & 2 genotyping14. Overall, only 10 out of the 93 isolates reported here were from monogenomic infections, while the number of polygenomic infections ranged from 2 to 11 P. falciparum genomes per patient sample, with a mean COI of 4 genotypes per isolate. Consequently, we next sought to assess the range of variant allele frequencies within individual samples, defined as the percentage of reads with the SNPs relative to the total number of reads mapped to the reference sequence at the SNP position. This analysis showed that the identified SNPs were distributed at varying frequencies ranging from 2.6 to 100% within the individual isolates; hence the defined classification as low frequency (< 5%), intermediate frequency (5–25%) and high frequency (> 25%) SNPs (Fig. 1B). We observed that 10 out of the 15 SNPs reported here were present at high variant frequencies within the patient sample, while only two SNPs were present at low frequencies. Of these SNPs with high variant frequencies, five were novel (D236V, N338D, D110N, I196F and R31H), while all the previously reported SNPs were present at high frequencies. Two of the novel SNPs (R50C and T37A) were present at low frequencies, while F187L, K211Q and I114V were present at intermediate frequencies (Fig. 1B). Altogether, this increases the confidence that our deep targeted sequencing approach is able to accurately identify emerging SNPs despite their low variant allele frequency within the complex infections. Out of the 10 isolates with monogenomic infections, only three carried a mutation in pfcyrpa, and further only a single SNP was identified per isolate. Of these SNPs, only I114V (400133) was present at low variant allele frequency, while V292F (400116) and N338D (400115) were both present at a high variant allele frequencies. The majority of the novel SNPs described here are present within polygenomic infections (8/10).
Structural modelling of SNPs
We assessed the predicted functional impact of identified polymorphisms by threading the SNPs onto the crystal structure of PfCyRPA, a six-bladed β-propeller protein15,16. These blades, interconnected by loop regions, are each constructed by a four-stranded anti-parallel β-sheet15,16. The detailed interactions between PfCyRPA and its binding partners (RH5 and Ripr) or monoclonal antibodies have more recently been reported11,17. Consequently, we used a structure-guided approach to thread the identified SNPs onto the PfCyRPA crystal structure in complex with PfRH5 and earlier characterized mAbs (Fig. 2A,B). The threading complex was built by superimposing the structure complexes of PfRH5 bound to its receptor Basigin (PDB id: 4U0Q) or PfCyRPA bound to PfRH5 (PDB id: 6MPV), or mAb Fab fragments PfCyRPA-Cy.003 (PBD:7PI2), PfCyRPA-Cy.004 (PBD:7PHW), PfCyRPA-Cy.007 (PBD:7PHV) and PfCyRPA-8A7 (PBD: 5TIH). The structural threading analysis revealed an even distribution of the identified SNPs between the PfCyRPA internal loops and individual blades (Table 2). Of these mutations, four (K211Q, D236N, D236V and N270T) were located within the blade 4. Likewise, blade 6 contained four SNPs (R31H, T37A, F41L and N338D) while there were three SNPs identified in blade 3 (V165I, F187L and I196F) and two SNPs identified in blade 2 (D110N and I114V). Blades 1 and 5 each contained a single mutation (Fig. 2A,B). Furthermore, our analysis showed four groups of SNPs with different predicted functional outcomes. Interestingly, 7 out of the 15 SNPs reported here are predicted to have minor effect on PfCyRPA structure or binding with PfRH5 or potentially on monoclonal antibody interactions, while another group of SNPs (V165I, I196F, D236N, N270T and V292F) were predicted to partially alter the structure of PfCyRPA. While both D236 and N270 form hydrogen bonds with S233 and N218 residues, respectively, the D236V and N270T mutations alter the PfCyRPA structure flexibility or stability, respectively through steric clashes or interruption of hydrogen bonds. Another group of SNPs, predicted to affect PfCyRPA interaction with PfRH5 and included R50C, F187L and I196F, the former improving binding to PfRH5 by removing the repulsion between R50 and K504 residues (Table 2). Intriguingly, 7/15 mutations may destabilize PfCyRPA (Table 2), with a change in Gibbs free energy (ΔΔG) close to or greater than 1.0, although most of these predictions indicated very moderate effect. For most of these potential destabilizing mutations, the overall population prevalence was rare: R31H (1.1%), F41L (1.1%), R50C(2.2%), I196F (2.2%), K211Q (2.2%), N338D (1.1%), with the exception of V292F (9.7%), which was our most dominant SNP in the population. One key implication of structure-guided vaccine design relative to the next-generation blood-stage malaria vaccines is to decipher the functional implication of the vaccine candidate-associated genetic diversity to epitope-paratope interactions. We identified a subset of SNPs with a potential impact on inhibitory monoclonal antibody binding, although these interactions are predicted to be very mild and they are not directly in the epitope bound by Cy.003, Cy.004, and Cy.007. Of these SNPs, only three (T37A, F41L, N338D) were mapped to blade 6, which along with blade 2 have been shown to trigger the most inhibitory antibodies17, while R50C, D110N and I114V bound to loop regions within blade 1 and 2, it is possible that these SNPs impact antibody recognition through structural changes (Fig. 2C–F), but such predictions should be functionally validated.
Discussion
The P. falciparum cysteine-rich protective antigen (PfCyRPA) plays a crucial role in merozoite invasion of the human erythrocyte. This antigen has attracted a particular attention as a promising vaccine candidate, as it is essential18,19 and accessible to naturally derived human antibodies20. Until recently however, the functional details of the role of PfCyRPA in erythrocyte invasion was a mystery, while more recent data report its critical role mediated by its lectin activity, specifically binding to 2-6-linked Neu5Ac on erythrocyte glycans21. This lectin activity, coordinated through specific amino acid residues essential for PfCyRPA binding of glycan targets share features between P. falciparum and its parental ancestor P. praefalciparum22. Preclinical studies have shown that PfCyRPA induces broadly neutralizing antibody response16,17,23,24, with a relatively conserved sequence in the various malaria parasites25,26, suggesting that a vaccine based on this protein may offer broader protection. However, despite its general conservation, PfCyRPA has some genetic variability, which could limit the effectiveness of a vaccine, as genetic variations in PfCyRPA may allow the parasite to mutate and evade the vaccine-induced immune response. Together with PfRipr, PfCyRPA has recently entered Phase 1 clinical testing (NCT05385471), thus a better understanding of the breadth and functional impact of PfCyRPA-associated polymorphisms in vaccine-induced immune response is needed to prioritize, design and optimize PfCyRPA-based vaccine alleles.
This study was undertaken to assess the extent of PfCyRPA genetic diversity in P. falciparum clinical isolates from naturally infected individuals in high malaria transmission settings. Samples reported here were collected from patients diagnosed with P. falciparum infections visiting healthcare centers in Kédougou, a Southeastern region of Senegal with high seasonal malaria transmission27. A previous study by Ndwiga and colleagues reported an excess of rare variants in proteins within the PfRH5 complex, including PfCyRPA26. In this study, the authors used two sequencing strategies, namely capillary Sanger sequencing and whole genome sequencing (WGS), which respectively identified 4 and 10 PfCyRPA-associated SNPs, while only a single SNP was concomitantly discovered by both strategies26.
Long read sequencing strategies such as Sanger sequencing enable the manual identification of genetic variation and the haplotype calling; however, they are limited by both their overall low throughput and their inability to accurately identify and segregate SNPs in the context of polygenomic infections such as those common in high transmission settings like Kédougou. We previously reported on the high prevalence of polygenomic infections in Kédougou28. This trend was confirmed in this current study, with isolates harboring 1 to 11 genotypes, while the mean COI reported here is 4 genotypes per isolate.
To increase our chance of discovering newly emerging and rare PfCyRPA-associated variants, we opted for a targeted deep amplicon sequencing using the Illumina Novaseq 6000 sequencing technology and used a sensitive discovery threshold of 2% for variant calling. We successfully sequenced pfcyrpa amplicons from 93 isolates and reported a total of 15 SNPs, of which 10 were novel, while only 5 were reported in previous studies26,29. Interestingly, our current data showed the PfCyRPA reference allele being the most prevalent allele, while the opposite trend was observed for PfRH528,30. This observation aligns with previous report suggesting a stronger balancing selection pressure on PfRH5 than that on PfCyRPA20. Additionally, our findings matched previous reports on the occurrence of an excess of rare variants26, as the majority of the SNPs reported here were present as singletons (occurring in single isolates), while only three SNPs (V292F, D236N and N270T) were present in more than two isolates. This result emphasizes the power of deep amplicon sequencing strategies in identifying rare genomic variants in polyclonal infections and agrees with previous reports involving the lead blood-stage malaria vaccine candidate, PfRH528,30. Here, we report rare variants in PfCyRPA from naturally circulating P. falciparum clinical isolates. It is of upmost important to determine if such variants, even if rare, could result in vaccine-resistant parasites that could further expand with vaccine roll out and subsequent selective pressure.
One limitation of the deep amplicon sequencing strategy used here is its inability to resolve individual parasite haplotypes due to the short reads but also to the high complexity of infection in this population. Consequently, for each identified SNP, we assessed the variant read frequency, defined as the percentage of variant reads in the total reads mapped to a given position in the PfCyRPA reference. Given the varying number of genotypes as well as their respective parasitemia in a given isolate, this analysis of the variant reads frequencies enabled us to quantitatively calculate the number of reads with the SNP relative to the total number of reads at its position and therefore classify the SNPs as low (< 5%), intermediate (5–25%) and high (> 25%) in each given sample. Interestingly, 10 out of the 15 SNPs reported here were present at high frequencies, while 1 and 4 SNPs were respectively present at intermediate and low frequencies. A similar observation was made in our previous study on PfRH528, which resulted in a number of SNPs present at low frequencies. While all previously reported SNPs, the most prevalent in the population, were also present at high variant allele frequencies, we also showed 4 novel SNPs (R31H, D110N, D236V and N338D) present at high frequencies, each of which were identified as singletons. Of these novel SNPs present at high frequencies, three (R31H, D110N and D236V) occurred in polygenomic infections. Interestingly, the D236V SNP, present as a singleton with a frequency of 99.8% emerged from an isolate with a COI of 4, which emphasizes that it is a dominant SNP even in a polygenomic infection. From our structural data, the slight positive ΔΔG values indicates that this SNP has a slight, though non-significant decrease in protein stability. Further analysis is needed to truly evaluate the impact of this SNP on protein folding, stability, and function and may also be interesting to consider in light of other potential compensatory mutations in binding partners RH5 and Ripr.
Given the relationship between a protein’s structure and its function, we sought to investigate the impact of the identified SNPs in PfCyRPA structure and ultimately predict their functional implication in binding with partner protein PfRH5 or its recognition by neutralizing monoclonal antibodies with known binding epitopes. The crystal structure of PfCyRPA has earlier been solved15,16, while more recent studies have solved its structure in complex with neutralizing antibodies as well as binding partners PfRH5 and PfRipr11,17. The functional implication of naturally arising polymorphisms however might be very challenging to investigate within the naturally circulating parasite populations, and mostly in the context of high malaria transmission where individual isolates are often represented as polygenomic infections.
As a primary investigation, we adopted an in-silico approach based on the threading of the observed SNPs onto the crystal structure of PfCyRPA in complex with binding partner PfRH5 or neutralizing antibodies. By superimposing the identified SNPs onto the PfCyRPA structure, we were able to accurately map their distribution and predict their impact on the antigen’s functional structure. Interestingly, in addition to the even distribution of the SNPs between the antigen’s internal loops and blades, there was at least one SNP present in each given blade. Moreover, out of the 15 SNPs located within PfCyRPA blades, 7 were located within blades 3 and 4, which together form much of the interface between PfCyRPA-PfRH5 interaction11. As these SNPs were predicted to have either a minor effect on the antigen’s structure or to impact its binding with PfRH5, their functional impact remains a mystery to be solved considering that their location might not be readily accessible to neutralizing antibodies but also given that antibodies occurring in the binding interface between the two antigens are not inhibitory17,31. On the other hand, while a previous study reported the predominance of conformational neutralizing epitopes within the PfCyRPA structure24, recent data have shown most of the inhibitory antibody binding epitopes to be located within the blades 1 and 2 of the PfCyRPA structure17. Our structural threading analysis showed R50C to be located within blade 1, while both D110N and V165I were located within the blade 2 of PfCyRPA, with both predicted to have a minor effect on PfCyRPA structure. Moreover, 4 out of the 15 SNPs (R50C, F187L, I196F, N270T) reported here were predicted to have a minor effect on antibody binding to PfCyRPA. However, even if there seems to be no SNPs found within the most critical epitopes of the PfCyRPA so far reported, this should not prevent further investigation on the potential impact of these SNPs, as even though predicted with a minor effect, they could have an important contribution in the parasite overall fitness. Finally, our predictions also reported on the presence of SNPs that can impact the antigen’s functionality in other ways, such as removing repulsive interactions (R50C), disrupting hydrogen bonds (N270T) or causing steric clashes (D236V), while another subset of SNPs has the propensity of either altering (N338D) of introducing (D236N) new glycosylation sites. Given the importance of their structural changes and their key roles in protein stability, folding and solubility, these findings warrant further investigation in order to confirm the functional impact of these SNPs in the context of malaria vaccinology as it relates to PfCyRPA.
The findings of this study are critically important as we consider the impact of genetic diversity on potential efficacy of PfCyRPA as a malaria vaccine candidate, however our study is not without limitations. The cross-sectional approach described here provides a snapshot of the pfcyrpa genetic diversity within the circulating parasite populations. The passive case detection recruitment strategy adopted here could prompt the tendency to only focus on isolates driving the more prominent clinical disease that influences the patient to seek care while overlooking the true natural genetic diversity in isolates from the larger community. As we highlighted in our previous study, future work can address this by sampling across the clinical presentation spectrum, including active surveillance of asymptomatic cases28. Another limitation of this study is the notable differences in the number of samples from the individual sites, which also did not enable a site by site comparison of the pfcyrpa genetic diversity across the sampling sites. This study was not powered to reflect a thorough assessment of genetic diversity stratified by site, although this would be interesting to explore in depth as there could be differences in the parasite populations circulating within each site due to unique epidemiological characteristics. This could be as a result of their geographical location related to the neighboring countries with whom the region shares borders (Mali and Guinea Conakry) due to the specific activities of each site (mining and trading activities). Therefore, a larger and more thorough sampling across the entire region could strengthen these preliminary data being reported here. Furthermore, while our sequencing approach has advantages for deep coverage and detection or rare SNPs, it also has its own limitations as the generated short reads, combined with the high complexity of infection in this population, make it difficult to accurately resolve individual parasite haplotypes. Finally, although our structural modelling can imply potential functional impact of the reported SNPs, more functional studies are needed to accurately decipher, if any, the true mechanistic implication of these polymorphisms into the parasites’ fitness and survival. These studies will be crucial to future guide structure-guided vaccine development for CyRPA and will inform which variant alleles may be most critical to consider in vaccine development. Given the working hypotheses generated from our investigations, this study, will inform downstream biochemical and functional genetic approaches to evaluate the role of each SNP in PfCyRPA function and survival strategies, which will ultimately inform PfCyRPA vaccine development.
Population prevalence of PfCyRPA SNPs. (A) The population prevalence of PfCyRPA-associated SNPs was calculated as the percentage of SNPs detected within the total number of clinical samples in the population (N = 93) using a variant allele frequency (VAF) threshold of 2%. Data is portrayed as both parts of a whole (left) and as a bar graph with individual SNPs ordered from left to right by decreasing population prevalence (right). A dotted vertical line has been added and SNPs to the right of the dotted line represent population singletons. PfCyRPA sequencing was performed from pfcyrpa amplicons using the Illumina NovaSeq 6000 sequencing platform and variant analysis was performed using the Geneious Prime software version 23.1.1. The graphs were plotted using the GraphPad Prism version 1.0.2 software. (B) Variant allele frequency of PfCyRPA SNPs in individual samples. Variant allele frequency was determined from the sequencing data outputs and calculated as the percentage of the variant reads relative to the overall read coverage at the variant position. The data are presented as bar graphs showing the number of isolates (black dots) with the error bars presenting the minimum and the maximum frequencies for each SNP in complex clinical samples. The SNPs are categorized as low < 20% (golden), intermediate 2–25% (lavender) or high frequency SNPs > 25% (maroon), based on their respective frequency within the individual complex sample. The dotted line depicts the 2% VAF threshold.
Structure-function predictions for the novel SNPs identified in CyRPA. The complete structure was obtained by superimposing the structure of PfCyRPA in complex with PfRH5 (PDB id: 6MPV), PfRH5 bound to its ligand Basigin (PDB id: 4U0Q) and of known monoclonal antibodies Fab regions 8A7(PBD:5TIH), Cy.003 (PBD:7PI2), Cy.004 (PBD:7PHW) and Cy007/c12 (PBD:7PHV). Both CyRPA blades and individual antibodies are color coded. (A) The location of SNPs within the BSG–RH5–CyRPA complex. The complex construction was achieved by superimposing the RH5 of the RH5–BSG complex (PDB ID: 4U0Q) onto the RH5–CyRPA complex (PDB ID: 6MPV). BSG and RH5 are depicted in light blue and grey, respectively. CyRPA is represented in a wheat color, while blade 1 and blade 2 are indicated in dark orange. (B) The positions of D110N and I114V relative to monoclonal antibodies (mAbs) with CyRPA blades color coded. Antibodies Cy.003, Cy.004, Cy.007, and 8A7 are represented in green, lavender, beige, and dark turquoise, respectively. (C) A detailed view of the positions of D110N and I114V relative to monoclonal antibodies (mAbs). Antibodies Cy.003, Cy.004, Cy.007, and 8A7 are represented in green, lavender, blue, and cyan, respectively. (D) Structural modelling revealed that SNPs V165I, N270T, and V292F influence the conformation of CyRPA. Small red plates signify the potential steric hindrances. (E) This panel highlights the SNPs that might confer resistance to antibodies, including F41L, D110N, and I114V. (F) SNPs R50C and F187L are shown to directly interact with PfRH5.
Methods
Study sites and sample collection
This study was conducted in Kedougou, a Southeastern region of Senegal, with a seasonal malaria transmission from May to November. Informed consent was obtained from the study participants or their legal guardians and samples were collected following the approved ethical protocol by the National Ethics Committee of Senegal (CNERS) (SEN19/36 and SEN23/09), the regulatory board of the Senegalese Ministry of Health and the Institutional Review Board of the Yale School of Public Health (2000025417). Samples used in this study were collected through passive case detection from patients visiting healthcare facilities Bandafassi, Bantaco, Camp militaire, Dalaba, Mako, and Tomboronkoto in 2019 and 2022, during the peak of the malaria transmission season (July and August) with malaria-like symptoms. If participants met the enrollment criteria of fever in the past 24 h, an axillary temperature ≥ 38 °C and a positive P. falciparum malaria diagnosis from a rapid diagnostic test (RDT) and/or microscopy, they were offered the opportunity to enroll in the study. After informed consent was obtained, a venous blood sample was drawn into EDTA vacutainers and samples were transported at room temperature to the laboratory for processing; no more than 6 h between draw and processing.
DNA extraction, PCR amplification and NGS library and sequencing
DNA was extracted from infected erythrocyte pellets using the ZYMO Quick-DNA Miniprep Kit (D3024) following the manufacturer’s instructions. The extracted DNA samples were eluted in 30 μl of nuclease free water and stored at − 20 °C prior to PCR amplification. For PCR amplification, PfCyRPA-specific primers were designed using the Geneious Prime software version 23.1.1. The PfCyRPA 3D7 reference sequence (PF3D7_0423800, PlasmoDB32 was used as template for primer designing and the amplification was performed using a classic PCR protocol. The PCR was done by using Phusion High-Fidelity DNA Polymerase (Catalog: M0530L, 50X higher fidelity than Taq). Supplemental Table 1 shows the primer pairs and the Supplemental Tables 2 and 3 the PCR conditions and PCR program respectively used for PfCyRPA amplification. Following successful amplification, PfCyRPA sequences were bead-purified (Omega) and quantified using a Qubit 2.0 fluorometer; and subsequently adjusted to equivalent concentration. Sequencing library preparation was performed with the Nextera XT using unique dual indexes (UDIs) and subjected to a subsequent bead-purification. DNA libraries were quantified by qPCR using Roche KAPA Library Quantification Kit. All samples were normalized to a final concentration of 4 nM. The clinical isolates, 3D7 reference DNA control, and water control were quantified and normalized and were pooled into 8-sub-pools, which were further bead-purified and quantified using a KAPA qPCR. The 8 sub-pools were further normalized and combined in equal quantities to form one final pool. This final pool was sent to the Yale Center for Genome Analysis (YCGA) for sequencing on an Illumina NovaSeq 6000 platform with targeted coverage of 500,000 reads per sample.
Complexity of infection
The complexity of infection (COI) for each sample was assessed by genotyping the merozoite surface proteins (MSP) using nested PCR on DNA extracted from whole blood, as previously described14.
Data processing and polymorphism analysis
De-multiplexed forward and reverse sequencing reads obtained for each sample were individually imported to Geneious Prime and paired sequences were obtained using the Illumina paired end setting. Paired sequences were subsequently trimmed using BBDuk plugin. A minimum quality score (Q) of 30 was set for the trimming with a minimum length of 75 base pairs, as we were expecting reads around 150 base pairs. Trimmed sequences were aligned with the 3D7 reference sequence that had been annotated with all known non-synonymous mutations. Single nucleotide polymorphisms (SNPs) annotation was performed using five iterations the criteria for SNP calling was set to a minimum frequency of 0.02 (2%) and 1000 read coverage. The prevalence of identified SNPs was expressed as the percentage of isolates with a given SNP relative to the total number of sequenced isolates. Moreover, we defined the variant read frequency as the percentage of the variant reads relative to the overall read coverage at the variant position. Sequence data and SNP analysis was performed by at least 3 individuals for each sample. Comparative analysis was performed by removing duplicate reads and identifying SNPs using the same approach as described above without the removal of the duplicates reads and the comparison is shown in Supplemental Table 4.
Structural modelling of PfCyRPA-associated SNPs
The structures of PfCyRPA and PfRH5 were downloaded from the Protein Data Bank (PDB, https://www.rcsb.org/). PfRh5-PfCyRPA complex was constructed using Pymol (PDB ID: 4U0Q and 6MPV) and structural predictions with mAb binding was performed using PDB IDs 5TIH, 7PI2, and 7PHW). Individual FASTA files containing amino acid sequences of PfCyRPA and individual novel SNPs were generated. These amino acid sequence files were threaded onto the crystal structure of PfCyRPA in complex with binding partner PfRH5 and/or known monoclonal antibodies. Pymol version 2.3.2 was used to predict the effect and to plot the structural location of each SNPs. The structural effect of the mutant versions of the protein were evaluated in terms of biochemical properties such as hydrogen bonding patterns, steric interactions, and predicted binding affinity between the mutant version of the protein and the Basigin receptor the binding energy alternation for SNPs were predicted by FoldX.
Data availability
Sequencing Reads associated with this study have been deposited in the NCBI SRA with the BioProject Accession: PRJNA1109877.
References
WHO. World Malaria Report 2023, 356 (ed. World Health Organization) (World Health Organization, 2023). https://apps.who.int/iris
Draper, S. J. et al. Malaria vaccines: recent advances and new horizons. Cell. Host Microbe. 24 (1), 43–56 (2018).
Rts, S. C. T. P. Efficacy and safety of the RTS,S/AS01 malaria vaccine during 18 months after vaccination: a phase 3 randomized, controlled trial in children and young infants at 11 African sites. PLoS Med. 11 (7), e1001685 (2014).
Datoo, M. S. et al. Safety and efficacy of malaria vaccine candidate R21/Matrix-M in African children: a multicentre, double-blind, randomised, phase 3 trial. Lancet 403 (10426), 533–544 (2024).
Rts, S. C. T. P. et al. A phase 3 trial of RTS,S/AS01 malaria vaccine in African infants. N Engl. J. Med. 367 (24), 2284–2295 (2012).
Rts, S. C. T. P. et al. First results of phase 3 trial of RTS,S/AS01 malaria vaccine in African children. N. Engl. J. Med. 365 (20), 1863–1875 (2011).
Gardner, S. N. Cell cycle phase-specific chemotherapy: computational methods for guiding treatment. Cell. Cycle. 1 (6), 369–374 (2002).
Tuju, J. et al. Vaccine candidate discovery for the next generation of malaria vaccines. Immunology 152 (2), 195–206 (2017).
Scally, S. W. et al. PCRCR complex is essential for invasion of human erythrocytes by plasmodium falciparum. Nat. Microbiol. 7 (12), 2039–2053 (2022).
Crosnier, C. et al. Basigin is a receptor essential for erythrocyte invasion by plasmodium falciparum. Nature 480 (7378), 534–537 (2011).
Farrell, B. et al. The PfRCR complex bridges malaria parasite and erythrocyte during invasion. Nature. (2023).
Natama, H. M. et al. Safety and efficacy of the blood-stage malaria vaccine RH5.1/Matrix-M in Burkina Faso: Interim results of a double-blind, randomised, controlled, phase 2b trial in children. Lancet Infect. Dis. (2024).
Williams, B. G. et al. Development of an improved blood-stage malaria vaccine targeting the essential RH5-CyRPA-RIPR invasion complex. Nat. Commun. 15 (1), 4857 (2024).
Snounou, G. et al. Biased distribution of msp1 and msp2 allelic variants in plasmodium falciparum populations in Thailand. Trans. R. Soc. Trop. Med. Hyg. 93 (4), 369–374 (1999).
Chen, L. et al. Structural basis for Inhibition of erythrocyte invasion by antibodies to plasmodium falciparum protein CyRPA. Elife. 6. (2017).
Favuzza, P. et al. Structure of the malaria vaccine candidate antigen CyRPA and its complex with a parasite invasion inhibitory antibody. Elife. 6. (2017).
Ragotte, R. J. et al. Heterotypic interactions drive antibody synergy against a malaria vaccine candidate. Nat. Commun. 13 (1), 933 (2022).
Sony Reddy, K. et al. Multiprotein complex between the GPI-anchored CyRPA with PfRH5 and PfRipr is crucial for plasmodium falciparum erythrocyte invasion. PNAS 112, 1179–1184 (2015).
Volz, J. C. et al. Essential role of the PfRh5/PfRipr/CyRPA complex during Plasmodium falciparum invasion of erythrocytes. Cell. Host Microbe. 20 (1), 60–71 (2016).
Mian, S. Y. et al. Plasmodium falciparum Cysteine-Rich protective antigen (CyRPA) elicits detectable levels of invasion-inhibitory antibodies during natural infection in humans. Infect. Immun. 90 (1), e0037721 (2022).
Day, C. J. et al. The essential malaria protein PfCyRPA targets glycans to invade erythrocytes. Cell. Rep. 43 (4), 114012 (2024).
Sharp, P. M., Bibollet-Ruche, F. & Hahn, B. H. Plasmodium falciparum CyRPA glycan binding does not explain adaptation to humans. Genome Biol. Evol. 17(2). (2025).
Healer, J. et al. RH5.1-CyRPA-Ripr antigen combination vaccine shows little improvement over RH5.1 in a preclinical setting. Front. Cell. Infect. Microbiol. 12, 1049065 (2022).
Knudsen, A. S. et al. Strain-dependent inhibition of erythrocyte invasion by monoclonal antibodies against Plasmodium falciparum CyRPA. Front. Immunol. 12, 716305 (2021).
Dreyer, A. M. et al. Passive Immunoprotection of Plasmodium falciparum-infected mice designates the CyRPA as candidate malaria vaccine antigen. J. Immunol. 188 (12), 6225–6237 (2012).
Ndwiga, L. et al. The Plasmodium falciparum Rh5 invasion protein complex reveals an excess of rare variant mutations. Malar. J. 20 (1), 278 (2021).
Paludisme, P. N. d.L.c.l., Bulletin Epidemiologique Annuel 2023 Du Paludisme au Senegal. (2024).
Mangou, K. et al. Structure-guided insights into potential function of novel genetic variants in the malaria vaccine candidate PfRh5. Sci. Rep. 12 (1), 19403 (2022).
Waweru, H. et al. Limited genetic variations of the Rh5-CyRPA-Ripr invasion complex in Plasmodium falciparum parasite population in selected malaria-endemic regions, Kenya. Front. Trop. Dis. 4, 1102265 (2023).
Thiam, L. G. et al. Vaccine-induced human monoclonal antibodies to PfRH5 show broadly neutralizing activity against P. falciparum clinical isolates. npj Vaccines. 9,198 (2024).
Alanine, D. G. W. et al. Human antibodies that slow erythrocyte invasion potentiate malaria-neutralizing antibodies. Cell. 178 (1), 216–228e21 (2019).
Alvarez-Jarreta, J. et al. VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center in 2023. Nucleic Acids Res. 52 (D1), D808–D816 (2024).
Acknowledgements
We would like to thank Souleymane Ngom from Dalaba, Lt. Dr Charles Latyr Diagne and Lamine Kane from Camp Militaire, Moctar Mansaly and Gerald Keita from Bandafassi, Safietou Sane and Astou Ndiaye from Mako, Adama Gueye from Bantaco, Der Ciss from Tomboronkoto and all the healthcare workers at these sites for their partnership with Institut Pasteur Dakar. We would also like to thank the people of Kédougou for their invaluable contributions to this work. We would like to thank the Yale Center for Genome Analysis (YCGA).
Funding
This work was supported by the by the Fogarty International Center of the NIH (K01 TW010496), National Institute of Allergy and Infectious Diseases of the NIH (R01 AI168238), and G4 group funding (G45267, Malaria Experimental Genetic Approaches & Vaccines) from the Institut Pasteur de Paris and Agence Universitaire de la Francophonie (AUF) to AKB. This work has been produced with the financial assistance of the European Union (Grant no. DCI-PANAF/2020/420 − 028), through the African Research Initiative for Scientific Excellence (ARISE), pilot program. ARISE is implemented by the African Academy of Sciences with support from the European Commission and the African Union Commission. The contents of this document are the sole responsibility of the authors and can under no circumstances be regarded as reflecting the position of the European Union, the African Academy of Sciences, and the African Union Commission. AB and LGT are partially supported by an ARISE grant from the African Academy of Sciences (ARISE-PP-FA-056).
Author information
Authors and Affiliations
Contributions
A.K.B. conceived the experiments. A.K.B., L.S., and Z.S. supervised the research. A.B., L.G.T., M.N.P., S.D.S., F.D., R.L., A.C., N.G., K.M., A.J.M, A.T., B.D.S., and A.M. collected the samples. A.K.B and A.B. assisted with geolocation. A.B., L.G.T., M.N.P. S.D.S., and F.D. conducted the experiments. Y.G., Z.S., S.D.P. performed structure modelling. A.B., L.G.T., S.L. N.G., and A.K.B. analysed the results. A.B., L.G.T., A.K.B. wrote the manuscript. A.B., L.G.T., A.K.B. J.L.A.N., I.V.W., Z.S., S.D.P. and L.S. reviewed and edit the manuscript. All authors reviewed and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ba, A., Thiam, L.G., Pouye, M.N. et al. Genetic diversity in the Plasmodium falciparum next-generation blood stage vaccine candidate antigen PfCyRPA in Senegal. Sci Rep 16, 5661 (2026). https://doi.org/10.1038/s41598-026-36257-z
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-026-36257-z

