Introduction

Malaria is a devastating disease that continues to evade efforts towards eradication. The estimated 249 million cases and 608,000 deaths in 2022, continues the worrying trend of unmet World Health Organisation case and mortality reduction milestones1. The cause of this is multifactorial, but a major threat to our continued ability to move towards disease eradication is the growing emergence of anti-malarial drug resistance1. The principle causative agent of malaria, Plasmodium falciparum, has been observed to have at least partial resistance or altered susceptibility to almost every drug used to treat the disease2,3. Indeed, there is a desperate need for the design of new drugs or the reuse of old ones, and better chemotherapeutic strategies to manage the evolution of their resistance4,5.

The P. falciparum Chloroquine resistance transporter (PfCRT) is a major resistance mechanism in malaria2. The protein is a 49 kDa member of the Drug/metabolite transporter superfamily and features 10 transmembrane helical domains (Fig. 1)6,7. PfCRT is located in the digestive vacuole (DV) of the malaria parasite, where it transports hemoglobin derived peptides into the parasite cytosol for nutritional purposes8,9. The peptides transported can be from 4 to 11 residues long and are chemically diverse, but there is a preference for those with a negative charge8,10. This peptide transport is essential for the parasite and knocking out the PfCRT gene results in parasite death10. PfCRT isoforms with a reduced capacity to transport peptides, or parasites with conditionally knocked-down PfCRT expression, will accumulate peptides within the DV and cause the osmotic swelling of the organelle10,11,12.

Fig. 1: The structure of PfCRT and the mutations in the CQ resistant strain Dd2.
figure 1

A Side view B as viewed from the vacuolar side of the membrane.

PfCRT has been implicated in modulating the susceptibility to a number of antimalarial compounds, with mutant isoforms being able to transport a range of antimalarial compounds such as amodiaquine, piperaquine, quinine, mefloquine, lumefantrine, and in particular chloroquine (CQ)13,14,15,16. As the protein’s name suggests, PfCRT first gained notoriety as the resistance mechanism to the former frontline antimalarial, CQ. The drug’s safety profile, cost-effectiveness and high efficacy was ideal for its use in mass drug administrations against malaria17. The drug’s primary target is in the DV of the parasite; the organelle in which endocytosed hemoglobin is digested18,19. CQ blocks the formation of hemozoin crystals, leading to the build up of toxic hemoglobin-derived heme and the subsequent death of the Plasmodium parasite. To reach its target, CQ is first either transported into the DV via PfMDR1 or PfAAT1, or diffuses through the membrane14,20,21. The acidic pH of DV promotes the protonation of the drug, adding a + 2 charge and trapping it within the organelle via weak-base trapping21. Mutations in PfCRT conferred resistance by allowing it to transport CQ out of the DV, away from its target13,22,23.

CQ resistance first evolved in 1957, and over the following decades CQ resistance spread throughout malaria endemic regions, causing an increase in mortality rates24,25,26. All CQ resistant field isolates of PfCRT possess at least 4 mutations, with some having as many as ten, but all contain the K76T mutation22. This mutation is the only necessary (though not sufficient) mutation for CQ resistance, and is the major resistance marker used to identify it22,27. There are two predominant mutational lineages that evolved in response to the CQ pressure, with the most common being exemplified with the well characterized Dd2-PfCRT isoform. It possess 8 mutations: M74I, N75E, K76T, A220S, Q271E, N326S, I356T, and R371I22. The mutations are mostly distributed around the central cavity of the protein (Fig. 1), and shift the overall charge of the cavity from neutral to negative6,28. This has been hypothesized as being important in allowing the transport of CQ and multiple other positively charged drugs6,22,28,29.

Interestingly, as CQ use subsided due to widespread resistance, the prevalence of the K76T mutation began to decline in some populations3. This suggests that in the absence of drug pressure, CQ resistance is associated with a fitness cost, as resistance mechanisms often are30. Indeed, in growth competition experiments of transgenic parasites with the wildtype (CQ sensitive) 3D7-PfCRT and Dd2-PfCRT, the CQ resistant isoform is outcompeted31. The difference in fitness is likely explained by altered hemoglobin catabolism in the Dd2-PfCRT containing parasites, and the protein’s reduced capacity for peptide transport8,31. Being able to transport CQ appears to interfere with this natural functioning of the protein, as there is a strong negative correlation between an isoform’s CQ transport rate and the transport rate of the test peptide VDPVNF8. When a positive charge at residue 76 (106/1-PfCRT) or adjacent to it (E75K, 2300-PfCRT) is reintroduced to the cavity, CQ transport is abrogated, but it will also restore PfCRT’s ability to transport VDPVNF to levels observed in the wildtype 3D7-PFCRT isoform8. In addition, the likely evolutionary path from 3D7-PfCRT to Dd2-PfCRT does not increase CQ transport upon each mutation, suggesting these mutations (M74I and I356T) may instead be functioning to restore parasite fitness22,32.

We hope to understand how and why CQ resistance comes at the cost of peptide transport and what this suggests about PfCRT’s functional constraints and future evolution. To do so, here we perform molecular dynamics (MD) simulations to ascertain the binding sites and access pathways of CQ and a number of short peptides. These simulations are done with the CQ sensitive 3D7-PfCRT and the CQ resistant Dd2-PfCRT to elucidate the effect on the interactions between the resistance associated mutations and the ligand. Our results allow us to determine the substrate binding sites, explain the polyspecific nature of peptide transport and the subsequent compromise between drug resistance and parasite fitness.

Results

Chloroquine’s access to the cavity recapitulates expected PfCRT isoform relationships

The CQ-sensitive wild-type (3D7) parasites are unable to evade the toxic effects of CQ because 3D7-PfCRT is unable to transport the drug in sufficient amounts out of the DV13. Contrastingly, the Dd2-PfCRT is able to, and thereby confers resistance to the parasite.

We investigated the basis of this differential transport ability of the 3D7-PfCRT and Dd2-PfCRT isoforms, and the role of Lys76 with the Dd2-76K-PfCRT isoform by performing MD simulations of the proteins in a 5 lipid-species bilayer and explicit solvent molecules in the presence of vacuolar CQ. The reintroduction of Lys76 in the Dd2-76K-PfCRT isoform completely abolishes CQ transport22. The 3D7 and Dd2-PfCRT isoform structures were generated by mutating the 7G8-PfCRT Cryo-EM structure (list of mutations in methods). We first ran coarse grained simulations of this system to equilibrate the membrane and determine whether there are any specific lipid interactions. It was found that a cholesterol binding site existed and there was an enrichment of POPS lipids around the protein (Supplementary Fig. 1). The membrane was then backmapped to full atomistic detail and the 3D7 and Dd2-PfCRT isoforms placed in the equilibrated membrane. All full atom simulation systems in this study were run for at least 9 µs each, until the free energy for ligands to enter the cavity stabilized (Supplementary Fig. 2).

By determining the free energy surface (FES), which quantifies how likely one is to observe an atom of CQ at any position within the simulation system, we see that the positively charged CQ does not permeate the membrane, consistent with CQ being weak-base-trapped in the DV (Fig. 2A–C)21. There is also a favorable region around the solution-exposed vacuolar side of PfCRT that may help direct CQ into the cavity, that is similar across all isoforms. However, differences between isoforms in the FES are particularly evident between their respective binding cavities. CQ samples the cavity region in all three isoforms, but access is much more favorable in Dd2-PfCRT, as evidenced by the drug spending the most time in the cavity of Dd2-PfCRT (with an average of 85.5 ± 3.4% of simulation frames, compared to 50.0 ± 4.9% in 3D7-PfCRT and 49.7 ± 7.0% in Dd2-76K-PfCRT), and penetrates deeper into the cavity (Fig. 2D). 3D7-PfCRT features no minima (<0 kJ mol−1) in the cavity, consistent with its negligible CQ transport. In Dd2-PfCRT, the cavity is now far more accessible, and features a centrally located minima of −3 kJ mol−1. There is a higher energy region directly above this minima, but it is bordered by two favorable paths (also −3 kJ mol-1) that would allow CQ to traverse from solution to the central minima. When Lys76 is reintroduced in Dd2-76K-PfCRT, the centrally located minima is no longer present. However, there does still feature a −1 kJ mol-1 minima along the left-hand path, and a minima of −3 kJ mol-1 near where the right-sided path would enter into the cavity. What this suggests is that the centrally located Lys76 obstructs the potential binding site of CQ, and the K76T mutation removes the conflicting charged interactions, and potentiates the paths to the binding site. That the paths to the binding site are still partially accessible in Dd2-76K-PfCRT, suggests that the other mutations in Dd2-PfCRT are responsible for the formation of these pathways.

Fig. 2: The accessibility of CQ to the PfCRT cavity.
figure 2

The free energy surfaces of CQ across the A 3D7-PfCRT, B Dd2-PfCRT, and C Dd2-76K-PfCRT systems. The surfaces are displayed across the X and Z dimensions of the simulation systems (Å), and averaged across a slice of the Y dimension that spans the width of the cavity. The system is oriented such that the vacuolar side of the protein is on the upper side of the graph, and the cytosolic on the lower. The color indicates the free energy of an atom of CQ at that position within the simulation system, with red indicating regions where the ΔG > 0 and blue regions where ΔG < 0. The free energy is relative to the average of a slice in bulk solution. The solid line represents the position of the PfCRT protein in that simulation system, and the interior dotted line, the region of Lys76’s motion. For Dd2-PfCRT, the range of Lys76 is from the corresponding 3D7-PfCRT simulation. D The average number of simulation frames for which there is a CQ molecule in the PfCRT cavity, as a percentage of the number of frames in that replicate. Error bars represent the std. error. n = 20 simulation replicates, T-test (two-sided): p = 1.0 × 10−6, p = 0.97.

Distinct CQ binding sites exist in both 3D7 and Dd2 PfCRT isoforms

To define the most likely binding positions of CQ in the protein, we clustered the positions of CQ by RMSD using the GROMOS algorithm, across the entire set of trajectories of each simulation system. In 3D7-PfCRT, the top ten poses all sit either partially, or entirely outside of the binding cavity, as would be expected from the FES which does not show any minima in the cavity (Figs. 3A2A). The highest ranked pose, along with a number of others, are around the loop connecting TM helices 9 and 10. The number of clusters at this location indicates a degree of flexibility in the binding here. The highest ranked cluster is almost twice as populated as the second most populated cluster, which indicates that 3D7-PfCRT possesses a well defined binding site at this location.

Fig. 3: Clustering of CQ positions across all trajectories of the system, and the cluster populations relative to the top cluster.
figure 3

A The relative populations of the top ten most populated clusters across the twenty 3D7-PfCRT CQ systems, and B their positions relative to the protein. Each color on the bar graph corresponds to the same colored CQ position on the protein with the most favorable binding position highlighted in blue on the structure. C Relative cluster population of Dd2-PfCRT, and their positions (D).

The cluster positions in Dd2-PfCRT are more dispersed and occur deeper into the cavity than in 3D7-PfCRT (Fig. 3B), in line with the minima seen in the FES (Fig. 2B). The positions seen in 3D7-PfCRT appear to be destabilized in Dd2-PfCRT and instead a series of poses are seen further into the cavity, ending in the highest ranked pose (blue). The highest ranked pose aligns with the position of the minima in the Dd2-PfCRT cavity (Fig. 2B). It is also over twice as populated as the next ranked pose. We therefore designate this position as the Dd2-PfCRT CQ binding site.

The binding sites of CQ in 3D7 and Dd2-PfCRT

To investigate how the binding site for CQ changes from 3D7-PfCRT to Dd2-PfCRT, we quantified the interaction of each residue of PfCRT with CQ as it sits in the respective binding sites of each isoform.

In the highest ranked 3D7-PfCRT pose, CQ’s di-aminoalkane ‘tail’ sits half out of the cavity and is in association with the loop connecting transmembrane helices 9 and 10 (Fig. 4A). The residues which support CQ in this position are in close proximity to the drug and feature a variety of interaction types (Fig. 4C). The most favorable interaction is a backbone hydrogen bond with Val370 at −50 kJ mol-1 (Fig. 4C, D). There is also a halogen bond between CQ’s chlorine and Glu198, electrostatic interactions between Glu372 and one of the protonated nitrogens, and cation-pi interactions between the quinoline group and Arg371 (Fig. 4C, D). This last arginine is mutated to an isoleucine in Dd2-PfCRT. There are two residues which have unfavorable (>0 kJ mol−1) potential energy with CQ: Lys80 and Gly153. The positive charges of both CQ and Lys80 will of course conflict. The lysine’s high, central position likely contributes to the large unfavorable region at the center of the 3D7-PfCRT cavity (Fig. 2A). Apart from Arg371, all of the Dd2-PfCRT mutation loci residues do not appear to be contributing favorably, or unfavorably, to the 3D7-PfCRT CQ binding site with average potential energies with CQ of near zero kJ mol−1 (Fig. 4D).

Fig. 4: The CQ binding site in 3d7-PfCRT and the interactions which support it.
figure 4

A A side view of 3D7-PfCRT. B A top-down view of 3D7-PfCRT from the vacuolar side of the membrane. C A close-up of CQ in its 3D7-PfCRT binding site. CQ is shown in yellow, and residues whose average potential energy with CQ is in the top 5% of those <0 kJ mol−1, are colored in blue, and the those in the top 5% of an average potential energy >0 kJ mol−1 are colored in red. Transmembrane domain (TMD) alpha-helices are labeled. D The average potential energy between CQ at the 3D7-PfCRT binding site and a series of residues. The residues shown are those with the highest 5% of average potential energies, the lowest 5%, and residues that are mutated in Dd2-PfCRT as shown with an asterisks. Error bars represent the standard deviation of the potential energy. n = 8690 simulation frames in the binding site.

The binding pose of CQ in Dd2-PfCRT is located in the upper-central region of the cavity, and is in close association with transmembrane helices 4 and 9 (Fig. 5A–C). The dominant interaction is an electrostatic bond between Glu198 and the protonated nitrogen on the quinoline group, with an average potential energy of almost −180 kJ mol−1 (Fig. 5C, D). The next most favorable interaction at −26 kJ mol−1, was a halogen bond with both Asp377 and Arg374. There also exists strong hydrophobic interactions with Gly353 and Pro354, contributing −20 kJ mol−1. And depending on the conformation, there are additional hydrogen bonds occurring between the protonated nitrogen on CQ’s tail and Gln156 or Gln352.

Fig. 5: The Dd2-PfCRT CQ binding site and the interactions which support it.
figure 5

A A side view of Dd2-PfCRT and the two minimum free energy paths as calculated by MULE from bulk solution to the CQ binding site. B A top down view from the vacuolar side of the membrane, of the CQ binding site in Dd2-PfCRT. C A close-up of CQ in its Dd2-PfCRT binding site. The residues with the lowest and highest 5% average potential energies are colored blue and red, respectively. The two minimum free energy paths that CQ can take to the binding site are also displayed and each CQ position colored on a blue to red scale, with blue being low free energy. Transmembrane domain (TMD) alpha-helices are labeled. D The average potential energy between CQ at the Dd2-PfCRT binding site and a series of residues. The residues shown are those with the lowest and highest 5% average potential energies, and also those that are mutated in Dd2-PfCRT and are marked with an asterisk. Error bars represent the standard deviation of the potential energy. n = 1190 simulation frames in the binding site.

To confirm the importance of the electrostatic interaction between Glu198 and the protonation on the quinoline moiety, we performed additional simulations of CQ at its binding site and deprotonated the quinoline nitrogen (Supplementary Fig. 3). When CQ was deprotonated, it rapidly moved out of the binding site within 20 ns, confirming the importance of this region in CQ’s binding to Dd2-PfCRT.

Of the residues that do not support CQ in its binding position, Lys80, Ser157 and Gln161, the latter two are likely from steric clashes – brought into proximity by the intense interaction CQ has with Glu198 (Fig. 4C, D). Lys80 is unfavorable as it is also in 3D7-PfCRT, likely contributing to the energy barrier above CQ’s binding minima observed in the FES (Fig. 2B).

The potential energies between residues mutated in Dd2-PfCRT and CQ are all almost zero kJ mol-1 (Fig. 5D). This suggests that the Dd2-PfCRT CQ binding site also exists in 3D7-PfCRT, it is simply obstructed by Lys76 (as seen in the FES, Fig. 2A–C). The presence of distinct pathways to the CQ binding site in Dd2-PfCRT that are not present in 3D7-PfCRT (and partially so in Dd2-76K-PfCRT), suggests that the other non K76T mutations may play a role in the formation of these paths (Fig. 2A–C).

Dd2-PfCRT mutations modulate binding site access

To further explore the role that PfCRT mutations may play in opening (or obstructing) access to the binding site along the two pathways, we sought to first quantitatively define the two paths to the binding site in the FES of Dd2-PfCRT CQ (Fig. 2B).

The package MULE was used to calculate the minimum free energy paths from bulk solution to the CQ Dd2-PfCRT binding site, and define the path’s coordinates (Figs. 2A, 5A). We next calculated the potential energy between CQ and all residues within the protein when the drug was at a point along either of these paths. To assess whether any residues in 3D7-PfCRT may obstruct access as K76 does, the paths were then superimposed onto the 3D7-PfCRT system, the analysis repeated, and the difference in potential energy between isoforms were found (Fig. 6).

Fig. 6: The difference in CQ’s potential energy along its two paths to the binding site between Dd2-PfCRT and 3D7-PfCRT mutation loci.
figure 6

The potential energy values are from its binding site, out to solution. Gray areas represent positions along the path for which there was no simulation frames for which CQ was ever present. For clarity, Path 1 was truncated at 75 Å. The color bar range has also been restricted to −25 kJ mol – the largest potential energy between 3D7 and Dd2-PfCRT was for the Q271E mutation in path 2, at −146 kJ mol−1.

As CQ does not stably bind in the 3D7-PfCRT cavity (Fig. 2A), we find there are many points along the paths where CQ is not found (gray areas in Fig. 6), especially close to the binding site, due to the presence of Lys76 (as evidenced by the Dd2-76K-PfCRT FES, Fig. 2C)(Fig. 6). However, there is sufficient sampling to see how some mutations alter movement along the paths. Along Path 1, we see that the R371I mutation initially makes the approach to the binding site more favorable, but as CQ goes through the 3D7-PfCRT CQ binding site, there is an increase in the potential energy (Figs. 5B, 6). There are also slightly favorable reductions in energy with I356T and K76T close to the Dd2-PfCRT binding site. In Path 2, we see that R371I again increases the potential energy on the approach to the binding site (Fig. 6). This is not wholly expected, as one might expect the removal of positive charge to increase CQ’s ability to enter the cavity. However, as seen in the 3D7-PfCRT, there is a supportive cation-pi interaction that would be lost upon mutation. The most striking change is that of the Q271E mutation. Indeed, the introduction of the negative charge increases the potential energy of Path 2 by an average −50 kJ mol−1. K76T has a larger effect along Path 2 than in Path 1, increasing the favorability amongst a region where CQ is not frequently present – indicating that this is a particularly high energy region. I356T slightly increases the favorability. The other Dd2-PfCRT mutations, M74I, N75E, A220S, and N326S; all had no interaction with CQ at any point along either of the paths. This suggests that these residues are not involved in the process of drug access and binding, and may instead perform a function relating to the transport cycle or peptide transport.

The relative peptide accessibility of PfCRT’s binding cavity

To investigate the molecular basis of the compromise made between peptide transport and CQ transport (and the resulting fitness cost), we again performed MD simulations of 3D7-PfCRT and Dd2-PfCRT with high concentrations of one species of the peptides DPVN, PENF, or PVNF, as we did with CQ. The ability of PfCRT to transport these has previously been studied using trans-stimulation assays in the Xenopus laevis heterologous expression system8. DPVN and PENF are substrates of 3D7-PfCRT but not of Dd2-PfCRT. PVNF is a substrate of both isoforms. We additionally performed simulations of 3D7-PfCRT and the peptide PEEK, which is not a substrate of either isoform. This pattern of substrates and non-substrates was chosen to allow us to infer whether our MD simulations could reproduce relative changes in accessibility. The peptides were 4 residues in length to minimize the peptide’s conformational space, while still being of a size that is transportable by PfCRT.

The peptides DPVN, PENF, and PVNF are able to access the 3D7-PfCRT cavity for at least 65% of the simulation frames (Fig. 7A). In contrast, DPVN and PENF show a significant (p < 0.001) reduction in accessibility to the binding cavity of Dd2-PfCRT, being present in an average of 17.5 ± 6.9% and 11.9 ± 4.6% of simulation frames, respectively. Conversely, PVNF maintains its presence in the binding cavity of Dd2-PfCRT, with 60.4 ± 6.0% of simulation frames having the peptide found therein. PEEK has only low accessibility to the 3D7-PfCRT cavity, similar to that seen for DPVN and PENF in Dd2-PfCRT (Fig. 7A, B). The FES shows similar trends to cavity accessibility. 3D7-PfCRT with DPVN, PENF, and PVNF, all feature minimas of −3, −3, and −4 kJ mol−1, respectively (Fig. 7C). The PVNF 3D7-PfCRT minima is much broader, and deeper than that of DPVN and PENF. Across the three peptides, their favorability in the Dd2-PfCRT cavity reduces substantially (Fig. 7D). The minima that DPVN had in 3D7-PfCRT has largely disappeared, though there is a slight minima of −1 kJ mol−1 towards the bottom of the cavity. PENF’s access to the cavity is much more restricted and no minima exists. PVNF’s favorability in the cavity has remained the most robust across peptides and isoforms, though it has reduced – with the lowest free energy minima being −2 kJ mol−1 and has moved higher up in the cavity.

Fig. 7: The accessibility of 4-residue peptides to the binding cavities of 3D7 and Dd2-PfCRT.
figure 7

A The average number of frames across simulation replicates a peptide is present in the binding cavity of the PfCRT isoform, as a percentage of simulation frames. The error bars reflect the standard error. n = 20 simulation replicates, T-test (two-sided): p = 1.8 × 10−5, p = 1.2 × 10−9, p = 0.1. Free energy surfaces of the simulation systems B 3D7-PfCRT PEEK, C 3D7-PfCRT DPVN, PENF, PVNF, D and Dd2-PfCRT DPVN, PENF PVNF. The free energy values are averaged across a portion of Y-axis that encapsulates the binding cavity region of PfCRT, as was the case in the CQ systems. FES are zeroed on the average free energy value of a slice of bulk solution. The solid line represents the position of the PfCRT protein in that simulation system, and the interior dotted line, the region of Lys76’s motion. For Dd2-PfCRT, the range of Lys76 is from the 3D7-PfCRT simulation with the same peptide.

The relative difference in the cavity presence of each peptide between isoforms is consistent with their respective trans-stimulation assays – low accessibility to the cavity in our simulations is predictive of whether the peptide is or is not a substrate. We also note that the relatively infrequent access of the 3D7-PfCRT cavity by PEEK demonstrates that the ability of our MD simulations to reproduce differences in isoform substrate discrimination is not the result of a coincidental conformational closure or substrate incompetency in the Dd2-PfCRT structure generated during system equilibration.

Peptides exhibit diverse binding behavior

We next quantified the most likely 3D7-PfCRT peptide binding sites by clustering the peptide positions while they were in the cavity (Fig. 8A, D, G). Each of the peptides exhibited different binding behaviors; DPVN had three highly populated sites, PENF had two, and PVNF had only the one very well differentiated site. The sites between peptides all differ in position and conformation, but are all relatively central within the 3D7-PfCRT cavity (Fig. 8).

Fig. 8: The binding sites of peptides to 3D7-PfCRT.
figure 8

A The relative cluster populations of the top ten most populated clusters of DPVN. B The top three 3D7-PfCRT binding poses of DPVN, viewed from the side. C The top three 3D7-PfCRT binding poses of DPVN, viewed from the top. The highest ranked pose is highlighted in orange. The residues displayed are the those with the top 5% of favorable potential energies with the peptide while in the cavity. The same is repeated for PENF in (D, E, F), and for PVNF in (G, H, I).

The scattered binding positions of the peptides means that quite a number of different residues appear to support the various binding sites, however, Lys76 forms a common interaction with every peptide at all of their highly populated binding positions (Fig. 8B, E, H). In the three DPVN sites, Lys76 forms a backbone hydrogen bond with the peptide’s valine residue, an electrostatic bond with the aspartate and the peptide’s C-termini, in poses ranked 1, 2 and 3, respectively (Fig. 8C). For the two PENF binding sites, Lys76 has an electrostatic bond with the peptide's glutamate in the first ranked pose, and then with the C-termini in the second (Fig. 8F). In PVNF, Lys76 hydrogen bonds with the asparagine side chain (Fig. 8I).

Key common interactions that support peptide binding are lost in Dd2-PfCRT

Due to the diversity of the binding positions of the peptides, we instead sought to characterize what residues were supportive (or disruptive) of the peptides being in the cavity, regardless of the bound pose.

After calculating the potential energy between each residue of PfCRT and each of the peptide systems for all frames in which the peptides were within the cavity, the top 5% of favorably interacting residues in common was assessed (Fig. 9A). All peptides shared favorable interactions with the residues Lys76, Lys80, His97, Phe145, Leu148, Gln156, and Leu160. There were also residues which were unique to each peptide in supporting their cavity binding and access. Notably, Asn75 supports DPVN, which is mutated to glutamate in Dd2-PfCRT, and Arg371 that supports both DPVN and PENF – mutated to an isoleucine.

Fig. 9: The residue interactions that support peptide binding and access.
figure 9

A A Venn diagram of the top 5% of favorable residue interactions for each peptide in the 3D7-PfCRT cavity. B The average potential energy between PfCRT residues and the peptide of interest. Residues for which the potential energy was greater than −10 kJ mol−1, were grouped in ‘other’. The residues are colored by their type, with positively charged residues in blue/purple, negatively charged in red, polar in green, and hydrophobic in light gray.

The strength of the interactions, quantified by the potential energy of interactions of the peptides with the protein residues while in the PfCRT cavity is shown in Fig. 9B. For all of the 3D7-PfCRT peptide systems, the positively charged residues Lys76, Lys80, and the protonated His97 are major contributors to the favorable cavity interactions, indicating the importance of charge complementarity in peptide recognition. Importantly, Lys76 is the most dominant interaction across all three of the peptides, despite the different binding positions and interaction types with the residue. Naturally then, in Dd2-PfCRT, the large energetic contribution from Lys76 is lost. Lys80 instead becomes the most favorable contributing residue, however, the total interaction energy is significantly reduced, indicative of weakened or lost binding of the peptides to this isoform (Fig. 9B).

Discussion

In this study we performed MD simulations to elucidate the atomistic basis of PfCRT-mediated CQ resistance and why this conflicts with PfCRT’s polyspecific peptide transport. Our simulations revealed that Lys76 is in close proximity to the would be CQ binding site in PfCRT, and the removal of the lysine (among other residue changes) allows CQ to enter and bind in the cavity. We find that the Dd2-PfCRT mutations R371I and Q271E work in concert with K76T to further enable CQ to access its binding site (Fig. 10A). Contrastingly, peptides rely on the positive charge of Lys76 to act as an ‘anchor point’, stabilizing their diverse binding modes (Fig. 10B). Mutation of this residue would therefore result in the reduction of peptide transport and subsequent fitness cost to K76T-associated CQ resistance. These results confirm long-standing hypotheses on the importance of residue charge (particularly that of Lys76) in regulating CQ binding6,22,28,29.

Fig. 10: A proposed model for CQ and peptide access and binding.
figure 10

A CQ is able to bind 3D7-PfCRT, but in a position that is partially out of the cavity which would not result in high levels of transport. This binding is supported by Arg371. In Dd2-PfCRT, the K76T opens up the binding site of CQ, along with the removal of Arg371’s positive charge and the introduction of the negative charge of Q271E, giving CQ access. Lys80 prevents the direct access of CQ from the cavity, forming the two pathways. B In 3D7-PfCRT, K76 acts as an anchor point for diverse peptide binding modes. A host of other residues also support these interactions. Upon mutation to Dd2-PfCRT, the crucial K76 interaction is lost, and a number of other charge related mutations occur, which increase the negativity of the cavity, furthering the peptides' exclusion from the cavity.

Comparison to experiments

The relative difference in cavity accessibility seen in our simulations is predictive of whether a ligand will be a substrate of that PfCRT isoform or not8. This not only causes differences in transportability for CQ in the PfCRT isoforms (Fig. 2D), but also the discrimination between peptides that also differs with isoform (Fig. 7A). Given that differences in transportability could theoretically arise from any stage within the transport cycle, the fact that our simulations of only the open-to-DV conformation are able to accurately predict substrate discrimination is a non-trivial demonstration of their ability to capture real differences in transporter-substrate behavior.

Transmembrane domain (TMD) 9 is highly conserved in PfCRT, and the Pro354 therein has been suggested to play a role in substrate binding and transport7,28. We find that CQ’s Dd2-PfCRT binding site is in close proximity to TMDs 4 and 9, and Pro354 supports it at that position (Fig. 5C, D). However, CQ is not the natural substrate – peptides are. The residues identified in this study as being important in peptide binding are not in TMD 9 (Figs. 8, 9)8,10. This is potentially because TMD 9 is involved in coordinating the conformational cycle during transport, or the binding position of CQ is also where the peptide’s co-substrate binds8.

Additionally, we find that CQ can bind to both 3D7-PfCRT and Dd2-PfCRT, albeit at different locations, agreeing with photoaffinity labeling experiments (Figs. 4, 5)33. These experiments demonstrate that CQ is able to bind to both isoforms in proximity to the loop between TMDs 9 and 10 (from residues 364 to 374). Both sites in our simulation are in proximity to this loop and have direct interactions with a residue in it. Importantly, however, these sites are in different positions and are supported by different residues; Glu372 and Val370 in 3D7-PfCRT and Arg374 in Dd2-PfCRT. This difference helps reconcile a major contention in regards to the expression system used to measure drug transport rates in PfCRT. Transformed Dictyostelium discoideum and X. laevis oocytes do not show any transport of CQ by 3D7-PfCRT (unless trans-stimulated by peptides), yet yeast and proteoliposomes do, albeit at lower levels than Dd2-PfCRT8,13,22,34. This was difficult to reconcile with the aforementioned photoaffinity labeling experiments, and equilibrium binding experiments which both demonstrated that CQ would bind 3D7-PfCRT6,33. The different CQ binding sites in 3D7-PfCRT and Dd2-PfCRT, and in particular the fact that the 3D7-PfCRT site is only partially in the cavity, where it likely has a reduced transportability, emphasizes that in this protein, binding does not necessarily result in transport.

PfCRT’s multiple mechanisms of polyspecificity

PfCRT is able to transport a diverse array of hemoglobin derived peptides (though with bias towards negatively charged peptides) of lengths 4 to 11 residues8,10. From our simulations, we see that peptide access and binding is facilitated by Lys76, Lys80, His97, Phe145, Leu148, Gln156, and Leu160 for all peptides tested (Fig. 9A). The distribution of chemical properties amongst the structure is notable; the positively charged residues are on one side of the cavity, and the hydrophobic residues are on the other (with an additional polar residue). These residues support a variety of different binding positions among the peptides. DPVN can move between three sites, PENF between two, and PVNF has just the single well defined binding position. The sites between peptides are diverse – the majority of the cavity surface appears to be capable of binding peptides, but all make use of the same charged anchor point: Lys76. This explains the polyspecificity of the transporter – a large range of substrates can find favorable binding positions, linking the charged anchor to any of the hydrophobic residues (Fig. 10B).

The distribution of residue chemical properties and the oscillatory behavior (Fig. 8) of the peptides between multiple sites is highly reminiscent of what has been observed in other polyspecific, multidrug resistance associated proteins as well as the organic anion and cation transporters35,36,37,38,39,40,41. These diverse protein families also feature large cavities with an abundance of hydrophobic residues, and on occasion, like-charged residues. The lack of directionality in the hydrophobic and electrostatic interactions allow a variety of orientations and binding positions to be adopted within the spacious cavities42.

The influence of electrostatic interactions in aiding polyspecific binding appears to explain the differences in binding behavior between DPVN, PENF and PVNF (Fig. 8). Both DPVN and PENF feature a negatively charged side chain residue, allowing for them to adopt a range of conformations while interacting with the positively charged residues in the cavity. Because of this, they are then able to oscillate between sites, in a manner suggested by Yamaguchi et al., with the multi-site (drug) oscillation hypothesis38. In this, the numerous low affinity binding positions allow the molecule to oscillate, jumping from site to site, while still being transported at a high rate due to these oscillations all occurring within the large transport cavity. PVNF being neutral and not possessing the ‘destabilizing’ negatively charged side chain is then better able to find a stable position within the cavity.

The charge altering mutations in Dd2-PfCRT (N75E, K76T, Q271E, and R371I) make the cavity strongly negatively charged6. The range of peptides which the isoform is able to transport is reduced, likely from reduced favorability of cavity access for negatively charged peptides (of which hemoglobin derived peptides tend to be) as seen in our simulations of Dd2-PfCRT with DPVN and PENF (Fig. 7A)8. PVNF may therefore retain accessibility to the cavity of Dd2-PfCRT on account of its neutral charge.

While the range of peptide substrates decreases for Dd2-PfCRT, there is a corresponding increase in the range of drug molecules it is able to transport. Dd2-PfCRT can transport a number of antimalarial compounds such as quinine, amodiaquine, mefloquine, lumefantrine, and piperaquine in addition to non-antimalarials such as verapamil13,14,15,16. The antimalarials all share a quinoline group and are positively charged at the pH of the DV. Initially, one might think that the increased substrate range is mediated by the drug’s shared chemical groups having a shared binding site. However, from kinetic studies and MD simulations, we know that at least piperaquine and quinine possess distinct binding modes to CQ (in Dd2-PfCRT at least – some evidence exists of a shared site in 7G8-PfCRT, but this only further demonstrates the intrinsic flexibility with which PfCRT can bind drugs)6,16,43. Shifting binding site positions upon mutation, is a phenomena that is sometimes observed in multidrug resistance proteins44,45. Thus, the binding may be dependent on the general chemical features of hydrophobicity and charge, rather than specific chemical motifs. This may also help to explain the overlap in drug-substrates that PfCRT has with PfMDR1, the P. falciparum homolog of the infamous P-gp14,46. PfMDR1 also features a large amphiphilic binding cavity (though with polar residues rather than negatively charged ones) and is associated with resistance to a number of antimalarial compounds46,47. Shared drug-substrate profiles is a phenomena that has also been noted in bacterial multidrug resistance proteins36,48.

Here we suggest an interpretation of the evolution of Dd2-PfCRT – the mutations appropriated the underlying polyspecificity of 3D7-PfCRT, and re-tuned the range of substrates that could be transported by altering the general electrostatic environment of the cavity. The principle of this has been demonstrated in mutagenesis studies of other multidrug resistance proteins49,50,51,52. This is perhaps why the Dd2-PfCRT mutations do not actively form a new binding site for CQ, but simply allow it to enter and bind to residues which were already present in 3D7-PfCRT, co-opting the underlying polyspecific binding mechanism.

Opportunities

PfCRT is an essential protein that plays a central role in the modulation of drug susceptibility, and as such has been argued to be a potential target for antimalarial action and ‘resistance-reversal’10,53,54. We concur with this argument, and the identification of the binding modes of CQ and several peptide substrates in this study could provide the basis of the rational, structure based design of inhibitors. It has been previously suggested that substrate (i.e., peptide) mimetics may be an effective starting point for inhibitor design54. However, the polyspecific mechanism of peptide recognition identified in this study may urge caution in this approach. Mimetics that are not neutrally charged may struggle to achieve sufficient affinity to inhibit transporter function, and the conflicting cavity charges may reduce their ability to target drug transporting PfCRT isoforms. Any designed inhibitors should target the core peptide binding residues of Lys76, Lys80, His97, Phe145, Leu148, Gln156, and Leu160, as this may hinder the evolution of resistance, according to the substrate envelope hypothesis55. Functional studies of mutations at these residues should first be done to confirm our hypothesized peptide-binding region.

We observe that CQ’s binding location is blocked by Lys76 in our study, and because of this multiple stepwise mutations must occur in order for the parasite to reinstate its fitness (Fig. 2)22. While this generally does induce a significant fitness cost in CQ resistant parasites (in the absence of drug pressure), the Cam734-PfCRT (M74I, N75D, K76T, A144F, L148I, I194T, A220S, Q271E, T333S) isoform has already evolved a way to subvert this56. The A144F and L148I mutations are in (or near) the hypothesized core peptide binding region, likely reinstating peptide transport by increasing hydrophobic interactions while still altering charges with N75D, K76T, and Q271E to allow for drug transport.

Limitations and future directions

CQ’s binding in the cavity is highly sensitive to changes to charged residues22. In our study we had protonated Asp329 and His97 in accordance with pKa predictions by pdb2pqr, removing a negative charge and adding a positive. Because of this, this region is unfavorable and CQ does not explore it. In an earlier, less extensive, simulation study of CQ’s binding to PfCRT, Asp329 was identified as being important in binding to CQ57. Its protonation in our study would not allow for this. This highlights the need for accurate determination of residue pKa’s, which could be done with constant pH MD simulations58.

Our work was not able to determine the functions of the mutations M74I, N75E, A220S, N326S, and I356T. This is not entirely unexpected as PfCRT was only present in the open-to-vacuole conformation. The mutations may function in facilitating the protein’s conformational changes in its transport cycle, restabilizing binding of the peptide co-substrate, restoring peptide binding to peptides in a chemical space not covered in this study, or aiding in CQ transport in later stages of the transport cycle. Without resolved structures of alternative conformational states, the transport cycle could be resolved with residue coevolution or deep-learning based simulation approaches, which could provide further insight into these mutation’s functions59,60.

In this study we provide insights into the molecular basis of PfCRT’s polyspecific peptide and drug binding, and how these functions are mutually incompatible. We confirm long standing hypotheses about the central role that K76 plays in the protein's transport ability. Our simulation protocol is robust and aligns with experimental data. This approach provides a potential avenue to the rational structure based design of PfCRT inhibitors in lieu of further resolved structures.

Methods

Protein preparation

The PfCRT Cryo-EM structure was downloaded from the PDB (6UKJ)6. The structure of the missing loop between residues 67 to 77 was predicted with the Dareus loop webserver61. To determine the protonation states of the protein residues at the cytosolic and vacuolar pHs, PfCRT was submitted to pdb2pqr62. The pKa’s of residues on the vacuolar side of the protein were calculated according to a pH of 5.5, and the cytosolic side to a pH of 7.7. Accordingly, residues His97, His273, and Asp329 were protonated.

Coarse grained molecular dynamics simulations

To equilibrate a semi-complex bilayer surrounding the protein, we first performed coarse-grained simulations. The protein was submitted to the CHARMMGUI Martini bilayer builder and embedded in a lipid bilayer of an asymmetric composition63. The membrane contained POPC, POSM, POPE, POPS, and Cholesterol at a ratio of 4:4:1:0:1 in the vacuolar-facing leaflet and 1:1:3:2:2 in the cytosolic-facing leaflet. As no lipid composition information exists for the DV, we chose a membrane composition which satisfied two conditions (1) there was a negative charge asymmetry on the cytosolic leaflet. This was to match the positive charge asymmetry on the protein which is thought to orient the protein correctly in the membrane7. (2) A membrane thickness of 4.2 nm as this is the width of the DV membrane64. The human erythrocyte membrane satisfied these conditions and so a simplified version of its composition was used as the basis of the coarse-grained membrane65. The system was solvated and Na+ and Cl- ions were added at a concentration of 150 mM, resulting in a box size of 17 × 17 × 11 nm with 31046 beads.

All coarse grained simulations were performed using Gromacs 2021, with the martini 2.2 forcefield and the Elnedyn22 elastic network with a force constant of 500 kJ mol-1 to maintain the protein’s structure63,66,67. The system was first energy minimized using the steep integrator for 5000 steps and the Van der Waals forces scaled to 0.01 before minimizing again with unscaled Van der Waals interactions. Molecular dynamics equilibration runs were then performed with position restraints on the protein and the lipid head groups being progressively relaxed and the time step increased from 5 fs to 20 fs. After equilibration, the production run was performed for 16 μs with the Parrinello-Rahman barostat, keeping the system pressure at 1 atm and thermostated at 300.65 K using the v-rescale method68,69.

In preparation for fully atomistic simulations, the simulation box size was reduced. The system configurations from two random frames of the production run had their size reduced to the dimensions to ~9.1 × 8.5 × 10.6 nm, allowing for 2 debye lengths on either side of the protein. These random simulation frames after size reduction had 8603, and 8089 beads, respectively. The two resulting smaller simulation systems were then equilibrated and ran for a further 15 μs until the membrane had relaxed to the new box size.

Full atom simulations

All full atom simulations were performed with Gromacs 2021 and the Charmm36m forcefield with WYF parameters extension67,70,71. Atomic coordinates were updated with the leap-frog algorithm with a 2 fs timestep. For all production runs the temperature was kept constant at 310 K with the Nose-Hoover thermostat, and the pressure was maintained at 1 atm with the Parrinello-Rahman barostat (NPT ensemble)68,72. All systems contained Na+ and Cl- ions to a concentration of 150 mM and for charge neutralization.

Full atom system setup

The two coarse-grained simulation systems were back-mapped using the ‘Martini to All-atom Converter’ tool63. The back-mapped protein was removed from the system (as it had stability issues when run in full atomistic MD) and was replaced with the protein structure generated as described previously. Lipids clashing with the protein were removed and ring penetrations were removed by repositioning the atoms using VMD 1.9.373. The systems then went through a series of equilibration and short timestep MD simulations to relax the system. The protein was then mutated from the 7G8-PfCRT cryo-EM structure to 3D7-PfCRT, Dd2-PfCRT and Dd2-76K-PfCRT using the CHARMMGUI ‘PDB reader’. The mutations performed to generate 3D7-PfCRT were S72C, T76K, S220A, D326N, and L356I. And to generate Dd2-PfCRT, mutations S72C, M74I, N75E, Q271E, D326S, L356T, and R371I were also performed. For Dd2-76K-PfCRT, the mutations made were the same as Dd2-PfCRT, except T76K was also done. Disulfide bonds were added between cysteines Cys289-Cys312 and Cys301-Cys30974. Each isoform membrane configuration system was all run for 600 ns until the protein RMSD and the area per lipid had stabilized.

High ligand concentration simulations

The last frame of these isoform simulations were then used as starting points for the high ligand concentration simulations. The high ligand concentrations were to encourage thorough sampling of PfCRT’s cavity by the ligands. The ligands used in this study were CQ and the peptides DPVN, PENF, PVNF and PEEK.

The 3D structure of CQ was downloaded from the PubChem database (CID: 2719), and was protonated according to a pH of 5.5 with openbabel75. The CHARMMGUI ligand reader and forcefield converter was then used to generate Charmm36m compatible forcefield parameters70,76,77. The parameters were then used to run an energy minimization and molecular dynamics simulations in vacuo to assess the stability of the parameters.

The structures of the peptides were first generated with Avogadro, then solvated and ran for 60 ns of MD simulation at a temperature of 1000 K in order to generate a range of different peptide conformations.

Gmx insert-molecules was used to add 5, 10 or 20 molecules of the ligand molecules to the PfCRT-membrane systems. All systems were able to run stably with 20 molecules and no ligand aggregation was observed, so this was the number of ligands added to the following systems.

10 replicates were run for each combination of 3D7-PfCRT and Dd2-PfCRT isoform, ligand and membrane starting configurations (with the exception of PEEK with Dd2-PfCRT, and in addition CQ and Dd2-76K-PfCRT). Each replicate was produced with different ligand coordinates using a different instance of the gmx insert-molecules command. Each of the simulation systems was briefly equilibrated for 7.5 ps with molecular dynamics simulations with a 0.5 fs timestep. Each replicate was then run at least for 450 ns – the total length for each system is detailed in Supplementary Table 1.

Analysis

Convergence

To assess the convergence of our simulation results we took the average free energy of slice of bulk solution (spanning from 0 to 91 Å in the X and Y dimension, and 85 to 90 Å in the Z dimension), and the average potential energy of the PfCRT cavity, also defined with a rectangular prism (30 to 70, 38 to 60, and 40 to 60 Å in the X, Y and Z dimensions). The difference between the free energy of the solution and cavity was taken across simulation time, until the value stabilized (or was close to stabilizing)(Supplementary Fig. 2).

Free energy surfaces

The FES were produced with a custom python script that utilized MDAnalysis and numpy78,79. A grid was defined across the simulation system and for the length of the trajectory, the frequency of the ligand atoms at each gridpoint was counted. This was then converted to the free energy with using the equation:

$$\Delta G=-{k}_{B}\,T\,{ln}\left(P\right)$$
(1)

Where ΔG is the Gibbs free energy of the gridpoint, kB is the Boltzmann constant, T is the temperature (310 K) and P is the probability of the atom being at a particular gridpoint.

For display, the free energy surfaces were averaged across the Y dimension of the cavity (as defined previously), and were zeroed on the value of the slice of bulk solution. The FES were plotted with the python library matplotlib80.

Clustering

Clustering was performed with the gromacs cluster function. The gromos clustering algorithm was used with a cutoff of 2 Å.

Paths

The Dd2-PfCRT FES was first recalculated using the center-of-mass of CQ instead of any atom, so that the drug's position could be unambiguously defined. The MULE package was then used to calculate the minimum free energy path from out in solution to the minima of the FES (i.e., the CQ binding site)81. The coordinates of the path could then be used to find simulation frames in which CQ was within half a grid-width, producing trajectories of CQ moving along the paths. The potential energy of each residue to CQ was then calculated according to the following section.

Potential energy of interactions

To calculate the potential energy between PfCRT residues and the ligand of interest, energy groups were defined for each residue, and the gmx mdrun -rerun and gmx energy commands were used to extract the potential energies across the simulation length. The potential energy for each residue was then averaged across time.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.