Introduction

Alphaviruses are arthropod-borne viruses from the Togaviridae family that cause major epidemics around the globe. The Alphavirus genus is categorized into two groups based on genetic relatedness and symptomatology. Arthritogenic alphaviruses (predominantly Old-World viruses), such as chikungunya virus (CHIKV), O’nyong-nyong virus (ONNV) and Ross River virus (RRV), cause persistent rheumatic symptoms, while encephalitic alphaviruses (predominantly New World viruses), such as Venezuelan equine encephalitic virus (VEEV) and eastern equine encephalitic virus (EEEV), lead to incapacitating neurological manifestations1,2,3,4. Although CHIKV is generally considered an arthritogenic virus, a few cases of encephalitis have been reported in neonates infected with CHIKV during recent epidemics5.

CHIKV, a prototypic arthritogenic alphavirus, is the causative agent for chronic chikungunya virus disease (CHIKVD), which is characterized by persistent debilitating arthralgia in small joints in 30–60% of infected individuals6,7,8,9. The virus primarily circulates in tropical and subtropical regions, including South/Southeast Asia, Africa, and Central and South America10. The high morbidity rate and recurrent outbreaks due to the expanding range of Aedes mosquitoes pose a significant global public health threat11,12.

The CHIKV genome is 12 kb in length and consists of four non-structural proteins (nsP1, nsP2, nsP3 and nsP4) and five structural proteins (Capsid, Envelope 1–3 [E1-3] and 6K)13. Non-structural proteins play a pivotal role in viral replication, and structural proteins encode for proteins required for virion assembly14. IXCHIQ, the first FDA-approved CHIKV vaccine, consists of both non-structural and structural proteins with deletions in the nsP3 protein15.

Currently, the cause of chronic arthralgia post-CHIKV infection is poorly understood. In vivo studies have shown a limited role of CD8+ T cells in clearing alphaviral infection, potentially due to a combination of CHIKV nsP2 protein-mediated downregulation of MHC I16 and type I interferon-mediated blockage of dendritic cell infection of RRV and antigen presentation17. However, numerous studies have alluded to the inflammatory role of CD4+ T cells. Previous studies in mice have shown extensive infiltration of CD4+ T cells in inflamed joints of infected mice18,19. In addition, the adoptive transfer of CD4+ T cells from CHIKV-infected mice into CHIKV-infected T cell receptor knock-out (TCR-/-) mice led to severe joint swelling, vascular leakage and cellular infiltration20. Our recent work has shown a significantly higher frequency of CHIKV-specific CD4+ T cells and negligible CD8+ T cell responses in the peripheral blood of individuals with chronic CHIKVD21, consistent with an association of CHIKV-specific CD4+ T cells with chronic viral arthritic disease.

However, the epitope targets of CD4+ T cell responses in chronic CHIKVD in humans are currently unknown, thereby hindering our further understanding of the role of CD4+ T cells in chronic CHIKVD and the development of effective vaccines. Few studies have experimentally described CD4+ T cell epitopes in mice during the acute phase of infection, where dominant epitopes were found in the nsP1, E1 and E2 proteins20,22. Specifically, Teo et al. discovered two epitopes in nsP1145-162 and E22800-2818 proteins, which, when transferred as epitope-specific T cell lines into TCR-/- mice, induced joint inflammation20. While several groups have developed methods to predict T cell epitopes in silico23,24,25, to our knowledge, no studies have experimentally analyzed specific T cell epitope sequences in humans suffering from chronic CHIKVD or other alphavirus infections in the context of natural infection.

In this study, we characterize the immunogenic targets of circulating CHIKV-specific CD4+ T cells in humans affected by chronic CHIKVD. We identify 123 CHIKV epitopes and define core motifs in three immunodominant regions of CHIKV in E1, nsP1 and CP proteins, which are marked by their capacity to bind to multiple HLA alleles. In addition, we design and validate a new CHIKV-specific CD4+ T cell epitope megapool, which can assist with studying CHIKV-specific T cell responses in small blood volumes. Finally, we determine CHIKV epitopes that are conserved in other arthritogenic and encephalitic alphaviruses based on sequence conservation. Overall, our work provides a useful tool to study CHIKV-specific T cell responses in the context of natural infection and to characterize effective vaccines.

Results

Experimental design for proteome-wide screen of CHIKV T cell epitopes

We designed a comprehensive peptide library to map the T cell epitopes associated with CHIKV infection in humans. Following the previously described megapool (MP) approach21,26, we synthesized 15-mer overlapping peptides for four non-structural CHIKV proteins (nsP1, nsP2, nsP3, nsP4) and five structural CHIKV proteins (Capsid or CP, E3, E2, 6K, E1) based on the sequence of the representative isolate and additional frequent variant peptides as described in the Methods section (Fig. 1A). Overall, we synthesized a total of 992 peptides, which comprised 741 peptides from the representative isolate and 251 variant peptides. These peptides were organized in ten MPs, which were also further divided into smaller Mesopools (MS) of 9-11 individual peptides.

Fig. 1: Experimental workflow for screening CD4+ T cell epitopes in CHIKV.
figure 1

a Schematic representation of CHIKV proteome comprising four non-structural (nsP1, nsP2, nsP3 and nsP4) and five structural proteins (Capsid or CP, E3, E2 and E1). b Workflow of epitope screening. All donors were tested in the AIM assay by stimulation with 10 megapools (MP) corresponding to each CHIKV protein (nsP1, nsP2_1, nsP2_2, nsP3, nsP4, CP, E3, E2, 6 K, E1). Positive donors in the AIM assay (OX40+ CD137+ or OX40+ CD40L+ ) were tested in the FluoroSpot assay by stimulating with smaller pools of MP, called mesopools (MS), each of which contained 9–11 individual peptides. Each MS was deconvoluted to determine individual epitopes. Created in BioRender. Weiskopf, D. (2025) https://BioRender.com/2bdm25t. c An example of the experimental workflow for the responses of one donor to the E1 protein. The first panel shows responses in the AIM assay to all CHIKV MPs tested. The middle panel shows SFCs per million PBMCs to each MS of the E1 protein. The third panel depicts responses to individual peptides in E1-7 MS. The dotted line indicates the threshold of positivity. The blue highlighted bars depict an example of a positive response from one donor.

The library was used to map T cell epitopes recognized in 17 individuals from Colombia who displayed chronic arthritis-like symptoms on average 6.3 years post-CHIKV infection (Epitope Screening Cohort in Table 1). Infections were confirmed by measuring CHIKV-specific IgG titers (Supplementary Fig. 1A). To identify the global pattern of recognition, PBMCs from each donor were assayed ex vivo with each CHIKV protein MP (nsP1, nsP2_1, nsP2_2, nsP3, nsP4, CP, E3, E2, 6 K, E1) (Supplementary Fig. 1C). As described before21, negligible CD8+ T cell responses were detected; hence, we focused our epitope identification efforts on CD4+ T cells. CHIKV-specific CD4+ T cell responses were measured in an activation-induced marker assay (AIM assay), using the upregulation of either OX40+ CD137+ or OX40+ CD40L+ markers, both of which are commonly used to measure antigen-specific CD4+ T cell responses27,28,29. Using two different AIM markers allowed us to perform a more comprehensive screening to identify more positive responses. Responses greater than the limit of sensitivity (LOS) and stimulation index (SI) greater than two were considered positive. CHIKV-specific CD4+ T cell responses were detected in 88% (15/17) of donors. The highest frequency of AIM+ OX40+ CD137+ responses were detected against E1 (65%), E2 (53%), CP (29%) and nsP3 (29%) proteins, with a similar trend for AIM+ OX40+ CD40L+ responses, where E1 (59%), E2 (53%), CP (41%) and nsP1 (41%) proteins were most frequently recognized (Supplementary Fig. 1C). Of 170 unique donor-protein combinations tested, a total of 47 positive responses were detected marked by upregulation of OX40+ CD137+ and 57 positive responses were detected marked by upregulation of OX40+ CD40L+ (Supplementary Fig. 1C).

Table 1 Characteristics of the donor cohort

Identification of CHIKV human CD4+ T cell epitopes

Donor/MP combinations that were positive in the AIM assay were restimulated in vitro to allow for epitope identification. In case of limiting cell availability, the combinations with the strongest AIM+ responses were selected for re-stimulation. After 14 days, each T cell culture was tested against the MS corresponding to the stimulating MP in an interferon gamma (IFNγ)-based FluoroSpot assay. The threshold of positivity (215 spot-forming cells per million PBMCs or SFC/106 PBMCs) for the FluoroSpot assay was determined as the value higher than 99% of all the measured responses following the same in vitro stimulation protocol of PBMCs from six CHIKV-seronegative donors from Nicaragua with all CHIKV mesopools (Table 1 and Supplementary Fig. 1D).

Overall, of a total of 795 donor/MS combinations tested, 293 (37%) donor/MS combinations were associated with positive responses. Cultures from these positive MS responses were further tested with individual peptides to identify specific epitopes. As a result, 123 epitopes from the representative sequence and 24 variant epitopes were identified (Supplementary Table 1). A summary of the screening strategy is shown in Fig. 1B and representative data from one donor is shown in Fig. 1C. The representative donor recognized six MPs (nsP2_1, nsP2_2, nsP3, CP, E3 and E1) in the AIM assay, which were then screened with MS. From the E1 MS, the donor recognized three MS (E1-3, E1-7 and E1-9). The E1-7 MS, among other pools, was then deconvoluted, where two epitopes from the pool were recognized by the donor.

In addition to experimental identification of epitopes, we used IEDB NetMHCIIpan 4.1 EL method30 to perform HLA binding predictions for all synthesized peptides using the reference set of 27 most common HLA Class II alleles31, which has been shown to provide a 100% coverage of the general population when combined. The IEDB-recommended percentile value less than 20% was used as the threshold to define a binding event. Using this criterion, 744/992 (75%) of peptides were predicted to bind to at least one HLA class II allele from the reference set, and 129/744 (17%) of the predicted binders were experimentally found to be epitopes. In addition, a significantly higher number of HLA alleles were predicted to bind to experimentally defined epitopes vs non-epitopes (p < 0.0001) (Supplementary Fig. 1E).

Immunodominant proteins in CHIKV proteome recognized by CD4+ T cells

Post-deconvolution, we identified a total of 123 epitopes from the proteome of the representative isolate (Fig. 2A). These sequences have been submitted to IEDB (submission ID - 1000910; https://www.iedb.org/reference/1042263). The majority of these epitopes were from E1 (24%), nsP1 (21%) and CP (16%) proteins, with few epitopes located in nsP4 (2%) and 6K (2%), while none were identified in the E3 protein. Overall, non-structural and structural proteins accounted for 34% and 66% of total magnitude of the response, respectively, with the highest total magnitude of responses against epitopes in E1 (35%), nsP1 (21%) and CP (18%) (Fig. 2B). Notably, in addition to having the highest number of epitopes and the highest frequency of response, the E1 protein elicited over one-third of the overall magnitude of the response.

Fig. 2: Immunodominant proteins in CHIKV proteome recognized by CD4+ T cells.
figure 2

a Frequency (%) of epitopes detected per CHIKV protein. n refers to the number of total epitopes identified. b The magnitude (SFC/106 PBMCs) of positive response to each epitope in the entire CHIKV proteome. The percentage shows the percent of the overall magnitude of responses elicited by each protein. The heatmap under the graph and the heatmap legend of the side of the graph indicates the number of donors that recognize each epitope, ranging from 0–9 donors. c The lower bound of the 95% confidence interval of response frequency of each epitope plotted for the non-structural (top) and structural polyprotein (bottom). Each overlapping peptide was mapped to the sequence of the representative isolate, and the response frequency for each residue was calculated using the ImmunomeBrowser tool in IEDB-AR. The lower bound of the confidence interval was plotted for each residue to visualize regions with dominant responses. The dotted line indicates the threshold of positivity (0.2). The highlighted regions depict the residue number that reaches the threshold of positivity and the associated peptide sequences.

To identify the immunodominant regions in the CHIKV proteome, we visualized the identified epitopes to the CHIKV proteome using the standalone ImmunomeBrowser tool from the Immune Epitope Database Analysis Resource (IEDB-AR)32 and calculated the lower bound of 95% confidence interval of the response frequency of each peptide in structural and non-structural polyproteins separately. Two regions - nsP1226-235 and E11126-1140 – exceeded a threshold of positivity of 20%, while CP46-50 reached the threshold (Fig. 2C).

Immunodominant epitopes in CHIKV proteome recognized by CD4+ T cells

Next, we focused our analysis on responses elicited by individual epitopes. The magnitude of response from epitopes varied widely, ranging from 200 to 21,200 SFCs/106 PBMCs (average response was 1128 SFCs/106 PBMCs; Fig. 3A). Interestingly, 38 epitopes accounted for 75% of the total magnitude of IFNγ response from all donors tested (Fig. 3B and Table 2).

Fig. 3: Immunodominant epitopes in CHIKV recognized by CD4+ T cells.
figure 3

a The magnitude of all positive responses, shown as the total number of IFNγ spot-forming cells per million PBMCs (SFC/106 PBMCs). Average response is at 1128 SFC/106 PBMCs, with responses varying from 220-21200 SFC/106 PBMCs. Each dot refers to a CHIKV epitope (n = 123). b Percentage of the total of average response for all donors plotted as a function of the total number of epitopes. Epitopes were ranked in descending order of average magnitude, and the percentage of total magnitude was calculated based on the cumulative sum of response. Dotted lines represent the number of epitopes that account for 50, 75 and 90% of total responses. c Number of epitopes recognized by each donor (n = 15). An average of 12 epitopes were identified. d The pie chart indicates the frequency of positive donor responses for each epitope. Parenthesis indicates the number of epitopes recognized by the specified number of donors. eg SFC/106 PBMCs shown for individual chronic (blue line) and recovered (gray line) donors against 15-mer peptides overlapping by 14 residues sequentially spanning the immunodominant region of (e) the nsP1 (chronic: n = 3, recovered: n = 2), (f) E1 (chronic: n = 5, recovered: n = 3) and (g) CP (chronic: n = 4) regions. The black lines indicate the average SFC/106 PBMCs for all donors shown, and the blue shaded regions depict the region with the highest response. The pie charts on the right show the percentage of HLA alleles predicted to bind to the indicated peptide sequences (i.e., the mapped epitope and negative control peptides on the N- and C-termini) shown under the pie chart. Chi-square value (two-tailed), along with the degree of freedom used and the resulting p-value, is reported for all peptide sequences. Each condition was tested in triplicate (technical replicates) in a FluoroSpot assay. Data are represented as mean ± SD (panels b, c) or geomean ± geometric SD (panels a, eg).

Table 2 Epitopes that elicit 75% of the total IFNγ response

Specific epitopes were identified in 13/15 donors, with each donor recognizing on average 12 epitopes (range: 0–25 epitopes; Fig. 3C). While 93/123 (75%) of epitopes were identified in only one donor (Fig. 3D), four peptides were most frequently recognized, with two contiguous peptides – E11121-1135 (FGGVAIIKYAASKKG) and E11126-1140 (IIKYAASKKGKCAVH) – being recognized by the eight and nine donors, respectively (Fig. 2B). Both E11121-1135 and E11126-1140 peptides were recognized by the same eight donors. The other two commonly recognized peptides – CP41-55 (LAQLISAVNKLTMRA) and nsP1221-235 (RRGKLSIMRGKKLKP) – were recognized by five donors each (Fig. 3D). All four of the commonly recognized epitopes also were part of the top 38 epitopes that elicited a high magnitude of response (Table 2).

Mechanism of immunodominance for the nsP1226-235, E11126-1140 and CP46-50 regions

The data presented above pin-points residues nsP1226-235, E11126-1140 and CP46-50 as being associated with remarkable frequency and magnitude of response. Additional experiments addressed the molecular mechanisms associated with such dominance, and whether these regions would likely contain a single epitope encoded in the overlap between the two consecutive 15-mer peptides, or whether they would happen to contain multiple distinct epitopes.

Accordingly, we synthesized a series of 15-mer peptides overlapping by 14 residues and spanning residues 216–245 for nsP1, 1116–1145 for E1, and 36–65 for CP. These peptides were tested for immunogenicity with a T cell line derived by two-week in vitro restimulation of PBMCs from chronic and recovered donors (Epitope fine-mapping Cohort from Table 1). Plotting responses from all donors clearly indicated that the peptides associated with optimal response all share the KLSIMRGKKL, IIKYAASKKG and LISAVNKLTMR core regions for nsP1 (Fig. 3E), E1 (Fig. 3F) and CP (Fig. 3G), respectively. In addition, T cells from recovered individuals recognized the same core sequences for nsP1 and E1 proteins, but at a considerably lower magnitude as compared to chronic donors.

Next, we used the IEDB NetMHCIIpan 4.1 EL method30 to perform HLA binding predictions for the reference set of 27 most common HLA class II alleles31. Predictions were performed using the 15-mer nsP1221-235 (RRGKLSIMRGKKLKP), E11123-1136 (GVAIIKYAASKKGKC) and CP42-56 (LAQLISAVNKLTMRAV) epitopes that encompassed the experimentally identified core motif, flanked by two and one additional amino acids at the N- and C-termini, respectively. The IEDB-recommended percentile value of less than 20% was used as the threshold defining a predicted binding event. As shown in Tables 3, 4 and 5, remarkably the nsP1 and E1 epitope was predicted to bind 18/27 (67%) and 19/27 (70%) of the HLA molecules, respectively, with the frequency of predicted binding being slightly lower at 11/27 (44%) HLA alleles for the CP epitope. For each of the epitopes, we also tested the binding capacity of two negative control 15-mers (nsP1: DLTEGRRGKLSIMRG and KKLKPCDRVLFSVGS; E1: THSSDFGGVAIIKYA and ASKKGKCAVHSMTNA; CP: RQAGQLAQLISAVNK and LTMRAVPQQKPRRNR) located at the N and C termini of the epitope region (nsP1: Supplementary Tables 2 and 3; E1: Supplementary Tables 4 and 5; CP: Supplementary Tables 6 and 7). In the case of the control peptides, binding capacity was predicted for 0/27 (0%) and 2/27 (7%) HLA alleles for N- and C-termini nsP1 peptides respectively (Fig. 3E), and 3/27 HLA (11%) and 0/27 (0%) HLA alleles for N- and C-terminus E1 peptides respectively (Fig. 3F). For CP, 3/27 (11%) and 1/27(4%) HLA alleles were predicted to bind to N and C-terminus peptides, respectively (Fig. 3G). A chi-square test was performed to examine if there was a statistical difference in binding for the finely mapped epitope and control peptides. For all proteins, a significant difference was found between predicted binding of peptides (nsP1, E1: p < 0.0001; CP: p = 0.0037). In addition, we sequenced the HLA locus for each of the tested donors in our cohort using PCR amplification of genomic DNA (Supplementary Table 8). Each donor contained at least one HLA allelic variant that was predicted to be a strong binder for the peptides tested.

Table 3 HLA binding predictions for the finely mapped epitope sequence in the nsP1 protein
Table 4 HLA binding predictions for the finely mapped epitope sequence in the E1 protein
Table 5 HLA binding predictions for the finely mapped epitope sequence in the CP protein

Altogether, these data suggest that the nsP1226-235, E11126-1140, and CP46-50 regions contain a single epitope, characterized by KLSIMRGKKL, IIKYAASKKG and LISAVNKLTMR core regions, respectively. The regions are associated with a remarkable capacity to bind many or most human HLA class II molecules frequently expressed in the worldwide population, thus explaining their dominance.

Development and validation of an epitope-based CHIKV CD4+ T cell MP

The MPs utilized so far were based on pools of overlapping peptides and variants spanning the entire sequence of the CHIKV proteome. Having now identified actual epitopes recognized by human CD4+ T cells, we leveraged this knowledge to create a CHIKV-specific MP containing the verified epitopes that thus encompassed fewer peptides. Accordingly, we designed two MPs - structural and non-structural - that consisted of 123 epitopes from the representative sequence and 24 variant epitopes that were previously identified. The structural epitope MP, referred to as CHIKV_S, and the non-structural epitope MP, referred to as CHIKV_NS, consisted of 84 and 63 epitopes, respectively. We envision that such epitope-based MP, by analogy with those developed for other indications such as DENV33, pertussis34, tuberculosis35 or SARS-CoV-236 could be used to study CD4+ T cell responses if only small amounts of blood are available, as typical in pediatric and clinical studies.

To validate these epitope-based CHIKV MPs, we recruited an independent cohort of 19 CHIKV-seropositive donors with chronic CHIKVD (referred to as “chronic”) and seven CHIKV-seronegative or uninfected controls. Cohort characteristics are provided in Table 1. Infections were confirmed by measuring CHIKV-specific IgG titers (Supplementary Fig. 1A). We stimulated PBMCs with CHIKV_S, CHIKV_NS and a combined CHIKV_S + NS MPs for 24 h and measured antigen-specific CD4+ T cell responses via upregulation of OX40+CD137+ . CHIKV-specific CD4+ T cell responses were not detected in uninfected controls; however, 8/19 (42%) and 5/19 (26%) of chronic CHIKVD donors were positive in responses to individual CHIKV_S and CHIKV_NS MPs, respectively (Fig. 4A). Following stimulation with combined CHIKV_S + NS MP (147 peptides), we detected a higher number of CHIKV-specific CD4+ T cell positive responses in 11/19 (58%) of donors. In comparison, 15/17 (88%) of donors were found to be positive in the initial screening of all the overlapping peptide MPs (10 MPs; 992 peptides). Despite the loss of approximately 30% of positive responses, the 7-fold reduction in the number of peptides required for screening and therefore the lower amount of blood volumes necessary points to an advantage of the CHIKV_S + NS MP in the feasibility of identifying the CHIKV-specific T cell responses in low amounts of blood, typically available in clinical and vaccine cohorts. In case of availability of bigger blood volumes, the overlapping peptide MP can be adapted for a more in-depth study of CHIKV-specific T cell responses.

Fig. 4: CHIKV CD4+ T cell specific epitope megapool induces a robust response.
figure 4

a Frequency of antigen-specific CD4+ T cells quantified by the AIM assay (OX40+CD137+) after 24 h stimulation with CHIKV epitope megapools (MP) consisting of epitopes from structural (S), non-structural (NS) and combined structural and non-structural (CHIKV_S + NS) proteins in 19 chronic CHIKV donors (chronic; blue dots) and seven CHIKV seronegative or uninfected donors (black dots; uninfected controls (UC)), with n representing the number of donors. % Positive refers to the percent of donors who are above the LOS (0.04), indicated by the dotted line. b Frequency of specific cytokine-producing cells (IFNγ, TNFα and IL-2) from the CHIKV-specific CD4+ T cells (AIM + OX40+CD40L+) after stimulation with the combined structural and non-structural CHIKV epitope MP (CHIKV_S + NS) in chronic CHIKV donors (n = 19). % Positive refers to the percent of donors that are above the LOS (0.003), indicated by the dotted line. c Frequency of CHIKV-specific AIM + CD4+ T cell memory subsets (AIM + OX40+CD137+) in chronic CHIKV donors (n = 19) post-stimulation with the CHIKV_S + NS epitope MP, based on the expression of CCR7 and CD45RA in AIM + CD4+ T cells as: T naïve (CCR7+CD45RA+), TCM (T central memory; CCR7+CD45RA-), TEM (T effector memory; CCR7-CD45RA-) and TEMRA (T effector memory re-expressing CD45RA; CCR7-CD45RA+). Each condition was tested once in individual experiments. Data are represented as geomean ± geometric SD (panels a and b) or as mean ± SD (panel c).

It has previously been reported in the literature that the CD4+CD8low population consists of antigen-specific CD4+ T cells with effector differentiation37. As such, we separately analyzed this population for the presence of AIM+ (OX40+ CD137+) CD4+CD8low cells in the epitope MP validation experiments (Supplementary Fig. 2B). We detected a low number of positive responses and did not identify any CHIKV-specific CD4+CD8 responses that has not been identified previously using the conventional CD4+ gate.

Next, within the AIM+ (CD4+ OX40+ CD40L+ ) responses, we measured cytokine expression of IFNγ, tumor necrosis factor-alpha (TNFα) and interleukin 2 (IL-2) after simulation with the combined CHIKV_S + NS MP in the 19 chronic CHIKVD donors via the intracellular staining assay. 58% and 53% of AIM+ (OX40+ CD40L+ ) donors were positive for TNFα and IL-2 expression, respectively, while 42% of donors were positive for IFNγ expression (Fig. 4B). The majority of CHIKV_S + NS-specific AIM+ CD4+ T cells displayed T effector memory phenotype (TEM: CD45RA-CCR7-), followed by T central memory cell phenotype (TCM: CD45RA-CCR7+ ) (Fig. 4C).

Sequence conservation of CHIKV CD4+ T cell epitopes in arthritogenic and encephalitic alphaviruses

Next, we investigated to what degree CHIKV epitopes identified in this study were conserved across other arthritogenic and encephalitic alphaviruses. We selected a representative set of viruses from the alphavirus genus (Fig. 5A) and retrieved the associated protein sequences from the Virus Pathogen Resource database (ViPR; www.viprbrc.org)38. We then calculated the percentage of conservation of each CHIKV peptide using the Conservation Analysis tool in IEDB-AR39. Plotting the median percent conservation for arthritogenic (excluding CHIKV sequences; n = 9) and encephalitic alphaviruses (n = 8) for all CHIKV peptides studied herein revealed regions of high conservation in the CP, nsP4, nsP1, nsP2 and E3 proteins and regions of low conservation in the 6K, nsP3, E2 and E1 proteins in both arthritogenic and encephalitic groups (Fig. 5B). As expected on the basis of closer phylogenetic relationships, the degree of conservation of CHIKV peptides was higher for all proteins in arthritogenic alphaviruses compared to encephalitic alphaviruses. Interestingly, the nsP4 protein showed the second highest degree of conservation in both arthritogenic and encephalic groups, however, few epitopes were experimentally identified in that region.

Fig. 5: Sequence conservation of CHIKV proteome in arthritogenic and encephalitic alphaviruses.
figure 5

a Phylogenetic tree indicating the viral sequences used to calculate the percentage of conservation of CHIKV peptides. The tree is divided into arthritogenic (blue) and encephalitic (green) alphaviral sub-groups. The tree was created using the ITOL server (Letunic and Bork (2024) Nucleic Acids Res doi: 10.1093/nar/gkae268). b Magnitude of response and median percent conservation of CHIKV peptides in arthritogenic (blue) and encephalitic (green) alphaviruses. The left y-axis and gray bars refer to the magnitude of response of each CHIKV epitope. The right y-axis and blue and green lines refer to median percent conservation for each CHIKV peptide. The solid lines separate each CHIKV protein. The median of percent conservation for each CHIKV protein is shown in the table below for arthritogenic (except CHIKV) and encephalitic alphaviruses. c Median percent conservation of all CHIKV epitopes in arthritogenic (except CHIKV sequences; blue) and encephalitic (green) alphaviruses separated based on non-structural (nsP1, nsP2, nsP3 and nsP4) and structural (CP, E3, E2, 6K and E1) proteins. Each dot indicates a CHIKV epitope (non-structural proteins: n = 62; structural proteins: n = 75). d Median percent conservation of each epitope recognized by two or more donors in arthritogenic (except CHIKV sequences; blue) and encephalitic (green) alphaviruses separated based on non-structural (nsP1, nsP2, nsP3 and nsP4) and structural (CP, E3, E2, 6K and E1) proteins. Each dot indicates a CHIKV epitope (non-structural proteins: n = 5; structural proteins: n = 35). For (c) and (d), the dotted lines refer to a 67% threshold that has previously been shown to define cross-reactive epitopes. The median percent conservation for each subgroup is shown below. Data were analyzed for statistical significance using a two-tailed paired Wilcoxon signed-rank test (p > 0.05, ns = nonsignificant). Exact p-values for respective figure panels are detailed in the Source Data File.

For the 123 individual CHIKV epitopes identified from the representative sequence and the 24 remaining variant epitopes identified in this study, we plotted the median percent conservation in arthritogenic and encephalitic alphaviruses (Fig. 5C). Overall, the median percent conservation for both non-structural and structural epitopes was significantly higher in the arthritogenic group (non-structural: 73%, structural: 67%) compared to in the encephalitic group (non-structural: 60%; structural: 47%) (p < 0.0001).

Previous studies have experimentally defined 67% conservation to be associated with cross-reactivity between SARS-CoV-2 epitopes and common-cold coronaviruses27,28. As such, we used 67% median conservation as the threshold to identify CHIKV CD4+ T cell epitopes that could potentially be cross-reactive with other alphaviruses. Overall, 65% (40/62) of all CHIKV non-structural and 43% (32/75) of structural epitopes were predicted to be potentially cross-reactive with other arthritogenic viruses and 31% (19/62) of the CHIKV non-structural and 13% (10/75) of the structural epitopes were predicted to be potentially cross-reactive with other encephalitic viruses (Fig. 5C).

We further analyzed the five epitopes in the non-structural proteins and 35 epitopes in the structural proteins that were recognized in two or more donors. (Fig. 5D). The epitopes derived from non-structural proteins were predicted to be cross-reactive with other arthritogenic and encephalitic alphaviruses in 80% (4/5) and 25% (1/5) cases, respectively, whereas epitopes derived from structural proteins were predicted to be cross-reactive with other arthritogenic and encephalitic alphaviruses in 31% (11/35) and 9% (3/35) cases, respectively, with overall median conservation of all the epitopes that were recognized in two or more donors significantly higher in the arthritogenic than in encephalitic groups (p < 0.0001). The potential cross-reactivity across other representatives of the alphavirus genus could be important in the context of CHIKV infection or vaccination.

Discussion

Here, we investigated the repertoire of CHIKV-specific CD4+ T cell responses in individuals with chronic CHIKVD and defined the precise CHIKV targets of CD4+ T cells. Epitopes in E1 and nsP1 proteins accounted for nearly half of the 123 identified epitopes and more than half of the total magnitude of the response. Specifically, two regions in nsP1226-235 and E11126-1140 proteins were identified as immunodominant and were marked by a high magnitude of response. Fine epitope mapping determined that these regions each contained a single epitope recognized in the same frame by multiple donors. The immunodominance of these two regions could be explained by the observation that these core regions are predicted to bind multiple HLA allelic variants that are frequently expressed in the world population. Beyond E1 and nsP1, CP contained the next highest number of epitopes, which accounted for 18% of the overall response. Interestingly, CP also contained a single epitope with the ability to bind to multiple HLA alleles. Immunodominance of the E1 region over nsP1 and CP regions could be potentially explained by the differences in binding to number of HLA alleles; E11126-1140 was predicted to bind to a higher percent of common HLA alleles (70%), than the percent of HLA alleles predicted to bind to nsP1221-235 (67%) and CP41-55 (44%) regions. In addition, we could not exclude the possibility of T cell cross-reactivity with other alphaviruses that co-circulate with CHIKV in Colombia, allowing higher recognition of the E1 immunodominant region in contrast to regions in nsP1 and CP. Overall, nsP1, E1, and CP proteins were the most immunogenic, highlighting their importance for vaccine development.

While no CHIKV-specific T cell epitopes had been previously identified in humans, a few studies have analyzed CHIKV-specific T cell targets in mice. Interestingly, nine of the 26 epitopes defined in CHIKV-infected C57BL/6 mice overlap with the epitopes identified in our study, which is remarkable given the highly distinct MHC in humans and mice22. In a separate study by Teo et al., specific sequences of the two pathogenic CD4+ T cell targets in nsP1 (nsP1146-160 DVAIYQDVYAVHAPT) and E2 (E2331-345 FNVYKATRPYLAHCP) proteins discovered in mice were found to be immunogenic in our study as well20. This suggests that murine models can be a representative model to study CHIKV T cell responses relevant for human disease.

We also assessed conservation of the T cell targets of CHIKV across other alphaviruses, which is relevant in the context of potential universal alphaviral vaccines40. Our analysis revealed regions of broad conservation of the CHIKV proteome in arthritogenic and encephalitic alphaviruses. While the nsP4 protein had 83% median conservation in arthritogenic alphaviruses, only three epitopes were identified in the nsP4 antigen. On the other hand, structural proteins, specifically E2 and E1 proteins, which elicited a high magnitude of response with a high number of immunogenic epitopes, were associated with a lower degree of conservation, indicating lower potential cross-reactivity with other alphaviruses. Importantly, however, the highest degree of conservation was found in epitopes in non-structural proteins of the arthritogenic and encephalitic alphaviruses. Based on these results, the design of a universal alphavirus vaccine could rely on focusing T cell responses on the relatively few epitopes of the viral proteome that are highly conserved and also immunogenic for human T cell responses.

However, there are several limitations that need to be considered when interpreting our findings. First, by using a stringent criterion for calculating the limit of detection and sensitivity, we might have missed some positive responses in our initial AIM screening and validation experiments. Second, due to limited availability of samples, we were unable to study CD4+ T cell targets during the acute phase of CHIKV infection or test our designed CHIKV_S + NS MP in the donors from the initial screening, which would provide a more direct measurement of their performance. To characterize the generalizability of these epitope pools, we tested them in a new, unrelated cohort of CHIKV-exposed individuals that was not used to identify these epitopes. Third, our studies are limited in terms of the number of individuals studied, as all were derived from a single geographical region. Future work could focus on a more exhaustive epitope screening in additional individuals and cohorts from diverse geographical origins. Lastly, we were unable to perform experimental confirmation for in silico analysis of HLA restriction of finely mapped epitopes or cross-reactive epitopes in other alphaviruses due to limited availability of samples and technical challenges. To truly determine the regions of CHIKV that could be cross-reactive with other alphaviruses, experimental testing of epitopes in samples infected with other alphaviruses would be required.

Overall, our work provides an in-depth characterization of CHIKV CD4+ T cell epitopes in chronic CHIKVD and extends the analysis to determine conservation of the identified epitopes in arthritogenic and encephalitic alphaviruses. In addition, we designed a CHIKV-specific CD4 epitope MP, which will aid in the characterization of CD4+ T cell response with small blood volumes. Our study has broadened our understanding of CHIKV-specific T cell responses, which will enable monitoring of human CD4+ T cell responses in the context of the study of CHIKV immunopathology and in the context of alphavirus vaccine development.

Methods

Study donor cohort

We enrolled 47 donors from Departamento del Atlántico on the Caribbean region of Colombia who had been diagnosed with CHIKV during the 2014–2015 epidemic. The criteria of a positive CHIKV diagnosis included: (1) whether the symptoms matched with CHIKV, (2) whether the patient lived in or visited a region where CHIKV had been detected using a RT-PCR assay, (3) whether other members of the patient’s household or other residents in the neighborhood had been diagnosed with CHIKV, (4) whether the infections with dengue virus, which often co-circulates with CHIKV and has similar clinical presentation, has been ruled out through laboratory tests such as ELISA and rapid diagnostic assays. Not all patient samples were subjected to laboratory tests to confirm CHIKV infection. Local medical professionals determined if the epidemiological link based on the criteria mentioned above was considered sufficient to diagnose a case as CHIKV positive. For the purpose of our studies, we confirmed infection by measuring the levels of anti-CHIKV IgG antibodies using the serum of all individuals (Supplementary Fig. 1A). 40 individuals were confirmed to be seropositive for CHIKV infection, while seven individuals were seronegative. Of the 40 seropositive individuals, 36 displayed chronic arthritis-like symptoms, such as joint pain and swelling, and the remaining four had recovered post-acute infection.

The cohort was divided into three groups – CHIKV epitope screening group, CHIKV epitope fine-mapping group and CHIKV epitope megapool (MP) validation group. PBMCs from 17 CHIKV seropositive individuals with chronic CHIKVD were used to screen for epitopes, 10 of which were also used for fine epitope mapping. Responses to core regions for chronic donors were compared to responses from four recovered donors. PBMCs from the remaining 19 CHIKV seropositive individuals with chronic CHIKVD were used to validate the MP of identified epitopes. Seven CHIKV seronegative donors were used as uninfected controls for epitope MP validation.

In addition, to determine the threshold of in vitro positive responses to CHIKV MPs, six uninfected donors were recruited from the National Blood Center of the Nicaragua Red Cross.

At the time of enrollment in the study, all individual donors provided informed consent. Subjects did not receive compensation for participation in the study. All blood and serum samples were anonymized and given a code number. In addition to collecting blood and serum samples, information regarding the date of their CHIKV diagnosis, the date of symptom onset, the date of sample collection, their age, their sex, and the specific symptoms at the time of CHIKV diagnosis and sample collection was also collected for each participant. Information regarding sex, age, current symptoms and CHIKV serostatus for each cohort is provided in Table 1.

Study approval

All samples were collected under an approved IRB protocol of La Jolla Institute for Immunology (LJI VD154) and Universidad del Norte, Colombia (Protocol number 153).

PBMC isolation

Whole blood samples were collected from CHIKV donors in Colombia. Large volumes of the donors’ blood were collected in sterile bags which contained 3.27 g citric acid, 26.3 g NaH2-citrate and 2.22 g NaHCO3 per liter (Fresenius Kabi, Fresenius HemoCare, Brasil Ltd, Brasil) and then diluted by 50% v/v with RPMI-1640 media (31800-105, Gibco, USA) containing 0.2% w/v NaHCO3. 34 mL volumes were gently layered into multiple sterile 50 mL tubes (430290, Corning, USA) containing 12 mL of sterile endotoxin-tested Ficoll-Paque PLUS (1.077 g/ml density) (GE Healthcare Biosciences, Sweden). After centrifugation at 500xg for 30 min, the buffy-coat peripheral blood mononuclear cells (PBMCs) were collected from immediately above the Ficoll-Paque-plasma interface, diluted with 50% v/v RPMI 1640 medium and again centrifuged at 500 x g to pellet the PBMCs. These PBMCs were then re-suspended in large volumes of RPMI 1640 medium containing 5% v/v fetal calf serum (FCS) (F-0926, Sigma, USA) and centrifuged again at 500xg for 30 min. After a further wash, PMBCs were re-suspended in ice-cold 10% dimethyl sulfoxide (D-2650, Sigma, USA) in FCS at approximately 20 ×106 PBMCs/mL and 1 mL aliquots contained in 1.5 mL polypropylene cryo-vials (5000-1020: Nalgene System 100, Thermo Scientific, USA) were slowly (− 1 oC/min) frozen down to − 80 oC in a isopropanol alcohol-walled freezing unit (Nalgene Mr. Frosty: C-1562 Sigma-Aldrich, USA) before being transferred to 25-vial boxes in liquid nitrogen. Whole blood samples from CHIKV uninfected donors in Nicaragua were collected and isolated using the established methods, where samples were subjected to density gradient centrifugation using Ficoll-Paque Premium (GE Healthcare Biosciences), and PBMCs were isolated and cryopreserved in liquid nitrogen, after resuspension in fetal bovine serum (Gemini Bio-Products) containing 10% dimethyl sulfoxide41.

Peptide pools

We retrieved 257 and 350 full-length CHIKV structural and non-structural polyprotein sequences from the Virus Pathogen Resource database (ViPR; www.viprbrc.org)38 using the following query: Chikungunya virus, Gene product name: structural OR non-structural polyprotein, Remove duplicate sequences. Unresolved sequences were removed. The query was performed in November 2019. Based on the results, we generate two separate consensus alignments for structural and non-structural polyproteins.

The number of sequences available varied as a function of geographic locations. To ensure balanced representation, the number of isolates by geographical region was limited to a maximum of 10. In total, 158 structural and 61 non-structural sequences were selected. For each polyprotein, sequences were aligned using MUSCLE, and consensus sequences for structural and non-structural proteins were BLASTed to identify a representative isolate (structural protein: AQX78118.1 and non-structural protein: AQX78116.1), using the tools hosted in the ViPR website. To account for variants from the representative isolates, we identified an additional 105 and 146 amino acid variants, which were found in frequencies above 10% from sequences of structural and non-structural representative isolates, respectively.

Based on the sequence of representative isolate, we designed 741 15-mer peptides overlapping by 10 residues and an additional 251 15-mers corresponding to the variant sequences. Peptides were synthesized by TC peptide lab (San Diego, CA). Synthesized peptides were resuspended in dimethyl sulfoxide (DMSO) at stock concentration of 100 mg/mL and stored at -20 °C in capped cluster tubes.

Individual peptides were then pooled to create 10 megapools (MP), corresponding to each CHIKV protein. The number of peptides per MP are listed in Supplementary Table 1. The nsP2 antigen was split into two pools (nsP2_1 and nsP2_2) of 117 and 81 peptides, respectively. The pools and peptides were used in the assay at 1 μg/mL. The MP approach has been used as previously described26. Large numbers of different epitopes are solubilized, pooled, and re-lyophilized to avoid cell toxicity problems associated with high concentrations of DMSO typically encountered when pooling after a single solubilization step.

Individual MP were further subdivided into mesopools (MS) encompassing 9–11 peptides, each of which were deconvoluted to identify individual epitopes. The number of MS corresponding to each MP ranged from 7 to 16, depending on the MP size, with a total of 95 MS. Introducing the MS intermediate screening step allowed to decrease the number of tests necessary to identify specific epitopes.

The peptides used in the MP and MS are 15 amino acids long with 10 overlaps. As a result of this design, all potentially stimulating, contiguous peptides of length 11 are included in the library. Alternative designs with 11 or 12 overlaps between adjacent peptides are also often used in the literature. We have used this design because class II restricted T cell recognize a core region of 9-10 residues in length, and as long as the tested peptide contains such a region, a positive result is observed, as also exemplified by the results of the fine mapping experiments presented herein. We also utilized peptides of 15 amino acids length for the fine-mapping rather than shorter peptides, because CD4+ T cells recognize peptides of mostly 13–17 residues in size bound to HLA class II, and the isolated 9-10 amino acid core region is usually poorly or not antigenic, because a few amino acid “overhang” on either side is required for optimal stimulation42.

For epitope-based MPs, epitope sequences from the representative isolate and variants were synthesized, resuspended in DMSO post-lyophilization and divided into structural and non-structural MPs, which were used for validation of epitope MPs. Each pool was tested 6-19 times.

Activation induced Marker assay and Intracellular staining assay

PBMCs were thawed in 10 mL of RPMI 1640 (Corning) supplemented with 5% human AB serum (GeminiBio), penicillin [100 IU/mL], streptomycin [100 μg/mL] (GeminiBio), and 2 mM L-glutamine (Gibco), and in the presence of benzonase (20 μL/10 mL). PBMCs were then plated at 1 × 106 cells per well in 96-well U bottom plates and stimulated separately with each CHIKV-specific MP [1 μg/mL] or CHIKV structural and non-structural epitope MP [1 μg/mL]. An equimolar amount of DMSO was used for the negative control. Stimulation with phytohemagglutinin (PHA, Roche) [1 μg/mL] was used for positive controls. For samples that were tested for epitope identification, cells were stained for the detection of activation-induced markers after 24 h of stimulation.

For IntraCellular Staining (ICS) assays, 20 h post-stimulation, Golgi-Plug and Golgi-Stop were added to the culture, in addition to anti-CD69 and anti-CD137 Abs. Cells were then washed, incubated with BD human FC block, and stained with LIVE/DEAD marker in the dark for 15 min. After incubation, cells were washed, surface stained in the dark for 30 min at 4 °C, and then fixed with 1 % of paraformaldehyde (Sigma-Aldrich, St. Louis, MO). Subsequently, cells were permeabilized and stained with specific antibodies in the dark for 30 min at RT. We use a 24 h stimulation time as a compromise to capture peak responses of AIM markers (OX4043 and CD13744), which provides an unbiased method to identify antigen-specific cells, regardless of the cytokine produced.

All samples were acquired on Cytek Aurora, and on average, 352,722 total events (109,616-632,000), 74,281 CD4+ T cells (5152 – 195,508) and 30,946 CD8+ T cells (2446 – 85,643) were acquired. SpectroFlo QC beads were used for quality control and to check the routine performance of Cytek Aurora. Antibodies used in the initial AIM screening assay are listed in Supplementary Table 9 and the antibodies used in the AIM/ICS validation assay are listed in Supplementary Table 10. Some markers, such as IL-10, IL-17A, CD38 and HLA-DR, were tested but were not detected and have, therefore, not been shown in the results. A representative gating strategy for the AIM assay is shown in Supplementary Fig. 1B and for the combined AIM/ICS assays in Supplementary Fig. 2.

Antigen-specific CD4+ T cells were measured as a percentage of AIM+ (OX40+CD137+ and OX40+CD40L+ ) CD4+ T cells. For the initial epitope screening, we further tested all positive responses from both OX40+CD137+ , and OX40+CD40L+ combination markers to be more inclusive and comprehensive. Both combinations of activation markers were used for extensive evaluation of the epitope derived MPs in an independent cohort; for the AIM assay, OX40 and CD137 were selected for testing the CHIKV S, NS, S + NS MPs, while OX40 and CD40L were used for the cytokine assays.

The stimulation index (SI) was calculated by dividing the percentage of stimulated samples by those of the DMSO control. The limit of detection (LOD) was calculated as the value corresponding to the two-fold of the upper 95% confidence interval of geometric mean of all DMSO values. The limit of sensitivity (LOS) was calculated using the median two-fold standard deviation of all DMSO values. Response < LOD (0.02%) and SI < 2, after background subtraction, was normalized to the LOD. LOD was calculated to be 0.01% for AIM + CD4+ CD8low population. For the initial AIM screening assay, SI > 2 and response > LOS (0.02%) was considered a positive response for antigen-specific CD4+ T cells. For the combined AIM/ICS assays, SI > 2 and response > LOS (0.04%) was considered a positive response for antigen-specific CD4+ T cells and SI > 2 and response > LOS (0.01%) for antigen-specific CD4+CD8low T cells.

T cell lines

To increase the frequency of rare antigen-specific CD4+ T cells, cells were expanded in a 14-day in vitro culture. PBMCs from AIM+ donors were cultured at 37 °C and 5% CO2 in RPMI 1640 (Corning) supplemented with 5% human serum at a density of 12 × 106 cells per well in Corning flat-bottom TC-treated 6-well plates. The cells were stimulated individually with each MP [1 μg/mL] that the donor was positive for in the AIM assay. Interleukin 2 (IL-2; 10 U/ml) (Prospec) was added every 3 to 4 days until cell harvest at day 14 (for mesopool evaluation) or day 17 (for individual peptide evaluation).

Since a positive donor/MP was tested with all MS, which contained all the peptides in the respective MP, and a positive donor/MS response was tested with all the individual peptides present in the MS, TCRs specific to all peptides should have been expanded, but there could be an unavoidable loss of TCR specificities during the expansion step.

For fine epitope mapping, PBMCs were individually stimulated with either nsP1221-235 and nsP1226-240 or E1121-1135 and E11126-1140 or CP41-55 and CP46-60 peptides on day 1 and restimulated on day 15 with individual peptides that overlapped by 14 residues and covered the entire region flanking the immunodominant regions (216-245 residues for nsP1, 1116–1145 residues for E1, and 36-65 residues for CP).

FluoroSpot assay

CHIKV-specific responses were assessed by interferon gamma (IFNγ)-based FluoroSpot assay, following in vitro expansion at day 14, to detect responses to MS or individual 15-mer peptides overlapping by 14 residues. For epitope screening, at day 17, positive pools were deconvoluted by FluoroSpot using individual CHIKV peptides (10 μg/mL) contained in the positive pool. Briefly, polyvinylidene fluoride (PVDF) plates (Millipore) were coated with anti-human IFNγ (1-D1K; Mabtech), and cells were plated in triplicate at 1 × 105 cells per well. Cells were stimulated with either peptide pools (1 μg/mL) or individual peptides (10 μg/mL) in 0.1 mL of complete RPMI and incubated for 24 h at 37 °C and 5% CO2. Following stimulation, cells were discarded, and plates were incubated with biotinylated IFNγ monoclonal antibodies (MAb) (7-B6-1-BAM; Mabtech) for 2 h, washed with PBS/0.05% Tween 20 and incubated with secondary detection antibody (Anti-BAM 490) for 1 h45. After incubation and another wash with PBS/0.05% Tween 20, responses were measured on the Mabtech IRIS FluoroSpot/ELISPOT reader using the Apex software. The machine undergoes regular quality control maintenance by Mabtech. Positive responses were identified as those having > 215 spot-forming cells (SFCs) per million cells, a SI of > 2, and a p-value < 0.05 when compared to unstimulated cells using a t test as described46. To determine the threshold or the detection limit of the assay (215 SFC per million cells), we measured IFNγ production in six CHIKV-seronegative donors from Nicaragua. The detection of a positive response to an epitope could be affected by the number of T cells being stimulated in the six CHIKV-seronegative donors.

Serologic assays

Site-specifically biotinylated CHIKV E1 DIII and Halo-tag control antigens were coupled to unique MagPlex-Avidin microspheres at a concentration of 5 μg of antigen per 106 beads in assay buffer (1% BSA + phosphate-buffered saline (PBS), pH 7.4) for 1 h at 37 °C with shaking at 700 rpm as described before47. Antigen-coupled beads were washed and aliquoted 2500 beads per antigen per well into a 96-well assay plate. Heat-inactivated (56 °C for 30 min) human serum samples diluted at 1:500 in assay buffer were incubated with beads for 1 h at 37 °C with shaking at 700 rpm. After washing the beads with assay buffer, PE- conjugated goat anti-human IgG Fc secondary Ab was added at 6 μg/mL (Southern Biotech, catalog: 2014-09) and incubated for 1 h at 37 °C with shaking at 700 rpm. Beads were washed and resuspended in 100 μL assay buffer for fluorescence analysis using the Luminex 200 system.

HLA typing

HLA typing of Class II alleles of PBMCs of all donors were performed using locus-specific PCR amplification of genomic DNA by a laboratory accredited by the American Society for Histocompatibility and Immunogenetics – accredited (ASH-II accredited) laboratory at Murdoch University (Western Australia) as previously described48. Specific HLA loci were amplified using sample specific MID-tagged primers, which are listed in Supplementary Table 11. Amplified DNA products from unique MID-tagged products (up to 96 MIDs) were quantified, pooled in equimolar ratios, and subjected to library preparation using NEBNext Ultra II library prep kits (Cat#E7645L, New England Biolabs). Libraries were quantified using the Jetseq library quantitation kit (Cat# BIO-68029 Meridian Bioscience) and High-sensitivity D1000 screen tape on an Agilent 2200 Tapestation (Agililent) for concentration and size distribution. Normalized libraries were sequenced on the Illumina MiSeq platform using the MiSeq V3 600-cycle kit (2x300bp reads). MID tags were used to separate sequences, reads were quality-filtered, and alleles were called using an in-house, accredited HLA caller software pipeline, thereby minimizing the influence of sequencing errors. Alleles were called using the latest IMGT HLA allele database as the allele reference library. The E.J.P. and S.A.M. laboratory developed an algorithm that relies on periodically updated versions of the freely available International Immunogenetics Information System (RRID:SCR_012780; http://www.imgt.org) and an ASHI-accredited HLA allele caller software pipeline, the IIID HLA analysis suite (http://www.iiid.com.au/laboratory-testing/). Sample report integrity was tracked and verified using a proprietary and accredited Laboratory Information and Management System (LIMS) and HLA analysis reporting software, which performs comprehensive allele balance and contamination checks on the final dataset.

Analysis of conservation of epitopes and non-epitopes in alphaviruses

Sequences of all alphaviruses were extracted from ViPR (www.viprbrc.org)38 based on sequence names provided in the phylogenetic tree in Fig. 5A. Sequences were aligned using the online MAFFT version 7 alignment tool (https://mafft.cbrc.jp/alignment/server/index.html), and the Neighbor-joining method (with default settings) was used to create a tree. The tree was visualized using the Interactive Tool of Life (ITOL) web server49. To calculate conservation, viruses were divided into arthritogenic and encephalitic subgroups based on their main pathogenic characteristics. IEDB epitope conservancy tool50, with the threshold of 10%, was used to determine the percent identity of each CHIKV peptide in the extracted alphaviral sequences. Overall median of percent conservation was calculated separately for encephalitic and arthritogenic alphaviruses.

Quantification and statistical analysis

Data and statistical analyses were performed using FlowJo 10.10.1 (BD Life Sciences) and GraphPad Prism 9.3.1 (GraphPad Software, San Diego, California). The statistical details of the experiments are provided in the respective figure legends and in the method details.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.