Long-persisting SARS-CoV-2 spike-specific CD4+ T cells associated with mild disease and increased cytotoxicity post COVID-19

Liu, Guihai; Antoun, Elie; Fries, Anastasia; Yao, Xuan; Yin, Zixi; Dong, Danning; Wang, Wenbo; Wing, Peter A. C.; Dejnirattisa, Wanwisa; Supasa, Piyada; Liu, Chang; Rostron, Timothy; Waugh, Craig; Clark, Kevin; Sopp, Paul; Fry, Jeremy W.; Vendrell, Iolanda; McKeating, Jane A.; Mongkolsapaya, Juthathip; Screaton, Gavin R.; Kessler, Benedikt M.; Fisher, Roman; Ogg, Graham; Mentzer, Alexander J.; Knight, Julian C.; Peng, Yanchun; Dong, Tao

doi:10.1038/s41467-025-63711-9

Download PDF

Article
Open access
Published: 01 October 2025

Long-persisting SARS-CoV-2 spike-specific CD4⁺ T cells associated with mild disease and increased cytotoxicity post COVID-19

Nature Communications volume 16, Article number: 8743 (2025) Cite this article

24k Accesses
6 Citations
175 Altmetric
Metrics details

Subjects

Abstract

The recent COVID-19 pandemic left behind the lingering question as whether new variants of concern might cause further waves of infection. Thus, it is important to investigate the long-term protection gained via vaccination or exposure to the SARS-CoV-2 virus. Here we compare the evolution of memory T-cell responses following primary infection with subsequent antigen exposures. Single-cell TCR analysis of three dominant SARS-CoV-2 spike-specific CD4⁺ T-cell responses identifies the dominant public TCRα clonotypes pairing with diverse TCRβ clonotypes that associated with mild disease at primary infection. These clonotypes are found at higher frequencies in pre-pandemic repertoires compared to other epitope-specific clonotypes. Longitudinal transcriptomics and TCR analysis, combined with functional evaluation, reveals that the clonotypes persisting 3–4 years post initial infection exhibit distinct functionality compared to those that were lost. Furthermore, spike-specific CD4⁺ T cells at this time point show decreased Th1 signatures and enhanced GZMA-driven cytotoxic transcriptomic profiles that were independent of TCR clonotype and associated with viral suppression. In summary, we identify common public TCRs used by immunodominant spike-specific memory CD4⁺ T-cells, associated with mild disease outcome, which likely play important protective roles to subsequent viral infection events.

Clonal diversity predicts persistence of SARS-CoV-2 epitope-specific T-cell response

Article Open access 09 December 2022

Establishment and recall of SARS-CoV-2 spike epitope-specific CD4⁺ T cell memory

Article 21 March 2022

Dynamics of TCR repertoire and T cell function in COVID-19 convalescent individuals

Article Open access 28 September 2021

Introduction

Coronavirus disease 2019 (COVID-19) was a worldwide pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), resulting in global morbidity and mortality. Studies in acute and convalescent individuals demonstrated that T cell responses to the virus associate with reduced disease severity^1,2, suggesting the importance of T cell responses in the control and resolution of SARS-CoV-2 infection^{3,4,5,6,7,8,9,10}.

Multiple immunodominant SARS-CoV-2 spike protein-specific T cell epitopes have been identified and are frequently detected in individuals who have recovered from infection and following vaccination^{11,12,13,14,15,16}. More broadly, T helper 1 (Th1) and T follicular helper (Tfh) cell subsets have been well studied in SARS-CoV-2 natural infection and vaccination. Spike-specific memory Tfh cells are reported to persist following natural infection¹⁷ and vaccination^15,16, and importantly, they associate with sustained anti-Spike antibody responses^12,18,19. Spike-specific Th1 cells were also observed in both infection and vaccination^20,21,22, with increased Th1 responses associating with severe COVID disease¹⁸, while vaccine-induced Th1 cell responses correlated with the frequency of CD8⁺ T cells¹². In addition to the canonical helper CD4⁺ T cell subsets, cytotoxic CD4⁺ T cells (CD4⁺ CTL) with direct effector and antiviral activity have been reported in several human viral infections^{23,24,25,26,27,28}. Cytotoxic SARS-CoV-2-reactive CD4⁺ T cells have been identified in hospitalised COVID-19 patients²⁹ and 2 years following initial infection³⁰, with a significant expansion in lung infiltrates in severe disease³¹. However, whether SARS-CoV-2-reactive CD4⁺ CTLs contribute to virus clearance or to immune-mediated pathology remains unclear.

Current technological advances combining population-specific single-cell transcriptomic profiling with T cell receptor (TCR) sequence analysis have enabled researchers to study the quality of T cell responses and their ability to control virus replication^6,32,33. However, due to the low frequency of antigen-specific T cells in the periphery and the limited availability of experimental tools for human study, there are very few studies investigating the immunodominant T cell responses along with their corresponding TCR repertoire, memory T cell establishment and their evolution following subsequent antigen exposure. As the T cell receptor repertoire is highly diverse, our knowledge of immunodominant epitope responses and their corresponding T cell receptor repertoire, along with a functional phenotype, is missing in the literature. Nevertheless, this analysis is critical for our understanding of the formation, evolution and function of the T cell memory pool arising from primary infection and the impact of subsequent antigen exposures, whether that comes from re-infection or vaccination.

In this study, we focus on three conserved immunodominant spike-specific CD4⁺ T cell epitopes (spike_166–180, spike_751–765 and spike_866–880) previously identified in individuals who had recovered from SARS-CoV-2 infection and vaccination^{13,14,15,34,35}. We examine their TCR usage, phenotypical and functional differences between 1–3 months of convalescent and 3–4 years following initial infection by ex vivo single-cell transcriptomic and TCR repertoire analysis, along with in vitro functional evaluation. We identify dominant public TCRα clonotypes associated with mild COVID-19 disease by integrating our data with publicly available single-cell RNA sequencing (scRNA-seq) datasets from early pandemic cases. Our longitudinal analysis reveals that the clonotypes persisting after the initial infection show distinct functionality compared to clonotypes that are lost over time. Furthermore, transcriptional profiling uncovers key differences in Th1 and cytotoxicity signatures between 1–3 months and 3–4 years after initial infection. Notably, the increased cytotoxicity of CD4⁺ spike-specific T cells at 3–4 years is TCR independent, driven primarily by GZMA and associated with significant viral suppression. These findings provide critical insights into the long-term memory response to a previously unknown pathogen.

Results

Identification of immunodominant CD4⁺ T cell responses to SARS-CoV-2 spike protein

We and others previously identified three dominant SARS-CoV-2 spike protein (S) CD4⁺ T cell epitopes: S_166–180 (CTFEYVSQPFLMDLE)^13,14,16, S_751–765 (NLLLQYGSFCTQLNR)^13,15 and S_866–880 (TDEMIAQYTSALLAG)^13,34,35. The HLA-restriction of these epitopes was defined using IFN-γ ELISPOT or peptide-MHC-Class II tetramer staining (Supplementary Fig. 1a–c). S_166–180-specific T cells are restricted by HLA-DPB1*04:01, while S_751–765- and S_866–880-specific T cells are restricted by HLA-DRB1*15:01. An overview of the study design can be seen in Fig. 1a. Our cohort comprised 48 individuals who had recovered from COVID-19 (Supplementary Data 1), with 30 (66.7% of 45) being HLA-DPB1*04:01 positive; and 17 (36.2% of 47) carrying HLA-DRB1*15:01 (Fig. 1b). Ex vivo IFN-γ ELISpot analysis using convalescent PBMC samples showed that 68% (17/25) of DPB1*04:01 individuals responded to S_166–180, while 85.7% (12/14) and 71.4% (10/14) of HLA-DRB1*15:01 positive patients showed responses to S_751–765 and S_866–880 respectively (Fig. 1c), confirming the immunodominance of these epitopes.

Fig. 1: Characterising three immunodominant spike-specific T cell responses targeting S166–180-DPB1*04:01, S751–765-DRB1*15:01 and S866–880-DRB1*15:01 epitopes in COVID-19 patients. — **Fig. 1: Characterising three immunodominant spike-specific T cell responses targeting S_166–180-DPB1*04:01, S_751–765-DRB1*15:01 and S_866–880-DRB1*15:01 epitopes in COVID-19 patients.**

To further characterise the immune response to these immunodominant epitopes, we generated 50 S_166–180-specific T cell clones from four participants, 54 S_751–765-specific T cell clones from four participants and 49 S_866–880-specific T cell clones from three participants. All those clones were established from 1–3-month convalescent samples, and the TCR clonotype was evaluated. Purity of the T cell clones was confirmed with tetramer staining after each round of expansion, and functional assays were only performed when purity was >95% (Supplementary Fig. 1d–f). To assess antigen-sensitivity of the clones, T cells were co-cultured with B cell lines loaded with titrated peptide and cytokine expression was measured by intracellular cytokine staining (ICS). Antigen-sensitivity of 32 S_166–180-specific, 45 S_751–765-specific and 48 S_866–880-speciifc T cell clones was evaluated. All clones expressed TNF-α, IFN-γ and IL-2 following antigen stimulation (Supplementary Fig. 1g). S_866–880-specific T cells showed the highest antigen-sensitivity, with the lowest half maximal effective concentration (EC₅₀) calculated from TNF-α, IFN-γ and IL-2 production.

Dominant TCRα but broad TCRβ clonotypes amongst spike-specific CD4⁺ T cells

We carried out ex vivo SmartSeq2 to analyse the gene expression and the TCR of spike-specific CD4⁺ T cells from individuals at both 1–3 months and 3–4 years after infection (Supplementary Table 1). In total, our dataset comprised 702 tetramer-sorted cells from six patients during 1–3 months of convalescence (68 S_166–180-specific, 293 S_751–765-specific and 341 S_866–880-specific CD4⁺ T cells) and 1735 tetramer-sorted cells from ten patients at the 3–4-years follow-up (1135 S_166–180-specific, 266 S_751–765-specific and 334 S_866–80-specific CD4⁺ T cells). A further 379 cytokine-sorted S_166–180-specific T cells from five patients at 1–3 months of convalescence were analysed for their TCR repertoire.

Consistent with previous reports^16,29,36,37, we observed a high diversity in V gene usage for both the alpha and beta chains, with a similar broad repertoire at both timepoints (Fig. 2a, b). Interestingly, despite a relatively broad usage of TRAV genes, we identified a bias of TRAV35 used by S_166–180-specific cells (66.1% of all S_166–180-specific cells), TRAV12-1 used by S_751–765-specific cells (45.5% of all S_751–765-specific cells) and TRAV26-1 used by S_866–880-specific cells (41.5% of all S_866–880-specific cells) (Fig. 2a). Unlike the alpha chain, TRBV gene usage appears more diverse across all three epitopes, showing no specific bias (Fig. 2b). Investigating the beta chain pairing with the dominant TRAV genes shows a broad beta gene pairing for S_166–180-specific cells with negligible differences between timepoints (Fig. 2c). However, dominant TRAV genes for S_751–765- and S_866–880-specific T cells show a different preference in beta gene pairing between the timepoints: for S_751–765-specific cells, TRBV24-1 is preferential at 1–3 months whereas TRBV6-1 is dominant at 3–4 years; for S_866–880-specific T cells, TRBV20-1 pairs dominantly with TRAV26-1 at convalescence, while TRBV3-1 is preferred at 3–4 years (Fig. 2c, d). This finding was similar across all the participants and was not impacted by a particular individual. Overall, we found no difference in TCR clonotype diversity between the two timepoints other than TCRα clonotype diversity of S_166–180-sepcific T cells decreased at 3–4 years (Supplementary Fig. 2a, b).

Fig. 2: TCR repertoire analysis of spike-specific CD4+ T cells at 1–3 months and 3–4 years post infection. — **Fig. 2: TCR repertoire analysis of spike-specific CD4⁺ T cells at 1–3 months and 3–4 years post infection.**

An earlier study reported that TCRs recognising the same antigen have similar TCR sequences³⁸. We therefore analysed similarity networks for the spike-specific TCR CDR3 alpha clonotypes and identified clusters of highly similar clones (Fig. 2d). A cluster defined as TRAV35-CAGXNYGGSQGNLIF, specific to S_166–180, is the largest alpha clonotype. This clonotype corresponded to the dominant TRAV gene used by the S_166–180-specific T cells, and is highly public, reported previously^16,36. S_751–765-specific (TRAV12-1-CVVNXXSSNTGKLIF) and S_866–880-specific (TRAV26-1-CIVRXANQAGTALIF) clusters were also identified, corresponding to the dominant TRAV genes used by S_751–765- and S_866–880-specific T cells, respectively. Similarity network analysis of spike-specific TCR CDR3 beta clonotypes identified smaller clusters compared to the alpha clonotypes (Supplementary Fig. 2c), and these were used for downstream analysis.

We next sought to identify public TCR clonotypes, which are unique clonotypes shared among more than one unrelated individual. We identified 30 public alpha chains specific to S_166–180, 12 to S_751–765 and 14 to S_866–880, as well as 13 public beta chains specific to S_166–180, 8 to S_751–765 and 8 to S_866–880 (Supplementary Data 2). Comparing the proportion of public clonotypes at different timepoints, we observed an increase in the proportion of cells with a public TCRα clonotype at 3–4 years compared to 1–3 months (χ² = 55.12, p = 1.14 × 10⁻¹³ [Fig. 2e]), with 73.87% of S_166–180-specific T cells having a public TCRα clonotype at 3–4 years. We did not observe a significant correlation between the number of vaccine doses and the proportion of cells with a public TCRα clonotype at 3–4 years (p = 0.9049). Moreover, we found no evidence of preferential retention of public TCRα clonotypes in specific individuals; all participants exhibited cells with public TCRα clonotypes. In contrast, the TCRβ clonotype showed a much lower overall proportion of cells with a public TCRβ clonotype (Fig. 2f), although there is a slightly greater proportion of cells with a public TCRβ clonotype at 1–3 months compared to 3–4 years (χ² = 4.73, p = 0.03 [Fig. 2f]). To confirm the public nature of these clonotypes and eliminate the potential risk of cross-contamination, we used the observed TCR space (OTS) database³⁹, which includes 3,185,982 paired TCRS from 892 individuals reported in 13 independent COVID-19 studies. 84.1% of the alpha clonotypes and 62.1% of the beta clonotypes classified as public in our dataset were also identified in the OTS. Using the OTS, we identified a total of 230 unique public alpha clonotypes and 75 unique public beta clonotypes (Fig. 2g).

Dominant TCRα clonotypes are associated with mild COVID-19 disease during primary infection

To investigate whether these public dominant TCRα clonotypes are associated with COVID-19 disease severity, we utilised two publicly available datasets of SARS-CoV-2-reactive CD4⁺ T cells with available TCR information, identified based on the upregulation of CD154/CD137/CD69 expression following peptide pool stimulation, for 52 COVID-19 patients^29,37 (Supplementary Table 2). Cells were classified as having a dominant TCRα clonotype if their CDR3α matches the identified clonotypes, TRAV35 (S_166–180), TRAV12-1(S_751–765) and TRAV26-1(S_866–880) shown in Fig. 2d, whereas the rest were considered as non-dominant. For both the ref. ²⁹ and the ref. ³⁷ datasets, we identified a significant association between having the dominant TCRα clonotype and mild COVID-19 disease (χ² = 226.17, p = 7.72 × 10⁻⁵⁰ [Fig. 3a] and χ² = 135.34, p = 4.08 × 10⁻³⁰ [Fig. 3b] respectively). Furthermore, since these two datasets used the same method to identify SARS-CoV-2-reactive CD4⁺ T cells, combining them allowed us to find a significant association of the dominant TCRα clonotype to the mild disease outcome (χ² = 488.98, p = 6.59 × 10⁻¹⁰⁷, Fig. 3c). To confirm this association was not driven either HLA-DPB1*04:01 or HLA-DRB1*15:01-specific associations, we split the analysis to compare the proportion of cells with the dominant S_166–180-specific TCRα clonotypes compared to non-dominant in HLA-DPB1*04:01⁺ individuals, and compared the proportion of cells with the dominant S_751–765-specific or S_866–880-specific TCRα clonotypes compared to non-dominant in HLA-DRB1*15:01⁺ individuals (Supplementary Fig. 3). The dominant S_166–180-specific TCRα clonotypes in HLA-DPB1*04:01⁺ individuals are associated with mild COVID-19 disease (Supplementary Fig. 3a), while the dominant S_751–765-specific or S_866–880-specific TCRα clonotypes in HLA-DRB1*15:01⁺ participants are associated with non-hospitalised COVID-19 (Supplementary Fig. 3b). Therefore, the difference in disease severity proportion is likely not driven by any particular epitopes. In summary, these data suggest that these dominant TCRα clonotypes play an important role in protecting individuals from developing severe disease.

**Fig. 3: Association between dominant TCRα clonotypes and disease severity.**

Given the highly expanded and public nature of these dominant TCRα clonotypes, we explored the possibility that these TCRs may arise from sequences that are present at a higher-than-average frequency in a naïve, pre-pandemic repertoire. Spindler et al.⁴⁰ carried out paired TCR sequencing of six healthy individuals prior to 2020 and identified 513,963 unique TCRα clonotypes, with an average of 85,660 clonotypes per individual. From this dataset, we calculated the frequency of the dominant and non-dominant TCRα clonotypes. The dominant S_166–180-specific and S_866–880-specific TCRα clonotypes appeared at a significantly higher frequency in the pre-pandemic repertoires compared to the non-dominant clonotypes (p = 0.042 and p = 0.033 for S_166–180 and S_866–880 respectively, Fig. 3d). The dominant S_751–765-specific TCRα clonotype was detected at a higher frequency than non-dominant clonotypes in the pre-pandemic repertoires, although this did not reach statistical significance.

As SARS-CoV-2-specific responses have been reported to possibly originate from cross-reactive CMV-specific T cell⁴¹ and pre-existing cross-reactive CD4⁺ T cell responses³⁷, we cross checked our TCR clonotypes against those reported in the Immune Epitope Database (IEDB) and VDJ database (VDJdb). None of the public TCRs identified in this study were previously reported to be reactive against other viruses on an HLA-matched background.

TCRα clonotypes maintained at 3-4 years show distinctive cytokine secretion

Given the longitudinal nature of our dataset, we examined whether clonotypes present during the convalescent period persisted for 3–4 years, or whether they are lost and replaced by new clonotypes, due to repeated vaccinations and infection with SARS-CoV-2 variants.

At 3–4 years, 80% of the S_166–180-specific cells retain a TCRα clonotype present at 1–3 months, predominantly the expanded and dominant public TCRα clonotype of TRAV35-CAGXNYGGSQGNLIF (Fig. 4a left); 46.2% of the S_751–765-specific cells have a TCRα clonotype found at 1–3 months, of which 65.8% are the dominant TCRα clonotypes of TRAV12-1-CVVNXXSSNTGKLIF and TRAV12-1-CVVNXGSSASKIIF (Fig. 4a middle); and 49.3% of the S_866–880-specific cells have a TCRα clonotype found at the 1–3-month timepoint, of which 50% are the dominant TCRα clonotypes of TRAV26-1-CIVRXANQAGTALIF and TRAV26-1-CIVRX[Y/W]NFNKFYF (Fig. 4a right).

**Fig. 4: Longitudinal TCRα and TCRβ analysis between 1–3 months and 3-4 years after infection.**

In contrast to the TCRα clonotypes, at 3–4 years, a lower frequency (10%) of S_166–180-specific cells have TCRβ clonotypes present at 1–3 months when compared to two other epitope-specific T cells, with 44.4% of the S_751–765-specific cells and 45.7% of S_866–880-specific T cells with a TCRβ clonotype also found at 1–3 months (Supplementary Fig. 4a). We next looked at the TCRβ clonotypes pairing with the previously identified dominant TCRα clonotypes (Fig. 4b). Similarly, TCRβ clonotypes pairing with the dominant S_166–180-specific TCRα clonotype show very little sharing between 1–3 months and 3–4 years, compared to S_751–765 and S_866–880-specific T cells, which show expanded TCRβ clonotypes paring with the dominant S_751–765- and S_866–880-specific TCRα clonotypes, maintained at 3–4 years.

As we identified clonotypes that were either maintained between the two timepoints sampled or not detected at the later timepoint, we interrogated our in vitro functional data of T cell clones isolated from 1–3-month samples to investigate whether the T cell clones with clonotypes maintained at the later timepoint showed differences in overall function. T cell clones were annotated as a maintained clonotype if their clonotype matched a TCR clonotype present in the single-cell TCR-seq dataset at both timepoints. Otherwise, T cell clones were categorised as either ‘1–3 month only’ or ‘3–4 year only’ if their TCR clonotype was found exclusively at one timepoint in the single-cell TCR-seq (Fig. 4c). Compared to the clones classified as ‘1–3 months only’, cytokine expression of S_751–765-specific T cell clones with maintained clonotypes showed significantly higher secretion of GM-CSF (adjusted p = 0.0014), IL-4 (adjusted p = 0.0014), IL-5 (adjusted p = 0.0042), IL-6 (adjusted p = 0.0042), IL-13 (adjusted p = 1.04 × 10⁻⁵) and TNF-α (adjusted p = 0.005) (Fig. 4d). However, when investigating the functional avidity of the S_751–765-specific T cell clones, no significant differences were observed in the IL-2 EC₅₀, TNF-α EC₅₀ or IFN-γ EC₅₀ between the two groups (Supplementary Fig. 4b, adjusted p > 0.05), suggesting that the clonal persistence may not be driven by functional avidity.

Taken together, these data suggest that spike-specific T cells with TCRα clonotypes detectable at both timepoints show differentiated cytokine profiles with higher expression of Th2 cytokines, compared to the T cells with TCRα clonotypes that were not detected at the 3–4-year timepoint.

Spike-specific CD4⁺ T cells show distinct transcriptional profiles over time

To investigate potential transcriptomic differences in the spike-specific T cells between the two timepoints, we analysed our scRNA-seq dataset that comprised 2213 total cells collected from two timepoints (1–3 months = 634 cells, 3–4 years = 1579 cells) across three epitope specificities (980 S_166–180-specific, 558 S_751–765-specific and 675 S_866–880-specific cells) from 13 individuals. Spike-specific T cells from the different individuals were integrated, and unsupervised clustering revealed ten distinct clusters based solely on their gene expression profiles (Fig. 5a, b). Interestingly, a unique cluster of cytotoxic T cells (CTL, cluster 7) exhibited high expression of cytotoxicity-related genes (e.g. GNLY, GZMH, NKG7 and GZMB). In addition, cells in this cluster showed elevated expression of the apolipoprotein B mRNA editing enzyme, catalytic polypeptide (APOBEC) genes APOBEC3C and APOBEC3G. Other clusters that were identified included an interferon-stimulated genes (ISGs) positive cell cluster (cluster 10, marker genes include OASL, IFIT2 and ISG15), two central memory clusters (clusters 5 and 1, with markers genes including CCR7 [cluster 5] and CXCR4 and IL7R [cluster 1]), and two clusters showing increased activation markers (clusters 4 and 6, with marker genes including FOS, DUSP1 and CD69 [cluster 4] and ANXA1 and LMNA [cluster 6]). Clusters were generally comprised of cells from all patients and all sequencing runs (Supplementary Fig. 5a, b), indicating the clustering analysis does not represent patient-specific subpopulations or batch effects.

Fig. 5: scRNA-seq transcriptomic comparison of spike-specific CD4+ T cells at 1–3 months and 3–4 years. — **Fig. 5: scRNA-seq transcriptomic comparison of spike-specific CD4⁺ T cells at 1–3 months and 3–4 years.**

As scRNA-seq of S_166–180-specific cells at 1–3 months was conducted using cytokine-sorted cells and these cells were excluded from the longitudinal transcriptional analysis. We investigated whether there were any proportional differences in the number of cells from each timepoint across different clusters of S_751–765-specific and S_866–880-specific T cells. Although no differences were evident between these two epitope-specific cells, differences were observed when comparing 1–3-month and 3–4-year samples (Fig. 5c). At 3–4 years, the proportion of cells in clusters 1, 2 and 8 increased, while the cell populations at clusters 3, 4 and 6 declined compared to 1–3 months. This was confirmed by differential abundance testing using the miloR package (Fig. 5d, e).

To further investigate differences between memory cells from 1–3 months and 3–4 years, we generated module scores using gene lists compiled from the literature (Supplementary Table 3). Cells from 1–3-months exhibited a greater expression of genes involved in the positive regulation of Th1 differentiation (ANXA1, CCR7 and IRF1), a greater level of integrin expression (SELPLG, ITGAE and ITGA4), and genes involved in T cell activation (CD44, CD27 and TUBA1B) (p < 0.0001 for all, Fig. 5f). In contrast, cells from 3–4-year exhibited a greater cytokine gene expression (IL16, IL32 and CCL4) and cytotoxicity (GZMA and GNLY) (p < 0.0001 for all, Fig. 5f). Moreover, this appears to be independent of epitope specificity (Supplementary Fig. 6a). Comparing the module scores for the S_751–765 and S_866–880 specific cells showed consistent differences, with increased Th1, Treg, integrin expression and activation signatures at 1–3 months and an increased cytotoxicity and cytokine signature at 3–4 years.

Spike-specific CD4⁺ T cells at 3-4 years exhibit increased cytotoxicity signatures driven by GZMA expression, correlating with viral suppression in vitro

Given the increased cytotoxicity signature observed at 3–4 years post infection, we further investigated whether this difference is driven by TCR clonotype. We categorised the cells into dominant or non-dominant TCRα groups. S_751–765- and S_866–880-specific CD4⁺ T cells at 3–4 years consistently showed a greater cytotoxicity signature compared to cells at 1–3 months, regardless of whether the cells had a dominant or non-dominant TCRα clonotype (Fig. 6a). We confirmed this by examining clonotype sharing between cluster 7 (the CD4⁺ CTL cluster) and the remaining clusters. Our analysis revealed that TCRα clonotypes are not restricted to cluster 7 but shared amongst all clusters (Fig. 6b), indicating that the difference in cytotoxicity is likely to be independent of TCR.

Fig. 6: Investigation of the CD4+ cytotoxicity signature in spike-specific cells at 3–4-year follow-up. — **Fig. 6: Investigation of the CD4⁺ cytotoxicity signature in spike-specific cells at 3–4-year follow-up.**

We then investigated individual genes driving the cytotoxicity signature at 3–4 years, and found that the expression of GZMA is significantly upregulated at 3–4 years (Fig. 6c, p = 0.011 and p = 7.5 × 10⁻⁸ for S_751–765- and S_866–880-specific cells, respectively), whereas other cytotoxic molecules, such as GZMB (Fig. 6d, p > 0.05) and PRF1 (Fig. 6e, p > 0.05), showed no significant differences. This suggests that the cytotoxicity signature at 3–4 years is primarily driven by GZMA. Furthermore, at 3–4 years post infection, a greater proportion of cells in the cytotoxic cluster were found in S_166–180-specific T cells (CTL, χ² = 20.007, p = 4.52 × 10⁻⁵ [Fig. 6f]), with 11.02% of cells being cytotoxic T cells, compared to 3.4% for S_751–765 and 5.68% for S_866–880. Amongst these cells, S_166–180-specific CD4⁺ T cells express the highest level of GZMA at 3–4 years, compared to S_751–765- and S_866–880-specific cells (Fig. 6g, adjusted p = 1.98 × 10⁻⁶ and 2.64 × 10⁻⁶ for S_166–180 vs S_751–765 and S_166–180 vs S_866–880, respectively). Given that our dataset contained three times as many S_166–180-specific cells as S_751–765-/S_866–880-specific cells at 3–4 years, we randomly down-sampled the S_166–180-specific cells to 300 cells (matching the number of S_751–765- and S_866–880-specific cells), and still observed a higher level of GZMA expression in S_166–180-specific cells (Supplementary Fig. 6b).

This TCR-independence and GZMA-driven cytotoxicity was further validated in in vitro T cell clones specific for the three epitopes. We assessed the cytotoxicity of 98 CD4⁺ T cell clones (33 S_166–180- specific CD4⁺ T cell clones from four individuals, 30 S_751–765-specific clones from four individuals and 35 S_866–880-specific CD4⁺ T cell clones from three individuals and all clones derived from 1–3-month samples). Clones were categorised as cytotoxic T cell clones if their killing capacity was >10% (see Methods for further details). Their killing of target cells was confirmed to be MHC class II-dependent by using an HLA-DR blocking antibody (Supplementary Fig. 6c). T cell clones with the same TCR clonotype exhibited distinct cytotoxic capability for all three epitopes, with some clones being cytotoxic and others not, despite the same TCR clonotype (Fig. 6h). Bulk RNAseq and proteomic analysis of three cytotoxic and three non-cytotoxic S_866–880-specific CD4⁺ T cell clones showed the cytotoxic CD4⁺ clones expressed higher GZMA at both RNA level and protein level, compared to non-cytotoxic clones (Fig. 6i, j, p = 4 × 10⁻⁴ and p = 0.032, respectively), with no significant differences in the GZMB expression. Importantly, cytotoxicity of the T cell clones associated with their suppression of SARS-CoV-2 replication (Fig. 6k, ρ = 0.390, p = 7.05 × 10⁻⁵), with cytotoxic T cell clones showing an overall greater level of viral suppression compared to non-cytotoxic CD4⁺ T cell clones (Fig. 6l, p = 0.00906). These results suggest that spike-specific memory CD4⁺ T cells at 3–4 years after infection retain more of a cytotoxic T cell memory, driven by GZMA, capable of controlling virus replication efficiently.

Discussion

CD4⁺ T cell memory plays an essential role in viral infections. However, the mechanisms underlying its long-term memory formation, particularly the phenotypical and functional changes, as well as the evolution of the TCR repertoire following repeated antigen exposures such as multiple vaccinations and re-infection, remain poorly understood in humans. This is largely due to the low frequency of antigen-specific T cells and the lack of suitable analytical tools. In this study, we longitudinally characterised the three most dominant spike-specific CD4⁺ T cell responses restricted by common HLA Class II alleles (S_166–180-DPB1*04:01, S_751–765-DRB1*15:01 and S_866–880-DRB1*15:01), by interrogating the ex vivo TCR repertoires and transcriptomes of antigen-specific single cell, together with in vitro functional evaluation.

We firstly identified significantly expanded and dominant TCRα clonotypes in 1–3-month samples that are associated with mild COVID-19 disease during the primary infection, when interrogating two studies from the early stages of the COVID-19 pandemic^29,37. Given the limited number of participants in the two studies and the absence of HLA-typing for some individuals in Meckiff et al. study²⁹, these findings need to be further validated in larger cohorts. Nevertheless, public TCR clonotypes have been observed in the literature associated with dominant virus-specific responses⁴². Among the three epitopes, a higher frequency of S_166–180-specific T cells uses public TCRα clonotypes, with more public TCRα clonotypes being identified. The expanded public TCR clonotype used by this epitope, together with the high prevalence of HLA-DPB1*04:01 at the population level⁶, suggests that this response may contribute significantly to the CD4⁺ T cell memory response to SARS-CoV-2. Surprisingly, the dominant public TCRα clonotypes used by these three dominant spike-specific responses were found at a relatively high proportion in naïve pre-pandemic TCR repertoires compared to other epitope-specific clonotypes. Cross-reactive SARS-CoV-2-specific memory CD4⁺ T cells are readily detectable ex vivo in ~20–50 % of unexposed people^43,44. The pre-existing cross-reactive T cells were induced by common cold viruses, or other coronaviruses, even other microbes, and have been reported to associate with mild COVID-19 infection^45,46,47. Although it is not known whether the dominant and public TCRα clonotypes identified in our study are T cell precursors or pre-exiting cross-reactive TCRs, T cells bearing these TCRs may be able to elicit rapid T cell response during the primary infection, contributing to mild disease outcome.

We next examined the TCR repertoire changes of these three immunodominant spike-specific T cells between convalescence (1–3 months after primary infection) and 3–4-year follow-up. At 3–4 years, the majority of T cells retained the TCRα clonotypes found in 1–3 months, and the public dominant TCRα clonotypes of all three responses persist. T cell clones bearing the clonotypes detected at both timepoints show overall higher secretion of pro-inflammatory/Th2 cytokines such as GM-CSF, IL-4, IL-5, IL-6, IL-13 and TNF-α, when compared to the ones bearing clonotypes not detected over time, suggesting that CD4 T cells producing more Th2 cytokines during resolution stage are more likely to maintain at higher frequency in the long-term memory. Given that the sequences of S_166–180, S_751–765 and S_866–880 epitopes show very little similarity to those of seasonal coronaviruses but are highly conserved across SARS-CoV-2 variants, the seasonal coronaviruses are unlikely to have affected repeated recall of the spike-specific CD4 + T cells. However, repeated stimulations by SARS-CoV-2 reinfections or multiple vaccine doses potentially may have affected repeated recall of spike-specific T cells targeting the three epitopes, thus affecting their TCR repertoire, survival and cytokine profile at 3–4 years after primary infection.

Investigating the transcriptomes of these spike-specific cells, we unexpectedly identified a decreased Th1 signature, together with an increased cytotoxicity signature at 3–4 years compared to 1–3 months convalescence. Further analysis of the ex vivo single-cell data and in vitro functional and multi-omic data of T cell clones revealed that this transcriptome cytotoxicity change is TCR independent and driven by increased expression of GZMA. It is unclear how cytotoxic CD4⁺ T cells develop. Grey-Gaillard et al.³⁰ suggest that infection-associated inflammation may have imprinted a cytotoxic signature on memory CD4⁺ T cells, as they show that spike-specific CD4⁺ T cells induced by infection remained enriched for transcripts related to cytotoxicity, whereas spike-specific CD4⁺ T cells induced by mRNA vaccination did not. In our cohort, the participants were recruited at the convalescent stage, and then further sampled 3–4 years post primary infection. Between these two sampling points, these participants likely underwent multiple vaccinations and possible infections by SARS-CoV-2 variants. This repeated antigen stimulation may have contributed to the increased cytotoxicity signature observed at 3–4 years.

Unlike cytotoxicity of CD8⁺ T cells, which is predominantly mediated by perforin and granzyme B^48,49, the cytotoxicity of CD4⁺ spike-specific T cells appears to be driven by the increased expression of GZMA, with significantly higher expression of GZMA at 3–4 years, compared to 1–3 months, whereas no significant difference was observed in expression of perforin and GZMB between the two timepoints. The greater level of GZMA was confirmed at both the RNA and protein level in cytotoxic and non-cytotoxic S_866–880-specific T cell clones. These cytotoxic CD4⁺ T cell clones showed enhanced SARS-CoV-2 viral suppression in vitro, highlighting their direct effector function against virus replication^29,50. Although it has been shown that direct killing of infected cells by CD4⁺ CTLs accumulated in the lung may have contributed to immunopathogenesis in severe COVID-19^29,50,51, these CD4⁺ CTLs in the long-term memory pool may contribute to protective immunity upon re-encountering the virus, which has been previously reported in an influenza human challenge²³.

Our study does have some limitations. Firstly, our analysis was carried out in a relatively small number of individuals. Therefore, the results we identified need to be validated and repeated in a larger number of individuals. Secondly, the analysis of the ex vivo spike-specific T cells by scRNA-seq is limited by their relatively low frequency in the circulation. Using tetramer staining to sort spike-specific CD4⁺ T cells from ex vivo PBMC samples has allowed for the analysis of highly specific single-cells, but limits the number of cells that can be analysed. Together with limited sampling depth, this may have resulted in insufficient TCR capture between the timepoints; therefore, the absence of certain TCR clonotypes at the later timepoint doesn’t necessarily imply their loss over time. Furthermore, our study focused only on two timepoints, 1–3 months and 3–4 years after primary infection. Studying with sampling at additional timepoints in-between, particularly after each antigen-stimulation (vaccinations/breakthrough infections) would provide a detailed understanding of how the long-term memory pool evolved and established.

Taken together, in this study, we have identified high-frequency common public TCRs (in particular alpha chains) targeting three highly conserved immunodominant spike epitopes, which are associated with mild disease outcome, and increased cytotoxicity 4 years post initial infection, consistent with an important role in protecting the population from subsequent virus infection. Our data also provide new insights into understanding the evolution of memory T cell responses arising from human primary infection, the impact of subsequent antigen exposures, and potential intrinsic T cell factors that may contribute to differences in immune memory formation.

Method

Study participants and clinical definitions

Patients were recruited from the John Radcliffe Hospital in Oxford, UK, between March 2020 and September 2021 by the identification of patients hospitalised during the SARS-CoV-2 pandemic and recruited into the Sepsis Immunomics study. All participants were sampled at least 28 days after symptom onset during the primary infection. Ten of them, who were further sampled around 3–4 years after initial infection (Supplementary Table 1), had received multiple doses of vaccination and showed no symptoms of SARS-CoV-2 infection in 3–6 months before sampling (Supplementary Data 1). Written informed consent was obtained from all patients. Ethical approval was given by the South Central-Oxford C Research Ethics Committee in England (ref. ¹⁹/SC/0296). Clinical definitions were defined as previously described⁵². In brief, the degree of severity was identified as mild, severe or critical infection, according to recommendations from the World Health Organization.

Generation of ACE2-transduced EBV-transformed B lymphoblastoid cell lines (BCLs)

EBV-transformed BCLs⁵³ and ACE2-transduced BCLs were established as described previously⁶. In brief, the cDNA for the human ACE2 gene (ENSG00000130234) was cloned into a lentiviral vector backbone (Addgene plasmid ID 17488), then co-transfected with packaging plasmids pMD2.G and psPAX2 into HEK293-TLA using PEIpro (Polyplus) to produce lentivirus. EBV-transformed BCLs were infected with ACE2-coding lentivirus, followed by cell sorting via flow cytometry to enrich ACE2-expressing B cells. B cells with stable expression of ACE2 were maintained with 0.5 μg ml⁻¹ of puromycin (Thermo Fisher Scientific). Mycoplasma testing was carried out every 4 weeks with all cell lines using the MycoAlert detection kit (Lonza).

Generation of T cell lines and clones

Short-term SARS-CoV-2-specific T cell lines were generated as described previously⁵³. Briefly, 2 × 10⁶ PBMCs were stimulated with 10 μM peptides at 37 °C for 1 h and cultured in H10 (RPMI 1640 medium with 10% human serum, 2 mM glutamine, 100 units/ml of penicillin and 100 μg/ml of streptomycin) at 2 × 10⁶ cells per well in a 24-well plate (Costar). IL-2 was added to a final concentration of 100 IU/ml on day 3. S_751–765- and S_866–880-specific T cell clones were established by sorting tetramer⁺ CD4⁺ T cells from thawed PBMCs or short-term T cell lines on day 10-14. S_166–180-specific T cell clones were generated by cell sorting with TNF-α, IFN-γ, and IL-2 secretion assay (Miltenyi). T cell clones were then expanded with irradiated allogeneic PBMCs every 2–3 weeks as described previously⁵⁴.

IFN-γ ELISpot assay

Ex vivo assays were carried out using either freshly isolated or cryopreserved PBMCs as described previously¹³. Peptides were added to 2 × 10⁵ PBMCs at a final concentration of 2 μM for 16–18 h. For in vitro ELISpot assays, autologous and allogeneic EBV-transformed BCLs were loaded with peptides, and subsequently co-cultured with polyclonal T cell lines at an effector: target (E:T) ratio of 1:50 for at least 6 h, negative wells containing BCLs and T cells were included. To quantify antigen-specific responses, mean spots of the control wells were subtracted from the sample wells, and the results were expressed as spot-forming units (SFU) per 10⁶ PBMCs. Responses were considered positive if results were at least three times the mean of the negative control wells and >25 SFU/10⁶ PBMCs. If negative control wells had >30 SFU/10⁶ PBMCs or positive control wells (Phytohemagglutinin stimulation) were negative, the results were excluded from further analysis.

Intracellular cytokine staining (ICS)

ICS was performed as described previously¹³. T cells were co-cultured with BCLs loaded with peptides at 37 °C for 6 h with GolgiPlug and GolgiStop (BD Biosciences). Cells were stained with Live/Dead Fixable Aqua dye (Invitrogen) followed by surface staining with CD4-PE-Cy7 (BD Biosciences). After subsequent permeabilisation with Fixation/Permeabilisation solution (BD Biosciences), cells were stained with IFN-γ-Alexa Fluor 488 (BD Biosciences), TNF-α-APC (eBioscence) and IL-2-BV421 (BioLegend). Negative controls without peptide were set up for each sample. Samples were run on Attune NxT Flow Cytometer (software v.3.2.1) and analysed using FlowJo v.10 software (FlowJo LLC). A representative gating strategy used for the ICS assay can be seen in Supplementary Fig. 7a.

Cytokine production assessment

To assess cytokine production, T cells were co-cultured with BCLs loaded with or without peptide at an E:T ratio of 2:1. After 48 h, 50 μl of supernatant was collected for cytokine detection. Cytokines, including IFN-γ, TNF-α, IL-2, IL-4, IL-6, IL-10, IL-13, RANTES and GM-CSF were quantified using the Bio-Plex Pro Human Cytokine Assay (Bio-Rad) following the manufacturer’s instructions, then ran on Bio-Plex 200 (Bio-Rad). Concentration was analysed by the Bio-Plex Manager (Bio-Rad).

CFSE-based cytotoxic T lymphocyte killing assay

EBV-transformed BCLs were labelled with 0.5 μM carboxyfluoroscein succinimidyl ester (CFSE, Thermo Fisher Scientific), then loaded with 2 μM of peptide at 37 °C for 1 hour. Subsequently, cells were washed, counted and co-cultured with T cells at an E:T ratio of 4:1 at 37 °C for 6 h. Samples were then stained with 7-AAD (eBioscience) and CD19-BV421 (BioLegend). To assess the MHC class II-dependence of the killing, B cell lines were treated with either 40ug/ml of anti-HLA-DR antibody (BioLegend) or isotype control (BioLegend) at room temperature for 1 hour prior to being loaded with peptide. Cell death was assessed based on the presence of CFSE⁺CD19⁺7-AAD- (live) cells. Negative controls containing BCLs without peptide pulse and T cells were included for each sample. Samples were run on Attune NxT Flow Cytometer (software v.3.2.1) and analysed using FlowJo v.10 software (FlowJo LLC). A representative gating strategy used for the killing assay is shown in Supplementary Fig. 7b.

Live virus suppression assay

As described previously⁶, BCLs expressing ACE2 were infected with SARS-CoV-2 viruses (Victoria 01/20 strain, provided by J. McKeating⁵⁵) at a MOI of 0.1 for 2 h at 37 °C. Cells were then washed and co-cultured with T cells at an E:T ratio of 4:1 at 37 °C. Control wells containing only virus-infected targets were included. Three replicates were set up for each condition. After 48 h incubation, cells were washed with PBS and lysed with RLT buffer (Qiagen), followed by RNA extraction using RNeasy 96 Kit (Qiagen). Virus copies were quantified by real-time qPCR using one-step RT mastermix kit (Eurogentec) and N1 probe contained in 2019-nCoV RUO kit (IDT), and the viral suppression rate was calculated by the reduction of viral copies when antigen-specific T cells are present.

Cell sorting for scRNA-seq

S_166–180-specific CD4⁺ T cells from 1–3-month samples were sorted using a cytokine secretion assay following the manufacturer’s instructions (Miltenyi Biotec). Briefly, 3–5 × 10⁶ PBMCs were stimulated with S_166–180 peptide at a final concentration of 10 μM for 5 h. Subsequently, cells were washed and incubated with TNF-α, IFN-γ and IL-2 catching antibodies for 45 min, followed by staining with PE-conjugated TNF-α, IFN-γ and IL-2 detection antibodies, CD3-FITC, CD8-APC, CD14-PE-CF594, CD19-PE-CF594 and CD16-PE-CF594 (BD Biosciences), CD4-BV421 (BioLegend). Before sorting, cells were stained with PI (eBioscience) to exclude nonviable cells. S_751–765- and S_866–880-specific CD4⁺ T cells from 1–3-month samples were sorted with peptide-MHC class II tetramers. In brief, 3–5 × 10⁶ cells were stained with APC-conjugated HLA-DRB1*15:01 S_751–765 and S_866–880 tetramers (ProImmune), respectively. Live/dead fixable Aqua dye (Invitrogen) was used to exclude nonviable cells from the analysis. Cells were washed and stained with the following surface antibodies: CD3-FITC, CD4-PE (BD Biosciences), CD14-BV510, CD19-BV510, CD16-BV510 and CD8-BV421 (BioLegend). After exclusion of nonviable/CD14⁺/CD19⁺/CD16⁺cells, CD3⁺CD8^-CD4⁺TNFα⁺/IFNγ⁺/IL-2⁺ cells or CD3⁺CD8^-CD4⁺tetramer⁺ were sorted for scRNA-seq using a BD FACSAria Fusion sorter or BD FACSAria III (BD Biosciences). Single cells were directly sorted into 96-well PCR plates (Thermo Fisher Scientific) containing cell lysis buffer and stored at −80 °C for further SmartSeq2 analysis. A representative gating strategy for the tetramer and cytokine sorting of single cells is shown in Supplementary Fig. 7c, d, respectively.

Tetramer-associated magnet enrichment and cell sorting of spike epitope-specific CD4⁺ T cells

S_166–180-, S_751–760- and S_866–880-specific CD4⁺ T cells were enriched from 3- to 4-year samples prior to sorting for scRNA-seq, as previously described^56,57. In brief, 1.5–3 × 10⁷ PBMCs were labelled with APC- or PE-conjugated peptide-MHC class II tetramers (S_166–180 tetramer from NIH Tetramer Core Facility, S_751–760- and S_866–880 tetramers from ProImmune) for 30 min. Enrichment was then performed with anti-APC or anti-PE microbeads using magnetic-activated cell sorting technology (Miltenyi Biotec) following the manufacturer’s instructions. Subsequently, enriched S_166–180-, S_751–760- and S_866–880-specific CD4⁺ T cells were stained with CD3-BV786 and CD8-BV510 (BioLegend), CD4-FITC, CD14-PE-CF594, CD19-PE-CF594 and CD16-PE-CF594 (BD Biosciences). Before sorting, cells were stained with Propidium Iodide (PI) (eBioscience) to exclude nonviable cells. CD3⁺CD8^-CD4⁺tetramer⁺ were sorted for scRNA-seq using a BD FACSAria Fusion sorter or BD FACSAria III (BD Biosciences).

SmartSeq2 scRNA-seq

ScRNA-seq with ex vivo sorted TNF-α⁺/IFN-γ⁺/IL-2⁺ or tetramer⁺ cells was performed using SmartSeq2 analysis as described previously⁵. Reverse-transcription (RT) and PCR amplification were performed with the exception of using ISPCR primer with biotin tagged at the 5′ end and increasing the number of cycles to 25. Sequencing libraries were prepared using the Nextera XT Library Preparation Kit (Illumina) and sequencing was performed on Illumina NextSeq sequencing platform with NextSeq Control Software v.4.

Deep sequencing of the TCR repertoire of T cell clones

TCR usage of T cell clones was sequenced as described⁶. Total RNA was extracted from 5 × 10⁵ cells of each clone using RNeasy Micro Kit (QIAGEN), and 100-300 ng of the RNA from each clone was used for the generation of full-length TCR repertoire libraries using SMARTer Human TCR a/b Profiling Kit/v2 (TAKARA) following the supplier’s instructions. After purification, libraries of all clones were pooled and sequenced using MiSeq reagent Kit v.3 (600 cycles) on a MiSeq (Illumina) with MiSeq Control Software v.2.6.2.1.

Cell preparation for bulk RNAseq and proteomics

Three cytotoxic and three non-cytotoxic S_866–880-specific CD4⁺ T cell clones were included for the analysis. T cells were thoroughly washed with PBS three times before RNA and protein extraction. Total RNA was extracted from 5 × 10⁵ cells using the RNeasy Micro Kit (QIAGEN) for bulk RNAseq. To extract proteins for proteomics, 1 × 10⁶ cells were lysed with 1% NP40 cell lysis buffer (Thermo Fisher Scientific), including 1X protease inhibitor cocktail (Sigma-Aldrich) and 1 mM phenylmethylsulfonyl fluoride (Thermo Fisher Scientific) on ice for 1 h, with vortexing every 10 min during the incubation. After cell lysis, the solution was centrifuged at 16,000×g for 10 min at 4 °C. Supernatant containing the proteins were transferred into new tubes and snap frozen on dry ice, followed by storage at −80 °C for further proteomic analysis. Three replicates from each clone were included for both RNAseq and proteomics.

RNAseq library preparation

About 1 μg total RNA was sent to the Oxford Genomics Centre for total RNAseq analysis. Briefly, cDNA libraries were prepared using the NEB Ultra II Library Prep kit for Illumina (NEB) following the manufacturer's protocol and sequencing was performed on NovaSeq 6000 using a NovaSeq 6000 SP Reagent Kit v1.5 (300 cycles, Illumina).

Proteomics sample preparation

Thirty micrograms of proteins were digested using S-trap micro spin columns following the manufacturer's protocol (Profiti). Briefly, SDS was added to the sample to a 2.5% final concentration. Samples were reduced with 10 mM DTT for 30 min and alkylated with 40 mM iodoacetamide for 30 min at room temperature in the dark. Samples were then acidified with phosphoric acid to a 1.2% final concentration, and proteins were precipitated by adding 90% methanol/100 mM TEAB buffer to a 1:7 ratio (sample:buffer). Samples were then transferred to the S-trap spin column and spun through with 4000 × g and washed four times with 150 μl 90% methanol/100 mM TEAB (4000 × g). About 20 μl of 50 mM TEAB containing 1 μg of trypsin (1:30 trypsin:protein ratio) was added to the S-trap spin column and incubated overnight at 37 °C. Peptides were then sequentially eluted with 50 mM TEAB, 2% formic acid and 2% formic acid in a 50% acetonitrile solution. Peptides were dried using a centrifugal evaporator.

Liquid chromatography–mass spectrometry (LC-MS/MS)

Dried peptides were reconstituted in LC-MS/MS water containing 2% acetonitrile, 0.1% TFA. Thirty-three percent of tryptic peptides were analysed by liquid chromatography tandem mass spectrometry (LC-MS/MS) using Ultimate 3000 UHPLC (Thermo Fisher Scientific) connected to an Orbitrap Fusion Lumos Tribrid (Thermo Fisher Scientific). Briefly, peptides were loaded onto a trap column (Acclaim PepMax; 100 µm x 2 cm, nanoviper, Thermo Fisher) and separated on a 50cm-long EasySpray column (ES903, Thermo Fischer) with a gradient of 2–35% acetonitrile in 5% dimethyl sulfoxide, 0.1% formic acid at a 250 nl/min flow rate over 60 min. Eluted peptides were then analysed on an Orbitrap Fusion Lumos (instrument control software v3.3; Thermo Fisher). Data were acquired in data-independent mode as previously described by ref. ⁵⁸. Briefly, full scans (350–1650 m/z) were acquired in the Orbitrap with 120k resolution and maximum injection time of 20 ms, followed by 40 DIA scan windows covering full mass range from 361 to 1388 with variable widths adjusted to the precursor density as described by ref. ⁵⁸. MS2 scans were acquired in the Orbitrap between 200–2000m/z at a resolution of 30,000 and a normalised HCD collision energy set to 30%.

Quantification and statistical analysis

SmartSeq2 scRNA-seq data processing

BCL files were converted to FASTQ format using bcl2fastq v2.20.0.422 (Illumina). FASTQ files were aligned to the human genome hg19 using STAR v2.6.1 d. Reads were counted using featureCounts (subread v2.0.0). The resulting counts matrix was analyzed in R v4.0.1 using Seurat v4.0.1.

Single-cell RNA sequencing analysis

Cells were filtered using the following criteria: minimum number of cells expressing specific gene = 3, minimum number of genes expressed by cell = 200 and maximum number of genes expressed by cell = 4000. Cells were excluded if they expressed more than 10% mitochondrial genes. Patient-specific cells were integrated using Harmony v.1.0 to remove batch effects. The AddModuleScore function (Seurat) was used to look at the expression of specific gene sets (Supplementary Table 3). Higher scores indicate that that specific signature is more highly expressed in a particular cell compared with the rest of the population. The FindMarkers function (Seurat) was used to evaluate differentially expressed genes (DEGs) between two conditions using the MAST (model-based analysis of single cell transcriptomics) statistical test, with different sequencing batches as latent variables. miloR⁵⁹ was used to carry out differential abundance testing between samples from 1–3 months and 3–4 years.

TCR processing

TCR sequences were reconstructed from SmartSeq2 scRNA-seq FASTQ files using MiXCR v.3.0.13 to produce separate TRA and TRB output files for analysis. The output files were parsed into R using tcR v.2.3.2. Bulk TCR sequencing BCL files were converted to FASTQ files using bcl2fastq. TCRs were extracted using MiXCR and parsed into R as described earlier. TCRs were filtered to retain 1α1β or 2α1β for each clone.

TCR repertoire analysis

TCRs were filtered to retain 1/2α or 1β; paired αβ cells consist of 1α1β or 2α1β. Clonotypes were defined as α (CDR3α amino acid + TRAV), β (CDRβ amino acid + TRBV) or paired αβ (CDR3α amino acid + TRAV + CDRβ amino acid + TRBV). Public clonotypes were defined as shared clonotypes between 2 or more patients. Circos plots were plotted using circlize v.0.4.12, showing paired TRAV-TRBV. All other plots were generated using ggplot2 and ggpubr v.0.4.0.

RNAseq analysis

FASTQ files were generated from BCL files using bcl2fastq (v2.20.0) and were aligned to the hg19 genome using the bwa-MEM algorithm (v 0.7.15-r1140). Aligned reads were counted and count tables generated using featureCounts (v2.0.0) from the subread-2.0.0 package, using the hg19 annotation GTF file from ENSEMBL. Downstream analysis of the RNA counts was carried out in R (v4.2.0) using DESeq2⁶⁰ (v1.38.3). An exploratory analysis of the data were carried out by principal component analysis (PCA) after variance stabilising transformation.

Proteomic analysis

Raw files were analysed in DIA-NN software (v8.0) as previously described⁶¹. Default settings were used as recommended. Briefly, for the library-free approach, a library was created from the human UniProt SwissProt database (downloaded 202102, containing 20,386 sequences) using deep learning. Trypsin was selected as the enzyme (1 missed cleavage), with carboamidomethylation of C as a fixed modification, oxidation of methionine as a variable modification and N-term M excision. Identification and quantification of raw data were performed against the in-silico library applying 1% FDR at the precursor level and match between runs (MBR). The DIA-NN ‘report.proteingroup’ matrix was further analysed using Perseus. Protein intensities were log2-transformed and subsequently normalised by subtracting the median intensity by column. Biological replicates were grouped, and data were prefiltered, allowing three valid numbers in at least one biological group. Missing values were then imputed following a normal distribution. Imputed and normalised data were imported into R (v4.2.0). The mass spectrometry raw data included in this paper had been deposited to the Proteome eXchange Consortium via the PRIDE partner repository⁶² with the dataset identifier PRD042469.

Statistical analysis

EC₅₀ calculations were performed with GraphPad Prism 9, all other statistics were analyzed with IBM SPSS Statistics 27. Figures were made with ggplot2 in R (v4.2.0). Chi-squared test of independence was used to compare the ratio difference between two groups. Data distribution normality was examined with the Kolmogorov–Smirnov test. The Mann–Whitney U-test was employed to compare two groups, and the Kruskal–Wallis one-way ANOVA was used to compare three groups. Correlation analysis was performed using Spearman’s rank correlation coefficient. EC₅₀ of T cell clones was calculated by using nonlinear regression with variable slope (four parameters) in a dose–response–stimulation model with GraphPad Prism. Statistical significance was set at p < 0.05, and all tests were two-tailed. For statistical analyses conducted using R, the MAST test was used to find DEGs between two conditions, adjusting for variation in batches and represented by volcano plots and violin plots. P values were adjusted for multiple testing using the Benjamini–Hochberg method.

TCR repertoire analysis of public datasets

Processed data were downloaded from GEO for GSE152522 (Meckiff et al.) and GSE162086 (Bacher et al.). As the Meckiff et al. data does not provide V gene usage, for Meckiff et al. cells were annotated as having a dominant TCRα if their CDR3 alpha sequence matched CIVR[A-Z]ANQAGTALIF, CIVRV[A-Z][Y/W]NFNKFYF, CVVN[A-Z][A-Z]SSNTGKLIF, CVNN[A-Z]GSSASKIIF or CA[A-Z][A-Z]NYGGSQGNLIF, allowing for any amino acid where “[A-Z]” is noted or allowing for either a 'Y' or 'W' at [Y/W]. For Bacher et al., as they provided both the CDR3 and V gene, we performed as above, but included the V gene in the matching, such that the alpha clonotype (CDR3 alpha + TRAV gene) had to match one of the following: CIVR[A-Z]ANQAGTALIF_TRAV26-1, CIVRV[A-Z][Y/W]NFNKFYF_TRAV26-1, CVVN[A-Z][A-Z]SSNTGKLIF_TRAV12-1, CVNN[A-Z]GSSASKIIF_TRAV12-1, or CA[A-Z][A-Z]NYGGSQGNLIF_TRAV35. All other cells were classed as having a non-dominant clonotype. The definition of disease severity was as reported by each manuscript.

For the prediction of HLA typing, we used arcasHLA. For Bacher et al., paired-end FASTQ files for the gene expression for each sample were downloaded from GSE162086 and used as input for arcasHLA. For Meckiff et al., due to eight samples having been multiplexed into each reaction, gene expression FASTQ files were downloaded from GSE152522, and we used vireoSNP to carry out genetic demultiplexing of the individual samples. Briefly, cellsnp-lite was used to call variants for each cell from the scRNA-seq data, using the provided list of 7.4 M human variants from the 1000 genome project with minor allele frequency (MAF) >0.05. Once the genotypes were called, vireo demultiplexed each sequencing run into individual FASTQ files, which were then used as input for arcasHLA.

The proportion of cells with a dominant and non-dominant TCR alpha clonotype between different disease severities was examined using the Chi-squared (χ²) test of independence. Results are reported as: χ² (degrees of freedom, N = sample size) = χ² value, p = p value.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The single-cell RNAseq data generated in this study have been deposited in the ArrayExpress database under accession code E-MTAB-14933. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD042469. Source data for all other figures are provided in a Source Data file. Source data are provided with this paper.

References

Rydyznski Moderbacher, C. et al. Antigen-specific adaptive immunity to SARS-CoV-2 in acute COVID-19 and associations with age and disease severity. Cell 183, 996–1012 e1019 (2020).
Article CAS PubMed Central PubMed Google Scholar
Zhou, R. et al. Acute SARS-CoV-2 infection impairs dendritic cell and T cell responses. Immunity 53, 864–877 e865 (2020).
Article CAS PubMed Central PubMed Google Scholar
Bergamaschi, L. et al. Longitudinal analysis reveals that delayed bystander CD8+ T cell activation and early immune pathology distinguish severe COVID-19 from mild disease. Immunity 54, 1257–1275 e1258 (2021).
Article CAS PubMed Central PubMed Google Scholar
Tan, A. T. et al. Early induction of functional SARS-CoV-2-specific T cells associates with rapid viral clearance and mild disease in COVID-19 patients. Cell Rep. 34, 108728 (2021).
Article CAS PubMed Central PubMed Google Scholar
Notarbartolo, S. et al. Integrated longitudinal immunophenotypic, transcriptional and repertoire analyses delineate immune responses in COVID-19 patients. Sci. Immunol. 6, eabg5021 (2021).
Peng, Y. et al. An immunodominant NP105-113-B*07:02 cytotoxic T cell response controls viral replication and is associated with less severe COVID-19 disease. Nat. Immunol. 23, 50–61 (2022).
Article CAS PubMed Google Scholar
Arunachalam, P. S. et al. Systems biological assessment of immunity to mild versus severe COVID-19 infection in humans. Science 369, 1210–1220 (2020).
Article ADS CAS PubMed Central PubMed Google Scholar
Hadjadj, J. et al. Impaired type I interferon activity and inflammatory responses in severe COVID-19 patients. Science 369, 718–724 (2020).
Article ADS CAS PubMed Central PubMed Google Scholar
Laing, A. G. et al. A dynamic COVID-19 immune signature includes associations with poor prognosis. Nat. Med. 26, 1623–1635 (2020).
Article CAS PubMed Google Scholar
Mathew, D. et al. Deep immune profiling of COVID-19 patients reveals distinct immunotypes with therapeutic implications. Science 369,eabc8511 (2020).
Low, J. S. et al. Clonal analysis of immunodominance and cross-reactivity of the CD4 T cell response to SARS-CoV-2. Science 372, 1336–1341 (2021).
Article ADS CAS PubMed Google Scholar
Painter, M. M. et al. Rapid induction of antigen-specific CD4(+) T cells is associated with coordinated humoral and cellular immunity to SARS-CoV-2 mRNA vaccination. Immunity 54, 2133–2142 e2133 (2021).
Article CAS PubMed Central PubMed Google Scholar
Peng, Y. et al. Broad and strong memory CD4(+) and CD8(+) T cells induced by SARS-CoV-2 in UK convalescent individuals following COVID-19. Nat. Immunol. 21, 1336–1345 (2020).
Article CAS PubMed Central PubMed Google Scholar
Rowntree, L. C. et al. SARS-CoV-2-specific T cell memory with common TCRαβ motifs is established in unvaccinated children who seroconvert after infection. Immunity 55, 1299–1315.e1294 (2022).
Article CAS PubMed Central PubMed Google Scholar
Wragg, K. M. et al. Establishment and recall of SARS-CoV-2 spike epitope-specific CD4⁺ T cell memory. Nat. Immunol. 23, 768–780 (2022).
Mudd, P. A. et al. SARS-CoV-2 mRNA vaccination elicits a robust and persistent T follicular helper cell response in humans. Cell 185, 603–613.e15 (2021).
Dan, J. M. et al. Immunological memory to SARS-CoV-2 assessed for up to 8 months after infection. Science 371, 587 (2021).
Article ADS Google Scholar
Nelson, R. W. et al. SARS-CoV-2 epitope-specific CD4 memory T cell responses across COVID-19 disease severity and antibody durability. Sci. Immunol. 7, eabl9464 (2022).
Rodda, L. B. et al. Imprinted SARS-CoV-2-specific memory lymphocytes define hybrid immunity. Cell 185, 1588 (2022).
Article CAS PubMed Central PubMed Google Scholar
Sahin, U. et al. COVID-19 vaccine BNT162b1 elicits human antibody and T1 T cell responses. Nature 586, 594 (2020).
Article ADS CAS PubMed Google Scholar
Paolini, A. et al. Patients recovering from severe COVID-19 develop a polyfunctional antigen-specific CD4+T cell response. Int. J. Mol. Sci. 23, 8004 (2022).
Guerrera, G. et al. BNT162b2 vaccination induces durable SARS-CoV-2-specific T cells with a stem cell memory phenotype. Sci. Immunol. 6, eabl5344 (2021).
Wilkinson, T. M. et al. Preexisting influenza-specific CD4 T cells correlate with disease protection against influenza challenge in humans. Nat. Med. 18, 274–280 (2012).
Article CAS PubMed Google Scholar
Weiskopf, D. et al. Dengue virus infection elicits highly polarized CX3CR1+ cytotoxic CD4+ T cells associated with protective immunity. Proc. Natl Acad. Sci. USA 112, E4256–E4263 (2015).
Article CAS PubMed Central PubMed Google Scholar
Long, H. M. et al. Cytotoxic CD4+ T cell responses to EBV contrast with CD8 responses in breadth of lytic cycle antigen choice and in lytic cycle recognition. J. Immunol. 187, 92–101 (2011).
Article CAS PubMed Google Scholar
Pachnio, A. et al. Cytomegalovirus infection leads to development of high frequencies of cytotoxic virus-specific CD4+ T cells targeted to vascular endothelium. PLoS Pathog. 12, e1005832 (2016).
Article PubMed Central PubMed Google Scholar
Appay, V. et al. Characterization of CD4(+) CTLs ex vivo. J. Immunol. 168, 5954–5958 (2002).
Article CAS PubMed Google Scholar
Hashimoto, K. et al. Single-cell transcriptomics reveals expansion of cytotoxic CD4 T cells in supercentenarians. Proc. Natl Acad. Sci. USA 116, 24242–24251 (2019).
Article ADS CAS PubMed Central PubMed Google Scholar
Meckiff, B. J. et al. Imbalance of regulatory and cytotoxic SARS-CoV-2-reactive CD4(+) T cells in COVID-19. Cell 183, 1340–1353.e1316 (2020).
Article CAS PubMed Central PubMed Google Scholar
Gray-Gaillard, S. L. SARS-CoV-2 inflammation durably imprints memory CD4 T cells. Sci. Immunol. 9, eadj8526 (2024).
Kaneko, N. et al. Temporal changes in T cell subsets and expansion of cytotoxic CD4+ T cells in the lungs in severe COVID-19. Clin. Immunol. 237, 108991 (2022).
Wang, X. M. et al. Global transcriptomic characterization of T cells in individuals with chronic HIV-1 infection. Cell Discov. 8, 29 (2022).
Article PubMed Central PubMed Google Scholar
Hasegawa, T. et al. Cytotoxic CD4(+) T cells eliminate senescent cells by targeting cytomegalovirus antigen. Cell 186, 1417–1431.e1420 (2023).
Article CAS PubMed Google Scholar
Karsten, H. et al. High-resolution analysis of individual spike peptide-specific CD4(+) T-cell responses in vaccine recipients and COVID-19 patients. Clin. Transl. Immunol. 11, e1410 (2022).
Article CAS Google Scholar
Tarke, A. et al. SARS-CoV-2 vaccination induces immunological T cell memory able to cross-recognize variants from Alpha to Omicron. Cell 185, 847–859.e811 (2022).
Article CAS PubMed Central PubMed Google Scholar
Minervina, A. A. et al. Longitudinal high-throughput TCR repertoire profiling reveals the dynamics of T-cell memory formation after mild COVID-19 infection. Elife 10, e63502 (2021).
Bacher, P. et al. Low-avidity CD4(+) T cell responses to SARS-CoV-2 in unexposed individuals and humans with severe COVID-19. Immunity 53, 1258 (2020).
Article CAS PubMed Central PubMed Google Scholar
Dash, P. et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature 547, 89–93 (2017).
Article ADS CAS PubMed Central PubMed Google Scholar
Raybould, M. I. J. et al. The observed T cell receptor Space database enables paired-chain repertoire mining, coherence analysis and language modelling. Cell Rep. 43, 114704 (2024).
Spindler, M. J. et al. Massively parallel interrogation and mining of natively paired human TCRαβ repertoires. Nat. Biotechnol. 38, 609 (2020).
Article CAS PubMed Central PubMed Google Scholar
Pothast, C. R. et al. SARS-CoV-2-specific CD4(+) and CD8(+) T cell responses can originate from cross- reactive CMV- specific T cells. Elife 11, e82050 (2022).
Li, H. J., Ye, C. T., Ji, G. L. & Han, J. H. Determinants of public T cell responses. Cell Res. 22, 33–42 (2012).
Article PubMed Central PubMed Google Scholar
Mateus, J. et al. Selective and cross-reactive SARS-CoV-2 T cell epitopes in unexposed humans. Science 370, 89 (2020).
Article ADS CAS PubMed Central PubMed Google Scholar
Grifoni, A. et al. Targets of T cell responses to SARS-CoV-2 coronavirus in humans with COVID-19 disease and unexposed individuals. Cell 181, 1489 (2020).
Article CAS PubMed Central PubMed Google Scholar
Braun, J. et al. SARS-CoV-2-reactive T cells in healthy donors and patients with COVID-19. Nature 587, 270 (2020).
Article ADS CAS PubMed Google Scholar
Swadling, L. et al. Pre-existing polymerase-specific T cells expand in abortive seronegative SARS-CoV-2. Nature 601, 110–117 (2022).
Bartolo, L. et al. SARS-CoV-2-specific T cells in unexposed adults display broad trafficking potential and cross-react with commensal antigens. Sci. Immunol. 7, eabn3127 (2022).
Cao, X. F. et al. Granzyme B and perforin are important for regulatory T cell-mediated suppression of tumor clearance. Immunity 27, 635–646 (2007).
Article CAS PubMed Google Scholar
Salti, S. M. et al. Granzyme B regulates antiviral CD8 T cell responses. J. Immunol. 187, 6301–6309 (2011).
Article CAS PubMed Google Scholar
Kaneko, N. et al. Temporal changes in T cell subsets and expansion of cytotoxic CD4+T cells in the lungs in severe COVID-19. Clin. Immunol. 237, 108991 (2022).
Baird, S. et al. A unique cytotoxic CD4 T cell-signature defines critical COVID-19. Clin. Transl. Immunol. 12, e1463 (2023).
C.O.-M.-o.B.A.C.E.a. & Consortium, C.O.-M.-o.B.A. A blood atlas of COVID-19 defines hallmarks of disease severity and specificity. Cell 185, 916–938.e958 (2022).
de Silva, T. I. et al. The impact of viral mutations on recognition by SARS-CoV-2 specific T cells. iScience 24, 103353 (2021).
Article ADS PubMed Central PubMed Google Scholar
Abd Hamid, M. et al. Self-maintaining CD103(+) cancer-specific T cells are highly energetic with rapid cytotoxic and effector responses. Cancer Immunol. Res. 8, 203–216 (2020).
Article CAS PubMed Google Scholar
Wing, P. A. C. et al. Hypoxic and pharmacological activation of HIF inhibits SARS-CoV-2 infection of lung epithelial cells. Cell Rep. 35, 109020 (2021).
Article CAS PubMed Central PubMed Google Scholar
Alanio, C., Lemaitre, F., Law, H. K. W., Hasan, M. & Albert, M. L. Enumeration of human antigen-specific naive CD8T cells reveals conserved precursor frequencies. Blood 115, 3718–3725 (2010).
Article CAS PubMed Google Scholar
Heim, K. et al. Attenuated effector T cells are linked to control of chronic HBV infection. Nat. Immunol. 25, 1650–1662 (2024).
Muntel, J. et al. Comparison of protein quantification in a complex background by DIA and TMT workflows with fixed instrument time. J. Proteome Res. 18, 1340–1351 (2019).
Article CAS PubMed Google Scholar
Dann, E., Henderson, N. C., Teichmann, S. A., Morgan, M. D. & Marioni, J. C. Differential abundance testing on single-cell data using k-nearest neighbor graphs. Nat. Biotechnol. 40, 245–253 (2022).
Article CAS PubMed Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed Central PubMed Google Scholar
Demichev, V., Messner, C. B., Vernardis, S. I., Lilley, K. S. & Ralser, M. DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nat. Methods 17, 41–44 (2020).
Article CAS PubMed Google Scholar
Perez-Riverol, Y. et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 47, D442–D450 (2019).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We are grateful to all the participants for donating their samples and data for these analyses, and the research teams involved in the consenting, recruitment, and sampling of these participants. We thank K. Clark and S.-A. Clark from the WIMM Flow Cytometry facility for their help with cell sorting and the WIMM Sequencing facility for sequencing. This work is supported by the Chinese Academy of Medical Sciences (CAMS) Innovation Fund for Medical Sciences (CIFMS), China (grant number: 2024-I2M-2-001-1) (T.D., Y.P., X.Y., G.L., E.A., D.D., J.C.K., G.O., G.R.S., B.K. and R.F.); UK Medical Research Council (MRC) grant MR/Y015347/1 (T.D.), MR/R022011/1 (G.O.) and IMMPROVE - MR/Y004450/1 (T.D.); Z.Y., W.W., G.L. and C.L. were supported by China Scholarship Council. The study is also funded by the NIHR Oxford Biomedical Research Centre (G.O. and J.C.K.), Wellcome Trust Investigator Award (204969/Z/16/Z) (J.C.K.), Wellcome Trust Grants (090532/Z/09/Z and 203141/Z/16/Z) to core facilities Wellcome Centre for Human Genetics, Senior Investigator Award (G.O.) and Clinical Research Network (G.O.); Schmidt Futures (G.R.S.). The McKeating laboratory is funded by a Wellcome Investigator Award (IA) 200838/Z/16/Z. This work uses data provided by patients and collected by the NHS as part of their care and support #DataSavesLives. NIH Tetramer Facility provided S_166–180(CTFEYVSQPFLMDLE)-DPB1*04:01 monomer. Figure 1a was created with BioRender.com.

Author information

These authors contributed equally: Guihai Liu, Elie Antoun.
These authors jointly supervised this work: Alexander J. Mentzer, Julian C. Knight, Yanchun Peng, Tao Dong.

Authors and Affiliations

Chinese Academy of Medical Science (CAMS) Oxford Institute (COI), University of Oxford, Oxford, UK
Guihai Liu, Elie Antoun, Xuan Yao, Zixi Yin, Danning Dong, Wenbo Wang, Peter A. C. Wing, Wanwisa Dejnirattisa, Piyada Supasa, Chang Liu, Iolanda Vendrell, Jane A. McKeating, Juthathip Mongkolsapaya, Gavin R. Screaton, Benedikt M. Kessler, Roman Fisher, Graham Ogg, Alexander J. Mentzer, Julian C. Knight, Yanchun Peng & Tao Dong
Centre for Translational Immunology, Nuffield Department of Medicine, University of Oxford, Oxford, UK
Guihai Liu, Elie Antoun, Xuan Yao, Zixi Yin, Peter A. C. Wing, Yanchun Peng & Tao Dong
Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
Anastasia Fries, Wanwisa Dejnirattisa, Piyada Supasa, Chang Liu, Juthathip Mongkolsapaya, Gavin R. Screaton, Alexander J. Mentzer & Julian C. Knight
Sequencing Facility, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
Timothy Rostron
Flow Cytometry Facility, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
Craig Waugh, Kevin Clark & Paul Sopp
ProImmune Limited, Oxford, UK
Jeremy W. Fry
Target Discovery Institute, Centre for Medicines Discovery, Nuffield Department of Medicine, Oxford University, Oxford, UK
Iolanda Vendrell, Jane A. McKeating, Benedikt M. Kessler & Roman Fisher
Dengue Hemorrhagic Fever Research Unit, Office for Research and Development, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok, Thailand
Juthathip Mongkolsapaya
MRC Translational Immune Discovery Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
Graham Ogg, Yanchun Peng & Tao Dong

Authors

Guihai Liu
View author publications
Search author on:PubMed Google Scholar
Elie Antoun
View author publications
Search author on:PubMed Google Scholar
Anastasia Fries
View author publications
Search author on:PubMed Google Scholar
Xuan Yao
View author publications
Search author on:PubMed Google Scholar
Zixi Yin
View author publications
Search author on:PubMed Google Scholar
Danning Dong
View author publications
Search author on:PubMed Google Scholar
Wenbo Wang
View author publications
Search author on:PubMed Google Scholar
Peter A. C. Wing
View author publications
Search author on:PubMed Google Scholar
Wanwisa Dejnirattisa
View author publications
Search author on:PubMed Google Scholar
Piyada Supasa
View author publications
Search author on:PubMed Google Scholar
Chang Liu
View author publications
Search author on:PubMed Google Scholar
Timothy Rostron
View author publications
Search author on:PubMed Google Scholar
Craig Waugh
View author publications
Search author on:PubMed Google Scholar
Kevin Clark
View author publications
Search author on:PubMed Google Scholar
Paul Sopp
View author publications
Search author on:PubMed Google Scholar
Jeremy W. Fry
View author publications
Search author on:PubMed Google Scholar
Iolanda Vendrell
View author publications
Search author on:PubMed Google Scholar
Jane A. McKeating
View author publications
Search author on:PubMed Google Scholar
Juthathip Mongkolsapaya
View author publications
Search author on:PubMed Google Scholar
Gavin R. Screaton
View author publications
Search author on:PubMed Google Scholar
Benedikt M. Kessler
View author publications
Search author on:PubMed Google Scholar
Roman Fisher
View author publications
Search author on:PubMed Google Scholar
Graham Ogg
View author publications
Search author on:PubMed Google Scholar
Alexander J. Mentzer
View author publications
Search author on:PubMed Google Scholar
Julian C. Knight
View author publications
Search author on:PubMed Google Scholar
Yanchun Peng
View author publications
Search author on:PubMed Google Scholar
Tao Dong
View author publications
Search author on:PubMed Google Scholar

Contributions

T.D. conceptualised the project; T.D. and Y.P. designed and supervised T cell experiments; J.C.K. supervised bioinformatic analysis, A.M. supervised sample collection; G.L., Y.P., X.Y., Z.Y., D.D. and W.W. performed all T cell experiments and data analysis; E.A. performed single cell data analysis and bulk RNAseq analysis; P.A.C.W. and J.A.M. assisted with virus infection; T.R. performed HLA typing and next generation sequencing; I.V, B.K. and R.F performed proteomics experiments and initial analysis; J.W.F. provided MHC Class II Tetramers; J.C.K., A.J.M. and A.F. established clinical cohorts and collected clinical samples and data; C.W., K.C., P. Sopp, W.D., P. Supasa, C.L., J.M. and G.R.S. provided technical assistance and critical reagents; T.D., J.C.K. and Y.P. supervised data analysis, E.A., T.D., Y.P. and G.L. wrote the original draft; J.C.K., J.A.M. and G.O. reviewed and edited the manuscript and figures.

Corresponding author

Correspondence to Tao Dong.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Katherine Kedzierska and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Description of Additional Supplementary Files (download PDF )

Supplementary Data 1-2 (download XLSX )

Reporting Summary (download PDF )

Transparent Peer Review file (download PDF )

Source data

Source Data (download XLSX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, G., Antoun, E., Fries, A. et al. Long-persisting SARS-CoV-2 spike-specific CD4⁺ T cells associated with mild disease and increased cytotoxicity post COVID-19. Nat Commun 16, 8743 (2025). https://doi.org/10.1038/s41467-025-63711-9

Download citation

Received: 28 April 2025
Accepted: 21 August 2025
Published: 01 October 2025
Version of record: 01 October 2025
DOI: https://doi.org/10.1038/s41467-025-63711-9