Conditional generation of real antigen-specific T cell receptor sequences

Karthikeyan, Dhuvarakesh; Bennett, Sarah N.; Reynolds, Amy G.; Vincent, Benjamin G.; Rubinsteyn, Alex

doi:10.1038/s42256-025-01096-6

Download PDF

Article
Open access
Published: 08 September 2025

Conditional generation of real antigen-specific T cell receptor sequences

Nature Machine Intelligence volume 7, pages 1494–1509 (2025)Cite this article

8787 Accesses
21 Altmetric
Metrics details

Subjects

A preprint version of the article is available at bioRxiv.

Abstract

Despite recent advances in T cell receptor (TCR) engineering, designing functional TCRs against arbitrary targets remains challenging due to complex rules governing cross-reactivity and limited paired data. Here we present TCR-TRANSLATE, a sequence-to-sequence framework that adapts low-resource machine translation techniques to generate antigen-specific TCR sequences against unseen epitopes. By evaluating 12 model variants of the BART and T5 model architectures, we identified key factors affecting performance and utility, revealing discordances between these objectives. Our flagship model, TCRT5, outperforms existing approaches on computational benchmarks, prioritizing functionally relevant sequences at higher ranks. Most significantly, we experimentally validated a computationally designed TCR against Wilms’ tumour antigen, a therapeutically relevant target in leukaemia, excluded from our training and validation sets. Although the identified TCR shows cross-reactivity with pathogen-derived peptides, highlighting limitations in specificity, our work represents the successful computational design of a functional TCR construct against a non-viral epitope from the target sequence alone. Our findings establish a foundation for computational TCR design and reveal current limitations in data availability and methodology, providing a framework for accelerating personalized immunotherapy by reducing the search space for novel targets.

Designing meaningful continuous representations of T cell receptor sequences with deep generative models

Article Open access 20 May 2024

T cell receptor therapeutics: immunological targeting of the intracellular cancer proteome

Article 27 October 2023

Defined tumor antigen-specific T cells potentiate personalized TCR-T cell therapy and prediction of immunotherapy response

Article 14 February 2022

Main

T cells are a subset of immune cells that use stochastically generated, highly specific pattern recognition receptors called T cell receptors (TCRs) to identify cells presenting ‘non-self’ peptides at the cell surface. This immune surveillance relies on a diverse repertoire of TCRs recognizing cognate peptides presented on major histocompatibility complexes (MHCs), creating a network of TCR:peptide-MHC (pMHC) specificities that collectively mediate self–non-self discrimination with single amino acid precision¹. T cell-based therapies including CAR-T, engineered TCRs and TCR bispecifics have shown durable treatment of chronic infections², autoimmune diseases³ and even solid tumours⁴. A critical bottleneck in their development is the identification of specific and self-tolerant TCRs, which rely on laborious and low-yield in vitro TCR discovery platforms⁵. In silico methods to decipher the mapping between TCRs and pMHCs stand to transform precision immunotherapies by operationalizing a potent mechanism of functionally deleting cells at the subprotein resolution.

Experimentally, some individual TCRs have been shown to recognize up to one million unique peptides⁶, and vice versa^7,8. However, learning this many-to-many mapping is severely confounded by sparse and biased paired data, with most antigen-specific TCRs being identified in the context of only a few, well-studied diseases⁹. Current approaches in modelling antigen specificity frame the problem as a binary classification task¹⁰ with limited utility in TCR design¹¹. Earlier generative models focused on TCR redesign given known antigen-specific repertoires¹² or unconditional TCR generation that statistically approximates natural repertoires^13,14 in the aggregate. Recently, conditional generation methods have shown promise, using convolutional neural networks–long short-term memory network architectures to generate TCRs for known antigens¹⁵. Although we introduced an autoregressive transformer architecture for this problem¹⁶ alongside others¹⁷ and related research has since emerged¹⁸, a deep understanding of the real-world utility of TCR generation remains elusive.

Following our initial proof of concept¹⁶, we adopted the formulation of the TCR reactivity problem as a sparse sequence-to-sequence (seq2seq) task (Fig. 1a), introducing TCRBART and TCRT5, two encoder–decoder transformer models based on the BART¹⁹ and T5 (ref. ²⁰) architectures (Fig. 1b and Supplementary Note A.1). Here, to directly address the issues of sparse parallel data comprising source–target sequence pairs, we investigate a handful of techniques from low-resourced machine translation²¹ for this task. One particularly effective approach leverages the reflexive nature of sequence co-dependencies between source–target pairs by jointly learning a bidirectional mapping^22,23, sharing representations and aligning latent spaces across both sequences²⁴. However, to the best of our knowledge, these approaches have not been applied to the functional protein design space.

In this work, we systematically trained 12 TCRBART and TCRT5 variants using a handful of these low-resource techniques and assessed the fidelity of their generations. We constructed a validation dataset comprising the top-20 pMHCs with the most known cognate TCRs (Fig. 1c), forfeiting their inclusion in the training data to maximize exact sequence matches during validation. Finally, we benchmark our flagship TCRT5 model on a robust test set against existing methods (Fig. 1d) and validate our model in vitro, sampling a complementarity determining region (CDR3β) sequence not seen during training that shows functional activity against a challenging target (Fig. 1e). Our results demonstrate the potential of seq2seq modelling for generating antigen-specific TCRs and the current limitations imposed by severely constrained data (Supplementary Note A.2).

Results

For our experiments, we considered three different training schemes for TCRBART and TCRT5 each, stratified by pretraining status for a total of 12 model variants. All models were evaluated on CDR3β sequence generation (Fig. 2a). The baseline models (TCRBART-0 and TCRT5-0) were trained on pMHC→TCR generation without pretraining. Bidirectional models (TCRBART-0 (B) and TCRT5-0 (B)) were trained on both directions (pMHC↔TCR), and multitask models additionally included a masked language modelling term for both TCR and pMHC reconstructions (TCRBART-0 (M) and TCRT5-0 (M)). Similarly, six models were pretrained on reconstruction and then fine-tuned using the same learning tasks to add TCRBART-FT, TCRBART-FT (B), TCRBART-FT (M), TCRT5-FT, TCRT5-FT (B) and TCRT5-FT (M). All sequence sampling was done using the beam search decoding algorithm (Supplementary Note A.3), a heuristic algorithm that generates sequences of high probability found to be effective for this task^16,17,18.

**Fig. 2: Multitask training increases accuracy and decreases diversity metrics.**

Conditional generation outperforms unconditional generation

Before comparing our various training techniques, we first sought to run hyperparameter optimization (Supplementary Note A.4) and then calibrate our metrics by benchmarking conditional models P(TCR∣pMHC) against unconditional generation P(TCR). To determine the advantage of input conditioning, we evaluated our baseline conditional models TCRBART-0 and TCRT5-0 on a subset of our metrics (Supplementary Note A.5). As our unconditional baseline, we used soNNia’s ‘Ppost’¹⁴, a generative model that extends the immune-specific process of generating T and B cell receptors known as V(D)J recombination to include thymic selection, sampling a TCR distribution closer to what is observed in the periphery, independent of the target antigen. Additionally, to investigate our training set composition’s impact on the validation performance, we evaluated sequences from TCRBART-0 and TCRT5-0 derived in an input-free manner (TCRBART-Unconditional and TCRT5-Unconditional). As expected, we found that conditioning on the epitope yielded gains on all metrics, except for global diversity (Fig. 2b). Surprisingly, TCRT5-Unconditional achieved non-zero F1 scores, revealing high-likelihood training CDR3β sequences in the validation set.

Multitask training increases accuracy metrics and decreases diversity metrics of generated sequences

After confirming the utility of target conditioning, we trained the additional TCRBART and TCRT5 variants and found that no single model outperformed the others on all metrics across all pMHCs (Extended Data Fig. 1a,b). For example, although sequence recoveries increased for bidirectional and multitask variants, their Char-BLEU scores decreased (Fig. 2c). For the F1 score, some models excelled on a small subset of examples and others showed marginal improvements across a broader set, reflected in the divergent mean and median scores. Reassuringly, all the training procedures maintained or improved the F1 performance for over 50% of validation pMHCs over the baseline (Fig. 2d). Using the mean average precision (mAP) to assess calibration (ranking of observed binders), we found that the bidirectional models outperformed the baseline and both outperformed the multitask variants (Fig. 2e). Diversity metrics, however, revealed a decline in unique sequences generated across pMHCs, going from the baseline models to the bidirectional and multitask ones. This was most evident for TCRBART-0 (M), which retained strong performance despite a drop of over 80% in unique generations (Fig. 2f), highlighting the importance of using both metrics to represent performance.

To holistically characterize the models, we visualized performance on a diversity–accuracy biaxial plot. Although pretraining and fine-tuning pushed the pareto front for the TCRT5-FT variants, we observed the opposite effect in the TCRBART-FT models (Fig. 2g and Supplementary Note A.6). Since both TCRBART-FT and TCRT5-0 generated less than 10% of the maximum number of unique sequences, with average Jaccard dissimilarities of less than 0.5, we selected the TCRBART-0 and TCRT5-FT variants as the best BART and T5 models, respectively, and restrict further analyses to these models. Interestingly, the differences between the baseline, bidirectional and multitask models of TCRBART-0 and TCRT5-FT were less obvious. Crucially, the fact remained that the bidirectional and multitask model variants generated fewer unique sequences and still improved performance. When we examined the generated sequences, we saw that many CDR3β sequences repeatedly sampled across pMHCs were known binders to multiple validation examples (Extended Data Fig. 2).

Multitask models preferentially sample polyspecific CDR3β sequences

Although cross-reactivity is an essential component of the TCR repertoire, a recent study defines distinctly ‘polyspecific’ TCRs (Fig. 3a) as those with higher generation probabilities, specific V/J gene preferences, shared CDR3s across individuals and reactivity to multiple unrelated peptides²⁵. Given the improbable nature in which the multitask models maintained competitive performance with a fraction of the diversity, we ventured to qualify the translations’ polyspecificity status. Since our models lack V/J and individual-level context, we employ an ‘ML-centric’ definition to identify CDR3β sequences from our training set appearing in multiple disease contexts and binding more than two epitopes (n = 915). We found that not only did the multitask models generate more polyspecific CDR3β sequences (P < 0.01 for both architectures; Fisher’s exact test), but that their mean polyspecificity, as defined by the number of cognate epitopes, increased too (Fig. 3b,c). In fact, we observed a strong inverse correlation between the number of polyspecific TCRs and the unique sequences generated (Pearson’s r: –0.957), with a high level of sequence sharing between models (Fig. 3d).

**Fig. 3: Multitask training promotes degenerate sampling of polyspecific TCRs.**

To determine if the models were mimicking CDR3β sequences seen most during training at similar frequencies, we examined the translations’ rank order against potentially explanatory variables such as polyspecificity, number of cognate epitopes/alleles and training set incidence. We found that although the highly ranked sequences were more common in the training set, they also had more dissimilar known cognate epitopes, suggesting robustness in capturing polyspecificity (Extended Data Fig. 3a,b). Multitask models showed weaker correlations between the sampling frequency and both training set occurrence and generation probabilities compared with their baseline and bidirectional counterparts, suggesting that they capture polyspecificity beyond simple memorization (Extended Data Fig. 3c,d).

Since our validation set comprises highly immunogenic viral peptides known to be the targets of polyspecific TCRs²⁵, we checked if our F1 performance could be explained solely by polyspecific generations. Although the models sampled polyspecific sequences at higher ranks, we found that baseline models sampled both more non-polyspecific sequences overall and more non-polyspecific true-positive binders (Fig. 3e). Given our desire for a model that generates CDR3β sequences for rare epitopes, we find polyspecific TCR generation a potentially misleading avenue for metric hacking, misrepresenting true usefulness. Thus, although the bidirectional and multitask models show promise in increasing accuracy through self-consistency for the receptor–ligand design problem, we note that their utility may be limited for real-world scenarios. We, therefore, select TCRT5-FT as our flagship model (Supplementary Note A.7) and, henceforth, refer to it simply as TCRT5.

TCRT5 generates real unseen antigen-specific CDR3β sequences

Having selected TCRT5 for its superior accuracy, diversity and minimal reliance on polyspecific TCRs, we proceeded to understand the model in a more qualitative manner. TCRT5 captures CDR3β lengths with a slight decrease in spread (mean, 14.6; s.d., 1.2) compared with the reference set (mean, 14.5; s.d., 2.0). However, the sampled sequences had a substantially higher generation probability as determined by OLGA²⁶, log[p_gen] (mean, –7.04; s.d., 0.85) than the reference (mean, –9.83; s.d., 2.356), indicating that TCRT5 was missing lower-probability sequences (Fig. 4a). This reduction in repertoire diversity was also captured by various sequence embedding models (Extended Data Fig. 4a–c). To determine whether this effect stemmed from our choice of decoding algorithm rather than the model’s weights, we compared the p_gen values from beam search and ancestral sampling against reference CDR3β sequences and found that beam search shifted the distribution towards higher biological likelihood than ancestral and reference (Extended Data Fig. 5a). Interestingly, we found that these p_gen values correlated with the model log-likelihood scores (Extended Data Fig. 5b).

**Fig. 4: Qualitative assessment of TCRT5.**

Sequence logo plots of the cognate sequences for canonical epitopes GILGFVFTL (influenza A), KLGGALQAK (cytomegalovirus) and YLQPRTFLL (SARS-CoV2) revealed a noticeable decrease in sequence diversity, particularly near the start of the sequence (Fig. 4b). This loss of entropy was quantified using the positional Δentropy value, which showed the greatest loss in entropy around position 5 (Fig. 4c), probably due to the bias of starting sequences with the ‘CASS’ motif. Additionally, we found that soNNia better matched the reference k-mer patterns at short lengths, whereas TCRT5 matched better for medium k-mer lengths with both converging at the longer k-mer lengths (Fig. 4d).

Next, to determine TCRT5’s input specificity, we computed the Jaccard index to assess the overlap between translation sequences across pMHCs (Fig. 4e) and found sequences with high similarity clustered together, such as melanoma antigens EAAGIGILTV and ELAGIGILTV, though more data are required to determine to what extent this generalizes. To check for the correlates of sampling occurrence across validation pMHCs, we compared the generation probabilities, polyspecificity and training set frequency, and found that higher generation probability sequences were more frequently sampled, with no clear correlation between the training frequency and increased sampling (Fig. 4f).

Finally, to test whether TCRT5 could generate known binders not seen during training, we analysed each generated sequence for biological validity, known specificity and training set membership of each of the translations and found that out of the 2,000 generations, 1,996 of them had non-zero generation probabilities, 181 were known binders and 7 were TCRs that were not seen during supervised training (Fig. 4g). In particular, one of these seven was not found in the pretraining set, indicating a real potential for sampling de novo TCRs. Moreover, they spanned multiple peptides, alleles and disease contexts: KLGGALQAK_A*03:01 (cytomegalovirus), LLWNGPMAV_A*02:01 (yellow fever virus), YLQPRTFLL_A*02:01 (SARS-CoV2) and YVLDHLIVV_A*02:01 (Epstein–Barr virus (EBV)), demonstrating that the performance was not localized to a single peptide or MHC. All analyses described above remained consistent when validated at a sampling depth of 1,000 sequences (Extended Data Fig. 6).

TCRT5 achieves state-of-the-art performance on sparsely validated epitopes

The goal of a conditional TCR generation model like TCRT5 is to sample TCRs against rare epitopes not seen during training, especially when few or no known TCRs exist. To better understand the utility of these models in such a real-world setting, we benchmarked against two publicly available models: ER-TRANSFORMER¹⁷ and GRATCR¹⁸. For a fair comparison, we curated a test set of high-confidence paired data from recent exports of VDJdb²⁷, IEDB²⁸ and the IMMREP2023 TCR-pMHC specificity competition¹⁰. We included studies after January 2023 with at least ten CDR3β sequences per epitope and a minimum edit distance (D_train) of 5 to any training epitope (Fig. 5a). This resulted in 14 epitopes spanning seven HLA alleles. One EBV epitope, RVRAYTYSK (HLA-A*03:01), contained 895 unique CDR3β sequences and was reserved for a simulated in silico functional design assay. The remaining 13 were used to construct a sparse benchmark evaluation set.

**Fig. 5: In silico benchmark on unseen epitopes.**

To maximize the likelihood of exact sequence recovery, we sampled 1,000 sequences per model per pMHC. As a sanity check, we observed that all models produced sequences with comparable lengths and OLGA-derived p_gen distributions, further suggesting that beam search decoding favours common high-probability motifs irrespective of the model (Fig. 5b and Supplementary Note A.8). Since ER-TRANSFORMER frequently omitted the canonical N-terminal cysteine and C-terminal phenylalanine, we defined an ER-TRANSFORMER+ variant that appends these residues when missing, recovering more realistic sequences. Across the 13 sparse epitopes, although all conditional models outperformed an unconditional soNNia baseline, TCRT5 achieved the highest overall performance, even recovering an exact sequence match for FTDALGIDEY (A*01:01) and HPNGYKSLSTL (B*07:02) (Fig. 5c).

Given the rarity of exact matches and the functional relevance of sequence similarity, we next evaluated whether TCRT5-generated sequences were predicted to be functionally similar to known binders. We leveraged GIANA²⁹, an unsupervised clustering model that demonstrated high cluster purity in a recent benchmark of TCR clustering methods³⁰ as well as computed the number of generated sequences with ≥90% sequence identity to known binders—a threshold we found empirically to align with improved precision–recall metrics (Supplementary Note A.9). We found that TCRT5 generated more sequences with greater than 90% sequence identity as well as those that clustered with the reference sequences, with ER-TRANSFORMER+ performing comparably (Fig. 5d,e).

Finally, to assess whether TCRT5 prioritized functional sequences at higher ranks, we simulated a prospective screen by generating 1,000 sequences for RVRAYTYSK and compared them against the 895 known CDR3β sequences. Using GIANA to cluster the model outputs with known binders, we found that GRATCR, ER-TRANSFORMER+ and TCRT5 generated 6, 133 and 231 clustered sequences, respectively, corresponding to 1, 19 and 23 unique reference CDR3β sequences (Fig. 5f). Remarkably, TCRT5 also recovered eight known binders, compared with three for ER-TRANSFORMER+ and zero for GRATCR. Moreover, TCRT5 consistently generated more sequences in top-ranked positions, outperforming all baselines across all rank cut-offs (Fig. 5g). These results demonstrate that TCRT5 samples realistic TCR sequences and prioritizes functional candidates, highlighting its potential utility in real-world TCR generation scenarios.

TCRT5 validates in vitro

Next, we sought to experimentally validate TCRT5 generations against a non-viral epitope, a notable challenge given our training set composition. We selected an HLA-A*02:01 epitope derived from leukaemia-associated Wilms’ tumour antigen-1 (WT1; sequence: VLDFAPPGA; D_train = 4)³¹, a target with a strong positive control in an existing TCR-T³². To test TCR functionality from the generated CDR3β sequences, we swapped the CDR3β of the TCR-T (henceforth referred to as the WT1 TCR) with TCRT5-generated sequences. These TCR constructs were then expressed in a TCR-KO Jurkat cell line with a nuclear factor of activated T cells (NFAT) promoter upstream of the luciferase enzyme, a setup that enabled the rapid functional read-out of T cell activation via luminescence (Fig. 6a, Extended Data Fig. 7a,b and Supplementary Note A.10). To account for CDR3β length differences disrupting TCR folding, we tested two sequence sets: variable-length sequences (20 sequences sampled uniformly from 100 generations) and fixed-length sequences (the first 20 sequences with the native WT1 CDR3β length).

**Fig. 6: In vitro validation of TCRT5-generated CDR3β sequences.**

Surprisingly, all of the designed CDR3β-swapped gene sequences showed structurally viable TCR expression on the cell surface as assessed by flow cytometry (Fig. 6b), highlighting the feasibility of CDR3β grafting. Of the 40 engineered TCR constructs tested, one generated sequence, F8 (CASSVGLYNEQFF) from the fixed-length set, demonstrated a substantial increase in luciferase expression over the dimethyl sulfoxide (DMSO) controls (Fig. 6c,d and Extended Data Fig. 7c). In particular, this sequence was in the pretraining corpus but absent from our fine-tuned data, indicating that TCRT5 generated a naturally occurring TCR, correctly identifying it from vast unlabelled repertoires. Although F8’s activity was lower than the established WT1 TCR positive control, it demonstrated that TCRT5 can generate sequences capable of mediating epitope-specific functional activation. To assess F8’s specificity, we tested for off-target reactivity against two controls: the related HA-1 antigen (A*02:01, VLHDDLLEA) and CEFX Ultra SuperStim (CEFX), a highly immunogenic cocktail of 80 bacteria- and virus-derived MHC class I peptides. The WT1 TCR responded only to its cognate peptide, whereas F8 did not react to HA-1 but showed similar activation levels for both WT1 and CEFX pools (Fig. 6e). Although these results demonstrate TCRT5’s ability to elicit a functional response to an out-of-distribution epitope, the identified F8 TCR’s reactivity against CEFX suggests a level of polyspecificity that highlights the need for further refinement in target selectivity and dataset construction for future work.

Discussion

In silico identification of TCRs that precisely target arbitrary pMHCs remains one of the great outstanding challenges in computational immunology⁹, requiring models to navigate a complex interaction network of cross-reactive TCR-pMHC specificities in the face of sparsely labelled data. Here we present TCR-TRANSLATE, a seq2seq framework adapting low-resource machine translation techniques to conditional CDR3β sequence generation, demonstrating the rapid sampling of antigen-specific repertoires in this data-sparse domain. Our systematic exploration reveals key insights about bespoke training methods, generation diversity and TCR polyspecificity, culminating in the generation of a TCR construct demonstrating functional activity against a therapeutically relevant target without post hoc optimization.

Interestingly, we found that pretraining had opposite effects on TCRT5 and TCRBART, probably due to the former’s span masking strategy being better suited to learn on CDR3β sequences given their short lengths and the standard masking rate of 15% (ref. ³³). Although span masking forces models to understand higher-order k-mer motifs, token masking at 15% would mask two amino acids on average, providing minimal learning signal per epoch. Beyond these architectural findings, we observed the models’ tendency to preferentially sample polyspecific TCRs, especially in the bidirectional and multitask training regimes. This consistent trend suggests that the alignment of sequence spaces through self-consistency training may inadvertently prioritize empirically de-risked sequences or high-likelihood sequences that satisfy many input conditions, even at the cost of diversity.

Despite these inherent biases, our flagship TCRT5 model outperformed both ER-TRANSFORMER¹⁷ and GRATCR¹⁸ across all metrics on held-out epitopes. Interestingly, all models sampled sequences with high V(D)J generation probabilities via beam search, suggesting potentially limited generalizability to rare targets. This pattern is well documented in natural language, where beam search and the broader class of mode-seeking decoding algorithms are known to sample simpler subsequences³⁴. In the TCR space, this creates an interesting paradox. Although ancestral sampling produced distributions more closely resembling the real cognate repertoire, beam search consistently outperformed it on F1 metrics¹⁶. This suggests a potential confirmation bias in our evaluation, where higher p_gen sequences may be over-represented in the reference set because they are more likely to be experimentally observed, effectively rewarding models for reproducing sampling biases rather than capturing true functional diversity.

The experimental validation of the F8 TCR construct represents the successful de novo design of a functional CDR3β, not seen during training, against a therapeutically relevant non-viral target. Remarkably, all 40 generated sequences showed viable surface expression, demonstrating the feasibility of CDR3β grafting. Although the overall hit rate (1/40) is low, it represents orders-of-magnitude improvement over traditional discovery methods. In particular, the F8 sequence was absent from our training data but exists in the unannotated form in iReceptor³⁵, indicating that TCRT5 identified a naturally occurring TCR with previously unknown specificity. The F8 TCR’s cross-reactivity with the CEFX peptide pool but not with HA-1 suggests a level of learned polyspecificity that reflects both biological reality and limitations in our training data. This behaviour reveals a fundamental tension between therapeutic utility and the underlying biology. Although polyspecificity is an evolved feature that enables broad immune surveillance, its utility in clinical applications is diminished, creating a scenario in which natural TCR properties may conflict with therapeutic requirements.

The current iteration of our study has many limitations stemming from both our approach and an innate scarcity of available data. In the data-scarce regime, we were most limited in silico by our ability to evaluate the models. Metrics based on exact sequence recovery, sequence similarity thresholds (≥90%) and clustering serve as imperfect proxies for functional specificity, providing noisy estimates of performance. Functionally, our focus on the CDR3β loop requires a scaffold TCR, neglecting the crucial contributions of the α chain and V/J genes to antigen specificity^36,37. Furthermore, all sampled sequences were validated without downstream prioritization, a design choice that preserved the unbiased evaluation of TCRT5, which does not fully reflect how a generative model would be used in a therapeutic discovery pipeline. Additionally, our choice of TCRT5-FT as the flagship model and the behavioural analyses we report were based on the validation set performance. However, we argue for its necessity given the severe data sparsity to evaluate pMHCs across multiple disease contexts. Importantly, TCRT5 demonstrates consistent, monotonic improvement in performance across training checkpoints, suggesting that our final model demonstrates real learning rather than random fluctuations to its parameters.

Experimentally, our validation was limited to a single non-viral epitope in a Jurkat NFAT-luciferase system, providing a useful but limited view of TCR-mediated T cell activation in the nucleus. This assay stops short of capturing the final read-out of cytotoxicity and cytokine secretion and simultaneously does not capture high-affinity, low-activity TCRs. Although we successfully validated a TCR against WT1, our modest hit rate (fixed length, 1/20; total, 1/40) and observed cross-reactivity with the CEFX peptide pool show the model’s current limitations in sampling fine-grained specificity signals, highlighting the need for richer, non-viral data. A broader experimental sweep to include diverse epitopes, running multimer stains and cytotoxicity screens to quantify upstream and downstream biological activities, and training on newer architectures and better datasets would help generalizability before expression in primary T cells.

Contrary to traditional TCR discovery processes, which have low yield, models like TCRT5 enable rapid hypothesis generation for rare epitopes with few to no known TCRs, drastically reducing the search space of possible cognate sequences. Through benchmarking on held-out epitopes, we showed that TCRT5 outperforms available conditional generation models across all reported metrics, including exact matches, sequence clustering and prioritization of known binders at higher ranks. More importantly, we validated this performance experimentally by identifying a functional TCR (F8) against an out-of-distribution, non-viral epitope without post hoc filtering, underscoring the model’s potential for out-of-the-box real-world utility. Although challenges remain, this work represents an important step towards achieving computationally guided targeting of arbitrary peptide sequences at will. As more data become available, the performance of generative models like TCRT5 is expected to improve, moving the field closer to scalable, high-precision TCR design for personalized immunotherapies that can rapidly respond to emerging threats and individual patient needs.

Methods

Sequence representation

We adopt the same seq2seq framework introduced in ref. ¹⁶, relaxing the direction of the pMHC→TCR source–target pairs to train on pMHC→TCR and TCR→pMHC, but evaluate on the former. To represent the TCR-pMHC trimeric complex, comprising three subinteractions (TCR-peptide, TCR-MHC and peptide-MHC) as a source–target sequence pair, we made a few simplifying assumptions that allowed for a more straightforward problem formulation. First, we assume a stable pMHC complex, reducing the problem space to a dimeric interaction between TCR and pMHC. Second, we focus on the variable amino acid residues at the binding interface. For the TCR, we use the CDR3β loop, a contiguous span of 8–20 amino acids that typically make the most contact with the peptide³⁸. Similarly, for the pMHC, we use the whole peptide and the MHC pseudo-sequence, defined in ref. ³⁹ as a reduced, non-contiguous, string containing the polymorphic amino acids within 4.0 Å of the peptide. We opt for a single-character amino-acid-level tokenization, primarily for its interpretability⁴⁰. In addition to the 20 canonical amino acids, we use standard special tokens to encode semantic information pertaining to the structure of the sequences including the start of sequence [SOS], end of sequence [EOS], masking [MASK], padding [PAD] and a separator token [SEP] to delineate the boundary between the concatenated peptide and pseudo-sequence. For TCRT5, we additionally employ the use of sequence-type tokens [TCR] and [PMHC], retained from T5’s use of task prefixes²⁰, to designate the translation direction:

TCRBART:

[SOS]EPITOPE[SEP]PSEUDOSEQUENCE[EOS]↔[SOS]CDR3BSEQ[EOS]

TCRT5:

[PMHC]EPITOPE[SEP]PSEUDOSEQUENCE[EOS]↔[TCR]CDR3BSEQ[EOS]

Of note, this formulation is extensible to other sequence representations of both TCR and pMHC by using the [SEP] token to delineate the α- and β-chain information for CDR3, multiple CDRs, and even full-chain sequence representations. Similarly, this approach can be used for the full MHC sequence as well.

Dataset construction

Core parallel corpus

Our parallel corpus comprised experimentally validated immunogenic TCR-pMHC pairs taken from publicly available databases (McPAS⁴¹, VDJdb²⁷ and IEDB²⁸). All data were collected before May 2023. Additionally, we used a large sample of partially labelled data derived from the MIRA⁴² dataset, which contained CDR3β and peptide sequences, but contained MHC information at the haplotype resolution instead of the actual presenting MHC allele. Therefore, the presenting MHC allele was inferred from the individual’s haplotype using MHCflurry 2.0’s⁴³ top-ranked presentation score for the listed alleles. Of importance, these allele-imputed examples were not used in the evaluation. To aggregate the data spanning various sources, formats and nomenclature, we mapped the columns from each individual dataset to a common consensus schema and concatenated the data along the consensus columns. Missing values were reasonably imputed based on other information for that data instance. To keep only the cytotoxic (CD8⁺) T cells, we filtered the instances in which the cell type was provided or the HLA allele was of MHC class I. Once the data were aggregated and the values were imputed, we applied the following column-level standardization for each source of information:

CDR3β, epitope and MHC pseudo-sequence: all amino acid representations were normalized using the ‘tidytcells.aa.standardise’ function found in the tidytcells Python package⁴⁴.
TR genes: the tidytcells package⁴⁴ was once again used to standardize the nomenclature surrounding the TCR genes (for example, TRB-V and TRB-J).
HLA allele: HLA allele information was parsed and standardized to the HLA-[A,B,C]*XX:YY format using the ‘mhcgnomes’ package (https://github.com/pirl-unc/mhcgnomes), and only the parsed entities that identified as alleles were retained whereas those with serotype and class-level resolution were filtered. For a small number of cases in which mhcgnomes identified an allele group but was unable to find/parse protein-level information, we imputed the protein field by incrementing from ‘*01’ until a matching IMGT allele was found. Although this step has the potential of introducing differences between the imputed pseudo-sequence and the ground truth, we anticipate this source of noise to have a minor effect as the MHC pseudo-sequence is well conserved within the serotype. HLA alleles were imputed when necessary and then normalized using the mhcgnomes package to the standard HLA-[A,B,C]*XX:YY format.

Once aggregated, only entries derived from human studies with MHC class I peptides were retained. Additionally, entries with the minimal information of HLA, peptide and CDR3β were retained. No other data filtration was performed for the training and validation splits.

Training/validation split

To assess the feasibility of having the models sample antigen-specific sequences for unseen epitopes, we held out a validation set of the top-20 most-target-rich pMHCs. We trained on the remaining data, further removing the occurrences of the held-out, epitope-bound alternate MHCs to ensure a clean validation split (Fig. 1c). We retained training sequences with a low edit distance to the validation pMHCs to better understand their influence on performance. The degree to which these sequences exhibit training set similarity is reflected in Extended Data Table 1. The parallel corpus was subsequently de-duplicated to remove near duplicates (peptides with the same allele and a ≥6-mer overlap), which we found to marginally help the overall performance, in accordance with ref. ⁴⁵. This resulted in a final dataset split of ~330k training sequence pairs (N = 6,989 pMHCs) and 68k validation sequence pairs (N = 20 pMHCs). A key limitation of our validation dataset is its bias towards mainly viral epitopes and a very narrow HLA distribution towards well-studied alleles.

Unlabelled ‘monolingual’ data

We hypothesized that pretraining the encoder–decoder model using self-supervised methods on pMHC and TCR sequences could help boost the translation performance of the model by learning better representations for source and target sequences, as that in ref. ⁴⁶, which crucially has been shown to improve performance in the low-resource setting²¹. For the unlabelled pMHC sequences, we used the positive MHC ligand binding assay data from IEDB (N ≈ 740k)²⁸. For the TCR sequences, we used around (N ≈ 14M) sequences from TCRdb⁴⁷, out of which around 7M CDR3β sequences were unique. For this dataset, we chose to retain duplicate CDR3β sequences as the TCRdb was amassed over multiple studies and populations; therefore, we felt that the inclusion of duplicate CDR3β sequences was reflective of convergent evolution in the true unconditional TCR distribution.

Benchmark ‘test’ data

To fairly compare TCRT5 against external models ER-TRANSFORMER¹⁷ and GRATCR¹⁸, we looked for data that would not advantage any one model over the other. This meant that we needed to find data that were not from any training set or validation set, which would have introduced leakage via model selection. Since GRATCR was fine-tuned exclusively on MIRA data, filtering for our training and validation sets would cover the GRATCR model. However, since we were not able to find the training set for ER-TRANSFORMER, we adopted a slightly more stringent data inclusion policy. To account for both ours and ER-TRANSFORMER’s dataset, we aimed to find paired TCR-pMHC data from recent studies (2023 onwards) and filter for epitopes that were at least five amino acid edits away from anything in our training set. For its distributed use and well-characterized performance, the IMMREP2023 TCR specificity competition¹⁰ was used along with recent exports from VDJdb and IEDB, which were accessed on 25 March 25 and 1 April 2025, respectively. To ensure that quality examples were taken from VDJdb, entries with a confidence score of ≥2 were chosen. After applying our filtering criteria, we were left with four pMHCs from the IMMREP2023 dataset, four pMHCs from IEDB and eight pMHCs from VDJdb. After manually examining the 16 pMHCs and validating their assay conditions, two pMHCs from VDJdb that shared the same peptide ‘RPIIRPATL’ were dropped due to their inclusion in a 2021 study. The final test set consisted of 14 epitopes with the ‘RVRAYTYSK’ epitope, which contained 895 unique CDR3β sequences, being removed from the benchmark set to have n = 13 pMHCs for the benchmark and the ‘RVRAYTYSK’ epitope as an in silico simulation. The degree to which these sequences exhibit training set similarity is reflected in Extended Data Table 2.

Supplementary Note A.11 provides more information.

Model training

Pretraining

TCRBART was pretrained using masked amino acid modelling (BERT style⁴⁸), whereas TCRT5 utilized masked span reconstruction, learning to fill in randomly dropped spans with lengths between 1 and 3. Of importance, neither model was trained on complete sequence reconstruction to reduce the possibility of memorization during pretraining. Both models were trained on unlabelled CDR3β and peptide-pseudo-sequences, simultaneously pretraining the encoder and decoder, inspired by the MASS/XLM approach^49,50. Unlike MASS/XLM, we omitted per-token learned language embeddings, allowing TCRBART to learn from the size differences between CDR3β and pMHC sequences and TCRT5 to use the [TCR] and [PMHC] starting tokens. To address the imbalance in sequence types, we upsampled sequences for a 70/30 TCR/pMHC split.

Direct training/fine-tuning

For the parallel data, we used the same three training protocols (baseline, bidirectional and multitask) for direct training from random initialization as well as fine-tuning from a pretrained model. This was done by extending the standard categorical cross-entropy loss function (equation (1)), favoured in seq2seq tasks for its desired effect of maximizing the conditional likelihoods over target sequences^51,52. For the baseline training, we used the canonical form of the cross-entropy loss, as shown below:

$$\begin{array}{rcl}{\mathcal{L}}&=&{\rm{CE}}({\bf{y}},\hat{{\bf{y}}})=-\mathop{\sum }\limits_{i=1}^{n}{{\bf{y}}}_{i}\log [{\hat{{\bf{y}}}}_{i}]\\ &=&-\mathop{\sum }\limits_{i=1}^{n}\mathop{\sum }\limits_{j-1}^{k}{y}_{ij}\log [{p}_{\theta }({y}_{ij}| {\bf{x}})]\end{array}.$$

(1)

The bidirectional and multitask models were trained using multiterm objectives, forming a linear combination of individual loss terms corresponding to the cross-entropy loss of each task/direction.

$${{\mathcal{L}}}_{\rm{bidxn}}={{\mathcal{L}}}_{pmhc\to tcr}+{{\mathcal{L}}}_{tcr\to pmhc}$$

(2)

$${{\mathcal{L}}}_{\rm{multi}}={{\mathcal{L}}}_{mlm}+{{\mathcal{L}}}_{pmhc\to tcr}+{{\mathcal{L}}}_{tcr\to pmhc}$$

(3)

To mitigate the effects of model forgetting with stacking single-task training epochs, we shuffled the tasks across the epoch using a simple batch processing algorithm (Algorithm 1). After the batch was sampled, it was rearranged into one of four seq2seq mapping possibilities and trained on target reconstruction with the standard cross-entropy loss, which was used for backpropagation. In this way, we could ensure that the model was simultaneously learning multiple tasks during training. For the bidirectional model, this was straightforward as we could swap the input and output tensors during training to get the individual loss contributions of ${{\mathcal{L}}}_{\rm{pmhc\to tcr}}\,{\text{and}}\,{{\mathcal{L}}}_{tcr\to pmhc}$ (equation (2)). For the multitask model, the mapping possibilities are (1) pMHC→TCR, (2) TCR→pMHC, (3) masked/corrupted pMHC*→pMHC and (4) masked/corrupted TCR*→TCR, which combine to form ${{\mathcal{L}}}_{\rm{multi}}$ (equation (3)). These tasks and sequence mappings as seen by TCRBART and TCRT5 are summarized in Fig. 2b.

Algorithm 1

Multitask training step.

Batched input: source pMHCs, X; target TCRs, Y

Sample a ≈ Bernoulli(0.5)

if a > 0.5 then

Swap X and Y

Compute attention masks

end if

Sample b ≈ Bernoulli(0.5)

if b > 0.5 then

Set X = X* and Y = X

Compute attention masks

end if

do Predict $\hat{{\bf{Y}}}=\phi ({\bf{X}})$ and gradient updates on CE(y, $\hat{{\bf{y}}}$)

For the purposes of comparison between models originating from different training schemes, each of the models was trained for 20 epochs, from which the checkpoint with the highest average overlap to the known TCR reference set (F1 score) was chosen. We chose this approach to characterize the models’ real-world potential under optimal conditions, as opposed to training for a fixed number of steps or even a fixed number of steps per task (Supplementary Note A.6).

Evaluation

To evaluate antigen specificity, we build our framework around sampling exact CDR3β sequences from published experimental data on well-characterized validation epitopes not seen during training. This approach has an interpretable bias compared with black-box error profiles, at the cost of potentially under-representing actual performance. We calculate sequence-similarity-based metrics beyond exact overlap to create a more robust evaluation framework, and characterize their concordances for future use on epitopes with fewer known cognate sequences. Broadly, our metrics can be summarized as evaluating the accuracy of the returned sequences, their diversity or some combination of the two. They are summarized in brief below:

Accuracy metrics

Char-BLEU: following BLEU-4 (ref. ⁵³), the character-level BLEU calculates the weighted n-gram precision against the k = 20 closest reference sequences to abate the unintended penalization of accurate predictions under a large reference set. We use NLTK’s ‘sentence_bleu’ function to calculate a single translation’s BLEU score and the ‘corpus_bleu’ function to compute the BLEU score over an entire dataset.
Native sequence recovery: we compute the index-matched sequence overlap with the closest known binder of the same sequence length, when available. This is the same as the length-normalized Hamming distance. The Levenshtein distance normalized to the length of closest reference was used for cases in which a size-matched reference did not exist.
mAP: borrowed from information retrieval, mAP measures the average precision across the ranked model predictions. Here we rank the generations by model log-likelihood scores and take the average of the precisions at the top-1, top-2, top-3… top-k ranked outputs. Then, we take the mean over the various pMHCs’ average precision values to get the mAP. This metric gauges the accuracy of the model as well as the calibration of its sequence likelihoods.
Biological likelihood: to assess the plausibility of model outputs independent of antigen specificity or labelled data, we compute the generation probability of predictions using OLGA, a domain-specific generative model that infers CDR3β sequence likelihood²⁶.

Diversity metrics

Total unique sequences: as a measure of global diversity, we compute the number of total unique generations across the top-20 validation pMHCs as a diversity metric that captures model degeneracy and input specificity. This metric is a function sampling depth and is dependent on the relatedness or model-perceived relatedness of the input epitopes in a dataset.
Jaccard similarity/dissimilarity index: the Jaccard index or the Jaccard similarity score is used to measure the similarity of two sets and is calculated as the size of the intersection divided by the union of the two sets. Since the Jaccard index is inversely proportional to diversity, one minus the Jaccard index is used to represent diversity between two sets.
Positional Δentropy: to quantify the change in diversity between the models’ outputs and the reference distribution per CDR3β position, we report H(q_i) – H(p_i) over the Kullback–Leibler divergence to get a signed change in entropy between the amino acid usage distribution of reference distribution q and sample distribution p at position i.

Both

Precision@K: borrowed from information retrieval, this metric is calculated by sampling K sequences from the model, with the key distinction that we do not include rank. Here we count the true positives as the exact sequence overlap to the reference target sequences and false positives are chosen, although restrictively, as sequences that do not occur in the reference set. These quantities are combined to compute precision as follows:
$$\,\text{Precision}=\frac{\text{True Positives (TP)}}{\text{True Positives (TP)}+\text{False Positives (FP)}\,}.$$
Recall@K: also taken from information retrieval, this metric uses exact sequence overlap to measure the model’s ability to sample the breadth of reference sequences, which we calculate to be the minimum between K and the number of total reference sequences to ensure this metric ranges from 0 to 1:
$$\,\text{Recall}=\frac{\text{True Positives (TP)}}{\min(K,\text{Total Reference Sequences})}.$$
F1@K: the F1 score is computed as the harmonic mean of precision and recall, useful for its ability to capture a balanced picture between precision and recall:
$$\,\text{F1}=2\times \frac{\text{Precision}\times \text{Recall}}{\text{Precision}+\text{Recall}}.$$
k-mer spectrum shift: as used in the DNA sequence design space⁵⁴, the k-mer spectrum shift measures the Jensen–Shannon (JS) divergence between the k-mer usage frequency distributions of two sets of sequences across different values of k. Here we compare the JS divergence between the distribution of k-mers derived from a pMHC’s model generations and its reference set of sequences.

TCRT5 data ablation

To evaluate the impact of specific training decisions, we conducted an ablation study by removing key complexities of our training and data pipelines and measuring their effects on model performance. We started with our chosen model, TCRT5, fine-tuned on the single-task TCR generation with semisynthetic MIRA⁴² data. Next, we retrained the model without the MIRA data for an equivalent number of steps to assess its contribution. Finally, we removed pretraining altogether, training a model on the reduced dataset from random initialization.

To avoid over-representing the performance of the model trained on MIRA data on similar validation examples, we specifically removed three pMHCs that were a single-edit distance from a MIRA example with a greater than 5% overlap in their cognate CDR3β sequences (LLLDRLNQL, TTDPSFLGRY and YLQPRTFLL) from the validation set. For all the models, we used the same checkpoint heuristic, selecting the model with the highest F1 score.

In silico benchmark

GRATCR

For running GRATCR on the test set peptides, we followed the instructions provided by the GRATCR team¹⁸ (https://github.com/zhzhou23/GRATCR). We ran the beam search decoding as provided. Since conditional likelihoods were not output by their beam implementation, the sampled sequence index was used as the translation rank. The script to sample the fine-tuned GRATCR was used as follows:

python GRA.py –data_path="./data/benchmark_peptides.csv"

–tcr_vocab_path="./Data/vocab/total-beta.csv"

–pep_vocab_path="./Data/vocab/total-epitope.csv"

–model_path="./model/gra.pth" –bert_path="./model/bert_pretrain.pth"

–gpt_path="./model/gpt_pretrain.pth" –mode="generate"

–result_path="./gratcr_benchmark_results.csv" –batch_size=1 –beam=1000

ER-TRANSFORMER

ER-TRANSFORMER was run using the unique amino acid model for a more direct comparison to TCRT5. We utilize the seq_generate method as described in their codebase with the default parameters shown in https://github.com/TencentAILabHealthcare/ER-BERT/ under Code/evaluate_seq2seq_MIRA.py as used by the ER-BERT team¹⁷. The translation rank was computed in the same manner as for TCRT5 using the Hugging Face infrastructure around model.generate. The code for sampling the ER-TRANSFORMER is shown below:

def seq_generate(input_seq, max_length, input_tokenizer, target_tokenizer, beams, k=1000):

input_tokenized = input_tokenizer(" ".join(input_seq),

padding="max_length",

max_length=max_length,

truncation=True,

return_tensors="pt")

input_ids = input_tokenized.input_ids.to("cpu")

attention_mask = input_tokenized.attention_mask.to("cpu")

outputs = model.generate(input_ids,

attention_mask=attention_mask,

num_beams=beams,

num_return_sequences=k)

output_str = target_tokenizer.batch_decode(outputs, skip_special_tokens=True)

output_str_nospace = [s.replace(" ", "") for s in output_str]

output_str_nospace = [s for s in output_str_nospace if s != ""]

return output_str_nospace

Additionally, we observed that the ER-TRANSFORMER performance was greatly improved using a post hoc editing step to the translations by simply adding a leading cysteine and ending phenylalanine wherever missing. Although this decreased the number of unique sequences, indicating that ER-TRANSFORMER was sampling both sequences with and without the required C and F, we felt that the large increase in accuracy warranted its inclusion for a fair benchmark and annotate this amended model ER-TRANSFORMER+, which we hold as a fairer comparison of the methods.

Modified F1 scores

In the sparse setting, evaluating the model performance using exact sequence recovery is zero inflated when this may not be the case if sufficient known binders were available. To help alleviate this, we took a principled approach of calling sequences true positives. The first was using sequence recovery values of >90% to a known reference CDR3β. Second, we used the GIANA 4.1 (ref. ²⁹) clustering algorithm to cluster the generated samples with known reference sequences. Purported positives were the generated samples that clustered with a reference sequence. GIANA was run using only CDR3β information and all of the default settings using the following command:

python GIANA4.1.py -f cdr3b_input_file_path -v False

In vitro validation

To further evaluate the ability of TCRT5 to generate epitope-specific CDR3β sequences for sparsely validated epitopes, we attempted to experimentally characterize a list of predicted CDR3β sequences for leukaemia-associated antigen, the HLA-A*02:01 presented WT1 (VLDFAPPGA)³¹ to be grafted on a well-characterized TCR-T³² using the sequence identified in ref. ⁵⁵. From the list of generated CDR3β sequences, we selected 40 for in vitro validation. We chose 20 sequences of the same length as the original CDR3β sequence (13 AA) by oversampling TCRT5 and choosing the first 20 sequences of length 13. Additionally, we chose 20 sequences of variable CDR3β lengths by sampling 100 sequences from TCRT5 and taking every fifth sequence starting from the first, ranging from 15–17 AA long.

Retroviral transduction

Predicted CDR3β sequences (Extended Data Table 3) were synthesized as gBlocks (IDT, custom) and cloned into a standard SFG retroviral backbone vector⁵⁶ containing the full-length WT1 TCR sequence. Sequences were codon optimized for expression in human cells and cloned plasmids were validated by Oxford Nanopore sequencing (Plasmidsaurus). TCR-retroviral supernatants were generated using 293T cells and co-transfection of the TCR-SFG, RDF and PegPam3 plasmids with GeneJuice Transfection Reagent (Sigma, 70967-5). Viral supernatants were harvested at 48- and 72-h post-transfection, snap frozen and stored at –80 °C. Transductions were performed using RetroNectin (Takara, T100A) according to the manufacturer’s recommendations.

TCR expression

TCRs were transduced into a genetically engineered Jurkat cell line (Promega, GA1182). The cell line is deficient in endogenous α and β chains (TCR-KO) and constitutively expresses both CD4 and CD8 co-receptors. Additionally, the TCR-KO Jurkats are engineered to express an NFAT-inducible luciferase reporter construct. Following transduction, TCR expression on the cell surface was evaluated by flow cytometry. Before staining, cells were incubated with 50-nM dasatinib for 30 min at 37 °C, shown to improve T cell staining⁵⁷. TCR-Jurkats were then labelled with the following fluorochrome-labelled monoclonal antibodies: CD8-BV421 (BioLegend, 344748) and TCRα/β-PE (IP-26, BioLegend, 984702). Samples were also stained for viability using Live/Dead Fixable Near-IR (Thermo, L10119) and run on a BD Fortessa flow cytometer (BD Biosciences). Analysis was performed with FlowJo (v. 10.10.0)

T cell activation and luminescence read-out

To assess T cell activation, 4 × 10⁵ TCR-T Jurkats were cultured in a 96-well plate for 6 h with peptide or DMSO-pulsed T2 cells at a 10:1 effector-to-target ratio. Before co-culture, T2 cells were pulsed overnight at 1 × 10⁶ cells ml⁻¹ supplemented with 10-μM peptide. Peptides were synthesized at GenScript with >95% purity (GenScript, custom). Luciferase expression was measured using the Bio-Glo-NL assay system (Promega, J3081) according to the manufacturer’s protocol. Luminescence was measured as relative luminescent units (RLUs) using a BioTek Synergy 2 microplate reader. All the reported values were normalized by subtracting the average luminescence values of the media control wells. Comparisons against the peptide-null control (DMSO) are reported as fold change values. Selected TCRs were also screened against a set of control peptides: HA-1 (VLRDDLLEA), a minor histocompatibility antigen commonly targeted in leukaemia, and CEFX Ultra SuperStim Pool MHC-I Subset (JPT, PM-CEFX-4), a mix of 80 class I bacterial and virally derived peptides known to react across a range of class I MHC alleles.

Statistics

Fisher’s exact test (one sided) was used to determine the P values for P_bidxn and P_multi for quantifying the difference in number of polyspecific TCRs sampled. This was computed using the ‘scipy.stats’ Python library. Pairwise Student’s t-test was computed for tests of significance between peptide and DMSO controls for all biological validation data.

Data availability

All sequence data, generations and computational results used for the paper figures are available via GitHub at https://github.com/pirl-unc/tcr_translate. TCR and pMHC sequences are taken from publicly available databases (IEDB, McPAS, VDJdb and MIRA) and are provided in the source–target pair format used for seq2seq training. Wet-laboratory validation data including flow panel read-outs and luminescence plate reader output are available via Zenodo (https://doi.org/10.5281/zenodo.15724161)⁵⁸. Source data are provided with this paper.

Code availability

All code used for training and evaluating TCRT5 is available via GitHub (https://github.com/pirl-unc/tcr_translate) or via Zenodo (https://doi.org/10.5281/zenodo.15068617)⁵⁹. For ease of use, the model and tokenizer for TCRT5 can be downloaded from Hugging Face at https://huggingface.co/dkarthikeyan1/tcrt5_ft_tcrdb. Additionally, the pretrained TCRT5 is available via Hugging Face at https://huggingface.co/dkarthikeyan1/tcrt5_pre_tcrdb.

References

Kalergis, A. M. et al. Single amino acid replacements in an antigenic peptide are sufficient to alter the TCR Vb repertoire of the responding CD8+ cytotoxic lymphocyte population. J. Immunol. 162, 7263–7270 (1999).
Article Google Scholar
Tzannou, I. et al. Off-the-shelf virus-specific T cells to treat BK virus, human herpesvirus 6, cytomegalovirus, Epstein-Barr virus, and adenovirus infections after allogeneic hematopoietic stem-cell transplantation. J. Clin. Oncol. 35, 3547–3557 (2017).
Article Google Scholar
Chung, J. B., Brudno, J. N., Borie, D. & Kochenderfer, J. N. Chimeric antigen receptor T cell therapy for autoimmune disease. Nat. Rev. Immunol. 24, 830–845 (2024).
Article Google Scholar
Harrison, C. TCR cell therapies vanquish solid tumors—finally. Nat. Biotechnol. 42, 1477–1479 (2024).
Article Google Scholar
Liu, Y. et al. TCR-T immunotherapy: the challenges and solutions. Front. Oncol. 11, 812799 (2022).
Wooldridge, L. et al. A single autoimmune T cell receptor recognizes more than a million different peptides. J. Biol. Chem. 287, 1168–1177 (2011).
Article Google Scholar
Bentzen, A. K. et al. T cell receptor fingerprinting enables in-depth characterization of the interactions governing recognition of peptide-MHC complexes. Nat. Biotechnol. 36, 1191–1196 (2018).
Sewell, A. Why must T cells be cross-reactive?. Nat. Rev. Immunol. 12, 669–677 (2012).
Article Google Scholar
Hudson, D., Fernandes, R. A., Basham, M., Ogg, G. & Koohy, H. Can we predict T cell specificity with digital biology and machine learning? Nat. Rev. Immunol. 23, 511–521 (2023).
Nielsen, M. et al. Lessons learned from the IMMREP23 TCR-epitope prediction challenge. ImmunoInformatics 16, 100113 (2024).
Wu, K. et al. TCR-BERT: learning the grammar of T-cell receptors for flexible antigen-binding analyses. In Proc. 18th Machine Learning in Computational Biology meeting Vol 240, 194–229 (PMLR, 2024).
Glanville, J. et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98 (2017).
Davidsen, K. et al. Deep generative models for T cell receptor protein sequences. eLife 8, e46935 (2019).
Article Google Scholar
Isacchini, G., Walczak, A. M., Mora, T. & Nourmohammad, A. A. Deep generative selection models of T and B cell receptor repertoires with soNNia. Proc. Natl Acad. Sci. USA 118, e2023141118 (2021).
Article Google Scholar
Fast, E., Dhar, M. & Chen, B. Tapir: a T-cell receptor language model for predicting rare and novel targets. Preprint at bioRxiv https://doi.org/10.1101/2023.09.12.557285 (2023).
Karthikeyan, D., Raffel, C., Vincent, B. & Rubinsteyn, A. Conditional generation of antigen specific T-cell receptor sequences. In NeurIPS 2023 Generative AI and Biology (GenBio) Workshop 1–14 (2023).
Yang, J. et al. De novo generation of T-cell receptors with desired epitope-binding property by leveraging a pre-trained large language model. Preprint at bioRxiv https://doi.org/10.1101/2023.10.18.562845 (2023).
Zhou, Z. et al. GRATCR: epitope-specific T cell receptor sequence generation with data-efficient pre-trained models. IEEE J. Biomed. Health Inform. 29, 2271–2283 (2025).
Lewis, M. et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proc. 58th Annual Meeting of the Association for Computational Linguistics 7871–7880 (Association for Computational Linguistics, 2020).
Raffel, C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 1–67 (2020).
Haddow, B. et al. Survey of low-resource machine translation. Comput. Linguist. 48, 673–732 (2022).
Article Google Scholar
Niu, X., Denkowski, M. & Carpuat, M. Bi-directional neural machine translation with synthetic parallel data. In Proc. 2nd Workshop on Neural Machine Translation and Generation 84–91 (Association for Computational Linguistics, 2018).
Yang, H.-W. et al. Aligning cross-lingual entities with multi-aspect information. In Proc. 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) 4431–4441 (Association for Computational Linguistics, 2019).
Ding, L., Wu, D. & Tao, D. Improving neural machine translation by bidirectional training. In Proc. 2021 Conference on Empirical Methods in Natural Language Processing 3278–3284 (Association for Computational Linguistics, 2021).
Quiniou, V. et al. Human thymopoiesis produces polyspecific CD8+ α/β T cells responding to multiple viral antigens. eLife 12, e82956 (2023).
Sethna, Z. et al. OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs. Bioinformatics 35, 2974–2981 (2019).
Article Google Scholar
Shugay, M. et al. VDJdb: a curated database of T-cell receptor sequences with known antigen specificity. Nucleic Acids Res. 46, D419–D427 (2017).
Article Google Scholar
Vita, R. et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 47, D339–D343 (2018).
Article Google Scholar
Zhang, H., Zhan, X. & Li, B. GIANA allows computationally-efficient TCR clustering and multi-disease repertoire classification by isometric transformation. Nat. Commun. 12, 4699 (2021).
Article Google Scholar
Hudson, D., Lubbock, A., Basham, M. & Koohy, H. A comparison of clustering models for inference of T cell receptor antigen specificity. ImmunoInformatics 13, 100033 (2024).
Article Google Scholar
Sugiyama, H. WT1. WT1 (Wilms’ tumor gene 1): biology and cancer immunotherapy. Jpn. J. Clin. Oncol. 40, 377–387 (2010).
Article Google Scholar
Chapuis, A. G. et al. T cell receptor gene therapy targeting WT1 prevents acute myeloid leukemia relapse post-transplant. Nat. Med. 25, 1064–1072 (2019).
Article Google Scholar
Wettig, A., Gao, T., Zhong, Z. & Chen, D. Should you mask 15% in masked language modeling? In Conference of the European Chapter of the Association for Computational Linguistics (2022).
Eikema, B. & Aziz, W. Is MAP decoding all you need? The inadequacy of the mode in neural machine translation. In International Conference on Computational Linguistics (2020).
Corrie, B. D. et al. iReceptor: a platform for querying and analyzing antibody/B-cell and T-cell receptor repertoire data across federated repositories. Immunol. Rev. 284, 24–41 (2018).
Article Google Scholar
Springer, I., Tickotsky, N. & Louzoun, Y. Contribution of T cell receptor alpha and beta CDR3, MHC typing, V and J genes to peptide binding prediction. Front. Immunol. 12, 730581 (2021).
Henderson, J., Nagano, Y., Milighetti, M. & Tiffeau-Mayer, A. Limits on inferring T-cell specificity from partial information. Proc. Natl Acad. Sci. USA 121, e2408696121 (2024).
Yu, K., Shi, J., Lu, D. & Yang, Q. Comparative analysis of CDR3 regions in paired human αβ CD8 T cells. FEBS Open Bio 9, 1450–1459 (2019).
Article Google Scholar
Hoof, I. et al. NetMHCpan, a method for MHC class I binding prediction beyond humans. Immunogenetics 61, 1–13 (2009).
Article Google Scholar
Dotan, E. Effect of tokenization on transformers for biological sequences. Bioinformatics 40, btae196 (2024).
Article Google Scholar
Tickotsky, N., Sagiv, T., Prilusky, J., Shifrut, E. & Friedman, N. McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. Bioinformatics 33, 2924–2929 (2017).
Article Google Scholar
Dines, J. N. et al. The ImmuneRACE study: a prospective multicohort study of immune response action to COVID-19 events with the immunoCODE™ open access database. Preprint at medRxiv https://doi.org/10.1101/2020.08.17.20175158 (2020).
O'Donnell, T. J., Rubinsteyn, A. & Laserson, U. MHCflurry 2.0: improved pan-allele prediction of MHC class I-presented peptides by incorporating antigen processing. Cell Syst 11, 42–48 (2020).
Article Google Scholar
Nagano, Y. & Chain, B. tidytcells: standardizer for TR/MH nomenclature. Front. Immunol. 14, 1224567 (2023).
Lee, K. et al. Deduplicating training data makes language models better. In Proc. 60th Annual Meeting of the Association for Computational Linguistics Vol 1, 8424–8445 (Association for Computational Linguistics, 2022).
Cooper Stickland, A., Li, X. & Ghazvininejad, M. Recipes for adapting pre-trained monolingual and multilingual models to machine translation. In Proc. 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 3440–3453 (Association for Computational Linguistics, 2021).
Chen, Si-Yi, Yue, T., Lei, Q. & Guo, An-Yuan TCRdb: a comprehensive database for T-cell receptor sequences with powerful search function. Nucleic Acids Res. 49, D468–D474 (2020).
Article Google Scholar
A. Elnaggar et al. ProtTrans: toward understanding the language of life through self-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7112–7127 (2022).
Song, K., Tan, X., Qin, T., Lu, J. & Liu, T.-Y. MASS: masked sequence to sequence pre-training for language generation. In Proc. 36th International Conference on Machine Learning Vol 97, 5926–5936 (PMLR, 2019).
Lample, G. & Conneau, A. Cross-lingual language model pretraining. In Proc. 33rd International Conference on Neural Information Processing Systems 7059–7069 (Curran Associates Inc., 2019).
Sutskever, I., Vinyals, O. & Le, Q. V. Sequence to sequence learning with neural networks. In Proc. 28th International Conference on Neural Information Processing Systems 3104–3112 (MIT Press, 2014).
Cho, K. et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) 1724–1734 (Association for Computational Linguistics, 2014).
Papineni, K., Roukos, S., Ward, T. & Zhu, W.-J. BLEU: a method for automatic evaluation of machine translation. In Proc. 40th Annual Meeting on Association for Computational Linguistics 311–318 (Association for Computational Linguistics, 2002).
Sarkar, A., Tang, Z., Zhao, C. & Koo, P. K. Designing DNA with tunable regulatory activity using discrete diffusion. Preprint at bioRxiv https://doi.org/10.1101/2024.05.23.595630 (2024).
Schmitt, T. M., Chapuis, A. G. & Greenberg, P. D. High avidity WT1 T cell receptors and uses thereof. US patent 10,780,158 (2020).
Kim, S. H. et al. Construction of retroviral vectors with improved safety, gene expression, and versatility. J. Virol. 72, 994–1004 (1998).
Article Google Scholar
Lissina, A. et al. Protein kinase inhibitors substantially improve the physical detection of T-cells with peptide-MHC tetramers. J. Immunol. Methods 340, 11–24 (2009).
Article Google Scholar
Bennett, S., Reynolds, A., Karthikeyan, D., Rubinsteyn, A. & Vincent, B. In vitro validation of TCRT5. Zenodo https://doi.org/10.5281/zenodo.15724161 (2025).
Karthikeyan, D. dhuvik/tcr_translate: pre-publication release. Zenodo https://doi.org/10.5281/zenodo.15068617 (2025).
Vaswani, A. et al. Attention is all you need. In Proc. 31st International Conference on Neural Information Processing Systems 6000–6010 (Curran Associates Inc., 2017).
Zhang, P., Bang, S., Cai, M. & Lee, H. Context-aware amino acid embedding advances analysis of TCR-epitope interactions. eLife 12 (2024).
Jiang, Y., Huo, M., Zhang, P., Zou, Y. & Li, S. C. TCR2vec: a deep representation learning framework of T-cell receptor sequence and function. Preprint at bioRxiv https://doi.org/10.1101/2023.03.31.535142 (2023).

Download references

Acknowledgements

This work was supported largely by the National Science Foundation Graduate Research Fellowship DGE-2040435 (D.K.). Additional support was provided by the National Institutes of Health R37CA247676 (S.N.B., A.G.R., A.R. and B.G.V.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences or the National Institutes of Health. The UNC Flow Cytometry Core Facility (RRID: SCR_019170) is supported in part by P30 CA016086 Cancer Center Core Support Grant to the UNC Lineberger Comprehensive Cancer Center. Research reported in this publication was supported by the Center for AIDS Research (award number 5P30AI050410). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Science Foundation or the National Institutes of Health. In addition, this work would not have been possible without numerous fruitful conversations. We would like to thank C. Raffel, first author of the original T5 paper, whose course on large language models at UNC inspired the work and whose generous expertise helped guide many of the early decision points in getting the models to do anything useful. Next, we would like to thank G. Isacchini, whose soNNia model provides a stellar comparison, and with whom many discussions regarding polyspecificity were conducted. We thank A. Palmer and A. Pomeroy for their suggestions and feedback on data communication and our figures. We thank W. Valdar for his feedback on our statistical methods. We thank A. Lee, S. Peterson and J. Webb for lending their creativity, expertise and help in addition to proofreading our paper. We are incredibly grateful to the numerous friends and reviewers from various conference venues including NeurIPs GenBio, AIRR-C VII, ICML AccMLBio and ICLR GEMBIO for their generous and valuable feedback, whose suggestions helped strengthen many portions of our study.

Author information

Authors and Affiliations

Personalized Immunotherapy Research Lab, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Dhuvarakesh Karthikeyan, Sarah N. Bennett, Benjamin G. Vincent & Alex Rubinsteyn
Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Dhuvarakesh Karthikeyan, Benjamin G. Vincent & Alex Rubinsteyn
Computational Medicine Program, UNC School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Dhuvarakesh Karthikeyan, Benjamin G. Vincent & Alex Rubinsteyn
Department of Biochemistry and Biophysics, UNC School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Dhuvarakesh Karthikeyan
Department of Microbiology and Immunology, University of North Carolina, Chapel Hill, Chapel Hill, NC, USA
Sarah N. Bennett, Amy G. Reynolds & Benjamin G. Vincent
Lineberger Comprehensive Cancer Center, UNC School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Sarah N. Bennett, Benjamin G. Vincent & Alex Rubinsteyn
Division of Hematology, Department of Medicine, UNC School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Benjamin G. Vincent

Authors

Dhuvarakesh Karthikeyan
View author publications
Search author on:PubMed Google Scholar
Sarah N. Bennett
View author publications
Search author on:PubMed Google Scholar
Amy G. Reynolds
View author publications
Search author on:PubMed Google Scholar
Benjamin G. Vincent
View author publications
Search author on:PubMed Google Scholar
Alex Rubinsteyn
View author publications
Search author on:PubMed Google Scholar

Contributions

D.K. conceived of and conducted the computational experiments. S.N.B. and A.G.R. designed and analysed the in vitro results. D.K., S.N.B., A.R. and B.G.V. contributed to writing the paper. A.R. and B.G.V. supervised the project and contributed to editing the paper. All authors reviewed the final paper.

Corresponding authors

Correspondence to Dhuvarakesh Karthikeyan or Alex Rubinsteyn.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Justin Barton, Nicholas Borcherding, Jamie Heather and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Atomic metrics.

(a) Box and whisker plot showing median and quartile ranges for individual translation sequence recoveries (n = 100) per model per pMHC. Whiskers extend to 1.5× IQR. Dotted lines show the mean sequence recovery of 10k soNNia unconditional sequences computed on the true cognate paratope sequences. (b) Bar plot showing F1@100 score per model and pMHC. Each subplot is demarcated with the number of reference CDR3βs in the top right corner.

Extended Data Fig. 2 Multitask models sample more known validated polyspecific TCR sequences.

(a) Subset of TCRBART-0 generations across model variants that are known binders to more than one validation pMHC (may be from the same disease context). (b) Subset of TCRT5-FT generations across model variants that are known binders to more than one validation pMHC (may be from the same disease context). Each row is an individual CDR3β sequence that was generated for and found in the experimentally validated set of reference TCRs for the listed validation pMHCs.

Extended Data Fig. 3 Exploring polyspecificity vs. training set statistics across baseline, bidirectional, and multitask model variants.

(a) Heat map of ranked TCRBART-0 translations across pMHCs coloured by number of known alleles, known epitopes, training set frequency, epitope dissimilarity (measured as the reciprocal of the longest common substring (LCS)), and membership status in the 915 polyspecific TCRs. (b) Analogous heat map as panel ‘a’ but for TCRT5-FT generations. (c) Correlation plots for TCRBART-0 and TCRT5-FT model generations and training set occurrence. Line of best fit is shown in red. Pearson’s r and Spearman’s ρ are also provided for each model. (d) Correlation plots for TCRBART-0 and TCRT5-FT log[p_gen] and model generation frequency. Line of best fit shown in red. Summary statistics are provided as well.

Extended Data Fig. 4 CDR3β embeddings highlight reduction in sampled TCR space.

PCA dimensionality reduction of embeddings generated by sequence based methods are shown for: (a) TCR-BERT (b) catELMo⁶¹ (c) TCR2vec⁶². Red points indicate sequences generated by TCRT5, gray corresponds to reference translations, and blue points are soNNia generated sequences. Reference TCRs are downsampled to 200 sequences and 100 background sequences are shown.

Extended Data Fig. 5 TCRT5 sequence likelihoods.

(a) Histograms showing the OLGA p_gen values for the reference CDR3βs as well as those generated by beam search and ancestral sampling methods. (b) Correlation plots showing the model scores (model sequence likelihoods) against the biophysical OLGA p_gen. Axes are log10 scaled. Red line is the best fit line with associated Pearson’s r.

Extended Data Fig. 6 TCRT5 Metrics @1000.

(a) Repertoire-level features of reference (validation target sequences) and generated CDR3βs. (b) Sequence logo plot generated from TCRT5 for the canonical GILGFVFTL (Influenza A), KLGGALQAK (CMV), and YLQPRTFLL (SARS-CoV2) from 1000 generations instead of 100. (b) TCRT5@1000 with beam search preferentially samples sequences at the right tail of OLGA generation probabilities. (c) Bar plots for individual pMHCs are overlaid on one another. (d) K-mer spectrum shift plot showing the Jensen-Shannon divergence between generated and reference sequences for TCRT5@1000. Error bars mark the mean and 1-standard deviation across validation pMHCs (n = 20). Mean soNNia values are shown per simulated run, with 1000 generations per pMHC per run over 100 simulations. (e) Heat map of Jaccard Index scores showing the generated sequence co-occurrence across different pMHC pairs at 1000 generations per pMHC. (f) Sankey diagram of TCRT5@1000 generations showing the validity as measured by nonzero generation probability, known binding status, and training set membership.

Extended Data Fig. 7 Supporting details for in vitro validation of predicted CDR3β sequences.

(a) Detailed schematic of Gibson cloning generated CDR3β sequences into a retroviral expression plasmid containing a WT1 TCR sequence, retrovirus generation using 293T cells, and retroviral transduction to generate TCR-Jurkat cell lines for validation studies. (b) Gating strategy used to assess TCR expression. (c) Raw relative luminescence units (RLUs) for 40 generated CDR3β sequences stimulated with WT1 peptide or DMSO (n = 3 technical replicates). Error bars show SEM. Panel ‘a’ created in BioRender.

Extended Data Table 1 Characterization of train/validation overlap

Full size table

Extended Data Table 2 Characterization of train/test overlap

Full size table

Extended Data Table 3 TCRT5 predicted sequences and associated constructs for Wilms’ Tumor Antigen 1 (WT1)

Full size table

Supplementary information

Supplementary Information

Supplementary Notes A1–A11.

Source data

Source Data Fig. 6

Luminescence data for WT1 peptide-pulse activation assay.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Karthikeyan, D., Bennett, S.N., Reynolds, A.G. et al. Conditional generation of real antigen-specific T cell receptor sequences. Nat Mach Intell 7, 1494–1509 (2025). https://doi.org/10.1038/s42256-025-01096-6

Download citation

Received: 14 December 2024
Accepted: 14 July 2025
Published: 08 September 2025
Issue date: September 2025
DOI: https://doi.org/10.1038/s42256-025-01096-6

Subjects

Abstract

Similar content being viewed by others

Main

Results

Conditional generation outperforms unconditional generation

Multitask training increases accuracy metrics and decreases diversity metrics of generated sequences

Multitask models preferentially sample polyspecific CDR3β sequences

TCRT5 generates real unseen antigen-specific CDR3β sequences

TCRT5 achieves state-of-the-art performance on sparsely validated epitopes

TCRT5 validates in vitro

Discussion

Methods

Sequence representation

Dataset construction

Core parallel corpus

Training/validation split

Unlabelled ‘monolingual’ data

Benchmark ‘test’ data

Model training

Pretraining

Direct training/fine-tuning

Algorithm 1

Evaluation

Accuracy metrics

Diversity metrics

Both

TCRT5 data ablation

In silico benchmark

GRATCR

ER-TRANSFORMER

Modified F1 scores

In vitro validation

Retroviral transduction

TCR expression

T cell activation and luminescence read-out

Statistics

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links