Abstract
Background
It appears that tumour-infiltrating neoantigen-reactive CD8 + T (Neo T) cells are the primary driver of immune responses to gastrointestinal cancer in patients. However, the conventional method is very time-consuming and complex for identifying Neo T cells and their corresponding T cell receptors (TCRs).
Methods
By mapping neoantigen-reactive T cells from the single-cell transcriptomes of thousands of tumour-infiltrating lymphocytes, we developed a 26-gene machine learning model for the identification of neoantigen-reactive T cells.
Results
In both training and validation sets, the model performed admirably. We discovered that the majority of Neo T cells exhibited notable differences in the biological processes of amide-related signal pathways. The analysis of potential cell-to-cell interactions, in conjunction with spatial transcriptomic and multiplex immunohistochemistry data, has revealed that Neo T cells possess potent signalling molecules, including LTA, which can potentially engage with tumour cells within the tumour microenvironment, thereby exerting anti-tumour effects. By sequencing CD8 + T cells in tumour samples of patients undergoing neoadjuvant immunotherapy, we determined that the fraction of Neo T cells was significantly and positively linked with the clinical benefit and overall survival rate of patients.
Conclusion
This method expedites the identification of neoantigen-reactive TCRs and the engineering of neoantigen-reactive T cells for therapy.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 24 print issues and online access
$259.00 per year
only $10.79 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout









Similar content being viewed by others
Data availability
The single cell datasets generated during this investigation are accessible in the Code Ocean database (https://codeocean.com/capsule/0506291/tree). Source of the original data are provided with this paper. The ML model and all analysis process codes have been uploaded to the GitHub webside (https://github.com/shizhiwen1990/Neo).
References
Morotti M, Albukhari A, Alsaadi A, Artibani M, Brenton JD, Curbishley SM, et al. Promises and challenges of adoptive T-cell therapies for solid tumours. Br J Cancer. 2021;124:1759–76.
Leidner R, Sanjuan Silva N, Huang H, Sprott D, Zheng C, Shih Y-P, et al. Neoantigen T-cell receptor gene therapy in pancreatic cancer. N Engl J Med. 2022;386:2112–9.
Kim SP, Vale NR, Zacharakis N, Krishna S, Yu Z, Gasmi B, et al. Adoptive cellular therapy with autologous tumor-infiltrating lymphocytes and T-cell receptor–engineered T cells targeting common p53 neoantigens in human solid tumors. Cancer Immunol Res. 2022;10:932–46.
Pyke RM, Mellacheruvu D, Dea S, Abbott CW, Zhang SV, Phillips NA, et al. Precision neoantigen discovery using large-scale immunopeptidomes and composite modeling of MHC peptide presentation. Mol Cell Proteom. 2021;20:100111.
Cimen Bozkus C, Roudko V, Finnigan JP, Mascarenhas J, Hoffman R, Iancu-Rubin C, et al. Immune checkpoint blockade enhances shared neoantigen-induced T-cell immunity directed against mutated calreticulin in myeloproliferative NeoplasmsMut-CALR–specific immunity and checkpoint blockade in MPN. Cancer Discov. 2019;9:1192–207.
Tran E. Neoantigen-specific T cells in adoptive cell therapy. Cancer J. 2022;28:278–84.
Yossef R, Tran E, Deniger DC, Gros A, Pasetto A, Parkhurst MR, et al. Enhanced detection of neoantigen-reactive T cells targeting unique and shared oncogenes for personalized cancer immunotherapy. JCI insight. 2018;3:e122467.
Duhen T, Duhen R, Montler R, Moses J, Moudgil T, de Miranda NF, et al. Co-expression of CD39 and CD103 identifies tumor-reactive CD8 T cells in human solid tumors. Nat Commun. 2018;9:1–13.
Djenidi F, Adam J, Goubar A, Durgeau A, Meurice G, de Montpréville V, et al. CD8+ CD103+ tumor–infiltrating lymphocytes are tumor-specific tissue-resident memory T cells and a prognostic factor for survival in lung cancer patients. J Immunol. 2015;194:3475–86.
Gros A, Parkhurst MR, Tran E, Pasetto A, Robbins PF, Ilyas S, et al. Prospective identification of neoantigen-specific lymphocytes in the peripheral blood of melanoma patients. Nat Med. 2016;22:433–8.
Zheng C, Fass JN, Shih Y-P, Gunderson AJ, Silva NS, Huang H, et al. Transcriptomic profiles of neoantigen-reactive T cells in human gastrointestinal cancers. Cancer Cell. 2022;40:410–423. e417.
Greener JG, Kandathil SM, Moffat L, Jones DT. A guide to machine learning for biologists. Nat Rev Mol Cell Biol. 2022;23:40–55.
Ali M. PyCaret: an open source, low-code machine learning library in Python. PyCaret version. 2020;2.238
Lowery FJ, Krishna S, Yossef R, Parikh NB, Chatani PD, Zacharakis N, et al. Molecular signatures of antitumor neoantigen-reactive T cells from metastatic human cancers. Science. 2022;375:877–84.
Meng Z, Rodriguez Ehrenfried A, Tan CL, Steffens LK, Kehm H, Zens S, et al. Transcriptome-based identification of tumor-reactive and bystander CD8(+) T cell receptor clonotypes in human pancreatic cancer. Sci Transl Med. 2023;15:eadh9562.
Zheng L, Qin S, Si W, Wang A, Xing B, Gao R, et al. Pan-cancer single-cell landscape of tumor-infiltrating T cells. Science. 2021;374:abe6474.
Andreatta M, Corria-Osorio J, Müller S, Cubas R, Coukos G, Carmona SJ. Interpretation of T cell states from single-cell transcriptomics data using reference atlases. Nat Commun. 2021;12:1–19.
Ianevski A, Giri AK, Aittokallio T. Fully-automated and ultra-fast cell-type identification using specific marker combinations from single-cell transcriptomic data. Nat Commun. 2022;13:1–10.
Goncharov M, Bagaev D, Shcherbinin D, Zvyagin I, Bolotin D, Thomas PG, et al. VDJdb in the pandemic era: a compendium of T cell receptors specific for SARS-CoV-2. Nat Methods. 2022;19:1017–9.
Gielis S, Moris P, Bittremieux W, De Neuter N, Ogunjimi B, Laukens K, et al. Detection of enriched T cell epitope specificity in full T cell receptor sequence repertoires. Front Immunol. 2019;10:2820.
Kawakami Y, Eliyahu S, Delgado CH, Robbins PF, Rivoltini L, Topalian SL, et al. Cloning of the gene coding for a shared human melanoma antigen recognized by autologous T cells infiltrating into tumor. Proc Natl Acad Sci. 1994;91:3515–9.
Kawakami Y, Eliyahu S, Sakaguchi K, Robbins PF, Rivoltini L, Yannelli JR, et al. Identification of the immunodominant peptides of the MART-1 human melanoma antigen recognized by the majority of HLA-A2-restricted tumor infiltrating lymphocytes. J Exp Med. 1994;180:347–52.
Kawakami Y, Eliyahu S, Jennings C, Sakaguchi K, Kang X, Southwood S, et al. Recognition of multiple epitopes in the human melanoma antigen gp100 by tumor-infiltrating T lymphocytes associated with in vivo tumor regression. J Immunol (Baltim, Md: 1950). 1995;154:3961–8.
Kalady MF, Onaitis MW, Emani S, Abdul-Wahab Z, Pruitt SK, Tyler DS. Dendritic cells pulsed with pancreatic cancer total tumor RNA generate specific antipancreatic cancer T cells. J Gastrointest Surg. 2004;8:175–82.
Peng H, James CA, Cullinan DR, Hogg GD, Mudd JL, Zuo C, et al. Neoadjuvant FOLFIRINOX therapy is associated with increased effector T cells and reduced suppressor cells in patients with pancreatic cancer. Clin Cancer Res. 2021;27:6761–71.
Hanada KI, Zhao C, Gil-Hoyos R, Gartner JJ, Chow-Parmer C, Lowery FJ, et al. A phenotypic signature that identifies neoantigen-reactive T cells in fresh human lung cancers. Cancer Cell 2022;40:479-493.e476.
Thommen DS, Koelzer VH, Petra H, Andreas R, Marcel T, Sarah D, et al. A transcriptionally and functionally distinct PD-1+ CD8+ T cell pool with predictive potential in non-small-cell lung cancer treated with PD-1 blockade. Nat Med. 2018;24:994–1004.
Wu R, Guo W, Qiu X, Wang S, Sui C, Lian Q, et al. Comprehensive analysis of spatial architecture in primary liver cancer. Science Advances 2021;7:eabg3750.
Chu T, Wang Z, Pe’er D, Danko CG. Cell type and gene expression deconvolution with BayesPrism enables Bayesian integrative analysis across bulk and single-cell RNA sequencing in oncology. Nat Cancer. 2022;3:505–17.
Liu B, Hu X, Feng K, Gao R, Xue Z, Zhang S, et al. Temporal single-cell tracing reveals clonal revival and expansion of precursor exhausted T cells during anti-PD-1 therapy in lung cancer. Nat Cancer. 2022;3:108–21.
Luoma AM, Suo S, Wang Y, Gunasti L, Porter CB, Nabilsi N, et al. Tissue-resident memory and circulating T cells are early responders to pre-surgical cancer immunotherapy. Cell. 2022;185:2918–2935.e2929.
Johnson MO, Wolf MM, Madden MZ, Andrejeva G, Sugiura A, Contreras DC, et al. Distinct regulation of Th17 and Th1 cell differentiation by glutaminase-dependent metabolism. Cell. 2018;175:1780–1795.e1719.
Bauer J, Namineni S, Reisinger F, Zöller J, Yuan D, Heikenwälder MJDD. Lymphotoxin, NF-ĸB, and cancer: the dark side of cytokines. Digestive diseases 2012;30:453-468.
Fernandes MT, Dejardin E, dos Santos, NRJBEBA-ROC. Context-dependent roles for lymphotoxin-β receptor signaling in cancer development. Biochimica et Biophysica Acta (BBA)-Reviews on Cancer 2016;1865:204–219.
Ruddle NH. Lymphotoxin and TNF: how it all began—a tribute to the travelers. Cytokine Growth Factor Rev. 2014;25:83–9.
Ngo VN, Korner H, Gunn MD, Schmidt KN, Sean Riminton D, Cooper MD, et al. Lymphotoxin α/β and tumor necrosis factor are required for stromal cell expression of homing chemokines in B and T cell areas of the spleen. The Journal of experimental medicine 1999;189:403–412.
Shi Z, Chen B, Han X, Gu W, Liang S, Wu L. Genomic and molecular landscape of homologous recombination deficiency across multiple cancer types. Sci Rep. 2023;13:8899.
Ma L, Wang L, Khatib SA, Chang C-W, Heinrich S, Dominguez DA, et al. Single-cell atlas of tumor cell evolution in response to therapy in hepatocellular carcinoma and intrahepatic cholangiocarcinoma. J Hepatol. 2021;75:1397–408.
Song G, Shi Y, Meng L, Ma J, Huang S, Zhang J, et al. Single-cell transcriptomic analysis suggests two molecularly distinct subtypes of intrahepatic cholangiocarcinoma. Nat Commun. 2022;13:1–15.
Schalck A, Sakellariou-Thompson D, Forget M-A, Sei E, Hughes TG, Reuben A, et al. Single cell sequencing reveals trajectory of tumor-infiltrating lymphocyte states in pancreatic cancer. Cancer Discov. 2022;12:2330–49.
Wu R, Guo W, Qiu X, Wang S, Sui C, Lian Q, et al. Comprehensive analysis of spatial architecture in primary liver cancer. Sci Adv. 2021;7:eabg3750.
McGinnis CS, Murrow LM, Gartner ZJ. DoubletFinder: doublet detection in single-cell RNA sequencing data using artificial nearest neighbors. Cell Syst. 2019;8:329–337.e324.
Korsunsky I, Millard N, Fan J, Slowikowski K, Zhang F, Wei K, et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods. 2019;16:1289–96.
Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck III, et al. Comprehensive integration of single-cell data. Cell. 2019;177:1888–1902.e1821.
Borcherding N, Bormann NL, Kraus G. scRepertoire: an R-based toolkit for single-cell immune receptor analysis. F1000Research. 2020;9:47.
Hafemeister C, Satija R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 2019;20:296.
Shi Z, Shen J, Qiu J, Zhao Q, Hua K, Wang H. CXCL10 potentiates immune checkpoint blockade therapy in homologous recombination-deficient tumors. Theranostics. 2021;11:7175.
Shen W, Song Z, Zhong X, Huang M, Shen D, Gao P, et al. Sangerbox: a comprehensive, interaction‐friendly clinical bioinformatics analysis platform. iMeta. 2022;1:e36.
Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27:1739–40.
Efremova M, Vento-Tormo M, Teichmann SA, Vento-Tormo R. CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes. Nat Protoc. 2020;15:1484–506.
Becht E, Giraldo NA, Lacroix L, Buttard B, Elarouci N, Petitprez F, et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome biology 2016;17:1–20.
Patel AP, Tirosh I, Trombetta JJ, Shalek AK, Gillespie SM, Wakimoto H, et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science. 2014;344:1396–401.
Li K, Tandurella JA, Gai J, Zhu Q, Lim SJ, Thomas DL II, et al. Multi-omic analyses of changes in the tumor microenvironment of pancreatic adenocarcinoma following neoadjuvant treatment with anti-PD-1 therapy. Cancer Cell. 2022;40:1374–1391.e7.
Shi Z, Zhao Q, Lv B, Qu X, Han X, Wang H, et al. Identification of biomarkers complementary to homologous recombination deficiency for improving the clinical outcome of ovarian serous cystadenocarcinoma. Clin Transl Med. 2021;11:e399.
Qiu X, Hill A, Packer J, Lin D, Ma Y-A, Trapnell C. Single-cell mRNA quantification and differential analysis with Census. Nat Methods. 2017;14:309–15.
Acknowledgements
We thank Chunhong Zheng and Eric Tran for assistance in setting up the scRNA-seq data of Neo T cells. We also thank Lei Zheng and Elana J. Fertig for assistance in setting up the bulk RNA-seq data of CD 8 + TIL cells from the neoadjuvant Immunotherapy cohort.
Funding
This study was supported by the Wenzhou Medical University Talent Research Startup Project; the Key Projects Jointly Constructed by Department of Science and Technology of National Administration of Traditional Chinese Medicine & Administration of Traditional Chinese Medicine of Zhejiang Province (NO.GZY-ZJ-KJ-24088), the Basic scientific research Project of Wenzhou Medical University (NO. KYYW202107); Wenzhou Key Laboratory of Cancer Pathogenesis and Translation, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University.
Author information
Authors and Affiliations
Contributions
Designing research studies: Zhiwen Shi, Hongwei Sun. Clinical sample collection: Zhengliang Du and Geer Chen. Collecting data: Zhiwen Shi, Xiao Han Zhengliang Du and Hongwei Sun. Analysing data: Zhiwen Shi, Xiao Han, Tonglei Guo, Fei xie and Hongwei Sun. Multiplex immunohistochemistry: Geer Chen and Zhengliang Du. Preparing the manuscript: Zhiwen Shi, Hongwei Sun. Grammar Check: Tonglei Guo and Xiao Han. Supervision: Zhiwen Shi. Funding Acquisition: Zhiwen Shi and Hongwei Sun; Weiyue Gu provided the venue and hardware for the relevant experiments. The authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, H., Han, X., Du, Z. et al. Machine learning for the identification of neoantigen-reactive CD8 + T cells in gastrointestinal cancer using single-cell sequencing. Br J Cancer 131, 387–402 (2024). https://doi.org/10.1038/s41416-024-02737-0
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41416-024-02737-0
This article is cited by
-
Functional tumor-reactive CD8 + T cells in pancreatic cancer
Journal of Experimental & Clinical Cancer Research (2025)
-
The dark matter in cancer immunology: beyond the visible– unveiling multiomics pathways to breakthrough therapies
Journal of Translational Medicine (2025)