Abstract
The capability to profile the landscape of antigen-binding affinities of a vast number of antibodies (B cell receptors, BCRs) will provide a powerful tool to reveal biological insights. However, experimental approaches for detecting antibody–antigen interactions are costly and time-consuming and can only achieve low-to-mid throughput. In this work, we developed Cmai (contrastive modeling for antigen–antibody interactions) to address the prediction of binding between antibodies and antigens that can be scaled to high-throughput sequencing data. We devised a biomarker based on the output from Cmai to map the antigen-binding affinities of BCR repertoires. We found that the abundance of tumor antigen-targeting antibodies is predictive of immune-checkpoint inhibitor (ICI) treatment response. We also found that, during immune-related adverse events (irAEs) caused by ICI, humoral immunity is preferentially responsive to intracellular antigens from the organs affected by the irAEs. We used Cmai to construct a BCR-based irAE risk score, which predicted the timing of the occurrence of irAEs.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
The UTSW irAE participant cohort BCR sequence data can be accessed on the Database for Actionable Immunology website (https://dbai.biohpc.swmed.edu/)4,63,64 and clinical characteristics are available in Supplementary Table 4. The raw RNA-seq data from which we derived the BCR sequences are available from the Gene Expression Omnibus (GSE296826). The structure data were downloaded from the PDB under accession numbers 6VKM, 8F76 and 3N43. The training and validation data of the BCR V and CDR encoders are shown in Supplementary Table 1. Source data are provided with this paper.
Code availability
The source codes, training/validation data and trained model parameters for Cmai are available from GitHub (https://github.com/ice4prince/Cmai).
References
Berzofsky, J. A. An Ia-restricted epitope-specific circuit regulating T cell–B cell interaction and antibody specificity. Surv. Immunol. Res. 2, 223–229 (1983).
Sabhnani, L. et al. Developing subunit immunogens using B and T cell epitopes and their constructs derived from the F1 antigen of Yersinia pestis using novel delivery vehicles. FEMS Immunol. Med. Microbiol. 38, 215–229 (2003).
Zhang, J. et al. Modulation of nonneutralizing HIV-1 gp41 responses by an MHC-restricted TH epitope overlapping those of membrane proximal external region broadly neutralizing antibodies. J. Immunol. 192, 1693–1706 (2014).
Zhu, J. et al. BepiTBR: T–B reciprocity enhances B cell epitope prediction. iScience 25, 103764 (2022).
Zhang, Z. et al. Interpreting the B-cell receptor repertoire with single-cell gene expression using Benisse. Nat. Mach. Intell. 4, 596–604 (2022).
Wang, S.-S. et al. Tumor-infiltrating B cells: their role and application in anti-tumor immunity in lung cancer. Cell. Mol. Immunol. 16, 6–18 (2019).
Garaud, S. et al. Tumor infiltrating B-cells signal functional humoral immune responses in breast cancer. JCI Insight 5, e129641 (2019).
Lechner, A. et al. Tumor-associated B cells and humoral immune response in head and neck squamous cell carcinoma. Oncoimmunology 8, 1535293 (2019).
Carmi, Y. et al. Allogeneic IgG combined with dendritic cell stimuli induce antitumour T-cell immunity. Nature 521, 99–104 (2015).
Burger, J. A. & Wiestner, A. Targeting B cell receptor signalling in cancer: preclinical and clinical advances. Nat. Rev. Cancer 18, 148–167 (2018).
Sterner, R. C. & Sterner, R. M. CAR-T cell therapy: current limitations and potential strategies. Blood Cancer J. 11, 69 (2021).
Setliff, I. et al. High-throughput mapping of B cell receptor sequences to antigen specificity. Cell 179, 1636–1646(2019).
Shiakolas, A. R. et al. Efficient discovery of SARS-CoV-2-neutralizing antibodies via B cell receptor sequencing and ligand blocking. Nat. Biotechnol. 40, 1270–1275 (2022).
Asrat, S. et al. TRAPnSeq allows high-throughput profiling of antigen-specific antibody-secreting cells. Cell Rep. Methods 3, 100522 (2023).
Mohan, D. et al. PhIP-seq characterization of serum antibodies using oligonucleotide-encoded peptidomes. Nat. Protoc. 13, 1958–1978 (2018).
Mason, D. M. et al. Optimization of therapeutic antibodies by predicting antigen specificity from antibody sequence via deep learning. Nat. Biomed. Eng. 5, 600–612 (2021).
Liu, G. et al. Antibody complementarity determining region design using high-capacity machine learning. Bioinformatics 36, 2126–2133 (2020).
Nimrod, G. et al. Computational design of epitope-specific functional antibodies. Cell Rep. 25, 2121–2131 (2018).
Shan, S. et al. Deep learning guided optimization of human antibody against SARS-CoV-2 variants with broad neutralization. Proc. Natl Acad. Sci. USA 119, e2122954119 (2022).
Saka, K. et al. Antibody design using LSTM based deep generative model from phage display library for affinity maturation. Sci Rep. 11, 5852 (2021).
Hie, B. L. et al. Efficient evolution of human antibodies from general protein language models. Nat. Biotechnol. 42, 275–283 (2024).
Akbar, R. et al. In silico proof of principle of machine learning-based antibody design at unconstrained scale. MAbs 14, 2031482 (2022).
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Xu, J. L. & Davis, M. M. Diversity in the CDR3 region of VH is sufficient for most antibody specificities. Immunity 13, 37–45 (2000).
Schroeder, H. W. & Cavacini, L. Structure and function of immunoglobulins. J. Allergy Clin. Immunol. 125, S41–S52 (2010).
Bever, C. S. et al. VHH antibodies: emerging reagents for the analysis of environmental chemicals. Anal. Bioanal. Chem. 408, 5985–6002 (2016).
Jaffe, D. B. et al. Functional antibodies exhibit light chain coherence. Nature 611, 352–357 (2022).
Atchley, W. R., Zhao, J., Fernandes, A. D. & Drüke, T. Solving the protein sequence metric problem. Proc. Natl Acad. Sci. USA 102, 6395–6400 (2005).
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).
Lu, T. et al. Tumor neoantigenicity assessment with CSiN score incorporates clonality and immunogenicity to predict immunotherapy outcomes. Sci. Immunol. 5, eaaz3199 (2020).
Lu, T. et al. Netie: inferring the evolution of neoantigen–T cell interactions in tumors. Nat. Methods 19, 1480–1489 (2022).
Jiang, F. et al. GTE: a graph learning framework for prediction of T-cell receptors and epitopes binding specificity. Brief. Bioinformatics 25, bbae343 (2024).
Huang, Y., Zhang, Z. & Zhou, Y. AbAgIntPre: a deep learning method for predicting antibody–antigen interactions based on sequence information. Front. Immunol. 13, 1053617 (2022).
Yuan, Y., Chen, Q., Mao, J., Li, G. & Pan, X. DG-Affinity: predicting antigen–antibody affinity with language models from sequences. BMC Bioinformatics 24, 430 (2023).
Chappert, P. et al. Human anti-smallpox long-lived memory B cells are defined by dynamic interactions in the splenic niche and long-lasting germinal center imprinting. Immunity 55, 1872–1890 (2022).
Raybould, M. I. J. et al. Thera-SAbDab: the therapeutic structural antibody database. Nucleic Acids Res. 48, D383–D388 (2020).
Martin, A. L. et al. Olfactory receptor OR2H1 is an effective target for CAR T cells in human epithelial tumors. Mol. Cancer Ther. 21, 1184–1194 (2022).
Dunbar, J. et al. SAbDab: the structural antibody database. Nucleic Acids Res. 42, D1140–D1146 (2014).
Bolotin, D. A. et al. Antigen receptor repertoire profiling from RNA-seq data. Nat. Biotechnol. 35, 908–911 (2017).
UniProt Consortium UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 51, D523–D531 (2023).
Zhang, E. et al. Roles and mechanisms of tumour-infiltrating B cells in human cancer: a new force in immunotherapy. Biomark. Res. 11, 28 (2023).
Xu, Y., Mao, Y., Lv, Y., Tang, W. & Xu, J. B cells in tumor metastasis: friend or foe? Int. J. Biol. Sci. 19, 2382–2393 (2023).
Qin, Y. et al. Tumor-infiltrating B cells as a favorable prognostic biomarker in breast cancer: a systematic review and meta-analysis. Cancer Cell Int. 21, 310 (2021).
Ni, Z. et al. Tumor-infiltrating B cell is associated with the control of progression of gastric cancer. Immunol. Res. 69, 43–52 (2021).
Zhang, Z. et al. Yin–yang effect of tumor infiltrating B cells in breast cancer: from mechanism to immunotherapy. Cancer Lett. 393, 1–7 (2017).
Shen, M., Wang, J. & Ren, X. New insights into tumor-infiltrating B lymphocytes in breast cancer: clinical impacts and regulatory mechanisms. Front. Immunol. 9, 470 (2018).
Sjöberg, E. et al. A minority-group of renal cell cancer patients with high infiltration of CD20+ B-cells is associated with poor prognosis. Br. J. Cancer 119, 840–846 (2018).
Minici, C., Testoni, S. & Della-Torre, E. B-lymphocytes in the pathophysiology of pancreatic adenocarcinoma. Front. Immunol. 13, 867902 (2022).
Wang, X. et al. PD-1-expressing B cells suppress CD4+ and CD8+ T cells via PD-1/PD-L1-dependent pathway. Mol. Immunol. 109, 20–26 (2019).
Thibult, M.-L. et al. PD-1 is a novel regulator of human B-cell activation. Int. Immunol. 25, 129–137 (2013).
Zhu, J. et al. Mapping cellular interactions from spatially resolved transcriptomics data. Nat. Methods 21, 1830–1842 (2024).
Riaz, N. et al. Tumor and microenvironment evolution during immunotherapy with nivolumab. Cell 171, 934–949(2017).
Carithers, L. J. & Moore, H. M. The Genotype-Tissue Expression (GTEx) Project. Biopreserv. Biobank. 13, 307–308 (2015).
Uhlén, M. et al. Tissue-based map of the human proteome. Science 347, 1260419 (2015).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).
Saravanan, V. & Gautham, N. Harnessing computational biology for exact linear B-cell epitope prediction: a novel amino acid composition-based feature descriptor. OMICS 19, 648–658 (2015).
Jespersen, M. C., Peters, B., Nielsen, M. & Marcatili, P. BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes. Nucleic Acids Res. 45, W24–W29 (2017).
Biswas, S. et al. Targeting intracellular oncoproteins with dimeric IgA promotes expulsion from the cytoplasm and immune-mediated control of epithelial cancers. Immunity 56, 2570–2583 (2023).
Springer, I., Tickotsky, N. & Louzoun, Y. Contribution of T cell receptor alpha and beta CDR3, MHC typing, V and J genes to peptide binding prediction. Front. Immunol. 12, 664514 (2021).
Cai, M., Bang, S., Zhang, P. & Lee, H. ATM-TCR: TCR–epitope binding affinity prediction using a multi-head self-attention model. Front. Immunol. 13, 893247 (2022).
Moris, P. et al. Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification. Brief. Bioinformatics 22, bbaa318 (2021).
Lu, T. et al. Deep learning-based prediction of the T cell receptor–antigen binding specificity. Nat. Mach. Intell. 3, 864–875 (2021).
Zhang, Z., Xiong, D., Wang, X., Liu, H. & Wang, T. Mapping the functional landscape of T cell receptor repertoires by single-T cell transcriptomics. Nat. Methods 18, 92–99 (2021).
Peng, X. et al. Characterizing the interaction conformation between T-cell receptors and epitopes with deep learning. Nat. Mach. Intell. 5, 395–407 (2023).
Ye, J., Ma, N., Madden, T. L. & Ostell, J. M. IgBLAST: an immunoglobulin variable domain sequence analysis tool. Nucleic Acids Res. 41, W34–W40 (2013).
Soto, C. et al. PyIR: a scalable wrapper for processing billions of immunoglobulin and T cell receptor sequences using IgBLAST. BMC Bioinformatics 21, 314 (2020).
Saul, L. et al. IgG subclass switching and clonal expansion in cutaneous melanoma and normal skin. Sci. Rep. 6, 29736 (2016).
Li, X. et al. Comparative analysis of immune repertoires between Bactrian camel’s conventional and heavy-chain antibodies. PLoS ONE 11, e0161801 (2016).
Lupo, C., Spisak, N., Walczak, A. M. & Mora, T. Learning the statistics and landscape of somatic mutation-induced insertions and deletions in antibodies. PLoS Comput. Biol. 18, e1010167 (2022).
Bowers, P. M. et al. Nucleotide insertions and deletions complement point mutations to massively expand the diversity created by somatic hypermutation of antibodies. J. Biol. Chem. 289, 33557–33567 (2014).
Martin, A. et al. Olfactory receptor OR5V1 is an effective target for CAR T cells in ovarian cancer (207). Gynecol. Oncol. 166, S116 (2022).
Binder, J. X. et al. COMPARTMENTS: unification and visualization of protein subcellular localization evidence. Database (Oxford) 2014, bau012 (2014).
Zhang, R. et al. Nuclear localization of STING1 competes with canonical signaling to activate AHR for commensal and intestinal homeostasis. Immunity 56, 2736–2754(2023).
Acknowledgements
The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. This study was funded by the National Institutes of Health (5R01CA258584, to T.W.; 1R01AI190103, to T.W., J.W. and J. Huang, 1U01AI156189, to D.G.; R38HL150214, to M.G.), the Cancer Prevention Research Institute of Texas (RP190208, to T.W.; RP230363, to T.W. and J. Huang), the American Cancer Society–Melanoma Research Alliance Team (MRAT-18-114-01-LIB, to D.G.), the V Foundation Robin Roberts Cancer Survivorship Award (DT2019-007, to D.G.) and the Welch foundation (I-1944, to X.B.).
Author information
Authors and Affiliations
Contributions
Conceptualization, T.W. Data collection and curation, F.J.F., M.S.V., D.M.Y., J.L., Y.X., C.L., I.R., C.Z., J.E.D., J. Homsi, S.R., S.Y., M.E.G., D.H., Y.G., Y.X., D.E.G., B.S., K.W., A.M., D.H., Y.G., C.L., P.R., J.C., Y.X. and T.W. Formal analysis, B.S., K.W., S.N., J.Y., Y.G. and X.B. Funding acquisition, T.W., D.E.G., J. Huang, J.W. and Tu.W. Investigation, B.S., K.W., S.N., J.Y. and Y.G. Methodology, B.S., S.N., J. Huang and T.W. Visualization, B.S., K.W., S.N. and T.W. Project administration, T.W. Software, B.S., K.W., S.N. and J.Y. Supervision, T.W. Writing—original draft, all authors. Writing—review and editing, all authors.
Corresponding authors
Ethics declarations
Competing interests
T.W. reports personal fees from Merck. D.G. has received research funding from Astra-Zeneca, BerGenBio, Karyopharm and Novocure, has stock ownership in Gilead, Medtronic and Walgreens, holds consulting or advisory board positions in Astra-Zeneca, Catalyst Pharmaceuticals, Daiichi-Sankyo, Elevation Oncology, Janssen Scientific Affairs, Jazz Pharmaceuticals, Regeneron Pharmaceuticals and Sanofi and is the cofounder and chief scientific officer of OncoSeer Diagnostics. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Cancer thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 The design of the Cmai model.
(a) Detailed model structure of Cmai. (b) The input and reconstructed Atchley factor matrices of example BCR Vh sequences. The Vh amino acids were first converted to Atchley factors, then encoded numerically by the Vh VAE, and finally reconstructed by the Vh de-coder.
Extended Data Fig. 2 Diversity of the model training data.
(a) Numbers of binding BCRs for antigens that fall into each interval. We counted the numbers of binding BCRs for each unique antigen in our training set. We calculated several quantiles based on these numbers for all the antigens (shown in figure). We calculated and plotted the median numbers of binding BCRs for antigens that fall into the intervals formed by these quantiles. For example, 99-100% means the top 1% of antigens with the most numbers of binding BCRs. 0-90% means the bottom 90% of antigens with the fewest binding BCRs. (b) VDJ gene usage of the background human BCR sequences.
Extended Data Fig. 3 Association between tumor antigen expression and Cmai BCR binding scores.
(a) The average BCR binding scores are higher for extracellular tumor antigens that are more highly expressed than for extracellular tumor antigens that are lowly expressed, in each TCGA cohort. N = 822 unique antigens. Sample size (number of patients) in each cohort is: ALL = 64, STES = 30, STAD = 25, BRCA = 26, LUSC = 31, SKCM = 25, LUAD = 24, OV = 21, HNSC = 36, ESCA = 34, KIRC = 19, THCA = 20, TGCT = 22, BLCA = 25, CESC = 29, COAD = 24, PAAD = 38, SARC = 19, KIRP = 19, DLBC = 19, MESO = 21, THYM = 24, PRAD = 21, UVM = 16, CHOL = 31, UCS = 16, LAML = 20, PCPG = 21, KICH = 18, LGG = 19, ACC = 24, GBM = 22. For the boxplot, box boundaries represent interquartile ranges, whiskers extend to the most extreme data point, which is no more than 1.5 times the interquartile range, and the line in the middle of the box represents the median. (b) The correlation between BCR binding scores and tumor antigen expression is higher for extracellular antigens than for intracellular antigens. The “all” group in (a) refers to all 32 TCGA cohorts. Numbers of putative extracellular antigens (N) are: STES = 30, STAD = 25, BRCA = 26, LUSC = 31, SKCM = 25, LUAD = 24, OV = 21, HNSC = 36, ESCA = 34, KIRC = 19, THCA = 20, TGCT = 22, BLCA = 25, CESC = 29, COAD = 24, PAAD = 38, SARC = 19, KIRP = 19, DLBC = 19, MESO = 21, THYM = 24, PRAD = 21, UVM = 16, CHOL = 31, UCS = 16, LAML = 20, PCPG = 21, KICH = 18, LGG = 19, ACC = 24, and GBM = 22. Numbers of putative intracellular antigens (N) are: STES = 30, STAD = 29, BRCA = 22, LUSC = 37, SKCM = 40, LUAD = 24, OV = 22, HNSC = 42, ESCA = 36, KIRC = 19, THCA = 19, TGCT = 64, BLCA = 31, CESC = 33, COAD = 23, PAAD = 25, SARC = 20, KIRP = 20, DLBC = 19, MESO = 19, THYM = 25, PRAD = 21, UVM = 25, CHOL = 25, UCS = 39, LAML = 34, PCPG = 43, KICH = 20, LGG = 32, ACC = 26, and GBM = 33. For (a) and (b), the individual cancer types are ranked as in Fig. 4c.
Extended Data Fig. 4 The predictive values of the Cmai binding scores.
(a–d) Patients were dichotomized based on median Cmai BCR binding scores in each cohort. (a) All TCGA patients (N = 2,625 patients); (b) TCGA KIRC patients (N = 62 patients); (c) TCGA MESO patients (N = 18 patients); and (d) TCGA PAAD patients (N = 40 patients). The Kaplan-Meier survival curves were constructed based on splitting the patient cohorts by the median Cmai binding scores and compared using the Log-Rank Test. Hazard ratios (HR) and 95% confidence intervals (CI) were obtained from Cox proportional hazards regression, with statistical significance evaluated using the likelihood ratio test.
Extended Data Fig. 5 Binding strength comparisons of BCRs with edit distances of 3 in the “continuous” training cohort.
For each cohort, we investigated pairs of BCRs with edit distance of 3 in the heavy chains and their comparative binding strengths.
Extended Data Fig. 6 Sensitivity in the choices of threshold values for calculating the BCR binding scores.
(a) The numbers of BCRs obtained from all TCGA tumor samples. (b) Scatterplots and correlations between the BCR binding scores calculated with slightly different threshold values, on the irAE cohort. “***” indicates the Pearson Correlation Test P value < 0.001. N = 4,375 BCR binding scores.
Supplementary information
Source data
Source Data Fig. 1
Statistical source data.
Source Data Fig. 2
Statistical source data.
Source Data Fig. 3
Statistical source data.
Source Data Fig. 4
Statistical source data.
Source Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 1
Statistical source data.
Source Data Extended Data Fig. 2
Statistical source data.
Source Data Extended Data Fig. 3
Statistical source data.
Source Data Extended Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 6
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Song, B., Wang, K., Na, S. et al. Profiling antigen-binding affinity of B cell repertoires in tumors by deep learning predicts immune-checkpoint inhibitor treatment outcomes. Nat Cancer 6, 1570–1584 (2025). https://doi.org/10.1038/s43018-025-01001-5
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s43018-025-01001-5