Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

ImmunoStruct enables multimodal deep learning for immunogenicity prediction

A preprint version of the article is available at bioRxiv.

Abstract

Epitope-based vaccines are promising therapeutic modalities for infectious diseases and cancer, but identifying immunogenic epitopes is challenging. Most prediction methods only use amino acid sequence information, and do not incorporate wide-scale structure data and biochemical properties across each peptide–major histocompatibility complex (MHC). We present ImmunoStruct, a deep learning model that integrates sequence, structural and biochemical information to predict multi-allele class I peptide–MHC immunogenicity. By leveraging a multimodal dataset of 26,049 peptide–MHCs, we demonstrate that ImmunoStruct improves immunogenicity prediction performance and interpretability beyond existing methods, across infectious disease epitopes and cancer neoepitopes. We further show strong alignment with in vitro assay results for a set of SARS-CoV-2 epitopes, as well as strong performance in peptide–MHC-based survival prediction for patients with cancer. Overall, this work also presents an architecture that incorporates equivariant graph processing and multimodal data integration for a long-standing challenge in immunotherapy.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of ImmunoStruct architecture and training.
Fig. 2: Cancer–wild-type contrastive learning leverages the wild-type counterpart to better organize the latent space.
Fig. 3: ImmunoStruct improves immunogenicity prediction on the IEDB dataset of infectious disease.
Fig. 4: ImmunoStruct improves immunogenicity prediction on the CEDAR dataset of cancer neoepitopes.
Fig. 5: Experimental validation of ImmunoStruct predictions.

Similar content being viewed by others

Data availability

The infectious disease data were obtained from IEDB at https://iedb.org/. The cancer neoepitope data were obtained from CEDAR at https://cedar.iedb.org/. The cancer survival data were obtained as previously described46. Data are freely available via GitHub at https://github.com/KrishnaswamyLab/ImmunoStruct.

Code availability

The source code for the ImmunoStruct model and inference scripts are available under an open-source license via GitHub at https://github.com/KrishnaswamyLab/ImmunoStruct and are available via Zenodo at https://doi.org/10.5281/zenodo.17535443 (ref. 68).

References

  1. Gupta, M. et al. Recent advances in cancer vaccines: challenges, achievements, and futuristic prospects. Vaccines 10, 2011 (2022).

    Article  Google Scholar 

  2. Fan, T. et al. Therapeutic cancer vaccines: advancements, challenges, and prospects. Signal Transduct. Target. Ther. 8, 450 (2023).

    Article  Google Scholar 

  3. Blass, E. & Ott, P. A. Advances in the development of personalized neoantigen-based therapeutic cancer vaccines. Nat. Rev. Clin. Oncol. 18, 215–229 (2021).

    Article  Google Scholar 

  4. Bjerregaard, A.-M. et al. An analysis of natural T cell responses to predicted tumor neoepitopes. Front. Immunol. 8, 1566 (2017).

    Article  Google Scholar 

  5. Katsikis, P. D., Ishii, K. J. & Schliehe, C. Challenges in developing personalized neoantigen cancer vaccines. Nat. Rev. Immunol. 24, 213–227 (2024).

    Article  Google Scholar 

  6. Ott, P. A. et al. A phase Ib trial of personalized neoantigen therapy plus anti-PD-1 in patients with advanced melanoma, non-small cell lung cancer, or bladder cancer. Cell 183, 347–362 (2020).

    Article  Google Scholar 

  7. Wells, D. K. et al. Key parameters of tumor epitope immunogenicity revealed through a consortium approach improve neoantigen prediction. Cell 183, 818–834 (2020).

    Article  Google Scholar 

  8. Sahin, U. et al. Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer. Nature 547, 222–226 (2017).

    Article  Google Scholar 

  9. Pishesha, N., Harmand, T. J. & Ploegh, H. L. A guide to antigen processing and presentation. Nat. Rev. Immunol. 22, 751–764 (2022).

    Article  Google Scholar 

  10. Tynan, F. E. et al. The immunogenicity of a viral cytotoxic T cell epitope is controlled by its MHC-bound conformation. J. Exp. Med. 202, 1249–1260 (2005a).

    Article  Google Scholar 

  11. Wu, P. et al. Mechano-regulation of peptide-MHC class I conformations determines TCR antigen recognition. Mol. Cell 73, 1015–1027 (2019).

    Article  Google Scholar 

  12. Weber, J. K. et al. Unsupervised and supervised AI on molecular dynamics simulations reveals complex characteristics of HLA-a2-peptide immunogenicity. Brief. Bioinforma. 25, bbad504 (2024).

    Article  Google Scholar 

  13. Reynisson, B., Alvarez, B., Paul, S., Peters, B. & Nielsen, M. Netmhcpan-4.1 and netmhciipan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data. Nucleic Acids Res. 48, W449–W454 (2020).

    Article  Google Scholar 

  14. Shao, X. M. et al. High-throughput prediction of MHC class I and II neoantigens with MHCnuggets. Cancer Immunol. Res. 8, 396–408 (2020).

    Article  Google Scholar 

  15. O’Donnell, T. J., Rubinsteyn, A. & Laserson, U. MHCflurry 2.0: improved pan-allele prediction of MHC class I-presented peptides by incorporating antigen processing. Cell Syst. 11, 42–48 (2020).

    Article  Google Scholar 

  16. Albert, B. A. et al. Deep neural networks predict class I major histocompatibility complex epitope presentation and transfer learn neoepitope immunogenicity. Nat. Mach. Intell. 5, 861–872 (2023).

    Article  Google Scholar 

  17. Saotome, K. et al. Structural analysis of cancer-relevant TCR-CD3 and peptide-MHC complexes by cryoEM. Nat. Commun. 14, 2401 (2023).

    Article  Google Scholar 

  18. Jiang, D. et al. Neoapred: a deep-learning framework for predicting immunogenic neoantigen based on surface and structural features of peptide–human leukocyte antigen complexes. Bioinformatics 40, btae547 (2024).

    Article  Google Scholar 

  19. Vita, R. et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 47, D339–D343 (2018).

    Article  Google Scholar 

  20. Koşaloğlu-Yalçın, Z. et al. The Cancer Epitope Database and Analysis Resource (CEDAR). Nucleic Acids Res. 51, D845–D852 (2023).

    Article  Google Scholar 

  21. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  Google Scholar 

  22. Gfeller, D. et al. Improved predictions of antigen presentation and TCR recognition with MixMHCpred2. 2 and PRIME2.0 reveal potent SARS-CoV-2 CD8+ T-cell epitopes. Cell Syst. 14, 72–83 (2023).

    Article  Google Scholar 

  23. Kim, J. Y., Bang, H., Noh, S.-J. & Choi, J. K. DeepNeo: a webserver for predicting immunogenic neoantigens. Nucleic Acids Res. 51, W134–W140 (2023).

    Article  Google Scholar 

  24. Satorras, V. G., Hoogeboom, E. & Welling, M. E(n) equivariant graph neural networks. In Proc. International Conference on Machine Learning 9323–9332 (PMLR, 2021).

  25. Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In Proc. 5th International Conference on Learning Representations (ICLR, 2017).

  26. Vaswani, A. et al. Attention is all you need. In Proc. Advances in Neural Information Processing Systems 5998–6008 (NIPS, 2017).

  27. Xiao, X. et al. HGTDP-DTA: hybrid graph-transformer with dynamic prompt for drug-target binding affinity prediction. In Proc. International Conference on Neural Information Processing 340–354 (Springer, 2024).

  28. Kingma, D. P. Auto-encoding Variational Bayes. In Proc. International Conference on Learning Representations (ICLR, 2014).

  29. Liao, D. et al. RNAGenScape: property-guided optimization and interpolation of mRNA sequences with manifold Langevin dynamics. Preprint at https://doi.org/10.48550/arXiv.2510.24736 (2025).

  30. Liu, C. et al. DiffKillR: killing and recreating diffeomorphisms for cell annotation in dense microscopy images. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing 1–5 (IEEE, 2025).

  31. Sun, X. et al. Geometry-aware generative autoencoders for warped Riemannian metric learning and generative modeling on data manifolds. In Proc. International Conference on Artificial Intelligence and Statistics (PMLR, 2024).

  32. Osorio, D., Rondón-Villarreal, P. & Torres, R. Peptides: a package for data mining of antimicrobial peptides. R J. 7, 4–14 (2015).

  33. Zhuang, F. et al. A comprehensive survey on transfer learning. Proc. IEEE 109, 43–76 (2020).

    Article  Google Scholar 

  34. Richman, L. P., Vonderheide, R. H. & Rech, A. J. Neoantigen dissimilarity to the self-proteome predicts immunogenicity and response to immune checkpoint blockade. Cell Syst. 9, 375–382 (2019).

    Article  Google Scholar 

  35. Łuksza, M. et al. A neoantigen fitness model predicts tumour response to checkpoint blockade immunotherapy. Nature 551, 517–520 (2017).

    Article  Google Scholar 

  36. Balachandran, V. P. et al. Identification of unique neoantigen qualities in long-term survivors of pancreatic cancer. Nature 551, 512–516 (2017).

    Article  Google Scholar 

  37. Perry, J. S. A. & Hsieh, C.-S. Development of T-cell tolerance utilizes both cell-autonomous and cooperative presentation of self-antigen. Immunol. Rev. 271, 141–155 (2016).

    Article  Google Scholar 

  38. Ghorani, E. et al. Differential binding affinity of mutated peptides for MHC class I is a predictor of survival in advanced lung cancer and melanoma. Ann. Oncol. 29, 271–279 (2018).

    Article  Google Scholar 

  39. Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nat. Biotechnol. 37, 1482–1492 (2019).

    Article  Google Scholar 

  40. Tadros, D. M., Eggenschwiler, S., Racle, J. & Gfeller, D. The MHC motif atlas: a database of MHC binding specificities and ligands. Nucleic Acids Res. 51, D428–D437 (2023).

    Article  Google Scholar 

  41. Sidney, J., Peters, B., Frahm, N., Brander, C. & Sette, A. HLA class I supertypes: a revised and updated classification. BMC Immunol. 9, 1 (2008).

    Article  Google Scholar 

  42. Nguyen, A. T., Szeto, C. & Gras, S. The pockets guide to HLA class I molecules. Biochem. Soc. Trans. 49, 2319–2331 (2021).

    Article  Google Scholar 

  43. Tynan, F. E. et al. T cell receptor recognition of a ‘super-bulged’ major histocompatibility complex class I-bound peptide. Nat. Immunol. 6, 1114–1122 (2005b).

    Article  Google Scholar 

  44. Tynan, F. E. et al. At cell receptor flattens a bulged antigenic peptide presented by a major histocompatibility complex class I molecule. Nat. Immunol. 8, 268–276 (2007).

    Article  Google Scholar 

  45. Ding, Y.-H. et al. Two human T cell receptors bind in a similar diagonal mode to the HLA-a2/tax peptide complex using different TCR amino acids. Immunity 8, 403–411 (1998).

    Article  Google Scholar 

  46. Borch, A. et al. Improve: a feature model to predict neoepitope immunogenicity through broad-scale validation of T-cell recognition. Front. Immunol. 15, 1360281 (2024).

    Article  Google Scholar 

  47. Rappaport, A. R. et al. A shared neoantigen vaccine combined with immune checkpoint blockade for advanced metastatic solid tumors: phase 1 trial interim results. Nat. Med. 30, 1013–1022 (2024).

    Article  Google Scholar 

  48. Chen, J.-L. et al. Structural and kinetic basis for heightened immunogenicity of T cell vaccines. J. Exp. Med. 201, 1243–1255 (2005).

    Article  Google Scholar 

  49. Lu, D. et al. KRAS G12V neoantigen specific T cell receptor for adoptive T cell therapy against tumors. Nat. Commun. 14, 6389 (2023).

    Article  Google Scholar 

  50. Poole, A. et al. Therapeutic high affinity T cell receptor targeting a KRASG12D cancer neoantigen. Nat. Commun. 13, 5333 (2022).

    Article  Google Scholar 

  51. Nusrat, F. et al. The clinical implications of KRAS mutations and variant allele frequencies in pancreatic ductal adenocarcinoma. J. Clin. Med. 13, 2103 (2024).

    Article  Google Scholar 

  52. Weiskopf, D. et al. Comprehensive analysis of dengue virus-specific responses supports an HLA-linked protective role for CD8+ T cells. Proc. Natl Acad. Sci. USA 110, E2046–E2053 (2013).

    Article  Google Scholar 

  53. Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).

    Article  MathSciNet  Google Scholar 

  54. Liu, C. et al. Imageflownet: forecasting multiscale trajectories of disease progression with irregularly-sampled longitudinal medical images. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing 1–5 (IEEE, 2025).

  55. Liu, C. et al. Cuts: a deep learning and topological framework for multigranular unsupervised medical image segmentation. In Proc. International Conference on Medical Image Computing and Computer-Assisted Intervention 155–165 (Springer, 2024).

  56. Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Proc. Advances in Neural Information Processing Systems 32, 8026–8037 (NIPS, 2019).

  57. Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In Proc. International Conference on Machine Learning 1597–1607 (PMLR, 2020).

  58. Zbontar, J., Jing, L., Misra, I., LeCun, Y. & Deny, S. Barlow twins: self-supervised learning via redundancy reduction. In Proc. International Conference on Machine Learning 12310–12320 (PMLR, 2021).

  59. Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In Proc. International Conference on Learning Representations (ICLR, 2019).

  60. Loshchilov, I. & Hutter, F. SGDR: stochastic gradient descent with warm restarts. In Proc. International Conference on Learning Representations (ICLR, 2017).

  61. Sette, A. & Sidney, J. Nine major HLA class I supertypes account for the vast preponderance of HLA-a and -b polymorphism. Immunogenetics 50, 201–212 (1999).

    Article  Google Scholar 

  62. Mirdita, M. et al. Colabfold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).

    Article  Google Scholar 

  63. Jamasb, A. et al. Graphein—a Python library for geometric deep learning and network analysis on biomolecular structures and interaction networks. In Proc. Advances in Neural Information Processing Systems 35, 27153–27167 (NIPS, 2022).

  64. Zaidi, N. et al. Role of in silico structural modeling in predicting immunogenic neoepitopes for cancer vaccine development. JCI Insight 5, (2020).

  65. Slota, M., Lim, J.-B., Dang, Y. & Disis, M. L. ELISpot for measuring human immune responses to vaccines. Expert Rev. Vaccines 10, 299–306 (2011).

    Article  Google Scholar 

  66. Yang, F. et al. Validation of an IFN-gamma ELISpot assay to measure cellular immune responses against viral antigens in non-human primates. Gene Ther. 29, 41–54 (2022).

    Article  Google Scholar 

  67. Nelde, A. et al. SARS-CoV-2-derived peptides define heterologous and COVID-19-induced T cell recognition. Nat. Immunol. 22, 74–85 (2021).

    Article  Google Scholar 

  68. Givenchian, K. B. et al. ImmunoStruct: ImmunoStruct release. Zenodo https://doi.org/10.5281/zenodo.17535443 (2025).

Download references

Acknowledgements

This work was supported by the National Science Foundation (NSF Career grant nos. 2047856, NSF IIS 2473317, NSF DMS 2327211) (S.K.), the National Institute of Health (grant nos. NIH 1R01GM130847-01A1, NIH 1R01GM135929-01) (S.K.) and by The Colton Center for Autoimmunity at Yale University (S.K.).

Author information

Authors and Affiliations

Authors

Contributions

K.B.G., A.I. and S.K. identified the research problem and designed this work. K.B.G. collected and cleaned the IEDB and CEDAR data. K.B.G., J.F.R., C.L., E.Y., R.Y., A.I. and S.K. conceived the experiments. K.B.G., J.F.R., C.L. and E.Y. developed ImmunoStruct. C.L. and K.B.G. conceived, designed and developed the cancer–wild-type contrastive learning. K.B.G., J.F.R., C.L. and E.Y. trained and evaluated ImmunoStruct. K.B.G., K.G. and E.C. performed the experimental validation on SARS-CoV-2. K.G. and J.F.R performed the clinical validation. K.B.G. and S.T. conducted the peptide–TCR contact analysis. K.B.G., J.F.R. and C.L. performed the data analysis. K.B.G., C.L. and S.T. produced the figures. All authors participated in the discussion and wrote the paper.

Corresponding authors

Correspondence to Akiko Iwasaki or Smita Krishnaswamy.

Ethics declarations

Competing interests

S.K. is the chief scientific officer of Latent Alpha and cofounder of Ascent Bio. A.I. is a member of the board of directors of Roche Holding Ltd and of Genentech. A.I. co-founded RIGImmune, Xanadu Bio and PanV. R.Y. is a Amazon Scholar. E.C. is cofounder of Neomabs Biotechnologies Inc. The other authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–5 and Supplementary Tables 1–5.

Reporting Summary

Peer Review File

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Givechian, K.B., Rocha, J.F., Liu, C. et al. ImmunoStruct enables multimodal deep learning for immunogenicity prediction. Nat Mach Intell 8, 70–83 (2026). https://doi.org/10.1038/s42256-025-01163-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s42256-025-01163-y

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing