Abstract
Venous thromboembolism (VTE) is a leading cause of preventable death among patients undergoing systemic treatment for cancer. Studies suggest that treatment strategies such as direct oral anticoagulant administration can significantly reduce the likelihood of VTE. Therefore, identifying people at high risk is of critical importance. Leveraging electronic health records (EHRs) from the U.S. Veterans Affairs (VA) healthcare system, we developed a transformer model to predict VTE risk in 80,808 cancer patients following the initiation of systemic treatment. The model uses longitudinal diagnostic codes, laboratory values, and demographic data. The proposed transformer model dynamically predicts VTE risk in 3-month quarterly intervals over the year following systemic treatment, achieving progressively improved performance across quarters (AUC: 0.68–0.77). The model is similarly performant on the external validation cohort from the Harris Health System (HHS) with 9752 patients (AUC: 0.68–0.74). By improving its predictions as a patient’s history evolves, this dynamic model surpasses prior static risk scores and better supports actionable decisions deeper into the treatment course.
Similar content being viewed by others
Acknowledgements
The authors thank the referees for helpful comments during the revision process. The authors also thank Catherine Dorece for her help throughout the submission process.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
J.M.G., J.L., N.R.F. report research funding to institution from Merck and Bayer.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
He, T., Zheng, C., La, J. et al. A deep learning model to dynamically predict cancer-associated thromboembolism in large-scale healthcare systems. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-026-02730-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41746-026-02730-2


