Abstract
Background
Accurate diagnosis of hematologic malignancies from peripheral blood smears (PBSs) requires integrating cellular morphology and composition across numerous white blood cells. Existing computational approaches predominantly automate single-cell classifications and do not provide holistic, slide-level diagnostic predictions.
Methods
We present a framework that employs a high-performance cell-based encoder (DeepHeme) for feature extraction, integrated with our weakly supervised, attention-based multiple instance learning (MIL) model, termed CAREMIL (Cell AggRegation, Explainable, Multiple Instance Learning). Through comprehensive evaluations of leading image encoders and MIL architectures, the combination of DeepHeme and CAREMIL demonstrated superior performance on disease classification tasks. CAREMIL functions as a robust aggregation mechanism, consistently outperforming established slide-level MIL methods (gated MIL and Dual-stream MIL Network) across multiple encoder types. The most pronounced performance gains were observed with out-of-domain encoders, including ImageNet-pretrained and open-source pathology foundation models (UNI2 and Virchow2).
Results
CAREMIL combined with DeepHeme achieves the highest diagnostic accuracy across acute myeloid leukemia (AML), myelodysplastic syndromes (MDS), and hairy cell leukemia (HCL), with AUROCs of 0.999, 0.891, and 0.945, respectively, and successfully identifies AML even in cases with minimal or absent circulating blasts. Attention values assigned by CAREMIL highlight diagnostically relevant cells and reveal disease-specific morphometric patterns, enabling biological interpretability and case-level insights. The framework remains resilient to individual cell misclassifications and does not require explicit cell-level supervision.
Conclusions
These findings establish CAREMIL as an effective and interpretable MIL framework for hematologic slide diagnosis, extendable to bone marrow aspirates, cytology, and other liquid biopsy specimens, supporting a shift toward quantitative, morphology-informed hematologic diagnostics.
Plain Language Summary
Computational models can be used to analyse images of cells and tissues taken from people with cancer. Most models are designed for images of solid tissue and do not base their analysis on the appearance of individual cells. This means they do not work so well on liquid samples such as blood samples. We developed a system to detect blood cancers from images of blood and found it worked better than models previously developed for solid tissue analysis. Our computational model could also be used by other researchers to discover additional diseases detectable from blood samples or be expanded to enable population-scale blood cancer screening.
Data availability
For follow up instructions please contact Chad Vanderbilt (vanderbc@mskcc.org) or Gregory Goldgof (goldgofg@mskcc.org) for initiating a data transfer agreement (DTA) with MSKCC Legal Dept. The DTA process will take 6–9 months, and is at the discretion of the legal department. No protected health information will be shared under any circumstances. The source data for Fig. 3 is in Supplementary Data 1. The source data for Fig. 4 is in Supplementary Data 2.
Code availability
The code for CAREMIL is available on GitHub and has been archived on Zenodo37.
References
Kimura, K. et al. Automated diagnostic support system with deep learning algorithms for distinction of Philadelphia chromosome-negative myeloproliferative neoplasms using peripheral blood specimen. Sci. Rep. 11, 3367 (2021).
Foucar, K. et al. Concordance among hematopathologists in classifying blasts plus promonocytes: a bone marrow pathology group study. Int. J. Lab. Hematol. 42, 418–422 (2020).
Döhner, H., Weisdorf, D. J. & Bloomfield, C. D. Acute myeloid leukemia. N. Engl. J. Med. 373, 1136–1152 (2015).
Pelcovits, A. & Niroula, R. Acute myeloid leukemia: a review. Rhode Isl. Med. J. 103, 38–40 (2020).
Wintrobe, M. M. Clinical hematology. Acad. Med. 37, 78 (1962).
Chase, M. L. et al. Consensus recommendations on peripheral blood smear review: defining curricular standards and fellow competency. Blood Adv. 7, 3244–3252 (2023).
Campo, E. et al. The international consensus classification of mature lymphoid neoplasms: a report from the clinical advisory committee. Blood J. Am. Soc. Hematol. 140, 1229–1253 (2022).
Sidhom, J.-W. et al. Deep learning for distinguishing morphological features of acute promyelocytic leukemia. Blood 136, 10–12 (2020).
Matek, C., Schwarz, S., Spiekermann, K. & Marr, C. Human-level recognition of blast cells in acute myeloid leukaemia with convolutional neural networks. Nat. Mach. Intell. 1, 538–544 (2019).
Wang, J. Deep learning in hematology: from molecules to patients. Clin. Hematol. Int. 6, 19–42 (2024).
Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 https://www.nature.com/articles/s41591-019-0508-1 (2019).
Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).
Landau, M. S. & Pantanowitz, L. Artificial intelligence in cytopathology: a review of the literature and overview of commercial landscape. J. Am. Soc. Cytopathol. 8, 230–241 (2019).
Sun, S. et al. Deepheme, a high-performance, generalizable deep ensemble for bone marrow morphometry and hematologic diagnosis. Sci. Transl. Med. 17, eadq2162 (2025).
Matek, C., Krappe, S., Münzenmayer, C., Haferlach, T. & Marr, C. Highly accurate differentiation of bone marrow cell morphologies using deep neural networks on a large image dataset. Blood https://www.sciencedirect.com/science/article/pii/S0006497121013975 (2021).
Song, A. H. et al. Morphological Prototyping for Unsupervised Slide Representation Learning in Computational Pathology. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11566–11578 (2024).
Ilse, M., Tomczak, J. & Welling, M. Attention-based deep multiple instance learning. In Proc. 35th International Conference on Machine Learning, 2127–2136 (PMLR, 2018).
Gadermayr, M. & Tschuchnig, M. Multiple instance learning for digital pathology: a review of the state-of-the-art, limitations & future potential. Comput. Med. Imaging Graph. 112, 102337 (2024).
Javed, S. A. et al. Additive mil: intrinsically interpretable multiple instance learning for pathology. Adv. Neural Inf. Process. Syst. 35, 20689–20702 (2022).
Deng, R. et al. Cross-scale multi-instance learning for pathological image diagnosis. Med. image Anal. 94, 103124 (2024).
Manescu, P. et al. Detection of acute promyelocytic leukemia in peripheral blood and bone marrow with annotation-free deep learning. Sci. Rep. 13, 2562 (2023).
Reis, D., Kupec, J., Hong, J. & Daoudi, A. Real-Time Flying Object Detection with YOLOv8. https://doi.org/10.48550/ARXIV.2305.09972 (2023).
Sun, S. et al. DeepHeme, a high-performance, generalizable deep ensemble for bone marrow morphometry and hematologic diagnosis. Sci. Transl. Med. 17, eadq2162 (2025).
Kraus, O. Z., Ba, L. J. & Frey, B. Classifying and segmenting microscopy images using convolutional multiple instance learning. Bioinformatics 32, i52–i59 (2016).
Carmichael, I. et al. Incorporating intratumoral heterogeneity into weakly-supervised deep learning models via variance pooling. in International Conference on Medical Image Computing and Computer-Assisted Intervention. 387–397 (Springer, 2022).
Mingote, V., Miguel, A., Ortega, A. & Lleida, E. Class token and knowledge distillation for multi-head self-attention speaker verification systems. Digit. Signal Process. 133, 103859 (2023).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Preprint at https://doi.org/10.48550/arXiv.1810.04805 (2019).
Dosovitskiy, A. et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. Preprint at https://doi.org/10.48550/arXiv.2010.11929 (2021).
Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862 (2024).
Zimmermann, E. et al. Virchow2: Scaling Self-Supervised Mixed Magnification Models in Pathology. Preprint at https://doi.org/10.48550/arXiv.2408.00738 (2024).
Li, B., Li, Y. & Eliceiri, K. W. Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 14313–14323 (IEEE, 2021).
Srisuwananukorn, A., Salama, M. E. & Pearson, A. T. Deep learning applications in visual data for benign and malignant hematologic conditions: a systematic review and visual glossary. Haematologica 108, 1993–2010 (2023).
Araújo, D. J. et al. Key Patches Are All You Need: A Multiple Instance Learning Framework For Robust Medical Diagnosis. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 5231–5240 (IEEE, Seattle, WA, USA, 2024).
Gaube, S. et al. Non-task expert physicians benefit from correct explainable AI advice when reviewing X-rays. Sci. Rep. 13, 1383 (2023).
Khoury, J. D. et al. The 5th edition of the World Health Organization classification of haematolymphoid tumours: myeloid and histiocytic/dendritic neoplasms. Leukemia 36, 1703–1719 (2022).
Wong, B., Hong, S. & Yi, M. Rethinking Pre-Trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification. 5 (2025). https://doi.org/10.1109/ISBI60581.2025.10981015.
Singi, S. et al. CAREMIL: Interpretable multiple instance learning for hematologic diagnosis [Code]. Zenodo https://doi.org/10.5281/zenodo.18784189 (2026).
Acknowledgements
This research was supported by funding from the Cancer Center Support Grant from the NIH/NCI (P30CA008748), the Warren Alpert Foundation through the Warren Alpert Center for Digital and Computational Pathology at Memorial Sloan Kettering Cancer Center, and the MSK Technology Development Fund. We extend our gratitude to the MSK Hematopathology and Digital Pathology Services for their contributions.
Author information
Authors and Affiliations
Contributions
Conceptualization: Si.S., G.M.G., and C.V. Methodology: Si.S., Sh.S., Z.Y., R.G., D.C.W., N.K., S.N., N.S., J.G.V.C., B.F., E.S.Y., Ar.S., C.C.J., I.C., G.M.G., and C.V. Experiments: Si.S., Sh.S., G.M.G., and C.V. Data curation and acquisition: G.M.G., I.S.I., K.H.B., J.B., L.W., I.S.I., M.Z., L.B., M.F., M.Y., W.X., La.M., M.R., A.D., and O.A. Manuscript writing: Si.S., Sh.S., I.C., G.M.G., and C.V. Funding acquisition: G.M.G., C.V., and O.A. Reviewal and approval of manuscript: Si.S., Sh.S., Z.Y., R.G., D.C.W., K.H.B., D.D., L.W., N.K., S.N., N.S., J.G.V.C., B.F., S.P., E.S.Y., A.K., A.S., A.M., J.B., I.S.I., C.C.J., An.C., L.B., D.K., B.K., M.F., Al.C., M.Y., S.I.M., M.Z., S.M., O.A., La.M., W.X., M.R., O.L., A.D., I.C., C.V., and G.M.G.
Corresponding author
Ethics declarations
Competing interests
L.B. reports stock ownership in Exact Sciences. M.Y. reports consulting for Janssen Research and Development. S.M. reports equity interest in Daboia Consulting LLC and professional services for Janssen Pharmaceuticals, Medical Case Management Group, North American Thrombosis Forum, and Physicians’ Education Resource. W.X. reports research support from Stemline Therapeutics. M.R. reports serving as a scientific advisory board member with equity support at Auron Pharmaceutical, research funding from Celularity, Roche-Genentech, Beat AML, and NGM, and travel funding from BD Biosciences. O.L. reports consulting fees from Janssen Biotech and Hologic, and support for professional activities from the American Society of Cytopathology. A.D. reports consulting for Seattle Genetics, Takeda, EUSA Pharma, AbbVie, Peerview, Physicians’ Education Resource, Incyte, and Loxo, as well as research support from Roche and Takeda. C.V. reports equity interest and intellectual property rights in Paige.AI, Inc., and a consulting and advisory role for Paige.AI, Inc. G.M.G. reports equity interest in HemeAI, Inc. The following authors declare no competing interests: Si.S., Sh.S., Z.Y., R.G., D.W., K.B., D.D., L.W., N.K., S.N., N.S., J.C., B.F., S.P., E.Y., A.K., Ar.S., A.M., J.B., I.I., C.C.J., Al.C., Ai.S., D.K., B.K., M.F., An.C., S.M., M.Z., O.A., La.M., and I.C.
Peer review
Peer review information
Communications Medicine thanks Hanxiang Ma, Tao Chen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Singi, S., Sun, S., Yin, Z. et al. Interpretable multiple instance learning for hematologic diagnosis from peripheral blood smears. Commun Med (2026). https://doi.org/10.1038/s43856-026-01558-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s43856-026-01558-x