Interpretable multiple instance learning for hematologic diagnosis from peripheral blood smears

Singi, Siddharth; Sun, Shenghuan; Yin, Zhanghan; Gupta, Riya; Webb, Dylan C.; Bilal, Khawaja H.; Dilip, Deepika; Wang, Linlin; Kumar, Neeraj; Nanda, Swaraj; Sanchez, Nicolas; Cleave, Jacob G. Van; Fried, Brenda; Paulsen, Sean; Yan, Ethan S.; Kamali, Ali; Sarkar, Argho; Manzo, Allyne; Baik, Jeeyeon; Isgor, Irem S.; Colorado-Jimenez, Cesar; Cardillo, Anthony; Boiocchi, Leonardo; Syed, Aijazuddin; Kim, David; Kezlarian-Sachs, Brie; Fenelus, Maly; Chan, Alexander; Yabe, Mariko; McCash, Samuel I.; Zhu, Menglei; Mantha, Simon; Ardon, Orly; McVoy, Lauren; Xiao, Wenbin; Roshal, Mikhail; Lin, Oscar; Dogan, Ahmet; Carmichael, Iain; Vanderbilt, Chad; Goldgof, Gregory M.

doi:10.1038/s43856-026-01558-x

Article
Open access
Published: 15 April 2026

Interpretable multiple instance learning for hematologic diagnosis from peripheral blood smears

Communications Medicine , Article number: (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

Background

Accurate diagnosis of hematologic malignancies from peripheral blood smears (PBSs) requires integrating cellular morphology and composition across numerous white blood cells. Existing computational approaches predominantly automate single-cell classifications and do not provide holistic, slide-level diagnostic predictions.

Methods

We present a framework that employs a high-performance cell-based encoder (DeepHeme) for feature extraction, integrated with our weakly supervised, attention-based multiple instance learning (MIL) model, termed CAREMIL (Cell AggRegation, Explainable, Multiple Instance Learning). Through comprehensive evaluations of leading image encoders and MIL architectures, the combination of DeepHeme and CAREMIL demonstrated superior performance on disease classification tasks. CAREMIL functions as a robust aggregation mechanism, consistently outperforming established slide-level MIL methods (gated MIL and Dual-stream MIL Network) across multiple encoder types. The most pronounced performance gains were observed with out-of-domain encoders, including ImageNet-pretrained and open-source pathology foundation models (UNI2 and Virchow2).

Results

CAREMIL combined with DeepHeme achieves the highest diagnostic accuracy across acute myeloid leukemia (AML), myelodysplastic syndromes (MDS), and hairy cell leukemia (HCL), with AUROCs of 0.999, 0.891, and 0.945, respectively, and successfully identifies AML even in cases with minimal or absent circulating blasts. Attention values assigned by CAREMIL highlight diagnostically relevant cells and reveal disease-specific morphometric patterns, enabling biological interpretability and case-level insights. The framework remains resilient to individual cell misclassifications and does not require explicit cell-level supervision.

Conclusions

These findings establish CAREMIL as an effective and interpretable MIL framework for hematologic slide diagnosis, extendable to bone marrow aspirates, cytology, and other liquid biopsy specimens, supporting a shift toward quantitative, morphology-informed hematologic diagnostics.

Plain Language Summary

Computational models can be used to analyse images of cells and tissues taken from people with cancer. Most models are designed for images of solid tissue and do not base their analysis on the appearance of individual cells. This means they do not work so well on liquid samples such as blood samples. We developed a system to detect blood cancers from images of blood and found it worked better than models previously developed for solid tissue analysis. Our computational model could also be used by other researchers to discover additional diseases detectable from blood samples or be expanded to enable population-scale blood cancer screening.

Data availability

For follow up instructions please contact Chad Vanderbilt (vanderbc@mskcc.org) or Gregory Goldgof (goldgofg@mskcc.org) for initiating a data transfer agreement (DTA) with MSKCC Legal Dept. The DTA process will take 6–9 months, and is at the discretion of the legal department. No protected health information will be shared under any circumstances. The source data for Fig. 3 is in Supplementary Data 1. The source data for Fig. 4 is in Supplementary Data 2.

Code availability

The code for CAREMIL is available on GitHub and has been archived on Zenodo³⁷.

References

Kimura, K. et al. Automated diagnostic support system with deep learning algorithms for distinction of Philadelphia chromosome-negative myeloproliferative neoplasms using peripheral blood specimen. Sci. Rep. 11, 3367 (2021).
Google Scholar
Foucar, K. et al. Concordance among hematopathologists in classifying blasts plus promonocytes: a bone marrow pathology group study. Int. J. Lab. Hematol. 42, 418–422 (2020).
Google Scholar
Döhner, H., Weisdorf, D. J. & Bloomfield, C. D. Acute myeloid leukemia. N. Engl. J. Med. 373, 1136–1152 (2015).
Google Scholar
Pelcovits, A. & Niroula, R. Acute myeloid leukemia: a review. Rhode Isl. Med. J. 103, 38–40 (2020).
Google Scholar
Wintrobe, M. M. Clinical hematology. Acad. Med. 37, 78 (1962).
Google Scholar
Chase, M. L. et al. Consensus recommendations on peripheral blood smear review: defining curricular standards and fellow competency. Blood Adv. 7, 3244–3252 (2023).
Google Scholar
Campo, E. et al. The international consensus classification of mature lymphoid neoplasms: a report from the clinical advisory committee. Blood J. Am. Soc. Hematol. 140, 1229–1253 (2022).
Google Scholar
Sidhom, J.-W. et al. Deep learning for distinguishing morphological features of acute promyelocytic leukemia. Blood 136, 10–12 (2020).
Google Scholar
Matek, C., Schwarz, S., Spiekermann, K. & Marr, C. Human-level recognition of blast cells in acute myeloid leukaemia with convolutional neural networks. Nat. Mach. Intell. 1, 538–544 (2019).
Google Scholar
Wang, J. Deep learning in hematology: from molecules to patients. Clin. Hematol. Int. 6, 19–42 (2024).
Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 https://www.nature.com/articles/s41591-019-0508-1 (2019).
Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).
Landau, M. S. & Pantanowitz, L. Artificial intelligence in cytopathology: a review of the literature and overview of commercial landscape. J. Am. Soc. Cytopathol. 8, 230–241 (2019).
Google Scholar
Sun, S. et al. Deepheme, a high-performance, generalizable deep ensemble for bone marrow morphometry and hematologic diagnosis. Sci. Transl. Med. 17, eadq2162 (2025).
Google Scholar
Matek, C., Krappe, S., Münzenmayer, C., Haferlach, T. & Marr, C. Highly accurate differentiation of bone marrow cell morphologies using deep neural networks on a large image dataset. Blood https://www.sciencedirect.com/science/article/pii/S0006497121013975 (2021).
Song, A. H. et al. Morphological Prototyping for Unsupervised Slide Representation Learning in Computational Pathology. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11566–11578 (2024).
Ilse, M., Tomczak, J. & Welling, M. Attention-based deep multiple instance learning. In Proc. 35th International Conference on Machine Learning, 2127–2136 (PMLR, 2018).
Gadermayr, M. & Tschuchnig, M. Multiple instance learning for digital pathology: a review of the state-of-the-art, limitations & future potential. Comput. Med. Imaging Graph. 112, 102337 (2024).
Google Scholar
Javed, S. A. et al. Additive mil: intrinsically interpretable multiple instance learning for pathology. Adv. Neural Inf. Process. Syst. 35, 20689–20702 (2022).
Google Scholar
Deng, R. et al. Cross-scale multi-instance learning for pathological image diagnosis. Med. image Anal. 94, 103124 (2024).
Google Scholar
Manescu, P. et al. Detection of acute promyelocytic leukemia in peripheral blood and bone marrow with annotation-free deep learning. Sci. Rep. 13, 2562 (2023).
Google Scholar
Reis, D., Kupec, J., Hong, J. & Daoudi, A. Real-Time Flying Object Detection with YOLOv8. https://doi.org/10.48550/ARXIV.2305.09972 (2023).
Sun, S. et al. DeepHeme, a high-performance, generalizable deep ensemble for bone marrow morphometry and hematologic diagnosis. Sci. Transl. Med. 17, eadq2162 (2025).
Kraus, O. Z., Ba, L. J. & Frey, B. Classifying and segmenting microscopy images using convolutional multiple instance learning. Bioinformatics 32, i52–i59 (2016).
Google Scholar
Carmichael, I. et al. Incorporating intratumoral heterogeneity into weakly-supervised deep learning models via variance pooling. in International Conference on Medical Image Computing and Computer-Assisted Intervention. 387–397 (Springer, 2022).
Mingote, V., Miguel, A., Ortega, A. & Lleida, E. Class token and knowledge distillation for multi-head self-attention speaker verification systems. Digit. Signal Process. 133, 103859 (2023).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Preprint at https://doi.org/10.48550/arXiv.1810.04805 (2019).
Dosovitskiy, A. et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. Preprint at https://doi.org/10.48550/arXiv.2010.11929 (2021).
Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862 (2024).
Zimmermann, E. et al. Virchow2: Scaling Self-Supervised Mixed Magnification Models in Pathology. Preprint at https://doi.org/10.48550/arXiv.2408.00738 (2024).
Li, B., Li, Y. & Eliceiri, K. W. Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 14313–14323 (IEEE, 2021).
Srisuwananukorn, A., Salama, M. E. & Pearson, A. T. Deep learning applications in visual data for benign and malignant hematologic conditions: a systematic review and visual glossary. Haematologica 108, 1993–2010 (2023).
Google Scholar
Araújo, D. J. et al. Key Patches Are All You Need: A Multiple Instance Learning Framework For Robust Medical Diagnosis. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 5231–5240 (IEEE, Seattle, WA, USA, 2024).
Gaube, S. et al. Non-task expert physicians benefit from correct explainable AI advice when reviewing X-rays. Sci. Rep. 13, 1383 (2023).
Google Scholar
Khoury, J. D. et al. The 5th edition of the World Health Organization classification of haematolymphoid tumours: myeloid and histiocytic/dendritic neoplasms. Leukemia 36, 1703–1719 (2022).
Google Scholar
Wong, B., Hong, S. & Yi, M. Rethinking Pre-Trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification. 5 (2025). https://doi.org/10.1109/ISBI60581.2025.10981015.
Singi, S. et al. CAREMIL: Interpretable multiple instance learning for hematologic diagnosis [Code]. Zenodo https://doi.org/10.5281/zenodo.18784189 (2026).

Download references

Acknowledgements

This research was supported by funding from the Cancer Center Support Grant from the NIH/NCI (P30CA008748), the Warren Alpert Foundation through the Warren Alpert Center for Digital and Computational Pathology at Memorial Sloan Kettering Cancer Center, and the MSK Technology Development Fund. We extend our gratitude to the MSK Hematopathology and Digital Pathology Services for their contributions.

Author information

These authors jointly supervised this work: Chad Vanderbilt, Gregory M. Goldgof

Authors and Affiliations

Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Siddharth Singi, Zhanghan Yin, Riya Gupta, Dylan C. Webb, Khawaja H. Bilal, Neeraj Kumar, Swaraj Nanda, Nicolas Sanchez, Jacob G. Van Cleave, Brenda Fried, Sean Paulsen, Ethan S. Yan, Ali Kamali, Argho Sarkar, Allyne Manzo, Jeeyeon Baik, Irem S. Isgor, Cesar Colorado-Jimenez, Anthony Cardillo, Leonardo Boiocchi, Aijazuddin Syed, David Kim, Brie Kezlarian-Sachs, Maly Fenelus, Alexander Chan, Mariko Yabe, Samuel I. McCash, Menglei Zhu, Orly Ardon, Lauren McVoy, Wenbin Xiao, Mikhail Roshal, Oscar Lin, Ahmet Dogan, Chad Vanderbilt & Gregory M. Goldgof
Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, USA
Shenghuan Sun
Department of Statistics, University of California, Berkeley, CA, USA
Zhanghan Yin, Dylan C. Webb & Nicolas Sanchez
New York Medical College, Valhalla, NY, USA
Deepika Dilip
Department of Laboratory Medicine, University of California, San Francisco, CA, USA
Linlin Wang
Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Simon Mantha
School of Data Science and Society; Department of Pathology and Laboratory Medicine, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA
Iain Carmichael
Department of Pathology and Laboratory Medicine, Weill Cornell School of Medicine, New York, NY, USA
Gregory M. Goldgof
Halvorsen Center for Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Gregory M. Goldgof

Authors

Siddharth Singi
View author publications
Search author on:PubMed Google Scholar
Shenghuan Sun
View author publications
Search author on:PubMed Google Scholar
Zhanghan Yin
View author publications
Search author on:PubMed Google Scholar
Riya Gupta
View author publications
Search author on:PubMed Google Scholar
Dylan C. Webb
View author publications
Search author on:PubMed Google Scholar
Khawaja H. Bilal
View author publications
Search author on:PubMed Google Scholar
Deepika Dilip
View author publications
Search author on:PubMed Google Scholar
Linlin Wang
View author publications
Search author on:PubMed Google Scholar
Neeraj Kumar
View author publications
Search author on:PubMed Google Scholar
Swaraj Nanda
View author publications
Search author on:PubMed Google Scholar
Nicolas Sanchez
View author publications
Search author on:PubMed Google Scholar
Jacob G. Van Cleave
View author publications
Search author on:PubMed Google Scholar
Brenda Fried
View author publications
Search author on:PubMed Google Scholar
Sean Paulsen
View author publications
Search author on:PubMed Google Scholar
Ethan S. Yan
View author publications
Search author on:PubMed Google Scholar
Ali Kamali
View author publications
Search author on:PubMed Google Scholar
Argho Sarkar
View author publications
Search author on:PubMed Google Scholar
Allyne Manzo
View author publications
Search author on:PubMed Google Scholar
Jeeyeon Baik
View author publications
Search author on:PubMed Google Scholar
Irem S. Isgor
View author publications
Search author on:PubMed Google Scholar
Cesar Colorado-Jimenez
View author publications
Search author on:PubMed Google Scholar
Anthony Cardillo
View author publications
Search author on:PubMed Google Scholar
Leonardo Boiocchi
View author publications
Search author on:PubMed Google Scholar
Aijazuddin Syed
View author publications
Search author on:PubMed Google Scholar
David Kim
View author publications
Search author on:PubMed Google Scholar
Brie Kezlarian-Sachs
View author publications
Search author on:PubMed Google Scholar
Maly Fenelus
View author publications
Search author on:PubMed Google Scholar
Alexander Chan
View author publications
Search author on:PubMed Google Scholar
Mariko Yabe
View author publications
Search author on:PubMed Google Scholar
Samuel I. McCash
View author publications
Search author on:PubMed Google Scholar
Menglei Zhu
View author publications
Search author on:PubMed Google Scholar
Simon Mantha
View author publications
Search author on:PubMed Google Scholar
Orly Ardon
View author publications
Search author on:PubMed Google Scholar
Lauren McVoy
View author publications
Search author on:PubMed Google Scholar
Wenbin Xiao
View author publications
Search author on:PubMed Google Scholar
Mikhail Roshal
View author publications
Search author on:PubMed Google Scholar
Oscar Lin
View author publications
Search author on:PubMed Google Scholar
Ahmet Dogan
View author publications
Search author on:PubMed Google Scholar
Iain Carmichael
View author publications
Search author on:PubMed Google Scholar
Chad Vanderbilt
View author publications
Search author on:PubMed Google Scholar
Gregory M. Goldgof
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization: Si.S., G.M.G., and C.V. Methodology: Si.S., Sh.S., Z.Y., R.G., D.C.W., N.K., S.N., N.S., J.G.V.C., B.F., E.S.Y., Ar.S., C.C.J., I.C., G.M.G., and C.V. Experiments: Si.S., Sh.S., G.M.G., and C.V. Data curation and acquisition: G.M.G., I.S.I., K.H.B., J.B., L.W., I.S.I., M.Z., L.B., M.F., M.Y., W.X., La.M., M.R., A.D., and O.A. Manuscript writing: Si.S., Sh.S., I.C., G.M.G., and C.V. Funding acquisition: G.M.G., C.V., and O.A. Reviewal and approval of manuscript: Si.S., Sh.S., Z.Y., R.G., D.C.W., K.H.B., D.D., L.W., N.K., S.N., N.S., J.G.V.C., B.F., S.P., E.S.Y., A.K., A.S., A.M., J.B., I.S.I., C.C.J., An.C., L.B., D.K., B.K., M.F., Al.C., M.Y., S.I.M., M.Z., S.M., O.A., La.M., W.X., M.R., O.L., A.D., I.C., C.V., and G.M.G.

Corresponding author

Correspondence to Siddharth Singi.

Ethics declarations

Competing interests

L.B. reports stock ownership in Exact Sciences. M.Y. reports consulting for Janssen Research and Development. S.M. reports equity interest in Daboia Consulting LLC and professional services for Janssen Pharmaceuticals, Medical Case Management Group, North American Thrombosis Forum, and Physicians’ Education Resource. W.X. reports research support from Stemline Therapeutics. M.R. reports serving as a scientific advisory board member with equity support at Auron Pharmaceutical, research funding from Celularity, Roche-Genentech, Beat AML, and NGM, and travel funding from BD Biosciences. O.L. reports consulting fees from Janssen Biotech and Hologic, and support for professional activities from the American Society of Cytopathology. A.D. reports consulting for Seattle Genetics, Takeda, EUSA Pharma, AbbVie, Peerview, Physicians’ Education Resource, Incyte, and Loxo, as well as research support from Roche and Takeda. C.V. reports equity interest and intellectual property rights in Paige.AI, Inc., and a consulting and advisory role for Paige.AI, Inc. G.M.G. reports equity interest in HemeAI, Inc. The following authors declare no competing interests: Si.S., Sh.S., Z.Y., R.G., D.W., K.B., D.D., L.W., N.K., S.N., N.S., J.C., B.F., S.P., E.Y., A.K., Ar.S., A.M., J.B., I.I., C.C.J., Al.C., Ai.S., D.K., B.K., M.F., An.C., S.M., M.Z., O.A., La.M., and I.C.

Peer review

Peer review information

Communications Medicine thanks Hanxiang Ma, Tao Chen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Transparent Peer Review file (download PDF )

Supplemental Information (download PDF )

Description of Additional Supplementary Files (download DOCX )

Supplementary Data 1 (download CSV )

Supplementary Data 2 (download CSV )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Singi, S., Sun, S., Yin, Z. et al. Interpretable multiple instance learning for hematologic diagnosis from peripheral blood smears. Commun Med (2026). https://doi.org/10.1038/s43856-026-01558-x

Download citation

Received: 08 July 2025
Accepted: 13 March 2026
Published: 15 April 2026
DOI: https://doi.org/10.1038/s43856-026-01558-x