Artificial intelligence differentiates prefibrotic primary myelofibrosis with thrombocytosis from essential thrombocythemia using digitized bone marrow biopsy images

Srisuwananukorn, Andrew; Loscocco, Giuseppe Gaetano; Dolezal, James M.; Kuykendall, Andrew T.; Santi, Raffaella; Zhang, Ling; Singh, Avani M.; Guglielmelli, Paola; Vannucchi, Alessandro Maria; Salama, Mohamed E.; Pearson, Alexander T.; Hoffman, Ronald

doi:10.1038/s41375-026-02893-7

Download PDF

Article
Open access
Published: 20 February 2026

Artificial intelligence differentiates prefibrotic primary myelofibrosis with thrombocytosis from essential thrombocythemia using digitized bone marrow biopsy images

Leukemia (2026)Cite this article

Subjects

Abstract

Prefibrotic primary myelofibrosis (prePMF) and essential thrombocythemia (ET) are distinct myeloproliferative neoplasms (MPNs) with overlapping clinical features, often leading to diagnostic uncertainty. We developed an artificial intelligence (AI) framework with human interpretability to distinguish prePMF from ET using digitized hematoxylin and eosin-stained bone marrow biopsy (BMB) slides. Trained on an initial cohort of MPN patients with thrombocytosis, the AI model achieved an AUROC of 0.89 and accuracy of 92.3%. To assess the image features guiding predictions, we generated synthetic images which potentially exaggerate disease-specific morphologies. In a blinded survey, hematopathologists reviewed both real and AI-generated images. While human experts frequently agreed with AI predictions on diagnosis with real images, diagnostic discordance reached up to 88% for AI-generated ET images despite being correctly predicted by AI. We further quantified marrow cellularity and adiposity in the real and generated images, which revealed a higher proportion of fat content in all ET images (42.0%) compared to prePMF (28.9%). These findings suggest that AI can utilize morphological cues distinct from current established diagnostic criteria, such as proportion of adiposity to distinguish types of MPNs. Thus, an AI-assisted diagnostic tool underscores the potential of AI to augment histopathologic evaluation and allow identification of more specific subpopulations of forms of MPNs.

How artificial intelligence might disrupt diagnostics in hematology in the near future

Article Open access 08 June 2021

Enhancing diagnostic accuracy of multiple myeloma through ML-driven analysis of hematological slides: new dataset and identification model to support hematologists

Article Open access 15 May 2024

Machine learning assisted real-time deformability cytometry of CD34+ cells allows to identify patients with myelodysplastic syndromes

Article Open access 18 January 2022

Introduction

Primary myelofibrosis (PMF) is a clonal stem cell disorder that is recognized as a distinct entity defined by the World Health Organization (WHO) as a Philadelphia chromosome-negative myeloproliferative neoplasm (MPN), which also includes polycythemia vera (PV) and essential thrombocythemia (ET). PMF is further subclassified into “prefibrotic” (prePMF) and overtly fibrotic PMF. Overt MF may also occur during the clinical course of a subset of ET and PV patients, which is termed post-ET and post-PV MF, respectively [1]. The WHO emphasizes the role of megakaryocyte morphology in the classification of the MPNs. Histologic examination of the peripheral blood smear and an adequate bone marrow biopsy (BMB) are indispensable for making the proper diagnosis of PMF. PrePMF was first described as a distinct clinical entity in 1976 by Thiele and colleagues and formally introduced in the 2001 and 2008 WHO diagnostic criteria [2,3,4]. Then in 2016 and 2022, the WHO provided further guidelines to pathologically distinguish prePMF with thrombocytosis from ET and overt PMF based on megakaryocytic atypia and marrow hypercellularity in the absence of reticulin fibrosis greater than grade 1 [5, 6]. Clinically, prePMF patients may present with isolated thrombocytosis or borderline leukocytosis and is frequently misdiagnosed as ET [7]. Conversely, a subset of ET cases exhibit bone marrow features resembling prePMF. A review of more than 1100 patients initially diagnosed as having ET using the 2008 WHO MPN diagnostic criteria revealed that nearly one in five cases met histopathological criteria for prePMF rather than ET after reassessment of marrow histopathology [8]. Although both disorders are associated with an increased risk of developing thrombotic events, prePMF carries a significantly higher risk and shorter latency period of transformation to overt PMF, leukemia, and death [7,8,9,10,11,12]. The 2022 WHO and the recent International Consensus Consortium (ICC) MPN diagnostic criteria each define the pathological diagnostic marrow features to differentiate between these two types of MPNs, inherently requiring the subjective assessment of megakaryocyte morphology and fibrosis grading using hematoxylin & eosin (H&E) and reticulin-stained bone marrow biopsies, respectively [13].

As a result of the subjective nature of bone marrow assessments and similar clinical phenotypes, prior reports have identified high rates of discordance among expert pathologists to distinguish ET from prePMF with a modest kappa statistic of 0.474 after participation in a specific MPN education module [14]. At best, diagnostic consensus by groups of pathologists from multiple centers with known expertise in MPN histopathology achieved consensus 88% of the time [15].

The combined recent technical advances within digital pathology and artificial intelligence (AI) offer potential scalable solutions for diagnostic and prognostic assessment across hematological neoplasms [16, 17]. Empiric evidence has shown that AI algorithms may exhibit high degrees of accuracy and reproducibility in image recognition tasks, occasionally surpassing human performance [18]. Much of this progress has been enabled by modern AI architectures including deep convolutional neural networks (DCNNs), transformers, and generative adversarial networks, which have been applied to tasks such as tissue segmentation, cell classification, and morphologic pattern recognition [19, 20]. More recently, foundation models have emerged as promising approach for improving model performance, particularly in data-scarce scenarios. These foundation models are pre-trained on large-scale image datasets, including diverse histopathologic whole slide images, and can be adapted for a wide range of downstream applications. In digital pathology, foundation models previously trained on hundreds of thousands to millions of whole slide images enable improved disease classification, genomic prediction, survival modeling, and other clinically relevant inferences which often achieve near or superhuman performance [21]. Specifically for prePMF and ET differentiation, Ryou et al. developed an AI-derived system termed the “continuous indexing of fibrosis” that closely associated the burden and heterogeneity of fibrosis preferentially with a diagnosis of prePMF. The development of their system focuses on known relevant features such as fibrosis and megakaryocyte morphology [22].

Conversely, there is currently a lack of AI frameworks capable of identifying novel morphological features that would permit accurate distinction of prePMF from ET. For these purposes, publicly available objective-agnostic foundation models pre-trained on large, heterogenous histopathology images offer a promising method for improving diagnostic accuracy. By leveraging knowledge encoded from millions of prior images, foundation models can transfer generalizable representations of histologic patterns to novel or specialized tasks, such as prePMF/ET classification, even when labeled data is scarce. To further understand automatically derived extracted digital biomarkers, our group has shown that the generation of synthetic histopathology can identify human-interpretable morphology not previously characterized for known disease states [23,24,25].

In this study, we developed an AI model capable of distinguishing prePMF with thrombocytosis from ET on H&E-stained bone marrow biopsy images, with demonstrated potential utility across patient cohorts and institutions. With only knowledge of the marrow images and diagnosis label, our AI model was not specifically trained on known diagnostic features for MPN subtyping.

Methods

We performed a retrospective, multi-institutional effort to develop and validate an AI algorithm to differentiate prePMF from ET. The custom-written code was developed with the Slideflow package written in the Python language, version 3.7 [26]. Computational analyses were performed on High Performance Computing Clusters (HPC) housed within the Icahn School of Medicine at Mount Sinai and the Ohio Supercomputer Center [27].

This research was approved by the Institutional Review Board of Mount Sinai Hospital and The Ohio State University including the study design for algorithmic development of retrospective patient data and pathology images. A waiver for consent was obtained. All experiments were in accordance with the Declaration of Helsinki. The reporting of data is in accordance with AI-specific guidelines including CLAIM and TRIPOD + AI [28, 29].

Patient Cohorts and Histopathology Digitization

Patients with prePMF and ET were identified at The University of Florence, Italy between 6/2007 and 5/2023, and Moffitt Cancer Center, Florida, USA between 1/2020 and 12/2022. Patients were identified by querying key diagnostic terminology within pathology reports. The diagnosis was revised according to the 2022 WHO and ICC classifications and verified by local expert hemato-pathologists.

De-identified diagnostic H&E-stained BMB core slides were digitized at 40x magnification by Aperio AT2 whole slide scanners (Leica Biosystems). Justification for sample size estimation is available in Supplemental Methods.

Development of complementary AI algorithms for prePMF or ET diagnostic prediction

An AI classification model, hereafter named the prePMF/ET classifier, for automated H&E-stained BMB image analysis to establish a diagnosis of prePMF or ET was developed. The training cohort entailed patients treated at the University of Florence. The model was validated for performance with patients treated at H. Lee Moffitt Cancer Center, Florida, United States.

We developed the prePMF/ET classifier using the open-source software package Slideflow with a two-stage approach: feature extraction with a histopathology-informed pre-trained foundation model, followed by attention-based multiple instance learning [26]. The chosen foundation model, RetCCL, was previously trained on 32,000 whole slide images across various cancer subtypes and further underwent re-training upon the described MPN cohorts for diagnosis prediction [30]. We additionally implemented an attention mechanism, which allows the AI model to assign a numeric score to each image region based on how strongly it contributes to the final diagnostic prediction. In effect, attention highlights the areas within the bone marrow biopsy that the model deems most informative for distinguishing prePMF from ET, thereby improving interpretability and enabling visual inspection of the model’s decision-making process [31]. Empirically, attention scores have been shown to improve classification tasks and simultaneously provide a means for qualitative interpretation. For final binary classification, the numeric threshold cutoff to assign a diagnosis was calculated at the point of maximal Youden’s index with the Florence cohort.

To provide additional human interpretation of the prePMF/ET classifier, we developed a second AI system to generate synthetic histology images highly representative of each disease state. We trained a class-conditioned generative adversarial network (cGAN) to exaggerate the subtle morphological features of H&E-stained images of prePMF and ET. We collectively refer to this second explanatory AI system as the prePMF/ET cGAN (Fig. 1). Further technical details regarding the development of both AI systems are provided in Supplemental Methods.

**Fig. 1: Overview of the current artificial intelligence (AI) strategy for classification and explanation of myeloproliferative neoplasm (MPN) disease prediction.**

Comparing AI parameters used and hemato-pathologists interpretation of histo-morphologic features

To assess whether the AI-based prePMF/ET classifier performs similarly to human assessment and to understand what relevant parameters AI was using, three hematopathologists (R.S., L.Z., M.E.S.) viewed and assigned a diagnosis for 80 real or generated image tiles that were correctly predicted by the prePMF/ET classifier. During this exercise the allowed diagnostic conclusions included prePMF, ET, abstaining from an answer, or other. The 80 images comprised of real images with highest attention scores extracted from the Moffitt cohort, as well as generated images provided by the prePMF/ET cGAN. The synthetic images were selected with highest attention scores and predicted as the intended diagnosis by the prePMF/ET classifier (i.e., “class-concordant” between classifier and generator). In total, the 80-image survey included 20 images each across real prePMF, generated prePMF, real ET, and generated ET.

The three pathologists were blinded to the source of images (real vs generated) and the true diagnostic labels. Each pathologist’s classification was compared to the prePMF/ET classifier by percent agreement. The proportion of images in agreement was compared using the Fisher’s exact tests.

Quantifying proportion of adipose tissue and cellularity

We calculated proportion of adipose tissue within each image of the prior 80-image survey by utilizing a custom-trained pixel-level classifier within the open-source QuPath software, version 0.5.1 [32]. We trained a model in a similar fashion as Palomaki et al. [33] Within the QuPath graphical user interface upon a single real image, we manually annotated regions of adipose tissue or cellularity to train an artificial neural network for pixel-level classification. Training was iterated with additional manually annotated regions until the classification was deemed adequate by visual inspection. The proportion of pixels within each image classified as adipose tissue or cellularity was compared using the Student’s t-test. Of note, areas of artifact, background, or cortical bone were excluded when calculating these proportions. Further hyperparameters for the pixel-level classifier are provided in Supplemental Methods.

Results

AI accurately differentiates prePMF from ET using Bone Marrow Biopsy Images

The prePMF/ET classifier was trained on patient specimens from the University of Florence (100 prePMF / 100 ET cases) and externally validated on additional specimens from the Moffitt Cancer Center (6 prePMF / 20 ET cases). Baseline clinical information for the training cohort is provided in Table 1.

Table 1 Clinical and laboratory variables at baseline of the University of Florence cohort.

Full size table

Using hyperparameters chosen by prior AI histopathology research, the initial models were trained upon 5-fold cross-validation of the Florence cohort to determine preliminary efficacy and consideration of hyperparameter modifications [34]. The classifier models exhibited a mean area under receiver operator curve (AUROC) of 0.90 with a standard deviation of 0.04. The mean average precision (AP) across 5 folds was 0.91 (Fig. 2a, b).

**Fig. 2: Performance of the prePMF/ET classifier.**

For external cohort assessment, a final classifier model was trained on the full Florence cohort and validated with the Moffitt cohort. After external validation, the classifier achieved AUROC of 0.88 and AP of 0.81, numerically greater than the baseline AP of 0.23 (Fig. 2c, d). To assess the stability of the training process, four additional replicate models were trained with the full Florence cohort and performed similarly with external validation (Supplementary Fig. 2). Finally, we determined the threshold value for binary classification to be the point of maximizing Youden’s Index, which balances sensitivity and specificity when comparing potential threshold values. The resultant fully trained prePMF/ET classifier exhibited a sensitivity of 66.6%, specificity of 100%, and accuracy of 92.3% for prediction of prePMF.

AI vs Expert Pathologists rely on different histopathological features for assigning the diagnosis of prePMF or ET

With the prePMF/ET classifier model, we aimed to interrogate what morphological features within the images might influence the performance of the model and compare them with human assessments in order to identify if AI and expert pathologists were referencing similar morphological characteristics to render a diagnosis. In this endeavor, we trained a class-conditioned generative adversarial network (cGAN) to generate synthetic images. We previously had shown that generated images can identify new morphological characteristics associated with disease states which may inform human interpretation and thereby improve diagnostic performance [23]. The cGAN was trained upon image tiles extracted at 40x magnification from the training cohort. The fully trained prePMF/ET cGAN generated high-quality synthetic images representing prePMF and ET H&E BMB images (Fig. 3).

**Fig. 3: Example image tiles for hematopathologist interpretation survey.**

To assess the degree of agreement with pathologists and to potentially identify automatically-derived relevant morphological features, a survey of highly representative images— all of which were correctly predicted by the prePMF/ET classifier—were assigned to three expert hematopathologists to determine a diagnosis from a single image tile. The survey of 80 images included real and generated representations of prePMF and ET. Both the prePMF/ET classifier and the hematopathologists predicted a diagnosis without access to information about the clinical context or ability to view additional images from the same “patient”. As such, the expert hematopathologists were allowed to abstain from assigning a diagnosis if it was felt that the provided morphological features in the single image were insufficient to render a diagnosis.

The overall agreement between each of the three hematopathologists with the prePMF/ET classifier was 45.0%; greater agreement was observed with real images (60.8% vs. 29.1%, p < 0.001) (Supplemental Table 3). Hematopathologists abstained or assigned an alternative diagnosis after reviewing 32.5% of images. Assignment of an MPN diagnosis by the hematopathologists was more common with real images (80.0% vs. 55.0%, p < 0.001). In the subset of images where hematopathologists did not abstain, hematopathologists agreed with the AI-based prePMF/ET classifier more frequently with real images compared to generated images (76.0% vs. 53.0%, p < 0.01) (Fig. 4a, b).

**Fig. 4: Comparative analysis of hematopathologist assessments of real and generated images.**

In the subset of real images, the frequency of the hematopathologists abstaining from assigning a diagnosis as well as accuracy of assignment was similar between the real images of prePMF or ET marrow biopsies (Fig. 5c, d). However, in the subset of all generated images, hematopathologists less frequently assigned a diagnosis on generated ET images (MPN diagnosis assigned in 68.3% for prePMF vs 41.7% in ET, p < 0.01). Similarly, the agreement between the hematopathologists and the AI-based prePMF/ET classifier was low for generated ET images (78.0% for prePMF vs. 12.0% for ET images, p < 0.001) (Fig. 4e, f). Given that a correct diagnosis was predicted by the prePMF/ET classifier for all samples within the 80-image survey, these findings imply that the generated ET images contain morphological features sufficient for AI but not for expert pathologists to render a diagnosis.

**Fig. 5: Analysis of high attention areas in bone marrow images.**

Human identification of morphology associated with AI prediction

Identification of novel marrow morphologic features associated with the predicted diagnosis was performed at two effective magnification levels. At lower magnification, heatmaps of attention scores produced by the prePMF/ET classifier on the Moffitt cohort allowed for visual assessment of tissue-level image features associated with diagnosis prediction. Interpretation of heatmaps revealed the prePMF/ET classifier’s reliance on areas of hematopoiesis (Fig. 5). Importantly, a qualitative assessment of all heatmaps revealed a lack of high attention to areas of cortical bone, blood clot, or image background/artifactual content across the whole slide images.

At a higher resolution, real and generated image tiles at 40x magnification from the previously described image survey were manually reviewed for cellular morphological features potentially shared between prePMF and ET. Importantly, we did not observe obvious artifactual content such as markings, preparation issues, digitization errors, or other nonbiological image features. Upon initial review, we hypothesized that the proportion of adipose tissue was higher in ET images, and conversely degree of cellularity was higher in prePMF images. Quantification of cellularity and adipose tissue was further performed with a pixel-level classifier trained on manually annotated images (Fig. 6).

**Fig. 6: Quantification and comparison of adipose tissue and cellularity on 80 real and generated images.**

In the full survey of 80 real and generated images, the proportion of adipose tissue was determined by the pixel-level classifier to be higher in the ET images (42.0% vs. 28.9%, p < 0.001). The trend of higher proportion of adipose tissue in ET was seen in both the subset of real images (38.2% in ET vs. 31.0% in prePMF, p < 0.01) and subset of generated images (45.7% vs. 26.8%, p < 0.001). As cellularity is inversely correlated to adipose tissue, similar conclusions can be stated that the prePMF images exhibit higher proportions of cellularity (Fig. 6b-d).

To further quantify if degree of cellularity is preserved when generating synthetic images, we compared the proportions of adipose/cellularity within each disease. In the subset of ET images, there was a higher proportion of adipose tissue within generated compared to real images (45.7% vs. 38.2%, p < 0.001). However, a significant difference in adipose tissue proportion within the prePMF images was not observed (26.8% in generated vs 31.0% in real, p = 0.1) (Fig. 6e, f). These results imply that the generative AI model exaggerates the degree of adipose tissue as a potentially relevant image biomarker for ET but not for prePMF. A representative video of class interpolation between generated ET and prePMF images is provided in Supplemental Video 1.

Discussion

We present a novel approach using AI to distinguish prePMF from ET based solely on digitized H&E bone marrow biopsy images, which is a challenging diagnostic dilemma for MPN patients with thrombocytosis. Amongst clinicians and hematopathologists, concerns surround the ability of the histomorphologic features to consistently distinguish prePMF from ET, which have led in the past to doubts by some clinicians and pathologists of the actual existence of prePMF as a distinct type of MPN due to the known limited reproducibility of the application of the diagnostic criteria [35]. Due to the close resemblance of the clinical presentations of these two entities, the initial standard of care for patients with these disorders has been routinely chosen by practicing physician to be similar. This practice might be faulty since there are important biological and clinical features which indicate that ET and prePMF are distinct entities and that patients with these disorders might benefit from different treatment approaches.

To address the diagnostic reproducibility issues, several reports have developed models to distinguish ET from prePMF with readily available clinical data. PrePMF patients more frequently present with lower hemoglobin values, higher levels of lactate dehydrogenase, increased white blood cell counts, and palpable splenomegaly. Logistic regression or stepwise decision trees developed from these clinical parameters can identify defined ET with nearly 80% accuracy or 90% positive predictive value with the older 2008 WHO diagnostic criteria [36,37,38]. New approaches to digital pathology using AI offer additional methods for improved classification of MPN subtypes. Ryou and co-workers first developed an automated AI methodology to objectively quantify fibrosis using routine reticulin-stained BMBs from MPN patients [22]. Their AI algorithm was trained on manually annotated fibrosis images to predict and characterize fibrosis patterns with unseen BMB images. Using an ensemble method of algorithms assessing fibrosis patterns and megakaryocyte morphology, the combined model accurately differentiated prePMF from ET with an internal validation cohort, and furthermore, their combined model was evaluated with ET patients enrolled in the Primary Thrombocythemia-1 (PT-1) clinical trial carried out in the United Kingdom between 1997 and 2012 [39]. The authors determined that the marrow fibrosis distribution identified a subset of ET patients with higher risk for transformation to overt myelofibrosis; given that PT-1 inclusion criteria enrolled patients prior to the formal recognition of prePMF, these findings support the possibility of unintentional enrollment of prePMF patients on to clinical trials, thereby compromising the potential ability to achieve disease-specific endpoints.

By contrast to the AI approach by Ryou and coworkers, our model does not rely on specific preconceived visual features, and as such provides a complementary assessment compared to features delineated by the WHO 2022 and ICC 2022 criteria which emphasize megakaryocyte morphology and the degree of fibrosis. Our model exhibited sustained performance with biopsy specimens procured across academic sites within a multi-institutional collaboration, but future iterations of this algorithm should include real-world biopsies with heterogenous stain quality, preparatory techniques, and clinical manifestations to further interrogate the generalizability of this diagnostic task. We further provided justification of the AI prediction by utilizing a novel generative AI method to identify subtle morphological features associated with each MPN. Generated images of H&E BMBs from these patients also suggested a higher abundance of adipocytes as a variable that could be used to identify ET and conversely cellularity for prePMF. We note that our generative AI model selectively produced images of greater adipose proportion for ET images but not for prePMF. While the identification of novel morphologies by generative AI still requires blinded prospective human evaluation prior to being accepted as a diagnostic standard, it is still noted that the observed difference of adipose content by our generative AI model is in line with earlier investigations noting lower bone marrow fat fraction as determined by T1-weighted magnetic resonance imaging in overt MF patient compared to ET [40]. In addition, the mechanism for decreased adipose tissue in prePMF may mirror the known signaling pathways in overt myelofibrosis cells, where increased levels of the cytokine lipocalin-2 have been shown to diminish adipogenesis of marrow stromal cells ex vivo [41].

In a human vs AI interpretability experiment involving real and generated images, we however noted that expert hematopathologists frequently disagreed with the AI-based model with an overall agreement of 45.0%, and human expert performance upon generated ET images achieved only 12.0% in cases which AI predicted correctly. In spite of this seemingly low performance, it should be noted that expert pathologists were challenged to assign a diagnosis from a single high-magnification image. Similarly, Madelung et al. reported individual agreement for MPN diagnosis with six hematopathologists compared to an expert ground-truth pathologist ranging between 38% and 52% despite having available the full patient biopsy with multiple stains and selected clinical information [14]. These data suggest that full patient bone marrow biopsy images with integrated clinical parameters are necessary for hematopathologists to accurately subclassify MPNs by current diagnostic criteria. Taken together, these findings imply that the AI algorithms and expert pathologists might identify distinctly different morphologies when evaluating H&E bone marrow biopsies. Though we have shown the potential importance of assessing marrow adipocytes in establishing a diagnosis of ET, further work is needed to characterize these subtle visual differences identified by AI.

Overall, the contributions by our group and others developing AI algorithms do not fully represent a substitute for histopathological analysis; these investigations only contribute to increasing the suspicion of a prePMF diagnosis in a patient with a working clinical diagnosis of ET. We envision the potential of an AI tool for automated MPN diagnosis as a valuable aid particularly for providers at centers with lower volumes of MPN patients. Additionally, given its high specificity for prePMF and high sensitivity for ET, future extensions of our proposed AI-based MPN classifier when trained on data from patients with a platelet count > 450 × 10⁹/L may enhance clinical trial enrollment of ET patients by excluding prePMF patients, which would facilitate more accurate assessment of the clinical activity of novel therapeutic agents.

Limitations of the current study include the small sample size of the evaluation cohorts and retrospective nature. Prospective evaluation associating AI prediction with MPN outcomes is warranted. The additional benefit of including other histology stains, such as reticulin, and clinical or molecular data combined with H&E images for AI prediction also warrants additional investigation.

In summary, we have developed a highly accurate AI model that distinguishes prePMF from ET with automated bone marrow analysis. Our model was developed on an open-source framework that could be readily deployed in a rapid and affordable manner for patients with these possible MPN diagnoses.

Data availability

The model for classification of prefibrotic primary myelofibrosis and essential thrombocythemia is available at https://github.com/andrewsris/prePMFvsET. For data request inquiries of the whole slide images, please contact the corresponding author (email: Andrew.srisuwananukorn@osumc.edu).

References

Barosi G, Mesa RA, Thiele J, Cervantes F, Campbell PJ, Verstovsek S, et al. Proposed criteria for the diagnosis of post-polycythemia vera and post-essential thrombocythemia myelofibrosis: a consensus statement from the International Working Group for Myelofibrosis Research and Treatment. Leukemia. 2008;22:437–8.
Article CAS PubMed Google Scholar
Thiele J, Georgii A, Vykoupil KF. Ultrastructure of chronic megakaryocytic-granulocytic myelosis. Blut. 1976;32:433–8.
Article CAS PubMed Google Scholar
Vardiman JW, Harris NL, Brunning RD. The World Health Organization (WHO) classification of the myeloid neoplasms. Blood. 2002;100:2292–302.
Article CAS PubMed Google Scholar
Vardiman JW, Thiele J, Arber DA, Brunning RD, Borowitz MJ, Porwit A, et al. The 2008 revision of the World Health Organization (WHO) classification of myeloid neoplasms and acute leukemia: rationale and important changes. Blood. 2009;114:937–51.
Article CAS PubMed Google Scholar
Arber DA, Orazi A, Hasserjian R, Thiele J, Borowitz MJ, Le Beau MM, et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 2016;127:2391–405.
Article CAS PubMed Google Scholar
Khoury JD, Solary E, Abla O, Akkari Y, Alaggio R, Apperley JF, et al. The 5th edition of the World Health Organization Classification of Haematolymphoid Tumours: Myeloid and Histiocytic/Dendritic Neoplasms. Leukemia. 2022;36:1703–19.
Article PubMed PubMed Central Google Scholar
Rumi E, Boveri E, Bellini M, Pietra D, Ferretti VV, Sant’Antonio E, et al. Clinical course and outcome of essential thrombocythemia and prefibrotic myelofibrosis according to the revised WHO 2016 diagnostic criteria. Oncotarget. 2017;8:101735–44.
Article PubMed PubMed Central Google Scholar
Barbui T, Thiele J, Passamonti F, Rumi E, Boveri E, Ruggeri M, et al. Survival and disease progression in essential thrombocythemia are significantly influenced by accurate morphologic diagnosis: an international study. J Clin Oncol. 2011;29:3179–84.
Article PubMed Google Scholar
Guglielmelli P, Pacilli A, Rotunno G, Rumi E, Rosti V, Delaini F, et al. Presentation and outcome of patients with 2016 WHO diagnosis of prefibrotic and overt primary myelofibrosis. Blood. 2017;129:3227–36.
Article CAS PubMed Google Scholar
Gisslinger H, Jeryczynski G, Gisslinger B, Wolfler A, Burgstaller S, Buxhofer-Ausch V, et al. Clinical impact of bone marrow morphology for the diagnosis of essential thrombocythemia: comparison between the BCSH and the WHO criteria. Leukemia. 2016;30:1126–32.
Article CAS PubMed Google Scholar
Finazzi G, Carobbio A, Thiele J, Passamonti F, Rumi E, Ruggeri M, et al. Incidence and risk factors for bleeding in 1104 patients with essential thrombocythemia or prefibrotic myelofibrosis diagnosed according to the 2008 WHO criteria. Leukemia. 2012;26:716–9.
Article CAS PubMed Google Scholar
Passamonti F, Rumi E, Arcaini L, Boveri E, Elena C, Pietra D, et al. Prognostic factors for thrombosis, myelofibrosis, and leukemia in essential thrombocythemia: a study of 605 patients. Haematologica. 2008;93:1645–51.
Article PubMed Google Scholar
Arber DA, Orazi A, Hasserjian RP, Borowitz MJ, Calvo KR, Kvasnicka HM, et al. International Consensus Classification of Myeloid Neoplasms and Acute Leukemias: integrating morphologic, clinical, and genomic data. Blood. 2022;140:1200–28.
Article CAS PubMed PubMed Central Google Scholar
Madelung AB, Bondo H, Stamp I, Lovgreen P, Nielsen SL, Falensteen A, et al. WHO classification 2008 of myeloproliferative neoplasms: a workshop learning effect-the Danish experience. APMIS. 2015;123:787–92.
Article PubMed Google Scholar
Thiele J, Kvasnicka HM, Mullauer L, Buxhofer-Ausch V, Gisslinger B, Gisslinger H. Essential thrombocythemia versus early primary myelofibrosis: a multicenter study to validate the WHO classification. Blood. 2011;117:5710–8.
Article CAS PubMed Google Scholar
Srisuwananukorn A, Salama ME, Pearson AT. Deep learning applications in visual data for benign and malignant hematologic conditions: a systematic review and visual glossary. Haematologica. 2023;108:1993–2010.
Article PubMed PubMed Central Google Scholar
Srisuwananukorn A, Krull JE, Ma Q, Zhang P, Pearson AT, Hoffman R. Applications of artificial intelligence to myeloproliferative neoplasms: a narrative review. Expert Rev Hematol. 2024;17:1–9.
He K, Zhang X, Ren S, Sun J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. 2015 IEEE International Conference on Computer Vision (ICCV) 2015 1026–34 https://doi.org/10.1109/iccv.2015.123.
Perez-Lopez R, Ghaffari Laleh N, Mahmood F, Kather JN. A guide to artificial intelligence for cancer researchers. Nat Rev Cancer. 2024;24:427–41.
Article CAS PubMed Google Scholar
Rashidi HH, Pantanowitz J, Chamanzar A, Fennell B, Wang Y, Gullapalli RR, et al. Generative Artificial Intelligence in Pathology and Medicine: A Deeper Dive. Mod Pathol. 2025;38:100687.
Article PubMed Google Scholar
Campanella G, Chen S, Singh M, Verma R, Muehlstedt S, Zeng J, et al. A clinical benchmark of public self-supervised pathology foundation models. Nat Commun. 2025;16:3640.
Article CAS PubMed PubMed Central Google Scholar
Ryou H, Sirinukunwattana K, Aberdeen A, Grindstaff G, Stolz BJ, Byrne H, et al. Continuous Indexing of Fibrosis (CIF): improving the assessment and classification of MPN patients. Leukemia. 2023;37:348–58.
Article CAS PubMed Google Scholar
Dolezal JM, Wolk R, Hieromnimon HM, Howard FM, Srisuwananukorn A, Karpeyev D, et al. Deep Learning Generates Synthetic Cancer Histology for Explainability and Education, 2022 November 01, 2022:[arXiv:2211.06522 p.]. Available from: https://ui.adsabs.harvard.edu/abs/2022arXiv221106522D.
Hieromnimon HM, Dolezal J, Doytcheva K, Howard FM, Kochanny S, Zhang Z, et al. Building digital histology models of transcriptional tumor programs with generative deep learning for pathology-based precision medicine. Genome Medicine. 2025;4:38.
Google Scholar
Hieromnimon HM, Trzcinska A, Wen FT, Howard FM, Dolezal JM, Dyer E, et al. Analysis of AI foundation model features decodes the histopathologic landscape of HPV-positive head and neck squamous cell carcinomas. Oral Oncol. 2025;163:107207.
Article PubMed Google Scholar
Dolezal JM, Kochanny S, Dyer E, Ramesh S, Srisuwananukorn A, Sacco M, et al. Slideflow: deep learning for digital histopathology with real-time whole-slide visualization. BMC Bioinformatics. 2024;25:134.
Article PubMed PubMed Central Google Scholar
Center OS. Ohio Supercomputer Center. Ohio Supercomputer Center; 1987.
Mongan J, Moy L, Kahn CE. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiology: Artificial Intelligence. 2020;2:e200029.
PubMed PubMed Central Google Scholar
Collins GS, Dhiman P, Andaur Navarro CL, Ma J, Hooft L, Reitsma JB, et al. Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open. 2021;11:e048008.
Article PubMed PubMed Central Google Scholar
Wang X, Du Y, Yang S, Zhang J, Wang M, Zhang J, et al. RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval. Med Image Anal. 2023;83:102645.
Article PubMed Google Scholar
Ilse M, Tomczak J, Welling M. Attention-based deep multiple instance learning. Proceedings of the 35th International Conference on Machine Learning; 10-15 July 2018; Stockholm, Sweden: PMLR; 2018;80:2127–36.
Bankhead P, Loughrey MB, Fernandez JA, Dombrowski Y, McArt DG, Dunne PD, et al. QuPath: Open source software for digital pathology image analysis. Sci Rep. 2017;7:16878.
Article PubMed PubMed Central Google Scholar
Palomaki VA, Koivukangas V, Merilainen S, Lehenkari P, Karttunen TJ. A Straightforward Method for Adipocyte Size and Count Analysis Using Open-source Software QuPath. Adipocyte. 2022;11:99–107.
Article CAS PubMed PubMed Central Google Scholar
Dolezal JM, Srisuwananukorn A, Karpeyev D, Ramesh S, Kochanny S, Cody B, et al. Uncertainty-Informed Deep Learning Models Enable High-Confidence Predictions for Digital Histopathology. Nature Communications 2022;13:6572.
Barbui T, Thiele J, Vannucchi AM, Tefferi A. Problems and pitfalls regarding WHO-defined diagnosis of early/prefibrotic primary myelofibrosis versus essential thrombocythemia. Leukemia. 2013;27:1953–8.
Article CAS PubMed Google Scholar
Carobbio A, Finazzi G, Thiele J, Kvasnicka HM, Passamonti F, Rumi E, et al. Blood tests may predict early primary myelofibrosis in patients presenting with essential thrombocythemia. Am J Hematol. 2012;87:203–4.
Article PubMed Google Scholar
Schalling M, Gleiss A, Gisslinger B, Wolfler A, Buxhofer-Ausch V, Jeryczynski G, et al. Essential thrombocythemia vs. pre-fibrotic/early primary myelofibrosis: discrimination by laboratory and clinical data. Blood Cancer J. 2017;7:643.
Article PubMed PubMed Central Google Scholar
Lekovic D, Bogdanovic A, Sobas M, Arsenovic I, Smiljanic M, Ivanovic J, et al. Easily Applicable Predictive Score for Differential Diagnosis of Prefibrotic Primary Myelofibrosis from Essential Thrombocythemia. Cancers (Basel). 2023;15:4180.
Harrison CN, Campbell PJ, Buck G, Wheatley K, East CL, Bareford D, et al. Hydroxyurea compared with anagrelide in high-risk essential thrombocythemia. N Engl J Med. 2005;353:33–45.
Article CAS PubMed Google Scholar
Rozman C, Cervantes F, Rozman M, Mercader JM, Montserrat E. Magnetic resonance imaging in myelofibrosis and essential thrombocythaemia: contribution to differential diagnosis. Br J Haematol. 1999;104:574–80.
Article CAS PubMed Google Scholar
Lu M, Xia L, Liu YC, Hochman T, Bizzari L, Aruch D, et al. Lipocalin produced by myelofibrosis cells affects the fate of both hematopoietic and marrow microenvironmental cells. Blood. 2015;126:972–82.
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

A.S. reports support provided by the ECOG-ACRIN Paul Carbone Fellowship Award, Conquer Cancer Foundation Young Investigator Award, and the Clinical and Translational Science Award (UM1TR004548). In addition, this research was supported in part through the computational and data resources and staff expertise provided by Scientific Computing and Data at the Icahn School of Medicine at Mount Sinai and supported by the Clinical and Translational Science Awards (CTSA) grant UL1TR004419 from the National Center for Advancing Translational Sciences, the Myeloproliferative Neoplasm Research –Consortium (MPN-RC) funded by the National Cancer Institute PO1 5P01CA108671-16, AIRC MYNERVA Project (#21267), MIUR (PRIN2022, #20229B28PEW), and private donors George and Bodil Gellman.

These data were partially presented as an oral presentation at the 2023 American Society of Hematology (ASH) annual meeting in San Diego, CA, USA.

Author information

These authors contributed equally: Alexander T. Pearson, Ronald Hoffman.

Authors and Affiliations

Division of Hematology, Department of Internal Medicine, The Ohio State University Comprehensive Cancer Center, Columbus, OH, USA
Andrew Srisuwananukorn
Department of Experimental and Clinical Medicine, ACTIVATE Center, University of Florence, Florence, Italy
Giuseppe Gaetano Loscocco, Paola Guglielmelli & Alessandro Maria Vannucchi
CRIMM, Center for Research and Innovation of Myeloproliferative Neoplasms, Azienda Ospedaliero-Universitaria Careggi, Florence, Italy
Giuseppe Gaetano Loscocco, Paola Guglielmelli & Alessandro Maria Vannucchi
Geisinger Cancer Institute, Danville, PA, USA
James M. Dolezal
Department of Malignant Hematology, H. Lee Moffitt Cancer Center, Tampa, FL, USA
Andrew T. Kuykendall & Avani M. Singh
Pathology Section, Department of Health Sciences, University of Florence, Florence, Italy
Raffaella Santi
Department of Pathology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL, USA
Ling Zhang
Department of Pathology, University of Texas Health Science Center, San Antonio, USA
Mohamed E. Salama
Sonic Healthcare, Austin, TX, USA
Mohamed E. Salama
Department of Medicine, Section of Hematology/Oncology, The University of Chicago, Chicago, IL, USA
Alexander T. Pearson
Chan Zuckerberg Biohub Chicago, The University of Chicago, Chicago, IL, USA
Alexander T. Pearson
Department of Medicine, Division of Hematology and Medical Oncology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Ronald Hoffman

Authors

Andrew Srisuwananukorn
View author publications
Search author on:PubMed Google Scholar
Giuseppe Gaetano Loscocco
View author publications
Search author on:PubMed Google Scholar
James M. Dolezal
View author publications
Search author on:PubMed Google Scholar
Andrew T. Kuykendall
View author publications
Search author on:PubMed Google Scholar
Raffaella Santi
View author publications
Search author on:PubMed Google Scholar
Ling Zhang
View author publications
Search author on:PubMed Google Scholar
Avani M. Singh
View author publications
Search author on:PubMed Google Scholar
Paola Guglielmelli
View author publications
Search author on:PubMed Google Scholar
Alessandro Maria Vannucchi
View author publications
Search author on:PubMed Google Scholar
Mohamed E. Salama
View author publications
Search author on:PubMed Google Scholar
Alexander T. Pearson
View author publications
Search author on:PubMed Google Scholar
Ronald Hoffman
View author publications
Search author on:PubMed Google Scholar

Contributions

AS conceived the project, performed all analyses, and drafted manuscript. JMD provided programming expertise. GGL, ATK, AMS, PG, and AMV identified patient cohorts and drafted manuscript. RS and LZ performed experiments and drafted manuscript. MES conceived the project, performed experiments, and drafted manuscript. ATP and RH conceived the project and drafted the manuscript.

Corresponding author

Correspondence to Andrew Srisuwananukorn.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethics approval and consent to participate

This research was approved by the Institutional Review Board of Mount Sinai Hospital (STUDY-21-00459) and The Ohio State University (2024C0106). Due to the retrospective nature of the work, a waiver for consent was obtained. All experiments were in accordance with the Declaration of Helsinki.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental Materials

Supplemental Video 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Srisuwananukorn, A., Loscocco, G.G., Dolezal, J.M. et al. Artificial intelligence differentiates prefibrotic primary myelofibrosis with thrombocytosis from essential thrombocythemia using digitized bone marrow biopsy images. Leukemia (2026). https://doi.org/10.1038/s41375-026-02893-7

Download citation

Received: 20 August 2025
Revised: 22 December 2025
Accepted: 09 February 2026
Published: 20 February 2026
Version of record: 20 February 2026
DOI: https://doi.org/10.1038/s41375-026-02893-7

Artificial intelligence differentiates prefibrotic primary myelofibrosis with thrombocytosis from essential thrombocythemia using digitized bone marrow biopsy images

Subjects

Abstract

Similar content being viewed by others

How artificial intelligence might disrupt diagnostics in hematology in the near future

Enhancing diagnostic accuracy of multiple myeloma through ML-driven analysis of hematological slides: new dataset and identification model to support hematologists

Machine learning assisted real-time deformability cytometry of CD34+ cells allows to identify patients with myelodysplastic syndromes

Introduction

Methods

Patient Cohorts and Histopathology Digitization

Development of complementary AI algorithms for prePMF or ET diagnostic prediction

Comparing AI parameters used and hemato-pathologists interpretation of histo-morphologic features

Quantifying proportion of adipose tissue and cellularity

Results

AI accurately differentiates prePMF from ET using Bone Marrow Biopsy Images

AI vs Expert Pathologists rely on different histopathological features for assigning the diagnosis of prePMF or ET

Human identification of morphology associated with AI prediction

Discussion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethics approval and consent to participate

Additional information

Supplementary information

Supplemental Materials

Supplemental Video 1

Rights and permissions

About this article

Cite this article

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

How artificial intelligence might disrupt diagnostics in hematology in the near future

Enhancing diagnostic accuracy of multiple myeloma through ML-driven analysis of hematological slides: new dataset and identification model to support hematologists

Machine learning assisted real-time deformability cytometry of CD34+ cells allows to identify patients with myelodysplastic syndromes

Introduction

Methods

Patient Cohorts and Histopathology Digitization

Development of complementary AI algorithms for prePMF or ET diagnostic prediction

Comparing AI parameters used and hemato-pathologists interpretation of histo-morphologic features

Quantifying proportion of adipose tissue and cellularity

Results

AI accurately differentiates prePMF from ET using Bone Marrow Biopsy Images

AI vs Expert Pathologists rely on different histopathological features for assigning the diagnosis of prePMF or ET

Human identification of morphology associated with AI prediction

Discussion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethics approval and consent to participate

Additional information

Supplementary information

Supplemental Materials

Supplemental Video 1

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links