A noninvasive machine learning model using a complete blood count for screening of primary vitreoretinal lymphoma

Li, Shengjie; Cao, Jiazhen; Li, Danhui; Ren, Jun; Wu, Jianing; Li, Yingzhu; Zhang, Mengyu; Hu, Henggui; Song, Yunxiao; Cheng, Jie; Guan, Ming; Cao, Wenjun

doi:10.1038/s41467-025-65693-0

Download PDF

Article
Open access
Published: 27 November 2025

A noninvasive machine learning model using a complete blood count for screening of primary vitreoretinal lymphoma

Shengjie Li ORCID: orcid.org/0000-0002-6443-740X^1,2^na1,
Jiazhen Cao³^na1,
Danhui Li⁴^na1,
Jun Ren¹,
Jianing Wu¹,
Yingzhu Li¹,
Mengyu Zhang⁵,
Henggui Hu⁵,
Yunxiao Song⁶,
Jie Cheng ORCID: orcid.org/0000-0001-6983-5389⁷,
Ming Guan ORCID: orcid.org/0000-0002-8796-2653³ &
…
Wenjun Cao ORCID: orcid.org/0000-0001-6383-9012^1,2

Nature Communications volume 16, Article number: 10667 (2025) Cite this article

3406 Accesses
18 Altmetric
Metrics details

Subjects

Abstract

Primary vitreoretinal lymphoma (PVRL) is a rare and aggressive intraocular malignancy that is frequently misdiagnosed because of its nonspecific early manifestations and the lack of effective screening tools. We conduct a multicentre case–control study including 255 PVRL patients and 292 controls to develop a machine learning–based screening model using complete blood count data. A six-feature random forest model demonstrates high diagnostic accuracy in the discovery cohort (area under the curve [AUC] = 0.85) and validates across all cohorts (AUC = 0.80–0.83), outperforming intraocular biomarkers such as the interleukin-10/interleukin-6 ratio (AUC = 0.65–0.78). Model performance further validates in a hospital-based prospective cohort (n = 100,526), where 38 PVRLs are identified among 66 individuals classified as high risk, and 2 additional cases are identified among 83,610 individuals classified as low risk, yielding a sensitivity of 95.0%, specificity of 99.97%, positive predictive value (PPV) of 57.6%, and negative predictive value of 99.99%. In the community cohort (n = 515,326), 22 individuals are flagged as high risk, 13 of whom are confirmed as having PVRL (PPV = 59.1%). This study presents the noninvasive and scalable blood-based screening strategy for detection of PVRL, with a web application enabling timely triage and population-level risk stratification.

Peripheral vitreoretinal abnormality and its correlation with malignant glaucoma in nanophthalmos with secondary angle closure glaucoma

Article 10 September 2025

Endogenous endophthalmitis in a tertiary referral centre: a 14-year case series with comparative perspective

Article 17 November 2025

Clinical features predictive of vision loss in patients with vitreoretinal lymphoma: a single tertiary center experience

Article Open access 18 March 2023

Introduction

Primary vitreoretinal lymphoma (PVRL) is a rare yet potentially aggressive intraocular malignancy that is characterized predominantly by B-cell lymphomas^1,2. The estimated annual incidence of PVRL is approximately 50 cases, and there is evidence suggesting an increasing trend in its occurrence^3,4. Owing to its infrequency, PVRL is often misdiagnosed or inadequately managed and is frequently mistaken for other intraocular conditions⁵. This misidentification can result in significant diagnostic delays, sometimes extending up to 21 months from the initial presentation⁵. The diagnosis of PVRL is further complicated by multiple factors, such as the small volume of vitreous humor, the low cellularity of lymphoma cells—which frequently coexist with inflammatory cells—and the challenges associated with maintaining cellular integrity during sample collection^6,7. More importantly, given that PVRL can lead to permanent vision loss and has a high risk of central nervous system (CNS) relapse⁸, timely and accurate screening is crucial for improving patient outcomes and effectively managing this aggressive malignancy.

Although standardized diagnostic protocols for PVRL exist, the rarity and fragility of lymphoma cells in the vitreous often hinder prompt detection, making timely diagnosis challenging⁹. For example, (1) cytological examination offers morphological evidence that supports a diagnosis of PVRL¹; (2) polymerase chain reaction is utilized to detect monoclonality through rearrangements in immunoglobulin heavy chain or light chain genes, which serve as critical diagnostic indicators¹⁰; (3) mutations in genes, such as MYD88 and CD79B have been linked to vitreoretinal lymphomas and may improve diagnostic accuracy^11,12; and (4) elevated levels of interleukin (IL)−10 compared with IL-6 are also significant, as B-cell lymphomas typically produce high IL-10 levels, making these ratios valuable markers for diagnosis^12,13. However, despite the variety of available methods, all of these methods require invasive intraocular sampling, and their effectiveness is constrained by the need to obtain a substantial number of viable cells—often limited by the low yield of intact neoplastic cells from small volumes of vitreous fluid⁶. Moreover, these methods are geared primarily towards diagnosing established diseases rather than enabling early screening. Therefore, there is an urgent need for a rapid, accurate, and practical screening method for this malignancy.

The complete blood count (CBC) is one of the most frequently ordered clinical tests across nearly all medical contexts, offering valuable and timely insights into a wide range of disease processes¹⁴. Because blood cells continuously interact with various tissues and organs, CBC is a powerful diagnostic tool. Its key advantages include its low cost, accessibility, high consistency, and widespread use in primary healthcare, making it an essential component of routine medical evaluation¹⁵. Its application in clinical practice is extensive, and some tests (e.g., lymphocytes, basophils, and hemoglobin) have demonstrated significant diagnostic and prognostic relevance for lymphoma^{16,17,18,19,20}. However, no research has been conducted to screen for PVRL using a CBC.

In this work, we combine CBC parameters with machine learning (ML) algorithms to screen for PVRL. We conduct a multicentre case–control study to develop a machine learning–based screening model for PVRL diagnosis from CBC data, and validate its performance in large-scale clinical cohorts.

Results

The study design is illustrated in Fig. 1A. No significant differences in age or sex were observed between the PVRL and normal control groups across the discovery cohort and validation cohorts 1–3 (Tables S1 and S2, P > 0.05). Approximately 50% of the CBC parameters differed significantly between PVRL patients and controls in both discovery and validation Cohort 1 (P < 0.05, Table S1). Except for the basophil count (P = 0.049), no significant differences were noted between discovery and validation cohort 1 for key parameters (Table S3).

**Fig. 1: Study design and performance of 12 machine learning (ML) models in the discovery cohort.**

Development of screening models based on all features

All 30 CBC features were used to train models across 12 ML algorithms. The random forest (RF), XGBoost, Tabnet, decision tree (DT), generalized linear model (GLM), gradient boosting (GBM), LightGBM, naïve Bayes, and AdaBoost models demonstrated superior performance (Table S4; Figs. 1B and S1), with area under the receiver operating characteristic curve (AUC)s significantly greater than those of the other models (P < 0.05, AUC range for others: 0.42–0.67).

Area under the precision-recall curve (AUPRC) analysis further confirmed the superior performance of these models (Fig. S2). Comprehensive performance metrics, including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and F1 score, are summarized in Fig. 1C and Table S5. On the basis of these results, the top-performing nine algorithms were selected for further model development.

Identification of the final model based on six features

Feature selection was applied in the discovery cohort to enhance clinical applicability. We ranked all 30 features by their mean SHapley Additive exPlanations (SHAP) values and incrementally built models by adding features one by one in descending order. Each model’s performance was compared to the full 30-feature model using the DeLong test. As shown in Fig. 2A, the RF model using the top 6 features achieved an AUC slightly higher than that of the full 30-feature model. In contrast, the model with only the top 5 features showed a significant decrease in AUC (p < 0.05). Furthermore, the RF model with these 6 features outperformed models built using other algorithms regardless of the number of features included.

**Fig. 2: Feature selection and final model identification in the discovery cohort.**

The RF model, incorporating six features, platelet distribution width (PDW), monocyte%, platelet large cell ratio (PLCR), monocyte count, hemoglobin (HG), and basophil count, retained near-optimal discriminatory ability, as reflected by global SHAP analysis (Fig. 2B) and feature distribution/correlation analysis (Fig. 2C).

The 6-feature RF model achieved an AUC of 0.85 (Fig. 2D) and an AUPRC of 0.84, with a PPV of 0.76, NPV of 0.71, accuracy of 0.73, and F1 score of 0.75 (Table S6). The confusion matrix (Fig. 2E) revealed a sensitivity of 0.81 and specificity of 0.75.

External Validation and Comparative Performance of the Final RF Model Versus IL-6, IL-10, and the IL-10/IL-6 Ratio.

In validation cohort 1, the 6-feature RF model achieved an AUC of 0.80 (Fig. 3A), an AUPRC of 0.84 (Fig. 3B), and an accuracy of 0.68 (Fig. 3C). The sensitivity and specificity were 0.60 and 0.75, respectively. Similar results were obtained in validation cohort 2 (AUC 0.80, AUPRC 0.69; Fig. 3D–F). In validation cohort 3, which included PVRL-CNS patients, the model achieved an AUC of 0.83 and an AUPRC of 0.72, with 0.79 sensitivity and 0.79 specificity (Fig. 3G).

**Fig. 3: Validation of the final random forest (RF) model in three independent cohorts.**

In the differentiation cohort distinguishing PVRL from uveitis, the RF model maintained high performance (AUC 0.81, AUPRC 0.76; Fig. 3H–J), confirming its robustness across diverse clinical settings.

As summarized in Tables S7 and S8, no significant differences in age or sex distribution were observed between the PVRL patients and controls across all cohorts. In aqueous humor, PVRL patients presented significantly elevated IL-10 and IL-10/IL-6 ratios and decreased IL-6 levels (all P < 0.05). The AUCs for IL-6, IL-10, and IL-10/IL-6 were 0.65, 0.66, and 0.77, respectively (Fig. S3A, B, 3K), and those for the PVRL-CNS were 0.67, 0.75, and 0.78, respectively (Figs. S3E, F, 3K). In the vitreous humor, similar trends were observed (P < 0.05), with AUCs of 0.69, 0.74, and 0.77 (Figs. S3C, D, and 4L) and 0.72, 0.74, and 0.74 for the PVRL-CNS (Figs. S3G, H, and 4L). The RF model consistently outperformed IL-6, IL-10, and the IL-10/IL-6 ratio across all analyses (DeLong test, P < 0.05).

Model clinical utility, relevance, and interpretability

Decision curve analysis (DCA) demonstrated that the 6-feature RF model provided consistent net clinical benefit across a broad range of threshold probabilities (Fig. S4A–E). The calibration curves further confirmed the model’s robust screening performance (Fig. S4F–H).

To evaluate parameter dynamics during treatment, 29 PVRL patients were followed for six months (study design: Fig. S5A). Fifteen patients were classified as nonresponders, and 14 were classified as responders. In nonresponders, PDW, HG, and the PLCR significantly decreased, whereas basophil, monocyte, and monocyte percentages increased (P < 0.05; Fig. S5B). Conversely, in responders, PDW, HG, and the PLCR increased, whereas monocyte and monocyte percentages decreased (P < 0.05; Fig. S5C).

SHAP analysis elucidated model decision-making: the global SHAP summary plot ranked feature importance (Fig. 4A), and local SHAP explanations provided patient-specific interpretability (Fig. 4B, C). The probability of PVRL was calculated according to the 6-feature RF model. The optimal cutoff value was determined to be 0.85 based on the Youden index. Probability values of 0.85 or greater denote high risk, and values of less than 0.85 denote low risk (Fig. 4D). To enhance clinical adoption, an interactive web application was developed (https://primary-vitreoretinal-lymphoma-prediction-app.streamlit.app/), enabling real-time risk prediction based on input of the six-feature values (Fig. 4E).

**Fig. 4: Interpretability of the six-feature random forest (RF) model.**

Hospital-based prospective cohort study validation of the final RF model

A total of 100,526 participants aged 18–92 years were enrolled for PVRL screening (Fig. 5A). Of these, 94,935 met the eligibility criteria. The detailed distributions of the categorized diseases are provided in Table S9. The participants’ data were entered in real time into an online web application hosted at https://primary-vitreoretinal-lymphoma-prediction-app.streamlit.app/ for PVRL risk assessment. A scatter plot (Fig. 5B) of the predicted probabilities based on the RF model is shown for all included individuals. On the basis of the screening results, 77 individuals were identified as high risk for PVRL (predicted probability ≥ 0.85). After excluding 11 individuals for various reasons, 66 high-risk individuals were referred to the Department of Ophthalmology for further evaluation. Among them, 38 cases were confirmed as PVRL by vitreous biopsy. Among the remaining 28 patients, 8 were diagnosed with diffuse large B-cell lymphoma, 14 had other ocular or systemic conditions, and 6 were diagnosed with mucosa-associated lymphoid tissue lymphoma. Among the 94,858 individuals classified as low risk (predicted probability less than 0.85), 12,248 were excluded for various reasons, and 83,610 low-risk individuals were referred to the Department of Ophthalmology & Otorhinolaryngology for further evaluation. Only 2 cases of PVRL were identified at the EENT Hospital. The final RF model demonstrated a sensitivity of 95.0%, a specificity of 99.97%, a PPV of 57.6%, and an NPV of 99.99%.

Community-based cross-sectional study validation of the final RF model

A total of 515,326 participants aged 40–88 years were enrolled in the PVRL screening program (Fig. 5C). Among them, 511,786 met the eligibility criteria, and their data were entered in real time into an online web application hosted at https://primary-vitreoretinal-lymphoma-prediction-app.streamlit.app/ for PVRL risk assessment. A scatter plot (Fig. 5D) of the predicted probabilities based on the RF model is shown for all included individuals. Based on the model’s output, 22 individuals were identified as being at high risk for PVRL.

These high-risk participants were referred to the Department of Ophthalmology for further evaluation, resulting in the confirmation of 13 PVRL cases. Owing to the cross-sectional nature of the community-based study, follow-up confirmation was not conducted for participants in the low-risk group. The final RF model demonstrated a PPV of 59.1%.

Discussion

In this study, a six-feature-based RF ML model using complete blood count parameters was successfully developed and validated, and a noninvasive screening tool for PVRL was established. The model demonstrated robust screening accuracy in the discovery cohort, three independent validation cohorts, and the diagnostic differentiation cohort, significantly outperforming conventional biomarkers in the vitreous humor/aqueous humor, such as the IL-10/IL-6 ratio. Longitudinal monitoring further validated the biological relevance of the selected features, with dynamic blood parameter changes aligning with treatment responses. More importantly, the performance of the PVRL screening model was validated through two large sample cohort studies. These findings address a critical unmet need in PVRL screening.

PVRL diagnosis is often delayed because of nonspecific symptoms and the reliance on invasive procedures, such as vitreous biopsy^21,22. Our model overcomes these limitations by leveraging the CBC, making this model cost-effective, widely accessible, and minimally invasive. The six selected CBC parameters, namely, monocyte percentage, PLCR, monocyte count, HG, and basophil count, are routinely measured in clinical practice, making this tool particularly valuable for primary care settings¹⁴. The deployment of a freely accessible web application enhances real-time risk assessment, enabling rapid decision-making in resource-limited environments or when ocular sampling is unavailable. Longitudinal analysis revealed that nonresponders had decreases in PDW, HG, and PLCR alongside increases in posttreatment monocyte and basophil counts, mirroring the model’s feature trends and reinforcing their association with disease activity. Further research is needed to elucidate the underlying mechanisms driving these associations.

While the IL-10/IL-6 ratio in the vitreous humor/aqueous humor remains a diagnostic cornerstone for PVRL, its performance (AUC: 0.65–0.78) was significantly inferior to that of our blood-based model (P < 0.05). This disparity may arise from the inherent limitations of ocular sampling: low cellular yield, rapid degradation of biomarkers, and technical variability. In contrast, blood parameters capture systemic immune responses and tumor–host interactions, offering a more comprehensive profile. Furthermore, standardized blood testing minimizes operational variability, increasing reproducibility across clinical settings.

Furthermore, Gozzi et al.²³ developed a classification model using Python and XGBoost; in our model, 87% of eyes were correctly diagnosed as PVRL or uveitis (including Fuchs uveitis, sarcoidosis uveitis, Behçet uveitis, and uveitis of unknown origin). Consistent with Gozzi et al.²³, who demonstrated that radiomic analysis of anterior segment optical coherence tomography images can noninvasively distinguish PVRL from uveitis of various etiologies, our findings further support the potential of machine learning–based approaches in assisting the differential diagnosis.

The multicentre case‒control design, coupled with validation in two large-sample cohorts, reinforces the model’s generalizability and clinical reliability. However, several limitations should be acknowledged. First, the relatively small sample size of the PVRL-CNS cohort constrains the model’s predictive power for CNS involvement. Second, owing to the cross-sectional design of the community-based study, follow-up confirmation was not performed for individuals in the low-risk group. Consequently, the model’s sensitivity may be overestimated, whereas its specificity and positive predictive value may be underestimated. Third, some patients classified as low risk by our model did not undergo gold standard biopsy-based confirmation of PVRL status, particularly in the hospital-based and community-based validation cohorts. Consequently, some patients categorized as low risk may in fact have had undiagnosed PVRL, representing potential false negatives. This inherent limitation in diagnostic verification could have led to an underestimation of the true prevalence of PVRL and may have affected the reported performance metrics of our predictive model. Last, a limitation of this study is that, in both the hospital-based prospective cohort and the community-based cross-sectional screening program, PVRL screening was restricted to individuals aged over 40 years. Although the model was developed and validated in cohorts with a broader age range (≥18 years), its performance in younger (<40-year-old) populations remains to be extensively evaluated in large-scale, population-based settings.

By integrating machine learning models with routine blood test data, this study proposes a noninvasive and high-accuracy screening tool for PVRL. Nevertheless, it should be emphasized that any CBC-based machine learning approach for PVRL screening must be confirmed by positive pathological findings from a vitreous biopsy. In clinical practice, such a tool has the potential to markedly reduce diagnostic delays, facilitate timely intervention, and ultimately improve outcomes in patients with this aggressive malignancy. Future research should aim to elucidate the underlying biological mechanisms and evaluate the model’s applicability in therapeutic monitoring, thereby maximizing its transformative potential in ophthalmic oncology.

Methods

Ethical considerations

This study received ethical approval from the institutional review boards of the participating institutions: Huashan Hospital of Fudan University (2023-515), Xuhui Central Hospital (2018025), the Eye and ENT Hospital of Fudan University (2020[2020013]), and Wanbei Coal-Electricity Group General Hospital (WBZY-LLWYH-2024-21). The study was conducted in accordance with the Declaration of Helsinki and the Ethical Guidelines for Medical and Health Research Involving Human Participants. Given that the multicentre case–control study was noninterventional and retrospective in nature, the institutional review boards waived the requirement for informed consent. In the prospective hospital-based screening phase, all participants provided written informed consent, signed before sample collection and study enrollment. Sex of participants was recorded as male or female according to hospital registration data, which were based on self-reported information at the time of admission. Due to the limited sample size, no sex- and/or gender-based analyses were performed.

Study design and participants

The PVRL screening model was developed through a multicentre, case‒control study in which patients were systematically identified and categorized based on predefined diagnostic criteria. This case‒control, multicentre study was conducted in 4 hospitals across China between January 1, 2016, and June 30, 2024. The discovery cohort comprised PVRL patients from the Eye and ENT Hospital of Fudan University. The first validation cohort included PVRL patients from Wanbei Coal-Electricity Group General Hospital and Xuhui Central Hospital of Fudan University, and the second validation cohort included PVRL patients from Huashan Hospital of Fudan University. The third validation cohort included PVRL patients with CNS involvement from Huashan Hospital of Fudan University. Healthy controls were recruited from the health examination centres of the same hospitals where the PVRL patients were enrolled. Furthermore, the healthy controls were matched to cases by age and gender. A one-to-one exact matching strategy was applied, with cases and controls matched on age and sex, to ensure comparable demographic characteristics between groups and to minimize potential confounding effects. Finally, 100 PVRL participants and 117 normal controls were included in the discovery cohort (Fig. S6A); 42 PVRL participants and 60 normal controls were included in the first validation cohort (Fig. S6B); 36 PVRL participants and 42 normal controls were included in the second validation cohort (Fig. S6C); and 77 PVRL-CNS participants and 73 normal controls were included in the third validation cohort (Fig. S6D).

Owing to the heterogeneous and often nonspecific clinical PVRL, it is frequently misdiagnosed as uveitis. To address this, a diagnostic differentiation cohort was established, consisting of 155 patients with PVRL and 158 with uveitis. Notably, the 155 PVRL patients overlapped with those in the previously described validation cohorts 1, 2, and 3. The uveitis patients were recruited from the Eye and ENT Hospital of Fudan University. Among the 158 patients with uveitis, 132 had endogenous, non-infectious causes, such as Fuchs uveitis, sarcoidosis uveitis, and Behçet uveitis; 26 cases had exogenous, infectious causes, including bacterial endophthalmitis–associated uveitis (n = 7) and herpetic uveitis (n = 19).

A follow-up cohort was established to assess dynamic changes in CBC parameters during PVRL treatment. In total, 29 PVRL patients were included, with overlap with the previously described PVRL patients from the Eye and ENT Hospital of Fudan University.

To further evaluate the diagnostic performance of CBC tests and interleukins in aqueous/vitreous humor for PVRL, multiple cohorts from the Eye and ENT Hospital and Huashan Hospital were included. Notably, these PVRL participants are also part of the previously described cohorts from the Eye and ENT Hospital and Huashan Hospital of Fudan University.

The performance of the PVRL screening model was validated through two large sample cohort studies: one hospital-based prospective cohort study and one community-based cross-sectional study.

A prospective screening program was initiated at the Eye and ENT Hospital of Fudan University. In brief, all patients presenting to the hospital were sequentially enrolled from October 2024 to May 2025. CBC tests were performed for all eligible participants. Individuals who screened positive for PVRL using the established screening model were further evaluated and confirmed by histopathological examination of biopsy samples. As a tertiary referral center for PVRL, the Eye and ENT Hospital of Fudan University receives a substantial number of patients with suspected disease. As a result, the prevalence of PVRL in this prospective screening program is significantly higher than that in the general population.

In parallel, a community-based cross-sectional screening program was launched in Xuhui District, Shanghai, in July 2024. Residents of Xuhui District were sequentially invited to participate through local community health service centres. CBC tests were conducted for all eligible participants, and those who screened positive for PVRL were referred for confirmatory diagnosis via histopathological analysis of biopsy samples. Due to the low prevalence of PVRL and its higher incidence in the elderly population, we limited the prospective screening program to individuals aged over 40 years to enrich the prevalence of PVRL within this community-based cross-sectional screening cohort. As a result, the prevalence of PVRL in this community-based cross-sectional screening program is significantly higher than that in the general population.

The detailed study design and participant information are provided in the Supplementary Materials. The study design is illustrated in Fig. 1A.

Diagnostic criteria for PVRL

The diagnosis of PVRL was established based on the 2016 World Health Organization classification of lymphoid neoplasms²⁴. All patients with PVRL or CNS lymphoma (CNSL) involvement underwent biopsy procedures. PVRL was diagnosed based on positive pathological findings from vitreous biopsy, following established standards for vitreous sampling and biopsy-based diagnostic criteria^9,25. Undiluted vitreous samples were obtained via dry vitrectomy and processed for cytological analysis. A positive pathological finding was defined as follows: conventional smear cytology demonstrating large lymphoid cells with irregularly shaped nuclei, multiple prominent nucleoli, and scant basophilic cytoplasm, typically accompanied by small reactive T lymphocyte infiltration, consistent with large lymphomatous cells.

CNSL was diagnosed when a positive pathological finding was obtained from CNS tissue biopsy. Patients in whom CNSL was ruled out at the time of PVRL diagnosis were classified as having isolated PVRL. Patients diagnosed simultaneously with CNSL were classified as having PVRL with concurrent CNS involvement. In accordance with the guidelines of the International Primary CNS Lymphoma Collaborative Group²⁶ and the European Association of Neuro-Oncology²⁷, all patients underwent 2-deoxy-2-[F-18]fluoro-D-glucose positron emission tomography/computed tomography (FDG PET/CT) and/or bone marrow aspiration to exclude systemic lymphoma involvement.

Model development

In total, 30 features were employed to develop the diagnostic models. These features included: neutrophil count, neutrophil percentage, red blood cell count, thrombocytocrit, platelet count (PLT), PDW, HG, eosinophil count, eosinophil percentage, basophil count, basophil percentage, mean platelet volume, lymphocyte count, lymphocyte percentage, hematocrit, monocyte count, monocyte percentage, PLCR, white blood cell count, red blood cell distribution width—standard deviation, red blood cell distribution width—coefficient of variation, mean corpuscular volume, mean corpuscular hemoglobin concentration, mean corpuscular hemoglobin, platelet-to-lymphocyte ratio (PLT/lymphocyte count), neutrophil-to-lymphocyte ratio, lymphocyte-to-monocyte ratio, PLT × neutrophil-to-lymphocyte ratio, PLT × neutrophil × monocyte-to-lymphocyte ratio, and neutrophil × monocyte × PLT-to-lymphocyte ratio.

Initially, features exhibiting a correlation coefficient exceeding 0.8 were removed to reduce multicollinearity. In this study, all included variables showed correlations below 0.8. To handle missing values, the median imputation method was employed²⁸. Twelve ML models—AdaBoost, DT, GLM, GBM, K-nearest neighbors, LightGBM, multilayer perceptron classifier, naïve Bayes, RF, support vector machine, TabNet, and XGBoost—were utilized, each leveraging its unique strengths to increase model diversity. To optimize the predictive models, we employed a two-stage hyperparameter tuning strategy. First, a grid search was used to define multiple candidate hyperparameter configurations. A nested 5-fold cross-validation was then performed for each candidate configuration in the discovery cohort. Specifically, the discovery cohort was evenly split into 5 folds, with 1 fold used as the validation set and the remaining four as the training set. Each candidate configuration was applied consistently across all 5 folds, meaning that the cross-validation was used to evaluate the performance of pre-specified hyperparameters rather than to generate different hyperparameters for each fold. The same hyperparameter settings were used to train the model on each of the five training folds and to evaluate it on the corresponding validation fold, resulting in five AUC values for each candidate configuration. Second, the mean-optimal configuration was identified by calculating the mean AUC across all folds for each candidate configuration. The configuration achieving the highest mean AUC was selected as the final hyperparameter set. Minor manual fine-tuning (e.g., adjustment of learning rate or max depth) was performed only around this final configuration to enhance stability. The final selection criterion remained the highest mean performance from the 5-fold cross-validation, ensuring the adoption of a robust and generalizable hyperparameter configuration. The final model, trained with the optimized hyperparameters, was then applied to the validation cohorts to generate the reported results. The final hyperparameters of these twelve ML models are shown in Table S10. Each method was chosen for its balance of simplicity, which allows for interpretable results and power, enabling the modeling of complex linear and nonlinear interactions among input features. Modeling was conducted in Python (v 3.11). The code is publicly available via GitHub–Zenodo and can be accessed using the https://zenodo.org/records/17189239.

Model performance evaluation

The models were assessed using various metrics, including sensitivity, specificity, PPV, NPV, F1 score, and accuracy. Their classification capabilities were further evaluated using the area under the receiver operating characteristic curve and the area under the precision‒recall curve. To compare the AUC values across different ML models, the DeLong test²⁹, a nonparametric method, was utilized.

Feature selection and model explanation

In the discovery cohort, sample features were selected using the SHAP method³⁰. Feature selection was performed to reduce the number of predictors while maintaining optimal model performance. In the discovery cohort, we applied a sequential forward-selection procedure guided by SHAP importance and statistical evaluation as follows: (1) ranking by SHAP importance: a model was trained on the full discovery cohort to compute mean SHAP values for all 30 features, generating a global importance ranking; (2) iterative model building: starting from the top-ranked feature, models were built incrementally by adding one feature at a time in descending SHAP order (top 1, top 2, …, up to all 30 features); (3) performance evaluation: in the discovery cohort, the AUC for each model was calculated using 5-fold cross-validation to obtain stable estimates; (4) statistical comparison: in the discovery cohort, each model’s performance was compared to that of the full 30-feature model using the DeLong test²⁹; (5) final selection: to balance parsimony and predictive performance, we selected the model with the fewest features whose performance was statistically non-inferior or superior to that of the full 30-feature model. In the discovery cohort, during feature selection, the DeLong test²⁹, a nonparametric method, was used to evaluate differences in AUC values before and after feature selection. The SHAP method supplies both global and local explanations for the model. The global explanation delivered consistent and precise attribution values for each feature, emphasizing the connections between the input features and PVRL. Conversely, the local explanation demonstrated specific predictions for individual PVRL cases on the basis of their respective input data.

Web-based model deployment

To facilitate clinical implementation, the final model was deployed as a web application using the Streamlit framework in Python. By inputting the relevant feature values, the application provides a probability estimate for PVRL and generates a personalized force plot, enhancing interpretability for individual cases.

Statistical analysis

All the statistical analyses were conducted using GraphPad Prism (version 10) and Python (version 3.11). The normality of the data distribution was assessed via the Kolmogorov–Smirnov test. For normally distributed continuous variables, paired or independent Student’s t tests were applied, whereas the Kruskal–Wallis test was used for nonnormally distributed data. Categorical variables were analysed using the chi-square test where appropriate. Continuous variables are reported as the means ± standard deviations (SDs), and categorical variables are expressed as counts and percentages. ROC curve analysis was performed to evaluate the diagnostic value of interleukins. The AUC was calculated to assess the overall diagnostic performance. The optimal cutoff value was determined using the Youden index (sensitivity + specificity−1). Sensitivity and specificity corresponding to the optimal cutoff were also reported. A two-tailed P value < 0.05 was considered statistically significant.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Access to comprehensive, individual-level clinical data is restricted to protect sensitive human subject information and to comply with the terms of informed consent. Requests for access to the de-identified data can be made to the corresponding author by providing a study protocol and ethical approval documentation. Requests for raw data will be reviewed by the corresponding author, who will facilitate further communications with the cohort leaders and relevant ancillary study committees as appropriate. The corresponding author typically responds to such requests within 8 weeks. A data use agreement will be established between the requesting party and the data holder, specifying that the data may only be used for the pre-specified project described in the request and that any resulting manuscript must reference the data source. The data generated in this study are provided in the Supplementary Information/Source data file. Source data supporting the findings of this study are provided in this paper. Source data are provided with this paper.

Code availability

The analysis code was written in Python (version 3.11) and relies on standard open-source libraries (NumPy, Pandas, scikit-learn, XGBoost), used in compliance with their MIT and BSD licenses. All reused components retain their original license and attribution. The code for the model development and the statistical analyses is available at https://github.com/fudanRenjun/Primary-vitreoretinal-lymphoma/tree/master and has been archived in Zenodo at https://zenodo.org/records/17189239³¹. The repository is released under the MIT License and is freely accessible without restrictions.

References

Akpek, E. K. et al. Intraocular-central nervous system lymphoma: clinical features, diagnosis, and outcomes. Ophthalmology 106, 1805–1810 (1999).
Article CAS PubMed Google Scholar
Coupland, S. E., Anastassiou, G., Bornfeld, N., Hummel, M. & Stein, H. Primary intraocular lymphoma of T-cell type: report of a case and review of the literature. Graefes Arch. Clin. Exp. Ophthalmol. 243, 189–197 (2005).
Article PubMed Google Scholar
Chan, C.-C. et al. Primary vitreoretinal lymphoma: a report from an international primary central nervous system lymphoma collaborative group symposium. Oncologist 16, 1589–1599 (2011).
Article PubMed PubMed Central Google Scholar
Sagoo, M. S. et al. Primary intraocular lymphoma. Surv. Ophthalmol. 59, 503–516 (2014).
Article PubMed Google Scholar
Grimm, S. A. et al. Primary intraocular lymphoma: an international primary central nervous system lymphoma collaborative group report. Ann. Oncol. 18, 1851–1855 (2007).
Article CAS PubMed Google Scholar
Melli, B. et al. Primary vitreoretinal lymphoma: current diagnostic laboratory tests and new emerging molecular tools. Curr. Oncol. 29, 6908–6921 (2022).
Article PubMed PubMed Central Google Scholar
Huang, R. S. et al. Diagnostic methods for primary vitreoretinal lymphoma: a systematic review. Surv. Ophthalmol. 69, 456–464 (2024).
Article PubMed Google Scholar
Pulido, J. S., Johnston, P. B., Nowakowski, G. S., Castellino, A. & Raja, H. The diagnosis and treatment of primary vitreoretinal lymphoma: a review. Int. J. Retin. Vitr. 4, 18 (2018).
Article Google Scholar
Soussain, C., Malaise, D. & Cassoux, N. Primary vitreoretinal lymphoma: a diagnostic and management challenge. Blood 138, 1519–1534 (2021).
Article CAS PubMed Google Scholar
Langerak, A. W. et al. EuroClonality/BIOMED-2 guidelines for interpretation and reporting of Ig/TCR clonality testing in suspected lymphoproliferations. Leukemia 26, 2159–2171 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chapuy, B. et al. Molecular subtypes of diffuse large B cell lymphoma are associated with distinct pathogenic mechanisms and outcomes. Nat. Med. 24, 679–690 (2018).
Article CAS PubMed PubMed Central Google Scholar
Calimeri, T. et al. Molecular diagnosis of primary CNS lymphoma in 2024 using MYD88Leu265Pro and IL-10. Lancet Haematol. 11, e540–e549 (2024).
Article CAS PubMed Google Scholar
Whitcup, S. M. et al. Association of interleukin 10 in the vitreous and cerebrospinal fluid and primary central nervous system lymphoma. Arch. Ophthalmol. 115, 1157–1160 (1997).
Article CAS PubMed Google Scholar
Horton, S. et al. The top 25 laboratory tests by volume and revenue in five different countries. Am. J. Clin. Pathol. 151, 446–451 (2019).
Article CAS PubMed Google Scholar
Foy, B. H. et al. Haematological setpoints are a stable and patient-specific deep phenotype. Nature 637, 430–438 (2025).
Article ADS CAS PubMed Google Scholar
Li, D. et al. Prognostic significance of pretreatment red blood cell distribution width in primary diffuse large B-cell lymphoma of the central nervous system for 3P medical approaches in multiple cohorts. EPMA J. 13, 499–517 (2022).
Article PubMed PubMed Central Google Scholar
Li, S. et al. Proposed new prognostic model using the systemic immune-inflammation index for primary central nervous system lymphoma: a prospective-retrospective multicohort analysis. Front. Immunol. 13, 1039862 (2022).
Article CAS PubMed PubMed Central Google Scholar
Le, M. et al. Pretreatment hemoglobin as an independent prognostic factor in primary central nervous system lymphomas. Oncologist 24, e898–e904 (2019).
Article CAS PubMed PubMed Central Google Scholar
Tsimberidou, A. M. et al. Assessment of chronic lymphocytic leukemia and small lymphocytic lymphoma by absolute lymphocyte counts in 2126 patients: 20 years of experience at the University of Texas M.D. Anderson Cancer Center. J. Clin. Oncol. 25, 4648–4656 (2007).
Article PubMed Google Scholar
Rodday, A. M. et al. The advanced-stage Hodgkin lymphoma international prognostic index: development and validation of a clinical prediction model from the HoLISTIC consortium. J. Clin. Oncol.41, 2076–2086 (2023).
Article CAS PubMed Google Scholar
Cassoux, N. et al. Ocular and central nervous system lymphoma: clinical features and diagnosis. Ocul. Immunol. Inflamm. 8, 243–250 (2000).
Article CAS PubMed Google Scholar
Hoffman, P. M., McKelvie, P., Hall, A. J., Stawell, R. J. & Santamaria, J. D. Intraocular lymphoma: a series of 14 patients with clinicopathological features and treatment outcomes. Eye 17, 513–521 (2003).
Article CAS PubMed Google Scholar
Gozzi, F. et al. Artificial intelligence-assisted processing of anterior segment OCT images in the diagnosis of vitreoretinal lymphoma. Diagnostics 13, 2451 (2023).
Article PubMed PubMed Central Google Scholar
Arber, D. A. et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood 127, 2391–2405 (2016).
Article CAS PubMed Google Scholar
Coupland, S. E. Analysis of intraocular biopsies. Dev. Ophthalmol. 49, 96–116 (2012).
Article PubMed Google Scholar
Abrey, L. E. et al. Report of an international workshop to standardize baseline evaluation and response criteria for primary CNS lymphoma. J. Clin. Oncol. 23, 5034–5043 (2005).
Article PubMed Google Scholar
Hoang-Xuan, K. et al. Diagnosis and treatment of primary CNS lymphoma in immunocompetent patients: guidelines from the European Association for Neuro-Oncology. Lancet Oncol. 16, e322–e332 (2015).
Article PubMed Google Scholar
Berkelmans, G. F. N. et al. Population median imputation was noninferior to complex approaches for imputing missing values in cardiovascular prediction models in clinical practice. J. Clin. Epidemiol. 145, 70–80 (2022).
Article PubMed Google Scholar
Grandini, M., Bagli, E. & Visani, G. Metrics for multi-class classification: an overview. Preprint at https://doi.org/10.48550/arXiv.2008.05756 (2020).
Hu, J. et al. Identification and validation of an explainable prediction model of acute kidney injury with prognostic implications in critically ill children: a prospective multicenter cohort study. EClinicalMedicine 68, 102409 (2024).
Article PubMed PubMed Central Google Scholar
fudanRenjun. fudanRenjun/primary-vitreoretinal-lymphoma: a noninvasive machine learning model using a complete blood count for screening of primary vitreoretinal lymphoma. Preprint at https://doi.org/10.5281/zenodo.17189239 (2025).

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (82302582) awarded to S.J.L., the Shanghai Municipal Health Commission Project (20224Y0317) awarded to S.J.L., and the Industry-University-Research Innovation Fund for Chinese Universities (2023JQ006) awarded to W.J.C. The sponsor or funding organization had no role in the design or conduct of this research. We gratefully acknowledge the dedicated support provided by the staff of the community health service centres in Xuhui District, whose contributions were essential to the successful completion of this research.

Author information

These authors contributed equally: Shengjie Li, Jiazhen Cao, Danhui Li.

Authors and Affiliations

Department of Clinical Laboratory, Eye & ENT Hospital, Fudan University, Shanghai, China
Shengjie Li, Jun Ren, Jianing Wu, Yingzhu Li & Wenjun Cao
Eye Institute and Department of Ophthalmology, Eye & ENT Hospital, Fudan University, Shanghai, China
Shengjie Li & Wenjun Cao
Department of Laboratory Medicine, Huashan Hospital, Fudan University, Shanghai, China
Jiazhen Cao & Ming Guan
Department of Pathology, RenJi Hospital, School of Medicine, Shanghai JiaoTong University, Shanghai, China
Danhui Li
Department of Clinical Laboratory, Anhui Wanbei Electricity Group General Hospital, Suzhou, China
Mengyu Zhang & Henggui Hu
Department of Clinical Laboratory, Shanghai Xuhui Central Hospital, Fudan University, Shanghai, China
Yunxiao Song
Department of General Practice, Shanghai Xuhui Central Hospital, Fudan University, Shanghai, China
Jie Cheng

Authors

Shengjie Li
View author publications
Search author on:PubMed Google Scholar
Jiazhen Cao
View author publications
Search author on:PubMed Google Scholar
Danhui Li
View author publications
Search author on:PubMed Google Scholar
Jun Ren
View author publications
Search author on:PubMed Google Scholar
Jianing Wu
View author publications
Search author on:PubMed Google Scholar
Yingzhu Li
View author publications
Search author on:PubMed Google Scholar
Mengyu Zhang
View author publications
Search author on:PubMed Google Scholar
Henggui Hu
View author publications
Search author on:PubMed Google Scholar
Yunxiao Song
View author publications
Search author on:PubMed Google Scholar
Jie Cheng
View author publications
Search author on:PubMed Google Scholar
Ming Guan
View author publications
Search author on:PubMed Google Scholar
Wenjun Cao
View author publications
Search author on:PubMed Google Scholar

Contributions

W.J.C., M.G., J.C., and S.J.L. conceptualized and designed this study. S.J.L., Y.Z.L., J.N.W., J.Z.C., D.H.L., and J.R. performed most experiments. M.Y.Z., H.G.H., and Y.X.S. performed partial experiments. S.J.L., J.Z.C., D.H.L., and J.R. finished the acquisition and analysis of data. S.J.L., J.R., J.Z.C., D.H.L., J.C., and J.N.W. prepared figures, performed the statistical analysis. S.J.L., J.C., and J.Z.C. wrote the original draft. W.J.C., J.C., and M.G. reviewed and supervised the manuscript. All the authors have read and approved the final manuscript.

Corresponding authors

Correspondence to Shengjie Li, Jie Cheng, Ming Guan or Wenjun Cao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Li, S., Cao, J., Li, D. et al. A noninvasive machine learning model using a complete blood count for screening of primary vitreoretinal lymphoma. Nat Commun 16, 10667 (2025). https://doi.org/10.1038/s41467-025-65693-0

Download citation

Received: 19 June 2025
Accepted: 21 October 2025
Published: 27 November 2025
Version of record: 27 November 2025
DOI: https://doi.org/10.1038/s41467-025-65693-0