Abstract
Primary vitreoretinal lymphoma (PVRL) is a rare and aggressive intraocular malignancy that is frequently misdiagnosed because of its nonspecific early manifestations and the lack of effective screening tools. We conduct a multicentre case–control study including 255 PVRL patients and 292 controls to develop a machine learning–based screening model using complete blood count data. A six-feature random forest model demonstrates high diagnostic accuracy in the discovery cohort (area under the curve [AUC] = 0.85) and validates across all cohorts (AUC = 0.80–0.83), outperforming intraocular biomarkers such as the interleukin-10/interleukin-6 ratio (AUC = 0.65–0.78). Model performance further validates in a hospital-based prospective cohort (n = 100,526), where 38 PVRLs are identified among 66 individuals classified as high risk, and 2 additional cases are identified among 83,610 individuals classified as low risk, yielding a sensitivity of 95.0%, specificity of 99.97%, positive predictive value (PPV) of 57.6%, and negative predictive value of 99.99%. In the community cohort (n = 515,326), 22 individuals are flagged as high risk, 13 of whom are confirmed as having PVRL (PPV = 59.1%). This study presents the noninvasive and scalable blood-based screening strategy for detection of PVRL, with a web application enabling timely triage and population-level risk stratification.
Similar content being viewed by others
Introduction
Primary vitreoretinal lymphoma (PVRL) is a rare yet potentially aggressive intraocular malignancy that is characterized predominantly by B-cell lymphomas1,2. The estimated annual incidence of PVRL is approximately 50 cases, and there is evidence suggesting an increasing trend in its occurrence3,4. Owing to its infrequency, PVRL is often misdiagnosed or inadequately managed and is frequently mistaken for other intraocular conditions5. This misidentification can result in significant diagnostic delays, sometimes extending up to 21 months from the initial presentation5. The diagnosis of PVRL is further complicated by multiple factors, such as the small volume of vitreous humor, the low cellularity of lymphoma cells—which frequently coexist with inflammatory cells—and the challenges associated with maintaining cellular integrity during sample collection6,7. More importantly, given that PVRL can lead to permanent vision loss and has a high risk of central nervous system (CNS) relapse8, timely and accurate screening is crucial for improving patient outcomes and effectively managing this aggressive malignancy.
Although standardized diagnostic protocols for PVRL exist, the rarity and fragility of lymphoma cells in the vitreous often hinder prompt detection, making timely diagnosis challenging9. For example, (1) cytological examination offers morphological evidence that supports a diagnosis of PVRL1; (2) polymerase chain reaction is utilized to detect monoclonality through rearrangements in immunoglobulin heavy chain or light chain genes, which serve as critical diagnostic indicators10; (3) mutations in genes, such as MYD88 and CD79B have been linked to vitreoretinal lymphomas and may improve diagnostic accuracy11,12; and (4) elevated levels of interleukin (IL)−10 compared with IL-6 are also significant, as B-cell lymphomas typically produce high IL-10 levels, making these ratios valuable markers for diagnosis12,13. However, despite the variety of available methods, all of these methods require invasive intraocular sampling, and their effectiveness is constrained by the need to obtain a substantial number of viable cells—often limited by the low yield of intact neoplastic cells from small volumes of vitreous fluid6. Moreover, these methods are geared primarily towards diagnosing established diseases rather than enabling early screening. Therefore, there is an urgent need for a rapid, accurate, and practical screening method for this malignancy.
The complete blood count (CBC) is one of the most frequently ordered clinical tests across nearly all medical contexts, offering valuable and timely insights into a wide range of disease processes14. Because blood cells continuously interact with various tissues and organs, CBC is a powerful diagnostic tool. Its key advantages include its low cost, accessibility, high consistency, and widespread use in primary healthcare, making it an essential component of routine medical evaluation15. Its application in clinical practice is extensive, and some tests (e.g., lymphocytes, basophils, and hemoglobin) have demonstrated significant diagnostic and prognostic relevance for lymphoma16,17,18,19,20. However, no research has been conducted to screen for PVRL using a CBC.
In this work, we combine CBC parameters with machine learning (ML) algorithms to screen for PVRL. We conduct a multicentre case–control study to develop a machine learning–based screening model for PVRL diagnosis from CBC data, and validate its performance in large-scale clinical cohorts.
Results
The study design is illustrated in Fig. 1A. No significant differences in age or sex were observed between the PVRL and normal control groups across the discovery cohort and validation cohorts 1–3 (Tables S1 and S2, P > 0.05). Approximately 50% of the CBC parameters differed significantly between PVRL patients and controls in both discovery and validation Cohort 1 (P < 0.05, Table S1). Except for the basophil count (P = 0.049), no significant differences were noted between discovery and validation cohort 1 for key parameters (Table S3).
A Schematic representation of the study design, illustrating the development and validation of a ML model for primary vitreoretinal lymphoma (PVRL) diagnosis using complete blood count parameters. B Receiver operating characteristic (ROC) curves comparing the diagnostic performance of 12 machine learning models on the basis of complete blood count data. C Summary of key performance metrics for the 12 machine learning models, including area under the curve (AUC), sensitivity, and specificity. RF random forest, DT decision tree, GLM generalized linear model, GBM gradient boosting, KNN K nearest neighbor, PDW platelet distribution width, PLCR platelet large cell ratio, HG hemoglobin, CNS central nervous system, PPV positive predictive value, NPV negative predictive value.
Development of screening models based on all features
All 30 CBC features were used to train models across 12 ML algorithms. The random forest (RF), XGBoost, Tabnet, decision tree (DT), generalized linear model (GLM), gradient boosting (GBM), LightGBM, naïve Bayes, and AdaBoost models demonstrated superior performance (Table S4; Figs. 1B and S1), with area under the receiver operating characteristic curve (AUC)s significantly greater than those of the other models (P < 0.05, AUC range for others: 0.42–0.67).
Area under the precision-recall curve (AUPRC) analysis further confirmed the superior performance of these models (Fig. S2). Comprehensive performance metrics, including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and F1 score, are summarized in Fig. 1C and Table S5. On the basis of these results, the top-performing nine algorithms were selected for further model development.
Identification of the final model based on six features
Feature selection was applied in the discovery cohort to enhance clinical applicability. We ranked all 30 features by their mean SHapley Additive exPlanations (SHAP) values and incrementally built models by adding features one by one in descending order. Each model’s performance was compared to the full 30-feature model using the DeLong test. As shown in Fig. 2A, the RF model using the top 6 features achieved an AUC slightly higher than that of the full 30-feature model. In contrast, the model with only the top 5 features showed a significant decrease in AUC (p < 0.05). Furthermore, the RF model with these 6 features outperformed models built using other algorithms regardless of the number of features included.
A Area under the receiver operating characteristic curve (AUC) values for different feature subsets selected using the SHapley Additive exPlanations (SHAP) method, along with feature ranking scores. Source data are provided as a Source data file. B SHAP summary bar plot highlighting the relative importance of features in the random forest (RF) model. C Illustrating the distribution and correlation of the six selected features. D Receiver operating characteristic (ROC) curves demonstrating the diagnostic performance of the RF model when the six selected features were used for primary vitreoretinal lymphoma (PVRL) detection. E Confusion matrix heatmap visualizing the classification performance of the six-feature RF model in diagnosing PVRL. DT decision tree, GLM generalized linear model, GBM gradient boosting, PDW platelet distribution width, PLCR platelet large cell ratio, HG hemoglobin, PLT platelet count, MPV mean platelet volume, HCT hematocrit, RDWSD red blood cell distribution width—standard deviation, RDWCV red blood cell distribution width—coefficient of variation, RBC red blood cell count, WBC white blood cell count, MCHC mean corpuscular hemoglobin concentration, PIV pan-immune inflammation value.
The RF model, incorporating six features, platelet distribution width (PDW), monocyte%, platelet large cell ratio (PLCR), monocyte count, hemoglobin (HG), and basophil count, retained near-optimal discriminatory ability, as reflected by global SHAP analysis (Fig. 2B) and feature distribution/correlation analysis (Fig. 2C).
The 6-feature RF model achieved an AUC of 0.85 (Fig. 2D) and an AUPRC of 0.84, with a PPV of 0.76, NPV of 0.71, accuracy of 0.73, and F1 score of 0.75 (Table S6). The confusion matrix (Fig. 2E) revealed a sensitivity of 0.81 and specificity of 0.75.
External Validation and Comparative Performance of the Final RF Model Versus IL-6, IL-10, and the IL-10/IL-6 Ratio.
In validation cohort 1, the 6-feature RF model achieved an AUC of 0.80 (Fig. 3A), an AUPRC of 0.84 (Fig. 3B), and an accuracy of 0.68 (Fig. 3C). The sensitivity and specificity were 0.60 and 0.75, respectively. Similar results were obtained in validation cohort 2 (AUC 0.80, AUPRC 0.69; Fig. 3D–F). In validation cohort 3, which included PVRL-CNS patients, the model achieved an AUC of 0.83 and an AUPRC of 0.72, with 0.79 sensitivity and 0.79 specificity (Fig. 3G).
A Receiver operating characteristic (ROC) curves demonstrating the diagnostic performance of the six-feature RF model for primary vitreoretinal lymphoma (PVRL) detection in validation cohort 1. B Precision‒recall (PR) curves illustrating the precision‒recall performance of the six-feature RF model for PVRL detection in validation cohort 1. C Confusion matrix heatmap visualizing the classification performance of the six-feature RF model for PVRL detection in validation cohort 1. D ROC curves used to evaluate the performance of the RF model in detecting PVRL and PVRL with central nervous system involvement (PVRL-CNS) in validation cohorts 2 and 3. E PR curves displaying the precision‒recall performance of the six-feature RF model for PVRL/PVRL-CNS detection in validation cohorts 2 and 3. F Confusion matrix heatmap depicting the classification performance of the six-feature RF model for PVRL detection in validation cohort 2. G Confusion matrix heatmap illustrating the classification performance of the six-feature RF model for PVRL-CNS detection in validation cohort 3. H ROC curves used to evaluate the performance of the RF model in distinguishing PVRL from uveitis. I PR curves displaying the precision‒recall performance of the six-feature RF model for distinguishing PVRL from uveitis. J Confusion matrix heatmap illustrating the classification performance of the six-feature RF model for distinguishing PVRL from uveitis. K ROC curves illustrating the diagnostic performance of the IL-10/IL-6 ratio in aqueous humor for detecting PVRL and PVRL with central nervous system involvement (PVRL-CNS). L ROC curves illustrating the diagnostic performance of the IL-10/IL-6 ratio in the vitreous humor for detecting PVRL and PVRL-CNS. PPV positive predictive value, NPV negative predictive value, AUC area under the curve.
In the differentiation cohort distinguishing PVRL from uveitis, the RF model maintained high performance (AUC 0.81, AUPRC 0.76; Fig. 3H–J), confirming its robustness across diverse clinical settings.
As summarized in Tables S7 and S8, no significant differences in age or sex distribution were observed between the PVRL patients and controls across all cohorts. In aqueous humor, PVRL patients presented significantly elevated IL-10 and IL-10/IL-6 ratios and decreased IL-6 levels (all P < 0.05). The AUCs for IL-6, IL-10, and IL-10/IL-6 were 0.65, 0.66, and 0.77, respectively (Fig. S3A, B, 3K), and those for the PVRL-CNS were 0.67, 0.75, and 0.78, respectively (Figs. S3E, F, 3K). In the vitreous humor, similar trends were observed (P < 0.05), with AUCs of 0.69, 0.74, and 0.77 (Figs. S3C, D, and 4L) and 0.72, 0.74, and 0.74 for the PVRL-CNS (Figs. S3G, H, and 4L). The RF model consistently outperformed IL-6, IL-10, and the IL-10/IL-6 ratio across all analyses (DeLong test, P < 0.05).
Model clinical utility, relevance, and interpretability
Decision curve analysis (DCA) demonstrated that the 6-feature RF model provided consistent net clinical benefit across a broad range of threshold probabilities (Fig. S4A–E). The calibration curves further confirmed the model’s robust screening performance (Fig. S4F–H).
To evaluate parameter dynamics during treatment, 29 PVRL patients were followed for six months (study design: Fig. S5A). Fifteen patients were classified as nonresponders, and 14 were classified as responders. In nonresponders, PDW, HG, and the PLCR significantly decreased, whereas basophil, monocyte, and monocyte percentages increased (P < 0.05; Fig. S5B). Conversely, in responders, PDW, HG, and the PLCR increased, whereas monocyte and monocyte percentages decreased (P < 0.05; Fig. S5C).
SHAP analysis elucidated model decision-making: the global SHAP summary plot ranked feature importance (Fig. 4A), and local SHAP explanations provided patient-specific interpretability (Fig. 4B, C). The probability of PVRL was calculated according to the 6-feature RF model. The optimal cutoff value was determined to be 0.85 based on the Youden index. Probability values of 0.85 or greater denote high risk, and values of less than 0.85 denote low risk (Fig. 4D). To enhance clinical adoption, an interactive web application was developed (https://primary-vitreoretinal-lymphoma-prediction-app.streamlit.app/), enabling real-time risk prediction based on input of the six-feature values (Fig. 4E).
A SHapley Additive exPlanations (SHAP) summary bar plot ranking the importance of the six selected features in the RF model. B Local explanation analysis illustrating the prediction for non-primary vitreoretinal lymphoma (PVRL) participants. C Local explanation analysis illustrating the prediction for PVRL participants. D RF model-predicted probabilities for positive cases (n = 252) and negative cases (n = 292). Each dot represents one individual, and no technical replicates were used. The box-and-whisker plots display the distribution of predicted probabilities. The central box represents the interquartile range (IQR), spanning from the first quartile (Q1, 25th percentile) to the third quartile (Q3, 75th percentile). The horizontal line within the box indicates the median (Q2, 50th percentile). The whiskers extend to the minimum and maximum values within 1.5 × IQR from Q1 and Q3, respectively, while data points beyond this range are considered outliers. E Clinical application interface: upon entering actual values of the six features, the tool automatically predicts PVRL risk. PDW platelet distribution width, PLCR platelet large cell ratio, HG hemoglobin.
Hospital-based prospective cohort study validation of the final RF model
A total of 100,526 participants aged 18–92 years were enrolled for PVRL screening (Fig. 5A). Of these, 94,935 met the eligibility criteria. The detailed distributions of the categorized diseases are provided in Table S9. The participants’ data were entered in real time into an online web application hosted at https://primary-vitreoretinal-lymphoma-prediction-app.streamlit.app/ for PVRL risk assessment. A scatter plot (Fig. 5B) of the predicted probabilities based on the RF model is shown for all included individuals. On the basis of the screening results, 77 individuals were identified as high risk for PVRL (predicted probability ≥ 0.85). After excluding 11 individuals for various reasons, 66 high-risk individuals were referred to the Department of Ophthalmology for further evaluation. Among them, 38 cases were confirmed as PVRL by vitreous biopsy. Among the remaining 28 patients, 8 were diagnosed with diffuse large B-cell lymphoma, 14 had other ocular or systemic conditions, and 6 were diagnosed with mucosa-associated lymphoid tissue lymphoma. Among the 94,858 individuals classified as low risk (predicted probability less than 0.85), 12,248 were excluded for various reasons, and 83,610 low-risk individuals were referred to the Department of Ophthalmology & Otorhinolaryngology for further evaluation. Only 2 cases of PVRL were identified at the EENT Hospital. The final RF model demonstrated a sensitivity of 95.0%, a specificity of 99.97%, a PPV of 57.6%, and an NPV of 99.99%.
A The screening process of the hospital-based prospective cohort study. B A scatter plot of the predicted probabilities based on the random forest (RF) model for all included individuals in the hospital-based prospective cohort study. C The screening process of the community-based cross-sectional study. D A scatter plot of the predicted probabilities based on the RF model for all included individuals in the community-based cross-sectional study. PVRL primary vitreoretinal lymphoma.
Community-based cross-sectional study validation of the final RF model
A total of 515,326 participants aged 40–88 years were enrolled in the PVRL screening program (Fig. 5C). Among them, 511,786 met the eligibility criteria, and their data were entered in real time into an online web application hosted at https://primary-vitreoretinal-lymphoma-prediction-app.streamlit.app/ for PVRL risk assessment. A scatter plot (Fig. 5D) of the predicted probabilities based on the RF model is shown for all included individuals. Based on the model’s output, 22 individuals were identified as being at high risk for PVRL.
These high-risk participants were referred to the Department of Ophthalmology for further evaluation, resulting in the confirmation of 13 PVRL cases. Owing to the cross-sectional nature of the community-based study, follow-up confirmation was not conducted for participants in the low-risk group. The final RF model demonstrated a PPV of 59.1%.
Discussion
In this study, a six-feature-based RF ML model using complete blood count parameters was successfully developed and validated, and a noninvasive screening tool for PVRL was established. The model demonstrated robust screening accuracy in the discovery cohort, three independent validation cohorts, and the diagnostic differentiation cohort, significantly outperforming conventional biomarkers in the vitreous humor/aqueous humor, such as the IL-10/IL-6 ratio. Longitudinal monitoring further validated the biological relevance of the selected features, with dynamic blood parameter changes aligning with treatment responses. More importantly, the performance of the PVRL screening model was validated through two large sample cohort studies. These findings address a critical unmet need in PVRL screening.
PVRL diagnosis is often delayed because of nonspecific symptoms and the reliance on invasive procedures, such as vitreous biopsy21,22. Our model overcomes these limitations by leveraging the CBC, making this model cost-effective, widely accessible, and minimally invasive. The six selected CBC parameters, namely, monocyte percentage, PLCR, monocyte count, HG, and basophil count, are routinely measured in clinical practice, making this tool particularly valuable for primary care settings14. The deployment of a freely accessible web application enhances real-time risk assessment, enabling rapid decision-making in resource-limited environments or when ocular sampling is unavailable. Longitudinal analysis revealed that nonresponders had decreases in PDW, HG, and PLCR alongside increases in posttreatment monocyte and basophil counts, mirroring the model’s feature trends and reinforcing their association with disease activity. Further research is needed to elucidate the underlying mechanisms driving these associations.
While the IL-10/IL-6 ratio in the vitreous humor/aqueous humor remains a diagnostic cornerstone for PVRL, its performance (AUC: 0.65–0.78) was significantly inferior to that of our blood-based model (P < 0.05). This disparity may arise from the inherent limitations of ocular sampling: low cellular yield, rapid degradation of biomarkers, and technical variability. In contrast, blood parameters capture systemic immune responses and tumor–host interactions, offering a more comprehensive profile. Furthermore, standardized blood testing minimizes operational variability, increasing reproducibility across clinical settings.
Furthermore, Gozzi et al.23 developed a classification model using Python and XGBoost; in our model, 87% of eyes were correctly diagnosed as PVRL or uveitis (including Fuchs uveitis, sarcoidosis uveitis, Behçet uveitis, and uveitis of unknown origin). Consistent with Gozzi et al.23, who demonstrated that radiomic analysis of anterior segment optical coherence tomography images can noninvasively distinguish PVRL from uveitis of various etiologies, our findings further support the potential of machine learning–based approaches in assisting the differential diagnosis.
The multicentre case‒control design, coupled with validation in two large-sample cohorts, reinforces the model’s generalizability and clinical reliability. However, several limitations should be acknowledged. First, the relatively small sample size of the PVRL-CNS cohort constrains the model’s predictive power for CNS involvement. Second, owing to the cross-sectional design of the community-based study, follow-up confirmation was not performed for individuals in the low-risk group. Consequently, the model’s sensitivity may be overestimated, whereas its specificity and positive predictive value may be underestimated. Third, some patients classified as low risk by our model did not undergo gold standard biopsy-based confirmation of PVRL status, particularly in the hospital-based and community-based validation cohorts. Consequently, some patients categorized as low risk may in fact have had undiagnosed PVRL, representing potential false negatives. This inherent limitation in diagnostic verification could have led to an underestimation of the true prevalence of PVRL and may have affected the reported performance metrics of our predictive model. Last, a limitation of this study is that, in both the hospital-based prospective cohort and the community-based cross-sectional screening program, PVRL screening was restricted to individuals aged over 40 years. Although the model was developed and validated in cohorts with a broader age range (≥18 years), its performance in younger (<40-year-old) populations remains to be extensively evaluated in large-scale, population-based settings.
By integrating machine learning models with routine blood test data, this study proposes a noninvasive and high-accuracy screening tool for PVRL. Nevertheless, it should be emphasized that any CBC-based machine learning approach for PVRL screening must be confirmed by positive pathological findings from a vitreous biopsy. In clinical practice, such a tool has the potential to markedly reduce diagnostic delays, facilitate timely intervention, and ultimately improve outcomes in patients with this aggressive malignancy. Future research should aim to elucidate the underlying biological mechanisms and evaluate the model’s applicability in therapeutic monitoring, thereby maximizing its transformative potential in ophthalmic oncology.
Methods
Ethical considerations
This study received ethical approval from the institutional review boards of the participating institutions: Huashan Hospital of Fudan University (2023-515), Xuhui Central Hospital (2018025), the Eye and ENT Hospital of Fudan University (2020[2020013]), and Wanbei Coal-Electricity Group General Hospital (WBZY-LLWYH-2024-21). The study was conducted in accordance with the Declaration of Helsinki and the Ethical Guidelines for Medical and Health Research Involving Human Participants. Given that the multicentre case–control study was noninterventional and retrospective in nature, the institutional review boards waived the requirement for informed consent. In the prospective hospital-based screening phase, all participants provided written informed consent, signed before sample collection and study enrollment. Sex of participants was recorded as male or female according to hospital registration data, which were based on self-reported information at the time of admission. Due to the limited sample size, no sex- and/or gender-based analyses were performed.
Study design and participants
The PVRL screening model was developed through a multicentre, case‒control study in which patients were systematically identified and categorized based on predefined diagnostic criteria. This case‒control, multicentre study was conducted in 4 hospitals across China between January 1, 2016, and June 30, 2024. The discovery cohort comprised PVRL patients from the Eye and ENT Hospital of Fudan University. The first validation cohort included PVRL patients from Wanbei Coal-Electricity Group General Hospital and Xuhui Central Hospital of Fudan University, and the second validation cohort included PVRL patients from Huashan Hospital of Fudan University. The third validation cohort included PVRL patients with CNS involvement from Huashan Hospital of Fudan University. Healthy controls were recruited from the health examination centres of the same hospitals where the PVRL patients were enrolled. Furthermore, the healthy controls were matched to cases by age and gender. A one-to-one exact matching strategy was applied, with cases and controls matched on age and sex, to ensure comparable demographic characteristics between groups and to minimize potential confounding effects. Finally, 100 PVRL participants and 117 normal controls were included in the discovery cohort (Fig. S6A); 42 PVRL participants and 60 normal controls were included in the first validation cohort (Fig. S6B); 36 PVRL participants and 42 normal controls were included in the second validation cohort (Fig. S6C); and 77 PVRL-CNS participants and 73 normal controls were included in the third validation cohort (Fig. S6D).
Owing to the heterogeneous and often nonspecific clinical PVRL, it is frequently misdiagnosed as uveitis. To address this, a diagnostic differentiation cohort was established, consisting of 155 patients with PVRL and 158 with uveitis. Notably, the 155 PVRL patients overlapped with those in the previously described validation cohorts 1, 2, and 3. The uveitis patients were recruited from the Eye and ENT Hospital of Fudan University. Among the 158 patients with uveitis, 132 had endogenous, non-infectious causes, such as Fuchs uveitis, sarcoidosis uveitis, and Behçet uveitis; 26 cases had exogenous, infectious causes, including bacterial endophthalmitis–associated uveitis (n = 7) and herpetic uveitis (n = 19).
A follow-up cohort was established to assess dynamic changes in CBC parameters during PVRL treatment. In total, 29 PVRL patients were included, with overlap with the previously described PVRL patients from the Eye and ENT Hospital of Fudan University.
To further evaluate the diagnostic performance of CBC tests and interleukins in aqueous/vitreous humor for PVRL, multiple cohorts from the Eye and ENT Hospital and Huashan Hospital were included. Notably, these PVRL participants are also part of the previously described cohorts from the Eye and ENT Hospital and Huashan Hospital of Fudan University.
The performance of the PVRL screening model was validated through two large sample cohort studies: one hospital-based prospective cohort study and one community-based cross-sectional study.
A prospective screening program was initiated at the Eye and ENT Hospital of Fudan University. In brief, all patients presenting to the hospital were sequentially enrolled from October 2024 to May 2025. CBC tests were performed for all eligible participants. Individuals who screened positive for PVRL using the established screening model were further evaluated and confirmed by histopathological examination of biopsy samples. As a tertiary referral center for PVRL, the Eye and ENT Hospital of Fudan University receives a substantial number of patients with suspected disease. As a result, the prevalence of PVRL in this prospective screening program is significantly higher than that in the general population.
In parallel, a community-based cross-sectional screening program was launched in Xuhui District, Shanghai, in July 2024. Residents of Xuhui District were sequentially invited to participate through local community health service centres. CBC tests were conducted for all eligible participants, and those who screened positive for PVRL were referred for confirmatory diagnosis via histopathological analysis of biopsy samples. Due to the low prevalence of PVRL and its higher incidence in the elderly population, we limited the prospective screening program to individuals aged over 40 years to enrich the prevalence of PVRL within this community-based cross-sectional screening cohort. As a result, the prevalence of PVRL in this community-based cross-sectional screening program is significantly higher than that in the general population.
The detailed study design and participant information are provided in the Supplementary Materials. The study design is illustrated in Fig. 1A.
Diagnostic criteria for PVRL
The diagnosis of PVRL was established based on the 2016 World Health Organization classification of lymphoid neoplasms24. All patients with PVRL or CNS lymphoma (CNSL) involvement underwent biopsy procedures. PVRL was diagnosed based on positive pathological findings from vitreous biopsy, following established standards for vitreous sampling and biopsy-based diagnostic criteria9,25. Undiluted vitreous samples were obtained via dry vitrectomy and processed for cytological analysis. A positive pathological finding was defined as follows: conventional smear cytology demonstrating large lymphoid cells with irregularly shaped nuclei, multiple prominent nucleoli, and scant basophilic cytoplasm, typically accompanied by small reactive T lymphocyte infiltration, consistent with large lymphomatous cells.
CNSL was diagnosed when a positive pathological finding was obtained from CNS tissue biopsy. Patients in whom CNSL was ruled out at the time of PVRL diagnosis were classified as having isolated PVRL. Patients diagnosed simultaneously with CNSL were classified as having PVRL with concurrent CNS involvement. In accordance with the guidelines of the International Primary CNS Lymphoma Collaborative Group26 and the European Association of Neuro-Oncology27, all patients underwent 2-deoxy-2-[F-18]fluoro-D-glucose positron emission tomography/computed tomography (FDG PET/CT) and/or bone marrow aspiration to exclude systemic lymphoma involvement.
Model development
In total, 30 features were employed to develop the diagnostic models. These features included: neutrophil count, neutrophil percentage, red blood cell count, thrombocytocrit, platelet count (PLT), PDW, HG, eosinophil count, eosinophil percentage, basophil count, basophil percentage, mean platelet volume, lymphocyte count, lymphocyte percentage, hematocrit, monocyte count, monocyte percentage, PLCR, white blood cell count, red blood cell distribution width—standard deviation, red blood cell distribution width—coefficient of variation, mean corpuscular volume, mean corpuscular hemoglobin concentration, mean corpuscular hemoglobin, platelet-to-lymphocyte ratio (PLT/lymphocyte count), neutrophil-to-lymphocyte ratio, lymphocyte-to-monocyte ratio, PLT × neutrophil-to-lymphocyte ratio, PLT × neutrophil × monocyte-to-lymphocyte ratio, and neutrophil × monocyte × PLT-to-lymphocyte ratio.
Initially, features exhibiting a correlation coefficient exceeding 0.8 were removed to reduce multicollinearity. In this study, all included variables showed correlations below 0.8. To handle missing values, the median imputation method was employed28. Twelve ML models—AdaBoost, DT, GLM, GBM, K-nearest neighbors, LightGBM, multilayer perceptron classifier, naïve Bayes, RF, support vector machine, TabNet, and XGBoost—were utilized, each leveraging its unique strengths to increase model diversity. To optimize the predictive models, we employed a two-stage hyperparameter tuning strategy. First, a grid search was used to define multiple candidate hyperparameter configurations. A nested 5-fold cross-validation was then performed for each candidate configuration in the discovery cohort. Specifically, the discovery cohort was evenly split into 5 folds, with 1 fold used as the validation set and the remaining four as the training set. Each candidate configuration was applied consistently across all 5 folds, meaning that the cross-validation was used to evaluate the performance of pre-specified hyperparameters rather than to generate different hyperparameters for each fold. The same hyperparameter settings were used to train the model on each of the five training folds and to evaluate it on the corresponding validation fold, resulting in five AUC values for each candidate configuration. Second, the mean-optimal configuration was identified by calculating the mean AUC across all folds for each candidate configuration. The configuration achieving the highest mean AUC was selected as the final hyperparameter set. Minor manual fine-tuning (e.g., adjustment of learning rate or max depth) was performed only around this final configuration to enhance stability. The final selection criterion remained the highest mean performance from the 5-fold cross-validation, ensuring the adoption of a robust and generalizable hyperparameter configuration. The final model, trained with the optimized hyperparameters, was then applied to the validation cohorts to generate the reported results. The final hyperparameters of these twelve ML models are shown in Table S10. Each method was chosen for its balance of simplicity, which allows for interpretable results and power, enabling the modeling of complex linear and nonlinear interactions among input features. Modeling was conducted in Python (v 3.11). The code is publicly available via GitHub–Zenodo and can be accessed using the https://zenodo.org/records/17189239.
Model performance evaluation
The models were assessed using various metrics, including sensitivity, specificity, PPV, NPV, F1 score, and accuracy. Their classification capabilities were further evaluated using the area under the receiver operating characteristic curve and the area under the precision‒recall curve. To compare the AUC values across different ML models, the DeLong test29, a nonparametric method, was utilized.
Feature selection and model explanation
In the discovery cohort, sample features were selected using the SHAP method30. Feature selection was performed to reduce the number of predictors while maintaining optimal model performance. In the discovery cohort, we applied a sequential forward-selection procedure guided by SHAP importance and statistical evaluation as follows: (1) ranking by SHAP importance: a model was trained on the full discovery cohort to compute mean SHAP values for all 30 features, generating a global importance ranking; (2) iterative model building: starting from the top-ranked feature, models were built incrementally by adding one feature at a time in descending SHAP order (top 1, top 2, …, up to all 30 features); (3) performance evaluation: in the discovery cohort, the AUC for each model was calculated using 5-fold cross-validation to obtain stable estimates; (4) statistical comparison: in the discovery cohort, each model’s performance was compared to that of the full 30-feature model using the DeLong test29; (5) final selection: to balance parsimony and predictive performance, we selected the model with the fewest features whose performance was statistically non-inferior or superior to that of the full 30-feature model. In the discovery cohort, during feature selection, the DeLong test29, a nonparametric method, was used to evaluate differences in AUC values before and after feature selection. The SHAP method supplies both global and local explanations for the model. The global explanation delivered consistent and precise attribution values for each feature, emphasizing the connections between the input features and PVRL. Conversely, the local explanation demonstrated specific predictions for individual PVRL cases on the basis of their respective input data.
Web-based model deployment
To facilitate clinical implementation, the final model was deployed as a web application using the Streamlit framework in Python. By inputting the relevant feature values, the application provides a probability estimate for PVRL and generates a personalized force plot, enhancing interpretability for individual cases.
Statistical analysis
All the statistical analyses were conducted using GraphPad Prism (version 10) and Python (version 3.11). The normality of the data distribution was assessed via the Kolmogorov–Smirnov test. For normally distributed continuous variables, paired or independent Student’s t tests were applied, whereas the Kruskal–Wallis test was used for nonnormally distributed data. Categorical variables were analysed using the chi-square test where appropriate. Continuous variables are reported as the means ± standard deviations (SDs), and categorical variables are expressed as counts and percentages. ROC curve analysis was performed to evaluate the diagnostic value of interleukins. The AUC was calculated to assess the overall diagnostic performance. The optimal cutoff value was determined using the Youden index (sensitivity + specificity−1). Sensitivity and specificity corresponding to the optimal cutoff were also reported. A two-tailed P value < 0.05 was considered statistically significant.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Access to comprehensive, individual-level clinical data is restricted to protect sensitive human subject information and to comply with the terms of informed consent. Requests for access to the de-identified data can be made to the corresponding author by providing a study protocol and ethical approval documentation. Requests for raw data will be reviewed by the corresponding author, who will facilitate further communications with the cohort leaders and relevant ancillary study committees as appropriate. The corresponding author typically responds to such requests within 8 weeks. A data use agreement will be established between the requesting party and the data holder, specifying that the data may only be used for the pre-specified project described in the request and that any resulting manuscript must reference the data source. The data generated in this study are provided in the Supplementary Information/Source data file. Source data supporting the findings of this study are provided in this paper. Source data are provided with this paper.
Code availability
The analysis code was written in Python (version 3.11) and relies on standard open-source libraries (NumPy, Pandas, scikit-learn, XGBoost), used in compliance with their MIT and BSD licenses. All reused components retain their original license and attribution. The code for the model development and the statistical analyses is available at https://github.com/fudanRenjun/Primary-vitreoretinal-lymphoma/tree/master and has been archived in Zenodo at https://zenodo.org/records/1718923931. The repository is released under the MIT License and is freely accessible without restrictions.
References
Akpek, E. K. et al. Intraocular-central nervous system lymphoma: clinical features, diagnosis, and outcomes. Ophthalmology 106, 1805–1810 (1999).
Coupland, S. E., Anastassiou, G., Bornfeld, N., Hummel, M. & Stein, H. Primary intraocular lymphoma of T-cell type: report of a case and review of the literature. Graefes Arch. Clin. Exp. Ophthalmol. 243, 189–197 (2005).
Chan, C.-C. et al. Primary vitreoretinal lymphoma: a report from an international primary central nervous system lymphoma collaborative group symposium. Oncologist 16, 1589–1599 (2011).
Sagoo, M. S. et al. Primary intraocular lymphoma. Surv. Ophthalmol. 59, 503–516 (2014).
Grimm, S. A. et al. Primary intraocular lymphoma: an international primary central nervous system lymphoma collaborative group report. Ann. Oncol. 18, 1851–1855 (2007).
Melli, B. et al. Primary vitreoretinal lymphoma: current diagnostic laboratory tests and new emerging molecular tools. Curr. Oncol. 29, 6908–6921 (2022).
Huang, R. S. et al. Diagnostic methods for primary vitreoretinal lymphoma: a systematic review. Surv. Ophthalmol. 69, 456–464 (2024).
Pulido, J. S., Johnston, P. B., Nowakowski, G. S., Castellino, A. & Raja, H. The diagnosis and treatment of primary vitreoretinal lymphoma: a review. Int. J. Retin. Vitr. 4, 18 (2018).
Soussain, C., Malaise, D. & Cassoux, N. Primary vitreoretinal lymphoma: a diagnostic and management challenge. Blood 138, 1519–1534 (2021).
Langerak, A. W. et al. EuroClonality/BIOMED-2 guidelines for interpretation and reporting of Ig/TCR clonality testing in suspected lymphoproliferations. Leukemia 26, 2159–2171 (2012).
Chapuy, B. et al. Molecular subtypes of diffuse large B cell lymphoma are associated with distinct pathogenic mechanisms and outcomes. Nat. Med. 24, 679–690 (2018).
Calimeri, T. et al. Molecular diagnosis of primary CNS lymphoma in 2024 using MYD88Leu265Pro and IL-10. Lancet Haematol. 11, e540–e549 (2024).
Whitcup, S. M. et al. Association of interleukin 10 in the vitreous and cerebrospinal fluid and primary central nervous system lymphoma. Arch. Ophthalmol. 115, 1157–1160 (1997).
Horton, S. et al. The top 25 laboratory tests by volume and revenue in five different countries. Am. J. Clin. Pathol. 151, 446–451 (2019).
Foy, B. H. et al. Haematological setpoints are a stable and patient-specific deep phenotype. Nature 637, 430–438 (2025).
Li, D. et al. Prognostic significance of pretreatment red blood cell distribution width in primary diffuse large B-cell lymphoma of the central nervous system for 3P medical approaches in multiple cohorts. EPMA J. 13, 499–517 (2022).
Li, S. et al. Proposed new prognostic model using the systemic immune-inflammation index for primary central nervous system lymphoma: a prospective-retrospective multicohort analysis. Front. Immunol. 13, 1039862 (2022).
Le, M. et al. Pretreatment hemoglobin as an independent prognostic factor in primary central nervous system lymphomas. Oncologist 24, e898–e904 (2019).
Tsimberidou, A. M. et al. Assessment of chronic lymphocytic leukemia and small lymphocytic lymphoma by absolute lymphocyte counts in 2126 patients: 20 years of experience at the University of Texas M.D. Anderson Cancer Center. J. Clin. Oncol. 25, 4648–4656 (2007).
Rodday, A. M. et al. The advanced-stage Hodgkin lymphoma international prognostic index: development and validation of a clinical prediction model from the HoLISTIC consortium. J. Clin. Oncol.41, 2076–2086 (2023).
Cassoux, N. et al. Ocular and central nervous system lymphoma: clinical features and diagnosis. Ocul. Immunol. Inflamm. 8, 243–250 (2000).
Hoffman, P. M., McKelvie, P., Hall, A. J., Stawell, R. J. & Santamaria, J. D. Intraocular lymphoma: a series of 14 patients with clinicopathological features and treatment outcomes. Eye 17, 513–521 (2003).
Gozzi, F. et al. Artificial intelligence-assisted processing of anterior segment OCT images in the diagnosis of vitreoretinal lymphoma. Diagnostics 13, 2451 (2023).
Arber, D. A. et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood 127, 2391–2405 (2016).
Coupland, S. E. Analysis of intraocular biopsies. Dev. Ophthalmol. 49, 96–116 (2012).
Abrey, L. E. et al. Report of an international workshop to standardize baseline evaluation and response criteria for primary CNS lymphoma. J. Clin. Oncol. 23, 5034–5043 (2005).
Hoang-Xuan, K. et al. Diagnosis and treatment of primary CNS lymphoma in immunocompetent patients: guidelines from the European Association for Neuro-Oncology. Lancet Oncol. 16, e322–e332 (2015).
Berkelmans, G. F. N. et al. Population median imputation was noninferior to complex approaches for imputing missing values in cardiovascular prediction models in clinical practice. J. Clin. Epidemiol. 145, 70–80 (2022).
Grandini, M., Bagli, E. & Visani, G. Metrics for multi-class classification: an overview. Preprint at https://doi.org/10.48550/arXiv.2008.05756 (2020).
Hu, J. et al. Identification and validation of an explainable prediction model of acute kidney injury with prognostic implications in critically ill children: a prospective multicenter cohort study. EClinicalMedicine 68, 102409 (2024).
fudanRenjun. fudanRenjun/primary-vitreoretinal-lymphoma: a noninvasive machine learning model using a complete blood count for screening of primary vitreoretinal lymphoma. Preprint at https://doi.org/10.5281/zenodo.17189239 (2025).
Acknowledgements
This work was supported by the National Natural Science Foundation of China (82302582) awarded to S.J.L., the Shanghai Municipal Health Commission Project (20224Y0317) awarded to S.J.L., and the Industry-University-Research Innovation Fund for Chinese Universities (2023JQ006) awarded to W.J.C. The sponsor or funding organization had no role in the design or conduct of this research. We gratefully acknowledge the dedicated support provided by the staff of the community health service centres in Xuhui District, whose contributions were essential to the successful completion of this research.
Author information
Authors and Affiliations
Contributions
W.J.C., M.G., J.C., and S.J.L. conceptualized and designed this study. S.J.L., Y.Z.L., J.N.W., J.Z.C., D.H.L., and J.R. performed most experiments. M.Y.Z., H.G.H., and Y.X.S. performed partial experiments. S.J.L., J.Z.C., D.H.L., and J.R. finished the acquisition and analysis of data. S.J.L., J.R., J.Z.C., D.H.L., J.C., and J.N.W. prepared figures, performed the statistical analysis. S.J.L., J.C., and J.Z.C. wrote the original draft. W.J.C., J.C., and M.G. reviewed and supervised the manuscript. All the authors have read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Li, S., Cao, J., Li, D. et al. A noninvasive machine learning model using a complete blood count for screening of primary vitreoretinal lymphoma. Nat Commun 16, 10667 (2025). https://doi.org/10.1038/s41467-025-65693-0
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-025-65693-0







