An artificial intelligence cloud platform for OCT-based retinal anomalies screening system in real clinical environments

Chen, Xinjian; Wang, Jingtao; Qian, Tianwei; Wang, Jingcheng; Ding, Yiming; Zhang, Su; Liao, Jingjing; Cheng, Qian; Yang, Ting; Mateen, Muhammad; Fan, Yu; Song, Zongming; Chen, Jili; Li, Suyan; Hu, Jianmin; Yan, Wentao; Chen, Haoyu; Wu, Wencan; Huang, Jing; Wong, Tien Yin; Xu, Xun

doi:10.1038/s41746-025-01959-7

Download PDF

Article
Open access
Published: 29 August 2025

An artificial intelligence cloud platform for OCT-based retinal anomalies screening system in real clinical environments

Xinjian Chen^1,2^na1,
Jingtao Wang¹^na1,
Tianwei Qian^3,4,5,6,7,
Jingcheng Wang⁸,
Yiming Ding¹,
Su Zhang¹,
Jingjing Liao¹,
Qian Cheng¹,
Ting Yang¹,
Muhammad Mateen¹,
Yu Fan⁸,
Zongming Song⁹,
Jili Chen¹⁰,
Suyan Li¹¹,
Jianmin Hu¹²,
Wentao Yan¹³,
Haoyu Chen¹⁴,
Wencan Wu¹²,
Jing Huang¹⁵,
Tien Yin Wong^16,17,18,19 &
…
Xun Xu^3,4,5,6,7

npj Digital Medicine volume 8, Article number: 559 (2025) Cite this article

8764 Accesses
2 Altmetric
Metrics details

Subjects

Abstract

Millions of individuals worldwide suffer from retinal anomalies, which can lead to irreversible vision loss. However, the number of ophthalmologists is highly mismatched with the population base in China, especially in many rural and underdeveloped towns. To tackle these challenges, this paper developed an Artificial Intelligence Cloud Platform for OCT-based Retinal Anomalies Screening (AI-PORAS), which is capable of detecting 15 retinal anomalies in OCT images to enhance remote diagnostic efficiency. AI-PORAS has been trained, validated, and deployed to 207 medical institutions in 29 provinces in China. The validation on 165,384 eyes with 3,551,959 OCT B-scan slices, AI-PORAS achieved an average accuracy of 93.16%, an AUC of 93.64%, a FPR of 6.82%, a FNR of 7.87%, matching the average performance level of attending ophthalmologists. Additionally, Statistical analysis of the 116,717 remotely diagnosed patients provided insightful guidance for healthcare decision-making and the development of tailored treatment plans.

The acceptance of ophthalmic artificial intelligence for eye diseases: a literature review and qualitative analysis

Article 13 June 2025

Assessment of image quality on color fundus retinal images using the automatic retinal image analysis

Article Open access 21 June 2022

The acceptability to patients with macular disease to have retreatment decisions being made by artificial intelligence

Article Open access 19 January 2026

Introduction

Pathological changes in the retina can cause severe vision loss or even blindness, and the accurate diagnosis of retinal lesions is extremely important¹. However, the number of ophthalmologists is highly mismatched with the population base in China, especially in many rural and underdeveloped towns. The lack of ophthalmologists, coupled with outdated equipment, makes it difficult to diagnose retinal diseases and some pathological changes.

The implementation of artificial intelligence (AI)², including medical imaging, automated diagnosis, and robotic surgery has become increasingly popular in modern medical industry^3,4,5,6. Recent studies have shown that deep learning (DL) models can detect and diagnose various retinal diseases by interpreting ocular data from different diagnostic modalities, including fundus photographs, optical coherence tomography (OCT), and visual fields⁷. The integration of DL algorithms into ophthalmology could play a crucial role in the screening and diagnosis of eye diseases in the future, especially in areas where ophthalmologists are lacking⁸.

AI has the potential to revolutionize disease diagnosis and management by performing classification for human experts and by rapidly reviewing immense amounts of images⁹. The analysis of fundus photography based on DL algorithm has been reported to use for automated screening and diagnosis of common retinal diseases, including diabetic retinopathy (DR)^3,10,11, age-related macular degeneration (AMD)^12,13,14,15, retinopathy of prematurity (ROP)^16,17, glaucoma^18,19,20, papilledema²¹, and anaemia²². Choi et al.²³ applied deep convolutional neural network for automated detection of twelve retinal diseases using a small database of fundus photograph and can detect fewer than 10 retinal diseases. Kim et al.²⁴ conducted a secondary diagnosis of normal conditions and eight retinal diseases using artificial intelligence tools based on fundus photographs. Cen et al.²⁵ established a two-tier grading system for 39 diseases and conditions using various fundus image datasets. All of the methods mentioned above utilized fundus photographs.

Recently, the analysis of OCT images based on DL algorithms has been reported to use for automated screening and diagnosis of retinal diseases. Compared to 2D fundus images, OCT images is a 3D imaging of retinal structure, which could increase the sensitivity of detection of retinal diseases at the early stage²⁶. Kermany et al.²⁷ developed an effective transfer learning algorithm for retinal disease diagnosis based on OCT images from 4686 patients across multiple centers. However, their system is limited to detecting only three retinal diseases. Hassan et al.²⁸ presented a novel incremental cross-DA approach that enables the classification networks to retain their prior knowledge while learning new cross-modality retinopathy screening tasks incrementally, achieving the diagnosis of five retinal diseases on OCT and six on fundus photographs. However, the accuracy of diagnosis for some retinal diseases with limited samples is relatively low. In this study, we developed an AI-assisted diagnosis system based on extremely large-scale multicenter OCT data, which can detect 15 retinal anomalies (including retinal diseases and some pathological changes), and it achieved average performance level that is comparable to that of attending ophthalmologists.

The analysis of big retinal disease data can elucidate complex interrelationships among multiple factors of the disease, thereby enhancing our understanding of disease onset and progression. Healthcare data analytics has the potential to bring in dramatic changes in healthcare industry by streamlining processes and improving the quality of care²⁹. Dias et al. emphasized the importance of collecting real-time data for effective medical decision support³⁰. Whereas, Chen et al.³¹ investigated the national burden of eye diseases in China from 1990 to 2019, utilizing statistical data from Global Health Data Exchange³². Yet, their study focused on only six eye diseases. In this study, we developed a cloud platform, a big data center, and AI-assisted diagnosis algorithm for retinal disease diagnosis. The system has been deployed in 207 medical institutions across 29 provinces in China. This widespread adoption underscores its significant clinical value, highlighting our focus on the research of AI-based diagnosis of retinal diseases in real clinical environments.

In summary, this study developed an artificial intelligence-empowered cloud platform, AI-PORAS, which can detect 15 retinal anomalies in OCT images with an average accuracy of 93.16% on a testing dataset of 165,384 eyes with 3,551,959 OCT B-scan slices. The proposed AI diagnostic model (OCT-AI), based on the domain adaptation method, perfectly accommodates the real-world dynamic changes in understanding retinal anomalies categories and the variability of OCT images across different OCT scanners. AI-PORAS has been adapted to a commercial OCT scanner, BV1000, and has been deployed in 207 medical institutions across 29 provinces in China, including remote areas where resources are limited. The deployed system has remotely diagnosed 116,717 patients for 15 retinal anomalies at the time of submission. In addition, extensive statistical analysis was conducted on the diagnosed patients in terms of age, gender, geographical location within China, and disease growth patterns, with the goal of identifying potential factors contributing to retinal anomalies, providing valuable decision-making guidance for government regulatory bodies. To the best of our knowledge, this is the first large-scale real-world study of using AI systems to localize, classify and statistical analysis 15 retinal anomalies simultaneously in OCT images. It is also the most extensively deployed AI cloud system for OCT-based retinal disease screening, serving 207 medical institutions in 29 provinces in China. Moreover, efforts are underway to promote the deployment of AI-PORAS in more medical institutions worldwide, benefiting people across various countries and regions globally.

Results

Localization of retinal anomalies on the source domain testing dataset

The OCT-AI model was trained for retinal disease localization with the source domain dataset. Figure 1 shows four example OCT B-scan slices with bounding boxes placed by the OCT-AI model for the identified diseases. Means of intersection over union (IoU) by OCT-AI and by different groups of ophthalmologists are shown in Fig. 2. The OCT-AI model achieved an average IoU of 0.44 for all 14 diseases that was better than the performances by the attending ophthalmologist group and the resident ophthalmologist group, who each obtained an average IoU of 0.42. Among these 14 diseases, the OCT-AI model outperformed or was comparable to both ophthalmologist groups in localizing 9 of the 14 anomalies and was inferior in localizing the remaining 5 diseases. All IoU values were calculated based on the ground truth manually annotated by the chief ophthalmologist group. On average, OCT-AI achieved the highest IoU of 0.44, indicating that the diseased regions identified by the OCT-AI model were close to the chief ophthalmologists’ decisions.

**Fig. 1: Four example results of retinal anomalies detection.**

**Fig. 2: Comparation of OCT-AI performance with ophthalmologist on source domain dataset.**

Performance on the source domain testing dataset

OCT-AI trained on the source domain training dataset achieved the best average accuracy, precision, F1 score, FNR, Kappa, and slightly worse FPR than these by the attending ophthalmologist group on the testing dataset as shown in Fig. 3. ROC curves and AUCs were also calculated for the competing methods as shown in Fig. 4. Average ROC by OCT-AI over all 14 anomalies had an AUC of 0.8559 (Fig. 4a). Individual ROC curves corresponding to the 14 anomalies are shown in Fig. 4b. Overall, OCT-AI reached or slightly exceeded the attending physician’s expertise level in classification of the 14 retinal anomalies and obtained much better performances than the resident physicians. In retinal disease localization task (Fig. 2), OCT-AI achieved a slightly better average IoU of 0.44 than the attending ophthalmologists (0.42) and the resident ophthalmologists (0.42), proving that the OCT-AI system can effectively localize the 14 retinal anomalies at the expertise level that is slightly higher than that of the attending ophthalmologists. It is worth noting that annotations by ophthalmologists are binary so only a single point of the ROC curve can be calculated for each ophthalmologist group.

**Fig. 3: Localization performance of different cohorts on source-domain test datasets.**

**Fig. 4: ROC curves and AUC values of the OCT-AI model.**

Performances on the target domain adaption dataset

The target domain adaptation dataset was collected by the commercial BV1000 OCT scanners and the training dataset were collected by two brands of OCT scanners (Optovue and Heidelberg), which have different characteristics than these of BV1000. Four experiments described in Section OCT-AI Model Development were conducted for comparison and all experiments used the target domain adaptation testing dataset for testing. It can be observed from Table 1 that BV-DA-CNN achieved the best FNR and AUC. FR-CNN and YOLOv12 are the directly application of OCT-AI to the testing dataset collected by BV1000 without adaptation and FR-CNN had the best FPR. However, FNR, ACC and AUC scores of FR-CNN are much worse than those by DA-CNN, BV-FR-CNN and BV-DA-CNN. Although YOLOv12 outperforms FR-CNN in terms of ACC and AUC (94.38% vs. 93.62, 88.2% vs. 70.68%), FR-CNN demonstrated better performance in the critical metrics of FNR and FPR. DA-CNN is not as good as BV-FR-CNN and BV-DA-CNN. BV-FR-CNN has the best ACC, and its FNR, FPR and AUC are slightly worse than those by BV-DA-CNN. BV-DA-CNN achieved the best ROC curve with AUC of 0.9406 while FR-CNN is much worse than the other three methods, as indicated by the AUC value of 0.7068 (Fig. 5a). The best method, BV-DA-CNN, achieved comparable performances as these on the testing data, indicating the model adaptation is effective.

**Fig. 5: ROC curves and AUC values by various backbones.**

Table 1 Results of domain adaptation dataset and external dataset

Full size table

Performance metrics (AUC) were calculated from 50 repeated random subsamples (50% of target domain adaption dataset, n = 50) to substantiate the significance of performance differences. As shown in Supplementary Fig. 1, ANOVA revealed statistically significant differences among the five methods (p < 0.0001). Paired Student’s t test further demonstrated that: 1) DV-CNN significantly outperformed both FR-CNN and YOLOv12 (p < 0.0001); 2) BV-FR-CNN showed statistically significant improvement over FR-CNN (p < 0.0001); 3) BV-DA-CNN achieved significant gains versus BV-FR-CNN (p < 0.0001).

Performances on the external dataset

As of the submission, we have diagnosed 116,717 patients with 204,491 eyes using the AI-PORAS platform deployed across 29 provinces and municipalities in China. We utilized 20% of the dataset (39,053 eyes with 861,314 OCT B-scan slices) to retrain the OCT-AI model, and the remaining 80% of the dataset (165,384 eyes with 3,551,959 OCT B-scan slices) for testing. It can be observed from Table 1 and Fig. 5b that BV-FR-CNN achieved a better performance than BV-DA-CNN. While BV-DA-CNN used a source domain dataset for domain adaptation, BV-FR-CNN does not, indicating that extremely large-scale data expansion has a more pronounced impact on model performance than domain adaptation. However, in real-world scenarios, obtaining an extremely large-scale labeled dataset at once is impractical. Therefore, before accumulating a larger-scale target domain dataset, we used the better-performing BV-DA-CNN as the backbone for OCT-AI. In the future, we plan to replace it with the BV-FR-CNN, which will be retrained on larger-scale data and is expected to demonstrate superior performance.

Detail results on the source domain dataset

The OCT-AI model was trained with the source domain training dataset and tested with the source domain testing dataset for classification of multiple retinal anomalies and the performances are shown in Fig. 3. The manual annotations made by the chief ophthalmologist group were utilized as ground truth. Performances of OCT-AI were compared with those by the attending and resident ophthalmologist groups. Accuracy, precision, F1 score, false positive rate (FPR), false negative rate (FNR) and Kappa coefficient (Kappa) were used as performance metrics. In summary, the OCT-AI system achieved similar or better performances than the attending ophthalmologist group and was significantly better than the resident ophthalmologist group, especially for diseases where sufficient data were available for training. Detailed results are presented below.

OCT-AI achieved an average accuracy of 93.12% in classifying the 14 anomalies (Fig. 3a), better than that achieved by the attending physicians (89.36%) and the resident physicians (87.24%). OCT-AI outperformed the two groups of physicians in classifying 11 out of 14 anomalies and was comparable or slightly worse in classifying three diseases, FTMH, RD and LMH. Accuracies of OCT-AI ranged from a low of 87.13% for Drusen to a high of 97.31% for PED.

Precision for a disease represents the fraction of correctly classified diseased regions among all identified regions for that disease. Figure 3b revealed that 82.32% of all the identified regions by OCT-AI were correct, averaged across all the 14 retinal anomalies, outperforming the attending ophthalmologist group (76.21%) and the resident ophthalmologist group (76.28%). The highest precision achieved by OCT-AI was on RD (100%), and the lowest was on ERM (non-macular) with a precision of 36.65%.

The average F1 score of 0.78 by OCT-AI was much higher than that by attending physicians (0.58) and that by resident physicians (0.48) as shown in Fig. 3c. The range of F1 scores by OCT-AI on the 14 diseases was from the lowest of 0.48 on ERM (non-macular) to the highest of 0.91 on both SRF and CME. OCT-AI obtained good F1 scores for the diseases where enough training data were available. For these diseases with limited training data, e.g., FTMH (n = 75), RD (n = 79), IS/OS (n = 127) and LMH (n = 37), OCT-AI achieved F1 scores of 0.57, 0.51, 0.54 and 0.51, respectively, which were much lower than those by the attending ophthalmologist group. Overall, OCT-AI outperformed both ophthalmologist groups.

FPRs by OCT-AI on the 14 anomalies ranged from 0% to 11.01%, with the lowest on RD (0%) and the highest on Drusen (11.01%, Fig. 3d). The average FPR of OCT-AI was 3.69%, which was lower than that of the attending physicians (4.04%) but was higher than that of the resident physicians (2.86%). The worst FPRs by OCT-AI, attending physicians and resident physicians were 11.01% (Drusen), 11.97\% (ERM, non-macular), and 13.63% (ERM), respectively.

FNR for a disease represents the ratio of missed diseased regions among all annotated regions for the disease. Figure 3e shows FNR of 21.46% by OCT-AI, which was much better than that by the attending ophthalmologist group (40.79%) and that by the resident ophthalmologist group (58.00%). OCT-AI performed worst on FTMH (n = 75), RD (n = 79) and IS/OS (n = 127) with FNRs of 57.94%, 65.49% and 62.94%, respectively, where training data were limited. However, the attending ophthalmologist group performed relatively well, achieving corresponding FNR of 2.80%, 7.96% and 34.01%, illustrating that human experts were utilizing experience to conduct the diagnosis and the process was independent to the training dataset. On the contrary, human experts performed not well on CNV, Drusen and Retinoschisis (FNRs all above 66.00%) while OCT-AI performed much better because these three diseases had much more data for training (n = 477, 873 and 462, respectively).

Kappa is a metric used for consistency testing and the larger the better. Figure 3f shows a similar trend of that as FNR, e.g., for those diseases having limited data for training, the attending ophthalmologist group performed better than OCT-AI and vice versa. On average, OCT-AI (0.76) was much better than the attending physicians (0.60) and the resident physicians (0.47). The best Kappa achieved by OCT-AI were among those diseases having relatively more samples for training, e.g., SRF (Kappa = 0.88, n = 439), CME (Kappa = 0.85, n = 545) and HE (Kappa = 0.80, n = 656).

Four-fold cross-validation was conducted on the training dataset to assess the proposed OCT-AI model for classification of retinal anomalies. The training dataset was divided into four parts and the system was trained with three of the four parts and tested on the remaining part. This procedure was repeated four times such that each part was utilized for testing once. Performance metrics on each fold are listed in Table 2. Receiver operating characteristic (ROC) curves and area under the ROC curve (AUC) values are shown in Fig. 2b. All performance metrics on each of the parts show strong internal consistency, confirming the reliability of the data annotation process.

Table 2 Results of four-fold cross-validation on the training dataset (Averaged on the 14 retinal anomalies)

Full size table

Analysis of OCT-AI failure cases

FPR and FNR are critical metrics for evaluating detection model performance. As shown in Supplementary Fig. 2a, b, we use a quantitative approach based on conditional false positive rate and conditional false negative rate to reveal inter-anomaly misclassification correlations. 1) Correlation of Conditional False Positive– Target False Negative: For all false-positive eyes of a certain conditional anomaly, calculate the false negative rate of other target anomalies (Supplementary Fig. 2a). Specifically, all false-positive cases for each anomaly were input to the model and estimated the false negative rate of other anomalies (e.g., Among all RD false-positive eyes, PVD false negative rate = 18.86%, ERM false negative rate = 15.2%). 2) Correlation of Conditional False Negative – Target False Positive: For all false-negative eyes of a certain conditional anomaly, calculate the false positive rate of other target anomalies (Supplementary Fig. 2b). Specifically, all false-negative cases for each anomaly were input to the model and estimated the false positive rate of other anomalies (e.g., Among all RD false-negative cases, PVD false positive rate = 32.02%, ERM false positive rate = 13.98%).

Analysis revealed that: 1) No false-negative anomalies were occurred in CE false-positive eyes, and the other false-positive anomalies exhibited consistent impact on CE false negative rate; 2) Among all False-Positive eyes, PVD had a large impact on the false negative rate of EZ (31.1%), and PED influenced the false negative rate of macular hole (MH) (30.0%); 3) In the false-negative case of each anomaly, no False Positive CE was observed; 4) In PED False-Negative case, false positive rates of SRF, HE, and CNV reached 74.5%, 64.7%, and 61.9% respectively. These findings indicated that CE exhibits minimal involvement with the misdiagnosis or missed detection of other anomalies, whereas the missed detection of PED serve as a major confounding factor contributing to false positives in the detection of other retinal anomalies.

Supplementary Fig. 2c–e are misclassification cases by the OCT-AI model (green bounding boxes: ground truth; yellow bounding boxes: BV-DA-CNN predictions), where RD was incorrectly classified as SRF. In this particular slice, due to the large area involved in the detachment, it is challenging to distinguish whether the lesion was RD or SRF. Notably, such misalignments also challenge clinical diagnosis, highlighting the need to enhance the model’s sensitivity to subtle morphological distinctions. Supplementary Fig. 2e involves a slice with RD where the image quality is poor. Due to the overall low brightness (possibly caused by scanning parameters or patient cooperation issues), the detached neurosensory layer was not clearly visualized, and consequently, the model also failed to detect the abnormality. This result highlights the model’s limitations in low-quality OCT images, which aligns with the practical difficulties faced by clinicians.

These representative case analyses reveal potential directions for model improvement in complex clinical scenarios. In the future, we will explore multi-scale feature fusion strategies based on medical foundation models to enhance the model’s ability to detect small or blurred lesions, as well as advanced image enhancement techniques to improve the model’s robustness to low-quality images.

Discussion

Millions of people worldwide are affected by retinal diseases such as diabetic retinopathy (DR)³³, age-related macular degeneration (AMD)¹⁴, glaucoma³⁴, and retinal detachment (RD). Without accurate diagnosis and timely appropriate treatment, these retinal diseases can lead to irreversible blurred vision, metamorphopsia, visual field defects, or even blindness. However, in rural and remote areas, especially in developing countries, where there is severe scarcity of medical resources. Consequently, patients with retinal diseases may not receive timely treatment in primary healthcare settings due to a deficiency of ophthalmologists and equipment. In China, there is also a shortage of ophthalmologists. According to China’s White Paper on Eye Health, there were 44,800 ophthalmologists nationwide by 2020, equating to 1.6 ophthalmologists per 50,000 people, the figure is even less than 0.6 in remote areas³⁵.

In recent years, the analysis of retinal images based on DL algorithms has been reported to use for automated screening and diagnosis of retinal diseases. However, most studies have shown their models trained on fundus images can only detect 10 or fewer retinal diseases^{23,24,36,37,38,39,40}, with a few capable of detecting over 10^25,41,42,43, and one can detect 39 retinal diseases²⁵, with accuracies between 30.5% and 87.42%, and AUC values between 94.1% and 99.84%. The OCT slice image-based studies conducted by Kermany²⁷, Ozdas⁴⁴ and Tayal⁴⁵ can only detect 3 or fewer retinal diseases with accuracies of 96.6%, 95.7% and 96.5%, respectively, and the study by De⁴⁶ can make referral recommendations for 4 retinal diseases. The Ultra-Widefield Fundus images (UWF) image-based studies performed by Sun⁴⁷ and Cui⁴⁸ can detect 8 and 5 retinal diseases, respectively, reaching the accuracy level of retina specialists. The study by Hassan²⁸ showed that a deep learning algorithm trained on both fundus and OCT modalities was able to detect 6 retinal diseases in OCT and 5 in fundus with an average accuracy of 98.26%. Deep learning algorithm trained with OCT images provided higher accuracies than those trained with fundus images but can only detect fewer retinal diseases.

Recent advancements highlight AI’s potential in diagnosing retinal diseases such as DR, AMD, and glaucoma⁴⁹. However, despite deep learning models achieving expert-level performance in retinal fundus image analysis^3,50,51, its clinical deployment remains limited by persistent challenges including data heterogeneity, algorithmic transparency, and regulatory compliance⁵⁰. Studies in radiology and oncology highlight AI deployment requires rigorous multi-phase validation (e.g., retrospective testing, prospective trials) to mitigate biases^52,53,54. Yet, these fields similarly face barriers in data quality, algorithmic transparency, and ethical concerns. Similarly, ophthalmology urgently requires clinically deployed and validated AI frameworks. Pioneering studies like De Fauw et al.⁴⁶ validated clinical applicability of AI in a tertiary hospital, while commercial solutions such as EyRIS (approved in seven countries) demonstrate partial success in DR/AMD screening⁵⁵. Critically, their applications remain constrained by narrow disease coverage and localized workflows. Additionally, there are no large cloud services or big data centers powered by AI techniques that are specifically designed for clinical retinal disease diagnosis and analysis.

This paper introduces an Artificial Intelligence Cloud Platform for OCT-based Retinal Disease Screening (AI-PORAS). OCT images contain more information than fundus images and it can diagnose up to 15 retinal anomalies. It streamlines and expedites the diagnosis process of retinal diseases remotely, leading to enhanced diagnostic accuracy and reduced workload for ophthalmologists. AI-PORAS also facilitates large-scale data management for advanced data analytics. The platform has successfully diagnosed 116,717 individuals across 29 provinces, municipalities, and autonomous regions in China, and this number continues to grow. Statistical analysis of the patient dataset showed that females exhibit a slightly higher prevalence of retinal diseases, ~3.16% higher than males. The incidence rates of CA, ERM, and EZ diseases continued to increase in 2023, a trend that warrants long-term attention. Additionally, a higher incidence rate of retinal diseases was observed in central China, while the southern part of China exhibited the lowest incidence rates. Details of statistical analysis on the correlation of retinal disease with gender (Supplementary Fig. 6a and Supplementary Tables 4, 5), age (Supplementary Figs. 6b, 7 and Supplementary Tables 6–8), trends (Supplementary Fig. 8 and Supplementary Table 9), and region (Supplementary Fig. 9 and Supplementary Tables 10–13) are presented in Statistical Analysis on the External Dataset from Supplementary information. AI-PORAS have demonstrated significant clinical value and being adopted by an increasing number of healthcare institutions, as well as it has high potential for further promotion to more countries and regions.

Different types of OCT scanners are available in the market, and the resulted OCT images may have different characteristics that could impact the performance of OCT-AI models. Additionally, AI system development typically span several years, and the understanding of retinal diseases is constantly updated with new research findings. Therefore, the categories of diseases automatically identified by the model must be adjusted accordingly. For example, our historically collected source domain dataset was obtained from Optovue and Heidelberg OCT scanners covering 14 retinal anomalies, while AI-PORAS was deployed across China with the self-developed commercial OCT scanner, BV1000, covering 15 retinal anomalies. Therefore, domain adaptation techniques among different scanners and disease categories are necessary to effectively utilize the historically accumulated source domain dataset from Optovue and Heidelberg for training the deep learning model.

The OCT-AI model, trained on 25,978 OCT B-scan slices from the source domain dataset, performed at a level comparable to attending ophthalmologists. It achieved an IoU of 0.44 for abnormality localization in OCT images and demonstrated an average accuracy of 93.12%, precision of 82.32%, F1 score of 0.78, FPR of 3.69%, FNR of 21.46%, and Kappa of 0.76 across 14 retinal disease classifications. All metrics, except for FPR, surpassed those of the ophthalmologists. Given the impracticality of acquiring an extremely large-scale dataset at once in a real clinical setting, domain adaptation techniques were employed to assess the generalization capability of OCT-AI. This approach enabled the model to quickly adapt to new OCT scanners and disease categories (BV-DA-CNN), achieving an accuracy of 96.62%, an AUC of 94.06, an FPR of 1.59%, and an FNR of 21.94% on 243,361 OCT B-scan slices in the target domain dataset. Compared to the baseline method (FR-CNN), three metrics (accuracy, AUC, and FNR) represent improvements of 3%, 23.38%, and 45.57%, respectively. Our study has demonstrated that extremely large-scale data expansion has a more significant impact on model performance than domain adaptation, shown in Table 1. On a larger external dataset comprising 4,413,273 OCT B-scan slices, BV-DA-CNN achieved an average accuracy of 93.16%, an AUC of 93.64%, a FPR of 6.82%, and a FNR of 7.87%. Although these scores were slightly lower than those of BV-FR-CNN, which benefits more from extremely large-scale data expansion, BV-DA-CNN remains the optimal choice for early clinical deployment. All data used in this study has been desensitized and did not require subject notification.

In summary, reading and diagnosing retinal anomalies in OCT images require intensive medical training. Many remote regions lack such resources, even primary hospitals in developed regions to be unable to meet the high demand for human expertise due to the unprecedented increase in diagnostic imaging capabilities. AI-PORAS is a potential solution to this challenge, providing remote image reading and interpretation services to hospitals in remote areas. As patient volume continues to grow, data analysis on the data will offer more insightful guidance on the diseases, and the data can be used to continuously improve the performance of AI-PORAS. To the best of our knowledge, this is the first large-scale real-world study of using AI systems to localize, classify and statistical analysis 15 retinal anomalies simultaneously in OCT images. It is also the most extensively deployed AI cloud system for OCT-based retinal disease screening, serving 207 medical institutions in 29 provinces in China.

Our future work on improving AI-PORAS can be achieved through 1) considering a retinal stratification algorithm to conduct a comprehensive thickness analysis of retinal layers across diverse regions, nationalities, and age groups. This can significantly enhance the understanding of eye health and provide valuable guidance and decision-making suggestions for healthcare professionals in delivering more effective and personalized care to patients; 2) integrating these patient factors such as gender, region, and age group into algorithm design to increase the specificity of AI-PORAS. Our objective is to validate the efficacy of OCT-AI in a clinical setting that mimics real-world scenarios. This endeavor not only enhances trust in artificial intelligence technology within the medical domain but also furnishes valuable insights and guidance for practical clinical applications in the future. 3) OCT-AI is systematic refinement of standard frameworks without a complete fundamental network redesign. Future work will explore attention mechanisms or develop new foundation model-based frameworks, as well as explore hybrid architectures combining Transformer and CNN to further optimize small lesions detection performance.

Our study has limitations. First, the current statistical analysis is conducted from four perspectives: region, gender, age, and trends, yet more subtle and potentially valuable information remains to be explored in depth, such as the underlying correlations between the incidences of certain diseases. Second, the OCT-AI Model was trained to detect 15 popular retinal anomalies, there are still quite few retinal diseases that cannot be handled by our system, partially because these retinal anomalies are rare and there are not enough annotated images for training. Finally, accurately localizing and classifying all diseases in OCT images are challenging and there are large variations among different human experts, which need to be addressed to develop robust OCT-AI model.

Methods

Study design

We developed the Artificial Intelligence Cloud Platform for OCT-based Retinal Disease Screening (AI-PORAS) system for retinal disease diagnosis as illustrated in Fig. 6. AI-PORAS consists of four key components: a cloud server powered by an in-house developed AI-assisted diagnosis model known as OCT-AI, an expert team of twelve ophthalmologists, OCT terminals at 207 medical institutions, and the Big Data Analysis Platform (Supplementary Fig. 3). Terminal OCT scanners capture images and associated essential patient data, including age, gender, and region, from terminals across China and transfer the data to the cloud server for further processing. In the cloud server, OCT-AI first provides auxiliary diagnostic results, which are then transmitted to the expert team for confirmation and possible correction to reach a consensus diagnosis. The diagnostic results are subsequently transmitted back to the terminals for clinical use. Additionally, the Big Data Analysis Platform performs a comprehensive statistical analysis on the patients in the pool, including disease growth trends, patient ages, genders, and regions.

Prior to the deployment of AI-PORAS, the OCT-AI system was pretrained with the source domain dataset shown in Supplementary Table 1, which consist of 30,716 OCT B-scan slices collected from 4243 eyes at five Chinese hospitals using two OCT scanners (Optovue and Heidelberg). The trained OCT-AI system was then adapted to a commercial OCT scanner, BV1000, made by Big Vision company, using the domain adaptation datasets (243,361 OCT B-scan slices from 2948 eyes) shown in Supplementary Table 2 and a domain adaptation approach. The AI-PORAS platform, equipped with the BV1000 OCT terminal scanner, has been deployed at 207 medical institutions across China and had diagnosed 116,717 patients, 204,491 eyes with 4,413,273 OCT B-scan slices by the time of manuscript submission shown in Supplementary Table 3, referred to as the external dataset. Diagnostic results on the source domain, target domain adaptation, and external datasets, and statistical analysis results on the external datasets after the platform’s deployment, are reported in this study.

Our source domain dataset was collected using commercial OCT scanners (Optovue and Heidelberg). Subsequently, we developed the BV1000 scanner for the remotely deployed AI-PORAS system. However, the BV1000 has different characteristics from those of the two commercial OCT scanners, necessitating adaptation of the trained AI-PORAS to the developed commercial scanner. Therefore, the BV1000 scanner was used to collect the domain adaptation dataset to adapt AI-PORAS to BV1000 before its deployment. Domain adaptation method can minimize image-level shifts, including image style and illumination, as well as instance-level shifts, such as object appearance size⁵⁶. This allows for the full utilization of the historically accumulated source domain dataset collected by Optovue and Heidelberg, effectively simulating the effect of dataset expansion.

According a multicenter study covering all 31 provinces of mainland China⁵⁷, fifteen retinal anomalies were commonly observed both in the general population and in ophthalmology outpatient departments across multiple institutions. The anomalies were investigated in this study include Posterior Vitreous Detachment (PVD), Retinal Pigment Epithelial Detachments (PED), Drusen, Choroidal Atrophy, Epiretinal Membrane (ERM), Ellipsoid Zone Disruption (EZ), Hard Exudate (HE), Retinoschisis (RC), Subretinal Fluid (SRF), Cystoid Macular Edema (CME), Retinal Hemorrhage (RH), Choroidal Neovascularization (CNV), Macular Hole (MH), Choroidal Excavation (CE), Retinal Detachment (RD) and Retinal Atrophy (RA). Notably, the included pathologies represent hallmark features of major blinding diseases: Drusen, PED, CNV, and SRF are pathognomonic features of AMD, affecting 196 million globally in 2020 with projected increase to 288 million by 2040⁵⁸; while CME, SRF, HE, and RH are commonly observed in Diabetic Retinopathy (DR), particularly in Diabetic Macular Edema (DME); CME, SRF, and RH are also typical manifestations of Retinal Vein Occlusion, the second-leading retinal vascular blinding disease after DR⁵⁹. This systematic inclusion of high-incidence and high-risk blinding retinal anomalies ensures clinical generalizability and diagnostic significance of our AI model.

Expert team and data annotation

The OCT-AI model first locates potential abnormal regions and marks a bounding box for each of the identified regions. It then classifies the tissue in the bounding box as one of the retinal anomalies. Thus, a ground truth annotation includes two parts: the location of the bounding box and the disease type of the tissue within it. To holistically balance comprehensive model accuracy evaluation with practical deployment efficiency, we designed a specialized fine-grained annotation protocol for source domain data, and implemented simplified yet efficient annotation workflow for target domain data.

To precisely annotate the source domain training and testing datasets, 9 clinical ophthalmologists were recruited and divided into three groups, each with different levels of experience: chief ophthalmologists, attending ophthalmologists, and resident ophthalmologists. Each group, consisting of a group leader and two members, operated as follows: each member initially annotated all the OCT B-scan slices independently, then the group leader reviewed all the annotations, resolved inter-annotator discrepancies, and confirmed the final annotations. The final annotations by the chief ophthalmologists served as ground truth for computing performance metrics, while those by the other groups were used in a comparative study. Each OCT B-scan slice was annotated as none, one, or more of the 14 retinal anomalies. Examples of annotated OCT B-scan slices are shown in Supplementary Fig. 4.

To efficiently annotate the target domain adaptation dataset, 3 OCT specialists with experience equivalent to that of the chief ophthalmologist group were recruited. Two group members first manually annotated all the dataset, including locations of bounding boxes and their corresponding disease types. The team leader then reviewed all the annotations, resolved inter-annotator discrepancies, and confirmed the final annotations. The final annotations were used as ground truth for training and domain adaptation. Each OCT B-scan slice was annotated as none, one, or more of the 15 retinal anomalies. It is worth noting that the target domain adaptation dataset was annotated for 15 anomalies, and some of them were different from these 14 anomalies annotated for the source domain dataset.

The collection and annotation of the source domain dataset lasted for about two years, while the collection and annotation of the target domain adaptation dataset occurred recently over a relatively short period. During this time, the expert team’s understanding of the disease evolved with data accumulation, leading to annotation differences between the datasets. Moreover, an AI model based on domain adaption method was designed to adapt such types of dynamic changes in real clinical scenarios.

Standardization protocols of diverse imaging platforms

Despite variations in hardware configurations among different OCT scanners - including but not limited to the splitting ratio of fiber optic couplers (e.g., 1/9 or 2/8), and light source selection (broadband low-coherence sources for spectral-domain OCT (SD-OCT) versus wavelength-swept lasers for swept-source OCT (SS-OCT))- all systems share the fundamental imaging principle based on low-coherence interferometry⁶⁰. Through complex signal analysis (Fourier or inverse Fourier transform) of coherent light signals, these devices reconstruct cross-sectional fundus images. This universal physical mechanism ensures consistency in the core imaging process across major commercial systems (Optovue, Heidelberg, BV1000). Achieving inter-device consistency primarily relies on standardized acquisition protocols. Therefore, in this study, all OCT imaging platforms have been required to ensure the visibility of key fundus structures and to center the imaging on the macula.

Besides, OCT devices from different manufacturers exhibit significant variations in imaging resolution, speckle noise intensity, and other factors, which introduces a “domain shift” to the model (Supplementary Fig. 5). No explicit means were directly implemented on the imaging platforms themselves to ensure enforced consistency across different devices. Therefore, we designed the BV-DA-CNN network, with a fully supervised domain adaptation approach to minimize inter-device domain discrepancies. This enables the model to learn robust features from source devices (e.g., Optovue and Heidelberg) and apply them to target devices for disease detection (e.g., BV1000).

To furtherly ensure cross-device image consistency, we implemented a standardized preprocessing pipeline compliant with the maskrcnn-benchmark framework input requirements⁶¹. Specifically, all input images (the shorter edge:$\,x$, the longer edge: $y$) underwent uniform resizing: $x$ was first resized to a predefined minimum dimension ${S}_{\min }$ while maintaining the original aspect ratio. If this operation resulted in $y$ exceeding a maximum threshold ${S}_{\max }$ pixels, we instead constrain $y$ to ${S}_{\max }$ pixels, proportionally adjusting $x$. This approach both preserved the original aspect ratio while ensuring scale consistency of network inputs, effectively mitigating the impact of variations in acquisition parameters or inherent resolution. More critically, the proposed fully supervised domain adaptation method achieved consistent lesion representation through feature space alignment, effectively reducing discrepancies between similar lesions captured by different imaging platforms at the feature space level.

OCT-AI model development

The proposed OCT-AI model advances the Faster R-CNN⁶² with an innovative domain adaptation framework tailored for OCT-based AI diagnostics. Lesions in OCT images typically exhibit multi-layer structures with blurred boundaries and irregular shapes, presenting high diversity. Compared to state-of-the-art detection architectures such as the YOLO series (e.g., YOLOv5/YOLOv8/YOLOv10) and the transformer-based DETR series (e.g., DETR, RT-DETR), Faster R-CNN demonstrates significant advantages in detecting medical lesions due to its RoIAlign mechanism for fine-grained feature extraction^62,63. Furthermore, to address domain shifts arising from heterogeneous OCT scanners, we designed a novel domain adaptation framework with four specialized strategies: FR-CNN, DA-CNN, BV-FR-CNN and BV-DA-CNN.

We trained the OCT-AI model for retinal abnormality localization and classification. The model first identifies the abnormal regions indicated by bounding boxes and then classifies each of the regions as one of the annotated anomalies using a fully connected (FC) layer, as shown in the green box in Fig. 7. OCT-AI, similar to Faster R-CNN structurally integrates region proposal, feature extraction, bounding box regression, and classification for fast object detection. The feature extractor can be any image classification model, such as ImageNet⁶⁴, VGG⁶⁵, ResNet⁶⁶, Inception V1-V4⁶⁷, or DenseNet⁶⁸. We chose ResNet-101⁶⁶ because its residual architecture makes the training very effective.

**Fig. 7: Diagram of the domain adaptive faster R-CNN model.**

In our study, the source domain dataset was collected using Optovue and Heidelberg OCT scanners, while the target domain adaptation dataset was collected by the commercial BV1000 scanner, which has different specifications from those of Optovue and Heidelberg. Additionally, it is worth noting that the target domain adaptation dataset was annotated for 15 anomalies, some of which differ from the 14 anomalies annotated in the source domain dataset. The OCT-AI model has been designed based on domain adaptation to accommodate changes in understanding retinal anomalies categories and the variability of OCT images across different OCT scanners. This approach also enables the OCT-AI model to fully leverage historical source domain data before acquiring larger-scale target domain data. We first trained and tested OCT-AI with the source domain dataset collected by Optovue and Heidelberg, and then adapted the trained OCT-AI using the target adaptation dataset collected by BV1000, before deploying it to the real system that equipped with the BV1000.

We designed four domain adaptation strategies and all experiments used the domain adaptation dataset for testing. 1) FR-CNN: we trained the Faster R-CNN model with the re-annotated source domain training dataset and then directly applied the trained model to the testing dataset in the target domain adaptation dataset. 2) DA-CNN: we adapted FR-CNN using the Domain Adaptive Faster R-CNN⁵⁶ method with the training images without annotations in the domain adaptation dataset, making this method semi-supervised. The Domain Adaptive Faster R-CNN⁵⁶ model employs two domain-adaptive classification components to align image-level and instance-level feature representations, as shown in the red box of Fig. 7. It also uses regularization consistency to improve robustness at both levels. 3) BV-FR-CNN: we trained Faster R-CNN with the training images with annotations in the target domain adaptation dataset directly. 4) BV-DA-CNN: we adapted the trained FR-CNN with the training images with annotations in the domain adaptation dataset, making this method fully supervised. Notably, in BV-DA-CNN, we made the following modifications to DA-CNN: we (I) added multi-label classification to the image-level classifier to enhance the overall anomalies prediction capability as shown in the blue box of Fig. 7, (II) changed the instance-level classifier from binary classification to multi-class classification, and (III) removed the consistency regularization from the original Domain Adaptive Faster R-CNN since both domains had labels for training.

During the training of OCT-AI object detection model, we employed a balanced 1:1 sampling ratio for selecting foreground and background proposals in the Region Proposal Network (RPN). The same sampling strategy was applied in the detection head responsible for fine-grained classification to ensure balance between foreground and background. Although no additional sampling techniques for inter-class lesion balancing, data augmentation techniques such as random rotation, vertical flipping, and horizontal flipping were utilized to enhance data distribution diversity and mitigate overfitting.

AI-PORAS system deployment

Retinal anomalies have become more prevalent in recent years, making it challenging to diagnose them relying solely on outpatient doctors, especially in rural areas where medical resources are limited. There is an urgent need for an efficient alternative diagnostic tool that can provide quick screening and early diagnosis of retinal disease to alleviate the uneven distribution of medical resources. For this purpose, we deployed our AI-PORAS platform 207 medical institutions in China nationwide through strategic partnerships or public health screenings, to store, view, process, and accelerate the screening and diagnosis of retinal anomalies, as shown in Fig. 6. Given that the BV-DA-CNN model has demonstrated the best performance (Table 1) before obtaining larger-scale target domain data, it has been selected as the backbone for the OCT-AI model, and the trained BV-DA-CNN has been uploaded to our self-built private cloud. This enables remote users to transmit data through the network and obtain results. After a patient is examined, their clinical information and images are uploaded to the cloud server, which archives the data and requests the deployed OCT-AI model to perform the analysis. Once completed, medical personnel log into the platform through a browser to review the original OCT data and the AI analysis results, and then write a report for the patient.

Implementation details

The system was implemented by Pytorch and trained with the Stochastic Gradient Descent (SGD) optimizer for 180000 epochs on a deep learning server with 8 GeForce GTX 1080Ti Graphical Processing Units (GPUs). The initial learning rate was set as 0.0025 and weight decay as 0.0001. To improve the generalization abilities, the training data were augmented by random flipping, random cropping, and adding noise.

Performance metrics

Intersection of union (IoU) between predicted and ground truth bounding boxes was utilized to assess performance of disease localization. For retinal disease classification, false negative rate (FNR, also known as missed diagnosis rate), false positive rate (FPR, also known as misdiagnosis rate), accuracy (ACC), precision (PRE), Kappa coefficient (Carletta, 1996 #383) and F1 score were used as performance metrics. In addition, receiver operating characteristic (ROC) curve, and area under the ROC curve (AUC) were calculated for model comparison. These metrics are defined as:

$$FNR=\frac{FN}{FN+TP},FPR=\frac{FP}{FP+TN},ACC=\frac{TN+TP}{TN+TP+FP+FN}$$

(1)

$$PRE=\frac{TP}{TP+FP},F1score=\frac{2\times PRE\times Recall}{PRE+Recall}$$

(2)

$$Kappa=\frac{2\times (TP\times TN-FN\times FP)}{(TP+FP)\times (FP+TN)+(TP+FN)\times (FN+TN)}$$

(3)

where ${Recall}=\frac{{TP}}{{TP}+{FN}}$, and ${TP},{TN},{FP},{FN}$ denote true positive, true negative, false positive, and false negative, respectively.

Data availability

The whole dataset supporting the findings of the current study is not publicly available due to the confidentiality policy of the National Health Commission of China and institutional patient privacy regulation. However, we have released 1,000 desensitized case images covering 15 retinal anomalies, available at https://drive.google.com/file/d/1mm9LtM7NjZD3Yf77lDFWuTdDA2HVqbSK/view?usp=sharing. For access to additional patient data, please contact the author (xjchen@suda.edu.cn) directly.

Code availability

Source code of OCT-AI available at https://github.com/wRuanMing/BV-DA-CNN.

References

Guo, A., Fang, L., Qi, M. & Li, S. Unsupervised denoising of optical coherence tomography images with nonlocal-generative adversarial network. IEEE Trans. Instrum. Meas. 70, 1–12 (2020).
Article Google Scholar
Shick, A. A. et al. Transparency of artificial intelligence/machine learning-enabled medical devices. NPJ Digital Med. 7, 21 (2024).
Article Google Scholar
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316, 2402–2410 (2016).
Article PubMed Google Scholar
Kim, D. H. et al. The settings, pros and cons of the new surgical robot da vinci xi system for transoral robotic surgery (tors): A comparison with the popular da vinci si system. Surg. Laparosc. Endosc. Percutan Tech. 26, 391–396 (2016).
Article PubMed Google Scholar
Curioni-Fontecedro, A. A new era of oncology through artificial intelligence. ESMO Open 2, e000198 (2017).
Article PubMed PubMed Central Google Scholar
Hamet, P. & Tremblay, J. Artificial intelligence in medicine. Metabolism 69, S36–S40 (2017).
Article CAS Google Scholar
Ting, D. S. W., Lee, A. Y. & Wong, T. Y. An ophthalmologist’s guide to deciphering studies in artificial intelligence. Ophthalmology 126, 1475–1479 (2019).
Article PubMed Google Scholar
Grewal, P. S., Oloumi, F., Rubin, U. & Tennant, M. T. Deep learning in ophthalmology: A review. Can. J. Ophthalmol. 53, 309–313 (2018).
Article PubMed Google Scholar
World Health Organization. Recommendations on digital interventions for health system strengthening. 2020-2010 (World Health Organization, 2019).
Abràmoff, M. D. et al. Improved automated detection of diabetic retinopathy on a publicly available dataset through integration of deep learning. Invest. Ophthalmol. Vis. Sci. 57, 5200–5206 (2016).
Article PubMed Google Scholar
Ting, D. S. W. et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA 318, 2211–2223 (2017).
Article PubMed PubMed Central Google Scholar
Burlina, P. M. et al. Use of deep learning for detailed severity characterization and estimation of 5-year risk among patients with age-related macular degeneration. JAMA Ophthalmol. 136, 1359–1366 (2018).
Article PubMed PubMed Central Google Scholar
Grassmann, F. et al. A deep learning algorithm for prediction of age-related eye disease study severity scale for age-related macular degeneration from color fundus photography. Ophthalmology 125, 1410–1420 (2018).
Article PubMed Google Scholar
Peng, Y. et al. Deepseenet: a deep learning model for automated classification of patient-based age-related macular degeneration severity from color fundus photographs. Ophthalmology 126, 565–575 (2019).
Article PubMed Google Scholar
Burlina, P. M. et al. Automated grading of age-related macular degeneration from color fundus images using deep convolutional neural networks. JAMA Ophthalmol. 135, 1170–1176 (2017).
Article PubMed PubMed Central Google Scholar
Brown, J. M. et al. Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks. JAMA Ophthalmol. 136, 803–810 (2018).
Article PubMed PubMed Central Google Scholar
Wang, J. et al. Automated explainable multidimensional deep learning platform of retinal images for retinopathy of prematurity screening. JAMA Netw. Open 4, e218758 (2021).
Article PubMed PubMed Central Google Scholar
Chang, J. et al. Explaining the rationale of deep learning glaucoma decisions with adversarial examples. Ophthalmology 128, 78–88 (2021).
Article PubMed Google Scholar
Phene, S. et al. Deep learning and glaucoma specialists: the relative importance of optic disc features to predict glaucoma referral in fundus photographs. Ophthalmology 126, 1627–1639 (2019).
Article PubMed Google Scholar
Raghavendra, U. et al. Deep convolution neural network for accurate diagnosis of glaucoma using digital fundus images. Inf. Sci. 441, 41–49 (2018).
Article Google Scholar
Milea, D. et al. Artificial intelligence to detect papilledema from ocular fundus photographs. N. Engl. J. Med. 382, 1687–1695 (2020).
Article PubMed Google Scholar
Mitani, A. et al. Detection of anaemia from retinal fundus images via deeplearning. Nat. Biomed. Eng. 4, 18–27 (2020).
Article PubMed Google Scholar
Choi, J. Y. et al. Multi-categorical deep learning neural network to classify retinal images: A pilot study employing small database. PLoS One 12, e0187336 (2017).
Article PubMed PubMed Central Google Scholar
Kim, K. M. et al. Development of a fundus image-based deep learning diagnostic tool for various retinal diseases. J Pers Med. 11, 321 (2021).
Cen, L. P. et al. Automatic detection of 39 fundus diseases and conditions in retinal photographs using deep neural networks. Nat. Commun. 12, 4828 (2021).
Article CAS PubMed PubMed Central Google Scholar
Schlegl, T. et al. Fully automated detection and quantification of macular fluid in oct using deep learning. Ophthalmology 125, 549–558 (2018).
Article PubMed Google Scholar
Kermany, D. S. et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172, 1122–1131.e1129 (2018).
Article CAS PubMed Google Scholar
Hassan, T. et al. Incremental cross-domain adaptation for robust retinopathy screening via bayesian deep learning. IEEE Trans. Instrum. Meas. 70, 1–14 (2021).
Google Scholar
Rehman, A., Naz, S. & Razzak, I. Leveraging big data analytics in healthcare enhancement: Trends, challenges and opportunities. Multimed. Syst. 28, 1339–1371 (2022).
Article Google Scholar
Karatas, M., Eriskin, L., Deveci, M., Pamucar, D. & Garg, H. Big data for healthcare industry 4.0: Applications, challenges and future perspectives. Expert Syst. Appl. 200, 116912 (2022).
Article Google Scholar
Binbin, C., Lixia, L. & Juan, Y. Eye diseases burden in china in the past 30 years. Zhejiang Da Xue Xue Bao Yi Xue Ban. 50, 420–428 (2021).
Google Scholar
IHME. Global Burden of Disease Study 2017 (GBD 2017) Data Input Sources Tool, 393, 1958–1972 (IHME, 2019).
Fleckenstein, M., Schmitz-Valckenberg, S. & Chakravarthy, U. Age-related macular degeneration: a review. JAMA 331, 147–157 (2024).
Article CAS PubMed Google Scholar
Baskaran, M. et al. The prevalence and types of glaucoma in an urban chinese population: the singapore chinese eye study. JAMA Ophthalmol. 133, 874–880 (2015).
Article PubMed Google Scholar
沙文茹. 《中国眼健康白皮书》发布: 致盲性眼病有效遏制. Nat Med中国医药科学. 10, 3-5, (2020).
Dong, L. et al. Artificial intelligence for screening of multiple retinal and optic nerve diseases. JAMA Network Open. 5, e229960 (2022).
Lin, P.-K. et al. Padar: Physician-oriented artificial intelligence-facilitating diagnosis aid for retinal diseases. J. Med. Imaging 9, 044501–044501 (2022).
Article Google Scholar
Tan, T. E. et al. Retinal photograph-based deep learning algorithms for myopia and a blockchain platform to facilitate artificial intelligence medical research: a retrospective multicohort study. Lancet Digit Health 3, E317–E329 (2021).
Article CAS PubMed Google Scholar
Sun, K. et al. Multi-label classification of fundus images with graph convolutional network and lightGBM. Comput Biol Med. 149, 105909 (2022).
Sengar, N., Joshi, R. C., Dutta, M. K. & Burget, R. Eyedeep-net: a multi-class diagnosis of retinal diseases using deep neural network. Neural Comput. Appl. 35, 10551–10571 (2023).
Article Google Scholar
Lin, D. R. et al. Application of comprehensive artificial intelligence retinal expert (care) system: a national real-world evidence study. Lancet Digit Health 3, E486–E495 (2021).
Article CAS PubMed Google Scholar
Son, J. et al. Development and validation of deep learning models for screening multiple abnormal findings in retinal fundus images. Ophthalmology 127, 85–94 (2020).
Article PubMed Google Scholar
Ho, E. et al. Deep ensemble learning for retinal image classification. Transl. Vis. Sci. Techn. 11, 39 (2022).
Ozdas, M. B., Uysal, F. & Hardalac, F. Classification of retinal diseases in optical coherence tomography images using artificial intelligence and firefly algorithm. Diagnostics 13, 433 (2023).
Tayal, A. et al. Dl-cnn-based approach with image processing techniques for diagnosis of retinal diseases. Multimed. Syst. 28, 1417–1438 (2022).
Article Google Scholar
De Fauw, J. et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat. Med. 24, 1342–1350 (2018).
Article PubMed Google Scholar
Sun, G. P. et al. Deep learning for the detection of multiple fundus diseases using ultra-widefield images. Ophthalmol. Ther. 12, 895–907 (2023).
Article PubMed Google Scholar
Cui, T. X. et al. Deep learning performance of ultra-widefield fundus imaging for screening retinal lesions in rural locales. JAMA Ophthalmol. 141, 1045–1051 (2023).
Article PubMed PubMed Central Google Scholar
Labib, K. M., Ghumman, H., Jain, S., Jarstad, J. S. & Jarstad, J. S. II A review of the utility and limitations of artificial intelligence in retinal disorders and pediatric ophthalmology. Cureus 16, 1 (2024).
Google Scholar
Li, Z. et al. Artificial intelligence in ophthalmology: the path to the real-world clinic. Cell Rep. Med. 4, 1 (2023).
Ting, D. S. W. et al. Artificial intelligence and deep learning in ophthalmology. Br. J. Ophthalmol. 103, 167–175 (2019).
Article PubMed Google Scholar
Ketola, J. H. et al. Testing process for artificial intelligence applications in radiology practice. Phys. Med. 128, 104842 (2024).
Article PubMed Google Scholar
Saini, S. & Saxena, N. A survey of threats to research literature-dependent medical ai solutions. ACM Comput. Surv. 55, 1–26 (2023).
Article Google Scholar
Liu, Y., Yu, W. & Dillon, T. Regulatory responses and approval status of artificial intelligence medical devices with a focus on china. NPJ Digital Med. 7, 255 (2024).
Article Google Scholar
Miller, S., Gomulya, D. & Rao-Kachroo, M. Eyris: From the lab to the market. Asian Manag. Insights 11, 46–53 (2024).
Google Scholar
Chen, Y., Li, W., Sakaridis, C., Dai, D. & Van Gool, L. Domain adaptive faster R-CNN for object detection in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3339–3348, (IEEE, 2018).
Zhou, C. et al. Visual impairment and blindness caused by retinal diseases: a nationwide register-based study. J. Glob. Health 13, 04126 (2023).
Article PubMed PubMed Central Google Scholar
Korva-Gurung, I., Kubin, A.-M., Ohtonen, P. & Hautala, N. Incidence and prevalence of neovascular age-related macular degeneration: 15-year epidemiological study in a population-based cohort in finland. Ann. Med. 55, 2222545 (2023).
Article PubMed PubMed Central Google Scholar
Meiaad, K., Michael, W. & Noemi, L. Ischemic retinal vein occlusion: characterizing the more severe spectrum of retinal vein occlusion. Surv. Ophthalmol. 63, 816–850 (2018).
Article Google Scholar
Targowski, P., Kowalska, M., Sylwestrzak, M. & Iwanicka, M. Optical coherence tomography and its non-medical applications. IntechOpen 1, 1–224 (2020).
Google Scholar
He, K., Gkioxari, G., Dollár, P. & Girshick, R. Mask R-CNN. In Proceedings of the IEEE international conference on computer vision. 2961–2969, (IEEE, 2017).
Ren, S., He, K., Girshick, R., Sun, J. & Faster, R. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017).
Article PubMed Google Scholar
Huang, J. & Li, T. Small object detection by detr via information augmentation and adaptive feature fusion. In: Proceedings of ACM ICMR Workshop on Multimodal Video Retrieval, 39–44 (ACM, 2024).
Deng, J. et al. Imagenet: a large-scale hierarchical image database. In IEEE conference on computer vision and pattern recognition. 248–255 (IEEE, 2009).
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. Preprint at https://arxiv.org/abs/1409.1556 (2014).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778 (IEEE, 2016).
Szegedy, C. et al. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1–9, (IEEE, 2015).
Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4700–4708, (IEEE, 2017).

Download references

Acknowledgements

This study was supported in part by the National Key R&D Program of China (2018YFA0701700), part by the National Nature Science Foundation of China (U20A20170), part by Shanghai Medical Key Special Construction Project (2024ZDXK0054).

Author information

These authors contributed equally: Xinjian Chen, Jingtao Wang.

Authors and Affiliations

School of Electronics and Information Engineering, Soochow University, Suzhou, China
Xinjian Chen, Jingtao Wang, Yiming Ding, Su Zhang, Jingjing Liao, Qian Cheng, Ting Yang & Muhammad Mateen
State Key Laboratory of Radiation Medicine and Protection, Soochow University, Suzhou, Jiangsu, China
Xinjian Chen
Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University, Shanghai, China
Tianwei Qian & Xun Xu
National Clinical Research Center for Eye Diseases, Shanghai, China
Tianwei Qian & Xun Xu
Shanghai Key Laboratory of Ocular Fundus Diseases, Shanghai, China
Tianwei Qian & Xun Xu
Shanghai Engineering Center for Visual Science and Photomedicine, Shanghai, China
Tianwei Qian & Xun Xu
Shanghai Engineering Center for Precise Diagnosis and Treatment of Eye Disease, Shanghai, China
Tianwei Qian & Xun Xu
BigVison Medical Technology Ltd, Suzhou, Jiangsu, China
Jingcheng Wang & Yu Fan
Henan Eye Institute, Henan Provincial People’s Hospital, Zhengzhou, China
Zongming Song
Shibei Hospital Jingan District Shanghai, Shanghai, China
Jili Chen
Department of Ophthalmology, The Affiliated Xuzhou Municipal Hospital of Xuzhou Medical University, Xuzhou First People’s Hospital, Xuzhou Eye Disease Prevention and Treatment Institute, Xuzhou, Jiangsu, China
Suyan Li
Fujian Medical University, Fuzhou, Fujian, China
Jianmin Hu & Wencan Wu
The Eye Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China
Wentao Yan
Joint Shantou International Eye Centre of Shantou University and The Chinese University of Hong Kong, Shantou, Guangdong, China
Haoyu Chen
Department of Pathology, Soochow Medical College, Soochow University, Suzhou, China
Jing Huang
Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
Tien Yin Wong
School of Clinical Medicine, Tsinghua Medicine, Beijing, China
Tien Yin Wong
Beijing Tsinghua Changgung Hospital, Beijing, China
Tien Yin Wong
Zhongshan Ophthalmic Center, Guangzhou, China
Tien Yin Wong

Authors

Xinjian Chen
View author publications
Search author on:PubMed Google Scholar
Jingtao Wang
View author publications
Search author on:PubMed Google Scholar
Tianwei Qian
View author publications
Search author on:PubMed Google Scholar
Jingcheng Wang
View author publications
Search author on:PubMed Google Scholar
Yiming Ding
View author publications
Search author on:PubMed Google Scholar
Su Zhang
View author publications
Search author on:PubMed Google Scholar
Jingjing Liao
View author publications
Search author on:PubMed Google Scholar
Qian Cheng
View author publications
Search author on:PubMed Google Scholar
Ting Yang
View author publications
Search author on:PubMed Google Scholar
Muhammad Mateen
View author publications
Search author on:PubMed Google Scholar
Yu Fan
View author publications
Search author on:PubMed Google Scholar
Zongming Song
View author publications
Search author on:PubMed Google Scholar
Jili Chen
View author publications
Search author on:PubMed Google Scholar
Suyan Li
View author publications
Search author on:PubMed Google Scholar
Jianmin Hu
View author publications
Search author on:PubMed Google Scholar
Wentao Yan
View author publications
Search author on:PubMed Google Scholar
Haoyu Chen
View author publications
Search author on:PubMed Google Scholar
Wencan Wu
View author publications
Search author on:PubMed Google Scholar
Jing Huang
View author publications
Search author on:PubMed Google Scholar
Tien Yin Wong
View author publications
Search author on:PubMed Google Scholar
Xun Xu
View author publications
Search author on:PubMed Google Scholar

Contributions

C.X.J., X.X., W.T.Y., and W.J.T. designed the research. C.X.J., X.X., Y.W.T., and W.W.C. provided data. W.J.T., F.Y. and W.J.C. implemented all the experiments. W.T.Y., Q.T.W., D.Y.M., Z.S., L.J.J., Y.T., C.Q., M.M., L.S.Y., H.J.M., S.Z.M., C.J.L., and C.H.Y. analyzed and discussed the experimental results. C.X.J., W.J.T., H.J., W.T.Y., and X.X., prepared the manuscript. All authors have read and approved the manuscript

Corresponding authors

Correspondence to Xinjian Chen, Tien Yin Wong or Xun Xu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, X., Wang, J., Qian, T. et al. An artificial intelligence cloud platform for OCT-based retinal anomalies screening system in real clinical environments. npj Digit. Med. 8, 559 (2025). https://doi.org/10.1038/s41746-025-01959-7

Download citation

Received: 14 April 2025
Accepted: 17 August 2025
Published: 29 August 2025
Version of record: 29 August 2025
DOI: https://doi.org/10.1038/s41746-025-01959-7