High-resolution fundus images for ophthalmomics and early cardiovascular disease prediction

Guo, Na; Fu, Wanjin; Li, Heng; Zhang, Haoyun; Li, Tiantian; Zhang, Wei; Zhong, Xing; Pan, Tianrong; Sun, Fuchun; Gong, Ajuan

doi:10.1038/s41597-025-04930-z

Download PDF

Data Descriptor
Open access
Published: 03 April 2025

High-resolution fundus images for ophthalmomics and early cardiovascular disease prediction

Na Guo ORCID: orcid.org/0000-0001-5985-5669¹^na1,
Wanjin Fu²^na1,
Heng Li³,
Haoyun Zhang⁴,
Tiantian Li⁵,
Wei Zhang²,
Xing Zhong⁶,
Tianrong Pan⁶,
Fuchun Sun⁷ &
…
Ajuan Gong⁶

Scientific Data volume 12, Article number: 568 (2025) Cite this article

8325 Accesses
3 Citations
1 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 25 April 2025

This article has been updated

Abstract

Cardiovascular diseases (CVDs) remain the foremost cause of mortality globally, emphasizing the imperative for early detection to improve patient outcomes and mitigate healthcare burdens. Carotid intima-media thickness (CIMT) serves as a well-established predictive marker for atherosclerosis and cardiovascular risk assessment. Fundus imaging offers a non-invasive modality to investigate microvascular pathology and systemic vascular health. However, the paucity of high-quality, publicly available datasets linking fundus images with CIMT measurements has hindered the progression of AI-driven predictive models for CVDs. Addressing this gap, we introduce the China-Fundus-CIMT dataset, comprising bilateral high-resolution fundus images, CIMT measurements, and clinical data—including age and gender—from 2,903 patients. Our experiments with multimodal models reveal that integrating clinical information substantially enhances predictive performance, yielding AUC-ROC increases of 3.22% and 7.83% on the validation and test sets, respectively, compared to unimodal models. This dataset constitutes a vital resource for developing and validating AI-based early screening models for CVDs using fundus images and is now accessible to the research community.

Multimodal deep learning of fundus abnormalities and traditional risk factors for cardiovascular risk prediction

Article Open access 02 February 2023

Optimizing retinal images based carotid atherosclerosis prediction with explainable foundation models

Article Open access 30 September 2025

Association of carotid wall layers with atherosclerotic plaques and cardiac hypertrophy in hypertensive subjects

Article 15 June 2021

Background & Summary

Cardiovascular diseases account for approximately 18.6 million deaths annually, ranking as one of the leading causes of mortality worldwide^1,2,3. Atherosclerosis, the underlying pathology of many CVDs, often progresses silently until significant vascular stenosis or occlusion precipitates severe clinical events such as myocardial infarction or stroke⁴. Early diagnosis and timely intervention are therefore critical to halt disease progression, reduce the risk of acute events, improve patient prognoses, and alleviate pressures on healthcare systems.

Carotid intima-media thickness (CIMT) is a key predictive biomarker for assessing atherosclerosis and cardiovascular risk, with alterations in CIMT reflecting changes in vascular health. Studies have demonstrated that a 0.1 mm increase in CIMT correlates with a 1.12-fold escalation in mortality risk^5,6. Accurate measurement and monitoring of CIMT are essential for early detection of cardiovascular diseases and for guiding clinical interventions. However, traditional B-mode ultrasound methods are constrained by high equipment costs, the necessity for specialized expertise, time-intensive procedures, and limited accessibility, impeding their widespread adoption in routine CIMT screening.

Emerging research indicates that the structure and function of retinal vessels are significantly influenced by systemic diseases such as atherosclerosis, hypertension, diabetic nephropathy and neurocardiovascular conditions^7,8,9,10, rendering them sensitive indicators of systemic arterial health. Anatomically, the central retinal artery—a branch of the internal carotid artery within the cranium—shares a close relationship with the carotid artery. As depicted in Fig. 1, hemodynamic changes in the carotid artery can directly affect the morphology and function of retinal vessels, manifesting as characteristic pathological alterations, including retinal arterial narrowing, reduced microvascular density, venous dilation, arteriovenous crossing compression, cotton wool spots, punctate hemorrhages, and hard exudates. These manifestations are particularly evident in cases of carotid atherosclerosis or stenosis^11,12. Consequently, abnormalities in the microvascular structures observed in fundus images can serve as imaging biomarkers for carotid artery lesions, providing early warning indicators for carotid and other cardiovascular diseases¹³. The detection and identification of retinal microvascular abnormalities offer a promising avenue for the precise assessment of early cardiovascular risk.

Fundus images, rich in detail and complex features, present challenges for lesion recognition and analysis due to their high-dimensional nature. Traditional diagnostic methods are limited in their ability to detect subtle lesions, such as retinal arterial narrowing and decreased microvascular density, especially when multiple lesions coexist. Deep learning technologies have robust capabilities for analyzing large volumes of image data, allowing it to accurately capture these minute and complex lesion features, thereby providing more precise evidence for clinical diagnosis. In the context of early lesion detection and prediction, deep learning exhibits immense potential, not only compensating for the shortcomings of traditional methods but also offering new possibilities for early disease intervention and prevention^14,15.

The rapid advancement of deep learning has catalyzed the emergence of “ophthalmomics,” a research domain that leverages fundus images to predict systemic diseases. Utilizing the powerful feature recognition capabilities of deep learning, ophthalmomics has achieved significant progress in predicting various systemic diseases, including stroke¹⁶, Alzheimer’s disease^17,18, myocardial infarction¹⁹, coronary heart disease²⁰, diabetic nephropathy^21,22, atherosclerosis^23,24, hypertension⁸, anemia²⁵, diabetes²⁶, and hyperlipidemia²⁷, as shown in Fig. 2. These developments underscore the substantial potential of fundus imaging in systemic disease prediction. Nevertheless, research specifically focusing on predicting CIMT using fundus images remains scarce. The lack of publicly available, high-quality datasets on systemic diseases hampers the study of early diagnostic techniques for CIMT and impedes the exchange and advancement within the field of ophthalmomics.

To address these challenges, we have established a publicly accessible dataset comprised of high-quality bilateral fundus images, CIMT measurements, and comprehensive clinical information, including age and gender. This dataset provides a multimodal resource comprising high-quality bilateral fundus images, CIMT measurements, and demographic information (e.g., age and gender). It helps address the current scarcity of publicly available multimodal datasets for cardiovascular research. Researchers can utilize this dataset to evaluate the performance, robustness, and generalizability of artificial intelligence (AI) models in various clinical scenarios. Ultimately, this dataset may support the future development of non-invasive early screening tools for cardiovascular diseases, especially in settings with limited resources. One limitation of this dataset is the lack of detailed medical history, such as hypertension and diabetes, which may significantly influence the accuracy of CIMT predictions. Future studies will incorporate these clinical variables to further improve predictive performance.

Methods

Dataset collection

This study conducted a retrospective analysis of patient data from April 2015 to February 2024, collected from the Department of Endocrinology at the Second Affiliated Hospital of Anhui Medical University. This dataset predominantly comprises individuals of Chinese ethnicity from Anhui Province, China. The inclusion criteria were as follows: (1) inpatients in the Department of Endocrinology; (2) patients who underwent both fundus imaging and carotid ultrasound during hospitalization; and (3) based on ultrasound measurements, carotid intima-media thickness (CIMT) < 0.9 mm was classified as the normal group, while CIMT ≥ 0.9 mm was classified as the thickened group. The exclusion criteria included: (1) patients with poor image quality who could not provide valid research data. Importantly, participants with ocular diseases, including retinal or vitreous disorders, were not excluded if the image quality was adequate for analysis; (2) patients lacking images of both eyes; and (3) patients without detailed CIMT measurement records.

In this study, a CIMT threshold of ≥0.9 mm was chosen based on previous clinical evidence from a Chinese cohort, which demonstrated a significant relationship between this threshold and elevated cardiovascular risk, including the incidence of new-onset hypertension²⁸. While recognizing the variability of CIMT thresholds across ethnicities and age groups²⁹, our selection aligns closely with validated research specific to our studied population.

Fundus images were captured using a Canon CR-2 PLUS AF non-mydriatic digital fundus camera. CIMT measurements were performed using a Siemens ACUSON S2000 ultrasound diagnostic system equipped with an L16 probe (frequency range of 5–12 MHz). Measurements were taken at the distal wall of the common carotid artery, 1–2 cm proximal to the carotid bifurcation, focusing on flat areas to ensure a clear intima-media interface. CIMT data were provided by the hospital information center and included patient ID, name, gender, and ultrasound diagnostic results.

The study was exempted from informed consent from the Ethics Committee of the Second Affiliated Hospital of Anhui Medical University (Ethics Approval Number: YX2023-2011(F1)). All data were derived from previous clinical diagnoses and treatments, posing no greater risk than daily life or routine physical examinations. The exemption did not compromise the rights or health of the participants. All data were fully anonymized, contained no personal privacy information, and had no commercial interests.

The final dataset comprised 2,903 cases of bilateral fundus images, collectively referred to as the China-Fundus-CIMT dataset. The dataset’s normal group included 849 cases (568 males and 281 females), while the thickened group comprised 2,054 cases (1,294 males and 760 females). Overall, the proportion of males was significantly higher than females (Fig. 3a). Statistical analysis of age and gender across the groups revealed that CIMT in the thickened group was significantly higher than in the normal group (p < 0.001; Fig. 3b). Additionally, the mean age of the thickened group was significantly higher than that of the normal group for both males and females (p < 0.0001; Fig. 3c and Table 1). Further analysis using a heatmap of age-specific CIMT distribution indicated a progressive increase in CIMT with advancing age (Fig. 3d).

Table 1 Age and Gender Statistics Across Different Groups in the China-Fundus-CIMT Dataset.

Full size table

Data preprocessing

Image cropping and renaming

Ensuring high data quality is pivotal for maximizing model performance. To meet rigorous quality standards, we removed black regions in fundus images containing patient information and left/right eye identifiers. A Python script was employed to crop all images to a standardized resolution of 512 × 512 pixels. Images were systematically renamed based on eye identifiers: left-eye images were labeled as “PatientID_L.png” and right-eye images as “PatientID_R.png.” This preprocessing step ensures consistency and standardization, establishing a solid foundation for subsequent model training.

Data quality control

A total of 17,908 color fundus images were collected from patients at our hospital. All images were independently assessed by two experienced clinical professionals according to standardized quality assessment criteria. The evaluation focused on four key aspects: (1) clarity of retinal structures, including the optic disc, macula, and retinal vasculature; (2) exposure adequacy, ensuring neither overexposure nor underexposure; (3) absence of obstructions or artifacts, such as reflections, light bands, or motion blur; and (4) completeness of the retinal field coverage. Images were classified as low quality if they exhibited significant deficiencies in any of these criteria. Any discrepancies between the two reviewers were resolved through consensus with a senior ophthalmologist. Representative examples of both high-quality and low-quality images are presented in Fig. 4a. Based on this manual classification, the dataset was partitioned into three subsets: a training set comprising 6,618 high-quality and 9,890 low-quality images; a validation set including 500 high-quality and 500 low-quality images; and a test set comprising 200 high-quality and 200 low-quality images used for further evaluation of model performance.

An automated fundus image quality classification model was subsequently developed using the SE-ResNeXt deep neural network architecture. Prior to training, all images were resized to a standardized resolution and underwent data augmentation techniques, including random rotations, horizontal and vertical flipping, and brightness adjustments, to enhance the model’s generalization capability. To address class imbalance between high- and low-quality images, a weighted cross-entropy loss function was applied during training. The model was optimized using the Adam algorithm, with an initial learning rate of 1e-3, reduced by a factor of 0.1 every 20 epochs, over a total of 200 epochs with a batch size of 32. Model performance was evaluated on both the validation and test datasets. The model achieved an average accuracy of 94.10% (Fig. 4b) and an AUC of 98.51% (Fig. 4c) on the validation set, and an average accuracy of 93.75% (Fig. 4d) with an AUC of 98.77% (Fig. 4e) on the test set.

Fundus images were initially evaluated using a quality assessment model. Images that met the quality standards were retained directly, whereas those failing the initial evaluation underwent processing with an image restoration model³⁰. Further details about the restoration model can be found at https://github.com/liamheng/Annotation-free-Fundus-Image-Enhancement. The restored images were subsequently re-evaluated using the quality assessment model. Images that still did not meet the quality criteria after restoration were discarded. The detailed workflow for image quality evaluation and restoration is illustrated in Fig. 5.

Preprocessing of ultrasound report data

To extract CIMT information from ultrasound reports, we employed regular expressions for automated text parsing. Two researchers then independently verified the extracted data to ensure accuracy. The processed data were systematically recorded in an Excel file containing the following fields: patient ID, gender, age, and CIMT measurements.

In line with prior research, a CIMT value of ≥0.9 mm is commonly used as the threshold for defining carotid intima-media thickening. However, potential measurement variability arising from ultrasound equipment limitations and operator subjectivity may introduce errors. To address this, we performed a separate analysis to assess the impact of including 0.9 mm values in the thickened group. This additional evaluation enhanced the reliability of the dataset and strengthened the robustness of our conclusions.

Dataset construction

The China-Fundus-CIMT dataset was constructed by matching CIMT measurements obtained from ultrasound reports with quality-assessed fundus image data using patient IDs. To meet the requirements of the study, the dataset was further divided into training, validation, and test sets. The validation set comprised 100 subjects from the normal group and 100 from the thickened group, while the test set included 50 subjects from each group. The training set consisted of 699 subjects in the normal group and 1,904 in the thickened group.

To ensure compatibility with neural network models, the data were standardized as follows: gender information was encoded using One-Hot encoding, and age data were both standardized and normalized. The preprocessed data were saved in JSON format to maintain consistency and facilitate efficient data loading. These steps provide a structured and high-quality foundation for model training and analysis.

Data Records

The China-Fundus-CIMT multimodal dataset is publicly available on the Figshare³¹ platform to facilitate researcher access. All fundus images are stored in a single directory and have been standardized to a resolution of 512 × 512 pixels. Associated clinical and imaging metadata are organized in a JSON file, where the primary key corresponds to the patient ID. This file encompasses multiple fields (detailed in Table 2), offering a structured and comprehensive resource to support multimodal data analysis and model development.

Table 2 Explanation of Each Field in the Fundus_CIMT Dataset json File.

Full size table

Technical Validation

To systematically evaluate the role of fundus images in predicting CIMT thickening, we conducted a series of experiments, starting with a baseline model based on individual fundus images. We comprehensively analyzed factors influencing model performance, including various image processing strategies (e.g., inclusion or exclusion of boundary values) and the incorporation of multimodal clinical information such as age and gender. The model architecture is depicted in Fig. 6. By comparing model performance under different experimental conditions, we sought to elucidate the potential of fundus images and their associated clinical data in predicting CIMT thickening, and offer valuable insights and a scientific foundation for future advancements in this field.

Impact of CIMT boundary values on model performance

The accuracy of CIMT measurements can be influenced by limitations in ultrasound precision and the subjective judgment of operators, potentially introducing variability that affects model stability and the reliability of predictions. To systematically investigate the impact of boundary values on model performance, this study utilized the data_info.json file for data extraction and constructed two experimental models: one with and the other without boundary values. This approach aims to provide a comprehensive assessment of how boundary values influence model performance.

In the first model, all data, including boundary values, were incorporated. In the second model, boundary values were excluded during the data preprocessing stage, and only more stable measurement data were used for training. A comparative analysis of the two models was conducted, focusing on their accuracy, stability, and adaptability to boundary cases. This study highlights the effects of boundary values and measurement variability on model performance, offering critical insights for refining model training strategies and improving overall robustness.

Impact of multimodal clinical information on model performance

Considering the statistically significant differences in age and gender between the thickened and normal groups, this study explores the influence of incorporating multimodal clinical information on model predictive performance using data from the China-Fundus-CIMT.json file. Building on the baseline model, statistically significant clinical variables, such as age and gender, were added as supplementary input features. These features were fused with image data at the fully connected layer to construct and validate a multimodal fusion model. The performance of this multimodal model is compared with an unimodal model, which uses only image data.

The comparison focuses on key aspects such as prediction accuracy, model robustness, and adaptability to intergroup feature differences. This research aims to provide a comprehensive evaluation of the benefits of multimodal information fusion in enhancing prediction precision and generalization capacity, offering valuable guidance for model optimization and practical applications in related fields.

Experimental setup and methods

All experiments were conducted on a Windows 11 system equipped with a GTX 4090 graphics card with 24 GB of memory, using PyTorch 2.3 and CUDA 12.1 for model training and inference. To address class imbalance, a weighted binary cross-entropy loss function was employed. The class weights were determined according to the inverse of the class sample sizes, ensuring balanced learning during model optimization.

In the unimodal model, a Siamese SE-ResNeXt neural network was used to extract features from bilateral fundus images. The SE-ResNeXt architecture was selected based on our previous study³², where it consistently outperformed traditional ResNet in predicting carotid intima-media thickness (CIMT) from fundus images. The integration of the Squeeze-and-Excitation (SE) module further enhances the network’s ability to recalibrate channel-wise feature responses, allowing it to focus on clinically relevant vascular structures. Compared with other advanced architectures such as EfficientNet and Vision Transformers, SE-ResNeXt offers a better trade-off between performance and computational efficiency on this dataset.

For the multimodal approach, age variables were processed through one-hot encoding, normalized, and passed into a fully connected layer with 128 neurons. This output was then fused with the model’s fully connected layer to form the multimodal input. The training process consisted of two stages: in the first stage, the Adam optimizer was used with an initial learning rate of 0.001, decayed by a factor of 10 every 20 epochs; in the second stage, training resumed from the best-performing model of the first stage, with the learning rate reset to 0.00001 and similarly reduced every 20 epochs. There are 100 epochs for each stage. To improve generalization, data augmentation techniques such as horizontal and vertical flipping, random rotations (0°–20°), and adjustments to color and saturation were applied. Model performance was evaluated using metrics including confusion matrix, accuracy, precision, recall, and F1 score, providing a comprehensive assessment of its classification capabilities³².

Validation results

Validation results: impact of cimt boundary values on model performance

The experimental results indicate that CIMT boundary values have a limited impact on the overall performance of the unimodal model. For the model including boundary values, the overall prediction accuracy on the validation set was 77.50% (Table 3), with a prediction accuracy of 66.00% for the normal group and 89.00% for the thickened group (Fig. 7a). On the test set, the overall accuracy was 75.00%, with a prediction accuracy of 70.00% for the normal group and 80.00% for the thickened group (Fig. 7d). In contrast, the model excluding boundary values achieved an overall accuracy of 77.41% on the validation set, with a prediction accuracy of 69.00% for the normal group and 87.21% for the thickened group (Fig. 7b). On the test set, the overall accuracy remained 71.73%, where the prediction accuracy for the normal group was 64.00%, and 80.95% for the thickened group (Fig. 7e). Additionally, the model including boundary values achieved AUC-ROC values of 82.79% on the validation set (Fig. 7g) and 82.58% on the test set (Fig. 6h). For the model excluding boundary values, the AUC-ROC values were 83.65% on the validation set (Fig. 7g) and 81.00% on the test set (Fig. 7h). These findings demonstrate that the inclusion or exclusion of boundary values results in minimal variations in model performance and has no significant effect on overall prediction accuracy. This highlights the robustness of the unimodal model and suggests that the treatment of boundary values has a limited influence on its predictive capabilities.

Table 3 The results of predicting whether CIMT is thickened using single-modal and multi-modal models.

Full size table

Impact of multimodal clinical information on predictive performance

The multimodal model that takes into consideration clinical information such as age and gender achieved significantly better predictive performance compared to the unimodal model. On the validation set, the overall prediction accuracy of the multimodal model was 80.50%, 2.5 percent higher than the unimodal model (Table 3). Specifically, the prediction accuracy for the normal group improved by 5 percent to 71.00%, while the thickened group improved by 1 percent to 90.00% (Fig. 7a,c). Additionally, the AUC-ROC value for the validation set was 86.01%, 3.22 percent higher than the unimodal model (Fig. 7g). On the test set, the multimodal model achieved an overall prediction accuracy of 81.00%, 6 percent higher than the unimodal model (Table 3). The test set prediction accuracy for the normal group increased by 4 percent to 74.00%, while the thickened group improved by 3.06 percent to 88.00% (Fig. 7d,f). The AUC-ROC value for the test set reached 90.41%, 7.83 percent higher than the unimodal model (Fig. 7h). In summary, integrating clinical information (e.g., age and gender) greatly enhances the predictive accuracy and generalization capability of the model. These findings underscore the potential of multimodal models in CIMT prediction and provide a strong foundation for further research in this area.

Usage Notes

The China-Fundus-CIMT dataset comprises high-quality bilateral fundus images, CIMT measurements, and clinical information such as age and gender. To facilitate research, we provide an automated processing pipeline that makes the dataset well-suited for studies on the early prediction of cardiovascular diseases and related medical image analyses. The dataset is available in standard image and json formats, ensuring compatibility with various open-source tools and deep learning frameworks. Researchers can efficiently load, preprocess, and train models using this dataset, enabling streamlined workflows for advancing research efforts.

Code availability

The complete codebase utilized in this study is publicly accessible on GitHub (https://github.com/gongajuan/Fundus_CIMT_2903). This repository encompasses comprehensive implementatins for data preprocessing, quality control, image augmentation, as well as unimodal and multimodal model construction, training, and evaluation. With its modular design, the codebase facilitates efficient reproducibility and scalability, offering a robust foundation for advanced multimodal analysis of fundus images and clinical data.

Change history

25 April 2025
A Correction to this paper has been published: https://doi.org/10.1038/s41597-025-05009-5

References

Mensah, G. A., Fuster, V., Murray, C. & Roth, G. A. Global Burden of Cardiovascular Diseases and Risks, 1990–2022. J AM COLL CARDIOL 82, 2350 (2023).
Article PubMed PubMed Central Google Scholar
Roth, G. A. et al. Global Burden of Cardiovascular Diseases and Risk Factors, 1990–2019: Update From the GBD 2019 Study. J AM COLL CARDIOL 76, 2982 (2020).
Article PubMed PubMed Central Google Scholar
Song, P. et al. Global and regional prevalence, burden, and risk factors for carotid atherosclerosis: a systematic review, meta-analysis, and modelling study. The Lancet Global Health 8, e721 (2020).
Article PubMed Google Scholar
Buldak, L., Cardiovascular Diseases-A Focus on Atherosclerosis, Its Prophylaxis, Complications and Recent Advancements in Therapies. INT J MOL SCI 23 (2022).
Ling, Y. et al. Varying Definitions of Carotid Intima-Media Thickness and Future Cardiovascular Disease: A Systematic Review and Meta-Analysis. J AM HEART ASSOC 12, e31217 (2023).
Article Google Scholar
Timóteo, A. T., Mota Carmo, M., Soares, C. & Ferreira, R. C. Has carotid intima‐media thickness prognostic impact in patients with high cardiovascular risk? A long‐term cohort study. Echocardiogr-J Card 36, 125 (2019).
Article Google Scholar
Abdollahi, M. et al. Artificial intelligence in assessing cardiovascular diseases and risk factors via retinal fundus images: A review of the last decade. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 14, e1560 (2024).
Google Scholar
Hanssen, H., Streese, L. & Vilser, W. Retinal vessel diameters and function in cardiovascular risk and disease. PROG RETIN EYE RES 91, 101095 (2022).
Article CAS PubMed Google Scholar
Germanèse, C. et al. A retinal Oct-Angiography and Cardiovascular STAtus (RASTA) dataset of swept-source microvascular imaging for cardiovascular risk assessment. DATA 8, 147 (2023).
Article Google Scholar
Germanese, C. et al., Artificial intelligence-based prediction of neurocardiovascular risk score from retinal swept-source microvascular imaging: the RASTA dataset (2024).
Arnould, L. et al. Using Artificial Intelligence to Analyse the Retinal Vascular Network: The Future of Cardiovascular Risk Assessment Based on Oculomics? A Narrative Review. Ophthalmol Ther 12, 657 (2023).
Article PubMed Google Scholar
De Boever, P., Louwies, T., Provost, E., Int, P. L. & Nawrot, T. S., Fundus photography as a convenient tool to study microvascular responses to cardiovascular disease risk factors in epidemiological studies. Jove-J Vis Exp e51904 (2014).
Ting, D. S. W. et al. Artificial intelligence and deep learning in ophthalmology. Brit J Ophthalmol 103, 167 (2019).
Article Google Scholar
Li, T. et al. Applications of deep learning in fundus images: A review. Med Image Anal 69, 101971 (2021).
Article PubMed Google Scholar
Li, Y. et al. Predicting systemic diseases in fundus images: systematic review of setting, reporting, bias, and models’ clinical availability in deep learning studies. EYE 38, 1246 (2024).
Article PubMed Google Scholar
Pachade, S. et al., Detection of Stroke with Retinal Microvascular Density and Self-Supervised Learning Using OCT-A and Fundus Imaging. J CLIN MED 11 (2022).
Ashayeri, H., Jafarizadeh, A., Yousefi, M., Farhadi, F. & Javadzadeh, A. Retinal imaging and Alzheimer’s disease: a future powered by Artificial Intelligence. Graef Arch Clin Exp 262, 2389 (2024).
Article Google Scholar
Wisely, C. E. et al. Convolutional neural network to identify symptomatic Alzheimer’s disease using multimodal retinal imaging. Brit J Ophthalmol 106, 388 (2022).
Article Google Scholar
Diaz-Pinto, A. et al. Predicting myocardial infarction through retinal scans and minimal personal information. Nat Mach Intell 4, 55 (2022).
Article Google Scholar
Huang, F., Lian, J., Ng, K., Shih, K. & Vardhanabhuti, V. Predicting CT-based coronary artery disease using vascular biomarkers derived from fundus photographs with a graph convolutional neural network. Diagnostics 12, 1390 (2022).
Article PubMed PubMed Central Google Scholar
Joo, Y. S. et al. Non-invasive chronic kidney disease risk stratification tool derived from retina-based deep learning and clinical factors. Npj Digit Med 6, 114 (2023).
Article PubMed PubMed Central Google Scholar
Tan, Y., Ma, Y., Rao, S. & Sun, X. Performance of deep learning for detection of chronic kidney disease from retinal fundus photographs: A systematic review and meta-analysis. Eur J Ophthalmol 34, 502 (2024).
Article PubMed Google Scholar
Al-Absi, H. R., Islam, M. T., Refaee, M. A., Chowdhury, M. E. & Alam, T. Cardiovascular disease diagnosis from DXA scan and retinal images using deep learning. Sensors-Basel 22, 4310 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Son, J. et al. Predicting high coronary artery calcium score from retinal fundus images with deep learning algorithms. Transl Vis Sci Techn 9, 28 (2020).
Article Google Scholar
Mitani, A. et al. Detection of anaemia from retinal fundus images via deep learning. Nat Biomed Eng 4, 18 (2020).
Article PubMed Google Scholar
Chikumba, S., Hu, Y. & Luo, J. Deep learning-based fundus image analysis for cardiovascular disease: a review. Ther Adv Chronic Dis 14, 384258471 (2023).
Article Google Scholar
Inouye, K., Petrosyan, A., Moskalensky, L. & Thankam, F. G. Artificial intelligence in therapeutic management of hyperlipidemic ocular pathology. EXP EYE RES 245, 109954 (2024).
Article CAS PubMed Google Scholar
Zhang, L. et al. The association between carotid intima-media thickness and new-onset hypertension in a Chinese community-based population. BMC Cardiovasc Disor 19, 1 (2019).
Article Google Scholar
Saba, L. et al. Global perspective on carotid intima-media thickness and plaque: should the current measurement guidelines be revisited? INT ANGIOL 38, 451 (2019).
PubMed Google Scholar
Li, H. et al. A generic fundus image enhancement network boosted by frequency self-supervised representation learning. Med Image Anal 90, 102945 (2023).
Article PubMed Google Scholar
Guo, N., Fu, W., Pan, T., Gong, A. & Sable, C. High-resolution fundus images for ophthalmomics and early cardiovascular disease prediction, China Fundus Carotid Intima-Media Thickness dataset. Figshare https://doi.org/10.6084/m9.figshare.27907056 (2024).
Gong, A., Fu, W., Li, H., Guo, N. & Pan, T. A Siamese ResNeXt network for predicting carotid intimal thickness of patients with T2DM from fundus images. Front Endocrinol 15, 1364519 (2024).
Article Google Scholar

Download references

Acknowledgements

This study was supported by the Scientific Research Project of Higher Education Institutions in Anhui Province (2024AH050790) and the National Natural Science Foundation of China Youth Fund (Grant No. 62103224).

Author information

These authors contributed equally: Na Guo, Wanjin Fu.

Authors and Affiliations

School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
Na Guo
Department of Clinical Pharmacology, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
Wanjin Fu & Wei Zhang
The Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China
Heng Li
University of Southern California, Physics department, California, USA
Haoyun Zhang
College of Information and Electrical Engineering, China Agricultural University, Beijing, China
Tiantian Li
Department of Endocrinology, The Second Affiliated Hospital of Anhui Medical University, Hefei, China
Xing Zhong, Tianrong Pan & Ajuan Gong
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Fuchun Sun

Authors

Na Guo
View author publications
Search author on:PubMed Google Scholar
Wanjin Fu
View author publications
Search author on:PubMed Google Scholar
Heng Li
View author publications
Search author on:PubMed Google Scholar
Haoyun Zhang
View author publications
Search author on:PubMed Google Scholar
Tiantian Li
View author publications
Search author on:PubMed Google Scholar
Wei Zhang
View author publications
Search author on:PubMed Google Scholar
Xing Zhong
View author publications
Search author on:PubMed Google Scholar
Tianrong Pan
View author publications
Search author on:PubMed Google Scholar
Fuchun Sun
View author publications
Search author on:PubMed Google Scholar
Ajuan Gong
View author publications
Search author on:PubMed Google Scholar

Contributions

Guo Na and Wanjin Fu: Responsible for image processing, data analysis, manuscript drafting, and revisions. Heng Li: Developed the quality control model for fundus images and the restoration model for image enhancement. Haoyun Zhang, Wei Zhang, and Tiantian Li: contributed to data curation, formal analysis, and methodology. Xing Zhong and Tianrong Pan: Conducted data collection and screening, along with image quality assessment. Ajuan Gong and Fuchun Sun: Provided critical guidance on the study design, oversight of the research process, and comprehensive revisions to the manuscript.

Corresponding author

Correspondence to Ajuan Gong.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Guo, N., Fu, W., Li, H. et al. High-resolution fundus images for ophthalmomics and early cardiovascular disease prediction. Sci Data 12, 568 (2025). https://doi.org/10.1038/s41597-025-04930-z

Download citation

Received: 27 December 2024
Accepted: 31 March 2025
Published: 03 April 2025
Version of record: 03 April 2025
DOI: https://doi.org/10.1038/s41597-025-04930-z

Subjects

Abstract

Similar content being viewed by others

Multimodal deep learning of fundus abnormalities and traditional risk factors for cardiovascular risk prediction

Optimizing retinal images based carotid atherosclerosis prediction with explainable foundation models

Association of carotid wall layers with atherosclerotic plaques and cardiac hypertrophy in hypertensive subjects

Background & Summary

Methods

Dataset collection

Data preprocessing

Image cropping and renaming

Data quality control

Preprocessing of ultrasound report data

Dataset construction

Data Records

Technical Validation

Impact of CIMT boundary values on model performance

Impact of multimodal clinical information on model performance

Experimental setup and methods

Validation results

Validation results: impact of cimt boundary values on model performance

Impact of multimodal clinical information on predictive performance

Usage Notes

Code availability

Change history

25 April 2025

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links