Big data in ophthalmology
Modern medical research relies on a range of study designs to answer questions about disease risk, progression, and treatment response. Traditional epidemiologic studies excel in control and precision, while Big Data approaches leverage large-scale, multi-centre datasets to enable broader analyses across diverse populations. Beyond just size, Big Data captures real-world clinical practice patterns and population-level trends that reflect actual healthcare delivery, while enabling advanced statistical methods for complex analyses.
Glaucoma exemplifies how studies can reach conflicting conclusions, and how “Big Data” might help reconcile them. Take the association between statin use and glaucoma as a case in point. Initial longitudinal studies by Stein et al. suggested that statins confer a modest protective effect against open-angle glaucoma (OAG), with long-term use associated with reduced glaucoma risk [1]. Yet a subsequent pooled analysis of three large cohorts (over 130,000 participants) found no significant association between statin exposure and incident primary OAG (POAG) [2]. More recently, an analysis from the National Institute of Health All of Us dataset even reported higher glaucoma prevalence among statin users, particularly in certain subgroups (e.g. adults aged 60–69 with hyperlipidaemia) [3]. Faced with such disparate findings—one study suggesting a preventive benefit, another neutrality, and yet another indicating potential harm—clinicians are left with considerable uncertainty regarding the interpretation and application of this evidence.
Our group’s recent study adds a new dimension to this debate [4]. By analysing a large multi-centre electronic health record (EHR) network, we found that the statin–glaucoma relationship may depend on a patient self-reported racial and ethnic background. In our cohort of over 300,000 hyperlipidaemia patients, statin use was associated with a significantly lower risk of ocular hypertension and OAG in non-Hispanic White and Black patients, whereas in Asian and Hispanic patients the protective effect was minimal or only evident with longer-term use [4]. In other words, the impact of statins on glaucoma risk was not uniform across populations. These results suggest a unifying hypothesis: earlier studies might have disagreed because they examined different populations. For instance, Stein’s predominantly White cohort showed benefit from statins [1], whereas the null findings by Kang et al. could reflect a more mixed population or different exposure duration [2]. Meanwhile, the All of Us-based study (with a diverse sample) noted an apparent harm signal, which could relate to unmeasured confounders (like cholesterol levels or healthcare-seeking behaviour) or perhaps specific subpopulations where statins do not help [3]. It is important to note that Lee et al. [3] assessed glaucoma risk without differentiating between its subtypes, whereas both our study [4] and that of Stein et al. [1] specifically examined the development of open-angle glaucoma as the primary outcome.
The notion that a treatment’s effect can vary by demographics is well precedented in medicine—for example, the efficacy of certain blood pressure medications differs by race. Black patients tend to respond better to diuretics and calcium-channel blockers, whereas beta-blockers are slightly less effective on average in Blacks than in Whites [5]. Similarly, the efficacy of statins seems to vary in different populations. Genetic polymorphisms, such as those in ABCG2 and ABCA1, alter statin metabolism and plasma concentrations, particularly in East Asian individuals. Thus, if a protective effect of statins exists only in some ethnic groups (or under certain conditions), studies lacking those groups could reach different conclusions than studies enriched for them. Recognising this possibility is the first step toward resolving the conflict.
The past decade has seen the emergence of several such platforms in medicine and ophthalmology. The Veterans Affairs Million Veteran Programme (MVP), for instance, has enroled over 900,000 United States (U.S.) veterans to create one of the world’s largest biobanks [6]. The United Kingdom (UK) Biobank, another landmark resource, follows half a million adults with deep phenotyping—including comprehensive health questionnaires, physical exams, blood biomarkers, multimodal imaging, and genome-wide genotyping for every participant [7]. Likewise, the NIH All of Us Research Programme is building a cohort of one million Americans with an explicit emphasis on diversity (over 50% of participants are from racial or ethnic minorities), linking EHR, genomic data, surveys, and wearable device data to capture a rich array of health determinants [8].
Understanding the inherent characteristics and limitations of these datasets is also crucial. For example, the Intelligent Research in Sight (IRIS) Registry primarily captures data from community-based ophthalmology practices, while the Sight Outcomes Research Collaborative (SOURCE) draws primarily from hospital systems [9, 10]. Although initiatives like All of Us deliberately prioritise demographic diversity, systematic gaps continue to persist across all major ophthalmic datasets [8]. Rural populations, individuals with limited healthcare access, and certain socioeconomic groups remain chronically underrepresented. A possible solution could be federated learning networks that connect underrepresented healthcare settings, allowing their patient populations to contribute to large-scale analyses and create more comprehensive population coverage without the barriers of traditional data sharing agreements. Addressing these gaps by including underserved populations could significantly enhance the external validity and broad applicability of big data findings.
While Big Data holds immense promise in resolving controversies such as the statin–glaucoma relationship, it is not without limitations. Challenges such as misclassification, residual confounding, and overreliance on administrative codes can undermine data accuracy. Additionally, race-stratified findings must be interpreted cautiously, given the complex interplay of genetics, environment, and social determinants. To address these issues, Table 1 summarises key limitations in Big Data studies and emerging solutions, including phenotype standardisation, federated data structures, and causal inference techniques such as Mendelian randomisation and pragmatic trials.
Looking forward, future efforts should focus on improving population representation, harmonising clinical definitions, and integrating novel analytical frameworks that better account for bias and support causal conclusions. By complementing traditional epidemiologic approaches with well-designed Big Data studies, researchers can generate more reliable, equitable, and clinically actionable insights. As the field continues to evolve, a thoughtful application of these resources will be essential to guide evidence-based care. With accuracy, inclusivity, and innovation, Big Data can help ophthalmology move beyond uncertainty toward more precise and personalised disease prevention.
References
Stein JD, Newman-Casey PA, Talwar N, Nan B, Richards JE, Musch DC. The relationship between statin use and open-angle glaucoma. Ophthalmology. 2012;119:2074–81.
Kang JH, Boumenna T, Stein JD, Khawaja A, Rosner BA, Wiggs JL, et al. Association of statin use and high serum cholesterol levels with risk of primary open-angle glaucoma. JAMA Ophthalmol. 2019;137:756–65.
Lee SY, Paul ME, Coleman AL, Kitayama K, Yu F, Pan D, et al. Associations between statin use and glaucoma in the all of Us research program. Ophthalmol Glaucoma. 2024;7:563–71.
Elhusseiny AM, Eleiwa TK, Dihan QA, Chauhan MZ, Al’Aref SJ, Lee RK. Racial and ethnic differences in the association between statin use and the risk of ocular hypertension and open-angle glaucoma. Am J Ophthalmol. 2025;276:117–125.
Johnson JA. Ethnic differences in cardiovascular drug response: potential contribution of pharmacogenetics. Circulation. 2008;118:1383–93.
Gaziano JM, Concato J, Brophy M, Fiore L, Pyarajan S, Breeling J, et al. Million veteran program: a mega-biobank to study genetic influences on health and disease. J Clin Epidemiol. 2016;70:214–23.
Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779.
All of Us Research Program I, Denny JC, Rutter JL, Goldstein DB, Philippakis A, Smoller JW, et al. The “All of Us” research program. N Engl J Med. 2019;381:668–76.
Bernstein IA, Fernandez KS, Stein JD, Pershing S, Wang SY. Big data and electronic health records for glaucoma research. Taiwan J Ophthalmol. 2024;14:352–9.
Pershing S, Lum F. The American Academy of Ophthalmology IRIS Registry (Intelligent Research In Sight): current and future state of big data analytics. Curr Opin Ophthalmol. 2022;33:394–8.
Funding
The Bascom Palmer Eye Institute is supported by NIH Centre Core Grant P30EY014801 and a Research to Prevent Blindness Unrestricted Grant. RKL is supported by the Walter G. Ross Foundation. This work was partly supported by the Camiener Foundation Glaucoma Research Fund and the Gutierrez Family Research Fund.
Author information
Authors and Affiliations
Contributions
MSS was responsible for conceptualising the study, drafting the initial manuscript, and interpreting results. MA contributed to study design, performed literature review, and assisted in drafting the manuscript. MMK contributed to study design, data interpretation, and manuscript preparation. AME conducted the literature review, assisted with data interpretation, and contributed to writing the manuscript. RKL supervised the study, provided critical revisions of the manuscript, and gave final approval for submission.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral about jurisdictional claims in published maps and institutional affiliations.
All authors attest that they meet the current ICMJE criteria for authorship.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sayed, M.S., Ayoubi, M., Khodeiry, M.M. et al. The power of many: multi-database approach to understand the relationship between statins and glaucoma. Eye 39, 2843–2845 (2025). https://doi.org/10.1038/s41433-025-04003-w
Received:
Revised:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41433-025-04003-w