Introduction

Geographic atrophy (GA) secondary to age-related macular degeneration (AMD) is a progressive and irreversible degenerative disease of the macula, the prevalence of which sharply increases after age 60. Aside from age, genetics, smoking, and dietary factors are additional known risk factors1,2. It is characterized by progressive loss of retinal pigment epithelium (RPE) cells, which are responsible for supplying nutrients and removing waste from the photoreceptors3,4. The loss of the RPE cells in the macula leads to photoreceptor death which culminates in progressive decline of central vision, as well as difficulties with light adaptation, reduced color perception, and contrast sensitivity, which are especially pronounced in low light conditions1,2,5.

Diagnosis and progression of GA is typically made using a combination of clinical examination and retinal imaging technologies, such as optical coherence tomography (OCT) and fundus autofluorescence (FAF)6,7. These imaging technologies allow the visualization of the areas of atrophy and monitor their progression1,2. However, retinal imaging does not allow understanding of the molecular processes driving and/or influencing the disease progression. The gap in this knowledge has made it challenging to develop therapies that can slow down or halt the vision loss of patients with this disease.

Analysis of the aqueous humor (AH) proteome could be helpful to shed light on the processes underlying retinal disease. AH in the anterior chamber plays a vital role in maintaining the health and the function of the eye8. Its chemical composition is principally similar to plasma, but much more dilute in protein concentration8. It is continuously produced by the ciliary body, and is drained through the trabecular meshwork and Schlemm’s canal8. This process is important for the maintenance of the intraocular pressure (IOP), as well as for supplying nutrients to the cornea and the lens, and removing their metabolic waste8. While AH does not come directly in contact with the retina, it is thought that retinal proteins can still reach the AH via diffusion between the vitreous and aqueous humor compartments9,10,11,12. Other mechanisms that can also contribute to the AH proteome are disruptions in blood-retinal and blood-aqueous barriers13,14.

Elucidating the composition of the AH proteome is important for determining the presence of proteins that could play a pivotal role in disease pathology. Due to the significantly lower protein concentration in the AH, proteomic profiling has been challenging15. Nonetheless, unbiased mass spectrometry approaches have identified hundreds of proteins in AH15,16,17,18,19,20,21,22,23, and have been useful in comparative studies for biomarker discovery. For example, data-dependent acquisition (DDA)-based mass-spectrometry identified eight proteins that were increased in the AH from intermediate AMD patients, some of which were related to drusen (extracellular deposits characteristic of AMD) biology and extracellular matrix remodeling18. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) profiling in AH from patients with neovascular AMD receiving anti-VEGF therapy identified biomarkers related to drug response, including proteins involved in lipid metabolism, oxidative stress, angiogenesis, and immune biology19,24. Overall, mass spectrometry has been a useful tool for augmenting the understanding of the AH proteome and its contribution to various eye diseases.

More recently, alternative proteomics platforms have been developed that have distinct advantages over mass spectrometry approaches in identifying low abundant proteins, such as cytokines25. For example, an aptamer-based technology has been used to profile the AH26 of diabetic retinopathy (DR) patients. By integrating single cell expression data, this study was able to determine that most proteins are expressed in the retina26, which provides supporting evidence that AH in the anterior chamber can inform biology about the retina. Another alternative proteomics platform utilizes proximity extension assay technology to determine relative abundance of proteins in biological fluids27,28,29. This method from Olink allows measurement in small sample volumes of over 1000 pre-specified proteins, and it has recently been successfully employed to profile the proteome of AH in uveal melanoma, neovascular AMD, diabetes mellitus, diabetic retinopathy, and diabetic macular edema patients9,30,31,32.

Here, we collect AH samples from GA secondary to AMD, intermediate AMD, and non-AMD control subjects (Fig. 1a). We apply the Olink proximity extension assay technology (Fig. 1b) to identify proteins that are differentially abundant in AH of AMD subjects. Then, using a variety of gene expression data from donor eye samples, we further refine a number of candidate biomarkers associated with GA (Fig. 1c). We identify 82 proteins significantly different in GA. Many of these proteins are expressed in the retina, especially the retinal pigment epithelium cells. Interestingly, analysis of AH from diabetic retinopathy patients reveals both shared and unique AH biomarkers with GA patients. Together, these results provide a framework for identifying and developing candidate biomarkers in AH for discovery of molecular processes driving retinal diseases, identifying new drug targets, and as surrogates informing of disease modification during drug development.

Fig. 1: Study schema.
figure 1

a Aqueous humor (AH) collection from the anterior chamber of the eye. b Schema on the high-throughput proteomics platform. For simplicity, only one protein target (gray) is shown but each of the 13 Olink Target 96 panels used in this study simultaneously detects 92 protein targets. c Bioinformatics strategy to refine putative geographic atrophy (GA) biomarkers. Number in parentheses indicates number of proteins remaining after successive analysis. AMD age-related macular degeneration, DR diabetic retinopathy, qPCR quantitative PCR. Figure created with templates from BioRender.com.

Methods

Patient population

We enrolled 30 non-AMD control, 30 intermediate AMD, and 28 GA secondary to AMD subjects in this study. Approval from institutional review boards and ethics committees was obtained before the study started by each of the 7 participating institutions (Cumberland Valley, MD, USA; Deep Blue Retina, MS, USA; Pacific Eye, CA, USA; Rand Eye, FL, USA; University Retina, IL; Wagner Retina, VA, USA; Wolfe Eye, IA, USA). Patients provided informed consent for this study before enrollment for participation in this research. The inclusion criteria for the non-AMD control subjects were no history of AMD, glaucoma, diabetic retinopathy or diabetic macular edema, and uveitis in both study eye and fellow eye. For the intermediate AMD subjects, the inclusion criteria were study eye diagnosis of intermediate AMD defined by large drusen greater than 125 μm and/or pigmentary abnormalities, and fellow eye diagnosis of intermediate AMD as defined in study eye or GA secondary to AMD with no evidence of prior choroidal neovascularization. For GA subjects, the inclusion criteria were study eye and fellow eye with diagnosis of GA secondary to AMD with no evidence of prior choroidal neovascularization, and GA lesion in study eye must be between 1 disc area (2.54 mm2) and 7 disc areas (17.78 mm2).

Aqueous humor and plasma sample collection

Plasma and AH samples (approximately 100 μL) from the anterior chamber of the eye were collected from each subject. Paired plasma samples were also collected on the same day. Samples were frozen immediately on dry ice after collection, shipped to Genentech on dry ice, and stored at −70 °C. They were then thawed on ice, subaliquoted, and stored at −70 °C until Olink and immunoassay analyses.

Olink processing

Aliquots (25 μL) of AH samples from all subjects were sent to Olink (Watertown, MA, USA) for analysis on 13 Olink Target 96 panels. The panels were Cardiometabolic, Cardiovascular II, Cardiovascular III, Cell Regulation, Development, Immune Response, Inflammation, Metabolism, Neurology, Neuro Exploratory, Oncology II, Oncology III, and Organ Damage. To avoid plate positional effects, the 88 samples were manually randomized and run by Olink on a single plate, with no dilution, using 1 μL of aqueous humor per panel. Olink then provided Normalized Protein eXpression (NPX), Limit of Detection (LOD), and source data in the form of Ct values for each protein assay for each sample. The complete control and correction factor source data for the calculation of NPX from Ct values for each protein assay for each sample was also provided by Olink.

Immunoassay analysis

Aliquots (20 μL) of paired AH and plasma samples from all subjects were measured for MCP-1, CXCL10, IL-6, and ICAM1 using the Ella 4-plex immunoassay platform from ProteinSimple (Bio-Techne, Minneapolis, MN, USA). Cartridges were purchased from ProteinSimple to measure four cytokines, MCP-1, CXCL10, IL-6, and ICAM1, in aqueous humor samples. Recombinant analyte controls were prepared at low and high concentrations (Bio-Techne catalog numbers 894948, 894949, 896256, and 898093 for MCP-1, CXCL10, IL-6, and ICAM1, respectively), and were run on each plate. Samples were analyzed at the minimum required dilution (MRD) of 1/5 in sample diluent (SD13; Bio-Techne catalog number 896098). The lower limits of quantitation for each analyte after taking into account the 1/5 dilution factor were 7.6 pg/mL (MCP-1), 3.0 pg/mL (CXCL10), 20.5 pg/mL (ICAM1) and 1.4 pg/mL (IL-6).

Total protein analysis

Total protein levels in AH were determined, at a dilution factor of 100, using the ThermoFisher NanoOrange™ Protein Quantitation Kit catalog number N-6666 (Invitrogen/ Molecular Probes, Waltham, MA), and read using an EnVision Xcite reader (PerkinElmer, Waltham, MA) with the built-in nanoOrange program. Assay range was 0.156 µg/mL to 10 µg/mL.

Olink data processing and normalization techniques

Both Ct and NPX values were obtained from Olink and imported into R (v4.2.0) from csv files. We performed NPX normalization using the normalizeBetweenArrays() function in the limma package (v 3.52.2). For median adjustment, we used the parameter ‘method = “scale”‘, and for quantile normalization, we used the parameter ‘method = “quantile”‘. To test the suitability for quantile normalization, we employed the R package quantro (v 1.30) with 2000 permutations to generate the null distribution for testing statistical significance. The qsmooth package (v 1.12) with default parameters was also used to evaluate the relative weights for quantile normalization.

Differential protein abundance analysis

The limma package (v 3.52.2) function lmFit() was used to produce a linear model for each protein. The linear models are moderated using the eBayes() function with the parameter ‘robust = TRUE‘, which uses the average variability across all proteins to adjust variability of individual proteins.

We used the limma package to perform differential protein abundance analysis similar to previous studies33,34. Limma has also been widely used in a broad range of data types, including qPCR, microarray and next-generation sequencing. Since Olink resembles a single-channel microarray or qPCR, we used quantile normalized NPX values as input for limma analysis.

For building a linear model on disease severity, we coded control, intermediate AMD, and GA patients as 0.1, 0.5, and 1 respectively. We then used limma analysis on severity with age as a covariate across all proteins.

Gene expression re-analysis

Bulk RNA-seq differential gene expression

Processed bulk RNA-seq read counts from 85 donors were downloaded from Zenodo Data35 (https://doi.org/10.5281/zenodo.7532115) and subsetted for only genes coding for proteins found on the Olink platform. Then the reads were adjusted based on DESeq2 estimated sizeFactor36 and weighted by their mean-variance relationship as estimated by voom. Differential gene expression was performed using the limma empirical Bayes analysis pipeline described in the R package limma37. Significant differences were defined as genes with p-value adjusted less than 0.05 and absolute fold change greater than 1.5. p-values were adjusted using the false discovery rate (FDR) method following the Benjamini-Hochberg procedure.

Single nuclei pseudobulk analysis

Processed single nuclei read counts for the retina were downloaded from Zenodo Data35 (https://doi.org/10.5281/zenodo.7532115) which contains 13 snRNA-seq samples (six from AMD groups and seven from controls). Read counts from the anterior segment from 8 donors were imported from the Gene Expression Omnibus (GEO) accession GSE199013. Both datasets were subsetted for only genes coding for proteins found on the Olink platform. Individual pseudobulk samples were generated for each sample by aggregating counts across all cells using the aggregateAcrossCells() function in the scater package (v1.24). Differential expression analysis was performed on pseudobulk samples using the voom-limma method38.

Data visualization

All data was plotted using the ggplot2(v3.5.0) library. Plotted statistical elements (boxplots, trend lines, statistical significance annotations) were performed using default parameters (see Code Availability statement for access to original code).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Results

Proteomic analysis of aqueous humor samples

AH samples were collected from 30 non-AMD control, 30 intermediate AMD, and 28 GA secondary to AMD subjects, with inclusion criteria as described in Methods. The three cohorts were balanced for sex and race (Table 1). The median ages were 70.1, 76.6, and 80.5 for the non-AMD control, intermediate AMD, and GA cohorts, respectively. The lower median age in the non-AMD control cohort represents a typical challenge in finding age-matched normal retina subject volunteers. The median age of the GA patients in this study was similar to ones in Ph3 clinical trials39.

Table 1 Study cohort demographics

Proteomics profiling of AH samples was analyzed using Olink Target 96 proximity extension assay, as described in Methods. Data was generated for a total of 1196 analytes (92 analytes per panel with 13 panels) in the 88 samples. Some analytes, such as IL-6, were found in multiple panels, and hence there were 1161 unique proteins analyzed. Quality assessment on the samples indicated all samples were of sufficient quality for downstream analysis (Supplementary Fig. 1).

The Olink Target 96 platform measures relative protein abundance using qPCR and converts Ct values to NPX values. To determine the accuracy of NPX measurements, we compared NPX values with a select number of qualified quantitative immunoassays tested on the same AH samples. For these select analytes, we observed high correlation with immunoassay measurements (Supplementary Fig. 2). This has also been observed in other studies comparing NPX values with quantitative immunoassays40,41.

AMD aqueous samples are associated with increased total protein levels

When we compared the distributions of NPX values across diseases, we observed samples from AMD patients tended to have higher median NPX values (Fig. 2a). For example, the median NPX in GA samples was significantly higher compared to non-AMD control samples (p = 0.038, Student t-test). These differences may in part be attributed to the age difference between groups (R2 = 0.10, Fisher’s Exact p = 0.003, Fig. 2b). To determine whether the differences in median NPX values in GA samples were related to protein abundance, we measured the total protein concentration in the same AH samples (Fig. 2c). Interestingly, we found a significant positive correlation between total protein abundance and median NPX (R2 = 0.73, Fig. 2d), which could potentially be due to a compromised blood-retina barrier. This trend was independently consistent for each disease group (Supplementary Fig. 3a). However, individual assay NPX values have varying degrees of positive correlation with total protein (Supplementary Fig. 3b,c), indicating most proteins, measured individually, would be otherwise confounded by the effect of total protein.

Fig. 2: Median Normalized Protein eXpression (NPX) is associated with total protein levels.
figure 2

a Boxplot of median NPX values across diseases. b Scatterplot representing the relationship between age (x-axis) and median NPX (y-axis). Each point on the plot corresponds to an individual sample. The blue line indicates the linear trend. c Boxplot of total protein measured in aqueous samples across diseases. d Scatterplot representing the relationship between median NPX (x-axis) and total protein (y-axis). The blue line indicates the linear trend. Boxplots indicate median and interquartile ranges (IQR) and the whiskers extend from the box to the smallest and largest values within 1.5 times the IQR from the quartiles. N  = 30 non-AMD controls, N = 30 iAMD samples, and N = 28 GA samples. P-values calculated using the Student t-test. GA geographic atrophy, iAMD intermediate age-related macular degeneration.

NPX normalization enables compositional comparisons across disease

We assume that the relative distribution of proteins between control and GA should not change with exception of a minority of disease relevant proteins. Due to the unbalanced distribution of NPX values between cohorts which may skew the results, we asked whether NPX values require further normalization prior to downstream analysis to mitigate the potential for total protein driving differential abundance. One method to remove the effect of total protein is to scale all the samples to have the same median NPX (also known as median-adjusted NPX) (Supplementary Fig. 4). This method centers median NPX values across samples by applying a constant correction factor for all proteins in each sample and, importantly, preserves relative NPX rank orders within each sample. Interestingly, when examining the mean-variance relationship in raw (Fig. 3a) and median-adjusted data (Supplementary Fig. 4b), we found an overall lower level of dispersion (Fig. 3b), indicating the dataset benefited from median-adjustment by reducing technical noise.

Fig. 3: Quantile normalization reduces variance in Normalized Protein eXpression (NPX) values.
figure 3

a Mean-variance plots showing the average uncorrected NPX value (x-axis) and the dispersion (y-axis) across all samples. Solid line represents the loess trend line and the grey areas denote the 95% confidence interval. Each point is an assay. b Loess trend lines from the different normalized methods. Black = uncorrected; orange = median adjusted; red = quantile normalized. sqrt(sigma): square root of the standard deviations of NPX values.

We next tested whether a more extensive method, quantile normalization, would provide added benefits of removing additional bias to further enhance data quality. Like median-adjusted NPX, quantile normalized data preserves relative NPX rank orders. After applying quantile normalization to our dataset, we found that indeed there was an even lower level of assay variability across the entire data range compared to median-adjusted normalization (Fig. 3b and Supplementary Fig. 4c). In addition, quantile normalization further reduces directional bias in differential protein abundance analysis (below) compared to both uncorrected and median-corrected NPX values (Supplementary Fig. 5). Therefore, quantile normalization appears to further reduce sources of noise and bias, and was used for all subsequent analyses. Note, we also performed the same analysis with uncorrected or median-corrected NPX values and found the majority of signals were preserved (Supplementary Fig. 5b). To better help illustrate the effect of data normalization, we also provide some examples of proteins that were only significant in unnormalized or normalized dataset (Supplementary Fig. 6).

Differential protein abundance analysis in GA aqueous humor samples

We performed a pairwise analysis between non-AMD controls, intermediate AMD, and GA AH Olink quantile normalized data to identify disease-specific changes. We found that the greatest change was between GA and non-AMD control samples, which had 87 assays that showed differential abundance. Of the 87, some proteins were measured across multiple plates and therefore only 82 unique proteins with significant differential protein abundances (FDR less than 0.05 and greater than 1.5 fold-change). Of these 82 proteins, 59 and 23 are increased and decreased in GA, respectively (Fig. 4a, Supplementary Data 1). Many of the differentially abundant proteins are found in the cardiovascular panels defined by Olink (e.g., Cardiovascular II/III, Cardiometabolic, Supplementary Fig. 7). In addition, we identified proteins involved in inflammation, such as IL-6, CXCL9, and CCL25.

Fig. 4: Differential protein abundance analysis reveals changes in geographic atrophy (GA) aqueous humor (AH) samples.
figure 4

a-c Volcano plots visually represent the change in protein expression levels and its statistical significance. The x-axis displays the log2 fold change in protein expression, indicating upregulated values to the right and downregulated values to the left. The y-axis represents the negative logarithm (base 10) of the false discovery rate (FDR), emphasizing proteins with higher statistical significance towards the top. iAMD intermediate age-related macular degeneration.

Interestingly, we identified four proteins, SMOC2, NT-proBNP, TNFRSF13B, and CXCL9, that increased in abundance in GA when compared to intermediate AMD, all of which except CXCL9 showed further increase when compared to non-AMD controls (Fig. 4b). Building a linear model of disease severity across non-AMD control, intermediate AMD, and GA confirmed these three proteins were significantly associated with disease progression (Supplementary Data 2). Likewise, pairwise comparison between non-AMD control and intermediate AMD revealed four significantly altered proteins, CFHR5, PIgR, SPINK1, and CCN5 (Fig. 4c), all of which except CFHR5 were also linked to increasing disease severity.

Candidate biomarkers associated with geographic atrophy

To identify biologically relevant candidate GA biomarkers, we cross-referenced gene expression datasets from both non-AMD and AMD donor eye samples from bulk RNA-seq42 as well as single-nuclei RNA-seq43. Firstly, in bulk RNA-seq samples from non-AMD RPE and retinal layers42, 50 out of the 82 (61%) differentially abundant proteins were expressed in at least one of the layers, indicating potential ocular sources for these proteins. In addition, proteins that are increased in GA aqueous humor have 44% showing expression in the retinal pigment epithelium (RPE) cells, whereas approximately 20% are expressed in the retina (Fig. 5a). SMOC2 and MYOC are among the genes with highest expression in the posterior parts of the eye (Supplementary Fig. 8). These two genes also exhibited higher expression in the peripheral retina compared to the macula in non-AMD donor eyes, while others such as IL6, CCL14, and CD300LG are expressed higher in the macula, where the GA lesions form and progress (Supplementary Fig. 9). Conversely, proteins that decrease in GA samples are ubiquitously expressed in both RPE and retina layers (Fig. 5band Supplementary Fig. 10).

Fig. 5: Gene expression of proteins that change in geographic atrophy (GA) aqueous humor (AH) in bulk and single nuclei RNA-seq datasets.
figure 5

ad Barplots indicating the percent of proteins changed in GA AH that are expressed in bulk RNA-seq (a, b) and single-nuclei RNA-seq (c, d) datasets. CSJ corneo-scleral junction, ETM eye trabecular meshwork, NK natural killer cells, RGC retinal ganglion cells, RPE retinal pigment epithelium. For bulk RNA-seq data N = 112 RPE, non-Macula, N = 105 RPE, Macula; N = 115 Retina non-Macula; N = 101 Retina Macula samples. For sNuc-seq: N = 7 control donors.

To further characterize the cellular compartment which may express these genes, we leveraged single-nuclei RNA-sequencing dataset from both non-AMD donor retina43 and non-AMD donor anterior segment44 (Fig. 5c and Supplementary Fig. 11a). Specifically, cells from the trabecular meshwork and corneo-scleral junction appear to be most enriched for increased abundance proteins in GA from the anterior segment, whereas amacrine, astrocyte and fibroblast cells are enriched populations in the posterior of the eye. In contrast, proteins that are decreased in abundance in geographic GA samples are expressed ubiquitously across all cell types (Fig. 5d and Supplementary Fig. 11b), consistent with the bulk RNA-sequencing analysis. Many of the proteins are expressed in multiple cell types, which we list in Supplementary Data 3.

To determine whether changes in protein composition in the AH could reflect changes in the retina, we examined gene expression data from well-phenotyped AMD donor eyes samples42. Differential expression analysis of GA and control donor eyes revealed that only 2 genes, SMOC2 and FABP4, are upregulated in the macula region of GA patients (Supplementary Fig. 12). Proteins that were reduced in GA aqueous humor samples did not appear to have any consistent gene expression changes in the retina.

Finally, we extended our study to cross-reference a dataset comparing differentially abundant proteins in AH of patients with diabetic retinopathy (DR) compared to controls (Supplementary Data 4). This analysis revealed that the median NPX levels are also elevated in DR (Fig. 6a), suggesting elevated total protein which also requires normalization for downstream analysis. We identify 81 significantly increased and 43 significantly decreased proteins in DR (FDR less than 0.05 and greater than 1.5 fold-change) (Fig. 6b). When compared to the protein abundance changes seen in GA versus control, we identified a number of shared differentially abundant proteins (p < 2.2e-16, Fisher’s Exact Test), including REG1A, DEFA1, MMP12, and ANGPLT4 (Fig. 6c and Supplementary Data 5) In addition, we also identified some disease-specific protein changes such as SMOC2 and IL-6 for GA, as well as CHIT1 and IGFBP-1 (Fig. 6c). However, unlike SMOC2 and IL-6, CHIT1 and IGFBP-1 are not expressed in the retina (Supplementary Fig. 8). While previous studies have shown IL-6 increase in DR45,46, our data does not suggest IL-6 is increased in DR that cannot be accounted for by total protein (Supplementary Fig. 13). Overall, this analysis revealed both disease-specific protein changes as well as unexpected commonalities which will require future investigation.

Fig. 6: Comparison of differential protein abundances between geographic atrophy (GA) and diabetic retinopathy (DR) aqueous humor (AH) samples against controls.
figure 6

a Boxplot of median Normalized Protein eXpression (NPX) values between DR (N = 18) and control (N = 22). Each dot is a sample. P-values calculated using the Student t-test. Boxplots indicate median and interquartile ranges (IQR) and the whiskers extend from the box to the smallest and largest values within 1.5 times the IQR from the quartiles. b Volcano plot of change in protein abundance levels in DR compared to control. Each dot is a protein and red color indicates statistical significance. c 4-way plot of two separate differential protein abundance analysis. X-axis shows log2 fold change (LogFC) from GA vs control; y-axis shows log2 fold change from DR vs control comparison. Each dot is an assay. Dots labeled in purple are significantly different (false discovery rate (FDR) < 0.05 & fold change > 1.5) in both analyses. Red and green represent significant changes in only GA (red) or DR (green) analysis76.

In the totality of our analysis, SMOC2 is one of the top pathophysiological biomarker candidates in AH (Fig. 7a). SMOC2 protein abundance in the AH can in small part be explained with age (R2 = 0.15, p = 8.9e-5, Fig. 7b). Meanwhile, SMOC2 expression is increased in the macula of AMD donor eyes compared to controls (Fig. 7c). In addition, sNuc-seq revealed SMOC2 is consistently upregulated in AMD donor eyes, particularly in mural and RPE cells (Fig. 7d, e). Together, the bulk RNA-seq and sNuc-seq data suggest that elevated levels of SMOC2 in AH could be originating from the retina and is a potential biomarker for AMD.

Fig. 7: SMOC2 is increased in geographic atrophy (GA) aqueous humor (AH) and upregulated in age-related macular degeneration (AMD) retina.
figure 7

a Boxplot of AH SMOC2 Normalized Protein eXpression (NPX) values across diseases. Boxplots indicate median and interquartile ranges (IQR) and the whiskers extend from the box to the smallest and largest values within 1.5 times the IQR from the quartiles. b Scatterplot representing the relationship between age (x-axis) and SMOC2 NPX (y-axis). The blue line indicates the linear trend. c Boxplot of SMOC2 reads per kilobase of transcript per million reads mapped (RPKM) values from bulk RNA-seq donor eye samples. d Scatterplot representing the average expression (x-axis) and fold change between AMD and control samples (y-axis). Each dot is a celltype. e Example boxplots from pseudobulk sNuc-seq samples in Mural (left) and RPE (right) showing differential expression of SMOC2 in AMD. P-values calculated using the Student t-test. iAMD intermediate AMD, RGC retinal ganglion cell, RPE retinal pigment epithelium, NEO neovascular AMD, AREDS Age-Related Eye Disease Studies. For Olink data: N  = 30 non-AMD controls, N = 30 iAMD samples, and N = 28 GA samples. For Bulk RNA–seq: N = 36 Normal macula; N = 24 AREDS2/3 macula; N = 18 Neo Macula; N = 10 GA macula; N = 31 normal periphery; N = 19 AREDS2/3; N = 14 NEO periphery; N = 7 GA periphery. For sNuc-seq: N = 6 AMD donors; N = 7 control donors.

Next, we identified IL-6 as a significant biomarker with differential abundance in GA AH samples. IL-6 is found on four different panels in the Olink platform and reproducibly significant in all 4 assays (Fig. 8a). IL-6 is among the top 10 with highest fold change between non-AMD control and GA (Fig. 4). Using a qualified immunoassay, we reconfirmed the significantly higher abundance of IL-6 in GA samples compared to non-AMD control, even after removing three high expressing outliers (p = 0.0006 and p = 0.002, respectively; Fig. 8b). IL6 is expressed in the retina from donor eyes, although with high variability (Supplementary Fig. 8). However, sNuc-seq was unable to detect IL6 expression. Comparing paired plasma sample IL-6 levels collected on the same day as the AH samples indicates no correlation between systemic IL-6 levels and AH IL-6 levels (R2 = 0.003, Fig. 8d). In addition, we also rule out the effect of age with IL-6 levels (R2 = 0.07) (Fig. 8c). Together, these data suggest that IL-6 could be another candidate AH biomarker for GA progression.

Fig. 8: IL-6 is differentially expressed between control and geographic atrophy (GA) in aqueous humor (AH).
figure 8

a Boxplot of IL-6 Normalized Protein eXpression (NPX) values across multiple Olink panels. b Boxplot of IL-6 protein levels measured by immunoassay. c Scatterplot representing the relationship between age (x-axis) and IL-6 immunoassay measurement in AH (y-axis). d Scatterplot representing the relationship between IL-6 in the plasma (x-axis) and in AH (y-axis). P-values calculated using the Student t-test. iAMD intermediate age-related macular degeneration, QN quantile normalized. Boxplots indicate median and interquartile ranges (IQR) and the whiskers extend from the box to the smallest and largest values within 1.5 times the IQR from the quartiles. N  = 30 non-AMD controls, N = 30 iAMD samples, and N = 28 GA samples.

Discussion

Little is known about the molecular mechanisms driving GA progression, making development of effective therapeutics challenging. Our study created a framework to identify candidate AH biomarkers associated with GA using Olink proteomics. Such biomarkers could be used in a variety of clinical applications, for example, assessing on-target activity of new therapeutics or predicting patient populations who might respond to drug intervention. Furthermore, biomarkers have the potential to act as an early indicator of transition from intermediate AMD to GA, allowing early intervention with approved therapeutics, or provide important insights into the onset and progression of GA, and most importantly, in the pathophysiology of the disease with the potential to offer hints at new therapeutic targets. Overall, identifying retinal disease biomarkers in AH samples has potential to support drug development and improve disease management over current standards of care.

As part of the quality control analysis of the Olink data, we observed a strong correlation between uncorrected median NPX values and total protein levels in matched samples. Indeed, most individual analytes have positive correlation with total protein as expected thus obfuscating true biological drivers of disease from passenger proteins that are simply associated with the overall level of total protein. This highlights potential limitations of past and future studies of AH that focus on individual analytes or a small, predefined set of proteins that have not accounted for total protein levels. Therefore, data normalization is critical for properly understanding proteomic compositional changes in disease samples.

Quantile normalization showed advantages in reducing data variability and improved sensitivity for detecting differentially abundant proteins in our Olink proteomics dataset. Presumably, this is due to removing sources of technical variation, such as total protein and others that are unknown. We show that normalization improves our power and sensitivity for detecting protein abundance changes, which reduces the number of false negatives in our study. However, it is still unclear the effect of normalization on false positives given we do not have a truth dataset to compare against. By removing unnecessary variation in the data, conventional wisdom would suggest normalization provides a more accurate analysis and fewer misinterpretations of outliers as significant findings.

Different proteomics assays operate under distinct assumptions, and the most suitable normalization technique might vary. Further investigation with other proteomics assays would be valuable to determine the broader applicability of quantile normalization. There are alternative normalization techniques that may be implemented that we have not explored in this study, but would anticipate similar gains in reducing bias (such as loess normalization or regression analysis47).

This type of data normalization might be unique to affinity-based platforms that are measured in fixed volumes in contrast to mass spectrometry which uses fixed total protein amounts (that is, total protein is normalized upfront). It is also possible that aqueous humor, and other alternative matrices with overall low protein abundances relative to plasma, are more susceptible to being driven by total protein concentrations. For example, DR patients with matched aqueous, vitreous and serum samples show only AH and vitreous humor have significant medianNPX differences between control and DR, but not in serum9. Future studies that measure total protein in matched samples will help illuminate the general relationship between total protein and medianNPX.

In our differential protein abundance analysis, we observed 82 significantly differential abundant proteins in GA compared to non-AMD control. Far fewer were observed in intermediate AMD when compared to control. One possible explanation for not observing many proteins differentially abundant in intermediate AMD compared to non-AMD is the high heterogeneity of the clinical condition in the intermediate stage of the disease. It is also possible that the sample size in this study was not large enough to identify statistically significant differences. While our pairwise comparisons did not identify many significant changes in aqueous humor for intermediate AMD, our linear model for disease severity identified a larger number (n = 53) of significantly altered proteins, suggesting these proteins are in fact elevated albeit the effect size was smaller than our prespecified threshold (FDR < 0.05 and Fold Change > 1.5).

By cross-referencing gene expression datasets, we determine that a number of differentially abundant proteins between control and GA are likely sourced from ocular origin. This is consistent with conclusions from another aqueous humor proteomics study that compared with tissue gene expression26. While other studies have previously performed expression analysis of AMD donor eyes48,49,50,51,52, the study we selected is unique in that the study was uniquely powered to distinguish between the macular and peripheral retina in addition to separating the retina and RPE/choroid layers42. How regional differences in protein expression between macula and peripheral retina translate into differences in biomarker findings is an exciting topic53. AMD is generally speaking not a local pathology, as evidenced by choroidal sclerosis, RPE alterations and drusen outside the macula. Nevertheless, relevant disease pathology is limited to the macula. DR, in contrast, is a disease affecting the total retina. It is, thus surprising, that biomarker levels in the macula and the panretinal disease share similarities. Possibly, the concept of a local macular disease deserves to be re-considered.

Our analysis uncovered notable proteins that may be useful as pathophysiology biomarkers for AMD in AH. One such protein is SMOC2 (SPARC-Related Modular Calcium Binding 2), which is a member of the SPARC (Secreted Protein Acidic and Rich in Cysteine/osteonectin/BM-40) family54. This protein family is known for its involvement in cell-matrix interactions, modulating cell adhesion, migration, proliferation, and differentiation, and in the development and remodeling of the extracellular matrix (ECM)55,56,57. Interestingly, SMOC2 was previously identified as a gene upregulated in AMD donor eyes53 and, more recently, as a biomarker for Alzheimer’s disease and related tauopathies58,59,60.

SMOC2 is secreted into the ECM, where it is involved in ECM deposition, organization and remodeling54. Its role in ECM organization could potentially influence age-related tissue changes, affecting tissue elasticity and function. Additionally, SMOC2 is thought to play a role in the regulation of angiogenesis61 exhibiting anti-angiogenic activity, and is involved in tissue regeneration62, wound healing63, and fibrosis64,65,66.

GA is associated with changes in the retinal ECM, particularly in Bruch’s membrane. The remodeling of Bruch’s membrane affects retinal structure, function, and integrity, likely contributing to GA progression1,2,67. By modulating the ECM organization, SMOC2 might affect how retinal cells interact with their microenvironment, which is crucial in response to injury. Additionally, chronic inflammation and oxidative damage are key features of GA1,2. While SMOC2’s direct involvement in retinal inflammation is not well characterized, its modulation of signaling pathways through interactions with growth factors and receptors in the ECM might influence cellular responses to oxidative stress due to the retina’s high metabolic activity. Given the compelling evidence of SMOC2 as a biomarker associated with GA pathophysiology in our study, and its known functions relevant to GA-associated biological processes, further research is warranted to understand the role of SMOC2 in GA.

Another GA pathophysiology candidate biomarker identified in our study is IL-6, a prominent pro-inflammatory cytokine known to play a critical role in immune response68,69. IL-6 is implicated in many inflammatory and auto-immune conditions due to its ability to stimulate immune cell proliferation, differentiation, and activation. In the context of GA, IL-6 may contribute to the chronic inflammation thought to be one of the drivers of disease progression70,71.

IL-6 can modulate complement proteins72, leading to the formation of membrane attack complexes (MACs) that damage retinal cells71,73. By activating the complement system, IL-6 may exacerbate retinal cell injury and death, particularly affecting the RPE cells and photoreceptors. This aligns with the understanding that complement activation plays a significant role in AMD pathogenesis1,2,71.

Additionally, IL-6 is linked to oxidative stress74,75, a condition characterized by an imbalance between reactive oxygen species (ROS) production and antioxidant defenses. The retina is highly susceptible to oxidative stress due to its high metabolic activity and exposure to light. IL-6 may induce oxidative stress by promoting the production of ROS, leading to cellular stress responses that result in dysfunction and death of RPE cells, ultimately causing the loss of photoreceptors in GA.

As an inflammatory cytokine, IL-6 could also result from or contribute to the formation and growth of drusen, which are extracellular deposits that accumulate between the RPE and Bruch’s membrane1,2. Drusen are prominent precursors of GA and are associated with inflammation in the retina. IL-6 may facilitate drusen formation by influencing the recruitment and activation of immune cells, further promoting inflammatory processes.

While we found IL6 expression in bulk RNA-seq retina samples, sNuc-seq was unable to detect IL6, which could be a limitation of the technology. sNuc-seq can have lower sensitivity for detecting low-abundance or transiently expressed genes, especially cytokines like IL6. Potential sources of IL-6 cytokine include endothelial cells, monocytes, RPE, and cells in the anterior segment of the eye, rather than systemic sources. This suggests that IL-6 production is localized within the ocular environment and may reflect local inflammatory responses.

In summary, IL-6 is known to be a central mediator of inflammation and may play a significant role in retinal pathology by promoting chronic inflammation, activating the complement system, inducing apoptosis, and increasing oxidative stress. These combined effects likely accelerate GA progression by amplifying inflammatory and degenerative processes in retinal tissues. Therefore, measuring IL-6 concentration in AH, alongside SMOC2, could potentially provide informative biomarkers for GA progression. Monitoring these biomarkers could enhance early detection, enable better tracking of disease progression, and contribute to the development of targeted therapeutic strategies.

A key question yet to be answered is the cause and source of elevated total protein levels in disease conditions. One possible source could be from blood due to a compromised blood-retina or ciliary body barrier contributing to higher protein content in the anterior chamber. However, a proteomic profiling study from paired aqueous humor and serum samples found these two matrices are highly distinct in patients with DR9. Therefore, peripheral proteins did not appear to be a major contributor to the aqueous humor proteome in DR. In GA, it is also possible that the loss of the RPE layer might allow the influx of blood proteins from the choriocapillary network into the retina, which might then diffuse to the AH. Future proteomic analysis of paired AH and serum samples from GA patients could help address this question. Currently, because the source of the total protein in GA AH is unknown, it is still unclear whether total protein content itself indicates a biological signal.

In this study, we used a bioinformatic filtering strategy to prioritize and highlight biomarkers for AMD, specifically SMOC2 and IL-6. Nevertheless, other significant proteins we identified should not be overlooked for future study. For example, gene expression levels do not always align with protein levels, so the absence of notable gene expression changes in donor eyes does not necessarily imply the same for protein levels. Furthermore, we excluded common changes in GA and DR aqueous humor samples compared to control, but these shared alterations could still be informative for GA pathophysiology and warrant further study. Conversely, candidate biomarkers that pass multiple filters, despite increasing our overall confidence, still require additional confirmation. Finally, while this study focused on confounding effects from total protein, other unknown covariates could also confound the analysis which we were unable to adjust for, such as pseudophakic status or systemic and eye treatment prior to AH collection32.

Another limitation of this study is the coverage of proteins by Olink Target 96 platform. While this platform was able to assay 1161 unique proteins, it should be noted that the individual proteins were pre-selected originally to cover existing known biological pathways of interest. However, future work should customize an AH specific panel to assay proteins that are expected to be enriched in the eye.