Introduction

The symptom-based tools currently used for stroke recognition in the field and emergency department offer limited accuracy, and stroke is estimated by some studies as the single most misdiagnosed condition in emergency medicine settings1. As such, the identification and development of reliable stroke-associated blood biomarkers has true potential to reduce life-threatening diagnostic delays and improve patient outcomes. Several biomarker discovery strategies have been employed over the past three decades in attempts to identify such biomarkers2. For example, countless studies have proposed candidate biomarkers or biomarker panels which exhibit alterations in blood as a result of the stroke-induced inflammatory response3; however, given the secondary nature of this mechanism, they have largely been hindered by limited specificity4,5,6. Arguably, a more promising strategy is to target molecules directly released into the blood by damaged brain tissue. Much of the work carried out along these lines has focused on brain-originating proteins7; however, these proteins often circulate at extremely low levels, which creates significant analytical challenges in terms of clinical implementation given constraints associated with point-of-care or stat laboratory testing8.

Two recent investigations have employed differing approaches in novel attempts to identify plasma circular RNAs (circRNAs) that are released from brain tissue into peripheral circulation during ischemic events (Table 1). Liu et al. carried out next-generation RNA sequencing of penumbral tissue isolated from a murine model of stroke to identify circRNAs that are responsive to ischemia, where they reported to observe significant upregulation of circOGDH9. In a follow-up analysis of clinical specimens, they further reported that plasma measures of circOGDH were elevated during the acute phase of human ischemic stroke, and demonstrated the ability to discriminate between stroke patients and neurologically normal controls with 82% sensitivity and 97% specificity. Meanwhile, Jiang et al. performed next generation RNA sequencing of brain-originating exosomes isolated from the plasma of ischemic stroke patients using immunoprecipitation, and reported a set of 23 circRNAs that were assumed to be neural in origin10. In a follow-up experiment, they further reported that the collective direct plasma measures of a subset of 5 of these 23 candidates including circFUNDC1, circCALR, circVPS13C, circUSP48, and circCDC14A were elevated in a second set of ischemic stroke patients relative to neurologically normal controls, and demonstrated the ability to differentiate between the two with 95% sensitivity and 81% specificity.

Table 1 Candidate plasma circRNAs detected in stroke that have been reported as originating from the brain.

If the 24 candidate circRNAs collectively identified by Liu et al. and Jiang et al. truly do originate from brain, it represents a potentially significant advance towards identifying a clinically viable acute stroke biomarker; the fact that these circRNAs should beresistant to digestion by plasma RNases and can be detected with nucleic acid amplification methods that are up to a 1,000 times more sensitive than the immunoassay techniques typically used to detect proteins11 circumvents one of the biggest technical challenges that has limited the clinical applicability of brain-originating proteins. However, limitations associated with the respective experimental workflows deployed for biomarker discovery in these two investigations raise the possibility that some or all of the copies of these circRNAs that were detected in plasma had originated from non-neural tissues and not the brain as originally claimed, which would greatly diminish their diagnostic utility. In the case of Liu et al., circODGH expression was not characterized in any non-neural tissues, making it unclear how restricted its expression is to the brain. In the case of Jiang et al., the plasma membrane proteins that were targeted for isolation of brain-derived exosomes raise the possibility that the exosome preparations that were used for downstream circRNA profiling contained contaminating material from circulating white blood cells. Specifically, it was reported that immunoprecipitation was carried out using a mix of antibodies against Excitatory amino acid transporter 1 (EAAT1), Transmembrane protein 119 (TMEM119), and L1 cell adhesion molecule (L1CAM) to capture exosomes and ultimately circRNAs assumed to originate respectively from astrocytes, microglia, and neurons. However, a review of human single cell mRNA sequencing compiled as part of the Human Protein Atlas project12, alongside corroborating experimental reports13,14,15, suggests that these proteins are expressed in several non-neural cell populations found in the blood including granulocytes and monocytes (Fig. 1).

Fig. 1
Fig. 1The alternative text for this image may have been generated using AI.
Full size image

Transcriptional expression levels of EAAT1, TMEM119, and L1CAM in a wide range of human cell populations. The average transcriptional expression levels of EAAT1, TMEM119, and L1CAM in 80 unique human cell types as quantified by single cell RNA sequencing. Data originate from over 600,000 individual cells isolated from 31 normal tissue types sourced from a pool of over 200 human donors, and were directly retrieved from the Human Protein Atlas. Expression levels represent log2 transcripts per million (TPM) values normalized between cells using Trimmed mean of M values (TMM) normalization.

In light of this uncertainty, examining the normal body-wide expression profiles of these circRNAs could identify alternate tissue sources that could serve as diagnostic confounds, and also help clarify the probability that they truly originate from the brain. Thus, in the work presented here, we leveraged publicly available RNA sequencing data originating from a large cohort of human donors to assess the expression levels of the 24 candidate plasma circRNA collectively detected by Liu et al. and Jiang et al. in 31 distinct tissue types including the brain and blood.

Methods

All analyses were carried out with R version 4.34 (R project for statistical computing) using data retrieved from circAtlas version 3.0, a compendium of circRNA expression data generated from over 1,000 peri-mortem and post-mortem adult normal human specimens via bulk next generation RNA sequencing16. The genomic coordinates and full length sequences for all 745,034 human circRNAs annotated in circAtlas were respectively downloaded as .BED and .FASTA files. The genomic coordinates associated with each of the 24 candidate circRNAs of interest collectively detected by Liu et al. and Jiang et al. were manually aggregated from the two original reports and unified in terms of Genome Reference Consortium hg38 (NCBI accession GCF_000001405.40) using the liftOver() function of the ‘rtracklayer’ package17. These unified genomic coordinates were used to search for matching circRNAs annotated in circAtlas. In a small number of instances where a perfect match could not be identified, the blast() function of the ‘rBLAST’ package18 was used to identify the annotated circRNA with the next closest alignment based on the full length sequences, which was used as a surrogate (Table 1). The estimated expression levels of these circRNAs of interest in all available tissue types with the exception of placenta were then retrieved, alongside those of 500 additional circRNAs randomly selected to serve as biological background using sample() function of base R (Supplementary Table 1). Specifically, for each circRNA, the tissue-level mean counts of backsplice junction spanning reads normalized per million were individually downloaded as .tsv files and merged as a single data frame for downstream expression analysis.

The degree of brain enrichment was calculated for each circRNA in terms of simple fold difference by dividing the observed mean expression level in brain tissue by the weighted mean expression level observed across all non-neural tissue types using simple scripts. The brain enrichment observed for circOGDH, as well as the median brain enrichment observed across the subset of 23 candidate circRNAs described by Jiang et al., were then statistically compared to the median brain enrichment observed across the set of background circRNAs respectively via either one sample or two sample Mann–Whitney U test using the wilcox.test() function the ‘stats’ package. Furthermore, the proportions of circRNAs respectively exhibiting maximal expression in either brain tissue or blood were tabulated within both the subset of 23 candidate circRNAs described by Jiang et al. and the set of background circRNAs using basic scripts. The observed proportions were then statistically compared via Fisher’s exact test using the exact() function of the ‘stats’ package. P-values of less than 0.05 were considered statistically significant. Figures were generated via the ‘ggplot2’ package and stylized using Adobe Illustrator version 24.0.1 (Adobe Incorporated).

Results

The summarized expression levels of circOGDH and the set of 23 candidate circRNAs described by Jiang et al. in all 31 tissue types included in our analysis can be found in Fig. 2. With respect to circOGDH, its single highest expression levels were observed in skeletal muscle, with meaningful expression in numerous non-neural tissues including stomach and spleen. Nonetheless, circOGDH did still display modestly heightened expression in brain tissue; its expression levels were 4.21 fold higher in brain tissue relative to the aggregate of other tissue types, suggesting that circulating copies could at least in part originate from the brain. However, this degree of brain enrichment was only marginally higher than the median brain enrichment observed across the 500 circRNAs we selected at random to serve as a proxy for biological background, and it is clear that there are numerous alternate tissue sources that could represent diagnostic confounds.

Fig. 2
Fig. 2The alternative text for this image may have been generated using AI.
Full size image

Expression levels of the 24 candidate circRNAs in a wide range of human tissues. The average expression levels of the single candidate circRNA described by Liu et al. and the 23 candidate circRNAs described by Jiang et al. in 31 distinct normal human tissues as quantified by bulk RNA sequencing, compared to those of 500 background circRNAs selected at random. Expression levels represent counts of backsplice junction spanning reads normalized per million, and are visually presented unity scaled between 0 and 1 to facilitate comparison. For each individual circRNA, the tissue of maximum expression is indicated, along with the degree of brain enrichment in terms of the fold difference observed between brain and non-brain tissues. Furthermore, the proportion of circRNAs within each set of circRNAs that respectively exhibit maximal expression in either brain tissue or blood are also indicated, along with the median brain enrichment. aThe proportions of brain expressed and blood expressed circRNAs amongst the candidates were statistically compared to background using Fisher’s exact test. Median brain enrichment amongst the candidates were statistically compared to background using either bone-sample or ctwo-sample Mann–Whitney U test. *Statistically significant compared to background.

Of the 23 candidate circRNAs detected by Jiang et al., only 1 (4.3%) displayed its single highest expression levels in brain tissue relative to all other tissue types included in our analysis, while 17 (73.9%) exhibited their single highest expression levels in blood. Comparatively, of the 500 randomly selected background circRNAs, 166 (33.2%) exhibited their highest expression levels in brain, while only 20 (4.0%) exhibited their highest expression levels in blood. Based on this background estimate, the probability of observing the proportion of blood expressed circRNAs present amongst the candidates described by Jiang et al. by random chance would be less than 10–15. Given that nearly all of the RNA in whole blood is contributed by white blood cells, these collective observations strongly suggest that a majority of the nucleic acid detected by Jiang et al. did not originate from the brain, and in fact instead originated from contaminating leukocytes or leukocyte-derived material in their exosome and plasma preparations. Nonetheless, it is possible that circulating copies of the single circRNA for which we observed maximal expression in brain tissue, circGRK3, do at least in part originate from the brain; however, circGRK3 was only enriched in brain tissue 4.19 fold relative to the aggregate of other tissue types included in our analysis, suggesting a wide range of potentially confounding alternate sources.

Discussion

Our collective results clearly indicate that nearly all of the 24 candidate circRNAs described by Liu et al. and Jiang et al. have alternate tissue sources that could represent diagnostic confounds if they were to be targeted for acute stroke recognition, and further suggest there is a high probability that many do not originate from the brain as originally believed.

With respect to circOGDH, while we did observe heightened expression in brain tissue, we observed meaningful expression in numerous non-neural tissues, most notably in gastrointestinal and skeletal muscle tissues. Given the large total mass of the gastrointestinal system and skeletal muscle, it is likely they contribute a significant proportion of the circOGDH found in circulation just with cell death occurring from normal physiologic cell turnover19,20. Given that the mass of affected brain tissue in an ischemic event is extremely small in comparison, especially early in pathology or in cases such as lacunar infarction, any circOGDH released from the brain may only yield marginal increases in circulating levels which could be difficult to discern. Furthermore, the widespread non-neural expression we observed suggests a high risk for false positives in instances of non-neurological injury if circOGDH were to be deployed for real-world stroke recognition. As a point of comparison, the cardiac troponins, which are now standard of care for triage of suspected acute myocardial infarction, are arguably the single most successful clinical use example of plasma tissue damage biomarkers21; their expression is enriched up to 2,000 fold in cardiac tissue relative to non-cardiac tissues22, whereas here, circOGDH was only enriched 4.21 fold in brain tissue relative to non-brain tissues.

With respect to the 23 candidate circRNAs described by Jiang et al., our results overwhelmingly suggest that it is likely that a majority originate from white blood cells, and not the brain as originally believed. Given the limited cell-type specificity of the antigens they targeted in their exosome enrichment strategy13,14,15, along with the high levels of blood expression we observed here, it is most likely that the results of their investigation were largely driven by the presence of contaminating leukocytes or leukocyte-derived material in their exosome and plasma preparations. As such, we strongly suspect that the differences in the plasma levels they reported for these circRNAs between ischemic stroke patients and neurologically normal controls were simply the result of the well-documented increase in circulating neutrophil and monocyte counts that is known to occur in stroke patients relative to healthy individuals23,24,25. This mechanism becomes more likely when considering the modest fold differences that were reported, which collectively ranged from only 1.3 to 6.5, as they mirror the magnitude of the increases that have been observed in counts of said cell populations during the acute phase of stroke. Given that these differences in circulating cell counts are absent or significantly attenuated between patients suffering stroke and those suffering stroke mimicking conditions commonly present in the suspected stroke population26, it is unlikely that plasma measures of these particular circRNAs would offer diagnostic value in a real world stroke recognition scenario. In that regard, this likely effect highlights the need to recruit true stroke mimics as controls in future biomarker discovery investigations, as opposed to neurologically normal or healthy individuals, as is currently the most common practice.

In terms of limitations, given that our analysis was carried out using data from normal tissues, it cannot account for any potential stroke-induced alterations in tissue expression levels, such as those that may occur in the brain. Nonetheless, from this perspective, it is important to note that a majority of the most well validated brain-originating blood biomarkers of neurological damage such as glial fibrillary acidic protein (GFAP) and neurofilament light chain (NfL) were originally proposed based on analyses of normal tissue expression alone, and said information is certainly useful in identifying threats to diagnostic specificity27. Furthermore, it is also important to note that in the experiments used by Liu et al. to compare circRNA expression between the penumbra and adjacent non-ischemic tissue in murine stroke, circOGDH was one of the most robustly differentially regulated circRNAs transcriptome-wide, demonstrating a 4-fold increase. These results are highly consistent with those of a similar study carried out by Mehta et al. that performed high-throughput differential expression profiling of brain tissue in murine stroke at early timepoints and reported fold changes that fall well under 4 for an overwhelming majority of successfully detected circRNAs28. If these observations generalize to human pathology, stroke-associated alterations of this magnitude in any tissue type would be unlikely to meaningfully alter the overall body-wide expression profiles we observed here, or our broader conclusions regarding potentially confounding tissue sources, especially given the prior context regarding the known limitations associated with the biomarker discovery workflows used to originally characterize the circRNAs in question.

While targeting circulating brain-originating circRNAs as a potential source of acute stroke biomarkers is certainly an intriguing and promising strategy, our collective results suggest that further investigation using more rigorous biomarker discovery workflows is needed to produce viable candidates with a higher degree of brain specificity.