Introduction

Breast cancer is now the most common form of cancer worldwide and the leading cause of cancer-related death in women1. Identification of biomarkers and new therapeutic targets for early detection, risk stratification and treatment is crucial to reducing breast cancer mortality. The complex systems underlying breast cancer development and progression call for a systems epidemiology approach that integrates clinical, molecular, and multi-omics data, to develop more tailored prevention and treatment strategies through precision medicine2,3,4.

MicroRNAs (miRNAs) are small, endogenous, non-coding RNA (ncRNA) molecules that post-transcriptionally regulate gene expression5,6. Depending on their target genes, miRNAs can function as either oncogenes or tumor suppressors to influence various hallmarks of cancer, such as the ability to evade growth suppressors, sustain proliferative signaling, resist cell death, activate migration and metastasis, or induce angiogenesis7. MiRNAs are frequently dysregulated in human cancers, and abnormal expression levels of miRNAs can be detected in solid tumors or body fluids, making them potential biomarkers for diagnosis, prognosis and prediction of treatment response8,9. Furthermore, miRNAs hold promise as therapeutic targets or tools7,8.

A candidate miRNA in this context is miR-20a-5p, a member of the miR-17-92 cluster. This cluster has a dual role in cancer, its members having demonstrated both tumor promoter and tumor suppressor functions in several cancer types10. Previous studies have demonstrated the dysregulation of miR-20a-5p in breast cancer, particularly triple negative breast cancer (TNBC). MiR-20a-5p has shown significantly higher expression levels in breast cancer tissue compared to normal breast tissue11. Significantly higher expression levels have also been found in TNBC tissue compared to normal breast tissue and non-TNBC tissue12, and in TNBC cells compared to HER2-positive13 and luminal A14 breast cancer cells. Further, high miR-20a-5p expression levels have been found in the TNBC cell line MDA-MB-231 and in the exosomes derived from this cell line15.

MiR-20a-5p has been linked to growth and proliferation of breast cancer cells. Bai et al. found that miR-20a-5p overexpression promoted cell growth and proliferation both in vitro and in vivo16, while Zhao et al. found the opposite effect in vitro17. Similarly, two studies demonstrated that miR-20a-5p overexpression led to increased migration and invasion of TNBC cells in vitro15,16, while another study found that the migrative and invasive capabilities were impaired17. Further, in vitro studies have suggested a promoting effect of miR-20a-5p on apoptosis17 and angiogenesis18.

Recent studies have suggested miR-20a-5p as a putative prognostic or predictive biomarker. A panel of eight miRNAs, including miR-20a-5p, was identified as a signature associated with tumor recurrence and decreased survival in TNBC patients12. In two independent patient cohorts of metastatic breast cancer patients, Rinnerthaler et al. found that low miR-20a-5p expression in breast cancer tissue predicted a greater benefit from bevacizumab-containing therapy, being significantly associated with longer progression-free and overall survival19.

Although previous studies suggest that miR-20a-5p exerts a role in breast cancer development and progression, substantial discrepancies among research findings warrant further investigation into the functional roles and biomarker potential of miR-20a-5p in breast cancer.

Research on the relationship between miR-20a-5p and established breast cancer risk factors, such as lifestyle and reproductive factors, is currently limited. Adopting the systems epidemiology perspective, investigating this relationship can help us understand the effect of these factors on gene expression and their contribution to the initiation and progression of breast cancer.

The emerging understanding that miRNAs may possess both cytoplasmic and nuclear function20,21,22, as well as their involvement in the tumor microenvironment23, highlights the importance of visualizing their tissue- and subcellular localization. However, none of the most commonly used techniques for miRNA quantification provide information on miRNA localization within the tissue or cell24.

In this study, we evaluated the expression profile of miR-20a-5p in 313 surgical specimens from breast cancer patients within the Clinical and Multi-omic (CAMO) cohort, which is part of the Norwegian Women and Cancer (NOWAC) cohort. The overall aim was to elucidate the role of miR-20a-5p in breast cancer and enhance our understanding of how miR-20a-5p expression is associated with breast cancer biology and its potential role as a biomarker for prognosis or targeted therapy. Specifically, three objectives were defined: (a) to assess the spatial and subcellular expression profile of miR-20a-5p in breast cancer tissues, which has not been previously addressed in the literature, (b) to quantify the association of miR-20a-5p expression with patient and tumor characteristics, including demographics, molecular subtypes, and clinicopathological features, and (c) to investigate the effects of miR-20a-5p on key aspects of breast cancer cell behavior, including proliferation, migration, and invasion.

To our knowledge, this is the first study to comprehensively evaluate the expression level of miR-20a-5p in various tissue- and subcellular compartments, explore its associations with breast cancer risk factors and clinicopathological features, and examine its functions in vitro.

Materials and methods

Study population

Our study includes a subset of participants from the CAMO cohort, described in detail elsewhere25. Nested within the NOWAC cohort26, the CAMO cohort consists of 388 women diagnosed with breast cancer in North Norway before 2013. For these women, we have detailed information on demographic, anthropometric, lifestyle, reproductive and clinicopathological parameters. The data was retrieved from questionnaires, medical records, national registries and histopathological analyses of tumor tissue.

From the initial 388 CAMO participants, 69 were excluded for technical reasons, such as missing formalin-fixed paraffin-embedded (FFPE) tissue blocks, too small tumors, fragmentation during TMA construction due to high adipose tissue fraction, or non-scorable tumor cores. Lastly, those who had received neoadjuvant treatment were excluded (n = 6), resulting in a final cohort consisting of 313 participants with scorable tissue cores. Among these participants, 309 had tissue cores with scorable stroma, 312 had scorable cytoplasm and 312 had scorable nuclei (Fig. 1). A total of 259 (82.7%) participants answered the questionnaires as part of the NOWAC cohort before their breast cancer diagnosis, while 54 (17.3%) patients gave this information after diagnosis.

Fig. 1
figure 1

Study population. Inclusion of participants from the CAMO cohort to this study. TMA tissue microarray.

Definitions and recoding of variables

Tumors were graded based on gland formation, nuclear pleomorphism, and mitotic count as part of routine diagnostic assessment. Expression of estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) was evaluated in needle biopsies as previously described25. The Ki67 expression was analyzed in tumor tissue slides from the surgical specimens and reported as the percentage of positive cancer cell nuclei in the most proliferative parts of the tumors. Molecular subtyping of tumors was done based on the surrogate markers ER, PR, HER2 and Ki67, according to recommendations by the St. Gallen International Expert Consensus25,27,28. Information on tumor diameter, lymph node metastasis, distant metastasis and relapse was manually curated from medical records.

Age at menarche, parity, body mass index (BMI), smoking, alcohol consumption, menopausal status, and family history of breast cancer in mother or sister were self-reported in NOWAC questionnaires. Parity was categorized as nulliparous and parous, BMI was grouped into three categories: < 25, 25–30 and > 30, and smoking status of the participants was set to ever or never. In case of missing data on age at menopause, women were classified as pre- or postmenopausal using an age cut-off of 53 years.

MicroRNA expression in tumor tissue

TMA construction

The methodology has previously been reported in detail29. Briefly, representative tumor areas were selected on histological slides from the primary tumors by two pathologists (LM and LTRB). A map with coordinates for each patient was made before collecting tissue cores. Several replicate 0.6 mm tissue cores were transferred from each paraffin donor block to a recipient block using a tissue arraying instrument (Beecher Instruments, Silver Spring, MD) and 4 μm sections were prepared (Microm microtome HM355S, Microm, Walldorf, Germany).

In situ hybridization

Labeling of miR-20a-5p by ISH was performed in the Ventana Discovery Ultra instrument (Ventana Medical Inc, Marana, AZ, USA). Double‐digoxigenin (DIG) labeled miRCURY LNA detection- and control probes from Exiqon AS, Denmark were used.

Adequate sensitivity level of the ISH method and minimal RNA degradation were confirmed by a control probe targeting U6, a small nuclear RNA component of the spliceosome. A scramble miRNA negative control probe indicated no unspecific staining from reagents or tissues. MiR-20a-5p expression in other tissues than breast cancer was also confirmed by a multi tissue TMA control. Reagents and probes used are shown in Supplementary Table S1.

Briefly, TMA slides were baked at 60 °C overnight and then transferred to the Discovery Ultra for ISH staining. After deparaffinization, heat retrieval and denaturation, probes were hybridized to tissue RNA targets. Stringency washing and blocking of unspecific bindings were performed. The RNA-bound probes were then detected immulogically by binding to alkaline phospatase-conjugated anti-DIG and visualized by substrate enzymatic reactions. Finally, the slides were counterstained, dehydrated through an increasing gradient of ethanol solutions to xylene, and mounted. Details of the optimized ISH protocol are shown in Supplementary Table S2.

Semiquantitative scoring

The TMAs were digitized using a Panoramic 250 Flash III slide scanner (3DHistech, Budapest, Hungary), and uploaded to the bioimage analysis software QuPath version 0.1.2. In scorable cores, the staining density was scored in a four-tiered ordinal scale (0 = negative, 1 = weak, 2 = moderate and 3 = strong). Cores with representative scoring values are shown in Fig. 2. Each tissue core received one score for each of the three different compartments: tumor stromal fibroblasts, cancer cell cytoplasm and cancer cell nuclei. All samples were anonymized and independently scored by one pathologist (LM) and two researchers (EST and ABD), who were blinded to the scores of the other researchers and the patients’ outcomes. In cases where there was a score discrepancy greater than 1, the slides were re-examined until a consensus was reached. A mean score for each compartment was calculated from all cores of the patient and all examiners. A predetermined scoring value of 2 was used as a cutoff to dichotomize the mean scoring value as high or low. To assess scoring in cancer cell cytoplasm and nucleus combined, the dichotomized scoring categories were combined as either high/high, low/low or mixed.

Fig. 2
figure 2

Semiquantitative scoring of miR-20a-5p in tumor tissue. A panel of representative tissue cores with scoring of miR-20a-5p stained by in situ hybridization (ISH) in tumor stromal fibroblasts (stroma), cancer cell cytoplasm (cytoplasm) and cancer cell nuclei (nucleus). Cores were given scores of 0–3 based on the intensity of the staining in each tissue- and subcellular compartment.

Functional in vitro studies

Cell lines and culture

The functional properties of miR-20a-5p were evaluated in three different breast cancer cell lines: SK-BR-3 (ATCC® HTB-30), MDA-MB-231 (ATCC® HTB-26) and MCF-7 (ATCC® HTB-22), all derived from metastatic sites (pleural effusions). SK-BR-3 is a HER2-positive cell line, characterized by HER2 overexpression and the absence of ER and PR. MDA-MB-231 is a triple-negative cell line, lacking ER, PR, and HER2 expression. MCF-7 represents the luminal A subtype, exhibiting ER and PR positivity, with low or undetectable levels of HER230.

To reduce the risk of significant changes to the cells due to mutations during the passages, we plated SK-BR-3 cells below passage 20, MDA-MB-231 cells below passage 15 and MCF-7 cells below passage 10. The cells (2 × 105 cells/ml) were cultured in Opti-MEM I (1×) medium without phenol red (catalog# 11058-021, GIBCO, RF, UK), supplemented with 5% of fetal bovine serum (FBS) (catalog# S0415, Biochrom, Berlin, Germany) and Penicillin Streptomycin 1% (catalog# 15140-148, Gibco, NY, USA), in a humidified atmosphere with 5% CO2: 95% air, at 37 °C, for 72 h. The culture medium was then replaced by serum-free medium 24 h before the experiments. At the start of the experiments, the cells were 85–90% confluent.

Cell transfection

Cells were transiently transfected with hsa-miR-20a-5p Pre-miR™ miRNA Precursor (catalog# AM17100, Thermo Fisher Scientific, USA), alongside the Cy3™ Dye-Labeled Pre-miR Negative Control #1 (catalog# AM17120, Thermo Fisher Scientific, USA) using the transfection reagent Lipofectamine™ RNAiMAX (catalog# 13778075, Thermo Fisher Scientific, USA). Transfected Cy3™ Dye-Labeled Pre-miR Negative Control emits fluorescent light upon UV-light exposure. The transfection efficiency, assessed by fluorescence microscopy, ranged from 80 to 95%.

MTT assay for proliferation

Cells were cultured at 5 × 103 cells/well in 96 well plates and then transfected with either hsa-miR-20a-5p Pre-miR™ miRNA Precursor or Cy3™ Dye-Labeled Pre-miR Negative Control. At 0, 1, 2 and 3 days after transfection, cells were treated with 12 mM of [3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide] (MTT, 5 mg/ml) (catalog# M6494, Invitrogen, OR, USA) and incubated for 4 h at 37 °C. The resulting formazan crystals were solubilized by incubating the cells in 0.01 M HCl/SDS (catalog# 28312, Thermo Scientific, IL, USA) at 37 °C overnight and then quantified spectrophotometrically by measuring the absorbance at 570 nm in the CLARIOstar® plate reader (BMG Labtech, Ortenberg, Germany). Three different experiments with four parallel wells were performed for each cell line (Supplementary Information, Fig. S1).

Wound healing assay for migration

Cells were cultured at 2 × 105 cells/well in 24 well plates, washed with phosphate buffered saline (PBS) and incubated in a serum-free culture medium containing mitomycin C (10 µg/L), which blocks DNA replication to avoid cell proliferation. The cell monolayer was scraped with a 200 µl sterile pipette tips to create a “wound”, and then washed to remove detached cells and debris. After 4 h, the cells were transfected with either hsa-miR-20a-5p Pre-miR™ miRNA Precursor or Cy3™ Dye-Labeled Pre-miR Negative Control and incubated for 24 h at 37 °C. To measure wound closure, photographs of the same areas of the wound were taken at 0 and 24 h. An inverted optical microscope (Nikon Eclipse TS100) was used to capture images, which were further analyzed by Micrometrics SE Premium 4 software. To determine the extent of cell migration during the 24-h incubation period, the areas occupied by migrated cells were quantified by subtracting the background levels at 0 h.

Transwell assay for invasion

Cells (2 × 105) in serum-free culture medium were seeded in ThincertR chambers (Greiner Bio-one, Kremsmünster, Austria) with polyethylene terephthalate membranes (8 mm pore size) pre-coated with 50 ml of phenol red-free Matrigel (Gibco). The chambers were placed in 24-well plates containing culture medium with 5% FBS in the lower chamber. Cells in the upper chambers were transfected with either hsa-miR-20a-5p Pre-miR™ miRNA Precursor or Cy3™ Dye-Labeled Pre-miR Negative Control and incubated for 48 h at 37 °C. The chambers were washed with 10 mM PBS, fixed in 4% paraformaldehyde for 30 min, and stained with 0.2% crystal violet for 10 min. Non-invading cells from the upper membrane surface were removed with a cotton swab. Invaded cells on the lower membrane surface were photographed using an inverted optical microscope (Nikon Eclipse TS1000). Images of three random microscope fields were captured in duplicate. To quantify invasion, the number of cells on the membranes were counted using Image J software (National Institutes of Health, Bethesda, MD, USA).

Statistical methods

We calculated interobserver reliability between miR-20a-5p scores by applying two-way random effects models with absolute agreement definition. The correlations between miR-20 expression levels in different compartments were calculated using Spearman’s correlation coefficients.

For normally distributed data, independent samples t-test was used to compare means between two independent groups, and one-way ANOVA for three or more groups. For variables where the assumption of normality was not met, Mann–Whitney U test was used to compare medians between two independent groups, and the Kruskal–Wallis test for three or more groups. The chi-squared test was used to check for associations between categorical variables. Logistic regression was used to model associations of miR-20 expression levels in different compartments with breast cancer risk factors and clinicopathological parameters. The results from the logistic regression analyses were presented as odds ratios (OR) with 95% confidence intervals (CI). In the absence of prior knowledge regarding potentially confounding factors, no adjustments were made in the statistical analysis.

We examined the relationship between miR-20a-5p expression levels and anthropometric, lifestyle and reproductive factors, using smoking, alcohol, parity, BMI, and menopausal status at diagnosis as predictor variables, with miR-20a-5p expression as the outcome. In analyzing the relationship between miR-20a-5p expression levels and clinicopathological factors, we used miR-20a-5p expression as the predictor variable for all analyses except tumor grade and molecular subgroup, which were used as the predictor variables with miR-20a-5p expression as the outcome.

Restricted cubic splines with four knots were used to assess possible non-linear relationship between stromal, cytoplasmic and nuclear miR-20a-5p expression and the outcome variables relapse, lymph node metastasis and distant metastasis. Locations of knots were based on Harrell’s recommended percentiles of the mean miR-20a-5p scoring values31. The restricted cubic splines for all three compartments were modeled with four knots positioned at the 5th, 35th, 65th and 95th percentiles of the mean scoring values. A Wald-type test was used to assess if the coefficients of the second and third splines were equal to zero.

For the functional in vitro studies, differences between transfected and control experiments were analyzed using independent samples t-tests.

Statistical analyses were done in STATA/MP version 17.0 (Stata Corp, College Station, TX, USA), GraphPad Prism 9 (GraphPad Software, Boston, MA, USA) and Microsoft Excel (Microsoft Office 365, Microsoft Corp., Redmond, WA, USA). A p-value of 0.05 was considered statistically significant for all analyses.

Language revision

ChatGPT version 4o (OpenAI Inc, San Fransisco, CA, USA) was used to improve grammar, sentence structure, and readability of minor parts of the manuscript. All AI-generated suggestions were critically evaluated to ensure that the original meaning of the text was preserved. Final language revision was done by a native English speaker.

Results

Patient characteristics

Patient characteristics for CAMO cohort participants have previously been described in detail25. Characteristics for the subset of patients used in this study are presented in Table 1. In short, the median age at diagnosis was 56 years, most cancers were classified as tumor grade 2 (42.3%) and were of the luminal A subtype (62.2%). A total of 40 (12.8%) tumors were of the basal-like subtype. Median tumor diameter was 16 mm, and 32.0% of women underwent mastectomy. A total of 258 (82.4%) tumors were hormone receptor (HR) positive, and 38 (12.2%) were HER2 positive. At diagnosis, 30.2% were diagnosed with lymph node metastasis. At any point during follow-up, 25 (8%) experienced locoregional relapse and 34 (11.0%) experienced distant metastasis. The mean follow-up time was 165.6 months.

Table 1 Patient characteristics.

MiRNA expression in tumor tissue

The scoring agreement between the three researchers was as follows: stroma 67%; cytoplasm 80% and nucleus 78%. We observed a high positive correlation between scoring values in nucleus and stroma (rs = 0,71), cytoplasm and stroma (rs = 0,67), and cytoplasm and nucleus (rs = 0,67). The proportion of tumors classified as having high expression of miR-20 was as follows: stroma 14.9%; cytoplasm 23.4%; and nucleus 57.7% (Table 2). The expression of mir-20a-5p was high in all compartments in 9% of the tumors.

Table 2 MiR-20a-5p expression levels in different compartments.

Associations of miR-20a-5p expression with breast cancer risk factors and clinicopathological features

MiR-20a-5p expression levels were assessed in relation to demographic, anthropometric, lifestyle, reproductive and clinicopathological factors (Tables 3, 4, 5, 6, 7).

Table 3 Associations of miR-20a-5p expression levels with breast cancer risk factors.
Table 4 Associations of miR-20a-5p expression levels with clinicopathological features.
Table 5 Logistic regression analysis of the association between mir-20a-5p expression levels in stroma, in relation to clinicopathological parameters and breast cancer risk factors.
Table 6 Logistic regression analysis of mir-20a-5p expression levels in cytoplasm, in relation to clinicopathological parameters and breast cancer risk factors.
Table 7 Logistic regression analysis of miR-20a-5p expression levels in nucleus, in relation to clinicopathological parameters and breast cancer risk factors.

For tumor stromal fibroblasts (Tab. 5), we observed a significantly lower median age at diagnosis in women with high expression of miR-20a-5p compared to the group with low expression (p = 0.009). Stromal miR-20a-5p expression was significantly associated with menopausal status (p = 0.001), Ki67 expression (p = 0.041), and relapse (p = 0.013). Logistic regression revealed an association between stromal miR-20a-5p expression and menopausal status (OR 2.81, 95% CI 1.48–5.33), indicating that women who were premenopausal at diagnosis had almost three times higher odds of having high stromal miR-20a-5p expression compared to postmenopausal women. A high Ki67 expression was associated with two-fold higher odds of having a high stromal miR-20a-5p expression, compared to low Ki67 expression (OR 2.02, 95% CI 1.02–4.01). Notably, we found three-fold higher odds of relapse associated with high compared to low stromal miR-20a-5p expression (OR 3.02, 95% CI 1.22–7.49).

For cancer cell cytoplasm (Tab. 6), we observed a significantly lower median age at diagnosis (p = 0.049) in women with high expression of miR-20a-5p compared to the group with low expression. Cytoplasmic miRNA expression was significantly associated with menopausal status (p = 0.016) and tumor grade (p = 0.047). Women who were premenopausal at diagnosis had higher odds of having a high cytoplasmic miR-20a-5p expression, compared to postmenopausal women (OR 1.95, 95% CI 1.13–3.37). Although not significant at the 5% level, our results suggest that a high cytoplasmic miR-20a-5p expression may be associated with a two-fold increase in odds of having a basal like subtype (OR 2.00, 95% CI 0.96–4.17). Similarly, suggestive associations were observed between high cytoplasmic miR-20a-5p expression and high Ki67 expression (OR 1.74, 95% CI 0.97–3.11) and, conflictingly, lower odds of lymph node metastasis (OR 0.57, 95% CI 0.31–1.06).

For cancer cell nucleus (Tab. 7), we observed an association with BMI (p = 0.033) and lymph node metastasis (p = 0.038). Logistic regression revealed an inverse association between tumor size and nuclear miR-20a-5p expression (OR 0.81, 95% CI 0.67–0.99), indicating that women with smaller tumors had increased odds of a high nuclear miR-20a-5p expression. There was also an inverse relationship between nuclear miR-20a-5p expression and odds of lymph node metastasis (OR 0.60, 95% CI 0.37–0.97), indicating that women with a high nuclear miR-20a-5p expression had 40% lower odds of lymph node metastasis.

The combined expression of miR-20a-5p in cancer cell nucleus and cytoplasm was significantly associated with menopausal status (p = 0.028).

The associations between the miR-20a-5p scoring values from each tissue- and subcellular compartment and the study outcomes relapse, lymph node metastasis and distant metastasis, estimated from restricted cubic splines models are presented in Fig. 3. We identified a non-linear association between stromal miR-20a-5p expression level and odds of relapse (p = 0.029, Fig. 3). For scoring values up to 2, there was a lower OR for relapse compared to a scoring value of 0. For scores higher than 2, there was an increasing OR for relapse with higher stromal miR-20a-5p expression.

Fig. 3
figure 3

Restricted cubic splines for miR-20a-5p expression levels in different compartments, in relation to clinical outcomes. Spline regression models for stromal, cytoplasmic, and nuclear miR-20a-5p expression levels (scoring values, x-axis) in relation to odds of relapse, lymph node metastasis, and distant metastasis (y-axis). Solid lines: odds ratio, dashed lines: 95% confidence interval.

Functional in vitro studies

None of the cell lines showed an increase in proliferation upon transfection with miR-20a-5p (Supplementary Information, Fig. S1), according to the proliferation assay. In contrast, wound healing assays demonstrated increased cell migration after 24 h in all cell lines overexpressing miR-20a-5p, compared to controls (Fig. 4). Finally, invasion assays demonstrated that miR-20a-5p overexpression increased invasiveness in both SK-BR-3 and MCF-7 cells, compared to negative controls (Fig. 5).

Fig. 4
figure 4

Migration assays in three cell lines. (a) Wound healing assay to assess the effects of miR-20a-5p transfection on migration in the breast cancer cell lines SK-BR-3, MDA-MB-231, and MCF-7. Size bars indicate 100 µm. (b) The box plots represent the mean distance of migrated cells (µm ± SEM) in three independent experiments. *Significantly different from control.

Fig. 5
figure 5

Invasion assays in three cell lines. (a) Transwell assay to assess the effects of miR-20a-5p transfection on invasiveness in the breast cancer cell lines SK-BR-3, MDA-MB-231, and MCF-7. Size bars indicate 100 µm. (b) The box plots represent the mean number of cells (± SEM) in three independent experiments. *Significantly different from control.

Discussion

In this study, we investigated the expression levels and in vitro functional role of miR-20a-5p in breast cancer. The expression level was evaluated in three separate tissue- and subcellular compartments in breast cancer surgical specimens, and we explored its associations with breast cancer risk factors and clinicopathological features. Further, three breast cancer cell lines were used to explore the effect of miR-20a-5p on proliferation, migration and invasion.

Our main findings point to a potential role of miR-20a-5p in more aggressive tumors, however, associations vary according to cell type and subcellular compartment. At large, the in vitro experiments support these findings.

This study demonstrates an association between stromal miR-20a-5p expression and a more aggressive cancer. Women with high stromal miR-20a-5p expression had significantly increased odds of relapse in logistic regression analysis. This association was confirmed using restricted cubic splines, which additionally revealed a non-linear trend. Specifically, ORs for relapse were low for scoring values up to 2, after which they increased progressively from scores of 2 and above (Fig. 3). Despite the wide CI observed in this analysis, the findings remain intriguing and warrant further investigation. The wide CI may be attributed to several factors, including the relatively small sample size, few cases of relapse, and biological variability. Moreover, compared to low expression, tumors with a high stromal miR-20a-5p expression had significantly higher odds of a high Ki67 expression. These findings are particularly intriguing, given the well-established significance of the tumor microenvironment in breast cancer progression32.

In lung cancer, it has been demonstrated that miR-20a is transferred from cancer-associated fibroblasts (CAFs) to tumor cells through exosomes33. Exosomes are small membrane-enclosed particles that can be secreted by various cell types and internalized by cancer cells, thereby facilitating intercellular communication within the tumor microenvironment34. CAF-derived, exosomal miR-20a was shown to upregulate PD-L1 and inhibit PTEN, thereby promoting proliferation and chemoresistance in lung cancer cells33. Similar mechanisms have been observed in breast cancer. CAF-derived exosomal miRNAs have been directly linked to ER-repression in breast cancer cells35, and to hormonal therapy resistance in luminal breast cancer models36. Furthermore, miR-20a-5p has been identified as highly expressed in exosomes from MDA-MB-23 breast cancer cells, and to promote osteoclast proliferation and differentiation through exosome-mediated transfer and binding to target genes, suggesting its role in bone metastasis via tumor cell and bone microenvironment crosstalk15. Hence, in vitro studies suggest that crosstalk between fibroblasts and tumor cells, facilitated by exosome-mediated miRNA-transfer, may provide a possible explanation for the observed association of high stromal expression of miRNA-20a-5p and a potentially more aggressive tumor phenotype.

Interestingly, our results demonstrate that cytoplasmic expression of miR-20a-5p in cancer cells is associated with tumor grade, and that high cytoplasmic miR-20a-5p may be associated with high Ki67 and an increased odds of having a basal-like subtype, which is generally recognized as more aggressive cancers and difficult to treat37. Although this finding was not significant, it aligns well with existing literature. In a previous study, we found significantly elevated levels of miR-20a-5p in high grade tumors and in triple-negative breast cancer compared to other subtypes using miRNA microarray and quantitative polymerase chain reaction (qPCR)11. Additionally, c-myc, a well-established transcriptional regulator of the miR-17-92 cluster to which miR-20a-5p belongs, has been found to be upregulated in basal-like breast cancers38,39. In contrast, our results also indicate that a high cytoplasmic miR-20a-5p may be associated with lower odds of lymph node metastasis, contradicting the hypothesis that elevated cytoplasmic miR-20a-5p expression may be associated with a more aggressive tumor phenotype. However, none of these findings reached statistical significance at the 5% level, and further research with larger cohorts is warranted to better understand the underlying mechanisms and validate the observed associations.

In the cancer cell nucleus, miR-20a-5p expression was inversely associated with tumor size and lymph node metastasis, suggesting a potentially less aggressive tumor phenotype in women with high nuclear miR-20a-5p expression. This contrasts with findings from the cancer cell cytoplasm, which suggested a possibly more aggressive tumor phenotype. There is increasing evidence of miRNAs that localize and have specific functions in the nucleus20,21,22,40, and that transportation across the nuclear membrane can regulate miRNA storage and function41. Considering the observed discrepancy in tumor phenotype associations between compartments, it would be interesting to explore whether nuclear miR-20a-5p may have effects that differ from those of its cytoplasmic counterpart. Previous studies have already indicated that miRNAs may have different functions depending on tissue compartment and subcellular localization21,22. Nuclear miR-20a-5p could be sequestered or in an inactive form, although this possibility has not yet been investigated to our knowledge. The nuclear functions of miRNAs in general, and miR-20a-5p in particular, remain elusive and more research on tissue- and subcellular localization and function of miR-20a-5p is needed.

In terms of associations with demographic, anthropometric, lifestyle and reproductive factors, we observed a significantly lower median age at diagnosis among women with high stromal expression of miR-20a-5p and those with high cytoplasmic expression of miR-20a-5p. Furthermore, menopausal status at diagnosis showed associations with stromal and cytoplasmic miR-20a-5p expression, and with combined miR-20a-5p expression in cytoplasm and nucleus. Logistic regression analysis revealed increased odds of having high stromal and cytoplasmic miR-20a-5p expression in premenopausal women compared to postmenopausal women. This observation is biologically plausible, as premenopausal women have higher levels of circulating estradiol, and estradiol induces expression of transcriptional factors such as c-myc42 and E2F143 which in turn regulate transcription of several miRNAs, including miR-20a-5p44,45. On a more general basis, miR-20a-5p, amongst others, has been identified as downregulated in cellular models of aging46. Moreover, we found that nuclear miR-20a-5p expression was associated with BMI. This observation is supported by recent studies, which indicate that the inflammation caused by adipose tissue may influence miRNA expression levels, and that miRNAs may be mediators of the effect of obesity on breast cancer development and progression47,48.

Our in vitro assays demonstrated that an increased expression of miR-20a-5p led to increased migration in all breast cancer cell lines and increased invasiveness in two out of three cell lines compared to controls. These results are in accordance with findings from cancer cell cytoplasm: cytoplasmic expression of miR-20a-5p in cancer cells was associated with tumor grade, and tumors with high compared to low cytoplasmic miR-20a-5p showed a trend towards higher odds of a basal-like subtype and high Ki67, potentially indicating a more aggressive tumor type. Our findings also support those of Bai et al., who reported that overexpression of miR-20a-5p in TNBC cells led to increased migration and invasion in vitro16. Similarly, Guo et al. found that miR-20a-5p promoted migration and invasion in MDA‐MB‐231 cells15. In contrast, Zhao et al. reported a reduction in invasive capabilities following miR-20a-5p transfection17. Variations in assay techniques and cell culture conditions may contribute to the discrepancies between study results.

There was no increase in proliferation following miR-20a-5p transfection in vitro, seemingly contradicting our finding from cancer cell cytoplasm where high cytoplasmic expression was associated with a non-significant increase in odds of high Ki67.

Overall, the cell studies support a possible association between high miR-20a-5p expression and a more aggressive disease course.

Strengths

By combining data obtained within the natural tissue context with long follow-up time, epidemiological and clinicopathological data, and functional in vitro experiments, we comprehensively evaluate and integrate several aspects of miR-20a-5p’s expression and function in breast cancer.

One of the major strengths of our study lies in the information about miRNA localization obtained by ISH. The main techniques currently used for miRNA quantification are qPCR, microarray analysis, next-generation sequencing, Northern blotting, and isothermal amplification24, none of which provide information on the subcellular localization of miRNAs. To our knowledge, there are no previous studies that have evaluated the expression of miR-20a-5p in both tumor stromal fibroblasts, cancer cell cytoplasm and cancer cell nuclei and examined their associations with the clinical endpoints relapse, lymph node metastasis and distant metastasis, as well as other clinicopathological, reproductive and lifestyle factors. By identifying the precise locations of miRNAs, we gain valuable insights into their potential functional roles. Different subcellular locations could indicate distinct regulatory mechanisms and target interactions. The canonical understanding is that miRNAs carry out their function in the cytoplasm by targeting mRNAs to inhibit their translation or promote their degradation post-transcriptionally. However, we observed nuclear expression of mature miR-20a-5p, indicating a possible role in transcriptional regulation or other nuclear processes. Furthermore, identifying the precise locations of miRNAs may facilitate biomarker development, as miRNA expression patterns within specific tissue- and subcellular compartments can serve as diagnostic or prognostic indicators. Lastly, understanding the delivery sites of miRNAs is a prerequisite for the development of effective anti-miRNA therapeutic strategies49. Importantly, we observed a high inter-observer agreement in this evaluation, underscoring the reliability of our assessment methodology and enhancing the credibility of the reported findings.

Other strengths include the long follow-up time, and the employment of restricted cubic splines to allow for flexible modeling of non-linear associations. By using this statistical method, we were able to visualize the complex relationship between stromal miR-20a-5p expression and relapse.

Limitations

Lack of reproducibility is a concern due to the semiquantitative scoring method and the absence of an established biologically relevant cut-off value. The subjective nature of scoring and the potential interobserver variability can affect the accuracy and reproducibility of our findings. Additionally, using small tissue cores to evaluate large tumors may have limitations due to tumor tissue heterogeneity.

Although surrogate markers are convenient and accessible indicators of molecular characteristics, they do not always accurately reflect the true molecular subtype, as determined by advanced gene expression analysis50,51,52. Thus, the use of surrogate markers to classify breast cancer into molecular subgroups may lead to misclassification. However, the surrogate markers used in our study reflect the prognostic and predictive markers commonly used in clinical practice to categorize and stratify tumors, underlining the translational relevance of our observations.

Furthermore, some of our analyses were underpowered due to a low number of participants in certain subgroups, specifically nulliparous women, women with obesity, those who had tumor relapse, those who had distant metastasis, and those with cancers of some subtypes. This may have limited our ability to detect significant associations. As such, our findings warrant further investigation in larger studies.

Additionally, while BMI and physical activity have been previously validated53,54, a misclassification of other lifestyle and reproductive factors of an unknown degree is likely present as the data on these factors were collected from self-administered questionnaires. There is also a time gap between the questionnaire data and clinical data collection for some participants, potentially leading to misclassification of the questionnaire responses. Moreover, some women (17.3%) completed the questionnaire after their breast cancer diagnosis, and we assumed that the reported information corresponded to that at the time of cancer diagnosis, which may not always be accurate.

Regarding the in vitro functional analyses, certain limitations arise from using cell lines originating from metastatic sites. Given that these cell lines have undergone metastasis, our observations predominantly correspond to the traits and actions of cells in the metastatic phase, rather than in the primary tumor. Further, our cell studies do not consider the tumor-stroma interplay and the role of the tumor microenvironment in tumor development and progression.

Conclusions

While most of our results point towards an oncogenic role, some of our findings indicate that miR-20a-5p may have diverse effects based on tissue compartment and subcellular location.

The associations of stromal miR-20a-5p expression with relapse and high Ki67 expression, and of cytoplasmic miR-20a-5p with tumor grade, and possibly with high Ki67 expression and the basal-like subtype, suggest that miR-20a-5p may have oncogenic properties in the tumor microenvironment and in cancer cells. Conversely, the associations of nuclear miR-20a-5p expression with smaller tumors and with decreased odds of lymph node metastasis suggest a protective role. Thus, nuclear versus stromal and cytoplasmic miR-20a-5p expression may have opposing effects on breast cancer progression.

Taken together, our findings on miR-20a-5p, especially its differential expression in various tissue and subcellular compartments, contribute to the evolving landscape of precision medicine by identifying it as a potential biomarker for targeted therapies in breast cancer. However, establishing new biomarkers requires several phases55. The present study contributes to the first phase which is preclinical exploratory studies, and our findings may help prioritize future research to bridge the gap between current knowledge and the clinical utility of miR-20a-5p in breast cancer. In the broader context, harnessing the potential of new biomarkers requires both ethical, legal, and social considerations, and it requires prioritization from policymakers2.

Our findings warrant further research in larger studies on miR-20a-5p’s expression levels and functions within different tissue- and subcellular locations, and its potential clinical utility as a biomarker, treatment target or treatment tool in breast cancer.