Introduction

The coronavirus disease 2019 (COVID-19), caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), has resulted in a pandemic that necessitated significant global efforts to understand the virus and the clinical course of the disease1. Most infected individuals exhibit mild clinical manifestations resembling those of a cold, including symptoms such as fever, cough, malaise, myalgia, headache, and disturbances in taste and smell. However, some individuals may develop severe pneumonia and acute respiratory distress syndrome (ARDS)1,2.

The pattern of clinical manifestations has evolved throughout the pandemic, primarily due to two factors: (1) the development of vaccines against COVID-19 and (2) the emergence of new variants and subvariants. Furthermore, socio-economic disparities and sanitation inequalities have been associated with worse health outcomes3,4. To mitigate the public health impact of COVID-19, coordinated global actions were essential, including the implementation of non-pharmaceutical interventions and vaccine development.

In Brazil, vaccination began in January 2021, leading to a significant reduction in mortality. Protection against death decreased by 50% within 20 weeks after the initial vaccination series, and protection against severe disease declined by 25% after 19 weeks5. Despite the introduction of vaccines and the consequent increase in vaccination coverage, SARS-CoV-2 remains a global concern due to the multiple waves of COVID-19 that have developed during the pandemic, driven by the emergence of new variants. These variants have been a key factor contributing to the occurrence of these waves6. Furthermore, individual-related factors are crucial for disease outcomes, making it essential to understand the underlying mechanisms of the infection and their implications for host response and long-term consequences. In this context, microRNAs (miRNAs) may be critical for elucidating the diverse clinical manifestations of COVID-19 and hold potential as biomarkers for disease progression and severity7.

In this context, small non-coding RNAs, including microRNAs (miRNAs) and small nuclear RNAs (snRNAs), may be critical for elucidating the diverse clinical manifestations of COVID-19 and hold potential as biomarkers for disease progression and severity. Accurate detection and quantification of miRNAs in biological fluids, such as plasma, provide insights that enhance our understanding of the dynamics of infection and host response. These short (~ 18–25 nucleotides), single-stranded molecules regulate gene expression post-transcriptionally and may directly influence immune and inflammatory responses to infectious diseases like COVID-198. Released into the extracellular space, miRNAs are encapsulated in macrovesicles or exosomes or bound to high-density lipoproteins or protein complexes9. Thus, these molecules can develop stability in various body fluids, such as serum, plasma, saliva, urine, and milk, making them non-invasive biomarkers, constituting promising for molecular diagnoses and prognostics10,11. This potential has been reported across various diseases12,13, including cancer, diabetes14,15, cardiovascular diseases16, and COVID-1917.

Although the stability of miRNAs in body fluids is measurable, the precise quantification of miRNA levels remains essential due to the influence of multiple external factors. The quality and quantity of the analyzed material may be susceptible to interferences related to collection, storage, RNA extraction processes, as well as the performance of reverse transcriptase (RT) and polymerase chain reaction (PCR). Such discrepancies can introduce bias in miRNA quantification, compromising their use as molecular markers and affecting the evaluation of differentially expressed miRNA functions. Thus, accuracy and consistency in miRNA quantification are fundamental for obtaining reliable results in studies utilizing these molecular markers18,19.

The interpretation of miRNA expression data requires appropriate normalization using endogenous reference genes, a critical step that can significantly influence the results. In this context, the selection of suitable reference genes remains a challenge, particularly in plasma samples from COVID-19 patients. This study aims to evaluate the stability of five small RNAs (snRNAs and miRNAs) as potential endogenous normalizers, contributing to the validation of robust analytical methods for expression studies in clinical contexts.

Results

Selection of candidate reference small RNAs (snRNAs and miRNAs) and stability analysis

The study population consisted of 48 individuals, with a mean age of 39.02 years (median = 36.0), ranging from 18 to 76 years. Sex distribution showed a higher prevalence of female participants (29; 60.42%) compared to male participants (19; 39.58%). Regarding clinical status, most participants did not require hospitalization (45; 93.75%), while 3 (6.25%) were hospitalized (Table 1). The initial selection of candidate small RNAs and their stability analysis were performed by evaluating the expression of five candidate genes—two small nuclear RNA (RNU6B and snRNA U6) and three miRNAs (miR-320a, miR-342-3p, and miR-328) in plasma samples from the case (n = 24) and control (n = 24) groups. However, one control sample was excluded from the final analysis due to insufficient plasma volume collected. Although we acknowledge the limitation of our sample size, we chose to focus exclusively on COVID-19 patients and viral-negative controls to ensure a homogeneous population that reflects the unique molecular alterations associated with SARS-CoV-2 infection. Including samples from other respiratory diseases, while potentially enhancing generalizability, was beyond the scope of this study.

Table 1 Characteristics of the studied population (n = 48) according to the year of sample collection.

Expression analysis was based on the quantitative cycle, evaluating the mean cycle threshold (Ct). All datasets of input RNA and inter-run calibrator-normalized Ct-values were analyzed with the NormFinder software to perform stability analysis, estimating the values for intra- and inter-group variability. RNU6B, a small nuclear RNA (snRNA) commonly used as a normalizer for intracellular and circulating miRNA expression analyses, exhibited low expression levels and poor homogeneity among plasma samples. Its amplification was undetectable in 95.8% of the plasma samples tested in duplicate, indicating very low and inconsistent expression levels across plasma samples (data no show). Besides, these analysis also indicated that, among the candidate genes, snRNA U6 presented the most stable expression (Stability value = 0.298) (Fig. 1).

Fig. 1
Fig. 1
Full size image

Analysis of endogenous normalizers stability using NormFinder algorithm.

Additionally, RefFinder software was used to analyze the same data set. This tool conjugates the normalization determination algorithms GeNorm, BestKeeper, comparative ΔCt and NormFinder. Stability analysis conducted with BestKeeper revealed that snRNA U6 presented the most stable expression in the group (Fig. 2a)., followed by miR-320a, miR-328-3p and miR-342-3p, subsequently. Conversely, miR-342-3p displayed the highest stability value, suggesting that it is the least suitable normalizer among the candidates in this dataset (Fig. 2a). The comparative ΔCt method (Fig. 2b) revealed the same results as the NormFinder online version included in RefFinder (Fig. 2d) or the GeNorm analysis (Fig. 2c) that indicated snRNA U6 as the most stable normalization candidate, followed by miR-342-3p, miR-320a and miR-328-3p. Assembling all the algorithms together by calculating the mean rank for each of the 4 candidate internal normalizers snRNA U6, followed by miR-342-3p, miR-320a and miR-328-3p were the best (Fig. 2e).

Currently, RefFinder is the only web-based platform designed to compare and assess housekeeping genes as potential reference genes. This tool consolidates four commonly used computational methods (geNorm, NormFinder, BestKeeper, and the comparative ΔCt method) into an online resource for evaluating the stability and reliability of reference genes. The stability rankings from each method are used to assign appropriate weights to the genes, and the geometric mean of these weights is calculated to determine the final ranking. Additionally, users can choose to employ a single program or a combination of the four to rank candidate reference genes20.

Fig. 2
Fig. 2
Full size image

Assessment of endogenous reference gene stability using the different algorithms: BestKeeper, ΔCt, GeNorm and NormFinder.

Expression differences between case and control groups

The data were analysed between the two groups to identify any differences in mean Ct values, given the importance of ensuring that an internal reference gene exhibits similar expression levels in diseased and healthy conditions. In this context, no significant differences were observed between the case and control groups in the mean expression of snRNA U6, -328 and miR-342-3p (all p-value > 0.05). However, a significant difference was identified between the two groups in the expression of miR-320a (Fig. 3).

Fig. 3
Fig. 3
Full size image

Comparative expression levels of normalization candidates in case individuals with SARS-CoV-2 RNA detectable by RT-qPCR and respiratory symptoms and control (individuals with SARS-CoV-2 RNA undetectable by RT-qPCR groups.

Discussion

The selection of a reliable internal reference gene (internal reference miRNA) is critical to ensure consistent expression across various health conditions and experimental groups. To achieve this, normalization using stable reference genes is an essential tool to eliminate technical variations and ensure that the observed changes in miRNA expression are biological rather than technical or artifactual21.

In this study, we analyzed potential candidate genes—two small nuclear RNAs (RNU6B and snRNA U6) and three miRNAs (miR-320a, miR-342-3p, and miR-328) as candidate reference genes for future differential expression studies in the context of the COVID-19 pandemic. We incorporated well-established statistical algorithms (comparative ΔCt method, geNorm, NormFinder, and BestKeeper) as tools for evaluating their stability.

Despite the widespread adoption of normalization strategies, there is no definitive guideline or universal recommendation for normalization in all experimental conditions18. Typically, endogenous control strategies rely on constitutively expressed miRNAs, either individually or as a combination of several miRNAs. Alternatively, normalization may employ the global mean Cq value from all analyzed miRNAs.

Initially, this study excluded RNU6B due to its low expression and homogeneity between samples, corroborating recent suggestions that it was not consistently expressed in plasma and serum samples22. RNU6B is small nuclear RNA (snRNA), frequently used as a reference in miRNA quantification, although its primary function is related to the modification and maturation of ribosomal RNAs, distinguishing between pathophysiological conditions. This miRNA has been commonly used in microRNA expression studies and has been employed as a reference normalizer in several investigations, including studies involving plasma samples from individuals co-infected with HIV-1 and HCV23, as well as COVID-19 patients24. However, in other investigations aimed at normalizing target miRNA expression in plasma from tuberculosis patients, RNU6B proved to be an unstable marker due to its high variability25,26. In fact, the low stability of RNU6B was corroborated in a more recent study, which demonstrated that it is not a stable marker in plasma samples, showing highly variable expression19,27.

Similarly, snRNA U6, another frequently used reference gene, has shown variability across studies. It functions as an essential component of the spliceosome, involved in the removal of introns during pre-mRNA processing. It has been employed in miRNA expression analyses in various contexts, including digestive malignancies (for example, hepatocellular carcinoma, gastric carcinoma, liver cirrhosis, and hepatitis B)28, as well as research into chronic kidney disease and nocturnal hypertension29. In this study, based on the quantitative cycle (Cq), no significant differences were observed between the case and control groups in relation to the average expression of snRNA U6 (p-value > 0.05). However, a study involving plasma from heart failure30 patients revealed that snRNA U6 exhibited high variability, rendering it unsuitable as an endogenous control in this setting. This variability suggests that miRNAs traditionally used for normalization may not be suitable under all conditions, indicating that their stability should be analyzed in different contexts21,31.

RNU6B and U6 are small nuclear RNAs (snRNAs) frequently employed as endogenous reference genes in miRNA expression studies, owing to their presumed constitutive expression and functional relevance in RNA processing. U6, in particular, is a core component of the spliceosome and plays an essential role in pre-mRNA splicing32,33,34. RNU6B, a transcript variant related to snRNA U6, has been widely used as a normalizer in RT-qPCR protocols. However, its expression has been shown to vary under different physiological and pathological conditions32. Moreover, studies on cancer and extracellular RNA have emphasized that even traditionally accepted reference RNAs such as RNU6B may not be appropriate for all experimental settings35. Recent studies, emphasize that these differences in their biogenesis and functions may lead to variations in stability and expression under different biological contexts. Such distinctions underpin the rationale for selecting these RNAs differentially for normalization in miRNA expression analyses, thereby enhancing the robustness of our results32,33,36,37.

In relation to miR-320a, significant difference in its expression was observed between the groups, suggesting instability. In fact, the miR-320a has been commonly used as a biomarker in various diseases due to its differential expression14,37,38,39, but not normally as a reference gene for miRNA normalization. In fact, miR-320a expression analysis has demonstrated certain variations in different contexts37,40,41. The data raises questions about the expression of miR-320a as potential differential marker in COVID-19 context.

The analysis of the average expression of miRNAs based on cycle threshold (Ct) values42 can be limiting, especially in studies comparing different health and disease conditions. Indeed, the exclusion of missing values, which is common in this analysis, can bias the results toward a favorable expression mean, disregarding the true heterogeneity of the data. This lack of significance may be attributed to variations in the concentrations of these miRNAs, which could be low, and these miRNAs may exhibit greater stability in different biological fluids.

To enhance the robustness of the analysis, the application of algorithms is a widely used technique, as it provides a more comprehensive assessment of gene expression stability. These algorithms can allow the selection of reference genes that better reflect the intrinsic variability of miRNAs under different health and disease conditions18,43,44 Another strategy involves using non-coding reference genes or synthetic RNAs added as exogenous controls. While these can normalize technical variation in RNA isolation, they do not correct for sample collection fluctuations and cannot improve test accuracy31. Therefore, endogenous genes are often preferred, and algorithms such as GeNorm, NormFinder, and BestKeeper are employed to select the most stable reference genes, increasing the robustness of results16.

In this study, we assessed the stability of each candidate gene and confirmed that snRNA U6 is a suitable reference gene for qRT-PCR analysis in plasma samples from COVID-19 patients. This conclusion was supported by the consistent stability rankings produced by the four statistical algorithms—NormFinder, ΔCt, BestKeeper, and GeNorm—which collectively identified snRNA U6 as the most stable reference gene for this sample set20,45.

The selection of snRNA U6 was particularly relevant, as its stability has been widely reported in various biological due to its more consistent expression across different samples and condition8,25,35,46,47,48,49, supporting its use as a robust internal control in studies of circulating small RNAs.

The use of these algorithms increased the reliability of the analysis, offering a comprehensive assessment of the stability of gene expression and allowing the selection of reference genes that better captured the intrinsic variability of miRNAs in different health conditions. Furthermore, although snRNA U6 is widely described as a reference normalizer in various pathologies10,46,47, there is a scarcity of reports in the literature on the most suitable normalizers for analysis in plasma from COVID-19 patients. So, this study presents snRNA U6 as a stable reference gene for the future differential expression analysis of miRNAs in plasma from COVID-19 patients.

This study provides a novel contribution by demonstrating, for the first time, that snRNA U6 can be utilized as an endogenous reference to normalize gene expression data for circulating miRNAs obtained from plasma of COVID-19 patients. The use of multiple established algorithms ensured a rigorous analysis, underscoring the methodological strength of our approach. However, sample size is a limitation of our study, emphasizing the need for future investigations with larger and more diverse cohorts to confirm the applicability of snRNA U6 across different clinical settings.

Conclusion

In conclusion, this study provides strong evidence for the use of snRNA U6 as a stable reference gene for miRNA expression analysis in plasma samples from COVID-19 patients, addressing the need for reliable internal controls in this setting. The combined use of NormFinder, BestKeeper, ΔCt, and GeNorm algorithms reinforces the robustness of snRNA U6 as a reference gene, enhancing the confidence in its stability across various conditions. Future studies should incorporate larger and more diverse cohorts to further validate the stability of snRNA U6 and investigate the roles of miR-320a and other candidate miRNAs in COVID-19. These findings establish a valuable foundation for future research, potentially improving the accuracy of miRNA biomarker discovery in infectious diseases and contributing to advancements in diagnostic and therapeutic strategies.

Materials and methods

Study design and ethical considerations

This study was carried out from August 2021 to August 2022 with individuals who tested to SARS-CoV-2 by RT-qPCR at Laboratório de Farmacogenômica e Epidemiologia Molecular (LAFEM), located at Universidade Estadual de Santa Cruz (UESC). LAFEM/UESC worked in partnership with the Central Public Health Laboratory Professor Gonçalo Moniz (LACEN–BA), supporting the routine diagnostics for SARS-CoV-2 detection in the southern region of Bahia State (Fig. 4), which was one of the main epicenters of the COVID-19 pandemic after the capital Salvador and its metropolitan region. After collecting a nasopharyngeal swab for viral detection, peripheral venous blood was also collected from individuals that agree to participate in the study as volunteers. Informed written consent was obtained from all study participants. This study was conducted in accordance with the relevant guidelines and regulations and was approved by the research ethics committee of the State University of Santa Cruz (Universidade Estadual de Santa Cruz), Bahia, Brazil (approval number: CAAE 38627420.3.0000.5526).

Fig. 4
Fig. 4
Full size image

Map of Brazil and the state of Bahia, highlighting the area (cities in the southern micro-region) attended by LAFEM/UESC as campaign laboratory to SARS-CoV-2 detection.

Ethical considerations

The study was conducted in accordance with the guidelines of the Declaration of Helsinki and approved by the Research Ethics Committee of the State University of Santa Cruz under registration number CAAE: 38627420.3.0000.5526. Informed consent was obtained from all participants, and all information was treated confidentially and anonymised to guarantee the privacy of the participants.

Groups definition

Individuals with SARS-CoV-2 RNA detectable by RT-qPCR and respiratory symptoms were included in the “case” group. Individuals with undetectable results for SARS-CoV-2 by RT-qPCR and without respiratory symptoms were included in the “control” group. In both groups, only individuals who consented to undergo SARS-CoV-2 testing and venipuncture for blood sample collection were included (Fig. 5).

Fig. 5
Fig. 5
Full size image

Flow diagram of the development research processes.

Laboratory detection of SARS-CoV-2 infection

Nasopharyngeal swab specimens were collected from all individuals and subsequently tested at LAFEM/UESC. Viral RNA was extracted using an automated Loccus EXTRACTA 32 device and the MVXA-P016 kit. The Allplex™ 2019-nCoV assay (Seegene®, Seoul, Korea) and the SARS-CoV-2 EDx assay (Bio-Manguinhos, FIOCRUZ, Brazil) were employed to detect SARS-CoV-2 RNA by RT-qPCR, in accordance with the manufacturer’s instructions, utilising the 7500 Fast Real-Time PCR System (Applied BiosystemsTM, Life Technologies, USA). The results were classified as positive (detection of SARS-CoV-2), negative (absent of detection), or inconclusive. Inconclusive results were excluded from the subsequent analysis.

Sample collection for MiRNA expression analysis

Venous blood samples (4 mL) were collected using a 21-gauge needle (Becton-Dickinson, Franklin Lakes, NJ, USA) into tubes containing K2-EDTA. The plasma fraction was separated by centrifugation and 200 µL of the plasma sample was mixed with 600 µL of Trizol LS (Invitrogen) and subsequently stored at − 80 °C for a maximum of 24 h after blood collection.

RNA isolation and quantification

RNA isolation involved adding 200 µL of cold chloroform (~ 8 °C), followed by inversion, room temperature incubation (5 min), and centrifugation (4 °C, 14,000 rpm, 20 min) to separate phases (Fig. 6). The aqueous phase, with an average volume of 600 µL, was collected and stored in a separate collection tube. Subsequently, 1000 µL of absolute isopropanol was added and the mixture was stored at − 20 °C for a period of approximately 12 h. Subsequently, the samples were thawed and subjected to centrifugation at 4 °C and 14,000 rpm for 20 min. The supernatant was discarded, and the pellet was retained for further analysis. Subsequently, 1000 µL of 70% ethanol was added to the tube, after which the sample was subjected to centrifugation at 4 °C and 14,000 rpm for 20 min. Once more, the supernatant was discarded, and the pellet was retained. The tubes containing the pellets were then left to dry at room temperature for approximately 15 min. Subsequently, the pellet was resuspended in 20 µL of RNase-free water and stored at − 80 °C. All reagents were employed at a temperature of approximately 8 °C.

RNA concentration was measured using a NanoDrop 1000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). Only RNA samples with a 260/280 ratio of at least 1.8 were included in the subsequent analysis. A total of 500 ng of RNA was used for reverse transcription, employing specific primers for microRNAs and a TaqMan microRNA reverse transcription kit (Applied Biosystems) in a reduced volume of 20 µL of reverse transcription reaction, in accordance with the manufacturer’s instructions.

The cDNA synthesis used the TaqMan MicroRNA Reverse Transcription Kit (Applied Biosystems) with stem-loop primers for snRNU6B, snRNA U6, miR-328-3p, miR-342-3p, and miR-320a (Thermo Fisher Scientific assay IDs: 001093, 001973, 000543, 002260, 002277). Additionally, the following microRNAs were assessed: 42-3p and miR-320a (Thermo Fisher Scientific assay IDs: 001093, 001973, 000543, 002260, and 002277). The reaction was conducted with 5 µL of purified microRNA in a total volume of 20 µL, following the manufacturer’s recommended protocol, which entailed incubation at 16 °C for 30 min, 42 °C for 30 min, and at 85 °C for 5 min.

Each qPCR reaction included 4.5 µL cDNA, 0.5 µL TaqMan 20X Assay, and 10 µL universal PCR master mix (Applied Biosystems) in a 20 µL final volume. Each sample was processed duplicate, with each cycle comprising 95 °C for 3 min, 95 °C for 15 s, and 60 °C for 60 s, using a QuantStudio 3.0 system.

Fig. 6
Fig. 6
Full size image

Procedures and main steps of miRNA expression analyses by RT-qPCR.

Selection of candidate reference genes and stability analysis

In the initial phase of the study, we evaluated the expression of five candidate reference genes (snRNU6B, snRNA U6, miR-328-3p, miR-342-3p, and miR-320a) in plasma samples from 24 individuals in the case group (detectable for SARS-CoV-2 infection) and 24 individuals in the control group (non-detectable for SARS-CoV-2 infection). The selection of these candidates was based on the team’s prior experience with the method41,50, as well as a literature review supporting the use of these molecules as endogenous reference genes in miRNA expression studies19,51.

Among them, snRNU6B and snRNA U6 belong to distinct classes of non-coding RNAs with different biological roles: snRNU6B is a small nucleolar RNA involved in ribosomal RNA processing, while snRNA U6 is a small nuclear RNA (snRNA) essential for pre-mRNA splicing as part of the spliceosome complex. The expression levels of these candidates were assessed using one-way ANOVA (p > 0.001) to determine potential differences between the groups.

Consequently, the remaining candidate, which exhibited minimal variation in expression levels between case and control groups (less than 1-fold change), was selected as a reference gene for COVID-19 plasma sample analysis. This careful selection provided a robust basis for subsequent analyses. Stability and consistency of miRNA expression were evaluated using specialized software, including RefFinder52 and NormFinder53, to ensure reliable results. NormFinder, an Excel-based tool, applies a model-based approach that assigns stability values to candidate normalizers based on intra- and inter-group variation, with lower values indicating greater stability. RefFinder, an online application, integrates four commonly used algorithms—Bestkeeper54, comparative ΔCt55, NormFinder53 and GeNorm—offering a comprehensive assessment to identify the most stable gene or gene pair for normalization. All statistical analyses were conducted with GraphPad Prism 5.0, considering p < 0.05 as statistically significant (t-test).