Introduction

The service provided by pollinators is essential for global ecosystems. About 87.5% of flowering plants worldwide require pollination service by animals1. About 75% of the agricultural crops benefit from pollinators, resulting in a 35% increase in yield and generating billions of dollars in value2,3. The honeybee (Apis mellifera) is the most important managed pollinator for crop and fruit productions4,5. Native wild bees are far more numerous than managed bees6. They provide complementary pollination service thought to surpass managed bees, which contribute to the ecological and agricultural stability7,8. The small carpenter bees refer to a large group of more than 300 species from genus Ceratina (Apidae: Xylocopinae)9. Ceratina calcarata Robertson is a species native to eastern North America, ranging Florida to the south, Ontario to the north, and Nova Scotia to the east10. The species, as well as the sympatric species C. dupla Say, can effectively pollinate many fruit, vegetable, and other crops11,12 and are among the most abundant pollinator species in recent restored land12,13. Similar to honey bees, native wild bees also suffered from declining populations, potentially caused by loss of habitat, the use of agrochemicals, land-scape alternation, and parasitism14,15.

Besides its economic and ecological importance, C. calcarata is considered as a subsocial species with features of prolonged maternal care and mother-adult offspring interaction. It also demonstrates traits of facultative sociality such as division of labor and cooperative brood care16. The eldest daughters are dwarf in body size and responsible for foraging, guarding and feeding their younger siblings17,18,19. Their roles resemble worker-like behavior in eusocial species and they do not have chance to overwinter or reproduce the next spring. Therefore, C. calcarata is an ideal model to study the evolution and mechanism of sociality in hymenopteran insects.

To date, many molecular and genomics approaches have been adopted to study the phylogeny10,20,21, adaptation22,23 and reproduction24 of Ceratina species. The reference genome of C. calcarata25 and C. australensis26 has been sequenced and assembled. The transcriptome and metatranscriptome of C. calcarata have been sequenced, which revealed the gene and microbiome regulations associated with overwintering27, maternal and sibling care18,28,29, social behavior30, and landscape adaptation31. Compared to high-throughput sequencing, real-time quantitative reverse transcription polymerase chain reaction (q-RT-PCR) provides a fast and accurate approach to quantify genes32,33,34. It involves the reverse transcription of RNA into complementary DNA (cDNA), followed by real-time PCR amplification. This method is cost-effective when a small number of target genes are analyzed from a large number of samples. The calculation of relative expression is relied on internal controls, which are housekeeping genes with constant expression levels across the treatments35. However, as a prerequisite procedure for studying gene expressions associated with development and adaptation to agricultural landscapes, there is a lack of highly conserved reference genes in C. calcarata for q-RT-PCRs. Thus, it is a challenge to study target gene expressions in this bee species.

In this study, we tested the expressional stability of seven commonly used reference genes, including four ribosomal binding proteins: ribosomal protein L8 (RPL8) and L32 (RPL32) binding to large subunits, S5 (RPS5) and S18 (RPS18) binding to small subunit, a cytoskeleton protein: β-actin (ACT), a translation elongation protein: elongation factor 1-alpha F2 (EF-1α), and a housekeeping enzyme: glyceraldehyde 3-phosphate dehydrogenase (GADPH). We used four methods to compare the stability of each gene: comparative ΔCt analysis36, NormFinder37, geNorm38, and BestKeeper39. Further RefFinder was utilized for integrated analysis with incorporating GeNorm, BestKeeper, NormFinder, and ∆Ct analysis40. Our results present the most stable genes across landscapes and developmental stages, which can be used in studies of gene expression analysis under similar scenarios.

Materials and methods

Sample collection and total RNA extraction

C. calcarata individuals were collected from multiple sites in Western Ohio during the spring 2024. The sites include three types of landscape: conventional farms with regular applications of pesticides and other agrochemicals, organic farms with natural based pesticides, and roadside landscape without agricultural activity. Given that C. calcarata nests in raspberry (Rubus idaeus L.)17,41, raspberry stems with diameters over 7 mm from the previous growing season were cut to 1.20 m stems, attached to bamboo sticks with twist ties, and vertically inserted into the ground randomly for about 20 cm in early May to attract C. calcarata (Supplementary figure S1).

After four weeks, the nesting individuals were collected from stems and their life stages were visually identified as larva, pupa and adult stages. Immatures were categorized into small larvae, large larvae, pre-pupae, white-eyed pupae31. For RNA extraction, we used large larvae which may be either 4th or 5th instar. However, we did not determine the sex of the larvae, and sex differences may contribute to variation in gene expression. The samples were frozen in dry ice and stored temporarily at −80 °C. Individual total RNA was extracted using TRIzol and purified by ZYMO Direct-zol RNA Miniprep Kit (Zymo Research, Irvine, CA, USA). Genomic DNA (gDNA) was removed by DNase I (Zymo Research) using in-column digestion method. The concentrations of RNA were measured by Qubit RNA BR Assay (Thermo Fisher Scientific, Waltham, MA). The RNA samples were preserved at −80 °C for future use.

Primer design and q-RT-PCR experiments, and efficiency test

The predicted coding sequences of candidate genes were searched from C. calcarata genome assembly annotation25 and confirmed by PCR. The primers were designed using Primer3Plus with default settings with q-RT-PCR module targeting amplicons of 90–130 bp42. The details of the primers are presented in Supplementary Table S1. Where possible, primers were designed to span exon-exon junctions to avoid amplification of gDNA. Primer specificity was confirmed by reverse transcription polymerase chain reaction (RT-PCR) followed by 1% agarose gel electrophoresis, which produced a single band of the expected size for each primer set. A single peak in the melting curve further verified the specificity of the amplification. Primer efficiency was determined using a five-point, 10-fold serial dilution of pooled cDNA as standard. The amplification efficiencies ranged from 91.05% to 108.40%.

The first strand cDNA for each sample was synthesized by iScript cDNA Synthesis Supermix (Bio-Rad, Hercules, CA) following the factory protocol. A mixture of oligo(dT) and random hexamers were used to prime the reaction. Samples were normalized to 1 µg total RNA per 20 µl reaction mix. Real time PCR were conducted using PowerUp SYBR Green Mix (Thermo Fisher Scientific) in QuantStudio 3 Real-Time PCR System (Thermo Fisher Scientific) by following procedure: 50 °C for 2 min, 95 °C for 2 min; 40 cycles of 95 °C for 15 s, 60 °C for 30 s; then a 60–95 °C melting curve to confirm the specificity of amplification. Three technical replications were incorporated for each sample. To obtain the PCR efficiency of each primer set, q-RT-PCR was also performed on 10X serial dilution of cDNA. The efficiency of each primer set was calculated by formula \(\:Efficiency={10}^{-1/slope}\).

Data analysis

The cycle threshold (Ct) value of each reaction was obtained by Design & Analysis 2 (DA2) software (version 2.8.0, Thermo Fisher Scientific). Standard curves of Ct were made by liner regression, and the efficiency of each primer set was calculated by formula as mentioned above. We addressed the expression stabilities under developmental stages or agricultural landscapes using following algorithms: comparative ΔCt method36, NormFinder37, geNorm38, and BestKeeper39. R Package ctrlGene (version 1.0.1)43 were used to address geNorm and BestKeeper analysis. RefFinder was utilized for integrated analysis with incorporating GeNorm, BestKeeper, NormFinder, and ∆Ct analysis40.

Results

Primer specificity and efficiency test

The specificity of primers was confirmed by reverse transcription PCR. The results of 1% agarose gel electrophoresis followed presented the unique bands for each primer set within the expected size range. The single peak in melting curves also confirmed the result. Using cDNA serial dilution as standard, the primers present efficiency from 91.05% to 108.40%, which are within the acceptable range.

Analysis of candidate reference gene expression

The expression levels of different reference genes of C. calcarata under different developmental stages and landscapes is presented in the Fig. 1a, b. The results revealed a range of Ct values for analysed candidate reference genes varying from 14.83 to 32.01. The Ct values for candidate reference genes varied across developmental stages (larvae vs. adult) and collection sites (conventional vs. organic vs. roadside sites). Among all genes, RPS18 and RPL8 showed lower Ct variability. However, GADPH and ACT exhibited greater fluctuation.

Fig. 1
figure 1

Expression levels of different candidate reference genes in small carpenter bee across developmental stages and landscapes. The x-axis shows the tested candidate reference genes and the y-axis shows their relative expression levels (mean standard deviation of Ct values. (a) Adult stage samples (b) Larval stage samples. Bars represent mean vaues from biological replicates; error bars indicate standard deviation.

Stability of candidate reference genes

∆Ct method

The ∆Ct method utilizes the standard deviations of Ct values to assess gene expression stability of genes. In both larval and adult stages, RPL8 was the most stable gene among the seven investigated genes. While analyzing the data based on collection sites as organic, roadside and conventional sites, RPS18, RPS5 and RPS32 were the more stable genes, respectively (Fig. 2a-e).

Fig. 2
figure 2

Expression stability of candidate reference genes in small carpenter bee across different developmental stages and landscapes using ∆Ct method. The x-axis shows the candidate reference genes, and y-axis shows mean ∆Ct values, where lower values indicate higher expression stability. (a) Larval stage (b) Adult stage (c) Organic sites (d) Roadside sites (e) Conventional sites.

GeNorm analysis

GeNorm analysis uses expression stability measurement (M) value based on the average pairwise variation to calculate the stability of expression levels. The M of different candidate reference genes using geNorm are presented in Fig. 3a-e. RPL8 followed by RPL32 were the most stable genes in the larval stage, whereas RPS18 followed by RPL8 were the most stable genes in adult stage. In addition, RPS18, RPS5 and RPL8 were the most stable genes based on the organic, roadside and conventional sites of collection, respectively.

Fig. 3
figure 3

Expression stability measurement (M value) of candidate reference genes in small carpenter bee across developmental stages and landscapes using geNorm analysis. The x-axis shows the candidate reference genes, and the y-axis shows their average expression stability value (M value), where lower M indicates higher stability. (a) Larval stage (b) Adult stage (c) Organic sites (d) Roadside sites (e) Conventional sites.

BestKeeper analysis

BestKeeper uses both the coefficient of variations (CVs) and standard deviations (SDs) to determine the stability of each candidate reference gene. The SDs used by BestKeeper were from Ct values. The stability of a reference gene is considered better if it has a lower CV ± SD value44. According to BestKeeper analysis, ACT and RPS18 were the top two most sable genes in the larval stage and RPS18 and RPL8 were the top most stable genes in the adult stage. The most stable candidate reference genes identified were RPL32 and RPLS18 on the organic area, RPS18 and RPS5 on the roadside area, and ACT and RPS18 on the conventional area (Table 1).

Table 1 Expression stability of candidate reference genes in small carpenter bee under different developmental stages and landscapes using bestkeeper analysis.

NormFinder analysis

NormFinder identifies the most suitable reference gene by an expression stability value, where lower values indicate more stable expression. The expression stability of candidate reference genes of C. calcarata under different developmental stages and landscapes using NormFinder is presented in Fig. 4. RPS18 and RPL32 were identified as the most stable candidate reference genes based on the NormFinder analysis.

Fig. 4
figure 4

Expression stability of reference genes in small carpenter bee across different developmental stages and landscapes using NormFinder analysis. The x-axis shows the candidate reference genes, and the y-axis shows their stability value calculated by NormFinder, where lower values indicate more stable expression.

RefFinder analysis

RefFinder is an integrated analysis tool used for the validation of reference genes that incorporates several methods including GeNorm, BestKeeper, NormFinder, and ∆Ct analysis. Based on the RefFinder, RPL8 and RPS18 were the top two most sable genes in the larval and adult stages. RPS18 consistently ranked among the most stable candidate reference genes across all collection sites, paired with EF-1α in the organic site, RPS5 in the roadside site and RPL8 in the conventional site (Table 2).

Table 2 Expression stability of candidate reference genes in small carpenter bee under different developmental stages and landscapes using reffinder analysis.

Discussion

C. calcarata is considered an indicator species of healthy ecosystems and an important pollinator for natural and agricultural ecosystems45,46. This species is commonly used for studying pollinator ecology, behavior, evolution and genomics17,47. In this study, we collected larval and adult stages of C. calcarata from different habitats, namely organic, roadside and conventional landscapes and evaluated the expression stability of seven candidate reference genes as ACT, EF-1α, GADPH, RPL8, RPL32, RPS5 and RPS18 using widely adopted analytical tools. Integrated analysis using RefFinder revealed that RPS18 consistently ranked among the top two most sable genes across both developmental stages and all habitat types. Similarly, RPS5 was identified as one of the two most stable genes in larvae, adults and roadside habitat. However, EF-1α and RPL8 were among the other two most stable genes in the organic and conventional site, respectively. In a study of a solitary bee, Megachile rotundata, also reported RPS18, and RPL8 as stable reference genes across all life stages and under a variety of environmental conditions48. Similarly, a transcriptional study of C. calcarata reported significant variation in gene expression associated with overwintering27. These findings suggest that RPS18 and RPL8 exhibit overall high expression stability and are suitable reference genes for gene expression studies in C. calcarata across different developmental stages and habitat conditions. To our knowledge, no study has investigated the gene expression stability of C. calcarata. Therefore, the present findings will be an important basis for future studies in C. calcarata with broader implications for native wild bees.

The expression stability of candidate reference genes varied across developmental stages and habitat types, suggesting that both intrinsic and environmental factors influence gene expression. In other bee species, such as Euglossa viridissima also exhibited age-related gene expression patterns49. Differences between larvae and adults may reflect distinct physiological processes, such as growth and differentiation in larvae and processes like reproduction, foraging or immune function in adults50. Additionally, environmental stressors such as pesticide exposure and resource availability may affect gene expression in bees51. These findings highlight the need to carefully validate reference genes across both developmental stages and ecological contexts for accurate normalization in q-RT-PCR studies.

In conclusion, the present study evaluated the expression stability of seven candidate reference genes in C. calcarata across different developmental stages and habitat types. The results demonstrate that gene stability varies with both developmental stages and environmental conditions, underscoring the importance of selecting appropriate reference genes for accurate normalization in q-RT-PCR. Our findings provide a valuable resource for future gene expression studies in wild bees and highlight the necessity of validating reference genes under specific conditions.