Comprehensive expression data for two honey bee species, Apis mellifera and Apis cerana japonica

Yokoi, Kakeru; Hatakeyama, Masatsugu; Kuwazaki, Seigo; Maeda, Taro; Yoshiyama, Mikio; Horigane-Ogihara, Mari; Matsuyama, Shigeru; Jouraku, Akiya; Bono, Hidemasa; Kimura, Kiyoshi

doi:10.1038/s41597-025-05279-z

Download PDF

Data Descriptor
Open access
Published: 02 June 2025

Comprehensive expression data for two honey bee species, Apis mellifera and Apis cerana japonica

Kakeru Yokoi ORCID: orcid.org/0000-0002-6672-5341¹,
Masatsugu Hatakeyama¹,
Seigo Kuwazaki¹,
Taro Maeda²,
Mikio Yoshiyama³,
Mari Horigane-Ogihara³,
Shigeru Matsuyama⁴,
Akiya Jouraku ORCID: orcid.org/0000-0003-3264-8293¹,
Hidemasa Bono ORCID: orcid.org/0000-0003-4413-0651^5,6 &
…
Kiyoshi Kimura^1,3,7

Scientific Data volume 12, Article number: 926 (2025) Cite this article

3365 Accesses
2 Altmetric
Metrics details

Subjects

Abstract

Comprehensive expression datasets were constructed for Apis mellifera and Apis cerana japonica. Firstly, we obtained RNA-Seq data of the samples prepared: post-oviposition day 6 to day 58 samples of A. mellifera workers (larva to adult), and day 9, 10, 12, and 13 samples of A. mellifera queen (larva to pupa), and day 9 to day 18 samples of A. cerana japonica workers (larva to adult). For A. cerana japonica, reference transcript sequence, predicted amino acid sequence, and functional annotation data were generated based on the genome sequence and RNA-Seq data. Using the transcript sequence and RNA-Seq data, comprehensive expression data for all transcripts of A. mellifera and A. cerana japonica were prepared. Hierarchical clustering analyses, the used sample preparation method, and other evaluation results ensured that both sets of expression data were reliable for use as comprehensive reference expression datasets. Therefore, these data can be used for honey bee research or comparative or evolutionary studies on insect species or social insect species at the genetic and molecular levels.

Organ-specific transcriptome analysis reveals differential gene expression in different castes under natural conditions in Apis cerana

Article Open access 28 May 2021

Identification of ecdysone receptor target genes in the worker honey bee brains during foraging behavior

Article Open access 28 June 2023

Exploring the potential of hydro alcoholic crude extract of beeswax as antibacterial antifungal antiviral antiinflammatory and antioxidant agent

Article Open access 12 September 2025

Background & Summary

Honey bees (Hymenoptera: Apidae) are industrially important as producers of honey, royal jelly, propolis, and wax. Further, honey bees are pollinators and are essential for several crops (e.g., strawberry and watermelon). They are used in biological research as representative social insects¹. A honey bee colony consists of a queen, workers (females), and drones (males). The queen and drones are mainly responsible for reproduction, whereas workers are mainly responsible for non-reproductive tasks, such as nursing larvae and foraging for nectar and pollen. There are several species of honey bees with species-specific traits. In this study, we treated two honey bee species: Apis mellifera (Western honey bee) and Apis cerana japonica (Japanese honeybee). Both species are, 10-11 mm in size, and similar in morphology (Fig. 1). A. cerana is a black bee, but A. mellifera also has a black subspecies¹. This makes it difficult to distinguish the two species in the field. However, the two species are genetically isolated from each other.

The traits of A. mellifera include abundant honey production and easy rearing compared to other wild honey bee species. Moreover, A. mellifera has been used as a model social insect species to study social behavior, memory, and learning. Because of its importance, the first genome sequence data for A. mellifera were reported in 2006². After several improvements, chromosome-level genome sequencing data were published³. Transcriptome analyses have also been performed using the genome sequence data for genetic or molecular-level insights into A. mellifera or honey bee traits or reactions, such as casting or immune reactions against pathogens^4,5,6,7, leading to the accumulation of A. mellifera RNA-Seq data in public databases. Using these RNA-Seq data, reference transcriptome data of A. mellifera with functional annotation data have been constructed⁸.

The Asian honey bee A. cerana, a wild honey bee species distributed in Asia, is utilized for beekeeping in some rural areas⁹. The traits of A. cerana are more mildness, more frequently absconding and producing lower honey compared to A. mellifera. Furthermore, A. cerana shows greater resistance to varroa mite and American foulbrood. The Japanese honey bee, A. cerana japonica, is a subspecies of A. cerana, which inhabits all areas of Japan except Okinawa and Hokkaido¹⁰. A. cerana japonica was used for beekeeping in Japan before A. mellifera were introduced. One interesting and characteristic trait of A. cerana japonica is the formation of a hot defensive bee ball against colony invaders such as the Japanese giant hornet Vespa mandarinia^11,12. To advance research using A. cerana at the genetic and molecular levels, genome sequence data of Apis cerana cerana (Korean and Chinese strains) were published in 2015 and 2018, respectively, and chromosome-level genome sequence data was published in 2020^13,14,15. The draft genome data of A. cerana japonica was reported in 2018, and chromosome-level genome data, constructed using A. cerana chromosome genome data, was published in 2023^16,17. An RNA-Seq analysis of A. cerana has been performed. For instance, genes associated with hot defensive bee balls have been identified using RNA-Seq analysis¹⁸. Additional RNA-Seq analyses will be useful for A. cerana research. Notably, remarkable differences in traits are observed between A. cerana japonica and A. mellifera even though the two species are evolutionarily very close to each other.

Considering the availabilities of genome and RNA-Seq data for A. mellifera and A. cerana japonica, reference transcriptome data (expression data) of intact honey bees at different developmental stages is required. These expression data of different developmental stages must be important because the transcriptome data can accelerate honey bee biological research at the molecular or genetic level (e.g., identification of the genes involved in the caste determination or the morphological traits of each developmental stage). However, preparing samples under completely unified conditions, which is relatively easy for model insects such as Drosophila melanogaster and Bombyx mori, is a very difficult task for honey bees, because of several reasons. First, honey bees cannot be reared from eggs to adults under artificial conditions such as in the laboratory. Honey bee larvae must be fed by nursing workers in the colony. Thus, the timing of feeding and conditions vary for each brood. Another factor is environmental conditions. Colonies are known to be placed outside. Therefore, the temperature and weather conditions in which the colonies are placed are not stable, which affects the length of each stage¹. The process of honey bee reproduction is another challenge. Queens mate with drones from their own and other colonies to oviposit the eggs of workers or new queens, leading to genetic variations in broods born from one queen. Furthermore, rearing A. cerana japonica is rather difficult because of its frequent absconding behavior. Nevertheless, comprehensive expression data for honey bees at multiple developmental stages under unified conditions are desirable. To the best of our knowledge, such a series of comprehensive expression data has not been previously published.

In this study, we prepared RNA-Seq data of intact two honey bee species (A. mellifera and A. cerana japonica) at multiple developmental stages using a sampling method with as little variation as possible (Sample preparation is described in the “Methods” section). In total, RNA-seq data from 8 A. mellifera queens, 72 A. mellifera workers, and 14 A. cerana japonica workers were obtained. In A. cerana japonica, reference transcriptome sequence data (47867 transcripts) were constructed using genome sequence data and, 99789 predicted amino acid sequences were constructed using the transcript sequence data. Functional annotation data of the reference transcriptome were constructed using fanflow4insect¹⁹ based on predicted amino acid sequence data. Next, using the RNA-Seq data and reference transcriptome data of A. mellifera and A. cerana japonica, expression values of the reference transcriptome in the prepared samples were calculated by pseudoalignment of the RNA-seq reads to the reference transcripts, using Kallisto, and comprehensive expression data of multiple developmental stages were constructed (Fig. 2). The validation results of the RNA-Seq and comprehensive expression data described below indicate the reliability of the reference transcriptome and expression data, which can contribute to honey bee research (e.g., determining target genes to create strains with beneficial traits for the beekeeping industry by genome editing) or evolutionary and comparative biology between insect species or social insect species at the genetic and molecular levels. These comprehensive expression data are the first public reference expression data (reference transcriptome data) of A. mellifera and A. cerana japonica samples.

Methods

Honey bee rearing

A. mellifera used in this study were from an Italian hybrid strain reared in a National Agricultural Research Organization (NARO) apiary; the bees were reared according to standard beekeeping practices. A healthy queen was used for sampling from a randomly selected hive.

For sampling A. cerana japonica, a swarm colony was reared in an apiary on the campus of the University of Tsukuba the year before sampling.

Sample preparation

Samples for RNA-Seq, which was conducted from June to August 2021, were prepared by using an egg-laying box of the special queen rearing equipment from Ezi Queen Systems (Ezi Queen Technology Limited, Auckland, New Zealand), from in which a queen could freely lay eggs without escaping, and in which workers could pass through the cover slits and care for the eggs and broods in the box (Fig. 3). Hereafter, “egg-laying boxes of the special queen rearing equipment from Ezi Queen Systems” will be referred to as “egg-laying boxes”. A single queen of both A. mellifera and A. cerana japonica was kept in the egg-laying box from 10 AM, and after 6 hours, the queen was released from the box. The oviposited eggs were used for RNA-seq analysis. At 10 AM after five days, sampling was started, and the obtained samples were stored at −80 °C until used. Similarly, other samples were consistently prepared at 10 AM. The A. mellifera worker day 6 samples (AmW_d6) were samples obtained at 138–144 hours post-oviposition (a detailed figure was available in figshare²⁰). Similarly, the postoviposition time of each sample can be determined by the sample name. After sealing all the brood cells in the boxes (occurring in approximately day 8 to day 9 workers¹), they were kept in incubators controlled at 35 °C. For adult samples, emerged worker bees were marked on their backs and returned to their original hive within 24 h. The age of the worker bees was determined in this manner. Sampling continued in this way until sufficient numbers were obtained. The same queen was used for sampling. The nomenclature for each sample is shown at the bottom of the supplemental figure in figshare²⁰. For example, AmW_d14_2 denotes “A. mellifera worker day 14, biological replicate 2” sample. A. mellifera worker samples were prepared from day 6 (larvae) until day 58 (adults), whereas A. mellifera queen samples were prepared from day 9 (larvae) until day 13 (pupae). A. cerana japonica worker samples were prepared from day 9 (larvae or pre-pupa) until day 18 (adults) (Fig. 4). The biological replicates of these samples varied from one to five, and biological replicate numbers of some categories did not begin with “1” and were not sequential because sufficient amounts of total RNA for RNA-Seq were not extracted from some samples. Furthermore, A. mellifera queen day 11 and A. cerana japonica worker day 11 samples were excluded from the sample collections because there were no sample from which sufficient amount and quality of total RNA for RNA-Seq were extracted. The whole individual samples prepared were used for RNA extraction.

RNA extraction

Total RNA was extracted using TRIzol Reagent (Thermo Fisher Scientific, Waltham, MA, USA) according to the product manual. Honey bee samples (whole bodies) were homogenized in 2 mL of TRIzol. The extracted RNAs were resuspended in 100 µL of RNase free water and purified using an RNeasy mini kit (Qiagen) according to the product manual. Purified RNA concentrations were quantified using a Nanodrop spectrophotometer (Thermo Fisher Scientific). According to the concentration values and total amounts of extracted RNA, one to five RNA samples (biological replicates) per category were used for RNA-Seq.

Library preparation and RNA-Seq

cDNA libraries for RNA-Seq were constructed from the extracted RNA samples using the TruSeq Stranded mRNA Library Prep Kit (Illumina, San Diego, CA, USA) following the procedures outlined in the Reference Guide. An Illumina NovaSeq 6000 was used to sequence paired-end reads. Library construction and sequencing were performed by the Macrogen Corp., Japan (Kyoto, Japan).

RNA-Seq Data analysis

The workflow of the RNA-Seq data analysis used in this study is shown in Fig. 5.

Removal of adapter sequences, quality control, and trimming of raw RNA-Seq data were performed using Trim Galore (version 0.6.7, https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) with default settings. Processed RNA-Seq data were used to construct transcriptomic sequences (only A. cerana japonica) and to calculate the expression values of the transcriptome in all prepared samples.

To construct transcriptomic sequence data of A. cerana japonica (blue arrows in Fig. 5), the processed RNA-Seq data were mapped to the genome sequences A. cerana japonica using Hisat2 (version 2.2.1)²¹ with the default settings. To construct transcriptome sequence data for each sample, each RNA-Seq data was assembled using the BAM file and gene set data from the genome data using Stringtie^22,23 (version 2.2.1) with the default settings. The constructed transcriptome sequence data were merged into a single dataset as a GTF file, and the gtf file was converted to the FASTA format using Gffread²⁴ (version 0.12.1) as the reference transcriptome sequence data of A. cerana japonica. The quality of reference transcriptome sequence data was evaluated by BUCSO (version 5.2.2)²⁵ with default setting using insecta_odb10 data. Functional annotations of each transcript were archived using the predicted amino acid sequences (predicted by TransDecoder (version 5.5.0)²⁶ with the default settings.) by Fanflow4Insects¹⁹. Protein-level annotations were performed using protein sequence data of comprehensively annotated species (Homo sapiens, Mus musculus, Caenorhabditis elegans, D. melanogaster, B. mori, Bombus terrestris, Nasonia vitripennis, and A. mellifera) and Unigene data (for details see reference article of Fanflow4Insects¹⁹).

Using the generated transcript sequence data of A. mellifera⁸ and A. cerana japonica (described above) and the RNA-Seq data, the expression levels of these transcripts in all samples of A. mellifera and A. cerana japonica were calculated by pseudoalignment of the RNA-seq reads to the reference transcripts with the default settings, using Kallisto²⁷ (version 0.44.0). Transcript per million (TPM) values in the Kallisto output files of all samples from the two honey bee species were extracted and merged into single matrix data as comprehensive expression data. All commands and scripts of the data analysis were described in a single text file (see “Code availability” section).

Hierarchical clustering analysis

To validate the constructed comprehensive expression data, hierarchical clustering analyses (average linkage and Spearman’s rank correlation) were performed using R (version 4.2.3) in Rstudio (version 2023.03.0 + 386), using the comprehensive expression data of the two species as input. Rscript of the analysis was uploaded in figshare (see “Code availability “ section).

Data Records

The raw RNA-seq data in this study were deposited in the Sequence Read Archive (SRA, ID: DRA016719 and DRA014424)^28,29. The BioProject, BioSample, and SRA accession IDs for each RNA-seq sample are described in the table file uploaded to figshare³⁰. The assembled transcript data in gtf and FASTA files, the predicted amino acid sequence FASTA file, and the functional annotation file of the A. cerana japonica transcripts are available in figsahre³¹. The TPM value matrix data of all transcripts in all samples of A. mellifera³² and A. cerana japonica were prepared³³ and the Kallisto output files for the construction of matrix data were uploaded to figshare³⁴.

Technical Validation

Sample preparation

As described in the “Methods” section, RNA-Seq samples were prepared using an egg-laying box, and a single queen was allowed to oviposit in the egg-laying box (Fig. 2). The queen was introduced and kept in the egg-laying box at 10 AM and released at 4 PM (6 hours) (Fig. 3). Therefore, the eggs in the egg-laying box were laid during the queen’s stay. After several days, samples were prepared at 10 AM, and used for RNA-Seq. Similarly, other samples were consistently prepared at 10 AM. For this sample preparation method, all samples were derived from a single queen and the post-oviposition time of the samples within the same category varied within 6 h. Considering the methods of sample preparation, despite some missing stages in the sample sets of the A. mellifera queen and A. cerana japonica worker (owing to difficulty in sample preparation), the RNA-Seq data and expression data in this study are reliable as reference expression data.

RNA-Seq data validation

To evaluate RNA-Seq data, the mapping ratios of A. cerana japonica RNA-Seq data prepared to A. cerana japonica genome sequence were shown³⁵. The mapping ratios of RNA-Seq are approximately 95%. Furthermore, the completeness of reference A. cerana japonica transcriptome sequence data was evaluated by BUCSO³⁵. The result of BUSCO showed that 99% of BUSCO genes were completely contained in the transcriptome sequence data. The two results ensured the reliability of the RNA-Seq and transcriptome sequence data.

Transcriptome data validations by hierarchical clustering analysis

To validate the reliability of the RNA-Seq and expression data, hierarchical clustering analyses were performed using the expression data of A. mellifera and A. cerana japonica as input data. The dendrograms of A. mellifera and A. cerana japonica are shown in Figs. 6, 7, respectively. The hierarchical clustering results of A. mellifera and A. cerana japonica showed that despite several exceptions, two to five biological replicates of the same condition were located in single or closely neighboring clusters, suggesting the reliability of the expression data; further, the clusters containing samples from relatively close sampling days were formed.

In A. mellifera clustering results (Fig. 6), all A. mellifera worker adults (colored in red), except AmW_d23 (immediately after becoming adults), were first separated, revealing that A. mellifera adult workers have different transcriptome profiles compared to their pupal or larval stage samples. Adult workers perform multiple tasks in colonies such as caring for broods and foraging. Considering these traits, the separation of the transcriptome profiles of A. mellifera adult workers from other samples is justified. Some of the larval worker samples (AmW_d6-9 and d11), and AmQ_d9 samples (larvae) formed one cluster (on the right side). This cluster may reflect the transcriptome profile of larval stages. The other clusters included the AmW_d10 and AmW_d12 samples (larvae); AmW_d13 to d18 (pre-pupa to pupa); and AmQ_d10, d12, and d13 (larvae and pupa). These clusters may reflect the transcriptome profiles of the pre-pupal to early pupal stages. We assumed that the AmW_d10 samples developed earlier than usual¹, although they were from the late larval stage, while AmW_d12 and AmQ_d10 samples were from the late larval stage, but their transcriptome profiles were shifted to those in the pre-pupal and pupal stages prior to the morphological changes from larva to pre-pupa. The AmQ_d12_1 sample was located at an isolated position, possibly because the developmental time differed from that of the other queen samples. Generally, the developmental duration of the queen (from egg to adult) is shorter than that of the worker¹. AmW_d19-d23 samples (pupae and adults) formed another cluster, possibly reflecting the transcriptome profile of emergence, which drastically changes the morphological traits. The inclusion of the AmW_d19 and d20 samples (mid-pupal stage) in the cluster may be attributed to the shift in transcriptome profiles from AmW_d18 samples to these samples occurring prior to the morphological changes during emergence. In A. cerana japonica, all samples were divided into three clusters (Fig. 7). One cluster included the AcW_d17 and AcW_d18 samples (pupae and adults), which may reflect the transcriptome profiles of the early adult stage or emergence. The other cluster included AcW_d12–d16 samples (pre-pupa and pupa). The third cluster included the AcW_d9 and d10 samples (larvae and pre-pupa). These clusters may reflect transcriptome profiles from the pre-pupa to pupa and the larva or early pre-pupa stages, respectively. These results reflect the biological traits of the A. cerana japonica developmental stages. In conclusion, both clustering results suggest that the transcriptome data in this study are reliable and can be used as comprehensive reference expression data (reference transcriptome data).

Code availability

The code files for transcriptome analysis shown in Fig. 5 and hierarchical clustering analysis using R as described in “Technical Validation” section are available in figshare³⁶.

References

Winston, M. The Biology of the Honey Bee. (Harvard University Press, 1991).
Weinstock, G. M. et al. Insights into social insects from the genome of the honeybee Apis mellifera. Nature 443, 931–949 (2006).
Article ADS CAS Google Scholar
Wallberg, A. et al. A hybrid de novo genome assembly of the honeybee, Apis mellifera, with chromosome-length scaffolds. BMC Genomics 20, 275 (2019).
Article PubMed PubMed Central Google Scholar
Lago, D. C., Hasselmann, M. & Hartfelder, K. Sex- and caste-specific transcriptomes of larval honey bee (Apis mellifera L.) gonads: DMRT A2 and Hsp83 are differentially expressed and regulated by juvenile hormone. Insect Mol Biol 31, 593–608 (2022).
Article CAS PubMed Google Scholar
Evans, J. D. et al. Immune pathways and defence mechanisms in honey bees Apis mellifera. Insect Mol Biol 15, 645–656 (2006).
Article CAS PubMed PubMed Central Google Scholar
Badaoui, B. et al. RNA-sequence analysis of gene expression from honeybees (Apis mellifera) infected with Nosema ceranae. PLoS One 12, e0173438 (2017).
Article PubMed PubMed Central Google Scholar
Cornman, R. S., Lopez, D. & Evans, J. D. Transcriptional Response of Honey Bee Larvae Infected with the Bacterial Pathogen Paenibacillus larvae. PLOS ONE 8, e65424 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Yokoi, K., Wakamiya, T. & Bono, H. Meta-Analysis of the Public RNA-Seq Data of the Western Honeybee Apis mellifera to Construct Reference Transcriptome Data. Insects 13, 931 (2022).
Article PubMed PubMed Central Google Scholar
Asian Beekeeping in the 21st Century https://doi.org/10.1007/978-981-10-8222-1 (Springer, Singapore, 2018).
Yoshiyama, M. & Kimura, K. Bee Diversity and Current Status of Beekeeping in Japan. in Asian Beekeeping in the 21st Century (eds. Chantawannakul, P., Williams, G. & Neumann, P.) 223–245, https://doi.org/10.1007/978-981-10-8222-1_10 (Springer, Singapore, 2018).
Ono, M., Igarashi, T., Ohno, E. & Sasaki, M. Unusual thermal defence by a honeybee against mass attack by hornets. Nature 377, 334–336 (1995).
Article ADS CAS Google Scholar
Ugajin, A. et al. Detection of Neural Activity in the Brains of Japanese Honeybee Workers during the Formation of a “Hot Defensive Bee Ball. PLOS ONE 7, e32902 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Park, D. et al. Uncovering the novel characteristics of Asian honey bee, Apis cerana, by whole genome sequencing. BMC genomics 16, 1 (2015).
Article PubMed PubMed Central Google Scholar
Diao, Q. et al. Genomic and transcriptomic analysis of the Asian honeybee Apis cerana provides novel insights into honeybee biology. Scientific Reports 8, 1–14 (2018).
Article ADS Google Scholar
Wang, Z.-L. et al. A Chromosome-Scale Assembly of the Asian Honeybee Apis cerana Genome. Frontiers in Genetics 11, 279 (2020).
Article CAS PubMed PubMed Central Google Scholar
Yokoi, K. et al. The draft genome sequence of the Japanese honey bee, Apis cerana japonica (Hymenoptera: Apidae). European Journal of Entomology 115 (2018).
Masuoka, Y. et al. Genome assembly reconstruction of the Japanese honey bee, Apis cerana japonica (Hymenoptera: Apidae), using homology-based assembly and nanopore long-reads. 2023.07.26.550500 Preprint at https://doi.org/10.1101/2023.07.26.550500 (2023).
Kamioka, T. et al. Genes associated with hot defensive bee ball in the Japanese honeybee, Apis cerana japonica. BMC Ecology and Evolution 22, 31 (2022).
Article CAS PubMed PubMed Central Google Scholar
Bono, H., Sakamoto, T., Kasukawa, T. & Tabunoki, H. Systematic Functional Annotation Workflow for Insects. Insects 13, 586 (2022).
Article PubMed PubMed Central Google Scholar
Yokoi, K. et al. Detailed explanation for the samples. Figshare https://doi.org/10.6084/m9.figshare.28573733 (2025).
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915 (2019).
Article CAS PubMed PubMed Central Google Scholar
Pertea, M., Kim, D., Pertea, G. M., Leek, J. T. & Salzberg, S. L. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc 11, 1650–1667 (2016).
Article CAS PubMed PubMed Central Google Scholar
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33, 290–295 (2015).
Article CAS PubMed PubMed Central Google Scholar
Pertea, G. & Pertea, M. GFF Utilities: GffRead and GffCompare. F1000Research 9, ISCB Comm J (2020).
Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. Molecular Biology and Evolution 38, 4647–4654 (2021).
Article CAS PubMed PubMed Central Google Scholar
Haas, B. J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc 8, 1494–1512 (2013).
Article CAS PubMed Google Scholar
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nature Biotechnology 34, 525–527 (2016).
Article CAS PubMed Google Scholar
DNA Data Bank of Japan https://ddbj.nig.ac.jp/resource/sra-submission/DRA016719 (2024).
DNA Data Bank of Japan https://ddbj.nig.ac.jp/resource/sra-submission/DRA014424 (2024).
Yokoi, K. et al. ID list of Apis mellifera and Apis cerana japonica RNA-Seq data. figshare https://doi.org/10.6084/m9.figshare.27174441 (2024).
Article Google Scholar
Yokoi, K. et al. Apis cerana japonica transcript data, transcript sequence data, predicted amino acid sequence data, functional annotation data. Figshare https://doi.org/10.6084/m9.figshare.27175734 (2024).
Yokoi, K. et al. Tpm data of Apis mellifera queen and worker. Figshare https://doi.org/10.6084/m9.figshare.27157614 (2024).
Yokoi, K. et al. Tpm data of Apis cerana japonica worker. Figshare https://doi.org/10.6084/m9.figshare.27157632 (2024).
Yokoi, K. et al. Kallisto output files. Figshare https://doi.org/10.6084/m9.figshare.27175941 (2024).
Yokoi, K. et al. RNA-Seq data stats. Figshare https://doi.org/10.6084/m9.figshare.28579118 (2025).
Yokoi, K. et al. Scripts of RNA-Seq data analyses and Rscripts for Hierarchical clustering analyses. Figshare https://doi.org/10.6084/m9.figshare.27175698 (2024).

Download references

Acknowledgements

This work was supported by the Center of Innovation for Bio-Digital Transformation (BioDX) and an open innovation platform for industry-academia co-creation (COI-NEXT) of JST (COI-NEXT, JPMJPF2010) to K.Y., M.H., S.K., A.J., H.B., and K.K. Parts of Figs. 1, 2, and 5 were drawn using illustrations from TogoTV (© 2016 DBCLS TogoTV, CC-BY-4.0 https://creativecommons.org/licenses/by/4.0/deed.ja). We would like to thank Editage (www.editage.jp) for English language editing. We express thankfulness to Dr. Shotaro Mine at Institute of Agrobiological Sciences NARO for the preparations of A. mellifera queen photographs used in Fig. 3.

Author information

Authors and Affiliations

Insect Design Technology Group, Division of Insect Advanced Technology, Institute of Agrobiological Sciences, National Agriculture and Food Research Organization (NARO), 1-2 Owashi, Tsukuba, Ibaraki, 305-8634, Japan
Kakeru Yokoi, Masatsugu Hatakeyama, Seigo Kuwazaki, Akiya Jouraku & Kiyoshi Kimura
Biodiversity Conservation and Utilization Group, Division of Agroecosystem Management Research, Institute of Agro-Environmental Sciences, National Agriculture and Food Research Organization (NARO), 3-1-3 Kannondai, Tsukuba, Ibaraki, 305-8604, Japan
Taro Maeda
Smart Livestock Facilities Group, Division of Advanced Feeding Technology Research, Institute of Livestock and Grassland Science, National Agriculture and Food Research Organization (NARO), 2 Ikenodai, Tsukuba, Ibaraki, 305-0901, Japan
Mikio Yoshiyama, Mari Horigane-Ogihara & Kiyoshi Kimura
Graduate School of Life and Environmental Sciences, University of Tsukuba, 1-1-1, Tennoudai, Tsukuba, 305-8572, Japan
Shigeru Matsuyama
Laboratory of BioDX, PtBio Co-Creation Research Center, Genome Editing Innovation Center, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima, Hiroshima, 739-0046, Japan
Hidemasa Bono
Laboratory of Genome Informatics, Graduate School of Integrated Sciences for Life, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima, Hiroshima, 739-0046, Japan
Hidemasa Bono
PtBio Inc., 3-10-23 Kagamiyama, Higashi-Hiroshima, Hiroshima, 739-0046, Japan
Kiyoshi Kimura

Authors

Kakeru Yokoi
View author publications
Search author on:PubMed Google Scholar
Masatsugu Hatakeyama
View author publications
Search author on:PubMed Google Scholar
Seigo Kuwazaki
View author publications
Search author on:PubMed Google Scholar
Taro Maeda
View author publications
Search author on:PubMed Google Scholar
Mikio Yoshiyama
View author publications
Search author on:PubMed Google Scholar
Mari Horigane-Ogihara
View author publications
Search author on:PubMed Google Scholar
Shigeru Matsuyama
View author publications
Search author on:PubMed Google Scholar
Akiya Jouraku
View author publications
Search author on:PubMed Google Scholar
Hidemasa Bono
View author publications
Search author on:PubMed Google Scholar
Kiyoshi Kimura
View author publications
Search author on:PubMed Google Scholar

Contributions

K.Y., M.H., H.B. and K.K. conceived the study. M.H., S.K., T.M., M.Y., M.O.H., S.M. and K.K. prepared the RNA samples from A. mellifera and A. cerana japonica for RNA-seq. K.Y., A.J. and H.B. performed the bioinformatics data analysis and data registration. S.M. prepared the photographs of the honey bees. K.Y. wrote the original draft of the manuscript. All authors reviewed and edited the draft of the manuscript. All the authors have read and agreed to the published version of this manuscript.

Corresponding authors

Correspondence to Kakeru Yokoi or Hidemasa Bono.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yokoi, K., Hatakeyama, M., Kuwazaki, S. et al. Comprehensive expression data for two honey bee species, Apis mellifera and Apis cerana japonica. Sci Data 12, 926 (2025). https://doi.org/10.1038/s41597-025-05279-z

Download citation

Received: 20 December 2024
Accepted: 27 May 2025
Published: 02 June 2025
Version of record: 02 June 2025
DOI: https://doi.org/10.1038/s41597-025-05279-z