Introduction

The holobiont concept has been proposed for invertebrate vectors, in which the host genome should not be considered as the only compartment at play in vector transmission1. It has been shown on several occasions that microparasites infecting vectors have major consequences on invertebrate biology and should thus considered as an intrinsic component of the vector. These microparasites include fungi, viruses and symbiotic bacteria, which all may drastically affect several life history traits of their invertebrate host, including their vectorial competence. For mostly technical reasons, the majority of investigations to date have explored the diversity of bacterial communities in arthropod vectors, as well as insights into community composition origin2,3. Such communities include core and non-core microbes, roughly equivalent to essential and non-essential microbes, respectively. The exploration of these aspects using bacterial communities has revealed that microbiota composition can be environmentally driven, and depend on a number of variables such as sampling site, season or vertebrate host species4. By contrast, some obligate symbionts are vertically transmitted and may hence display co-evolutionary patterns with their hosts. Indeed, obligate symbionts (also referred as mutualists) provide essential nutrients and vitamins to their arthropod hosts with sometimes extremely restricted diets such as bat flies5, ticks6 or aphids7. Endosymbionts such as vertically transmitted Wolbachia are well known sex manipulators that favour the fitness of infected females over that of males and uninfected females. These symbionts may in turn provide their hosts with protection against predators as established for aphids8 or viral infection9,10, which can be used to control the transmission of arboviruses by major mosquito vectors such as Aedes albopictus or Aedes aegypti11,12.

Fleas (Insecta, Siphonaptera) are a diversified group of hematophagous ectoparasites represented worldwide by over 2500 described species and subspecies13. These arthropods occupy a wide range of habitats and hosts, and are distributed unevenly across the globe14. Although the greatest diversity of fleas is found in the Palaearctic region15, fleas are found on all continents, including in extreme environments such as Antarctica where Glaciopsyllus antarcticus has been identified in seabirds16,17. Adult fleas feed parasitically on the blood of their vertebrate hosts, specifically mammals and birds. While fleas are rarely specific to their vertebrate host, some taxa associate with a particular host group, as exemplified by Xenopsylla spp., which are found specifically on rodents18. Fleas are known vectors of pathogens of significant medical importance such as the causative agents of plague (Yersinia pestis)19,20, murine typhus (Rickettsia typhi)21 or cat scratch disease (Bartonella henselae)22.

These vectors are of major medical importance on southwestern Indian Ocean islands, especially Madagascar, which reports among the highest incidence rates of plague worldwide23,24,25. In Madagascar, 49 flea species have been documented, of which 42 are endemic26. Malagasy fleas provide an interesting biological context for understanding the drivers of bacterial community composition in these disease vectors for two important reasons: (1) the unique endemism of the Malagasy fleas fauna allows exploring long-term evolution of their symbionts and (2) given the lack of reported host-specificity for endemic fleas from the Central Highlands, this area of the island provides the means to understand if vertebrate hosts are a strong driver of microbiota composition in fleas.

Using a sample composed of 12 species endemic to Madagascar sampled in different locations on the island, we addressed how biotic (i.e. flea and vertebrate host species) and abiotic (sampling site and season) variables influence the composition of the microbiota of fleas. The resulting data provide new insights into mechanisms driving the composition of bacterial communities in adult fleas, which may in turn have important consequences on the epidemiology of certain zoonotic diseases, especially plague.

Materials and methods

Flea specimens included in the analysis

Five hundred seventy-seven (221 from Ambohitantely, 312 from Ankazomivady and 44 from Lakato) adult fleas sampled either in Madagascar were included in the study. This sample is composed of 12 flea species endemic to Madagascar and representing 4 genera: Centetipsylla (C. madagascarensis), Paractenopsyllus (P. vauceli, P. duplantieri, P. grandidieri, P. petiti, P. rouxi and P. raxworthyi), Synopsyllus (S. robici, S. estradei and S. fonquerniei) and Dynopsyllus (D. brachypeten and D. flacourti). These specimens were previously sampled in the context of different research programs aiming at investigating the ecology of vector-borne microbes with zoonotic potential on islands from the southwestern Indian Ocean27,28. Malagasy fleas were collected on both introduced (M. musculus, R. rattus, S. murinus) and endemic Malagasy terrestrial mammals. The endemic Malagasy terrestrial mammals include 4 endemic rodent species of the subfamily Nesomyinae (Eliurus majori, E. minor, E. tanala and Nesomys rufus) and 10 endemic tenrecs (Hemicentetes nigriceps, Microgale cowani, M. dobsoni, M. longicaudata, M. gymnorhyncha, M. parvula, M. soricoides, M. taiva, Oryzorictes hova and Setifer setosus)29. A subsample from Ambohitantely was selected for exploring the impact of season on microbial composition as this site is located in a plague focus. This site was sampled both in winter (May 2014) and at the beginning of summer (October 2014), corresponding to the annual human plague season in Madagascar. The climatic aspects of seasonality at Ambohitantely include annual rainfall of about 1460 mm of which 86% falls between November and April, a cold season extending from May to August, and a warm season from December to February30.

DNA extraction procedures

Fleas included in the present study were obtained from previous studies27,28. As a result, no extraction controls could be included before Illumina sequencing. DNA extraction from Malagasy fleas was detailed previously28 and followed a non-destructive procedure maintaining the integrity of insect cuticles allowing morphological diagnosis and the production of molecular barcodes. Briefly, the individuals were placed on filter paper to remove the alcohol. Then, using a binocular, an incision was made in the abdomen with a syringe needle. Each individual was then placed in 100 µl of Instagate solution and PBS solution from the Instagene TM Matrix kit (Bio-Rad Laboratories, Hercules, California)28. DNA was pooled per location, flea species and host species. Sample size in each pool was composed of 2 to 52 DNA samples (see supplementary S1 Table for the composition of each pool).

Description of flea associated bacteriomes through Miseq sequencing of v3-v4 and v4-v5 regions of 16 S

In order to address the diversity of flea bacteriomes, both V3-V4 and V4-V5 regions of the bacterial 16SrRNA were PCR amplified and sequenced on an Illumina Miseq sequencing platform at Genoscreen (Lille, France). We used both the V3-V4 and V4-V5 regions of the bacterial 16SrDNA in order to limit PCR biases and maximize the detection of most bacterial diversity. The 3’ ends of V3-forward (TACGGRAGGCAGCAG) and V4-reverse (GGACTACCAGGGTATCTAAT) or V4-forward (GTGYCAGCMGCCGCGGTAA) and V5-reverse (CCGYCAATTYMTTTRAGTTT) bacterium-specific primers were associated at the 5’ end with multiplex identifier (MID) tags, a GsFLX key, and GsFLX adapters. Each pool was independently amplified twice with distinct MID tags, allowing the individual identification of each pool, as well as each of the two PCR duplicates. Sequences underwent a first quality control and analysis process before being demultiplexed, assembled in paired-end reads and depleted of primers sequences and low-quality reads. Both V3-V4 and V4-V5 raw datasets were first analysed using the SILVA database for the identification of OTUs. Then, V3-V4 and V4-V5 datasets were processed separately using a pipeline implemented in Mothur31. V3-V4 and V4-V5 datasets were depleted of sequences < 350 bp and < 300 bp, respectively. Sequences with ambiguous calls and homopolymers longer than 10 bp were removed. Chimeras were then removed and OTUs were defined by clustering at 97%. OTUs were then aligned on a customized SILVA database including the V3-V5 region of 16 S rRNA of bacteria.

Biodiversity analyses

Both V3-V4 and V4-V5 datasets, each consisting of 49 pools composed of fleas from a single species sampled on the same vertebrate host species, at the same study site and during the same sampling season were analysed independently using R software (R-4.2.2) in Rstudio (2022.07.2). The α-diversity was assessed using the vegan package, which allowed the evaluation of the Chao1 index, measuring the total number of unique taxa and the inverse of Simpson’s index estimating the diversity and evenness of communities. The structure of β-diversity was compared among flea genera and different sampling sites. For this, permutational MANOVA (PERMANOVA) tests with 999 permutations were performed and non-metric multidimensional scaling (NMDS) ordinations were conducted on Bray–Curtis dissimilarities, calculated from rarefied sequence counts, after square root transformation and Wisconsin standardization, to produce plots. Higher values for these indices indicate greater diversity (species richness and abundance). Measures of flea microbial diversity were examined in relation to flea species, host species, season, and collection site. The ANOVA test was used to examine significant differences in diversity measures and the test was considered significant when p < 0.05. Of the 49 pools, 23 corresponded to fleas collected in Réserve Spéciale d’Ambohitantely in the Central Highlands. This subsample was composed of fleas from five endemic flea species (Paractenopsyllus duplantieri, Paractenopsyllus grandidieri, Paractenopsyllus petiti, Synopsyllus estradei and Synopsyllus fonquerniei) sampled on 7 endemic (Microgale cowani, Microgale dobsoni, Microgale gymnorhyncha, Microgale longicaudata Microgale majori, Setifer setosus, Eliurus majori) and 1 introduced (Rattus rattus) mammal species. The two (V3-V4 and V4-V5) datasets from these 23 pools were eventually analysed independently as they allowed addressing the diversity of the flea microbiota at the same site and during two different seasons.

Results

Composition of fleas microbiota

Using the normalised data, a total of 1,380,057 and 1,588,722 reads was obtained for the V3-V4 and V4-V5 regions, respectively. The phyla Proteobacteria (V3-V4 = 88.2%; V4-V5 = 92.4%), Firmicutes (V3-V4 = 7.1%; V4-V5 = 5.1%) and Bacteroidetes (V3-V4 = 3.0%; V4-V5 = 2.5%) accounted for more than 96.0% of the sequences for each of the studied regions (see Supplementary Fig. 1). The analyses of V3-V4 reads revealed 223 bacterial genera of which 5 genera represented slightly more than 53% of all reads, namely Wolbachia (19.2%), Thorsellia (12.4%), Sphingomonas (8.5%), Rhizobium (6.8%) and Mesorhizobium (6.6%) (Fig. 1a). V4-V5 dataset revealed 196 bacterial genera of which 5 accounted for more than 66% of the reads namely Wolbachia (35.8%), Rickettsia (10.2%), Rhizobium (7.8%), Sphingomonas (6.5%) and Thorsellia (6.2%) (Fig. 1b).

Season but not host species influence bacterial communities of fleas

We investigated how flea species, vertebrate host species and sampling sites influenced bacterial community composition using the Chao1 and Simpson diversity indices (see supplementary S2 Table for all the indices). The Simpson diversity index was different according to flea species but this was significant for V4-V5 dataset only (Table 1). Importantly, for both the V3-V4 and V4-V5 data sets, no significant difference was detected in the bacterial community diversity when comparing a specific flea species sampled on different hosts (Table 1).

Table 1 Anova test P-values of Simpson Diversity and Chao1 Indices as a function of flea species, host species, being either endemic or introduced, collection site and season, for V3-V4 and V4-V5 datasets.

We further addressed these patterns using a subsample composed of fleas from Ambohitantely, which were sampled at the same site during two successive seasons. Using this subset, an impact of the season was observed using both indices (Chao 1 and Simpson diversity indices) and, with bacterial diversity being much higher during the wet than the dry season, regardless of the 16 S sequenced region (Table 1).

Fig. 1
figure 1

Heatmap showing detected genera for which an abundance > 1% was detected in the entire dataset. V3-V4 (a) and V4-V5 (b) according to flea species. Abbreviations of flea genera include C = Centetipsylla, D = Dynopsyllus, P = Paractenopsyllus and S = Synopsyllus.

Fig. 2
figure 2

Structure of bacterial communities depending on the flea genus using V3-V4 (a) and V4-V5 (b) data sets, and the sampling site using V3-V4 (c) and V4-V5 (d) data sets. NMDS plots were generated using Bray–Curtis distance matrices. The P-value of the PERMANOVA is indicated in the upper right corner.

Importantly, this subsample confirmed that vertebrate hosts did not control bacterial community composition, using both indices and datasets (see Table 1, Ambo V3-V4 and Ambo V4-V5 columns). By contrast, the difference in bacterial diversity depending on flea species was detected using the Simpson index and V4-V5 dataset only (Table 1).

Lastly, using Permanova analyses on the whole sample, the bacterial community composition was significantly different between sampling sites for both V3-V4 (χ2 = 3.0653, df = 2, P < 0.001; Fig. 2c) and V4-V5 (χ2 = 3.417, df = 2, P < 0.01; Fig. 2d) datasets. No significant difference was detected for the bacterial community composition between flea genera and flea species for V3-V4 and V4-V5 datasets.

Discussion

The present study aimed at describing the bacterial community composition of a large diversity of endemic fleas from Madagascar and at investigating the biotic and abiotic variables driving the composition of these communities. We first explored the taxonomic composition of bacteria in fleas and then determined the effect of a number of biotic (flea genus and species, vertebrate host species) and abiotic (season, sampling site) variables on the composition of flea microbiota.

For both datasets, three phyla, namely Proteobacteria, Firmicutes and Bacteroides represented over 96.0% of the obtained sequences. These results are comparable to previous studies conducted on the flea microbiota in North America and Uganda32,33. At the genus level, both datasets showed that Wolbachia, Thorsellia, Sphingomonas, Rickettsia and environmental bacteria (Mesorhizobium and Rhizobium) were dominant and represented over 53% of reads.

It is important to note that the study was carried out using DNA of fleas that were extracted for other purposes27,28,34, and for which no extraction control was originally included. Therefore, presented data likely includes contaminant reads resulting from DNA extraction that we could not remove from raw sequence data. However, all fleas were extracted in the same way and in the same batch28, and contaminant DNA is thus likely homogenous throughout the samples. Therefore, although presented community composition includes possible extraction contaminants, we consider that differences in community composition depending on biotic/abiotic variable result from each analysed variable.

Our analyses showed that α-diversity of the microbiota is influenced by flea species but only for V4-V5 dataset. By contrast, vertebrate host species and sampling sites did not significantly influence the α-diversity of flea microbiota (see Table 1). In addition, Permanova analysis revealed that β-diversity was influenced by sampling site but not by flea species or genus. We then analysed independently fleas from Ambohitantely site in order to control for the effect of sampling site, address the importance of season in community composition and further test the importance of vertebrate host in driving bacterial community composition. The diversity of the flea microbiota at this locality appeared higher during the wet than during the dry season. Importantly, the bacterial community composition was not significantly affected by vertebrate host (see Table 1), confirming the pattern detected with the whole dataset. This strongly supports that bacterial community composition is driven by environmental factors as exemplified by the importance of season at this sampling site.

One of the most striking results of the present analyses is that the microbiota of fleas in general appeared influenced by non-biotic variables such as season but not by vertebrate host species, which is different from what is reported for other host-parasite systems, such as seabird ticks4. Indeed, ticks feed on their hosts at all development stages and the composition of their microbiota is largely influenced by their host species35,36. This pattern strongly suggests that most of the flea bacterial community is acquired by these invertebrates before parasitizing their vertebrate host. Fleas are characterized by a detritivore larval off-host stage37,38. Larvae feed on organic matter present in their environment (e.g. excrements of arthropod or vertebrate hosts). We propose that a fraction of the microbial diversity associated with this environment, including those in their diet, could be acquired by larval fleas and then trans-stadially transmitted to older stages.

The microbiota of arthropod vectors is increasingly recognized as controlling the vector capacity of their hosts, and is even used to block, for example, arboviruses transmitted by the mosquito Aedes aegypti. Given the medical importance of plague in Madagascar, it is of upmost importance to address whether some habitats could favour bacterial community structures conducive to the transmission of Yersinia pestis by these vectors. Bubonic plague transmission in Madagascar is largely restricted to the Central Highlands, with the notable exception of Mahajanga, a harbour town located along the northwest coast where Y. pestis maintenance is unstable, transmitted by fleas of the introduced shrew Suncus murinus, but recurrently introduced from the Central Highlands39. The distribution of bubonic plague in the Central Highlands is thought to be controlled by the distribution of the main vectors, i.e. introduced Xenopsylla cheopis and endemic Synopsylla fonquernei. Data presented herein strongly suggest that the environment shapes the microbial community of fleas, which might in turn have consequences on vector competence, in line with experimental infection carried out on fleas raised on different substrates32. Our data together with previously published investigations32,33 support that fleas microbiota is mostly structured by the environment, which has in turn consequences on Y. pestis transmission. It is thus important to address the role of the environment in plague epidemiology in Madagascar.