Abstract
Ticks are arthropod vectors capable of transmitting a wide spectrum of pathogens affecting humans and animals. However, we have relatively limited information of their genomic characteristics and the diversity of associated microbiomes. Here we used long- and short-read sequencing on 1,479 samples from 48 tick species across eight genera from China to determine their genome and associated pathogens and microbiome. Through de novo assembly, we reconstructed 7,783 bacterial genomes representing 1,373 bacterial species, of which, 712 genomes represented 32 potentially pathogenic species. Computational analysis found nutritional endosymbionts to be prevalent and highly specific to tick genera. The microbiome genome-wide association study revealed host genetic variants linked to pathogen diversity, abundance and key biological pathways essential to tick biology, including blood-feeding and pathogen invasion. These findings provide a resource for studying the host–microbe interactions within ticks, paving the way for strategies to control tick populations and tick-borne diseases.
Similar content being viewed by others
Main
Ticks are obligate blood-feeding ectoparasites that transmit pathogens to humans. As vectors, ticks are capable of spreading a diverse array of bacterial, viral and parasitic pathogens to both humans and animals1,2. The geographical expansion and global spread of certain tick populations3,4,5,6,7, along with the increasing prevalence of endemic and emerging tick-borne diseases8,9,10,11, have raised substantial concerns for public and animal health worldwide.
Ticks harbour complex microbial communities due to their widespread distribution, unique feeding behaviour, diverse range of animal hosts and remarkable adaptability to various environments12,13. Microbiome studies on certain tick species have revealed substantial bacterial diversity14,15,16,17,18,19,20,21,22, providing valuable taxonomic insights into specific tick species. However, the microbiomes of most tick species remain largely unexplored. A systematic investigation of large-scale samples covering various tick genera and species across different eco-geographies, feeding statuses and other ecological traits is urgently needed. Our previous research highlighted the multifactorial influences of ecogeographical factors and tick genetic variations on the distribution of tick-borne bacterial pathogens23. However, to fully elucidate the complex tripartite interactions among ticks, pathogens and microbiota, comprehensive studies encompassing a broad range of tick species and environmental niches are essential.
In this Article, we used both short-read (Illumina) and long-read (nanopore) sequencing technologies to conduct a large-scale hologenomic analysis of 1,479 samples (1,460 processed via Illumina sequencing and 19 via nanopore sequencing) from 48 tick species across China. Our objectives were to explore the diversity of tick-associated bacteria and the factors influencing their composition, identify potential tick-borne pathogens, investigate the role of bacterial endosymbionts and their relationships with ticks and pathogens, and elucidate the link between tick genetic variability and pathogen carriage. By leveraging this extensive dataset, we provide insights into the tripartite interactions among ticks, tick-borne pathogens and microbiota, aiming to advance strategies for the control of ticks and tick-borne infections in both humans and animals.
Results
Sequencing of 1,479 samples covering representative tick species
To comprehensively characterize tick genomes and their associated microbiomes, we collected over 16,000 individual adult ticks representing 48 tick species across 8 genera from all 31 provinces, municipalities and autonomous regions of mainland China, encompassing diverse ecological faunas (Fig. 1a). Individual ticks were pooled into 1,460 samples based on species, sex, collection site and blood-feeding status, with each pooled sample undergoing whole-genome shotgun sequencing (Extended Data Fig. 1 and Supplementary Tables 1–4). In each genus of ticks, there is a primary representative species (Fig. 1a,b).
a, Map of collection of samples. The size of the circle represents the number of tick samples collected in the area, and the colours represent different genera of ticks. Different backgrounds on the map represent different ecological faunas. b, Number of samples of different tick species and genus. Dark colours indicate female ticks, while light colours represent male ticks. The principal representative species for each tick genus are as follows: H. longicornis for Haemaphysalis, D. silvarum for Dermacentor, R. microplus for Rhipicephalus, I. persulcatus for Ixodes, H. asiaticum for Hyalomma and A. javanense for Amblyomma. c, Phylogenetic tree of tick species based on the mitochondrial genome with mite as outgroup. Different background colours represent different tick genera, while dots indicate tick species that have undergone nanopore sequencing. d, Correlation between mitochondrial dissimilarity and geographical distance (km) within each genus. The vertical axis represents the geographical distance between samples, while the horizontal axis represents the mitochondrial dissimilarity among tick species represented by the samples. The crosses represent the 99% confidence intervals for mitochondrial dissimilarity (x axis) and geographical distance (y axis), centred on the mean values calculated within each genus. The colours of different points represent different genera of ticks. The dashed line represents the Pearson’s correlation between them (r = 0.24, P = 3.06 × 10−15, two-sided, n = 1,020), with the shaded error band indicating the 95% confidence interval around the estimated mean trend based on all individual samples. e, Phylogenetic tree of 19 tick genomes obtained in this study with P. tepidariorum as outgroup based on 350 single copy orthologues. The estimated divergence time between clades is labelled on the branch nodes. Geological eras are marked on the right side of the timeline. Ma, million years ago.
To explore the potential relationship between host genetic differentiation and adaptation to ecological environments in connection with the tick microbiome, we first de novo assembled the mitochondrial genomes of all 48 tick species from the whole-genome sequencing data. Mitochondria-based phylogeny corroborated the morphological classification of each tick species (Fig. 1c and Supplementary Note 1), reinforcing the accuracy of host background information in this study. To further explore the connection between geographic distribution and genetic evolution, we analysed genetic distances among the 1,460 samples relative to their geographical separations. Pearson’s correlation revealed a positive relationship between geographical distance and genetic divergence, with greater genetic divergence observed over larger distances (r = 0.24, P = 3.06 × 10−15; Fig. 1d), suggesting genetic evolutionary variations in ticks may play a crucial role in their adaptation to diverse ecological environments24.
As nucleic acids were extracted from whole ticks, it is essential to eliminate host-derived sequence interference in the metagenomic data. Currently, among the tick species distributed in China, only six have published genome assemblies, which were shown in our previous study23. To compensate for the lack of genomic information, we pooled samples of each tick species collected from various sites for nanopore sequencing (Extended Data Fig. 1). Through de novo assembly, we successfully generated draft genomes for 19 tick species25 (Supplementary Table 5), with 13 of them being newly sequenced (Fig. 1e). The 19 tick genomes provided useful levels of genome completeness (81 ± 11%) but low contiguity (N50, the length of the shortest contig for which longer and equal length contigs cover 50% of the assembly, = 70 ± 61 kilobases (Kb)), possibly due to low coverage sequencing and high within-species variability (Supplementary Note 2). Leveraging these draft genomes, we effectively filtered out tick sequences, retaining a median of 15.6% of Illumina reads per sample for microbiome analysis. This approach also facilitated the identification of 33 circular bacterial genomes from the nanopore sequencing data, establishing a critical foundation for subsequent microbiome investigations.
Discovery of 7,783 bacterial genomes
By integrating short-read and long-read sequencing data, we used a de novo metagenome assembly pipeline26,27, and a total of 10,702 metagenome-assembled genomes (MAGs) were obtained. Subsequently, after a quality control step, we focused on 7,783 MAGs with medium quality or better, meeting criteria of at least 50% completeness and less than 10% contamination as Genomic Standards Consortium recommended28. The mean and median sequencing coverage depth per assembly were 42.3× and 28.0×. Based on the average nucleotide identity (ANI) calculation between these MAGs, 1,373 clusters were generated (Supplementary Table 6).
We selected the highest quality representative MAGs (rMAGs) from each cluster to construct a maximum likelihood phylogenetic tree. The 1,373 rMAGs were predominantly classified into eight bacterial classes (Fig. 2a). Among the tick genera analysed, we obtained 2,952 bacterial genomes from 351 Rhipicephalus samples and 2,025 bacterial genomes from Haemaphysalis. The genus Argas, sampled from only two samples, yielded 32 bacterial genomes (Fig. 2b).
a, Maximum-likelihood phylogenetic tree of the 1,373 rMAGs assembled in this study. The size of the leaf node represents the genome counts for each MAG cluster, and the colour of the leaf node represents the taxonomy of rMAGs at the class level. The triangular markers indicate that the rMAGs are supported by nanopore sequencing data. From the inner rings to the outer rings, four types of information: completeness (T1), genus of tick samples (T2), ANI with most similar genome in public (T3) and N50 (T4) are labelled. 100K, 100 thousand; 1M, 1 million. b, The counts of bacterial genomes assembled from each tick genus. High-quality (completeness ≥ 95% and contamination < 5%) genomes are indicated in dark grey, while medium-quality (completeness ≥ 50% and contamination < 10%) genomes are shown in light grey. c, The count of genomes in MAG clusters containing top 50 counts. Each column represents a MAG cluster. Top three frequent genera Coxiella, Rickettsia and Staphylococcus were marked with dots of different shapes. d, The distribution of protein similarity between proteins identified in rMAGs and the most similar proteins in the NCBI non-redundant (NR) protein sequence database is displayed. The rMAGs annotated as Gammaproteobacteria, Actinobacteria, Bacilli, Alphaproteobacteria, Rickettsiales and Legionellales are showcased. e, Overall microbial carriage similarity among ticks with sample counts over 10. In the similarity heat map, orange indicates high similarity in microbiome carriage, while blue indicates low similarity.
It is worth noting that among the top 50 clusters, genomes of Coxiella and Coxiella-like bacteria were the most predominant, followed by Rickettsia, both of which are well recognized as tick-associated bacteria (Fig. 2c). Of the 1,373 rMAGs, approximately two-thirds were tentatively classified as undefined species, as determined by ANI values below 95% (ref. 29). Protein sequences from rMAGs with ANI above 95% showed nearly 100% similarity to known bacteria in the NCBI non-redundant (NR) protein sequence database30, whereas those from the remaining 908 rMAGs were highly distinct from any known bacterial species (Fig. 2d). These findings highlight the vastly unexplored diversity of the tick-associated microbiota, emphasizing the potential for previously undescribed bacterial species yet to be characterized.
Subsequently, we identified the host- and region-specific characteristics of the tick microbiome by analysing variations in microbial communities across different tick genera, species and geographical regions. By comparing the microbial carriage similarity among 18 tick species with sample sizes exceeding 10, we identified six distinct clusters based on microbial composition patterns31. Clusters I to III appear to be predominantly influenced by geographic distribution, whereas the composition of clusters IV and V may be largely explained by their shared taxonomic lineage (Fig. 2e and Supplementary Note 3). These findings underscore both geographic and taxonomic influences in shaping tick-associated microbiomes, providing key insights into the intricate relationships between ticks, their microbial communities and environmental factors.
Characterization of tick microbiome according to ecotypes
To delve deeper into the factors shaping the tick microbiome, we performed taxonomic profiling to examine the relationships between microbial composition and ecogeographical factors, animal hosts and biological traits of ticks. We introduced the synonymous term ‘ecotype’ (ET) to categorize tick microbiomes based on their microbial abundance profiles and investigate the characteristics and relationships with these classifications. Five distinct tick microbiome ecotypes were identified (Fig. 3a), each dominated by specific bacterial taxa (Fig. 3b, Extended Data Fig. 2a and Supplementary Note 4). Significant variations in the proportions of dominant bacteria were observed across ecotypes (Extended Data Fig. 2b), with median proportions exceeding 50% in ET1, ET2 and ET3 (Extended Data Fig. 2c). Alpha diversity differed markedly among ecotypes, showing a progressive increase from ET1 to ET5 (Fig. 3c).
a, Hierarchical clustering of tick microbiome samples and the corresponding ecotype assignment, tick host information and taxonomic profiles. The sample information from top to bottom includes ecotype, parasitized statuses, engorgement, gender and tick genus. The top nine genera of highest abundance are displayed. F, female; M, male. b, Potential of Heat-Diffusion for Affinity-Based Transition Embedding (PHATE) dimension reduction results for tick microbiome samples. Dimn, dimension. The number following each ecotype designated as ETn indicates the count of representative samples within each ecotype. The average relative abundance of microbial genera is displayed for representative samples. c, Alpha diversity of samples corresponding to the five ecotypes. In each box plot, the centre line indicates the median, the edges of the box represent the first and third quartiles, and the whiskers extend to span a 1.5 interquartile range from the edges. Different letters indicate statistically significant differences at P < 0.05 by analysis of variance. d, Correlations between different ecotypes and overall sample characteristics (including tick, animal hosts, geographic locations and environmental factors) were determined using permutation tests. Five colours represent five different ecotypes, while grey represents the overall sample. The abbreviations are as follows: Tmin, mean minimum temperature; Tmean, mean temperature; RHmean, mean relative humidity; Prmean, mean daily precipitation; Prmax, maximum precipitation; Tmax, mean maximum temperature; ASD, monthly average sunshine duration; AWS, average wind speed per 2 min; AAP, average atmospheric pressure. The grey text represents environmental factors, while the black text indicates characteristics of ticks. e, The model of environmental influences on tick microbiome and significant factors affecting high-abundance genera. Link diagram on the right shows the factors influencing the specific microbial taxa using the random forest model. Panel e created with Figdraw.com.
To examine the influence of ticks, their hosts, habitats and environmental conditions on tick-associated microbial communities, we used permutational multivariate analysis of variance (PERMANOVA) based on Bray–Curtis distances between samples32. It is worth noting that environmental factors related to humidity and temperature exerted significant influences on the microbiome. Across different ecotypes, ET2 and ET4 showed similar patterns of environmental influence, particularly with respect to tick species, host, region, mean relative humidity, mean daily precipitation and maximum precipitation (Fig. 3d and Extended Data Fig. 2d), indicating shared environmental drivers or interactions shaping microbial community dynamics.
To further explore how tick species, hosts and geographical locations correlate with the ecotypes of tick microbial communities, we used chi-square tests to assess the associations between ecotypes and factors. ET1 showed a notable positive correlation with specific tick species, their free-living status and their prevalence in Northeast China and Inner Mongolia–Xinjiang region (Extended Data Fig. 2e). As shown in Fig. 3a, ET1 was characterized by a dominance of Rickettsia, which are vertically transmitted, thereby reducing the reliance on bacterial acquisition from animal hosts. This underscores the importance of free-living status as a key determinant for ET1.
In addition, to analyse the interplay between tick species, geographical regions and hosts on microbial communities, we systematically controlled for two of these factors and evaluated the contribution of the remaining factor to microbiome variation (Extended Data Fig. 3a). The results indicated that when ticks fed on cows and were parasitized, geographical regions did not substantially influence microbial communities. However, all other comparisons revealed substantial differences, highlighting substantial variability in tick microbiomes under the combined influence of these factors. Using the most abundant genus, Haemaphysalis, as a case study, we investigated the impacts of regional and host factors on microbiome composition. Host-attached samples showed a higher probability of their microbial communities being classified as ET2 or ET4, whereas free-living samples harboured greater abundances of Actinobacteria, α-proteobacteria and β-proteobacteria. Ticks parasitizing breeds of cattle or sheep were predominantly associated with bacteria from the Staphylococcaceae family (Extended Data Fig. 3c,d). Significant differences in microbial composition were observed between parasitized and free-living ticks across various tick species (Extended Data Fig. 2e). Random forest analysis further reinforced that tick species and animal hosts exerted the strongest influence on microbiome composition (Fig. 3e). Among them, Rickettsia showed a dynamic response to multiple environmental and biological factors, while both Rickettsia and Francisella exceed 60%, indicating that their distribution patterns were strongly influenced by environmental variables (Supplementary Table 7).
Taxonomic and functional diversity of tick-associated bacteria
The tick microbiome is highly complex, consisting primarily of environmental bacteria and tick-associated bacteria13. The latter typically include species from the genera Rickettsia, Coxiella, Anaplasma, Ehrlichia, Francisella and Borrelia. To explore this complexity, we extracted genomes annotated to these six genera from 1,373 rMAGs and identified 11 Rickettsia species encompassing 582 MAGs, 26 Coxiella species with 624 MAGs, 11 Anaplasma species with 95 MAGs, 8 Ehrlichia species with 30 MAGs, 3 Francisella species with 89 MAGs and 3 Borrelia species with 5 MAGs. By comparing these genomes to those of known bacteria, we identified six Rickettsia species with 111 MAGs, seven Anaplasma species with 38 MAGs, six Ehrlichia species with 20 MAGs and two Borrelia species with 4 MAGs, all of which had not been previously characterized (Extended Data Fig. 4a). While 24 out of 26 Coxiella species and all Francisella species were initially classified as undefined bacteria, based on the current understanding of tick-borne microbes33,34, we have reclassified them as Coxiella-like endosymbionts (Coxiella-LE) and Francisella-like endosymbionts (Francisella-LE).
We constructed a phylogenetic tree using the 582 rickettsial genomes obtained in this study, along with all 149 rickettsial genomes deposited in the Reference Sequence Database. The MAGs in this study belong to three of the four known Rickettsiae groups: the spotted fever group (SFG), the transitional group and the typhus group. For the MAGs in the SFG, most of them were classified into one species, yet the phylogenetic tree reveals high genetic diversity within SFG (Fig. 4a). Different tick species tended to carry different Rickettsia species or subspecies: Dermacentor silvarum, Rhipicephalus microplus and Rhipicephalus turanicus carried SFG Rickettsia, Ixodes persulcatus carried ancestral group Rickettsia, and Haemaphysalis montgomeryi primarily carried typhus group Rickettsia (Extended Data Fig. 4b). In addition to assembling genomes of four known Anaplasma species, we obtained genomes for seven previously undescribed Anaplasma species (Extended Data Fig. 4a). Phylogenetic analysis showed that three undefined species formed into a distinct branch closely related to Anaplasma phagocytophilum (Fig. 4b), a notorious human and animal pathogen. For Ehrlichia, we assembled genomes of two known species (Ehrlichia minasensis and Ehrlichia muris) and six unknown Ehrlichia species. The three previously uncharacterized species named Candidatus Ehrlichia granulatus, Candidatus Ehrlichia microplus-1 and Candidatus Ehrlichia javanensis were genetically close to Ehrlichia sp. HF, Ehrlichia canis and Ehrlichia ruminantium, respectively (Fig. 4c). The discovery of these previously undescribed species expands the genomic resources of tick-borne agents in the family Anaplasmataceae and provides insights for early warning of tick-borne pathogens.
a, Phylogenetic tree of 582 rickettsial genomes obtained in this study and 149 rickettsial genomes deposited in the Reference Sequence Database. Different colours represent different groups of Rickettsia. TRG, transitional group; TG, typhus group; AG, ancestral group. The genomes assembled in this study are represented by solid dots. b,c, The phylogenetic tree of Anaplasma (b) and Ehrlichia (c), where the size of the dots indicates the number of samples. Red dots indicate a bootstrap value of 100. Reference genomes are represented by grey dots, genomes assembled in this study are marked with turquoise dots, and previously undescribed species identified in this study are annotated in orange. d, Orthologous genes among Ehrlichia, Rickettsia and Anaplasma. Red nodes on the phylogenetic tree indicate genomes assembled in this study. In the heat map, orange denotes the presence of orthologous genes in the genome. The numbers in parentheses indicate the count of orthologous genes. e, Distribution of endosymbionts across different tick species. The thickness of the chord represents the number of positive samples. Different colours represent different genera of ticks. f, Evolutionary relationships of Coxiella MAGs assembled in this study, showcasing the identity (colour) and coverage (size) of virulence and nutritional genes, along with genomic collinearity. The calculation method for coverage is the length of the consistent region divided by the full length of the reference gene. The species names are abbreviated as follows: CEAN, Coxiella endosymbiont of Amblyomma nuttalli; CEOA, Coxiella endosymbiont of Ornithodoros amblus; CERM, Coxiella endosymbiont of R. microplus. CLE-Hy, Coxiella-like endosymbiont of Haemaphysalis hystricis; CLE-Hf, Coxiella-like endosymbiont of Haemaphysalis flava.
To compare genomic signatures among Rickettsia, Anaplasma and Ehrlichia, we identified orthologous genes based on protein similarity. It is worth noting that Rickettsia possesses a significantly higher number of unique genes compared to Anaplasma and Ehrlichia (Fig. 4d). Rickettsia-specific genes were particularly enriched in the prokaryotic defence system pathway (Extended Data Fig. 4c), primarily associated with the Type I R-M system. Further Clusters of Orthologous Groups annotation of genus-specific genes indicated that, compared to Anaplasma and Ehrlichia, Rickettsia harbours more genes involved in cellular processes and signalling (Extended Data Fig. 4d). Moreover, we assembled genomes of three Borrelia species and investigated the distribution of tick-associated bacteria across different tick species (Extended Data Fig. 4e,f).
Given that ticks are obligate blood-feeding arthropods reliant on vitamin supplementation from endosymbionts33,35, we screened clusters to identify Coxiella-LE, Francisella-LE, Arsenophonus and Candidatus Midichloria. Coxiella-LE was frequently found in Haemaphysalis, Dermacentor, Ixodes and certain soft ticks, albeit with varying prevalence rates (Fig. 4e). Francisella-LE was nearly ubiquitous across Amblyomma and Hyalomma species and occasionally detected in Haemaphysalis longicornis ticks (Supplementary Table 8). In addition, phylogenetic analysis of Rickettsia endosymbionts identified them in two Argas persicus and seven Amblyomma javanense (Extended Data Fig. 4g).
Coxiella-LE genomes clustered into three distinct clades, including one with Coxiella burnetii. The phylogenetic position of Coxiella-LE showed no clear correlation with tick species or genera (Fig. 4f). Rickettsia endosymbionts showed deletions in certain virulence genes (Extended Data Fig. 4h and Supplementary Note 5). Francisella-LE genomes clustered into a single phylogenetic clade (Extended Data Fig. 4I). It is worth noting that pathogenicity island genes were either pseudogenized or absent in the Francisella-LE genomes analysed.
Tick–pathogen–microbiome interactions based on hologenome-wide analysis
Finally, we have established a comprehensive resource for investigating genetic–microbiome associations in ticks. Correlation analysis revealed significant correlations between mitochondrial genome diversity and microbial community similarity, particularly in free-living H. longicornis ticks from Central China and D. silvarum ticks parasitizing sheep in Inner Mongolia–Xinjiang (Fig. 5a,b).
a, Spearman correlation between mitochondrial sequence divergence and microbiome community differences in ticks with samples over 200 including H. longicornis, R. microplus and D. silvarum across different sampling regions and parasitic states. The numbers within each heat map cell represent Spearman’s correlation coefficients (at least 20 pairs), and the colour intensity of the heat map reflects the statistical significance of the correlation. b, The relationship between mitochondrial sequence divergence and the similarity of tick symbiotic communities for H. longicornis in a free-living state from Central China (left) and D. silvarum parasitizing sheep from Inner Mongolia–Xinjiang regions (right). Data are presented as fitted values ± 95% confidence interval. c, H. longicornis genetic variation correlated with the relative abundance of Rickettsia. The Manhattan plot shows the results of the mGWAS. Two-sided P values were calculated using the BLINK algorithm implemented in the GAPIT R package. Bonferroni correction was applied for multiple comparisons. −log10(P values) are plotted against the position of SNPs on chromosomes. The gene annotation information corresponding to these SNPs above the threshold is labelled. Chr., chromosome number. d, The Upset plot identifies 109 shared GO terms annotated to pathogen-associated genes across three tick species. A total of 119 GO functional annotations are shared among the genes with significant SNPs in the three tick species. MF, molecular function; BP, biological process; CC, cellular component. e, Clustering results of samples based on homozygosity of SNPs and the corresponding region, animal host, parasitized state and engorgement condition. Determined to be two. D. silvarum clusters were significantly associated with sampling locations (P = 0.001, Mantel test), parasitic hosts (P = 0.001, Mantel test) and parasitic statuses (P = 0.001, Mantel test). REF indicates the nucleotide in the reference genome, whereas ALT represents the alternative nucleotide differing from the reference sequence.
These findings sparked our curiosity in exploring the relationship between genetic polymorphisms and pathogen abundance in ticks. We selected three tick species with sufficient sample sizes for analysis: H. longicornis ticks (359), R. microplus ticks (203) and D. silvarum ticks (158). A total of 71,729 high-confidence single-nucleotide polymorphisms (SNPs) were detected in H. longicornis, 20,536 in R. microplus and 55,530 in D. silvarum ticks. We conducted correlation analyses between identified SNPs and pathogen abundance using Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK). This analysis revealed 19 SNPs significantly associated with pathogens in H. longicornis and R. microplus and 31 in D. silvarum (P < 10−6) (Supplementary Table 9).
After assessing the potential functional impact of nucleotide mutations, we identified a SNP on chromosome 4 within a gene encoding a peptidase M13 family protein in H. longicornis. This SNP may enhance the H. longicornis tick’s blood feeding capabilities. Another SNP, located on chromosome 3 of H. longicornis genome within a gene encoding RING-type domain-containing protein, is involved in ubiquitination processes (Fig. 5c). In R. microplus, a SNP was identified in the serotonin receptor-like protein gene, which is implicated in the 5-hydroxytryptamine signalling pathway (Extended Data Fig. 5a). In D. silvarum, a SNP was discovered in a gene related to juvenile hormone acid methyl transferase (Extended Data Fig. 5b).
To determine whether SNPs shared functional similarities across species, we investigated whether SNPs shared similar functions across species (Supplementary Note 6). Gene Ontology (GO) annotation identified 109 shared GO terms among the three species, primarily related to biological processes such as development, metabolism, localization, cellular processes, reproduction and stress responses (Fig. 5d). A notable relationship emerged between SNP homozygosity and pathogen abundance, with homozygous ticks generally exhibiting higher pathogen loads (Extended Data Fig. 5c–f). Clustering based on SNP homozygosity distinguished distinct sample groups (Fig. 5e and Extended Data Fig. 5g). In H. longicornis, cluster formation showed no significant correlation with sampling locations or parasitic status (Extended Data Fig. 5g), whereas in D. silvarum, clusters were significantly associated with sampling locations (P = 0.001, Mantel test), parasitic hosts (P = 0.001, Mantel test) and parasitic statuses (P = 0.001, Mantel test). These findings suggest that regional and environmental factors influence genotypic clustering in D. silvarum, potentially affecting their microbial community variations.
Discussion
In this study, we constructed a comprehensive metagenomic database encompassing over 16,000 adult ticks from 48 species across eight genera, representing diverse ecological habitats, geographic regions, and blood-feeding statuses, containing 46 tick species whose microbiome has never been profiled based on metagenomic sequencing. The expanded sampling of dominant tick species enables us to uncover the associations between microbiota composition and ecogeographical factors, host associations and key biological traits of ticks.
Approximately two-thirds of the genomes in our database belong to previously uncharacterized bacterial species, underscoring the vast and unexpected diversity of tick microbiomes. Previous studies on tick microbiome have used the amplicon-based approach targeting the highly conserved 16S ribosomal RNA region36 but were insufficient for detecting undefined bacteria. In our previous work, we conducted an initial survey of potential pathogens in ticks from China, offering deeper insights into their vector capacities23. In this study, we leveraged metagenomic sequencing and de novo assembly, enabling the identification of both known and unknown bacterial species and offering a more comprehensive perspective on tick microbiomes.
In many cases, human infections are only recognized years after the initial discovery of such potential pathogens, partly due to the markedly lower microbial loads in human blood compared to ticks37. For example, Rickettsia sibirica subspecies sibirica BJ-90 was initially detected in Dermacentor sinicus ticks from China in 1990 (ref. 38), yet it was not confirmed as a human pathogen until 22 years later39. Similarly, the 19 previously undescribed tick-associated bacterial species in this study represent potential human pathogens, underscoring the urgent need for enhanced surveillance to enable early identification and prevention of emerging tick-borne diseases.
In H. longicornis, two loci showed significant genotype-dependent regulation, with homozygous individuals demonstrating markedly higher Rickettsia loads compared to heterozygous counterparts (Extended Data Fig. 5c,d). The SNP on chromosome 3 may influence the ubiquitination pathway by modulating the expression or enzymatic activity of the RING-type ubiquitin ligase gene40, thereby regulating tick immune responses. This mechanism potentially alters Rickettsia’s capacity to evade the tick immune system, ultimately affecting pathogen load. Studies have demonstrated that ubiquitin ligases restrict the colonization of the rickettsial bacterium A. phagocytophilum in Ixodes scapularis, and their silencing confers a survival advantage to this rickettsial bacterium41. The mutation in the peptidase family M13 gene on chromosome 4 likely regulates tick vascular function42, potentially inducing vascular permeability abnormalities that disrupt tissue microenvironment homeostasis, consequently impacting Rickettsia colonization efficiency within ticks. In R. microplus, genetic variations in the 5-HT receptor gene showed a significant correlation with Rickettsia load. Previous studies demonstrate that serotonin biosynthesis regulates blood-feeding efficiency in ticks43, suggesting that 5-HT-mediated neural signalling may modulate pathogen load through critical phenotypes like haematophagy. The JHAMT gene mutation in D. silvarum likely regulates Anaplasma transmission efficiency by affecting juvenile hormone biosynthesis and subsequent developmental processes44.
Overall, we establish a comprehensive large-scale bacterial genome database in ticks, demonstrate the ecogeographical impact on microbiome composition, identify a diverse array of tick-associated bacterial species with pathogenic potential and uncover notable associations between tick genomic diversity and pathogen abundance. Our investigation into the complex interactions among ticks, microbiomes and pathogens enhances our understanding of tick-borne diseases and provides valuable insights for developing targeted tick control strategies.
Methods
Sample collection
Ticks were collected from 31 provinces, metropolises or autonomous regions of mainland China. The collection sites were selected according to their ecological environments, including coniferous forest, steppe, farmland, desert, shrubland and tropical forest. Ticks were collected by dragging a standard 1 m2 flannel flag over vegetation or from domestic or wild animals such as cattle, dogs, sheep, goats, cats, rabbits, camels, deer and boars. Domestic animals were sampled from local farms, while wild animals were sourced from two origins: wildlife rescue centres and licensed breeding facilities. Ticks were collected with forceps from the animals after obtaining consent from the responsible personnel or owners. The latitude and longitude of each collection site were recorded. Environment factor information was downloaded from the National Geomatics Center of China (https://www.ngcc.cn/). The sample collection includes both free-living and host-attached ticks, encompassing both females and males across a broad geographic range and representing a majority of tick species, ensuring the study’s sample representativeness. The identification of tick species, sex and developmental stage was conducted by an entomologist (Y.S.) based on morphological characteristics, such as the capitulum, palps, scutum, coxae and tarsi, following the genus and species identification key45. Live ticks were transported to the laboratory, and dead ticks were directly stored at −80 °C.
Metagenomic sequencing
Adult ticks were used for tick metagenomic sequencing. Tick samples were pooled according to species, sex, collection site and blood-feeding status, and the total DNA was extracted from each pool. Each individual tick was exclusively used for either Illumina or nanopore sequencing. All 1,460 adult tick samples collected from the wild were thoroughly surface sterilized (two successive washes of 70% ethanol, 30 s each), and genomic DNA for resequencing was isolated using the AllPrep DNA/RNA Mini Kit (QIAGEN, catalogue number 69504). The DNA concentration was measured using the Qubit dsDNA HS Assay Kit in a Qubit 2.0 fluorometer (Life Technologies). Sequencing libraries were constructed using the NEBNext UltraTM DNA Library Prep Kit for Illumina (NEB) following the manufacturer’s recommendations, and index barcodes were added to attribute sequences to each sample. The library preparations were sequenced on an Illumina NovaSeq platform (NovaSeq 6000 SP Reagent Kit). Each sample generated approximately 70 million paired-end reads (150 bp × 2), resulting in about 20 Gb of sequencing data.
Nanopore sequencing
Live adult ticks were used for tick nanopore sequencing. The individuals used for nanopore sequencing were independent of those used for Illumina sequencing. Each pooled sample, comprising a single tick species, contained approximately 15–30 adult ticks. High-quality DNA extraction is initially conducted using Qiagen Genomic Kit (catalogue number 13443). Subsequently, a comprehensive assessment of the DNA sample’s suitability is performed through a multi-faceted approach. This includes a visual inspection for potential contaminants or irregularities in the sample’s appearance. In addition, a 0.75% agarose gel electrophoresis is used to ascertain any degradation and determine the size of DNA fragments present. The assessment further involves the use of Nanodrop to evaluate DNA purity, ensuring an optimal optical density at 260/280 nm (OD260/280) ratio between 1.8 and 2.0 and an optical density at 260/230 nm (OD260/230) ratio falling within the range of 2.0 to 2.2. Finally, Qubit is used for precise quantification, ensuring that the DNA meets the required standards for concentration and quality. After the quality assessment of the DNA sample, BluePippin automated nucleic acid size selection system is used to recover specific fragment sizes of DNA. Subsequently, the DNA fragments obtained after size selection undergo damage repair and end repair procedures. Following magnetic bead purification, a DNA barcode is ligated to the DNA ends using a kit. After another round of magnetic bead purification, sequencing adapters provided within the ligation kit are ligated to the DNA. Finally, the constructed DNA library undergoes precise quantification using Qubit to ensure accurate quantitation of the library. After constructing the library, a specific concentration and volume of the DNA library is loaded onto a flow cell. The flow cell is then transferred to the PromethION sequencer for real-time single-molecule sequencing.
Genome assembly and genome annotation for nanopore sequencing data
We used Porechop v0.2.4 (ref. 46) to trim chimeras and remove sequence adapters for raw nanopore sequencing data. Nanofilt v2.6.0 (ref. 47) was used to filter out low-quality and short reads, while Filtlong v0.2.0 (https://github.com/rrwick/Filtlong) was applied to eliminate Nanopore sequences of poor quality and shorter than 1,000 bp, retaining 95% of the data. The clean data were used to assemble by Flye v2.8 (ref. 48) for continuity with a previous study (ref. 49). Racon v1.4.13 (ref. 50) was used for correction by integrating nanopore sequencing data with one round of polishing (Supplementary Note 7). After extracting DNA for nanopore sequencing, a portion of the DNA was also subjected to Illumina sequencing workflow to obtain reads for Illumina-reads-based genome polishing by Pilon v1.23 (ref. 51) and NextPolish v1.3.0 (ref. 52). To further improve the completeness of the genome assembly, we performed de novo assembly of the Illumina read data using MEGAHIT v1.2.9 (ref. 53) and subsequently integrated the corrected Nanopore-read assembly with the Illumina-read assembly using QuickMerge v0.3 (ref. 54). Kraken2 v2.1.0 (ref. 55) was used to identify the bacterial contigs in assembly. Contigs with fragments annotated as bacterial sequences were filtered out, representing approximately 0.1% of the assembled sequences. The quality of the assembly was assessed using QUAST v5.0.2 (ref. 56), and the completeness was evaluated using BUSCO v5.2.2 with arthropoda_odb10 (ref. 57). Repetitive sequences in each tick genome were identified based on RepeatModeler v2.0.5 (ref. 58). Gene annotation was accomplished by integrating evidence or predictions from transcriptome-, ab initio- and homology- based approaches. In the transcriptome-based approach, RNA-sequencing data for each tick species was acquired from a previous study in the National Center for Biotechnology Information (NCBI) Sequence Read Archive under Bioproject PRJNA841744. RNA-sequencing reads generated from each tick species were assembled by Trinity v2.4.0 (ref. 59) with default parameters. The assembled transcripts were aligned to each assembled genome and were used to predict gene structure by PASA v2.3.3 (ref. 60). Ab initio gene prediction was performed using Augustus v3.3 (ref. 61). Homologous protein sequences were aligned to the tick genome assemblies using TBLASTN v2.2.28+ (ref. 62), and Genomethreader v5.4.0 (ref. 63) was used to predict gene structure. Finally, we used EvidenceModeler (EVM) v1.1.1 (ref. 64) to integrate the gene models predicted by the above approaches into a nonredundant and more complete gene set.
Phylogenetic analysis and divergence time estimation for tick genome
The single-copy orthologue proteins within I. persulcatus, H. longicornis, D. silvarum, Hyalomma asiaticum, Rhipicephalus sanguineus, R. microplus, I. scapularis, Centruroides sculpturatus and Parasteatoda tepidariorum were identified using orthofinder v2.5.4 (ref. 65) with default parameters. A total of 3,150 orthologous protein sequences were identified and considered the core protein. The protein sequences annotated from genomes assembled by nanopore sequencing reads were mapped to core proteins, and aligned protein sequences were extracted for phylogenetic tree construction. Extracted protein sequences were aligned using MUSCLE v3.8.1551 (ref. 66). Gblocks v0.91b (ref. 67) was used to select conserved blocks from aligned sequences. Finally, iqtree v2.2.0 (ref. 68) was used to generate a phylogenetic tree and calculate branch support. The divergence time within the nodes of the phylogenetic tree was estimated by r8s (ref. 69).
Assembly and phylogenetic tree construction for tick mitochondrial genome
Raw Illumina reads were filtered using the AfterQC v0.9.6 (ref. 70) with default parameters. MitoZ v3.6 (ref. 71) was used to assemble mitochondrial genome, which performs de novo assembly based on all reads and subsequently filters non-mitochondrial contigs and annotates the mitochondrial scaffolds. For tick species with less than 50 samples, all available samples were used for mitochondrial genome assembly; for tick species with more than 50 samples, a random subset of 50 samples was selected. The longest mitochondrial genome sequence obtained for each species was ultimately chosen for downstream analysis. We initially selected the nucleotide sequences of 23 genes, including ATP6, ATP8, COX1, COX2, COX3, ND1, ND2, ND3, s-rRNA, trnA, trnC, trnD, trnE, trnG, trnK, trnL, trnM, trnN, trnR, trnS, trnV, trnW and trnY. For each gene listed, protein sequences were taken from all mitochondrial assemblies and subjected to multiple sequence alignment using MUSCLE v3.8.1551 (ref. 66). Gblocks v0.91b (ref. 67) was used to select conserved blocks for each alignment. Finally, iqtree v2.2.0 (ref. 68) was used to generate a phylogenetic tree and calculate branch support. Mitochondrial dissimilarity was extracted from the phylogenetic tree. The geographical distance between two samples was calculated as follows: Geographical distance = cos−1(cos(E1 − E2) × cos(N1) × cos(N2) + sin(N1) × sin(N2)) × 6,371 km, where E1 and N1 were the longitude and latitude of sample 1, and E2 and N2 were the longitude and latitude of sample 2.
Genome assembly and binning for metagenomic sequencing data
Illumina reads after quality control were aligned to the tick genomes required above using the bowtie2 v2.4.1 (ref. 72) with default parameters. The paired reads were discarded if one read matched the tick genome by using samtools v1.9 (ref. 73) with parameters -f 12. For samples with remaining sequence data greater than 10 G, we used MEGAHIT v1.2.9 (ref. 53) for assembly. For samples with less than 10 G of remaining sequence data, SPAdes v3.15.2 with --meta option74 was used for assembly. A test of this selection strategy was provided in Supplementary Note 8. Binning and genome reconstruction were accomplished by MetaBAT2 v2.2.15 (ref. 75). Assembly quality was assessed through CheckM v1.1.3 in lineage_wf workflow76, and MAGs meeting criteria of at least 50% completeness and less than 10% contamination as Genomic Standards Consortium recommended were used for downstream analysis. Prokka v1.14.6 (ref. 77) was used to identify the genes with default parameters.
MAGs clustering and phylogenetic tree construction
ANI among each genome was calculated by fastANI v1.32 (ref. 29). Louvain v0.16 (ref. 78) on Python 3.10.0 was used to cluster and identify the groups of MAGs. We assessed and scored the genomes based on factors such as genome completeness, contamination ratio, the number of contigs, circularity, presence of the 16S gene and the count of transfer RNA genes. The genome with the highest quality was selected as the representative genome. PhyloPhlAn v3.0.60 (ref. 79) was used to construct a microbial phylogenetic tree.
Taxonomically profiling and ecotype analysis
After host sequences were filtered out, species classification was performed using Kraken2 v2.1.0 (ref. 55) to determine the abundance of microbes in each sample. Based on the microbial communities at the genus level, samples were classified into five ecotypes. Hierarchical clustering (Ward D2 method) is performed based on the taxonomic profile of the microbial community at the genus level. The optimal number of clusters was determined using the elbow method. This method involves selecting the point where the change in intra-cluster distance is minimized as the cluster count is varied. This point indicates that beyond it, the trend of distance change begins to plateau, suggesting that further increases in the number of clusters might not substantially enhance the accuracy or interpretability of the clustering. Representative samples for each ecotype are defined as follows: First, low-dimensional embeddings of the samples are calculated based on isometric mapping. For each ecotype ETi, the sample that is furthest from the centre of all samples is considered the most representative and is denoted as Si,rep. The distance from this representative sample to the centre of all samples is recorded as Di,rep. For each ecotype ETi, we calculate the distance from every sample Si,j belonging to this ecotype to Si,rep, denoted as Di,j. If Di,j is less than 0.5 × Di,rep, we consider it a representative sample. Vegan v2.6-2 (ref. 80) package in R was used to calculate the Shannon index, and dimensionality reduction was performed with phateR (ref. 81). To investigate the effects of ticks, parasitizes, habitats and environmental factors on the tick-associated microbiomes, vegan v2.6-2 (ref. 80) in R was used to calculate Bray–Curtis distances between samples. Dimensionality reduction was conducted using Rtsne (ref. 82) in R, and differences in distances between groups of different ticks, hosts, habitats and environmental factors were compared using the adonis function for PERMANOVA32 on datasets with sample sizes exceeding 8 to analyse the impact of these factors on the microbiota. Significance of environmental influences on tick microbiome was calculated using A3 package, and only influencing factors with a significance level below 0.05 were displayed.
Potentially pathogenic bacteria identification and species classification
PhyloPhlAn v3.0.60 (ref. 79) was used to construct a microbial phylogenetic tree. For each genome, Kraken2 v2.1.0 (ref. 55) was used for taxonomy annotation, and genomes annotated as Rickettsia, Anaplasma, Ehrlichia, Coxiella, Francisella, Borrelia, Candidatus Midichloria and Arsenophonus were selected as tick-associated bacteria. The Core Genome Alignment Sequence Identity were estimated for each cluster to classify them into species. The Core Genome Alignment Sequence Identity ≥96.8% among genomes were considered indicative of the same species, in accordance with the criteria used especially for the order Rickettsiales83. Orthologous genes were identified using orthofinder v2.5.4 (ref. 65). Anaplasma and Ehrlichia showed a greater number of unique orthologous gene clusters compared to Rickettsia, which shared fewer orthologues with the other two genera. Clusters of Orthologous Groups annotation was accomplished by emapper v2.1.9 (ref. 84) based on eggNOG DB v5.0.2 (ref. 85). Virulence and vitamin biosynthesis protein in genomes were identified using diamond v0.9.35 (ref. 86) based on these proteins in the virulence factor database87 and NCBI.
SNP identification
Owing to their considerable sample sizes and substantial research importance of H. longicornis, R. microplus and D. silvarum, the samples from these three tick species were used for microbiome genome-wide association study (mGWAS) analysis. Reference genome and their corresponding genome annotation files (GFF) for the H. longicornis, R. microplus and D. silvarum were downloaded from the NCBI Genome database. BEDTools v2.27.1 (ref. 88) was used to extract the exonic sequences from the reference genome sequence based on GFF files, and SeqKit v2.3.0 (ref. 89) was used to deduplicate the sequences. Illumina reads after quality control were aligned to the reference genome exonic sequences using Bowtie2 v2.4.2 (ref. 72), and the generated SAM files were processed by SAMtools v1.9 (ref. 73) to generate BAM files and indices. Variant identification was performed using GATK4 v4.0.12.0 (ref. 90). The Genome Analysis Toolkit (GATK) MarkDuplicates was used to mark duplicated reads caused by PCR and cloning. Alignment summary metrics, such as alignment rate, were summarized by GATK CollectAlignmentSummaryMetrics. The deduplicated BAM files were subjected to variant calling using GATK HaplotypeCaller, and GVCF files were generated. GVCF files were converted to VCF format using GATK GenotypeGVCFs. VCF files were compressed, indexed and merged using BCFtools v0.1.16 (ref. 73) to produce a consolidated VCF file for each tick species. SNPs were filtered and marked using GATK VariantFiltration based on quality control (quality by depth (QD) < 2.0, quality score (QUAL) < 30.0, Fisher strand bias (FS) > 60.0, mapping quality (MQ) < 40.0, mapping quality rank sum (MQRankSum) < −12.5 and read position rank sum (ReadPosRankSum) < −8.0). GATK SelectVariants was then used to exclude the marked low-quality SNP from the VCF file. Further filtering of SNPs was performed using vcftools91 according to the criteria of a missing rate less than 50% (--max-missing 0.5) and a minor allele frequency greater than 5% (--maf 0.05). The filtered result files were sorted using TASSEL v5.2.40 (ref. 92) and converted from VCF format to HapMap format. After quality control, 20,547 SNPs were retained for H. longicornis, 55,541 for R. microplus and 71,740 for D. silvarum.
Microbiome GWAS analysis
To ensure more reliable results, we set thresholds based on a relative abundance greater than 0.001 in each individual sample and a prevalence greater than 15% in the respective tick species for pathogen selection. Ultimately, the candidate associated pathogens identified were Rickettsia for H. longicornis, Rickettsia for R. microplus and Rickettsia and Anaplasma for D. silvarum. Association analysis between genotypes and phenotypes was conducted using GAPIT v3.2.093 in R, and BLINK94 algorithm from the mixed linear model was used to conduct association detection. The significance threshold (P value cut-off is set to 1 × 10−6) was determined by the Bonferroni test. After completing the association analysis, SNPs located on chromosomes were retained for subsequent bioinformatics analysis. We used the emapper v2.1.984 software to functionally annotate genes identified with SNPs. In addition, we used the genetic variant annotation software SnpEff v5.295 to analyse the functional impact of SNPs obtained from mGWAS analysis, predicting the effects of these SNPs on the function of the encoded proteins in the respective genes. PLINK v1.90b496 was used to perform linkage disequilibrium analysis on SNPs. Independent genetic variant loci were selected using thresholds of LD r2 < 0.1 and P < 10−6. GAPIT93 was used to evaluate the contribution of these independent variant loci to the explanation of phenotypic variance. We performed hierarchical clustering on tick samples based on the genotypes of selected SNPs and evaluated the correlation between SNP genotypes and tick sample metadata using the Mantel test. Samples were further grouped according to genotype, and SNPs with a larger number of samples were selected for subsequent analysis. The Wilcoxon rank-sum test was used to assess the relationship between genotypes and the abundance of corresponding pathogens. All statistical analyses were conducted using R software, with P < 0.05 indicating statistical significance. Hierarchical clustering (Ward D2 method) was performed based on the SNP alleles after the genotype data were converted to character variables. During the clustering process, distance matrices between samples or SNPs are computed, and hierarchical clustering is conducted based on these distances. Silhouette analysis is used to select the optimal number of clusters.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The genome assemblies for 19 tick species in this study are available via European Nucleotide Archive (ENA) (PRJEB89832) at https://www.ebi.ac.uk/ena/browser/view/PRJEB89832 and via the National Genomics Data Center (NGDC) (PRJCA037692) at https://ngdc.cncb.ac.cn/bioproject/browse/PRJCA037692. The raw Illumina sequencing data are available via ENA (PRJEB89768) at https://www.ebi.ac.uk/ena/browser/view/PRJEB94929, and via NGDC (PRJCA017096) at https://ngdc.cncb.ac.cn/bioproject/browse/PRJCA017096 and (PRJCA002242) at https://ngdc.cncb.ac.cn/bioproject/browse/PRJCA002242. The genomes for 1,373 rMAGs in this study are available via ENA (PRJEB89768) at https://www.ebi.ac.uk/ena/browser/view/PRJEB89768 and via NGDC (PRJCA037694) at https://ngdc.cncb.ac.cn/bioproject/browse/PRJCA037694.
Code availability
The code used for the analysis is available via Zenodo at https://doi.org/10.5281/zenodo.16248308 (ref. 97).
References
Estrada-Peña, A. Ticks as vectors: taxonomy, biology and ecology. Rev. Sci. Tech. 34, 53–65 (2015).
Dantas-Torres, F., Chomel, B. B. & Otranto, D. Ticks and tick-borne diseases: a One Health perspective. Trends Parasitol. 28, 437–446 (2012).
Beard, C. B. et al. Multistate infestation with the exotic disease-vector tick Haemaphysalis longicornis - United States, August 2017-September 2018. MMWR Morb. Mortal. Wkly Rep. 67, 1310–1313 (2018).
Wormser, G. P. et al. First recognized human bite in the United States by the Asian longhorned tick, Haemaphysalis longicornis. Clin. Infect. Dis. 70, 314–316 (2020).
Zhao, L. et al. Distribution of Haemaphysalis longicornis and associated pathogens: analysis of pooled data from a China field survey and global published data. Lancet Planet. Health 4, e320–e329 (2020).
Paulauskas, A. et al. First record of Haemaphysalis concinna (Acari: Ixodidae) in Lithuania. Ticks Tick Borne Dis. 11, 101460 (2020).
Liu, J. et al. An integrated data analysis reveals distribution, hosts, and pathogen diversity of Haemaphysalis concinna. Parasit. Vectors 17, 92 (2024).
Paules, C. I., Marston, H. D., Bloom, M. E. & Fauci, A. S. Tickborne diseases - confronting a growing threat. N. Engl. J. Med. 379, 701–703 (2018).
Madison-Antenucci, S., Kramer, L. D., Gebhardt, L. L. & Kauffman, E. Emerging tick-borne diseases. Clin. Microbiol Rev. 33, e00083-18 (2020).
Rochlin, I. & Toledo, A. Emerging tick-borne pathogens of public health importance: a mini-review. J. Med. Microbiol. 69, 781–791 (2020).
Fang, L. Q. et al. Emerging tick-borne infections in mainland China: an increasing public health threat. Lancet Infect. Dis. 15, 1467–1479 (2015).
Narasimhan, S. et al. Grappling with the tick microbiome. Trends Parasitol. 37, 722–733 (2021).
Narasimhan, S. & Fikrig, E. Tick microbiome: the force within. Trends Parasitol. 31, 315–323 (2015).
Gall, C. A. et al. The bacterial microbiome of Dermacentor andersoni ticks influences pathogen susceptibility. ISEM J. 10, 1846–1855 (2016).
René-Martellet, M. et al. Bacterial microbiota associated with Rhipicephalus sanguineus (s.l.) ticks from France, Senegal and Arizona. Parasit. Vectors 10, 416 (2017).
Varela-Stokes, A. S. et al. Tick microbial communities within enriched extracts of Amblyomma maculatum. Ticks Tick Borne Dis. 9, 798–805 (2018).
Thapa, S., Zhang, Y. & Allen, M. S. Bacterial microbiomes of Ixodes scapularis ticks collected from Massachusetts and Texas, USA. BMC Microbiol. 19, 138 (2019).
Chandra, S. & Šlapeta, J. Biotic factors influence microbiota of nymph ticks from vegetation in Sydney, Australia. Pathogens 9, 566 (2020).
Lejal, E. et al. Temporal patterns in Ixodes ricinus microbial communities: an insight into tick-borne microbe interactions. Microbiome 9, 153 (2021).
Krawczyk, A. I. et al. Quantitative microbial population study reveals geographical differences in bacterial symbionts of Ixodes ricinus. Microbiome 10, 120 (2022).
Piloto-Sardiñas, E. et al. Comparison of salivary gland and midgut microbiome in the soft ticks Ornithodoros erraticus and Ornithodoros moubata. Front Microbiol 14, 1173609 (2023).
Pollet, T. et al. The scale affects our view on the identification and distribution of microbial communities in ticks. Parasit. Vectors 13, 36 (2020).
Jia, N. et al. Large-scale comparative analyses of tick genomes elucidate their genetic diversity and vector capacities. Cell 182, 1328–1340.e1313 (2020).
Ye, R. Z. et al. Virome diversity shaped by genetic evolution and ecological landscape of Haemaphysalis longicornis. Microbiome 12, 35 (2024).
Field, D. et al. The minimum information about a genome sequence (MIGS) specification. Nat. Biotechnol. 26, 541–547 (2008).
Levin, D. et al. Diversity and functional landscapes in the microbiota of animals in the wild. Science 372, eabb5352 (2021).
Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662.e620 (2019).
Bowers, R. M. et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat. Biotechnol. 35, 725–731 (2017).
Jain, C., Rodriguez, R. L., Phillippy, A. M., Konstantinidis, K. T. & Aluru, S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 9, 5114 (2018).
Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 50, D20–d26 (2022).
Plantinga, A. M., & Wu, M. C. Beta diversity and distance-based analysis of microbiome data. in Statistical Analysis of Microbiome Data Frontiers in Probability and the Statistical Sciences (eds Datta, S. & Guha, S.) 101–127 (Springer, 2021).
Anderson, M. J. Permutational multivariate analysis of variance (PERMANOVA), in Wiley StatsRef: Statistics Reference Online pp. 1–15 (Wiley, 2017).
Duron, O. & Gottlieb, Y. Convergence of nutritional symbioses in obligate blood feeders. Trends Parasitol. 36, 816–825 (2020).
Buysse, M. & Duron, O. Evidence that microbes identified as tick-borne pathogens are nutritional endosymbionts. Cell 184, 2259–2260 (2021).
Wang, J., Gao, L. & Aksoy, S. Microbiota in disease-transmitting vectors. Nat. Rev. Microbiol. 21, 604–618 (2023).
Wu-Chuang, A. et al. Current debates and advances in tick microbiome research. Curr. Res. Parasitol. Vector Borne Dis. 1, 100036 (2021).
Parola, P. et al. Update on tick-borne rickettsioses around the world: a geographic approach. Clin. Microbiol. Rev. 26, 657–702 (2013).
Yu, X. et al. Genotypic and antigenic identification of two new strains of spotted fever group rickettsiae isolated from China. J. Clin. Microbiol. 31, 83–88 (1993).
Jia, N. et al. Rickettsia sibirica subspecies sibirica BJ-90 as a cause of human disease. N. Engl. J. Med. 369, 1176–1178 (2013).
Cai, C. et al. The RING finger protein family in health and disease. Signal Transduct. Target. Ther. 7, 300 (2022).
Shaw, D. et al. Infection-derived lipids elicit an immune deficiency circuit in arthropods. Nat. Commun. 8, 14401 (2017).
Nalivaeva, N. N., Zhuravin, I. A. & Turner, A. J. Neprilysin expression and functions in development, ageing and disease. Mech. Ageing Dev. 192, 111363 (2020).
Zhong, Z. et al. Symbiont-regulated serotonin biosynthesis modulates tick feeding activity. Cell Host Microbe 29, 1545–1557.e1544 (2021).
Pound, J. M. & Oliver, J. H. Jr. Juvenile hormone: evidence of its role in the reproduction of ticks. Science 206, 355–357 (1979).
Teng, K. & Jiang, Z. Economic Insect Fauna of China Fasc 39 Acari: Ixodidae (Science, 1991).
Bonenfant, Q., Noé, L. & Touzet, H. Porechop_ABI: discovering unknown adapters in Oxford Nanopore Technology sequencing reads for downstream trimming. Bioinform. Adv. 3, vbac085 (2023).
De Coster, W., D’Hert, S., Schultz, D. T., Cruts, M. & Van, C. Broeckhoven, NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 34, 2666–2669 (2018).
Kolmogorov, M. et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat. Methods 17, 1103–1110 (2020).
Huang, G. et al. PandaGUT provides new insights into bacterial diversity, function, and resistome landscapes with implications for conservation. Microbiome 11, 221 (2023).
Vaser, R., Sović, I., Nagarajan, N. & Šikić, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).
Hu, J., Fan, J., Sun, Z. & Liu, S. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36, 2253–2255 (2020).
Li, D. et al. MEGAHIT v1.0: a fast and scalable metagenome assembler driven by advanced methodologies and community practices. Methods 102, 3–11 (2016).
Chakraborty, M., Baldwin-Brown, J. G., Long, A. D. & Emerson, J. Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage. Nucleic Acids Res. 44, e147 (2016).
Lu, J. et al. Metagenome analysis using the Kraken software suite. Nat. Protoc. 17, 2815–2839 (2022).
Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. USA 117, 9451–9457 (2020).
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).
Stanke, M., Diekhans, M., Baertsch, R. & Haussler, D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24, 637–644 (2008).
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).
Gremme, G., Brendel, V., Sparks, M. E. & Kurtz, S. Engineering a software tool for gene structure prediction in higher organisms. Inf. Softw. Technol. 47, 965–978 (2005).
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).
Emms, D. M. & Kelly, S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157 (2015).
Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).
Talavera, G. & Castresana, J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst. Biol. 56, 564–577 (2007).
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Sanderson, M. J. r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19, 301–302 (2003).
Chen, S. et al. AfterQC: automatic filtering, trimming, error removing and quality control for fastq data. BMC Bioinformatics 18, 80 (2017).
Meng, G., Li, Y., Yang, C. & Liu, S. MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization. Nucleic Acids Res. 47, e63 (2019).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824–834 (2017).
Kang, D. D. et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).
Zhang, J., Fei, J., Song, X. & Feng, J. An improved Louvain algorithm for community detection. Math. Probl. Eng. 2021, 1485592 (2021).
Asnicar, F. et al. Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0. Nat. Commun. 11, 2500 (2020).
Dixon, P. VEGAN, a package of R functions for community ecology. J. Veg. Sci. 14, 927–930 (2003).
Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nat. Biotechnol. 37, 1482–1492 (2019).
Krijthe, J., van der Maaten, L. & Krijthe, M. J. Package 'Rtsne'. R package version 0.13. GitHub https://github.com/jkrijthe/Rtsne (2018).
Chung, M., Munro, J. B., Tettelin, H. & Dunning, J. C. hotopp, using core genome alignments to assign bacterial species. mSystems 3, e00236-18 (2018).
Cantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).
Huerta-Cepas, J. et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 47, D309–d314 (2019).
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Liu, B., Zheng, D., Zhou, S., Chen, L. & Yang, J. VFDB 2022: a general classification scheme for bacterial virulence factors. Nucleic Acids Res. 50, D912–d917 (2022).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Shen, W., Le, S., Li, Y. & Hu, F. SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS ONE 11, e0163962 (2016).
Heldenbrand, J. R. et al. Recommendations for performance optimizations when using GATK3.8 and GATK4. BMC Bioinformatics 20, 557 (2019).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Bradbury, P. J. et al. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633–2635 (2007).
Lipka, A. E. et al. GAPIT: genome association and prediction integrated tool. Bioinformatics 28, 2397–2399 (2012).
Huang, M., Liu, X., Zhou, Y., Summers, R. M. & Zhang, Z. BLINK: a package for the next level of genome-wide association studies with both individuals and markers in the millions. Gigascience 8, giy154 (2019).
Cingolani, P. in Variant Calling: Methods and Protocols. pp. 289–314 (Springer, 2012).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet 81, 559–575 (2007).
Du, L. Genome-resolved metagenomics reveals microbiome diversity across 48 tick species. Zenodo https://doi.org/10.5281/zenodo.16248308 (2025).
Acknowledgements
We thank all members of the Tick Genome and Microbiome Consortium for help with sample collection. This study is supported by the State Key Research Development Program of China (2023YFC2305901 to W.-C.C., 2022YFC2303801 and 2021YFC2301300 to F.Z., 2022YFA1304102 to J.W. and 2019YFC1200501 to J.-F.J.) and the Natural Science Foundation of China (32025009 to F.Z., 82525059 to N.J., 32170659 to W.S. and U2002219 to J.-F.J.)
Author information
Authors and Affiliations
Consortia
Contributions
W.-C.C., F.Z., N.J. and J.W. designed and supervised the research. J.-F.J. and X.-M.C. led the sample collection. C.B., X.-D.R., Q.-C.C., C.-H.D., T.-C.Q., X.W., X.-H.H., T.-C.Y., B.-G.J., J.-Y.C., X.-R.W., L.-F.T., Y.-W.L., L.-L.D., Y.L., Y.Z., J.-G.Z., C.-C.W., W.-Q.S., L. Zhan, Z.-H.L., D.F. and L. Zhao collected samples. Y.S. led the species identification of ticks. Q.W., M.-Z.Z., T.-T.Y. and L.-Y.X. conducted species identification of ticks. Q.W., M.-Z.Z., T.-T.Y. L.-Y.X., R.-Z.Y., Y.-S.P., N.W., Z.-T.L., L.-F.L., C.L., S.-J.S., Y.-T.L., D.T., X.-Y.H., J.W., Y.-F.W., W.-Y.G., Y.-Y.L., T.X., T.-H.W., X.-Y.S. and D.-Y.Z. prepared materials for sequencing. L.-F.D. and H.F. performed genome assembly and annotation. L.-F.D., W.S. and H.F. performed microbiome analysis and interpretation. W.S. and H.F. performed hologenome-wide association analysis. L.-F.D., W.S., H.F. and N.J. wrote the manuscript for the article. L.-F.D., W.S., H.F. and J.W. prepared the figures and tables. W.-C.C., F.Z., N.J. and J.W. polished the article.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Microbiology thanks Yingjun Cui, Jason Miller and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Flowchart of study design.
Figure created with Figdraw.com.
Extended Data Fig. 2 Differences in bacteria among ecotypes and influencing factors.
(a) Differences in abundance of the nine high-abundance genera across different ecotypes. Box plots show the median (center line), the 25th and 75th percentiles (box limits), and whiskers extending to values within 1.5× interquartile range. (b) Statistical significance levels of the nine high-abundance genera across different ecotypes. Comparisons are made between the former to the latter ecotype. Red squares indicate the former is significantly higher than the latter and the blue indicate the former is significantly lower than the latter (two-sided t-test, *: P-value < 0.05, **: P-value < 0.01, ***: P-value < 0.001, P-values were adjusted by Benjamini-Hochberg method). The yellow boxes represent the enrichment of the nine high-abundance genera across different ecotypes. (c) Relative abundance (size) and coefficient of variation (color) of major bacterial species across different ecotypes. (d) Hierarchical clustering of influencing factors across different ecotypes. (e) Associations among ecotypes and various factors across categories. Red and black indicate enrichment and depletion, respectively (Chi-square test, -: P-value > 0.05, *: P-value < 0.05, **: P-value < 0.01, ***: P-value < 0.001).
Extended Data Fig. 3 Relationships between ecotypes and geography, tick species, and animal hosts.
. (a) The t-SNE dimensionality reduction distribution of tick microbiome data under three factors. Unsupervised clustering and dimensionality reduction of the microbiome taxonomic profile using t-distributed stochastic neighbor embedding (t-SNE) in the bottom right corner. Each hex cell represents a specific region of China, each row represents a specific tick host, while each column corresponds to a specific animal host, points inside each hex cell indicate tick samples, and the five background colors denote five ecotypes. A series of PERMANOVA were conducted. When comparing by row, the tick genus is fixed, and the P-value in the hex cell at the end of each row represents a fixed tick genus and geographic location of tick samples. When comparing by column, the animal host is fixed, and the P-value in the hex cell at the end of each column indicates a fixed animal host and geographic area of tick samples, comparing differences between various tick genus. (b) Detailed dimensionality reduction plot of the microbiomes using t-SNE method in comparison between free-living ticks and those parasitizing on cattle or sheep. The color of the hexagons indicates different ecotypes, while the shade of the color represents sample density. Different shapes of points indicate different geographic regions, with solid and hollow points representing free-living and cattle/sheep-parasitic samples, respectively. (c) Relative abundance of the dominant bacterial genus in free-living and cattle/sheep-parasitic ticks across different regions. Red indicates free-living ticks, while green represents samples parasitizing cattle and sheep. Box plots show the median (center line), the 25th and 75th percentiles (box limits), and whiskers extending to values within 1.5× interquartile range. Outliers beyond this range are shown as individual points. (d) Differences in microbiome at different bacterial taxonomy levels between free-living ticks and those parasitizing cattle and sheep across various geographic regions. The size of the circle indicates the relative abundance. A linear discriminant effect size analysis (LEfSe) was used at different taxonomical levels to find the genera representing parasitism difference. (e) Differences in bacterial communities between parasitic and free-living ticks across various tick species. Blue denotes bacteria enriched in free-living ticks, while red indicates bacteria enriched in parasitic ticks. The intensity of the color represents the Linear Discriminant Analysis (LDA) score.
Extended Data Fig. 4 Genomic characteristics of tick-associated bacteria.
. (a) Number of MAGs in this study. Different colors represent different genera. Each bar represents a species, with an asterisk (*) indicating genomes not present in public databases. (b) Number of rickettsial genomes, categorized by Rickettsia groups. (c) Pathways enriched in genus-specific genes within Anaplasma, Ehrlichia, and Rickettsia, and their significance levels according to Kyoto Encyclopedia of Genes and Genomes (KEGG). (d) Clusters of Orthologous Groups (COG) annotations of genus-specific genes in Anaplasma, Ehrlichia, and Rickettsia, with different colors representing different genera. J: Translation, ribosomal structure and biogenesis; A: RNA processing and modification; K: Transcription; L: Replication, recombination and repair; B: Chromatin structure and dynamics; D: Cell cycle control, cell division, chromosome partitioning; Y: Nuclear structure; V: Defense mechanisms; T: Signal transduction mechanisms; M: Cell wall/membrane/envelope biogenesis; N: Cell motility; Z: Cytoskeleton; W: Extracellular structures; U: Intracellular trafficking, secretion, and vesicular transport; O: Posttranslational modification, protein turnover, chaperones; C: Energy production and conversion; G: Carbohydrate transport and metabolism; E: Amino acid transport and metabolism; F: Nucleotide transport and metabolism; H: Coenzyme transport and metabolism; I: Lipid transport and metabolism; P: Inorganic ion transport and metabolism; Q: Secondary metabolites biosynthesis, transport and catabolism; R: General function prediction only; S: Function unknown. (e) Phylogenetic tree of Borrelia genomes. Reference genomes are represented by gray dots. (f) Distribution of pathogens across different tick species in this study. The thickness of the chord represents the number of positive samples. Different colors represent different genera of ticks and the tick species are arranged from left to right as follows: R. microplus, R. turanicus, R. pumilio, R. haemaphysaloides, R. sanguineus, I. persulcatus, I. pomerantzevi, I. ovatus, I. granulatus, I. acutitarsus, Hy. asiaticum, Hy. isaaci, Hae. montgomeryi, Hae. longicornis, Hae. concinna, Hae. japonica, Hae. qinghaiensis, Hae. hystricis, Hae. megaspinosa, Hae. tibetensis, Hae. yeni, Hae. kolonini, Hae. flava, D. silvarum, D. niveus, D. abaensis, D. nuttalli, D. reticulatus, D. everestianus, Ar. persicus, Am. javanense, Am. varanense, Am. geoemydae, Am. helvolum and Al. lahorensis. (g) Phylogenetic tree of Rickettsia primarily annotated as endosymbionts. Species names are abbreviated as follows: Rickettsia endosymbiont of Bemisia tabaci: REBT; Rickettsia endosymbiont of Diachasma alloeum: REDA; Rickettsia endosymbiont of Pyrocoelia pectoralis: REPP; Rickettsia endosymbiont of Culicoides newsteadi: RECN; Rickettsia endosymbiont of Oedothorax gibbosus: REOG; Rickettsia endosymbiont of Cardiosporidium cionae: RECC. (h) Pathogenic features of Rickettsia genomes. The size of the dots indicates coverage, while the shade of the color represents similarity. (I) Evolutionary relationships of Francisella MAGs assembled in this study, showcasing the identity (color) and coverage (size) of virulence and nutritional genes, along with genomic collinearity. The calculation method for coverage is the length of the consistent region divided by the full length of the reference gene. The species names are abbreviated as follows: Francisellaceae bacterium (FB), Francisella-like endosymbiont of Amblyomma geoemydae (FLE-Ag), Francisella endosymbiont of Amblyomma maculatum (FEAM), Francisella-like endosymbiont of Amblyomma testudinarium (FLE-At), Francisella endosymbiont of Ornithodoros moubata (FEOM), and Francisella-like endosymbiont of Hyalomma anatolicum (FLE-Ha).
Extended Data Fig. 5 mGWAS of pathogen traits linked with environmental factor.
(a) R. microplus genetic variation correlated with the relative abundance of Rickettsia. The Manhattan plot displays the results of the microbiome genome-wide association analysis. -log10(P-values) are plotted against the position of SNPs on 11 chromosomes. The significance threshold (dashed line) is set to 1×10−6. SNPs above the threshold are shown as large dots. The color of the large dots represents SNPs annotated by SnpEff with high impact or causing missense mutations. (b) D. silvarum genetic variation correlated with the relative abundance of Anaplasma. (c-f) Relationships between the SNPs homozygosity and the pathogen abundance in different genes of ticks. The pathogen load carried by homozygous individuals of tick SNPs is significantly higher than that in heterozygous individuals. A two-sided Wilcoxon rank-sum test was used for statistical analysis, and no adjustment was made for multiple comparisons. Box plots show the median (center line), the 25th and 75th percentiles (box limits), and whiskers extending to values within 1.5× interquartile range. Outliers beyond this range are shown as individual points. In the C-E, the SNPs originate from the Hae. longicornis. The SNPs in the F come from D. silvarum. (g) Clustering results of Hae. longicornis samples based on homozygosity of SNPs.
Supplementary information
Supplementary Information
Supplementary Figs. 1–4, Notes 1–8, Discussion and References.
Supplementary Tables
Supplementary Tables 1–13.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Du, LF., Shi, W., Cui, XM. et al. Genome-resolved metagenomics reveals microbiome diversity across 48 tick species. Nat Microbiol 10, 2631–2645 (2025). https://doi.org/10.1038/s41564-025-02119-z
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41564-025-02119-z