Introduction

Our world has witnessed frequent outbreaks of emerging and re-emerging infectious diseases (EIDs), causing catastrophic consequences to the regional or global human health, economic development, and husbandry industry1. Mounting evidence justifies that the prevention and control of EIDs cannot be achieved only from the human aspect1,2. Our world has already become a community with a shared future for all life, hence a one-health approach that involves multidisciplinary collaboration among public health, microbiology, veterinary medicine, wildlife, ecology, etc. has been initiated as a solution2,3. It is widely recognized that most EID events originate in wildlife4,5. Thus far, virus diversity has been extensively investigated in, but overwhelmingly biased toward, such known natural reservoirs and vectors as bats, rodents, ticks, and mosquitos6,7. Contrarily, we have far less knowledge of what viruses have been harbored by those intermediate animals, particularly those ecological node animals connecting with wildlife in nature and domestic animals and humans in anthropogenic areas. These knowledge gaps significantly impede predicting EIDs and making and implementing countermeasures.

The wild boar (Sus scrofa) is an indigenous suid widely distributed in various habitats across Eurasia8. It has been introduced to North and South America as an invasive species8. With the increase in its population, wild boar has become frequently encountered wildlife in human-dominated agricultural regions and urban communities9 and has been assessed as the least concern on the International Union for Conservation of Nature Red List of Threatened Species10, indicating it faces no threats of extinction. They consume predominately plants and occasionally eggs, carrion, small rodents, insects, and worms and are also the prey of such large carnivores as wolves, leopards, and tigers8. Hence, wild boars are ecological node animals, exerting a complicated role in various ecosystems. Due to their invasion of agricultural regions, damage to crops, conflicts with domestic animals and humans, and importantly, transmission of diseases to other animals,9,10,11, wild boars have been considered pest animals and are under active suppression in Europe and China12,13.

Wild boars are neglected natural reservoirs of many viruses. Their role in harboring and transmitting viruses has long been shadowed by other animals. A few studies investigating wild boar viruses were conducted on the ground that these creatures are reservoirs for pig pathogens14,15. Indeed, there is no immunological or physiological barrier between wild boars and domestic pigs14,16, hence these pig pathogens, like classic swine fever virus (CSFV), porcine circovirus (PCV), porcine parvovirus (PPV), African swine fever virus (ASFV), can readily spread to and have also been found in wild boars14,15. These viruses maintain stable circulation among these free-roaming animals and become an important infection source for domestic pigs14,15. Besides, a few zoonotic viruses, like hepatitis E virus (HEV), influenza A virus (IAV), Japanese encephalitis virus (JEV), have been detected in wild boars, posing a potential exposure risk to humans when in close contact15. Nonetheless, the breadth and circulation dynamics of wild boar viruses have been rarely investigated, which is dramatically incompatible with their population, distribution, and ecological role.

In this study, we conducted a nationwide pan-viromic study of wild boars in China using a combination of DNA-specific multiple displacement amplification (MDA) and RNA-specific meta-transcriptomic (MTT) viromic methods. We deciphered the circulation dynamics of wild boar viruses among vectors, wildlife, domestic animals, and humans. Our results provide important insight into understanding the virus diversity of wild boars and their ecological and epidemiological relationships with other animal taxa, which will help predict potential wild boar-associated EIDs.

Results

Overview of the wild boar virome

Between 2018 and 2024, we collected 2535 liver, spleen, kidney, lung, tonsil, and submaxillary, inguinal, and mesenteric lymph node samples and 274 blood samples of 466 healthy and 50 dead wild boars from 127 locations in 26 provincial regions in China (Supplementary Fig. 1 and Supplementary Data 1). We prepared 95 single-individual- (si-) and 66 multiple-individual- (mi-) MTT libraries, 81 si- and 53 mi- MDA libraries, and 9 si- metagenomic (MTG) libraries for viromic sequencing. Among them, 60 si- and 44 mi-treatments were subjected to MTT and MDA paired library preparation and sequencing. We found ticks infesting some wild boars at sampling and collected 2 Amblyomma testudinarium ticks and 2 Dermacentor steini ticks in Anhui Province and 3 Haemaphysalis hystricis ticks in Guangdong Province. We mixed these ticks according to the location and prepared 2 MTT libraries. We recovered 9281 exogenous eukaryotic viral sequences (that is, BrCN-Virome) from these wild boar libraries, with at least 2779 being complete genomes or encompassing the complete coding region. BrCN-Virome covers 4 RNA and 5 DNA viral phyla (Fig. 1A, B), with 3724 and 3393 sequences separately assigned to known and new species within 46 known families, and the remaining 2164 classified into new families within at least 14 orders. Owing to the proclivity of the MDA technique to amplify single-stranded circular DNA molecules, the circular Rep-encoding single-stranded (CRESS) DNA viruses were overrepresented in BrCN-Virome, accounting for 49.0% (n = 4588) (Fig. 1C). Double-stranded DNA viruses were also efficiently recovered, constituting 14.5% (n = 1361) of the data set, with most being ASFV sequences (Fig. 1C). We identified 2551 full-length virus hallmark genes (VHGs) of BrCN-Virome, i.e., 243 RNA-dependent RNA polymerase- (RdRp) and 2308 major capsid protein- (MCP) sequences. At the granularity level of subgenus (i.e., sequences with an average amino acid identity of 90% [AAI90] over 80% coverage17), these VHG sequences were grouped into 1137 viral clusters (vcAAI90), with each paired library capturing 30.1 ± 25.3 (mean ± standard deviation) vcAAI90s (Fig. 1D). However, none of these vcAAI90s showed >50% positive rates in the paired libraries, and only 29 assigned to PCV2, ASFV, porcine astrovirus (PoAstV), and porcine mastadenovirus (PoAdV) appeared in 20–38% of paired libraries (Fig. 1E). This suggests the viromes of wild boars varied significantly between libraries.

Fig. 1: Overview of the wild boar virome.
figure 1

A, B Distribution and relative abundance (shown using RPM) of RNA viruses across the 161 MTT libraries (A) and DNA viruses across the 134 MDA libraries (B). Viruses are arranged on the horizontal axes according to genome type and viral phylum. Libraries are shown on the vertical axes, which are classified into four groups based on sample type. The viral taxonomic suffixes are all trimmed. The prefix ‘Un.’ in a virus name identifies the virus cannot be assigned to a known taxon. C Composition of BrCN-Virome on the viral family level. D vcAAI90 number captured in each paired library (n = 104). The box plot illustrates the estimated median (center line), upper and lower quartiles (box limits), and whiskers (error bars) denoting the highest and lowest points within the 1.5× interquartile range of the upper and lower quartiles. E Positive rates of each vcAAI90 in these paired libraries. Source data are provided as a Source Data file.

Different viromic composition between wild boars and domestic pigs

We prepared a VHG data set of domestic pigs and clustered it with the counterpart of BrCN-Virome at the AAI50 level (i.e., resembling the viral family level17). In total, 1145 vcAAI50s were generated, with 605 specific to BrCN-Virome and 479 to domestic pigs, while only 61 were shared by the two groups (Fig. 2A). BrCN-Virome had much richer diversity in Flaviviridae, Bunyavirales, Totiviridae, Circoviridae, unclassified CRESS DNA viruses, and Papillomaviridae (Fig. 2A). Notably, compared to the 32 CRESS DNA-related vcAAI50s of domestic pigs, BrCN-Virome has expanded them to 275. Whereas domestic pigs harbored much more diverse AstVs, caliciviruses, picornaviruses, parvoviruses, AdVs, herpesviruses, and poxiviruses (Fig. 2A). Most of shared viruses were sedoreoviruses and ASFVs (Fig. 2A). It demonstrates that the virus diversity varies between wild boars and domestic pigs. The CLANS analysis showed that BrCN-Virome contributed limited genetic diversity to these RNA viruses (Fig. 2B), but consisted of rich and abundant CRESS DNA viruses and contributed a lot of genetic novelties to the mega-cluster of Papilloma/Polyoma. Interestingly, by closely inspecting the CLANS result, we found 2 and 3 VHG sequences in the mega-clusters of Flaviridae and Mononegavirales loosely related to these domestic pig viruses (Fig. 2B). The former 2 were annotated to tick-related flaviviruses, while the latter 3 were carnivore-related canine distemper viruses (CDV) within the species Morbillivirus canis (Fig. 2B). This suggests the links of viruses between wild boars and other mammals.

Fig. 2: Difference of virus diversity between wild boars and domestic pigs.
figure 2

A Richness comparison of virus diversity on the virus family level between wild boars and domestic pigs. B Clustering analysis of BrCN-Virome and domestic pig viruses using CLANS. The two insets on the right panel are magnifications of the clusters of Flaviviridae and Mononegavirales. Taxonomic suffixes of clusters are trimmed. Abbreviations are explained in Supplementary Table 1. Animal cartoons used here are adapted from free resources designed by Freepik (https://www.freepik.com/). Source data are provided as a Source Data file.

Phylogenetic analyses reveal multiple viruses of concern

We quantified the genetic diversity of BrCN-Virome using VHG sequences and found that, at the AAI90 level, there were 815 vcAAI90s not shared with known domestic pig viruses (Fig. 3A), i.e., adding at least 815 new subgenera to the virus diversity of suids. Most of the new vcAAI90s were contributed by DNA viruses, and particularly, circoviruses contributing 319 vcAAI90s (Fig. 3A). However, BrCN-Virome only contributed 52 new vcAAI90s to RNA viruses (Fig. 3A). We conducted family/order-level phylogenetic analyses based on the 2079 full-length VHG sequences (Fig. 3B). A total of 381 VHG sequences covering at least 19 families were closely related to known references in GenBank with more than 99% AAI, 76.9% (n = 293) of which were annotated to these pig viruses, such as CSFV, PPV, PCV, AstV, AdV, and ASFV, and the rest were related to papillomavirus, group A rotavirus (RVA), Akabane virus (AKAV), CDV, Mogiana tick virus, etc., of human, bovine, carnivore, and tick origins (Fig. 3B). In addition, we discovered 21 anelloviral sequences in BrCN-Virome closely (AAI: 81.0-98.5%) related to rodent wawtorqueviruses, alongside a papillomaviral clade and 3 polyomaviral sequences exhibiting distant phylogenetic relationships (AAI: 51.4–64.5%) to human gammapapillomaviruses and human delta polyomaviruses, respectively (Fig. 3B).

Fig. 3: Phylogenetic analyses of BrCN-Virome reveal multiple viruses of concern.
figure 3

A Candidates for new virus subgenera identified in this study. B Phylogenies of VHG sequences of BrCN-Virome (tips in red) in the context of GenBank (black). C Geographic distribution of viruses of concern and their positive libraries of live and dead wild boars. Map data were retrieved from the Ministry of Natural Resources of the People’s Republic of China (http://bzdt.ch.mnr.gov.cn/). D Virus circulation among wild boars, humans, domestic animals, wildlife, and arthropods. Taxonomic suffixes of clusters are trimmed. Abbreviations are explained in Supplementary Table 1. Animal cartoons used here are adapted from free resources designed by Freepik (https://www.freepik.com/). Source data are provided as a Source Data file.

The phylogenetic analyses revealed multiple wild boar viruses that have evident or potential associations with human or livestock infectious diseases. Further analysis showed that 18 viruses of concern infecting domestic pigs, cattle, carnivores, sheep, and humans were present in wild boars across 22 provincial regions (Fig. 3C). Among them, ASFV, CSFV, PCV2, porcine epidemic diarrhea virus (PEDV), PPV1, and RVA are notorious pathogens of pigs that usually cause substantial damage to the pig industry. PCV2 and ASFV were widely distributed and present in 18 and 11 provincial regions, respectively (Fig. 3C). Notably, 97.6% (41/42) of ASFV-positive libraries and 65.0% (13/20) of CSFV-positive ones consisted of samples of dead wild boars (Fig. 3C). However, as to the remaining pig pathogens, most of their positive libraries were composed of samples of live wild boars (Fig. 3C). AKAV that causes reproductive abnormalities in herbivore livestock18 was found in wild boars in the Guangxi Zhuang Autonomous Region (Fig. 3C). We did not find wild boar HEV closely related to the human agent, but some wild boar RVA sequences showing average nucleotide identity (ANI) as high as 97.1% with the human agent were present in 7 provincial regions. CDV, a lethal virus to carnivores, was discovered in the neighboring Jilin and Liaoning provinces (Fig. 3C).

Analysis of the circulation spectrum of wild boar viruses showed that at least 24 virus species of wild boars were traced to domestic pigs, humans, ticks, birds, carnivores, cattle, sheep, and rodents (Fig. 3D), among which 18 species are responsible for various diseases of humans and other animals. Compared to the interaction with other hosts, wild boar viruses interacting with domestic pigs were much more diverse, involving 16 virus species (Fig. 3D). We also noted the circulations of parvoviruses and polyomaviruses occurred between wild boars and bluetails, cranes, and shrikes and rodents, respectively (Fig. 3D). Particularly, 4 unclassified tick-borne flaviviruses, i.e., Guangxi tick virus, Mogiana tick virus, Kindia tick virus, and Amarillovirales sp., were also related to BrCN-Virome (Fig. 3D).

Tick-related virus circulation

By querying against the mosquito and tick branches of the ZOVER database7 at the AAI90 level, we found 458 BrCN-Virome sequences of 37 species within at least 11 virus families were associated with at least 23 species of ticks and 6 species of mosquitoes (Fig. 4A). Genomoviruses were the most diverse, involving 11 virus species. Coming in second was circoviruses with five species, including PCV2 and PCV3. Some viruses within the families Flaviviridae, Peribunyaviridae, Phenuiviridae, and Rhabdoviridae were also found in BrCN-Virome, such as Rhipicephalus-associated flavi-like virus, Mogiana tick virus, Guangxi Tick virus, AKAV, Tongren Phenu tick virus 1, Ledantevirus yongjia, and Ledantevirus longquan (Fig. 4A). The H. longicornis ticks were related to 13 virus species of BrCN-Virome, with 8 being genomoviruses (Fig. 4A). The Rhipicephalus microplus ticks were associated with 8 virus species of these wild boars, all being known vector-borne viruses except a Hepeviridae sp. (Fig. 4A).

Fig. 4: Relatedness of wild boar viruses to mosquito- and tick-borne viruses.
figure 4

A A total of 458 BrCN-Virome sequences of 37 species within at least 11 viral families matched up with at least 23 tick species and 6 mosquito species. Abbreviations are explained in Supplementary Table 1. B Viromic overview of the two tick libraries. C Four tick-borne viruses identified in the tick library AHST1901 were discovered in 10 wild boar libraries with different relative abundances. Animal cartoons used here are adapted from free resources designed by Freepik (https://www.freepik.com/). Source data are provided as a Source Data file.

The 2 tick MTT libraries allowed us to further confirm the circulation, from which we recovered 77 exogenous eukaryotic viral sequences representing 22 species within families Flaviviridae, Nairoviridae, Phenuiviridae, Rhabdoviridae, etc. (Fig. 4B). The two tick libraries showed markedly different viromic compositions. The Anhui library (AHST1901) was predominantly composed of flavivirus-like sequences, whereas the Guangdong one (GDST1901) was mainly associated with bunyaviruses, rhabdoviruses, etc. (Fig. 4B). We compared the two tick viromes with BrCN-Virome and found that four virus species of the AHST1901 virome were related to 10 wild boar libraries (Fig. 4C). Notably, the ticks of AHST1901 were collected from the wild boars of AnHH1905; the four AHST1901 viruses were all present in the library AnHH1905 with ANI as high as 100%. The Mogiana tick virus was distributed in all the 10 wild boar libraries with relative abundances of more than 200 RPM in certain wild boar libraries (Fig. 4C), suggesting the virus was actively replicating in these wild boars at sampling.

Viromic signatures of heathy, dead, and ASFV-killed wild boars

To investigate the death causation of wild boars, we first inspected the distribution of viruses of concern in the 60 si-libraries of dead wild boars. Fifty-one libraries were associated with infection with these pig pathogenic viruses (Supplementary Fig. 2 and Supplementary Data 2), of which 74.5% (n = 38) were associated with single infection with ASFV (n = 15), CSFV (n = 7), and PCV2 (n = 5) and co-infection with ASFV and PCV2 (n = 11). Among these paired treatments, 50 and 45 were tissues of dead wild boars (db) and apparently healthy individuals (hi), respectively. We then compared the viromic composition of the two library types. The db- and hi-viromes captured 691 and 436 vcANI90s, respectively. The accumulation curves of vcANI90 suggested that both types of viromes did not reach saturation (Fig. 5A). These hi-libraries had significantly lower within-sample virus diversity than that of db-libraries (Wilcoxon rank sum test, p = 6.772e-07) (Fig. 5B). A total of 26 vcANI90s showed different abundance patterns between the two library types, of which 14 were ASFV, and the rest were AstV, alphapolyomavirus, PCV3, and unclassified CRESS DNA viruses (Fig. 5C). AstV and PCV3 were prone to appearing in these hi-libraries, but ASFV, alphapolyomavirus and CRESS DNA viruses showed higher positive rates in the db-libraries. Indeed, AstV appeared in 42.2% (n = 19) of hi-libraries but only in 14.0% (n = 7) of db-libraries. On the contrary, ASFV was positive in 60.0% (n = 30) of db-libraries but only in one hi-library. These indicate ASFV infection was the main cause of the death of wild boars.

Fig. 5: Viromic comparison between dead wild boars (db) and healthy individuals (hi).
figure 5

A Species accumulation curves of db- and hi-libraries based on vcANI90s. The center lines indicate the predicted vcAAI90 numbers with shadowed areas representing the 95% confidence intervals. B Alpha diversity comparison between hi- (n = 35) and db-libraries (n = 60). The difference was examined using the two-tailed Wilcoxon rank sum test. The p-value was determined to be 6.772e-7, indicating an extremely significant difference. Each box plot illustrates the estimated median (center line), upper and lower quartiles (box limits), and whiskers (error bars) denoting the highest and lowest points within the 1.5× interquartile range of the upper and lower quartiles. C Different viromic signatures of hi- and db-libraries. The absolute lg-transformed P-values are shown on the vertical axis, which was examined using the two-tailed Wilcoxon rank sum test to indicate the differences in the relative abundance of vcANI90s between db- and hi-libraries. The positive rate differences of vcANI90s between the two types of libraries are shown on the horizontal axis. The dashed horizontal line identifies the significant level (p = 0.05); the two dashed vertical lines indicate the positive rate difference of 0.1. D Viruses in libraries of ASFV-killed wild boars (db-ASFV) showed less abundance than that in hi-libraries. Source data are provided as a Source Data file.

We further examined the impact of ASFV infection on the viromic composition. In these ASFV-positive db-libraries (db-ASFV), ASFVs were very abundant, with an average RPM of 402.5 ± 200.5 (Fig. 5D). But unexpectedly, except for ASFV, all other viruses were sparsely distributed in ≤ 26.7% (n = 8) of the db-ASFV libraries, and few of them were more abundant than ASFV (Fig. 5D). However, viruses in the hi-libraries (the ASFV-positive one excluded) were distributed more evenly, with 18 vcANI90 clusters related to PCV2, AstV, and AdV present in ≥ 27.3% (n = 12) of hi-libraries (Fig. 5D). Particularly, viruses in hi-libraries were very abundant, with an average RPM of 495.4, much higher than viruses in the db-ASFV libraries (mean RPM = 122.8) (t test, p = 0.001289) (Fig. 5D). These suggested that the ASFV infection suppresses the replication of other viruses.

ASFV variants in wild boars

We extracted 29 full-length sequences of the B646L gene of ASFV from ASFV-positive si-libraries and conducted phylogenetic analysis with references. These sequences were divided into I and II genotypes with 99.9–100% nt identities with their references (Fig. 6A). The majority (n = 24) of these sequences fell into genotype II (GII), and five collected in 2023 and 2024 were genotype I (GI) (Fig. 6A). Molecular detection of all individuals showed that 60% (36/60) of the dead wild boars were positive for ASFV (Fig. 6B), including the 33 tested positive by viromic analysis. Among the 3 samples where ASFV was detected by PCR but not by viromic sequencing, all were classified as GII viruses with very low viral loads as evidenced by Ct values > 35. Genotyping based on the amplicons confirmed the phylogenetic analysis (Fig. 6A, B). Moreover, we detected the recombination events of ASFV in wild boars using a method based on fragmented contigs. Fragmented contigs of all GII ASFVs closely matched up with the GII reference with 99.5–100% nt identities (Fig. 6C), suggesting no recombination occurred. However, these of GI viruses alternatively mapped against the GI and GII references (Fig. 6D), i.e., all GI ASFVs were GI/II-recombinant viruses. Notably, we found some fragmented ASFV contigs of the library JiXS2402 overlapped against each other (Fig. 6E), showing different identities with the reference. We then amplified the entire B646L gene from the sample and randomly picked 15 clones for Sanger sequencing. Surprisingly, these clones were very heterogeneous (Fig. 6F). The majority (n = 11) of clones were the same as the sequence assembled by viromic sequencing. Among the rest, however, one was classified into GII with 99.9% nt identity with the reference. Particularly, the remaining three clones showed 98.3–99.1% nt identities with each other and with GI and GII references (Fig. 6F) and hence cannot be robustly classified into any genotypes based on the genotyping criterion of ASFV19. This suggests that the individual JiXS2402 was simultaneously infected by multiple ASFV variants.

Fig. 6: Characterization of wild boar ASFVs.
figure 6

A Phylogenetic analysis of 29 full-length B646L genic sequences from ASFV-positive libraries using the maximum-likelihood method. B Heath status and ASFV infection of these wild boars. GII and GI/II indicate wild boars infected with genotype II and GI and GII recombinant ASFVs, respectively. C, D Nucleotide identities of ASFV full genomes recovered from libraries ShNS1903 (C) and JiXS2401 (D) to GI (orange-filled circles) and GII (green-filled diamonds) references based on fragmented contigs. E Fragmented ASFV contigs recovered from library JiXS2402 show different nt identities to the same genomic regions of GII reference. F Phylogenetic analysis of the 15 sub-clones of the B646L amplicon of JiXS2402 using the neighbor-joining method. The five variants are distinguished using colored boxes. The sequence trimmed from the de novo assembled contig is in bold italics. Source data are provided as a Source Data file.

Discussion

This study represents a comprehensive investigation conducted thus far to elucidate the virome landscape of wild boars. Although wild boars are physiologically and genetically highly similar to domestic pigs16,20, BrCN-Virome is distinct from the virome of domestic pigs in genetic diversity. This discrepancy is likely partially attributed to the sample type and viromic techniques we used here. All samples involved here were internal solid organ tissues and blood samples, so those viruses mainly replicating in the digestive tract, such as AstV, calicivirus, etc.21, have been rarely detected. MTT sequencing has been considered an unbiased manner to profile the entire virome and widely employed to reveal virus diversity22. This has greatly expanded our knowledge of the diversity of RNA virosphere23, which partially explains the fact that we discovered limited new RNA viruses. However, MTT shows much less efficiency in capturing DNA viruses than the MDA technique24. That is why we used a combination of MTT and MDA methods to offset their respective deficiencies, allowing us to recover a rich diversity of DNA viruses, particularly those with circular genomes. The traits of the wild boar itself also play an important role in shaping the viromic composition since their deep participation in various ecosystems8. For example, wild boars freely roam in the wild and are frequently infested by arthropods, making them have a higher chance of contracting vector-borne viruses in Bunyavirales, Flaviviridae, Rhabdoviridae, etc.25 Interestingly, wawtorqueviruses were mainly detected in rodents26, and gammapapillomaviruses and deltapolyomaviruses are primarily linked to human infections27,28, but few of these viruses were ever discovered in suids29. The discovery of their relatives in wild boars suggests potential host range expansion of these viruses.

In a local ecosystem, all life forms directly or indirectly interact with each other via food chain, sharing habitat, resource competition, etc., which provides ample conditions for virus circulation and cross-species transmission30. Here we decoded the virus circulation dynamics between wild boars and other species, demonstrating that wild boar is a node species connecting humans, domestic animals, wildlife, and arthropods in the virus circulation network. Among these circulating viruses, most are related to domestic pigs, indicative of wild boars as the major natural reservoir of pig viruses. Some viruses are known to be pathogenic to pigs, humans, sheep, cattle, and carnivores31, but the rest show uncertain pathogenicity. These pig pathogens are widely distributed in wild boars across China, with some also lethal to wild boars, such as ASFV, CSFV, etc.31 Detection of CDV and AKAV in wild boars is a rare event but concerning, since it suggests wild boars participate in the maintenance and cross-species transmission of pathogens of other species. Particularly, CDV is lethal to carnivores and has already emerged as an extinction factor for the endangered Amur tiger32. As an important food source for these large carnivores, wild boars readily spread viruses to them, and vice versa, posing a significant obstacle to the wildlife conservation32. Wild boars are ideal natural hosts for medically important mosquitoes and ticks and hence can be considered sentinel animals to identify potential vector-borne viruses25. Although not yet associated with any EIDs, the four tick-borne flaviviruses are potential risks of concern since they were detected in tissue samples of wild boars, indicative of their ability to replicate in mammals. Recently, new tick-borne viruses, such as Alongshan virus33, Songling virus34, Yezo virus35, Wetland virus36, and Xue-Cheng virus37, have been frequently identified as causative agents of human febrile illnesses in China, highlighting the need for a comprehensive assessment of the pathogenicity and risk to humans of the four tick-borne flaviviruses.

Our data show that virus infection is an important cause of wild boar deaths, with ASFV being the predominant agent and other pathogens, such as CSFV and PCV2, also playing a part. By comparing the viromic profiles between ASFV-positive and ASFV-negative wild boars, we found that these ASFV-positive individuals were infected with much more diverse viruses but, surprisingly, with lower abundance. Thus far, there are no reports to study the impacts of ASFV infection on the viral flora, but some epidemiological investigations revealed that ASFV-positive pigs had higher infection rates of PCV3 and anellovirus than ASFV-negative individuals38,39. These phenomena suggest that ASFV infection disturbs the virus community of hosts. This should result from host immunosuppression after ASFV infection, which allows diverse viruses to invade hosts40. The lower abundance of these viruses should be attributed to the fact that host cells were dysfunctional due to ASFV infection, consequently interfering with virus replication41.

Since its introduction into China in 2018, the GII ASFV has brought a catastrophic consequence to the Chinese pig industry42. It soon spread to the wild boar population43. In 2021, low-virulent GI ASFV emerged in China44, though low in prevalence, resulting in GI and II recombinant strains in 202245. These reflect that the ASF epizootic in pigs has altered fast and become very complicated in China. However, the background of ASFV in wild boars remains limited. Our data demonstrated that GII ASFV is a predominant genotype in Chinese wild boars. Due to the limited representativeness of our samples, no GI ASFV was discovered here, but the GI/II recombinant viruses were detected in wild boars in 2023 and 2024. Notably, we discovered an ASF case coinfected with five different ASFV variants, i.e., GI, GII, and three other genotype-undetermined. Co-infection of multiple ASFV variants, especially with different genotypes, has been rarely reported, regardless of in pigs or wild boars, though a dual infection of two GI ASFV variants was observed in a wild boar and a pig carcass in Sardinia, Italy46. There are two probabilities, or effecting together, to explain how and why such diverse and divergent ASFVs simultaneously appeared in the sample. One is the quasispecies nature of ASFV. The DNA polymerase X and the downstream DNA ligase of ASFV show low fidelity in DNA replication47,48, i.e., having a potential mutagenic effect. Another is simultaneous infection with multiple exogenous ASFVs as the GI and GII ASFVs have already spread in China44,45. Nevertheless, these results revealed a complicated situation of ASFV infection in Chinese wild boars, adding more challenges to prevent and control ASFV in China.

We must acknowledge the limitations of our data set in fully capturing the complete spectrum of wild boar viral diversity, as both sequencing coverage and sampling scale indicate that the observed virome composition has not yet theoretically saturated. Additionally, bacteriophages were excluded from the data set given that our primary focus was on analyzing eukaryotic viruses in these solid internal organs. Nevertheless, this study refreshes our understanding of the role of wild boars in virus cross-species circulation, highlighting that wild boar viruses not only significantly challenge healthy pig farming but also pose substantial threats to wildlife conservation and public health. Some countermeasures should be considered to mitigate the potential threats. Their chance to encounter domestic pigs can be cut off by enhancing biosecurity within free-range systems through fencing/physical barriers, and monitoring food/water/cleanliness and upgrading it to intensified industrialized production. Active vaccination of wildlife via the distribution of vaccine-laden baits has proved a successful strategy to combat wildlife diseases49. Wild boar viruses should be routinely monitored to understand their dynamics of evolution and distribution. In addition, the reduction of wild boar population by measures like hunting or trapping is another possible measure12.

Methods

Ethics statement

The procedures for sampling and processing wild boars were reviewed and approved by the Administrative Committee on Animal Welfare of Changchun Veterinary Research Institute (Institutional Animal Care and Use Committee Authorization, permit numbers IACUC of AMMS-11-2018-021 and IACUC of AMMS-11-2023-010).

Sample collection and processing

China’s national and local authorities have coordinated efforts to manage wild boar populations and investigate ASFV prevalence13. The sampling activities were conducted in close collaboration with the State/Local Administration of Forestry and Grassland of China. Animal sex was not considered in the study design and analysis. The sampling areas covered various landscapes including Northeastern Forest Zone, Northwestern Arid Zone, Southeast Hilly Zone and Southwestern Mountains (Supplementary Fig. 1 and Supplementary Data 1). All samples were collected in the fields and immediately cryo-transported to the laboratory at Changchun Veterinary Research Institute, where they were stored at −80 °C. Ectoparasites of wild boars were examined and plucked if there were any. These ectoparasites were also transported to the laboratory along with wild boar samples. All these arthropods were ticks, and their species were morphologically identified under a stereo microscope (Nikon) based on various characters, such as the scutum, the basis capitula and palp, the anal groove and the shape and size of the spurs on the coxae50,51. When necessary, the mitochondrial cytochrome c oxidase subunit I gene of tick52 was amplified and sequenced, then subjected to a BLASTN v.2.14.0+ search against GenBank nt database (release date: May 1 2024) to confirm the species identification further.

We prepared a homogenate for each wild boar tissue sample. A piece approximately 6 mm in size (~0.2 g) was cut from each sample and homogenized using a tissuelyser (Jingxin, Shanghai, China) with DMEM buffer at a w/v ratio of 1:5. A few ticks were collected from wild boars of two locations. They were grouped into two pools according to sampling location. Ticks in each pool were washed 3 times with sterile DMEM buffer and then homogenized using the tissuelyser in 500 μl DMEM buffer. Homogenates were centrifuged at 10,000 × g for 10 min at 4 °C. Supernatants were equally allocated to two parts, one for viromic sequencing and another for follow-up molecular detection.

Nucleic acid extraction, library preparation and sequencing

The company names and catalog numbers for all commercial reagents are available in Supplementary Table 2.

We first assessed MDA and MTG methods to analyze tissue samples using 9 si-supernatants. Nucleic acid (NA) extraction, library preparation and sequencing, and bioinformatic analysis are described below. Generally, we recovered 128 contigs related to ASFV, AdV, parvovirus, CRESS DNA virus, and anellovirus using the MDA method, much more diverse than 52 ASFV contigs using the MTG method. The 80 ASFV contigs recovered using MDA were ~6340.0 bp in length, very close to the length of ~6468.1 bp of the MTG-recovered ASFV contigs. Therefore, we decided to use MDA to profile the DNA viromes of these samples.

For wild boar tissue samples, we prepared HTS libraries according to location, health status, and collection year, which allowed us to generate more si-libraries than mi-libraries (see Results). The supernatant of each tissue sample was equally pipetted to bring a final volume of 1 ml. Free NAs of every 260 μl supernatant mix were digested using 4 μl DNase I (TaKaRa, Dalian, China), 1 μl RNase A (TaKaRa), and 30 μl 10 × DNase buffer at 37 °C for 1 h. The reaction was terminated by adding ethylene diamine tetra-acetic acid disodium salt and then subjected to RNA or DNA extraction. The total RNA was extracted using RNAiso Plus reagent (TakaRa) following the manufacturer protocol and further purified using magnetic beads. Ribosomal RNA was depleted from the purified RNA using a RiBo-Zero Magnetic Gold kit (Epicentre Biotechnologies, Madison, WI). The quality and quantity of RNA were determined using a Qubit 4 fluorometer (Invitrogen, Carlsbad, CA). A meta-transcriptomic library was prepared using a NEBNext Ultra directional RNA library prep kit (NEB, Ipswich, MA). The total DNA was extracted using a DNeasy Blood and Tissue kit (Qiagen, Dusseldorf, Germany) and subjected to isothermal amplification using an Illustra GenomiPhi V2 DNA amplification kit (GE, Fairfield, CT). The MDA reaction was incubated at 30 °C for 1.5 h and then inactivated at 65 °C for 10 min. The purified products were quantified and subjected to DNA library preparation using a NEBNext Ultra DNA library Prep kit. To minimize index hopping in multiplexed runs, libraries were labeled using 10-nt unique dual indexes. Libraries were quality checked using a Qsep1 Bio-sequence Analyzer (Houze Bio-Tech, Hangzhou, China), then pair-end (150 bp) sequenced on a NovaSeq 6000 sequencer (Illumina, San Diego, CA). At least 6 gigabase (Gb) raw data were produced for each library.

For tick samples, we only prepared MTT libraries. The procedure for preparing tick MTT libraries was the same as for wild boars. For wild boar blood samples, we did not digest free NAs but instead spiked them using ~105 BHK-21 cells and then subjected them to the preparation of MDA and MTT libraries as described above. To inspect possible cross and exogenous contamination, we also included a BHK-21 cell blank control for every 30 libraries. The five MTT and four MDA blank controls were simultaneously treated along with the preparation of wild boar samples.

Bioinformatic procedure for BrCN-Virome preparation

The bioinformatic procedure to prepare BrCN-Virome includes raw data filtering, de novo assembly, virus-like sequence (VLS) identification, quality control and enhancement, and taxonomic clustering and assignment. Raw FASTQ files were filtered using fastp v.0.23.4. Since some eukaryotic viruses have the capacity to integrate host-derived genetic sequences into their own genomes53, host genome was not excluded before de novo assembly. High-quality reads of each library were directly de novo assembled using megahit v.1.2.9 with the default kmer list and a minimum length of 500 for MTT contigs and 1500 for MDA and MTG contigs54. This parameterization aligns with the fact that eukaryotic DNA viruses exhibit minimum genome lengths exceeding 1500 bp, while eukaryotic RNA viruses (including those with segmented genomes) have genomic components no shorter than 500 bp29. To recover circular sequences, MDA- and MTG-assembled contigs were converted to metagenomic assembly graphs using the megahit_toolkit contig2fastg script and then assembled using SCAPP v.0.1.4 (use_scores = =False, use_gene_hits = =False)55. Virus-like contigs were recognized using a combination of similarity-based and profile-based methods. The former was achieved by querying contigs against our eukaryotic viral reference database (EVRD) v.3.053 using DIAMOND BLASTX v.2.1.8 with the ultra-sensitive mode. The latter was implemented by scanning prodigal-predicted protein sequences against a collection of HMM profiles (i.e., RdRp profiles from iVirus, the virus branch of eggnog v.5.0, and RdRp-scan) using hmmscan, available in HMMER v.3.3.2 (bitscore = =30)56.

No cross contamination was identified by checking using cell blank controls, but some mysterious VLSs were discovered by querying against problematic viral sequences53 and host and human genomes and hence were excluded from the preliminary VLS data set. The data set was competitively searched against the nt and nr databases of GenBank using BLASTN and DIAMOND BLASTP at the nt and aa levels, respectively. The top 25 blast hits of each VLS were output, and a VLS was considered viral based on the relative majority rule. We then excluded all bacteriophage-like and endogenous retrovirus-like sequences. The remaining VLSs were assessed using CheckV v.1.0.357, which did not reveal any contamination. Thus, we condensed them using CD-HIT v.4.8.1 at 99% similarity over 80% coverage of the short sequence. We considered contigs of high quality if they i) were complete genomes or encompassed the entire coding regions as compared with their relatives, ii) had at least 10-nt-long inversely complementary termini, iii) were circular, and iv) were determined high-quality by CheckV. The final VLSs recovered from wild boar libraries were named BrCN-Virome.

We assessed the overall diversity of BrCN-Virome at the species (ANI90), subgenus (AAI90), and family (AAI50) levels. BrCN-Virome nt and aa sequences were clustered at the respective similarities over 80% (for ANI90 and AAI90) or 50% (for AAI50) coverages using MMseqs2 with the coverage mode of 0. Based on the taxonomic clustering, we applied a hierarchical strategy to assign BrCN-Virome sequences with taxonomic lineages56. The taxonomic assignment was first performed at the sub-genus level. If BrCN-Virome sequences and EVRD-aa references were clustered together at the AAI90 level, we picked up the genus name of the EVRD-aa references and assigned it to these BrCN-Virome sequences. Different BrCN-Virome sequences within a genus were differentiated using the library code and contig ID. Then the assignment proceeded to the family level. Likewise, BrCN-Virome sequences within a vcAAI50 cluster were affiliated with the family name of EVRD-aa references. The remaining BrCN-Virome sequences were queried against EVRD-aa using DIAMOND BLASTP with the ultra-sensitive mode. The top five blast hits with the highest bitscores of each query sequence were output; their lowest common ancestor taxon was determined using TaxonKit v.0.8.0 and assigned to the query sequence with the suffix ‘-like’.

Ecological and statistical analysis

Unless specified otherwise, all ecological and statistical analyses were performed using the VHG data set of BrCN-Virome.

VHG identification

VHGs are broadly conserved among diverse groups of viruses17. RdRp is a super-VHG that unites almost all RNA viruses17. RdRp domains of BrCN-Virome were identified by scanning BrCN-Virome aa sequences using hmmsearch against RdRp profiles of iVirus58 and RdRp-scan59. All RdRp candidates were aligned using MAFFT v7.470 in the E-INS-i mode against the full-length RdRp representatives we curated previously56. The RdRp catalytic motifs of these candidates were visually checked using Jalview v.2.11.2.6 at the phylum, order, or family levels. MCP is another super-VHG that is encoded by almost all DNA viruses17. We withdrew MCP sequences within the realms of Adenaviria, Duplodnaviria, Mondnaviria, and Varidnaviria in the PDB database. These MCP sequences were structurally validated and are highly reliable but lack representativeness. Hence, we used UniProt sequences to search against the MCP sequences using DIAMOND BLASTP. All matched sequences were aligned against the MCP sequences and visually checked as described in RdRp curation. False sequences were excluded if they did not contain conserved motifs. Those shorter than 80% of the alignment length were also excluded. The remaining sequences were condensed together with PDB MCP sequences using cd-hit at 90% similarity over 80% coverage. The resulting sequences were ~312 aa in length and considered a curated MCP data set. MCP domains of BrCN-Virome were identified by querying BrCN-Virome sequences against the MCP data set using DIAMOND BLASTP. All MCP candidates were validated by checking conserved motifs. BrCN-Virome MCP sequences were considered full-length if they matched up with more than 80% of the average length of alignments. All RdRp and MCP sequences of BrCN-Virome were pooled together and condensed using cd-hit at 95% nt similarity over 80% coverage.

Viromic comparison between wild boars and domestic pigs

We first prepared a viral genomic data set of domestic pigs, which was composed of sequence data of two origins. One is Pigs_Virome, established based on 1841 healthy weaned piglets from 45 farms in 25 provincial regions of China and four viromic projects of pigs worldwide60. Another is viral sequences deposited in GenBank with a host label of pig, domestic pig, or Sus scrofa domesticus. We queried the above-curated VHGs using DIAMOND BLASTP against the viral genomic data set of domestic pigs (e-value == 1e-5, query-cover == 80). The identified VHGs of domestic pigs were clustered with the BrCN-Virome VHGs using MMseqs2 at 50% aa similarity over 50% coverage. The two VHG data sets were merged and subjected to all-versus-all alignment using BLASTP. The blast file was analyzed using CLANS v.29.05.2012 with a P value cutoff of 1e-3 in 3D mode. The CLANS analysis was based on ≥10,000 rounds of runs with singletons being excluded.

Identification of viruses of concern

Viruses of concern are those evidently or potentially associated with human and animal diseases. To pinpoint the risks of wild boar viruses posing to humans, domestic pigs, and wildlife, we searched for viruses of concern in BrCN-Virome related to targets recorded in the Virus Pathogen Resource (ViPR) at the species level. ViPR was retrieved from the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) v.3.30.19a61, against which all BrCN-Virome sequences were queried using BLASTN (perc_identity == 90, qcov_hsp_perc == 80).

Determination of virus circulation

We define it as a virus circulation event if two viruses of different host species are from the same viral species, i.e., sharing 99% ANI over 80% coverage. The virus-carrier database we prepared previously56 and the mosquito and tick branches of the ZOVER database7 were used as a target. All BrCN-Virome sequences were subjected to a BLASTN search against the target using BLASTn. All blastn hits with ≥90% nt similarities over query coverages of ≥80% were output for further analysis. To validate the association of wild boar viruses with tick-borne ones, we queried viral sequences of the two tick libraries against BrCN-Virome using BLASTN with an e-value cutoff of 1e-5.

Viromic characterization of dead and ASFV-killed wild boars

To determine the death causation of wild boars, we first assessed the virus diversity of 50 tissue libraries of dead wild boars (db-libraries) and 45 of apparently healthy individuals (hi-libraries). The host reads were eliminated from these libraries by mapping against the Genome assembly Sscrofa 11.1 of domestic pig (accession number: GCF_000003025.6) using Bowtie2 v.2.4.1 with the very fast end-to-end preset. The BrCN-Virome VHG sequences were indexed, against which the unclassified reads from the 95 libraries were mapped using Bowtie2 with the end-to-end sensitive mode. Mapped reads were fed into SAMtools v.1.10 to count their numbers. The read count of each VHG within a vcANI90 was summed to represent the read count of the vcANI90. The relative abundance of each vcANI90 of VHGs was calculated by dividing the million unclassified reads by the read count of the vcANI90 (i.e., RPM), with RPM ≤ 1 being removed to further minimize index hopping. The species saturation statuses of db- and hi-libraries were assessed using the species accumulation curves, which were achieved using the ‘specaccum’ function (method == random, permutations == 10,000), available in the vegan package v.2.5-7. Alpha diversities of the two types of libraries were measured using Shannon and Simpson diversity indices using the ‘diversity’ function of the vegan package. The differences in relative abundances of vcANI90s between db- and hi-libraries were compared using the Wilcoxon rank sum test. We use the difference (>0.1) of the positive rate of a vcANI90 between the two types of libraries to indicate whether the vcANI90 was prone to appearing in db- or hi-libraries. To understand the impact of ASFV infection on the viromic composition, the average RPMs of vcANI90s between the libraries of ASFV-killed wild boars (db-ASFV) and hi-libraries were compared using the two-tailed t test. These analyses were conducted in R environment v.4.1.3.

Phylogenetic reconstruction

The procedures to reconstruct the order- or family-level phylogenies were the same as we described elsewhere56. Briefly, the full-length VHG representatives of BrCN-Virome, all domestic pig viruses, and EVRD-aa were separately sampled from their vcAAI90s and all-versus-all aligned using BLASTP. Sequences were retrieved from the blast hits that had an alignment length of ≥200 within the same order or family. We aligned them using E-INS-I and trimmed them using trimAL v.1.4 to remove columns containing ≥ 67% of gaps. Alignments were visually checked using Jalview to remove those sequences not containing conserved motifs. An initial maximum-likelihood tree was reconstructed using IQ-TREE v.1.6.12 with 1000 ultrafast bootstrap values and the automatically determined best-fit model62. Those divergent sequences not belonging to the order or family were identified and removed from the alignment. Then the refined alignment was subjected to another round of phylogenetic reconstruction. The phylogenetic analysis of these ASFV amplicons was conducted using MEGA-X with the neighbor-joining algorithm and the bootstrap test value of 1000. All phylogenetic trees were visualized using FigTree v.1.4.3.

Recombination analysis

Although we have generated 15 near-full-length ASFV genomes (≥160 kb), we noticed there existed multiple contigs in a library covering the same genic region, indicating that RDP was not suitable for predicting recombination events in this context. Therefore, we conceived a method to predict recombination events based on fragmented contigs, which also allowed us to detect co-infection with different strains. All ASFV contigs (≥1.5 kb) of a library were cut into fragments of 1000 bp using the sliding function (window == 1000, step == 500) of the seqkit package v.2.4.0. These fragmented contigs were aligned against the GI (Pig/I/SD/DYI/2021, accession number: MZ945537) and GII (Pig/II/HLJ/2018, accession number: MK333180) ASFV references using BLASTN. The nt identities to the two references were respectively arranged and plotted according to the genomic loci of these fragmented contigs. If the identity plots of an ASFV against the two references intersected, recombination event(s) were predicted to have occurred to the ASFV. If the identity plots against a reference overlapped over the same genomic region(s), i.e., ASFV contigs of a si-library showed different nt identities to the same genomic region, co-infection was predicted to have occurred in the individual. We employed two approaches to validate the recombination prediction. One was mapping all ASFV reads against the two references using Bowtie2 with very sensitive end-to-end mode; single nucleotide variations in recombinational regions were compared using bcftools v1.10.2, and they should appear in different frequencies between recombination regions of different parental strains. Another was phylogenetic analyses of recombinational regions to compare their phylogenetic distances to the two references. We used molecular detection to validate the co-infection event (see ‘Molecular validation and detection’).

Molecular validation and detection

PCR or RT-PCR assays were used to validate CDV and tick-borne viruses, detect ASFV and CSFV, and fill gaps. Primer pairs (Supplementary Table 3) were referred to previously published or World Organization for Animal Health (WOAH) and national recommendations63,64,65, or designed based on contigs. RNA and DNA were extracted from the supernatants that were kept for molecular detection. RNA extraction was implemented using the RNeasy Mini kit (Qiagen). DNA extraction was the same as described in ‘Nucleic acid extraction, library preparation, and sequencing.’. Complementary DNA was synthesized using the first cDNA synthesis kit (TaKaRa). PCR amplification was effected using the 2×PCR MasterMix (Tiangen) with 30 (for outer PCR), 35 (for inner PCR), or 40 (for normal PCR) thermocycles of 94 °C for 30 s, annealing at 54 °C (or adjusted for certain primer pairs) for 30 s, and extension at 72 °C for 40 s. Negative control was set using distilled water. Most PCR products with the expected size were directly sequenced using the Sanger method on an ABI 3730 sequencer (Comatebio, Changchun, China). To confirm the co-infection of multiple ASFV variants, amplicons were subcloned into pMD18T vectors (TaKaRa) and used to transform DH5α-competent E. coli cells (Tiangen). Fifteen to twenty clones of each amplicon were randomly picked for the Sanger sequencing.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.