Introduction

Population genetics provides essential insights into the biology and ecology of wild species. In particular, analysing population structure enables the investigation of dispersal processes, gene flow and interactions between individuals and populations1. These aspects are not only crucial for understanding species evolution but are also particularly relevant in wildlife health and conservation management2. Disease transmission in wildlife depends on multiple factors, including population structure, pathogen dynamics and environmental and host characteristics3,4. For instance, gregarious species often form fragmented social networks, with cohesive subgroups interacting primarily within their geographical range4. Such fragmentation can reduce the risk of large-scale outbreaks for low-transmissibility pathogens, but increase vulnerability to widespread epidemics caused by highly transmissible pathogens that exploit interactions across subgroups4. Analyses of genetic variability within species help explain infectious disease susceptibility and transmission dynamics, especially in free-ranging wildlife with varying social structures5. Understanding these processes may facilitate a deeper comprehension of how population structure influences the epidemiology of infectious diseases in natural ecosystems6.

In Europe, wild ungulates are the most widespread species in the natural environment7. The wild boar (Sus scrofa), in particular, has expanded rapidly owing to its remarkable adaptability and high reproductive rate8,9. This species now occupies a wide range of habitats, including anthropised areas10,11,12 where it causes significant agricultural and environmental damage13,14,15,16. Moreover, as a reservoir of multi-host pathogens, it poses a sanitary threat to humans, livestock and other wildlife species17,18,19.

Among current transboundary animal health threats, African swine fever (ASF) represents one of the most critical challenges in Europe20. The causative agent, Asfivirus haemorrhagiae21, induces a highly lethal disease in domestic pigs and wild boar, leading to significant economic losses22,23. Since its detection in Europe in 2014, genotype II of the ASF virus has been mainly detected in wild boar populations, with sporadic spillover events into domestic pig farms24. In Italy, the genotype was first identified in wild boar in north-western areas during early winter 202125. The index case occurred in the municipality of Ovada, Alessandria province (Piedmont), and the infection quickly spread across the province and regional borders reaching the province of Genoa (Liguria), within weeks and, later, the Lombardy region in June 202326. Since then, evidence of ongoing spread and persistence has since been documented, including recent incursions into pig farms26,27. In the absence of the ASF-tick vector in this geographic context, wild boar movements, density and genetic connectivity are likely key determinants of disease persistence and spread28. Understanding how population structure shapes ASF dynamics is therefore essential for the development of effective control strategies.

Despite growing recognition of the importance of wildlife disease control, population genetics remains underutilised in epizootic investigations, even though it can provide valuable insights into transmission dynamics and containment strategies29,30. For wild boar, few studies have investigated the association between the genetic structure and disease emergence or spread, focusing mainly on Aujezsky’s disease and Classical Swine Fever31, animal tuberculosis32,33 and, more recently, ASF3,34,35,36,37. The present study aims to assess whether the degree of connectivity of wild boar populations has influenced the dispersion patterns of ASF in north-western Italy, providing insights that may contribute to improved disease management and control strategies. To this end, we analysed both the genetic structure of wild boar populations and the spatial dynamics of disease spread, with particular attention to areas of genetic admixture – zones where individuals from distinct population clusters interbreed.

Results

Genotyping

From the initial panel of 14 microsatellite loci, one (TGR SW24) failed to amplify. This locus was therefore excluded from subsequent analyses leaving 13 loci for the study. No evidence of null alleles was detected, with estimated median null alleles frequencies ranging from − 0.009 to 0.061 across all 13 loci (Supplementary file, Table S2) Observed heterozygotes (Hobs) ranged from 0.45 to 0.6 across all hunting districts, while expected heterozygotes (He) varied between 0.46 and 0.57. Tests for Hardy-Weinberg equilibrium (HWE) indicated significant deviations at all loci (p  0.019), except for SW951 (p = 0.072). At the district levels, deviations from HWE across all loci were observed in those hunting areas located in the southern (i.e., CNa, CNc; p < 0.01) and south-eastern Piedmont (AL1a-2a, ALa; p = 0.05). In contrast, AT2a demonstrated the highest degree of conformity with HWE (p = 0.944). Regarding the number of private alleles, some hunting districts in south-eastern Piedmont (i.e., ALa) displayed the highest number of private alleles (n = 10) along with the greatest allelic richness (Ar = 2.8), while areas from north-central Piedmont (i.e., VC1a) exhibited the lowest allelic richness (Ar = 2.21). The highest inbreeding coefficient (Fis = 0.2378) was recorded in north-western districts (i.e., TO3a), whereas the lowest coefficient (Fis= -0.2234) was recorded in north-central areas (i.e., VC2a) with median Fis across all the hunting districts of 0.00039. Pairwise Fst comparisons revealed the greatest genetic differentiation between VC2a and IMa (Fst = 0.22). These two populations also exhibited the highest overall levels of genetic differentiation, with Fst values exceeding 0.05 in all pairwise comparisons except between VC2a and AL1a (Fst = 0.04). Genetics parameters are displayed in supplementary material (Tables S3, S4 and Figure S1).

Population structure

The application of the Bayesian clustering approach resulted in the assignment of study individuals to two discrete clusters (ΔK = 87.35, standard deviation 4.43), with individuals assigned to Bayesian cluster 1 largely distributed in the Piedmont region (n = 343/410; 83.7%) whereas cluster 2 individuals were more evenly distributed, with a slight predominance in the Liguria region (n = 95/168; 56.5%). Within these clusters, we observed varying degrees of admixture or co-ancestry (Fig. 1); however, the assignment probabilities indicated that wild boar were strongly associated with one of the two genetic clusters. Indeed, most of individuals demonstrated a high degree of certainty of assignment, with the 95% exhibiting a probability greater than 0.99 for their respective cluster population (Fig. 2). Individuals assigned to Cluster 1 exhibited probability of 0.932 (median = 0.972) of belonging to this cluster, while individuals from Cluster 2 demonstrated a mean probability of 0.928 (median = 0.974). A limited number of individuals (n = 6) displayed intermediate assignment values, mainly within the range of the 5th and 95th percentiles (i.e., 0.308–0.405 for cross-population assignments). These individuals, probably reflecting admixed ancestry, were mainly concentrated in hunting districts corresponding to genetic transition zones near the border of the two regions (Fig. 1); these included hunting districts encompassing the provinces of Genoa (GE1a, GE2a), Savona (SV1a-2a), Asti (AT1a-2a), Cuneo (CNa, CNc) and part of Alessandria (ALa). Additionally, a group of admixed individuals was detected in the more remote district of Verbano-Cusio-Ossola (VVCOc), representing an exception to this pattern.

Fig. 1
figure 1

Admixture levels assuming K = 2 genetic clusters. (A) Genetic structure of Ligurian populations by hunting district. (B) Genetic structure of Piedmont populations by hunting district. (C) Genetic structure of the total population, with individuals ordered by cluster membership coefficients. Each vertical bar represents one individual, and colours indicate the proportional genetic assignment to each cluster. Figure generated with R 4.3.3 (www.r-project.org).

Fig. 2
figure 2

Regional (left) and provincial (right) distribution of the probabilities assigned to wild boar individuals from north-western Italy to belong to the Bayesian Cluster 1, as determined by Bayesian clustering analysis. Figure generated with R 4.3.3 (www.r-project.org).

Following a more detailed analysis of genetic differences, we observed distinct spatial patterns, with a marked genetic differentiation (Mantel test, rM = 0.29; p = 0.03). Hunting districts were grouped into five separate clusters, exhibiting moderate stability (average silhouette width = 0.44; Fig. 3) and different levels of genetic divergence. Notably, wild boar populations from hunting areas in Liguria showed clear genetic separation from those in Piedmont, evidenced by a median inter-cluster distance of 4.99 (dmin/dmax: 0.003–17.08; Fig. 3A-B; Supplementary Table S5). However, the genetic variability within each region was found to be non-uniform, with Ligurian wild boar showing higher intra-regional genetic distances (dmedian= 3.79; dmin/dmax: 0.58–16.1) compared to those recorded within Piedmont (dmedian= 0.83; dmin/dmax: 0.005–6.74; Supplementary Table S5). This divergence was particularly evident along CA-Axis 1 (16.73% explained variance) which showed a strong association with hunting districts from Piedmont (Fig. 3A). Most wild boar from hunting areas in Piedmont exhibited a high degree of genetic similarity, forming two well-defined clusters: Cluster 1 included lowland hunting districts from central and southern Piedmont, as well as the southernmost Alpine districts, while Cluster 2 was mainly formed by Alpine hunting districts of northern and north-western areas (Fig. 4). Both clusters displayed minimal pairwise distances (dmedian[Cluster 1] = 0.58; dmin/dmax: 0.11–1.66/ dmedian[Cluster 3] = 0.74; dmin/dmax: 0.01–1.84), suggesting a relatively homogeneous genetic composition (Fig. 3B). However, certain districts in Piedmont showed more differentiated patterns, manifesting signs of isolation, as for TO3a (Fis = 0.2378). Conversely, hunting districts from south-eastern areas (i.e., ALa) appeared to share more genetic similarities with adjacent Ligurian areas (Cluster 2; within-cluster dmedian= 1.99; dmin/dmax = 0.61–2.60; Fig. 4) than with other Piedmontese hunting areas (dmedian [ALa vs. rest Piedmont] = 5.60; dmin/dmax =4.42–6.74). Within Liguria, we identified three main groups of hunting districts (Figs. 3A and 4): Cluster 2, which comprised the hunting districts of Genoa province (GE1a and GE2a), along with the south-eastern Piedmont hunting district (ALa); Cluster 4, which encompassed most western hunting districts of the region (SV2a-3a and IMa); and Cluster 5 that comprised the easternmost areas of Liguria (SPa). The latter cluster showed high levels of genetic differentiation compared with the other Ligurian districts (Fig. 3B). In fact, SPa showed the highest pairwise distances with median distances ranging from 10.8 with Cluster 2 to up to 14.3 with Cluster 4 (Supplementary Table S5), therefore suggesting a distinct genetic composition.

As previously mentioned, evidence of genetic connectivity was identified between the two regions (Figs. 3B and 4). Specifically, genetic similarities were observed between certain hunting districts of Genoa (GE) and south-eastern Piedmont (ALa). A comparable pattern was observed within SV1a, which shared genetic characteristics with southern and central hunting districts of Piedmont, displaying a median genetic distance of 0.54 (dmin/dmax = 0.13–0.95) in the CA space (Fig. 3B). Despite these similarities, we detected global but not local structures when accounting for spatial correlation (pglobal = 0.0001 and plocal = 1). In particular, the results obtained from the connectivity network used in sPCA reinforced those obtained from Bayesian clustering (i.e., two main genetic clusters), suggesting two different genetic structures (Fig. 5): one population spread in Piedmont and the other predominately distributed in the Liguria region, with some areas of transition mainly occurring in south-eastern areas of Piedmont. The sPCA results were consistent across different neighbour definitions (k = 3–15), with similar eigenvalue magnitudes and Moran’s I values (Supplementary Table S6), indicating that the detected spatial genetic structure was robust to variations in the connectivity network.

Fig. 3
figure 3

Graphical visualisation of the results of the Correspondence Analysis (CA) on the hunting districts of north-western Italy: (A) Distribution of hunting districts from Piedmont and Liguria regions based on CA space; (B) Heatmap of the pairwise similarity (Mahalanobis distance in the CA space) of genetic composition across hunting districts in the Piedmont and Liguria regions. Figure generated with R 4.3.3 (www.r-project.org).

Fig. 4
figure 4

Geographical representation of the clustered study hunting districts in Piedmont and Liguria. Note: Due to the reduced number of samples analysed and their high allelic variability, TO3a was excluded from this analysis. Figure generated with QGIS 3.8.3 (https://qgis.org/).

Fig. 5
figure 5

Spatial interpolation of sPCA individual lagged scores of the main two genetic structures of wild boar in Piedmont (blue) and Liguria (red) regions, north-western Italy. Figure generated with R 4.3.3 (http://www.r-project.org).

Spatiotemporal and directional patterns of ASF spread

Over a period of 38 months, the spatiotemporal scan identified 16 significant high-risk clusters of ASF cases among wild boar across the study area (p < 0.05; Table 1, based on 2,414 georeferenced detections). Figure 6 illustrates the high-risk ASF clusters in descending chronological order, beginning the earliest (Cluster 1) detected in December 2021, and extending to the most recent (Cluster 16), in January 2025.

Table 1 Clusters of wild Boar ASF reported in North-western Italy from 12/29/2021 to 03/11/2025 identified using the space–time permutation model of the scan statistics. Detailed information regarding each spatio-temporal cluster.

The initial ASF detection, Cluster 1, occurred in a transitional area along the Liguria-Piedmont border (i.e., between the A26 and A7 motorways), where we have identified admixed individuals and zones of connectivity between the two main wild boar populations. This cluster exhibited a radius of 13.8 km, persisted for a duration of 29 days, and involved 32 wild boar. It was the only cluster identified in 2022, corresponding with the initial ASF incursion in the study area. Throughout the observed timeline, several geographic areas experienced recurrent clustering at different time points, particularly in north-eastern areas, aligning with identified routes of genetic connectivity and with multiple events detected between late 2023 and mid-2024 (Table 2). This phenomenon was further substantiated by the spatial-temporal distribution and recurrence of clusters within the same municipalities or neighbouring ones, characterised by events of short duration and high intensity but also by more extensive, long-duration clusters (Table 2). In 2023, five distinct clusters of varying sizes and durations were subsequently identified, including a significant event centred in the Ligurian locality of Dego (Cluster 3), which encompassed 80 cases, extended across a 23.2 km radius, and persisted for nearly a month (Table 1). The epidemic further expanded in 2024 with the detection of eight new clusters (Cluster 7–14), which exhibited moderate to high case counts, particularly in eastern areas, and spread into neighbouring territories of the Lombardy and Emilia Romagna regions (Fig. 6). Of these, Cluster 13, located in Albareto (June 2024), was of most importance despite its comparatively brief duration of only six days (Table 1). The final phase of the study period, in early 2025, revealed two additional disease clusters (Clusters 15 and 16), which occurred along the eastern front of the study area, thereby reinforcing the temporal progression observed over the study period.

Fig. 6
figure 6

This figure represents the results of several analyses: admixture levels for k = 2 (yellow and blue triangles), sPCA clusters (hunting districts coloured by the assignments to one of the five identified clusters) and the spatio-temporal ASF spread, with the significative high clusters (black circles). Based on the Bayesian analysis, the blue triangles represent the wild boar primarily belonging to Bayesian cluster 1, while the yellow triangles represent wild boar from Bayesian cluster 2. Figure generated with QGIS 3.8.3 (https://qgis.org/).

Table 2 Characteristics of the spatio-temporal clusters of ASF detected in wild Boar in north-western Italy between December 2021 and March 2025.

Only four out of the 16 ASF clusters showed significant directional patterns, including the ASF clusters 2 (Grondona), 3 (Dego), 8 (Alice Bel Colle) and 16 (Cerano) (Table 2; Fig. 7). The clusters 2 and 16 consistently displayed westward spread patterns, as strongly evidenced by the high concentration indexes (R̅= 0.42 and 0.43, respectively). Conversely, Clusters 3 and 8 experienced eastward and north-eastward spread patterns, respectively, although they displayed less pronounced tendency (R̅= 0.24 and 0.28, respectively). The remaining clusters did not exhibit significant directionality, generally corresponding to periods of limited geographic expansion or localized persistence, often occurring within previously affected or genetically interconnected areas.

Fig. 7
figure 7

Directional spread patterns of ASF outbreaks in wild boar populations in north-western Italy (29 December 2021 to 11 March 2025). Each circle delineates a spatial cluster of ASF cases, with larger circles corresponding to higher case counts. Clusters highlighted by coloured points and arrows represent those with statistically significant directional patterns, suggesting non-random spread of infection. Grey circles without symbols indicate clusters with no significant directionality of disease spread. Figure generated with R 4.3.3 (http://www.r-project.org).

Discussion

A growing body of evidence indicates that different connectivity between wild boar populations exerts a pivotal influence on the dynamics of infectious diseases in wildlife34,37,38,39,40. Our study provides a comprehensive insight into the genetic structure reflecting long-term effective connectivity between wild boar populations and the spatiotemporal dynamics of ASF outbreaks in north-western Italy. Through the integration of microsatellite-based genotyping and spatial cluster analysis, we identified two main genetic clusters corresponding broadly to the analysed regions of Piedmont (Bayesian Cluster 1) and Liguria (Bayesian Cluster 2), with areas of genetic admixture occurring along the regional border (Figs. 1 and 6). Over a 38-month period, we detected 16 ASF clusters displaying clear patterns of progressive geographic expansion27, primarily progressing in a non-random north-eastward direction. ASF clusters notably increased in frequency and spatial extent over time. The initial outbreak, detected in early winter 2021, was geographically confined to the Liguria-Piedmont border; however, subsequent clusters expanded predominantly in north-eastern areas, where interconnection patterns among both genetic clusters were detected. Moreover, ASF clusters frequently overlapped with previously affected areas, suggesting local virus persistence and in some cases, endemic circulation.

Only four of the sixteen identified clusters exhibited significant directionality in the circular tests. This pattern likely reflects differences in the underlying transmission dynamics. Non-significant clusters often corresponded to phases of local persistence or saturation, when ASF detections repeatedly occurred within the same or neighbouring areas, producing multidirectional or stationary spread rather than a coherent advancing front34,41. Under such conditions, limited animal movement and carcass-mediated environmental persistence can sustain local circulation without a dominant spatial trajectory42,43,44. Additionally, clusters characterised by small case numbers, short durations or compact (nearly circular) spatial footprints have inherently lower statistical power to detect a preferred direction. These factors together may explain the absence of significant directionality in several clusters, despite ongoing local transmission.

Overall, the combination of directional and non-directional clusters indicates that ASF dissemination alternated between phases of multidirectional local persistence and periods of more directional expansion, reflecting shifts between short-range and long-range transmission dynamics40,41,45. These contrasting phases mirror the dual mechanisms governing disease spread in north-western Italy: local persistence sustained by limited dispersal and environmental contamination, and long-range propagation facilitated by population connectivity. In genetically homogeneous zones, strong social cohesion and carcass-mediated persistence likely promoted within-group infection recycling, producing multidirectional or stationary patterns. Conversely, in admixed or highly connected areas, movements along ecological corridors such as the Ligurian Apennines39 amplified transmission and generated coherent spread fronts (clusters 2, 3, 8 and 18). Similar associations between habitat connectivity and the spatial coherence of ASF spread have been reported elsewhere in Europe, where high-connectivity landscapes promoted faster or more directional epidemic progression40, while modelling studies suggest that disrupting such corridors can slow transmission pathways46.

Genetic evidence lends further support to this interpretation. Gene flow from Piedmont into the western Ligurian provinces (Savona and Genoa) is reflected by admixed individuals, with some wild boar in southern Piedmont displayed genotypes indicative of interbreeding with Ligurian populations. The recurrence of ASF outbreaks in these areas reinforces the role of population connectivity as a driver of disease spread. By contrast, peripheral provinces such as Imperia and La Spezia, which exhibited minimal degree of admixture, remained free of ASF, suggesting that fragmented coastal zones landscapes and mountainous terrain may have restricted connectivity of these populations and acted as epidemiological barriers. Such long-distance dispersal is typically driven by juvenile or subadult movement, which reduces the influence of local relatedness as distance increases3,34,47. Environmental persistence of the virus, through carcasses and contaminated substrates48,49,50,51, may therefore act as a key amplifier of this process by prolonging local infectivity and enabling indirect transmission across time and space. Together, the combined effects of social structure, carcass-mediated environmental stability, and corridor-driven dispersal appear enough to perpetuate long-term ASF persistence and facilitate the spatial expansion of the epidemic front43. These dual processes (family-based persistence and corridor-mediated expansion) thus offer a coherent framework linking genetic structure, landscape connectivity and the observed spatiotemporal dynamics of ASF.

The spatial configuration of ASF outbreaks observed in the present study appears to be closely linked to wild boar population structure and landscape connectivity. Comparable dynamics have been documented in Germany, where ASF spread was strongly modulated by landscape barriers between genetic groups37. In our study area, the two main genetic clusters – Piedmont and Liguria – are separated by a mosaic of natural and anthropogenic features, including rivers, mountains, and urbanised areas. While coastal zones show marked fragmentation, the inland Ligurian Apennines form a continuous corridor across the regional border, serving as a major conduit for both gene flow and viral transmission. This spatial heterogeneity likely explains why ASF persisted and expanded along interior corridors but remained absent from peripheral, isolated landscapes – a pattern consistent with other European regions, where highly connected habitat patches function as stepping stones for ASF propagation40.

These results also highlight the epidemiological relevance of genetic admixture zones, where individuals from both genetic clusters co-occur and interbreed. Such zones may be particularly vulnerable to disease spread due to increased contact rates, overlapping home ranges, and the coalescence of different social groups3,34,47. While we interpret these areas as potential transmission hubs, alternative explanations should also be considered. The genetic admixture observed may partly reflect long-term management legacies, such as historical translocations or restocking practices, that have shaped the background genetic structure of Piedmont’s wild boar populations52,53,54. By contrast, the genetic history of Ligurian populations remains less well-characterised; previous studies reported unique lineages that likely expanded from central Italian refugia, though long-distance gene flow appears limited53,54. This historical structuring may therefore have preconditioned the genetic landscape on which ASF dynamics unfolded.

Nevertheless, the spatial and temporal coincidence between admixture zones and ASF clusters suggests that ongoing connectivity, rather than past introductions alone, underlies current patterns of spread. Although, uneven carcass detection or sampling intensity might contribute to apparent clustering, several lines of evidence point to a biological rather than artefactual explanation: admixture zones were identified independently of disease data, exhibited high genetic connectivity across contiguous habitats, and overlapped with ASF clusters detected months apart. These findings support a model of heterogeneous ASF expansion – facilitated by permeable corridors and constrained by long-standing ecological boundaries.

In summary, ASF dissemination in north-western Italy appears to mirror the underlying genetic and ecological connectivity of wild boar populations. Although the genetic sampling (January 2022-October 2023) only partially overlapped with the ASF surveillance period (December 2021-March 2025), this temporal difference is unlikely to affect our interpretation. Population genetic structure in wild boar is expected to remain stable over such a short timescale, reflecting long-term dispersal and connectivity processes rather than transient demographic fluctuations. Any outbreak-related changes in structure would likely become evident only in cohorts born after the study period. The inclusion of ASF surveillance data up to March 2025 was therefore intended to provide the most recent complete epidemiological context for interpreting spatial relationships between population connectivity and disease spread.

We acknowledge several limitations. Our analyses relied on global ancestry estimation, which does distinguish recent and historical admixture; conducting local ancestry analyses in future studies would help disentangle these processes. The use of hunting-derived samples may also have introduced demographic bias, as juveniles –potential recent dispersers– were likely underrepresented, potentially leading to an underestimation of fine-scale connectivity. Additionally, variation in sample size across hunting districts, linked to carcass availability and detection effort, may have reduced local resolution; results from under-sampled areas should therefore be interpreted cautiously. Future work integrating local ancestry inference, balanced demographic sampling, and explicit modelling of landscape barriers will be essential to clarify how geographic and ecological features shape wild boar movement and ASF transmission. Such integrative approaches will enhance the predictive capacity of surveillance systems and support the design of spatially targeted control strategies.

Overall, our results highlight how the genetic and spatial organisation of wild boar populations directly influences ASF epidemiology. In conclusion, this study provides direct evidence that the spread of ASF in north-western Italy has been shaped by the underlying population structure and connectivity networks of wild boar. ASF clusters frequently overlapped with zones of genetic admixture and connectivity hubs, particularly along the Ligurian Apennines, where gene flow between the Piedmont and Ligurian populations was evident. These findings integrate genetic, spatial and epidemiological perspectives, showing that the interaction between social structure, environmental persistence and dispersal corridors governs both local maintenance and long-range expansion of the virus.

Disease dynamics therefore appear to be driven by two complementary mechanisms: short-range transmission within kin-related groups and longer-range dispersal along ecological corridors. Recognising these spatial and genetic drivers is essential to design effective and sustainable control measures. Management actions should be grounded in a deep understanding of population structure, movement ecology and landscape connectivity, enabling interventions that target biologically meaningful units rather than administrative boundaries.

Because ASF is a transboundary disease, effective containment requires coordination at local, regional and international levels, including policy harmonisation, real-time data sharing and cross-border collaboration. Integrating genetic monitoring into routine disease surveillance would provide critical insights into population connectivity and isolation, supporting the development of spatially informed control strategies and the prioritisation of high-risk transition zones for targeted interventions.

Methods

Study area

The study area is located in north-western Italy, encompassing an area of approximately 37,000 km2 within the regions of Piedmont and Liguria (Fig. 8). It extends between 43.8°N–46.5°N latitude and 6.5°E–9.5°E longitude, bordered by Aosta Valley and Switzerland to the north, France to the west, the Ligurian Sea to the south and Lombardy and Emilia-Romagna to the east. The area’s topography and climate exhibit significant heterogeneity. Piedmont is characterised by mountainous terrain in the north and west, the Po Valley in the centre and hilly landscapes in the south-east. The climate is predominantly humid temperate, with four distinct patterns55: (1) prealpine, the most widespread, featuring peak rainfall in spring and autumn with relatively balanced summer and winter precipitation; (2) subalpine, typical of the northernmost areas, where autumn rainfall exceeds spring levels, and summer precipitation is significantly higher than in winter; (3) subcontinental, prevalent in the north-west, where summer rainfall is nearly equal to spring and autumn peaks; and (4) sublittoral, dominant in the south-eastern part and Alta Val di Susa, with a pronounced autumn rainfall maximum, humid winters and hot summers with occasional arid tendencies. By contrast, Liguria is defined by a narrow coastal strip and a mountainous hinterland shaped by the Ligurian Apennines. The Mediterranean climate of the region is characterised by mild, occasionally humid winters along the coast, while the interior experiences colder, snowier conditions. Summers are hot and sultry along the coast, but cooler and breezy conditions prevail in the valleys. Rainfall is most abundant in spring and autumn, driven by humid southern currents that bring heavy orographic precipitation, whereas northern winds create drier, foehn-like conditions56.

Fig. 8
figure 8

Geographical distribution of the study hunting districts in the Piedmont and Liguria regions. Hunting districts designated with ‘c’ correspond to Alpine wildlife zones while those marked with ‘a’ represent lowland wildlife zones, as defined by the Italian Law 157/1992. To enhance the representativeness of the number of individuals tested per area, some original hunting districts (dotted boundaries) were merged and treated as a single hunting area. This applies to the districts CNc, CNa, TOBc, VVCOc, NOa, and ALa. Figure generated with QGIS 3.8.3 (https://qgis.org/).

Sampling design and data collection

Blood and tissue samples were collected from deceased wild boar between January 2022 and October 2023 as part of the nationwide passive surveillance programme against ASF in wild boar. No live animals were sampled, nor were any animals hunted or otherwise killed specifically for the purposes of this study. The study design is a stratified random sampling (with a fixed number of units for each stratum). The strata originally comprised 46 hunting districts in the Piedmont and Liguria regions. However, to improve the representativeness of areas from which only a few individuals had been sampled, certain districts were merged with adjacent ones and treated as a single hunting unit. Merging was based on both geographic contiguity and ecological similarity, considering habitat continuity, topographic features, and administrative boundaries. Districts with fewer than 30 individuals sampled were merged with neighbouring districts from the same municipality that shared continuous forested or rural landscapes and belonged to the same ecological corridor. This approach aimed to minimise artificial fragmentation of genetically homogeneous areas and to ensure sufficient sample size for reliable estimation of genetic parameters. This resulted in a total of 26 study hunting districts (Fig. 8).

Wild boar individuals were sampled at random, without replacement, and the group of animals from each hunting area formed the population unit for analysis. To ensure that the entire area of these two regions was uniformly represented in the analysis, the dataset, from which a random sample was extracted, was entered, with consideration given to the hunting district delineation of the two regions. The randomisation process was executed using Stata 1857. Sample size per district ranged from 3 to 68 individuals, reflecting the availability of carcasses and surveillance intensity within each area. Although this variability is inherent to passive surveillance, the analyses performed here incorporated resampling (i.e., bootstrap) and population-level approaches designed to reduce bias associated with unequal sample sizes.

DNA extraction and genotyping

The genomic DNA was extracted from blood and tissue samples using the Maxwell RSC Viral Total Nucleic Acid Purification Kit (Promega Corporation) according to the manufacturer’s instructions. Two multiplex PCR amplifications (Mix 1 and Mix 2) were performed, each optimized with different primer annealing temperatures. A total of 14 microsatellite markers were amplified: 11 recommended by the FAO58 guidelines for pigs and three additional markers according to Reiner et al.35 (Supplementary Table S1). The PCR reaction was performed in a volume of 10 µl consisting of 5 µl of 2 × Type-it Multiplex PCR Master Mix (Qiagen, Germany), 4 µl of 10 × primer mix, and 1 µl (5ng) of extracted DNA. Thermal cycling conditions consisted on an initial denaturation at 95 °C for 5 min., followed by 26 cycles at 94 °C for 30 s., annealing at either 57–60 °C for 90 s. (depending on the mix), and extension at 72 °C for 30 s. To 1 µl of PCR product, 0.5 µl of GeneScan 500 LIZ, a size standard consisting of fluorophore-labelled single-stranded molecules (Applied Biosystems), and 9 µl of Hi-Di formamide were added. The typing was performed using capillary electrophoresis on samples that had been loaded on a SeqStudio 24 Flex Genetic Analyser (Applied Biosystems), and the allele size was determined using GeneMapper Software 6, (Applied Biosystems).

Analysis of the wild Boar population structure

A preliminary quality control procedure was accurately performed on the data obtained from the 843 sequenced individuals. To ensure a robust and consistent analysis, we excluded loci that were not genotyped in > 1% of individuals as well as those individuals that were not possible to genotype for both alleles of maximum three loci of the microsatellite panel. Following this selection process, the final dataset consisted of genotyping data from 578 individuals, accompanied by detailed geographical information (i.e., latitude and longitude coordinates) on the location where the individuals were found dead and the population to which they belonged.

Population genetic parameters were analysed using R version 4.3.359. Null allele frequencies were estimated using the “null.all” function from the PopGenReport package60 with 1000 bootstrap replicates and evaluated according to the criteria outlined by Gruber & Adamack61. According to these criteria, if the values of the 97.5th percentile include zero across all loci indicates that for each locus the frequency of null alleles does not significantly differ from zero. The Hardy-Weinberg equilibrium (HWE) was assessed with an exact test based on Monte Carlo allele permutations, implemented via the “hw.test” function from the pegas package62 with 1000 iterations. Private alleles were identified with the “private_alleles” function of the poppr package63. Allelic richness, observed and expected heterozygosity, and inbreeding coefficient (Fis) were calculated using the “divBasic” function from the diveRsity package, incorporating 1000 bootstrap replicates for Fis confidence intervals. Pairwise population comparisons for Fst (i.e. the Fixation index which is a measure of population differentiation) were conducted using the function “fastDivPart” from the same package, based on the Weir and Cockerham method64.

To evaluate the global genetic structure of the wild boar population, an admixture model with correlated allele frequencies was implemented with the use of STRUCTURE software (version 2.3.4)65,66. The selection of this model is predicated on the broad home range of the study species, due in part to its behavioural plasticity and capacity to adapt in response to anthropic and climatic pressure67,68,69. These characteristics, in fact, might lead to the theoretical possibility of natural interconnectedness across different but neighbouring hunting districts in our study area (Faustini et al., 2025), and consequently lead to an admixture of individuals. Individuals were assigned to clusters (K), from 1 to 10, using Markov chain Monte Carlo (MCMC) simulations for 200,000 iterations after a burn-in of 100,000 iterations, with 10 independent replicates for each K value. The optimal K values were determined using the Python (version 3.12.2) program STRUCTURE HARVESTER (version 0.7)70 based on Ln Pr(X|K) and the Evanno’s ‘s ∆K method71. The visualisation of the STRUCTURE output was achieved using the StructuRly package 1072.

To further explore the genetic structure at a finer scale, we performed a correspondence analysis (CA) based on the geographic origin of individuals (i.e., hunting district). To evaluate the similarities between hunting areas (i.e., potential connectivity patterns), the factor scores obtained from CA were subsequently used to calculate a pairwise Mahalanobis distance matrix and subjected to hierarchical clustering using the ward.D method, known to optimise the within-group variance. To validate the consistency of the clusters identified, the silhouette scores were calculated to evaluate the appropriateness of the n-cluster solution. This framework approach was implemented using the R packages ade473 and stats59.

In order to assess the genetic variability in the natural space, we first evaluated the role of geographic distance in shaping genetic differentiation of wild boar. This was achieved by testing the possibility of isolation by distance patterns through Mantel test with 999 permutations, using the ade4 package. This analysis was performed at population-level (i.e., hunting district) to reduce noise from individual-level variation and minimise the potential influence from variable sample size. Secondly, we conducted a spatial Principal Component Analysis (sPCA) of individuals to evaluate the spatial organisation of genetic variation in our study area using the adegenet package74,75. This multivariate method jointly considers genetic variance and spatial autocorrelation, allowing the detection of both global and local genetic structures. This method requires a user-defined connectivity network among sampling points; we selected a network based on the five closest neighbours (k = 5), which ensured a fully connected yet minimally redundant graph, thereby preserving local spatial relationships without over-smoothing the spatial signal. To evaluate the robustness of this parameter choice, we repeated the analysis k values ranging from 3 to 15, The relative magnitude and ranking of eigenvalues and Moran’s I values remain stable across all tested configurations (Supplementary Table S6), indicating that the inferred spatial genetic structure inferred was not sensitive to the number of network density and that k = 5 provided an optimal balance between resolution and connectivity. Finally, the statistical significance of global or local structures was assessed through a Monte Carlo procedure with 9,999 permutations76, and lagged scores from sPCA were spatially interpolated to visualise spatial patterns of genetic variation.

Spatiotemporal and directional analysis of ASF spread

The evaluation of the spreading evolution of ASF in the study area was achieved through analysis of the ASF surveillance data. These data encompassed all wild boar that tested positive for the ASF virus genome through real-time PCR77 in the Piedmont and Liguria regions from 29/12/2021 to 05/03/2025. A Retrospective Spatiotemporal Scan Analysis permutation model78 was used to identify spatial-temporal ASF clusters to study the dynamics of ASF transmission. A cylindrical scan window of variable radius and constant height was used, with height representing the time period to search the high-risk clusters. The scanning window moved in time and space to cover every possible geographic location and time interval. To capture shorter-term variations, we explored time aggregations of seven days. The maximum spatial cluster size was defined as 50% of the total number of ASF cases and the maximum temporal cluster size was defined as 30 days. High-rate clusters were restricted to having at least five ASF cases. The log-likelihood ratio statistic (LLR) was employed to evaluate whether the scan window contained a clustered area, and Monte Carlo simulations (999 times) were used to evaluate the significance of the detected clusters at a p-value < 0.05. The number of cases within a hypothetical cluster was then compared with that expected under the null hypothesis. A cluster can exhibit a high trend if the rate of occurrence within the cluster is increasing more rapidly than outside the cluster. Likewise, a high trend may also occur if the rate within the cluster is decreasing less rapidly than outside the cluster79. Because this model analyses only ASF-positive detection and does not include the underlying population at risk, relative risk values and confidence intervals cannot be estimated. The identified clusters therefore represent significant aggregations of cases rather than areas of quantified risk.

For each space-time cluster identified, the direction of disease spread was evaluated using a directional test. The directional analysis was performed utilising the circular and Directional R packages80,81. For each cluster, the mean direction angle was calculated, along with the cardinal orientation (N, NE, E, SE, S, SW, W, NW). The concentration index (R̅), which is obtained from the number of cases in the same direction varying between 0 and 1, at the value 0 representing total absence and 1 representing perfect alignment, was also calculated. Finally, the Rayleigh test was performed to ascertain whether the directions are random or if they have a preferred direction. The null hypothesis stated that there was no association between the time at which a case was identified and its corresponding spatial location. Conversely, the alternative hypothesis suggested that the direction from one case to the next was similar for cases that occur at approximately the same time.

All figures illustrated in the manuscript are original and were generated with R software (version 4.3.3) and QGIS 3.8.3. Administrative boundary layers were obtained from the Italian National Institute of Statistics (ISTAT) Open Data, dataset “Confini delle unità amministrative a fini statistici, release 01/01/2018” (https://www.istat.it/notizia/confini-delle-unita-amministrative-a-fini-statistici-al-1-gennaio-2018-2/), which is distributed under a Creative Commons Attribution licence; all maps presented here are original figures generated by the authors using these open data.