Introduction

Plants play a crucial role in tackling major social and environmental challenges today, such as the provision of food, fuel and environmental stability. However, the threat posed by invasive plant pests and pathogens is substantial1,2, and continually rising due to increasing globalization and climate change3. Epidemics resulting from these pest invasions can have catastrophic impacts on human health, food security and contribute to environmental change. For example, after Xylella fastidiosa was detected in Italy in 2013, it reduced ecosystem services including food production, soil erosion regulation and ornamental resources by 34%, and decreased biodiversity by 28%4. This was due to its impact on high-value habitats and the genetic diversity of the host olive trees, as farmers mostly replanted resistant cultivars. Moreover, in Italy the pathogen has been projected to lead up to 5.2 billion Euros in economic losses on olive alone and can infect hundreds of other plant species5.

After discovering a plant pest – which we interpret broadly to refer to vectors, herbivorous insects, and pathogens6 – in a new area, the extent of the infestation must be quickly determined so control measures can be implemented to prevent further spread. In the EU, delimitation, which involves defining the boundaries of an area considered to be infested by a pest (i.e. the potential infested zone or PIZ)2, is a regulatory requirement for all quarantine pests7. It is typically conducted via surveys which generate information on the observed presence and absence of a pest within a population8. An example of a delimiting strategy involves drawing a circle around every identified infected (or infested, in the case of an insect pest) individual and supplementing each circle with a buffer zone9,10,11. To achieve eradication or containment of a pest, the chosen delimiting strategy should delimit an area that contains the infestation12.

Understanding early-stage pest spread is challenging and depends on the epidemiological characteristics of the pest, the landscape, and environmental conditions at the initial arrival site. Spatial and temporal processes of surveillance and detection (e.g. the frequency and intensity of surveys, and the sensitivity of detection methods used) also determine how long the pest has been left undetected. A lack of understanding of these complex interacting processes has contributed to a lack of consensus on how to effectively conduct delimitation. Consequently, while there has been studies that evaluated previously applied strategies13,14,15 and documents that provide guidelines on the design and implementation of delimiting strategies16,17, there is little overview on the effectiveness of different delimiting strategies in the domain of plant pests and real-world approaches are often ad-hoc and not science-based. This is concerning as failure to accurately delimit an outbreak could result in either delimiting an area that is too small, which could lead to the pest spreading further, or an area that is too large, which could lead to unnecessary costs and potential legal obstacles associated with control programs18. Effective delimitation is especially important in non-agricultural areas (e.g. plants in urban areas, or natural communities) where host populations are not intensively managed and inspected. These areas can act as important reservoirs for pests and influence the connectivity between commercial and conservation areas19,20,21,22,23,24.

To address the knowledge gap and provide scientific support for policymakers, we used an individual-based model (IBM) to simulate the spread of Candidatus Liberibacter spp. - the causal agent of Huanglongbing (HLB), also known as citrus greening. We then evaluated the performance of three delimiting strategies across various host distribution landscapes with host densities comparable to those of urban areas in Spain. The work was developed in collaboration with the European Food Safety Authority (EFSA) plant pest survey methods Working Group to inform and update the EFSA General guidelines for statistically sound and risk-based surveys of plant pests16.

We chose to model HLB as it is a pertinent example due to its economic significance and its current threat to global citrus-producing areas, including Europe. It is caused by the bacterium Candidatus Liberibacter spp., and is considered to be the most destructive citrus pathosystem in the world25 and is an EU “priority plant pest”26. Previously confined to Asia and Africa, it was first detected in the Western Hemisphere in 2004 in Brazil, where in just five years it had infected approximately four million trees27. The effects of HLB were also devastating after its introduction in Florida in 2005 where after progressing slowly in the first three years, the disease incidence quickly started doubling every year until reaching 80% by 201328 and generated losses estimated at approximately US$1 billion per year29. While HLB is currently absent in Europe, two of its vectors Trioza erytreae and Diaphorina citri have been detected in the Mediterranean Basin. T. erytreae was first detected in the island of Madeira (Portugal) in 1994, then in the Canary Islands (Spain) in 2002, and eventually on the mainland of Spain in Galicia in 201430, while D. citri was recently detected in Cyprus in 202331. Given that Spain is the fifth largest producer of citrus in the world32, the detection of T. erytreae in Spain has instigated the need for a contingency plan which includes an effective delimitation strategy. Delimitation for HLB is particularly challenging given its long asymptomatic period and delayed but rapid rate of population growth33. While most citrus trees grow in commercial groves and orchards, substantial populations are also found in municipal and residential areas, where tree density and distribution differ significantly.

Methodology

Host distribution methods

We used three different methods to simulate the spatial distribution of hosts which represented a range of landscape types (Fig. 1a). The spread of HLB through the different host landscapes was modelled with an individual based model (IBM). The IBM was coded in R 4.3.334. We test our model using a landscape with a similar density of citrus trees to that of Seville, a large city in Spain. It is difficult to unambiguously quantify the number of citrus trees in Seville, with estimates ranging from 25,000 bitter orange trees in 199635 to 50,000 orange trees in 202036. In line with these estimates, here we used the value quoted by Galvañ, et al.37, who estimated there are approximately 46,000 citrus trees in the city of Seville which has an area of 141.4 km2. To approximate the tree density in Seville, the IBM generates 15,941 trees and distributes them in a 49 km2 plot.

The first method involves random placement, where x and y coordinates were drawn from a uniform probability distribution between 0 and 7000. This resulted in points with randomly generated coordinates that were uniformly distributed and exhibited no distinct pattern or clustering. The second method used the R package spatstat38 to generate a clustered host landscape, using a Poisson cluster process. The final method captured the key characteristics of urban citrus tree distribution in Seville, where trees tend to be distributed along roadsides and in parks. Firstly, to avoid artifacts at the boundaries of the plot, a Voronoi diagram that extended beyond the 7 km x 7 km plot was constructed and an augmented population size was chosen (28,000). Using an estimate of Seville’s public parks and gardens and acknowledging that not all trees are citrus, we randomly selected 8% of polygons to represent these areas. Within each polygon, we simulated a certain number of trees based on its area and distributed them randomly. The remaining trees were then distributed along the lengths of all the remaining polygons to simulate tree plantings along roads. To introduce more spatial host heterogeneity, the spacing between trees was varied by first calculating the average spacing between each tree necessary to populate the lengths of all the remaining polygons. We then randomly chose a value between 0.8 and 1.4 times of the average spacing for each polygon length and iterated this process until the difference between the simulated and augmented population sizes was less than or equal to 10. Finally, the entire plot was cropped down to 7 km x 7 km and trees were randomly removed to achieve the target population of 15,941. We also simulated landscapes with an unrealistic level of clustering (Extreme clustered landscapes) with both the Poisson cluster process and Voronoi diagram methods to fully test the robustness of the delimiting strategies. With the three different distribution methods and two levels of clustering, there was a total of five different host landscape types (Fig. 1a).

To ensure there was enough contrast between the host landscape types, we generated 500 realizations of each landscape type, simulated a 5-year epidemic with the IBM and calculated the disease prevalence per year and the number of susceptible and infectious hosts in the population in each year. For each realization, we ensured that the first infected host was in approximately the same position for all five landscape types. While the resulting plots (Fig. S1 and S2) showed that there was sufficient contrast between the host landscape types, it also showed that the variation in disease prevalence increased with increasingly clustered host landscapes. This variation was anticipated, as in more clustered landscapes, if the first infected host occurs within a dense cluster interconnected with other clusters, the simulated pest spreads more rapidly and infects a larger proportion of the population. Conversely, if the first infected host appears in an area with sparse hosts, the limited connectivity hinders the pest’s spread, resulting in a lower prevalence.

The individual based model

The IBM adapts an SCIR (Susceptible, Cryptic, Infectious, Removed) model39,40,41 rather than the SEIR (Susceptible, Exposed, Infectious, Recovered) model because trees newly inoculated with HLB bacteria can become infectious within two weeks42 even while the tree is asymptomatic43. Since there is no effective chemical treatment for HLB44 and control measures are applied after delimitation, we did not model the removal of infected hosts, making the IBM a SCI model. The IBM makes two key assumptions. Firstly, infected hosts only become symptomatic an average of 365 days after infection. Anecdotal observations from the field have suggested that HLB disease symptoms manifest upwards of several months to a year after infection25,45, while graft inoculated trees in a laboratory manifested symptoms 200 days after infection43. Secondly, even though the vectors of HLB have shown preference for different citrus species46,47, we assumed that all host individuals are the same species.

The distances between all individuals were computed and the rate of disease transmission between a pair of hosts separated by distance \(\:{d}_{ij}\) was modelled with an (unnormalized) exponential dispersal kernel, \(\:K\left(d;\:\alpha\:\right)=\text{exp}\left(-\frac{d}{\alpha\:}\right)\) where \(\:\alpha\:\) is the scale parameter. The IBM initializes by setting the state of each individual to susceptible. A single individual is selected at random to be the origin of the epidemic and its state is updated to cryptic. The simulation then runs in continuous time and epidemiological transitions of all individuals were simulated using the direct Gillespie’s algorithm48.

Because data on psyllid dispersal under natural conditions were not available, we used data on the distance travelled by psyllids when flying in an artificial mill, making the conservative assumption that average dispersal distances under natural conditions would be comparable. Arakawa and Miyamoto49 estimated the mean distance travelled by psyllids as 346 m under controlled conditions. We then chose the scale parameter of the exponential dispersal kernel in our model, \(\:\alpha\:\), such that \(\:2\times\:\alpha\:=346\) m invoking the standard relationship between the mean dispersal distance in two dimensions and the exponential dispersal kernel scale parameter50. With a value of 173 m for \(\:\alpha\:\), we then parameterized the baseline infection rate (\(\:\beta\:\)). In commercial trees, the prevalence of HLB can reach 50% in two years after the first infection in blocks with only young trees (0–2 years)25,51. Considering that citrus trees in a city would likely be a mix of old and young trees, we parameterized \(\:\beta\:\) with a line search to achieve a prevalence of 50% after five years on a random host landscape. We selected twenty-five values of \(\:\beta\:\) (ranging from 0.0001/yr to 0.00022/yr), obtained the median prevalence for each value from 500 iterations, plotted the results in a graph (Fig. S3) and obtained a value of 0.0001424/yr for \(\:\beta\:\). An animated GIF showing the spread of the simulated pest in a random host landscape can be found in the Supplementary Material (Fig. SA1).

With the parameterized values of \(\:\alpha\:\) and \(\:\beta\:\), and on a random host distribution landscape, we estimated (a) the mean maximum spread distance of the pest in the first year (1057 ± 17 m, Fig. S23a), (b) the mean maximum spread distance of the pest per generation (738 ± 7 m, Fig. S23b), and (c) the mean number of generations after 5 years (25, Fig. S24). Each of these values was estimated with a minimum of 1,000 realizations of the IBM.

For the remainder of the paper, we will use the value of 1050 m as the true maximum yearly spread distance of the pest, and the value of 750 m as the true maximum spread distance of each generation. It is worth noting that the value of 1050 m falls within the range (1 percentile: 0.9 km/yr, 99 percentile: 40.12 km/yr) from a recent expert knowledge elicitation exercise for the spread rate of HLB in the EU26. Although 1050 m is on the lower end of the elicited range, this is because experts estimated the spread in citrus production areas with much higher host densities than in our simulation and included natural long-distance jumps of vectors which could not occur within the size of study region we consider here. Both the mean and median number of generations present after five years was twenty-five, which indicated that our simulated pest had an average of five generations per year. This is consistent with the biology of the psyllid vectors and HLB. According to Djeddour, et al.52, the psyllid vectors have about nine to ten generations per year, coupled with the latent period of HLB equal to about one generation of psyllid vectors34, this gives about five generations of infected trees per year.

Modelling the delimiting strategies

Description of the three main strategies

In practice, delimiting strategies usually involve drawing a circle around infected hosts with the radius determined by an estimate of the maximum annual spread distance of the pest. This estimate can be derived through various methods, such as expert elicitation or with empirical data, but its accuracy is rarely known. Therefore, in addition to examining the effect of host distribution landscapes, we also assess the impact of underestimating, matching, or overestimating spread rate – referred to here as the “inspector-estimated” (IE) value - on the performance of the delimiting strategies. We coded three different delimiting strategies, namely the In-to-Out strategy, the Adaptive strategy, and the Multi-foci strategy (Fig. 1b). These three strategies were identified in discussion with the EFSA plant pest survey methods Working Group leading to a recommendation for the Adaptive strategy in EFSA’s General Guidelines16. The In-to-Out and Adaptive strategies are essentially circles of varying radii drawn around the first detected infected tree. They differ in that the In-to-Out strategy always starts with surveying the immediate area around the first infected tree and then moves outward in concentric circles until no more infected trees are detected (Fig. 1b(i) and SA2). The Adaptive strategy aims to get ahead of the spread of the pest by first estimating the boundary of the epidemic and then conducting a preliminary survey in a band outside the estimated boundary (Fig. 1b(ii)). If the epidemic boundary was correctly estimated, no infected trees will be detected in the preliminary survey and the Adaptive strategy moves inwards towards the first detection and stops when a successful detection is made (Fig. 1b(ii) and SA4). If an infected tree is detected in the preliminary survey, the Adaptive strategy behaves like the In-to-Out strategy and moves outward until no more infected trees are detected (Fig. 1b(ii) and SA3). Unlike the other two strategies, the Multi-foci strategy considers multiple detections made in each survey round and draws a circle, of fixed radius length, around each new detection. The next survey band is defined as the encircled areas that do not overlap with the previous survey band (Fig. 1b(iii) and SA5). Like the In-to-Out strategy, the Multi-foci strategy stops when no new detections are made. To assess the impact of underestimating, matching or overestimating the true spread distances of the pest, the radius of each round of all three delimiting strategies will be calculated based on several IE values.

Fig. 1
figure 1

Examples of all the different host landscape types. Each plot has the same area and number of points (15,941) (a). Diagram of how the three delimiting strategies work. The red circle represents the first circle drawn around the first infected tree based on each strategy’s calculated radius length. The band in between the red circle and dotted purple circle in the Adaptive strategy represents the band in which the preliminary survey is conducted. Red dots represent the first infected tree detected, black dots represent subsequent detections of infected trees and survey bands are indicated by blue areas (b).

Calculating radius length for the delimiting strategies

Because the radii of the In-to-Out and Adaptive strategies differ between survey rounds, and each new survey band represents an additional year or generation of spread, there are multiple ways to calculate the radii. For this paper we considered three different methods to calculate the varying radii of the In-to-Out and Adaptive strategies. The first method (Linear) assumes that the pest spreads according to the IE maximum annual spread distance (Table 1). However, this is likely an overestimate because the probability of the pest dispersing to the maximum distance in consecutive years is very low. The effects of an exponential spread that is compounded yearly can be better approximated by a gamma distribution to represent multiple rounds of exponentially distributed dispersal. This may also be an overestimation since it doesn’t account for environmental factors or the effects of host distribution, but it provides a usable method. The simplest approach (Gamma Year) was to parameterize the shape parameter with the IE spread duration of the pest in years and use the corresponding IE maximum yearly spread distance to parameterize the rate parameter (Table 1). However, the underlying assumption of the Gamma Year method is that the pest has only one generation per year. Therefore, in the case of a polycyclic pest like HLB, where new infective units are produced in within the same season, it would be more accurate to parameterize the shape parameter with the IE spread duration of the pest in generations and use the IE maximum generational spread distance to parameterize the rate parameter (Gamma Gen) (Table 1). Since the IE spread distances are treated as the 95th percentile of the pest’s annual/generational spread, the expression \(\:\text{ln}(1-0.95)\) transforms this 95th percentile into a form that adjusts the radius calculation accordingly. With the Multi-foci strategy, the radius of every circle was kept constant and was the IE maximum yearly spread distance of the pest. With the three versions of both the Adaptive and In-to-Out strategies and the Multi-foci strategy, there were a total of seven different strategies tested.

Table 1 Summary of how each delimiting strategy calculates the length of the radius. Where \(\:{y}_{i}\) is the inspector-estimated spread duration of the pest in years for the \(\:i\)th survey round, \(\:{g}_{i}\) is the inspector-estimated spread duration of the pest in generations for the \(\:i\)th survey round, \(\:D\) is the inspector-estimated maximum yearly spread distance of the pest, and \(\:\delta\:\) is the inspector-estimated maximum spread distance of each generation.

The number of trees surveyed per round

To calculate the number of trees that need to be surveyed to detect an infection in each survey band, all seven strategies determine the number of samples required within a band to achieve a certain confidence level (\(\:CL\)) that detection will occur if the pest is present above a defined design prevalence (\(\:DP\)). For example, if the pest is not detected at \(\:CL=0.8\) and \(\:DP=0.1\), it suggests an 80% confidence that the pest prevalence is lower than 10% of the target population. For this, we adopted the finite population equation from EFSA’s risk-based estimate of system sensitivity (RiBESS+) tool53. The equation utilizes a hypergeometric probability distribution and is based on the principles originally developed by Cannon54. It is expressed as:

$$\:Sample\:size=\frac{\left(1-\:{(1-CL)}^{\frac{1}{N\times\:DP}}\right)\times\:\left(N-0.5\left(N\times\:DP\times\:MetSens-1\right)\right)}{MetSens}$$
(1)

where \(\:N\) is the total number of trees within the survey band and \(\:MetSens\) is the sensitivity of the detection method. In practice, both the \(\:CL\) and \(\:DP\) are variables determined by risk managers, balancing the trade-off between acceptability risk levels and available resources. For our tests, we used a confidence level of \(\:CL=0.95\) and a design prevalence of \(\:DP=0.01\); the number of trees depended on the area being surveyed, and we varied the sensitivity of detection with \(\:MetSens=0.2,\:0.5,\:0.8\) and \(\:1.0\). Finally, our tests of all seven delimiting strategies assume that each survey round, regardless of the number of trees or area that needs to be covered, is completed instantaneously on the 30th day after the conclusion of the previous survey round or the start of the delimiting strategy.

Assessing the performance of the delimiting strategies.

Performance was measured with four metrics.

Capability.

  • The number of infected trees delimited by the strategies divided by the total number of infected trees present at the end of the delimiting survey. We also explored the proportion of simulations where a Capability of 1 was achieved (i.e. 100% of infected trees were delimited).

Efficiency.

  • The area of the delimited potential infested zone divided by the area of the convex hull (the area of the smallest convex polygon needed to delimit all the infected trees present in the population at the end of the delimiting strategy).

Effort.

  • The total number of trees surveyed.

Survey rounds.

  • The number of survey rounds taken to delimit a potential infested zone.

We assessed the performance of all seven strategies with five different scenarios of increasing complexity. The first scenario assumed a hypothetical situation where the IE duration of pest spread and distances accurately matched the true values, the epidemic’s origin was correctly identified and used as the starting point for all seven delimiting strategies, and there was no asymptomatic period for the pest. The only variable was the sensitivity of the detection method (Method Sensitivity), tested at four levels (0.2, 0.5, 0.8, and 1.0). The second scenario was similar to the first, except that a 1-year asymptomatic period was included.

The third scenario expanded on the second by starting the delimiting strategies at a random symptomatic tree in the population, rather than the epidemic’s origin. The fourth scenario further built on the third, simulating situations where the IE annual/generational spread distance of the pest was: (1) greatly underestimated (annual: 450 m/yr; generational: 350 m/gen), (2) underestimated (annual: 750 m/yr; generational: 550 m/gen), (3) matched with the true value (annual: 1050 m/yr; generational: 750 m/gen), and (4) overestimated (annual: 1350 m/yr; generational: 950 m/gen). In this scenario, the strategies assumed a 3-year spread duration (15 generations) but started surveying when the pest had spread for only 2 years (10 generations) to overestimate the duration, or for 4 years (20 generations) to underestimate it.

The fifth scenario was like the fourth but assessed the performance of the delimiting strategies in clustered and extreme clustered landscapes generated by both the Poisson cluster process and Voronoi diagram methods.

The IBM, host distribution methods and delimiting strategies were all modelled and run in R 4.3.334 and the code and parameters can be found in the Supplementary Materials (https://github.com/roguedaemon87/Supplementary-Material). To differentiate animated figures from static ones, animated figures in the Supplementary Material are labeled with the prefix SA_, while static figures are labeled with S_.

Results

In the following sections we first explore the influence of changing the different surveillance parameters on the performance of each of the three delimiting strategies, using a ‘random host landscape’ to demonstrate this. Secondly, we explore the influence of different types of clustered landscapes on the performance of the three delimiting strategies.

The influence of surveillance parameters on delimiting strategy performance on a random host landscape

The RiBESS + equation performed well and negated the effects of changing the method sensitivity on the Capability and Efficiency of all seven strategies (i.e. as lower method sensitives were applied, this was compensated for by RiBESS + increasing sample size) (Fig. 2a). However, decreasing the method sensitivity resulted in increased Effort levels to delimit the PIZ (Fig. 2b). Varying the inspector-estimated (IE) yearly/generational spread distance profoundly affected the performance of all seven strategies (Fig. 3). Capability of all seven strategies was poor when the IE spread distances were lower than the true values and improved when the IE spread distance was equal to or greater than the true values (Fig. 3a). Throughout the first four scenarios, the Gamma Gen (both In-to-Out and Adaptive) strategies had the highest Capability levels, albeit at higher Effort levels and at lower Efficiency (when the IE spread distance was matched or overestimated) than the other strategies.

The Gamma Gen (both In-to-Out and Adaptive) strategies also achieved perfect Capability scores the most often, even across different host landscapes (Fig. 4a and S19), and performed well regardless of whether the IE mean number of generations per year was underestimated or overestimated (Fig. S11 to S14). However, in most scenarios, the Adaptive (Gamma Gen) strategy had a higher Efficiency (Method Sensitivity = 0.2, 0.5, 0.8, 1.0; median: 0.67, 1.00, 1.51, 2.13) than the In-to-Out (Gamma Gen) strategy (Method Sensitivity = 0.2, 0.5, 0.8, 1.0; median: 0.67, 1.23, 1.89, 2.88) (Fig. 3b), needed less Effort (Fig. 5a) and fewer survey rounds (Fig. 5b) than the In-to-Out (Gamma Gen) strategy and therefore was the overall best strategy. Changing the IE duration of the pest spread also affected the performance of all seven strategies. The Gamma Gen strategies performed well when the pest spread duration was overestimated (delimiting strategies initiated 2 years post-infection) but declined in Capability when the duration was accurately matched or underestimated (delimiting strategies initiated 4 years post-infection) (Fig. 4b). In contrast, the Gamma Year, Linear, and Multi-foci strategies showed improved performance with longer pest spread durations (Fig. 4b).

Fig. 2
figure 2

The change in Capability (a) and Effort (b) with Method sensitivity of all seven delimiting strategies on a randomly distributed host landscape. The inspector-estimated (IE) duration of the pest spread, and annual/generational spread distances were matched with the true values. For each realization of the simulated epidemic, all delimiting strategies started from the same randomly selected symptomatic individual. Each boxplot was obtained from 500 iterations. Mean values are indicated with a dark-red diamond.

Fig. 3
figure 3

The change in Capability (a) and Efficiency (b) scores of all seven delimiting strategies on a randomly distributed host landscape with varying inspector-estimated (IE) annual/generational spread distances. The IE duration of the pest spread was overestimated, and the Method Sensitivity was 0.5. For each realization of the simulated epidemic, all delimiting strategies started from the same randomly selected symptomatic individual. Each boxplot was obtained from 500 iterations and the values of the IE spread distances are given in the table above the graph. Mean values are indicated with a dark-red diamond.

Fig. 4
figure 4

(a) The change of how often each of the seven delimiting strategies achieved a perfect Capability score of 1 with inspector-estimated (IE) annual/generational spread distances on a randomly distributed host landscape. The IE duration of pest spread was overestimated. (b) The change in Capability scores of all seven delimiting strategies on a randomly distributed host landscape with varying start times of the delimiting strategies. The inspector-estimated (IE) duration of the pest spread was 3-years, the IE spread distance was matched with the true value (annual: 1050 m/yr; generational: 750 m/yr). For both panels, in each realization of the simulated epidemic, all delimiting strategies started from the same randomly selected symptomatic individual. Each boxplot and bar were obtained from 500 iterations and mean values for the boxplots are indicated with a dark-red diamond. Method sensitivity was 0.5.

Fig. 5
figure 5

The change in Effort (a) and the number of survey rounds (b) of all seven delimiting strategies on a randomly distributed host landscape with varying inspector-estimated (IE) annual/generational spread distances. The IE duration of the pest spread was overestimated, and the Method Sensitivity was 0.5. For each realization of the simulated epidemic, all delimiting strategies started from the same randomly selected symptomatic individual. Each boxplot was obtained from 500 iterations and the values of the IE spread distances are given in the table above the graph. Mean values are indicated with a dark-red diamond.

The influence of clustered host landscapes on delimiting strategy performance

Although increasing the level of clustering resulted in greater variation in performance, the Gamma Gen strategies maintained the highest Capability scores (Fig. 6a). While the In-to-Out (Gamma Gen) strategy slightly outperformed the Adaptive (Gamma Gen) strategy in Capability, the Adaptive (Gamma Gen) strategy proved more Efficient in most scenarios, requiring fewer survey rounds and less Effort than the In-to-Out (Gamma Gen) strategy. However, when the IE spread distances were lower than the true value, the Capability scores of the Gamma Year and Multi-foci strategies improved with increasing levels of clustering and host heterogeneity, with the former surpassing the Capability scores of the Gamma Gen strategies in Extreme Voronoi Diagram host landscapes (Fig. 6b). This was because the Gamma Year strategies’ shorter radii better matched the spread of symptomatic individuals, which had been slowed and bottlenecked by the heterogenous landscape (Fig. S21, SA6 and SA7). However, this in turn resulted in the Gamma Year strategies needing much more Effort (Fig. 7a) and many more survey rounds (Fig. 7b) than the Gamma Gen strategies to delimit the PIZ. The full plots of each scenario can be found in the Supplementary Materials (Scenario 1: Fig. S4; Scenario 2: Fig. S5; Scenario 3: Fig. S6; Scenario 4: Fig. S7–10; Scenario 5: Fig. S15 – S18).

Fig. 6
figure 6

The change in the Capability scores of the delimiting strategies on various host landscape types and when the Method Sensitivity was 0.5, and the IE duration of the pest spread was overestimated. The inspector-estimated (IE) spread distances were matched with the true value (a) and greatly underestimated (b) (450 m/yr and 350 m/gen). For each realization of the simulated epidemic, all delimiting strategies started from the same randomly selected symptomatic individual. Boxplots for the Random, Poisson and Voronoi landscapes were obtained from 500 iterations, while boxplots for the Extreme Poisson and Extreme Voronoi landscapes were obtained from 200 iterations. Mean values are indicated with a dark-red diamond. The performance of the In-to-Out (Linear) and Adaptive (Linear) strategies was not assessed on clustered host landscapes.

Fig. 7
figure 7

The change in Effort scores (a) and number of survey rounds (b) of each delimiting strategy on various landscape types. The Method Sensitivity was 0.5, the inspector-estimated (IE) spread distances were greatly underestimated (450 m/year and 350 m/generation) and the IE duration of the pest spread was overestimated. For each realization of the simulated epidemic, all delimiting strategies started from the same randomly selected symptomatic individual. Boxplots for the random host, Poisson and Voronoi landscapes were obtained from 500 iterations, while boxplots for the Extreme Poisson and Extreme Voronoi landscapes were obtained from 200 iterations. Mean values are indicated with a dark-red diamond.

Discussion

Delimiting strategies are essential for effective outbreak management, whether for eradication or containment. Without science-based approaches, delimited areas may be inadequately sized—too small, allowing pests to spread unchecked, or too large, making eradication economically unfeasible. In this study, we focused on the economically significant causal agent of citrus HLB to illustrate how the design of delimiting strategies depends on the pest’s epidemiology. We compared various delimiting strategies across different scenarios and host landscape types to objectively determine the best-performing strategy. On a random host landscape, the Multi-foci strategy had the worst performance. Its performance was most negatively affected when the strategy started on a random symptomatic tree and asymptomatic trees cannot be detected. This was mainly caused by two reasons. Firstly, because the IBM models the spread with an exponential dispersal kernel, the epidemic spreads from its origin and creates two fronts: an inner front of symptomatic hosts and an outer front of asymptomatic hosts. Hence, when the Multi-foci strategy begins at a random symptomatic tree on the edge of the symptomatic front, it encounters more symptomatic hosts towards the epidemic’s origin and fewer in other directions. This bias persists even with perfect method sensitivity, as fewer hosts are sampled away from the origin. At lower sensitivities, imperfect detection further reinforces this trend, leading the strategy to focus inward rather than outward. Secondly, the constant and relatively short radii of the Multi-foci strategy results in survey circles that fail to capture the full extent of the spread, especially given the presence of asymptomatic trees. An example of these reasons is provided in Fig. SA8 of the Supplementary Material. For comparison, Fig. SA9 illustrates the performance of the Adaptive (Gamma Gen) strategy in the same scenario as Fig. SA8. The In-to-Out and Adaptive strategies performed similarly in terms of Capability, but the Adaptive strategy was generally more Efficient and required fewer survey rounds and less Effort in most situations. However, the In-to-Out strategy outperformed the Adaptive strategy in terms of Capability, Effort, and Number of Survey Rounds when the pest was detected early (two years after the initial infection) (Fig. S7, S9 and S10). The strong performance of the In-to-Out strategy when the pest was detected early aligns with the findings of other research studies55,56 and can be attributed to the strategy’s initial focus on surveying the immediate area around the starting point while the epidemic was still relatively small. This allowed the In-to-Out strategy to delimit the PIZ faster and more efficiently than the Adaptive strategy. Therefore, if the target pest’s dispersal rate is low or restrained and the confidence of spread duration is high, the In-to-Out strategy would be preferable over the Adaptive strategy. However, in practice, the timing of pest detection and identification is often unclear in relation to when it was first introduced to an area. This makes the Adaptive strategy the safer and more conservative option.

Of the three different versions of the Adaptive strategy, the Gamma Gen version performed the best. Compared to the Linear and Gamma Year versions, it consistently had the highest Capability levels (Fig. 3a) and even had an average Capability of > 90% when the generational spread distance was underestimated (Fig. S7). The Gamma Gen version also benefitted from earlier pest detections, as its wider survey bands were able to encompass most infected hosts (Fig. 4b). Although the relative Capability of its wider bands decreased as pest spread duration increased, it still outperformed the Gamma Year and Linear versions in terms of Capability (Fig. 4b). Additional details regarding the effect of the duration of pest spread on the delimiting strategies can be found in the Supplementary Materials. The Gamma Gen version was less efficient than the other versions due to its wider survey bands, which tended to overestimate the area of the convex hull. However, even when the generational spread distance was overestimated, it only delimited an area approximately 2 times the convex hull on average (Fig. S12). The superior performance of the Gamma Gen version stems from its suitability in tracking a polycyclic pest like HLB. Unlike the Linear and Gamma Year versions, which assume the pest has only one generation – and thus one spread event – per year, the Gamma Gen version accounts for multiple spread events annually. This was further confirmed when we analyzed how frequently each version accurately estimated the epidemic’s spread distance and moved inward (Fig. S20), as well as by comparing the probability density function values derived from the gamma distribution with a histogram of the pest’s spread distances after five years (Fig. S21 and S22). Of the three versions, the Gamma Gen version was the only one that moved inwards > 90% of the time when the estimated spread distance was matched, regardless of whether the IE duration of pest spread was overestimated, matched or underestimated. Even when the IE number of generations per year was underestimated to three generations per year and the IE generational spread distance was greatly underestimated, if the IE duration of the pest spread was matched or underestimated, the Adaptive (Gamma Gen) strategy could achieve Capability levels of > 90% (Fig. S11). Therefore, the Adaptive (Gamma Gen) strategy emerged as the overall best strategy in a random host landscape.

The Adaptive (Gamma Gen) strategy remained the best option, even on clustered landscapes where all delimiting strategies showed greater variation in performance due to host distribution heterogeneity. While the In-to-Out (Gamma Year) and Adaptive (Gamma Year) strategies achieved higher levels of Capability when the IE spread distances were underestimated, the Adaptive (Gamma Gen) was more efficient and required less Effort to delimit the PIZ. It is important to note that while the difference in Capability between the Adaptive (Gamma Year) and the Adaptive (Gamma Gen) strategies averaged 0.19 in an unrealistically clustered urban landscape (i.e. Extreme Voronoi), when the IE spread distances were underestimated, the difference in Capability was closer to 0.05 in more realistic scenarios (i.e. Voronoi column in Fig. 6). The lower Capability level of the Adaptive (Gamma Gen) strategy on highly clustered landscapes was due to its wider survey bands overestimating the spread of symptomatic individuals. This issue arose only in clustered landscapes, where individuals were densely packed in patches, and the low density of individuals between patches slowed the spread of symptomatic individuals. This finding aligns with metapopulation theory and has been documented in plant disease incidence in forests and grasslands57,58,59. These results provide actionable insights for the design of pest management programs, particularly in regions where resources for monitoring and containment are limited. The ability of the Adaptive (Gamma Gen) strategy to maintain high Capability across different landscape types offers flexibility in its implementation, which could be especially valuable in regions facing logistical or financial constraints. While the performance of the delimiting strategies was not evaluated using actual real-world distribution data of urban citrus trees, simulated host distributions generated through the Voronoi diagram method, based on satellite images of Seville, offer a close approximation. Using Earth observation data, which in principle allows for the mapping of individual trees, could provide a valuable methodology to improve this approximation. However, this approach is far from trivial due to challenges such as noise from buildings and other nearby objects in urban areas, as well as the difficulty of accurately identifying citrus trees among other intermixed species60.

Thus far this study has shown the importance of tailoring a delimiting strategy to the characteristics of the target pest. Considering the long asymptomatic period and polycyclic nature of HLB, our case study pest, vastly improved the performance of the Adaptive strategy. This also implies that if a different pest was simulated, for example a monocyclic pest like Crinipellis perniciosa61 or a polycyclic pest with a longer asymptomatic period like X. fastidiosa62, the other strategies may outperform the Adaptive (Gamma Gen) strategy. This underscores the importance of evaluating different delimiting strategies for their suitability in targeting specific pests, making the methodology presented in this paper valuable for future studies. Besides re-parameterizing the IBM with different dispersal kernels (e.g. inverse-power law kernel63 and exponential-power kernel50), baseline infection rates and scale parameters to simulate different pests, the delimiting strategy models can be included in large scale spread simulations, like Ellis, et al.64, Radici, et al.65 and Nguyen, et al.66, to develop effective coordinated management strategies and contingency plans. Future studies could also explore how these delimiting strategies perform under varying environmental stressors such as climate change or dispersal barriers, which can alter pest dispersal patterns. Additionally incorporating dynamic factors like changing host availability or explicitly modelling a highly heterogeneous vector population into future iterations of the IBM could yield more precise strategy recommendations. Exploring mixed host landscapes—such as a clustered host population partially generated using the Poisson cluster process and partially using the Voronoi diagram method—would be an intriguing avenue for future studies and is already possible using our code.

In addition to the assumptions mentioned in the Methodology section, the IBM did not simulate the movement or distribution of a vector population. Instead, we modelled the psyllid vectors implicitly with the baseline infection rate and assumed a uniformly distributed vector population across the landscape, with constant dispersal throughout the year. In reality, local dispersal of HLB vectors tend to be greatest during the warmer months and decreases greatly during the autumn and winter and winter months21. Vector population densities in the field are also much higher in areas with high densities of seedling trees than in areas mostly populated by mature trees67. We chose to model the vectors implicitly to ensure that the framework is generic, allowing the IBM to be adapted for other pests. This approach aligns with previous studies40,68,69,70, which have shown that models without explicitly modeling the vectors can still be effective for HLB. Like the IBM, the delimiting strategy models are not fully realistic. For example, the strategies do not account for false positive or false negative detections; the latter is especially likely for HLB as the HLB bacterium and pathogenic bacteria are unevenly distributed in plant tissues and sampling the wrong spot in a symptomatic individual can result in false negatives in polymerase chain reaction (PCR) tests71. However, since false negatives occur approximately 20% and 16% of the time for conventional PCR methods72 and real-time PCR methods73 respectively, their effect is likely nullified by the number of samples taken in each survey round in our simulations.

Conclusion

The impact of invading plant pests can be extremely severe, and so optimizing control, monitoring and reaction strategies is imperative. Although much research has focused on improving surveillance methods, enhancing the accuracy and speed of pest detection, and increasing the efficiency of various control measures, little attention has been given to evaluating strategies for delimiting the extend of pest spread. Despite its limitations, this study highlights the importance of evaluating the suitability of delimiting strategies and adapting them to the specific characteristics of the targeted pest. In the case of polycyclic pests with a long asymptomatic period like HLB, the Adaptive (Gamma Gen) strategy outperformed the other strategies discussed in this paper. Additionally, this study presents a framework for assessing the performance of three delimiting strategies, with potential for future implementation in larger-scale research projects to enhance pest management strategies.