Introduction

In February 2024, dairy farms in Texas, New Mexico, and Kansas began to report an unidentified disease spreading through lactating herds1,2. The disease was characterized by decreased rumen activity, diarrhoea, reduced milk production, and thicker milk consistency and discoloration. In March, milk samples from these farms were confirmed via real-time PCR as being infected with highly-pathogenic avian influenza H5N13. This marked the first time that transmission of Influenza A had been identified in US cattle populations4.

Subsequent phylogenetic studies identified this strain circulating in dairy cattle as a clade 2.3.4.4b genotype first isolated from wild bird populations in late 20235. This, and additional most-recent common ancestor studies, suggests that the initial spillover into cattle likely occurred in December of 2023 in Texas6. Histological studies demonstrated the virus’ capability to bind to epithelial cells in the mammary tissue of dairy cows7, in accordance with findings of far greater viral shedding within milk compared to nasal swabs or respiratory tissues3. These factors indicate that the repeated use of milking apparatus between individual cows during milking is a primary route of transmission8,9. This additionally explains why outbreaks have yet to be detected in beef cattle or dry heifers. In April, the first human spillover case from dairy cattle was reported10, with a dairy worker demonstrating conjunctivitis but no respiratory symptoms, likely due to contact with infected milk during the milking process.

The dairy industry is a substantial contributor to US national economic activity, with over 9 million milk cows11 contributing to approximately 3% of US GDP12. Cattle are frequently moved between premises and across states. As a result of this, export of cattle has been implicated in the proliferation of H5N1 to herds nationwide3, leading to interventions on exports being introduced. When cattle are shipped interstate, they must be accompanied with an Interstate Certificate of Veterinary Inspection (ICVI) to certify that such animals are fit to travel13,14. As of April 29th 2024, cattle exported interstate have up to 30 cows in the cohort tested for H5N1 influenza15. Should the herd test positive, the export cannot proceed, and the origin herd must be quarantined for 30 days before being tested again. No such requirements were introduced for transfers of cattle within state borders.

As of December 9th 2024, there have been 720 cattle herd outbreaks reported by the USDA16, across 15 states, and 35 human spillover cases with cattle as the exposure source17. Prolonged outbreaks of H5N1 in a novel animal reservoir presents a continuing threat for further spillover and the potential for viral reassortment. Recent structural analysis by Lin et al.18 suggests that a single glutamine to leucine mutation within this 2.3.4.4b variant would be sufficient to allow for human receptor binding. For this reason, ascertaining the true size of the current epidemic, and identifying the areas of greatest circulation, is crucial to inform public health responses for curbing transmission. In previous bovine disease outbreaks, such as bovine spongiform encephalopathy and foot-and-mouth disease in the UK, public health responses have been significantly aided by modeling studies to estimate rates of under-reporting19, estimating key epidemiological mechanisms20, and quantifying the impact of control policies21. Such efforts have not yet been applied to the current bovine H5N1 epidemic in the US.

In this study, we estimate the true size of the current epidemic via a stochastic metapopulation transmission model capturing 9,308,707 milk cows distributed across 35,974 herds across the 48 continental US states, as counted in the 2022 agricultural census11. Epidemiological parameters are estimated by fitting to outbreak data via a Bayesian evidence synthesis approach22. The movement of cattle between herds and states is captured using probabilistic outputs of the US Animal Movement Model (USAMM)23 and verified using actual 2016 ICVI data14. Mechanistic modeling assumptions are made relating the probability of detecting and reporting an infected herd proportional to the number of infected cattle and total population size of the herd, irrespective of the US state they reside in. The model successfully simulates outbreaks for US states that have frequently reported outbreaks, such as California. We estimate the rates of under-reporting by state, by comparing the number of confirmed outbreaks with model simulated trajectories, and present the anticipated rates of positivity for cattle tested upon leaving each state over time. We further use this model to interrogate the impact of intervention methods to date on the underlying epidemiological dynamics, and quantify the extent of uncertainty in the scale of the current epidemic, highlighting the most pressing data streams to capture.

Results

The model structure and key output metrics are illustrated in Fig. 1. Data on the number of dairy herds in the United States and their respective populations are taken from the 2022 US Agricultural Census11. Each herd is modeled via Susceptible-Exposed-Infected-Recovered (SEIR) infection dynamics. Panel 1A illustrates the number of infected cattle per herd over time. Panel 1B depicts the date at which an infected herd probabilistically reports an outbreak. Panel 1C illustrates the aggregated number of herds with any infected cattle per state, and the number of new reported outbreaks. The number of new reported outbreaks is skewed by contact tracing efforts and other time-varying factors—thus are not independent data samples. Therefore, we do not fit to outbreak incidence data, but rather to the date of first detection of an outbreak in each state (panel 1D).

Fig. 1: Schematic overview of model format and outputs.
figure 1

Infection spreads from the initial infected state through export of cattle. A Cattle exports are stochastically generated using trade data from the United States Animal Movement Model (USAMM)23. B At each time step, a herd has a probability of testing, and notifying of an outbreak. C We aggregate the number of herds with any infected cattle by state, and the number of newly reported outbreaks, at each date. D We fit global epidemiological parameters and an ascertainment scaling parameter via particle Markov Chain Monte Carlo simulation (pMCMC). Using the posterior distributions of these parameters, we are able to produce further model simulations herein. Full methodological details are presented in Supplementary Material Section 2.

Figure 2 plots the simulated mean and 95% credible intervals (CrI) of the date of first outbreak detection and the number of reported outbreaks for each US state. After fitting the epidemiological parameters of the model via pMCMC22,24, we generated 20,000 stochastic realizations of the model with parameter estimates drawn from the posterior distributions of the fit parameters. All model results shown are from these stochastic realizations so as to present the full stochastic range of uncertainty rather than the optimized realizations from the pMCMC fits.

Fig. 2: Model simulations.
figure 2

After fitting model parameters we simulate 20,000 stochastic realizations drawing from the parameter posterior distributions. Displayed is the epidemic trajectory from these simulations for each US state. A shows the date at which the first outbreak is detected in a state, a binary outcome. 0 indicates the state has not yet reported its first outbreak. 1 indicates that it has. Model simulation thus plots the proportion of the 20,000 realizations which have simulated a reported outbreak by this date. B shows the proportion of herds in each state which report new outbreaks per week, assuming no differences in ascertainment (parameter Aasc) between states. Red points depict data. The black line depicts the model mean, lightly shaded grey region depicts the 95% credible interval (95% CrI), and the darker shaded grey region depicts the 50% CrI.

The date of first detection in panel 2A is represented as a step function, where the black line in these plots shows the proportion of simulations that have had their first outbreak reported by that date in the respective state. The shaded areas shows the 95% CrI of the modeled date of first outbreak in each state. Note that for the majority of states in panel 2A, such as Washington, the upper 95% CrI bound is the final date of the simulations. This should not be interpreted as dates beyond this point therefore lying outside of the 95% CrI.

Panel 2B shows the proportion of dairy herds in each state reporting new outbreaks each week from December 18th 2023 to December 2nd 2024. Both panels illustrate that the majority of outbreaks are currently concentrated along the West Coast of the country. The model forecasts that states in the mid-West and Florida are the most probable next states to declare their first outbreak. This trend is due to the epidemic beginning in Texas, which exports primarily to nearby West Coast states.

The model is seen to overestimate the number of reported outbreaks in some states. For example, Texas, New Mexico, and Ohio all feature simulations whose credible interval does not contain the observed data. While our model assumes differences in outbreak detection due to differences in herd sizes by state, we do not assume further intrinsic state-varying differences in outbreak detection. In reality, differences in public health resourcing and messaging will impact outbreak detection rates. 72% of outbreaks reported as of December 9th 2024 have been in California. Due to making up the majority of the epidemiological data, model fits are mostly tuned to the detection rates observed in California. Therefore, overestimation of the model can be interpreted as under-reporting within a state compared broadly to baseline reporting efforts in California, as seen most strongly in the case of Arizona (Fig. 2A). The simulated number of infected herds, the number of herds with any infected cows on the premises, is shown in Supplementary Material Section 3.1.

Twenty-six of the 48 US states (54%) observed an outbreak of H5N1 before December 2nd 2024 in the majority of model simulations (> 50% of simulations, Table 1). Based on these probabilities, one would expect to have observed outbreaks in a mean of 27 (22–32 95% CrI) states by December 2nd 2024, assuming all states reported outbreaks equally. In actuality, only 16 states identified and reported outbreaks in this time period, indicating a high degree of under-reporting compared to the high baseline set by California.

Table 1 Reported outbreaks

We note that simulated incidence levels have a bimodal distribution. Many simulations never see H5N1 emerge in a particular state, which is why the 95% CrIs in Fig. 2 often span 0. Thus, this mean value is not the most probable outcome, but should be interpreted alongside the proportion of simulations which see no infections in particular states, as provided in Table 1. Particularly narrow 95% CrIs are seen in Fig. 2A for Texas, Ohio, New Mexico, and Kansas, due to the seeding of cases in these states as detailed in the Methods.

These results demonstrate how the composition of the dairy sector in each state has a significant impact on the overall epidemic dynamics. For example, while Florida is increasingly likely to report an outbreak (Fig. 2A), the expected proportion of herds reporting outbreaks in Florida remains low (Fig. 2B). First, states with larger herd sizes present greater opportunities for infection to spread quickly within the respective holdings. This then poses a greater risk of contaminating neighboring herds through shared workers, equipment, grazing space, or environmental runoff. Secondly, larger population holdings are observed to import larger numbers of cattle, hence increasing the probability of infection, as only up to 30 cows are currently tested during inter-state transfer15. Thirdly, our model assumptions of ascertainment trend towards larger holdings being more likely to report outbreaks, as has been observed in real-world reporting to date3 (Fig. 3). The respective sizes of each state’s dairy industry is provided in Supplementary Material Section 1.

Fig. 3: Ascertainment rate assumptions.
figure 3

A shows how the modeled baseline probability of reporting an outbreak depends on the number and proportion of infected cattle in a herd. Our model assumes that the probability that an infected herd reports an outbreak depends on the size of the holding, and the number of infected cattle on that date. B shows the mean and 95% CrI per-herd probability a herd reports an outbreak by US state, assuming every herd has 10% of its cattle infected. The credible interval captures the variation in herd sizes and the posterior distribution of the ascertainment rate parameter. C maps the mean values shown in (B).

Our model assumes each herd that has not yet reported an outbreak, has a probability of declaring an outbreak at each date. This probability is dependent on the absolute number of infected cattle in the herd, and the proportion of the herd that is currently infected. This functional form (Fig. 3A) was designed after discussion with veterinarians based on their experience with on-farm callouts. This baseline probability is then further scaled by an ascertainment rate model parameter, which is estimated in model fitting (Table 2). Alternate ascertainment rate assumptions are presented as sensitivity analyses in section 3.2.3 of the Supplementary Material.

Table 2 The Prior distributions and posterior intervals for all fit model parameters

We calculate the mean probability that a randomly selected herd in each state will report an outbreak, given that 10% of its animals are infected. These values ranged from 0.412 in California, a state with a greater number of large herds, to 0.092 in West Virginia (Fig. 3B, C). We see that states with a greater number of large herds are more likely to report outbreaks than other states. Correspondingly, California has reported the vast majority of outbreaks to date (Table 1).

Current federal orders require that, when exporting cattle interstate, up to 30 randomly-chosen cows from the exported cohort will be tested for H5N1, and only if all tested cattle register negative tests will the export take place15. Thus, exports of less than 30 cattle will have all cows tested, and exports of more than 30 cattle will have only 30 randomly selected cows tested. The results of these tests, be it positive or negative, are not currently reported to health authorities. We output from our model simulations the expected rates of export test positivity per state. This takes into account the expected number of cattle being exported.

Figure 4 shows the mean probability by state of such an export testing positive. We use the 20,000 simulation runs produced in Fig. 2 to sample 20,000 national epidemic trajectories for each herd. For each herd, and for each time point, we assume that it exports cattle, and sample how many cattle it will be exporting. We then calculate the probability of these cattle testing positive via the density of a hypergeometric distribution. Figure 4 displays the mean probability over all herds and all 20,000 stochastic realizations. The 95% CrIs are provided in Supplementary Material Section 3.1.

Fig. 4: Probability of positive border testing.
figure 4

We calculate the probability of an export of cattle out of each state testing positive from 20,000 stochastic model simulations. When moving cattle inter-state, up to 30 cattle will be tested for H5N1 per export. Panels show the state average per-herd probability that, should a herd export cattle, it would test positive at: A week beginning April 15th 2024, B week beginning August 19th 2024, and C week beginning December 2nd 2024.

Lastly, we use the model to assess the impact that interstate testing has had on the epidemic trajectory. We consider two counterfactual scenarios. Scenario 1) weaker measures—we assume no restrictions are introduced, no testing is required when exporting cattle, and thus all interstate exports proceed unabated. Scenario 2) stronger measures—we assume that the federal order was implemented 28 days earlier, on April 1st 2024, and that up to 100 cattle are tested instead of 30.

Considerable stochastic variation is seen across all scenarios, though we do see a reduction in all infection measures for the mean values of scenario 2—stronger measures, and an increase for the mean values of scenario 1—weaker measures, compared with the baseline scenario (Fig. 5). For the week beginning December 2nd 2024, under baseline model assumptions, the model simulates a national total of mean 120.9 new reported outbreaks (15–518 95% CrI), compared to an increased mean of 150.7 outbreaks (95% range 17–632 under the no interventions scenario 1, and a reduced mean of 93.4 outbreaks (95% range of 11–407) under the stronger measures of scenario 2.

Fig. 5: Border testing intervention counterfactuals.
figure 5

A The number of new reported outbreaks weekly. B The number of herds nationally with any infected cattle. C The total number of infected cows nationally over time. Solid lines show simulation mean. Shaded regions show 95% CrI. Blue (True measures) depicts baseline model assumptions, whereby up to 30 cows in each inter-state export are tested starting from April 29th 2024. Red depicts the scenario with no border testing. Green depicts border testing of up to 100 cows from each export, implemented 28 days earlier, on April 1st 2024.

Figure 5 shows that under each scenario, the epidemic continues to grow—meaning border testing measures alone are insufficient to effectively curb the epidemic. Stronger, farm-focused intervention measures would be required to reduce transmission sufficiently to achieve control.

Sensitivity analyses

All results are also produced under four alternate modeling assumptions. Supplementary Material section 3.2.1 considers alternate likelihood assumptions. Supplementary Material section 3.2.2 infers cattle exports from exact 2016 ICVI export data. Supplementary Material section 3.2.3 considers simplified ascertainment rate assumptions—where ascertainment is proportional only to the proportion of the herd infected. Due to the relatively short time frame considered, and unclear evidence as to the extent of mortality or culling, we did not include birth-death processes within our model. Supplementary Material section 3.2.4 considers the dynamic impact of including such birth-death mechanisms. Our conclusions are unchanged in all of these sensitivity analyses.

Discussion

Our study presents the first herd-level dynamic model of highly pathogenic avian H5N1 influenza transmission in US dairy cattle across the continental United States. By synthesizing existing data on dairy herd population sizes and cattle trade patterns, we recreate the spread of the virus from an initial seeding in Texas on December 18th 2023, through to the week beginning December 2nd 2024.

The model projects that the majority of the initial national disease burden is focused within West Coast states, due to their existing trade patterns with Texas, and the size of their respective dairy industries. However, East Coast states are not without risk of currently housing infected herds, as our model suggests that a considerable degree of under-reporting is misrepresenting the true size of the epidemic. A clear result from Fig. 2 and Table 1 is that some states are particularly likely to be home to infected herds, but have yet to identify and report infections. Most notable are Arizona, Wisconsin, Indiana, and Florida. Arizona has the largest mean herd size in the country (Supplementary Material Section 1), and extensive trade connections with Texas and California (Supplementary Material Section 2.4)—states particularly burdened with infection. Wisconsin, while farther from the epidemic epicenter, has the largest number of dairy herds in the country—6216. While Florida has a modestly sized dairy sector, and is located on the east coast, it has one of the highest mean herd sizes in the country, as their industry is predominantly made up of a few very large holdings. It also imports more cattle from Texas than its neighbors. Indiana presents itself as having a high likelihood of probable infection due both to having a very high number of dairy herds, but also due to its frequent trading links with Wisconsin. Table 1 shows that, while it is not implausible that no infections have established within these states, the probability of this is low, with Wisconsin in particular only reporting no outbreaks in 1.9% of model simulations. In only 22 of the 48 continental US states did our model predict zero reported outbreaks in  > 50% of model simulations (Table 1). Figure S20 of the Supplementary Material visualizes the herd population sizes of each state against the frequency of imports from Texas, demonstrating the relationship between herd sizes and outbreak likelihood.

The model also demonstrates how the distribution of cattle populations in each state mechanistically impacts the rate of reporting. Figure 3 shows that, due to many West Coast states housing large populations of dairy cattle in single herds, they have a higher-than-average likelihood of reporting outbreaks. This is reflected in the outbreak data. California has reported over 8 times as many outbreaks as the state with the next highest number of reported outbreaks. Our model suggests that this can be explained by the fact that the average herd size in California is significantly higher, and not necessarily due to more robust epidemiological investigation attempts in the state.

The only national intervention mandated to date is the testing of cattle exported interstate. Up to 30 cows in an exported cohort are tested for H5N1, and must test negative for the export to proceed. Figure 4A shows that, early in the epidemic, Texas was one of the only states with a non-negligible probability of cattle testing positive at export, though we note that such interventions were only brought in from April 29th 2024. By August (panel 4B), Texas had a greater than 40% mean probability of an export testing positive. By December of 2024, our model predicts that infections in Texas may have begun to decrease, and a more uniform probability of positivity is observed across the country. According to the USAMM, a mean 29,590 (IQR 922) interstate exports of dairy cattle occur every year23. Given that such testing is mandated to occur, it would be prudent to report such testing to verify against our expected positivity rates and better refine model estimates.

Our model has also demonstrated that the border-testing intervention alone, while a valuable (if unrealised) opportunity for surveillance, is insufficient to control the spread of H5N1 influenza. We explored the counterfactual scenario of stronger border testing measures, of up to 100 cows, and introduced 28 days earlier, on April 1st 2024. Despite a slight reduction in the mean number of outbreaks under this scenario, the fundamental epidemic dynamics remained unchanged, with infections and outbreaks continuing to increase as the year continued. This suggests that targeted biosecurity interventions at farm level, such as postmilking teat dipping and the use of disposable wipes for premilking teat disinfection25, and interventions between herds such as boot dips at facility entrances, clothing disinfection post-site visit, or greater emphasis on adequate personal protective equipment26 will be required (Supplementary Fig. S19). Additionally, better outreach with industrial partners should be pursued. On May 10th 2024, the U.S. Department of Agriculture (USDA) provided a total of $98 million to support biosecurity measures27,28, whereby individual farms could apply for up to $28,000 to implement protocols such as secure milk plans, disposal of infected milk, veterinarian costs, and testing costs. As of January 9th 2025, only 510 premises have applied for this additional funding29. On May 30th 2024, the USDA announced a further $824 million was being allocated to a nationwide voluntary Dairy Herd Status Pilot Program, whereby premises could apply for free routine milk surveillance. The 2022 US Agricultural Census lists 36,024 dairy farms. As of January 9th 2025, only 75 herds have enrolled for the voluntary testing program30. Evidently, voluntary measures are currently failing to see sufficient uptake.

Data availability has been poor throughout the epidemic, the only epidemiological data stream being the number of reported outbreaks. Due to a lack of uniform surveillance or testing, uncertainty surrounding state-level infection levels is large, as demonstrated in Fig. 2. Uncertainty is further compounded by the probabilistic nature of our modeled export assumptions, necessitated by a lack of precise movement data in this period. Many other countries, including the European Union, enforce mandatory identification of all premises, individual cattle, and movement of animals, often by electronic tagging methods31. The US has no such requirement. Additionally, since veterinary and public health responses are governed at the state level, individual states vary greatly in the measures, resources, and interventions they have applied to limit spread. Reported outbreak incidence data are not sufficient to reasonably quantify these state-level differences. The most valuable enhancement to current surveillance would be through stratified and systematic sentinel testing for infection, reporting of both positive and negative test results. This would allow overall assessment of infection prevalence within farms, and estimation of the proportion of herds with any level of infections, which in turn would allow better estimation of the risks of onward infection through cattle trade. A further additional valuable source of data would be the publication of the results of pre-export cattle testing currently being undertaken. Figure 4 shows our estimates of the rates of positive tests at export currently, which such data might be compared against, if released.

While our analysis suggests that some of the earliest infected states may have passed the peak of their epidemics, Fig. 2 suggests that many more states will still be in the early stages of their epidemics. Importantly, our model also does not capture the role of either re-infection, or the emergence of new, more adapted, clades of the virus (though studies have shown that initial infection infers strong protection against reinfection32). Our analysis suggests that dairy herd outbreaks will continue to be a significant public health challenge in 2025, and that more urgent interventions are sorely needed. Early economic models of the impact of the epidemic on the US dairy sector project economic losses ranging from $14 billion to $164 billion12. Additionally, 35 human spillover cases from cattle17 have been reported to date. The longer the epidemic persists in a novel mammalian reservoir, the greater the risk of further human spillovers and viral adaptations to human hosts. Recent research suggests only minimal genetic distance separates the currently circulating clade from adaptation to human receptor binding18, and such adaptation has already occurred to improve virus replication in bovine and primary human airway cells33.

Our work is not without limitations. Most importantly is that, due to insufficient epidemiological data, we had to make strong assumptions about the probability of ascertainment—whether or not an infected herd is identified and reported. Figure 3 outlines the implications of these assumptions, but the wide credible interval for our estimate of the ascertainment parameter Aasc reflects these data limitations. Additionally, because the US does not employ a mandatory electronic tagging system, there is no way to accurately capture the precise cattle movements for 2024. While we were provided with the 2016 ICVI data utilised in Cabezas et al.14, it was considered, upon comparison with USAMM model simulations, that precise inter-state exports might vary greatly year-to-year. Therefore, assuming identical movements to 2016 could induce significant bias into the results. Thus, we instead take the probabilistic approach, whereby the exports of cattle are probabilistically determined through model simulations according to the USAMM model23. While this introduces further uncertainty into the model, it accurately demonstrates how poor data availability regarding precise 2024 cattle movement hampers epidemic forecasting efforts. We nonetheless present model results fit using this 2016 ICVI data as a sensitivity analysis in Supplementary Material Section 3.2.2.

Additionally, our work does not consider the dynamic impact of other zoonotic reservoirs. The ongoing H5N1 epidemic in the US is also heavily impacting the poultry industry, with 662 counties reporting outbreaks as of March 3rd 202534. Modeling the disease in poultry is significantly more challenging due to the role played by wild bird migration35, and our current model does not consider spillover from other animal populations. Further work identifying farm sites which house multiple host species would be an important next step in identifying points of spillover risk between reservoir animals, presenting a risk of further genetic reassortment.

In conclusion, our model demonstrates that we cannot definitively conclude that the current number of reported outbreaks is a true representation of the scale of the current H5N1 influenza epidemic in dairy cattle. Significant under-reporting is likely, and the differences in dairy herd population distributions across states have aided in spreading disease across the west coast. Current mandatory interventions are insufficient for controlling the spread of disease, and voluntary testing and interventions are severely under-utilised. Significant increases in testing are urgently required to reduce the uncertainty of model projections and provide decision-makers with a more accurate picture of the true scale of the national epidemic.

Methods

Infection seeding

We seeded the epidemic with five infected cows in a mid-size herd in Texas, on the week beginning December 18th 2023, based on phylogenetic analyses6. For the stochastic realizations, we also seeded 9 additional herds in accordance with the nine early outbreaks detailed in Caserta et al.3. The herd size, number of infected cattle, and date of seeding is consistent with the data presented in that manuscript.

Epidemiological dynamics

We construct a stochastic metapopulation SEIR model36 with 35,974 individual herds of varying population size, informed by the 2022 US Agricultural Census11. Each herd’s infection dynamics are the stochastic equivalent of the following set of ordinary differential equations (ODEs):

$$\frac{d{S}_{i}^{s}}{dt}= -\beta {S}_{i}\left(\frac{{I}_{i}}{{N}_{i}}+\alpha \frac{{I}_{-i}}{{N}_{-i}}\right),\\ \frac{d{E}_{i}}{dt}= \beta {S}_{i}\left(\frac{{I}_{i}}{{N}_{i}}+ \alpha \frac{{I}_{-i}}{{N}_{-i}}\right)-\sigma {E}_{i},\\ \frac{d{I}_{i}}{dt}= \sigma {E}_{i}-\gamma {I}_{i},\\ \frac{d{R}_{i}}{dt}= \gamma {I}_{i}.$$
(1)

Here, Si, Ei, Ii, and Ri are the number of susceptible, exposed, infected and recovered cows in herd i. Ni is the total population of herd i. β, σ, and γ are the transmission, incubation, and recovery rates respectively. α is a model parameter between 0 and 1 controlling the rate of transmission between herds in the same state. Ii and Ni are the total number of infected cattle, and the total number of all cattle, in the US state herd i resides in, not including the cattle in herd i itself. Early epidemiological surveys of farms reporting outbreaks found that transmission routes existed between herds in the same state through the shared use of equipment, staff, or the movements of wild birds37, which we capture here in the model. We assume no such forms of transmission can occur between herds in different US states.

The stochastic analogue of the above ODEs, is that we calculate the number of cattle progressing between epidemiological compartments via binomial distributions, for each time step dt as:

$${n}_{SE}^{i} \sim \, {{\rm{Binomial}}}\,\left({S}_{i},\,1-\, {{\rm{exp}}}\,\left(-\beta \left(\frac{{I}_{i}}{{N}_{i}}+\alpha \frac{{I}_{-i}}{{N}_{-i}}\right)dt\right)\right),\\ {n}_{EI}^{i} \sim \, {{\rm{Binomial}}}\,\left({E}_{i},\,1-\, {{\rm{exp}}} \,(-\sigma \,dt)\right),\\ {n}_{IR}^{i} \sim \, {{\rm{Binomial}}}\,\left({I}_{i},\,1-\, {{\rm{exp}}} \,(-\gamma \,dt)\right).$$
(2)

Here \({n}_{XY}^{i}\) is the number of cattle moved from compartment X to Y (for general X and Y), in herd i, in a time step of size dt.

After all cattle movements between epidemiological compartments is concluded, we calculate for each herd that has yet to report an outbreak, whether or not it will report an outbreak in that time step. It reports an outbreak with probability \({P}_{i}^{\,{\mbox{outbreak}}\,}=1-{e}^{-{\phi }_{i}}\), where ϕi is

$${\phi }_{i}=\left(\frac{{I}_{i}}{{(0.7{N}_{i})}^{0.95}}+\frac{{I}_{i}}{150}\right)\,{A}^{{{\rm{asc}}}}\,dt,$$
(3)

and Aasc is a model parameter that we fit. The bracketed term to the left of Aasc in Eq. (3) is shown in the heatmap of Fig. 3A. This functional form was developed in consultation with veterinarians based on their experiences of at what stage of pathogen spread they are typically consulted. While US states undoubtedly vary in their detection capabilities, there is insufficient outbreak data to fit unique Aasc values for each state. Assuming one national Aasc parameter allows us to identify which states that have reported 0 outbreaks to date are driven mostly by under-reporting (Fig. 2B).

Movement of cattle between herds

After calculating the movement between epidemiological compartments and any reporting of outbreaks, we then calculate the movement of cattle between herds. As detailed in Supplementary Material Section 2.4, we infer from the USAMM the probability, \({P}_{k}^{\,{\mbox{export}}\,}\), for each US state, k, that a herd within that state will export cattle each week. We assume the same probability for every herd in the state. We also calculate the proportion of cows in the origin herd that will be exported—\({P}_{k}^{\,{\mbox{export size}}\,}\) from the USAMM export simulations, which include cohort size and size of origin herd. We also calculate the probabilities of, should an export of cattle occur, which US state they will be exported to. This is parameterized by a movement matrix M, where element Mk,l denotes the probability that an export from state k will go to state l. This matrix describes the patterns of interstate movement, and the diagonal represents the probability of an export remaining within the same state. The exact matrix is provided as Supplementary Data. Once the destination state is determined, we randomly allocate which herd in the destination state the cattle will be exported to, scaled by the population size of the respective herds, to preserve herd sizes. Once an origin herd, i, and destination herd, j, are assigned, we draw the number of cattle to be exported as

$${n}_{{S}_{i}{S}_{j}} \sim \, {{\rm{Binomial}}}\,\left({S}_{i},\,{P}_{k}^{\, {{\rm{export}}} \, {{\rm{size}}}\,}\,dt\right),\\ {n}_{{E}_{i}{E}_{j}} \sim \, {{\rm{Binomial}}}\,\left({E}_{i},\,{P}_{k}^{\, {{\rm{export}}}\, {{\rm{size}}}\,}\,dt\right),\\ {n}_{{I}_{i}{I}_{j}} \sim \, {{\rm{Binomial}}}\,\left({I}_{i},\,{P}_{k}^{\, {{\rm{export}}}\, {{\rm{size}}}\,}\,dt\right),\\ {n}_{{R}_{i}{R}_{j}} \sim \, {{\rm{Binomial}}}\,\left({R}_{i},\,{P}_{k}^{\, {{\rm{export}}}\, {{\rm{size}}}\,}\,dt\right),$$
(4)

where k is the US state that origin herd i resides in. Lastly, before moving cattle between the respective compartments of herds i and j, we simulate the border testing mandate. If the model date is after April 29th 2024, we draw a random variable, X from a hypergeometric distribution:

$$X \sim \, {{\rm{Hypergeometric}}}\,\left({n}_{{I}_{i}{I}_{j}},\,{n}_{{S}_{i}{S}_{j}}+{n}_{{E}_{i}{E}_{j}}+{n}_{{R}_{i}{R}_{j}},\,\,{{\rm{min}}}\,(30,\,{n}_{{N}_{i}{N}_{j}})\right).$$
(5)

Here the three parameters of the above hypergeometric are, the number of success items in the population, the number of failure items in the population, and the number of samples taken without replacement from the population. X is the number of infected cattle drawn. If X = 0, then no infected cattle are detected, and the export takes place. Note, a positive test prevents the export, but does not immediately register as a reported outbreak. All probabilities and a full logic flow diagram are presented in Supplementary Material Section 2. U.S. state boundaries were obtained using the maps package in R (via map_data("state")) and visualized with ggplot2.

cowflu package

To efficiently simulate the above probabilistic model, we produced a custom R package, cowflu38, which allows simulating and fitting the model via the dust2 package22 in R, while the model itself is written in C++. Documentation on the use of the package and worked vignettes can be found on our github repo: https://github.com/mrc-ide/cowflu. The package is flexible to being applied to any SEIR metapopulation model with custom probabilities of movement between sub-populations, subject to user-defined movement matrices.

Model fitting

Five of the above model parameters—β, α, σ, γ, and Aasc, are fit via particle Markov Chain Monte Carlo24 methods. We assign weakly-informative prior distributions, informed by early studies associated with the current outbreak39. We fit the model simulated values of date of first outbreak detection (as seen in Fig. 2A) to the real world data equivalent, via a likelihood function detailed in Supplementary Material section 2.5. We ran the pMCMC simulations across 16 chains of 40,000 iterations each. Model convergence statistics are presented in Supplementary Material section 2.5.

Table 2 shows the priors and posteriors for all model parameters. Note that we fit \(\frac{\beta }{\gamma }\) instead of β due to observed correlation between β and γ, so as to improve chain mixing.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.