Introduction

Animal agriculture is highly concentrated in the United States (U.S.). The majority of poultry, cattle, and swine are held in a small subset of the largest Animal Feeding Operations (AFOs). For example, while there were over 74 million swine on farms in the U.S. in 2022, more than three-quarters of these animals were held in just the largest five percent of operations1. Similarly, over 75% of the 389 million layer chickens held on farms in the U.S. in 2022 were held in less than one percent of all operations with layer chickens2.

Subject to limited environmental regulation and reporting requirements, AFOs generate hundreds of millions of tons of manure each year3. The storage, handling, processing, and land-application of this manure, along with other practices at these operations, result in large emissions of ammonia (NH3), hydrogen sulfide (H2S), and volatile organic compounds (VOCs). Livestock activities and manure management also result in direct emissions of primary particulate matter (PM) and contribute to elevated ambient concentrations of PM through the release of pollutants that act as precursors to secondary PM formation4,5,6.

Globally, over 80% of atmospheric NH3 emissions are attributed to agriculture7. In the U.S., livestock waste is responsible for 2.7 million tons of NH3 emissions per year, or nearly half of all NH3 emissions produced nationally8. These emissions can impact biodiversity and contribute to eutrophication and soil acidification9. Once in the atmosphere, NH3 can further contribute to air pollution by reacting with aerosols to generate PM2.5, in forms such as ammonium nitrate, ammonium sulphate, and ammonium chloride10. In some regions of the U.S. with high concentrations of AFOs, nearly half of total anthropogenic PM2.5 pollution is derived from reactions between NH3 emitted from livestock and fertilizer use and other atmospheric gases11. Elevated PM concentrations can contribute to haze and acid rain, impact plant biodiversity, alter soil and water chemistry, and influence weather and climate12,13.

While H2S emissions tend to be lower in magnitude than NH3 emissions and have not been characterized as extensively, data from regions with high concentrations of AFOs in North Carolina suggest that agricultural H2S emissions can be among the top contributors to total sulfur emissions14. Animal manure and feed storage and handling also generate a diversity of VOCs, which can lead to acid deposition, smog, and atmospheric ozone formation15.

Emissions of NH3, H2S, and VOCs, and elevated ambient concentrations of PM can contribute to both chronic and acute health impacts, and communities in proximity to AFOs may be exposed to multiple overlapping sources and forms of air pollution16,17. NH3 emissions from AFOs can contribute to eye, nose, and throat irritation, headache, nausea, diarrhea, cough, nasal congestion, drowsiness, and alterations in mood for workers and surrounding communities18,19. Even at low levels of exposure, H2S emissions from AFOs can contribute to odor, nasal issues, and respiratory problems, including chest tightness and asthma20. PM2.5 is small enough to penetrate the thoracic region of the respiratory system21 and is associated with chronic obstructive pulmonary disease, acute lower respiratory infections, cerebrovascular disease, ischemic heart disease, and lung cancer22,23. Living in proximity to swine or dairy AFOs increases the risk of asthma24,25 and can even increase the likelihood of mortality26. Indeed, secondary PM2.5 formed from NH3 emissions from livestock waste handling and storage is estimated to cause 6900 premature deaths per year, and secondary PM2.5 formed from NH3 emissions from fertilizer application, including manure, is estimated to cause an additional 4900 premature deaths per year27.

Air pollution from AFOs has the greatest impact on adjacent communities which, across the U.S., are disproportionately minority populations28. For example, in North Carolina, the number of swine AFOs in census blocks with high poverty and high percentages of nonwhite residents are severalfold greater than those in wealthier or whiter census blocks29. Similar geographic patterns have been documented in other states, including Maryland, Mississippi, Wisconsin, and Iowa, as well as for other types of AFOs including dairy and poultry operations30,31,32,33,34. These spatial distributions result in disparities in exposure to air pollution, with communities of color and low-income communities exposed to greater ambient concentrations of NH3, H2S, and PM2.528,35,36. Individuals with the greatest exposure to AFO air pollution tend to lack reliable access to medical care, further compounding physical and mental health impacts of AFOs28,37.

Despite the magnitude and impact of air pollution from AFOs, these emissions persist with minimal regulation in the U.S. Efforts to regulate agricultural pollution have been hindered in part by a purported lack of methodologies to estimate facility-scale emissions. Here, we provide a brief history of federal policies related to air pollution from AFOs in the U.S., and we evaluate recent progress towards the development of air pollution models to allow for improved regulation. We then provide an overview of methods for estimating emissions, and we identify key opportunities for accelerating regulation based on existing data and future sampling efforts.

Regulatory history of air pollution from industrial animal agriculture

U.S. Environmental Protection Agency (EPA) regulations define AFOs as agricultural operations where animals are kept and raised in confined situations for 45 days or more38. Multiple facilities under common ownership are considered a single operation if they adjoin each other or use a shared waste management system. Concentrated Animal Feeding Operations (CAFOs) are AFOs that meet certain size thresholds, or which have been designated as such by a regulatory agency based on factors including size, location, or waste management practices. EPA regulations also provide categorical size designations for CAFOs, with Large CAFOs referring to operations that confine at least a certain number of animals—for example, 700 mature dairy cows, 2500 swine weighing 55 pounds or more, or 30,000 laying hens or broilers in AFOs with liquid manure handling systems39.

Air pollutant emissions in the U.S. are subject to regulation under multiple statutes, including the Clean Air Act (CAA), 42 U.S. Code § 7401 et seq., and the Emergency Planning and Community Right-to-Know Act (EPCRA), 42 U.S. Code § 11001 et seq. Under the CAA, facilities that emit air pollutants in sufficient quantities may be required to obtain operating permits40. Similarly, under EPCRA, facilities that produce, use, or store certain hazardous substances, including NH3 and H2S, must report unpermitted releases of those substances above certain “reportable quantities” to state, tribal, and local authorities, which, in turn, must make information from the reports available to the public41. For both NH3 and H2S, the EPCRA reportable quantity is 100 lbs/day42.

Existing literature suggests that many AFOs generate air pollutant emissions well above relevant statutory thresholds4,5,35,43,44. However, for over two decades, EPA has functionally excused AFOs from compliance with air pollution statutes, at least partially as a result of delays in the development of an officially sanctioned methodology for estimating air pollution emissions from the sector. Beginning in 2005, EPA entered into an agreement with nearly 13,900 dairy, poultry, and swine AFOs, allowing these operations to avoid liability for past and ongoing violations of the CAA and EPCRA provided that the AFOs pay a penalty and contribute to a $15 million USD fund to cover the cost of a National Air Emissions Monitoring Study (NAEMS)45,46. The agreement did not include beef cattle facilities, and these facilities were not monitored in the NAEMS. The majority of participants in the agreement do not publicly disclose operation details such as animal inventory; however, participants include at least 1669 AFOs that meet EPA’s designation of Large CAFO and at least 71 facilities that are ten times as large as this threshold47.

EPA completed data collection for the NAEMS, which included measurements of emissions of NH3, H2S, PM, and VOCs from barns and lagoons at 25 dairy, poultry, and swine AFOs in 10 states between 2007 and 2009 (Supplementary Table 1, Fig. 1). EPA initially estimated that the consent agreement would come to an end by 2010, at which point, the agency anticipated that it would have finalized models to allow for estimation of air pollution from AFOs. Following several postponements, EPA released draft models for public comment in 2024. As of November 2025, these models have not yet been finalized, and the agreement remains in place, allowing thousands of facilities to continue to release unregulated air pollution. In addition, EPA has issued multiple rules exempting AFOs from EPCRA requirements to report emissions of hazardous substances, including NH3 and H2S, despite recent legal challenges48.

Fig. 1: Daily NH3 and H2S emissions in NAEMS versus 1000-heads of animal inventory for dairy, poultry, and swine facilities.
figure 1

Dashed lines show 100 lbs NH3 or H2S per day.

Evaluating the National Air Emissions Monitoring Study and identifying data gaps

There are decades of studies quantifying emissions of NH3, H2S, PM and VOCs from barns and lagoons at swine49, poultry50, and cattle operations51,52,53. These studies have helped characterize relationships between emissions rates and variation in environmental conditions and manure and livestock management practices. Building from these findings, the NAEMS is the largest publicly available dataset of air pollution collected through consistent methods from AFOs in the U.S. The dataset includes 16,672 daily observations of NH3 emissions ranging from −11.7 to 1719.3 kg/day (median = 8.96 kg/day; Fig. 1), 17,799 daily observations of H2S emissions ranging from −624 to 112,714 g/day (median =225.8 g/day), 1726 daily observations of PM2.5 emissions ranging from −1329.5 to 7247.6 g/day (median = 33.3 g/day), and 12,046 daily observations of PM10 ranging from −6845 to 367,744.5 g/day (median = 238.4 g/day).

The 25 operations monitored through the NAEMS represent less than 1% of AFOs participating in the consent agreement allowing them to avoid liability for past and ongoing emissions violations. Critically, EPA collected very few observations of NH3 or H2S emissions at or above thresholds for regulation (Fig. 1). 84% of NH3 emission measurements and 99% of H2S emissions measurements in the NAEMS are below the 100 lbs/day reportable quantity specified for hazardous substances in EPCRA. This poor data coverage of higher emissions is at least partly due to the limited scope of operations monitored through the NAEMS rather than a lack of higher emissions occurring. Several studies have observed emissions well above these values at AFOs in the U.S4,5,35,43,44. Furthermore, the limited observations of threshold exceedances at the few moderately sized AFOs monitored within the NAEMS suggest that there are likely hundreds to thousands of exceedances of the NH3 threshold when these rates are scaled to consider larger facilities across the country.

Comparing the animal inventory at AFOs monitored in the NAEMS to data on facility sizes across the U.S. indicates poor coverage of the high end of the size distribution of AFOs. While the specific magnitude of emissions depends on multiple factors including environmental conditions and farm practices, emissions are likely to increase as a function of facility size due to greater animal inventory and activity and larger amounts of manure production and accumulation51,52,53,54. These larger facilities will be critical to monitor as they are likely to show the highest rates of regulatory threshold exceedances, and they are likely to contribute substantially to total air pollution. The largest swine facility monitored by EPA in the NAEMS held 2550 swine. In contrast, there were 3540 farms with over 5000 hogs and pigs across the country reported in the 2022 United States Department of Agriculture (USDA) Census of Agriculture55. In Duplin County, North Carolina alone, there were 344 facilities permitted to hold more than 2550 swine, including one facility with an allowable count of 48,250 in 202456. Similarly, the largest dairy facility monitored by EPA held 3623 cows. In Tulare County, California, there were 46 facilities with more than 3623 cows, including one facility permitted to hold over 10,000 mature dairy cows.

The NAEMs dataset is also limited by the omission of particular types of AFOs and gaps in coverage of the full lifecycle of farming practices contributing to each type of air pollutant. Beef cattle operations can be major sources of NH3, PM, and VOCs, due to manure handling practices and beef cattle activity on feedlots and during grazing57. Recent studies confirm that ambient concentrations of PM surrounding beef feedlots tend to be elevated compared to similar areas without AFOs, and these concentrations commonly exceed thresholds for safe human exposure28. Despite these concerns, beef AFOs were not monitored through the NAEMS. More broadly, while the NAEMS dataset covers emissions from barns and lagoons at AFOs, it does not include any observations of emissions resulting from land application of animal manure, which accounts for a significant component of NH3, H2S, and VOC emissions and PM pollution from AFOs58,59,60. Finally, EPA was unable to utilize a large proportion of VOC data that it gathered due to instrumentation errors and poor data quality.

Identifying facilities likely to exceed regulatory thresholds

Despite the limited scope of the NAEMS dataset, these observations can be used to quantify the likelihood of a given facility exceeding regulatory thresholds. EPA has considered the potential for setting de minimis thresholds based on animal inventory to exempt small facilities from having to estimate and report emissions if they are unlikely to approach regulatory emissions thresholds61. To explore this application of data, we fit a Bayesian non-linear model to NH3 emissions values from all dairy facilities in the NAEMS and estimated the likelihood of a dairy facility exceeding the regulatory threshold of 100 lbs NH3/day depending on the number of cows (Fig. 2). According to this model and assuming that the NAEMS dataset is representative of typical conditions, a dairy facility with over 2067 cows has a 95% chance of emitting more than 100 lb NH3/day on any given day, and a dairy facility with 1173 cows has a 50% chance of exceeding this threshold on any given day.

Fig. 2: Identifying dairy facilities likely to exceed regulatory ammonia thresholds.
figure 2

a Bayesian non-linear regression fit to model relationship between all NH3 observations from dairy facilities in the NAEMS and animal inventory. Shaded green region shows 50% predictive interval. Dashed vertical green line indicates minimum inventory (1173) with 50% likelihood of exceeding 100 lbs NH3/day (dashed gray horizontal line). Shaded blue region shows 95% predictive interval. Dashed vertical blue line indicates minimum inventory (2067) with 95% likelihood of exceeding 100 lbs NH3/day. b Histogram showing the number of cows (thousand heads) in each facility size class for all dairy facilities in the U.S. ranging from facilities having 1–9 head to facilities with 2500 or more heads. c Histogram showing the number of dairy facilities in the U.S. in each facility class size. Predictive interval modeling details: We fit a non-linear model of the form a + b*Inventory + d*Inventory^2 to the relationship between NH3 emissions and Inventory using Markov Chain Monte Carlo (MCMC) with normally-distributed priors. To account for heteroskedasticity, we modeled the precision of the expectation using the relationship 1/(s*Inventory)^3, where a, b, d, and s are all fitted parameters. The prior for s was log-normally distributed. We ran the MCMC using the R package (rjags) and the software Jags with 3 chains, a burn-in period of 1,000 iterations, and 100,000 iterations following the burn-in period. We used the Gelman and Rubin’s diagnostic to assess convergence of chains and achieved an effective sample size of over 4700 for each parameter. We fit 50% and 95% predictive intervals for the model by sampling posterior predictions and identified the Inventory value where the lower bound of each predictive interval crosses the 100 lb/day (45.4 kg/day) threshold.

Extrapolating from this analysis suggests that several hundreds of dairy AFOs are likely to be exceeding regulatory reporting thresholds. According to the U.S. Department of Agriculture Census of Agriculture, there were 834 dairies with 2500 or more cattle in 2022 (Fig. 3c)62. These dairies (and potentially hundreds of other types of cattle-holding facilities which were not monitored in the NAEMS) have greater than a 95% likelihood of exceeding the 100 lbs/day NH3 threshold. These facilities represent a small proportion of all dairies across the country, but they hold the majority of total milk cow inventory (Fig. 3b–c).

Fig. 3: Comparing dairy ammonia emissions observations and predictions between models.
figure 3

Points show emissions observations from dairy facilities in the NAEMS. Solid line shows draft EEM for NH3 from naturally-ventilated barns at 2.3 m/s windspeed (the average windspeed observed across sites). Black dashed line shows EF derived from the NAEMS dataset through linear regression. Orange dashed line shows EF from Kavolelis86.

More detailed methods (reviewed below) will be necessary to arrive at specific quantitative estimates of emissions rates for a given facility. However, existing data can be used to quantify the likelihood of a facility generating emissions in violation of regulatory thresholds, and this type of likelihood analysis can be readily implemented to accelerate regulatory action. Regulators should require facilities with high likelihoods of threshold exceedances to conduct additional direct monitoring and use these data to advance the development of predictive models. Additionally, operations with a high likelihood of exceeding regulatory thresholds should be required to adopt practices to mitigate emissions. For example, installing control-technologies like air scrubbers, improving livestock feed to reduce excreted nutrients, altering manure storage and handling practices, and optimizing manure and fertilizer land-application practices to increase nutrient use efficiency all have the potential to reduce air pollution27,63,64,65,66,67,68,69. More broadly, regulators should also explore strategies to reduce the concentration and scale of industrial animal production to mitigate total emissions.

These thresholds can also help reduce calculation burdens for AFOs with very low likelihoods of exceeding regulatory thresholds, which may potentially be exempted from being required to estimate their emissions for reporting purposes. In 2022, there were 325,303 operations in the U.S. with fewer than 20 dairy cows, and these facilities had a very low likelihood of ever exceeding the regulatory threshold of 100 lbs NH3 on any given day70,71. While we have focused on NH3 from dairy AFOs in this analysis, this approach should be expanded to other regulated pollutants and facility types as additional supporting data becomes available.

Strategies for modeling air pollution emissions

Several modeling approaches exist to estimate air pollution rates from AFOs. These include (1) emissions factors (EFs), (2) more complex statistical models, and (3) process-based models53,57,72,73,74. Regulatory agencies in the U.S. have used a combination of each of these types of methods to quantify air pollution from agriculture and other sectors.

EFs are among the simplest statistical approaches to derive estimates of air pollution. These factors are typically calculated by assuming a linear relationship between emissions and a single variable, which is often the number of animals in a facility. For example, an EF-based calculation may assume that each animal generates a set quantity of air pollution per unit time, such that the total air pollution generated by a facility over a given period can be quantified by multiplying the EF by the total number of animals held in the facility. Thousands of EFs for NH3 from animal waste have been compiled in global, publicly-available databases75,76. In an earlier analysis of the NAEMS, EPA derived maximum EF estimates based on the highest rate of emissions observed across the NAEMS dataset to calculate the largest number of facilities potentially impacted by changes in regulation (assuming all facilities were to generate air pollution at the highest observed emissions rates)61. EFs are easy to implement as they typically only require animal inventory data; however, they fail to track the sensitivity of emissions to multiple environmental factors and farming practices.

Alternatively, more complex statistical models may be developed to represent nonlinear dynamics, accommodate non-independent or non-normal data, and account for the sensitivity of air pollution emissions to multiple environmental factors and animal management practices. While these models may fit observed data more closely and account for multiple factors likely to influence emissions, they are still constrained by the quantity and quality of the data used to build the model and may not be generalizable for predicting emissions in contexts outside the bounds of the input data used for their construction.

Process-based models offer another alternative. These models predict emissions based in part by representing the biophysical processes generating emissions77. Because these mechanisms are likely to hold true across contexts, they may offer greater generalizability. However, these types of models still carry multiple sources of uncertainty, including uncertainty related to model structure and parameter selection78, and they can sometimes be more challenging for users to implement. In 2013, the EPA Science Advisory Board (SAB) recommended the development of a process-based model to estimate air pollution from AFOs, and existing literature provides some examples of successful development79. For example, Leytem et al. (2018) developed a process-based model to estimate NH3 emissions from dairy lagoons in Idaho80, and Rumsey and Aneja (2014) developed a mass-transfer model to predict H2S emissions from manure at swine AFOs72. Similarly, Pinder et al. (2004) developed a process-based model to estimate NH3 emissions from dairy cows accounting for variation based on farming practices and meteorology, including temperature, wind speed, and precipitation81,82. This approach was further expanded by McQuilling and Adams (2015) to develop a set of semi-empirical process-based models to predict NH3 emissions from beef cattle, swine, and poultry operations based on nitrogen mass balance and meteorology57. EPA utilized this model in its 2014 and 2020 National Emissions Inventories to estimate NH3 emissions from livestock waste, including emissions from grazing, housing, storage, and land application8. EPA has also developed or used process-based models in other environmental regulation contexts, including the Aquatox model, which represents the fate of chemicals in aquatic ecosystems and their impacts on various organisms83, the SERAFM model which predicts mercury concentrations in water, sediments, and fish tissues, and the DAYCENT model which predicts greenhouse gas emissions from agricultural soils84.

Evaluating EPA’s draft emissions estimation methodologies

EPA adopted a statistical modeling approach to develop draft Emissions Estimation Methodologies (EEMs) for AFOs based on the NAEMS dataset85. In 2024, EPA released 82 draft statistical models for NH3, H2S, PM2.5, PM10, and total suspended particulates (TSP) from various types of swine, dairy, and poultry facilities. As of November 2025, the models have not been finalized, and the consent agreement remains in place.

To develop each model, EPA first processed the data collected through the NAEMS and compared emissions across sites to identify potential sources of variation. Based on these observations and a review of related scientific literature, EPA identified environmental and facility-related variables likely to predict emissions, such as live animal weight, animal inventory, relative humidity, windspeed, and ambient temperature. EPA developed multiple potential linear mixed effects models with repeated variance spatial power covariance structures to estimate average daily emissions. Separate model regressions were evaluated for all combinations of selected parameters and natural log-transformed average daily emissions data. EPA selected significant predictors of emissions, and they evaluated the fit of each model based on multiple diagnostics, including Negative Twice the Likelihood (-1LogL), Akaike Information Criterion (AIC), Adjusted Akaike Information Criterion (AICc), and the Schwarz Bayesian Information Criterion (BIC). Based on these and other evaluations of model fit, EPA has proposed one draft EEM for each combination of pollutant and facility type. In order to implement a draft EEM, a user is required to provide between one and four input variables reflecting meteorology or facility details. Because the models were developed for natural log-transformed emissions data, users must subsequently back-transform emissions based on parameters provided by EPA to arrive at emissions rates in units commensurate with regulatory thresholds (e.g., lbs/day).

In contrast to EFs, which are available in published literature75,76, or which could alternatively be derived from the NAEMS dataset, EPA’s more complex statistical modeling approach allows each draft EEM to account for the combination of input variables most strongly correlated with emissions rates within the NAEMS dataset for each specific pollutant and facility type. A key difference between the EFs and EPA’s draft EEMs is in the shape of the relationship between emissions and input variables. Due to EPA’s particular statistical modeling approach designed to accommodate non-normally distributed data, the EEMs often predict exponential increases in emissions rates at high input values. In many comparisons between related EFs and EEMs, the two contrasting modeling approaches intersect at two points – with EEMs predicting higher emissions than EFs before the first point of intersection, EFs predicting higher emissions than EEMs at intermediate input ranges, and then EEMs predicting potentially far larger emissions than EFs following the second point of intersection (Fig. 3).

For example, Kavolelis86 proposed an EF of 29 ± 9 g NH3/day per cow for dairy barns86. To allow for comparison, we calculated an EF of 95 ± 8 g NH3/day per cow through linear regression of all observations of NH3 from naturally ventilated dairy barns within the NAEMS dataset. In contrast, the EEM for NH3 from naturally-ventilated dairies shows a steep increase in emissions estimates for facilities with over 1000 cattle (Fig. 3). It will be critical to constrain these estimates and resolve divergences among these modeling approaches by collecting additional data from the largest facilities or other high emissions scenarios.

Due to EPA’s modeling approach, specific draft EEMs may demonstrate problematic behaviors outside of intermediate input ranges observed within the NAEMS. We have identified some threshold values below which predicted emissions are negative and unlikely to be accurate (Supplementary Table 2). For example, the draft EEM for dairy lagoons predicts negative NH3 and H2S emissions below 11 degrees Celsius. Some draft EEMs may also predict unreasonably high values for emissions above a certain threshold of animal inventory or other input variables. For example, we used the draft model for predicting NH3 in a naturally-ventilated dairy barn to estimate the NH3 emissions from an existing dairy operation with 10,325 cows in Tulare County, California, and the model predicted a daily emissions rate of 1.2 × 1016 lbs NH3/day (equal to six trillion tons per day). Because the highest inventory value in the NAEMS dataset is 3653 cows, emissions predictions above this inventory value are increasingly uncertain. While additional data from these larger facilities will be necessary to constrain estimates of emissions rates, it is virtually certain that these large facilities are emitting at rates well over the regulatory threshold of 100 lbs NH3/day (Fig. 1).

To avoid unanticipated model behaviors like the ones described above, regulators should provide appropriate input ranges for the use of each model and quantify uncertainty across the range of inputs potentially observed in the U.S. In future model iterations, regulators should explore potential alternatives to accommodating non-normally distributed data that avoid these types of unintended nonlinearities and consider alternative modeling frameworks, including process-based models57.

Opportunities for increasing air pollution data, improving emissions estimation methodologies, and overcoming barriers to regulation

Identifying data gaps and sampling priorities

Existing datasets like those developed through the NAEMS can be used to optimize future air pollution sampling. Our analysis of the NAEMS dataset identifies a critical need for more observations from the largest AFOs, which are likely to account for the highest emissions and largest number of regulatory threshold exceedances. While existing data is sufficient for identifying facilities likely to exceed regulatory thresholds, increased monitoring of emissions, environmental conditions, and practices at the largest AFOs will help constrain uncertainties in specific emissions estimates.

Future monitoring efforts may use model evaluation statistics from the draft EEMs to identify sampling priorities. Accounting for model uncertainty and sensitivity to various environmental and facility inputs can help researchers identify the types of new data with the greatest potential for supporting future model improvements. Future studies should prioritize collecting additional data to support refining those models currently associated with the greatest uncertainties. For example, EPA performed an uncertainty analysis to quantify the difference between model predictions and observed emissions for each EEM85. Based on EPA’s uncertainty analysis, NH3 from larger poultry facilities, lagoons at swine facilities, and larger dairy facilities are among those most in need of additional supporting data (Supplementary Table 3).

In conjunction with uncertainty analyses, model sensitivity analyses can be used to identify priorities for specific input variables to monitor. Sensitivity analyses quantify the degree to which model outputs (e.g., emissions) vary in relation to model inputs (e.g., facility details or meteorological information). For example, several of the EEMs show the highest sensitivity to animal inventory, suggesting a need to measure a more continuous range of facility sizes representative of the distribution of AFOs in the country. However, other EEMs showed high sensitivity to environmental factors as well – for example, lagoon emissions are sensitive to windspeed and temperature, indicating potential benefit from monitoring lagoons across a broader range of meteorological conditions (Supplementary Table 3).

Additional data will also be necessary to fill critical gaps in the NAEMS. For example, due to instrumentation errors, EPA was unable to develop more complex statistical models for VOCs from AFOs and relied instead on a literature review to propose simple EFs. As noted above, the NAEMs also did not include any observations of beef cattle operations and emissions from land application, both of which account for substantial contributions to air pollution from AFOs. Expanding monitoring efforts to include a broader suite of pollutants, emissions sources, and AFO categories will be necessary to develop a more comprehensive understanding of total air pollution from AFOs.

Expanding data collection methods

Air pollution emissions data were gathered for the NAEMS primarily through on-farm gas analyzers, which may be costly or challenging to implement at a larger scale. Remote sensing technologies have the potential to vastly increase emissions monitoring data and their spatial coverage87,88. In recent years, satellite remote sensing has been used to estimate emissions and concentrations of several pollutants included in the NAEMS, such as PM2.5 and NH389,90,91,92,93. These measurements have been combined with inverse modeling, oversampling, and other approaches to estimate emissions from specific sources92,94.

Future improvements in these methods will be required to address challenges with low spatial and temporal resolution of satellite retrievals, data gaps caused by clouds or other retrieval noise, and better account for transport, deposition, and chemical conversion of the estimated pollutants94,95. However, even given these limitations, observations from satellite remote sensing can be used to identify the location of significant emission sources, such as AFOs with the highest likelihood of regulatory threshold exceedances, and these operations may be required to provide direct measurements. Alternatively or in tandem, commercially-available low-cost sensors may be used to ground-truth estimates of pollution, as EPA has done in other contexts to assess PM2.5 concentrations96.

Remote sensing can also be combined with modeling to estimate emissions of pollutants that are more difficult to measure. While satellite remote sensing shows promise for estimating NH3 and PM emissions, H2S has been more challenging to assess97. However, previous research has demonstrated that NH3 and H2S emissions from AFOs are strongly correlated14, which is corroborated by strong correlations between NH3 and H2S emissions from poultry sites in the NAEMS dataset (Supplementary Fig. 1). Remote sensing of NH3 emissions could provide additional data to model H2S emissions, as demonstrated by Leifer et al.97 who used remote sensing estimates of NH3 emissions in combination with an H2S EF to model H2S emissions97.

Conclusions

Despite limitations in existing data, it is virtually certain that many large AFOs generate air pollution in excess of regulatory thresholds in the U.S. Regulators and researchers may already use existing data from the NAEMS to identify facilities that are likely to be responsible for the largest share of emissions. These facilities should be required to provide additional data and adopt mitigating practices to reduce harms to human health and redress environmental injustices. While existing data is sufficient to inform improved regulation immediately, these datasets should also be used to optimize future data collection efforts and should be supplemented with emerging sampling technologies to refine modeling approaches.