Introduction

In Asia, regular emissions of nitrogen oxides (NOx) have long posed concerns for public health, due to their detrimental impact on air quality and potential contribution to respiratory illnesses1,2. Primarily originating from vehicles and industrial activities3,4, NOx emissions significantly contribute to urban air pollution, exacerbating environmental and health challenges in densely populated areas. In response to airborne hazards, Asia has seen extensive efforts to monitor air quality, leveraging a combination of ground-based monitoring stations and satellite-based observations5,6,7. The advent of satellite instruments has been particularly beneficial in addressing the information gaps arising from ground-based sites’ limited geographic coverage.

The increasing availability of observation references has led to significant advancements in air quality research, particularly in estimating and simulating NOx emission rates and ambient concentrations8,9,10,11,12. Foremost among these research methods is the use of chemical transport models (CTMs), such as the Community Multiscale Air Quality (CMAQ) model13, which has been useful in understanding the complex behaviors of air pollutants and pollution mechanisms in Asia14,15,16,17. However, this approach often encounters challenges in regions with traditionally sparse monitoring; the reliance on outdated emission inventory data – a critical input for CTMs – can compromise the accuracy of air quality simulations18,19,20. The inherent uncertainties in bottom-up inventories of NOx emissions, arising from limited knowledge of emission factors and broad extrapolations, further complicate the accuracy of these estimates21,22,23,24. To address this challenge, extensive efforts have focused on utilizing satellite observation data to update bottom-up estimates of air pollutant emissions. A number of numerical modeling studies have effectively adjusted the extent of NOx emissions in emission inventories by solving mathematical inverse problems, supported by top-down measurements from satellite instruments along sun-synchronous low earth orbits (LEO)8,9,10,11,12,25,26,27,−28. Instruments such as Ozone Monitoring Instrument (OMI), TROPOspheric Monitoring Instrument (TROPOMI), and Sentinel-5 Precursor (Sentinel-5P) have provided more current information on air pollutant loadings. The use of more up-to-date observation references has been confirmed to be effective in improving the emissions inventories, consequently enhancing the reliability of air quality simulations and laying a stronger basis for further air quality assessments.

Despite their significant contributions, the utilization of LEO satellite instruments encounters challenges in capturing the temporal dynamics of air pollutants29,30,31,32,33. The orbiting routine of the instruments typically allows for only one snapshot of air pollutant loadings at each geographic location per day, leading to limited segments of daily observations33. This often necessitates dependency on time-averaged observation data, which are typically averaged over biweekly, monthly, yearly, or even multi-year periods25,26,27,28,29,30,31,32,34,35,−36. This approach, although vastly practical, may not be sufficiently granular to accurately capture the highly variable nature of NOx, which is characterized by their short lifetime both in the real world and CTM simulations37,38,39,40. For example, NOx emissions’ diurnal variations associated with the urban commuting cycle (i.e., dual peaks during morning and evening traffic)3,41 remain underrepresented in LEO observations due to the instruments’ orbiting cycles. A shared insight among researchers is the pressing need for temporally more continuous satellite observations to effectively monitor the dynamic emission patterns42.

In response to such a limitation, there has been a growing focus on utilizing observation data from satellite instruments in geostationary earth orbits (GEO)42,43,44. GEO satellite instruments offer finer temporal resolutions, from several minutes to an hour during daylight hours, enabling more frequent and continuous observations. This advantage has been particularly well-exploited in aerosol research, where instruments like the Geostationary Ocean Color Imager (GOCI) and the Advanced Himawari Imager (AHI) have made substantial contributions to air quality modeling in East Asia45,46,47. However, a gap remains in studies concerning gaseous air pollutants, particularly in terms of better inventorying the quantity of their emissions, owing to the historical absence of instruments capable of monitoring gas-phase species in a geostationary manner. In fulfilling such a need, the launch of the Geostationary Environment Monitoring Spectrometer (GEMS) in February 2020, the first instrument of its kind to measure the loadings of both gas-phase air pollutants and aerosols44, set the stage for new research by providing enhanced monitoring capability over the Asia-Pacific region36,48,49,50. Deployed aboard the Geostationary Korean Multi-Purpose Satellite 2 (GEO-KOMPSAT-2), GEMS offers hourly measurements of trace gases, such as nitrogen dioxide (NO2), sulfur dioxide, formaldehyde, and ozone, as well as aerosol properties during daylight hours in a consecutive manner44 – capability that was previously unattainable with instruments on LEO platforms. Despite its higher orbital altitude compared to LEO instruments, GEMS’s spatial resolution, at 3.5 km × 8 km, surpasses that of older LEO instruments, such as OMI at 13 km × 24 km at nadir, and is comparable to more recent LEO instruments like TROPOMI, which has offered a resolution of 5.5 km × 3.5 km since August 2019. This unique temporal resolution and competitive spatial resolution underscore GEMS as a young instrument with significant potential, making it worth more rigorous exploration to optimize its contributions to advancing our modeling capabilities. Furthermore, upon the forthcoming completion of the GEO satellite instrument constellation, including NASA’s Tropospheric Emissions: Monitoring of Pollution (TEMPO)42 and ESA’s Sentinel-444, there emerges a critical need for more nuanced methodologies to better exploit the unprecedented quantity and quality of observation data becoming available.

Leveraging a wealth of top-down information from GEMS, our study aimed to explore the potential of geostationary observation data in refining the bottom-up estimates of NOx emissions, ultimately enhancing the accuracy of air quality simulations. Focusing on the spring months of 2022, we evaluated the utility of GEMS tropospheric NO2 columns as top-down constraints to adjust the NOx emissions inventory in Asia. Informed by 8 to 10 observation references a day, our Bayesian inverse modeling provided hourly adjustments to the inventoried extent of hourly NOx emissions. We conducted a series of CMAQ simulations of tropospheric NO2 columns and surface NO2 concentrations in Asia and assessed the model accuracies achieved with the original and adjusted NOx emissions inventories. We then assessed the potential advantages of GEMS’s high-frequency observation data in addressing the temporal constraints of LEO platforms. This involved comparing the improvements in model performance achieved from using GEMS data and the proxy data of LEO satellite instrument observations.

Results and discussion

Simulations using the original emissions inventory

We evaluated the model accuracy achieved using the original extent of NOx emissions in the inventory. This involved comparing the tropospheric NO2 columns and surface NO2 concentrations observed during the months of March, April, and May (MAM) 2022 to those simulated by CMAQ using the a priori NOx emissions.

Figure 1a and b illustrate the monthly averages of hourly NO2 columns observed and modeled during daytime. In our study, “daytime” specifically refers to GEMS’s retrieval period from early morning to late afternoon (i.e., 8 AM to 5 PM in Korea and 7 AM to 4 PM in China; see Sect. 4.3). Compared to GEMS NO2 columns, the model tended to overestimate the columns across most regions of South Korea (hereafter referred to as Korea), with the exception of the Seoul metropolitan area (SMA), and in densely populated urban areas in China, including Beijing and Shenyang, and those across the southeast region. The model also overestimated the columns in major urban centers in Southeast Asian countries (e.g., Ho Chi Minh City and Hanoi in Vietnam, Bangkok in Thailand, and Kuala Lumpur in Malaysia, and Singapore), in India (particularly in the northeastern region), and in Japan (except for Tokyo) (see Supplementary Fig. S1 for geographic labels). Conversely, the model generally underestimated the columns in the SMA, across the North China Plain (NCP) and the plains in the northeast region of China, and in the landlocked urban centers of northcentral and southwestern China. The model underestimated the columns in the Yangtze River Delta (YRD) in March, and then overestimated the columns in April and May. Similar patterns of monthly spatial discrepancies, but to a greater extent, were noted between the NO2 columns observed at 04:45 UTC (hereafter referred to as LEO proxy NO2 columns; see Methods) and the corresponding modeled NO2 columns (Fig. 1d and e).

Fig. 1
figure 1

Monthly averages of hourly daytime tropospheric NO2 columns (molecules/cm2) observed and modeled during MAM 2022. (a) GEMS NO2 columns during daylight hours, (b) CMAQ-simulated NO2 columns during daylight hours, (c) differences (b - a), (d) LEO proxy NO2 columns at 04:45 UTC, (e) CMAQ-simulated NO2 columns at 04:45 UTC, (f) differences (e - d). The maps were created using MATLAB R2024a by MathWorks, Inc. (https://www.mathworks.com/products/matlab.html).

Figure 2 shows time series of the observed and modeled hourly surface NO2 concentrations at ground-based monitoring stations in China during MAM 2022. The model generally underestimated the concentrations during daylight hours, aligning with GEMS retrieval times, and began overestimating in the subsequent hours leading up to nighttime. A similar tendency was observed in Korea (Supplementary Fig. S2), indicating a regional modeling challenge in East Asia when using the a priori emissions. Overall, there was a fair agreement between the observed and modeled hourly NO2 concentrations, with correlation coefficients (R) ranging from 0.57 to 0.64 and indices of agreement (IOA) from 0.64 to 0.74 in China (Fig. 2) and those from 0.62 to 0.82 and 0.65 to 0.71, respectively, in Korea (Supplementary Fig. S2). The extent of overestimation was noticeable throughout the months, with normalized mean biases (NMB) ranging from 7.75 to 32.59%, mainly due to nighttime overestimation. However, during the daytime, the model severely underestimated NO2 concentrations, with NMB from − 13.17% to -29.31% in Korea and from − 5.08% to -23.58% in China. This suggests a potential underestimation of daytime NOx emissions in Asia, as discussed in several previous studies, which led to the overall underestimation of surface NO2 concentrations in East Asia27,34,36,57. Also, it is noteworthy that these discrepancies may be related to the known limitation of the CB05 mechanism, which underestimates the recycling of alkyl nitrates back to NO2 and affects NOx chemistry, contributing to the model’s difficulty in accurately capturing daytime NO2 concentrations.

Fig. 2
figure 2

Time series of hourly surface NO2 concentrations (ppb) observed and modeled at 250 MEE sites in China during MAM 2022. Each day has 24 h and begins at 00 UTC on the time axis. OBS: observed concentrations, OBS (daytime): observed concentrations during daylight hours (GEMS retrieval hours), and CMAQ (prior): modeled concentrations using the a priori emissions. R: Pearson’s correlation coefficient, IOA: index of agreement, NMB: normalized mean bias (%), and MAE: mean absolute error (ppb).

Top-down estimates of NOx emissions

After the top-down adjustments to the NOx emissions inventory, we examined the subsequent changes in NOx emissions in Asia during MAM 2022. This involved comparing the a priori emissions with the a posteriori NOx emissions, which were obtained through the Bayesian inversions informed by the observation data from GEMS and LEO proxy (see Methods).

Figure 3 shows the monthly averages of hourly daytime NOx emissions, comparing the adjusted emissions to the a priori emissions. The GEMS-informed adjustment led to spatial adjustments in emission quantities, aiming to offset the model’s prior biases discussed earlier. NOx emissions generally decreased in areas where the model previously underestimated the NO2 columns, including most regions across Korea (except for the SMA), Beijing, Shenyang, the YRD (in April and May), as well as in highly populated urban areas in southeastern China, Southeast Asian countries, India, Japan (except for Tokyo). Specifically, in Beijing and Shenyang, the decreases in NOx emissions at the pixels corresponding to the city centers were overwhelmed by the increased emissions in the surrounding greater metropolitan areas. Conversely, we observed substantial increases in NOx emissions in the SMA, the NCP, the plains in northeastern China, the YRD (in March), and urban centers in northcentral and southwestern China. In areas where the model previously underestimated and overestimated NO2 columns (Fig. 1c), we observed an average increase of 31.72% and a decrease of 22.78% in NOx emissions, respectively, during MAM (Fig. 3c). The spatial extent of areas with decreased emissions was slightly smaller than that with increased emissions by a factor of 0.998, which led to a slight increase in domain-averaged NOx emissions by 1.42%. This suggests that the localized increases in East Asia slightly outweighed the reductions elsewhere.

Upon observing similar spatial distributions, the LEO-informed adjustment led to notably larger increases and decreases in NOx emissions. In areas where the model previously underestimated and overestimated NO2 columns (Fig. 1f), we noted an average increase of 35.39% and a decrease of 36.40% in NOx emissions, respectively (Fig. 3e). The total area with decreased emissions was slightly larger than that with increased emissions by a factor of 1.023, resulting in an average decrease of 17.10% in NOx emissions. This tendency towards more extensive adjustments, particularly towards decreasing the emissions, is attributed to the conventional inversion method practiced with LEO proxy NO2 columns, which relies on monthly averaged, time-specific discrepancies between observed and modeled NO2 columns (Fig. 1f) to adjust NOx emissions throughout the entire day.

After these inversions, we observed overall increases in NOx emissions in polluted regions of East Asia and decreases in relatively pristine areas of Southeast Asia, such as the Arakan Mountains (west of Myanmar), the Annamese Mountains (between Laos and Vietnam), and the northern mountainous terrain of Laos. The gradients in the emission differences seemed to reflect regions where the emissions errors exceeded the observational errors (Supplementary Fig. S3), particularly where the prior model once underestimated NO2 columns in East Asia and overestimated them in Southeast Asia. The high prior emissions in Southeast Asia can be attributed to the generalized energy use assumptions and data aggregation methods used in EDGAR, which rely on national or regional averages and standardized emission factors that may not adequately reflect the conditions in less-industrialized or rural regions51. Applying broad energy consumption data from more urban or industrial areas to these regions can distort the estimates, as NOx-emitting activities like transportation and industrial processes are less prevalent. This can be further exacerbated in CTM studies that often require emissions to be regridded from their native spatial resolution to a coarser scale, smoothing out localized details. Local bottom-up studies in Southeast Asia have reported much lower NOx emissions compared to EDGAR, with some estimates indicating that EDGAR’s projections are higher by factors of 2–3 compared to locally reported emissions52,53.

Fig. 3
figure 3

Monthly averages of hourly daytime NOx emissions (moles/s) during MAM 2022. (a) the a priori emissions, (b) the a posteriori emissions adjusted using GEMS tropospheric NO2 columns, (c) differences (b - a), (d) the a posteriori emissions adjusted using LEO proxy NO2 columns, and (e) differences (d - a), (f) differences (d - b). The maps were created using MATLAB R2024a by MathWorks, Inc. (https://www.mathworks.com/products/matlab.html).

Simulations using the posterior NOx emissions inventory

Figures 4 and 5 show the diurnal variations in hourly daytime NOx emissions and surface NO2 concentrations observed and modeled at ground-based stations in Korea and China during MAM 2022. We examined how the emissions responded to the GEMS- and LEO-informed adjustments, along with the subsequent changes in modeled NO2 concentrations and their biases against station measurements. This allowed us to assess the effectiveness of the top-down inversions in capturing the diurnal patterns of NOx emissions and refining their inventoried extent towards better simulating surface NO2 concentrations.

While the top-down adjustments were generally effective at addressing the model’s prior biases in simulating NO2 concentrations, the extent of improvement varied across daylight hours, with the best consistency during midday hours. The GEMS-informed adjustment led to overall increases in NOx emissions in Korea (Fig. 4) and China (Fig. 5), with a more significant extent of increases observed during morning hours. These increases seemingly compensated for the previously underestimated daytime NO2 concentrations, reducing NMB from − 17.38% to -9.64% in Korea and from − 13.05% to -4.94% in China, on average, during MAM (Table 1). However, despite these upward adjustments, the model underestimation persisted, particularly during morning hours. This could be partly due to GEMS’s systematic bias under unfavorable viewing geometries in early morning and late afternoon hours when GEMS tends to underestimate NO2 columns54. If the GEMS data used in our study had not retained the low bias, the inversion could have resulted in even larger increase in NOx emissions, potentially leading to better agreement between the modeled and observed NO2 concentrations.

We also observed some instances of overcorrection in Korea after the GEMS-informed adjustment, particularly in April during the early morning and late afternoon, where the posterior emissions led the model to overestimate NO2 concentrations despite GEMS’s inherent low bias54. These adjustments in the undesired, upward direction in these times may suggest that the observed NO2 columns used in the inversion might not have well reflected lower values, which can potentially be attributable to GEMS’s high bias at lower NO2 columns55, leading to excessive increases in NOx emissions. This pattern was also observed in China, but to a lesser extent. China showed a delayed response, with the NOx emissions beginning to increase an hour or two after the start of the day. This is likely due to GEMS’s limited coverage during its first retrieval, which does not extensively cover China at that time (Supplementary Fig. S4), leading the emissions likely to remain close to their prior state due to the absence of observational data for the adjustment. Despite the overall improvement in the emissions inventory, our results indicate that the influence of GEMS’s systematic biases seemed to have affected our top-down inversion efforts, particularly in the under- and overcorrections observed during the early morning and late afternoon.

The LEO-informed adjustment also addressed the prior underestimation but to a lesser extent. The adjustment resulted in uniform increases in emissions in both Korea and China across daytime hours, with the adjustments based on the discrepancies between observed and modeled NO2 columns at 04:45 UTC (Fig. 1f) during the inversion. These increases in NOx emissions led NMB to reduce from − 17.38% to -15.18% in Korea and from − 13.05% to -10.87% in China, on average, during MAM (Table 1). A possible reason for the less pronounced improvement is that, unlike earlier successful LEO-informed approaches8,11,56, which considered instrument biases prior to the inversion, our study did not rigorously incorporate bias correction efforts due to the novelty of the GEMS data product.

The month-to-month variations in NOx emissions indicate a gradual decline in the intensity of emissions from March to May, with the most noticeable reductions occurring during the afternoon hours. Note that the evening peak was beyond the scope of this analysis, as GEMS retrievals end before the onset of the evening rush hour. The seasonal decrease may be attributed to reduced heating demand as weather warms, leading to less combustion for residential and commercial purposes. In contrast, morning emissions remained relatively high, reflecting the steady influence of daily vehicular activities during rush hour periods.

Fig. 4
figure 4

Monthly averaged diurnal profiles of daytime NOx emissions (moles/s) and surface NO2 concentrations (ppb) at 459 AirKorea sites in Korea during MAM 2022. NOx prior: prior emissions, NOx posterior: posterior emissions after GEMS- and LEO-informed adjustments, NOx adjustment: the extent of adjustment, NO2 Obs: observed concentrations, NO2 prior: modeled concentrations using the prior emissions, NO2 posterior: modeled concentrations using the GEMS- and LEO-informed emissions, NMB: modeled concentrations’ normalized mean bias (%) against observed concentrations.

Fig. 5
figure 5

Monthly averaged diurnal profiles of daytime NOx emissions (moles/s) and surface NO2 concentrations (ppb) at 250 MEE sites in China during MAM 2022. NOx prior: prior emissions, NOx posterior: posterior emissions after GEMS- and LEO-informed adjustments, NOx adjustment: the extent of adjustment, NO2 Obs: observed concentrations, NO2 prior: modeled concentrations using the prior emissions, NO2 posterior: modeled concentrations using the GEMS- and LEO-informed emissions, NMB: modeled concentrations’ normalized mean bias (%) against observed concentrations.

Table 1 Normalized mean biases (%) of modeled surface NO2 concentrations against those observed at 459 AirKorea sites in Korea and 250 MEE sites in China during GEMS retrieval hours (local solar time) for MAM 2022. Prior: modeled concentrations using the a priori emissions, LEO: modeled concentrations after the LEO-informed adjustment, GEMS: modeled concentrations after the GEMS-informed adjustment.

Figure 6 illustrate the monthly averages of hourly daytime tropospheric NO2 columns simulated after the GEMS-informed and LEO-informed adjustments. The GEMS-informed adjustment generally remedied the model’s earlier underestimation (Fig. 1), particularly in capturing the high peaks in East Asia, resulting in a closer alignment of the modeled columns with GEMS tropospheric NO2 columns. NO2 columns decreased after the adjustment in areas where overestimation was once prevalent. Also, we noticed some instances of overcompensation. While effective in addressing underestimations of NO2 columns, the GEMS-informed adjustment introduced overestimations in some areas, such as the Yellow Sea, where the columns were already well-captured using the a priori emissions. Given that the inversion was not directly applied to NOx emissions over sea surfaces, the increased NO2 columns over the Yellow Sea are attributed to significant increases in nearby upwind inland areas, such as the NCP and the northeastern region of China.

The LEO-informed adjustment also addressed the model’s prior biases, but with different spatial patterns across the domain. The adjustment was generally more effective in constraining previously overestimated NO2 columns but was less effective in capturing their high peaks in areas where underestimation was once prevalent, such as the southern half of the NCP. This was considered to be caused by the inversion, which was more substantially directed towards reducing the extent of NOx emissions, as discussed earlier in Sect. 2.2. These spatial patterns closely reflected the discrepancies observed between LEO proxy at 04:45 UTC and the corresponding modeled NO2 columns (Fig. 1c and f), reaffirming the ongoing challenge of preventing over- and under-correction in emissions inventories, especially when relying solely on time-averaged observation data11,36,57,58,59,60.

Fig. 6
figure 6

Monthly averages of hourly daytime tropospheric NO2 columns (molecules/cm2) modeled during MAM 2022. (a) modeled using the a priori emissions, (b) modeled after the GEMS-informed adjustment, (c) modeled after the LEO-informed adjustment. The maps were created using MATLAB R2024a by MathWorks, Inc. (https://www.mathworks.com/products/matlab.html).

Figure 7 and Supplementary Fig. S5 show time series of hourly daytime NOx emissions and surface NO2 concentrations observed and modeled in Korea and China, respectively, during MAM 2022. The GEMS-informed adjustment improved IOA from 0.77 to 0.80 in Korea and from 0.79 to 0.83 in China, on average, during MAM (Supplementary Table S1). To a lesser extent, the LEO-informed adjustment also resulted in a slight improvement in correlation by 0.01. The model previously underestimated NO2 concentrations by 19.55% in Korea during MAM, and the GEMS-informed adjustment moderated this negative bias to 11.63%. The smallest bias was achieved in April, with NMB of 0.02%, indicating a fairly close agreement between the modeled and observed concentrations. Similarly, in China, the extent of underestimation was substantially reduced from 13.08 to 4.63% during MAM, with exceptional alignments in April and May, with NMB of 0.78% and 0.79%, respectively. The LEO-informed adjustment was also generally effective in reducing the model’s prior biases but to a lesser extent. The extents of underestimation were slightly reduced to an average of 17.26% in Korea and 10.64% in China.

Fig. 7
figure 7

Time series of the hourly daytime NOx emissions and surface NO2 concentrations observed and modeled at 459 AirKorea sites in Korea during MAM 2022. Note that 8 observations were made a day in March, and 10 a day in April and May, and each day begins at 00 UTC on the time axis. OBS: observed concentrations, CMAQ (prior): modeled concentrations using the a priori emissions, CMAQ (posterior GEMS): modeled concentrations after the GEMS-informed adjustment, CMAQ (posterior LEO): modeled concentrations after the LEO-informed adjustment, % Valid retrieval: the percentage that valid GEMS observation was made per day, NOx (prior): a priori NOx emissions, and NOx (posterior GEMS): a posteriori NOx emissions after the GEMS-informed adjustment.

Figure 8 illustrates time series of daytime mean surface NO2 concentrations observed and modeled in Korea and China during MAM, aligned with the number of valid observations afforded by GEMS retrievals. While the GEMS-informed adjustment was generally more effective in moderating the model’s prior biases, a noticeable aspect was the response of the a posteriori concentrations to the availability of valid observation references. On days with minimal valid data given (i.e., valid number of retrievals averaged between 0 and 1), including March 13, 14, 17, 18, and April 13 in Korea, the GEMS-informed inversion led to minor adjustments in the corresponding NO2 concentrations due to limited top-down information (Fig. 8a). Meanwhile, the LEO-informed inversion allowed the modeled concentrations to be adjusted in a continuous manner, taking advantage of the monthly-averaged observation data applied during the inversion, but with less pronounced improvements. A similar pattern was observed sporadically between the modeled and observed concentrations in China, but to a less severe extent (Fig. 8b).

Fig. 8
figure 8

Time series of the daytime mean surface NO2 concentrations observed and modeled at 459 AirKorea sites in Korea and 250 MEE sites in China during MAM 2022. Note that each day begins at 00UTC on the time axis. OBS: observed concentrations, CMAQ (prior): modeled concentrations using the a priori emissions, CMAQ (posterior GEMS): modeled concentrations after the GEMS-informed adjustment, CMAQ (posterior LEO): modeled after the LEO-informed adjustment, and the number of valid GEMS observation per day.

Conclusion

Our study introduces the potential of geostationary observation data in hourly adjusting an emissions inventory with the assistance of GEMS’s unprecedented sampling frequency. The GEMS- and LEO-informed adjustments to the NOx emissions inventory, despite both exploiting observed tropospheric NO2 columns, adopted different approaches in response to the available top-down information. The fine temporal resolution of GEMS observation data was beneficial for capturing daytime variations in NOx emissions in a top-down manner, improving the simulation accuracy of NO2 loadings in Asia to a certain extent. The inversion informed by our LEO proxy was also generally beneficial in reducing the model biases but had limitations in adjusting diurnal emission profiles. This is attributed to its reliance on limited top-down information, such as monthly-averaged discrepancies between prior simulations and observations, which may not accurately reflect the temporally dynamic nature of NOx emissions. Moreover, our LEO-informed inversion did not achieve the level of improvement showcased in earlier successful studies using actual LEO instruments, mainly due to our relatively simplistic bias correction efforts applied to the observed NO2 columns. This represents a limitation of our study and suggest the need for follow-up research as GEMS products mature and more comprehensive bias-correction guidance become available. Additionally, it is important to note that our model evaluations were largely confined to East Asia, due to the current availability of station measurement data.

Nevertheless, the distinct responses of NOx emissions to the GEMS- and LEO-informed inversions highlight the potential of geostationary observation data in refining the emissions with more detailed temporal nuance. Our findings emphasize the utility of geostationary observation data in air quality research and advocate for the development of more advanced strategies to maximize its use. This will enable more granular air quality analyses across Asia and potentially beyond, especially as we anticipate the complete deployment of the constellation of GEO satellite instruments across the Northern Hemisphere in the near future.

Methods

Modeling setup

Meteorology plays a critical role in air quality simulations by guiding the dispersion and transport of air pollutants. We simulated meteorological fields over the modeling domain during MAM 2022 by using the Weather Research and Forecasting (WRF) model 3.861. This enabled us to simulate hourly meteorology over a 320 × 320 grid across 35 vertical layers at a spatial resolution of 27 km. Note that we focused on the spring season as a proof of concept, as it is relatively free from extreme weather, increased energy demands, and associated emission sources that could complicate the analysis.

CMAQ serves as a forward model in the emission adjustment process (see Sect. 4.4). Using established emissions input, CMAQ simulates corresponding air pollutant concentrations three-dimensionally within the atmosphere. This enables us to align our simulations with actual observed air quality, which is an essential step for refining emissions inventories. We simulated air quality and chemistry using CMAQ 5.213 and its Decoupled Direct Method in Three Dimensions (CMAQ DDM-3D)62. CMAQ DDM-3D calculates first-order coefficients that represent the locally semi-normalized sensitivity of modeled concentrations to changes in emissions. At the same spatial resolutions used for WRF simulations, we simulated hourly NO2 concentrations over a 300 × 300 grid and computed their corresponding sensitivities to NOx emissions. Building upon previous studies over Asia27,34,36, we employed CMAQ configurations that have been validated in similar contexts across the region. However, there is a still limitation with the CB05 chemical mechanism coupled with CMAQ in our study, which is known to underestimate the recycling of alkyl nitrates back to NO263,64, potentially leading to lower NO2 concentrations. Although we did not attempt to adopt other mechanisms, which would require species mapping that could introduce additional uncertainties, we implemented measures to mitigate the potential underestimation of NO2. We deliberately disabled surface nitrous acid (HONO) interactions within the model to avoid additional OH production from existing HONO. This was intended to reduce the potential for OH-driven scavenging of NOx and formation of secondary aerosols (which do not readily convert back to NO2), which could otherwise contribute to premature depletion of NOx and lower NO2 concentrations, misleading our inversion. Both the WRF and CMAQ simulations began with a 10-day spin-up from February 19. Further technical details in our modeling setup are listed in Supplementary Table S2.

Emissions in Asia

Emissions inventories inform CTMs about the sources and quantities of air pollutants, enabling the simulation of their emergence, behavior, and eventual concentrations in the atmosphere. To prepare anthropogenic emissions over the modeling domain, we used the Emissions Database for Global Atmospheric Research (EDGAR) 6.165. EDGAR provides annual figures (base year: 2018) of greenhouse gas and air pollutant emissions at a 0.1° spatial resolution. We processed these emissions into a format compatible with CMAQ using the Sparse Matrix Operator Kernel Emissions (SMOKE) 4.766. This included speciation of the inventory emissions, re-gridding the emissions into a 27 km resolution grid, and assigning the annual emissions into hourly emission estimates during MAM 2022 while accounting for time zones, weekday-weekend routines, and local diurnal variations.

To prepare biogenic and biomass burning emissions, we utilized the Model of Emissions of Gases and Aerosols from Nature (MEGAN) 3.067 and the Fire Inventory from the National Center for Atmospheric Research (FINN) 1.568, respectively. MEGAN estimates net emissions of gases and aerosols from terrestrial ecosystems, factoring in vegetation responses to meteorology. We obtained hourly biogenic emissions at a 27 km resolution, using the Reprocessed Moderate Resolution Imaging Spectroradiometer (MODIS) version 6 Leaf Area Index product69 and the Visible Infrared Imaging Radiometer Suite (VIIRS) global Green Vegetation Fraction product70, both at a 0.05° resolution, and WRF-simulated meteorology as key inputs to MEGAN. FINN provides emissions from open biomass burning, such as wildfires, agricultural fires, and prescribed burning, based on MODIS observations and fuel loading parameters. We obtained hourly biomass burning emissions over the modeling domain at a 27 km resolution. We then merged anthropogenic, biogenic, and biomass burning emissions altogether to prepare the a priori emissions input for CMAQ.

Satellite data: GEMS tropospheric NO2 columns and LEO proxy

GEMS is the first UV-visible geostationary hyperspectral imager that measures the earth’s radiance and solar irradiance within the 300 to 500 nm wavelength range, aiming to monitor atmospheric gases and aerosols in the broader Asia-Pacific region. We utilized hourly Level 2 GEMS tropospheric NO2 product (version 2.0) to gain a comprehensive overview of NO2 pollution across Asia. This dataset, available since November 2022, includes observations from November 2020 to the present. Covering longitudes from 75° E to 145° E and latitudes from 5° S to 45° N, GEMS records 8 to 10 consecutive hourly snapshots of tropospheric NO2 columns during daylight hours in MAM, at a spatial resolution of 3.5 km × 8 km44. We selected GEMS’s tropospheric NO2 columns observed from March 1 to May 31, 2022, to serve as the top-down observation references for updating the a priori emissions. To illustrate, GEMS recorded 8 observations every day from 00:45 to 06:45 and at 23:45 UTC in March, with this sampling frequency increasing to 10 observations from 00:45 to 07:45, and at 22:45 and 23:45 UTC during April and May. In local hours, the full temporal span of daytime observations extends from 07:45 AM to 4:45 PM in Korea and 06:45 AM to 03:45 PM in China. Beyond tropospheric NO2 columns, we utilized several variables derived from the Level 2 data, including averaging kernel, cloud fraction, troposphere and stratosphere air mass factors, modeled pressures interpolated to GEMS layers, algorithm quality flags, and root mean square error. The averaging kernels and air mass factors were used to adjust for the vertical sensitivity of the retrievals and to mitigate the influence of initial profile assumptions (a priori profiles), following the established approach in the previous study34. To ensure the quality of the observation references, we used pixels satisfying algorithm quality flags of 0 bit (good sample) and cloud fractions smaller than 0.3. Pixels with pressure profiles occasionally exhibiting unrealistically high values (e.g., 2,000 hPa) at the edge pixels of retrievals were excluded from our study. We also considered the inherent bias in GEMS tropospheric NO2 columns, observing an average discrepancy of -0.255 × 1015 molecules/cm2 compared to Pandora spectrometer measurements in highly polluted grid cells (≥ 0.5 × 1016 molecules/cm2)49, by selectively adjusting the observed values upwards.

To evaluate the effectiveness of using multi-hour observations from the GEO platform for emissions adjustment and to enable a comparison between the a posteriori emissions adjusted based on GEO and LEO observations, we created ‘LEO proxy’ data. This proxy uses GEMS’s 04:45 UTC tropospheric NO2 columns as a stand-in for corresponding LEO observations, mimicking the temporal resolution of typical LEO satellite instruments. We selected the monthly averaged LEO proxy during MAM as benchmarks for updating the a priori emissions in each month. This approach enabled us to assess the utility of temporally more continuous observation data, rather than providing a direct comparison between the utilities of GEMS and other LEO instruments.

Top-down approach for the NOx emissions adjustment

The extent of NOx emissions is not directly observable through GEMS. Thereby, to establish quantitative constraints on NOx emissions and provide adjustments accordingly, we employed a Bayesian approach for inverse modeling, suited for solving problems that are not grossly nonlinear71. Recognizing the short lifespan of NO237,38,39,40, we explicitly assumed a linear, local relationship between NOx emissions and NO2 columns. However, it is important to note that that the observed NO2 columns at each retrieval time are the consequences of not only the emissions from that specific hour but also the NO2 lingering from previous hours, given the short yet still multi-hour lifetime of NOx38,40. For example, the availabilities of nighttime ozone and hydroxyl radicals (OH) can introduce nonlinearity between NO2 concentrations and NOx emissions27 - a complexity that falls outside the scope of our study. Our method aimed to infer the most probable extent of NOx emissions by minimizing the Bayesian-informed cost function in Eq. 1.

$$\:J\left(x\right)=\:\frac{1}{2}{\left(y-Fx\right)}^{T}{S}_{o}^{-1}\left(y-Fx\right)+\frac{1}{2}{\left(x-{x}_{a}\right)}^{T}{S}_{e}^{-1}\left(x-{x}_{a}\right)\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\left(1\right)$$

This process per grid cell, based on Gaussian error assumptions, involved determining the a posteriori emissions \(\:\varvec{x}\) given multiple sources of information, including the a priori emissions \(\:{\varvec{x}}_{\varvec{a}}\), top-down observation references \(\:\varvec{y}\) (either GEMS tropospheric NO2 columns or LEO proxy NO2 columns), and their counterpart \(\:\varvec{F}\) from the forward model (CMAQ-simulated NO2 columns). Given the 15-minute offset in GEMS retrievals from O’clock sharps (from 00:45 to 23:45 UTC), we aligned these observations to the nearest subsequent hour in the 00:00–23:00 UTC cycle. For example, GEMS observations at 04:45 UTC were used for constraining the emissions at 05:00 UTC. \(\:{\varvec{S}}_{\varvec{e}}\) and \(\:{\varvec{S}}_{\varvec{o}}\) represent the uncertainties in emissions and observations, which were set at 50%, 200%, and 100% for anthropogenic, biogenic, and biomass burning emissions, respectively, based on previous NOx inversion studies across Asia27,34,36. Observation error was sourced from the GEMS Level 2 data. The corresponding observation and emission errors, represented by RMSE, are shown in Supplementary Fig. S3.

Upon reaching the minimum of the cost function, as identified by its first derivative, we applied the Gauss-Newton method as described in Eq. 2. This involved refinement of the estimate of \(\:\varvec{x}\) (with each iteration denoted as \(\:\varvec{i}\); \(\:\varvec{i}=1\) in this study), gradually advancing towards a convergence of the solution. One limitation of this study is that we did not pursue extensive iterations, as our primary aim was to assess the potential of the newly available observation data, rather than to refine the model accuracy itself towards its finest. The Jacobian matrix \(\:\varvec{K}\) explains the sensitivity relationship between NOx emissions and NO2 concentrations, which was calculated by CMAQ DDM-3D once at the beginning of the simulations and then used as a fixed matrix during the inversion. The forward model \(\:\varvec{F}\) gets updated in every iteration, guiding the inversion towards more accurate outcomes.

$$\:{\widehat{x}}_{i+1}\:=\:{x}_{a}+{S}_{e}{K}_{i}^{T}{({K}_{i}{S}_{e}{K}_{i}^{T}+{S}_{o})}^{-1}\left[y-K{x}_{i}+{K}_{i}\left({x}_{i}-{x}_{a}\right)\right]\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\left(2\right)$$

Using GEMS NO2 columns, this inversion was performed whenever the top-down constraints were available. This allowed us to constrain hourly NOx emissions corresponding to GEMS’s daytime retrieval hours, while leaving the emissions unchanged during the hours without observations. To utilize LEO proxy NO2 columns, we employed the inversion method from previous studies2728,36, which used time-averaged LEO observation data to constrain emissions. Resolving the discrepancy between the monthly observed and modeled NO2 columns during MAM, this inversion led hourly NOx emissions to be uniformly constrained throughout the entire day. After completing these inversions, we reran the model using the a posteriori emissions to simulate NO2 concentrations over the modeling domain.

Note that we sought to balance the complexity of NOx chemistry and transport with the practical constraints of computational resources. While these inversions assume a local relationship between NOx emissions and NO2 columns, our use of CMAQ DDM-3D partially addresses the non-locality issue by accounting for advection and diffusion, providing sensitivities of NO2 concentrations to changes in NOx emissions from specific sources across all grid cells in the model domain. However, it does not fully resolve the transport effects related to NOx’s lifetime. To further manage this complexity, we deliberately chose not to adjust nighttime emissions. This gives a “pause” in the model, allowing the previous day emissions’ influence on early morning NO2 concentrations to diminish overnight. By isolating the daily emission cycle in this way, we aimed to simplify the interpretation of the relationship between daytime emissions and observed NO2 loadings. However, we acknowledge that this approach does not fully address the carryover effects across daylight hours themselves, which likely introduced transport errors into our inversion. During daylight hours, we chose not to adjust emissions when GEMS data were unavailable, and instead examined GEMS’s valid observation count each day, allowing us to maintain the integrity of our evaluation. While techniques like rolling averages or climatological updates could fill gaps during cloudy periods, our primary focus was to assess the GEMS NO2 product’s utility in its relatively unaltered state. By not adjusting emissions during these gaps, we ensured that our evaluation distinguishes the data’s strengths and limitations without introducing additional uncertainties. This enabled a more direct comparison with the LEO-informed inversion, allowing us to identify the strengths of each approach.

Ground-truth for model evaluation

Prior to proceeding with CMAQ simulations, we assessed the accuracy of the WRF-simulated meteorological fields at ground-based weather stations in Korea. We obtained hourly measurements of 2 m air temperature and 10 m wind U and V components observed at 95 sites during MAM 2022, sourced from the Korean Meteorological Administration. The modeled meteorology showed fair agreement with station measurements (Supplementary Fig. S6), with IOA ranging from 0.65 to 0.93.

To evaluate the accuracy of CMAQ simulations before and after updating the NOx emissions inventory, we obtained hourly surface NO2 concentrations (in ppb) observed at ground-based monitoring stations across Asia during MAM 2022, sourced from Korea’s Ministry of Environment (AirKorea) and China’s Ministry of Ecology and Environment (MEE). To ensure the quality of AirKorea measurements, from an original count of 515 stations, we excluded those with more than 50% missing data during the validation period16,36, which resulted in a 9.93% data loss and retaining 459 stations. To assure the quality of MEE measurements, we applied data filtering methods72,73,74 to the measurements at 250 control points across China. This included discarding negative concentrations and duplicate records (> 4 consecutive repeats) due to equipment failures, resulting in a 0.43% decrease in the number of data points. Note that we converted MEE’s native measurements in µg/m3 to ppb74.