Hybrid transformer and physics-informed neural operator for correcting TEMPO NO2 biases over North America

Kayastha, Sagun Gopal; Park, Jincheol; Choi, Yunsoo

doi:10.1038/s44407-026-00056-7

Download PDF

Article
Open access
Published: 06 March 2026

Hybrid transformer and physics-informed neural operator for correcting TEMPO NO₂ biases over North America

Sagun Gopal Kayastha¹,
Jincheol Park¹ &
Yunsoo Choi¹

npj Clean Air volume 2, Article number: 15 (2026) Cite this article

1106 Accesses
1 Citations
Metrics details

Subjects

Abstract

Uncertainty in the Air Mass Factor (AMF) causes systematic biases in satellite-retrieved nitrogen dioxide (NO₂) vertical column densities (VCDs). We introduce the first physics-informed neural network that directly refines TEMPO’s AMF to improve the conversion of its slant columns to VCDs within a self-sufficient data pipeline. Our unique Transformer-Fourier Neural Operator hybrid architecture learns the dependencies among 2D and 3D radiative transfer features that govern AMF, using a Huber loss that enforces consistency between predicted AMF and radiative transfer theory. Trained on 74,919 TEMPO-Pandora observation pairs across North America from August 2023 to December 2024, our bias correction framework improves R² from 0.58 to 0.80 and reduces RMSE by 30%, with stable performance across all seasons. By incorporating an explicit physical constraint during training rather than relying on post-hoc bias fitting, our approach complements purely data-driven learning and provides a theory-consistent correction of AMF-driven biases in satellite VCD retrievals.

Metamaterial absorber using cascaded ring resonators and optimization through machine learning for sensing applications

Article Open access 14 November 2025

Zero-field Hall effect emerging from a non-Fermi liquid in a collinear antiferromagnet V_1/3NbS₂

Article Open access 18 April 2025

Role of ambient temperature in modulation of behavior of vanadium dioxide volatile memristors and oscillators for neuromorphic applications

Article Open access 12 November 2022

Introduction

Nitrogen dioxide (NO₂), emitted by combustion sources such as traffic, power plants, and biomass burning, is a critical trace gas influencing atmospheric composition. NO₂ contributes substantially to tropospheric and stratospheric chemistry by driving acid deposition, serving as the primary precursor of tropospheric ozone (O₃), and modulating the lifetime of greenhouse gases such as methane, thereby affecting the Earth’s radiative balance^1,2,3,4. Epidemiological evidence links NO₂ exposure to respiratory and cardiovascular disease, including asthma exacerbation, chronic obstructive pulmonary disease, and lung cancer, as well as to increased premature mortality^5,6,7. Given its short atmospheric lifetime, NO₂ concentrations peak near urban emission sources^8,9, underscoring the need for precise quantification of ambient NO₂ to inform regulatory planning and public health.

To better characterize the spatiotemporal behavior of NO₂, a combination of in situ observations, numerical modeling, and satellite-based remote sensing has been widely adopted, with each offering distinct advantages and limitations^10,11,12. Ground-based air quality monitoring networks provide high-fidelity data at specific sites but are often too sparse for comprehensive regional or global assessments, especially in rural and remote areas where monitoring infrastructure is limited^13,14. Chemical transport models (CTMs), such as the GEOS-Chem¹⁵ and Community Multiscale Air Quality (CMAQ) model¹⁶, have been widely used to estimate air pollutant concentrations across vast areas in three dimensions but remain subject to uncertainties in emission inventories, meteorological inputs, and model parameterizations^17,18,19. Satellite-based remote sensing has been instrumental in overcoming the limited spatial coverage of ground-based monitoring by providing extensive observations of NO₂ loadings. Since the late 1990s, a series of polar-orbiting UV-visible spectrometers, including the Global Ozone Monitoring Experiment (GOME)²⁰, SCanning Imaging Absorption SpectroMeter for Atmospheric CHartographY (SCIAMACHY)²¹, Ozone Monitoring Instrument (OMI)²², GOME-2²³, and TROPOspheric Monitoring Instrument (TROPOMI)²⁴, have provided multi-decadal measurements of NO₂ column densities. These datasets, which typically offer near-daily global coverage, have been essential for monitoring air pollution dynamics, evaluating emission trends, and advancing atmospheric chemistry research worldwide.

Recent advances in satellite instruments have introduced new observational capabilities through geostationary platforms, complementing earlier sun-synchronous sensors by enabling continuous intra-day observations of air pollution levels²⁵. This new generation of geostationary instruments includes the Geostationary Environment Monitoring Spectrometer (GEMS)^26,27, which provides hourly measurements of columnar loadings of air pollutants, including NO₂, across Asia at 7 × 8 km resolution. The Tropospheric Emissions: Monitoring of Pollution (TEMPO)^28,29 provides hourly snapshots of air quality over North America at a resolution of ~2.1 × 4.75 km, facilitating the detailed characterization of urban emissions and regional air pollution patterns. The Sentinel-4 mission³⁰, recently launched by the European Space Agency in July 2025, will offer similar observation capabilities over Europe and North Africa. Such advances not only enhance monitoring capacity but also enable more precise adjustments to inventoried air pollutant emissions, beyond what ground-based monitoring alone can achieve^{1,31,32,33,34}. They also provide top-down observational constraints for evaluating the reliability of CTM simulations, identifying systematic biases, and guiding improvements to model inputs and parameterizations^35,36. More recently, satellite observations have been increasingly fused with ground-based measurements, CTM outputs, or land use and land cover (LULC) data through machine learning (ML) and deep learning (DL) approaches to refine estimates of surface NO₂ concentrations^37,38,39.

Despite such advances, satellite-based NO₂ retrievals have been subject to systematic errors, which are often pronounced under certain viewing geometries or instrument-specific limitations. For example, retrievals made at coarse spatial resolution smooth out sub-grid NO₂ gradients through spatial averaging, which usually leads to an underestimation of NO₂ levels in polluted areas^13,40. Furthermore, cloud contamination necessitates a data filtering process that may exclude a non-negligible portion of retrievals, resulting in information loss and potential systematic bias⁴¹. Beyond these observational constraints, the most critical source of uncertainty arises from the Air Mass Factor (AMF), which is required to convert slant column densities (SCDs) of NO₂ into vertical column densities (VCDs)^42,43. AMFs are typically calculated using radiative transfer models (RTMs) that require accurate ancillary inputs, including surface reflectance, cloud and aerosol properties, and the assumed vertical distribution of trace gases. To improve computational efficiency, these models often employ look-up tables (LUTs) that provide precomputed AMF values across a range of atmospheric and surface conditions, typically relying on simplified assumptions^44,45,46. This process can introduce or amplify uncertainties when the assumed conditions deviate from real-world atmospheric conditions, thereby propagating errors into the resulting VCDs of air pollutants. Because stratospheric AMFs are generally stable and have low-uncertainty, the majority of AMF-related retrieval error arises from the tropospheric component. The influence of such uncertainties has been well-documented across multiple satellite platforms, including OMI^47,48,49, GOME-2^50,51, TROPOMI^52,53,54. For example, up to 45% of retrieval error in OMI’s tropospheric NO₂ VCD has been attributed to inaccuracies in AMF calculation and the separation process between stratospheric and tropospheric NO₂ columns⁵⁵. Ground-based evaluations of TROPOMI tropospheric NO₂ columns using Pandora spectrometer data, collected by the Pandonia Global Network (PGN), report AMF uncertainties ranging from 10% to 35%, primarily driven by unaccounted aerosol effects and directional surface reflectance anisotropy⁵⁶. Errors in AMF can also vary substantially across different radiative transfer schemes by 31% to 42% depending on the assumptions used to represent atmospheric conditions⁴³.

Recent studies have revealed that recalculating AMF with regionally and temporally specific inputs into RTMs can substantially reduce VCD retrieval errors. In southern China, incorporating high-resolution surface reflectance and aerosol inputs reduced errors by 25% to 30%⁵⁷. Observation-based corrections in aerosol representation, such as those using the Absorbing Aerosol Index, reduced AMF biases by 10% to 15% during intense biomass-burning events⁵⁸. Regional optimization of AMFs based on a priori knowledge of NO₂ vertical profiles, such as those derived from CTMs, improved retrieval accuracy by 15% to 25% in urban settings⁵⁹. Replacing these profiles with long-term in situ observation data reduced uncertainties in AMF by up to 20%⁶⁰. In addition to spatial representativeness, temporal mismatches between the satellite overpass time and the meteorological or profile inputs used in AMF calculation can introduce significant retrieval errors. Prior work has shown that aligning AMF inputs to the correct observation time, rather than using temporally coarse or offset fields, can meaningfully reduce these errors⁶¹. These results underscore the necessity for more accurate AMF calculations and bias-correction efforts that explicitly account for spatial and temporal variability in atmospheric conditions, thereby improving the accuracy of top-down NO₂ column retrievals.

Deterministic corrections of satellite-derived NO₂ columns have historically relied on empirical adjustments against ground-based measurements, improvements to RTMs, and refinements of a priori NO₂ profile assumptions^50,62,63. These approaches, typically involving regression-based bias correction or full-product reprocessing, have been effective at mitigating systematic errors associated with surface reflectance, aerosol, and cloud properties. However, they often struggle to fully capture the complex, multivariate dependencies that shape AMF behaviors and consequently influence NO₂ retrieval accuracy. In recent years, deep learning-based approaches have been increasingly explored for improving satellite NO₂ data, with applications including spatial resolution enhancement⁵⁶, surface-level concentration estimation^64,65, and gap-filling of missing retrievals⁶⁶. Beyond these general applications, several studies have applied deep learning techniques to correct biases in satellite NO₂ data more directly. For example, Wu et al. ⁶⁷ employed a back-extrapolation framework that leveraged a random forest model trained on satellite NO₂ columns and ground-level NO₂-related predictors to correct biases in long-term daily surface NO₂ concentrations across China. Oak et al. ⁶⁸ developed a two-step bias-correction approach for GEMS tropospheric NO₂ columns, first refining the original GEMS retrievals and then applying a Light Gradient Boosting Machine (LightGBM) trained on co-located TROPOMI NO₂ columns to further reduce systematic biases in GEMS retrievals over East Asia. Ghahremanloo et al. ²⁵ applied a deep convolutional neural network (Deep-CNN) to correct biases in hourly GEMS tropospheric NO₂ columns from 2021 to 2023. The model was trained on 17,879 Pandora collocations using extensive feature selection, improving Pearson’s correlation coefficient (R) between the GEMS and Pandora NO₂ columns from 0.68 to 0.88 and reducing mean absolute bias by more than 50%.

More recently, advances in deep learning have introduced a new class of architectures known as neural operators, which extend traditional neural networks beyond fixed-length vector inputs to learn mappings between spatial or spatiotemporal function fields, such as air pollutant concentrations, wind vectors, or temperature distributions. Unlike earlier neural networks that rely on discrete input-output pairs, neural operators can capture relationships between the fields that continuously vary over space and time. This makes the operators particularly well-suited for solving partial differential equations (PDEs) and modeling physics-based processes commonly encountered in scientific applications. For instance, neural operators have been applied to correcting biases in numerical weather prediction outputs, including temperatures and humidity⁶⁹, as well as forecasting spatial distributions of carbon monoxide⁷⁰. Although such applications to satellite observation data are still emerging, the potential of neural operators to address biases originating from geophysical processes makes them a compelling tool for air quality applications.

In this study, we present a physics-informed hybrid neural network that combines a transformer and a Fourier neural operator (FNO) branch to correct systematic biases in TEMPONO₂ VCDs. Unlike prior ML-based bias correction²⁵ approaches that directly adjust the retrieved VCD, our method predicts a correction to the AMF and then recomputes the vertical column. The transformer branch captures local interactions in 2D surface and geometric predictors, such as surface albedo and viewing geometry, while the FNO branch models global spatial patterns from vertically resolved 3D profile inputs, such as scattering weights and NO₂ shape factors. These representations are fused through a cross-attention mechanism and passed to a shared prediction head that estimates a physically meaningful AMF correction, which consequently improves the accuracy of NO₂ VCDs. The training objective includes dedicated loss terms that enforce domain-specific physical constraints, penalizing physically implausible corrections and promoting consistency with established atmospheric principles. We train the model using 74,919 collocated TEMPO-Pandora total NO₂ VCDs across North America from August 2023 to December 2024. Once fully trained, the model can operate exclusively on TEMPO inputs alone, without reliance on auxiliary inputs, enabling continuous, high-frequency bias correction across the full TEMPO coverage. An extensive evaluation was conducted using 10-fold and leave-one-station-out (LOSO)^71,72 cross-validation strategies to assess model generalization across space and time. This framework provides a physically consistent approach to improving the accuracy of TEMPO total NO₂ VCDs within a near-real-time processing pipeline. In this study, we focus on the total NO₂ vertical column rather than the isolated tropospheric component. This choice is consistent with the Pandora NVS product used for training, which provides accurate total NO₂ columns.

Results

Baseline evaluation: standard TEMPO NO₂ Product vs. Pandora

To benchmark TEMPO’s NO₂ retrieval accuracy and examine associated biases, we compared the hourly variation of TEMPO NO₂ VCD and SCD against ground-based VCD measurements at 58 Pandora stations across CONUS during the period from August 2023 to December 2024. After applying a quality control process to both TEMPO and Pandora (see Methods), we identified 74,919 matched retrieval-measurement pairs for analysis. Agreement varied substantially by site, with the coefficient of determination (R²) ranging from 0.09 to 0.60, index of agreement (IOA) from 0.52 to 0.85, mean absolute biases (MAB) from 7.7 × 10¹⁴ to 1.09 × 10¹⁶ molecules/cm², and root mean square errors (RMSE) from 1.04 × 10¹⁵ to 1.44 × 10¹⁶ molecules/cm². R² quantifies the variance explained by a linear fit; IOA standardizes the magnitude of prediction errors⁷³; MAB captures the average absolute difference; and RMSE reflects the total error, encompassing both systematic and random components. Regionally, baseline TEMPO VCD performance varies meaningfully across CONUS, with agreement strongest along the West Coast and weakest in the Midwest; errors are smallest in the Southwest, and biases are mostly positive, largest in the Southeast and Northeast, while the Mountain West shows a slight negative bias (Supplementary Table S1). Throughout this study, VCD refers to the total NO₂ vertical column. For brevity, we hereafter refer to TEMPO NO₂ VCD and SCD as TEMPO VCD and SCD, and Pandora NO₂ VCD as Pandora VCD.

Hourly R² between TEMPO SCD and Pandora VCD, and between TEMPO VCD and Pandora VCD (Fig. 1a), exhibit a pronounced diurnal cycle when evaluated in local time, with lower correlations during the early morning hours, increasing steadily toward a broad maximum during late morning to early afternoon (approximately 09–13 local time), followed by a gradual decline toward the late afternoon. For each local hour, performance metrics were computed by converting individual TEMPO pixel observations from UTC to local solar time using longitude and aggregating all collocated TEMPO–Pandora pairs within each local-time bin. On average, TEMPO VCD shows lower correlation with Pandora VCD (mean R² ≈ 0.55) than TEMPO SCD does (mean R² ≈ 0.67), indicating that additional uncertainty is introduced during the AMF conversion process. TEMPO VCD exhibits a modest negative bias relative to Pandora VCD during the early morning hours. The bias approaches near zero by late morning (approximately 10–12 local time) and becomes weakly positive during the mid-to-late afternoon, reaching a maximum of approximately (2–4) × 10¹⁴ molecules/cm² around 16–17 local time, before decreasing again toward the early evening. The grey ±1σ band (on the order of 10¹⁵ molecules/cm²) shows that hour-to-hour scatter far exceeds these mean offsets. Such diurnal variations in bias are characteristic of geostationary instruments like TEMPO. The fixed viewing geometry combined with changing solar angles throughout the day systematically alters atmospheric path lengths and surface reflectance conditions, impacting trace gas retrievals^63,68. Specifically, the changing angles modify the atmospheric path length of sunlight, significantly impacting the calculated AMF and its sensitivity to assumptions about the NO₂ vertical profile, aerosols, and clouds^35,43. The tighter clustering of SCD correlations suggests that radiative-transfer assumptions, ancillary inputs, and profile-shape errors drive much of the diurnal variability in VCD accuracy. A portion of the residual spread also reflects inherent pixel–point representativeness differences between Pandora and TEMPO (see Methods), which contribute to baseline discrepancies independent of AMF-related retrieval bias.

Fig. 1: Hourly agreement and bias characteristics of TEMPO Nitrogen Dioxide(NO2) columns against Pandora ground references across the conterminous United States. — **Fig. 1: Hourly agreement and bias characteristics of TEMPO Nitrogen Dioxide(NO₂) columns against Pandora ground references across the conterminous United States.**

Further analysis reveals that the retrieval bias varies systematically across the range of NO₂ concentrations (Fig. S1). At low concentrations (<6 × 10¹⁵ molecules cm²), TEMPO exhibits a positive mean bias of ~10–15%, indicating a slight overestimation under clean conditions. In the intermediate range of 8–12 × 10¹⁵ molecules cm², the mean bias approaches zero, suggesting good agreement with Pandora. At higher concentrations (> 15 × 10¹⁵ molecules cm²), TEMPO increasingly underestimates, with biases reaching −20% to −30% at column amounts above ~30 × 10¹⁵ molecules cm². This pattern indicates that TEMPO tends to overestimate NO₂ loadings in clean regimes but underestimate in polluted conditions. Our evaluation is consistent with recent work⁷⁴, which reported that TEMPO exhibits structured, environment-dependent biases and substantial site-to-site variability. That study also found systematic overestimation in clean conditions and underestimation at higher NO₂ loadings, patterns that closely match the concentration-dependent behavior we observe. Our results extend these findings by showing how these biases differ across urban, suburban, and rural environments and by quantifying the extent to which our correction framework reduces these structured errors.

Evaluation of the bias correction performances

We evaluated the ML model’s ability to reduce discrepancies between TEMPO and Pandora VCDs using three cross-validation approaches, each designed to evaluate a different type of data dependence. The first approach used 10-fold cross-validation (CV), where retrieval-measurement pairs are randomly split to provide a baseline estimate of model performance. The second approach was station-based group k-fold CV. In this scheme, the 58 monitoring sites were divided into six non-overlapping groups, each comprising roughly ten stations, with one group reserved for validation in each fold. This ensures that training and validation sets do not share data from the same site, providing a more rigorous test of performance at previously unseen locations. The third approach was LOSO, where all data from a single site are withheld in each iteration. This provides the strictest test of spatial generalization, as the model must predict at completely unseen locations. Both the station-based grouping and LOSO schemes are significant for air-quality datasets, where spatial structure and localized emission patterns can otherwise bias performance estimates. To evaluate model improvements relative to the uncorrected TEMPO product, we used four metrics: R², IOA, MAB, and RMSE.

Figure 2 compares the bias-corrected model estimates (“Model”) and the original TEMPO VCDs (“TEMPO”) against Pandora observations under 10-fold CV. The corrected VCDs showed improved agreement with ground-based measurements. R² increases from 0.58 to 0.8 and IOA from 0.87 to 0.96, while the regression slope improves from 0.74 to 0.82, reflecting reduced systematic bias. Error magnitudes also decline. MAB decreases from 2.53 to 1.82 × 10¹⁵ molecules cm² (−30%), and RMSE from 4.42 to 3.09 × 10¹⁵ molecules cm² (−30%). Normalized metrics also show improvements, with NMAB dropping from 33.3% to 23.2% and NRMSE from 59.8% to 37.6%. To illustrate ML model performance at the native TEMPO Level-2 pixel resolution, we present representative hourly AMF and total NO₂ VCD comparisons for sampled overpasses on 26 January and 10 February 2025 (Figs. S4 and S5). This approach is used because the Level-2 swath geometry and pixel locations vary from hour to hour, making direct temporal averaging at native resolution non-straightforward.

Fig. 2: 10-fold cross-validation comparison of Pandora total nitrogen dioxide vertical column density (NO2 VCD) with a ML model predictions and b original TEMPO retrievals. — **Fig. 2: 10-fold cross-validation comparison of Pandora total nitrogen dioxide vertical column density (NO₂ VCD) with a ML model predictions and b original TEMPO retrievals.**

Seasonal stratification of 10-fold CV results, spanning winter (n = 8183), spring (n = 14,268), summer (n = 20,623), and fall (n = 31,850), confirms that the ML model effectively corrects biases across all seasons (Fig. 3). In winter, the model improves R² from 0.58 to 0.84 and IOA from 0.87 to 0.95, while reducing MAB and RMSE by 37% and 39%, respectively (NMAB and NRMSE decline by a similar margin). Spring shows comparable improvements, with R² increasing to 0.82 and IOA to 0.94, resulting in 27–34% reductions in MAB and RMSE. Summer exhibits the lowest baseline skill in the uncorrected TEMPO VCD, which may be influenced by the combined effects of more vigorous daytime vertical mixing and enhanced contributions from lightning-induced NO_x in the upper troposphere, both of which can alter the sensitivity of AMF to errors in the assumed vertical profile^75,76. Despite these factors, our model improves the summer R² from 0.53 to 0.70, IOA from 0.85 to 0.89, and reduces MAB and RMSE by 18–22%. The fall performance mirrors the winter performance, with an R² of 0.81, IOA of 0.94, and MAB and RMSE reduced by 29% and 36%, respectively. The reductions in normalized errors across all seasons demonstrate the model’s robustness to seasonal variability in meteorology, aerosol properties, surface conditions, and sampling biases.

**Fig. 3: Seasonal bias-correction performance of the machine-learning (ML) model and original TEMPO retrievals.**

Spatial generalization

Supplementary Fig. S2 shows the station-based group k-fold CV results. After bias correction, R² increases from 0.58 to 0.74, IOA from 0.87 to 0.92, and slope from 0.74 to 0.80. MAB falls from 2.53 × 10¹⁵ to 2.07 × 10¹⁵ molecules/cm² and RMSE from 4.42 × 10¹⁵ to 3.36 × 10¹⁵ molecules/cm², while NMAB and NRMSE improve from 30.8% to 25.2% and 53.7% to 40.9%, respectively. For a more stringent evaluation of spatial generalization, a LOSO-CV was performed across 58 Pandora sites. In each fold, observations from one station were withheld for testing while the model was trained on the remaining 57 stations. By preventing any station’s data from appearing in both training and evaluation, this scheme yields an unbiased estimate of performance at entirely unseen monitoring locations. Figure 4 and Supplementary Table S2 summarize the LOSO results. In the TEMPO correlation map, VCDs at many stations, particularly in the western U.S., exhibit poor correlation (R² ≈ 0.1–0.4) with the Pandora measurements. In contrast, after bias correction, the model VCDs show a substantial increase in correlation with station measurements, achieving R² of 0.5–0.9. Across all stations, mean R² rose from 0.36 to 0.50, IOA increased from 0.71 to 0.76, RMSE fell by 1.16 × 10¹⁵ molecules/cm² (from 3.79 to 2.63 × 10¹⁵), and MAB was reduced by 0.68 × 10¹⁵ molecules/cm² (from 2.59 to 1.90 × 10¹⁵). These improvements indicate the higher fidelity of the corrected VCD, reflecting a significant reduction in both systematic bias and residual error.

Fig. 4: Spatial pattern of leave-one-site-out cross-validated R2 (coefficient of determination) for Pandora nitrogen dioxide vertical column density (NO2 VCD) comparisons. — **Fig. 4: Spatial pattern of leave-one-site-out cross-validated R² (coefficient of determination) for Pandora nitrogen dioxide vertical column density (NO₂ VCD) comparisons.**

The regional insets in Fig. 4 highlight these improvements at a finer scale. In the western inset (blue box), our correction substantially improves the correlation with Pandora VCD for many sparsely monitored sites, where the original R² was below 0.3, increased to the 0.4–0.7 range. Similarly, in the northeastern inset (red box), R² at several urban and suburban stations increased from 0.4–0.6 to 0.7–0.9 after correction. The most pronounced enhancements occurred at five stations where our bias correction was particularly effective (average ΔR² ≈ +0.38). For instance, at Pandora55s1, R² and IOA increased from 0.32 to 0.77 (Δ +0.44) and from 0.73 to 0.90 (Δ +0.17), respectively, RMSE fell by 4.23 × 10¹⁵ molecules/cm² (8.68 × 10¹⁵ → 4.45 × 10¹⁵ molecules/cm²), and MAB dropped by 2.46 × 10¹⁵ molecules/cm² (5.54 × 10¹⁵ → 3.08 × 10¹⁵ molecules/cm²). Similar improvements in R² were seen at Pandora142s1 (0.38 → 0.76; Δ +0.38), Pandora170s1 (0.15 → 0.52; Δ +0.37), Pandora247s1 (0.30 → 0.64; Δ +0.34), and Pandora157s1 (0.47 → 0.81; Δ +0.34). In these regions, spanning both urban and rural settings, the bias-corrected VCDs not only captured the broad diurnal and seasonal variability but also mitigated site-specific biases arising from local emission patterns and variable viewing geometries.

A minority of stations (approximately 5%) exhibited modest declines post-correction. The most significant drop occurred at Pandora68s1, where R² fell by 0.12 (0.29 → 0.17), IOA dropped by 0.29, RMSE increased by 1.61 × 10¹⁵ molecules/cm², and MAB rose by 1.64 × 10¹⁵ molecules/cm². This underperformance may be attributed to sparse training data in their vicinity, unmodeled local pollution sources, or site-specific measurement noise. Additionally, the model may have struggled to generalize to atypical conditions or concentration regimes that were not well represented during training^25,49. Similarly, as Tang et al. ⁷⁷ highlighted, machine learning models can underperform when applied to areas with varying environmental conditions, emphasizing the challenge of model transferability across diverse locations. Despite such isolated cases, the consistently strong results across 75,000 independent retrieval–measurement pairs, particularly the gains in R² and IOA, demonstrate that physics-informed, machine-learning-based bias correction markedly enhances NO₂ VCD retrievals from geostationary observations.

To examine how local surface conditions influence bias-correction performance, we grouped stations into urban, suburban, and rural environments using the 2024 National Land Cover Database (NLCD). Each Pandora site was assigned to a land-use class based on the dominant NLCD category. The performance within each category was evaluated using the LOSO-CV, ensuring that all reported improvements represent true local-scale generalization to unseen stations rather than in-sample fitting (Fig. S3).

Urban sites (n = 26), characterized by strong horizontal spatial gradients in surface NO₂ at scales smaller than a TEMPO pixel caused by sharp contrasts between roads, industrial sources, and background areas that lead to large pixel point representativeness errors, showed pronounced improvement after correction (ΔR² = +0.16, ΔRMSE = 1.07 × 10¹⁵ molecules/cm²). Suburban sites (n = 13) exhibited moderate gains (ΔR² = +0.12; ΔRMSE = 0.85 × 10¹⁵ molecules/cm²), reflecting the combined influence of both localized emission-driven gradients and broader scene-dependent retrieval factors. Rural stations (n = 19), which exhibit weak horizontal emission-driven gradients within a TEMPO pixel but strong sensitivity to meteorological controls on the vertical distribution of NO₂, particularly boundary layer depth, vertical mixing, and horizontal transport that shape the assumed NO₂ profile used in the AMF, also experienced notable improvement (ΔR² = +0.17; ΔRMSE = 1.16 × 10¹⁵ molecules/cm²).

The magnitude of improvement is broadly similar across land-use classes, indicating that the ML model reduces multiple sources of AMF-related error rather than preferentially correcting a single dominant mechanism. While improvements in urban regions are consistent with partial correction of unresolved fine-scale horizontal emission gradients and surface reflectance variability, comparable gains at rural sites highlight the role of meteorology and profile-driven AMF errors that affect all environments. As a result, these stratified results demonstrate that the model improves local-scale retrieval performance across diverse surface conditions, but do not uniquely isolate the relative contributions of spatial-gradient versus meteorological controls on AMF error. Future work could disentangle these contributions more explicitly by combining land-use stratification with controlled sensitivity experiments, such as perturbations to NO₂ vertical profiles, meteorological inputs, or spatial-resolution matching, to better attribute AMF error sources.

Ablation study and component analysis

To validate the design of our hybrid ML physics-informed Transformer–FNO model (see Methods), we performed an ablation study to systematically assess the contribution of each core component (Table 1). We first optimized the physics-constraint weight (λ) across five levels: 0, 0.001, 0.0025, 0.005, and 0.0075. Subsequently, we evaluated two structural variants: MLP-Transformer, in which the FNO branch was replaced by a simple MLP, and FNO-MLP, in which the Transformer branch was substituted with an MLP.

Table 1 Ablation study results for TEMPO NO₂ bias correction

Full size table

We assessed the impact of the physics-informed AMF penalty. The λ = 0 case, a purely data-driven model, establishes a baseline with R² of 0.75 and a slope of 0.68, confirming that a systematic bias remains without the physics constraint. Among the five physics weights tested, λ = 0.005 delivers the closest agreement with reference VCDs, achieving R² of 0.80 and a slope of 0.81. This was a clear improvement over the baseline model, as well as the intermediate models at λ = 0.001(R² = 0.77, slope = 0.76) and at λ = 0.0025 (R² = 0.80, slope = 0.77). At the optimal weight of λ = 0.005, the model also attains the best error characteristics: IOA of 0.94, MAE of 1.82 × 10¹⁵ molecules/cm² with a corresponding normalized error of 22.1%, and RMSE of 2.91 × 10¹⁵ molecules/cm² (35.4%). In contrast, increasing the weight further to λ = 0.0075 resulted in a slight performance decline, with IOA of 0.92, MAE of 2.01 × 10¹⁵ molecules/cm² (24.5%), and RMSE of 3.21 × 10¹⁵ molecules/cm² (39.0%).

Using this optimal weight, we then compared the complete Transformer-FNO model against the two structural variants by evaluating their respective outputs against the Pandora VCDs. The VCDs from the MLP-Transformer show a slope of 0.75, MAE of 1.96 (23.8%), and RMSE of 2.99 × 10¹⁵ molecules/cm² (36.4%), and FNO-MLP show a slope of 0.79, MAE of 1.94 (23.6%), and RMSE of 3.12 (37.9%). The degradation in performance when either component is removed confirms the necessity of the hybrid design. Collectively, these results indicate that our model’s hybrid architecture can effectively correct the retrieval bias by integrating two specialized components, each designed to address the distinct nature of the input data. The Transformer branch interprets the complex relationships within surface and geometric predictors, while its FNO branch concurrently captures the column-wide dependencies within the vertically resolved atmospheric profiles.

Discussions

Our results demonstrate the effectiveness of incorporating a physics-based constraint within a deep learning architecture to correct biases in AMF, thereby improving the final accuracy of TEMPO NO₂ column retrievals. Our hybrid Transformer-FNO model achieves this by framing the AMF correction as an intermediate, physically constrained prediction. This approach directly addresses the primary source of systematic, concentration-dependent bias by penalizing deviations from radiative-transfer theory via a Huber loss. The success of this physics constraint, however, is dependent on the model’s ability to process the varied inputs that govern AMF. The ablation study confirms this: the Transformer branch is essential for interpreting the 2D surface and geometric predictors, while the FNO is critical for capturing column-wide dependencies within the vertically resolved 3D atmospheric profiles. The underperformance of the ablated variants demonstrates that these two components perform complementary roles. This finding validates our hybrid approach, where the physics-based constraint provides the necessary physical grounding, while the combination of the Transformer and FNO branches effectively interprets the geometric, surface, and atmospheric profile variables that collectively govern the AMF when retrieving NO₂ columns.

The robustness of our physics-constrained model is confirmed through multi-stage validation, showing a substantial reduction in both systematic bias and residual error. Ten-fold CV shows a 0.26 increase in R² and a 37% reduction in RMSE, and the more stringent LOSO CV confirms these improvements at entirely unseen monitoring stations. The model’s consistent performance across all seasons, including the summer months, further validates its reliability under a wide range of atmospheric conditions.

Beyond these metrics, our approach offers two key operational advantages for practical, large-scale deployment of this bias correction framework. First, the entire correction pipeline operates using only TEMPO data as input, eliminating the need for concurrent ground-based measurements or other auxiliary datasets. This self-sufficiency makes the method readily deployable in real-time processing streams, capable of providing continuous, high-frequency, bias-corrected NO₂ VCD across the entire geostationary swath. Second, by embedding the AMF constraint directly in the loss function, the model is explicitly guided to enforce consistency with fundamental radiative-transfer physics. This prevents the model from making physically implausible adjustments and enhances the interpretability of the correction. Together, these features deliver the robust accuracy, practical deployability, and scientific integrity required for geostationary NO₂ monitoring.

While the model effectively reduces systematic biases, it operates exclusively on TEMPO Level-2 scene-dependent inputs. It therefore inherits any uncertainty present in those variables (e.g., surface albedo, cloud parameters, aerosol treatment, or GEOS-CF–derived NO₂ profiles). Because the model can only learn from the information available in these predictors, upstream retrieval biases or missing variability propagate into the corrected columns, contributing to the slight compression of the highest and lowest VCD values and explaining why the regression slope, though improved, remains below 1.0. In this context, the observed performance gains reflect the combined reduction of multiple AMF-related error sources, rather than isolation of a single dominant correction mechanism. Residual disagreements at certain stations should also be interpreted in the context of pixel–point representativeness differences between Pandora and TEMPO (see Methods), which cannot be removed by AMF bias correction alone. Although the present study focuses on total column correction, this framework can be extended to tropospheric retrievals by incorporating predictors that better characterize vertical structure or stratospheric contributions (e.g., layer-resolved scattering weights or alternative profile sources), enabling the model to correct total and tropospheric VCDs either jointly or in sequence.

Methods

Study area and data collocation

The study covers the North America domain (10–60° N, 140–60° W), encompassing the contiguous United States, southern Canada, and northern Mexico, as shown in Fig. 5. This region lies within TEMPO’s geostationary field of regard (center at 92.85° W), enabling hourly daytime retrievals of NO₂ columns across major urban corridors, including Los Angeles, the Northeast megalopolis, and Mexico City, as well as rural background environments. TEMPO NO₂ columns were collocated with ground-based Pandora spectrometer measurements from the PGN for validation and bias correction.

**Fig. 5: Study domain showing TEMPO satellite coverage and Pandora station locations across North America.**

The Tropospheric Emissions: Monitoring of Pollution (TEMPO) data

TEMPO, launched on April 7, 2023, as the first NASA Earth Venture Instrument mission, provides hourly daytime ultraviolet–visible imaging spectroscopy of key air pollutants over North America from a geostationary orbit at 92.85° W. Covering spectral ranges of ~293–494 nm and ~538–741 nm, TEMPO retrieves columns of NO₂, ozone (O₃), formaldehyde (HCHO), and sulfur dioxide (SO₂), alongside glyoxal, water vapor, bromine monoxide, iodine monoxide, and aerosols, and measures cloud properties, foliage reflectance, and ultraviolet (UV) B flux, with data processing by the Smithsonian Astrophysical Observatory Science Data Processing Center^28,29. In this study, we use the TEMPO Level-2 NO₂ product (Version V03, provisional). Its nominal spatial resolution at the field-of-regard center (36.5° N, 100° W) is ~2.1 × 4.75 km² for NO₂ (8 × 4.75 km² for ozone profiling), enabling fine-scale monitoring of urban corridors and regional backgrounds. The NO₂ retrieval employs a 405–465 nm fitting window to derive SCDs, which are then converted to VCDs using radiative-transfer-derived air mass factors^28,29. The AMF calculation incorporates a priori NO₂ vertical profiles from a chemical transport model, surface reflectance based on climatological bidirectional reflectance distribution function (BRDF) products, and cloud parameters, including effective cloud fraction and cloud pressure. Aerosol effects are treated implicitly within the radiative-transfer framework. Only retrievals passing the recommended quality-assurance and cloud-screening flags are retained^11,28. TEMPO achieved first light on August 2, 2023, and began nominal operations in October 2023, with products publicly accessible through NASA’s Atmospheric Science Data Center. As part of a geostationary air-quality constellation, including South Korea’s GEMS and Europe’s Sentinel-4, TEMPO’s hourly coverage captures diurnal variability and rapid emission events that polar-orbiting instruments cannot resolve.

Pandora data

The Pandora spectrometer, deployed within the PGN, is a ground-based passive remote-sensing instrument that acquires hyperspectral solar-irradiance measurements in the UV-visible range to retrieve total and tropospheric column densities of NO₂, O₃, and HCHO via direct-sun and multi-axis DOAS modes^25,78,79. Although Pandora columns are inherently spatially and vertically smoothed relative to in situ probes, they serve as high-quality, independent references for satellite validation, with each observation accompanied by quality-assurance flags^80,81,82. For this study, we utilized the Level-2 NVS (L2_rnvs3p1-8) data product from 58 PGN stations, which provides total (TotCol), tropospheric (TropCol), and near-surface (SurfConc) retrievals for NO₂, O₃, and HCHO processed under standardized algorithms and quality-control procedures.

Because our model corrects total NO₂ columns, we use the direct-sun NVS TotCol measurements. Under direct-sun geometry, the AMF is close to unity, and the associated uncertainty is relatively low, typically 2–5%, as described in the PGN Data Product Readme. In contrast, the TropCol and SurfConc products require additional AMF calculations that depend on vertical profile shape, cloud fraction, surface reflectance, and viewing geometry, and these additional AMF dependencies introduce more variability into the retrieval, the uncertainties for TropCol and SurfConc are correspondingly higher than for the direct-sun TotCol product⁷⁸. All Pandora data were obtained through the PGN portal (https://www.pandonia-global-network.org) and are fully documented in the official product readme (https://www.pandonia-global-network.org/wp-content/uploads/2023/11/PGN_DataProducts_Readme_v1-8-8.pdf).

National Land Cover Database (NLCD)

Land-use classification for the local-scale urban, suburban, and rural analysis was based on the 2024 NLCD, produced by the U.S. Geological Survey (USGS), which provides 30 m resolution land-cover information across the United States. For our analysis, stations were classified as urban (NLCD codes 23–24), suburban (21–22), or rural (all other codes) using the NLCD land-cover classes. The NLCD product was downloaded from EarthExplorer (https://earthexplorer.usgs.gov/).

Data preparation and preprocessing

Our data-processing workflow integrates satellite-based NO₂ from TEMPO with ground-based NO₂ measurements collected from PGN. First, we verify the geographic locations of each Pandora station to ensure coverage within TEMPO’s field of view. Then we compile station metadata and calibration records to ensure consistency in our comparisons. We then identify, retrieve, and process the relevant TEMPO Level 2 granules corresponding spatially and temporally to each Pandora observation via NASA’s EarthData Search portal and the Atmospheric Science Data Center (ASDC) at NASA Langley Research Center (LaRC) (earthaccess v0.5.1). Our spatial matching procedure utilizes polygon-intersection methods and nearest-neighbor resampling⁷⁹ to align Pandora coordinates with TEMPO pixel boundaries. We identify the single TEMPO Level-2 pixel whose center lies closest to each Pandora station using a nearest-neighbor search, and we retain that pixel only if its center lies within 2 km of the site. This distance criterion is used solely as a screening step to ensure spatial consistency, and no spatial averaging across multiple TEMPO pixels is performed. We apply primary data-quality flags to TEMPO retrievals, retaining only pixels with cloud fraction <20% and positive tropospheric and stratospheric NO₂ column densities. Then, we compute the total NO₂ column as the sum of those two components. Because Pandora provides point-based measurements while TEMPO retrieves a ~2.1 × 4.75 km² pixel average, the two instruments differ in representativeness. Local enhancements observed by Pandora (e.g., roadway plumes or near-source gradients) may be spatially smoothed in the larger TEMPO footprint, and early/late-day slant-path geometries can intersect multiple pixels. These differences introduce representativeness error that is unrelated to AMF bias. Throughout this study, we therefore treat Pandora as a high-quality reference rather than a perfect “truth,” and interpret remaining discrepancies as a combination of retrieval uncertainty and pixel–point mismatch. We achieve temporal alignment using two complementary methods. First, we pair each TEMPO measurement with the nearest Pandora observation within a 15-minute window. Second, we apply a Gaussian-weighted smoothing of Pandora observations over a temporal window (σ = 5 minutes) to produce a continuous series⁸³. We then organize our spatiotemporally harmonized data into structured intermediate files, preserving measurement uncertainties and metadata for downstream analysis. We use four groups of predictors: angular variables (solar zenith angle, solar azimuth angle, viewing zenith angle, viewing azimuth angle, and relative azimuth angle); three-dimensional profiles (scattering weights, gas profile, temperature profile); two-dimensional fields (snow-ice fraction, terrain height, surface pressure, albedo, effective cloud fraction), and NO₂ observations (total NO₂ column, slant column). Next, we screen both the Pandora direct-sun total NO₂ column (TotCol) and the TEMPO total NO₂ columns for extreme values, excluding observations above the 99th percentile, and standardize all continuous meteorological and column-density variables (e.g., pressure, albedo, cloud fraction, column densities) via mean-centering and division by their standard deviation⁸⁴ applied independently to each layer of the vertical profiles. We apply min-max normalization to latitude and longitude over our domain bounds and transform angular (solar/viewing zenith & azimuth, relative azimuth) and temporal (hour of day, day of year) predictors into paired sine-cosine channels to preserve cyclic structure (83, 84). We retain unscaled copies of all core inputs and apply a denormalization to the model outputs, restoring them to physical NO₂ concentrations for final predictions and loss calculations. All preprocessing and model code were written in Python 3.10 using Xarray, Pandas, NumPy, SciPy, and PyTorch.

Hybrid deep learning architecture for NO₂ bias correction

Figure 6 illustrates our model, which processes heterogeneous inputs through two parallel branches. In the first branch, Transformer encoder layers⁸⁵ extract global contextual features from surface and satellite predictors. In the second branch, a FNO^86,87 captures large-scale atmospheric structures from vertically resolved profiles. We then fuse the two sets of features using a cross-attention module, and a final fully connected network estimates an intermediate AMF. This AMF is subsequently used to compute the VCD via the physical inversion relationship:

$$V{\rm{CD}}=\frac{{\rm{SCD}}}{{\rm{AMF}}}$$

(1)

Fig. 6: Architecture schematic of the hybrid NO2 bias-correction machine-learning (ML) model. — **Fig. 6: Architecture schematic of the hybrid NO₂ bias-correction machine-learning (ML) model.**

The Transformer branch processes input data ${{\rm{x}}}_{2{\rm{D}}}$ structured as a tensor with shape (batch, ${{\rm{n}}}_{2{\rm{D}}}$), where ${{\rm{n}}}_{2{\rm{D}}}$ is the number of 2D features. In contrast, the Profile Branch handles atmospheric profiles ${{\rm{x}}}_{3{\rm{D}}}$ formatted as a tensor with shape $({\rm{batch}},{{\rm{n}}}_{3{\rm{D}}},{\rm{num\; layers}}),$ where ${{\rm{n}}}_{3{\rm{D}}}$ corresponds to the different profile variables, and num layers (e.g., 72) represents the number of pressure layers. The model is trained by minimizing a composite loss function primarily targeting accurate total VCD, comparing the total VCD derived from the predicted AMF (i.e., ${\rm{SCD}}/{\rm{AM}}{{\rm{F}}}_{{\rm{pred}}}$) against reference Pandora VCD (${VC}{D}_{{true}})$. A weighted physics loss term is also included to enforce greater physical consistency in the estimated AMF.

Transformer-based feature extraction

The Transformer branch extracts features from our 2D predictors, measurement angles, terrain height, and non-profile variables. Given an input feature vector ${x}_{2d}\in {R}^{{d}_{2d}}$ (where d_2d is the number of input features), the initial measurements undergo a linear transformation to a higher-dimensional latent representation h (the embedding dimension) through:

$${z}_{{emb}}={W}_{{emb}}{x}_{2d}+{b}_{{emb}}$$

(2)

where ${W}_{{emb}}\in {R}^{{{h}\times d}_{2d}}$ and ${b}_{{emb}}\in {R}^{h}$ are learnable parameters. Following embedding, the latent representation ${z}_{{emb}}$ is processed by a stack of transformer encoder layers^85,88. Each layer utilizes a multi-head self-attention mechanism to model dependencies between features within the embedded vector. For each attention head, the representation z (${z}_{{emb}}$) is linearly transformed into query (Q), key (K), and value (V) matrices:

$${\rm{Q}}={{\rm{W}}}_{{\rm{Q}}}{\rm{z}},{\rm{K}}={{\rm{W}}}_{{\rm{K}}}{\rm{z}},{\rm{V}}={{\rm{W}}}_{{\rm{V}}}{\rm{z}}$$

(3)

where ${W}_{Q},{W}_{K},{W}_{V}\in {R}^{h\times {h}_{k}}$ are projection matrices for a head dimension ${h}_{k}$. The self-attention output is computed using scaled dot-product attention:

$$\mathrm{Attention}\,({\rm{Q}},{\rm{K}},{\rm{V}})=\mathrm{softmax}\left(\frac{{\rm{Q}}{{\rm{K}}}^{{\rm{T}}}}{\sqrt{{{\rm{h}}}_{{\rm{k}}}}}\right){\rm{V}}$$

(4)

Outputs from multiple heads are concatenated and linearly projected. A residual connection is added, followed by layer normalization. Subsequently, a position-wise Feed-Forward Network (FFN), consisting of two linear layers with a non-linear activation (ReLU) in between, is applied, again followed by a residual connection and layer normalization:

$${{\rm{f}}}_{2{\rm{d}}}={\rm{LayerNorm}}\left({{\rm{z}}}_{{\rm{attn}}}+{\rm{FFN}}\left({{\rm{z}}}_{{\rm{attn}}}\right)\right)$$

(5)

The resulting feature vector ${{\rm{f}}}_{2{\rm{d}}}\in {{\rm{R}}}^{{\rm{h}}}$ encapsulates globally relevant information extracted from the auxiliary parameters. The use of Transformers is motivated by their proven ability to model complex dependencies^88,89 and their increasing adoption in remote sensing and Earth sciences⁹⁰.

FNO-based feature extraction

The second branch of our model processes vertically resolved atmospheric profiles discretized into 72 pressure layers. These profiles, although discretized, are regarded as samples from an underlying continuous function ${\rm{u}}\left({\rm{x}}\right)$. Such a continuous formulation is necessary since calculating the AMF requires integrating the profile function continuously over altitude.

$${\rm{AMF}}=\int {\rm{W}}\left({\rm{z}}\right){\rm{S}}\left({\rm{z}}\right){\rm{c}}\left({\rm{z}}\right){\rm{dz}}$$

(6)

where $W\left(z\right)$ represents the scattering weight at altitude z; $S\left(z\right)$ is the shape factor, calculated as $S\left(z\right)=n\left(z\right)/\int n\left(z\right){dz}$, with $n\left(z\right)$ being the trace gas concentration at layers $z$; $c\left(z\right)$ is the temperature correction factor. In the retrieval of NO₂, c(z) is determined by the empirical relationship⁴⁶ $c\left(z\right)=1-a\left[T\left(z\right)-{T}_{\sigma }\right]+b{\left[T\left(z\right)-{T}_{\sigma }\right]}^{2}$, with $a=0.00316$, $b=3.39\times {10}^{-6}$, ${{\rm{T}}}_{{\rm{\sigma }}}=220{\rm{K}}$ and $T\left(z\right)$ = the temperature at altitude z.

Given that AMF is defined by continuous integration through the column, the FNO branch employs an FNO to extract the global, long-range interactions inherent in the vertical profiles. The profiles are denoted as a function $a\left(z\right)$, defined over a vertical domain. ${D}_{z}$, which includes the profile variables. In practice, this function is represented by its discretized form, a vector $a$ sampled at specific $\{{z}_{1},{z}_{2},\ldots ,{z}_{n}\}$. Within the FNO branch, the input tensor is first lifted internally to a higher-dimensional latent representation ${v}_{0}\left(z\right)$. This richer representation is then passed through a sequence of four Fourier layers. Within each layer, the global dependencies along the vertical dimension z are first computed via the Fourier domain:

$${g}_{t}\left(z\right)={{\mathcal{F}}}_{{\mathcal{Z}}}^{-1}({R}_{t}\cdot {\left({{\mathcal{F}}}_{{\mathcal{Z}}}({v}_{t}))({k}_{z})\right)}_{z}$$

(7)

where ${{\mathcal{F}}}_{{\mathcal{Z}}}$ and ${{\mathcal{F}}}_{{\mathcal{Z}}}^{-1}$ represent the 1D Fast Fourier Transform and its inverse along the vertical dimension z, while ${R}_{t}$ is a learnable linear transformation applied in the frequency domain (modes ${k}_{z}$). This global component ${g}_{t}\left(z\right)$ is then combined with a parallel local linear transformation ${W}_{t}$ (1D convolution in our case) acting on the layer’s input ${v}_{t}\left(z\right)$, followed by a non-linear activation σ, to yield the updated latent profile ${v}_{t+1}\left(z\right)$:

$${v}_{t+1}\left(z\right)=\sigma \left({W}_{t}{v}_{t}\left(z\right)+{g}_{t}\left(z\right)\right)$$

(8)

After four iterations (i.e., the number of FNO layers), the final output is the latent profile representation ${v}_{T}\left(z\right)$. This structure allows the model to efficiently learn correlations across different pressure levels by integrating both global context (from the Fourier path) and local features (from the ${W}_{t}$ path). To obtain a fixed-size vector representation suitable for subsequent fusion, the resulting latent representation ${v}_{T}\left(z\right)$ is aggregated using a global average operation across the layers dimension z. This aggregation step yields a single feature vector for each sample in the batch, effectively summarizing the salient information from the vertical profiles extracted by the FNO component.

Attention fusion and prediction head

After feature extraction, we merge the Transformer and FNO streams with a single-head cross-attention layer. Here, the Transformer branch output acts as the query Q, while the FNO branch outputs provide both keys K and values V. We compute the attention-weighted fusion using Eq. 4. This attention-based fusion strategy allows the model to dynamically weight and incorporate the most pertinent information captured by the FNO branch (representing vertical profile features) based on the context provided by the Transformer branch (representing other input features). Although analogous to multi-head attention, our implementation uses a single head, and the resulting fused vector is passed to the final prediction network. We then feed the fused features into a two-layer MLP (Linear → ReLU → Dropout → Linear) with a Softplus activation (β = 2) to predict the intermediate AMF values.

Training procedure and loss function

We train the model end-to-end by minimizing a composite loss function (Eq. 10) that combines a data-fidelity term (${{\mathcal{L}}}_{{vcd}}$) with a physics-informed constraint ${{\mathcal{L}}}_{{physics}}$. The data-fidelity loss (${{\mathcal{L}}}_{{vcd}}$) is defined as the Huber loss (Eq. 9) on the residual a = VCD_pred - VCD_true, with δ = 1.0. Here, VCD_pred is computed by combining the satellite SCD with the model-predicted AMF, as shown in Eq. 1. To ensure stable optimization, both the satellite SCD (used to compute VCD_pred) and VCD_true are standardized (zero mean, unit variance), which makes the residual “a” dimensionless and prevents high-magnitude scenes from dominating the loss.

$${{\mathcal{L}}}_{\delta }\left(a\right)=\left\{\begin{array}{l}\frac{1}{2}{a}^{2}\,\,{\text{if}}\left|a\right|\le \delta \\ \delta \left(\left|a\right|-\frac{1}{2}\delta \right)\,\,\text{otherwise},\end{array}\right.$$

(9)

Concurrently, the physics loss ${{\mathcal{L}}}_{{physics}}$ also uses the Huber formulation to penalize deviations between the predicted AMF and an independent AMF reference computed via Eq. 6. The total loss is then

$${{\mathcal{L}}}_{{total}}={{\mathcal{L}}}_{{vcd}}+{({\rm{\lambda }}}_{\text{physics}}\times {{\mathcal{L}}}_{{physics}})$$

(10)

where ${{\rm{\lambda }}}_{\text{physics}}=0.005$ balances the two objectives. We determined this weight, and the optimal number of Fourier modes, Transformer layers, hidden dimensions, and other hyperparameters via Ray Tune⁹¹. A concise layer-by-layer specification and whole search space for hyperparameter tuning and selections are summarized in Table S3 and S4. This ${{\rm{\lambda }}}_{\text{physics}}$ specific value accounts for the differing ranges of the target variables, as we normalize VCD values but do not normalize AMF, thereby ensuring a balanced contribution of both terms to the composite loss.

Data availability

TEMPO Level 2 products used in this study are publicly available via the NASA EarthData repository (EarthData Search) and the ASDC at NASA LaRC. (https://www.earthdata.nasa.gov/data/instruments/tempo). Pandora spectrometer data are accessible through the Pandonia Global Network (PGN) data portal (https://data.ovh.pandonia-global-network.org/).

Code availability

Training and evaluation codes with sample evaluation data can be found at https://github.com/sagunkayastha/NO2_Bias_Correction.

References

Jung, J. et al. The impact of springtime-transported air pollutants on local air quality with satellite-constrained NOx emission adjustments over East Asia. J. Geophys. Res. Atmos. 127, e2021JD035251 (2022).
Article CAS Google Scholar
Lasek, J. A. & Lajnert, R. On the Issues of NOx as Greenhouse Gases: An Ongoing Discussion…. Appl. Sci. 12, 10429 (2022).
Article CAS Google Scholar
Thornhill, G. D. et al. Effective radiative forcing from emissions of reactive gases and aerosols – a multi-model comparison. Atmos. Chem. Phys. 21, 853–874 (2021).
Article CAS Google Scholar
van der A, R. J. et al. Trends, seasonal variability and dominant NOx source derived from a ten year record of NO₂ measured from space. J. Geophys. Res. Atmos. 133, D04302, https://doi.org/10.1029/2007JD009021 (2008).
Article CAS Google Scholar
Bai, K., Ma, M., Chang, N.-B. & Gao, W. Spatiotemporal trend analysis for fine particulate matter concentrations in China using high-resolution satellite-derived and ground-measured PM2.5 data. J. Environ. Manag. 233, 530–542 (2019).
Article CAS Google Scholar
Chen, Z.-Y. et al. Extreme gradient boosting model to estimate PM2.5 concentrations with missing-filled satellite data in China. Atmos. Environ. 202, 180–189 (2019).
Article CAS Google Scholar
Gurjar, B. R. et al. Human health risks in megacities due to air pollution. Atmos. Environ. 44, 4606–4613 (2010).
Article CAS Google Scholar
Fioletov, V., McLinden, C. A., Griffin, D., Zhao, X. & Eskes, H. Global seasonal urban, industrial, and background NO₂ estimated from TROPOMI satellite observations. Atmos. Chem. Phys. 25, 575–596 (2025).
Article CAS Google Scholar
Pommier, M. Estimations of NOx emissions, NO2 lifetime and their temporal variation over three British urbanised regions in 2019 using TROPOMI NO₂ observations. Environ. Sci. Atmos. 3, 408–421 (2023).
Article CAS Google Scholar
Huijnen, V. et al. Comparison of OMI NO₂ tropospheric columns with an ensemble of global and European regional air quality models. Atmos. Chem. Phys. 10, 3273–3296 (2010).
Article CAS Google Scholar
Naeger, A. R. et al. Revolutionary Air-Pollution Applications from Future Tropospheric Emissions: Monitoring of Pollution (TEMPO) Observations. (2021) https://doi.org/10.1175/BAMS-D-21-0050.1.
Wang, C., Wang, T., Wang, P. & Rakitin, V. Comparison and Validation of TROPOMI and OMI NO2 Observations over China. Atmosphere 11, 636 (2020).
Article CAS Google Scholar
Lamsal, L. N. et al. Ground-level nitrogen dioxide concentrations inferred from the satellite-borne Ozone Monitoring Instrument. J. Geophys. Res. Atmos. 113, D16308, https://doi.org/10.1029/2007JD009235 (2008).
Article CAS Google Scholar
Larkin, A. et al. A Global Land Use Regression Model for Nitrogen Dioxide Air Pollution. Environ. Sci. Technol. 51, 6957–6964 (2017).
Article CAS Google Scholar
Bey, I. et al. Global modeling of tropospheric chemistry with assimilated meteorology: Model description and evaluation. J. Geophys. Res. Atmos. 106, 23073–23095 (2001).
Article CAS Google Scholar
Byun, D. & Schere, K. L. Review of the governing equations, computational algorithms, and other components of the Models-3 Community Multiscale Air Quality (CMAQ) modeling system. Appl. Mech. Rev. 59, 51–77 (2006).
Article Google Scholar
Gilliam, R. C. & Pleim, J. E. Performance assessment of new land surface and planetary boundary layer physics in the WRF-ARW. J. Appl. Meteorol. Climatol. 49, 760–774 (2010).
Article Google Scholar
Harkey, M., Holloway, T., Oberman, J. & Scotty, E. An evaluation of CMAQ NO₂ using observed chemistry-meteorology correlations. J. Geophys. Res. Atmos. 120, 11,775–11,797 (2015).
Article CAS Google Scholar
Syrakov, D., Prodanova, M., Georgieva, E., Etropolska, I. & Slavov, K. Impact of NOx emissions on air quality simulations with the Bulgarian WRF-CMAQ modelling system. Int. J. Environ. Pollut. 57, 285–296 (2015).
Article CAS Google Scholar
Burrows, J. P. et al. The Global Ozone Monitoring Experiment (GOME): Mission Concept and First Scientific Results. J. Atmos. Sci. 56, 151–175 (1999).
Article Google Scholar
Bovensmann, H. et al. SCIAMACHY: Mission objectives and measurement modes. J. Atmos. Sci. 56, 127–150 (1999).
Article Google Scholar
Levelt, P. F. et al. The ozone monitoring instrument. IEEE Trans. Geosci. Remote Sens. 44, 1093–1101 (2006).
Article Google Scholar
Munro, R. et al. The GOME-2 instrument on the Metop series of satellites: instrument design, calibration, and level 1 data processing – an overview. Atmos. Meas. Tech. 9, 1279–1301 (2016).
Article Google Scholar
Veefkind, J. P. et al. TROPOMI on the ESA Sentinel-5 Precursor: A GMES mission for global observations of the atmospheric composition for climate, air quality and ozone layer applications. Remote Sens. Environ. 120, 70–83 (2012).
Article Google Scholar
Ghahremanloo, M., Choi, Y. & Singh, D. Deep learning bias correction of GEMS tropospheric NO2: A comparative validation of NO2 from GEMS and TROPOMI using Pandora observations. Environ. Int. 190, 108818 (2024).
Article CAS Google Scholar
Choi, W. J. et al. Introducing the geostationary environment monitoring spectrometer. J. Appl. Remote Sens. 12, 044005 (2018).
Article Google Scholar
Kim, J. et al. New Era of Air Quality Monitoring from Space: Geostationary Environment Monitoring Spectrometer (GEMS). Bull. Am. Meteorol. Soc. 101, E1–E22 (2020).
Article Google Scholar
Chance, K. et al. TEMPO Green Paper: Chemistry, physics, and meteorology experiments with the Tropospheric Emissions: monitoring of pollution instrument. in Sensors, Systems, and Next-Generation Satellites XXIII vol. 11151 56–67 (SPIE, 2019).
Zoogman, P. et al. Tropospheric emissions: Monitoring of pollution (TEMPO). J. Quant. Spectrosc. Radiat. Transf. 186, 17–39 (2017).
Article CAS Google Scholar
Gulde, S. T. et al. Sentinel 4: a geostationary imaging UVN spectrometer for air quality monitoring: status of design, performance and development. in International Conference on Space Optics — ICSO 2014 vol. 10563 1158–1166 (SPIE, 2017).
Lamsal, L. N. et al. Application of satellite observations for timely updates to global anthropogenic NO emission inventories. Geophys. Res. Lett. 38, L05810, https://doi.org/10.1029/2010GL046476 (2011).
Article CAS Google Scholar
Martin, R. V. et al. Global inventory of nitrogen oxide emissions constrained by space-based observations of NO₂ columns. J. Geophys. Res. Atmos. 108, 4537, https://doi.org/10.1029/2003JD003453 (2003).
Article CAS Google Scholar
Park, J., Choi, Y., Jung, J., Lee, K. & Yeganeh, A. K. First top-down diurnal adjustment to NOx emissions inventory in Asia informed by the Geostationary Environment Monitoring Spectrometer (GEMS) tropospheric NO2 columns. Sci. Rep. 14, 24338 (2024).
Article CAS Google Scholar
Park, J., Choi, Y. & Kayastha, S. Local and transboundary contributions to NO_y loadings across East Asia using CMAQ-ISAM and a GEMS-informed emission inventory during the winter–spring transition. Atmos. Chem. Phys. 25, 4291–4311 (2025).
Article CAS Google Scholar
Judd, L. M. et al. Evaluating the impact of spatial resolution on tropospheric NO2 column comparisons within urban areas using high-resolution airborne data. Atmos. Meas. Tech. 12, 6091–6111 (2019).
Article CAS Google Scholar
Silvern, R. F. et al. Using satellite observations of tropospheric NO₂ columns to infer long-term trends in US NO_x emissions: the importance of accounting for the free tropospheric NO₂ background. Atmos. Chem. Phys. 19, 8863–8878 (2019).
Article CAS Google Scholar
Holloway, T. et al. Satellite monitoring for air quality and health. Annu. Rev. Biomed. Data Sci. 4, 417–447 (2021).
Article Google Scholar
Huang, K., Zhu, Q., Lu, X., Gu, D. & Liu, Y. Satellite-based long-term spatiotemporal trends in ambient NO₂ concentrations and attributable health burdens in China from 2005 to 2020. GeoHealth 7, e2023GH000798 (2023).
Article Google Scholar
Kim, N. R. & Lee, H. J. Leveraging high-resolution satellite-derived NO2 estimates to evaluate NO₂ exposure representativeness and socioeconomic disparities. Environ. Sci. Technol. 59, 3434–3442 (2025).
Article CAS Google Scholar
Beirle, S., Boersma, K. F., Platt, U., Lawrence, M. G. & Wagner, T. Megacity emissions and lifetimes of nitrogen oxides probed from space. Science 333, 1737–1739 (2011).
Article CAS Google Scholar
Geddes, J. A., Murphy, J. G., O’Brien, J. M. & Celarier, E. A. Biases in long-term NO₂ averages inferred from satellite observations due to cloud selection criteria. Remote Sens. Environ. 124, 210–216 (2012).
Article Google Scholar
Boersma, K. F., Eskes, H. J. & Brinksma, E. J. Error analysis for tropospheric NO₂ retrieval from space. J. Geophys. Res. Atmos. 109, D04311, https://doi.org/10.1029/2003JD003962 (2004).
Article CAS Google Scholar
Lorente, A. et al. Structural uncertainty in air mass factor calculation for NO₂ and HCHO satellite retrievals. Atmos. Meas. Tech. 10, 759–782 (2017).
Article CAS Google Scholar
Nowlan, C. R. et al. Nitrogen dioxide observations from the Geostationary Trace gas and Aerosol Sensor Optimization (GeoTASO) airborne instrument: Retrieval algorithm and measurements during DISCOVER-AQ Texas 2013. Atmos. Meas. Tech. 9, 2647–2668 (2016).
Article CAS Google Scholar
Seo, S. et al. Tropospheric NO₂ retrieval algorithm for geostationary satellite instruments: applications to GEMS. Atmos. Meas. Tech. 17, 6163–6191 (2024).
Article CAS Google Scholar
van Geffen, J. et al. Sentinel-5P TROPOMI NO₂ retrieval: impact of version v2.2 improvements and comparisons with OMI and ground-based data. Atmos. Meas. Tech. 15, 2037–2060 (2022).
Article Google Scholar
Laughner, J. L., Zhu, Q. & Cohen, R. C. The Berkeley High Resolution Tropospheric NO₂ product. Earth Syst. Sci. Data 10, 2069–2095 (2018).
Article Google Scholar
Lin, J.-T. et al. Retrieving tropospheric nitrogen dioxide from the Ozone Monitoring Instrument: effects of aerosols, surface reflectance anisotropy, and vertical profile of nitrogen dioxide. Atmos. Chem. Phys. 14, 1441–1461 (2014).
Article Google Scholar
Qin, W. et al. A geometry-dependent surface Lambertian-equivalent reflectivity product for UV–Vis retrievals – Part 1: Evaluation over land surfaces using measurements from OMI at 466 nm. Atmos. Meas. Tech. 12, 3997–4017 (2019).
Article CAS Google Scholar
Boersma, K. F. et al. Improving algorithms and uncertainty estimates for satellite NO₂ retrievals: results from the quality assurance for the essential climate variables (QA4ECV) project. Atmos. Meas. Tech. 11, 6651–6678 (2018).
Article CAS Google Scholar
Liu, S. et al. An improved total and tropospheric NO₂ column retrieval for GOME-2. Atmos. Meas. Tech. 12, 1029–1057 (2019).
Article CAS Google Scholar
Ialongo, I., Virta, H., Eskes, H., Hovila, J. & Douros, J. Comparison of TROPOMI/Sentinel-5 Precursor NO₂ observations with ground-based measurements in Helsinki. Atmos. Meas. Tech. 13, 205–218 (2020).
Article CAS Google Scholar
Tack, F. et al. Assessment of the TROPOMI tropospheric NO₂ product based on airborne APEX observations. Atmos. Meas. Tech. 14, 615–646 (2021).
Article CAS Google Scholar
Zhao, X. et al. Assessment of the quality of TROPOMI high-spatial-resolution NO₂ data products in the Greater Toronto Area. Atmos. Meas. Tech. 13, 2131–2159 (2020).
Article CAS Google Scholar
Lamsal, L. N. et al. Evaluation of OMI operational standard NO₂ column retrievals using in situ and surface-based NO₂ observations. Atmos. Chem. Phys. 14, 11587–11609 (2014).
Article Google Scholar
Griffin, D. et al. High-resolution mapping of nitrogen dioxide with TROPOMI: first results and validation over the Canadian Oil Sands. Geophys. Res. Lett. 46, 1049–1060 (2019).
Article CAS Google Scholar
Mak, H. W. L., Laughner, J. L., Fung, J. C. H., Zhu, Q. & Cohen, R. C. Improved satellite retrieval of tropospheric NO₂ column density via updating of Air Mass Factor (AMF): Case Study of Southern China. Remote Sens 10, 1789 (2018).
Article Google Scholar
Cooper, M. J., Martin, R. V., Hammer, M. S. & McLinden, C. A. An observation-based correction for aerosol effects on nitrogen dioxide column retrievals using the absorbing Aerosol Index. Geophys. Res. Lett. 46, 8442–8452 (2019).
Article CAS Google Scholar
Goldberg, D. L. et al. A top-down assessment using OMI NO₂ suggests an underestimate in the NO_x emissions inventory in Seoul, South Korea, during KORUS-AQ. Atmos. Chem. Phys. 19, 1801–1818 (2019).
Article CAS Google Scholar
Choi, S. et al. Assessment of NO₂ observations during DISCOVER-AQ and KORUS-AQ field campaigns. Atmos. Meas. Tech. 13, (2020).
Laughner, J. L., Zare, A. & Cohen, R. C. Effects of daily meteorology on the interpretation of space-based remote sensing of NO₂. Atmos. Chem. Phys. 16, 15247–15264 (2016).
Article CAS Google Scholar
Bucsela, E. J. et al. A new stratospheric and tropospheric NO₂ retrieval algorithm for nadir-viewing satellite instruments: applications to OMI. Atmos. Meas. Tech. 6, 2607–2626 (2013).
Article CAS Google Scholar
Wang, C., Wang, T., Wang, P. & Wang, W. Assessment of the performance of TROPOMI NO2 and SO₂ data products in the north china plain: comparison, correction and application. Remote Sens. 14, 214 (2022).
Article Google Scholar
Chi, Y. et al. Ground-level NO₂ concentration estimation based on OMI tropospheric NO2 and its spatiotemporal characteristics in typical regions of China. Atmos. Res. 264, 105821 (2021).
Article CAS Google Scholar
Ghahremanloo, M., Lops, Y., Choi, Y., Mousavinezhad, S. & Jung, J. A coupled deep learning model for estimating surface NO₂ levels from remote sensing data: 15-year study over the contiguous United States. J. Geophys. Res. Atmos. 128, e2022JD037010 (2023).
Article CAS Google Scholar
Lops, Y. et al. Spatiotemporal estimation of TROPOMI NO2 column with depthwise partial convolutional neural network. Neural Comput & Applic. 35, 15667–15678, https://doi.org/10.1007/s00521-023-08558-1 (2023).
Article Google Scholar
Wu, D. et al. Quantifying uncertainty in deep spatiotemporal forecasting. Proc. 27th ACM SIGKDD Conf. Knowl. Discov. Data Min. (KDD ’21), 1841–1851 https://doi.org/10.1145/3447548.3467325 (2021).
Oak, Y. J. et al. A bias-corrected GEMS geostationary satellite product for nitrogen dioxide using machine learning to enforce consistency with the TROPOMI satellite instrument. Atmospheric Meas. Tech 17, 5147–5159 (2024).
Article CAS Google Scholar
Azizzadenesheli, K., Kovachki, N., Li, Z., Liu-Schiaffini, M., Kossaifi, J. & Anandkumar, A. Neural operators for accelerating scientific simulations and design. Nat. Rev. Phy. 6, 320–328 (2024).
Article Google Scholar
Bedi, S., Tiwari, K., Prathosh, A. P., Kota, S. H. & Krishnan, N. M. A. A neural operator for forecasting carbon monoxide evolution in cities. Npj Clean. Air 1, 1–12 (2025).
Article Google Scholar
Shen, S. et al. Enhancing global estimation of fine particulate matter concentrations by including geophysical a priori information in deep learning. ACS EST Air 1, 332–345 (2024).
Article CAS Google Scholar
Kayastha, S. G. et al. A deep learning framework for satellite-derived surface PM_2.5 estimation: enhancing spatial analysis in the United States. (2024) https://doi.org/10.1175/AIES-D-24-0028.1.
Willmott, C. J. et al. Statistics for the evaluation and comparison of models. J. Geophys. Res. Oceans 90, 8995–9005 (1985).
Article Google Scholar
Ghahremanloo, M. et al. Comprehensive Analysis of Bias in TEMPO NO2 Column Densities Through Pandora Observations. J. Geophys. Res. Atmos. 130, e2025JD044150 (2025).
Article CAS Google Scholar
Chatterjee, D. et al. Interpreting summertime hourly variation of NO₂ columns with implications for geostationary satellite applications. Atmospheric. Chem. Phys. 24, 12687–12706 (2024).
CAS Google Scholar
Laughner, J. L. & Cohen, R. C. Quantification of the effect of modeled lightning NO₂ on UV–visible air mass factors. Atmos. Meas. Tech. 10, 4403–4419 (2017).
Article CAS Google Scholar
Tang, D., Zhan, Y. & Yang, F. A review of machine learning for modeling air quality: Overlooked but important issues. Atmos. Res. 300, 107261 (2024).
Article CAS Google Scholar
Herman, J. et al. NO₂ column amounts from ground-based Pandora and MFDOAS spectrometers using the direct-sun DOAS technique: Intercomparisons and application to OMI validation. J. Geophys. Res. Atmos. 114, D13307, https://doi.org/10.1029/2009JD011848 (2009).
Article CAS Google Scholar
Herman, J. et al. Underestimation of column NO₂ amounts from the OMI satellite compared to diurnally varying ground-based retrievals from multiple PANDORA spectrometer instruments. Atmos. Meas. Tech. 12, 5593–5612 (2019).
Article CAS Google Scholar
Bae, K. et al. Validation of GEMS operational v2.0 total column NO2 and HCHO during the GMAP/SIJAQ campaign. Sci. Total Environ. 974, 179190 (2025).
Article CAS Google Scholar
Kim, M.-H. et al. Assessing CALIOP-derived planetary boundary layer height using ground-based Lidar. Remote Sens. 13, 1496 (2021).
Article Google Scholar
Verhoelst, T. et al. Ground-based validation of the Copernicus Sentinel-5P TROPOMI NO₂ measurements with the NDACC ZSL-DOAS, MAX-DOAS and Pandonia global networks. Atmos. Meas. Tech. 14, 481–510 (2021).
Article CAS Google Scholar
Jones, P. D. et al. Adjusting for sampling density in grid box land and ocean surface temperature time series. J. Geophys. Res. Atmos. 106, 3371–3380 (2001).
Article Google Scholar
Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Vaswani, A. et al. Attention is all you need. Advances in Neural Information Processing Systems 30 (2017).
Azizzadenesheli, K. et al. Neural operators for accelerating scientific simulations and design. Nat. Rev. Phys. 6, 320–328 (2024).
Article Google Scholar
Li, Z. et al. Fourier neural operator for parametric partial differential equations. International Conference on Learning Representations (ICLR 2021).
Dosovitskiy, A. et al. An image is worth 16×16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR 2021).
Liu, Z. et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 9992–10002 https://doi.org/10.1109/ICCV48922.2021.00986 (2021)
Aleissaee, A. A. et al. Transformers in remote sensing: a survey. Remote Sens 15, 1860 (2023).
Article Google Scholar
Liaw, R., Liang, E., Nishihara, R., Moritz, P., Gonzalez, J. E. & Stoica, I. Tune: A research platform for distributed model selection and training. ICML 2018 Workshop on Automated Machine Learning (AutoML) https://arxiv.org/abs/1807.05118 (2018).

Download references

Acknowledgements

The authors acknowledge the use of the Carya Cluster and advanced support from the Research Computing Data Core at the University of Houston.

Author information

Authors and Affiliations

Department of Earth and Atmospheric Sciences, University of Houston, Houston, TX, USA
Sagun Gopal Kayastha, Jincheol Park & Yunsoo Choi

Authors

Sagun Gopal Kayastha
View author publications
Search author on:PubMed Google Scholar
Jincheol Park
View author publications
Search author on:PubMed Google Scholar
Yunsoo Choi
View author publications
Search author on:PubMed Google Scholar

Contributions

S.G.K. conceived and designed the study, developed the computational framework, performed the simulations and formal analyses, and prepared the original manuscript draft. Y.C. contributed to manuscript review and editing. J.P. supervised the study, assisted with result interpretation, and contributed to manuscript review and editing.

Corresponding author

Correspondence to Yunsoo Choi.

Ethics declarations

Competing interests

The authors declare no competing interest.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary. (download DOCX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kayastha, S.G., Park, J. & Choi, Y. Hybrid transformer and physics-informed neural operator for correcting TEMPO NO₂ biases over North America. npj Clean Air 2, 15 (2026). https://doi.org/10.1038/s44407-026-00056-7

Download citation

Received: 11 November 2025
Accepted: 01 February 2026
Published: 06 March 2026
Version of record: 06 March 2026
DOI: https://doi.org/10.1038/s44407-026-00056-7