Abstract
A comprehensive understanding of the atmospheric hydrological cycle is crucial for studying the transboundary impacts of climate and anthropogenic changes on global water resources. This study presents AMSSRAB, a global dataset of atmospheric moisture source-sink relationships (AMSSRs) between evaporation and precipitation locations, and the derived atmospheric basins (ABs) that regionalize the global atmospheric hydrological cycle. By integrating three atmospheric moisture tracking models (WAM2layers, UTrack, and WaterSip), AMSSRAB provides a multi-model ensemble representation of atmospheric moisture flows. The dataset provides seasonal atmospheric moisture source-sink relationships at 1° resolution over 40 years (1979–2018). The derived atmospheric basins identify quasi-independent moisture circulation systems characterized by high internal recycling. AMSSRAB demonstrates strong agreement with bias-corrected reconstruction data and previously published datasets while offering a regional perspective that complements existing evaporationshed-precipitationshed frameworks. This dataset enables researchers to investigate moisture source-sink relationships at multiple scales, analyze regional moisture recycling patterns, and examine atmospheric moisture responses to climate variability within the Earth’s atmospheric water system.
Similar content being viewed by others
Background & Summary
The atmospheric hydrological cycle, responsible for transporting moisture evaporated from both land and oceans to distant precipitation locations, acts as a bridge connecting evaporation and precipitation over land and oceans. Oceanic evaporation, enhanced by climate change, potentially boosts the moisture contribution to land precipitation through this cycle1,2. Human-induced alterations to terrestrial surfaces, such as large-scale deforestation, afforestation, irrigation, and urbanization, can significantly modify local evaporation patterns, with potential inter-basin downwind effects on precipitation3,4,5, runoff6, and even glaciers7. These effects are transmitted via the atmospheric hydrological cycle, conveying the inter-basin impacts of human activities on the global hydrological cycle and water resources. Therefore, understanding this cycle is crucial, as it forms an important link between land, oceans, and human society.
Atmospheric moisture tracking models offer a powerful research tool for such studies. They simulate the fate of tracers tagged with isotopes or virtual moisture masses by solving the atmospheric moisture mass balance equations either embedded in weather/climate models (online models)8,9 or constructed by reanalysis datasets (offline models)10,11,12. The models offer a unique perspective on the atmospheric hydrological cycle by illustrating atmospheric moisture source-sink relationships (AMSSRs). The AMSSRs highlight the spatial locations of moisture sources (evaporation) and sinks (precipitation) and quantify the amount of moisture transported from the sources to the sinks. From this, two counterpart concepts, evaporationshed13 (referred to as ESHED hereafter) and the precipitationshed14 (referred to as PSHED hereafter), were proposed to establish an atmospheric watershed (ASHED) analysis framework for the atmospheric hydrological cycle. Recent studies using ASHED concepts have shed light on the hydrological functions of key moisture source/sink regions worldwide. Studies using the PSHED framework highlighted the buffer effects of forests on precipitation variability15,16. Cui et al.17 developed an ASHED-based leaf area index and showed that vegetation changes have increased global water availability. Pranindita et al.18 revealed PSHED shrinkage over Europe during heatwaves and noted the relatively stable moisture supply of forests to all study regions during heatwaves. Rockström et al.19 utilized ASHEDs to illustrate how a nation’s precipitation is interconnected with evaporation from foreign land and oceans, underscoring the need for comprehensive understanding of atmospheric moisture flows between nations for global water resources governance.
Recently, high-resolution datasets of AMSSRs and ASHEDs have been developed, providing essential data support for researchers studying how and to what extent global changes driven by climate or human activities destabilize and shift global water through the atmospheric hydrological cycle. Link et al.20 generated a global dataset (referred to as L20 hereafter) using the Eulerian moisture tracking model WAM2layers (water accounting model-2layer21,22) with spatial resolution of 1.5° during 2001–2018. L20 provides monthly data for all AMSSRs starting from global land to support the enumerations of all ESHEDs of 1.5° land grid cells. Tuinenburg et al.23 constructed a global dataset (referred to as T20 hereafter) using the Lagrangian model UTrack (Utrecht atmospheric moisture tracking model24) with spatial resolutions of 0.5° and 1.0° and covering 2008–2017. T20 offers monthly climatological means of AMSSRs between all 0.5°/1.0° grid cells, allowing the delineation of ESHEDs and PSHEDs. De Petrillo et al.25 developed a reconciled dataset (RECON) to address discrepancies between the T20 reconstructed evaporation/precipitation and ERA5 reanalysis data. Using an iterative proportional fitting (IPF) procedure, RECON provides bias-corrected moisture flow estimates at 0.5° resolution centred on the period 2008–2017. These datasets bring operational convenience and have been used to investigate hydrological impacts of land use services through AMSSRs for worldwide hotspot regions such as the Tibetan Plateau26 and for global water availability17,27. Researchers have also explored advanced analytical approaches enabled by these datasets. Network analysis methods, such as tipping cascade analysis28,29, have been developed and applied to study cascading moisture recycling and remote moisture transport connections. Zhang et al.30 built global AMSSR networks and identified the regional-scale and quasi-independent spatial structures, “atmospheric basins” (ABs), within which the moisture flow paths promote moisture evaporated internally to precipitate internally, resulting in high-strength moisture recycling and the similarity to the behaviour of water flowing within a river basin.
Despite these advances in data availability and analytical methods, model dependency remains a critical issue, as significant deviations exist in simulated AMSSRs across different atmospheric moisture tracking models23,24,31. Lack of global-scale observational evidence supported by isotopes makes thorough validation challenging. The conceptual differences and applications of various forcing data further complicate model intercomparisons. A coordinated international moisture tracking model intercomparison project has been established and is actively ongoing in an effort to systematically tackle these uncertainties32. Additionally, while previous work has demonstrated the potential of network analysis for analyzing cascading moisture recycling and identifying ABs30, further development is needed to enhance the reliability of identified ABs for broader applications, particularly by addressing the challenges of distinguishing consistent atmospheric moisture transport patterns from individual model uncertainties.
In light of these challenges, the present study develops a multi-model based global 40-year (1979–2018) 1° dataset of AMSSR and derives ABs through network analysis. We provide the publicly available AMSSRAB dataset that: (1) integrates three atmospheric moisture tracking models with different numerical frameworks (Eulerian WAM2layers, Lagrangian UTrack and WaterSip) to reduce individual model uncertainties; (2) applies community detection algorithms to identify ABs as quasi-independent regional moisture circulation systems; and (3) spans 40 years of seasonal data enabling both climatological analysis and the study of seasonal-scale climate oscillations. This multi-model ensemble approach effectively addresses model dependency issues while providing a comprehensive regional perspective that complements the ESHED-PSHED framework. The procedure for AMSSRAB production is segmented into four key stages as shown in Fig. 1: (1) implementation of atmospheric moisture tracking to develop the global AMSSR data; (2) establishment of global AMSSR networks and AMSSR dataset; (3) identification of ABs, and (4) generation of the AB dataset. First, three widely used atmospheric moisture tracking models are utilized to generate global AMSSR data. Each model varies in its approach to the atmospheric moisture balance, enabling the consideration of the model dependency issue. The subsequent stage rearranges the AMSSR data into matrices and then conducts bias correction using an iterative fitting algorithm. The third stage involves feeding the AMSSR networks into a community detection algorithm that effectively identifies network communities. The final stage mitigates uncertainties stemming from the inherent randomness of the employed community detection algorithm and the use of multiple models.
Workflow of the development of AMSSRAB.
Methods
Atmospheric moisture tracking
Three widely used models were applied to conduct atmospheric moisture tracking: WAM2layers (version 2.4.08, https://github.com/ruudvdent/WAM2layersPython/releases/tag/v2.4.08), UTrack, and WaterSip33. These models differ in their approach to the atmospheric moisture budget. WAM2layers, used in the prior study of ABs30, solves the Eulerian mass balance equation of a tagged moisture tracer, allowing for the direct generation of AMSSR spatial maps that align with the gridded forcing data resolution. UTrack and WaterSip solve the moisture budget by tracing the Lagrangian trajectories of air parcels. They differ in their depiction of moisture uptake and loss processes along the route. UTrack uses actual evaporation and precipitation data. WaterSip relies on the changes in moisture content along the trajectories. Consequently, the two models apply different strategies and schemes for quantifying moisture contributions, with detailed descriptions available in prior studies12,24.
For consistency, the European Centre for Medium-Range Weather Forecasts Interim Reanalysis data (ERA-Interim)34 with 1° × 1° spatial resolution were used as input data for the three models. Commonly required variables include surface pressure, and model-level horizontal winds and specific humidity. Additional data needed by WaterSip and UTrack are model-level vertical velocity and air temperature. WAM2layers and UTrack also require evaporation and precipitation data, with the former needing additional vertical integral state and flux variables of atmospheric moisture for background picture of the atmospheric hydrological cycle. Evaporation and precipitation were acquired as 3-hourly data, and all other variables were acquired as 6-hourly data. Input data for WaterSip and UTrack were first processed through the Lagrangian analysis tool LAGRANTO35 to generate the Lagrangian air trajectories, which were then fed into the two models for atmospheric moisture tracking.
The models were run to generate seasonal AMSSR (DJF, December–January–February; MAM, March–April–May; JJA, June–July–August; SON, September–October–November), spanning a period from 1979MAM to 2018SON. For WaterSip and UTrack, LAGRANTO was first run to generate backward air trajectories, releasing air parcels from input data grid cell centers with 10 vertical levels above the surface at 50 hPa intervals at 00, 06, 12, and 18 UTC during the study period, and tracing the parcels backward in time for 20 days. The output trajectories were fed into the two models with similar configurations to the original model description papers (WaterSip: valid endpoint relative humidity threshold at 80%, moisture uptake threshold at 0.2 g kg−1 (6 h)−1, and planetary boundary layer limitation activated12; UTrack: tracking time step at 6 h, and weighting contributions of particles released at different heights by the corresponding vertical specific humidity profiles36). Model outputs for one season contain information about the amount of moisture provided by locations along the trajectories contributing to precipitation at each grid cell. These outputs were then projected onto a 1° × 1° global grid, creating a 2D moisture source map with dimensions of 181 × 360 (65160 map cells) that illustrated the moisture sources contributing to each grid cell’s seasonal precipitation (see Fig. 2 for a demonstrative 2D source map). WAM2layers was run following a similar procedure as described by Zhang et al.30, but confined to a 1° × 1° grid spanning latitudes from −80° to 80°, excluding the fine-to-coarse aggregation strategy. A seasonal 2D source map with dimensions of 161 × 360 (57960 map cells) for precipitation of one grid cell could be directly outputted in a single simulation round.
Demonstration of the key processes involved in the development of AMSSRAB. Units of 2D source map, its flattened version, and adjacency matrices: mm per season; colorbars of source maps and AB boundary probability are embedded within the subfigures, and the logarithmic-scale colorbar applies to all matrices; AB boundaries are randomly coloured.
Due to the abundance of tunable parameters and optional schemes inherent to the three models, conducting exhaustive sensitivity tests on various model configurations is challenging. Therefore, we adhered to the models’ standard settings where possible. For more technical details of the models, readers are referred to the model papers12,21,22,24,33,36.
Generation of AMSSR and AB data
After atmospheric moisture tracking, we obtained 65160 (or 57960 for WAM2layers) maps of 181 × 360 (or 161 × 360 for WAM2layers) 2D moisture source patterns per season. These maps were then converted into an adjacency matrix, a common representation of a network, to construct an AMSSR network. In this matrix, each element, aij, signifies the AMSSR between grid cells (or nodes in network terminology) i and j (from i to j). The j-th column a:j illustrates the moisture that each node provides to node j, equivalent to the flattened version of the 2D source map for node j. Therefore, the adjacency matrix A, with dimensions of 65160 × 65160 (or 57960 × 57960 for WAM2layers), was practically derived from column-wise stacking of the flattened 2D source maps, following the grid cell order of the flattened maps (Fig. 2).
The sums of the i-th row ai: and i-th column a:i of A denote the total moisture that grid cell i emits and receives, ideally equaling the grid cell’s evaporation and precipitation respectively. However, discrepancies exist23, possibly due to model configurations, such as the predefined fixed air trajectory time length and limited model domain, as well as certain technical assumptions. To correct these biases, we adopted a matrix scaling method known as RAS37. The RAS aims to adjust the elements of an n-dimensional (n = 2 in the present case) matrix to make the marginal summations of the matrix equal to the predefined n-1 dimensional marginal totals. The core procedure of RAS is to adjust each row and column element by scaling it with a ratio of the given marginal total to the actual sum of the corresponding row or column elements. This scaling is alternated row-wise and column-wise until a predefined convergence criterion is met (Fig. 2). Practically, flattened seasonal global precipitation and evaporation maps, with the evaporation map rescaled to match global total precipitation for moisture mass balancing, serve as the predefined marginal totals for RAS. The bias-corrected adjacency matrices formed the AMSSR dataset.
The AMSSR dataset was used to build seasonal AMSSR networks and conduct AB identification. Following Zhang et al.30, we employed Infomap, an entropy-based algorithm38, to identify flow-based communities (atmospheric moisture flows in the present case) within the network, where a community with a high regional moisture recycling ratio was designated as an AB. An AB is essentially a group of grid cells that exchange moisture predominantly with each other rather than with external regions. The regional moisture recycling ratio quantifies the proportion of moisture that is evaporated and precipitated within the entire AB region relative to the basin’s total evaporation and precipitation. Within an AB, the moisture evaporated tends to precipitate within the same basin, similar to how water flows within a terrestrial river basin. This quasi-closed behavior results in high regional moisture recycling ratios (typically exceeding 50%), indicating strong internal moisture connections. The inherent randomness in the Infomap algorithm may lead to partition results with non-negligible differences when the algorithm is executed multiple times. To address this issue, we ran Infomap 20 times for the same network with the same initial partition obtained from a pre-run of Infomap. The results of the 20 runs were polygonized to smooth out the isolated nodes and tiny communities. The boundaries of the polygons were fed to a gaussian kernel density estimator (KDE) to calculate the probability that one point resides on the polygon boundaries. Points with high probability are considered to be the boundaries of the identified communities. We applied a skeleton tracing algorithm (STA)39 to the KDE probability map to extract polyline data of the identified communities. The polyline data were manually adjusted by deleting the small dangles and linking the nearly continuous segments, and then resampled and smoothed. The refined polylines serve as the “spatially averaged” boundaries of the identified communities from the 20 runs and can smooth out the randomness inherent in Infomap.
A similar approach was employed to reduce discrepancies in community identification results across the three models. The refined polylines of the three models for the same season were fed into the KDE-STA-refinement (K-S-R) procedure, resulting in a final estimation of the boundaries of the identified communities for that season. An illustrative example of the K-S-R procedures are shown in Fig. 2. The obtained polylines were converted to polygons and then converted to 1° × 1° raster maps, representing the vector and raster versions of community data for AMSSR networks. We calculated the amounts of recycled, imported, and exported moisture, and the recycling ratios of evaporation and precipitation for each community (the proportions of recycled moisture within community with respect to evaporation and precipitation). A community dominated by evaporation/precipitation recycling is defined as an AB.
We examined the stability of ABs over the study period using clustering analysis. Specifically, we constructed a dataset containing the major features of all ABs for one season. The selected features include the spatial coordinates of AB representative centroids, the spatial extents, the amounts of recycled, imported, and exported moisture, and the recycling ratios of evaporation and precipitation. We selected the hierarchical density-based spatial clustering of applications with noise (HDBSCAN)40 for its ability to filter out noise data samples (highly unstable communities in the present study). To ensure stability in the clustering results, we adjusted the “minimum samples” and “minimum cluster size” parameters, as noted by Zhang et al.41, within the ranges of 4~7 and 20~30, respectively. Clusters that exhibited stability across various parameter settings were deemed valid. ABs associated with the same valid cluster were labelled as identical entities throughout the time period of the dataset, allowing for the tracking of the evolution of each persistent AB over time.
Data Records
The AMSSRAB dataset is available from the National Tibetan Plateau/Third Pole Environment Data Center at https://doi.org/10.11888/Atmos.tpdc.30047842. The repository provides FTP access credentials for downloading the complete 505 GB dataset, along with comprehensive documentation including data summary, file naming conventions, and usage guidelines. Users are recommended to use Python to load and process the AMSSR and AB data. Quick visualization of the AB vector and raster data can be easily performed using a GIS software such as ArcGIS and QGIS.
The AMSSRAB dataset comprises two key components. First, adjacency matrices of AMSSRs are provided in the directory named AMSSR. These matrices span 40 years (1979–2018) and include seasonal multi-model averages as well as 40-year averaged seasonal matrices. Given the sparsity of these matrices (only ~20% of the elements in each matrix are non-zero values), they are stored in the form of sparse column matrices provided by SciPy (a free and open-source Python library for scientific computing) to save storage space and reduce data loading time. Each adjacency matrix is saved as a single file with the suffix “.npz”, following the naming convention Global_AMSSR_matrix_[year][season].npz for individual years (e.g., Global_AMSSR_matrix_1979MAM.npz) and Global_AMSSR_matrix_multiyear_average_[season].npz for 40-year averages, resulting in 163 (41 for MAM, JJA, and SON; 40 for DJF) “npz” files in total. We also provide geographical information of the matrices in the file AMSSR_matrix_index_lon_lat.csv, showing the 1D spatial coordinates (longitudes, latitudes) of the 65160 grid cells (AMSSR network nodes). These coordinates are ordered latitude-major from north to south, following this pattern: (−180°, 90°), (−179°, 90°), …, (178°, −90°), (179°, −90°). Correspondingly, the elements of the adjacency matrices are arranged in the same order. By indexing an adjacency matrix, e.g., the i-th row j-th column element, users can determine the AMSSR from grid cell i to j, i.e., the amount of moisture that i provides for precipitation over j. Besides facilitating climatology or inter-annual AMSSR analysis on grid-to-global scales, the storage format of the AMSSR dataset provides convenience to researchers for constructing global or regional AMSSR networks. It also simplifies the application of various network analysis tools to study the structure and dynamics of global and regional AMSSR.
Vector and raster versions of AB data are provided in the directory named AB. We store AB vector data in polygon shapefile format with files named as Global_AB_polygon_[season] for seasonal data (DJF, MAM, JJA, SON) and Global_AB_polygon_multiyear_average for climatology data. To avoid the production of an excessive number of files and frequent data import, we store inter-annual AB data for each season in a single shapefile. All shapefiles contain 2 field columns: the year field, and the label field indicating which AB the polygon belongs to, where label 0 represents noise entities identified by HDBSCAN. Climatology AB vector data for all seasons are stored in one shapefile featured with the season field. We use NetCDF format to store AB raster data, with files named as Global_AB_raster_[season].nc for seasonal data and Global_AB_raster_multiyear_average.nc for climatology. Inter-annual data for each season are stored in a single 4D NetCDF file with dimensions being label, year, lat, and lon which indicate the label type, year, latitude, and longitude of each grid cell. The labels of the raster data have two types: the label type random gives a random label for each community (including those identified as noise by HDBSCAN), and the label type clustering signifies the AB labels in accordance with the vector data field label. The spatial resolution of the raster data is 1° × 1° for compatibility with the spatial resolution of the adjacency matrices. Climatology AB raster data for all seasons is stored in a single 3D NetCDF file with dimensions being season, lat, and lon. The storage strategy of the vector and raster data enables easy filtering for researchers based on their areas of interest and specific time periods, supporting long-term, extreme case, and climatology analysis of ABs. Notably, researchers are recommended to use the vector data for qualitative exploration. For quantitative analyses at 1° × 1° or coarser scales, such as moisture recycling and exchange analysis of ABs (Fig. 5 of Zhang et al.30), the recommended approach is utilizing a combination of the raster data and the AMSSR adjacency data. A general overview of the entire dataset is provided in Fig. 3. Please note that due to high sparsity, the 1° × 1° adjacency matrices are aggregated to 20° × 20° for visual demonstration.
Overview of AMSSRAB dataset. For raster data, two types of labels, i.e., clustering-identified labels and randomly generated labels, are provided for each season. For polygon data, only clustering-identified labels are provided.
Technical Validation
Comparison with existing datasets
Our AMSSRAB dataset employs a multi-model ensemble averaging approach, combining three models (WAM2layers, WaterSip, and UTrack) to provide a balanced representation of atmospheric moisture flows. To validate this multi-model approach, we conducted comparative analyses with existing moisture tracking datasets (L20, T20, and RECON). The time periods of L20 and T20 cover 2001–2018 and 2008–2017 respectively. Both datasets provide monthly data, but inter-annual data are only available for L20. T20 offers two optional spatial scales (0.5° and 1°) while L20 provides a resolution of 1.5°. RECON provides annual-scale data averaged over 2008–2017 at 0.5° resolution with bias correction. We selected 2008–2017 as the comparison period and aggregated L20 and T20 to the seasonal scale. We then calculated the 10-year averages of L20 and our AMSSR data. The comparisons were conducted at grid and regional scales for L20 and T20, and at annual scale for RECON.
Figure 4 presents the comprehensive comparison between our multi-model averaged AMSSR data and existing datasets L20 and T20 across grid and regional scales. At the grid scale (Column 1), which compares individual moisture connections between every pair of 1° × 1° grid cells (3° × 3° for our data versus L20), our multi-model average shows moderate to strong correlations with both reference datasets, achieving R2 values of 0.67–0.69 with T20 and 0.79–0.82 with L20 across different seasons. The scatter plot analysis reveals that T20 generally estimates higher moisture connection strengths than our multi-model average, while our dataset estimates marginally higher values than L20. The spatial correlation maps (Columns 2-3) represent how well our dataset reproduces the moisture sink patterns (where each grid’s evaporation precipitates) and source patterns (where evaporation contributes to each grid’s precipitation) compared to L20 and T20. These correlation coefficients exceed 0.5 across most global regions, with lower correlations confined to polar regions and North African arid areas where moisture fluxes are minimal. At the basin scale (Column 5), we aggregated moisture flows between regional units consisting of 26 major river basins43 (identical to those used in Tuinenburg et al.23) and 10 ocean regions44, with all remaining land areas treated as an additional single region (37 regions in total). The regional-scale moisture recycling comparisons show similar bias patterns where our estimates are systematically lower than T20 but higher than L20. However, the analysis reveals distinct characteristics in the moisture connection distributions. For strong moisture connections, all datasets show high consistency, while weak connections exhibit larger uncertainties. Notably, weak connections show opposite bias directions, with our data tending to be higher than T20 but lower than L20, indicating that T20 tends toward stronger moisture recycling estimates, while L20 contains more numerous weak inter-regional connections. This suggests fundamental differences in how these datasets represent the spectrum of atmospheric moisture transport spanning from local recycling to long-distance moisture exchange.
Comparison of multi-model averaged AMSSR data (MdlAvg) with L20 and T20. Column 1: scatter plots of each aij obtained from multi-model average versus L20 (MdlAvg-L20) and versus T20 (MdlAvg-T20), with abscissa being MdlAvg and ordinate being T20 or L20; Column 2: spatial distributions of correlation coefficients of moisture sinks (Rsink) and sources (Rsource) for MdlAvg-T20; Column 3: the same as Column 2, but for MdlAvg-L20; Column 4: latitudinal distribution of zonal averaged correlation coefficients for land and oceans; Column 5: scatter plots of regional moisture recycling amounts comparing MdlAvg with T20 and L20, with abscissa being·MdlAvg and ordinate being L20 or T20. The bottom plot of Column 5 indicates the global region partition (randomly coloured). Units of abscissa and ordinate of all scatter plots are mm per season. The upper ticks of the colorbar apply to Column 1 and indicate occurrence of each scatter point, while the lower ticks apply to Column 2 and 3. The line legends apply to Column 4.
Figure 5 presents the comparison between our multi-model averaged AMSSR data and the RECON dataset at annual scale. RECON applies IPF correction to T20 data, which employs the same mathematical principle as our RAS bias correction at different temporal scales. Both methods ensure proportional adjustment of moisture flows to match evaporation and precipitation constraints. At the grid scale, our multi-model average achieves an R² value of 0.84 with RECON. The scatter plot analysis shows that RECON generally estimates higher moisture connection strengths than our multi-model average, consistent with the T20 bias pattern observed in Fig. 4. The spatial correlation maps show that correlation coefficients generally exceed 0.6 across most global regions, with weaker correlations in polar and North African desert regions, consistent with the spatial patterns observed in Fig. 4. The latitudinal analysis shows consistently high correlation coefficients across low- and mid-latitudes for both land and ocean regions, with oceanic areas showing slightly superior agreement. At the basin scale, the comparison achieves an R² value of 0.98, with similar bias characteristics where RECON estimates higher values for strong moisture connections while our dataset estimates higher values for weak connections. Overall, the high consistency with RECON demonstrates the reliability of our multi-model averaging approach and confirms the effectiveness of matrix balancing correction methods such as IPF and RAS.
Comparison of multi-model averaged AMSSR data (MdlAvg) with RECON at annual scale. Left panels show grid-scale scatter plot and basin-level comparison, with abscissa being MdlAvg and ordinate being RECON. Right panels display spatial correlation maps for moisture sink (Rsink) and source (Rsource) patterns, with latitudinal averages for land and ocean regions. Units of scatter plots are mm per year.
To systematically assess individual model performance and uncertainties, we conducted comprehensive pairwise comparisons among all seasonal datasets, including our three models (WAM2layers, UTrack, WaterSip) and reference datasets (L20, T20), across IPCC climate reference regions45 (Fig. 6). We computed correlation coefficient maps for moisture sink (RSink) and source (RSource) patterns between model pairs for each season, then aggregated these correlations within each IPCC region. The multi-model average demonstrates enhanced performance, maintaining high correlations with all constituent models while achieving stronger agreement with reference datasets (L20 and T20), as evidenced by lower standard deviations. This confirms the ensemble approach effectively synthesizes multi-model information and enhances result reliability. Model consistency varies systematically with regional characteristics. Regions with active moisture fluxes, including tropical and subtropical monsoon areas (SAM, SAS, SCA, SEA, EAS), and low- to mid-latitude ocean regions, show exceptionally high inter-model agreement, indicating that abundant moisture flux facilitates model convergence. Conversely, substantial discrepancies occur in extreme climate regions such as polar areas (ARO, EAN, GIC, WAN) and arid zones (ARP, SAH), where minimal moisture flux amplifies relative model differences. Topographically complex regions (TIB, WCA, ECA, SWS) exhibit moderate correlations, reflecting the influence of regional-specific processes and terrain effects.
Pairwise model comparison matrix across IPCC reference regions45. Each cell shows seasonal correlations for moisture sink (RSink) and source (RSource) patterns, with colors representing median correlation values. Symbols indicate standard deviation (σ) levels (× for σ >0.2, + for 0.1 < σ ≤ 0.2). Model abbreviations: W (WAM2layers), U (UTrack), S (WaterSip), T (T20), L (L20), M (Multi-model average). The bottom map shows IPCC region locations with their acronyms: ARP (Arabian-Peninsula), CAF (Central-Africa), CAR (Caribbean), CAU (C.Australia), CNA (C.North-America), EAS (E.Asia), EAU (E.Australia), ECA (E.C.Asia), EEU (E.Europe), ENA (E.North-America), ESAF (E.Southern-Africa), ESB (E.Siberia), GIC (Greenland/Iceland), MDG (Madagascar), MED (Mediterranean), NAU (N.Australia), NCA (N.Central-America), NEAF (N.Eastern-Africa), NEN (N.E.North-America), NES (N.E.South-America), NEU (N.Europe), NSA (N.South-America), NWN (N.W.North-America), NWS (N.W.South-America), NZ (New-Zealand), RAR (Russian-Arctic), RFE (Russian-Far-East), SAH (Sahara), SAM (South-American-Monsoon), SAS (S.Asia), SAU (S.Australia), SCA (S.Central-America), SEA (S.E.Asia), SEAF (S.Eastern-Africa), SES (S.E.South-America), SSA (S.South-America), SWS (S.W.South-America), TIB (Tibetan-Plateau), WAF (Western-Africa), WCA (W.C.Asia), WCE (West&Central-Europe), WNA (W.North-America), WSAF (W.Southern-Africa), WSB (W.Siberia), ARO (Arctic-Ocean), ARS (Arabian-Sea), BOB (Bay-of-Bengal), EAN (E.Antarctica), EAO (Equatorial.Atlantic-Ocean), EIO (Equatorial.Indic-Ocean), EPO (Equatorial.Pacific-Ocean), NAO (N.Atlantic-Ocean), NPO (N.Pacific-Ocean), SAO (S.Atlantic-Ocean), SIO (S.Indic-Ocean), SOO (Southern-Ocean), SPO (S.Pacific-Ocean), WAN (W.Antarctica).
The comprehensive validation analyses presented above, utilizing multiple complementary approaches and reference datasets across different spatial and temporal scales, consistently demonstrate the reliability of our multi-model ensemble methodology. The convergence of evidence from grid-scale comparisons, regional assessments, and validation against bias-corrected reconstruction data supports adopting the three-model average as our final AMSSR dataset, effectively balancing individual model strengths. Additional grid-scale comparisons between individual models are provided in supplementary materials (Figs. S1–S5) for users requiring detailed model-specific information.
Validation of AB stability
Utilizing HDBSCAN, we identified the ABs that generally persist from 1979 to 2018. The spatial distribution and moisture recycling characteristics of these persistent ABs for each season are depicted in Fig. 7. The identified ABs demonstrate remarkable persistence, with almost all having membership counts exceeding 30 throughout the study period. Certain boundaries of these ABs, such as the eastern boundaries of the AB occupying the eastern South Pacific, and the western boundaries of the AB covering southeastern South America and the South Atlantic, exhibit remarkable stability. Conversely, many of the identified ABs display considerable spatial variations in their boundaries.
Spatial distribution and moisture recycling characteristics of persistent atmospheric basins (ABs) for each season (DJF, MAM, JJA, SON). Each AB for each season is labelled with a unique number, followed by its membership count in parentheses (indicating the number of years the AB was identified during the 40-year study period). The boxplots show evaporation (re, orange) and precipitation (rp, blue) recycling ratios for each numbered AB across the 40-year period, with boxes indicating interquartile ranges, median lines, and whiskers extending to the 5th and 95th percentiles. The horizontal dashed line at 50% indicates the threshold above which internal recycling dominates over external moisture exchange.
For each AB, the evaporation recycling ratio (fraction of the AB’s evaporation that precipitates within itself) and precipitation recycling ratio (fraction of the AB’s precipitation originating from its own evaporation) quantify moisture recycling intensity, where values above 50% indicate that internal recycling exceeds external moisture exchange. Most identified ABs frequently exhibit at least one type of recycling ratio exceeding 50%, with typical values ranging from 50% to 80%, validating that these regions function as semi-closed moisture circulation systems. The persistence of high recycling ratios across the 40-year study period, despite inter-annual climate variability, confirms that ABs represent stable organizational structures of the atmospheric hydrological cycle. These semi-closed systems, by preferentially recycling moisture internally, effectively partition the global atmospheric moisture field into coherent regions.
Usage Notes
Application demonstrations with the AMSSRAB dataset
The AMSSRAB dataset enables diverse applications from basic moisture tracking to advanced climate dynamics research. We provide three application examples demonstrating progressively uses of the dataset, from extracting moisture source-sink relationships to illustrating long-term climate responses.
Application example 1
Extracting ESHEDs and PSHEDs. We demonstrate how to extract ESHEDs and PSHEDs from the AMSSR matrix to visualize atmospheric moisture source-sink relationships (Fig. 8). The AMSSR adjacency matrix encodes moisture flows, where element aij represents flow from source grid i to sink grid j. The PSHED reveals moisture sources by extracting columns for the region and summing each column (∑j∈regionaij), while ESHED identifies moisture sinks by extracting rows for the region and summing each row (∑i∈regionaij). The workflow involves: (1) loading the AMSSR matrix; (2) identifying regional grid indices; (3) extracting ESHED/PSHED through matrix operations; (4) defining ESHED/PSHED boundaries by threshold. The Tibetan Plateau JJA example demonstrates strong local recycling (>100 mm/season) as the primary moisture source. Remote moisture sources include substantial contributions from the Indian monsoon region (>10 mm/season) and additional input from westerly circulation (>1 mm/season). For moisture sinks, evaporated moisture from the Tibetan Plateau predominantly remains over the plateau itself, followed by transport to East Asia and the Northwest Pacific. The ESHED-PSHED framework enables quantification of both upstream influences on the Tibetan Plateau (e.g., how South Asian irrigation expansion affects plateau precipitation) and downstream impacts from plateau changes (e.g., how plateau vegetation shifts influence East Asian precipitation).
Workflow demonstration for extracting ESHEDs and PSHEDs from AMSSRAB dataset, using the Tibetan Plateau as an example with multi-year average JJA (summer) AMSSR matrix. The diagram illustrates the data processing pipeline from adjacency matrix to ESHED/PSHED boundary definition. Bottom panels show spatial distributions of moisture sources and sinks (unit: mm per season), with the 90% contribution boundaries (red and blue contours) identifying core moisture source and sink regions.
Application example 2
Temporal evolution of ABs. Building on the Tibetan Plateau, this case demonstrates how AMSSRAB tracks AB boundary evolution and climate responses over 40 years (Fig. 9). This case reveals that the Tibetan Plateau forms an independent AB during MAM and JJA showing high recycling ratios (re: 40–80%, rp: 30–50%), whereas in DJF and SON the AB extends across East Asia. Long-term trends (9-year moving averages) show that both evaporation and precipitation recycling ratios are increasing in JJA and SON, alongside general northward boundary migration. In DJF, the AB area decreases significantly, as eastern boundaries shift westward, southern boundaries move northward, and western boundaries move eastward. The dataset captures climate oscillation responses. During El Niño events, DJF eastern boundaries are located further east, southern boundaries further south, and precipitation recycling ratios are significantly higher compared to neutral conditions. During North Atlantic Oscillation (NAO) positive phases, MAM northern boundaries are positioned slightly more poleward. The 40-year increasing recycling trends within the Tibetan Plateau AB in JJA and SON indicate strengthening internal moisture circulation, with growing implications for regional water vulnerability. These findings demonstrate AMSSRAB’s capability to quantify regional moisture circulation responses spanning decadal trends and inter-annual climate variability.
Demonstration of ABs over the Tibetan Plateau. Top row: Seasonal AB boundaries with randomly coloured years. Row 2–4: Time series of area (10⁶ km²), evaporation recycling (re), and precipitation recycling (rp) ratios with 9-year moving averages (dashed lines); trend symbols (+/−) indicate significant trends (p-value < 0.05). Rows 5-6: North-South and East-West boundary positions displayed with separate axes for each direction, including moving averages. Rows 7-8: Boxplots of boundary positions (North, East, South, West) grouped by ENSO phases (classified using Oceanic Niño Index, https://origin.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/ONI_v5.php) and NAO phases (NAO+ and NAO- defined as >0.5 and <−0.5 standard deviations from mean seasonal NAO index, https://www.cpc.ncep.noaa.gov/products/precip/CWlink/pna/norm.nao.monthly.b5001.current.ascii.table); hatched areas indicate statistically significant differences (p-value < 0.05, one-way ANOVA testing whether mean values differ among climate phase categories). Bottom row: Boxplots of evaporation and precipitation recycling ratios by climate oscillation phases.
Application example 3
AB variability during climate oscillations. This example demonstrates AMSSRAB’s capability to capture ENSO-driven circulation changes. The Tropical Pacific Ocean (TPO) atmospheric basin reflects moisture convergence characteristics of the Intertropical Convergence Zone (ITCZ), clearly visible in the integrated moisture transport fields (Fig. 10). The TPO basin disappears entirely during strong El Niño events (1982 SON, 1983 DJF, 1987 MAM, 1992 MAM, 1997 SON, 1998 DJF, 1998 MAM, 2015 JJA), when moisture flux anomalies reveal substantially weakened westward moisture transport that prevents TPO development, resulting in multiple smaller transient ABs emerging instead. TPO boundary boxplots show that the western boundary responds most strongly to ENSO, shifting eastward by ~20° longitude during El Niño and westward during La Niña relative to neutral conditions in three seasons (excluding MAM). These shifts correspond directly to Walker circulation modulation, with El Niño weakening westward moisture transport and La Niña enhancing it. This case demonstrates how AMSSRAB enables quantitative assessment of atmospheric moisture circulation responses to major climate oscillations.
Demonstration of TPO response to ENSO events. Top row shows TPO boundaries during different ENSO phases when present, overlaid with integrated moisture transport fluxes (seasonal means when TPO present, reference arrow: 200 kg/m/s). Second row displays AB configurations during TPO absence in strong El Niño years, with moisture flux anomalies (TPO-absent minus all-season mean, reference arrow: 100 kg/m/s) showing disrupted transport patterns. Bottom panels show boxplots of TPO boundary positions (North, South, East, West) across seasons and ENSO phases (El Niño, La Niña, Neutral), with diagonal hatching indicating significant differences (p-value < 0.05, one-way ANOVA).
Usage considerations
The three atmospheric moisture tracking models employed in this study each contain multiple configurable parameters that influence AMSSR quantification. Selected studies from the recent literature have explored these parameter spaces across different models (Table 1), demonstrating that each model exhibits its own unique parameter sensitivities. These distinct sensitivity profiles reflect the fundamentally different numerical frameworks and physical assumptions underlying each model. These inherent differences contribute to systematic discrepancies between models that extend beyond the scope of parameter optimization within any single model. As evident in our comparisons (Figs. 4–5, Figs. S1–S5), different moisture tracking models exhibit systematic biases, as their scatter plot distributions consistently deviate from the 1:1 line. We therefore strongly encourage users of publicly available AMSSR datasets to conduct uncertainty analyses by assessing the sensitivity of their findings to data from different model sources. While our multi-model averaging approach synthesizes information from three structurally different models and maintains balanced agreement with each constituent model (Fig. 6), comprehensive resolution of model consistency issues requires the systematic international coordination exemplified by the ongoing moisture tracking model intercomparison project32.
Regarding the choice of forcing data, different AMSSR datasets have adopted various reanalysis products. While T20 adopted ERA5 with higher spatial 0.25° and temporal (hourly) resolution, L20 and our dataset utilized ERA-Interim. Although higher-resolution forcing data can benefit Lagrangian trajectory simulations46, the practical impact remains limited. The L20 study20 found minimal differences between ERA5 and ERA-Interim across most regions. Our comparisons with ERA5-based datasets support this finding. At the seasonal scale, comparisons with T20 show correlation coefficients exceeding 0.6 across most regions for grid-level moisture patterns, and R² above 0.93 at the basin-scale (Fig. 4). At the annual scale, the comparison with RECON demonstrates stronger agreement, achieving R² = 0.84 at grid scale and R² = 0.98 at basin scale (Fig. 5). These consistent results indicate that reanalysis choice has limited impact on moisture tracking accuracy when assessed at regional and seasonal scales. Zhang et al.30 further demonstrated that AB structures remain resilient to both forcing data choice (comparing ERA-Interim with MERRA-2) and spatial resolution (tests between 1.5°and 3°grids yielded adjusted mutual information of 0.84–0.88, indicating high consistency). This consistency reflects the dominance of large-scale circulation patterns in seasonal atmospheric moisture transport, whose characteristic scales far exceed typical reanalysis grid spacing and are consistently captured across different reanalysis products. Throughout the development of this 40-year AMSSRAB dataset, the established framework of ERA-Interim at 1° × 1° resolution balanced accuracy, computational efficiency, and methodological consistency.
Based on the comprehensive regional comparison presented in Fig. 6, dataset reliability varies across different geographic contexts. Regions with active moisture fluxes, including tropical and subtropical monsoon areas (South American Monsoon, South Asia, South Central America, Southeast Asia, East Asia) and low- to mid-latitude ocean regions, show exceptionally high inter-model agreement, where active atmospheric moisture exchange produces well-defined patterns that different numerical frameworks can consistently capture. Topographically complex regions (Tibetan Plateau, Central Asia, Southwest South America) exhibit moderate correlations, reflecting the influence of regional-specific processes and terrain effects. The most substantial discrepancies occur in extreme climate regions such as polar areas (Arctic Ocean, Greenland/Iceland, Antarctica) and arid zones (Arabian Peninsula, Sahara), where near-zero evaporation, precipitation, and moisture fluxes create numerical instabilities during fraction allocation and bias correction, magnifying relative model differences. For regions with limited reliability, users are advised to focus on relative spatial patterns rather than absolute values and consider multi-source uncertainty analysis when necessary.
Data availability
The AMSSRAB dataset comprising atmospheric moisture source-sink relationship (AMSSR) matrices and atmospheric basin (AB) data is available at the National Tibetan Plateau/Third Pole Environment Data Center (https://doi.org/10.11888/Atmos.tpdc.300478)42. The dataset includes AMSSR adjacency matrices in sparse format and AB vector/raster data.
Code availability
All codes to construct, bias-correct, and refine the AMSSR and AB data, together with AMSSR visualization scripts and validation data, can be accessed from Zenodo at https://doi.org/10.5281/zenodo.1703212047.
References
Findell, K. L., Keys, P. W., Berg, A. & Krasting, J. P. Rising temperatures increase importance of oceanic evaporation as a source for continental precipitation. J. Clim. 32, 14 (2019).
Gimeno, L., Nieto, R. & Sorí, R. The growing importance of oceanic moisture sources for continental precipitation. Npj Clim. Atmospheric Sci. 3, 27 (2020).
Meier, R. et al. Empirical estimate of forestation-induced precipitation changes in Europe. Nat. Geosci. 14, 473–478 (2021).
Wang, J., Feng, J. & Yan, Z. Impact of extensive urbanization on summertime rainfall in the Beijing region and the role of local precipitation recycling. J. Geophys. Res. Atmospheres 123, 3323–3340 (2018).
Xie, D. et al. Hydrological impacts of vegetation cover change in China through terrestrial moisture recycling. Sci. Total Environ. 915, 170015 (2024).
Wang-Erlandsson, L. et al. Remote land use impacts on river flows through atmospheric teleconnections. Hydrol. Earth Syst. Sci. 22, 4311–4328 (2018).
de Kok, R. J., Tuinenburg, O. A., Bonekamp, P. N. J. & Immerzeel, W. W. Irrigation as a potential driver for anomalous glacier behavior in high mountain Asia. Geophys. Res. Lett. 45, 2047–2054 (2018).
Dominguez, F., Miguez-Macho, G. & Hu, H. WRF with water vapor tracers: A study of moisture sources for the North American monsoon. J. Hydrometeorol. 17, 1915–1927 (2016).
Nusbaumer, J., Wong, T. E., Bardeen, C. & Noone, D. Evaluating hydrological processes in the Community Atmosphere Model Version 5 (CAM5) using stable isotope ratios of water. J. Adv. Model. Earth Syst. 9, 949–977 (2017).
Dirmeyer, P. A. & Brubaker, K. L. Contrasting evaporative moisture sources during the drought of 1988 and the flood of 1993. J. Geophys. Res. Atmospheres 104, 19383–19397 (1999).
van der Ent, R. J., Savenije, H. H. G., Schaefli, B. & Steele-Dunne, S. C. Origin and fate of atmospheric moisture over continents. Water Resour. Res. 46, W09525 (2010).
Sodemann, H., Schwierz, C. & Wernli, H. Interannual variability of Greenland winter precipitation sources: Lagrangian moisture diagnostic and North Atlantic Oscillation influence. J. Geophys. Res. Atmospheres 113, D03107 (2008).
van der Ent, R. J. & Savenije, H. H. G. Oceanic sources of continental precipitation and the correlation with sea surface temperature. Water Resour. Res. 49, 3993–4004 (2013).
Keys, P. W. et al. Analyzing precipitationsheds to understand the vulnerability of rainfall dependent regions. Biogeosciences 9, 733–746 (2012).
Mu, Y., Biggs, T. W. & De Sales, F. Forests mitigate drought in an agricultural region of the Brazilian Amazon: atmospheric moisture tracking to identify critical source areas. Geophys. Res. Lett. 48 (2021).
O’Connor, J. C. et al. Forests buffer against variations in precipitation. Glob. Change Biol. 27, 4686–4696 (2021).
Cui, J. et al. Global water availability boosted by vegetation-driven changes in atmospheric moisture transport. Nat. Geosci. 15, 982–988 (2022).
Pranindita, A., Wang-Erlandsson, L., Fetzer, I. & Teuling, A. J. Moisture recycling and the potential role of forests as moisture source during European heatwaves. Clim. Dyn. 58, 609–624 (2022).
Rockström, J., Mazzucato, M., Andersen, L. S., Fahrländer, S. F. & Gerten, D. Why we need a new economics of water as a common good. Nature 615, 794–797 (2023).
Link, A., van der Ent, R., Berger, M., Eisner, S. & Finkbeiner, M. The fate of land evaporation-a global dataset. Earth Syst. Sci. Data 12, 1897–1912 (2020).
van der Ent, R. J., Wang-Erlandsson, L., Keys, P. W. & Savenije, H. H. G. Contrasting roles of interception and transpiration in the hydrological cycle - Part 2: Moisture recycling. Earth Syst. Dyn. 5, 281–326 (2014).
Kalverla, P., Benedict, I., Weijenborg, C. & van der Ent, R. J. Atmospheric moisture tracking with WAM2layers v3. Geosci. Model Dev. 18, 4335–4352 (2025).
Tuinenburg, O. A., Theeuwen, J. J. E. & Staal, A. High-resolution global atmospheric moisture connections from evaporation to precipitation. Earth Syst. Sci. Data 12, 3177–3188 (2020).
Tuinenburg, O. A. & Staal, A. Tracking the global flows of atmospheric moisture and associated uncertainties. Hydrol. Earth Syst. Sci. 24, 2419–2435 (2020).
De Petrillo, E., Monaco, L., Tuninetti, M., Staal, A. & Laio, F. Cell-scale atmospheric moisture flows dataset reconciled with ERA5 reanalysis. Sci. Data 12, 629 (2025).
Li, Y. et al. Contribution of Tibetan Plateau ecosystems to local and remote precipitation through moisture recycling. Glob. Change Biol. 29, 702–718 (2023).
Hoek van Dijke, A. J. et al. Shifts in regional water availability due to global tree restoration. Nat. Geosci. 15, 363–368 (2022).
Krönke, J. et al. Dynamics of tipping cascades on complex networks. Phys. Rev. E 9, https://doi.org/10.1103/PhysRevE.101.042311 (2020).
Wunderling, N. et al. Recurrent droughts increase risk of cascading tipping events by outpacing adaptive capacities in the Amazon rainforest. Proc. Natl. Acad. Sci. 119, e2120777119 (2022).
Zhang, Y. et al. Atmospheric basins: identification of quasi‐independent spatial patterns in the global atmospheric hydrological cycle via a complex network approach. J. Geophys. Res. Atmospheres 125, e2020JD032796 (2020).
Li, Y. et al. Unraveling the discrepancies between Eulerian and Lagrangian moisture tracking models in monsoon- and westerly-dominated basins of the Tibetan Plateau. Atmospheric Chem. Phys. 24, 10741–10758 (2024).
Benedict, I., Weijenborg, C., van der Ent, R., Keune, J. & Koren, G. Moisture tracking intercomparison: time to address the uncertainty. in (Lorentz Center, Leiden, Netherlands, 2024).
Sodemann, H. The Lagrangian moisture source and transport diagnostic WaterSip V3.2. EGUsphere 1–47, https://doi.org/10.5194/egusphere-2025-574 (2025).
Dee, D. P. et al. The ERA-Interim reanalysis: configuration and performance of the data assimilation system. Q. J. R. Meteorol. Soc. 137, 553–597 (2011).
Sprenger, M. & Wernli, H. The LAGRANTO Lagrangian analysis tool – version 2.0. Geosci. Model Dev. 8, 2569–2586 (2015).
Tuinenburg, O. A., Hutjes, R. W. A. & Kabat, P. The fate of evaporated water from the Ganges basin. J. Geophys. Res. Atmospheres 117, D01107 (2012).
Trinh, B. & Phong, N. V. A Short Note on RAS Method. Adv. Manag. Appl. Econ. 3, 133–137 (2013).
Rosvall, M. & Bergstrom, C. T. Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. 105, 1118–1123 (2008).
Huang, L. Skeleton Tracing: A new algorithm for retrieving topological skeleton as a set of polylines from binary images, GitHub [code]. 5 (2021).
Campello, R. J. G. B., Moulavi, D. & Sander, J. Density-based clustering based on hierarchical density estimates. in Advances in Knowledge Discovery and Data Mining (eds Pei, J., Tseng, V. S., Cao, L., Motoda, H. & Xu, G.) vol. 7819 160–172 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2013).
Zhang, Y., Huang, W. & Zhong, D. Major moisture pathways and their importance to rainy season precipitation over the Sanjiangyuan region of the Tibetan Plateau. J. Clim. 32, 6837–6857 (2019).
Zhang, Y. & Zhong, D. A dataset of global atmospheric moisture source-sink relationships and atmospheric basins based on multiple atmospheric moisture tracking models. https://doi.org/10.11888/Atmos.tpdc.300478 (2023).
Schewe, J. & Schmied, H. M. DDM30 river routing network for ISIMIP3. https://doi.org/10.48364/ISIMIP.865475 (2022).
Flanders Marine Institute. Global Oceans and Seas, version 1. https://doi.org/10.14284/542 (2021).
Iturbide, M. et al. An update of IPCC climate reference regions for subcontinental analysis of climate model data: definition and aggregated datasets. Earth Syst. Sci. Data 12, 2959–2970 (2020).
Hoffmann, L. et al. From ERA-Interim to ERA5: the considerable impact of ECMWF’s next-generation reanalysis on Lagrangian transport simulations. Atmospheric Chem. Phys. 19, 3097–3124 (2019).
Zhang, Y. & Zhong, D. Code for generating the multi-model based dataset of global atmospheric moisture source-sink relationships and atmospheric basins. Zenodo https://doi.org/10.5281/zenodo.17032120 (2025).
Zhang, C., Chen, D., Tang, Q. & Huang, J. Fate and Changes in Moisture Evaporated From the Tibetan Plateau (2000–2020). Water Resour. Res. 59, e2022WR034165 (2023).
Fremme, A., Hezel, P. J., Seland, Ø. & Sodemann, H. Model-simulated hydroclimate in the East Asian summer monsoon region during past and future climate: a pilot study with a moisture source perspective. Weather Clim. Dyn. 4, 449–470 (2023).
Cheng, T. F. & Lu, M. Global Lagrangian Tracking of Continental Precipitation Recycling, Footprints and Cascades. J. Clim. 1, 1–45 (2022).
Acknowledgements
This study was funded by the National Natural Science Foundation of China (grant no. 52209026), the Topology of Hydrosphere Project from the State Key Laboratory of Hydroscience and Engineering (grant no. sklhse-TD-2024-F01), and the Second Tibetan Plateau Scientific Expedition and Research Program (grant no. 2019QZKK0208). We also acknowledge the National Supercomputer Center in Tianjin, China, and the Beijing Super Cloud Computing Center, China, for providing high-performance computing resources that have contributed to the research.
Author information
Authors and Affiliations
Contributions
Y.Z. and D.Z. conceived and designed the research and provided financial support. Y.Z. wrote the codes, analysed the data, and created the figures. Y.Z. and R.C. prepared the dataset. D.X., Y.C. and Y.M. conducted the investigation. Y.Z. wrote the initial draft. D.Z., R.C., M.Z. and Y.T. contributed to editing and revising the draft. D.Z. provided resources and supervision.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhang, Y., Zhong, D., Cai, R. et al. A multi-model based dataset of global atmospheric moisture source-sink relationships and atmospheric basins. Sci Data 12, 1842 (2025). https://doi.org/10.1038/s41597-025-06123-0
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41597-025-06123-0












