Abstract
The scale height (SH) represents the height increment for a certain parameter to decrease to 36.7% (1/e) of its value at a certain height. Here we present ERA5-SH, a gridded dataset containing the SH values of six troposphere key parameters (PWV, WVD, Tm, ZTD, ZHD and ZWD) based on ERA5 reanalysis from 2013 to 2022, with a temporal resolution of 1 hour and a spatial resolution of 1°. The dataset was generated using numerical integral and exponential fitting, and exhibits high reliability with mean coefficients of determination being 0.991, 0.957, 0.980, 0.999, 0.999, and 0.995, respectively. Using the global distributed radiosonde sites as references, the mean RMSE for the six parameters were 0.243 km, 0.189 km, 3.290 km, 0.879 km, 0.681 km, and 0.263 km, respectively. This dataset will contribute to a deeper understanding of the tropospheric vertical distribution and to improve the accuracy of atmospheric delay modeling, which are vital for the advancement of the Earth observation technologies with high precision.
Similar content being viewed by others
Background & Summary
The troposphere, as the layer of the atmosphere closest to the Earth’s surface, contains about 75% of the mass of the atmosphere and over 90% of the water vapor mass. When the electromagnetic wave signal traverses the troposphere, it undergoes alterations in speed and path deflection. These changes, coupled with the inclusion of the tilt distance along the signal propagation path, collectively contribute to what is known as tropospheric delay. Tropospheric delay significantly impacts GNSS navigation and positioning, remote sensing and satellite altimetry et al., which hinders the advancement of high-precision earth observation services1,2. Tropospheric delay is typically divided into two components: the tropospheric hydrostatic delay and wet delay caused by water vapor. Water vapor, being a highly dynamic constituent of the atmosphere, demonstrates notable temporal variations in content and distribution, presenting difficulties in accurately characterizing tropospheric wet delay3. Tropospheric water vapor affects the transmittance of visible and near-infrared bands, causing attenuation and delay of microwave signals, which significantly impacts the scattering and absorption of radar signals4. Similarly, in the field of GNSS, the troposphere introduces delays in electromagnetic signals, thereby affecting the accuracy of positioning5. Since the composition of the troposphere changes with altitude, the effects of the troposphere vary significantly across different vertical layers. Accurately describing the vertical distribution of tropospheric components, particularly zenithal water vapor, is crucial for improving atmospheric models and GNSS positioning accuracy. Research has shown that zenith tropospheric delay and water vapor content generally follow a negative exponential distribution with altitude6,7. The scale height concept provides a useful quantitative measure for describing this distribution. A comprehensive analysis of scale height aids in understanding the structure and variability of the tropospheric atmosphere, facilitating the development of more precise tropospheric delay models. This study explores the use of scale height to describe the vertical structure of tropospheric parameters. We assess the feasibility of applying this method globally and publish a comprehensive dataset on the scale heights of key tropospheric parameters.
In the troposphere, six key parameters play a vital role in Earth observation, particularly for applications like GNSS meteorology. These parameters include the tropospheric zenith total delay (ZTD) and its two components, the zenith hydrostatic delay (ZHD) and zenith wet delay (ZWD), water vapor density (WVD), precipitable water vapor (PWV), and the weighted mean temperature (Tm). Table 1 provides detailed information about these six parameters. In addition, the vertical distribution of these six key tropospheric parameters strictly follows a negative exponential pattern, which makes it highly appropriate to use the concept of scale height to describe their vertical structure.
With the development of high precision Earth observation system, the ZTD has been a key parameter of interest to researchers across various disciplines, including Global Navigation Satellite Systems (GNSS), microwave remote sensing, and radar detection et al. Accurate estimation and modeling of ZTD is essential to mitigate the effects of tropospheric delay and improve the accuracy of GNSS positioning and navigation solutions8. Based on this, a large number of tropospheric delay correction models have been established, such as Hopfield model, Saastamoinen model and Black model which need to measure meteorological parameters9,10,11, and some empirical models such as GZTD series, IGGtrop series and GPT series12,13,14. At the same time, there are currently ZTD, ZHD and other parameters based on the global grid network products such as TUW-VMF3 and GFZ-VMF3. Li et al. conducted a global spatiotemporal assessment of two existing tropospheric products, TUW-VMF3 and GFZ-VMF3, from the Vienna University of Technology (TU Wien) and the GeoForschungsZentrum Potsdam (GFZ)15. Nevertheless, most existing models and products are predominantly focused on two-dimensional planes, often overlooking the variations in elevation of parameters such as ZTD. Wang et al. refined the vertical model of ZTD using numerical weather models16. Taking Altitude-Related Correction into account, Zhao et al. proposed a high-precision ZTD model17. By establishing a four-layer ZTD scale height model, Zhang et al. significantly improved the convergence speed of precision single point positioning18. Considering the elevation difference of ZTD, Zhao et al. proposed a high-precision ZTD interpolation method19. In contemporary positioning and modeling methodologies, the vertical distribution characteristics of parameters such as ZTD and ZHD are increasingly recognized as critical. Consequently, it is essential to develop a dataset that delineates the scale height of these key parameters.
Water vapor is one of the most significant and challenging parameters affecting ZTD and high-precision monitoring of water vapor is also a key focus of GNSS meteorology20. As the vertical distribution of water vapor generally follows a negative exponential pattern, the scale height of water vapor serves as a valuable metric for characterizing its vertical structure. In the field of GNSS meteorology, the study of scale heights for key variables such as water vapor density, PWV and Tm is equally critical. Using the water vapor density scale height as a vertical constraint can improve the precision of GNSS water vapor tomography21,22. PWV plays an important role in extreme precipitation weather warning, and high-precision GNSS-PWV inversion is helpful for accurate forecasting, and also plays a role in calibration of large-scale remote sensing PWV data23. Ding et al. developed an empirical model for PWV vertical adjustment24. Tm, a key conversion factor in the GNSS-PWV inversion, plays a pivotal role, and further investigation into its vertical distribution can yield more accurate PWV products25,26. Yang et al. established a refined empirical model for Tm using error compensation techniques27. Relevant research indicates that incorporating the Tm scale height can significantly improve modeling accuracy28, and high-precision Tm scale height data can lead to more accurate GNSS-PWV retrieval results29. Both water vapor density and PWV scale heights offer valuable insights into the vertical distribution of water vapor, which is closely tied to the turbulent structure of the atmosphere. By examining this vertical distribution, researchers can better understand the interaction between water vapor and atmospheric turbulence, thereby gaining deeper insights into the transport and transformation mechanisms of water vapor within the atmospheric boundary layer30.
Scale height plays a critical role in various fields, and numerous studies have confirmed its significance in enhancing model accuracy. As an indicator representing the vertical distribution characteristics of parameters, scale height also offers new perspectives for advancing tropospheric detection techniques and atmospheric modeling. However, parametric scale height products remain limited, and our understanding of scale height is still inadequate. In this study, we developed a six-parameter scale height dataset, ERA5-SH, derived from the profile values of fundamental meteorological parameters provided by ERA5. This product includes six key parameters relevant to Earth observation and employs rigorous data screening techniques to ensure its accuracy. The dataset’s accuracy was validated using data from over 400 active sounding sites worldwide. Additionally, the characteristics of ERA5-SH are analyzed in detail, with an example provided demonstrating how ZTDSH can enhance the accuracy of spatial interpolation.
Methods
Data acquisition
ERA5
The ERA5 dataset, produced by the European Centre for Medium-Range Weather Forecasts (ECMWF), is a comprehensive global meteorological dataset offering reanalysis of atmospheric and surface variables spanning from 1979 to the present. Leveraging sophisticated numerical models, data assimilation techniques, and information from diverse observational sources, ERA5 re-simulates and evaluates historical meteorological conditions31. The scale height product derived from ERA5 reanalysis is resolved on a 360° × 181° longitude and latitude grid with a spatial resolution of 1° × 1° and a temporal resolution of 1 hour. This product utilizes profile data for temperature, geopotential, relative humidity, and specific humidity across 37 pressure levels ranging from 1000 hPa to 1 hPa.
It is essential to recognize that the geopotential provided by ERA5 is referenced to mean sea level (MSL) and adjusted for gravitational variations. However, vertical coordinates typically used in GNSS and related fields represent ellipsoidal heights relative to a reference ellipsoid. Therefore, it becomes necessary to convert geopotential heights to ellipsoidal heights for further processing. The geopotential height hd is first calculated using the formula32:
where Z is the geopotential and gn = 9.80665m/s2 is the standard gravity constant. The orthometric height horth is then derived by considering the radius of curvature of the meridian and the gravitational acceleration at the specific latitude. Finally, the ellipsoid height hel is obtained by adding the geoidal undulation, which can be calculated using the Earth Gravitational Model 2008 (EGM2008).
To obtain meteorological parameters at the surface, adjustments to the vertical distribution of ERA5-provided meteorological data are required. Specifically, the vertical correction involves accounting for the elevation difference between the ground level and the isobaric surface heights in the ERA5 dataset. Linear interpolation is used for temperature and relative humidity to estimate values at the ground level. For pressure, assuming the ground layer lies between the k and k-1 th pressure levels, the interpolation is performed according to the following formula33:
where Pground, Pk-1 and Pk denotes the pressure at ground, k-1 and k th pressure levels, respectively. Hground, Hk-1 and Hk are the ellipsoid height at ground, k-1 and k th pressure levels, respectively.
If the surface ellipsoid height is lower than the lowest pressure level, the extrapolation is performed using the following formula:
where Pground and P0 denotes the pressure at ground and lowest pressure level, respectively. Hground and H0 are the ellipsoid height at ground and lowest pressure level, respectively. Tv is the virtual temperature, which can be calculated as follows:
where T is temperature and q represent specific humidity.
Radiosonde
The validation of the Scale Height product was validated using daily data obtained from 818 sounding stations worldwide, sourced from the University of Wyoming Weather Data website (http://weather.uwyo.edu/upperair/sounding.html). These datasets encompass meteorological parameters like temperature, pressure, and relative humidity, spanning from the Earth’s surface up to an altitude of around 30 km. It is important to note that the radiosonde data are collected at two distinct times, namely UTC 12:00 and UTC 0:00.
Sounding data provides high vertical resolution and detection accuracy, making it valuable for atmospheric studies. However, some stations face challenges such as limited time resolution, significant data gaps, and insufficient detection heights, which can hinder its effective use. To overcome these challenges, rigorous quality control measures have been applied to ensure the reliability and consistency of the sounding data and finally we selected 587 stations out of 818 for validation. These measures include the following principles:
-
1.
The altitude of the final valid record in the radiosonde data must be no less than 10 km to ensure sufficient vertical coverage.
-
2.
The number of valid observation levels in the radiosonde data should be at least 20 to provide adequate vertical detail.
-
3.
The vertical spacing between two consecutive altitude layers must not exceed 2 km to maintain a smooth vertical profile.
-
4.
The pressure differential between any two successive levels should not exceed 200 hPa to avoid abrupt changes and ensure data consistency.
Production process of ERA5-SH
The six scale height data derived from ERA5 is produced and validated according to the flowchart shown in Fig. 1. It can be seen that the geopotential data from ERA5 was converted into ellipsoidal height for production purposes, yielding the ellipsoidal height for each pressure layer. Then, meteorological parameters (temperature, specific humidity et al.) at each pressure layer were interpolated or extrapolated based on the ellipsoidal height of the surface, resulting in the derivation of meteorological parameter profiles starting from the surface. Subsequently, the obtained meteorological parameters were utilized for numerical computations and the profiles corresponding to the six research parameters (ZTD, ZWD, ZHD, WVD, PWV, Tm) were extracted. Finally, the six scale height data was determined using the least square method for parameter fitting.
To validate the accuracy of the produced ERA5-SH, the meteorological profiles from 587 sounding stations worldwide in 2022 were selected as measured data to conduct the assessment. Initially, stringent quality control measures were applied to the radiosonde data, eliminating stations with subpar data quality. Utilizing the meteorological parameter profiles provided by the remaining stations, akin to the production process of ERA5-SH data, the scale height of parameters at the sounding stations was calculated. As the ERA5-SH data is grid-based, bilinear interpolation was performed to obtain the scale height data at the sonde station. Subsequently, the accuracy of the data was assessed by comparing the interpolated ERA5-SH data with the calculated data at the sounding site.
Numerical calculation for the layered tropospheric parameters
In this section, we will outline the calculation methodology for six parameters and acquire the profile data of these parameters utilizing the ERA5 and sounding data.
The ZTD consists of two parts: ZHD and ZWD34:
The ZWD can be expressed as an integral of the wet refractive index (Nw) in the vertical direction:
where \({k}_{2}^{{\prime} }\approx 71.2952\) and \({k}_{3}\approx 64.79\) are calculated as parameters of Nw. T is temperature, measured in Kelvin and e represents water vapor pressure, which can be calculated by the modified Magnus formula32:
where rh represents relative humidity and Tc is temperature, measured in degrees Celsius.
Similarly, the ZHD can be expressed as an integral of the hydrostatic refractive index in the vertical direction:
where \({k}_{1}\approx 77.689\) is calculated as parameters of Nh and p denotes pressure.
The WVD can be calculated according to the following formula:
where \({R}_{v}=461.5J/\left({kg}* K\right)\) denotes a gas constant.
The PWV can be calculated accurately by the following formula based on the stratified meteorological data:
where \({\rho }_{w}\approx 0.999{kg}/{m}^{3}\) means the liquid water density, \(g\approx 9.80655m/{s}^{2}\) is the gravitational acceleration, i denotes the i th pressure level and n is the total number of layers. qi and Δpi represent the specific humidity and the pressure difference of the i th pressure level, respectively. The specific humidity can be calculated by the following formula:
The Tm can be calculated accurately by the following formula based on the stratified meteorological data35:
where Δhi denotes the thickness of the i th pressure level.
Scale height fitting
The vertical distribution of the parameters follows negative exponential function as follows:
where the SH represents the scale height, \(p{v}_{s}\) is the surface parameter value, \({pv}\) is the parameter value at a height of h and hs is the ground ellipsoidal height, respectively.
In the preceding numerical calculation, the profile data for each parameter has been acquired, enabling the determination of the scale height (SH) through exponential least square fitting utilizing the Levenberg-Marquardt method.
Periodic fitting of time series
For the scale height time series with annual and semi-annual cycles, they are modeled in the form of trigonometric functions, and the form is as follows36:
where \({a}_{0},{a}_{1},{a}_{2},{a}_{3},{a}_{4}\) are the model coefficients.
Accuracy evaluation indicators
In this study, we selected three indicators, including RMSE (Root Mean Square Error), RRMSE (Relative Root Mean Square Error), Bais and R2 (coefficient of determination), to evaluate the accuracy of the ERA5-SH. These can be calculated using the following formulas:
where SH and SHR respectively represent the ERA5-SH and the reference. i and n denotes the i th value and the total number of samples.
Data Records
The ERA5-SH dataset is divided into two parts, which can be accessed via: https://doi.org/10.5281/zenodo.14676025 (ZTDSH, ZHDSH, ZWDSH)37 and https://doi.org/10.5281/zenodo.14679394 (PWVSH, WVSH, and TmSH)38. Data for each parameter is stored annually in a.mat file, which contains a structure named after the corresponding parameter. The files are compressed using linear quantization, with the structure including two fields, “Scale” and “Offset,” for data decompression, and an int16-type field named “Data” for storing the data. The “Data” is a three-dimensional matrix and the dimensions represent longitude (starting from 0°), latitude (from 90°N to 90°S), and time (starting from January 1st at 0 o 'clock, hourly), respectively. Additionally, there is a field named “Max_error,” which represents the maximum compression error. Each.mat file is approximately 1 GB in size.
Technical Validation
Accuracy assessment
The determination coefficient R2 can well reflect the goodness of the exponential fitting used to obtain these scale height. In Fig. 2, it illustrates the mean R2 of the six scale height spanning from 2013 to 2022, and it can be seen that each scale height can effectively reflect the vertical distribution of the corresponding trospheric parameters in most regions with R2 is more than 0.95 overall. The R2 for PWVSH and ZWDSH demonstrate analogous global distribution patterns, particularly in the transitional zones between continents and oceans. In fact, the magnitude of ZWD is intricately linked to PWV, which is extensively employed in GNSS meteorology. For variables related to water vapor, their coefficients of determination generally exhibit lower values within continental-oceanic transition regions. This phenomenon can be attributed to the difference in temperature and humidity between the land and the sea. Due to the differences in specific heat capacity between land and water, the land heats up more quickly than the sea during the day. This rapid heating causes warm, humid air to rise, leading to the formation of convection, which subsequently influences the vertical distribution structure of water vapor. In addition, the determination coefficients of ZHDSH show significant differences at latitude, with lower values at lower latitudes. The determination coefficients of ZHDSH, ZTDSH and WVSH are lower in Antarctic region. This is closely related to the high altitude and cold climatic conditions of the region. At the same time, the spatial distribution of determination coefficients of WVSH and TmSH in the Antarctic region shows obvious instability. It should be noted that unlike other parameters, which are defined as integral values from the specified height to the top of the tropsphere, water vapor density is not an integral value, so its vertical distribution is more susceptible to atmospheric activity, resulting in a relatively low coefficient of determination. However, in terms of its mean value, the coefficient of determination still exceeds 0.85, indicating that WVSH can still reflect the vertical distribution structure of water vapor density to a certain extent. Meanwhile, in comparision with the parameters related to water vapor, the scale height of ZHD is more stable, and the coefficient of determination is more than 0.998.
Figure 3 presents the R2 box plots for the six parameters from 2013 to 2022, alongside histograms of all data over the ten-year period. Overall, the six parameters have consistently demonstrated a high level of goodness of fit over the ten years. Even for WVSH, which has the lowest fit among the parameters, the minimum R2 exceeds 0.8, while both the mean and median values are above 0.95, indicating high stability and reliability. The box plot reveals that ZHDSH does not exhibit any outliers over the past ten years, as ZHD is a variable independent of water vapor, and its vertical structure remains relatively stable across both time and space. In contrast, the other parameters related to water vapor display more outliers in their coefficients of determination over the ten-year period. This can be attributed to the rapid spatial and temporal variability of water vapor in the troposphere, leading to instability in the water vapor structure over time and space. As shown in the histogram, the distribution of R2 for all variables, except for ZHDSH, follows a normal distribution. Additionally, the histograms of variables directly related to water vapor (PWV, WVD, and ZWD) exhibit higher similarity, with the distributions of determination coefficients for PWVSH and ZWDSH being particularly similar. This finding corroborates the conclusions presented in Fig. 2. The mean R2 of the six parameters exceed 0.95, and their standard deviations are less than 0.02, indicating excellent goodness of fit and stability.
Following the quality control principles mentioned in Section 3, 409 high quality radiosonde stations were selected to conduct the validation of the ERA5-SH dataset. The scale height of the six tropospheric parameters computed at the radiosonde station are considered as the reference value, and bilinear interpolation is applied to compute the ERA5-SH values at the corresponding station by interpolating from the four nearest grid points. The accuracy of this estimation is assessed through calculation of RMSE, RRMSE, Bias, and R2 against the reference value. Figure 4 shows the global distribution of the verified RMSE, and it can be seen that the ERA5-SH dataset shows high precision on a global scale. Among the analyzed parameters, PWVSH and ZWDSH show stable and high accuracy globally, with RMSE values below 0.2 km at most stations The accuracy of WVSH varies with latitude. High-latitude stations tend to have larger RMSE values, exceeding 0.6 km at certain locations. In contrast, low-latitude stations demonstrate lower RMSE values, generally within 0.2 km. In addition, TmSH exhibits high accuracy on a global scale, with most sites maintaining RMSE values within 7 km, whereas TmSH values typically range between 30 and 50 km. It should be noted that ZHDSH and ZTDSH show significantly lower accuracy in Asia, primarily due to the insufficient sounding altitude of stations in this region. For reliable ZHD and ZTD estimates, a sounding altitude of over 20 km is typically required, whereas other water vapor-related variables (PWV, ZWD, WVD, Tm) can be accurately determined with a sounding altitude of approximately 15 km. Outside of this region, ZTDSH and ZHDSH also demonstrate high accuracy, with RMSE values below 0.5 km at most stations, particularly for values around 7 km.
The maximum, minimum as well as the mean value of the RMSE, RRMSE, bias and correlation coefficient (R) for the six scale heights are counted and listed in Table 2. The maximum and minimum RMSE values for PWVSH are 0.719 km and 0.122 km, respectively, indicating a significant disparity that suggests some stations have poor verification results. As shown in Fig. 4, these stations are primarily located in areas with extreme climates, such as high altitudes. This trend is not exclusive to PWVSH; similar patterns are observed for other variables as well. Despite some test stations exhibiting poor verification results, the mean values for the four indices of PWVSH are 0.243 km, 13.194%, 0.185 km, and 0.919, respectively, demonstrating a generally high overall verification accuracy. For both PWVSH and ZWDSH, the average correlation coefficient exceeds 0.9, indicating the highest verification accuracy among the parameters. Conversely, the minimum correlation coefficients for ZTDSH and ZHDSH are less than 0, with mean values below 0.6, attributed to the previously mentioned insufficient sounding altitudes. Nevertheless, the maximum correlation coefficient for these parameters exceeds 0.85, suggesting that high accuracy can still be achieved at stations unaffected by sounding height limitations. For RRMSE, the mean values for WVSH and TmSH are below 10%, with maximum values under 20%, indicating that most stations exhibit high accuracy. Additionally, the minimum correlation coefficients for WVSH and TmSH are 0.542 and 0.742, respectively, reflecting their high verification accuracy. In general, a comprehensive analysis of the four precision indices across the six parameters demonstrates consistently high verification accuracy.
Furthermore, six radiosonde stations from different regions were selected to provide more comparative details: Station 3808 for ZTDSH, Station 70200 for ZHDSH, Station 45004 for ZWDSH, Station 83378 for PWVSH, Station 96441 for WVSH, and Station 52818 for TmSH. Figure 5 displays the geographical locations of these stations, along with the time series of the corresponding parameters and scatter density plots. It can be seen that the scale heights derived from radiosonde data show strong agreement with the ERA5-SH dataset, exhibiting high similarity in both overall trends and detailed variations. For PWVSH, ZWDSH, and TmSH, the verification results are particularly robust. A comparison of the time series reveals that the ERA5-SH results align closely with those from the radiosonde, both in terms of values and trends. The scatter density plots also demonstrate a strong correlation, with the R2 for the linear fits exceeding 0.9. For WVSH, since WVD is not an integral value, WVSH exhibits relatively fewer stable results during verification. However, despite differences in some details, the overall consistency remains high, with a linear fit R2 greater than 0.86. In the case of ZHDSH and ZTDSH, due to the previously mentioned insufficient sounding height, there is a systematic bias between the ERA5-SH and radiosonde results. Nevertheless, the trends between the two remain highly consistent, and the scatter plots still display a strong linear correlation.
Characteristics of ERA5-SH
Based on the above description, the ERA5-SH dataset demonstrates high precision and stability, as shown by the great exponential fitting goodness and validation against external radiosonde data. This chapter provides an in-depth analysis of the product’s characteristics, focusing on both spatial and temporal distribution patterns of the six parameters. The analysis examines these patterns from both a mean perspective and at specific moments or locations, offering a comprehensive understanding of the scale height features. Finally, an example is presented that highlights the use of the ZTDSH product to enhance the precision of spatial interpolation, especially in areas with significant elevation changes, demonstrating the practical application of the dataset in improving Earth observation accuracy.
Figure 6 illustrates the spatial distribution of ERA5-SH at 00:00 UTC on January 1, 2013. It is evident that, except for ZHD, a parameter unrelated to atmospheric water vapor, the scale height of the remaining parameters exhibits a vortex structure influenced by atmospheric dynamics. Moreover, the scale height at the periphery of the vortex tends to be relatively elevated. Upon further examination of the long-term spatial distribution map, a pronounced variability in the scale height of water vapor-related parameters is observed, displaying a distinct large-scale periodicity. In contrast, the scale height of ZHD remains relatively stable and undergoes minimal short-term fluctuations. Since ZTD is defined as the sum of ZHD and ZWD, its scale is strongly influenced by both ZHDSH and ZWDSH. As a result, ZTD shows significant spatial variation with latitude (driven by ZHDSH) and exhibits a vortex structure similar to that of ZWDSH.
The temporal variation characteristics of scale height are further explored. To account for latitudinal differences, the globe was divided into six latitude regions: R1 (60°N-90°N), R2 (30°N-60°N), R3 (0°-30°N), R4 (0°-30°S), R5 (30°S-60°S), and R6 (60°S-90°S). Figure 7 illustrates the mean scale heights of the six parameters for each month in 2022 across these latitude regions. It is evident that scale heights exhibit significant differences between the Northern and Southern Hemispheres, with opposing trends over time. For instance, in the R3 region, scale heights for PWVSH and ZWDSH gradually increased from February to August, reaching peak values of 2.20 km and 2.29 km, respectively, in August. Conversely, in the R4 region, located on the opposite side of the equator, an opposing trend was observed, with minimum values of 1.59 km and 1.65 km occurring in August. Notably, extreme values of scale heights were often recorded in both hemispheres during July and August, indicating seasonal variability. ZTDSH and ZHDSH also displayed significant latitudinal differences. Due to the substantial influence of ZHD on ZTD, both parameters exhibited similar spatiotemporal distribution characteristics, with scale height in the R3 and R4 regions significantly higher than those in other high-latitude areas. Additionally, extreme climate conditions in the polar regions often result in extreme scale height values. Notable examples include WVSH in January and February in the R1 region, and TmSH in February and March in the R6 region.
To delve deeper into the mean value characteristics of scale height, Fig. 8 presents the mean value of the six parameter scale heights spanning the period from 2013 to 2022. The mean distribution of scale height exhibits pronounced geographical disparities. Specifically, the scale height of the parameter associated with water vapor displays lower values over the oceanic regions flanking both sides of the equator, while higher values are observed in proximity to the equator. In contrast, ZHD, which is decoupled from water vapor, demonstrates a relatively consistent spatial distribution, with elevated values in low-latitude areas and diminished values in high-latitude regions. Additionally, the scale height exhibits higher values near the equator, a feature that is particularly pronounced in parameters related to water vapor (e.g., PWVSH, ZWDSH). The lower scale height values observed at land-sea boundaries are primarily attributed to the temperature differences between land and sea. Furthermore, variations in scale height with respect to elevation are evident. In regions of higher elevation, such as the Qinghai-Tibet Plateau and Antarctica, scale height tends to be smaller, with an overall decrease in scale height corresponding to increased elevation.
Figure 9 shows the time series (gray spots) and box plot of the ERA5-SH at a certain location (120°E, 30°N). Meanwhile, statistical analysis is carried out on the data, and statistical characterization such as the mean and median of the series is given. Notably, parameters directly linked to water vapor (PWV, WVD, ZWD) generally fall within the range of 0.8 to 4 km, with the mean value around 2.0 km. Influenced by ZHD, the primary component of ZTD, the scale height of ZTD is very similar to that of ZHD, mainly distributed between 7.0 km and 8.0 km. Tm is significantly impacted by temperature and exhibits a linear decrease with height in the troposphere, resulting in a relatively high scale height mainly distributed between 38.0 km and 80.0 km.
Note that the time series of the parameter scale height exhibits certain annual and semi-annual periodic characteristics, formula (22) is utilized to fit the time series, and the results are depicted by the red line in Fig. 8. It is evident that ZHD displays strong annual and semi-annual cycle characteristics as it is not influenced by changes in water vapor. On the other hand, the remaining variables, which are associated with water vapor, exhibit significant fluctuations but still demonstrate certain periodic traits overall. It is important to highlight that the periodic characteristics vary across different locations. For instance, in the case of ZTD, at times and locations with low water vapor content, the scale height exhibits reduced fluctuations and displays pronounced periodic characteristics.
Here, we present an example of height correction using ZTDSH data, which enhances the accuracy of interpolation from grid to site, particularly in regions with significant elevation changes. In the process of interpolating ZTD from grid points to target positions, adjustments are necessary due to the significant elevation-dependent variations of ZTD. By leveraging ZTDSH data, the ZTD value at the target location can be accurately determined by calculating the elevation difference between the target location and the grid node. The ZTD data referenced in this study are derived from a screened ZTD dataset compiled by the Karlsruhe Institute of Technology (KIT) team in 2020, which includes 91,088,258 screened ZTD values from 12,552 GNSS stations39. We select these GNSS test sites as target locations and conduct bilinear interpolation based on the data from the four nearest grid nodes surrounding the sites. Initially, the elevation discrepancy between the four grid points and the target position is individually calculated. Subsequently, the ZTD value is adjusted according to the ZTDSH at the four grid points, aligning the ZTD to the target elevation. Finally, utilizing the four corrected ZTD data, bilinear interpolation is performed on the same elevation plane to derive the ZTD value at the target position.
Figure 10 presents a comparison of interpolation results for 12,552 GNSS stations worldwide. The left graph displays the interpolation results without accounting for elevation changes, while the right graph illustrates the results incorporating ZTDSH data. After integrating ZTDSH data, the RMSE of the interpolation decreased significantly from 50.27 mm to 18.40 mm, particularly in regions with substantial elevation changes. This underscores the necessity of incorporating elevation correction for ZTD in areas with significant elevation gradients, especially at land-sea interfaces, where the RMSE can exceed 50 cm. With ZTDSH data correction, the RMSE can be reduced to less than 5 cm, resulting in an increase in interpolation accuracy of over 90%.
Figure 11 offers a detailed breakdown of RMSE and Bias at each site before and after elevation correction. Without elevation correction, the average RMSE of ZTD obtained through interpolation is 5.02 cm. Following elevation correction with ZTDSH, the average RMSE is reduced to 1.84 cm, indicating a significant improvement in accuracy. The stations are categorized into two groups: high altitude (>400 m) and low altitude (<400 m). Notably, the accuracy improvement is more pronounced at high altitudes.
Code availability
The code for scale height estimation, analysis, and validation is available at https://github.com/HaoRuixian/ERA5-SH-dataset-for-troposphere-parameters-code-for-estimate-and-analysis.
References
Fernandes, J., Lazaro, C. & Vieira, T. On the role of the troposphere in satellite altimetry. Remote Sens Environ. 252, 112149, https://doi.org/10.1016/j.rse.2020.112149 (2020).
Balidakis, K. et al. Estimating integrated water vapor trends from VLBI, GPS, and numerical weather models: Sensitivity to tropospheric parameterization. J Geophys Res. 123, 6356–6372, https://doi.org/10.1029/2017JD028049 (2018).
Zhao, Q. et al. General method of precipitable water vapor retrieval from remote sensing satellite near-infrared data, Remote Sens Environ. 114180, https://doi.org/10.1016/j.rse.2024.114180 (2024)
Wang, L. et al. Water Vapor Retrievals from Near-infrared Channels of the Advanced Medium Resolution Spectral Imager Instrument onboard the Fengyun-3D Satellite. Adv. Atmos. Sci. 38, 1351–1366, https://doi.org/10.1007/s00376-020-0174-8 (2021).
Yang, F., Meng, X., Guo, J., Yuan, D. & Chen, M. Development and evaluation of the refined zenith tropospheric delay (ZTD) models. Satell Navig. 2, 21, https://doi.org/10.1186/s43020-021-00052-0 (2021).
Huang, L. et al. A new model for vertical adjustment of precipitable water vapor with consideration of the time-varying lapse rate. GPS Solut 27, 170, https://doi.org/10.1007/s10291-023-01506-5 (2023).
Hao, R. et al. Spatial-Temporal Variation of Water Vapor Scale Height and Its Impact Factors in different climate zones of China. Adv Space Res 74, 1576–1585, https://doi.org/10.1016/j.asr.2024.05.019 (2024).
Shi, J., Li, X., Li, L., Ouyang, C. & Xu, C. An Efficient Deep Learning-Based Troposphere ZTD Dataset Generation Method for Massive GNSS CORS Stations. IEEE Trans Geosci Remote Sens 61, 1–11, https://doi.org/10.1109/TGRS.2023.3276874 (2023).
Hopfield, H. S. Two-quartic tropospheric refractivity profile for correcting satellite data. J Geophys Res 74(18), 4487–4499, https://doi.org/10.1029/JC074i018p04487 (1969).
Saastamoinen, J. Atmospheric correction for the troposphere and stratosphere in radio ranging satellite. The Use of Artificial Satellites for Geodesy 15, 247–251, https://doi.org/10.1029/GM015p0247 (1972).
Black, H. D. & Eisner, A. Correcting satellite Doppler data for tropospheric effects. J Geophys Res-atmos 89(D2), 2616–2626, https://doi.org/10.1029/JD089iD02p02616 (1984).
Yao, Y., He, C. & Zhang, B. A new global zenith tropospheric delay model GZTD. China J of Geophysics 56, 2218–2227, https://doi.org/10.6038/cjg2013a0709 (2013).
Li, W., Yuan, Y., Ou, J., Li, H. & Li, Z. A new global zenith tropospheric delay model IGGtrop for GNSS applications. Sci Bull 57, 2132–2139, https://doi.org/10.1007/s11434-012-5010-9 (2012).
Landskron, D. & Boehm, J. VMF3/GPT3: Refined discrete and empirical troposphere mapping functions. J Geod 92, 349–360, https://doi.org/10.1007/s00190-017-1066-2 (2018).
Li, J. et al. Unraveling the Accuracy Enigma: Investigating ZTD Data Precision in TUW-VMF3 and GFZ-VMF3 Products using a Comprehensive Global GPS Dataset. IEEE Trans Geosci Remote Sens. 62, 1–1, https://doi.org/10.1109/TGRS.2024.3385228 (2024).
Wang, J. et al. Improving the vertical modeling of tropospheric delay. Geophys. Res. Lett 49(5), e2021GL096732, https://doi.org/10.1029/2021GL096732 (2022).
Zhao, Q. et al. High-precision ZTD model of altitude-related correction. IEEE J Sel Top Appl Earth Obs Remote Sens 16, 609–621, https://doi.org/10.1109/JSTARS.2022.3228917 (2022).
Zhang, S. et al. A New Four-Layer Inverse Scale Height Grid Model of China for Zenith Tropospheric Delay Correction. IEEE Access. 8, 210171–210182, https://doi.org/10.1109/ACCESS.2020.3038678 (2020).
Zhao, Q. et al. A high-precision ZTD interpolation method considering large area and height differences. GPS Solut. 28, 4, https://doi.org/10.1007/s10291-023-01547-w (2023).
Vaquero-Martínez, J. & Antón, M. Review on the role of GNSS meteorology in monitoring water vapor for atmospheric physics. Remote Sens 13(12), 2287, https://doi.org/10.3390/rs13122287 (2021).
Yang, F., Sun, Y., Meng, X., Guo, J. & Gong, X. Assessment of tomographic window and sampling rate effects on GNSS water vapor tomography. Satell Navig. 4, 7, https://doi.org/10.1186/s43020-023-00096-4 (2023).
Yang, F. et al. GNSS water vapor tomography based on Kalman filter with optimized noise covariance. GPS Solut 27, 181, https://doi.org/10.1007/s10291-023-01517-2 (2023).
Wang, Y. et al. An optimal calibration method for MODIS precipitable water vapor using GNSS observations. Atmos Res. 309, 107591, https://doi.org/10.1016/j.atmosres.2024.107591 (2024).
Ding, M., Ding, J., Peng, Z., Su, M. & Sun, T. Developments of empirical models for vertical adjustment of precipitable water vapor measured by GNSS. Adv. Space Res. https://doi.org/10.1016/j.asr.2024.08.039 (2024).
Sun, Y. et al. Evaluation of the weighted mean temperature over China using multiple reanalysis data and radiosonde. Atmos Res 285, 106664, https://doi.org/10.1016/j.atmosres.2023.106664 (2023).
Yang, F. et al. Higher accuracy estimation of the weighted mean temperature (Tm) using GPT3 model with new grid coefficients over China. Atmos Res 305, 107424, https://doi.org/10.1016/j.atmosres.2024.107424 (2024).
Yang, F. et al. Establishment and analysis of a refinement method for the GNSS empirical weighted mean temperature model. Acta Geod. et Cartogr. Sin 51(11), 2339–2345, https://doi.org/10.11947/j.AGCS.2022.20210269 (2022).
Zhang, J., Yang, L., Wang, J., Wang, Y. & Liu, X. A New Empirical Model of Weighted Mean Temperature Combining ERA5 Reanalysis Data, Radiosonde Data, and TanDEM-X 90m Products over China. Remote Sens. 16, 855, https://doi.org/10.3390/rs16050855 (2024).
Li, Q. et al. Global grid-based Tm model with vertical adjustment for GNSS precipitable water retrieval. GPS Solut 24, 73, https://doi.org/10.1007/s10291-020-00988-x (2020).
Ruf, C. S. & Beus, S. E. Retrieval of tropospheric water vapor scale height from horizontal turbulence structure. IEEE Trans. Geosci. Remote Sensing 35, 203–211, https://doi.org/10.1109/36.563258 (1997).
Hersbach, H. et al. The ERA5 global reanalysis. Q J R Meteorol Soc 146, 1999–2049, https://doi.org/10.1002/qj.3803 (2020).
Kraus, H. Die Atmosphäre der Erde: Eine Einführung in die Meteorologie. https://doi.org/10.1007/3-540-35017-9 (Springer Berlin, Heidelberg, 2004).
Böhm, J., Salstein, D., Alizadeh, M.M., Wijaya, D.D. Geodetic and Atmospheric Background. In: Böhm, J., Schuh, H. (eds) Atmospheric Effects in Space Geodesy. Springer Atmospheric Sciences. https://doi.org/10.1007/978-3-642-36932-2_1 (Springer, Berlin, Heidelberg, 2013).
Davis, J., Herring, T., Shapiro, I., Rogers, A. & Elgered, G. Geodesy by Radio Interferometry: Effects of Atmospheric Modeling Errors on Estimates of Baseline Length. Radio Sci. 20, 1593–1607, https://doi.org/10.1029/RS020i006p01593 (1986).
Sapucci, L. Evaluation of Modeling Water-Vapor-Weighted Mean Tropospheric Temperature for GNSS-Integrated Water Vapor Estimates in Brazil. J Appl Meteorol Climatol. 53, 715–730, https://doi.org/10.1175/JAMC-D-13-048.1 (2014).
Zhang, B., Yao, Y. & Xu, C. Global Empirical Model for Estimating Water Vapor Scale Height. Acta Geod Cartogr Sin 44(10), 1085–1091, https://doi.org/10.11947/j.AGCS.2015.20140664 (2015).
Hao, R. et al. ERA5-SH: A global grided scale height dataset for tropospheric parameters based on ERA5 reanalysis (Part I: ZTDSH, ZHDSH, ZWDSH) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14676025 (2025).
Hao, R. et al. ERA5-SH: A global grided scale height dataset for tropospheric parameters based on ERA5 reanalysis (Part II: PWVSH, WVSH, TmSH) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14679394 (2025).
Yuan, P. et al. An enhanced integrated water vapour dataset from more than 10 000 global ground-based GPS stations in 2020. Earth Syst. Sci. Data. 15, 723–743, https://doi.org/10.5194/essd-15-723-2023 (2023).
Acknowledgements
The authors would like to thank ERA5 and the University of Wyoming for providing the Meteorological data. We also thank the Karlsruhe Institute of Technology, Karlsruhe, Germany, for providing screened ZTD data. This study is supported by National Natural Science Foundation of China (42204022, 42074036), Science and Technology Development Plan Project of the Silk Road Economic Belt Innovation-Driven Development Pilot Zone and Urumqi-Changji-Shihezi National Innovation Demonstration Zone (2023LQY02), the Open Research Fund of State Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University (23P02), the Fundamental Research Funds for the Central Universities (2024ZKPYDC02, 2024CXCYY010, 2042023kf0003), China University of Mining and Technology-Beijing Innovation Training Program for College Students (202402008, 202402010).
Author information
Authors and Affiliations
Contributions
Ruixian Hao: Conceptualization, Methodology, Writing - Original Draft; Fei Yang: Supervision, Writing - Review & Editing, Validation; Zhicai Li: Data Curation, Visualization; Yuhao Zhang: Investigation; Lv Zhou: Data Curation; Lei Wang: Writing - Review & Editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Hao, R., Yang, F., Li, Z. et al. ERA5-SH: A global grided scale height dataset for tropospheric parameters based on ERA5 reanalysis. Sci Data 12, 381 (2025). https://doi.org/10.1038/s41597-025-04714-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-04714-5