Introduction

Precipitation is the driving force of the hydrological cycle and also plays a significant role in the atmospheric circulation1. Accurate observation of its spatiotemporal distribution characteristics and intensity is crucial for numerous applications, including flood and drought forecasting, soil moisture simulation, vegetation cover research, disaster prevention, and understanding the response of the rainfall-runoff process to climate change2,3,4. However, the precipitation process is influenced by a combination of factors such as atmospheric motion, geographic location, and underlying surface, resulting in complex spatial variability of precipitation5. Traditional precipitation observation methods include ground rain gauge networks and radar systems. But in basins with complex topography and climate, it is challenging to accurately reflect the spatiotemporal distribution of precipitation due to difficulties in maintaining rain gauge equipment and radar systems prone to interference. Additionally, data collected by rain gauges are mostly used within the industry, making it difficult to obtain meteorological data with high spatiotemporal precision.

With the development of remote sensing technology, satellite-based precipitation products have received extensive attention due to their advantages of continuous spatiotemporal coverage and wide range, offering new opportunities for ecohydrological modeling in data-sparse or data-deficient regions6,7. To date, various high resolution satellite precipitation products have been released and widely used in related research, such as the Tropical Rainfall Measuring Mission (TRMM)8, Climate Prediction Center Morphing Technique (CMORPH)9, the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks-Climate Data Record (PERSIANN-CDR)10, CHIRPS11, the Global Precipitation Measurement Integrated Multi-Satellite Retrievals (GPM IMERG)12,13,14, and the Multi-Source Weighted-Ensemble Precipitation (MSWEP)15. In addition, reanalysis datasets are created from a combination of observed data and assimilated data, such as the China Meteorological Forcing Dataset (CMFD)16. Different precipitation products exhibit varying accuracy and reliability in specific basins or regions due to the influence of uncertainties in retrieval algorithms, etc., which can introduce greater uncertainty into the description of precipitation spatiotemporal distribution and hydrological process modeling, especially in areas with complex climate and topography17. Additionally, the long-term, gauge-based gridded precipitation data for the Chinese mainland (CHM_PRE) is a high quality, high spatial–temporal resolution precipitation data released in 202318. However, due to its relatively recent release, the adaptability of this data at the basin or regional level has not been evaluated yet.

Currently, the spatial scales for assessing the accuracy of precipitation products mainly include two types: one is grid-scale comparison, which involves comparing observed station precipitation with satellite precipitation in the corresponding grid19, the other is comparison at the basin/region scale20. Among these, grid-scale comparison can better reflect the accuracy of precipitation at the grid scale but cannot directly express the performance at the basin scale. Basin scale comparison primarily evaluates the overall performance of precipitation products from the perspective of the entire basin, and its statistical index values are generally higher than those at the grid scale, possibly due to the combined effects of underestimation in some subareas and overestimation in others21,22. Basin scale comparison is also difficult to intuitively reflect the performance of precipitation products in local parts of the basin. In summary, it is evident that both grid and basin scale assessments have their limitations, necessitating the search for an intermediate spatial scale for evaluation.

Some scholars compare the accuracy of precipitation products by dividing the study area into different zones, including climatic zones23,24,25, topographical zones26,27,28, river sub-basin zones29,30, and water resource zones31,32,33. For instance, Peng et al.34 divided China into four different climatic regions (arid, transitional, humid, and the Qinghai-Tibet Plateau) to assess the accuracy of precipitation products, finding that the performance of each product varies spatially. Zeng et al.26 categorized Sichuan into three topographical regions based on elevation to evaluate the accuracy of precipitation products at different topographical regions and grid scales. Yang et al.30 divided the Yellow River Basin into 24 sub-basins according to its river system to compare the reliability of precipitation products. Tan et al.31 used the three-level water resource zones defined by the Hai River Basin Water Conservancy Commission to divide the basin into four regional scales for assessing the accuracy of precipitation products. These zoning methods consider factors other than precipitation (such as topography, underlying surface, and water resource distribution, etc.)35,36. In fact, precipitation itself is influenced by multiple factors37,38. However, there is currently limited research on utilizing the inherent spatiotemporal characteristics of precipitation data itself for precipitation zoning and comparing the performance of precipitation products.

This study introduces an innovative precipitation zoning framework based on REOF (Rotated Empirical Orthogonal Function), enabling the first multi-scale accuracy evaluation in the upper and middle reaches of the Hanjiang River Basin (UMHRB) to transition from a basin-wide perspective to intrinsic precipitation zones. This approach not only deepens our understanding of the performance heterogeneity of mainstream precipitation products (IMERG, MSWEP, CMFD, and CHM_PRE), but more crucially, reveals a critical yet frequently overlooked phenomenon: products that perform satisfactorily at the watershed scale often exhibit significant deterioration at the precipitation zoning scale. These findings expose the constraints of traditional whole-basin assessments and emphasize the imperative for evaluating products tailored to specific geo-climatic units. Therefore, the multi-scale zoning framework developed in this study not only guides data selection for precision meteorology and drought monitoring but also serves as a valuable methodological template for precipitation product assessment in other regions.

Materials

Study area

The Hanjiang River, a first-order tributary of the Yangtze River, is bordered by the Qinling Mountains to the north and connected to the Daba Mountains to the south. Due to its significance as the water source area for the Middle Route of the South-to-North Water Transfer Project and its critical role in flood control in the middle reaches of the Yangtze River39,40,41, it has received extensive attention. The main stem of the river spans a total length of 1577 km with a drainage area of 159,000 km2. We selected the upper and middle reaches of the Hanjiang River basin (31° 25′ N to 34° 10′ N, 106° 05′ E to 112° 16′ E) as the research area, which covers a drainage area of 105,000 km2. The topography features a decrease in elevation from east to west. The elevation difference is substantial, with the maximum height reaching 3945 m, and decreasing to 47 m at the outlet (Fig. 1).

Fig. 1
Fig. 1The alternative text for this image may have been generated using AI.
Full size image

The location of the upper and middle reaches of the Hanjiang River basin (UMHRB), and the spatial distribution of weather stations. All of the maps in this figure and Figs. 2, 3, 4, 5 and 8 were created by ArcGIS 10.2 software (https://resources.arcgis.com/en/help/main/10.2/).

The research area experiences a northern subtropical monsoon climate, characterized by mild and humid conditions with distinct seasonal variations in precipitation. Based on data from 1975 to 2018, the long-term mean annual precipitation is 930 mm. This total is heavily dominated by the rainy season (May–October), during which monthly precipitation averages between 102 and 178 mm. The rainy season alone accounts for approximately 79.8% of the annual precipitation. Conversely, the dry season (November–April) is marked by arid conditions, with monthly precipitation averaging only 12 to 39 mm. According to relevant studies on the Yangtze River Basin and the Hanjiang River Basin42, the four seasons are defined as spring (March to May), summer (June to August), autumn (September to November), and winter (December to February).

Datasets

Four precipitation products with a spatial resolution of 0.1° are assessed, including IMERG, MSWEP, CMFD, and CHE_PRE. Details about these precipitation products are listed in Table 1.

Table 1 General information of precipitation products used in this study.

IMERG

IMERG was derived from the Integrated Multi-satellite Retrievals for GPM. The Global Precipitation Measurement (GPM) mission, jointly proposed by NASA of the United States and JAXA of Japan, represents a new generation of global satellite precipitation observation programs. The algorithm is designed to more accurately capture trace precipitation and solid precipitation46. As a representative product of GPM, the IMERG V07 Final Run integrates data from the GPM Microwave Imager (GMI), all passive microwave data (PWM) from other satellites, geostationary orbit satellite infrared data, and monthly observations from ground rain gauges.

MSWEP

MSWEP is a long-term gridded precipitation product, developed by Beck in 2016. MSWEP combines various precipitation products including GSMaP-MVK, TRMM 3B42RT, PERSIANN, and SM2RAIN-ASCAT. The weights for station precipitation data are calculated based on the density of station grids, while the weights for satellite and reanalysis data products are determined by their accuracy at the station. Finally, the product is calibrated using data from 125 global flux towers.

CMFD

CMFD is a high spatial–temporal resolution gridded near-surface meteorological dataset, that was developed specifically for studies of land surface processes in China. The data was made through fusion of remote sensing products, reanalysis data and in-situ observation data. Its record starts from January 1979 to December 2018.

CHM_PRE

CHM_PRE is based on daily precipitation observations from 2839 stations within and around China since 1961. Building upon the traditional approach of constructing a data using a “precipitation background field precipitation ratio field”, this data attempts to apply monthly precipitation constraints and topographical feature corrections. The accuracy of CHM_PRE was evaluated using interpolated daily precipitation data from approximately 40,000 high-density stations across China from 2015 to 2019. The assessment reveals that CHM_PRE effectively represents the spatial variability of precipitation, with a median correlation coefficient of 0.78, a median root mean square error of 8.8 mm/d, and a median Kling–Gupta Efficiency (KGE) value of 0.69 between its daily time series and the daily precipitation observations at high-density stations. The data covers the period from 1961 to 2022.

In-situ meteorological measurements

The measured precipitation data were obtained from the China Meteorological Data Service Center at a daily temporal resolution (http://data.cma.cn). To ensure data quality, the meteorological station data were filtered for completeness and temporal scale. Ultimately, daily precipitation data from 42 meteorological stations were selected for analysis (Fig. 1). This data has undergone a series of quality control measures by the National Meteorological Information Center, including tests for outliers, consistency, and non-parametric verification. In this study, a long-term series of annual precipitation data from 1975 to 2018 was used for precipitation zoning, while daily precipitation data from 2001 to 2018 were employed to validate the accuracy of the precipitation products.

Methodology

The daily precipitation statistics for the observed meteorological stations are calculated from 20:00 of the previous day to 20:00 of the current day (corresponding to 12:00 UTC to 12:00 UTC). In contrast, the daily precipitation data from precipitation products are recorded from 00:00 to 24:00 UTC. In this study, the data from the precipitation products are aggregated by summing the data from 12:00 to 24:00 UTC on the previous day and from 00:00 to 12:00 UTC on the current day to obtain new precipitation product datasets. Daily precipitation data are then generated through cumulative summation. To avoid the uncertainty introduced by interpolating meteorological station data to a raster, this study uses the grid point values as estimates from the precipitation product and compares them with the observed values from meteorological stations within the same grid.

This study evaluates the accuracy of daily precipitation products from IMERG, MSWEP, CMFD, and CHM_PRE across multiple spatiotemporal scales, including precipitation zoning scale, and basin scale. Additionally, the study assesses the capabilities of these products in detecting different precipitation intensities and extreme precipitation events. Rotated Empirical Orthogonal Function (REOF) analysis is employed to identify the precipitation zones.

Rotating empirical orthogonal function

The Empirical Orthogonal Function (EOF) and Rotating Empirical Orthogonal Function (REOF) are widely used methods in climatology47. Primarily, they decompose meteorological element fields with spatiotemporal characteristics into a linear combination of spatial modes and time coefficients.

The EOF, also known as Principal Component Analysis (PCA), is utilized to extract the principal components from a data series48. Each identified spatial mode is associated with a corresponding time coefficient, which demonstrates how the mode’s amplitude varies over time. The spatial mode with the largest variance contribution is termed the first mode, while the one with the second-largest variance contribution is called the second mode. The variance not stored in the first mode is stored in the second mode, and so forth. As a result, the explained variance for each subsequent mode will be smaller.

The observational data matrix of a meteorological element field can be orthogonally decomposed into two parts: a spatial function V and a temporal function T:

$$X = VT$$
(1)

The REOF, also known as Rotated Principal Component Analysis, achieves a high-value separation of typical spatial modes by performing orthogonal rotation with maximum variance on several eigenvectors that meet certain criteria in EOF. This high value is only found in regions where the meteorological element field changes significantly, while the rest of the regions approach zero. REOF simplifies the spatial structure and highlights the characteristics of local regions. For more details on the primary principles, please refer to the relevant reference49,50. The number of modes is determined through the North significance error test51. This test relies on the sampling error theory of eigenvalues, and according to this theory, modes are considered insignificant when the difference between adjacent eigenvalues is less than their error range. As a data-driven statistical approach, it offers a robust framework for delineating spatially coherent regions. The utility of this method has been demonstrated in various studies52,53.

Evaluation indicators

Based on the aforementioned precipitation zoning results, the observed precipitation data and corresponding precipitation product data from all stations within each zone were aggregated separately, and performance metrics were subsequently calculated based on these aggregated datasets. The accuracy is evaluated using statistical indicators, namely, the Correlation Coefficient (CC), Nash–Sutcliffe Efficiency Coefficient (NSE), Root Mean Square Error (RMSE), and Relative Bias (BIAS). CC reveals the linear fitting relationship between the precipitation product and the measured observations. NSE is commonly used to assess the accuracy of model simulations. RMSE reflects the average error and is highly sensitive to outliers. BIAS describes systematic bias. The closer CC and NSE are to 1, and RMSE and BIAS are to 0, the more reliable the precipitation estimates are considered for a specific region or period. The Kling–Gupta Efficiency (KGE) is a comprehensive evaluation indicator that measures the overall goodness of fit between observed and estimated values; a higher value indicates superior performance. When KGE is less than 0, it indicates that the precipitation product’s performance is poor. For detailed formulas, please refer to Table 2. Where n is the number of samples, Gi is the gauge observation,\(\overline{G}\) is the average of Gi, Si is the products,\(\overline{S}\) is the average of Si, and r is Pearson correlation coefficient; \(\delta_{s}\) and \(\mu_{s}\) are the variation coefficients and mean of products, respectively;\(\delta_{o}\) \(\mu_{o}\) are the variation coefficients and mean of gauge observations, respectively.

Table 2 Evaluation indicates used in this study.

To evaluate the accuracy of precipitation products, a comprehensive index P value is calculated based on four statistical indicators: CC, RMSE, BIAS, and NSE. The calculation formula for the P value is as follows54:

$$P = \omega_{1} X_{1} + \omega_{2} X_{2} + \omega_{3} X_{3} + \omega_{4} X_{4}$$
(2)

where \(X_{1} ,X_{{2}} ,X_{3} ,X_{4}\) represent CC, RMSE, BIAS and NSE; \(\omega_{1} ,\omega_{2} ,\omega_{3} ,\omega_{4}\) represent the weight with values of 0.4, 0.1, 0.3 and 0.2, respectively. A higher P value indicates better performance.

To avoid the impact of different dimensions, the four indicators are normalized as follows:

$$X = \frac{{X_{i} - X_{\min } }}{{X_{\max } - X_{\min } }}$$
(3)
$$X = \frac{{X_{\max } - X_{i} }}{{X_{\max } - X_{\min } }}$$
(4)

where \(X_{\max }\),\(X_{\min }\), and \(X_{i}\) represent the maximum, minimum, and absolute values of the statistical indicators, respectively. CC and NSE are normalized using Eq. (3), while BIAS and RMSE are normalized using Eq. (4).

Probability density function

The probability density function (PDF) is used to evaluate the performance of precipitation products at different precipitation intensities55. The PDF represents the probability of precipitation occurrence in different rainfall intervals and has been widely used in research to assess the quality of precipitation products56. In this study, we categorize daily precipitation into eight classes (mm/d): < 0.1 (non-rainfall), 0.1–1 (tiny rainfall), 1–2 (light rainfall), 2–5 (light rain), 5–10 (moderate rainfall), 10–25 (heavy rainfall), 25–50 (heavy rainfall), and > 50 (violent rainfall)56,57,58.

Extreme precipitation indices

Seven commonly used extreme precipitation indices developed by the ETCCDI (Expert Team on Climate Change Detection and Indices) are employed to evaluate the capability of precipitation products in detecting extreme precipitation events59. These indices include: PRCPTOT: The annual total precipitation amount of 1 mm or more (mm). SDII: The ratio of the annual total precipitation amounts to the number of rainy days in a year (mm/d). Rx1day: The maximum daily precipitation amounts in a year (mm). Rx5day: The maximum continuous 5-day precipitation amount in a year (mm). R10 mm: The number of days with daily precipitation amount greater than 10 mm in a year (d). CDD: The maximum number of consecutive days with a total daily precipitation amount less than 1 mm (d). R95pTOT: The annual total precipitation amounts to days with precipitation exceeding the 95th percentile (mm).

Results

Precipitation zones based on REOF

The variance contribution rates of the EOF and REOF decomposition, derived from observed meteorological data from 1975 to 2018, are displayed in Table 3. As indicated in Table 3, the variance contribution rates of each principal component of the REOF spatial eigenvectors are uniformly distributed, with the first eigenvector having the highest variance contribution rate, explaining 21.9% of the total variance, followed by the second eigenvector, which explains 19.03% of the total variance. The variance contribution rate of the eigenvectors diminishes sequentially. Using the north significance test, the results showed that the first five modalities through the significance test. Consequently, the study domain is partitioned into five distinct precipitation zones.

Table 3 Variance Contribution and cumulative variance contribution of the EOF and REOF decomposition.

To generate spatial distribution maps of the 5 eigenvalue modalities using the Inverse Distance Weighting (IDW) method from the ArcGIS toolbox, and to classify the spatial data into 3 segments using Jenks Natural Breaks. Figure 2 displays the five final selected REOF spatial modes, which represent five distinct precipitation loading patterns. Each of these loading patterns corresponds to a different precipitation zone. The white areas represent high value regions of spatial modalities, indicating that these areas should be considered as separate zones.

Fig. 2
Fig. 2The alternative text for this image may have been generated using AI.
Full size image

REOF modal distribution of precipitation (a) REOF 1, (b) REOF 2, (c) REOF 3, (d) REOF 4, and (e) REOF 5.

The final precipitation zoning is illustrated in Fig. 3, where Arabic numerals denote the individual precipitation zones. It can be observed that Zone 1 is primarily located in the southernmost part of the basin, with 10 corresponding meteorological stations; Zone 2 is mainly near the basin outlet, with 7 meteorological stations; Zone 3 is predominantly in the northernmost part, with 10 meteorological stations; Zone 4 is mainly in the uppermost reaches, with 8 meteorological stations; and Zone 5 is primarily in the eastern part, with 7 meteorological stations. There is no overlap between the observed stations within the 5 precipitation zones.

Fig. 3
Fig. 3The alternative text for this image may have been generated using AI.
Full size image

Diagram of precipitation distinction.

Performance evaluation of spatial distribution and grid comparison

Precipitation data from meteorological stations (2001–2018) were spatially interpolated to generate raster precipitation data matching the resolution of satellite precipitation data. In Fig. 4a,f, it can be observed that the multi-year average and the summer average daily precipitation all exhibit gradually increasing trends from north to south in the study area, and the precipitation is mainly concentrated in the southern area.

Fig. 4
Fig. 4The alternative text for this image may have been generated using AI.
Full size image

Spatial distribution of mean annual precipitation (MAP) (ae) and June–August (SAP) (fj) daily precipitation during the period 2001–2018.

Figure 4b–e presents the spatial distributions of the multi-year average daily precipitation amounts of IMERG, MSWEP, CMFD and CHM_PRE. All four products can well reflect the spatial variation characteristics of precipitation, that is, the precipitation gradually increases from north to south. Figure 4b shows that the IMERG product shows a certain degree of overestimation in the upstream area, while the MSWEP product is underestimated in the middle and upper reaches in Fig. 4c.

Figure 4f–j presents the spatial distributions of the summer average daily precipitation amounts of IMERG, MSWEP, CMFD, and CHM_PRE. All four precipitation products can well reflect the spatial variation characteristics of summer precipitation, especially the CMFD product, whose results are basically consistent with the distribution of station-interpolated precipitation (Fig. 4i). However, the other three precipitation products failed to capture the significant rainfall in the middle reaches of the basin.

We also evaluated the spatial performance of four precipitation products (IMERG, MSWEP, CMFD, CHM_PRE) against 42 gauges in the UMHRB using four daily-scale metrics: CC, RMSE, BIAS, and NSE (Fig. 5). CHM_PRE consistently outperformed the other products. It achieved the highest CC, with 100% of its grids strongly correlated (CC > 0.8) with observations (Fig. 5m). This stands in stark contrast to CMFD, where 85% of grids had a CC > 0.6 but none exceeded 0.8 (Fig. 5i), and to IMERG and MSWEP, which had only 9% and 14% of grids above the 0.6 threshold (Fig. 5a,e). NSE displayed a spatial pattern similar to CC. CHM_PRE also yielded the lowest RMSE (0.7–3.3 mm/d), far below the minimum values for IMERG (5.8), MSWEP (5.3), and CMFD (4.2 mm/d). Furthermore, CHM_PRE exhibited minimal bias, with 97% of grids within ± 10%, compared to 71% for CMFD. While Fig. 5 offers a granular view of station and grid-scale accuracy, the significant spatial variability in performance precludes a straightforward visual evaluation of the basin’s overall accuracy.

Fig. 5
Fig. 5The alternative text for this image may have been generated using AI.
Full size image

Spatial distribution of four metrics for IMERG, MSWEP, CMFD, and CHM_PRE in the UMHRB from 2001 to 2018. The first column is CC (a IMERG, e MSWEP, i CMFD, and m CHM_PRE), the second column is RMSE (b IMERG, f MSWEP, j CMFD, and n CHM_PRE), the third column is BIAS (c IMERG, g MSWEP, k CMFD, and o CHM_PRE), and the fourth column is NSE (d IMERG, h MSWEP, l CMFD, and p CHM_PRE).

Performance of daily precipitation products on precipitation zoning scale

Table 4 presents the comparison results of statistical indicators for the precipitation products across different precipitation zones and the basin scale, clearly revealing significant spatial heterogeneity in product performance. In different precipitation zones, regarding the IMERG product, the optimal values appear in zones 1, 2, 1, and 5, while the worst values appear in zones 2, 1, 2, and 1. For the MSWEP, the optimal values appear in zones 1, 5, 1, and 5, while the worst values appear in zones 5, 2, 2, and 1. Concerning the CMFD, the optimal values appear in zones 1 and 4, 3, 1 and 5, while the worst values appear in zones 2, 1, 2, and 1. For the CHM_PRE, the optimal values all appear in Zone 1, while the worst values appear in zones 2, 3 and 5, 2, 2, and 2.

Table 4 Daily precipitation for precipitation products against observations in all precipitation zoning and the basin scale.

On both the zoning and basin scales, the IMERG, CMFD, and CHM_PRE precipitation products all show varying degrees of underestimation, with CMFD exhibiting the most significant underestimation, particularly in Zone 3 (BIAS = − 10.61%). In contrast, MSWEP overestimates precipitation across the entire basin (BIAS = 0.79%). The evaluation metrics of the CHM_PRE precipitation product, including CC, NSE and RMSE, all outperform the other three products.

To accurately evaluate the comprehensive performance of the four precipitation products across the five-precipitation zoning and basin scale, the P values are obtained (Fig. 6). Based on the calculated P values, the precipitation data with high P values are recommended as the best products for that region. Overall, CHM_PRE is recommended for all precipitation zones and the whole UMHRB.

Fig. 6
Fig. 6The alternative text for this image may have been generated using AI.
Full size image

Heatmap of P values between precipitation products against observation.

Assessment of precipitation event probability

The probability density functions (PDFs) for daily precipitation from IMERG, MSWEP, CMFD, and CHM_PRE show a consistent pattern across different intensity levels (Fig. 7). In all five precipitation zones and the UMHRB, the highest probability of precipitation occurrence is for light events (0.1–1 mm/d), followed by moderate events (2–5 mm/d). Conversely, the lowest probabilities are associated with violent rainfall (> 50 mm/d) and heavy rainfall (20–50 mm/d). Regarding model performance, IMERG, CMFD, and CHM_PRE systematically underestimate the probability of all precipitation events exceeding 1 mm/d, while MSWEP exhibits a similar underestimation for events greater than 2 mm/d in all zones and the UMHRB.

Fig. 7
Fig. 7The alternative text for this image may have been generated using AI.
Full size image

The probability density function of daily precipitation events with different intensities for IMERG, MSWEP, CMFD, CHM_PRE and gauge stations in precipitation zones (a Zone 1, b Zone 2, c Zone 3, d Zone 4, and e Zone 5) and UMHRB (f).

The IMERG, CMFD and CHM_PRE overestimated the probability of precipitation events ranging from 0 to 0.1 mm/d in all zones and the UMHRB, with CHM_PRE exhibiting the highest probability for this range. However, MSWEP is only overestimated this probability in Zone 3 and Zone 5. Across all zones and UMHRB, IMERG, MSWEP, and CMFD overestimated the probability of precipitation events ranging from 0.1 to 1 mm/d, with MSWEP and CMFD exhibiting a greater overestimation than IMERG. For the precipitation range of 1 to 2 mm/d, MSWEP only overestimated in Zone 1, while underestimating in all other zones and the UMHRB.

Overall, the IMERG, MSWEP and CMFD observed probability trends of precipitation events that are consistent with gauged stations. Among them, MSWEP better reproduces the probability density distribution of different levels of precipitation intensity, indicating its better ability to capture various precipitation probabilities. On the other hand, CHM_PRE overestimates the probability of non-precipitation events with less than 0.1 mm/d, but effectively reveals the probability of events in the range of 0.1 to 1 mm/d. This suggests that the performance of CHM_PRE in accurately capturing the occurrence of precipitation events needs further improvement, especially in distinguishing between zero and non-zero precipitation.

The capability of detecting extreme precipitation events

Table 5 evaluates the performance of four precipitation products (IMERG, MSWEP, CMFD, and CHM_PRE) in detecting extreme precipitation events, as measured by KGE. At the basin scale, IMERG achieves KGE values exceeding 0.5 for all extreme precipitation indices except Rx1day and CDD. Similarly, MSWEP and CMFD also surpass the 0.5 threshold for all indices except SDII and Rx1day. Notably, CHM_PRE outperforms the other products, achieving KGE values above 0.5 for all indices and recording the highest scores across the board. An analysis at the precipitation zoning scale reveals a more complex and heterogeneous performance landscape for the four products in simulating extreme precipitation. Both IMERG and CMFD exhibit systematic deficiencies; IMERG struggles particularly with the Rx1day index across all zones, while CMFD shows widespread inadequacy, with at least two indices per zone failing to reach a KGE of 0.5, notably Rx1day and SDII. MSWEP’s performance is characterized by high spatial variability, achieving competence only within Zone 2, but demonstrating a critical failure to reproduce annual maximum daily precipitation (Rx1day) in other zones, as indicated by KGE values as low as 0.07 in Zone 5 and 0.09 in Zone 3. Conversely, CHM_PRE emerges as the most reliable product, maintaining KGE values above 0.5 for all indices across all zones. The marginally lower, yet still acceptable, KGE for the SDII index warrants further investigation and is hypothesized to be a consequence of its enhanced detection of trace precipitation events (0–1 mm), a finding supported by the PDF analysis in Fig. 7. Overall, among the four precipitation products, CHM_PRE exhibits the best capability in detecting extreme precipitation events, followed by CMFD and IMERG, with MSWEP performing the worst.

Table 5 The KGE values of IMERG, MSWEP, CMFD and CHM_PRE in reproducing the extreme precipitation indexes in 5 precipitation zones and the UMHRB.

Discussion

Importance of multi-scale assessment framework

In this study, we established the “basin–precipitation zoning” multi-scale assessment framework and comprehensively evaluated accuracy of the four precipitation products (IMERG, MSWEP, CMFD, and CHM_PRE). We found that if the extreme precipitation indices of the entire area are only considered, four precipitation products show good detection capabilities (Table 5). However, their KGE values for some indicators are lower at the precipitation zoning scale. This suggests that accuracy assessment at the precipitation zoning scale is critical, as it enables a more accurate and refined evaluation of precipitation products. This contrast also underscores the limitations of traditional basin-wide averaging assessments, which may mask the spatial heterogeneity of product performance across sub-regions60. Consequently, adopting a zonal assessment framework is essential, as it provides more refined and scale-relevant actionable insights61. Specifically, in regions where extreme precipitation performance is weak (e.g., Zones 3 and 5), incorporating the region-dependent errors quantified in this study—such as the low KGE values of MSWEP for extreme precipitation—into assimilation systems can replace global error assumptions. This substitution optimizes the quality of the assimilation analysis fields and enhances the reliability of precipitation products. Simultaneously, in areas exhibiting pronounced systematic biases in heavy precipitation, these findings provide a clear basis for parameter calibration in hydrological models or bias correction of input data, thereby improving the accuracy of runoff simulations and flood forecasting. Furthermore, this “zonal assessment” paradigm is generalizable to other complex terrains or climatic transition zones. It promotes a shift in evaluation methodology from “overall performance description” to “region-specific diagnosis,” holding significant methodological value for advancing the refinement of regional hydrometeorological research.

Comparison with and advancement beyond Previous Studies

Extensive research has been conducted to evaluate the performance of various precipitation products from diverse perspectives62,63,64. However, it is well established that the applicability of these datasets varies significantly across different regions, spatiotemporal scales, and resolutions. This inherent variability underscores the necessity of selecting and applying precipitation data according to specific contexts. Against this backdrop, the results of this study are highly consistent with previous findings. For instance, a recent evaluation of five datasets (NWAFU, HHU, CMFD, HRLT, and CHM_PRE) in the Hangjiahu Plain similarly identified CHM_PRE as the most accurate product at the monthly scale65, which further corroborates our conclusions. Furthermore, previous studies have employed CHM_PRE data as gauge-based reference precipitation data66. This study concludes that CHM_PRE exhibits optimal performance in terms of RMSE, NSE, and CC at the daily scale, indicating superior data quality. However, CHM_PRE tends to overestimate the frequency of non-precipitation events. This bias is likely attributed to interpolation errors inherent to the CHM_PRE methodology, which are particularly pronounced under conditions of trace or no precipitation and are frequently diluted or obscured in area-averaged analyses. This finding provides a specific target for the future refinement of this otherwise reliable dataset.

Regarding the two remote sensing precipitation products (IMERG and MSWEP), MSWEP generally demonstrates superior performance across various spatiotemporal scales59. Consistent with this, our study also found that MSWEP possesses high CC values, which may be attributed to its algorithm. Specifically, its fusion scheme assigns weights to merged data based on the magnitude of CC values67. However, MSWEP also exhibits relatively high BIAS values (Table 3), indicating an overestimation tendency at both sub-regional and entire watershed scales. In certain sub-regions, the accuracy of MSWEP is inferior to that of IMERG. One possible reason is that MSWEP inherits deficiencies from individual precipitation estimations during the fusion process. On the other hand, IMERG utilizes observations from state-of-the-art sensors, including the Dual-frequency Precipitation Radar (DPR) and the conical-scanning multi-channel GPM Microwave Imager68, and additionally incorporates monthly gauge observations69. In the analysis of extreme precipitation detection capabilities, MSWEP shows limited capacity in capturing Consecutive Dry Days (CDD) and maximum 1-day precipitation (Rx1day), which aligns with previous research findings70,71. As indicated by the KGE values in Table 4, Rx1day recorded values of 0.07 in Zone 5 and 0.09 in Zone 3. Therefore, the fact that a product performing well on average may suffer severe failure in key sub-regions constitutes the direct contribution of the zoning framework proposed in this study.

Uncertainty analysis

We employed the REOF method with data from 42 stations (1975–2018) to delineate five precipitation zones. The resulting zoning, however, is subject to inherent uncertainties, primarily influenced by the temporal domain and spatial scale.

The sensitivity to the temporal domain is evident when comparing our 5-zone division (1975–2018) with a 3-zone division derived from a shorter period (2001–2018) (Table 6). The longer time series yields a more stable and detailed partition, subdividing the middle and lower reaches into finer zones. By comparing Figs. 2 and 8, we can see that the 3-zone division divides the study area into upper, middle, and lower zones, whereas the 5-zone division further subdivides the middle and lower reaches into four specific zones: East, South, Southeast, and Northeast. This dependency is echoed in other studies of Hunan Province, where different research periods led to varying zoning outcomes (e.g., Bi et al., 2021; Zhang et al., 2013).

Table 6 Contribution and cumulative contribution of the REOF decomposition during 2001–2018.
Fig. 8
Fig. 8The alternative text for this image may have been generated using AI.
Full size image

REOF modal distribution of precipitation, (a) REOF 1, (b) REOF 2, and (c) REOF 3.

Furthermore, the spatial scale significantly impacts the zoning72. This is illustrated by the varying number of zones identified in regions of different sizes: four zones in the vast Heilongjiang Province (470,000 km2)73, ten in the larger Huai River Basin (270,000 km2)53, and five in our more compact study area. This comparison confirms that zoning results are scale-dependent. Extending the analysis to much larger domains, such as hemispheric or global scales, necessitates longer time series and can also lead to EOF modes characterized by multiple active centers74,75,76.

To mitigate the influence of confounding factors, we employed rigorous data analysis and stringent quality control measures, thereby minimizing uncertainties to the greatest extent possible with the available data. Future research should incorporate newly available, extended time series of station-based precipitation to further validate and build upon the conclusions presented here.

Conclusions

We comprehensively evaluate four precipitation products (IMERG, MSWEP, CMFD and CHM_PRE) using 42 meteorological station data in the middle and upper reaches of the Hanjiang River Basin (UMHRB). The assessment focuses on three aspects: daily scale, probability of precipitation occurrence, and capability of extreme precipitation detection, at both zonal and non-zonal scales. The main conclusions are as follows:

  1. 1.

    CHM_PRE demonstrates superior performance in daily statistical accuracy and extreme precipitation detection capability, making it the most reliable product for the region. However, it tends to overestimate the frequency of light precipitation events (< 0.1 mm/d), indicating a deficiency in monitoring light rain and suggesting that further algorithm optimization is needed to improve the simulation of precipitation occurrence probability. In contrast, although MSWEP exhibits the weakest capability in detecting extreme precipitation, it excels in reproducing the probability density distribution across various rainfall intensities, highlighting its strength in capturing the full spectrum of precipitation probabilities.

  2. 2.

    The precipitation zoning analysis reveals significant spatial heterogeneity in product performance, a feature often masked by basin-scale assessments. Compared to Zone 1, all products exhibited relatively poor performance in Zone 2 (with MSWEP showing overestimation while others underestimated), yet CHM_PRE maintained its robustness across all zones. The ranking of extreme precipitation detection capability remains consistent at both zonal and basin scales: CHM_PRE is the most capable, followed by CMFD and IMERG, with MSWEP being the least effective.

  3. 3.

    The “basin–precipitation zoning” multi-scale assessment framework established in this study effectively reveals the spatial heterogeneity of product performance. It specifically demonstrates that products performing well at the basin scale may exhibit significant degradation at the zonal scale, particularly in extreme precipitation detection. This underscores the necessity of conducting evaluations based on climate-consistent zones, providing a more refined reference basis for the selection and application of precipitation products.