Introduction

Hailstorms, among the most intense deep convection systems on Earth, play important roles in the hydrological cycle and general circulation of the atmosphere. Hailstones formed in hailstorms can be damaging and impact society in many areas of the world1. Therefore, understanding how and why hailstorm occurrence has varied in the past and predicting its variations/changes in the future are crucial to disaster management. Recently, the changes in hailstorm trends associated with anthropogenic climate warming have drawn considerable attention from research communities worldwide2. The main challenges are the inadequacies in monitoring systems from both spatial and temporal perspectives for such small-scale weather phenomena and the lack of knowledge regarding the microphysics of local-scale deep convection and the link between thermal dynamics and microphysics3,4. Therefore, changes in hailstorm occurrence in the context of a warming climate were considered as being low confident in both IPCC-AR5 to IPCC-AR6 reports4,5.

Hailstorm observations in the past have displayed considerable heterogeneity, which makes studying historical hail trends from a global perspective exceedingly challenging. In a particular region, hail observations from various sources may show different trends. For example, observations of United States hailstorms from public reports revealed a significant increase in frequency over the latter half of the twentieth century, whereas data from meteorological stations revealed no significant trend6,7. The artificial increase in reporting is thought to be caused by the growth of the population7. Although meteorological station observations are continuous and coherent in several countries in eastern Asia and a robust decrease in hailstorm days and occurrence has been reported over the past several decades8,9,10,11,12,13,14,15, the coarse spacing distance of the meteorological stations has resulted in a lack of spatial representativeness. Most importantly, these observations have been limited in the last half century, and the data period is short to reveal the variations/changes in hailstorms with near-term climate warming since the Industrial Revolution.

China is a country with a long history of >5000 years. The historic records in China span to the Yin dynasty 3000 years ago16. There are many dynastic histories, local chronicles, agricultural records, geographic monographs, personal travelogues, and other archival materials available. Natural disasters, such as floods, earthquakes, and hail damage, have made significant social impacts on people, and have therefore been recorded16,17. The very first hailstorm damage record occurred in 886 BC (Fig. 1a); however, regular local chronicles started in the Han dynasty at approximately 200 BC. The valuable, long-term hail damage records provide us with an opportunity to extend the analysis of hailstorm occurrence to 200 BC. In this work, we introduce a long-term hail damage record dataset and analyse the variations in hailstorm occurrence after 1500, covering the periods both pre- and post-Industrial revolution, as data after 1500 are thought to be more reliable than earlier records. We also project the future occurrence of hailstorm days at the yearly scale via artificial intelligence technology under different climate change scenarios (RCP 2.6, RCP 4.5, RCP 6.0 and RCP 8.5 from the CMIP5 multimodel ensemble mean) at the yearly scale with the use of a convolutional neural network model.

Fig. 1: Long-term records of hailstorm days in China.
figure 1

a Ancient hail damage records from 886 BC to 1100 (grey bars), with key historical markers at 886 BC and the Han dynasty indicated by pink triangle and blue star, respectively. (b) Ancient damage records from 1100−1948 (grey bars); the black solid line shows the 20 year smoothed time series. The beginning of the Ming dynasty is marked by a yellow diamond. c Hail damage data from 1949−2000 (grey shading); colored lines represent CMA (China Meteorological Administration) station records for hailstorms with maximum hailstone sizes ≥2 mm (purple), ≥5 mm (yellow), ≥7 mm (orange), and ≥10 mm (blue). The grey solid line shows the 20 year smoothed hail damage data.

Results

Variations in hailstorm occurrence

Figure 1 shows the interannual variations in the number of hailstorm days derived from ancient damage records from 886 BC-1949 AD (bars in Fig. 1a, b), current damage records from 1949−2000 (grey shading in Fig. 1c), and meteorological station records from 1949−2000 (coloured solid lines in Fig. 1c). The territory of China changed significantly prior to the Ming dynasty (ruled China beginning in 1368 AD18). Before 1500 AD, the annual number of hailstorm damage days was typically fewer than 10, and the hail damage records were usually discontinuous (Fig. 1a, b). While after 1500 AD, the hail damage records were relatively reliable, considering territory of China change little after Ming dynasty. Following 1949, the hailstorm days identified from hail damage records (grey shading and grey solid line in Fig. 1c) show good agreement with meteorological station observation. This alignment is particularly evident for hailstone with a diameter of 7 mm or larger (orange solid line in Fig. 1c) after 197913. These findings suggest that the use of hail damage data to study hailstorm activity is quite reliable, which is consistent with the findings of previous studies17,19. Considering that ancient and current hail damage datasets were recorded in different systems, we focus on analysing the features of hailstorm damage days on the basis of ancient hail damage data from 1500 to 1948.

The annual variations in hailstorm damage days during this period (Figs. 1b, 2) display two characteristics. First, there was a significant increasing trend at the beginning of the nineteenth century. Specifically, the number of hailstorm damage days increased from 10 days per year in the 1700 s to 60 days per year in the 1940s, an average rate of increase of 3.9 days per decade. The increase during the 1900s−1940s, reaching an average rate of 8.3 days per decade, more than doubled that during the 1850s−1890s (Fig. 2). Second, multidecadal and quasicentennial variations were obvious. A power spectra analysis of hailstorm damage days suggested that a 90 year cycle is the dominant frequency, and this result is significant considering a red-noise background at the 95% upper confidence interval; moreover, a 56 year cycle can be identified as the second-most-dominant frequency (Fig. S4).

Fig. 2: Long-term variations in hailstorm damage days, population, and temperature from 1500 to 1948.
figure 2

a Observed hailstorm damage days (light grey greysolid line) and population-detrended hailstorm damage days (dark grey solid line). The brown line represents the nonlinear trend component of population-detrended hailstorm damage days decomposed via the CEEMDAN method. The orange line shows the nonlinear trend, quasicentennial variability (QCV), and multi-decadal variability (MDV) components. b Time series of the population (blue dotted line), temperature anomaly (light green line), and the nonlinear temperature trend component (dark green line), which is also derived from the CEEMDAN method.

Previous studies have demonstrated that population growth may be associated with the increase in the yearly number of hail reports in the United States, thus increasing hailstorm days7,20. To eliminate the possible artificial effects of population growth on the number of records, the population effect were first processed via a regression method21,22 (for more details, see the Methods). Figure 2 shows the population-detrended number of hailstorm days (referred to as number of hailstorm days in the following text; dark grey solid line), which are used in the following analysis.

In general, long-term variations in the number of hailstorm days after population effects are removed still show a clear increase after the 1850s. In addition, the power spectra analysis reveals two cycles at the quasicentennial and multidecadal scales (orange line in Fig. S4). These results suggest that hailstorm activity exhibits quasicentennial and multidecadal cycles, in addition to an increasing trend after 1500. It is interesting to explore what may have caused such changes/variations in hailstorm days.

Influence of temperature changes

Current knowledge suggests that areas of high convective available potential energy (CAPE), with high energy level for prevailing circulation; abundant moisture, which provides supercooled water for hailstone growth; and strong vertical wind shear, which contributes to convective processes23,24 (although not necessarily simultaneously), are favourable environments for hailstorm formation. Since long-term records of wind are not available, we show the possible influence of surface temperature changes, which are related to convective instability and atmospheric moisture, the first two factors listed above, on hailstorm occurrence in China.

Five long-term temperature datasets24,25,26,27,28 are widely used by the climate community5. These five datasets differ slightly dating back to 1500 (Fig. S2). As the temperature in the global Mann dataset has the highest correlation coefficient with the number of hailstorm days, the following discussion is based on the Mann dataset (Fig. 2).

Figure 2 shows that the global mean temperature changed only slightly before 1850; it increased by approximately 0.8 degrees from 1850−1948, and most of the temperature increase occurred in the last 50 years of that period. Correspondingly, the number of hailstorm days increased from 20−60 days per year from 1850−1948, and most of the increase occurred in the last 50 years of the period (Fig. 2).

To quantify the influence of the global mean surface temperature on the variations in the number of hailstorm days in China, we further used the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) method to obtain the features (Figs. S5 and S6), such as different timescale variabilities and/or nonlinear trends, of hailstorm days and temperature. Notably, CEEMDAN can be used to decompose the time series of a variable into different time scales without any prior subjective criteria29,30,31. The trend of hailstorm days is significantly correlated with that of the global mean temperature (Mann dataset), with a coefficient of 0.99 after 1800, indicating a high possibility that the increase in the number of hailstorm days was caused by an increased global temperature.

In addition to long-term trends, variabilities at different timescale were observed for the number of hailstorm days, implying that other climate factors may contribute to changes in hailstorm trends. As this research focuses on variations in the number of hailstorm days at long time scales, determining the climate factors that dominate the variations in hailstorm days at decadal and quasicentennial scales is important.

Influence of the PDO

Previous studies have suggested that multidecadal-scale precipitation and surface temperature in China are largely impacted by the Pacific Decadal Oscillation (PDO) and North Atlantic Oscillation (NAO)32,33,34,35,36,37,38,39,40. To further assess the possible influences of the PDO and NAO on hailstorm trends, we used the CEEMDAN method to analyse their variations at different time scales. (Fig. S7-S8). At the quasicentennial scale, there is a significant negative correlation between the PDO and the number of hailstorm days, especially before 1850. At the multidecadal scale, the correlation between the PDO and number of hailstorm days is positive at the 95% confidence level (Fig. 3). Furthermore, the correlation between the PDO and number of hailstorm days is significant, with a coefficient of −0.96 (0.57) for the quasicentennial variation (multidecadal variation) if an 80−100 (50−70) year fast Fourier transform bandpass filter is applied (Fig. S9b). However, the NAO is not significantly correlated with the number of hailstorm days at the quasicentennial scale or the multidecadal scale (Fig. 3).

Fig. 3: Variability in quasicentennial and multidecadal components of hailstorm day and climate indices.
figure 3

Solid lines represent the quasicentennial variability (QCV) components, and dashed lines represent the multidecadal variability (MDV) components. a Hailstorm day variability, b Pacific Decadal Oscillation (PDO) variability, c North Atlantic Oscillation (NAO) variability.

Previous research indicated that quasicentennial and multidecadal oscillations were dominant in the PDO before and after the 1850s, respectively41. The quasicentennial cycle corresponds to the solar irradiance fluctuation (the Gleissberg cycle), and the multidecadal oscillation tends to be associated with global warming41. However, we cannot rule out the effects of different observational sources between the preindustrial and post-industrial periods as well as the low confidence in decadal PDO records prior to 19504.

The different relationships between the PDO and the number of hailstorm days at different time scales may be associated with different atmospheric circulations and climatic backgrounds during these periods. During the Little Ice Age (LIA; 1450−1850), when quasicentennial variations dominated both the PDO and hailstorm trends, widespread negative precipitation anomalies were found from southern to northern China when the PDO was in a positive phase39. Previous studies have demonstrated that northerly wind anomalies dominate eastern China when the PDO is in the positive phase, resulting in a relatively dry and cold environment that inhibits precipitation34,39,40, which is unfavourable for hailstorm occurrence because of the relatively low CAPE and limited amount of precipitable water (PW). Thus, the correlation between the PDO and the number of hailstorm days was negative (Fig. S10). After 1850, with increasing temperature, multidecadal variations with the PDO became dominant41; during the positive PDO phase, abnormal southerly winds transported moisture to eastern China42, promoting the occurrence of hailstorms due to relatively high CAPE and PW levels (Fig. S10). Our analysis also revealed variations in the location and intensity of the PDO before and after 1850, which may deserve further analysis in the future.

Future projections

Given that the nonlinear trend and quasicentennial and multidecadal variations in hailstorm days are influenced mainly by warming and the PDO, a one-dimensional convolutional neural network (CNN) model43,44,45,46 was developed to project the number of hailstorm days in the future. We chose the period of 1500−1948 as the training period and 1950−2099 as the prediction period. Trend and oscillation predictions were conducted separately. The model reasonably captures the changes in hailstorm trends between the 1950s and 1970s and between the 1980s and 2000s, respectively, when the PDO was in negative and positive phases (Fig. 4). This finding further suggests that the CNN model provides an accurate projection of the number of hailstorm days in China.

Fig. 4: Historical and projected trends of hailstorm days in relation to temperature anomalies and the PDO.
figure 4

Annual hailstorm days are shown as light grey lines, with 50 year smoothed trends in dark grey for historical (pre-1950, left Y-axis) and current (1950−2000, right Y-axis) periods. Colored solid lines from 1960−2000 represent projections (post-2000, right Y-axis) based on observed temperature anomalies and the Pacific Decadal Oscillation (PDO) index. Colored solid lines from 2000−2100 show future projections under RCP scenarios: RCP-8.5 (red), RCP-6.0 (orange), RCP-4.5 (light blue), and RCP-2.6 (dark blue). Shading around projections indicates ±1 standard deviation from 200 ensemble training runs. PDO index is shown as shading, with data from 1870 to 2013 based on HadISST1.1, and from 2014 onward derived from CMIP5 models47. Temperature anomalies are shown as dashed green lines (10 year low-pass filtered), with different shades representing the four RCP scenarios. Data from 1880 and 2000 are from NASA/GISS, and from 2000 to 2100 are from the CMIP5 multimodel ensemble mean.

In the future, the global mean surface temperature is projected to increase by 0.8, 1.7, 2.2, and 4.0 °C under RCP 2.6, RCP 4.5, RCP 6.0 and RCP 8.5, respectively (green lines in Fig. 4). Models in the Coupled Model Intercomparison Project phase 5 (CIMP5) project a high possibility of a positive PDO from 2040 to 208047. The number of hailstorm days in China is expected to increase substantially from now to 2060 s (Fig. 4). Specifically, this number would increase to 620 days in 2072 from 300 days in 2017 under the RCP 8.5 scenario, an increase of 107%. The increases would be 76%, 75%, and 59% under the RCP 6.0, RCP 4.5 and RCP 2.6 scenarios, respectively. The CNN model also projects a decrease in the number of hailstorm days after the 2070s, presumably because of the PDO phase change. A sensitivity test using a synthetic PDO dataset in which the positive phase was manually extended until 2100 has been conducted. The results show that without a PDO phase change after 2070s, the predicted decrease in hailstorm days does not occur. The number of hailstorm days would never return to the value of 241 per year observed in the 20th century.

Discussion

Studies of changes in hailstorm days in the context of climate warming are largely limited by the lack of observations. In China, reliable meteorological observation stations have been established since the middle of the last century, a period not sufficient for evaluating long-term responses. The long-term hailstorm damage records from ancient China thus provide us with valuable information for analyses. Notably, analyses of long-term hailstorm damage data suggested that the number of hailstorm days increased by approximately 3.9 days per decade in China with warming from 1850−1948. In addition, the PDO modulated regional hailstorm activity at the quasicentennial (80−100 years) and multidecadal (50−70 years) scales before and after 1850, respectively. A CNN model setup based on hailstorm damage data from 1500−1948 projected that hailstorm days are expected to increase as warming continues, largely in the next 40−50 years. Our results show that the number of hailstorm days at the end of this century will not return to the levels observed in the 20th century. Therefore, the challenge of disaster prevention related to hailstorms will increase as the 21st century continues.

Uncertainty remains in our study due to the exclusion of factors such as urbanization, aerosols, and land-use changes. Urbanization can increase surface temperatures and atmospheric instability, potentially intensifying updrafts necessary for hail formation. Aerosols influence cloud microphysics, may either suppress or invigorate hailstorms depending on their type and concentration. Land-use changes, such as deforestation or agriculture expansion, modify surface moisture and temperature gradients, thereby altering atmospheric instability and storm dynamics. Overall, these anthropogenic factors likely play a role in hailstorm occurrence, though their effects are highly dependent on local environmental contexts and feedback mechanisms. In addition, advances in communication technology and changes of observation systems over time may have influenced the frequency of recorded hail events48,49. Further research should explore how these factors contribute to trends in hailstorm occurrence.

Methods

Data

We use three hail datasets in this study: (1) Ancient hail damage records (886 BC-1948 CE; Fig. 1a, b). These data are documented in the traditional Chinese lunisolar calendar, providing information on hail events with year and month details, while only a few entries specify the exact day. The Chinese lunisolar calendar is based on both lunar and solar motions and is designed to align seasonal variations (governed by the solar cycle) with the lunar year. Each month consists of either 29 or 30 days, resulting in a year length of ~354 or 355 days—about 11 days shorter than a solar year. To compensate for this discrepancy, a leap month is introduced into the Chinese lunisolar calendar approximately every two to three years. Over the long term, this adjustment causes the Chinese months to lag behind their corresponding Gregorian months by about one month on average50,51,52. Moreover, the ancient hail records have been relocated to current administrative province, ensuring consistency in spatial resolution with the current records. (2) Current hail damage records (1949−2000; grey shading in Fig. 1c), collected by the Civil Affairs Department of China. These records are based on the solar calendar. (3) Continuous and coherent hailstorm records from more than 2000 meteorological stations (1950−2010; coloured solid lines in Fig. 1c). This dataset is provided by the Chinese Meteorological Administration9,13,15 and based on the solar calendar. The meteorological stations are located in each administrative county shown in Fig. S1. In addition to the occurrence of hailstorms, the maximum size of hailstones was also regularly recorded in meteorological station reports after 198012 (Fig. 1c). Hailstone size only occasionally appeared in ancient and current hail damage reports; thus, only hail size data from meteorological station reports after 1980 were used in this study.

Population datasets used to remove potential artificial effect include: (1) Historical population data (pre-1949), collected from a previous study53, with irregular time intervals. (2) Current population data (post-1949) with yearly interval, obtained from the National Bureau of Statistics of China (https://data.stats.gov.cn).

For temperature before 1949, five historical datasets from the National Climatic Data Center (NCDC)24,25,26,27,28 are used. The spatial and temporal coverages of the data are listed in Table S1. The annual temperature time series is shown in Fig. S2. Among the five datasets, Mann’s data cover the globe, whereas the other four cover the Northern Hemisphere. For the current period (1949−2000), temperature data are obtained from NASA/GISS (https://climate.nasa.gov).

Three PDO datasets are used for different purposes. To analyse the influence of PDO on the variabilities of ancient hailstorm days, we use a reconstructed annual PDO index (1470−2000), which has been validated against other reconstructed PDO indices and observational data, confirming its reliability41,54,55,56,57,58. For the current period (1950−2013), the PDO index is obtained from HadISST 1.1. Future PDO projection is directly sourced from the CMIP5 composite model output provided in a previous study47. This future PDO index is constructed based on the statistical synthesis of selected CMIP5 simulations and validated using historical observations to ensure consistency in phase transitions and early warning signals.

NAO data used in this paper from 1500−1949 is from the NCDC, contributed by Ortega et al.59.

Definition of hailstorm days

A hailstorm damage day is defined as a day in which there is at least one hail damage record (ancient or current damage) in a province. A hailstorm day in the current climate is defined as a day in which at least one storm is recorded in a province. It is important to note that a single hailstorm affecting multiple provinces will be counted multiple times. However, given China’s vast geographical extent, hailfall in different provinces on the same day can also be caused by different hailstorm systems. Considering such cases, we used this method to ensure consistency and reliability in defining hail days within our dataset.

Data agreement analysis

The data agreement was tested first before analysis. After 1949, far more hailstorm days were noted in the meteorological station reports than from current hail damage reports (Fig. 1c). However, hail damage is recorded only when damage occurs, which is largely related to the size of a hailstone. The interannual variations in the number of hailstorm days from current hail damage reports are compared with those with different bin size distributions from meteorological station records. Number variation with hailstone sizes equal to or greater than 7 mm are found to agree well with that of records from damage reports after 1980 (Fig. 1c), and the correlation coefficient is 0.85. As the damage caused by hailstones is related to their kinetic energies when they reach ground, which is a function of the size of the hailstone and can be estimated as

$${{\rm{KE}}}=0.0243{{{\rm{D}}}}^{3.55}$$
(1)

where D is the diameter of the hailstone and KE is the kinetic energy60.

The numbers of hailstorm (damage) days from the three datasets all show significant seasonal variations, with one peak in the warm season (Fig. S3). Both the current hail damage record (1949−2000) and CMA (China Meteorological Administration) records show a peak in June. However, in the ancient hail damage record, the peak appears in May, approximately one month earlier than in the current records. Since the ancient hail damage records were based on the lunisolar calendar, which is ~1 month shorter than the solar calendar currently used, this one-month shift in peak timing is expected. The lunisolar months generally lag behind their corresponding solar months by about one month. The overall agreement in seasonal variation between the hail damage records and CMA observations suggests that further valid comparisons between the current and historical hailstorm trends can be made. It is worth noting that different record location may also play a role in seasonal cycle differences since different regions have different seasonal patterns. However, its influences in our case is relatively limited, as the ancient hail damage data has already been relocated to the current administrative province.

Detrend method

The linear regression method was used to estimate the possible influence of the population on hail damage days and remove the signals associated with artificial impacts21,22. The linear relation can be estimated as:

$${{\rm{H}}}({{\rm{t}}})={{{\rm{H}}}}_{0}+{{\rm{\beta }}}{{\rm{P}}}({{\rm{t}}})+{{\rm{R}}}({{\rm{t}}})$$
(2)

where H(t) is the number of hailstorm damage days; H0 is a constant; P(t) is the population; and R(t) is the residual number of hailstorm days.

Decomposition method

The complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) method can be used to decompose the time series of a variable into various time scale components. This algorithm is a continuation of the empirical modal decomposition (EEMD)29 and empirical mode decomposition (EMD)61 algorithms. It can overcome the problem of “mode mixing” and provide a more stable and exact reconstruction of the original signal at a lower computational cost than those of the individual methods. In this study, the standard deviation of the Gaussian random numbers used as additional noise was set to 0.2. The number of realizations was 500 and maximum number of sifting iterations allowed was 5000. The detrend hailstorm day records, temperature anomalies, PDO and NAO records were decomposed into several intrinsic mode functions (IMFs), and the results are shown in Tables S2, S3 and S4, respectively. According to the mean periods of these modes, we obtained five major components: the interannual variation (mean period less than 10 yr, summing IMF1, IMF2, and IMF3), the interdecadal variation (mean period between 10 and 50 yr, summing IMF4 and IMF5), the multidecadal variation (MDV; IMF6), the quasicentennial variation (QCV; IMF7), and the nonlinear trend. The five major sets of time scale decomposition results are shown in Figs. S5, 6 and 7.

Future projections method

The CEEMDAN-CNN-BiLSTM model was applied to conduct future projections of the number of hailstorm days. The CEEMDAN method was first used to decompose hailstorm days, temperatures, and PDO records into different mode components. Then, the trend component and oscillation (oscillation in this section refers to the summation of the MDV and QCV) component of hailstorms were separately predicted with a one-dimensional (1D) convolutional neural network (CNN) machine learning method43,44,46,62 and a 1D CNN-based bidirectional long short-term memory (BiLSTM) parallel model with an attention mechanism63. The nonlinear trend of the global temperature anomaly was chosen as the training feature for the prediction of the hailstorm trend, whereas the PDO oscillation was used as the training feature for the prediction of the oscillation in the number of hailstorm days. Compared with 2D CNNs, 1D CNNs are preferable for processing 1D signals46. Furthermore, to better capture the hidden features of the oscillations and produce better predictions, the coupled CNN-BiLSTM-Attention model63 was applied. These methods were chosen on the basis of trial and error tests.

Data before 1949 were used to train the machine learning model. The data were separated randomly into training and test sets at a ratio of 8:2. For the 1D CNN model used in trend prediction, we designed a convolutional layer with 32 filter maps and a kernel size of 1, followed by a flattening layer and a fully connected dense layer. The model was compiled using the Adam optimizer with a learning rate of 0.001 and initial decay rates β1 and β2 set to 0.9 and 0.999. The loss function used was mean squared error loss. The model was trained for up to 1000 epochs with a batch size of 16. To prevent overfitting and improve computational efficiency, early stopping was implemented. The training will be terminated if the validation loss showed no improvement after 20 consecutive epochs. We also employed five-fold cross-validation to evaluate model robustness. The best-performing model from each fold was saved to ensure that only the most optimal model was retained.

For the CNN-BiLSTM-Attention model used in oscillation prediction, we first designed a 1D convolutional layer with 32 filters and a kernel size of 1, similar to the structure in the trend prediction model. Then, a BiLSTM layer which consists of 32 hidden units was used to processes the extracted features from the convolution layer, capturing temporal dependencies in both forward and backward directions. To avoid overfitting, a dropout layer with a rate of 0.5 was applied after the BiLSTM layer. An attention mechanism is then applied to dynamically weight important time steps, enhancing interpretability. Finally, a fully connected dense layer with a linear activation function was used for regression. The fitting and optimization methods for this coupled model were the same as those used for the 1D CNN model, except that the batch size was set to 8.

To eliminate the random error that may be caused by randomly splitting the data, the splitting and training processes were repeated 200 times to generate 200 ensemble models for trend and/or oscillation prediction. The ensemble mean R square values were 0.47/0.50 for the trend prediction and 0.78/0.76 for the oscillation prediction for training/testing.

Data between 1950 and 2099 were used for projection, and we assumed that the influence of population growth remained constant at the mean value between 1950 and 1980. The temperature anomaly between 1950 and 2000 was obtained from NASA/GISS data, and the data from 2000 to 2100 were from the ensemble mean model output of CMIP5. The temperature data were from CMIP5 runs with four different scenarios used (https://esgf-node.llnl.gov/search/cmip5/). The temperature anomaly and PDO index between 1950 and 2000 were first processed with CEEMDAN, and the corresponding trend and oscillation components were used for projection. The trend and oscillation predictions were subsequently summed, and the results are shown in Fig. 4. The prediction results from 1950−2000 were in good agreement with the damage records, suggesting that the CNN-BiLSTM model is capable of accurately predicting the number of future hailstorm days in China.