Introduction

Kenya’s increasing exposure to the effects of climate variability is a pressing issue, especially with erratic rainfall patterns and rising high-temperature patterns significantly affecting its key sectors. Agriculture, a backbone of Kenya’s economy1,2, is particularly vulnerable, as unpredictable weather disrupts planting and harvesting cycles, reduces yields, and exacerbates food insecurity. Infrastructure, too, faces challenges, with extreme weather events such as floods and droughts causing damage to roads, bridges, and other critical systems. The cumulative effect of these climate-induced challenges undermines the country’s overall economic stability, highlighting the urgent need for robust mitigation and adaptation strategies.

The effects of climate variability are particularly evident in regions like Marsabit, where prolonged droughts and heavy rainfall lead to severe consequences. Droughts reduce water availability, hinder crop growth, and limit pastures, leading to crop failures and livestock losses, exacerbating food insecurity3,4,5. In contrast, intense rainfall causes soil erosion, farmland flooding, and infrastructure damage, imposing significant financial burdens on the government for repairs and diverting resources from development projects.

These recurring events underscore the urgent need for sustainable strategies, such as climate-resilient agricultural practices, improved water management systems, and robust infrastructure design. Investments in early warning systems and community-based adaptation measures are also critical to mitigating the impacts on vulnerable populations.

A deeper understanding of climate variability, such as rainfall and temperature, can be achieved through probability distributions, which provide valuable tools to analyze climate patterns6. Globally, researchers have identified region- and time-dependent distributions for these variables, with models such as GEV, Gamma, log-normal, and Weibull frequently recommended for climatic data. Notable studies include those by Sharma and Singh7, Dzupire et al.8, Athulya and James9, Ozonur et al.10, Ximenes et al.11, Hussain et al.12, Singirankabo and Iyamuremye13 and Agbonaye and Izinyon14. For example, Ximenes et al.11 found Gamma and Weibull to be optimal for monthly precipitation in Northeast Brazil, while Douka and Karacostas15 identified GEV and log-normal as suitable for extreme precipitation in Thessaloniki, Greece. The differences in the probability distributions between11 and15 can be attributed to different geographical locations; Greece is located between \((40^\circ \text 37' N, 22^\circ \text 95' E)\) and northeast Brazil is \((34^\circ \text 47' N, 48^\circ \text 45' W)\). Their work on these regions also employed different periods; Greece’s data comprised monthly precipitation records from 1988 to 2017, whereas the study on Northeast Brazil used hourly rainfall data from 1947 to 2003. These studies and a summary in Table 1 demonstrate the importance of selecting appropriate probability distributions for accurate climate modeling.

Table 1 Literature results of probability density functions (PDF) fitted to rainfall data.

Extensive research has also been conducted to identify the best-fitting probability distributions for temperature data. Key studies include those by Athulya and James9, Dzupire et al.8, Hasan22, Hossain23, Hussain et al.12 and Ozonur et al.10. These studies have explored various distributions, including the normal, log-normal, Gamma, and Weibull distributions. For instance, Hussain et al.12 identified the Generalized Pareto (GP), Extreme Value (EV), and GEV models as suitable for modeling temperature data. Similarly, Hasan22 employed ten continuous distributions, including the exponential, Gamma, Log-Gamma, Beta, normal, log-normal, Erlang, power function, Rayleigh, and Weibull distributions, with the Beta distribution emerging as the best fit for the temperature data.

This study aims to identify the most appropriate probability distributions for modeling monthly maximum temperatures and total monthly rainfall in Kenya. The analysis is based on a comprehensive data set covering the last 73 years, capturing the impacts of recent climatic changes. By incorporating these extensive and up-to-date data, the study ensures that the models account for evolving climate patterns. For instance, accurate descriptions of climatic data provide a better understanding of the probability distributions of maximum temperatures and total rainfall, which helps capture the frequency and intensity of climatic events, such as heat waves and heavy downpours. These models also enhance predictive capabilities by leveraging historical trends and recent shifts, improving forecasting accuracy and facilitating better preparation for future climatic scenarios. Additionally, by identifying the underlying distributions, the study supports data-driven decision-making, providing a critical foundation for risk assessment and resource allocation in agriculture, water management, and disaster response sectors.

The study makes a significant contribution to modeling climatic events through three key focus areas. First, it provides a comprehensive theoretical framework for understanding and applying statistical distributions in hydrology and climate studies. The framework offers precise definitions of commonly used distributions, facilitating their identification and application to various climatic datasets. It also includes robust parameter estimation methodologies that ensure accurate modeling of climatic variables. Furthermore, the study outlines strategies for selecting extreme values tailored to specific extreme value distributions, enabling the precise focus on significant climatic events.

Second, the research emphasizes the application of GOF tests to identify the most suitable probability distributions for climatic data. Detailed discussions on the implementation of GOF tests enhance the accuracy and reliability of the models. This methodological rigor improves the alignment of models with observed data and bolsters their credibility for practical applications in risk assessment and decision-making.

Lastly, we emphasized the significance of temporal pattern analysis through block size selection, a crucial factor in statistical modeling that directly impacts the capture of temporal patterns in climatic data. We conducted a sensitivity analysis to assess the impact of varying block sizes on the GEV distribution. This analysis combined graphical methods, GOF tests, return level estimates for various periods, and confidence intervals. By examining the effect of block size on model performance and extremal forecast, this section provides valuable insights into the stability and reliability of the GEV distribution across varying temporal resolutions.

The paper is structured as follows. “Methods” section provides a detailed description of the data, the procedure for selecting candidate probability distributions, parameter estimation methods, and the implementation of GOF tests, including the combined approach of multiple GoF tests. “Results and discussion” section presents summary statistics, results from the selection of candidate distributions, findings from the GoF tests, and insights from the sensitivity analysis. Finally, “Conclusion” section concludes the paper by summarizing the key findings and their implications for climate modeling and risk assessment.

Methods

Data

The monthly maximum temperature (Tmax) and total precipitation (Prep) data for Kenya, covering the period 1950–2022, were sourced from the World Bank Climate Change Knowledge Portal24. The precipitation data (Prep), measured in millimeters, represents the total accumulation of monthly rainfall. This provides a comprehensive measure of rainfall intensity and distribution across different months. The temperature data (Tmax), recorded in degrees Celsius, captures the highest daily maximum temperature observed each month, offering valuable insights into extreme temperature events.

Selection of candidate probability distributions

A review of existing literature identified probability distributions commonly applied in hydrological studies: exponential, Gamma, Weibull, log-normal, logistic, Gumbel, GPD, and GEV, as referenced by7,8,9,10,11,12,14,16,17,18,20. Similarly, for temperature data, these distributions, in addition to a normal distribution, were identified as suitable candidates, supported by findings from22 and other related studies. Table 2 describes each probability distribution function. These distributions were selected due to their suitability in modeling skewed, heavy-tailed, or extreme data characteristics commonly found in climatic datasets. The Cullen and Frey graph25 was used to preliminarily assess the shape characteristics of the data, guiding the selection of appropriate distributions for further analysis.

Parameter estimation

In statistical modeling, parameter estimation is essential due to the typically unknown nature of most model parameters. Commonly employed methods include the Method of Moments, L-moments, Maximum Likelihood Estimation (MLE), and LH-moments, as noted in studies by Al Mamoon and Rahman6 and Haddad and Rahman26. In this paper, we employ the MLE method for parameter estimation across the analyzed distributions, as it is one of the most widely applied and robust methods. MLE is favored for its consistency and efficiency, particularly in large samples, as it maximizes the likelihood of the observed data and often yields more reliable results compared to other methods such as Moments, L-moments, and LH-moments, particularly in terms of asymptotic properties. Research, including foundational studies by Fisher27, Zong28 and Naghettini29, has demonstrated that MLE’s variance and bias are comparatively low, thereby enhancing its suitability across a broad range of distributions. These qualities render MLE exceptionally reliable for environmental datasets, including temperature and rainfall measurements, where precision and robustness are critical.

Goodness of fit tests

The suitability of each probability distribution was assessed using a suite of GOF tests, including the Kolmogorov-Smirnov (KS), Anderson-Darling (AD), Cramer-von Mises (CvM), and Chi-Square tests. These tests evaluate the alignment between theoretical and empirical data, with KS tests focusing on overall distributional fit15,30, AD and CvM emphasizing tail behavior15,26,31,32,33, and Chi-Square examining frequency alignment19. Additional evaluation was performed using Akaike Information Criteria (AIC) and Bayesian Information Criteria (BIC) to balance model complexity and fit10,12,22,26, along with Root Mean Square Error (RMSE) to quantify predictive accuracy14.

Comprehensive scoring methodology

The literature indicates a lack of suitable GOF tests designed to effectively distinguish between empirical and theoretical distributions34. Numerous studies have shown that the best-fit probability distribution can vary significantly between different regions, even for the same variable32. In response to these challenges, we adopt a comprehensive scoring methodology, as outlined in previous studies14,17,22,35. This method employs an integrated scoring approach that incorporates multiple GOF tests, information criteria, and graphical analyses to ensure a robust selection of the optimal probability distribution model. Each distribution model is subjected to several GOF tests, with a scoring system applied whereby the best-performing model in each test receives the highest rank. To enhance the rigor of the selection process, each model’s rank is determined independently for each GOF test and then aggregated across all tests to produce a composite score. For graphical assessments, rankings are informed by visual inspection of density plots and quantile-quantile (Q–Q) plots, providing additional insight into the best-fitting model.

Table 2 Description of various probability distribution functions.

Results and discussion

This section provides statistical results from the analysis. The dataset used in this study assumes an independent and identically distributed (iid). We tested for stationarity using the Augmented Dickey-Fuller (ADF) test, randomness using the Wald-Wolfowitz runs test, and independence using the Ljung-Box test to verify adherence to these assumptions. All tests were performed at \(5\%\) significance level. The results indicated that the data were stationary and random but exhibited autocorrelation; therefore, the data were aggregated using block analysis.

Summary statistics

Table 3 shows the descriptive statistics for the annual maximum temperature and total rainfall for Kenya.

Table 3 Summary statistics for the monthly maximum temperature (\(^\circ\)C) and total monthly rainfall (mm).

The maximum temperature (Tmax) for 876 observations has an average of \(26.23 ^\circ C\) with low variability (standard deviation = 1.27) and a range from \(23.16 ^\circ C\) to \(29.97 ^\circ C\). The interquartile range \(25.32 ^\circ C\) to \(27.15 ^\circ C\) highlights a concentration around the median \(26.23 ^\circ C\), with a near-symmetrical distribution (skewness = 0.12) and a relatively flat shape (kurtosis = 2.43). The findings resonate with previous studies in1,2, which indicate that while temperature variability at the national level tends to be low due to data aggregation, an increase in temperature has been observed in most regions across the country.

In contrast, Total rainfall (Prep) exhibits much higher variability, with a mean of 63.97 and a standard deviation of 42.72, ranging from 2.46 to 280.32. This wide range reflects the variability and extreme nature of rainfall. Quartiles (q25 = 35.90, q75 = 81.88) and a median of 50.90 indicate a right-skewed distribution (skewness = 1.46), while positive kurtosis (5.43) points to heavy tails, signifying extreme events. The findings also align with the evidence1,2.

Choice of candidate distributions

For the temperature data in Fig. 1a, the Cullen and Frey graph shows that the distribution approximates the normal region with a slight platykurtic shape, identifying the normal, uniform, log-normal, Gamma, Weibull, and logistic distributions as potential candidates. Studies, such as12, have shown that extreme value distributions are suitable for modeling temperature data; therefore, these distributions were also considered potential candidates. In the rainfall data in Fig. 1b, the distribution exhibits positive skewness and high kurtosis, suggesting alignment with distributions such as log-normal, Gamma, Weibull, and exponential. Given the presence of extreme values, models that account for extreme behavior, specifically the GPD and GEV distributions, were also included in the analysis.

Fig. 1
figure 1

Cullen and Frey plots for Assessing best-fit distribution of (a) Maximum temperature (\(^\circ\)C) and (b) Total rainfall (mm).

Model fitting was conducted using MLE for parameter estimation. For extreme value distributions, the Block Maxima (BM) and Peak Over Threshold (POT) approaches were used to determine the number of block maxima and thresholds required to fit GEV and GPD distributions, respectively. The BM approach is widely used in extreme value analysis to capture maximum events within defined time intervals, such as annual maxima, and it is commonly applied for environmental and climate data30,36,37. For the POT method, which is well-suited to modeling excesses over a specified threshold, the Mean Residual Life (MRL) plot was generated as shown in Fig. 2, and visual inspection was used to determine an appropriate threshold for each variable13,37. The blue curve in Fig. 2 represents the observed mean excess values {\(e = E(x_i - u \mid \text x_i > u )\)} , the red lines denote the upper and lower confidence intervals \((95\%)\) and threshold \(u\) defines the limit for identifying extreme events \((x_i: x_i > u)\)38. In Fig. 2a, a threshold in the range of 50 to 150 is suitable, as it provides a stable mean excess with narrower confidence intervals. This indicates that values above this threshold exhibit behavior suitable for modeling with a GPD. For temperature, the MRL plot in Fig. 2b did not suggest a proper threshold, hence the initial guess of a threshold around \(u=25\), where the confidence intervals remain relatively narrow, indicating reliable estimates. However, after approximately 28, the confidence intervals begin to widen slightly, indicating increased uncertainty in the mean excess values at higher thresholds. The GPD parameters were estimated based on observations exceeding this threshold.

Fig. 2
figure 2

Mean residual life (MRL) plots for evaluating threshold selection in (a) Total rainfall (mm) and (b) Maximum temperature (\(^\circ\)C).

Graphical assessments and GOF tests results

Graphical assessments

Density and Q–Q plots were generated to compare the observed data with several fitted theoretical distributions. For temperature data, the density plot in Fig. 3 shows that the GEV, Gamma, and log-normal distributions provide the best fit, capturing both the central peak and tail behavior. The normal, Weibull, and logistic distributions also perform reasonably well but exhibit slight deviations in the tails. In contrast, the uniform distribution shows significant discrepancies, particularly in the extremes, suggesting its unsuitability for modeling extreme temperature events. The Q–Q plots in Fig. 4 reveal that most distributions demonstrate deviations in the tails, with the GEV and normal distributions showing the closest adherence to the theoretical quantiles. Among the fitted distributions, the GEV, normal, log-normal, and Gamma distributions provide the best fit in that order, followed by the logistic and Weibull distributions, which exhibit moderate deviations. In contrast, the GPD and uniform distributions exhibit a substantial lack of fit, particularly at the lower and upper tails. This visual approach to identifying the best-fitting distribution is inherently subjective and, therefore, cannot be relied upon solely. To enhance robustness, these results were complemented with findings from other GOF tests to improve the reliability of distribution selection.

Fig. 3
figure 3

Density plots of observed and simulated maximum temperature (\(^\circ\)C) data to assess the performance of probability distributions.

Fig. 4
figure 4

Quantile–Quantile (Q–Q) plots for comparing the fit of eight probability distributions to maximum temperature (\(^\circ\)C).

Similarly, for the rainfall data in Fig. 5, the GEV, Gamma, and log-normal distributions show the closest alignment with the actual observed data, effectively capturing the shape and spread of the distribution. The Weibull distribution provides a moderate fit, performing well in the central range but diverging in the tails. In contrast, the exponential and GPD distributions exhibit substantial deviations, failing to represent the empirical distribution, especially at the extremes accurately. The Q–Q plots in Fig. 6 reinforce these findings, with the GEV and Gamma distributions displaying the best adherence to the theoretical quantile line, followed by the log-normal and Weibull distributions. Exponential and GPD exhibit the weakest performance. These results are consistent with previous studies, such as21, which identified the GEV distribution as the most appropriate model for extreme rainfall events.

Fig. 5
figure 5

Density plots of observed and simulated total rainfall (mm) data to assess the performance of probability distributions.

Fig. 6
figure 6

Q–Q plots to compare the fit of six probability distributions for total rainfall (mm) data.

GOF tests

The GOF analysis in Table 4 (a) identifies the GEV distribution as the most suitable model for the maximum temperature data. The GEV distribution achieves the lowest statistics for the KS (0.0297), AD (0.8890), and CvM (0.1335) statistics, accompanied by high p-values (0.4206, 0.4211, and 0.4442), indicating a strong alignment with the observed data. It also produces the lowest Chi-square statistic (3.5969, p = 0.9637) and achieves superior performance in terms of AIC (2,898.30), BIC (2,912.63) and RMSE (1.5694), highlighting its precision and efficiency. Other distributions, such as the normal, log-normal, and Gamma, provide moderate fits, with non-significant GOF statistics but higher AIC and BIC values, along with RMSE values that reflect less accuracy compared to the GEV. Conversely, the Weibull, Uniform, Logistic, and GPD distributions exhibit poor performance, with high test statistics, low p-values, and significant deviations from the observed data. The Uniform and GPD distributions show extreme misalignment, as evidenced by infinite AD statistics, high Chi-square values, and elevated RMSE scores, confirming their unsuitability for modeling maximum temperature data.

Table 4 Goodness of fit test results for temperature and rainfall distributions.

For the rainfall data in Table 4 (b), the GEV distribution also emerges as the most robust model, as reflected in the highest p-values for the tests KS (0.3487), AD (0.2753), and CvM (0.2897), indicating minimal deviation from observed data. Furthermore, the GEV achieves among the lowest AIC (8713.87) and BIC (8728.19) values, highlighting its parsimony and suitability for modeling rainfall patterns. Its superior predictive accuracy is evident from the lowest RMSE value (58.86), reinforcing its reliability. Concerning chi-square tests, the log-normal distribution was found to have the lowest chi-square value, indicating a better fit. Yuan et al.17 also had a similar finding when they used Chi-square tests to evaluate the best fit for the frequency analysis of the annual maximum hourly precipitation. In contrast, the GPD and exponential distributions perform poorly, with significant p-values, high Chi-square statistics, and elevated RMSE values, indicating substantial deviation and limited applicability for modeling rainfall data.

A comprehensive scoring method was used to further evaluate the best-fitting distributions, with findings presented in Table 5. Analysis for temperature distributions in Table 5 (a) revealed that the GEV consistently outperformed others as observed in39, achieving the highest overall rank with a total score of 17. This was supported by its superior performance in key tests, including KS, AD, and CVM tests. The Gamma and log-normal distributions ranked second and third, respectively, demonstrating moderate fits across multiple metrics. However, distributions like Weibull, Uniform, Logistic, and GPD performed poorly, accumulating higher total scores and displaying suboptimal results in density plots and QQ plots.

For rainfall distributions, the ranking analysis in Table 5 (b) also confirms that the GEV distribution again emerged as the top performer, ranking first with a total score of 16. These findings are supported by Agbonaye and Izinyon14, Al Mamoon and Rahman6, Alam et al.18, Coronado-Hernández et al.36, Fadhilah et al.21, Ghosh et al.40, Ng et al.35 and Yuan et al.17. Its strength was evident across most GOF tests, where it outperformed or closely matched the best-performing distributions in each category. The Gamma distribution ranked second, showcasing a strong overall fit with balanced performance across metrics. Log-normal followed in third place, excelling in certain tests but lagging in others, such as AIC and BIC. In contrast, the exponential and Weibull distributions demonstrated weaker fits, while the GPD distribution consistently ranked lowest.

Table 5 Goodness of fit rankings for temperature and rainfall distributions.

Sensitivity analysis

To evaluate the robustness of the GEV distribution’s fit to rainfall data, a sensitivity analysis was performed using various block sizes designed to capture diverse temporal patterns and extremes. Block size refers to a series of independent groups of observations of a particular length38. According to Coles and Coles38, block sizes are often selected to capture a specific period. In this work, the block sizes included annual, seasonal, monthly, 5-year, 10-year, 12-month moving averages, 6-month intervals, and 4-month intervals. Annual blocks, where maximum values were extracted per year, followed the methodologies outlined in38,41. Seasonal blocks were based on quarterly aggregations, as indicated by42 and41. Monthly blocks were used to capture monthly maxima, as discussed in43 and42. For longer-term patterns, multi-year blocks of 5-year and 10-year intervals were established, consistent with approaches adopted in studies such as44. A 12-month moving average window assessed rolling maxima, highlighting shifts in trends. Event-based blocks focused on the most extreme events by isolating total rainfall above the 95th percentile following the techniques used in45. For intermediate seasonality, semi-annual blocks were divided each year into January–June and July–December intervals, consistent with approaches used by42,43,46. Furthermore, a regional seasonal classification for Kenya was used to account for local climatic variations, with blocks corresponding to the “Hot and Dry”, “Long Rainy”, “Cool”, and “Short Rainy” seasons, building on the framework proposed by47. For each block length, maximum values were extracted and the GEV parameters were estimated and presented in Table 6.

For both rainfall and temperature data, parameter estimates reveal notable differences between block sizes, particularly in the shape parameter, which defines tail behavior. For rainfall, annual, 5-year, and 10-year blocks exhibited non-significant negative shape parameters \((p < 0.05)\), indicating a Weibull class of distribution as reported in30 and uncertainty in tail estimates for these broader temporal aggregations. In contrast, mid-range blocks, such as monthly, quarterly, event-based, and seasonal, yielded significant positive shape parameters, reflecting the heavy-tailed Frechet class of distributions with well-defined extremal patterns. This is in agreement with Moccia et al.33 although the findings of Onwuegbuche et al.48 and Singirankabo et al.37 revealed that Gumbel is the optimal distribution. The location and scale parameters were consistently significant \((p < 0.05)\) across all block sizes, indicating reliable estimation of central tendency and variability. The event-based block for rainfall, with a high shape estimate (0.3974), suggested a heavier tail and a higher propensity for extreme rainfall events compared to other blocks. For temperature data, location and scale parameters were also consistently significant across all blocks, confirming stable estimates of central tendency and variability. However, the shape parameter was not significant for the 5-year, 10-year, and event-based models, indicating uncertainty in tail estimates, which is likely due to the limited number of data points or the irregular occurrence of extreme events. In contrast, the quarterly, monthly, and seasonal models produced significant shape parameters, suggesting that they provide more robust and reliable tail estimates for predicting rare and extreme values in both temperature and rainfall.

Table 6 ML estimates and significance of location, scale, and shape parameter for temperature and rainfall distribution.

The model diagnostic tests in Table 7 reveal that the 10-year and 5-year blocks provide the best fit for both rainfall and temperature data, achieving the lowest AIC and BIC values (e.g., AIC = 74.406 and 146.985 for rainfall), indicating strong model parsimony and minimal information loss. These longer blocks effectively capture long-term extreme trends but rely on fewer data points (n = 7 and 14), which increase uncertainty in parameter estimates due to increased variances, as demonstrated by46. This finding aligns with studies by38,41, which emphasize the effectiveness of larger blocks in capturing long-term climatic trends by averaging out short-term fluctuations, thereby focusing on extreme patterns. Event-based and annual blocks also perform well for rainfall, with low AIC and BIC values, reflecting their stability in representing extreme events with adequate data, as supported by42. In contrast, higher-frequency blocks, such as monthly and 12-month moving average models, exhibit much higher AIC and BIC values for both rainfall and temperature, suggesting potential overfitting and inefficiency in capturing extreme patterns, a limitation also noted by43. Mid-range blocks, including quarterly, semi-annual, and seasonal, achieve moderate AIC and BIC values for both datasets, offering a balanced approach that captures seasonal variability while maintaining sufficient stability for reliable parameter estimation. This perspective is supported by studies such as15,42,46, which highlight the value of intermediate temporal scales in balancing the trade-offs between long-term trend analysis and sufficient data representation.

Table 7 Model performance metrics for maximum temperature (\(^\circ\)C) and total rainfall (mm) across different blocks.

In addition, we computed the return levels for different return periods to determine how various models estimate the extremes. The return level represents the magnitude of an event expected to be equaled or exceeded, on average, once within a specified return period38,48. The findings in Fig. 7 for temperature and rainfall data reveal distinct patterns across models when estimating extremes at various return periods. For temperature in Fig. 7a , the 10-year and 5-year models consistently produce the highest return levels, maintaining stability across increasing return periods as observed in48, indicating their robustness in estimating extreme values over longer intervals. In contrast, models with finer resolutions, such as monthly and 12-month moving averages, yield lower return levels with modest increases over time, suggesting a limited capacity to capture rare extremes. The quarterly and semi-annual models show moderate return levels, providing a balanced estimation that captures both seasonal variability and long-term trends. For rainfall in Fig. 7b, a similar pattern emerges, with the 10-year, 5-year, and seasonal models achieving the highest and most stable return levels, while finer models like monthly and 12-month moving averages display lower return levels and less pronounced growth across return periods. The event-based model exhibits high initial return levels but shows a plateau at more extended periods, indicating potential limitations in capturing prolonged extremes. Overall, the 10-year, 5-year, and seasonal models appear to be the most consistent for temperature and rainfall extremes.

Fig. 7
figure 7

Return level plots for different block sizes for (a) Maximum temperature (\(^\circ\)C) and (b) Total rainfall (mm).

Finally, we used a density plot to check how each model captures the distribution of maximum temperatures and total rainfall. In the temperature plot in Fig. 8a , the 10-year, 5-year, and event-based models displayed the most concentrated curves, suggesting a narrower range with more pronounced extremes. Models with higher temporal resolutions, like monthly and 12-month moving averages, exhibit wider density curves, indicating a broader distribution that captures more frequent fluctuations but is less focused on extremes. The quarterly and semi-annual models fall between these extremes, striking a balance between stability and variability. For rainfall data in Fig. 8b, a similar pattern emerges: the 10-year and 5-year models show steeper, more concentrated curves, indicating that they effectively capture rare, high-magnitude events. In contrast, finer-resolution models, such as monthly and 12-month moving averages, have flatter curves, capturing a wider range of data with less emphasis on extremes.

Fig. 8
figure 8

Density plots for different block sizes for (a) Maximum temperature (\(^\circ\)C) and (b) Total rainfall (mm).

Conclusion

In this study, we have assessed various probability distributions for modeling maximum temperature and total rainfall data using a systematic and comprehensive approach that combines several GOF tests and graphical tools. In addition, we have identified the optimal block size for the GEV distribution using return levels across different periods, as well as log-likelihood, AIC, and BIC. Insights from GOF tests highlighted that the GEV, Gamma, and log-normal distributions were well-suited for both maximum temperature and total rainfall datasets, as they consistently aligned with empirical data. On the other hand, distributions such as uniform, Weibull, and logistic showed a poor fit across multiple metrics, underscoring their limitations in capturing the complexities of climatic variables. The GEV distribution emerged as the optimal model for rainfall and temperature data, consistently outperforming others in key metrics such as the AIC, BIC, and RMSE. It also demonstrated superior performance in GOF tests, including the KS, AD, and CVM tests. This strong performance affirms the robustness of the GEV distribution in modeling climatic extremes and its capacity to provide reliable insights into long-term trends.

Block size analysis revealed the effectiveness of longer temporal aggregations, such as 10-year and 5-year blocks, which produced stable and high return levels across return periods, effectively capturing long-term extreme trends. However, these longer blocks increased uncertainty in parameter estimates due to fewer data points. In contrast, intermediate blocks, such as quarterly and seasonal, struck a balance by capturing seasonal variations while maintaining stability and reliable parameter estimates with moderate AIC and BIC values. High-frequency blocks, such as monthly and 12-month moving averages, although rich in data, exhibited higher AIC and BIC values, suggesting potential overfitting and inefficiency in representing extreme values.

The results of this study are important for Kenya and the East African region, as the adopted methodology can be applied. The comprehensive GOF tests also enhance forecasting temperature and rainfall data, which is crucial for risk assessment and the development of climate adaptation strategies. With this knowledge, predictions and preparations for catastrophic events, such as floods, droughts, or rising temperatures, can be enhanced. With better forecasts, policymakers and the government can improve infrastructure for water catchment systems and enhance agricultural activities through proper planning and disaster preparedness.

However, a key limitation of this study is its focus on individual probability distributions for temperature and rainfall without explicitly addressing the interdependence between these variables. Since temperature and rainfall are inherently related, accurate risk assessments and effective climate adaptation strategies require consideration of their associations. Extensive research has been conducted on the dependence between temperature and rainfall; therefore, future studies should prioritize exploring dependence structures within a multivariate framework using the fitted probability distributions identified in this study. Advanced approaches such as copula models or joint distribution analyses could provide deeper insights into the interactions between these variables, particularly under extreme climatic conditions. Such efforts would significantly enhance the reliability of climate models and their applicability to integrated risk assessment frameworks.

To build on this work, future research should focus on applying this methodology at finer spatial scales using real datasets from various regions in Kenya. Conducting probability distribution analyses at regional levels, incorporating block size analysis, and integrating data from multiple weather stations could yield region-specific insights into seasonal rainfall patterns, further informing targeted climate adaptation strategies. From a policy perspective, the results underscore the need for data-driven strategies that take into account both individual and joint variability of climatic variables. Policymakers should leverage these insights to design robust adaptation measures, such as enhancing agricultural planning, improving water resource management, and enhancing infrastructure resilience tailored to Kenya’s specific climate challenges.