Main

Agriculture is vulnerable to atmospheric conditions both in the short term (weather) and in the long term (climate). The latter influences agricultural outputs through factors such as average climate conditions (expected temperature and precipitation in a given location at a given time), climate variability (changes in climate statistics across temporal and spatial scales, beyond those of individual weather events) and climate uncertainty arising from the inaccuracy of future climate predictions1.

Climate variability poses a huge challenge for farmers, who frequently decide on agricultural issues in advance by assuming that certain climatic conditions will take place2. Temperature and precipitation during the growing season could explain around 30% of year-to-year yield fluctuations for the six most popular crops worldwide3, and climate variability could account for up to 60% of yield variability in substantial areas of the global breadbaskets4. Depending on the level of development, weather changes could drive from 20% to 80% of crop yield interannual variability5. The relative contribution of precipitation and temperature to yield anomalies constitutes a hot topic of discussion. In sub-Saharan Africa (SSA)—the riskiest region in terms of food security by 20506—crop production under climate change could be more sensitive overall to temperature changes than to precipitation changes7. However, it was argued that the dominance of the temperature effect on maize varies regionally depending on the directionality of the yield response to both variables8. As global warming and human interventions amplify the fluctuations of natural systems beyond the previous envelope of variability9, the traditional paradigm of climate–crop yield stationary relationships phases out10. Therefore, increasing the resilience of the agricultural sector to more frequent and intense climate-related disruptions constitutes a key challenge to feed the growing global population.

Leveraging the increasing amount of climate and weather information to produce agricultural decision-relevant variables is necessary to implement climate-smart agriculture, which aims to transform agri-food systems to respond effectively to climate change11. However, bridging this gap is not straightforward12, as the response of crop yields to weather is complex, nonlinear and threshold type. Two complementary rather than competing approaches exist to characterize this relationship: data-driven or statistical models, relying on empirical equations, and crop growth or process-based models, able to represent the interactions between physiological processes and the environment13. Reported caveats include the following: assumption of time-invariant relationships between climate drivers and crop yield10, little attention devoted to the effects of interannual climate variability compared with those related to mean changes4, limited consideration for the influence of extreme weather events14,15, univariate analysis of the climate–yield relationship (response to isolated variables or extremes)16,17 and use of coarse spatio-temporal indicators (generally growing-season and national-level statistics), which could lead to an unsuitable generalization of the impacts of climate variables on yields across regions with diverse topography and bioclimates18.

Here we aim to explore the influence of sub-seasonal wet–dry spell patterns on maize yield variance in SSA. To this end, we have developed an approach for dealing with large spatio-temporal scales and limited data availability. These features constitute an important asset in developing regions, where data limitations severely hamper the accuracy and usefulness of crop models to produce reliable estimations of crop production at regional and national scales. We consider both the intra-growing season variability and the spatial heterogeneity of the crop response to map weather-related risks at the regional level, which allows us to integrate in the analysis the time specificity of weather-related impacts on crops and the existence of different agroclimatic zones within the region.

The paper is structured as follows. First, we characterize maize yield across SSA and related trends during the reference period (1982–2009) according to the Global Dataset of Historical Yield (GDHY) aligned version v1.2 + v1.3 (ref. 19). Then, we explore the changes in mean precipitation and temperature and in the prevalence and distribution of intra-growing season dry–wet spells during the same period (1982–2009) using the Agriculture Modern-Era Retrospective Analysis for Research and Applications (AgMERRA) climate dataset20. Next, we identify the most suitable subset of dry–wet spell patterns to explain the variability of maize production and its spatial heterogeneity. Finally, we map the changes in maize yield risk arising from climate variability and discuss their potential implications for this critical cereal in SSA.

Results

Maize yield characterization and trends

Predominant agro-ecological zones in SSA regarding maize cultivation include semi-arid, sub-humid and humid tropical lowlands, although they are unevenly distributed across Western (WA), Eastern (EA) and Southern Africa (SA) (Supplementary Fig. 1). Crop calendars differ depending on the region and agro-ecological zone: median values of the growing season range from around 100 days in the semi-arid zone of WA and the humid zone of EA to 160 days in the semi-arid and sub-humid zones of SA. Median values of the planting date vary from the beginning of April (semi-arid and sub-humid zones of EA and humid zone of WA) to mid-November in the semi-arid and sub-humid zones of SA (Supplementary Fig. 2). The analysis of the interannual variability of rainfall shifts around the planting date shows that they usually occur within a 10 day interval for most of the region (Supplementary Methods and Supplementary Fig. 17). Maize annual yields also behave differently from one agro-ecological zone to another, but they are generally low. During the reference period, mean values in the selected regions ranged from 0.9 t ha−1 to 1.4 t ha−1 in the semi-arid zone, from 1.4 t ha−1 to 1.5 t ha−1 in the sub-humid zone and from 1.5 t ha−1 to 1.8 t ha−1 in the humid zone. Most productive areas at the regional level are located in southern Nigeria (humid), the Lake Victoria area in Kenya (humid) and the sub-humid zone of South Africa. Interannual variability of maize yields is usually higher in the semi-arid zone than in the sub-humid one, whereas the humid zone shows the least variability (Supplementary Figs. 3 and 4).

Changes in maize annual yield from 1982 to 2009 were heterogeneous at the regional level. In the western part of EA (Tanzania, Uganda, Burundi, Rwanda and South Sudan), a positive trend was detected (increase of 10–30 kg ha−1 yr−1, significant for a 90% confidence level). However, the trend was negative in certain areas of Somalia and Kenya (for example, yield decrease between 10 kg ha−1 yr−1 and 20 kg ha−1 yr−1 in the Great Lakes area) but non-significant for the same confidence level. In WA, maize yield showed a positive trend at the regional level (increase of 20 kg ha−1 yr−1 on average, up to 60 kg ha−1 yr−1 in countries such as Côte d’Ivoire—where crop yield more than doubled from 1982–1991 to 2000–2009—and significant except for Senegal and Sierra Leone). The trend was also positive and significant in most parts of SA, ranging from around 10–30 kg ha−1 yr−1 in the western coast (for example, Namibia) and Mozambique to more than 100 kg ha−1 yr−1 in the most productive area at the regional level (eastern South Africa). On the contrary, Zimbabwe and northern Botswana showed a negative trend (for example, a decrease of around 20 kg ha−1 yr−1 in Zimbabwe). These trends were significant in most of SA except for certain countries such as Zambia and Namibia (Section 4 of Supplementary Fig. 5).

Mean precipitation and temperature changes during maize development

Maize water needs range from 500 mm to 800 mm, unevenly distributed across its four stages of development (50% of the maximum crop water need at the initial stage, increasing during the development stage until reaching its peak at mid-season and decreasing up to 25% of its maximum value at the late stage21). These needs depend on climatic factors such as temperature and thus change across SSA (where mean temperature during the growing season ranges from less than 20 °C in areas such as the Great Lakes or Lesotho to more than 27 °C in Somalia, South Sudan or the Sahel region). Besides, water needs are variably met under the rainfed prevailing system, as rainfall during maize development ranges from less than 300 mm in the Horn of Africa, the Sahel and some zones of Namibia and South Africa to more than 800 mm in areas such as Sierra Leone, Guinea, northern Zambia and Mozambique. Not only are precipitation and temperature spatially heterogeneous, but they also vary from one stage of development to another. Average contribution to total rainfall during the reference period oscillated between 12% and 22% for the initial stage, 27% and 29% for the development stage, 30% and 36% during the mid-season and 21% and 25% for the late stage. The initial stage was also the warmest, as temperature decreased throughout the growing season (from 0.4 °C in SA to 1.2 °C in WA on average).

Total rainfall during the growing season showed a negative trend in most of EA (except for the northwestern area) from 1982 to 2009 (average decrease of around 20 mm per decade). However, it was not significant for the 90% confidence level aside from zones such as the Horn of Africa, where total rainfall decreased more than 10% from 1982–1991 to 2000–2009. These changes were not homogeneous throughout the maize season: while the precipitation trend was negative in most of EA during the initial and development stages, it was positive during the mid and late season stages (except for some areas of Tanzania, Uganda and Kenya). In WA, the precipitation trend was positive in the Sahel (around 30 mm per decade on average) and negative in certain areas of the Gulf of Guinea and Sierra Leone (where total rainfall during the growing season decreased more than 10% from 1982–1991 to 2000–2009). However, these trends were not significant in large areas of WA. Besides, precipitation trends varied across the growing season (for example, Nigeria and Ghana showed a positive trend at its beginning and a negative trend at the late stage). In SA, the rainfall trend was positive (44 mm per decade on average, although highly variable depending on the location), except for certain areas of Mozambique and eastern South Africa. Concretely, in the most arid zone of SA, total precipitation during the maize season increased more than 20% from 1982–1991 to 2000–2009. However, this trend was only significant in Angola and northern Namibia. Mean temperature showed a positive and significant trend in most of SSA (0.15 °C per decade, 0.21 °C per decade and 0.27 °C per decade on average in WA, SA and EA, respectively), except for areas such as western Sahel and eastern South Africa (negative, non-significant trend). Temperature increase from 1982–1991 to 2000–2009 was heterogeneous at the regional level: from <0.25 °C in most of WA to >0.5 °C in large areas of EA and certain zones of SA (for example, Mozambique) (Fig. 1 and Supplementary Figs. 68).

Fig. 1: Precipitation and temperature changes during the maize growing season.
figure 1

Changes in total rainfall and mean temperature during the maize growing season in WA, EA and SA from 1982 to 2009.

Changes in dry spells

Relative contribution of temperature to the onset of dry–wet spells (characterized through the standardized precipitation evapotranspiration index (SPEI) using a 10 day time window (Methods and Supplementary Methods)) was higher at the beginning of the growing season than at other developmental stages. During the initial stage, temperature could contribute over 50% to the incidence of extremely dry spells (defined using quantile 0.1 as the SPEI threshold) in the semi-arid and sub-humid zones of SA, and between 45–50% and 34–36% in the same agro-ecological zones of EA and WA, respectively. These percentages dropped to 41–45%, 39–43% and 27–33% for the same areas during the mid-season stage. Besides, temperature contribution to the onset of dry spells decreased along with their intensity: it was between 10% and 20% less for the moderately dry class compared with the extremely dry class (Fig. 2).

Fig. 2: Temperature contribution to dry spell onset during the maize growing season.
figure 2

Relative contribution of temperature to the onset of dry spells by SPEI class and stage of development (initial and mid-season) in WA, EA and SA.

Changes in the incidence of dry spells during the maize growing season from 1982 to 2009 were spatially and temporally heterogeneous. In most of EA, dry spells particularly decreased (negative trend) during the mid-season and late stages (average regional decline of about 11 events per decade throughout the growing season, significant for the 90% confidence level in Somalia and eastern Kenya). However, they increased in western EA (around five events per decade on average, significant in South Sudan and Uganda). Most of WA and SA also showed a negative trend for dry spells (average regional decline of seven and ten events per decade, respectively), with the exception of certain areas (for example, Guinea and Sierra Leone—where they increased at a rate of about four events per decade—or eastern South Africa at the end of the season). These trends were significant in northern Namibia and Botswana, Zimbabwe and southern Mozambique in SA and in the northwestern part of WA (Fig. 3 and Supplementary Fig. 9).

Fig. 3: Trends in the occurrence of dry spells during the maize growing season.
figure 3

Detected trends in the relative frequency of dry spells at the initial and mid-season stages of development (number of events with regard to stage duration) and significance for the 90% confidence level, using Sen’s slope and the Mann–Kendall test (two sided).

Maize yield variance and regional crop risk changes

Changes in mean temperature and total precipitation at different developmental stages were able to explain between 30% and 35% of maize yield variability at the regional level (see ‘Multiple Linear Regression’ in Methods for further reference). By contrast, variance explained by shifts in the likelihood of certain dry–wet spell patterns during crop development amounted to 50–60% at the regional level (although large differences exist depending on location—for example, from up to 75% in the Horn of Africa to around 30% near Lake Victoria) (Fig. 4). Four patterns acted as main drivers of yield variability: prevalence of dry or wet spells (from extreme to moderate ones) along the four stages (‘dry–dry–dry–dry’ and ‘wet–wet–wet–wet’, respectively) and substantially different conditions during the initial stage with regard to the rest of the season (‘dry–wet–wet–wet’ and ‘wet–dry–dry–dry’). This is coherent with our findings from the analysis of conditional probabilities (Supplementary Fig. 10), as the likelihood of alternating dry–wet conditions during the last three developmental stages was found to be low.

Fig. 4: Maize yield explained variance.
figure 4

Maize yield explained variance (adjusted R2) in WA, EA and SA by mean precipitation and temperature at different developmental stages and changes in the likelihood of combined dry/wet spells during the growing season.

Risk related to the ‘dry–dry–dry–dry’ pattern (Methods) generally declined in SSA (as it was two to four times less likely in 2000–2009 than in 1982–1991). For the ‘wet–wet–wet–wet’ pattern (whose probability increased 1.5–2.5 times during the same period), risk changes were heterogeneous at the regional level: from being prevalent in the western coast of EA in 1982–1991 to increasing in the eastern part in 2000–2009. Very localized modifications arose in WA (decrease in the rainiest areas, such as Sierra Leone and Guinea) and SA (increase in northern Botswana). Risk linked to the ‘dry–wet–wet–wet’ pattern (probability increase between 1.2 and 1.7 times from 1982–1991 to 2000–2009) substantially rose in the Horn of Africa (where maize production declined during the reference period), decreased in the rainiest areas of WA and certain areas of SA (for example, western South Africa) and slightly increased in zones such as northern Namibia. Finally, risk related to the ‘wet–dry–dry–dry’ pattern (1.5 to 2.5 times less likely in 2000–2009 than in 1982–1991) decreased in EA, WA (particularly in Burkina Faso and Nigeria) and most of SA (except for the eastern part of the region, for example, Mozambique and Zimbabwe) (Figs. 5 and 6). Overall risk decline related to prevalent dry conditions has important implications for food security at the regional level. Estimated mean yield decrease under the ‘dry–dry–dry–dry’ pattern ranged between 0.3 t ha−1 and 0.6 t ha−1, affecting 61%, 81% and 90% of the area in EA, WA and SA, respectively (Supplementary Fig. 11). Thus, annual net production losses for this pattern could be equivalent to the yearly energy requirement of a population of 31 million people—18 million people in the case of the ‘wet–dry–dry–dry’ pattern, considering 34 Mha harvested in these regions in 202122, a caloric supply of 3,560 kcal kg−1 and a daily energy need of 2,100 kcal per person. However, both ‘wet–wet–wet–wet’ and ‘dry–wet–wet–wet’ patterns resulted in net production increases, equivalent to the annual energy demand of 34 and 21 million people, respectively.

Fig. 5: Changes in maize crop risk related to predominant dry spell patterns.
figure 5

Changes in maize crop risk in WA, EA and SA related to changes in the likelihood of two dry–wet spell patterns during the growing season (considering 0.1–0.3 and 0.7–0.9 as SPEI quantile ranges to define ‘dry’ and ‘wet’ events, respectively): prevalence of dry spells at all stages of crop development (‘dry–dry–dry–dry’) and prevalence of wet spells during the initial stage and dry spells for the rest of the growing season (‘wet–dry–dry–dry’).

Fig. 6: Changes in maize crop risk related to predominant wet spell patterns.
figure 6

Changes in maize crop risk in WA, EA and SA related to changes in the likelihood of two dry–wet spell patterns during the growing season (considering 0.1–0.3 and 0.7–0.9 as SPEI quantile ranges to define ‘dry’ and ‘wet’ events, respectively): prevalence of wet spells at all stages of crop development (‘wet–wet–wet–wet’) and prevalence of dry spells during the initial stage and wet spells for the rest of the growing season (‘dry–wet–wet–wet’).

Discussion

Our analysis highlights the inadequacy of assuming static weather–crop yield relationships across time and space in the context of agricultural decision-making. Detected changes in climatic conditions during the maize growing season were heterogeneous across SSA from 1982 to 2009. The rainfall trend was positive in most of SA (in agreement with ref. 23) and in the Sahel, hence coherent with its observed greening, for example, ref. 24, and negative in certain areas of the Gulf of Guinea and Sierra Leone, where a trend towards less frequent but more intense rainfall was detected25. In most of EA, the precipitation trend was negative at the beginning of the growing season and positive during the mid-season and late stages. This agrees with the findings of a previous study26 about the ‘Eastern African paradox’, which attributed the regional rainfall decline to a later onset and earlier cessation of the long rains. Mean temperature values rose consistently during the growing season (except for some areas where the trend was non-significant) although the magnitude of this increase was uneven at the regional level. Despite warmer maize seasons, the incidence of dry spells showed a negative trend in large areas of SSA.

According to our results, the relative contribution of temperature to the onset of dry events was not constant throughout the growing season, being most relevant at the initial stage compared with subsequent stages and increasing with the intensity of the event. This is interesting as compound extremes—and particularly ‘hot droughts’ resulting from high temperatures and elevated vapour pressure deficits—have been linked to especially poor harvests (up to 30% yield losses) in several regions of the world since 200027. Changes in mean temperature and total precipitation at different developmental stages could explain between 30% and 35% of the interannual yield variance in SSA (similar to previous findings3 at the global scale). However, changes in the prevalence of dry–wet spells across the growing season increased the explained variance to 50–60%. These results agree with findings concluding that extreme weather indices are equally informative as or more informative than mean ones14 or attributing 18–43% of the explained variance at the global scale of four major crops to climate extremes28. We found that a few dry–wet spell patterns drove maize yield variability during the reference period: the prevalence of dry or wet events at all developmental stages and occurrence of substantially different conditions during the initial stage with regard to the rest of the season. Changes in crop risk were unevenly distributed, although those associated with prevalent dry spells—resulting in net production losses at the regional level—generally decreased during the reference period.

The statistical nature of the selected approach makes it suitable for large spatio-temporal scales and with limited data availability, as it can capture the effects of having limited understanding of processes because of the absence of field and management data for validation and assessment29. This is especially relevant in developing countries and more concretely in SSA. Statistical methods may also provide estimates of warming impacts in the short term similar to those of crop models30, even if their insights are limited by the historical envelope of variability. However, the quality of available agricultural statistics in the region introduces a large source of uncertainty. For example, it has been argued that the low-quality and untimely agricultural production data plausibly impede agricultural development in SSA31 and that African countries are often lacking the most fundamental information to shape the design of effective policies and programmes in agriculture and the rural space32. Climatic forcing data also contribute to this uncertainty, as the largest differences between datasets correspond to regions where in situ observations are sparse, inconsistent or incomplete—for example, SSA. These differences are stronger for precipitation than for temperature, and for distributional characteristics and extreme events than for the mean response33. Future research to deal with these issues include considering additional climate datasets and the role of teleconnections, combining the proposed methodology with crop model outputs (such as the ones from the Agricultural Model Intercomparison and Improvement Project34) under multiple climate scenarios to assess crop viability in the medium and long term, and assessing whether nonlinear regression methods outperform linear ones (for example, ref. 14).

Another potential limitation is the use of fixed crop calendars, as they overlook the possibility of shifting planting dates (the simplest adaptation strategy to tackle altered seasonal rainfall distribution and thermal conditions, for example, ref. 35). Although we show that a 10 day aggregation period is suitable for mitigating the effect of shifts in the rainy season onset around the planting date in this particular case (Supplementary Fig. 17), we recommend deriving time-varying optimal planting dates based on rainy season onset. Finally, we did not contemplate the influence of other biotic (for example, weeds, pests) and abiotic (for example, agricultural inputs, management practices or political instability) factors on maize yield. We assumed that it is limited by the low use of irrigation and fertilizers in SSA and the application of a first-differences approach (which minimizes the impact of slowly changing factors), although it should be assessed in future analyses. Despite these considerations, this approach can contribute to support agricultural stakeholders in making climate-informed decisions about crop suitability at the regional level.

Methods

Case study

SSA is expected to be the riskiest region in terms of food security by 2050. Driven by a projected 2.5-fold population growth, food demand by 2050 could increase by over 300% in relation to the period 2005–2007 and demand for cereals is expected to approximately triple6. Cereals are the main staple crops in SSA, but the region largely depends on substantial cereal imports to feed its growing population (cereal import dependency ratio was around 20% during the period 2010–201922). Maize is the leading cereal in SSA, playing a role comparable to that of rice or wheat in Asia. It is mostly grown by smallholder farmers for food under rainfed conditions, across multiple agroecological zones and farming systems, and consumed by heterogeneous people in terms of socioeconomic level and dietary preferences. Concretely, maize constitutes a source of food security and livelihoods for 208 million people in SSA and accounts for almost 50% and 20% of calories and protein consumed in EA and WA, respectively36. Per capita maize consumption (as a food source) in Africa is one of the highest worldwide (45.1 kg per capita per year), especially in SA (where Lesotho, Malawi, Zambia and South Africa are among the top 5 consumers along with Mexico, with averages exceeding 100 kg per capita per year37). About 40 Mha (the largest land area of all staples in SSA) is devoted to the production of maize, the primary cereal grown in half of the SSA countries and one of the top two cereals in 75% of these countries38. Suitable area for the cultivation of maize (depending on specific crop requirements compared with the dominant agroclimatic and agro-edaphic conditions) is estimated at 68% of the rainfed cropland in SSA39. During the period 2007–2017, the area occupied by maize crops in SSA increased by almost 60% (ref. 40). However, maize yields in SSA remain low: country-average yields of rainfed maize ranged between 1.68 t ha−1 and 1.99 t ha−1 from 2007 to 2016, representing around 15–25% of country-mean water-limited yield potential for most countries41. As a result, Africa only generates 7.4% of maize global production, even if the continent hosts more than 20% of the maize-cultivated area worldwide37. Despite the socioeconomic relevance of rainfed maize in SSA, the body of research on the impacts of climate variability on crop yield at the regional level is still limited and thus prevents the implementation of relevant policy frameworks42. The low quality and quantity of agricultural statistics also hamper sectoral policy development in SSA. Key factors of data-related barriers include the lack of financial resources to generate survey or administrative data of sufficient quality and scope to inform policy, lack of human resources to collect such data in a cost-effective and sustainable manner, lack of an integrated approach for the collection of household-survey data, lack of standardization when data are collected by different agencies and institutions, limited analytical capacity and/or poor dissemination of the available data and results32.

Data sources

Data on climatic variables were extracted from the AgMERRA dataset20, which was specifically conceived for the assessment of agricultural impacts triggered by climate variability and change, and provides daily, continuous, high-resolution data over the 1980–2010 period. According to a study that examined the impact of climatic forcing data on agricultural performance through 91 climate data–crop model combinations33, the AgMERRA dataset was able to reproduce the bulk of the signal captured in the Global Gridded Crop Model Intercomparison (GGCMI) ensemble.

Global annual time series data of maize yield at 0.5° grid-cell resolution for the period 1981–2016 were obtained from the GDHY aligned version v1.2 + v1.3 (ref. 19). The GDHY is a hybrid of agricultural census statistics and satellite remote sensing, as yield data are estimated using yield statistics reported by FAO at the national level and a satellite-derived crop-specific vegetation index. Growing season data for maize at the global level (particularly planting dates and length of the growing season) were obtained from the GGCMI Phase 2 (ref. 43). The procedure applied to obtain the crop calendar used in that study was further described44. Concretely, they used data from two global cropping calendars, the first corresponding to the global dataset of monthly irrigated and rainfed crop areas around the year 2000 (MIRCA2000, monthly resolution45) and the second proposed by the Center for Sustainability and the Global Environment (SAGE) of the University of Wisconsin-Madison (daily resolution46). The SAGE crop calendar—derived from observations of crop planting and harvesting dates from six data sources—was used where available, whereas the MIRCA2000 one was used only in regions not covered by the former (which is not the case for SSA).

Methodological framework

The selected agroclimatic index is the SPEI computed at the dekadal (10 days) timescale. SPEI values were subsequently classified and coded as alphanumeric sequences for each stage of crop development. Characterization of SPEI classes at each developmental stage involved the assessment of their relative frequency and estimation of the relative importance of temperature versus precipitation. The latter was performed by means of a random forest (RF) algorithm, considering mean 10 day precipitation and temperature as predictors and the corresponding SPEI class as the dependent variable (Supplementary Methods and Supplementary Table 3). Similarities of SPEI class sequences were then assessed with regard to reference ones (series of dry–wet spells), using a predefined scoring system. Next, these similarity measures were used as inputs of a Bayesian network (BN) to assess the likelihood of concurrent dry–wet spells at different crop growth stages and their potential impacts on yields. Both likelihood and impacts were subsequently discretized, combined and normalized through a risk matrix framework to produce regional crop risk maps (Supplementary Fig. 14). Data during the reference period were split in blocks of 10 years to detect trends and their significance regarding mean precipitation, temperature and the incidence of dry–wet spells by means of Sen’s slope47 and the Mann–Kendall test48,49. Finally, we performed two multiple linear regressions to explain maize yield variance using two different sets of predictors: changes in mean precipitation and temperature at the four stages of crop development and changes in dry–wet spell patterns across these stages (Supplementary Fig. 12).

SPEI computation

The SPEI50 is able to consider the roles of both precipitation and evapotranspiration in drought onset and could outperform other agroclimatic indices as a predictor of crop yield51,52,53. The standard procedure for the computation of the SPEI involves the following steps: (1) computing the effective precipitation as precipitation minus evapotranspiration, (2) selecting a suitable aggregation period depending on the purpose of the assessment, (3) fitting a three-parameter statistical distribution to the aggregated effective precipitation time series (for example, log-logistic, generalized extreme value distribution) and (4) transformation to a standardized normal distribution.

Here we computed the evapotranspiration using the Penman–Monteith equation54 as recommended in a previous study55. The selected statistical distribution was the generalized extreme value, and values of the index were limited to the range [−3, 3] to ensure reasonableness (as suggested in a previous study56). The goodness of fit of the distribution function was assessed through the Anderson–Darling and Kolmogorov–Smirnov tests (Supplementary Methods and Supplementary Table 4). The length of the aggregation period was carefully considered, as weather impacts on yields are very time specific (for example, ref. 57). While sectoral drought impact assessments usually consider time windows of several months58, sub-monthly time scales could be more informative whenever onset and end dates of dry spells are relevant59. This is especially important as climate change is driving a transition towards flash droughts (rapid-onset, sub-seasonal droughts) over 74% of global regions60. After a preliminary assessment of the suitability of several sub-monthly aggregation periods to explain yield anomalies (Supplementary Methods and Supplementary Table 5), an initial 10 day (dekadal) time window was selected. A dekadal aggregation is often used in agricultural sciences, as it allows the provision of indicators regarding varying growing seasons and crop phenological phases worldwide, more precisely than a monthly resolution would (for example, ref. 61). Next, we conducted a complementary analysis to ensure that this aggregation period was adequate to mitigate the effect of shifts in the rainy season onset around the planting date (Supplementary Methods and Supplementary Fig. 17).

SPEI classification and class characterization

SPEI values were classified using objective drought thresholds based on quantiles62. Concretely, we considered the following classes and related quantiles: very dry (A; <0.1), dry (B; 0.1–0.3), moderately dry (C; 0.3–0.5), moderately wet (D; 0.5–0.7), wet (E; 0.7–0.9) and very wet (F; >0.9). Sequences of SPEI classes for each growing season were then extracted and divided according to maize growth stages, considering the available crop calendar data (planting dates and length of the growing season) and the length of the maize growth stages according to ref. 54. As the length of the growing season is highly variable across SSA, the length of each stage (initial, development, mid-season and late season) was computed as a percentage of the length of the growing season (Supplementary Table 1).

SPEI classes for each stage of crop development were analysed as follows: (1) computation of precipitation and temperature statistics, (2) estimation of the relative importance of temperature versus precipitation for each class and developmental stage (using an RF algorithm and mean 10 day values of both variables to explain the SPEI class; Supplementary Table 3 and Supplementary Figs. 15 and 16), (3) relative frequency of each class at each developmental stage and (4) trend detection of class frequency using Sen’s slope and the Mann–Kendall test.

Temperature relative contribution to the onset of dry spells

We used an RF classifier to assess the relative contribution of temperature versus precipitation to dry spell onset. Concretely, we considered 10 day mean precipitation and temperature as predictors and the corresponding SPEI class as the dependent variable. To assess the accuracy of the RF classifier, the absolute error took value zero if the predicted SPEI class was the same as the actual one, 0.2 if the predicted class was adjacent to the actual one (for example, when the predicted class is ‘dry’ and the actual one is ‘very dry’), 0.4 if the predicted class differed two classes from the actual one and so on till the maximum possible absolute error of one (which occurred when the predicted class was ‘very dry’ and the actual one was ‘very wet’ or vice versa; Supplementary Methods).

Sequence similarity assessment

This step involved the comparison of SPEI class sequences with a set of reference sequences. Six reference sequences were formed by repeating the character corresponding to each class—for example, ‘A’ for ‘very dry’—as many times as necessary to match the original sequence. Similarity of the original sequence with regard to the reference ones was then assessed through a penalty matrix (equation (1)) based on the Needleman–Wunsch algorithm63. This penalty matrix assigns the minimum penalization (zero) if SPEI class characters occupying the same position in both sequences are identical (for example, both are ‘C’ corresponding to ‘moderately dry’), a penalty value of one if characters belong to adjacent classes (for example, ‘E’ (wet) in one sequence and ‘D’ (moderately wet) or ‘F’ (extremely wet) in the other) and so on. Maximum penalization (five) is assigned when the SPEI class in one sequence is ‘A’ (very dry) and the corresponding character in the other is ‘F’ (very wet).

$$\mathrm{Penalty}=\left(\begin{array}{cccccc}0 & 1 & 2 & 3 & 4 & 5\\ 1 & 0 & 1 & 2 & 3 & 4\\ 2 & 1 & 0 & 1 & 2 & 3\\ 3 & 2 & 1 & 0 & 1 & 2\\ 4 & 3 & 2 & 1 & 0 & 1\\ 5 & 4 & 3 & 2 & 1 & 0\end{array}\right)$$
(1)

The similarity between two sequences is computed through equation (2). Here it is necessary to note that the divisor corresponds to the maximum possible penalization according to the previously defined penalty matrix:

$${\mathrm{Similarity}}_\mathrm{S1,S2}=1-\frac{{\sum }_{i=1}^{n}\mathrm{Penalties}}{5\times n}$$
(2)

where S1 is the first sequence, S2 is the second sequence and n is the number of elements of the sequences.

BN and evidence propagation

The proposed BN is composed of five nodes that represent the crop yield and the four stages of crop development and was implemented through the ‘bnlearn’ R package64. As we had six similarity measures for each stage of development, 1,296 possible combinations of these elements exist (considering a combinatorial problem in which four elements are taken at a time from a set of six elements with repetition). To ensure the robustness of the methodology, the BN is fitted the same number of times at each location using these combinations of data (Supplementary Fig. 13). We used a discrete BN to compute the joint and conditional probability of having weather conditions similar to those of the reference at different stages of crop development. Concretely, we considered 2,400 combinations of dry–wet spells (from single events—for example, dry conditions during the initial stage—to the concurrence of two, three or four events throughout the growing season). For this purpose, similarity measures were previously discretized in three categories (‘high’, ‘medium’ and ‘low’) and crop yields transformed into a binary variable (which adopted a value of one if the annual yield was less than the 10% quantile and zero otherwise).

Regarding the impact assessment, we analysed changes in the marginal distribution of yields when a certain evidence (for example, occurrence of a certain combination of dry–wet spells) was propagated across the BN. Although the BN was the same described above, the original continuous variables were used instead of the discrete ones. Evidence propagation was performed through the ‘BayesNetBP’ R package65, considering that the conditions of interest had similarity measures equal to one (perfect match with the corresponding reference sequences at selected developmental stages).

Risk mapping

Risk evaluation was performed through the risk matrix framework proposed in a previous study66. Risk matrix is one of the main prioritization techniques in risk management owing to its simplicity and comprehensiveness. Two components integrate this framework: probability or likelihood of occurrence of a certain event and severity of consequences resulting from the occurrence. Each component was assigned a score from 0 to 5 (where 5 represents the highest likelihood or impact and 0 the lowest), and risk score was calculated by multiplying the scores of both components. Finally, risk score was normalized using a minimum–maximum technique (Supplementary Fig. 14).

Multiple linear regression

We used a common approach based on first differences (for example, ref. 3) to evaluate the relationship between the 10 year time series for yield and climatic variables. Concretely, we performed two multiple linear regressions with first differences of yield as the response variable and two different sets of variables as predictors: (1) first differences of mean precipitation and temperature at the four stages of crop maize development and (2) natural logarithm (for example, ref. 67) of first differences of joint probability for multiple dry–wet spell patterns. In the case of the second, we used an iterative process of calculating P values (considering the t-test) and selecting features based on significance level. The aim is to create a more parsimonious and effective model by including only the most relevant explanatory variables. The median number of variables of the final model for each point is 6–7 depending on the region. Besides, to compare models with different numbers of predictors, we considered the adjusted R2 (equation (3)) instead of R2 (as the latter always increases alongside the number of variables).

$${\mathrm{Adjusted}\,R}^2=1-\left(1-R^2\right)\times (n-1)/(n-p-1)$$
(3)

where n is the sample size and p is the number of predictors.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.