Introduction

Soybean is the largest traded agricultural commodity comprising more than 10% of the total value of global agriculture trade1. The majority of the global soybean crop is produced in the United States (US), Brazil, and Argentina which together contribute approximately 75% of the global annual soybean supply (Fig. 1a). This highly concentrated soybean production in three regions makes the global soybean supply vulnerable to regional production shocks. Climate-related shocks, in particular, can have substantial impacts, with concurrent multivariate hot and dry extremes among the most detrimental environmental factors affecting crop yields2,3. These are critically damaging when they co-occur in multiple hotspot-producing regions within a single harvesting season, which can threaten the stability of the global food system4,5. One key contribution of this work is to investigate a specific case study that combines both multivariate and spatially compound weather extremes6 with disproportionate impacts on global soybean production.

Fig. 1: Global and regional soybean production trends and 2012 anomalies.
figure 1

a Total national soybean production (Mt) time series (source: FAOSTAT) split into Brazil, Argentina, United States and the rest of the world. b LOESS (Locally estimated scatterplot smoothing) detrended global soybean production anomaly (Mt). Red dot indicates the year 2012. ce Anomalies of linearly detrended 2012 soybean yield (t/ha) (source: USDA/IBGE/SIAA), summer maximum temperature (mean daily maximum temperature) (source: CRU36) and summer root zone soil moisture (% deviation from climatological mean) (source: GLEAM37) at county scale in the Americas. Summer refers to Jul-Aug-Sep average in the northern hemisphere and Jan-Feb-March average in the southern hemisphere.

2012 is a noticeable year when soybean yields failed in large parts of both the US and Southeast South America (SESA). This led to an unprecedented soybean production shock even when expressed in a relative sense, with the global production anomaly being 10% below anticipated trend levels (Fig. 1b). A 2-year persistent La Niña event in 2010–2012 favored an active south Atlantic convergence zone in austral summer 2011–2012 and a strong negative ‘horseshoe’ sea-surface temperature pattern in the north Pacific in boreal summer 2012. Both have been previously shown to be important precursors of hot and dry conditions in SESA and the US respectively, mediated via dynamically well-understood teleconnections7,8,9,10. Analyzing how anthropogenic climate change modulates the impacts of such spatially compound harvest failures can provide relevant information for adaptive planning concerning similar events in the future. Furthermore, it enables an event-based estimation of losses or gains directly incurred due to climate change11.

As defined by the IPCC Working Group II (WGII), impact attribution involves quantifying how changes in climate-related systems cause changes in natural or human systems12. While impact attribution does not necessarily entail attributing changes to human-induced climate forcing, an increasing number of studies do consider this aspect13, which is the focus of our study. However, attributing the anthropogenic imprints to weather extremes and their subsequent impacts is challenging due to the intertwined roles of natural variability and anthropogenic climate change in driving such events14. Attribution methods typically involve comparing the observed system state, which includes anthropogenic influences, to a counterfactual baseline that excludes these human-induced forcings.

The most commonly used probabilistic approach involves comparing large ensembles of climate model experiments—specifically, control simulations that exclude anthropogenic forcings and historical simulations that include these forcings. This comparison allows for separating forced changes from internal variability15. However, this method is vulnerable to epistemic model errors. In particular, this concerns atmospheric circulation changes, which are much more uncertain compared to the well-understood and modeled thermodynamic changes associated with global warming16. Particularly noteworthy is the current lack of confidence in the greenhouse gas–forced response of ENSO dynamics. Recent research highlighted that current models fail to reproduce the observed strengthening in the east-west tropical Pacific Sea-surface temperature (SST) gradients, the latter favoring La Nina-like conditions17,18. These inconsistencies could lead to important errors in both attribution and projection of anthropogenic climate change impacts in regions sensitive to tropical Pacific SSTs19.

The storyline approach has been proposed as a framework to address climate change imprint on high-impact events in the context of deep uncertainty20,21. One way to do this is via climate model experiments that can realistically simulate a prescribed observed circulation anomaly under different background climatological conditions, so-called spectrally nudged atmospheric experiments22,23,24. Such conditioning removes the effects of highly uncertain circulation changes and allows one to focus exclusively on the thermodynamic implications of climate change on a particular event of interest. This type of assessment is, by definition, deterministic and does not explore, for instance, changes in the frequency of the event. Nevertheless, it avoids the generation of probabilistic statements that can be misleading due to high model uncertainty. Such spectrally nudged storyline approaches can provide key information on the implications of climate change but have not been used yet to study high-impact crop failure events.

Here, we use spectrally nudged atmospheric experiments that reproduce the anomalous circulation state and associated surface extremes of 2012 under observed factual conditions (1 °C warming), as well as under two counterfactual scenarios: pre-industrial conditions (no warming) and a 2 °C warmer world24. We first use mixed-effect statistical models (see “Methods”) to quantify the relationship between crop yields and summer temperature and soil moisture at the county level. We then use estimated relationships to attribute 2012 soybean production deficits based on event storylines of temperature and soil moisture conditions under different levels of warming. By comparing soybean production anomalies between storylines, we quantify the extent to which anthropogenic climate change modulates the impacts of an event with identical, drought-inducing atmospheric circulation conditions as observed in 2012. In other words, we ask the question of what the impacts would have been if the 2012 conditions, along with their level of agri-technology, land management, harvested area, etc., would have occurred in a different climate.

Results

Compound hot and dry conditions as key driver of crop losses

We concentrate our analysis on the US, Brazil, and Argentina given their large share of global soybean production. Production in all three countries in 2012 fell short of trend expected levels (Fig. 1a, b). Within the US, negative yield anomalies were widespread, except for the eastern coast. In South America, adverse yield anomalies were primarily concentrated in the SESA region, encompassing the southern parts of Brazil and Argentina. Crop yield estimates were near average only in the province of Buenos Aires (south of 34 °S) and the central Brazil (CB) region (north of 20 °S) (Fig. 1c). The spatial pattern of this compound yield anomaly closely mirrors the estimates of summer temperature and soil moisture anomalies in their respective regions (Fig. 1d, e).

To quantify this climate-yield relationship, we link detrended crop yields (in tons per hectare) and detrended summer soil moisture, temperature and their interaction using three distinct mixed-effect statistical models for the US, SESA, and CB regions separately (Fig. 2). We allow estimated relationships (e.g., the sensitivity of soybeans to high temperatures) to vary at county level per region, which offers the advantage of quantifying both region-wide and county-specific effects simultaneously (Supplementary Fig. 1). This approach accommodates the potential idiosyncrasies of local conditions that might contribute to varying crop-climate sensitivities at county scale. Furthermore, explicitly accounting for the interaction between soil moisture and high temperatures was shown to lead to significant model improvements allowing to capture physiologically expected increasing crop sensitivities to temperature when soil moisture is low2,25,26.

Fig. 2: Sensitivity of soybean yield anomalies to summer soil moisture and maximum temperature anomalies.
figure 2

a Yield anomalies (t/ha, color shading) as a function of summer soil moisture (%, vertical axis) and maximum temperature (°C, horizontal axis) anomalies. Contour lines represent modeled yield sensitivity based on regional fixed effect model coefficients. Grids represent observed soybean anomalies aggregated within bin intervals of 2.5% (height) and 0.25 °C (width) for soil moisture and temperature anomalies respectively. Marginal R2 considers the proportion of variance explained by the regional fixed effects only while conditional R2 additionally considers variance explained by local effects relative to the overall variance. b Out-of-sample 2012 model yield predictions at county level. The three boxes represent, from north to south, the United States (US), Central Brazil (CB), and southeast South America (SESA) regions, respectively.

Our statistical crop models explain roughly one-third of soybean variability in the US and SESA (Fig. 2a, b), in line with previous research27,28. We note considerably lower explained variability for the CB region (Supplementary Table 1). We further calculate the out-of-sample fraction of explained variability (R2) at the local and regional scale which leads to similar results, highlighting model robustness (Supplementary Fig. 2). Specifically for 2012, we report out-of-sample predictions that show a very similar spatial pattern and intensity to observed soybean yield anomalies in 2012 across the Americas (compare Figs. 1c and 2b). We find that 1 °C warmer summers, on average, lead to soybean yield losses of 0.07 t/ha in the US and 0.1 t/ha in the SESA and CB regions. Similarly, soil moisture anomalies of 10% below average conditions correspond to average yield losses of 0.1, 0.3 and 0.06 t/ha in the US, SESA, and CB regions, respectively. Notably, the combination of a 1 °C temperature increase and a 10% soil moisture deficit results in additional compound negative yield impacts of 0.03, 0.24, and 0.03 t/ha, beyond the independent effects of heat and drought alone (Fig. 2a).

The compound impacts of moisture and temperature reflect distinct crop physiological stress to combined hot-dry conditions that cannot be inferred from simply the sum of moisture and temperature impacts29,30. Dry soils can make crops much more susceptible to heat stress due to a variety of physiological processes, including a lack of evaporative cooling30. Similarly, heat stress can damage a crop’s roots, making it more sensitive to drought conditions by constraining root water uptake31. While soybean sensitivity to climate variability exhibits variations at the local level due to county-specific effects such as selected cultivars or soil type, the estimated coefficients remain broadly consistent between county and regional average estimates. The effects of temperature diverge only for northern counties in the US (north of 40 °N) where higher temperatures have a limited or positive impact on soybean yields compared to largely negative effects across the US (Supplementary Fig. 1). This is in line with previous research2,32, and can be largely attributed to colder regional climatic conditions (Supplementary Fig. 3a). Lower crop sensitivity in CB has also been documented in previous research and can be attributed to a tropical humid climate which leads to a reduced frequency of hot and dry events9,33. Both these regions have, in fact, seen recent expansion in harvested area driven by growing demand for feed and concurrent favorable weather conditions34.

Estimating impacts of climate change on 2012 crop losses

We proceed to estimate yield anomalies of the 2012 conditions under a pre-industrial and plus 2 °C climate. This consists of combining storylines of summer 2012 weather conditions under different warming levels with the statistical crop models we establish above. For the weather input, we rely on global climate model experiments from the ECHAM6 model where large-scale vorticity and divergence in the free atmosphere are spectrally nudged toward reanalysis conditions23,35. This reproduces three ensemble members of the 2012 large-scale circulation anomaly under pre-industrial, present-day and plus 2° warming levels while allowing surface temperature and soil moisture to respond freely (see “Methods”). The spectrally nudged ECHAM6 model is able to reproduce the inter-annual variability in soil moisture and temperature well compared to observation-based data products (Supplementary Fig. 3). However, model biases with respect to the absolute magnitude of the respective soil moisture and temperature anomalies remain (Supplementary Fig. 4b). To avoid propagating this bias in our impact calculations, we use original CRU36 and GLEAM37 based temperature and soil moisture values as 2012 reference conditions, hereafter “factual 2012,” and apply delta changes (see “Methods”) on those anomalies to obtain our adjusted storyline datasets38. Finally, we transform yield anomalies to production anomalies by multiplying with harvest area estimates for 2012. This step accounts for the spatial pattern of yield and harvested area and the varying contribution of counties to total soybean production when aggregating across spatial scales.

To attribute the conditional climate change effects on the 2012 event, we compare soybean production estimates between factual (1 °C warming) and pre-industrial (no warming) conditions. We find that 35% (5-95 range of 13–45%) of the global soybean production deficit in 2012 is due to historic warming (Fig. 3). Production deficits varied considerably at the regional scale with 3.5%, 222% and −14% changes for the US, SESA and CB respectively highlighting the heterogeneity of climate change impacts on local crop production. We note that for the CB region, these anomalies are small in magnitude compared to the US and SESA regions reflecting low yield sensitivity and generally favorable factual 2012 local weather conditions (Figs. 1c and 2a). The large changes recorded in the SESA region can be explained by high crop-weather sensitivities and local temperature levels above mean climatological conditions. Pre-industrial temperature anomalies in the SESA region are generally below the 1980–2014 climatological mean of around 30 °C (Supplementary Fig. 5a). However, during 2012, temperatures in the entire SESA region exceeded that climatological value (Fig. 4e and Supplementary Fig. 6). Factual warming impacted production equally through both direct temperature increases and the indirect, stronger interactive heat-drought impacts, highlighting the non-linear response of crops to compound weather conditions. Although soil moisture changes themselves are negligible for South America, higher temperatures create stronger sensitivities to similar moisture deficits, as quantified by the steeper soil moisture-crop yield response curve at warmer levels (Fig. 4b). On the other hand, warming in the US due to historic climate change is of a smaller magnitude compared to that in the SESA region. This, in combination with slightly wetter soil moisture conditions, leads to milder additional yield losses in the US (as shown in Figs. 3 and 4, and Supplementary Figs. 68).

Fig. 3: Soybean 2012 production anomalies in pre-industrial, factual and plus 2 °C conditions.
figure 3

Per region (colored bars) and multi-regional total production (gray bars) anomalies are calculated by summing up the product of county-level yield model estimates and local harvested area estimates. Contributions of temperature, soil moisture and combined temperature and soil moisture conditions are determined following county-level model coefficient estimates. Confidence intervals combine both the upper and lower storyline ensemble member estimate and the 5-95% confidence interval in model predicted production estimate. The mean effects (dots) consider both average prediction estimates and ensemble average storyline estimates.

Fig. 4: Soybean yield anomalies in response to 2012 summer soil moisture and maximum temperature anomalies in pre-industrial, factual and plus 2 °C conditions.
figure 4

Estimates for summer soil moisture (%) (ac) and maximum temperature (°C) (df) anomalies are calculated by taking a harvest area weighted spatial average for these variables over the three different regions separately. Confidence intervals take into account both the upper and lower storyline ensemble member estimate and the 5–95% confidence interval in the fixed effect model coefficients. The mean effects (solid dashed lines) are based on both the average fixed effect coefficients and ensemble average storyline estimates.

Climate change effects in a plus 2 °C storyline led to roughly a 56% (34–91%) increase in the factual 2012 production anomaly. Regional changes are 61.1%, 25.5% and 728% for the US, SESA and CB regions, respectively (The extremely high percentage change for CB reflects the very small anomaly in 2012). Although changes in soil moisture are negligible for all regions, the absolute change in regional temperature anomaly conditions is larger going from factual to plus 2 °C storylines compared to differences between pre-industrial and factual storylines (Fig. 4 and Supplementary Fig. 6). This is particularly the case for the US where historic warming leads to 0.8 °C of local warming, but future local warming is roughly twice as large (Fig. 4f). This difference may be due to the regionally varying SST warming patterns in response to pre-industrial cooling (relative to 2012) and plus 2 °C warming. Different regional SST warming patterns can lead to slightly varying circulation imprints on local temperature and precipitation between the two periods.

Discussion

Here we apply statistical approaches that model the combined effect of summer soil moisture and temperature on crop yields. This formulation is consistent with recent efforts to model yield focusing on the interaction of timely water and heat stress conditions, including synergistic impacts of compound hot and dry conditions30. Alternatively, process-based crop models can be used to dynamically simulate the response of crops to weather although these require extensive computational and calibration efforts and have been shown to underestimate the impacts of weather extremes39,40. Statistical models, on the other hand, do imply a certain degree of extrapolation when inferring impacts of weather conditions outside the training dataset41. In this study, this is particularly relevant for simulated impacts in a 2 °C warmer climate where temperature anomalies are largely unprecedented. Nevertheless, current research shows similar crop-weather sensitivities and impact estimates from process-based and statistical models, at least with warming up to +2 °C, which suggests the estimates from this work are reliable41. Statistical models also ignore the potential yield-stimulating role of CO2 fertilization, but this has been shown to be significantly less effective during hot-dry conditions42.

We attribute the impacts of climate change using a highly conditioned setup where we prescribe the circulation pattern and assume no change in other relevant drivers such as biotic stress, harvested area, crop genetics and management. Thus, any adaptation and technological development are excluded from our analyses. Changes in some of these storyline parameters can alter the impacts of such events in the future. For instance, recent studies have explored the potential to considerably expand soybean production in Europe43,44. Such changes would imply a different concentration of future soybean production, different trade networks and, therefore, different cascading impacts. Moreover, here we assumed a fixed growing season, which in practice is affected by changes in temperature, photoperiod, and intertwined farming practices. Warmer temperatures are generally associated with earlier and faster phenological growth, which can lead to a shorter growing season45. This can affect the timing of peak physiological sensitivity to weather extremes, and the effectiveness of related processes, such as the CO2 fertilization effect. Research indicates that timely adaptation, including the introduction of new cultivars and adjustment of the growing season, can increase actual crop yields by up to 9% for soybean46. Future work could refine our storylines by including these effects, which could provide quantitative information for future adaptation planning.

The current exceptional concentration of global soybean production in just three countries renders this trade network particularly vulnerable to shocks5. Our analyses indicate that the thermodynamics of global warming significantly increase the severity of joint soybean breadbasket failures in key harvesting regions across the Americas. This raises significant concerns, as soybeans are currently the largest globally traded agricultural commodity, accounting for 60% of globally traded oilseed crops47. A substantial portion of soybean production is allocated to animal feed and is experiencing growing demand due to shifts in dietary preferences toward meat products48. A notable example is China, which presently accounts for 56% of global soybean demand, making it particularly vulnerable to the escalating intensity of spatially compound soybean failures with warming49. Additionally, soybeans are used in dairy and meat replacement products, as well as food aid meals for emergency relief programs, posing potential cascading impacts on various sectors that rely on soybean50. While our study has focused on soybeans, spatially synchronized harvest failures have been documented for other staple crops, including maize and wheat51. These failures can be induced by large-scale weather patterns like ENSO, the North Atlantic Oscillation, and circumglobal wave-trains, all of which exhibit uncertain responses to warming52,53. Considering the far-reaching implications of such synchronized crop failures on global food security, the development of carefully tailored storylines could provide useful information for adaptation planning. This can help increase preparedness for devastating yet plausible far-reaching food crises. Some plausible interventions could include a de-concentration of producing regions and enhancing the resilience of the trade-storage system. Additionally, the development of multi-stress resistant cultivars or reducing dependence on soybean for some key uses could be explored.

To conclude, we present here a framework that conditionally attributes the impacts of climate change on the unprecedented global soybean production failure in 2012. We find that one-third of the production deficit in 2012 was linked to anthropogenic global warming. Further warming in a +2 °C world (above pre-industrial) has the potential to further increase the production deficit by one-half compared to factual conditions. The crop losses are primarily due to the direct impacts of warmer temperatures on crops and the indirect impacts of warmer temperatures on physiological water stress. Although we find no substantial decrease in soil moisture across storylines, interactive heat-moisture effects exacerbate 2012-like events in a warmer climate. This study illustrates how the impacts of extreme weather can amplify with climate change. Our storyline-based impact attribution study provides a blueprint for future impact attribution studies, in principle applicable for any type of impact, which could be particularly relevant for estimates on loss and damage54.

Online methods

Crop yield, temperature and soil moisture datasets

National global production estimates (tons) for the period (1980–2014) are obtained from the FAOSTAT dataset. To estimate detrended anomalies, we subtract the long-term production trend calculated based on local regressions (loess). County-level yield (t/ha) and harvested area (ha) data for Argentina, Brazil and the United States for the period (1980–2014) are obtained from governmental sources: SIAA (http://www.siia.gov.ar/, last access: 1 February 2022), IBGE (https://www.ibge.gov.br/, last access: 1 February 2022) and USDA (https://quickstats.nass.usda.gov/, last access: 1 February 2022) respectively. County-level yield data are linearly detrended to eliminate long-term effects largely due to technological improvements. The harvested area per county is used to transform yield values in tons per hectare to production in tons. Root zone soil moisture and maximum temperature variables at monthly time scale are obtained from the gridded GLEAM v3.5a and CRU v. 4.06 datasets, respectively. GLEAM is a model-based dataset that assimilates observed satellite-based soil moisture input while CRU provides maximum temperature estimates based on station observations36,37. These datasets are filtered for the period 1980–2014 and temporally averaged over summer crop-sensitive periods (Jan-Feb-March for South America, Jul-Aug-Sep for the United States). Furthermore, the data is spatially averaged and linearly detrended at county level. Temperatures are presented as (°C) anomalies and soil moisture as (%) anomalies with respect to 1980–2014 climatology.

Spectrally nudged storyline dataset

Storylines for the three levels of warming for the year 2012 are produced using the ECHAM6 atmospheric model with T255 horizontal spectral resolution and 95 vertical levels (T255L95) nudged with NCEP R1 reanalysis data24. To simulate conditions under different levels of warming, sea-surface temperatures and greenhouse gases are altered for each of the storylines23,24. Three ensemble members per storyline are considered to robustly estimate the climate change signal in the time series of concern. Details on the storylines dataset, including the different SST and greenhouse gas levels used for the simulation, can be found in van Garderen (2022)35.

We make use of the long-term spectrally nudged simulation (ECHAM_SN) that covers the period (1980–2014) to compare our storyline dataset to GLEAM and CRU datasets. ECHAM_SN is produced in a similar way to the storyline dataset but covers a larger period and does not include ensemble members or counterfactual simulations55. We process the ECHAM_SN soil moisture and maximum temperature variables in a similar way to the GLEAM and CRU variables, where we average over the abovementioned summer periods and spatial units and proceed to linearly detrend at county level. We find statistically significant correlations between datasets for both summer soil moisture and maximum temperature detrended variables both at county level across the entire study domain (Supplementary Fig. 3).

Statistical analysis

We use a mixed-effect regression model per region to link detrended summer maximum temperature (CRU) and root zone soil moisture (GLEAM) to detrended soybean county-level yields (governmental sources). The regional model is defined as follows:

$${\hat{y}}_{c,t}= \left(\beta 0+{\beta 0}_{c}\right)+\left(\beta 1+{\beta 1}_{c}\right){{{{\rm{SM}}}}}_{c,t}+\left(\beta 2+{\beta 2}_{c}\right){{{{\rm{SM}}}}}_{c,t}^{2} \\ +\left(\beta 3+{\beta 3}_{c}\right){{{{\rm{TX}}}}}_{c,t}+\left(\beta 4+{\beta 4}_{c}\right){{{{\rm{TX}}}}}_{c,t}^{2}+\left(\beta 5+{\beta 5}_{c}\right){{{{\rm{TX}}}}}_{c,t}{{{{\rm{SM}}}}}_{c,t}$$
(1)

where c is a county index and t is for year (1980–2014). \({\hat{y}}_{c,t}\) is the predicted yield anomaly in county c and year t. \({{TX}}_{c,t}\) and SMc,t represents the detrended maximum temperature and soil moisture values in county c and year t, respectively. β0 represents the regional intercept, β1 and β2 represent the soil moisture regional effect, β3 and β4 represent the maximum temperature regional effect and β5 represents the temperature-soil moisture regional interaction effect. Coefficients with subscript c accommodate different sensitivities for each county per region where the slope is random, with county as random factor. Confidence intervals for regional and county-level coefficients, in addition to yield prediction estimates, are calculated based on bootstrap resampling with 1000 draws. For further robustness tests, we calculate out-of-sample model predictions, including for year 2012, based on a leave-one-out cross-validation scheme and compare the model performance to in-sample model fit.

Attributing the crop production impacts of 2012 using climate storylines

The storyline time series are averaged over summer periods and spatial units of interest. To stay as close as possible to the event’s observed conditions, we use original CRU and GLEAM based temperature and soil moisture values as 2012 reference conditions and apply additive delta changes on those anomaly levels based on our storyline outputs38. Delta changes from pre-industrial to present conditions are calculated by considering all possible combinations of three ensemble members for both pre-industrial and present-day storylines in 2012, resulting in nine combinations. A similar approach is employed to calculate delta changes from the present to the plus 2 °C storyline (Supplementary Fig. 4). We then use statistical model estimated coefficients to project changes in yield anomalies resulting from changes in 2012 weather conditions due to climate change. We use individual model coefficients relating yield sensitivity to temperature, soil moisture and their interaction to estimate the contribution of each of these components to yield change. County-level yield anomalies (t/ha) are aggregated to region-wide production anomalies (tons) by multiplying local estimated yield anomalies with county-level 2012 harvested area size (ha). The 5–95% confidence intervals for production estimates are calculated based on bootstrap resampling with 1000 draws. Finally, we illustrate varying yield sensitivities to temperature and soil moisture per storyline by calculating the marginal effects of temperature and soil moisture on yield, measured specifically for respective soil moisture and temperature storyline regional average values.