Background & Summary

“To understand our world’s changing climate, it is imperative that we understand how climates of the past varied. Paleoclimatic data are the language we use to look into the past to understand ourselves and ultimately our future.” — Dowsett, 20201

Climate change, referring to long-term shifts in temperatures and weather patterns, is one of the pressing topics of our time. Throughout the last years, it has received increasingly more attention, both from the public and from scientists. Climate change, however, is not a recent phenomenon. Advances in the field of paleoclimatology, the study of ancient climates prior to the widespread availability of instrumental records, allow us to increasingly understand what the Earth’s past climate was like. One of the main themes that emerges from these studies is that climate change not only had a temporal but also a spatial dimension.

The link between temperature and precipitation fluctuations and agricultural productivity is widely recognized in modern agriculture2. Crops rely on suitable growing conditions, including optimal temperatures and adequate soil moisture and nutrients, to successfully complete their life cycles3. Consequently, alterations in temperature and precipitation patterns directly impact the suitability of land for agricultural use4. Before the industrial era, when irrigation and chemical fertilizers were not yet prevalent, these climatic influences were even more significant. This dependency was reinforced at a societal level by the fact that over half of the population was directly involved in agriculture during pre-industrial times5,6. As a result, fluctuations in climate often had profound effects on both the economic system and human well-being by directly impacting agricultural productivity7,8,9.

These effects do not occur uniformly across space. Climate change, instead, also has the potential to change the geography of crop suitability4,10,11,12. The climatic, soil and topographic requirements may vary over a wide range of different geographic areas and the climate experienced at any one location is unique to that point, at no other location will an identical climate occur. Historical examples for heterogeneous shifts in agricultural suitability are well documented already for the 14th century13. The famine period from 1315-1322 in Northern Europe was driven by diminished crop yields due to wet and cold summers. After a shift of the Atlantic westerlies, rainfall increased in southern Europe, leading to harvest failures in the 1330s and 1340s without affecting the north in a similar fashion. The disappearance of viticulture that only occurred in the north of Europe would be another example14.

Given that climate can vary significantly over short distances, it is crucial for researchers working with disaggregated historical data to account for the spatial nuances and significance of micro-climates15. Despite the spatial and temporal nature of climatic change, until now, quantitative historical studies have primarily relied on static indices of agricultural suitability - mainly due to the lack of indices that vary over time.

In order to overcome this limitation, we combine recent advances in paleoclimatology and construct the first time-varying index for agricultural suitability in Europe from 1500 to 2000 at a yearly level. We use a simple surface energy and water balance model as proposed by Ramankutty et al.3. Specifically, we rely on spatio-temporal data for temperature by Luterbacher et al.16 and Xoplaki et al.17, and precipitation by Pauling et al.18. Previous research has demonstrated that a combination of growing season length and moisture availability to crops effectively encapsulates pertinent characteristics to define the cold and dry boundaries of agricultural land19. The remaining components of the index emphasize the vital significance of soil potential hydrogen (pH) and carbon content in delineating agricultural boundaries, which are fundamental in determining nutrient availability for crops and maintaining soil functionality3,20,21. The temporal and spatial extent, as well as the resolution at the 0.5° × 0.5° level, are imposed by these paleoclimatic datasets.

This novel index is able to capture not just long-term trends, such as the so-called Little Ice Age (LIA), but also short-term climatic shocks. Specifically, we illustrate that the index captures negative shocks on agriculture induced by both precipitation and temperature, highlighting the importance of looking at the joint role of these features, their interaction with local soil conditions and their non-linear impacts. For example, temperatures that are both too high and too low are detrimental to crops. The same holds for precipitation as drought-like conditions are not conducive to plant growth, whereas too much rainfall during summer and autumn constituted the greatest hazard to crops in pre-industrial Europe22,23. In addition, there are potential interactions, such as, for example, the combination of cold weather and droughts, which seem to have created the most unfavorable conditions for crop cultivation in Mediterranean Europe24. Their effects are thus non-linear and interact with each other, suggesting that some functional form is needed25.

Importantly, in the construction of the index, we deliberately abstain from incorporating any information on actual historical land use and only include features that are exogenous to human activity. This is in contrast to an interesting literature with a different goal which tries to estimate and reconstruct historical cropland cover26,27,28,29,30,31. Such an approach necessarily needs to make assumptions about the level and spatial distribution of population over time - which is something that is viewed as an endogenous outcome in the types of historical studies for which our index will be useful. Irrespective of that, population estimates before the 19th century, let alone their spatial distribution, are not well-known and often based on educated guesses and assumptions32.

Furthermore, we deliberately do not aim to estimate the potential agricultural output of a particular piece of land, which varied across time and space based on a multitude of institutional and cultural factors. For example, land-tenure systems, the type of crop-rotation, the availability of new-world crops, the availability of horses as draft animals, episodes of war or conflict, and many others. It is thus desirable to separate out land use and potential production and only focus on factors that are exogenously imposed by nature. Our index, therefore, aims to define the likelihood of an area being suitable for cultivation, irrespective of whether it was cultivated or not. This leaves ample room for the expert researcher of any study using our index to determine how much of the output is actually caused by the human-environment interaction and, therefore, makes it easier to alleviate concerns about environmental determinism7.

By capturing the evolution of agricultural suitability over time and space, our dataset will empower researchers to explore the interplay between climatic fluctuations and Europe’s agricultural landscape, analyze human responses at a local and regional scale, and foster a deeper understanding of the region’s historical dynamics. It will be particularly relevant for research in the area of ’history of climate and society’ (HCS), where recent emphasis has been placed on the need to account for local and spatio-temporal heterogeneity of past climate changes33. In studies that just wish to account for (exogenous) changes in agricultural suitability at a fine-grained level, our dataset will be a useful time-varying control variable, for example, in regression analysis.

Scholars across many disciplines have long traced the cascading effects of climate and geography in shaping societies34,35. Climate and land suitability for agriculture have historically influenced crop productivity and the essential role of agriculture in providing nutrients, calories, proteins, and vitamins. These factors not only fueled labor but also shaped the organizational frameworks of pre-industrial societies, setting the stage for modern energy transitions36. This perspective illuminates the profound influence of land’s capacity to sustain crops, whether in terms of overall suitability or specific crop requirements, in elucidating persistent developmental disparities, such as state formation37, distribution of economic activity38, structural transformation39, growth40, as well as time preferences41, cultural dynamics42, and cooperation43.

Therefore, understanding changes in agricultural suitability and its implications is crucial for assessing the role of human activity as a driver of global environmental change44. In this context, our study naturally attaches to studies analyzing the spatial and temporal dynamics of agricultural suitability in current times, emphasizing its critical importance for food security and its integration into climate change research45,46,47,48.

Understanding the past human-climate interrelationship is essential for decision-makers and institutions seeking to comprehend a range of issues and take effective actions in the future. Climate variability over the last millennium provides crucial context for assessing future changes, particularly as anthropogenic effects become increasingly dominant49. Our index will aid researchers in addressing important questions regarding the impact of changes in agricultural suitability induced by climate change on people’s lives and the economy in the past. This insight can inform strategies to tackle contemporary challenges, especially in low- and middle-income countries that are often highly vulnerable to climatic shocks.

Several different time-varying indices of agricultural suitability, covering periods starting from the 1960s, have been proposed4,44,50,51. Since all of them require a rich set of input variables with spatio-temporal variation, which are essentially only available since the advent of satellite data, none of these can be extended backward to cover longer periods of time. To overcome this, researchers working with historical data typically use either a static index3,41 or, e.g., choose one cross-section of the GAEZ index52 to proxy for suitability in pre-industrial times. Alternatively, if researchers are willing to disregard geographic variability, they could just use readily available paleoclimatic time series data measured for the entire continent or parts of it. Relying only on variation in temperature or rainfall in isolation in order to capture agricultural suitability, however, is not recommended due to interactive and non-linear effects, as pointed out above. Our paper is the first one to offer an index with temporal as well as geographic variation in agricultural suitability covering the European landscape from 1500 to 2000. The temporal and spatial coverage of our index is limited, however, we hope we were able to illustrate the immense usefulness of the geographical component of paleoclimatic data for social science research and spur further research in this area. Such advances would allow to extend our approach to areas outside of Europe and periods farther back in time.

The most comparable paper to our work is the spatially explicit Old-World Drought Atlas (OWDA)53, which is a 0.5° gridded tree-ring based reconstruction of soil moisture spanning the entire common era. OWDA estimates a boreal summer self-calibrating Palmer’s Drought Severity Index (scPDSI), therefore focusing on severe drought and wetness. These precipitation-related events are usually the ones that trigger famine54, and the OWDA is thus able to capture extreme outlier events well. Especially in the North of Europe, however, crop cultivation is often found to be much more temperature sensitive23. Droughts are also typically much more spatially restricted than temperature anomalies55. Our index takes this into account and also captures more nuanced suitability fluctuations that naturally occurred due to changes in both temperature and precipitation.

The article is organized as follows: The Methods section provides an in-depth exploration of our approach, offering a schematic overview of the methodology (Fig. 1), and detailing the various building blocks that constitute our index. In the Data records section, we describe the dataset along with instructions for accessing and downloading it. Finally, we offer an overview of our index and conduct a technical validation with relevant benchmark indices.

Fig. 1
figure 1

Study design. Historical temperature data are taken from Luterbacher et al.16 and Xoplaki et al.17. Historical precipitation data are from Pauling et al.18. Windspeed, humidity, elevation, and sunshine hours are taken from the Climate Research Unit (CRU) CRU v.2.0 dataset58. Soil characteristics such as carbon content and soil pH are from the Harmonized World Soil Database59.

Methods

Agricultural suitability is influenced by natural constraints, including local climate, soil characteristics, and topography, which collectively determine the availability of energy, water, and nutrients necessary for crop cultivation. Changes in temperature and precipitation patterns directly impact the suitability of land for agricultural use4, however, the specific impact of these climatic factors varies widely based on crop type, geographical location, production practices, and technological advancements56.

Our agricultural suitability index follows the approach by Ramankutty et al.3 and consists of four main building blocks: a measure for cumulative temperature exposure (growing degree days), a measure for soil moisture (aridity index), the carbon content of the soil, and its potential hydrogen value. Land suitability is then defined as the predicted value of the propensity of a given piece of land to be suitable for cultivation.

By choosing a simple surface energy and water balance model, we illustrate how existing spatio-temporal paleoclimatic data can be used to create a spatially explicit suitability index that reaches several centuries into the past. Importantly, we are relying on temporal variation only from rainfall and temperature, sources that are exogenous to human activity in Europe - at least in our historical context. Since one of the primary uses of this index is to help scholars study how climate change affected human behavior in the past, it would be far from ideal to incorporate human behavior itself into the index. It would, in fact, induce a form of circular reasoning that would make it difficult to distinguish cause from effect.

This section describes in detail how each of these blocks were constructed and how we combine them into an index at a yearly level. The time frame and the spatial extent of our index were dictated by the spatio-temporal historical data on temperature and precipitation which are the main inputs.

Data overview

The starting point of our data construction is a uniform grid with dimensions of 0.5° × 0.5°, bounded by the following geographical coordinates: 25° W to 40° E and 35° N to 70° N. The core historical climate datasets under investigation include temperature data16,17 and precipitation data18. These datasets provide reconstructed seasonal temperature values (in degrees Celsius) and precipitation values (in millimeters) for four seasons (Autumn, Winter, Spring, and Summer) at a 0.5° × 0.5° grid resolution (approximately 55km by 55km measured at the equator). Our constructed initial grid aligns with these data sources.

To construct a comprehensive crop suitability index, we have adopted the methodology introduced by Ramankutty et al.3 as our foundational framework. This index integrates multiple critical factors, including growing degree days, soil moisture, soil pH, and carbon content, as primary determinants of soil suitability. Due to the lack of temporal data on soil parameters, we assume that soil pH and carbon content remain constant over time, attributing temporal variability exclusively to growing degree days and the soil moisture index. Numerous inputs were indispensable for the computation of various parameters, extending beyond just temperature and precipitation data. We incorporated additional environmental variables such as potential sunshine hours, average wind speed, relative humidity, and elevation. A comprehensive description of all the data utilized in this study is provided bellow:

Historical temperature

In this study, we employ the European Seasonal Temperature Dataset, constructed by Luterbacher et al.16 and Xoplaki et al.17. This dataset represents a high-resolution grid (0.5° × 0.5°) capturing seasonal temperature patterns across European land areas spanning from 25° W to 40° E and 35° N to 70° N from the year 1500 to 2000. The dataset draws upon various sources, incorporating homogenized and quality-assured instrumental data series, historical records documenting sea-ice and temperature indices from past centuries, and seasonally resolved proxy temperature reconstructions sourced from Greenland ice cores and tree rings originating from Scandinavia and Siberia16,17. The dataset segments the year into four distinct seasons, each spanning three months: winter (December, January, and February), spring (March, April, and May), summer (June, July, and August), and autumn (September, October, and November). To match the resolution and grid structure, the data has been resampled by extracting the seasonal mean value for every cell to our grid covering Europe, specifically ranging from 25° W to 40° E and 35° N to 70° N.

Link to data: Historical temperature (https://crudata.uea.ac.uk/cru/projects/soap/data/recon/#paul05).

Historical precipitation

We exploit the historical precipitation dataset sourced from the work of Pauling et al.18. This gridded dataset (0.5° × 0.5°) encompasses the European landscape extending from 30° W to 40° E and from 30° N to 71° N. It provides a comprehensive view of precipitation patterns from the year 1500 to 1900, also incorporating the gridded reanalysis data spanning the years 1901 to 2000 documented by Mitchell et al.57. The construction of this dataset required the use of advanced statistical techniques, prominently employing Principal Component Regression methods. It combines an extensive array of data sources, including long instrumental precipitation records, precipitation indices rooted in historical documentation, and natural proxies such as tree rings chronologies, ice cores, corals, and speleothems, all sensitive to precipitation signals18. Similar to the historical temperature dataset, it segments the year into four distinct seasons, each spanning three months: winter (December, January, and February), spring (March, April, and May), summer (June, July, and August), and autumn (September, October, and November). To match the resolution and grid structure, the data has been resampled by extracting the seasonal mean value for every cell to our 0.5° grid covering Europe, specifically ranging from 25° W to 40° E and 35° N to 70° N.

Link to data: Historical precipitation (https://crudata.uea.ac.uk/cru/projects/soap/data/recon/#paul05).

Other surface climate

The remaining surface climate data has been gathered from the CRU v.2.0 dataset58. This dataset is a reconstruction of a 10-minute latitude/longitude data set of mean monthly surface climate over the global land area, excluding Antarctica. In this study, we used four climate elements from this dataset: relative humidity, sunshine duration, wind speed, and elevation. These elements were interpolated from the data set of station means for the period 1961 to 1990. To match the resolution and grid structure, the data has been resampled by extracting the seasonal mean value for every cell to our 0.5° grid covering Europe, specifically ranging from 25° W to 40° E and 35° N to 70° N.

Link to data: CRU V.2.0 (https://crudata.uea.ac.uk).

Soil characteristics

Soil data parameters have been sourced from the Harmonized World Soil Database (HWSD)59, a repository renowned for its comprehensive representation of soil characteristics. The re-gridded HWSD dataset is presented in the form of files at a resolution of 0.05°, encompassing a diverse array of soil attributes. These attributes encapsulate information derived from actual soil profiles, reflecting varying stages of pedogenic evolution, land utilization, historical land use, and past disturbances. Within the HWSD, soil properties are accessible for both surface soil horizons (ranging from 0 to 30 cm) and deeper soil profiles (spanning depths of 30 to 100 cm)59. For our specific investigation, we have carefully chosen two pivotal components essential for assessing crop suitability: topsoil carbon content (T_C, in kg C m-2) and topsoil pH (T_PH_H20, in H20, −log(H+)). To match the resolution and grid structure, the data has been resampled by extracting the seasonal mean value for every cell to our 0.5° grid covering Europe, specifically ranging from 25° W to 40° E and 35° N to 70° N.

Link to data: HWSD (https://daac.ornl.gov).

Actual agricultural suitability

Our model is calibrated to the most recent and precise measure of suitability sourced from the FAO GAEZ v4 data portal. To align with the scope of our study, we constructed the actual measure of agricultural suitability by considering an average of four main types of crops historically prevalent in pre-industrial Europe: wheat, oat, rye, and barley. These crops were assessed over the time period of 1971-2000 under rainfed conditions and low input levels without considering CO2 fertilization effects. The resulting index has been normalized to a scale from 0 to 1 and then extracted to a grid with a resolution of 0.5° covering the European landscape, specifically ranging from 25° W to 40° E and 35° N to 70° N.

Link to data: FAO GAEZ v4 Data Portal (https://gaez.fao.org).

Administrative units

The polygon delineating the coastlines of Europe is taken from Natural Earth. The dataset represents land polygons, including major islands at 1:50m covering the extent of the study: 25° W to 40° E and 35° N to 70° N.

Link to data: Natural Earth, version 4.0.0. (https://www.naturalearthdata.com).

Data processing

We processed the data in two primary stages. First, all geographic datasets were resampled extracting the mean value for each input within our 0.5° grid to ensure uniform spatial resolution and grid structure. Second, specific adjustments were made to certain inputs to meet some functional form requirements, particularly for incorporation into the evapotranspiration calculations.

For instance, potential sunshine hours per day were derived using the percentage of daylight (Sun) from the CRU dataset58, capped at a maximum of 11 hours per day in accordance with the World Meteorological Organization Standard Normals (https://data.un.org). Additionally, wind speed data provided at 10 meters height (U10) in the CRU dataset58 was adjusted to a 2-meter height (U2) using a logarithmic wind profile appropriate for measurements over short grassed surfaces, with the relationship approximated as U2 = 0.75U10, following and established methodology60.

Furthermore, seasonal temperature extremes were calculated by identifying the minimum (\({T}_{\min }\)) and maximum (\({T}_{\max }\)) temperatures for each grid cell across the four seasons. Precipitation (P) in mm/year was computed as the total annual sum across these seasons. For detailed information on the variables used, their sources, the specific processing steps undertaken, and the final outputs, please refer to Table 1.

Table 1 Required inputs: Sun represents the percent of maximum possible sunshine (measured as percent of daylength), multiplied by the maximum hours of sunshine per day (11) using country averages from the World Meteorological Organization Standard Normals.

Building blocks for the suitability index

To construct a comprehensive crop suitability measure, in line with the approach outlined by Ramankutty et al.3 as our foundational framework, we require four key components: growing degree days, soil moisture, soil pH, and carbon content, which collectively serve as the primary determinants of agricultural suitability. In the upcoming sections, we will provide an overview of the four parameters and their significance in evaluating crop development.

Growing Degree Days (GDD)

GDD serves as a vital metric for estimating the growth and developmental progress of plants and insects during the active growing season. It is commonly used in the agronomic literature as a measure for cumulative temperature exposure61. The concept is based on the notion that development occurs only when temperatures surpass a minimum developmental threshold, typically set at 5° C for most European crop varieties, as depicted by the European Environment Agency (EEA, https://www.eea.europa.eu).

Originally, GDD is computed by summing daily temperature values over the course of a year. However, given our dataset’s lack of daily temperature observations, we have tailored the calculation of the GDD to align with our available seasonal data. To achieve this, we have evenly distributed the weight of each season, consisting of approximately 91 days, to construct the GDD measure for each year t from 1500 to 2000. The four seasons represent averages over 3 months as follows: Winter (December, January, February), Spring (March, April, May), Summer (June, July, August), and Autumn (September, October, November). Hence, this measure, contingent on the average seasonal temperature i in year t, is defined as follows:

$$GD{D}_{t}=\mathop{\sum }\limits_{i=1}^{4}\max (0,\,91\times ({T}_{i,t}\,-\,5))\,{\rm{day}}\,{\rm{degrees}}.$$
(1)

Aridity Index (AI)

We employ the AI as a proxy for soil moisture. The AI is a straightforward and widely accepted measure of aridity, rooted in the assessment of long-term climatic water deficits62. Beyond its role in forecasting drought and flood patterns, such indices are recognized for their utility in gauging moisture availability, crucial for the potential growth of reference crops and various vegetation types63.

Aridity can be expressed as a generalized function of the ratio of precipitation over potential evapotranspiration (PET, or ET0 in our case).

$$AI=\frac{P}{E{T}_{0}}$$
(2)

Evapotranspiration (ET) constitutes a pivotal phenomenon within the realm of biology, with particular significance in the study of crop water requirements. Over the different measures of potential water loss, reference evapotranspiration (ET0) is considered a more suitable indicator for estimating potential evapotranspiration compared to temperature-based metrics due to its utilization of an energy balance approach64.

ET0 can be computed according to the Penman-Monteith equation, which calculates the rate of ET from a hypothetical reference crop characterized by a standard crop height of 12 cm, a consistent canopy resistance of 70 ms−1, and an albedo of 0.23, closely resembling the rate of ET from an extensive expanse of green grass65,66. To execute the computation of ET0 effectively, the model needs several input parameters such as potential sunshine hours per day SD (hours), wind speed U2 (ms−1), mean daily relative humidity RH (%), latitude L (deg), elevation A (m), minimum temperature TminC), maximum temperature TmaxC) and mean temperature TaC).

To ensure uniformity and relevance, we undertook specific data transformations. Notably, we converted the potential sunshine dataset from CRU, originally expressed as a percentage, into potential sunshine hours by imposing a maximum of 11 hours of daylight, using country averages from the World Meteorological Organization Standard Normals. Furthermore, we adjusted the wind speed data, initially measured at a height of 10 meters, to the standard measurement height of 2 meters. This adjustment was made using a logarithmic wind speed profile for measurements conducted above a short grassed surface60, resulting in an approximate equivalence of U2 = 0.75U10. All necessary inputs are comprehensively described in Table 1.

We provide below a step-by-step derivation of the different parameters necessary to compute ET0 following the method described by the FAO GAEZ framework67.

1) Latent heat of vaporization (λ, in MJ kg−1) represents the amount of energy required to change a unit mass of liquid into vapor at a given temperature location (Ta).

$$\lambda =2.501-0.002361{T}_{a}$$
(3)

2) Atmospheric pressure (P, in kPa) is the pressure exerted by the weight of the earth’s atmosphere. Measured using elevation above sea level in meter (A), at higher altitudes where atmospheric pressure is lower, evaporation tends to occur more readily due to reduced pressure.

$$P=101.3{\left(\frac{293-0.0065A}{293}\right)}^{5.256}$$
(4)

3) Psychrometric constant (γ, in kPa°C−1) serves as a bridge between the partial pressure of water vapor in the air and the air temperature. Given that atmospheric pressure (P) changes with altitude, the psychrometric constant (γ) also varies accordingly. Consequently, water evaporates at higher altitudes and boils at lower temperatures due to the decrease in atmospheric pressure depending on the latent heat of vaporization (λ).

$$\gamma =0.0016286\left(\frac{P}{\lambda }\right)$$
(5)

4) Aerodynamic resistance (ra) determines the transfer of heat and water vapor from the evaporating surface into the air according to wind speed measurement at 2 m (U2). Assuming a constant crop height of 0.12 m and a standardized height for wind speed at 2 m, temperature, and humidity at 2 m, the aerodynamic resistance ra for the grass reference surface can be approximated as follows:

$${r}_{a}=\frac{208}{{U}_{2}}$$
(6)

5) Crop canopy resistance (rc) relates to the resistance offered by the crop canopy to the transfer of water vapor from the leaves to the atmosphere. It represents the extent to which the crop canopy limits the evapotranspiration process. Setting the average daily stomata resistance of a single leaf (Rl) equal to 100, and a leaf area index of the reference crop (LAI) at 2.8867, rc can be computed according to the following equation:

$${r}_{c}=\frac{{R}_{l}}{0.5{\rm{LAI}}}$$
(7)

6) Modified psychrometric constant (γ*, in kPa°C−1) can be computed using previous steps.

$${\gamma }^{\star }=\gamma \left(1+\frac{{r}_{c}}{{r}_{a}}\right)$$
(8)

7) Saturation vapor pressure (ea, in kPa) represents the maximum amount of water vapor that air can hold for given minimum and maximum temperatures Tmin and Tmax. At higher temperatures, the saturation vapor pressure increases because warmer air can hold more moisture, while at lower temperatures, the saturation vapor pressure decreases.

$${e}_{ax}=0.6108\exp \left(\frac{17.27{T}_{\max }}{237.3+{T}_{\max }}\right)$$
(9)
$${e}_{an}=0.6108\exp \left(\frac{17.27{T}_{\min }}{237.3+{T}_{\min }}\right)$$
(10)
$${e}_{a}=0.5({e}_{ax}+{e}_{an})$$
(11)

8) Vapor pressure at dew point (ed in kPa), i.e. at the temperature at which water vapor begins to condense into water can be computed using relative humidity (RH, in %) and saturated vapor pressure eax and ean.

$${e}_{d}=\frac{RH}{100}\times \frac{0.5}{\left(\frac{1}{{e}_{ax}}+\frac{1}{{e}_{an}}\right)}$$
(12)

9) Slope vapor pressure curve (ϑ, in kPa°C−1) gives the relationship between saturation vapor pressure (eax and ean) for given temperatures Tmax and Tmin:

$${\vartheta }_{x}=\frac{4096{e}_{ax}}{{(237.3+{T}_{max})}^{2}}$$
(13)
$${\vartheta }_{n}=\frac{4096{e}_{an}}{{(237.3+{T}_{min})}^{2}}$$
(14)
$$\vartheta ={\vartheta }_{x}+{\vartheta }_{n}$$
(15)

10) Latitude (φ, in rad) using latitude (L) in degree.

$$\varphi =\frac{L\pi }{180}$$
(16)

11) Solar declination (δ, in rad) varies throughout the year due to the tilt of the Earth’s axis relative to its orbit around the Sun. It can be approximated using the Spencer formula68.

$$\delta =0.006918-0.339912cos\tau +0.070257sin\tau -0.006758cos2\tau +0.000907sin2\tau -0.002697cos3\tau +0.00148sin3\tau $$
(17)

Where τ is called the day angle (in radian) and depends on the Day of the Year (J).

$$\tau =\frac{2\pi J-1}{365}$$
(18)

12) Relative distance Earth to Sun (d) throughout the year, as influenced by the Earth’s elliptical orbit around the Sun and the tilt of its axis.

$$d=1+0.033\cos \left(\frac{2\pi }{365}J\right)$$
(19)

13) Sunset hour angle (Ψ, in rad) to determine the timing of sunset using the latitude in rad (φ) of the location and the solar declination angle in rad (δ).

$$\Psi =\arccos (-\tan \varphi \tan \delta )$$
(20)

14) Extraterrestrial radiation (Ra, in MJ m−2d−1) refers to the amount of solar radiation received at the top of the Earth’s atmosphere under ideal conditions, assuming there are no atmospheric factors such as clouds, haze, or air pollution to attenuate the solar radiation. The computation of the extraterrestrial radiation is found using the relative distance from Earth to the Sun (d), the sunset hour angle (Ψ, in rad), the latitude (φ, in rad), and the solar declination (δ, in rad) previously computed.

$${R}_{a}=37.586d(\Psi \sin \varphi \sin \delta +\cos \varphi \cos \delta \sin \Psi )$$
(21)

15) Maximum daylight hours (DL) represents the theoretical maximum duration of daylight at a specific location for a given day given the sunset hour angle (Ψ).

$$DL=\frac{24}{\pi }\Psi $$
(22)

16) Short-wave radiation (Rs, in MJ m−2d−1) gives the solar radiation received at the Earth’s surface given sunshine hours per day (SD), maximum daylight hours (DL) and extraterrestrial radiation (Ra).

$${R}_{s}=\left(0.25+0.5\frac{SD}{DL}\right){R}_{a}$$
(23)

17) Net incoming short-wave radiation (Rns, in MJ m−2d−1) represents the solar radiation absorbed by the Earth’s surface depending on short-wave radiation (Rs), contributing to surface heating and various environmental processes. Assuming an albedo coefficient (α) of 0.23 for a reference crop, the net incoming radiation is defined as follows67.

$${R}_{ns}=0.77{R}_{s}$$
(24)

18) Net outgoing long-wave radiation (Rnl, in MJ m−2d−1) is the energy emitted by the Earth’s surface in the form of long-wave infrared radiation, primarily due to its temperature. Given sunshine hours per day (SD), maximum daylight hours (DL), the vapor pressure at dew point (ed), maximum and minimum temperature (Tmax and Tmin), the net outgoing long-wave radiation is computed as follow.

$${R}_{nl}=4.903\times 1{0}^{-9}\left(0.1+0.9\frac{SD}{DL}\right)\left(0.34-0.139\sqrt{{e}_{d}}\right)\frac{{(273.16+{T}_{\max })}^{4}+{(273.16+{T}_{\min })}^{4}}{2}$$
(25)

19) Net radiation flux at surface (Rn, in MJ m−2d−1) gives the balance between incoming solar radiation absorbed by the Earth’s surface (Rns) and outgoing long-wave radiation emitted by the surface (Rnl).

$${R}_{n}={R}_{ns}-{R}_{nl}$$
(26)

Wrapping up the inputs we have computed in the previous steps, we can define an aerodynamic (ETar) and radiation (ETra) term according to the Penman-Monteith combination equation51. While the computation of soil heat flux (G) holds significance at the monthly level, its relevance diminishes considerably on a larger spatial scale, raising the potential for inaccuracies in its estimation63. Therefore, in our study, we opt to omit the computation of the soil heat flux (G) altogether, effectively setting it to zero (G = 0).

$$E{T}_{ar}=\frac{\gamma }{\vartheta \,+\,{\gamma }^{\star }}\cdot \frac{900}{{T}_{a}\,+\,273}\cdot {U}_{2}\cdot ({e}_{a}\,-\,{e}_{d})$$
(27)
$$E{T}_{ra}=\frac{\vartheta }{\vartheta \,+\,{\gamma }^{\star }}\cdot ({R}_{n}\,-\,G)\cdot \frac{1}{\lambda }$$
(28)

The output is a 0.5° × 0.5° set of grids indicating the yearly measure of ET0 in millimeters per day from 1500 to 2000.

20) Reference evapotranspiration (ET0, in mm per day).

$$E{T}_{0}=E{T}_{ar}+E{T}_{ra}$$
(29)

Soil pH and carbon content

In line with the methodology outlined by Ramankutty et al.3, we designate soil pH and carbon content within the topsoil, where the majority of crop roots reside, as pivotal factors in determining soil suitability for agricultural purposes.

Primarily, soil pH, a fundamental parameter, profoundly influences soil fertility and nutrient availability, thereby directly impacting crop productivity. Lower pH levels, indicative of soil acidity, can impede the accessibility of essential plant nutrients while facilitating the uptake of detrimental elements such as aluminum, manganese, and hydrogen, along with the leaching of vital nutrients like calcium, magnesium, sodium, and potassium69. Conversely, elevated soil pH, characteristic of alkaline soils, hampers crop growth and yield by altering nutrient availability and disrupting natural microbial communities. The presence of sodium bicarbonate (NaHCO3) and sodium carbonate (Na2CO3) exacerbates these challenges, inducing nutritional stress and iron deficiency in plants and ultimately diminishing crop yields21.

Secondly, soil organic carbon assumes a pivotal role in determining soil functionality and ecological services. Its presence significantly enhances various chemical properties of soil, including infiltration capacity, water-holding ability, and nutrient availability for crops20. Nevertheless, low levels of organic carbon may indicate nutrient deficiencies, rendering the soil unsuitable for cultivation. Conversely, excessively high organic matter content can exacerbate water retention, leading to waterlogging conditions in poorly drained soils, thereby rendering them unsuitable for optimal crop growth3.

To assess topsoil suitability conditions, we utilized data from the Harmonized World Soil Database59, which provides measurements of pH (in H2O −log(H+)) and carbon content (in kg C m-2) at depths ranging from 0 to 30 cm.

Computation of agricultural suitability measure

In the following, we show how the building blocks are combined in order to obtain a yearly index.

Model calibration

In line with the methodology outlined by Ramankutty et al.3, the determination of our agricultural suitability index relies on functions denoted as f(GDD), f(AI), f(C), and f(pH), each detailed below. These functions are established by fitting curves to capture the relationship between actual agricultural suitability and the four parameters: GDD, AI, C, and pH. Essentially, each functional form represents the probability of suitability as a function of these input parameters.

To avoid dependence on historical endogenous cropland cover maps, we opted to construct the actual agricultural suitability by averaging the suitability indices of major crops historically present in the region. In pre-industrial Europe, wheat, rye, barley, and oat were the predominant crops, with rice, maize, and potatoes becoming significant only in later centuries in certain regions36.

To align with our most recent observations, we collected the suitability indices of wheat, rye, barley, and oat across from the FAO GAEZ data portal to our 0.5° × 0.5° grid covering the European landscape for the period 1971-2000. To simulate historical conditions accurately, these measures were constructed under rain-fed conditions, low input levels, and without CO2 fertilization. Subsequently, the suitability indices of different crops were averaged and scaled over the grid to yield a normalized index of actual agricultural suitability ranging from 0 to 1.

For the GDD and AI, we employed simple sigmoidal curves, terminating the curve fitting when reaching the first peak. This approach aligns with prior research indicating that a combination of growing season length and moisture availability effectively delineates the cold and dry boundaries of agricultural land19. While GDD indicates sufficient warmth for cultivation, AI accounts for plant-available moisture. The functional forms are calibrated as follows:

$$f(GDD)=\frac{1}{1+{e}^{a(b-GD{D}_{2000})}}$$
(30)

Where GDD2000 represents the average GDD measure over the period 1971-2000 to match the FAO suitability measure. The fitting yields a = 0.0079 and b = 982.75.

$$f(AI)=\frac{1}{1+{e}^{a(b-A{I}_{2000})}}$$
(31)

Where AI2000 denotes the average AI measure over the period 1971-2000 to align with the FAO suitability measure. The fitting gives a = 22.95 and b = 0.237.

The choice of a double sigmoidal curve to model the relationship between carbon content (C) and agricultural suitability is underpinned by key agricultural and environmental considerations. Soil carbon content, serving as a measure of the total organic content within the soil, plays a pivotal role in determining nutrient availability for crop growth. A low soil carbon content implies limited nutrient availability, thereby restricting optimal plant development and agricultural productivity. Conversely, excessively high soil carbon content levels often indicate waterlogging, particularly prevalent in wetland areas. In such instances, soil drainage becomes imperative for cultivation, necessitating significant investment in land preparation3. Hence, the functional form is calibrated as follows:

$$f(C)=\frac{a}{1+{e}^{b(c-C)}}\times \frac{a}{1+{e}^{d(e-C)}}$$
(32)

The fitting gives a = 0.99, b = 1.7, c = 2.345, d = − 0.458 and e = 14.347.

The representation of soil pH (pH) involves a composite of fitted lines, each delineating distinct pH ranges and their respective effects on soil suitability for cultivation. Soil pH levels significantly impact soil fertility and nutrient availability, thus influencing agricultural productivity. Extremely low pH levels characterize overly acidic soils, while excessively high pH levels denote soil alkalinity, both of which are unsuitable for cultivation. As soil pH gradually increases from toxic conditions towards the optimal range, nutrient availability improves, consequently enhancing the probability of successful cultivation until reaching an optimal plateau, as depicted by the first fitted line. Soil with pH values ranging between 6.5 and 7 represents an ideal range for cultivation, as indicated by the second flat line, reflecting optimal soil conditions conducive to crop growth. However, as soil pH continues to rise beyond the optimal range, nutrient availability diminishes under alkaline conditions, leading to decreased suitability for cultivation, as depicted by the third fitted line3. The functional representation of f(pH) incorporates these considerations, with parameters tailored to capture the nuanced relationship between pH levels and agricultural suitability. Specifically, the function f(pH) is defined as follows:

$$f(pH)=\max \left(0,\left\{\begin{array}{l}-1.12\,+\,0.325pH\quad {\rm{if}}\quad pH < 6.53\\ 1\quad {\rm{if}}\quad 6.53\le pH\le 7.09\\ 5.194\,-\,0.591pH\quad {\rm{if}}\quad pH > 7.09\end{array}\right.\right)$$
(33)

Figure 2 gives a visual representation of the different fitting curves. It is important to note that, in contrast to the study by Ramankutty et al.3, which employs a soil moisture index derived from the ratio of actual evapotranspiration (ETa) to potential evapotranspiration (PET), we have opted to utilize the AI as a proxy for soil moisture. This decision allows us to directly incorporate precipitation as a crucial factor in assessing agricultural land suitability and overcomes the limitation on historical temporal data for inputs necessary to compute the rate of ETa.

Fig. 2
figure 2

Calibration of the model using the FAO suitability index, growing degree days (GDD), the aridity index (AI), soil potential hydrogen, pH (in H20 −log(H+)) and carbon content, C (kg C/m2). Each point on the x-axis represents observations averaged over bins, and values on the y-axis correspond to the best suitability value reached by each bin. Selected points have for each calibration (except for their own): 4 < Csoil < 10, AI > 0.5, GDD > 1300, 6 < pHsoil < 8. The lines represent the different fitting curves: f(GDD), f(AI), f(C), f(pH).

Combining the building blocks

By incorporating time-varying parameters denoted as AIt and GDDt with soil suitability conditions C and pH, our agricultural suitability metric is constructed on an annual basis, combining the four foundational components described previously:

$${{\rm{Suit}}}_{t}=f({AI}_{t})\,\times \,f(GD{D}_{t})\,\times \,f(C)\,\times \,f(pH)$$
(34)

As each of the different functions produces values between 0 and 1, the product of the four parameters always generates a value between 0 and 1 by construction. In practical terms, this computation yields a spatial grid at a resolution of 0.5° × 0.5°, defined as the yearly probability for a given piece of land to be suitable for cultivation based on four distinct climatic and soil factors. It deliberately does not account for the potential agricultural output of a particular piece of land, which varied across time and space based on a multitude of factors. For example, institutional factors such as land-tenure systems, the type of crop rotation, the availability of new-world crops (which only started to diffuse later), the availability of horses as draft animals, or agricultural technology. In short, we try to abstract from possibly everything endogenous to human activity - which, we hope, makes the index appealing for many research applications.

Figure 3 presents the time series of the agricultural suitability index, averaged across Europe, accompanied by a 25-year moving average. The first time series underscores the notable impact of temperature and precipitation fluctuations on the suitability index. The moving average, on the other hand, elucidates overarching trends more distinctly. It delineates the peak of the Little Ice Age in the 16th and 17th centuries, succeeded by nearly a century of increase. Notably, around the time of the French Revolution, the index embarks on a century-long decline. From 1900 onward, a sustained ascent becomes evident, driven by the well-documented rise in average temperatures expounded upon in the contemporary climate change literature.

Fig. 3
figure 3

The agricultural suitability index for Europe visualized over time using yearly variation (upper panel) and a 25-year moving average (lower panel).

Figures 4 and 5 visually depict the spatial variability of different climate indices and the suitability index for selected years marked by “extreme events”. In the year 1669 AD (Fig. 4), Europe experienced one of the lowest mean levels of precipitation, totaling 622.62 millimeters for the year, whereas 1775 AD (Fig. 5) represented the most suitable year with a mean suitability index of 0.55 across Europe. Figures 6 and 7 illustrate the temperature and precipitation time series respectively, where these events are discernible. In 1695 AD, the lowest temperature level over Europe was estimated with a daily mean of 5.37 °C. In 1720 AD, the highest level of precipitation was observed, with an average of 808.65 millimeters over Europe. The lowest amount of yearly precipitation occurred in 1686 AD with an average of 620.8 millimeters. Regarding agricultural suitability, 1902 AD represented the least suitable year, with a mean suitability index of 0.434 across the continent. Finally, 1989 AD estimated the highest temperature in Europe, with a daily mean of 8.67 °C. While these snapshots represent only a few among numerous maps, the diverse nature of these shocks underscores the significance of their spatial dimension and their interaction with other climate and soil variables.

Fig. 4
figure 4

Case study 1 (1669 AD). The left panels show the mean precipitation18, mean temperature16,17, and our measure of agricultural suitability for the year 1669, where we register one of the lowest levels of precipitation across Europe (662.62 mm over the year). Z-score values (right panels) for each grid i have been computed using the standard formula: Z-scorex,i,t = (xi,t − μx,i)/σx,i where \({\mu }_{{x}_{i},i}\) and σx,i are the mean and standard deviation of variable x in grid i over the 1500-2000AD period respectively. xi,t is either yearly mean temperature, precipitation, or mean agricultural suitability for grid i at time t.

Fig. 5
figure 5

Case study 2 (1775 AD). The left panels show the mean precipitation18, mean temperature16,17, and our measure of agricultural suitability for the year 1775, where we register the highest overall mean of agricultural suitability across Europe (0.55). Z-score values (right panels) for each grid i have been computed using the standard formula: Z-scorex,i,t = (xi,t − μx,i)/σx,i where \({\mu }_{{x}_{i},i}\) and σx,i are the mean and standard deviation of variable x in grid i over the 1500-2000AD period respectively. xi,t is either yearly mean temperature, precipitation, or mean agricultural suitability for grid i at time t.

Fig. 6
figure 6

Temperature time series for Europe using yearly observations (upper panel) and a 25-year moving average (lower panel).

Fig. 7
figure 7

Precipitation time series for Europe using yearly observations (upper panel) and a 25-year moving average (lower panel).

In 1669 (Fig. 4), temperatures in Western Europe were up to 1.20 standard deviations above their mean, while temperatures in the remaining part of the continent remained relatively stable. This temperature increase could be interpreted as having a positive impact on crop suitability, particularly evident through the GDD measure. However, Central Europe experienced extremely low levels of precipitation, with some regions recording deviations of up to -5 standard deviations from the usual levels. Two notable observations can be made regarding the change in agricultural suitability for that year. Firstly, the combination of relatively high temperatures and extremely low precipitation significantly reduced soil moisture available for plants through the AI, rendering most areas unsuitable for cultivation, with some areas deviating by up to -22 standard deviations from the usual values. Secondly, the increase in temperature in the arid part of the study area (Southern Europe and Northern Africa) was counterbalanced by higher levels of precipitation, providing sufficient moisture to compensate for the temperature increase and thereby enhancing the agricultural suitability of the area.

In our second case study (Fig. 5), dated back to the year 1775 AD, we observed the highest suitability index (0.55). While Southern Europe and Northern Africa experienced a decline in suitability due to the interplay of higher temperatures and reduced precipitation, Eastern Europe witnessed a notable uptick in suitability primarily attributed to temperature rise. This case study underscores an important insight: regions traditionally deemed too cold for sustained cultivation, especially those hovering around the 5°C threshold conducive to crop growth, stand to gain the most from temperature increases. Such temperature shifts, when coupled with appropriate moisture levels and soil conditions, have the potential to metamorphose previously unsuitable areas into viable cultivation zones. A pertinent example is Northern Europe, where land suitability saw a remarkable rise, reaching values up to 4.50 standard deviations above the norm, spurred by elevated temperatures peaking at 1.80 standard deviations above the mean level. This illustrates how changes in temperature, particularly in regions near the colder boundary for cultivation, can profoundly influence crop suitability, opening new avenues for agricultural productivity in areas once deemed largely unsuitable.

The importance of changes in climate conditions is also noteworthy and varies depending on soil conditions. Areas with different soil suitability conditions will be impacted differently by climate shocks, with areas having less suitable soil being more vulnerable to significant climate shocks. Conversely, areas with relatively high soil suitability, while negatively impacted by the shock, could potentially still sustain cultivation due to their favorable soil conditions. Thus, understanding the interplay between climate and soil conditions is crucial for predicting the resilience of agricultural systems to climate change.

While it is acknowledged that more sophisticated tools for measuring crop suitability over time have been developed, particularly the FAO GAEZ grids available for periods from 1960 onwards50, for periods in the 20th century, our index remains valuable for studies requiring consistent, uninterrupted suitability measurements spanning multiple centuries, thereby avoiding structural discontinuities in the dataset.

Data Records

The agricultural suitability index presented in this paper is available on the Harvard Dataverse70 (https://doi.org/10.7910/DVN/ECWMZS).

Users of the statistical programming language R have the option to directly load the data through the environmentalhist package (https://github.com/axlehner/environmentalhist). The dataset provides a reconstruction of agricultural suitability for the European landscape ranging from 25° W to 40° E and 35° N to 70° N at a 0.5° × 0.5° resolution for the 1500–2000 period in GeoTIFF (.tif) format:

suit.tif

Geospatial raster dataset with 501 bands representing the annual average agricultural suitability index for the period from 1500 to 2000. The index ranges from 0 (unsuitable for cultivation) to 1 (highly suitable for cultivation).

Technical Validation

Validating our agricultural suitability index presents challenges due to its pioneering nature and the lack of comparable historical data. As the first to develop a time-varying index reaching back beyond the mid-20th century, we lack benchmarks to technically validate for the earlier time periods. Additionally, agricultural output data linked to specific geolocations is sparse prior to the 20th century, further complicating validation efforts. However, we believe that technical validation only on periods in the second half of the 20th century is sufficient to demonstrate the robustness of our index for several reasons. First, the index is defensive in the sense that the time-varying component is induced only by the temperature and precipitation datasets16,17,18, which have undergone rigorous validation themselves. Second, our simple surface energy and water balance model - as proposed by Ramankutty et al.3 - relies on relatively weak assumptions and standard functional forms and thus should have almost equal applicability across several centuries. Additionally, the methodology remains widely embraced across many academic fields.

We proceed to validate our index for post-World War II periods (1961-1990 and 1971-2000), the furthest extent to which we can compare our index with other existing agricultural suitability measures. This analysis involves comparing our index to the well-established FAO GAEZ product and the static Ramankutty et al.3 index itself. First, we analyze the cross-sectional correlation and see the goodness of fit. Second, we spatially compare the differences across all grid cells. Lastly, a histogram shows that these errors are roughly symmetrically distributed.

To enable accurate comparison, all indices have been extracted to our 0.5° resolution grid covering the European landscape, specifically ranging from 25° W to 40° E and 35° N to 70° N. Since the FAO GAEZ suitability index uses climate data from 1971-2000 AD and the index built by Ramankutty et al.3 uses data from 1961-1990 AD, we average our index over these two time spans to facilitate temporal comparison.

Figure 8 (left panel), illustrates a fairly strong correlation between the agricultural suitability index derived from FAO data and our measure of agricultural suitability with an R-squared of almost 60%. Several factors contribute to this observation. Primarily, our index employs a focused approach to assessing agricultural suitability, relying on four main parameters to delineate favorable conditions. In contrast, the FAO index integrates numerous factors, including soil quality, water supply systems, and crop-specific soil suitability ratings, which contributes to its higher level of detail and reliability. Given that we do not target to fit the FAO index ex-ante and that its functional forms are quite different, we believe the observed correlation is very reassuring.

Fig. 8
figure 8

The left panel shows the correlation between the suitability index built in this study (Suit) with data averaged over the period 1971 - 2000 and a suitability index from FAO. The right panel shows the correlation between the suitability index built in this study (Suit) with data averaged over the period 1961 - 1990 and the suitability index from Ramankutty et al.3. The red dashed line represents the 45° line, and the solid red line represents the linear model y = βx. The coefficient and R2 from the linear regression are shown in the top left corner.

Figure 8 (right panel) exhibits the correlation with the index developed by Ramankutty et al.3. The achieved R-squared of over 70% is remarkably good, and it can be seen in the scatterplot that the observations line up fairly well along the 45-degree line. Our index tends to identify slightly more suitable areas. This discrepancy arises due to several differences in our methodology. Firstly, we rely on different and more recent weather and soil data, as well as incorporating a composite index of agricultural suitability from FAO instead of historical cropland cover maps for parameter calibration. Consequently, these differences impact the definition of functional forms in our index. Secondly, the limited global coverage of our temperature and precipitation data over the European landscape results in geographically more constrained observations for model training thus driving some of the observed difference. Our index is, therefore, more optimized for the European continent, whereas Ramankutty et al.3 covers the entire globe.

Figure 9 illustrates the spatial disparities between our index and the two benchmark indices, calculated as follows: difference = Suiti − Xi, where Suiti represents our measure of agricultural suitability and Xi denotes either the composite measure of agricultural suitability defined by the FAO GAEZ (left panels) or the measure of agricultural suitability by Ramankutty et al.3 (right panels) for grid i. Consequently, negative values indicate areas identified as more suitable by the other dataset, while positive values imply that our measure indicates grids with higher suitability conditions for cultivation.

Fig. 9
figure 9

Illustration of the spatial difference between suitability indices. The top panels show our measure of agricultural suitability averaged over the period 1971-2000 (as in FAO) and the period 1961-1990 (as in Ramankutty et al.3) to allow consistent temporal comparison with the two benchmark indices. The middle panels represent the composite measure of agricultural suitability from FAO GAEZ and the measure of agricultural suitability by Ramankutty et al.3. The lower panels show the spatial difference between suitability indices computed as follow: Difference = Suiti − Xi, where Suiti represents our measure of agricultural suitability (Suit1971−2000 for FAO and Suit1961−1990 for Ramankutty et al.3) and Xi denotes the benchmark indices for grid i. Consequently, negative values indicate areas identified as more suitable by the other dataset, while positive values imply that our measure indicates grids with higher suitability conditions for cultivation.

Looking at the FAO difference, we observe differences in the arid south and cold north boundaries of Europe for the FAO index, whereas our index slightly shows more suitable areas in southern Europe and along the East coast of the Adriatic Sea, most likely due to our simplified soil suitability conditions. Comparing spatial differences with Ramankutty et al.3, we still observe differences in the arid south but also note what appears to be more generally suitable areas in central Europe. As climate is unlikely to differ substantially enough in these areas to account for such differences, the variations likely result from differences in datasets on soil suitability (carbon content and soil pH), as well as induced changes in model calibration.

As a final step, we observe that these differences are minor and distributed symmetrically (Fig. 10). However, a slight shift towards the positive spectrum suggests that our index tends to identify higher suitability conditions on average in Europe for this time period, particularly when compared to the index developed by Rmankutty et al.3. It is important to emphasize that our index is not designed to compete with the precision of modern indices built by the FAO GAEZ.

Fig. 10
figure 10

The histogram shows the distribution of the differences for all 9100 grids as shown in the lower panel of Figure 9. Difference = Suiti − Xi, where Suiti represents our measure of agricultural suitability, and Xi denotes either the composite measure of agricultural suitability defined by the FAO GAEZ (FAO) or the measure of agricultural suitability by Ramankutty et al.3 for grid cell i. Consequently, negative values indicate areas identified as more suitable by the other dataset, while positive values imply that our measure indicates grids with higher suitability conditions for cultivation.

Usage Notes

The precision of our agricultural suitability index is inherently tied to the quality of its input data. Historical climate data predating the twentieth century are limited, often requiring reconstruction from documentary evidence and natural proxies, each with inherent limitations and potential biases34. Consequently, any calibration errors in the reconstruction of historical temperature and precipitation data may propagate through our model.

Our index currently offers robust estimates for the European climate due to the availability of historical data for this region. However, the scarcity of global historical data limits the model’s applicability to different climates, particularly in more arid and humid areas of the world. This could result in less reliable estimates for some regions outside of our study area.

Regarding other climatic variables essential for computing evapotranspiration rates, such as historical indices of wind speed, relative humidity, or potential sunshine hours, time-varying data are nonexistent. Our model assumes these variables remained relatively stable over past centuries. Major changes in these variables, such as sunshine hours influenced by volcanic activity, for instance, could introduce measurement errors. While this assumption simplifies the water balance model, it does mean that variations in moisture availability are primarily attributed to changes in temperature and precipitation, potentially leading to some measurement errors.

Additionally, the use of a 5-degree threshold in calculating growing degree days means that areas, where temperatures hover around this threshold, are particularly sensitive to temperature fluctuations. Researchers should interpret these results carefully, distinguishing between meaningful changes in weather conditions and potential noise.

To minimize potential errors and noise, we recommend researchers extract rasters for the specific years of interest and then use zonal averages via buffers when measuring suitability for a particular location, such as a city. This approach helps average out possible noise and measurement errors. For average suitability over larger areas, such as countries, zonal means within the relevant polygons for the given year are sufficient. For studies spanning longer periods, we suggest averaging the grid data over these years before taking zonal statistics.

If researchers are interested in agricultural suitability only in the second half of the 20th century, we recommend relying on the much more detailed products provided by the FAO cited in this text and also used for technical validation. If, however, a research project covers periods before and after the turn of the 20th century, we suggest relying on our index throughout to ensure consistency. The year 2000 is the temporal limit imposed by the historical spatial temperature and precipitation datasets.

The presented model could tremendously benefit from advances in paleoclimatology. More extensive and finer-grained paleoclimatic data extending farther back in time and geographically beyond Europe would allow to extend its scope both spatially and temporally. Arguably the most important element to achieve this would be new and expanded datasets containing geolocalized tree-ring records.

Although the model is calibrated for common European crops, combining seasonal measurements with crop-specific water, temperature, and soil requirements could yield further insights into the spatial distribution of crops and their interaction with climate. Adjusting the model for specific crop types would not just allow an understanding of their distribution but could, for example, be also informative for modern efforts to reintroduce certain indigenous crops. These potential improvements are avenues for future research. Collaborations across disciplines could significantly enhance the utility of climate data inputs, addressing diverse research needs.

Our time-varying suitability index also has the potential for being used as an input for other models dealing with changes in Europe’s historical landscape - especially when it comes to the growing interest in studying the usage and depletion of natural resources leading up to the Industrial Revolution. For example, studies examining historical deforestation patterns typically use agricultural suitability as an important input71. These datasets can potentially increase the accuracy and precision of their estimates of historical forest coverage by incorporating a time-varying suitability index.