Introduction

Given the value of natural ecosystems for local biodiversity, human wellbeing, and climate change mitigation, we are witnessing the emergence of conservation and restoration movements across the globe1. Yet, for restoration and sustainable ecosystem management efforts to be successful, projects must consider which vegetation will be best suited to those regions under current and future environmental conditions. The terrestrial Earth surface can be divided into fourteen broad biome types, which differentiate distinct combinations of species across the globe2,3,4. While these groupings were originally designed to categorize vegetation, past work suggests these delineations also perform exceptionally well characterizing the distributions of other groups and trophic levels5. These biome classifications have proven fundamental to our understanding of terrestrial ecosystems, increasing confidence in global carbon sink projections6, as well as local ecosystem management decisions7. Furthermore, these biome categories place broad constraints on which types of vegetation are likely to survive in a given location and are therefore directly relevant to restoration and sustainable ecosystem management.

At the global scale, our contemporary biome maps are based primarily on expert opinion8, or remote sensing9. There is considerable interest in developing predictive models of the Earth’s terrestrial biomes, as these models could, in turn be used to predict the future distribution of terrestrial biomes under climate change scenarios, enabling climate-informed species selection for restoration and sustainable ecosystem management10. Past efforts to do so have used two general approaches: dynamic global vegetation models (DGVMs) and biome-climate envelope models (BEMs). DGVMs apply first-principles understanding of ecology to understand and project terrestrial carbon balance11. To do so, these models simulate a limited number of vegetation types that can be used to reflect contemporary and future patterns of the Earth’s biome types. Despite increasing accuracy across model generations, and the representation of biological feedback mechanisms to reflect complex system dynamics, the use of plant functional types, and the incorporation of complex ecological and physical processes remain key limitations of DGVMs12. In contrast, BEMs are statistical models often applying machine-learning techniques. These models do not attempt to represent the biophysical mechanisms that are likely to govern vegetation change, and instead leverage massive amounts of data to develop top-down predictive models of vegetation type13,14. When applied at regional scales, BEM classification can be extremely accurate (81–93% overall accuracy,15,16). Machine-learning BEMs have the ability to both accurately capture the global distribution of biomes, as well as forecast their future distribution. Previous studies incorporated plant traits17 and plant functional forms18 in BEMs to map the current distribution of biomes and forecast global biome shifts. While these studies provide valuable insights into the structural changes in ecosystems that climate change may promote, their approach is limited by the availability of plant species occurrence data and reliant on the assumptions of trait or growth form models.

Here, we calibrate probabilistic machine learning models to classify the Earth’s fourteen biomes at the global scale using soil and climate drivers and assess their accuracy. By doing so we identify the biome-climate envelopes (BCEs), the climate-defined multidimensional abiotic space that governs the distribution of the Earth’s biomes18,19. We then apply this model to global climate change scenarios to predict Earth’s BCEs in the future under RCP 4.5 and 8.5 scenarios, a conservative and current likely climate future, respectively20. Using this approach, we assess the vulnerability of Earth’s terrestrial BCEs to future global change. Based on previous studies14,17,18, we expect to find major shifts in BCEs due to climate change. We hypothesize that boreal and tundra BCEs are likely to decrease their extend and shift into higher latitudes as the climate gets warmer and drier21,22, while BCEs of dry ecosystems may expand in the future17,18,22. The models presented here do not reflect changes in vegetation characteristics under future climate conditions, as uncertainty about ecological feedbacks and migration rates precludes our capacity to predict how climate variation would impact these vegetation dynamics. However, this work identifies the regions where changes in climate conditions might make the local environments unsuitable for the vegetation that presently characterizes those regions.

Results

Our machine learning models described the contemporary distribution of BCEs across the globe with high accuracy (Fig. 1). The overall out-of-fit model classification accuracy, using an independent validation dataset, was 90% (kappa coefficient 0.89). This present-day BCE classification accuracy was consistent across all 14 biomes, ranging between 83 and 94% (Fig. 1, Table S3). We additionally characterized areas for which our global model predicted multiple feasible BCEs (at least two biomes with a probability > 10%), showing that in 9.6% of the global land surface area the current BCE classification is uncertain, with the majority of these multiple-BCE regions occurring at the transition zone between biomes.

Fig. 1
Fig. 1
Full size image

Global distribution of biome-climate envelopes (BCEs). (a) Observed contemporary distribution of Earth’s 14 terrestrial BCE types, generated using the RESOLVE Ecoregions 2017 dataset from the Earth Engine Data Catalog23 based on Dinerstein et al.8. (b) Predicted consensus BCE estimate based on random forest ensemble model. Shaded regions reflect uncertainty, where more than one BCE is predicted to occur with greater than 10% probability. (c) Confusion matrix of model classification accuracy. The diagonal represents the number of correctly classified validation points per BCE, out of 2500 total out-of-sample classifications per BCE. Off diagonal are the number of misclassifications, and which BCE was predicted instead.

Using IPCC climate projections over the coming decades, we predicted changes of BCEs in the future for two different climate scenarios. For both the RCP 4.5 and 8.5 climate-change scenarios, our models predict substantial changes in the global distribution and relative abundance of BCEs until 2080 (Fig. 2, Tables S1 and S2). By representing the abiotic context of biomes, these predicted changes in BCEs reflect the vulnerability of these regions to changes in the composition of the flora and fauna as a result of climate change. Even under the conservative RCP 4.5 scenario, the relative area of BCEs currently supporting montane grass- and shrublands, boreal forest, and tundra is predicted to decrease until 2080 by 15, 20, and 29% respectively, which equals an absolute reduction of 84, 284, and 239 million hectares (Fig. 3, Table S1). BCEs of habitats which are permanently or periodically inundated, such as mangroves and flooded grasslands and savannas, show reductions of 2 and 60%, respectively. The BCEs of dry ecosystems show strong increases in their relative area, with an increase of 23% (116 million hectares) for tropical dry forests and 4% (95 million hectares) for deserts and xeric shrublands (Fig. 3).

Fig. 2
Fig. 2
Full size image

Model consensus biome-climate envelope (BCE) estimate. (a) under contemporary climate, (b) under 2080 climate under RCP 4.5 and (c) under 2080 climate under RCP 8.5. Shaded regions reflect uncertainty, where more than one BCE type is predicted to occur with greater than 10% probability. Grey areas indicate ice-covered surfaces and lakes.

Fig. 3
Fig. 3
Full size image

Changes in individual biome-climate envelope areas as a function of time. From 2013 to 2080 (a) under RCP 4.5 and (b) under RCP 8.5. Grey areas indicate ice-covered surfaces and lakes.

At the global scale, our model projects that 11.4% of the land surface (1.5 billion hectares) has a high vulnerability (> 90% probability) of changing to a different BCE under the conservative RCP 4.5 climate scenario and 16.7% under the RCP 8.5 scenario (Fig. 4). An equally large region becomes highly uncertain, with the potential to exist within multiple BCE states. Specifically, the area of land classified as multiple possible BCE states (at least two BCEs with > 10% occurrence probability) increases from 9.6% under contemporary conditions to 16.1% (2.1 billion hectares) under RCP 4.5 and 19.1% (2.4 billion hectares) under RCP 8.5. It is unclear whether this increasing vulnerability is driven by an increasing area of novel climate conditions that are projected under a changing climate24. Ultimately, both measures of BCE change—those predicted to change to a new classification, and those that become highly uncertain—highlight the increasing vulnerability of terrestrial biomes to future changes in climate states.

Fig. 4
Fig. 4
Full size image

Probability that the contemporary consensus biome-climate envelope changes by 2080. (a) under RCP 4.5 and (b) under RCP 8.5 climate change scenarios. Grey areas indicate ice-covered surfaces and lakes.

Mapping the extent of model extrapolation vs. interpolation shows that the vast majority of future BCEs fall within current climate space (Figs. S1 and S2), especially northern latitudes where we predict the greatest potential for BCE turnover (Fig. 4). However, climate change will cause novel climates to emerge, and these environments are fundamentally outside of our training dataset. Our model predicts very little BCE turnover in these areas, likely in part due to the nature of our machine learning approach25, and therefore may be overly conservative. The Sahara Desert and Arabian Peninsula are the regions most distant from the current climate space (Fig. S1). According to our model, these areas will become hotter deserts than ever recorded. Assessing the extent of extrapolation across the full multivariate environmental covariate space highlights that more than 90% of pixels of future BCEs fall within current climate space, with the highest proportion of extrapolation in the tropics (Fig. S2). Thus, predictions for the ~ 10% of areas where novel combinations of environmental variables are expected to emerge should be interpreted with caution, as model uncertainty is inherently higher in these regions.

Finally, we assessed forecast sensitivity to the contemporary climate window on which our prediction model is trained. Using a leading 20-year window, as done here, may be overly conservative, as a substantial amount of climate change has already occurred. If biomes have yet to shift in response to changes in BCEs, then this may lead to an underestimation of potential BCE turnover in the future as our training climate would essentially have some amount of climate change “baked in”. To address this possibility, we re-trained models using a 34-year climate window, the longest possible range available for the CHELSA climate variables used in this study, and then we regenerated the BCE forecasts. Under this scenario we estimate that 14% and 21% of the land surface is likely to undergo a change in consensus BCE under RCP 4.5 and 8.5, respectively (Fig. S3). This finding indicates that, as expected, our forecasts are fundamentally sensitive to the amount of climate change that occurs in the future, and that forecasts presented here are likely conservative.

Discussion

The distribution of the Earth’s biomes places fundamental constraints on the global carbon cycle and biodiversity5,6. We show that soil and climate variables can account for 90% of current BCE distributions, providing an accurate classification of Earth’s present-day biomes. By applying this model to future climate scenarios given by the IPCC, we reveal major shifts in Earth’s BCEs in line with predictions from past studies14,17,18. Global greenhouse gas emissions are currently tracking the RCP 8.5 scenario. Under this scenario, we find that 17% of the terrestrial biosphere (2.2 billion hectares) is vulnerable to a significant shift in environmental conditions that could threaten the present vegetation in the coming decades, potentially rearranging the distribution of Earth’s biodiversity. Moreover, there is considerable uncertainty about the expected trajectory of up to 19% of terrestrial surface (2.4 billion hectares), as the dominant BCE is becoming increasingly uncertain. Changes in the climate-envelopes that govern these biomes will have profound implications for biodiversity conservation, reforestation, and land management.

Aligning with previous work14,17,18 our model suggests that climate change will especially affect the extend and distribution of tundra, boreal and dryland BCEs. Driven by changes in climate, 787 million hectares of tundra and boreal forests will be under threat by 2080, as these biomes are likely to be encroached upon by correspondingly lower latitude BCEs. These ecosystems store massive ecosystem carbon reserves, especially within peatland and permafrost soils26, that could be vulnerable to such striking environmental change as soils warm and dry 27. For example, the boreal forest BCE, protected by Canada’s largest National Park (Wood Buffalo National Park), is forecasted to have the temperate grassland, savanna and shrubland BCE of Theodore Roosevelt National Park, 1700 km south in North Dakota (United States), by 2080. Meanwhile, dryland BCEs such as tropical dry forests, xeric shrublands and deserts, and mediterranean forests, woodlands and scrub are expected to substantially increase in abundance by 2080. For example, Austin Texas (United States), currently a temperate savanna, grassland and shrubland BCE, is forecasted to have the BCE of Big Bend National Park, an extreme desert environment, by 2080. BCEs currently supporting tropical grasslands are projected to expand by 5% under RCP 8.5 at the expense of tropical moist forest BCEs, mostly in the southeastern part of the Amazon. This projection is consistent with other BEM predictions17,18 and with analyses observing decreased resilience and grass encroachment in this area28,29. In other places, the tropical moist forest BCE is expected to expand by 3%, encroaching tropical grassland BCEs in central Africa and northern South America17. With the prediction that 11–17% of terrestrial surface is vulnerable to a shift in BCEs, our assessment falls between the high18 and low estimates14,17 of other studies. Our spatial modeling makes clear that, even under the most optimistic climate change scenarios, the world should prepare for a spatial restructuring of the Earth’s BCEs on an alarmingly short time scale. More extreme scenarios suggest fundamentally new climate envelopes will emerge across the planet (Figs. S1 and S2).

Despite these striking changes in environmental conditions, the extent to which flora and fauna respond to these future climate pressures is a key outstanding question, and one that will ultimately govern if, when, and where future biomes eventually align with these BCE projections. Turnover of an ecosystem from one type to another can take decades to centuries—if at all—even for radical changes in climate30. Anthropogenic disturbances such as CO2 fertilization and nutrient deposition have the potential to buffer ecosystems and minimize climate-driven changes in plant composition31,32. The deterministic regionalization of biomes that our model is built on does not reflect regional and continent-specific phenomena such as evolutionary processes and migration timings33. Furthermore, complex ecological feedback mechanisms, such as disturbance regimes, may have continent-specific implications33,34 and introduce uncertainty about biome occurrences35. Because our models do not account for these potential differences in biome–environment relationships across continents, the uncertainty associated with our future predictions may be higher. Furthermore, we lack information on potential long-term changes in soil characteristics, which could alter BCE predictions and ecosystem dynamics36. Thus, we caution that these BCE projections should not be interpreted as space-for-time projections of future vegetative composition in the year 2080. Lags in ecosystem turnover, which can be particularly strong for long-lived species, like trees, will undoubtedly create persistent disconnects between biome-climate envelopes and biome patterns30,37. This phenomenon can lead to a mismatch between present-day climate and biota, promoting the emergence of novel biomes24. By adopting a probabilistic approach and focusing on the abiotic component of biome classification (see also19), we outline which plant communities and which regions are likely to experience novel environmental pressures over the coming century.  This information may be particularly relevant in regions that are presently degraded, where the sustainable restoration or management of ecosystems may not need to overcome past vegetation legacies.

Ultimately, the extent to which contemporary relationships between environment and vegetation climate envelopes will hold in the future remains uncertain. Our model predictions are inherently constrained by the environmental variables included, and changes in other unmodeled factors could alter the climate-biome relationships we identify. For example, interactions between light availability and temperature are known to constrain vegetation changes in the northern latitude38. Furthermore, rising CO2 is expected to increase plant water use efficiency, in turn allowing woody vegetation to persist in drier climates than observed today. On one hand, this would support the use of more mechanistic models, such as DGVMs, to generate future vegetation predictions as these models explicitly represent how the relationships between some features of the environment and vegetation are likely to change. However, the limited capacity of DGVMs to capture the contemporary relationships between the environment and vegetation suggests that a combination of approaches is likely to provide the most valuable avenues for future research. Furthermore, development of precipitation and water stress variables that account for the effect of atmospheric CO2 concentration may allow models to fit to contemporary data to overcome some of these limitations. Nevertheless, we argue the data-driven envelope models presented here are an important complement to mechanistic models of terrestrial vegetation.

Although the extent to which time lags, hysteresis, region-specific ecological feedback mechanisms, and changes in the relationships between the environment and vegetation will lead to a disconnect between future climate and future biomes remains unknown, our work complements other assessments14,17,18 in presenting a set of clear, testable, baseline predictions for future biome-climate envelopes. Our predictions are limited to the environmental variables and processes our model includes but the accuracy of our projections is contingent on the accuracy of the RCP climate projections and consistent with modern biome delineations. As new data become available, Earth-system modelers will be able to assess where these projections succeed and fail and use this information to generate updated models and forecasts in the future. This forecast-analysis-cycle approach is foundational to developing a forecasting capacity across disparate scientific disciplines39. Given the severity of global environmental change and the urgency to understand climate change, we cannot ignore the need for actionable, present-day information to guide global restoration efforts. By identifying areas that are vulnerable to novel climate pressures, this work can help land managers, conservationists, and restoration practitioners make critical decisions about long-term management activities and objectives, and serve as a baseline guide for proactive and adaptive ecosystem management planning40,41.

Given the speed of global environmental change, understanding the current and future state of the Earth’s major ecosystem types is an urgent question in ecology. This is especially true in the context of ecosystem restoration, where projects urgently need information about which environmental conditions are likely to characterize their site in the coming decades. Here, we apply a machine-learning bioclimatic envelope approach to classify the current distribution of Earth’s biomes, showing that this approach performs well and complements other contemporary vegetation models. When forecasted into the future, even the most optimistic climate change scenario predicts a fundamental rearrangement of the Earth’s major biome-climate envelopes, subjecting local biota to substantial and novel environmental pressures in those areas. Here, future climates will not support current vegetation states and restoration and conservation actions could seek to enhance the ability of ecosystems to track climatic changes and enable the movement of species to more suitable locations. Our research suggests humanity should prepare for a realignment of the Earth’s biome-climate envelopes, with direct and alarming implications for climate-smart restoration, conservation and ecosystem management.

Methods

Contemporary biome-climate envelope global distribution data

We used the RESOLVE global biome product8 (Fig. 1a) as training data for our classification model, using regional classification systems as a baseline for biome-climate envelope (BCE) boundaries. This original biome dataset was first established in 200142 based on 100 years of biogeographic scholarship, including extensive field investigations. Here, we use the revised version from 20178, which incorporated recent advances in biogeographic scholarship and has been widely used since then43,44. Importantly, this biome classification represents the on-the-ground distribution of regional biotas and was constructed using on-the-ground observations and expert knowledge, rather than generalizations of biome-environment relationships. This is critical, since our model is trained on environmental variables, and a product generated from those environmental variables would result in a circular analysis.

Covariate layers

All environmental covariate layers used to train machine learning models are presented in Table S4. We created a bioclimatic composite of 25 covariate layers, including climate (temporal period 1979–2013)45 and soil (limited to the topsoil; 15 cm)46. We did not include topography as a covariate because future projections for topographic changes are unavailable (although topography may not change given the relatively short time period of the study, the climatic variables it represents will change in the future47). All covariate layers were resampled and reprojected to a unified pixel grid in EPSG:4326 (WGS84) at a resolution of 30 arcsec (approximately 1 km2 at the equator). Due to limited coverage of several covariate layers, Antarctica was not included in the analyses.

Geospatial modelling of biome-climate envelopes using contemporary environmental data

Biome-climate envelope (BCE) classification was performed using an ensemble random forest machine learning approach (see Fig. 5 for an overview schematic). All model training and geospatial analyses were performed in Google Earth Engine48. Before conducting full ensemble classification analysis, we first tuned random forest hyperparameters. To do so we randomly sampled 2500-point locations within each biome, resulting in a total of 35,000 training points. We then tested 11 random forest models, varying the number of variables per split while keeping other hyperparameters (number of trees, and minimum leaf population) constant. To select the best performing model, the performance of each random forest model was tested using an independent validation dataset. Hyperparameters of the final random forest model included 100 trees, five variables per split and a minimal leaf population of 1.

Fig. 5
Fig. 5
Full size image

Schematic overview of the modelling process. We used an ensemble random forest machine learning approach to predict the current biome-climate envelope (BCE) distribution and future BCE distributions under RCP 4.5 and RCP 8.5.

Once hyperparameters were selected, we then generated 1000 independent training datasets, each with a different random seed. Each training dataset was comprised of 2500 random points per biome, for a total of 35,000 training and validation observations per data set. We then trained 1000 random forest models on each of these training datasets. Therefore, across the entire ensemble, we trained against 35 million observations. We then used the model ensemble to generate BCE occurrence probabilities at each pixel. Occurrence probabilities were calculated within each pixel as sum of random forest model outcomes within the ensemble that predict a particular BCE within each pixel, divided by 1000. Consensus BCE classifications were made selecting the BCE with the highest occurrence probability within each pixel. Finally, the model was validated against another, independent sampling of the RESOLVE global biome product, comprised of 2500 random points per BCE, for a total of 35,000 validation observations.

Interpolation—extrapolation

To assess the extent of extrapolation when predicting under future climate scenarios, we assessed which climate variables under RCP 4.5 and 8.5 are fundamentally outside of the range of our training dataset (i.e. contemporary climate conditions on Earth). For each pixel under RCP 4.5 and 8.5, we report the fraction of climate covariates that fall inside or outside the training climate variable range (Fig. S1). In addition, to assess the extent of extrapolation throughout the full multivariate environmental covariate space, we performed a principal component analysis (PCA)-based approach, following the method of van den Hoogen et al.49. We created convex hulls for each of the bivariate combinations from the first 7 principal components (which collectively covered more than 90% of the sample space variation). Using the coordinates of the convex hulls, we classified whether each pixel falls within or outside each of the convex hulls. For both the RCP 4.5 and RCP 8.5 scenarios in 2080, more than 90% of the world’s pixels fall within the entire set of 21 PCA convex hull spaces computed from our sampled data, with most of the extrapolated pixels existing in tropical regions (Fig. S1).

Evaluating model accuracy

We generated a confusion matrix to assess model accuracy. We report on a per-biome basis the relationship between actual data and the result of the model. Overall accuracy is computed by dividing the sum of the diagonal elements by the total number of reference pixels. The kappa coefficient was also computed to quantify the difference between the agreement of the classifier compared to the expected agreement by chance between reference and classified data50.

Forecasting BCE distributions under future climate scenarios

To predict potential future distributions of the Earth’s BCEs, we re-generated predictions from the original 1000-member model ensemble, updating the 19 bioclimatic variables from the averaged outputs of three general circulation models (GCM). We chose three models that do not represent feedbacks with the terrestrial carbon cycle, as this would make our analysis circular. We chose the ACCESS1.0, CMCC-CM, MIROC5 models from the CMIP5 model intercomparison project based on the guideline provided by Sanderson et al.51. Those models are available for the climate data we use from CHELSA and represent a diverse set of Earth system model assumptions51,52. Models were run under RCP 4.5 and 8.5 scenarios from the Coupled Model Intercomparison Project Phase 5 (CMIP5) as input. We focus on RCP 4.5 and RCP 8.5 because the former represents a moderate scenario whereas the latter represents a pessimistic scenario that assumes continued growth in emissions53 and is the scenario the world is currently tracking most closely20. We assumed soil variables to remain constant during all future forecasts given the relatively short time period considered in this study (see also13,14,17). To match the resolution of the other covariate layers in our composite, future climate layers were downscaled to a resolution of 30 arcsec, using the current conditions from the CHELSA dataset as a reference45. Each layer was computed for 2041–2060 and 2061–2080 (hereinafter referred to as 2080). Our results and discussion focus on the latter time period.

Mapping consensus BCEs, classification uncertainty, and BCE turnover

We mapped the consensus BCE estimate for contemporary conditions as well as forecasts under RCP 4.5 and 8.5 climate change scenarios (i.e. BCE with the highest occurrence probability within a given grid cell). Whenever two or more BCEs had an occurrence probability of > 10% we shaded this region in light gray to visualize classification uncertainty. To understand how BCEs may change in the future, we mapped the probability a given pixel changes its contemporary consensus estimate under RCP 4.5 and 8.5 (Fig. 4).

Assessing model sensitivity to contemporary climate estimates

Our classification models are calibrated to a leading 20-year climate window; however, a substantial amount of climate change has already occurred. It is possible that the present-day BCE distribution does not yet reflect climate driven changes which will occur. If this is the case, it would cause our forecasts to underestimate the amount of BCE turnover in the future. To assess this possibility, we retrained our models using a 34-year leading climate window, which essentially made our training model climate “cooler” and projected future climate to be more different relative to the training climate. All analyses were performed using Google Earth Engine and Visual Studio Code version 1.102.1; all maps and plots were produced with R version 4.4.2 and with QGIS version 3.42.1.