Background & Summary

The European continent is rich in natural and semi-natural habitats that host diverse species of flora and fauna and provide a wide range of ecosystem services. However, these habitats are under major pressure due to climate change, pollution, biological invasions, rapid urbanisation, agricultural expansion, as well as intensification in some areas and abandonment in others, which threaten the extent and quality of these habitats. The European Environmental Agency (EEA)‘s latest assessment1 on the State of Nature in Europe reveals an alarming decline in Europe’s biodiversity, with most protected species and habitats lacking adequate conservation.

Despite these challenges, habitat assessments are largely based on expert judgment rather than field data1, leading to uncertainties in evaluating their true conservation status. Additionally, the exact extent of habitats remains unknown, particularly outside protected areas (e.g. in the context of the EU Habitats Directive assessment within and outside Natura 2000 sites), posing additional challenges to biodiversity conservation.

To monitor these pressing environmental pressures effectively, it is imperative to acquire accurate and comprehensive knowledge of the distribution of habitats at high spatial and thematic resolutions across Europe.

In Europe, the European Nature Information System (EUNIS)2,3,4,5 is the habitat classification framework designed for a comprehensive coverage of habitat types across the continent. EUNIS is particularly suited for large-scale mapping, including remote sensing-based “wall-to-wall” approaches. This also distinguishes it from widely used maps of Potential Natural Vegetation6,7, which represent theoretical landscapes not affected by humans rather than current habitat distributions8. EUNIS Habitat Classification is a hierarchical system with multiple nested levels, each offering increasing levels of detail and specificity in describing habitat types4. The classification system allows users to navigate from broader habitat categories to more specific habitat types (from level 1 to level 6). Level 1 distinguishes the major habitat formations such as wetlands, grasslands, forests, etc. To align with the most recent consistently revised classification9,10, this study focuses on Level 3 which provides detailed descriptions for terrestrial habitats and has consistent coverage across Europe.

Producing habitat maps requires accurate in-situ data covering a diverse set of habitats in Europe11. The compilation of the European Vegetation Archive12 (EVA) and advances in classification expert systems4,13 have enabled large-scale classification of European habitats using in-situ vegetation data. These systems assign individual vegetation plots to established classification frameworks such as EUNIS, providing ground truth data for habitat modelling and mapping at large scales. However, while in-situ data provide valuable reference points, they are often spatially limited and time-consuming to collect.

Remote sensing offers a complementary approach by enabling large-scale, high-resolution habitat mapping across extensive areas, including inaccessible sites. In addition to mapping habitat extent, remote sensing provides key environmental descriptors, such as vegetation indices (e.g., NDVI), surface moisture, canopy structure, and seasonal phenology, which can enhance habitat classification and potentially improve predictive models14.

Recent studies showcase the potential of integrating remote sensing variables15 with in-situ data for habitat modelling using knowledge-based classifiers16, data-driven machine learning approaches17,18 and hierarchical approaches19. However, a vast majority of studies focused on fine-scale mapping at regional scale, particularly within protected areas20,21. Land cover mapping has been extensively developed at global and continental scales22,23,24,25, but these products remain too broad for ecological applications that require habitats or vegetation types. Reviews of habitat mapping with remote sensing26,27 have highlighted that existing studies target specific formations or biomes, including forests18,28,29,30,31, wetlands32,33, grasslands16,34,35, coastal dunes36 and arid landscapes37. Individual suitability maps for most terrestrial EUNIS habitats at level 3 have been developed previously38,39. Yet, integrating them into a single map proved challenging. Despite recent methodological advancements40,41, a comprehensive, continental-scale map of EUNIS habitats has yet to be developed. To this end, there is a growing need to integrate different datasets and leverage the scalability and flexibility of Machine Learning (ML) methods for large-scale mapping of multiple habitats.

Here we develop and present high spatial and thematic resolution predictions of EUNIS (level 3) habitats across Europe. We provide habitat distribution maps at 100-m resolution by harnessing high-resolution and ecologically relevant remote sensing variables, and validate these habitat maps using three independent datasets and provided them to the community via public repositories.

Methods

Habitat modelling predicts each habitat class at specific locations given environmental predictors. In this study, we used the EUNIS habitat nomenclature4,10, focusing on terrestrial habitats at level 3. This includes over 250 distinct habitat classes within nine broader formations. To handle the discrete nature of habitat classes, we employed a classification approach to accurately assign habitat types to specific locations based on predictor variables available as gridded raster data at the European extent.

In practice, we built a set of multi-class machine learning models, where each model classifies data into one of multiple habitat classes. This contrasts with independent binary classifiers38, which would require training separate models for each habitat class. The use of such classifiers could potentially lead to inconsistencies and loss of contextual relationships among classes. Joint modelling in multi-class models implicitly accounts for associations between multiple habitats, allowing less prevalent classes to borrow statistical power from more common ones.

The presence of multiple habitats from different EUNIS level 1 formations within the same spatial unit can create mosaics, introduce ambiguity and potentially reduce model accuracy. To address this, we leveraged the hierarchical structure of habitat classes by training separate multi-class models at EUNIS level 3, each restricted to habitats within a single level 1 group, ensuring that each model focuses only on a subset of ecologically related habitats and aligning with the structure of the EUNIS system.

In the following, we detail the data sources (habitat in-situ information and environmental predictors), the detailed modelling and prediction strategies, the validation and finally the pipeline for producing spatially contiguous maps integrating all habitat classes i.e., wall-to-wall map.

Data

Study area

The study extent is defined by the EEA39 region. This includes the 27 EU member states, three EFTA countries (Iceland, Liechtenstein, and Norway), and nine additional collaborating countries: Albania, Bosnia and Herzegovina, Kosovo, Montenegro, North Macedonia, Serbia, Switzerland, Türkiye, and the United Kingdom. This extent aligns with the Corine land cover mask used further down for integrating habitat predictions into a wall-to-wall map.

Habitat plots

Vegetation plots from the European Vegetation Archive (EVA)12,42 spanning the period 1990–2021 served as ground truth data for training and testing the model across Europe. Each vegetation plot record was translated to the EUNIS typology at level 3 based on its species composition, using the EUNIS-ESy expert system10. The EUNIS classification includes nine revised habitat formations: MA (Marine habitats), N (Coastal habitats), P (Inland surface waters), Q (Wetlands), R (Grasslands), S (Shrublands - heathlands, scrub, and tundra), T (Forests), U (Sparsely vegetated habitats), and V (Vegetated man-made habitats).

An EVA vegetation plot typically contains a full list of co-occurring vascular plant species, often also a list of co-occurring bryophytes and lichens, estimates of cover-abundance of each species and various additional information on vegetation structure and layering. The dataset included vegetation plots assigned to the target habitat formations: saltmarshes (MA2), coastal habitats (N), wetlands (Q), grasslands (R), shrublands (S), forests (T), sparsely vegetated habitats (U), and man-made habitats (V). We focused on the terrestrial realm and therefore excluded other marine habitats and inland waters, as their classification is not based on vegetation and consequently, they cannot be classified by a vegetation-based expert system.

Vegetation plots with no cover-abundance information for individual species were excluded. Further, plots smaller than 1 m2, larger than 1000 m2, without geographical coordinates and with reported uncertainty of the coordinates larger than 100 m were also excluded. The resulting dataset43 contained a total of 597,819 georeferenced plots, heterogeneously distributed across Europe (Fig. 1, Table 1).

Fig. 1
figure 1

Distribution and density (log-scaled, 100 km × 100 km grid) of vegetation plots from the European Vegetation Archive (EVA) used in this study.

Table 1 Number of vegetation plots for each EUNIS level 1 habitat formation.

Habitat datasets for validation

To evaluate the quality of the habitat maps, we have used two habitat occurrence datasets: a hold-out of habitat observations from the Netherlands (NL) and the French Forest Inventory (IFN). The IFN dataset was previously used to test the classification of vegetation plots into habitat types in the EUNIS-ESy2. However, both datasets contain geolocated habitat observations that were not used for training the habitat distribution models.

The NL dataset contains 49,512 vegetation plots (Fig. 2) from the Landelijke Vegetatie Databank (LVD)44,45, sampled between 2010 and 2022 and classified into EUNIS level 3 classes following the same methodology as the EVA dataset.

Fig. 2
figure 2

Distribution of habitat observations used for validation and number of observations in each EUNIS formation.

The French Forest Inventory (Inventaire Forestier National, IFN)46 is a nationwide program monitoring French forests annually on a systematic grid of 2000 m² plots. On each plot, the habitat at the centre is identified using ecoregion-specific identification keys, and additional habitats may be noted. These observations are linked to national typologies (HABREF), as well as EUNIS Level 3. Habitat data are currently available through DataIFN only for ecological regions with published identification keys (Grand-Est, Vosges, Jura, southern Alps). We used 21,252 plots surveyed between 2013 and 2021 (Fig. 2).

Additionally, we used the MAES/EUNIS habitat map for Austria (AT) (2021)47, a fine-scale 10-m resolution raster (~8 million classified grid cells) compilation of biotope mapping data from Austrian federal states, harmonised to EUNIS Level 3 classes. It provides full national coverage across habitat types.

Environmental predictors

To train the models on the vegetation plots, we built a comprehensive database of environmental predictors at the highest possible spatial resolution, which are ecologically meaningful to predict habitats across Europe14,15. These variables had to be available at least at 1 km resolution within our study area. We selected a set of the least correlated environmental variables, including climate, topography, hydrography, geology, and soil (Table 2). These data were complemented by remote sensing (RS) products (Table 3) describing vegetation structure, phenology and productivity parameters and landscape composition, capturing functional and structural properties relevant to EUNIS level 3 habitats.

Table 2 List of environmental predictors used for habitat modelling and mapping at 100 m resolution.
Table 3 Remote sensing products used for habitat modelling and mapping at 100 m resolution.

Phenology and productivity metrics from the Plant Phenology Index (PPI) summarize seasonal vegetation dynamics and photosynthetic activity. These are particularly effective in distinguishing habitats whose definitions depend on growing-season timing, length, amplitude and productivity. For instance, dry grasslands green up earlier and senesce faster due to soil moisture limitation than mesic grasslands, which have longer, more productive growing seasons48. Similarly, annual croplands display abrupt growth and senescence peaks associated with sowing and harvest49. Deciduous and evergreen forests differ in their phenological amplitude, with deciduous forests exhibiting a broad-amplitude seasonal cycle of leaf loss and regrowth, while evergreen forests maintain a narrow-amplitude cycle with year-round canopy50.

Vegetation structure was captured by several complementary metrics, including the Leaf Area Index (LAI) for foliage density and vertical layering, canopy cover for horizontal tree crown extent, and canopy height to capture forest stature/successional stage, which together allow differentiation between open shrublands, tall-herb communities, sparse versus dense forests, and young plantations versus old-growth stands51,52,53.

Hydrological regimes were represented by inundation seasonality, essential for separating wetland and riparian habitats54.

Finally, land-cover composition from ESA WorldCover55 summarized proportions of surrounding classes, critical for context-dependent habitats whose definition depends on adjacency or mosaics, such as dune slacks surrounded by coastal dunes, agroforestry systems embedded in croplands, or heathlands interspersed with grasslands. Direct use of raw multispectral imagery (e.g., Landsat, Sentinel-2) was not considered, since the higher-level biophysical and land-cover products employed here are already derived from these missions and provide ecologically validated, interpretable indicators.

All RS datasets were harmonized to a 100 m grid: continuous variables were resampled using bilinear interpolation, while categorical land-cover classes were aggregated as proportions within each 100-m cell. Remote sensing predictors spanned different periods depending on availability of EO products. To reduce interannual variability and short-term noise, we averaged them over their respective ranges to capture stable environmental regimes.

Ensemble machine learning framework

Overview

To address uncertainties arising from model choices and data sampling, we employed an ensemble modelling approach56,57.

First, the uncertainty arising from model choices stems from different algorithms with varying functional forms58,59. For instance, decision-tree approaches create trees with different depths, capturing interactions between variables whereas neural networks represent smooth, continuous responses, with wide architectures for recurring patterns and deep architectures for hierarchical representations60,61. To encompass the diversity of models, we created an ensemble of algorithms from different families of tree-based models as well as neural networks which meet a minimum performance requirement. Since decision trees excel with structured tabular data and neural networks with intricate feature interactions, combining them creates a more generalized and robust model, leveraging the strengths of each approach for improved predictive performance.

Second, we considered the uncertainty arising from different training data. Employing spatial block cross-validation41,62, we trained the model with 20% of the observations hidden at each iteration. This process was repeated, generating an ensemble of classifiers with access to distinct samples, thus accounting for data sampling uncertainty.

Selected ML algorithms

To achieve the best modelling performance, we employed well-known machine learning techniques, each with their advantages and disadvantages including bagging models, boosting models, and neural networks (Supplementary Table 2).

Bagging models, also known as bootstrap aggregating models, enhance predictive accuracy and alleviate overfitting by training multiple individual estimator models on various subsets of both the training data and predictor variables. This approach harnesses collective knowledge to improve results. A notable example of bagging is the Random Forest (RF) algorithm63, which employs classification or regression trees as base estimators.

Boosting models progressively train weak learners to create a robust learner by assigning greater emphasis to incorrectly classified instances. This iterative process leads to improved overall predictive performance.

  • XGBoost64: This optimized gradient boosting algorithm merges tree-based models with regularization techniques, resulting in highly accurate and efficient predictions.

  • CatBoost65: Tailored for categorical variable handling, CatBoost employs gradient-based strategies, ordered boosting, and innovative encoding methods to enhance accuracy and manage categorical features effectively.

LightGBM66: A specialized form of boosting, LightGBM employs a gradient-based decision tree algorithm that optimizes training speed through leaf-wise growth and histogram-based optimizations, all while maintaining strong predictive performance.

Neural networks60 are computational models that feature interconnected nodes, or “neurons” organised in layers. These networks learn to extract meaningful features by adjusting connection weights and biases during training. In this study, we used several architectures for fully Connected Neural Networks (Multi-layer Perceptron - MLP) with a single shallow hidden layer, a single wide hidden layer, two hidden layers and three hidden layers (Supplementary Table 2).

Dealing with habitat class imbalance

Class imbalance arises from differences in habitat prevalence, either due to uneven sampling effort or restricted extent in the case of rare habitats. To address this, we evaluated imbalance correction strategies that modify the optimization objective rather than under-sample frequent habitats or oversample rare habitats from the training data. The choice of method depends on the machine learning framework.

For tree-based algorithms (RF, XGBoost, CatBoost and LightGBM), we evaluated class weighting, a technique that assigns to each class a weight inversely proportional to its frequency to balance its relative importance in the overall optimization. In RF this is achieved via weights in the splitting criterion, while in boosting algorithms (XGBoost, CatBoost, LightGBM) it corresponds to a weighted version of the multi-class log loss.

For neural networks, the baseline categorical cross-entropy loss can be extended with class weights, yielding the Weighted Categorical Cross-Entropy (WCE), which is conceptually equivalent to the weighted log loss in tree ensembles. Weighted categorical cross-entropy (WCE) in neural networks follows the same principle as weighted log loss in boosting algorithms (XGBoost, CatBoost, LightGBM) and class weighting in RF, namely to increase the influence of rare classes on the optimization objective. While the underlying formulations differ across model families, the methods are functionally comparable as imbalance correction strategies67.

Focal loss68 (FL) is a modification of the standard Cross Entropy loss that specifically targets imbalanced classification problems. It introduces a focusing parameter “gamma” to down-weight the contribution of well-classified examples, putting more emphasis on hard, misclassified examples. This helps to alleviate the dominating effect of the majority class and enables the model to focus more on the minority class instances during model training. Label Distribution Aware Margin69 (LDAM) loss addresses class imbalance by assigning distinct margins to classes based on their distribution characteristics. These margins represent class boundary separation and control intra-class and inter-class distinctions. LDAM loss aims to penalize misclassifications of minority class examples more, encouraging the model to better account for underrepresented classes. Both FL and LDAM can be combined with class weights to further account for imbalance.

To ensure that imbalance correction improved performance, each algorithm was also evaluated against a baseline mode without correction. The strategy retained for each model and habitat type (Supplementary Table 3) reflects the optimal choice relative to this baseline.

Ensemble model training

Overview

For each habitat type at level 1, we trained an ensemble of multi-class models to predict the most likely EUNIS level 3 class within the target formation (i.e., level 1) following the steps depicted in Fig. 3a:

  1. 1.

    Create a dataset containing observations from classes (EUNIS level 3 within each formation (EUNIS level 1).

  2. 2.

    Generate a spatial block partition of the selected observations for cross-validation.

  3. 3.

    Train an ensemble multi-class model for all classes within each formation, this step encompasses the feature pre-processing and hyperparameter tuning.

  4. 4.

    Select the best algorithm(s) in each family (bagging, boosting, neural networks) as well as the imbalance correction strategy based on overall cross-validation predictive performances.

Fig. 3
figure 3

Ensemble multi-class modelling framework.

In the following, we provide more details about the input data, pre-processing (feature pre-processing, data partitioning) and hyperparameter tuning steps.

Input data

For training the models, the dataset consisted of the geolocated level 3 habitat observations from the EVA dataset and the abiotic and RS products predictors extracted at the highest spatial resolution available for the observations’ locations. Similarly, for each external evaluation dataset (NL, AT, IFN), the dataset consisted of the map of habitat observations annotated at level 3 and the abiotic and the RS predictor maps of the validation areas.

Preprocessing steps

Spatial Block-CV partitioning

We establish a spatial block partition of the annotated dataset for cross-validation, as follows:

  1. 1.

    Grid Division: The study’s spatial domain was divided into a grid composed of cells measuring 100 km × 100 km each. Several grid sizes (10 km, 20 km, 50 km, 100 km, 200 km, 500 km) were tested to determine the optimal granularity.

  2. 2.

    Class Frequency Computation: Within each grid cell, we calculated the frequencies of different habitat classes.

  3. 3.

    Cell Block Allocation: The grid cells were then partitioned into five distinct spatial blocks, ensuring that every habitat class is adequately represented within each block. This is accomplished using the IterativeStratification70 module found in the Python scikit-multilearn package70. This algorithm processes cells in order from the rarest to the most frequent labels and assigns them to the block that best preserves the global class distribution, thereby ensuring that all habitat classes are represented in each block.

  4. 4.

    Observation Assignment: Every individual habitat observation was assigned to the spatial block corresponding to the cell within which it resides.

  5. 5.

    Balanced Observations:We verified that the number of observations and habitat classes was reasonably balanced across the five blocks. Imbalance was defined as cases where one or more blocks contained disproportionately fewer observations. In such cases, the grid cell size was decreased (e.g., from 100 km to 50 km or 10 km) to generate more cells and redistribute observations, and the procedure was repeated from step 1. The final grid size retained for the analyses was 100 km, which provided the best compromise between spatial independence and balanced class representation.

This spatial block partitioning methodology serves two crucial purposes. First, it amalgamates observations from proximate locations into the same partitions. This prevents potential overestimation of predictive performance stemming from data leakage induced by spatial autocorrelation. Second, in instances where multiple neighbouring habitats are observed and co-occurring in the same location, this technique exposes the model to a comprehensive array of potential responses for the same predictors. As a result, the model’s probabilities reflect the uncertainty inherent in the dataset.

Feature pre-processing

We employed tailored feature pre-processing procedures customised to the nature of the features and the specific demands of the machine learning algorithms (Supplementary Table 1). These pre-processing steps were seamlessly integrated within a unified pipeline that encompasses both algorithm training and prediction tasks.

Hyperparameter tuning

For each individual machine learning algorithm, specific hyperparameters can be configured to control model complexity and optimization settings. To select the optimal hyperparameters for each modelling task (formation), a distinct 10% subset of the training data was set aside for hyperparameter tuning, with a focus on achieving high adjusted balanced accuracy as the objective. Our hyperparameter tuning process unfolds as follows:

  1. 1.

    We kept problem-specific parameters (such as objective function and metrics) at their default values, which are tailored for multi-class classification. These parameters are solely defined by the output variable type (here discrete classes).

  2. 2.

    Initially, a fixed set of architectures was generated, and we explored the best optimizer configurations without incorporating regularisation. For algorithms that are iteratively optimised, like boosting models and neural networks, we implemented early stopping callbacks to cease training when validation performance begins to decline, thereby avoiding overfitting.

  3. 3.

    Subsequently, we fine-tuned regularisation parameters to manage model complexity, leveraging a hold-out validation dataset. In the case of Multi-Layer Perceptrons (MLPs), given their computational complexity, we resorted to a grid search with a constrained array of configurations. We applied temperature scaling to MLPs to enhance the calibration of probabilities, especially when imbalance correction techniques are applied. This step was not required for bagging and boosting algorithms.

  4. 4.

    Bayesian optimization techniques were employed for bagging and boosting algorithms to navigate the hyperparameter space and identify the most optimal hyperparameter settings. This Bayesian hyperparameter tuning, facilitated by the optuna package71, involves a probabilistic surrogate function known as the acquisition function. This function estimates the potential improvement in the objective function based on different hyperparameter combinations. The algorithm iteratively assesses various hyperparameter sets, updating the surrogate function and hyperparameter space accordingly.

  5. 5.

    Finally, we leveraged the performance outcomes on the test set to make informed selections of the best-performing algorithm(s) within each algorithm family.

This comprehensive and enhanced approach ensures the fine-tuning of algorithmic parameters, ultimately leading to selecting the most suitable models based on their predictive performance (Supplementary Table 3).

Decision trees excel with structured tabular data and neural networks with intricate feature interactions. Therefore, combining them creates a more generalized and robust model, leveraging the strengths of each approach for improved predictive performance.

Ensemble forecasting and uncertainty

Upon completion of the training process, we obtained a collection of trained machine learning algorithms alongside their associated feature pre-processing pipelines for every cross-validation fold. The ensemble model is a simple voting classifier, combining predictions from all individual models and assigning weights based on their respective rankings in terms of predictive performance (Fig. 3b).

From the collective predictions of the ensemble, we extracted various metrics of uncertainty:

  • Model Uncertainty: The variability in predictions among the constituent models within the ensemble indicates sensitivity to the choice of model.

  • Data Sampling Uncertainty: The diversity in predictions across different folds within the ensemble underscores the sensitivity to variations in data sampling.

This uncertainty can be quantified at the level of individual pixels through different metrics, including:

  • Confidence Scores: These scores represent the probability associated with the most probable class. Higher scores indicate more confidence in the chosen class, but the number of classes affects this measure. For example, a 30% confidence probability in a problem with 200 classes carries different implications than the same probability in a problem with just 3 classes.

  • Committee Averaging Scores: These scores were calculated based on the proportion of voters (model x fold) that have predicted each class as the most likely. This measurement offers insights into the level of consensus or disagreement among models:

    • CA = 1: unanimous agreement among all models on prediction of the most likely habitat.

    • 0 < CA < 1: varying degrees of disagreement among models.

    • CA ~ 0: Pronounced disagreement among models.

Habitat mapping workflow

The ensemble multi-class habitat models combined with a set of decision rules were used to generate wall-to-wall habitat maps for Europe. The workflow proceeds in four steps depicted in Fig. 4. First, ensemble models predict habitat class probabilities (Step 1). Second, regional filtering rules constrain predictions to habitats that occur in their biogeographic regions (Step 2). Third, land-cover filtering rules refine probabilities by enforcing compatibility between habitats and their associated land-cover classes (Step 3). Finally, land cover-based priority rules are applied to generate the final categorical habitat map (Step 4). These steps produce three complementary habitat mapping products (continuous probabilities for each habitat class, top 3 categorical maps with confidence and the final wall-to-wall map).

Fig. 4
figure 4

Workflow to produce European habitat maps.

Step 1: Ensemble models

We used the ensemble model to predict the probabilities of each EUNIS level 3 class for each EUNIS 1 formation. Figure 5 shows an example map of habitat probability for the class of Fagus forests on non-acid soils (T17).

Fig. 5
figure 5

Habitat probability map for the habitat class T17 Fagus Forest on non-acid soils.

Step 2: Regional filtering rules

Location is an intrinsic part of the EUNIS habitat classification. Some habitats are, by definition, associated with specific biogeographic regions (e.g., Macaronesian heathy forest). Although the climate can be a good approximator of biogeographic regions, we applied a post-hoc filtering to select the most likely habitat only amongst those which occur in the biogeographic region of the prediction pixel. This step also allowed to account for the habitats’ range of occurrence (e.g., Carpathian travertine fens).

For inland habitats, we used the ecoregions of the world72. For coastal habitats, we used the official EEA coastline delineation with an inland depth of 5 km away from the coastline. Using these layers and vegetation plot data from the EVA, we computed a matrix of association between each EUNIS level 3 class and the ecoregion/coastline, which were then used to generate regional masks for each modelled class.

EUNIS class probabilities and ranking (output product n°1) was generated by multiplying together the data cubes of the class-wise regional masks (step 2) and their model predicted probabilities (step 1). This output can be used to generate maps of the most likely habitat class at level 3 for each formation or aggregated at level 2. Figure 6 illustrates that for Heathlands, Scrub and Tundra (S) habitat classes, annotated at level 2, whereas Figs. 7, 8 illustrate that for broadleaved and coniferous forests respectively at level 3. These outputs are not filtered by land use and are therefore less sensitive to the minimum mapping unit of the land-cover/land-use maps.

Fig. 6
figure 6

Dominant heathlands, scrub and tundra habitat map with colour legend at EUNIS level 2.

Fig. 7
figure 7

Dominant broadleaved forest habitat map at EUNIS level 3.

Fig. 8
figure 8

Dominant coniferous forest habitat map at EUNIS level 3.

Step 3: Land cover filtering rules

Here, we used crosswalks to select for each EUNIS habitat class the associated land cover classes. From that, we generated land cover masks. The three most likely (top 3) EUNIS classes and their confidence scores (output product n°2) within each EUNIS 1 formation were obtained by multiplying together the data cubes of the class-wise land cover masks (step 3) with the EUNIS class probabilities (output product n°1). After rescaling, we selected the top 3 classes and their corresponding probabilities i.e., confidence scores. Only classes with non-zero probabilities were kept, therefore in some cases the top 2 and top 3 classes were undefined. Figure 9 shows an example of a map of the most likely habitat (top 1) for vegetated man-made habitats, accompanied by its confidence map. This step refines the probabilities before the final wall-to-wall mapping (Step 4).

Fig. 9
figure 9

Dominant vegetated man-made habitat class (EUNIS level 3) and confidence map.

Step 4: Wall-to-wall mapping

Finally, we applied land cover-based priority rules (output product n°4) to determine the prevailing EUNIS 1 formation at each pixel to map the final habitat class at EUNIS level 3 (output product n°3). Non-vegetated land cover classes were assigned broad habitat categories: Urban for artificial areas and Inland water further split into Water course, lakes and reservoirs, transitional water and sea/ocean.

Steps 1-2 were done only once at European scale for each EUNIS 1 formation.

Steps 3 and 4 required the definition of crosswalk rules tailored to the underlying land cover product. We provided the worksheet summarising the crosswalk rules for Corine Land Cover, which was used as a mask for producing the final habitat maps. The spatial resolution of the final layer as well as the habitat extents are thus controlled by the chosen land cover product. Any potential user can use its preferred land cover layer to refine the spatial resolution of the final product.

Output products

All the following maps have been produced at 100 m resolution across Europe:

  1. 1.

    Continuous map of all EUNIS level 3 class probabilities.

  2. 2.

    Categorical map of the top 3 most likely habitats and continuous map of their confidence scores at level 3 within each formation.

  3. 3.

    Categorical wall-to-wall map of the top 3 EUNIS habitats at level 3 across all formations (previewed in Fig. 10), with a legend and a QGIS style file.

    Fig. 10
    figure 10

    Wall-to-wall habitat map - colour coded at level 2 for visibility.

  4. 4.

    An Excel sheet summarising the crosswalk and priority rules from Corine to EUNIS habitats.

Data Records

The dataset is available at Zenodo73 and is released under a Creative Commons Attribution 4.0 International (CC-BY 4.0) license, allowing reuse with attribution. It provides wall-to-wall habitat maps for Europe at 100 m resolution, classified to EUNIS level 3 habitats and covering terrestrial, freshwater, and coastal realms. All data are distributed in GeoTIFF format, complemented by CSV legends and style files for visualization.

Technical specifications

  • Spatial resolution: 100 m

  • Spatial extent: 900000.0000,7400000.0000,900000.0000,5500000.0000

  • Coordinate Reference System: ETRS89-LAEA Europe (EPSG:3035)

  • Data type: unsigned integer (16-bit for habitat class maps, 8-bit for confidence maps)

  • NODATA: 65535 for habitat class maps, 255 for confidence maps

Folder structure

The dataset corresponds directly to the four output products described above:

  1. 1.

    Continuous map of all EUNIS Level 3 class probabilities

    • Provided in the folder habitat_probability/

    • One GeoTIFF per EUNIS Level 1 formation (e.g., MA2.tif, N.tif, R.tif, T.tif), each containing stacked probability bands for all Level 3 habitats within the formation.

  2. 2.

    Categorical maps of the top-3 most likely habitats and their confidence scores (per formation)

    • Provided in the folder topk_habitats/

    • [code_level_1]_topk.tif: top-1, top-2, and top-3 predicted classes (Bands 1–3)

    • [code_level_1]_topk_confidence.tif: corresponding confidence scores (Bands 1–3)

    • [code_level_1]_legend.csv: legend mapping integer codes to habitat classes.

  3. 3.

    Categorical map of the top-3 EUNIS habitats across all formations

    • Provided in the folder wall_to_wall/

    • eunis_dominant.tif: top-1 most likely habitat class

    • eunis_top2.tif: top-2 most likely habitat class

    • eunis_top3.tif: top-3 most likely habitat class

    • Additional files: eunis_legend_detailed.csv (full class legend), eunis_legend_style.qml (QGIS style file).

  4. 4.

    Crosswalk and priority rules from Corine Land Cover to EUNIS habitats

    • Provided as documented_cross_walks.csv (spreadsheet with crosswalk and priority rules).

Technical Validation

Quantitative evaluation

Habitat maps were cross-validated at the European scale on the EVA dataset and evaluated by comparison with independent datasets from the Netherlands (NL), Austria (AT) and France (IFN). We evaluated the predictive quality in terms of recall (proportion of instances of a given habitat class correctly classified), precision (proportion of instances predicted as a given habitat class that truly belong to that class), and F1-score (harmonic mean of recall and precision) for each EUNIS level 3 class. In multi-class settings, recall and precision often exhibit a trade-off because improving recall (capturing more true positives) can lead to an increase in false positives, lowering precision, while tightening classification to improve precision may exclude some true positives, reducing recall.

Table 4 summarizes the distribution of these class-wise metrics (mean and standard deviation) within each EUNIS level 1 formation. Detailed performances for EUNIS level 3 classes across the validation datasets, as well as cross-validation performances, are also provided in the Supplementary Materials for each formation including: Saltmarshes (Table S4), Coastal habitats (Table S5), Wetlands (Table S6), Grasslands (Table S7), Scrub and Tundra (Table S8), Forests (Table S9) and Sparsely vegetated habitats (Table S10).

Table 4 F1-score, precision and recall from the ensemble forecasting models for the EVA and the three validation datasets: Netherlands (NL), Austria (AT) and the French Forest Inventory (IFN).

The spatial-block cross-validation (EVA) results show strong predictive performance, with most classes achieving very good F1-scores. Prediction quality varies depending on the number of habitat classes within the formation. Formations with fewer classes such as saltmarshes or those strongly shaped by abiotic elements (e.g., soil type, landscape structure) such as coastal, wetland, and sparsely vegetated habitats exhibit consistently high recall and precision. In contrast, grasslands, shrublands, and forests obtained variable predictive scores across classes reflecting their structural complexity and compositional diversity.

In sparsely vegetated habitats, precision and recall were well balanced. However, across the other formations, distinct trade-offs emerged. Grasslands, shrublands, and forests exhibited higher precision, reflecting more conservative models, whereas coastal and wetland habitats showed higher recall, indicating a tendency toward over-prediction.

When evaluated on external datasets, the habitat maps performed well, albeit slightly lower than in cross-validation, except for coastal and salt marsh habitats. Similar trade-offs across formations persisted, with the largest performance drop observed for the Austrian dataset, likely due to its higher resolution (10 m) compared to the model’s 100-m resolution, making fine scale matching more challenging.

In summary, the maps achieved strong predictive performance, with F1-scores ranging from 0.61 to 0.94 in spatial cross-validation and from 0.33 to 0.95 in external validation datasets. Low performance was often associated with habitats of limited spatial extent, prompting the use of an ecoregion filtering step. In such cases, accuracy improved when considering the top three predictions rather than only the most likely class.

Known limitations

Validation scope

We selected the validation areas based on available independent datasets covering different biogeographic regions: Alpine and Continental (Austria), Atlantic (Netherlands), Mediterranean, Alpine, and Continental (IFN). However, there are several regions that were not assessed due to a lack of independent data.

European habitat coverage

While we tried to be as exhaustive as possible in covering all habitat types in Europe, our map covers only a subset of EUNIS habitats which were classified by an expert system from plant community composition. Some of the missing habitats are challenging to define based on vegetation plot data (e.g., caves, glaciers) or in the case of anthropogenic habitats (e.g., tree plantations) hard to distinguish from semi-natural habitats. Habitats with a one-to-one association to a particular land cover class (e.g., glaciers, plantations) were incorporated in Step 4. Additionally, for statistical reasons, habitats with less than 5 occurrences in the curated EVA database were also discarded. Moreover, freshwater habitats were not mapped due to ongoing revisions of the classification system and lack of water quality predictors. Although there are some existing remote sensing products describing water body turbidity and trophic state, they are only available for a few large water bodies, and we found little overlap with freshwater vegetation plots. Finally, due to the choice of land use map (Corine land cover) which has a minimum mapping unit of 1 ha, linear habitats (e.g., hedgerows, small streams) and other smaller extent habitats could not be used in mapping.

Missing predictors

Some habitat classes with low validation scores were linked to humid soil habitats (R53, R65), those affected by land use pressures such as grazing (R1N) and abandoned agricultural lands (S82). Future improvements should integrate predictors for soil moisture (beyond the included inundation occurrence), land use history, and human footprint. Additionally, certain low-performing classes, particularly those found at habitat edges such as woodland fringes (R53), could benefit from incorporating spatial landscape structure using deep convolutional neural networks with satellite imagery. The use of LiDAR imagery could also help in distinguishing classes with high structural complexity whereas incorporating seasonal remote sensing indicators could better capture the phenology of habitats.