Introduction

Soils are the foundation of terrestrial ecosystems and represent an important habitat for organisms across the tree of life1. Soils also contribute to a wide range of ecosystem services including food production, climate regulation, as well as cultural and educational services2. These ecosystem services are directly supported by ecosystem functions, such as nutrient cycling, water holding capacity, and primary production3,4,5. In recent years, research has increasingly focused on ecosystem multifunctionality, recognizing that ecosystem services depend on multiple functions simultaneously, often assessed through measurable functional proxies (e.g., enzymatic activities, soil respiration)6,7,8. Previous research has shown that greater soil biodiversity is associated with higher levels of multiple functions, such as nutrient cycling and primary production9,10,11.

Microorganisms are sensitive, ubiquitous, and react quickly to environmental stress, reflecting cumulative impacts12, and potentially constituting an important indicator of changes in ecosystem multifunctionality13. This is especially important as sequencing is becoming a cheaper and highly standardized technology allowing direct comparison of data across different studies and national surveys14. However, assessments on the capacity of the soil microbiome in explaining ecosystem multifunctionality at larger scale are urgently needed to support the monitoring and conservation of soils.

Unfortunately, there are still important uncertainties about the capacity of the soil microbiome to predict multifunctionality across large spatial scales. First, despite the enormous diversity of soil biota (particularly microbial diversity), the focus is still often placed on a narrow subset of organisms. For instance, mycorrhizal fungi and nitrogen-fixing bacteria are among the few microbial functional groups currently assessed in Europe as potential indicators of ecosystem multifunctionality15,16. Second, despite growing interest in microbial indicators of ecosystem multifunctionality, current studies do not specifically address how multiple environmental contexts, such as land use, soil texture, and climate, influence the relationship between the taxonomic identity of soil organisms and their capacity to predict multifunctionality17. Finally, although overall microbial diversity (i.e., taxa richness) has been shown to positively influence multifunctionality in global and national contexts, similar large-scale analyses linking microbial community structure to multifunctionality are lacking for Europe11,18,19. This gap is especially relevant considering that less than 40% of European soils are currently categorized as healthy, despite the continent’s long history of land management and its significant role in global food production and biodiversity20. These knowledge gaps hinder our ability to evaluate how the soil microbiome relate to specific ecosystem functions across diverse contexts (i.e., soil types, climatic regions, or land uses).

Here, we use a harmonized pan-European observational field survey (named LUCAS Soil) to quantify the contribution of the soil microbiome in explaining ecosystem multifunctionality. We analyze 484 fresh soil samples collected across diverse environmental conditions, using categorical groupings to test the effects of land use, climatic region, soil texture, and soil pH, as these factors have previously been shown to structure soil microbial communities and ecosystem multifunctionality21. Using metabarcoding, we obtain site-level information on the relative abundance of bacteria and fungi. We assess ecosystem multifunctionality by integrating several functional proxies that capture key functions linked to broader ecosystem services. Specifically, we measure soil aggregate stability (flood regulation, habitat for organisms), net primary productivity (food and fiber/timber provision), and basal respiration (climate regulation). In addition, we quantify enzymatic activities including N-acetyl-glucosaminidase (nitrogen cycling), phosphatase (phosphorus cycling), and xylosidase (carbon cycling), all of which are involved in organic matter decomposition. Together, these indicators provide a multidimensional picture of ecosystem functioning, which we summarize as ecosystem multifunctionality.

To quantify the contribution of the soil microbiome to explain multifunctionality, we apply Random Forest models to identify the most important predictors of multifunctionality, considering a wide set of abiotic (climate and soil properties) and biotic (microbial community composition) variables. To quantify the relative contributions of these predictor groups, we perform variance partitioning analysis, allowing us to attribute the unique and shared variance in multifunctionality to climate, soil properties, and the soil microbiome. We finally employ structural equation models (SEMs) to test for the direct and indirect effects of climate, soil properties, and the soil microbiome on ecosystem multifunctionality. We hypothesize that soils from less-disturbed environments (e.g., woodlands) would support higher multifunctionality due to more favorable conditions for microbial activity. We also hypothesize that the contribution of microbial taxa to multifunctionality would be context-dependent, varying across land use, climatic region, pH, and soil texture.

Results and discussion

We analyzed 484 soils across Europe to investigate how ecosystem multifunctionality varies with land use, climatic region, soil texture, and pH, and we also investigated the relative contribution of the soil microbiome compared to soil properties and climate as indicators of multifunctionality. For each soil, we measured six functional proxies: soil basal respiration, xylosidase, N-acetylglucosaminidase, phosphatase activities, soil aggregation, and primary productivity (Table S1). Consistent with our first hypothesis, our study found the highest multifunctionality values in less-disturbed grassland and woodland soils (average multifunctionality value = 0.33 ± 0.09, n = 92, and 0.31 ± 0.10, n = 165, respectively), loam soils (0.31 ± 0.09, n = 179), acidic soils (0.30 ± 0.09, n = 290), and soils originating from temperate humid locations (0.31 ± 0.08, n = 249). On the contrary, cropland soils (0.25 ± 0.07, n = 227), alkaline soils (0.25 ± 0.09, n = 131), and soils originating from drier regions (0.25 ± 0.10, n = 122) showed the lowest multifunctionality values (Fig. 1, Table S3). Multifunctionality in cropland soils was 24.2% lower than in grasslands, and 19.4% lower than in woodlands. Our study also found that a combination of soil properties (i.e., total nitrogen, organic carbon, and microbial biomass), together with the relative abundance of a few taxa, significantly associate with multifunctionality (Fig. 2). More importantly, the soil microbiome explained a portion of variation of multifunctionality across soils. Specifically, the community composition of bacterial and fungal modules (i.e., co-occurring OTUs) explained between 2.27% and 14.08% of unique variance (Fig. 3, Table S4). Soil properties accounted for 12.2–31.4% of the unique variance in multifunctionality, whereas climate explained very little (0–1%), suggesting that climate influences multifunctionality mainly through indirect effects. At the level of individual functional proxies, the soil microbiome explained more unique variance than soil properties or climate for primary productivity and enzymatic activities (Figures S4S7), while soil properties dominated to explain variation in soil aggregation and basal respiration (Figures S8S9).

Fig. 1: Ecosystem multifunctionality across soil grouping categories.
Fig. 1: Ecosystem multifunctionality across soil grouping categories.
Full size image

Results are shown for land-use types (A), climatic regions (B), soil texture types (C), and soil pH (D). Chi-squared (χ²) statistic, eta-squared (η2), and significance (p value) are indicated following Kruskal-Wallis non-parametric test. Different letters represent significant differences among groups (p value < 0.05). In the boxplots, the central line indicates the median, the box limits correspond to the 25th (Q1) and 75th (Q3) percentiles (interquartile range), and the whiskers extend to 1.5 × IQR. Points beyond the whiskers represent outliers.

Fig. 2: Random Forest results indicating predictor importance of ecosystem multifunctionality.
Fig. 2: Random Forest results indicating predictor importance of ecosystem multifunctionality.
Full size image

Results are shown for different land uses (A1A3), soil texture classes (B1B3), pH classes (C1C3), and climatic regions (D1D3). Importance is shown for each predictor as the increase in mean-squared error (%). For each model, the ten most important predictors are displayed. Model performance is reported as cross-validated R2 (CV R2) and out-of-bag error (OOB). Asterisks indicate significance after non-parametric permutation test with 1000 permutations: **; p value < 0.010, *; p value < 0.050.

Fig. 3: Results from variance partitioning analysis showing major drivers of variation in ecosystem multifunctionality in our study.
Fig. 3: Results from variance partitioning analysis showing major drivers of variation in ecosystem multifunctionality in our study.
Full size image

Results are grouped by land use (A), climatic region (B), soil texture type (C), and pH (D). The number of soils included in each group is indicated above each stacked bar plot as “n”. Different colors indicate unique variance explained by climate (blue), soil properties (yellow), soil microbiome (red), and shared variance (gray). Individual values for each category are available in Table S3.

Context-dependency of soil microbiome contributions to multifunctionality

Our findings indicate that the potential of the soil microbiome to act as indicator of ecosystem multifunctionality is strongly context-dependent, varying with soil characteristics and environmental conditions. We observed that in croplands, the unique variance in multifunctionality explained by the soil microbiome (13.94%) was comparable to that explained by soil properties (13.92%), suggesting that microbial composition in these intensively managed systems is an important factor explaining ecosystem performance (Fig. 3). A similar pattern was evident in soils from temperate dry climates, in neutral pH conditions (pH between 6.5 and 7.3), and in clay-rich soils, where the predictive power of the microbiome closely matched that of soil properties (Fig. 3, Table S4). These results suggest that under certain soil and environmental configurations, the microbiome may serve as a reliable indicator for assessing ecosystem functioning. In contrast, in soils from other contexts, such as those originating from woodlands and continental climatic regions, the unique contribution of the microbiome to explaining multifunctionality was noticeably lower compared to that of soil properties (Fig. 3). Our results contrast with previous studies showing that multifunctionality in intensively managed ecosystems (e.g., croplands) depends primarily on external inputs (e.g., fertilizers), as we demonstrate that these systems also rely on their soil microbiome (Gossner et al., 2016). We also suggest that the utility of the soil microbiome as indicator of multifunctionality is not universal but depends on the interaction between microbial communities and their surrounding environment, which is in line with previous research showing that soil microbial diversity depends on the interplay between soil properties and climate21,22.

At the level of individual functions, the soil microbiome explained substantial unique variance (>25%) for xylosidase activity in dry and clay-rich soils, phosphatase activity in neutral and clay-rich soils, and N-acetylglucosaminidase activity in neutral soils (Figures S4S6). The soil microbiome was also the main predictor of primary productivity, particularly in alkaline and sandy soils (Figure S7). On the other hand, soil properties (e.g., organic carbon) were the main predictors of basal respiration. Future studies should address whether the freeze–thaw of soil samples might have disrupted microbial community signals and make respiration more strongly dependent on substrate availability than on community composition.

We hypothesized that microbial indicator taxa of multifunctionality vary across soil types, and our results confirmed the hypothesis. Random forest analyses revealed that the contribution of microbial groups to ecosystem multifunctionality is context-dependent and varies across land uses and environmental gradients (Fig. 2). In croplands, multiple microbial modules (particularly bacterial modules 3, 5, 7, and 9, and fungal modules 13 and 1) emerged as strong predictors of multifunctionality. Taxonomically, most bacterial modules were dominated by Proteobacteria, representing 30–35% of OTUs in nearly all modules (Figure S10, Supplementary dataset 1). Module 7, however, was distinct in being dominated by Actinobacteria, especially the genus Gaiella, a pattern also observed to a lesser extent in module 3. Both modules 3 and 7 also contained numerous OTUs affiliated with Sphingomonas, Nocardioides, Solirubrobacter, and Streptomyces (Actinobacteria). Module 1, together with modules 2 and 5, included a high proportion of Acidobacteria. Module 5 also contained many OTUs within Bacteroidetes, particularly taxa from the genus Flavobacteria (Figure S10). Fungal modules were largely dominated by Ascomycota, which accounted for 48–61% of OTUs across modules (Figure S11). Modules 1, 3, and 5 contained many OTUs associated with the genus Mortierella, while modules 5 and 13 included members of Glomeromycota (arbuscular mycorrhizal fungi). In woodlands, fungal module 5, enriched in Mortierella and symbiotic fungi such as Archaeorhizomyces and members of Glomeromycota, emerged among the top predictors of multifunctionality for certain soil categories (Fig. 2). Notably, Glomeromycota (arbuscular mycorrhizal fungi) have repeatedly been shown to support plant growth and contribute to ecosystem multifunctionality23,24, although they represented as small proportion of all fungal sequences in our dataset ( ≈ 1.00%). Similarly, previous research at the regional scale have identified Actinobacteria and members of Mortierella as hubs in microbial co-occurrence networks25,26. We finally acknowledge that all taxa identified here as potential indicators of multifunctionality are strict aerobes, which is consistent with sequencing bulk soil from the top 20 cm where oxygen is abundant. Future studies should extend to deeper layers, where community composition may shift and alternative microbial indicators could emerge27.

Taken together, our results open the path towards the development of biomarkers where the abundance of these taxa could be monitored through qPCR assays, integrated into soil health indices, or used as features in predictive models28. Given their sensitivity to land-use and soil conditions, these taxa may serve as early-warning markers for shifts in ecosystem multifunctionality.

To explore how environmental context shapes the relationships between microbial communities, soil properties, climate, and ecosystem multifunctionality, we employed multi-group structural equation models (SEMs). These models tested direct contributions of climate, soil properties, and the microbiome to multifunctionality, as well as indirect effects of climate and soil mediated through the microbiome (Fig. 4, Figure S3, Supplementary dataset 2). For each environmental classification (land use, climatic region, soil texture, and pH class), we compared unconstrained models (allowing regression paths to vary among groups) with constrained models that assumed equal relationships across groups. Unconstrained models consistently outperformed constrained ones based on likelihood ratio tests, supporting the idea that the effects of microbial, soil, and climatic factors on multifunctionality are context dependent (Table S5). Among the classification schemes tested, grouping by land use yielded the best model fit, as indicated by the lowest AIC in chi-squared comparisons (Fig. 4). This, together with previous research, suggests that land use is a primary organizing factor in determining how environmental drivers influence multifunctionality29.

Fig. 4: Summary of multi-group structural equation models (SEMs) showing direct and indirect effects of climate, soil properties, and microbial community composition on multifunctionality.
Fig. 4: Summary of multi-group structural equation models (SEMs) showing direct and indirect effects of climate, soil properties, and microbial community composition on multifunctionality.
Full size image

Multi-group SEMs were built for land use categories (A), climatic regions (B), soil types (C), and pH classes (D). For each SEM, colors (red, green, blue) indicate soil grouping categories. Numbers adjacent to arrows indicate standardized path coefficients across all variables in each category. Predictor variables included in each box and a priori model are available in Figure S3. Model fit parameters are indicated as chi-squared (X2), degrees of freedom (d.f.), p value of two-sided test (P), Akaike Information Criterion (AIC), and root mean squared error of approximation (RMSEA). Full details including individual path coefficients are available in Supplementary Dataset 2.

SEMs also confirmed that while the direct effects of climate, soil properties, and the microbiome were relatively balanced in croplands and grasslands, woodlands exhibited a high influence of soil properties on multifunctionality (Fig. 4). This may reflect the greater influence of accumulated organic matter and nutrient pools in woodland soils, together with microclimatic buffering by tree cover30. These factors strengthen the role of soil properties in regulating ecosystem multifunctionality in woodlands, while reducing the relative contribution of climate and microbial community composition.

SEMs showed that the direct effect of climate (air temperature and precipitation) on ecosystem multifunctionality was strongest in soils from dry regions, where water limitation potentially amplifies both direct and microbiome-mediated climate effects (Fig. 4). In soil originating from continental regions, soil properties also played a dominant direct role, exceeding their influence in other regions. Analyses by soil texture showed that in loam soils, the microbiome had a greater direct influence on multifunctionality than soil properties, a pattern not observed in clay or sandy soils, where soil properties were more dominant. Loamy soils occupy the middle of the soil texture spectrum, reflecting a balanced composition of sand, silt, and clay31. Such conditions likely support diverse and active microbial communities, enabling them to directly regulate multiple ecosystem processes. Finally, pH classification revealed that in alkaline (pH > 7.3) and neutral (pH 6.5–7.3) soils, the microbiome and soil properties contributed equally to multifunctionality (Fig. 4). In contrast, in acidic soils (pH <6.5), multifunctionality was driven primarily by soil properties, with the microbiome playing a minor role. We argue that acidic soils impose stronger physiological constraints on microorganisms, enhancing the relative importance of soil chemical properties such as pH and nutrient availability in regulating ecosystem multifunctionality32. Taken together, these findings highlight the highly context-specific nature of soil functioning and underscore the need to consider environmental background when interpreting the role of the soil microbiome. To confirm the mechanisms proposed (regarding the influence of loam texture, woodland organic matter, or acidic soils on the relationship between the soil microbiome and multifunctionality), controlled manipulative experiments are needed. Laboratory or field studies altering soil texture, pH, or organic inputs could directly test microbial contributions to multifunctionality.

European-level assessment of the link between microbiomes and functions

We here assess ecosystem multifunctionality across Europe by simultaneously comparing multiple environmental factors, including land use (croplands, grasslands, and woodlands), climatic regions (continental, temperate dry, and temperate humid), soil texture (clay, loam, and sand), and soil pH classes (acidic, neutral, and alkaline soils). Previous works have modeled multifunctionality in grasslands and croplands, showing that agricultural practices can pose risks to ecosystem functions by creating trade-offs between biodiversity and functions such as productivity and nutrient cycling, with pesticide use and fertilization emerging as key drivers33,34. Similarly, a European-scale study covering 94 soils in 13 countries found that interactions between land use and climatic zones drive ecosystem functions in croplands and grasslands35. More recently, Sünnemann and collaborators (2023) found a Europe-wide decline in ecosystem multifunctionality under rising temperatures and dry conditions, worsened by fertilizer and pesticide application. Our study adds primary productivity as an additional ecosystem function, thereby moving beyond soil-focused analyses toward a broader ecosystem perspective of multifunctionality. We argue that future studies should incorporate direct measurements of actual biomass, such as kilograms of dry weight per hectare, because these are likely to provide more accurate estimates than satellite-derived proxies36. Remote sensing has been used in similar studies and captures vegetation greenness or productivity indices at coarse resolution, while field-based biomass measurements directly reflect the material available for carbon storage, reducing uncertainty and improving the precision of multifunctionality assessments37,38. Finally, ecosystem multifunctionality relies on the number of functions integrated7. In this study, six ecosystem functions were measured across all the 484 sites, including primary productivity, which is a key function. Future studies should also incorporate direct measurements of soil gas emissions (e.g., CO₂, CH₄, N₂O), as these are critical processes linking microbial activity to climate regulation39,40.

Considering that habitat conversion to croplands will likely increase worldwide for food production and to satisfy needs of a growing population, our results stress that the capacity of ecosystems to supply multiple functions simultaneously could be compromised if conversion to croplands is preferred. In line with this, Jeanneret and collaborators (2021) estimated that conversion to arable lands negatively impacts biodiversity of vascular plants and arthropods across Europe. Similarly, conversion to cropland causes homogenization of microbial communities41,42. A similar study found that European croplands were particularly abundant in potentially pathogenic fungi compared to grasslands and woodlands, but the richness of beneficial taxa (e.g., mycorrhizal fungi) decreased43. Our results add to these that the conversion of natural ecosystems to croplands not only has an impact on soil biodiversity but also on multifunctionality. Moreover, we show that the contribution of the soil microbiome to multifunctionality is context-dependent. We acknowledge that comparing multifunctionality across highly disturbed ecosystems (croplands) and natural or semi-natural ecosystems (woodlands or grasslands) is challenging, since these are very different ecosystem types with different disturbance regimes and the dynamics of each ecosystem might differentially impact multifunctionality. For example, woodlands are stable ecosystems dominated by trees where primary productivity is rather constant over time, whereas croplands are subjected to crop rotations, whereas grasslands are often dominated by annual herbs and forbs, with consequences for organic carbon stocks, and productivity44. Taken together, we argue that despite being part of the same landscape, soils from different land uses, textures, climatic regions, and pH classes respond to distinct microbial predictors of multifunctionality. This indicates that monitoring should adopt context-specific microbial indicators rather than a uniform approach.

Our continental analysis provides evidence that ecosystem multifunctionality is shaped by both environmental context and biotic composition, with land use, soil properties, climate, and the soil microbiome each contributing to ecosystem multifunctionality. Multifunctionality was highest in grasslands and woodlands, in loam-textured and acidic soils (pH <6.5), and in temperate humid regions. The best predictors of ecosystem multifunctionality overall were microbial biomass and nitrogen content. Random forest and variance partitioning analyses further revealed that the soil microbiome (i.e., co-occurring microbial OTUs) explains a substantial and context-dependent share of multifunctionality. The soil microbiome was the main factor contributing to variation in enzymatic activities and primary productivity, while soil basal respiration and soil aggregation were mostly explained by soil properties. We also found that modules dominated by Actinobacteria (e.g., Gaiella, Sphingomonas), Acidobacteria, and Mortierella frequently emerged as strong predictors of ecosystem multifunctionality. Our use of multi-group structural equation models further highlights that the strength and nature of relationships among climate, soil properties, the microbiome, and multifunctionality differ across environmental classifications. Among all groupings, land use emerged as the most influential structuring variable. Collectively, these findings underscore the need for context-aware approaches in using microbial indicators for soil monitoring. As land-use change intensifies globally, understanding how local conditions modulate microbiome-function relationships is essential to designing effective, scalable soil health strategies and sustaining ecosystem services under environmental change.

Methods

Field survey and soil sampling

Our study was built upon the EU Statistical Office’s Land Use and Coverage Area Frame Survey (LUCAS) Soil, the largest pan-European scheme for assessing soil characteristics in relation to land cover and land use45,46. The sampling approach used a composite strategy, where the final sample at each location consisted of five combined topsoil (0–20 cm) subsamples. The initial subsample was obtained precisely at the coordinates of the designated LUCAS point, while the four additional subsamples were collected 2 m from the central point, aligning with the cardinal directions (North, East, South, and West). To minimize the impact of seasonality and temporal variation, all samples were collected over the shortest possible timeframe during spring/summer 2018. For this study, a total of 484 soil samples were collected, comprising 165 from woodlands, 92 from grasslands, and 227 from croplands. Based on the Köppen–Geiger climate classification, 113 samples originated from sites in a continental climate, 122 from temperate dry regions, and 249 from temperate humid regions (Figure S1).

Soil edaphic factors and climatic variables

Soil samples were used to determine a range of edaphic factors following standard procedures46. Total phosphorus content (mg kg-1) was measured by the ISO 11263:1994 protocol following phosphorus solubilization in a sodium hydrogen carbonate solution. Total nitrogen content (g kg−1) was measured following the ISO 11261:1995 protocol. Extractable potassium content (mg kg−1) was determined with atomic absorption spectrophotometry47. Soil pH was measured according to the ISO 10390:1994 standard using 0.01 M CaCl₂ as the extractant, while the proportions of silt, clay, and sand were determined using laser diffraction particle size analysis following ISO 13320:2009. Based on these measurements, samples were further grouped by pH and texture. For soil texture classification, we used the USDA system integrated within the soil texture wizard48. We classified 290 soils as acidic (pH < 6.5), 63 as neutral (pH 6.5–7.3), and 131 as alkaline (pH > 7.3). Regarding texture, 129 soils were identified as clay, 179 as loam, and 176 as sand.

Soil organic carbon was measured following the ISO 10694:1995 protocol. Briefly, the total carbon content in each sample was determined using an elemental analyzer following dry combustion and corrected for carbonate content (ISO 10693:1994), resulting in soil organic carbon (g kg−1). Soil bulk density was measured following the ISO 11272:2017 procedure49. Soil bulk density was expressed as the dry weight of soil divided by its volume (g cm−3), which includes the volume of soil particles and the pores among them50. Soil microbial biomass (expressed as μg Cmic g¹ soil dry weight) was estimated as a proxy using substrate-induced respiration, based on the respiration response to glucose addition measured directly after potential basal respiration on the same soil samples. Climatic variables, including monthly mean air temperature and precipitation (averaged values over the 1970–2000 period) were obtained for each sampling location from the WorldClim database (worldclim.org).

Soil microbial community composition

Soil samples (n = 484) were analyzed for bacterial and fungal biodiversity using DNA metabarcoding. DNA was extracted using the Qiagen DNeasy PowerSoil HTP 96 Kit, with three 0.2 g aliquots per sample pooled post-extraction51. DNA quality and quantity were assessed using the Qubit™ 1X dsDNA HS Assay Kit. PCR amplification was performed in triplicate using 5 × HOT FIREPol® Blend Master Mix (Solis BioDyne, Tartu, Estonia) in 25 μl volume. The PCR conditions for bacterial amplification included 55 °C annealing temperature, 26 cycles, and 1.5 ng of DNA template in 1 µl, following an optimized protocol derived from the Earth Microbiome Project (https://earthmicrobiome.org/protocols-and-standards/16s) to minimize PCR bias while ensuring sufficient yield for sequencing. For fungal amplification, the PCR conditions included an annealing temperature of 55 °C, 30 cycles, and 1.5 ng of DNA template in 1 µl. Amplified DNA was tagged with multiplex identifier tags, pooled, and verified on a TBE 1% agarose gel. Primer sets for barcode amplification of 16S rRNA gene were 515 F (GTGYCAGCMGCCGCGGTAA) and 926 R (GGCCGYCAATTYMTTTRAGTTT), targeting the bacterial V4-V5 hypervariable region52,53 and ITS9mun (GTACACACCGCCCGTCG) and ITS4ngsUni (CGCCTSCSCTTANTDATATGC) for the fungal ITS region54. Sequencing was performed using Illumina MiSeq platform with 2 ×300 paired-end mode for bacterial amplicons and PacBio Sequel II platform for fungal amplicons.

The Illumina and PacBio amplicon data (for bacteria and fungi, respectively) were demultiplexed using LotuS2 and paired-end reads were assembled using FLASH 1.2.1055,56. For bacteria, zero-radius operational taxonomic units (zOTUs) were generated using the UPARSE algorithm (usearch version 10.0.02457). The process involved merging paired-end reads, trimming off 16S primer sequences, quality filtering, and denoising. For fungi, 98%-OTUs were obtained following the VSEARCH algorithm58. Our datasets, consisting of 484 soils, included 79,593 zOTUs (bacteria) and 35,152 98%-OTUs (fungi). The bacterial dataset was rarefied at 40,000 sequences per sample, and the fungal dataset was rarefied at 1000 sequences per sample. Taxonomy was assigned to the bacterial dataset using the Ribosomal Database Project v1659 and to the fungal dataset using MegaBLAST searches against the UNITE 9.1 database60.

To identify groups of co-occurring microbial taxa, we used the WGCNA (Weighted Gene Co-expression Network Analysis) R package61. OTU tables containing relative abundances were first formatted with samples as rows and OTUs as columns. We assessed the data for missingness using the goodSamplesGenes function, followed by hierarchical clustering (hclust) to detect potential sample outliers. A soft-thresholding power was selected based on scale-free topology and mean connectivity criteria (pickSoftThreshold, with blockSize = 1000). An adjacency matrix was then constructed using this threshold and transformed into a Topological Overlap Matrix (TOM) to compute dissimilarity between OTUs. Taxa were clustered using hierarchical clustering, and modules were defined via the cutreeDynamic function, with a minimum module size of 15 for fungi and 100 for bacteria. Module eigengenes were calculated (moduleEigengenes function) and used as indicators of microbial community composition. Finally, OTUs were taxonomically annotated and assigned to their corresponding modules for interpretation. This network-based approach allowed us to reduce community data dimensionality and link modules to environmental variables and multifunctionality.

Ecosystem multifunctionality

We assessed six functional proxies across 484 soils (Table S1): soil basal respiration, three enzymatic activities (N-acetylglucosaminidase, phosphatase, and xylosidase), net primary productivity, and soil aggregate stability. These proxies capture different ecosystem processes and associated services. Soil basal respiration reflects microbial activity and relates to climate regulation. N-acetylglucosaminidase breaks down chitin and complex carbohydrates, informing nitrogen cycling and organic matter decomposition. Phosphatase releases inorganic phosphate from organic compounds, a key process in phosphorus cycling. Xylosidase cleaves xylose from hemicellulose, contributing to plant litter decomposition and carbon cycling. Net primary productivity reflects primary production and food provision, while soil aggregate stability represents soil structure, linked to erosion regulation, flood regulation, and habitat provision. Our selection of functions was based on biotic or abiotic processes that can be measured as a rate or directly contribute to ecosystem services7.

Potential soil basal respiration (μL O2 g−1 soil dry weight h−1) was measured using an O2-microcompensation apparatus: respiration was determined at 20 °C using the pressure difference between a sealed chamber containing the sample and a control chamber at atmospheric pressure. For this, 5–7 g of thawed soil, which had been previously acclimated for five days at 4 °C and sieved at 2 mm, was used. Measurements of potential basal respiration lasted from 22 to 42 hours, depending on the time taken for respiration rates to reach detectable levels. A timeframe of 5–7 consecutive measurement hours, in which respiration was stable, was used to calculate the average potential basal respiration of the sample62,63. The activities of the soil enzymes N-acetylglucosaminidase, xylosidase, and phosphatase were measured based on 4-methylumbelliferone (MUF)-coupled substrates63. Briefly, fresh soil samples (250 mg each) were suspended in 50 μM acetate buffer (pH 5), sonicated to disrupt soil aggregates, and incubated at 25 °C for 60 min in 96-well microplates containing substrates, MUF dilutions for quenching and extinction coefficients, and controls for substrate and soil suspensions. Reactions were stopped with 2 M NaOH, and fluorescence was measured in eight technical replicates using an Infinite 200 PRO instrument (Tecan Group, Männedorf, Switzerland). Enzymatic activities were quantified as substrate turnover rates, expressed as nmol of MUF per gram of dry soil per hour. Net Primary Productivity (NPP) was estimated at each sampling location using MODerate resolution Imaging Spectroradiometers (MODIS) imagery from NASA’s Terra and Aqua satellites, specifically through the MOD17 PSN/NPP algorithm. NPP was calculated as the sum of eight-day net photosynthesis (PSN) products based on the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR). FAPAR, which indicates the solar radiation absorbed by plants, is derived using variables such as the maximum radiation conversion efficiency, ground temperature, and vapor pressure64. The NPP data for 2018, aligning with LUCAS sampling locations, were averaged across ±6 months surrounding the sampling times at a 500 m resolution, and expressed as g C m−2 yr−1. Soil aggregation was determined from the percentage of water-stable aggregates (WSA) following the wet-sieving method63,65. Briefly, 4 g of fresh soil (FM) were placed into 0.25-mm sieves and allowed to rewet by capillarity for 5 min. Samples were then wet-sieved for 3 min and dried overnight at 70 °C, after which dry matter was measured. Coarse Matter (CM) was measured after another night at 70 °C. Finally, the percentage of water-stable aggregates was calculated as follows:

$$\%{{{\rm{WSA}}}}=\left(\frac{{{{\rm{WSA}}}}-{{{\rm{CM}}}}}{{{{\rm{FM}}}}-{{{\rm{CM}}}}}\right)*100$$
(1)

To quantify ecosystem multifunctionality, we calculated four distinct indices based on the six measured functions. The first index was a simple average, where all functions were standardized to a 0–1 scale and then averaged, assuming equal contribution of each function. The second index applied a z-score transformation, standardizing each function using the formula z = (X − µ)/σ, where X is the function value, µ is the mean, and σ is the standard deviation6,66. This approach facilitates direct comparability among functions by expressing values in terms of standard deviations from the mean. The third index was a weighted average, which accounted for the fact that three of the six functions were enzymatic activities representing similar processes. In this index, each enzymatic function was assigned a relative weight of 0.33, while non-redundant functions retained full weight. In addition, we calculated a multiple-threshold index by assessing the number of functions (0–6) performing above predefined thresholds (25%, 50%, and 75% of the maximum observed value). We also analyzed each function individually to provide detailed insights into how predictors influenced both single functions and overall multifunctionality. Because the calculated indices were highly correlated (Figure S2), we selected the weighted index for downstream analyses (see next section). This index best reflects ecosystem multifunctionality, as it corrects for the overrepresentation of enzymatic activities (3 out of the 6 measured proxies).

Statistical analyses

All statistical analyses were conducted in R version 4.2.167. We explored the relative contribution of the soil microbiome, soil properties, and climate in explaining patterns on multifunctionality by means of random forest analyses, variance partitioning, and structural equation models (SEMs). We first investigated how multifunctionality and individual soil functions varied across different environmental groupings. Specifically, we categorized our 484 soil samples based on land use (cropland = 227, grassland = 92, woodland = 165), climatic region (continental = 113, temperate dry = 122, temperate humid = 249), soil texture (clay = 129, loam = 179, sand = 176), and pH class (acidic = 290, neutral = 63, alkaline = 131). We then used Kruskal-Wallis tests followed by pairwise Dunn’s tests (Bonferroni correction) to assess group differences. For each test, we reported the chi-squared statistic, p value, and eta-squared as a measure of effect size. Since overlaps among soil groupings could confound interpretations, we quantified the percentage of overlap between categories. With few exceptions, overlap was limited. However, woodland, continental, and sandy soils largely coincided with acidic soils (Table S2). This indicates that patterns attributed to these groupings may partly reflect their strong association with soil pH.

To identify key predictors of multifunctionality (weighted index), we employed Random Forest (RF) regression using the rfPermute package with 1000 trees68. Models were trained on standardized data and evaluated using out-of-bag (OOB) R² and 10-fold cross-validation (repeated 3 times). Variable importance was assessed based on the increase in mean squared error when each predictor (microbial modules, soil properties, climate) was permuted.

Next, we partitioned the variance in multifunctionality using adjusted R² to determine the unique and shared contributions of three predictor groups: climate (precipitation, temperature), edaphic factors (excluding total nitrogen due to collinearity with SOC), and microbial composition (bacterial and fungal modules). Unique and shared contributions were calculated through a nested linear models.

Finally, to disentangle direct and indirect effects, we used multi-group Structural Equation Models (SEMs) with the lavaan package69. Predictors were grouped into climate, soil properties, and microbial variables (see Figure S3 for a priori model). We fit SEMs and estimated standardized path coefficients. A constrained model was also fit to test whether relationships differed significantly across groups. Direct effects on multifunctionality and indirect effects on the microbiome were summarized by group and category.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.