Crowdsourced biodiversity monitoring fills gaps in global plant trait mapping

Lusk, Daniel; Wolf, Sophie; Svidzinska, Daria; Dormann, Carsten F.; Kattge, Jens; Bruelheide, Helge; Sabatini, Francesco Maria; Damasceno, Gabriella; Moreno Martínez, Álvaro; Violle, Cyrille; Hending, Daniel; Hähn, Georg J. A.; Tabeni, Solana; Phartyal, Shyam; Gonçalves, Fernando; Kreft, Holger; Schmidt, Marco; Chen, Han; Güler, Behlül; Dolezal, Jiri; Pielech, Remigiusz; Guido, Anaclara; Dwyer, Ciara; Napoleone, Francesca; Willie, Jacob; Gasper, André Luís; Macía, Manuel J.; Chytry, Milan; Lenoir, Jonathan; Thakur, Dinesh; Dengler, Jürgen; Świerszcz, Sebastian; Altman, Jan; Mucina, Ladislav; Nerlekar, Ashish N.; Kakinuma, Kaoru; Rawat, Pravin; Stančić, Zvjezdana; Testolin, Riccardo; Hatim, Mohamed Z.; Rodrigues, Flávio; Homeier, Jürgen; Marques, Marcia C. M.; McCarthy, James K.; El-Sheikh, M. A.; Korznikov, Kirill; Gerberding, Kilian; Kattenborn, Teja

doi:10.1038/s41467-026-68996-y

Download PDF

Article
Open access
Published: 30 January 2026

Crowdsourced biodiversity monitoring fills gaps in global plant trait mapping

Nature Communications volume 17, Article number: 1203 (2026) Cite this article

9671 Accesses
2 Citations
204 Altmetric
Metrics details

Subjects

Abstract

Plant functional traits are fundamental to ecosystem dynamics and Earth system processes, but their global characterization is limited by available field surveys and trait measurements. Recent expansions in biodiversity data aggregation—including vegetation surveys, citizen science observations, and trait measurements—offer new opportunities to overcome these constraints. Here we demonstrate that combining these diverse data sources with high-resolution Earth observation data enables accurate modeling of key plant traits at up to 1 km² resolution. Our approach achieves correlations up to 0.63 (15 of 31 traits exceeding 0.50) and improved spatial transferability, effectively bridging gaps in under-sampled regions. By capturing a broad range of traits with high spatial coverage, these maps can enhance understanding of plant community properties and ecosystem functioning, while serving as tools for modeling global biogeochemical processes and informing conservation efforts. Our framework highlights the power of crowdsourced biodiversity data in addressing longstanding extrapolation challenges in global plant trait modeling, with continued advancements in data collection and remote sensing poised to further refine trait-based understanding of the biosphere.

Leveraging remote sensing and crowd-sourced biodiversity data for enhanced plant functional trait mapping

Article Open access 21 April 2026

CoRRE Trait Data: A dataset of 17 categorical and continuous traits for 4079 grassland species worldwide

Article Open access 18 July 2024

Citizen science plant observations encode global trait patterns

Article 20 October 2022

Introduction

The diversity and distribution of plants shape ecosystem function and influence global biogeochemical cycles^1,2,3. Since Humboldt’s early explorations of plant geography in 1807, ecologists have sought to understand how plant properties govern species’ success across environmental gradients and their role in Earth system processes^4,5,6,7. Despite significant advances, a comprehensive global understanding of plant functional traits and their implications for ecosystem resilience remains incomplete^8,9.

Plant functional traits—measurable characteristics influencing growth, reproduction, and survival—are central to ecological strategies and ecosystem modeling^10,11,12,13. Traits such as leaf area, wood density, and seed mass reflect trade-offs in resource use, stress tolerance, and competition, shaping plant community assembly¹⁴. Integrating these traits into global vegetation and Earth system models is crucial for refining projections of energy, carbon, and water cycles^15,16. However, producing spatially explicit, high-resolution trait maps requires extensive in situ trait measurements—a challenge that remains largely unmet.

Earth observation technologies provide valuable insights into vegetation properties, offering continuous, high-resolution data on surface reflectance, vegetation structure, climate, and soil conditions^{17,18,19,20,21,22}. Yet, their utility for plant trait modeling remains constrained by limited in situ trait observations for calibration and validation. While in situ trait measurements are accurate, they are geographically sparse and labor-intensive²³. Global initiatives such as TRY and BIEN have compiled extensive trait databases^24,25,26, but they often lack the spatial coverage necessary for coherently mapping plant functional composition at fine scales. Consequently, satellite-based trait extrapolations carry significant uncertainties^27,28.

Crowdsourced ecological data offer a promising avenue to address these gaps. Expert-led vegetation surveys, such as those aggregated by the sPlot database, document species co-occurrence and abundance around the planet^29,30,31. However, many regions remain underrepresented (Fig. 1, SCI). Meanwhile, crowdsourced biodiversity databases like the Global Biodiversity Information Facility (GBIF), which aggregates records from citizen science platforms such as iNaturalist and Pl@ntNet alongside many other sources, have contributed over half a billion plant observations (Fig. 1, CIT), vastly expanding global species distribution datasets^32,33,34. Though these data exhibit spatial and taxonomic biases, they hold great potential for plant trait inference. Recent work by Wolf et al.²⁸ showed that combining citizen science occurrences with trait databases can yield community-weighted mean trait estimates closely aligned with those from vegetation plot data.

**Fig. 1: Density maps of citizen science observations (CIT) and vegetation surveys (SCI) after filtering.**

Each data source—trait measurements, vegetation surveys, and citizen science observations—has strengths and limitations. Here, we present a data integration framework that combines these sources with Earth observation to generate continuous, high-resolution (up to 1 km²) global maps of 31 key plant traits. Our approach integrates environmental data on surface reflectance, climate, soil properties, and vegetation structure with species observations from GBIF, expert vegetation surveys from sPlot, and trait records from TRY to improve spatial coverage and accuracy while quantifying uncertainty and spatial transferability (Fig. 2).

**Fig. 2: Workflow for modeling plant functional traits using citizen science observations, vegetation surveys, trait measurement data, and Earth observation.**

In this work, we predict community-weighted mean trait values globally using three data subsets: (1) scientific vegetation surveys alone (SCI), (2) citizen science observations alone (CIT), and (3) a combined approach (COMB). Using spatial cross-validation with independent vegetation survey data, we assess model performance and geographic generalizability across scales from 1 to 222 km². This multi-resolution modeling strategy allows us to explore resolution-performance trade-offs and identify the optimal spatial scales for specific traits. Our approach achieves unprecedented accuracy for many traits and, to our knowledge, maps several traits at high resolution for the first time. By integrating diverse data sources, we produce the most precise large-scale trait maps to date, significantly improving spatial coverage and predictive reliability. These advances provide a robust foundation for refining ecosystem models and predicting global vegetation dynamics with greater confidence.

Results

Density and extent of citizen science and survey data

The three trait data subsets—vegetation surveys (SCI), citizen science observations (CIT), and their combination (COMB)—exhibited distinct patterns of observation density and geographic coverage. Prior to matching with trait measurements from the TRY trait database, CIT contained 339,971,350 observations of 314,217 species, while SCI consisted of 2,534,183 plots recording 52,942,365 observations of 91,603 species. After matching with TRY, CIT retained 89% of observations (n = 303,288,097) and 28% of species (n = 88,802), while SCI retained 84% of plot-level relative abundance (n = 45,037,375 observations) and 43% of species (n = 39,286). Following spatial aggregation for matching with environmental predictors, CIT (and therefore COMB) displayed markedly greater spatial coverage than SCI alone, though both subsets showed spatial clustering in Europe, North America, Japan, and Australia (Fig. 1).

Global trait maps and spatial transferability

We modeled the global distributions of 31 plant traits (Table 1) across five spatial resolutions: 1 km² (~0.01^∘), 22 km² (~0.2^∘), 55 km² (~0.5^∘), 111 km² (~1^∘), and 222 km² (~2^∘) (Fig. 3). To account for differences in observation density and geographic coverage, models were generated for each of the three trait data subsets: SCI, CIT, and COMB. We used gradient-boosted decision trees, which are well-suited for capturing complex, nonlinear relationships, with environmental predictors aggregated at each corresponding spatial resolution³⁵. Rather than deriving coarser-resolution maps from higher-resolution outputs, we trained separate models at each scale to prevent biases from simple upscaling. This approach yielded a total of 465 models (31 traits × 5 resolutions × 3 trait data subsets).

Fig. 3: Trait maps based on combined citizen science and vegetation survey data (COMB) at 1 km2 resolution. — **Fig. 3: Trait maps based on combined citizen science and vegetation survey data (COMB) at 1 km² resolution.**

Table 1 Plant functional traits used for modeling and trait map generation

Full size table

Final trait maps, along with each map’s corresponding coefficient of variation (COV) and area of applicability (AOA) masks, were generated as GeoTIFF rasters. COMB maps at 1 km² resolution can be viewed online via an interactive map viewer (https://global-traits.projects.earthengine.app/view/global-traits) or downloaded directly (see “Data availability”). Predictions were made for all pixels where sufficient predictor data were present with the exclusion of permanent water bodies. In the interest of transparency and reproducibility for future work, all trait maps are made available. Given the range of model performance, map users are encouraged to carefully consider their specific use cases alongside the provided model performance and uncertainty indicators (COV and AOA). Plant functional type-specific maps are also available for download. Citizen science observations, with their significantly greater geographic extent and sample size, have the potential to enhance vegetation survey data by allowing models to predict with greater confidence in data-deficient regions. To evaluate this, we calculated two metrics for each trait map and model during spatial cross-validation: area of applicability³⁶ and coefficient of variation (see “Methods”) (Fig. 4). For this portion of the analysis, 1 km² maps were primarily considered to ensure comparison at the finest grain size available, though area of applicability was also compared across resolutions.

Fig. 4: Spatial transferability metrics for traits with r ≥ 0.5 at 1 km2 resolution. — **Fig. 4: Spatial transferability metrics for traits with r ≥ 0.5 at 1 km² resolution.**

The applicability of models to new domains, or their AOA, is determined by how similar the new region’s predictors are to those used during the model’s training³⁶. Incorporating citizen science data alongside vegetation surveys increased the AOA for all traits at all resolutions, with an average gain of 2.43 percentage points. The largest improvement was 9.22 for wood fiber length at 1 km² resolution, while the smallest was 0.04 for leaf width at 222 km² resolution. In particular, trait maps at finer resolutions benefited more on average from the inclusion of citizen science trait data, while the benefits became smaller at coarser resolutions (Fig. 4d).

The coefficient of variation (COV) is another indicator of model uncertainty derived by comparing the predictions made by each model during cross-validation. For all models, the COV was consistently lower on average for COMB maps than for SCI maps. In particular, COMB maps showed better generalization across all biomes, including regions where observations are especially sparse for both citizen science observations and vegetation surveys, such as desert, alpine, and wetland regions.

Model performance across trait data subsets at 1 km² resolution

To assess predictive accuracy at high spatial resolution, we compared model performance across the three trait data subsets (SCI, CIT, and COMB) at 1 km² resolution, using spatial cross-validation with independent vegetation survey community-weighted mean (CWM) trait data not employed in model training (Fig. 5a–c). Model performance for all traits across all trait data subsets and resolutions can be found in Table S2.

**Fig. 5: Model performance across trait data subsets.**

SCI and COMB models consistently outperformed CIT models in terms of correlation coefficient (r) and normalized root mean squared error (nRMSE). COMB models maintained similar r and nRMSE values to those trained solely on SCI data, with an average increase in r by 0.12 compared to CIT models. SCI models marginally outperformed COMB models on average, but COMB models consistently demonstrated lower mean coefficients of variation (Fig. 4a, b and Table S2). At 1 km² resolution, SCI models had r ≥ 0.5 for 16 out of 31 traits, COMB models for 15 traits, and CIT models for 6 traits. The traits with the highest predictive accuracy across both COMB and SCI models included leaf N (area), specific leaf area (SLA), rooting depth, stem specific density (SSD), stem conduit diameter, and leaf area. Model prediction error (nRMSE) across biomes (Fig. 5c) was generally lowest for SCI models and only slightly higher for COMB models, while CIT models consistently exhibited higher error across most biome types. Wetland regions displayed a wide range in error for all models. Notably, COMB models displayed robust predictive accuracy across biomes, often outperforming SCI models slightly in data-sparse regions (Fig. S3).

Map accuracy of combined trait data across spatial scales

To evaluate how spatial resolution affected map accuracy, we compared Pearson’s r for COMB models across five resolutions (1–222 km²), where training data were aggregated at each resolution prior to model training to avoid simple upscaling effects (Fig. 5d). Accuracy generally improved with increasing spatial aggregation, with most traits (18 of 31) showing peak performance at 22 km² or 55 km² resolution. Beyond this point, accuracy tended to plateau or slightly decline, suggesting diminishing predictability at coarser scales for many traits. Similar trends were observed for SCI and CIT models (Table S.2).

Comparison with existing trait map products

To benchmark model performance, we compared COMB and CIT predictions with existing global trait maps generated through model-based extrapolations from sparse trait data, though using differing methodologies (Table 2). Three widely studied traits—specific leaf area (SLA), leaf N (mass), and leaf N (area)—were selected due to their broad overlap with prior trait mapping efforts.

Table 2 Comparison of trait map performance with existing products

Full size table

At all resolutions and across all three traits, COMB predictions from this study exhibited stronger correspondence with independent, held-out vegetation survey data than previous trait maps. COMB models consistently produced the highest Pearson correlation coefficients (r), with values as high as 0.68 at 22 km² resolution for leaf N (area). Even at coarser scales, COMB models retained high predictive accuracy, with average r values exceeding those of earlier studies. Notably, citizen science-only models (CIT), which employed no vegetation survey data during training, also demonstrated competitive performance, often outperforming previously published trait maps and achieving the second-highest correlations for most trait-resolution combinations.

Importance of environmental predictors

We assessed the relative importance of each Earth observation predictor to determine which predictor datasets contributed most to model performance (Fig. S5). The predictor datasets included WorldClim bioclimatic variables (n = 6), MODIS land surface reflectance features (n = 72), SoilGrids soil properties (n = 61), VODCA vegetation optical depth features (n = 9), and global canopy height data (n = 2)^{19,20,21,22,37}.

For models trained with the combined dataset (COMB), MODIS surface reflectance emerged as the most influential predictor set overall (mean importance = 0.083), followed by WorldClim bioclimatic variables (0.051), SoilGrids soil properties (0.042), canopy height (0.008), and vegetation optical depth (0.006). Although MODIS, WorldClim, and SoilGrids contributed more strongly to trait predictions on average, the importance of individual predictors varied widely across traits. Notably, the highest single predictor importance was observed for a WorldClim variable (0.139, stem conduit density), followed by MODIS (0.091, specific leaf area), SoilGrids (0.025, rooting depth), canopy height (0.023, plant height), and vegetation optical depth (0.013, leaf N [mass]).

Discussion

Integrating structured vegetation surveys and opportunistic citizen science observations provides a means to expand spatial coverage, reduce extrapolation error, and improve trait predictions at a global scale. While citizen science data inherently contains noise and bias due to unstructured and opportunistic sampling^38,39,40,41, its broad geographic distribution and taxonomic coverage complement the systematic yet spatially limited nature of expert vegetation surveys. Models trained on a combination of these data sources (COMB) achieved predictive power comparable to those trained solely on vegetation surveys (SCI; Fig. 5), while also benefiting from markedly improved spatial transferability (Fig. 4).

Predictive performance was similar between SCI and COMB models, likely due to the down-weighting of citizen science samples in the COMB subset aimed at encouraging the models to prioritize the more structured and reliable patterns from vegetation surveys. In fact, when examined across biomes, COMB models tended to moderately outperform SCI models (Fig. S3). Because R² can be biased by differences in trait variance across biomes, however, we emphasize nRMSE as a more reliable indicator of comparative model error in these cases. The inclusion of citizen science observations also resulted in a consistent reduction in the coefficient of variation and an expanded area of applicability on average and across biomes. This reduction in uncertainty indicates that, while vegetation surveys offer higher-quality trait data, citizen science observations can enhance spatial generalizability and predictive confidence, particularly in regions with limited survey coverage. These benefits were especially pronounced in remote and under-sampled regions, such as alpine/boreal, tropical, and wetland zones, where sparse vegetation surveys limit trait mapping efforts^25,29. Overall, sampling effort significantly influenced model performance across biomes, with COMB models exhibiting reduced variance, lower prediction error, and improved accuracy in regions where citizen science observations substantially exceeded vegetation survey coverage. Given the sensitivity of these ecosystems to climate change, permafrost dynamics, treeline shifts, and associated greening and browning trends, improving the spatial robustness of trait models is increasingly important⁴².

The benefits of integrating multiple data sources were most pronounced at finer spatial resolutions (e.g., 1 km²), where environmental conditions are more homogeneous and local-scale species assemblages are better captured. At these scales, the broader species occurrence data improved spatial coverage, expanding the area of applicability and enabling models to generalize to a wider range of environments. However, at coarser resolutions, the distinction between SCI and COMB models diminished, as spatial aggregation reduced variance in the predictor and trait spaces.

Model accuracy, however, followed a slightly different pattern (Fig. 5d). Some traits—such as leaf area, seed mass, and stem density—exhibited higher predictive accuracy at coarser resolutions, while others plateaued or declined beyond 55 km² resolution. This trend suggests that certain traits may align more strongly with large-scale climatic gradients, while others are driven by localized environmental and ecological factors, as well as land use and local management. These findings emphasize the need for trait-specific assessments of spatial scaling effects to refine predictions across different functional trait dimensions.

Trait maps generated using the combined data approach (COMB) exhibited higher accuracy compared to previous global trait mapping efforts when validated against independent community-weighted mean (CWM) trait data (Table 2). This improvement is likely due to the complementary nature of the input data sources, allowing COMB to leverage the strengths of both vegetation surveys and citizen science observations while mitigating their individual limitations. Notably, models trained exclusively on citizen science data (CIT), which never incorporated vegetation survey data, also outperformed most previously published trait maps. These findings support those of Wolf et al.²⁸, reinforcing the value of crowdsourced species observations in trait mapping. Despite the biases inherent in opportunistic citizen science data, the sheer volume and geographic coverage of these datasets appear to encode meaningful ecological patterns.

These improvements also likely reflect differing trait representation and underlying methodological assumptions and goals. For example, several previous trait mapping efforts have relied primarily on remote sensing proxies or broad-scale climate-trait correlations, which effectively capture dominant vegetation patterns but may not fully represent the trait variation present across plant communities. A key distinction of our approach is its ability to aggregate species-level trait data from multiple sources, allowing for a direct estimation of community trait composition rather than relying on aggregated plant functional type (PFT) classifications. Trait products that integrate discrete land cover or PFT information—particularly those derived from top-of-canopy remote sensing—have been shown to better reflect the traits of the most dominant functional groups within a given region rather than CWM traits²⁷. Here, it was our aim to capture traits across the full vertical structure of plant communities, a perspective particularly relevant for applications in ecosystem process modeling, functional diversity assessments, and dynamic vegetation models^1,12,16.

To model each trait, 150 Earth observation and environmental predictors representing approximately 19 billion observations were utilized, prompting critical consideration of whether such a highly data-intensive approach is justified. Given the scale of these inputs, we calculated predictor importance using feature permutation to quantify the relative influence of a given predictor or predictor set on prediction quality (Fig. S5).

MODIS surface reflectance was, on average, the most influential predictor set overall, followed by bioclimatic variables and soil properties, while canopy height and vegetation optical depth played a smaller role. These results align with prior research demonstrating that many functional traits are primarily structured along climate and, to a lesser extent, soil gradients, while their fine-scale variability may be better captured through optical remote sensing^1,2,43,44. Of course, such trends depend heavily on scale and the trait in question, and individual trait models exhibited distinct responses to different predictor types (Fig. S6). Stem conduit density, for example, was most strongly influenced by a WorldClim bioclimatic variable, specific leaf area by MODIS reflectance, and rooting depth by SoilGrids. Further, canopy height and vegetation optical depth unsurprisingly contributed most to modeling structural traits like plant height and leaf width. These results suggest that no single environmental predictor is universally superior at predicting all plant functional traits and affirm that varied predictor data may be necessary to begin to encode the complexity of the biosphere and its functional diversity.

Despite the advantages of integrating multiple crowdsourced datasets, sampling biases remain a challenge. Vegetation surveys, citizen science data, and trait databases each exhibit geographic clustering and taxonomic gaps due to accessibility constraints, observer behavior, and historical research focus in the global north^38,39,45. In African savannas, for example, grasses—despite being dominant across vast regions—are underrepresented in citizen science observations due to the difficulty of taking meaningful photographs of them, specialized identification requirements, and morphological similarities that complicate species-level classification. Likewise, vegetation surveys often target specific communities of interest rather than providing representative coverage at broader spatial scales. Addressing these biases will require targeted efforts to improve taxonomic and seasonal coverage in trait-mapping initiatives, and the rapid expansion of citizen science projects and increased collaboration among scientists present valuable opportunities to close these gaps in the near future^28,29.

The influence of sampling density on predicted trait values is particularly evident in regions with sharp differences in observation coverage. In Portugal, for example, where CIT observation density is high, but SCI surveys are sparse, trait patterns differ noticeably from those in neighboring Spain, where fewer CIT observations are available. This discrepancy likely stems from the inclusion of multiple country-wide forest inventories from Portugal in the GBIF database, which may introduce a taxonomic bias in favor of woody plants. The resulting contrast in sampling density and methodology across the Iberian Peninsula’s unique environmental conditions may, in turn, confound model prediction, as indicated by the high coefficients of variation observed within Portugal compared to the relatively low variance in Spain and western France. These findings highlight the need for systematic bias corrections⁴¹, such as spatial filtering or weighting schemes, to reduce observation-driven artifacts in trait predictions while also emphasizing the benefits of more comprehensive vegetation surveys.

Expanding citizen science and vegetation survey datasets enhances our understanding of plant community composition, but trait mapping remains constrained by the limited availability of in situ trait measurements. Most species observed in this study lacked corresponding trait records in the TRY database, with only 28% of citizen science observations and 43% of vegetation survey species records successfully matched (though ~85% of all observations still fell within these species). This “Hutchinsonian shortfall” is by no means limited to this study, however, and despite ongoing database expansion and advanced gap-filling techniques, additional strategies are needed to bridge these gaps^25,46,47. Computer vision, for instance, presents a promising opportunity to leverage the vast number of citizen science photographs to infer certain missing traits⁴⁸, not only improving trait coverage but also complementing traditional trait measurement efforts.

Even when species observations are successfully matched to trait databases, the “naive” approach of assigning species-mean trait values overlooks intraspecific trait variation (hereafter referred to as “intraspecific variation") across environmental gradients. While intraspecific variation may contribute less to trait patterns at large spatial scales, it remains a key factor in functional diversity assessments^24,49,50,51. Importantly, the relevance of intraspecific variation varies by trait; for example, plant height exhibits substantial within-species variability, whereas specific leaf area (SLA) tends to be more stable⁵². Approaches such as incorporating in situ trait measurements or local PFT conditions within a given spatial radius may help capture intraspecific variation, though they often limit the number of species included per grid cell^44,53. Future trait-mapping efforts should explore integrating intraspecific variation-aware modeling approaches to improve ecological realism in global trait predictions.

The integration of diverse data sources in trait mapping provides a powerful framework for capturing plant functional diversity. While our approach emphasizes methodological innovation, these advances enable critical ecological applications, such as directly addressing data gaps in trait-based biome classification⁵⁴ and providing the spatially continuous, high-resolution trait maps essential for next-generation process-based vegetation models⁵⁵. As ecological research increasingly demands more traits, higher accuracy, and finer spatial resolution to understand ecosystem responses to global change, such methodological developments become foundational to expanding the horizons of trait-based ecology. Nevertheless, several avenues remain for advancing these methodologies. One major advantage of our approach is its adaptability; beyond linking species observations to mean trait values, it allows for the estimation of trait distributions, including quantiles, ranges, and variance. This flexibility could also be leveraged to better represent the variability within specific plant functional types (PFTs), particularly for groups such as grasses and shrubs, where within-PFT variation is often masked by broad community-level averaging^27,56. Additionally, multi-label inference, an approach that enables models to predict multiple correlated traits simultaneously, could enhance predictive accuracy by leveraging known relationships among traits^57,58,59,60. More extensive refinement of trait-specific predictor selection could also improve model efficiency by emphasizing the most relevant environmental variables for each trait. Trait predictions might be further improved by constraining species observations to those characteristic of specific vegetation formations, or PFTs, to ensure that trait-environment relationships are inferred within ecologically coherent units². Beyond static trait predictions, future research could also focus on modeling key functional diversity (FD) and biodiversity metrics instead of solely community-weighted mean (CWM) traits. Understanding the global distribution of FD at high resolution may shed more light on the debated role of FD in ecosystem resilience and resistance to environmental change^30,61,62,63.

Future advancements in remote sensing technologies offer significant opportunities for refining trait models. The increasing availability and planned expansion of hyperspectral imaging, satellite-derived canopy fluorescence, and high-resolution structural metrics will provide new avenues for detecting fine-scale trait variability^64,65,66. Moreover, incorporating temporally resolved environmental predictors—such as ECMWF Reanalysis in combination with the high temporal resolution of MODIS and other remote sensing missions—could enable the development of multi-temporal trait maps, providing dynamic snapshots of how trait distributions respond to environmental fluctuations^67,68. By leveraging these time-series data, future trait models could better capture phenological shifts, disturbance responses, and other temporal dynamics that influence plant functional traits across different ecosystems. However, while environmental predictors may include high temporal resolution, the more restricted temporal character of trait measurements will likely remain a limiting factor.

This study presents a generalizable framework that integrates crowdsourced biodiversity data with high-resolution Earth observation to tackle longstanding challenges of extrapolation due to data sparseness in global plant trait mapping. By combining expert vegetation surveys, citizen science-driven biodiversity observations, and trait measurement databases, our approach maximizes spatial coverage, trait representation, and predictive accuracy while mitigating individual data source limitations. However, these advancements were only made possible through the collective efforts of the scientific community and the broader public, facilitated by the growing framework of open, accessible, and reusable data. This collaborative, data-driven paradigm represents a fundamental shift in how we approach long-standing challenges in Earth system science. As curated expert data collections as well as crowdsourced biodiversity and trait databases continue to expand and data-sharing initiatives grow, future work should focus on improving bias correction, capturing trait variability, and integrating temporal dynamics to further refine functional trait predictions. By fostering continued collaboration between researchers, institutions, and citizen scientists, we can move toward a more comprehensive and globally representative understanding of plant functional diversity.

Methods

Trait data from the TRY plant trait database

Data for 31 plant functional traits linked to key ecological processes (Table 1) and measured across 74,245 species were retrieved from the gap-filled TRY Plant Trait Database, version 5 (www.try-db.org)^{14,25,69,70,71,72}. The TRY database is an aggregation of trait measurements from individual plants, often with multiple trait measurements per species. Despite providing robust species mean trait values, the collection remains sparse and heterogeneous across environments and plant life stages in some areas, and so gap-filling utilizing Bayesian hierarchical probabilistic matrix factorization was implemented by the maintainers to expand its coverage^47,69. Given the wide variation in functional trait distributions, trait data were transformed using the Yeo–Johnson power transformation to standardize the distributions prior to model training⁷³. Power transformation was performed using the sklearn Python package (v1.4.2).

Citizen science plant observations

Vascular plant species observations were retrieved from the Global Biodiversity Information Facility (GBIF; www.gbif.org) on 10 April 2024. Initial download parameters selected only observations that were members of Tracheophyta, included a geospatial reference, were marked as “present”, and were non-cultivated. Prior to filtering, the GBIF data included 339,971,350 observations of 314,217 plant species from 12,645 datasets across 113 countries⁷⁴. Included in the GBIF data were nearly 100 million observations from popular citizen science initiatives, including more than 31 million observations from the iNaturalist Research-grade Observations dataset and over 24 million observations from Observation.org and Pl@ntNet combined^32,33,75. Importantly, GBIF aggregates observations from more than just popular citizen science applications, but given their predominant presence in the database as well as the large variability among the other data sources, we chose to refer to the totality of GBIF as “citizen science” for ease of readability and accessibility. Before spatial aggregation (see details below), GBIF observations were randomly subsampled at each spatial resolution so that each resulting grid cell contained no fewer than 10 and no more than 500 observations. The lower threshold of 10 observations per cell was chosen as a heuristic compromise to ensure sufficient data density while maintaining broad spatial coverage. We acknowledge this trade-off and note that optimizing such thresholds remains an open area for future research.

Vegetation survey data

Vegetation survey data were retrieved from sPlot, a collation of worldwide vegetation plots including plot-level species relative coverage²⁹. sPlot version 4.0 includes 52,942,365 occurrences of 91,603 species across 2,534,183 plots in 147 countries. Though GBIF observations cover a much larger geographic extent (Fig. 1), sPlot surveys have the advantage of incorporating a more strategic sampling design, providing true absence information, and documenting species relative cover. Therefore, after matching with traits from TRY, sPlot can provide community-level trait statistics, such as community-weighted mean, median, and percentiles. This positions sPlot as a preferred alternative to GBIF in areas of overlapping or missing spatial coverage, as well as a benchmark against which trait extrapolation products can be compared.

Assigning trait values to species observations and surveys

Following the methodology described by Wolf et al.²⁸, we calculated species-level means for the selected traits and assigned them to all matching species observations in both GBIF and sPlot using species name as the key matching criteria. PFT were not used in this assignment process, and were only considered when training models on observations from specific PFT combinations (not used in this study). Although a simple species-level mean does not capture intraspecific variation, it avoids the additional biases introduced by the spatially uneven availability of geo-referenced trait data and remains a practical strategy given current global data limitations; nonetheless, we acknowledge that intraspecific variation plays a crucial role in trait ecology and should be incorporated where feasible in future work. Due to inconsistencies in species name formatting, all species identifiers in each dataset were first truncated to the first two words containing their primary binomial names and matched in a case-insensitive manner. Not all species present in GBIF and sPlot were present in TRY, and so some species were excluded (see “Results”), though similar matching efforts have shown that the species that remained still likely represent plant communities^29,76. For sPlot observations, plot-level community-weighted mean trait values were calculated by multiplying the relative cover of each species by the mean trait from TRY and averaging the resulting weighted values for each plot.

Spatial aggregation of derived trait values

In order to match the derived GBIF and sPlot trait values with gridded Earth observation data, we spatially aggregated both trait subsets. GBIF observation locations and sPlot plot locations were first projected from geographic coordinates into Equal-Area Scalable Earth (EASE) Grids, version 2.0, Global Cylindrical (EPSG: 6933). The GBIF individual trait values and sPlot plot-level community-weighted mean (CWM) trait values were each aggregated into separate raster grids at five spatial resolutions: 1, 22, 55, 111, and 222 km² using the mean. Because different plant traits exhibit spatial variation at different scales, with some traits responding primarily to local environmental conditions while others are driven by broad-scale climatic gradients^77,78, we generated trait training data at multiple resolutions to optimize model performance and assess scale-dependent ecological patterns. This approach enables identification of the optimal spatial scale for different functional traits and supports diverse research applications ranging from fine-scale ecosystem modeling to broad-scale biogeographic analyses. As a result, the GBIF grids represented frequency-weighted mean trait values, which have been shown to approximate CWM traits²⁸, while the sPlot grids contained mean CWM values. Correlations between the sparse sPlot and GBIF grids are presented in Table S3. For the combined trait data subset (COMB), GBIF and sPlot aggregations were merged, with sPlot values taking precedence in grid cells where trait means from both datasets were present. All spatial manipulation was performed using the Python libraries of pandas (v2.2.2), dask (v2024.9.0), xarray (v2024.9.0), and rasterio (v1.4.1).

Gridded Earth observation data

Climate data have been shown to largely drive the global variation in plant traits, and previous studies have successfully predicted plant traits via the environment^{44,53,79,80,81,82}. More specifically, temperature influences photosynthesis by integrating energy availability from solar radiation. Plant productivity, reliant on photosynthesis, is also tied to water availability, making precipitation dynamics a crucial indicator of plant functional configuration⁸³. Bioclimatic variables were extracted from the WorldClim database with a resolution of \({0}^{\circ }\,{10}^{{\prime} }{0}^{{\prime\prime} }\)²². The following bioclimatic variables were selected: annual mean temperature (BIO1), temperature seasonality (BIO4), temperature annual range (BIO7), annual precipitation (BIO12), precipitation of wettest month (BIO13), precipitation of driest month (BIO14), and precipitation seasonality (BIO15). To calculate precipitation annual range (BIO13–14) we subtracted precipitation of driest month (BIO14) from precipitation of wettest month (BIO13) and later discarded them, using only BIO13–14 to explain precipitation variability. As mentioned above, BIO1 and BIO12 are known predictors of plant traits. Likewise, climate variables associated with range and seasonality have also been shown to correlate with plant traits. Therefore, we specifically chose the additional four climate variables (BIO4, BIO7, BIO13–14, BIO15) to capture the annual variations in the site-specific climatic conditions.

Given that plant canopies are configured to interact with light, their optical reflectance properties can inform their functional trait configuration^44,84. When considering the prospect of perpetually updated plant trait models, remote sensing data is especially appealing given its high spatial and temporal availability. For this study, surface reflectances in the visible and infrared spectra were collected from the MODIS MOD09GA v061 (MODIS/Terra Surface Reflectance Daily L2G Global 1 km and 500 m SIN Grid) dataset¹⁹. MODIS was selected due to its more reliable and continuous large-scale coverage when compared to other optical remote sensing missions like Landsat and Sentinel-2. Bands 1–5 (620–670 nm, 841–876 nm, 459–479 nm, 545–565 nm, and 1230–1250 nm, respectively) were used, as well as an NDVI band, which was calculated using bands 1 (red) and 2 (near-infrared). In order to preserve as much spatial and temporal information as possible, surface reflectances for each band were obtained at 1 km² resolution and temporally aggregated into 12 monthly averages spanning 20 years from March 2000 through March 2020, yielding 72 total predictors (12 × 5). These operations were performed using Google Earth Engine.

Because soil properties influence nutrient and water availability-key factors in plant resource acquisition and trait composition—we included high-resolution soil data under the assumption that it could help explain some of the trait variance⁴⁷. Mean soil properties were obtained from the SoilGrids 2.0 dataset—global soil property predictions at six standard depth intervals at a spatial resolution of 250 meters—produced and hosted by the International Soil Reference and Information Centre (ISRIC)²¹. All available soil properties at all depth intervals were used, including pH, soil organic carbon content, bulk density, coarse fragments content, sand content, silt content, clay content, cation exchange capacity, total nitrogen, soil organic carbon density, and soil organic carbon stock.

It is important to note that the predictor datasets used during model training for SoilGrids predictions include MODIS optical observations (middle and near-infrared bands) as well as WorldClim 2.0 bioclimatic variables, and therefore, some collinearity may exist between the predictors used in this study.

Vegetation optical depth (VOD) measures the opacity of vegetation to microwave radiation, indicating the amount and water content of plant biomass across landscapes. In addition to plant water content, VOD can inform ecosystem properties relevant to plant traits such as vegetation density, biomass, and environmental conditions^85,86. Further, VOD can discriminate between vegetation types, a reliable predictor of plant traits^{2,27,44,53,87}. Global, multi-sensor measurements of VOD have been made available as part of the global long-term microwave Vegetation Optical Depth Climate Archive (VODCA)²⁰. We obtained all available measurements in the C, Ku, and X bands from all available years (C-band: 2002–2018; Ku-band: 1987–2017; X-band: 1997–2018) at a native spatial resolution of 0.25^∘ and upsampled all measurements to 1 km² using bilinear interpolation. Bilinear interpolation was utilized over nearest-neighbor upsampling (NN) as NN tended to result in the imprinting of the 0.25^∘ grid cells during model extrapolation. Three temporal aggregations were then computed across each band time series: mean and the 5th and 95th percentiles.

Canopy height not only has an obvious direct relationship with community-level plant height, but has also been shown to be correlated with above-ground traits like stem diameter, stem specific density, and seed mass, as well as below-ground and other hydraulic traits such as stem conductivity and potential root length and mass^57,88,89,90. Additionally, taller plants directly affect the competitive landscape by reducing light availability to below-canopy communities as well as demanding more water and nutrients than smaller individuals due to increased transpiration and photosynthesis^91,92. To represent global canopy height, we utilized the global canopy height and canopy height standard deviation products developed by Lang et al.³⁷ due to their uniquely high spatial resolution compared to other canopy height models.

All predictor datasets, with the exception of VOD, were obtained at resolutions similar to or finer than 1 km², and so each dataset was first reprojected to EASE (EPSG: 6933) for standardized, area-consistent global referencing and then spatially resampled using weighted mean resampling to 1 km² (~0.01^∘), 22 km² (~0.2^∘), 55 km² (~0.5^∘), 111 km² (~1^∘), and 222 km² (~2^∘) grids. As mentioned above, upsampling was required for VOD observations in order to be matched with other predictors and trait grids at 1 km² and 22 km² resolutions. Lastly, water and urban land cover pixels were masked from all predictor maps using ESA WorldCover 10m v100⁹³.

Model training and validation

Prior to model training, environmental predictors and GBIF and sPlot trait data (labels) were matched by pixel coordinates and converted to tabular format. Though our machine-learning approach could tolerate missing values, pixels containing fewer than 60% of the predictors were excluded from the final training sets. Overall, the three subsets of trait data were matched with environmental predictors: vegetation survey trait community-weighted means (CWM) only (SCI), citizen science trait means only (CIT), and a combination of both, where vegetation survey CWMs were preferred when both were present (COMB; Table 3). This approach was repeated to generate training data for models at each resolution.

Table 3 Trait data subsets used for model training

Full size table

After matching by pixel coordinates, models were trained for all traits across each trait data subset using LightGBM gradient-boosting decision trees (Python library v4.5.0)⁹⁴. LightGBM utilizes a suite of boosted decision trees and is capable of capturing complex, nonlinear relationships among high-dimensional tabular data while avoiding overfitting and remaining resource-efficient³⁵. Additionally, LightGBM is capable of reasonably handling missing values by ignoring them during splitting, then allocating them to whichever side reduces loss the most. This was necessary as not all predictors had complete spatial overlap, and so, while observations that had more than 40% missing predictor values were filtered prior to training, some missing predictor values persisted depending on the spatial coverage of each particular trait.

When validating machine learning-based geospatial models, spatial autocorrelation must be considered^95,96. Traditional validation methods such as random K-fold or leave-one-out cross-validation (CV) tend to produce overly optimistic results because they do not account for spatial clustering in training data or discrepancies between training and extrapolation feature spaces^95,97. While probability sampling and design-based inference may be preferable for reference datasets with sufficiently broad and evenly-distributed spatial coverage, spatial K-fold cross-validation (hereafter referred to as “spatial cross-validation") has been shown to be a robust alternative for sparse, spatially clustered datasets⁹⁸. Because spatial cross-validation partitions data into spatially independent folds and evaluates model performance across distinct geographic regions, it provides a geographically explicit and transferability-aware validation framework. When combined with supplementary quality indicators—such as the coefficient of variation and area of applicability (AOA), discussed below—spatial cross-validation provides a more reliable assessment of model generalizability^36,95,99.

To construct spatial cross-validation folds, we first determined the spatial autocorrelation range of sPlot community-weighted mean (CWM) trait values by computing spherical semivariograms at a 1 km² resolution using the pykrige Python library (v1.7.2). We then overlaid a hexagonal grid with cells sized according to this range and randomly assigned each cell to one of five folds (Fig. S1). All trait observations from both sPlot and GBIF were assigned a fold ID based on their respective hexagons. To ensure balanced fold distributions, we ran 100 simulations of random fold assignments and selected the final configuration based on minimizing dissimilarity between folds. Dissimilarity was quantified using Kolmogorov-Smirnov tests, with the chosen fold assignment maximizing the mean p-value, thereby minimizing the likelihood of systematic differences between folds.

Model training and validation were performed using the leave-one-out (LOO) method, in which predictor-trait pairs from a single fold were set aside for model validation while fitting was performed on the remaining data, and the process was repeated for each remaining fold. All trait model variants (SCI, CIT, and COMB) were validated using vegetation survey community-weighted mean trait values from the held-out fold to ensure validation integrity and spatial independence of the validation data. Importantly, spatial cross-validation was not used for model selection or hyperparameter tuning; all model configurations were fixed prior to validation. As such, and consistent with best practices in the machine learning literature, a separate test set was not required^100,101.

Due to the significant imbalance in sample size between GBIF and sPlot pixels after spatial aggregation, samples in the combined trait data subset (COMB) were weighted based on their origin. sPlot vegetation surveys, which represent community-weighted means, were deemed more reliable than citizen science observations. Consequently, weights were assigned to GBIF observations inversely to their proportion relative to sPlot observations in the combined training set. Additionally, feature pruning was performed during model training based on automated sensitivity analysis via permutation feature importance. Features were dynamically removed during model ensemble creation if their removal improved model performance. Because final models were actually an ensemble of child models, not all child models necessarily utilized the same pruned feature set. Finally, models used to produce final trait prediction maps were trained on all available environmental predictor data.

The retrieval of predictor importance, or the weight given to a particular feature during model training, is an important aspect of model interpretability. Feature and dataset importance were calculated using the permutation importance method, in which the values of each feature or set of features (dataset) are randomly shuffled, and model performance is measured¹⁰². The permutation importance is quantified as the performance difference when prediction is performed on data containing the shuffled feature. Because we implemented spatial cross-validation, feature importance was defined as the average feature importance across all CV folds.

Model performance and spatial transferability

Citizen science trait values, given their opportunistic provenance, are noisy and biased³⁹. As mentioned above, we validated all model runs against spatially independent sPlot community-weighted mean trait values that were not used during model training for each of the three trait data subsets (CIT, SCI, and COMB). Model performance was assessed using average normalized root mean squared error (nRMSE) and Pearson’s correlation coefficient r based on the combined observed versus predicted values of all spatial cross-validation runs. Following the standard practice in global trait mapping studies, we used Pearson’s correlation coefficient as the primary metric for comparison rather than R²^27,28. To further assess model robustness, we calculated the pixel-wise coefficient of variation (COV) by generating predictions from each cross-validation model, stacking the results, and computing the COV for each pixel along the vertical stack as the ratio of the standard deviation to the mean prediction. All the above statistics were also assessed across biomes. Biomes were first retrieved from the Terrestrial Ecoregions of the World map and then consolidated from 14 biomes into 7 biome categories¹⁰³. The biome-to-biome-category mapping can be found in the supplementary information (Table S4).

While spatial cross-validation is useful for minimizing spatial dependence on model error estimation, it may not be sufficient for explaining the transferability of model extrapolations in areas where predictor data vary significantly from the reference data^40,97,104. In particular, we were interested in assessing whether the abundance and coverage of citizen science observations impart greater spatial transferability to citizen science-based models (COMB) compared to using less-abundant vegetation surveys alone. To address this, we applied the methods developed by Meyer & Pebesma³⁶ for determining the dissimilarity index (DI) between predictor data used during model training ("train”) and unseen predictor data used for extrapolation ("new”), and thus the area of applicability (AOA) of the final trait maps. Briefly, dissimilarity in the predictor space was calculated by first computing the average minimum distance between observations in each cross-validation fold in “train” (i.e., the minimum distances between points in one fold from points in all other folds), followed by calculating the minimum distances between observations in the “new” data from the training data. The DI was then determined as “new” distances divided by the mean “train” distance. In line with existing literature, we defined the DI threshold as 1.5 × IQR_DI + 75th percentile_DI of the cross-validated training data, with values below the threshold being within the AOA and values above the threshold (outliers) being outside it. DI and AOA were computed for all models at all resolutions³⁶. To quantify whether the inclusion of GBIF data increased the AOA of trait models, we compared the AOA of all models across all training sets and calculated the change in overall extent.

It should be noted that the methods described by Meyer & Pebesma³⁶ do not address the handling of missing values. Because LightGBM can tolerate missing values and we did not do any data imputation prior to training, we had several options when calculating AOA: a) drop all observations in “new” and “train” that contained any missing features (and therefore misrepresent the actual spatial coverage of the training data as well as reduce feature space variance present in DI calculation compared to actual reference data used in training), likely resulting in an overly pessimistic AOA; b) drop features in both the “new” and “train” data that contained any missing values, likely resulting in an overly optimistic AOA as feature space complexity is reduced; or c) a “middle ground” approach of imputing missing values and assuming that the resulting predictor space is still a robust representation of the true reference data. To retain as much of the original signature of the true reference data when calculating dissimilarity, as well as to ensure that final AOA maps matched the geographic extent of the predictions, we chose option “c”. However, we recognize that, given the novelty of this method, there is likely room for improvement. Missing value imputation was performed using the NaNImputer method from the Python “verstack” library (v4.1.4), which fits LightGBM regression models for each feature to fill missing values¹⁰⁵.

We generated continuous trait maps for all traits at all resolutions for the SCI and COMB trait subsets as stacked rasters containing the inference data, COV, and AOA mask. The rasters were written as cloud-optimized GeoTIFFs in EPSG:6933, and the inference and COV bands were quantized to signed 16-bit integer format.

To compare our trait maps with existing products, we used sPlot gridded CWMs as a benchmark, an established method for comparing global trait maps^27,28. Because SCI and COMB models were trained using sPlot data—a distinguishing feature of this study—we used spatial cross-validation to prevent overlap and spatial autocorrelation between training and validation datasets.

Trait maps from nine existing products covering three foliar traits-specific leaf area, leaf nitrogen per area (leaf N [area]), and leaf nitrogen per mass (leaf N [mass])-were selected and spatially aligned with sPlot CWMs at the nearest common resolution (1, 22, 55, 111, or 222 km²). Most products were originally created at a single native resolution, but to enable performance comparisons across scales, they were downsampled to coarser resolutions when necessary. Upsampling to finer resolutions was avoided to prevent unnecessary data manipulation. The final trait values were transformed using a Yeo–Johnson power transform with the lambda values determined in the original transformation of the sPlot gridded CWMs. Pearson’s correlation coefficient r between the transformed trait products and matched sPlot CWMs was then used to describe agreement. The specific foliar traits were selected due to their general commonality between the selected previous studies.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The Earth observation data used in this study are publicly available through Google Earth Engine (https://earthengine.google.com) or the Google Earth Engine Community Catalog (https://gee-community-catalog.org/). GBIF species occurrence data are available via the dataset citation provided in the ref. ⁷⁵. The TRY gap filled trait data used in this study are available under restricted access due to the data sharing policies of contributing datasets within TRY. Access can be requested through the TRY Plant Trait Database (https://www.try-db.org) following their standard data request procedure. The sPlot vegetation survey data used in this study are available under restricted access to protect the interests of data contributors. Access can be requested by contacting the sPlot consortium through the German Centre for Integrative Biodiversity Research (iDiv) via G.D. or through the sPlot website (https://www.idiv.de/splot). After request processing, data are provided under a data use agreement. The global trait maps generated in this study have been deposited in Zenodo¹⁰⁶. An interactive map viewer is available at https://global-traits.projects.earthengine.app/view/global-traits, and additional study resources can be found at https://planttraits.earth. Users of these maps should consult the coefficient of variation and area of applicability layers, as well as the model performance metrics provided in the raster metadata and Table S2, noting that performance varies across traits and biomes (Figs. 4b, 5c, S3 and Table S4). Source data underlying the figures are available at https://doi.org/10.5281/zenodo.18108765.

Code availability

The code used to process data, train models, and generate trait maps in this study is available at https://github.com/dluks/cit-sci-trait-maps and archived on Zenodo at https://doi.org/10.5281/zenodo.18269445.

References

Bruelheide, H. et al. Global trait-environment relationships of plant communities. Nat. Ecol. Evol. 2, 1906–1917 (2018).
Article PubMed Google Scholar
Kambach, S. et al. Climate-trait relationships exhibit strong habitat specificity in plant communities across Europe. Nat. Commun. 14, 712 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Laughlin, D. C. et al. Root traits explain plant species distributions along climatic gradients yet challenge the nature of ecological trade-offs. Nat. Ecol. Evol. 5, 1123–1134 (2021).
Article PubMed Google Scholar
Grime, J. P. Trait convergence and trait divergence in herbaceous plant communities: mechanisms and consequences. J. Veg. Sci. 17, 255–260 (2006).
Article Google Scholar
Reich, P. B. & Oleksyn, J. Global patterns of plant leaf N and P in relation to temperature and latitude. Proc. Natl. Acad. Sci. USA 101, 11001–11006 (2004).
Article ADS CAS PubMed PubMed Central Google Scholar
von Humboldt, A. & Bonpland, A. Ideen zu einer Geographie der Pflanzen nebst einem Naturgemälde der Tropenländer: auf Beobachtungen und Messungen gegründet, welche vom 10ten Grade nördlicher bis zum 10ten Grade südlicher Breite, in den Jahren 1799, 1800, 1801, 1802 und 1803 angestellt worden sind (Cotta, 1807).
Wright, I. J. et al. The worldwide leaf economics spectrum. Nature 428, 821–827 (2004).
Article ADS CAS PubMed Google Scholar
Chapin III, F. S. et al. Consequences of changing biodiversity. Nature 405, 234–242 (2000).
Article Google Scholar
Running, S. W. Ecosystem disturbance, carbon, and climate. Science 321, 652–653 (2008).
Article CAS PubMed Google Scholar
Enquist, B. J. et al. Scaling from traits to ecosystems: developing a general trait driver theory via integrating trait-based and metabolic scaling theories. In (eds Pawar, S., Woodward, G. & Dell, A. I.) Advances in Ecological Research, Vol. 52, Ch 9, 249–318 (Academic Press, 2015).
Isaac, M. E., Gagliardi, S., Ordoñez, J. C. & Sauvadet, M. Shade tree trait diversity and functions in agroforestry systems: a review of which traits matter. J. Appl. Ecol. 61, 1159–1173 (2024).
Article Google Scholar
Sakschewski, B. et al. Resilience of Amazon forests emerges from plant trait diversity. Nat. Clim. Change 6, 1032–1036 (2016).
Article ADS Google Scholar
Lavorel, S. & Garnier, E. Predicting changes in community composition and ecosystem functioning from plant traits: revisiting the Holy Grail. Funct. Ecol. 16, 545–556 (2002).
Article Google Scholar
Díaz, S. et al. The global spectrum of plant form and function: enhanced species-level trait dataset. Sci. Data 9, 755 (2022).
Article PubMed PubMed Central Google Scholar
Fisher, R. A. et al. Vegetation demographics in Earth system models: a review of progress and priorities. Glob. Change Biol. 24, 35–54 (2018).
Article ADS Google Scholar
Scheiter, S., Langan, L. & Higgins, S. I. Next-generation dynamic global vegetation models: learning from community ecology. N. Phytol. 198, 957–969 (2013).
Article Google Scholar
Lausch, A. et al. Linking Earth observation and taxonomic, structural and functional biodiversity: local to ecosystem perspectives. Ecol. Indic. 70, 317–339 (2016).
Article Google Scholar
Homolová, L., Malenovský, Z., Clevers, J. G. P. W., García-Santos, G. & Schaepman, M. E. Review of optical-based remote sensing for plant trait mapping. Ecol. Complex. 15, 1–16 (2013).
Article Google Scholar
Vermote, E. & Wolfe, R. MODIS/Terra Surface Reflectance Daily L2G Global 1km and 500m SIN Grid V061. https://lpdaac.usgs.gov/products/mod09gav061/ (2021).
Moesinger, L. et al. The global long-term microwave vegetation optical depth climate archive (VODCA). Earth Syst. Sci. Data 12, 177–196 (2020).
Article ADS Google Scholar
Poggio, L. et al. SoilGrids 2.0: producing soil information for the globe with quantified spatial uncertainty. SOIL 7, 217–240 (2021).
Article ADS CAS Google Scholar
Fick, S. E. & Hijmans, R. J. WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 37, 4302–4315 (2017).
Article Google Scholar
Cornelissen, J. H. et al. A handbook of protocols for standardised and easy measurement of plant functional traits worldwide. Aust. J. Bot. 51, 335–380 (2003).
Article Google Scholar
Kattge, J. et al. TRY - a global database of plant traits. Glob. Change Biol. 17, 2905–2935 (2011).
Article ADS Google Scholar
Kattge, J. et al. TRY plant trait database - enhanced coverage and open access. Glob. Change Biol. 26, 119–188 (2020).
Article ADS Google Scholar
Enquist, B. J., Condit, R., Peet, R. K., Schildhauer, M. & Thiers, B. M. Cyberinfrastructure for an integrated botanical information network to investigate the ecological impacts of global climate change on plant biodiversity. Preprint at https://peerj.com/preprints/2615v2 (2016).
Dechant, B. et al. Intercomparison of global foliar trait maps reveals fundamental differences and limitations of upscaling approaches. Remote Sens Environ. 311, 114276 (2024).
Wolf, S. et al. Citizen science plant observations encode global trait patterns. Nat. Ecol. Evol. 6, 1850–1859 (2022).
Article PubMed Google Scholar
Bruelheide, H. et al. sPlot - A new tool for global vegetation analyses. J. Veg. Sci. 30, 161–186 (2019).
Article Google Scholar
Hähn, G. J. A. et al. Global decoupling of functional and phylogenetic diversity in plant communities. Nat. Ecol. Evol. 9, 237–248 (2025).
Sabatini, F. M. et al. sPlotOpen - An environmentally balanced, open-access, global dataset of vegetation plots. Glob. Ecol. Biogeogr. 30, 1740–1764 (2021).
Article Google Scholar
Affouard, A. et al. Pl@ntNet automatically identified occurrences https://www.gbif.org/dataset/14d5676a-2c54-4f94-9023-1e8dcd822aa0 (2023).
INaturalist Contributors. iNaturalist Research-grade Observations https://www.gbif.org/dataset/50c9509d-22c7-4a22-a47d-8c48425ef4a7 (2024).
Telenius, A. Biodiversity information goes public: GBIF at your service. Nord. J. Bot. 29, 378–381 (2011).
Article Google Scholar
Cai, L. et al. Global models and predictions of plant diversity based on advanced machine learning techniques. N. Phytol. 237, 1432–1445 (2023).
Article Google Scholar
Meyer, H. & Pebesma, E. Predicting into unknown space? Estimating the area of applicability of spatial prediction models. Methods Ecol. Evol. 12, 1620–1633 (2021).
Article Google Scholar
Lang, N., Jetz, W., Schindler, K. & Wegner, J. D. A high-resolution canopy height model of the Earth. Nat. Ecol. Evol. 7, 1778–1789 (2023).
Article PubMed PubMed Central Google Scholar
Beck, J., Ballesteros-Mejia, L., Nagel, P. & Kitching, I. J. Online solutions and the ‘Wallacean shortfall’: what does GBIF contribute to our knowledge of species’ ranges? Divers. Distrib. 19, 1043–1050 (2013).
Article Google Scholar
Beck, J., Böller, M., Erhardt, A. & Schwanghart, W. Spatial bias in the GBIF database and its effect on modeling species’ geographic distributions. Ecol. Inform. 19, 10–15 (2014).
Article Google Scholar
Meyer, C., Weigelt, P. & Kreft, H. Multidimensional biases, gaps and uncertainties in global plant occurrence information. Ecol. Lett. 19, 992–1006 (2016).
Article PubMed Google Scholar
Di Cecco, G. J. et al. Observing the observers: how participants contribute data to inaturalist and implications for biodiversity science. BioScience 71, 1179–1188 (2021).
Article Google Scholar
Ernakovich, J. G. et al. Predicted responses of arctic and alpine ecosystems to altered seasonality under climate change. Glob. Change Biol. 20, 3256–3269 (2014).
Article ADS Google Scholar
Joswig, J. S. et al. Climatic and soil factors explain the two-dimensional spectrum of global plant trait variation. Nat. Ecol. Evol. 6, 36–50 (2022).
Article PubMed Google Scholar
Moreno-Martínez, A. et al. A methodology to derive global maps of leaf traits using remote sensing and climate data. Remote Sens. Environ. 218, 69–88 (2018).
Article ADS Google Scholar
Valdez, J. et al. Strategies for advancing inclusive biodiversity research through equitable practices and collective responsibility. Conserv. Biol. 38, e14325 (2024).
Article PubMed PubMed Central Google Scholar
Hortal, J. et al. Seven shortfalls that beset large-scale knowledge of biodiversity. Annu. Rev. Ecol. Evol. Syst. 46, 523–549 (2015).
Article Google Scholar
Joswig, J. S. et al. Imputing missing data in plant traits: a guide to improve gap filling. Glob. Ecol. Biogeogr. 32 1395–1408 (2023).
Schiller, C., Schmidtlein, S., Boonman, C., Moreno-Martínez, A. & Kattenborn, T. Deep learning and citizen science enable automated plant trait predictions from photographs. Sci. Rep. 11, 16395 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Carlucci, M. B., Debastiani, V. J., Pillar, V. D. & Duarte, L. D. S. Between- and within-species trait variability and the assembly of sapling communities in forest patches. J. Veg. Sci. 26, 21–31 (2015).
Article Google Scholar
Albert, C. H. Intraspecific trait variability matters. J. Veg. Sci. 26, 7–8 (2015).
Article Google Scholar
Albert, C. H., Grassein, F., Schurr, F. M., Vieilledent, G. & Violle, C. When and how should intraspecific variability be considered in trait-based plant ecology? Perspect. Plant Ecol. Evol. Syst. 13, 217–225 (2011).
Article Google Scholar
Seifert, A., Violle, C., Chalmandrier, L. & Wardle, D. A. A global meta analysis of the relative extent of intraspecific trait variation in plant communities. Ecol. Lett. 18, i–ii, 1285–1419 (2015).
Google Scholar
Butler, E. E. et al. Mapping local and global variability in plant trait distributions. Proc. Natl. Acad. Sci. USA 114, E10937–E10946 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Scheiter, S., Wolf, S. & Kattenborn, T. Crowd-sourced trait data can be used to delimit global biomes. Biogeosciences 21, 4909–4926 (2024).
Article ADS Google Scholar
Langan, L. et al. Prediction in trait-based ecology: global simulations of specific leaf area using a trait-based dynamic vegetation model. Preprint at https://doi.org/10.1101/2025.04.14.648688 (2025).
Anderegg, L. D. L. et al. Representing plant diversity in land models: an evolutionary approach to make “Functional Types” more functional. Glob. Change Biol. 28, 2541–2554 (2022).
Article ADS CAS Google Scholar
Díaz, S. et al. The global spectrum of plant form and function. Nature 529, 167–171 (2016).
Article ADS PubMed Google Scholar
Li, J. & Prentice, I. C. Global patterns of plant functional traits and their relationships to climate. Commun. Biol. 7, 1–14 (2024).
CAS Google Scholar
Poggiato, G. et al. Predicting combinations of community mean traits using joint modelling. Glob. Ecol. Biogeogr. 32, 1409–1422 (2023).
Article Google Scholar
Yang, Y. et al. Quantifying leaf-trait covariation and its controls across climates and biomes. N. Phytol. 221, 155–168 (2019).
Article CAS Google Scholar
Mori, A. S., Furukawa, T. & Sasaki, T. Response diversity determines the resilience of ecosystems to environmental change. Biol. Rev. 88, 349–364 (2013).
Article PubMed Google Scholar
Oliver, T. H. et al. Biodiversity and resilience of ecosystem functions. Trends Ecol. Evol. 30, 673–684 (2015).
Article PubMed Google Scholar
Smith, T. & Boers, N. Reliability of vegetation resilience estimates depends on biomass density. Nat. Ecol. Evol. 7, 1799–1808 (2023).
Article PubMed PubMed Central Google Scholar
Moreno, J. F. et al. Very high spectral resolution imaging spectroscopy: The Fluorescence Explorer (FLEX) mission. In Proc. 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 264–267 (IEEE, 2016).
Nieke, J. et al. The Copernicus hyperspectral imaging mission for the environment (CHIME): an overview of its mission, system and planning status. In Proc. Sensors, Systems, and Next-Generation Satellites XXVII, Vol 12729, 21–40 (SPIE, 2023).
Ustin, S. L. & Middleton, E. M. Current and near-term earth-observing environmental satellites, their missions, characteristics, instruments, and applications. Sensors 24, 3488 (2024).
Article ADS PubMed PubMed Central Google Scholar
Hersbach, H. et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 146, 1999–2049 (2020).
Article ADS Google Scholar
Karger, D. N. et al. Climatologies at high resolution for the Earth’s land surface areas. Sci. Data 4, 170122 (2017).
Article PubMed PubMed Central Google Scholar
Schrodt, F. et al. BHPMF - a hierarchical Bayesian approach to gap-filling and trait prediction for macroecology and functional biogeography. Glob. Ecol. Biogeogr. 24, 1510–1521 (2015).
Article Google Scholar
Grime, J. P. The C-S-R model of primary plant strategies - origins, implications and tests. In (eds Gottlieb, L. D. & Jain, S. K.) Plant Evolutionary Biology, 371–393 (Springer, 1988).
Kong, D. et al. Nonlinearity of root trait relationships and the root economics spectrum. Nat. Commun. 10, 2203 (2019).
Article ADS PubMed PubMed Central Google Scholar
Kramer-Walter, K. R. et al. Root traits are multidimensional: specific root length is independent from root tissue density and the plant economic spectrum. J. Ecol. 104, 1299–1310 (2016).
Article Google Scholar
Yeo, I. & Johnson, R. A. A new family of power transformations to improve normality or symmetry. Biometrika 87, 954–959 (2000).
Article MathSciNet Google Scholar
GBIF.Org User. Occurrence Download https://www.gbif.org/occurrence/download/0136703-240321170329656. 24330515338 (2024).
Observation.Org. Observation.org, Nature data from around the World https://www.gbif.org/dataset/8a863029-f435-446a-821e-275f4f641165 (2024).
Borgy, B. et al. Sensitivity of community-level trait-environment relationships to data representativeness: a test for functional biogeography. Glob. Ecol. Biogeogr. 26, 729–739 (2017).
Article Google Scholar
Carmona, C. P., Bello, F. D., Mason, N. W. H. & Lepš, J. Traits without borders: integrating functional diversity across scales. Trends Ecol. Evol. 31, 382–394 (2016).
Article PubMed Google Scholar
Funk, J. L. et al. Revisiting the Holy Grail: using plant functional traits to understand ecological processes. Biol. Rev. 92, 1156–1173 (2017).
Article PubMed Google Scholar
Boonman, C. C. F. et al. Assessing the reliability of predicted plant trait distributions at the global scale. Glob. Ecol. Biogeogr. 29, 1034–1051 (2020).
Article PubMed PubMed Central Google Scholar
Madani, N. et al. Future global productivity will be affected by plant trait response to climate. Sci. Rep. 8, 2870 (2018).
Article ADS PubMed PubMed Central Google Scholar
Vallicrosa, H. et al. Global maps and factors driving forest foliar elemental composition: the importance of evolutionary history. N. Phytol. 233, 169–181 (2022).
Article CAS Google Scholar
van Bodegom, P. M., Douma, J. C. & Verheijen, L. M. A fully traits-based approach to modeling global vegetation distribution. Proc. Natl. Acad. Sci. USA 111, 13733–13738 (2014).
Article ADS PubMed PubMed Central Google Scholar
Wright, I. J. et al. Modulation of leaf economic traits and trait relationships by climate. Glob. Ecol. Biogeogr. 14, 411–421 (2005).
Article Google Scholar
Kattenborn, T. & Schmidtlein, S. Radiative transfer modelling reveals why canopy reflectance follows function. Sci. Rep. 9, 6541 (2019).
Article ADS PubMed PubMed Central Google Scholar
Frappart, F. et al. Global monitoring of the vegetation dynamics from the vegetation optical depth (VOD): a review. Remote Sens. 12, 2915 (2020).
Article ADS Google Scholar
Tian, F. et al. Remote sensing of vegetation dynamics in drylands: evaluating vegetation optical depth (VOD) using AVHRR NDVI and in situ green biomass data over West African Sahel. Remote Sens. Environ. 177, 265–276 (2016).
Article ADS Google Scholar
Jones, M. O., Jones, L. A., Kimball, J. S. & McDonald, K. C. Satellite passive microwave remote sensing for monitoring global land surface phenology. Remote Sens. Environ. 115, 1102–1114 (2011).
Article ADS Google Scholar
Niklas, K. J. Size-dependent allometry of tree height, diameter and trunk-taper. Ann. Bot. 75, 217–227 (1995).
Article Google Scholar
Liu, H. et al. Hydraulic traits are coordinated with maximum plant height at the global scale. Sci. Adv. 5, eaav1332 (2019).
Article ADS PubMed PubMed Central Google Scholar
Weigelt, A. et al. An integrated framework of plant form and function: the belowground perspective. N. Phytol. 232, 42–59 (2021).
Article Google Scholar
Niinemets, l A review of light interception in plant stands from leaf to canopy in different plant functional types and in species with varying shade tolerance. Ecol. Res. 25, 693–714 (2010).
Article Google Scholar
Niklas, K. J. Maximum plant height and the biophysical factors that limit it. Tree Physiol. 27, 433–440 (2007).
Article PubMed Google Scholar
Zanaga, D. et al. ESA WorldCover 10 m 2021 v200. Zenodo https://doi.org/10.5281/zenodo.7254221 (2022).
Ke, G. et al. LightGBM: a highly efficient gradient boosting decision tree. InProc. Advances in Neural Information Processing Systems, Vol 30 (Curran Associates, Inc., 2017).
Ploton, P. et al. Spatial validation reveals poor predictive performance of large-scale ecological mapping models. Nat. Commun. 11, 4540 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Kattenborn, T. et al. Spatially autocorrelated training and validation samples inflate performance assessment of convolutional neural networks. ISPRS Open J. Photogramm. Remote Sens. 5, 100018 (2022).
Article Google Scholar
Meyer, H. & Pebesma, E. Machine learning-based global maps of ecological variables and the challenge of assessing them. Nat. Commun. 13, 2208 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Wadoux, A. M.-C., Heuvelink, G. B., De Bruin, S. & Brus, D. J. Spatial cross-validation is not the right way to evaluate map accuracy. Ecol. Model. 457, 109692 (2021).
Article Google Scholar
Pohjankukka, J., Pahikkala, T., Nevalainen, P. & Heikkonen, J. Estimating the prediction performance of spatial models via spatial k-fold cross validation. Int. J. Geogr. Inf. Sci. 31, 2001–2019 (2017).
Article Google Scholar
Chollet, F. & Chollet, F. Deep Learning with Python, Second Edition (Simon and Schuster, 2021).
Géron, A. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (O’Reilly Media, Inc., 2022).
Altmann, A., Toloşi, L., Sander, O. & Lengauer, T. Permutation importance: a corrected feature importance measure. Bioinformatics 26, 1340–1347 (2010).
Article CAS PubMed Google Scholar
Olson, D. M. et al. Terrestrial ecoregions of the world: a new map of life on Earth: a new global map of terrestrial ecoregions provides an innovative tool for conserving biodiversity. BioScience 51, 933–938 (2001).
Article Google Scholar
Ludwig, M., Moreno-Martinez, A., Hölzel, N., Pebesma, E. & Meyer, H. Assessing and improving the transferability of current global spatial prediction models. Glob. Ecol. Biogeogr. 32, 356–368 (2023).
Article Google Scholar
Zherebtsov, D. https://github.com/danilzherebtsov/verstack (2023).
Lusk, D., Wolf, S., Svidzinska, D. & Kattenborn, T. Global plant trait maps based on crowdsourced biodiversity monitoring and Earth observation - 1 km - All PFTs. Zenodo https://doi.org/10.5281/zenodo.14646322 (2025).

Download references

Acknowledgements

This study was funded by the German Research Foundation (DFG) within the framework of BigPlantSens (Assessing the Synergies of Big Data and Deep Learning for the Remote Sensing of Plant Species; project no. 444524904) and PANOPS (Revealing Earth’s plant functional diversity with citizen science; project no. 504978936). D.L. and T.K. thank the European Space Agency for funding the “FORTRACK” project via the ESA CLIMATE SPACE: Climate-Biodiversity studies. The study is supported by the TRY initiative on plant traits (http://www.try-db.org) and the sPlot consortium (http://www.idiv.de/splot). The TRY initiative and database are hosted, developed, and maintained by J.K. and G. Boenisch (Max Planck Institute for Biogeochemistry, Jena, Germany), currently supported by Future Earth/bioDISCOVERY and the German Centre for Integrative Biodiversity Research Halle-Jena-Leipzig (iDiv, DFG-FZT 118, 202548816). sPlot is a strategic project of iDiv and is supported by the German Research Foundation (DFG-FZT 118, 202548816). F.M.S. gratefully acknowledges the support of the Italian Ministry of University and Research, under the Maria Levi Montalcini programme. S.W. was funded by the German National Research Data Infrastructure for Biodiversity, NFDI4Biodiversity, a DFG project, project no. 442032008 and by the European Space Agency Climate Change Initiative (ESA-CCI) Tipping Elements SIRENE project (contract no. 4000146954/24/I-LR). CFD acknowledges funding by the German Research Foundation (DFG) through CRC 1537 Ecosense and EXC 3127 Future Forests. F.G.’s surveying efforts were funded by the Swiss National Science Foundation Postdoctoral Fellowships (TMPFP2_217531). F.R. acknowledges the support of Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) for postdoctoral fellowships (N^∘88887.974741/2024-00). J.A., J. Dolezal, and K. Korznikov were supported by the research grant 25-15727S of the Czech Science Foundation and long-term research development project No. RVO 67985939 of the Institute of Botany of the Czech Academy of Sciences. This work would not have been possible without the contributions of ecologists and vegetation surveyors who diligently sampled field plots, curated datasets, and shared them through accessible databases. We also acknowledge the vital role of citizen scientists engaged in platforms such as iNaturalist and Pl@ntNet, whose time, observations, and local knowledge have been crucial in assembling high-quality, research-ready data.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Chair of Sensor-based Geoinformatics (geosense), University of Freiburg, Freiburg, Germany
Daniel Lusk, Kilian Gerberding & Teja Kattenborn
Remote Sensing Centre for Earth System Research, Leipzig University, Leipzig, Germany
Sophie Wolf & Daria Svidzinska
Department of Biometry and Environmental System Analysis, University of Freiburg, Freiburg, Germany
Carsten F. Dormann
German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
Jens Kattge, Helge Bruelheide & Gabriella Damasceno
Max Planck Institute for Biogeochemistry, Jena, Germany
Jens Kattge
Institute of Biology/Geobotany and Botanical Garden, Martin Luther University Halle-Wittenberg, Halle, Germany
Helge Bruelheide & Gabriella Damasceno
BIOME Lab, Department of Biological, Geological and Environmental Sciences (BiGeA), Alma Mater Studiorum University of Bologna, Bologna, Italy
Francesco Maria Sabatini, Georg J. A. Hähn & Riccardo Testolin
Image Signal Processing Group, Image Processing Laboratory (IPL), University of Valencia, Paterna, Spain
Álvaro Moreno Martínez
CEFE, CNRS, EPHE, IRD, University of Montpellier, Montpellier, France
Cyrille Violle
Department of Biology, University of Oxford, Oxford, UK
Daniel Hending
Instituto Argentino de Investigaciones de las Zonas Áridas, CONICET, Mendoza, Argentina
Solana Tabeni
Department of Forestry, Mizoram University, Aizawl, India
Shyam Phartyal
Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
Fernando Gonçalves
Biodiversity, Macroecology & Biogeography, University of Göttingen, Göttingen, Germany
Holger Kreft
Palmengarten der Stadt Frankfurt am Main, Frankfurt am Main, Germany
Marco Schmidt
College of Grassland Science, Inner Mongolia Agricultural University, Hohhot, China
Han Chen
Institute for Global Change Biology, and School for Environment and Sustainability, University of Michigan, Ann Arbor, MI, USA
Han Chen
Biology Education, Dokuz Eylül University, Buca, Izmir, Turkey
Behlül Güler
Institute of Botany of the Czech Academy of Sciences, Trebon, Czechia
Jiri Dolezal, Dinesh Thakur, Jan Altman & Kirill Korznikov
Faculty of Science, University of South Bohemia, České Budějovice, Czechia
Jiri Dolezal
Institute of Botany, Faculty of Biology, Jagiellonian University, Kraków, Poland
Remigiusz Pielech
Instituto de Ecología y Ciencias Ambientales, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
Anaclara Guido
Centre for Environmental and Climate Science, Lund University, Lund, Sweden
Ciara Dwyer
Sapienza University of Rome, Rome, Italy
Francesca Napoleone
Centre for Research and Conservation, Royal Zoological Society of Antwerp, Antwerp, Belgium
Jacob Willie
Universidade Regional de Blumenau, Blumenau, Brazil
André Luís Gasper
Departamento de Biología (Botánica), Universidad Autónoma de Madrid, Madrid, Spain
Manuel J. Macía
Centro de Investigación en Biodiversidad y Cambio Global (CIBC-UAM), Universidad Autónoma de Madrid, Madrid, Spain
Manuel J. Macía
Department of Botany and Zoology, Faculty of Science, Masaryk University, Brno, Czechia
Milan Chytry
UMR CNRS 7058 “Ecologie et Dynamique des Systèmes Anthropisés” (EDYSAN), Université de Picardie Jules Verne, Amiens, France
Jonathan Lenoir
Institute of Natural Resource Sciences (IUNR), Zurich University of Applied Sciences (ZHAW), Wädenswil, Switzerland
Jürgen Dengler
Institute of Agroecology and Plant Production, Wrocław University of Environmental and Life Sciences, Wrocław, Poland
Sebastian Świerszcz
Faculty of Forestry and Wood Sciences, Czech University of Life Sciences Prague, Prague, Czechia
Jan Altman
Iluka Chair in Vegetation Science and Biogeography, Harry Butler Institute, Murdoch University, Perth, WA, Australia
Ladislav Mucina
Department of Geography and Environmental Studies, Stellenbosch University, Stellenbosch, South Africa
Ladislav Mucina
Department of Plant Biology, Michigan State University, East Lansing, MI, USA
Ashish N. Nerlekar
Program in Ecology, Evolution, and Behavior, Michigan State University, East Lansing, MI, USA
Ashish N. Nerlekar
Korea Advanced Institute of Science and Technology, Daejeon, South Korea
Kaoru Kakinuma
ICFRE- Himalayan Forest Research Institute, Shimla, Himachal Pradesh, India
Pravin Rawat
Faculty of Geotechnical Engineering, University of Zagreb, Zagreb, Croatia
Zvjezdana Stančić
Plant Ecology and Nature Conservation Group, Environmental Sciences Department, Wageningen University and Research, Wageningen, The Netherlands
Mohamed Z. Hatim
Plant Ecology Laboratory (LABEV), UNEMAT, Nova Xavantina, MT, Brazil
Flávio Rodrigues
Faculty of Resource Management, HAWK University of Applied Sciences and Arts, Göttingen, Germany
Jürgen Homeier
Departamento de Botânica, SCB, Universidade Federal do Paraná, Curitiba, Brazil
Marcia C. M. Marques
Manaaki Whenua - Landcare Research, Lincoln, New Zealand
James K. McCarthy
Botany and Microbiology Department, College of Science, King Saud University, Riyadh, Saudi Arabia
M. A. El-Sheikh

Authors

Daniel Lusk
View author publications
Search author on:PubMed Google Scholar
Sophie Wolf
View author publications
Search author on:PubMed Google Scholar
Daria Svidzinska
View author publications
Search author on:PubMed Google Scholar
Carsten F. Dormann
View author publications
Search author on:PubMed Google Scholar
Jens Kattge
View author publications
Search author on:PubMed Google Scholar
Helge Bruelheide
View author publications
Search author on:PubMed Google Scholar
Francesco Maria Sabatini
View author publications
Search author on:PubMed Google Scholar
Gabriella Damasceno
View author publications
Search author on:PubMed Google Scholar
Álvaro Moreno Martínez
View author publications
Search author on:PubMed Google Scholar
Cyrille Violle
View author publications
Search author on:PubMed Google Scholar
Daniel Hending
View author publications
Search author on:PubMed Google Scholar
Georg J. A. Hähn
View author publications
Search author on:PubMed Google Scholar
Solana Tabeni
View author publications
Search author on:PubMed Google Scholar
Shyam Phartyal
View author publications
Search author on:PubMed Google Scholar
Fernando Gonçalves
View author publications
Search author on:PubMed Google Scholar
Holger Kreft
View author publications
Search author on:PubMed Google Scholar
Marco Schmidt
View author publications
Search author on:PubMed Google Scholar
Han Chen
View author publications
Search author on:PubMed Google Scholar
Behlül Güler
View author publications
Search author on:PubMed Google Scholar
Jiri Dolezal
View author publications
Search author on:PubMed Google Scholar
Remigiusz Pielech
View author publications
Search author on:PubMed Google Scholar
Anaclara Guido
View author publications
Search author on:PubMed Google Scholar
Ciara Dwyer
View author publications
Search author on:PubMed Google Scholar
Francesca Napoleone
View author publications
Search author on:PubMed Google Scholar
Jacob Willie
View author publications
Search author on:PubMed Google Scholar
André Luís Gasper
View author publications
Search author on:PubMed Google Scholar
Manuel J. Macía
View author publications
Search author on:PubMed Google Scholar
Milan Chytry
View author publications
Search author on:PubMed Google Scholar
Jonathan Lenoir
View author publications
Search author on:PubMed Google Scholar
Dinesh Thakur
View author publications
Search author on:PubMed Google Scholar
Jürgen Dengler
View author publications
Search author on:PubMed Google Scholar
Sebastian Świerszcz
View author publications
Search author on:PubMed Google Scholar
Jan Altman
View author publications
Search author on:PubMed Google Scholar
Ladislav Mucina
View author publications
Search author on:PubMed Google Scholar
Ashish N. Nerlekar
View author publications
Search author on:PubMed Google Scholar
Kaoru Kakinuma
View author publications
Search author on:PubMed Google Scholar
Pravin Rawat
View author publications
Search author on:PubMed Google Scholar
Zvjezdana Stančić
View author publications
Search author on:PubMed Google Scholar
Riccardo Testolin
View author publications
Search author on:PubMed Google Scholar
Mohamed Z. Hatim
View author publications
Search author on:PubMed Google Scholar
Flávio Rodrigues
View author publications
Search author on:PubMed Google Scholar
Jürgen Homeier
View author publications
Search author on:PubMed Google Scholar
Marcia C. M. Marques
View author publications
Search author on:PubMed Google Scholar
James K. McCarthy
View author publications
Search author on:PubMed Google Scholar
M. A. El-Sheikh
View author publications
Search author on:PubMed Google Scholar
Kirill Korznikov
View author publications
Search author on:PubMed Google Scholar
Kilian Gerberding
View author publications
Search author on:PubMed Google Scholar
Teja Kattenborn
View author publications
Search author on:PubMed Google Scholar

Contributions

D.L. and T.K. conceived the study. D.L. designed the methodology and led the analysis. S.W., D.S., and K.G. contributed to the development of analytical tools and statistical modeling. Data collection and processing were carried out by C.V., D.H., G.J.A.H., S.T., S.P., F.G., H.K., M.S., H.C., B.G., J. Dolezal, R.P., A.G., C.D., F.N., J.W., A.L.G., M.J.M., M.C., J.L., D.T., J. Dengler, S.Ś., J.A., L.M., A.N.N., K. Kakinuma, P.R., Z.S., R.T., M.Z.H., F.R., J.H., M.C.M.M., J.K.M., M.A.E., and K. Korznikov, who also provided critical feedback on data quality and interpretation. Manuscript writing was led by D.L., with substantial contributions from T.K., S.W., D.S., C.F.D., J.K., H.B., F.M.S., G.D., and A.M.M., and all authors reviewed and edited the manuscript.

Corresponding author

Correspondence to Daniel Lusk.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Reporting Summary (download PDF )

Transparent Peer Review File (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lusk, D., Wolf, S., Svidzinska, D. et al. Crowdsourced biodiversity monitoring fills gaps in global plant trait mapping. Nat Commun 17, 1203 (2026). https://doi.org/10.1038/s41467-026-68996-y

Download citation

Received: 17 September 2025
Accepted: 20 January 2026
Published: 30 January 2026
Version of record: 30 January 2026
DOI: https://doi.org/10.1038/s41467-026-68996-y

Subjects

Abstract

Similar content being viewed by others

Leveraging remote sensing and crowd-sourced biodiversity data for enhanced plant functional trait mapping

CoRRE Trait Data: A dataset of 17 categorical and continuous traits for 4079 grassland species worldwide

Citizen science plant observations encode global trait patterns

Introduction

Results

Density and extent of citizen science and survey data

Global trait maps and spatial transferability

Model performance across trait data subsets at 1 km2 resolution

Map accuracy of combined trait data across spatial scales

Comparison with existing trait map products

Importance of environmental predictors

Discussion

Methods

Trait data from the TRY plant trait database

Citizen science plant observations

Vegetation survey data

Assigning trait values to species observations and surveys

Spatial aggregation of derived trait values

Gridded Earth observation data

Model training and validation

Model performance and spatial transferability

Reporting summary

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information (download PDF )

Reporting Summary (download PDF )

Transparent Peer Review File (download PDF )

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links

Model performance across trait data subsets at 1 km² resolution