Table 3 Trait data subsets used for model training

From: Crowdsourced biodiversity monitoring fills gaps in global plant trait mapping

Trait data subset

Source

Trait aggregation type

Training data size after spatial aggregation

Vegetation surveys (SCI)

sPlot

Community-weighted mean

430,213

Citizen-science observations (CIT)

GBIF

Frequency-weighted mean

2,392,987

Combined (COMB)

sPlot + GBIF

Community-weighted mean (preferred) + frequency-weighted mean

2,646,876

  1. Models for all traits were trained for each trait data subset at multiple resolutions and evaluated against held-out, spatially independent community-weighted mean traits using spatial K-fold cross-validation. In the case of the COMB trait data subset, SCI and CIT were merged with a preference for SCI values when both SCI and CIT data were present.