Background & Summary

Owing to their shared evolutionary history, all living organisms possess a common biochemistry of 28 naturally occurring chemical elements1,2. These elements play essential roles in core biological processes, including the storage of genetic information, metabolic regulation, mechanical support, and protective mechanisms3,4,5. Despite this universal elemental composition, the relative proportions of these elements —referred to as stoichiometry—vary both within and among species3,6. Such stoichiometric variation underlies key functional traits related to resource uptake, assimilation, storage, and release7,8,9, reflecting the evolution of diverse life-history strategies shaped by organismal morphology and function. Furthermore, stoichiometric variation may reflect ecological adaptations to the environmental conditions in which different organisms have evolved3,4,10,11.

The framework of ecological stoichiometry3 has driven extensive research into how organisms acquire, store, and transfer nutrients essential for growth and reproduction across diverse environments. Despite significant advancements, progress has been hindered by the limited availability and synthesis of elemental content data for plants and animals across broad spatial and taxonomic scales, as well as insufficient integration between ecological and evolutionary mechanisms (but see6,12). Therefore, a comprehensive understanding of spatiotemporal patterns and underlying mechanisms governing stoichiometric diversity is crucial. Beyond explaining observed patterns in organismal elemental content, such insights would enhance our ability to predict how organisms will respond to ongoing and future environmental changes, particularly under intensifying global change drivers.

To address this critical gap, we introduce StoichLife — the first global dataset for biogeographical and macroevolutionary patterns in organismal stoichiometry, developed within the sBIOMAPS working group at iDiv in Germany13. StoichLife provides openly accessible data without restrictions, though we kindly ask users to acknowledge this paper when using the database. The database aims to advance research in ecological stoichiometry, functional biogeography, macroecology, and macroevolution. Additionally, StoichLife remains open to integrating new data and future updates.

Methods

Data compilation

We developed the StoichLife template structure, which contains data pertaining to elemental content and their ratios (%C, %N, %P, C:N, C:P, and N:P) alongside body size measurements, and information regarding sampling locality, country, and taxonomic affiliation (i.e., phylum, class, order, family, species or morphospecies). This template was distributed among prospective data contributors actively engaged in sampling and analyzing plant and animal elemental content across diverse regions worldwide (between 2014 and 2022). As part of this effort, we engaged 24 researchers who contributed 50 datasets on animals (both invertebrates and vertebrates), including 15 previously published datasets14,15,16,17,18,19,20,21,22,23,24,25,26,27,28, one dataset from two sources29,30, and 34 unpublished datasets (details of collection outlined in Supplementary Text 1; refs. 31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55). Additionally, we incorporated data from six large databases used in prior studies, comprising one zooplankton dataset56, 20 aquatic animal datasets57, five datasets for both animals and plants10, three datasets for coral reef macroalgae (Pangaea database58), 80 datasets for green leaves59, and 35 datasets for plants60. Datasets were included based on the following criteria: (i) elemental analyses were conducted on individual organisms under natural conditions, excluding those subjected to experimental manipulations such as nutrient enrichment; (ii) for animals, analyses were performed on whole-body (bulk) tissue, while for plants, stoichiometry was primarily assessed through leaf or shoot elemental composition; and (iii) georeferenced coordinates of sampling sites were available to facilitate spatial analyses.

Data search

To complement data contributors, we conducted a systematic literature review to identify ecological stoichiometry studies published before 2021. Using Clarivate Analytics’ Web of Science Core database, we employed a set of search terms: “nutrient content” OR “nutrient composition” OR “elemental content” OR “elemental composition” OR “chemical composition” OR “nitrogen content” OR “nitrogen composition” OR “phosphorus content” OR “phosphorus composition” OR “percent nitrogen” OR “percent phosphorus” OR “N:P” OR “nitrogen-to-phosphorus” OR “ecological stoichiometry”.

This rigorous search yielded a total of 2,620 papers, which were further filtered to exclude: (i) microbial data, which typically represent analyses of entire microbial communities rather than individual cells; (ii) studies involving laboratory or field experiments; and (iii) literature reviews and opinion papers. Following this refinement, 110 eligible papers remained. After thoroughly examining these papers, we narrowed the selection down to 33 papers that met our criteria8,41,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91. In cases where archived datasets lacked essential details (e.g., species-mean resolution), we contacted authors to request raw data. Additionally, we cross-referenced all identified datasets within Andrieux et al.92, the most comprehensive synthesis study on animal stoichiometry to date. However, no additional datasets meeting our criteria were found through this process.

StoichLife consolidates 227 datasets, comprising an unprecedented 28,049 individual records from both published and unpublished sources13. At the time of data compilation (2022/09/19): 14% of datasets (n = 31) and 26% of records (n = 7,361) were unpublished. StoichLife contains data on plant and animal elemental content (%C, %N, and %P; Fig. 1) from terrestrial, freshwater, and marine realms. Spanning a broad geographical extent from 68°S (Antarctica) to 81°N (Arctic) and from 160°W (Hawaii, USA) to 177°E (New Zealand), StoichLife encompasses a diverse array of taxonomic groups, including plants and animals, across realms (Fig. 1). Despite its inherent limitations, the StoichLife dataset is the most comprehensive compilation of animal and plant data from terrestrial, freshwater, and marine realms across the globe to date (Figs. 1 and 2).

Fig. 1
figure 1

Workflow and structure of the StoichLife dataset. The database includes 227 plant and animal elemental content datasets across terrestrial, freshwater, and marine realms. StoichLife comprises 38 variables, including environmental descriptors related to temperature, solar radiation, and environmental N and P availability. These were compiled from the literature using bioinformatic approaches to allow for reproducibility, consistency, and efficiency. Each data type was checked and validated using the R programming language (see Methods - Data processing section).

Fig. 2
figure 2

Distribution of sampling locations in the StoichLife dataset. The histogram describes the number of records within bin sizes of 4° × 6° (latitude and longitude, respectively).

Data Processing

Dataset checking, cleaning, and formatting

Three distinct data types were processed: quantitative, taxonomic, and spatial. These data underwent rigorous validation and quality assurance procedures using the R software93. The quantitative data include elements such as individual %C and element ratios (e.g., C:N) and individual body mass measurements (dry mass). Elemental content values were verified to represent the percentage of each element in dry body mass, while elemental ratios were checked to ensure they accurately reflected both mass and molar ratios.

The taxonomic data encompass classifications ranging from species and morphospecies to higher taxonomic ranks, including families, orders, classes, phylum, and kingdoms. Data validation involved both automated and manual inspection to correct spelling errors, complete missing taxonomic information where feasible, address ambiguously identified morphospecies (e.g., “Geophilidae,” “Psychodidae sp.1”), and ensure the accuracy of currently accepted names. In cases where two distinct morphospecies with identical names appeared in different datasets (e.g., “Psychodidae sp.1”), we assigned unique identifiers to distinguish them (e.g., “Psychodidae sp.1_A” and “Psychodidae sp.1_B” in the two datasets). Taxonomic affiliations were validated using the Global Biodiversity Information Facility (GBIF), Integrated Taxonomic Information System (ITIS), and Catalogue of Life (COL) databases via the taxadb-package (version 0.1.594) in R. Additionally, plant taxonomy was verified using the Plants of the World online website (https://powo.science.kew.org/; Fig. 1).

To standardize taxonomic names across datasets, we applied a set of harmonization criteria. First, when missing taxonomic information was identified and multiple synonyms were available, we retained the accepted name based on established taxonomic databases (GBIF, ITIS, and COL). Second, taxonomic synonyms found between original publications and taxonomic databases were standardized to their accepted names. Third, in cases where taxonomic information in original sources aligned with taxonomic information from GBIF, ITIS, or COL, we prioritized GBIF as the primary reference unless substantial discrepancies were found. Finally, when inconsistencies remained unresolved after validation (e.g., differing taxonomic information between the original publication and taxonomic sources), we deferred to the taxonomic information provided by the original publications or data contributors. The StoichLife dataset preserves both the initial taxonomic classifications provided by contributors and the revised taxonomy to maintain transparency and facilitate future updates.

The spatial data include latitude and longitude coordinates of sampling locations. These coordinates underwent extensive validation through visual inspection, where they were plotted onto a global map to identify and correct any spatial errors. Common discrepancies, such as marine data mistakenly recorded as inland or vice versa, were rectified using geographical details provided in the original publications (Fig. 1). These steps ensured that spatial data were accurate and georeferenced correctly, enhancing their utility for ecological and biogeographical analyses.

StoichLife comprises 38 variables (columns) and 28,049 individual records (rows; Supplementary Table 1). Each record corresponds to at least one measurement of one elemental content or ratio taken at the level (i.e., %C, %N, %P, C:N, C:P, and N:P) of an individual organism. Records in the StoichLife dataset are distributed across 1,120 locations, each associated with both latitude and longitude coordinates (n = 23,290 records; Fig. 2; Supplementary Table 1).

In addition, 3,616 records contain only latitudinal information, while 1,143 records lack spatial information entirely. The dataset spans terrestrial (n = 16,832), freshwater (n = 8,935), and marine (n = 2,282) realms. Most records originate from the northern hemisphere (n = 19,268) compared to the southern hemisphere (n = 7,638; Fig. 2). Geographically, the dataset exhibits extensive coverage in Europe, the Americas, East Asia, and Eastern Australia, with notable gaps observed in Africa (excluding South Africa), the Middle East, Central and Southeast Asia, Western Australia, and Russia (Fig. 2). In the marine realm, Oceania (Western Pacific) and the Eastern Indian Ocean are underrepresented.

The revised taxonomy within StoichLife comprises 5,876 species (65.4%) and morphospecies (34.6%), spanning 837 families, 208 orders, 50 classes, 16 phyla, and 2 kingdoms. Animalia is the most extensively documented kingdom (n = 19,664 entries), followed by Plantae (n = 8,385 entries; Fig. 3; Supplementary Table 1). The percentage of taxa identified at the species level is higher in plants (92.4% of all plant taxa) than in animals (27.7% of all animal taxa). Two major classes, Insecta (Animalia) and Magnoliopsida (Plantae), account for approximately 44% of all entries in StoichLife, with 7,205 individual records for Insecta and 5,130 for Magnoliopsida (Fig. 3). Among these, Magnoliopsida has the highest number of species or morphospecies, (n = 2,420), followed by Insecta (n = 1,366) (Fig. 3).

Fig. 3
figure 3

Number of observations within taxonomic classes. Classes are sorted by decreasing number of observations. Numbers beside bars indicate the number of species or morphospecies within each class. Number of observations is given in the thousands. Classes with <10 observations are not shown (n = 13). Species or morphospecies for which the class was not identified are not shown (n = 600).

However, 30 out of 51 classes contain 50 or fewer individual records. The dataset includes 10,322 individual animal records (1,570 species or morphospecies) and 6,510 individual plant records (3,048 species or morphospecies) from the terrestrial realm. In freshwater habitats, there are 7,882 individual animal records (751 species or morphospecies) and 1,053 individual plant records (206 species or morphospecies). Meanwhile, marine data contain 1,460 individual animal records (133 species or morphospecies) and 822 individual plant records (172 species or morphospecies; primarily algae). Given that certain species inhabit several environments due to their life cycles, habitat classifications were preserved as indicated in the original sources. For example, terrestrial insects with aquatic larval stages were classified according to their primary feeding environment, distinguishing between aquatic larvae and terrestrial adults.

The number of individual records varies by element and stoichiometric ratio, with %N (n = 25,652) being the most common, followed by %C (n = 18,558), C:N (n = 18,373), %P (n = 14,457), N:P (n = 12,244), and C:P (n = 7,092). Only 7,091 individual records (25.3% of all entries) contain values for all three elements (i.e., %C, %N, and %P), with the majority belonging to animals (n = 5,635; 20.1% of all records) rather than plants (n = 1,456; 5.2% of all records). Overall, StoichLife provides a wide range of elemental values and their ratios (Figs. 4 and 5; Supplementary Table 1): %C [0.6–78.1], %N [0.06–19.5], %P [0.004–8.9], C:N molar [2.07–223.3], C:P molar [7.4–9154.7], and N:P molar [0.2–789.4].

Fig. 4
figure 4

Distribution of elemental content for the 16 phyla included in the database. Figures on top of each boxplot indicate the number of observations. Species or morphospecies for which the phylum was not identified are not shown (n = 7).

Fig. 5
figure 5

Distribution of elemental ratios for the 16 phyla included in the database. Figures on top of each boxplot indicate the number of observations. Species or morphospecies for which the phylum was not identified are not shown (n = 7). For the sake of visual clarity, values above 100, 3000, and 400 were removed for C:N (n = 8), C:P (n = 17), and N:P (n = 5), respectively.

These values exhibit substantial variations across taxonomic and trophic groups (i.e., phyla; Figs. 4 and 5) and realms (i.e., freshwater, marine, and terrestrial; Fig. 6).

Fig. 6
figure 6

Distribution of elemental content across realms and trophic groups. For the sake of visual clarity, values above 100, 3000, and 400 were removed for C:N (n = 8), C:P (n = 17), and N:P (n = 5), respectively.

Body mass data, measured as whole-organismal dry mass in g, were recorded for 9,942 individual animals. The two most represented taxa were Arthropoda (n = 6,552) and Chordata (n = 2,868; Fig. 7), with smaller contributions from Mollusca (n = 149), Annelida (n = 132), Nematoda (n = 62), Platyhelminthes (n = 58), Chaetognatha (n = 48), Cnidaria (n = 35), Acanthocephala (n = 27), and Ctenophora (n = 11). Terrestrial animals accounted for 3,885 individual records, while 5,347 originated from freshwater and 710 from marine realms. Body dry mass values ranged from <0.001 g (copepod nauplii) to over 800 g (Salmo salar Linnaeus, 1758; Supplementary Table 1), demonstrating substantial variation across taxa (Fig. 7).

Fig. 7
figure 7

Distribution of dry body mass across animal phyla. Masses are displayed as body dry masses (g) plotted on a log10 axis. The boxplots within the violin plot show the median, upper, and lower quartile per phylum.

To facilitate investigations into environmental drivers of organism elemental content, StoichLife integrates information on environmental factors such as air temperature, solar radiation, as well as environmental nitrogen and phosphorus availability. Mean annual air temperature at 10 m above ground or sea surface (T10M; °C; 0.5° × 0.5° resolution) and solar radiation data (ALLSKY_SFC_SW_DWN: All Sky Insolation Incident on a Horizontal Surface; W/m2; 1° × 1° resolution) from each sampling site were extracted from the National Aeronautics and Space Administration Prediction of Worldwide Energy Resources project (NASA POWER; https://power.larc.nasa.gov). Both temperature and solar data have temporal coverage from 1981 to 2022/09.

Global nitrogen (N) availability at each sampling site, represented by inorganic N deposition (kg N/km2/ year1; resolution of 2° × 2.5°); was retrieved from a published database95 covering the period from 1984 to 2016). Soil phosphorus (P) data (P labile; g P/m2; resolution 0.5° × 0.5° resolution) were extracted from the Oak Ridge National Laboratory Distributed Active Archive Center for Biogeochemical Dynamics (ORNL DAAC; https://daac.ornl.gov). While these soil P data lack specific temporal coverage, they are nominally representative of pre-industrial conditions ca. 1850. Additionally, marine total P data (sea surface P measurement; micromoles P/kg; resolution 1° × 1° resolution were sourced from the National Center for Atmospheric Research World Ocean Atlas (NCAR WOA; https://climatedataguide.ucar.edu; WOA13), covering the period from 1955 to 2012. To ensure consistency across datasets, we computed the average values for each of these four environmental factors across the respective temporal windows. This approach was chosen for several reasons: (1) our primary objective was to examine spatial rather than temporal variations in organismal elemental content; (2) samples were collected at different times by various researchers, and specific sampling dates were often unavailable; and (3) the environmental data were measured or modeled over different temporal timeframes, making direct temporal alignment impractical.

Data Record

The dataset is provided as an Excel file at Dryad13, which includes key information such as elemental content in % of dry mass, spatial coordinates, the biological level of organization (e.g., organ- or individual-level measurement, mean per species or population), taxonomic information (e.g., from species or morphospecies to kingdom), realm identity, the ontogenetic stage when possible (e.g., larvae, juvenile, adult, seed, sprout). Additionally, the dataset incorporates any other relevant information provided by contributors. A metadata file ‘StoichLife_metadata.xlsx’ accompanies the dataset, detailing column headings, measurement units, and descriptions of numerical variables. This file also includes a comprehensive list of data sources used in compiling the database, ensuring full transparency and reproducibility.

Technical Validation

Each dataset underwent thorough inspection using R (version 4.0.0 2020–04–2493) to ensure data integrity and consistency. We generated histograms and estimated value ranges for each element and elemental ratio to detect extreme values, potential outliers, or measurement errors. In most cases, extreme values were retained unless they were deemed implausible based on clear indications of measurement errors. Only four values were excluded due to implausibility: one C value exceeding 80%, one N value below 0.01%, and two P values exceeding 20%. Importantly, original datasets with missing information were not excluded from the StoichLife dataset. Some datasets contain missing data for specific variables, such as sampling location, specifically longitude, body mass, and trophic group (Supplementary Table 1). These gaps are retained to preserve data availability and allow future users to apply imputation or additional validation as needed.