Background & Summary

In the context of intensifying global environmental change, terrestrial C cycling processes are being substantially altered1,2. Key environmental drivers, including elevated atmospheric CO2 concentrations, enhanced nutrient deposition, and climate warming, significantly influence vegetation growth, functional traits, and ecosystem C sequestration capacity3,4,5,6,7,8. As the dominant component of terrestrial C storage, forest ecosystems play a pivotal role in regulating global C balance and mitigating climate change9,10,11. Leaves, as the primary photosynthetic organs in forest plants, assimilate C through CO2 fixation, while N and P are essential for supporting fundamental metabolic processes, including chlorophyll synthesis, protein formation, and energy transfer—all critical for photosynthetic function12,13,14. The tightly coupled relationships between leaf C, N, and P reflect not only plant adaptation strategies to environmental variability15,16 but also directly modulate photosynthetic performance. These stoichiometric relationships serve as vital ecological indicators, connecting C and nutrient cycling processes while enabling predictions of nutrient use efficiency and ecosystem functioning17,18,19,20. Importantly, under elevated atmospheric CO2 concentrations, ecosystem productivity and C storage capacity become increasingly constrained by N and P availability, particularly through their limiting effects on plant nutrient acquisition and utilization efficiency21,22,23,24. Despite this critical relationship, systematic long-term measurements of leaf C-N-P stoichiometry across forest ecosystems remain notably limited. This significant data gap hinders both (1) the mechanistic understanding of nutrient dynamics and (2) quantitative assessment of forest C cycling processes and their feedback to climate change. Therefore, establishing a standardized, temporally explicit dataset of leaf C-N-P concentrations from representative forests is imperative—both for elucidating plant functional responses to global change3,6,25 and for parameterizing terrestrial ecosystem models with ecologically realistic nutrient constraints26,27,28.

China’s forest ecosystems constitute a vital component of the global C cycle, characterized by extensive spatial distribution, high biodiversity, and significant C storage potential, which are critical for maintaining regional ecological security29,30. However, considerable uncertainties persist in estimates of forest C sequestration and ecosystem productivity in China, partly because of the limited availability of dynamic data on leaf nutrient concentrations and the lack of systematic quantification of nutrient limitations31,32. In terrestrial ecosystems, C assimilation rates and net primary productivity (NPP) are fundamentally constrained by nutrient availability, with N and P playing particularly critical roles. C assimilation efficiency is enhanced only when the stoichiometric nutrient requirements of plants are met33,34. As the essential component of RuBisCO and the electron transport system, leaf N directly determines the maximum carboxylation rate and photosynthetic nitrogen use efficiency. Meanwhile, leaf P governs ATP production and membrane stability, thereby controlling photosynthetic phosphorus use efficiency while simultaneously modulating the sensitivity of photosynthetic response to N availability35,36. In the context of global change, comprehensive long-term data on the variations of C, N, and P concentrations in forest plant leaves across China’s diverse ecosystems remain limited.

To address these fundamental data gaps, we compiled the Leaf Carbon-Nitrogen-Phosphorus Concentrations in China’s Forests (CNP−China) dataset through standardized monitoring conducted by CERN across four observation years (2005, 2010, 2015, and 2020). This comprehensive dataset covers 52 dominant tree species from China’s tropical, subtropical, and temperate forest ecosystems, with the leaf C, N, and P concentrations (LeafCNP) data as the central component, supplemented by three auxiliary tables: Plot Information, Species Information, and Soil Properties (Fig. 1). The Plot Information file provides geographic and environmental details for each forest plot; the Species Information file offers taxonomic and identification information for each species; and the Soil Properties file contains corresponding soil data, including total N, P, and organic matter concentrations. All samples were collected and analyzed following CERN protocols to ensure cross-year and cross-site comparability. The dataset employs a relational structure with unique Site ID, Plot ID, and Species ID identifiers for integration across tables. Rigorous quality control procedures, including unit standardization, logical error correction, and outlier removal, were implemented before finalizing the data in structured CSV format to facilitate data sharing, reanalysis, and model integration.

Fig. 1
figure 1

Composition structure of the CNP–China dataset. Overview of the dataset’s structure, including one main data table (LeafCNP) and three supplementary files: Plot Information, Species Information, and Soil Properties.

The CNP−China dataset provides essential support for studying the spatio-temporal variation of forest stoichiometry, nutrient adaptation strategies, and plant responses to global environmental change. Its standardized structure and scalable design facilitate fundamental research on ecological stoichiometry and biogeographic patterns, significantly advancing our understanding of ecosystem-level C-N-P interactions. Furthermore, CNP−China can serve as input parameters or validation data in ecosystem C cycle models, supporting the optimization and improvement of carbon-nutrient coupling simulations. By sharing this dataset, we aim to support global C cycle research in several areas: (1) exploring the relationship between forest ecosystem C sink and nutrient limitations; (2) assessing how leaf–soil C:N:P ratios regulate forest productivity; (3) optimizing parameter settings for C-N-P interactions in ecosystem models; and (4) providing decision support for forest C sink management in the context of climate change. CNP−China can be integrated with eddy covariance data and ecosystem models to enhance understanding of C sink capacity and feedback in China’s forests and beyond.

Methods

Site selection

The CNP−China dataset is derived from long-term monitoring data collected at 10 forest ecosystem sites of CERN. These sites span a wide latitudinal gradient, from tropical and subtropical to temperate regions of China (Fig. 2), ensuring strong representativeness and systematic coverage37,38. These sites—Xishuangbanna Forest (BNF), Ailao Mountain Forest (ALF), Dinghu Mountain Forest (DHF), Heshan Forest (HSF), Huitong Forest (HTF with two plots: HTF01 and HTF02), Gongga Mountain Forest (GGF), Shennongjia Forest (SNF), Maoxian Forest (MXF), Beijing Forest (BJF), and Changbai Mountain Forest (CBF)—collectively represent China’s major forest ecosystems, including tropical monsoon rainforests, subtropical evergreen broadleaf forests, montane mixed forests, and temperate deciduous broadleaf forests. The 11 permanently established plots (one per site except HTF) are maintained under strict protection to ensure minimal anthropogenic disturbance, thereby guaranteeing data continuity and ecological validity. This systematic spatial design captures the variability in foliar nutrient characteristics across China’s diverse climatic and vegetation gradients, providing a robust representation of national-scale forest stoichiometric patterns. The CNP–China dataset compiles leaf and soil nutrient measurements from these forest plots, including both publicly released datasets and unpublished data obtained with permission from the corresponding management institutions. Details of dataset availability and citation are provided in the Data Records section.

Fig. 2
figure 2

Spatial distribution of forest observation plots in the CNP–China dataset. Black dots show 11 long-term forest plots across 10 CERN sites spanning China’s major forest biomes. Green shading indicates national forest cover. Site abbreviations are defined in the main text.

Species identification and taxonomic standardization

From 2005 to 2020, CERN conducted long-term monitoring of dominant tree species in typical forest plots across China. The selected species typically exhibit high-importance values within their communities, with significant ecological functions and regional representativeness, making them suitable for reflecting vegetation characteristics and ecological processes across different forest types. During field surveys and sample collection, the dominant species were identified by experienced local researchers, who recorded the Chinese names of the species.

To ensure consistency across different years and plots, we performed systematic organization and standardization of species nomenclature. Chinese names were verified mainly using the online platforms of iPlant (https://www.iplant.cn/) and Flora of China (http://www.efloras.org/flora_page.aspx?flora_id=2), ensuring the accuracy of the original records. In cases of name confusion or unclear expressions, we prioritized correcting Chinese names to their authoritative scientific names. The Latin names were standardized using The World Flora Online (WFO) Plant List (https://wfoplantlist.org/), the internationally recognized authority for plant taxonomic verification. For taxonomically disputed species or those with revised nomenclature, we consistently applied the most current and widely recognized scientific names to maintain the dataset’s taxonomic rigor and scholarly credibility. Additionally, the Species Information file includes the Chinese name, leaf type (e.g., needle or broad), and leaf phenology type (evergreen or deciduous) for each species, providing ecological context and facilitating classification analyses and trait-based research.

Leaf sampling and nutrient concentration analysis

From 2005 to 2020, CERN implemented a systematic survey of dominant tree species in typical forest ecosystems across China at five-year intervals, aiming to obtain long-term, continuous plant nutrient data. Because the plant elements (especially C, N, and P) stabilize between late July and early August annually, sampling was mainly focused on this time window to ensure data comparability and representativeness. Within each forest plot, three to five mature, healthy individuals of each dominant tree species were selected for leaf sampling. To comprehensively represent the average nutritional status of the tree canopy, the crown of each tree was divided into three vertical layers—upper, middle, and lower—and leaves were collected proportionally from each layer. These samples were then combined to form a composite leaf sample for each individual tree. The fresh weight of each composite sample was approximately 500 g, which adequately represented the overall leaf characteristics of a single plant. However, due to logistical constraints or field conditions, the actual number of replicates occasionally varied across years, sites, or species. The selection of dominant species in each sampling year was guided by the concurrent forest inventory. As species dominance naturally shifts over time due to ecological succession or environmental changes, some species were only sampled in specific years. All collected leaf samples were placed in sealed bags with unique identification labels on-site and transported to the laboratory under refrigerated conditions for subsequent preprocessing and nutrient analysis.

Sample preparation followed a standardized procedure. Fresh leaves were first thoroughly rinsed with clean water to remove surface dust and contaminants. To inactivate enzymes and prevent post-harvest metabolic activity, the samples were briefly dried at 105 °C. The drying temperature was then reduced to 65 °C, and the samples were oven-dried to a constant weight to ensure complete moisture removal and chemical stability. Once dried, the leaf samples were ground and thoroughly mixed to obtain a homogeneous powder, which was used for subsequent nutrient content analysis. For element analysis, the total C and N concentrations of the leaves were measured using an elemental analyzer, while the total P concentration was determined using an inductively coupled plasma optical emission spectrometer (ICP-OES). All element concentrations were expressed as dry weight percentages (g kg−1) to ensure data consistency and comparability.

Soil collection and nutrient analysis

Soil samples were collected between August and November of each observation year to minimize the influence of extreme weather and ensure soil conditions remained relatively stable. Within each typical forest plot, the area was first divided into several appropriately sized sampling units, from which a subset was randomly selected as the actual sampling units. Surface soil samples were generally collected from at least three replicate units per plot and year, with the exact number varying depending on local conditions. Each sampling unit was treated as an independent replicate. Within the sampling unit, 10 to 20 sampling points were established based on a “W” or “S” shaped pattern. Surface soil samples (0–20 cm) were collected using a soil auger or corer, and samples from all points within a unit were thoroughly mixed to create a composite sample. To meet analytical requirements, the total mass of each composite sample was no less than 500 grams. After collection, visible plant residues, large stones, and other impurities were manually removed. Samples were then air-dried under ventilated conditions or oven-dried at low temperatures until reaching a constant weight. Dried samples were thoroughly ground and passed through a 0.2 mm sieve to obtain a homogeneous fine powder for subsequent laboratory analyses. Soil total nitrogen (N) concentration was determined using either the Kjeldahl method or an elemental analyzer. Among these, the Kjeldahl method was more commonly applied, in accordance with CERN’s routine soil monitoring practices. Total phosphorus (P) concentration was measured after digestion with sulfuric and perchloric acids, followed by molybdenum–antimony colorimetric analysis. Soil organic matter concentration was measured using the potassium dichromate oxidation-heating method. All analyses were performed following standardized protocols to ensure the accuracy and comparability of the results.

Data processing

During data processing (Fig. 3), we implemented a systematic workflow to ensure data quality and consistency. We first collected and integrated the raw data from 11 typical forest plots, encompassing leaf and soil nutrient measurements across four sampling periods (2005, 2010, 2015, and 2020). To ensure consistency and comparability across years, plots, and experimental batches, all measurement units were standardized and corrected. Our quality control protocol involved multiple validation steps. Initially, we identified and removed records with obvious entry errors (e.g., abnormal formats), inconsistent unit labels, or logical inconsistencies (e.g., negative concentration values). For repeated measurements, we applied statistical filtering techniques, excluding outliers that exceeded the threshold of mean ± 3 × standard deviation (SD) to improve data robustness while retaining natural variability. All processed data were subsequently organized into structured CSV files to optimize accessibility, readability, reusability, and interoperability for downstream analyses and collaborative research. This structured data curation process ensures that the dataset is systematically organized, transparent, and suitable for open access and reuse.

Fig. 3
figure 3

Data processing workflow for the CNP–China dataset. Field observations from 11 forest plots and four time points were standardized through data integration, format harmonization, and quality control procedures, resulting in a curated dataset for analysis.

Data Record

The full dataset described in this Data Descriptor, titled “Leaf Carbon-Nitrogen-Phosphorus Concentrations in China’s Forests (CNP−China)”, has been archived in the Zenodo repository and is available at https://doi.org/10.5281/zenodo.1533570539. The CNP−China dataset includes nutrient measurements of canopy tree leaves and surface soils collected in 2005, 2010, 2015, and 2020 from 11 typical forest ecosystem plots, interoperability downstream analyses and collaborative research, along with associated plot and species information. The data are presented in standardized CSV format and organized into four interconnected files: leaf carbon, nitrogen, and phosphorus concentrations (LeafCNP); basic plot information (Plot Information); dominant tree species information (Species Information); and surface soil nutrient properties (Soil Properties).

The primary data table, LeafCNP, documents the concentrations of total C, N, and P in leaf samples from dominant tree species across multiple forest plots and observation years. This core component characterizes the temporal dynamics of tree species’ nutrient status within diverse forest ecosystems. Each entry represents a composite leaf sample and includes metadata on site, plot, species, sampling year, sample code, and the measured C, N, and P concentrations. To enhance the ecological interpretability and extend the application potential of the dataset, three supplementary data tables are provided. The Plot Information table records the geographic coordinates (latitude, longitude, and elevation), ecosystem type, and climatic variables (long-term mean annual temperature and precipitation) for each typical forest plot. These contextual data facilitate the environmental interpretation of leaf and soil nutrient patterns and enable robust analyses across eco-regions or environmental gradients. These plot-level metadata are particularly valuable for cross-regional comparisons and climate gradient studies. The Species Information table provides comprehensive taxonomic details for dominant tree species observed at each plot, including family, genus, Latin species name, and Chinese common name. It further incorporates key functional traits, such as leaf type (needle or broad) and leaf phenology (evergreen or deciduous), supporting trait-based comparative analyses. The Soil Properties table documents surface soil concentrations of organic matter, total N, and total P for each plot, offering important context for assessing the relationships between plant nutrient status and soil nutrient availability. Detailed descriptions of all data fields are provided in Tables 14. The four data tables are relationally linked via the site code (Site ID) and plot code (Plot ID). Additionally, the LeafCNP and Species Information tables are cross-referenced using the species code (Species ID). Field measurements were obtained from 10 long-term forest observation sites affiliated with CERN. Among these, five sites—BNF, DHF, ALF, HTF, and CBF—have made their original datasets publicly available through the National Ecosystem Science Data Center. These datasets have been assigned DOIs and are cited in the References section40,41,42,43,44,45,46,47,48,49. The remaining five sites—HSF, GGF, SNF, MXF, and BJF—have not released their raw data via public repositories; however, the measurements were conducted following standardized CERN monitoring protocols. Reuse and integration of these data into the CNP–China dataset are compliant with CERN’s official data policy. This dataset therefore represents the first comprehensive and openly accessible release of these data under a permanent DOI.

Table 1 Leaf nutrient concentration.
Table 2 Plot information.
Table 3 Species information.
Table 4 Soil properties.

The dataset architecture is designed for optimal extensibility, facilitating future updates and cross-scale integration analyses. It supports diverse research applications, including studies on plant functional traits, ecological and geographic patterns, and biogeochemical cycling. Specifically, it provides an invaluable empirical foundation for forest C stock assessment, carbon–nutrient coupling analysis, and the validation of C cycle models. This comprehensive dataset offers essential support for advancing forest ecosystem research in the context of accelerating global environmental change.

Technical Validation

To ensure the accuracy and reliability of the dataset, a series of technical validation procedures and quality control measures were implemented across data collection, sample analysis, and data processing. All leaf and soil samples were collected from long-term monitoring plots within CERN following standardized protocols and operational procedures. Sample collection, preprocessing, and laboratory analyses were conducted by professionally trained technicians in strict adherence to CERN guidelines. Analytical instruments were regularly maintained and calibrated to ensure measurement precision and consistency, minimizing potential errors from manual handling and processing. During data entry, preliminary manual checks were performed to identify and correct anomalies. The aggregated data then underwent standardized processing and programmatic validation, including unit harmonization and reasonable range checks for each variable. For replicated measurements of leaf and soil samples, outliers were detected and removed based on the “mean ± 3 standard deviations” criterion. Moreover, to preserve the authenticity of the original data, missing values were not imputed, thereby avoiding biases introduced by model-based assumptions. All final data were structured with clearly defined fields and standardized coding, ensuring reproducibility and facilitating downstream reuse.

To further evaluate the spatial and temporal consistency of the dataset, we compared the leaf C, N, and P concentrations of the same dominant tree species across different years and locations. The observed variations remained within ecologically reasonable ranges, and no systematic biases were identified. Cross-validation with independent datasets from the published literature confirmed the consistency of both concentration magnitudes and distribution patterns, further supporting the reliability and scientific robustness of the dataset.

Usage Notes

The CNP−China dataset provides foundational data for investigating spatio-temporal variations in leaf C, N, and P stoichiometry across China’s forest ecosystems. Users should consider the following points when using the dataset:

Temporal coverage

The dataset includes observations from four time points—2005, 2010, 2015, and 2020—capturing trends in leaf C, N, and P concentration of dominant tree species across typical forest ecosystems in China over the past two decades. Users should note that some plots have missing observations for certain years; therefore, careful considerations of sample size and spatial distribution are recommended when performing data selection or applying weighting procedures.

Spatial distribution

Sampling sites are primarily located in tropical, subtropical, and temperate regions of eastern China, with relatively limited representation of western and alpine areas. Users should be cautious when extrapolating findings beyond the dataset’s spatial coverage.

Species selection bias

The dataset focuses on dominant tree species within the canopy layer at each plot and does not include comprehensive coverage of all plant functional types, such as shrubs and herbaceous species. For studies targeting whole-ecosystem nutrient budgets, users are encouraged to supplement this dataset with additional sources.

Data consistency

All sampling and measurements were conducted according to standardized protocols established by CERN, ensuring comparability across different years and plots. Rigorous manual verification and outlier removal have been performed; outliers in replicate measurements were identified using a ± 3 SD criterion to reduce technical errors while preserving natural variation. However, minor measurement errors may still exist under rare conditions. Users are encouraged to apply additional filtering or standardization as needed for their specific research objectives.

Recommended applications

The CNP−China dataset can be applied in a variety of research contexts, including analyses of plant nutrient strategies, investigations of eco-geographical patterns, and studies on carbon–nutrient coupling mechanisms. It can also serve as input or validation data for terrestrial ecosystem models. To enhance the robustness and breadth of research outcomes, we recommend integrating CNP−China with complementary data sources, such as remote sensing products, meteorological observations, eddy covariance data, and ecosystem process models.

Data citation

When using this dataset in research or publications, please cite it according to the provided citation format and explicitly acknowledge its full name: “Leaf Carbon-Nitrogen-Phosphorus Concentrations in China’s Forests (CNP−China)”.