Introduction

Accurate quantification of canopy height and above-ground carbon (AGC) is essential for understanding the AGC cycle and its role in climate regulation, biodiversity conservation, and ecosystem services1. While forests have long been recognized as major carbon sinks, their capacity is declining in some regions due to climate change and increasing disturbance frequency (e.g., in the tropics)2, elevating the importance of scattered trees outside forests in croplands, grasslands, and urban areas for their role in carbon storage3,4,5,6. Comprehensive AGC mapping across all land uses is therefore essential for designing climate policies, verifying carbon offset projects, and achieving global climate targets like the Paris Agreement7.

Recent 30m-resolution AGC products8,9 over China have provided AGC estimates exclusively for forests, relying on direct AGC estimations from local observations using Landsat8 and Sentinel optical imagery9. This method restricts model adaptability in non-forest regions, as it fails to capture key structural variations in non-forest vegetations, such as tree-height heterogeneity, patch geometry and canopy density. Consequently, AGC in scattered trees in non-forest ecosystems like croplands, grasslands, and urban green spaces, remain more uncertain10.

Here, we present the high-resolution canopy height maps (at 10 m resolution) and AGC maps (at 30 m resolution) over all land cover types across China from 2019 to 2023, developed using a hybrid time series deep learning, and machine learning framework that combines U-Net11,12,13 with Random Forest (RF)14, using multi-sensor information from Sentinel-1 (SAR)15 and Sentinel-2 (optical-NIR-SWIR-TIR)16. The U-Net11,12 model, a specialized convolutional neural network (CNN) architecture, is particularly effective in capturing spatial hierarchies, segmenting complex land cover features, and identifying tree structures across various landscapes, including forests, croplands, grasslands, and urban areas17. This capability is especially valuable for large-scale canopy structure mapping across diverse ecosystems12,13.

Unlike studies that directly estimate AGC from remote sensing data8,9,18,19, our framework first derives canopy height from LiDAR-based Global Ecosystem Dynamics Investigation (GEDI) reference data20,21 using a deep learning approach. The predicted canopy height, a physically meaningful and spatially consistent structural metric, is then used to derive tree cover ratio and subsequently converted into AGC estimates. This design significantly enhances accuracy by leveraging the extensive spatial and temporal coverage of GEDI canopy height observations across both forest and non-forest regions. It allows a more faithful representation of the structural characteristics of scattered trees in croplands, grasslands, and urban green areas, where conventional AGC models often ignore or perform poorly due to the scarcity of field biomass observations. In the second stage, those structural metrics derived from the deep learning model are combined with additional environmental predictors, including elevation, slope, and geographic coordinates, within a machine learning framework to estimate AGC. This integration allows the model to capture both canopy structural properties and ecological gradients without relying on forest masks or region-specific allometric equations. Collectively, our framework yields a robust, full-coverage AGC map for China, providing new insights into tree carbon storage across diverse land-cover types including non-forests, supporting regional carbon accounting and climate-mitigation planning.

This two-step deep learning and machine learning framework has been successfully demonstrated in previous studies across Europe3,12,13, confirming the reliability and transferability of this methodology. In contrast, there are other studies that solely employ machine learning based models, e.g., RF, to predict canopy height and then estimate AGC are typically restricted to forested areas22, as RF cannot inherently differentiate forest from non-forest regions and typically relies on predefined forest masks23,24.

A time series deep learning framework based on the U-Net11,12 model was first developed to estimate maximum canopy height at a 10 m resolution across China from 2019 to 2023. The model was trained using GEDI RH98 data as reference maximum canopy height data21, incorporating high-resolution Sentinel-115, Sentinel-216, digital elevation, and slope data25. A total of 1,156,170,559 GEDI footprints were used, covering all land cover types and biomes26 across China (Supplementary Figs. 13). The modeled map of canopy height was externally validated against height from 614,156 aerial vehicle (UAV) LiDAR data points collected between 2019 and 2023 across China27, covering both forested and non-forest areas defined by a tree cover ratio28 in 2019 below 30%29,30 (Supplementary Figs. 4a, b, 5a, b). Subsequently, we train a RF14 model using 4788 observed biomass data8,9 from filed sites in 2019 (Supplementary Fig 4c). These field sites span both forest and non-forest ecosystems29,30 (Supplementary Fig 5c, d) and were used to estimate AGC from 2019 to 2023. The RF model incorporates geospatial coordinates, slope, elevation, and canopy-related metrics predicted by the U-Net model, including maximum canopy height, average canopy height, and tree coverage ratio. The integration of these canopy structural variables enables AGC predictions across various land cover types. In this study, we will only focus on the AGC in 2019. The detailed model structure and algorithms are provided in the Methods section and Supplementary Fig 68, Supplementary Tables 1-2.

Results

Our canopy height maps demonstrated good predictive performance, achieving an R² of 0.74 and a mean absolute error (MAE) of 1.77 m, in out-of-sample testing (Supplementary Fig 9b). To further validate its prediction, we conducted an external assessment using a high-resolution UAV LiDAR scanning dataset27,31. In this independent evaluation, the model achieved an R² of 0.72 and MAE of 2.39 m (Supplementary Fig 9c), confirming its ability to accurately capture canopy structure in complex forested landscapes. However, our model exhibits a mean bias of −0.8 m (Supplementary Fig 9c), suggesting a tendency to underestimate canopy height. Notably, for canopy heights exceeding 30 m, errors increase dramatically (Supplementary Fig 10a), which is a common problem encountered by data driven methods due to a relatively small ratio of tall trees in the training dataset12,13,32. Residual analysis shows stable performance up to 35° slope or 2500 m elevations (Supplementary Fig 11), with slight overestimation at steeper/higher classes, likely attributed to terrain-induced shadowing effects in Sentinel-2 imagery or shifts in dominant tree species over mountainous terrain13,33,34.

Our model effectively captures canopy height variations across different land use types. Fig 1a presents the predicted 10 m resolution canopy height map over China in 2019, highlighting the spatial distribution of vegetation height across different ecosystems. Our results reveal that taller forests are primarily concentrated in southern and northeastern China, while lower vegetation is more prevalent in arid and agricultural regions. To further illustrate the prediction of our model across diverse landscapes, Fig. 1b–g provide detailed visual comparisons between satellite imagery and predicted canopy height in various land covers, including forests (b, g), urban areas (c), croplands (d), plains (e), and mountainous terrains (f).

Fig. 1: Predicted canopy height map over China and regional comparisons across different land covers in 2019.
Fig. 1: Predicted canopy height map over China and regional comparisons across different land covers in 2019.The alternative text for this image may have been generated using AI.
Full size image

The figure illustrates the predicted canopy height map of China at a 10 m resolution for the year 2019 using the GEDI-Sentinel based U-NET canopy height model. Plot a shows the nationwide spatial distribution of vegetation height. Plots b–g provide detailed comparisons between satellite imagery and predicted canopy height across different land cover types: b trees in the forests of Heilongjiang (Northeast), c trees in urban Shanghai (East), d trees on croplands in Sichuan (Southwest), e trees on plains in Inner Mongolia (North), f trees in mountainous areas in Chongqing (Southwest), and g trees in the forests of Hainan (South). These comparisons demonstrate the model’s robustness in capturing vegetation structure across various landscapes, including dense forests, croplands, plains, urban areas, and mountainous terrain.

To further evaluate the model’s ability to capture canopy structure across diverse environments, including sparsely forested and transitional zones as well as areas below the conventional forest definition, we generated a tree-line map (Fig. 2). Tree lines, defined as the uppermost boundary where trees can grow, were mapped out based on our high-resolution canopy height map. The resulting tree-line map reveals elevational limits of tree growth across mountainous regions in the western China35, with notably high tree lines (in between 4900 m to 5000 m) in the southeastern Tibetan Plateau (Fig. 2g).

Fig. 2: Altitude of the tree lines in China.
Fig. 2: Altitude of the tree lines in China.The alternative text for this image may have been generated using AI.
Full size image

The figure presents the altitudinal distribution of tree lines across China based on high-resolution canopy height maps. a shows the spatial distribution of tree line altitudes across China for elevation over 3000 m. Plots b–g provide examples of tree line distributions in various provinces: b Yunnan, c Sichuan, d Gansu, e Qinghai, f Xinjiang and g Tibet. Note that tree lines at elevations below 3000 m have been masked out.

For AGB density estimation, our RF model exhibited good predictive capacity, with a cross-validated R² of 0.67 and an MAE of 37.71 Mg ha-1 (Supplementary Fig 12). While for AGB density less than 100 Mg ha-1, the median error is less than 10 Mg ha-1 (Supplementary Fig 12, 13). This suggests that the AGB model effectively generalizes across varying biomass conditions, capturing the spatial heterogeneity of AGB from canopy height. However, it tends to substantially underestimate AGB densities exceeding 350 Mg ha-1, primarily because of insufficient training data available within this range (Supplementary Fig 13). Further validation shows consistent model performance across both forest and non-forest regions, with low bias in each case (Supplementary Fig 14), confirming the transferability of our model beyond forests. To convert AGB density to AGC density, we applied a biomass-to-carbon conversion factor of 0.5, which is widely recommended for China’s forests8,36.

Ecosystem AGC density across different land cover types37,38 exhibited large spatial variations at the national scale (Fig. 3), with high-density carbon storage in southwestern and northeastern forests, and lower values in agricultural and arid regions (Fig. 3d). Forest (Fig. 3a) dominates the total AGC stock, with the highest values observed in Yunnan (1.61 Pg C), Sichuan (1.32 Pg C), and Heilongjiang (1.50 Pg C), where dense forests or complex mountain ecosystems contribute notably to biomass storage. AGC of scattered trees on grassland (Fig. 3b) shows a lower but notable contribution, with the highest values observed in Sichuan (0.36 Pg C), Yunnan (0.24 Pg C), and Inner Mongolia (0.18 Pg C), where extensive grassland ecosystems play an important role in carbon sequestration. AGC of trees dispersed within cropland (Fig. 3c), indicating the presence of hedgerows, shelterbelts or agroforestry systems, also exhibits spatial heterogeneity with the highest values observed in Sichuan (0.27 Pg C) and Guangxi (0.16 Pg C) provinces. Urban tree AGC (Fig. 3e), including the city parks, street trees and woody vegetation in residential green spaces, highlights carbon stored in trees within urbanized regions, showing relatively low but non-negligible contributions, especially to biodiversity and climate resilience. Fig 3f highlights the variation in average AGC density across different land covers using different land cover masks (GLC_FCS30D37,38, GLC1039, ESA40, HILDA + 41,42, HILDA+ with JRC forest mask43). Forests consistently exhibit the highest AGC density, with a median value of 49.3 Mg C ha-1, ranging from 46.58 to 56.18 Mg C ha-19,18,43,44. In comparison, scattered trees in croplands have a median AGC density of 19.50 Mg C ha-1. While urban and grassland tree cover shows lower median values of 11.18 Mg C ha-1 and 9.82 Mg C ha-1, respectively. The provinces and regions of China are presented in Supplementary Fig 15.

Fig. 3: Above-ground carbon distributions over different land covers in China in 2019.
Fig. 3: Above-ground carbon distributions over different land covers in China in 2019.The alternative text for this image may have been generated using AI.
Full size image

The figure shows the spatial distribution of AGC and AGC density across various land cover types and provinces in China. Plots a–c and e depict AGC distribution across different ecosystems, with plot a representing AGC stored in forests, b in grasslands, c in croplands, and e in urban tree areas. Plot d shows the AGC density across different provinces. Plot f provides a violin plot illustrating the distribution of AGC density by land cover type, revealing that forests have the highest median AGC density, followed by croplands, grasslands, and urban areas. Shaded areas in the violin plot denote the 95% confidence interval.

Our results demonstrate significant variations in AGC across ecosystems (Fig. 4a). We revealed a total AGC of 17.38 Pg C ± 0.55 Pg C in China across all the land covers. Forests remain the dominant AGC reservoir, storing the majority of China’s AGC9,18, which represents 67.1% to 79.2% of the total AGC (Supplementary Fig 16). Scattered trees in non-forest ecosystems, such as grasslands, croplands, urban areas, and other land covers like shrub lands, contribute approximately 20.8 to 32.9% of the total AGC (Supplementary Fig 16). Beyond total ecosystem contributions, our analysis further quantifies the AGC stored in low woody vegetation in China, defined as height shorter than 5 m. Our results reveal that low height trees in cropland contribute substantially to total tree AGC in croplands (19.50%) compared to other land covers (Fig. 4b). In contrast, low height trees in forests, located mainly near the forest edges, contribute only 0.43% of the total forest AGC (Fig. 4b).

Fig. 4: Above-ground carbon distributions over different land covers in China in 2019.
Fig. 4: Above-ground carbon distributions over different land covers in China in 2019.The alternative text for this image may have been generated using AI.
Full size image

The figure illustrates the distribution of AGC across different land cover types and the contribution of low vegetation (vegetation height <5 m) to AGC based on various land use masks. Plot a shows the total AGC (in Pg C) for different land covers, including forests, grasslands, croplands, urban areas, and others, as classified by GLC_FCS30D37,38, GLC1039, ESA40, HILDA + 41,42, HILDA+ with JRC forest mask43. Plot b highlights the proportion of AGC contributed by low vegetation across land covers.

Our analysis reveals that AGC density at forest edges is lower than the average forest AGC (Fig. 5), with variations across provinces (Fig. 5a) and distance to the edge (Fig. 5b). The forest mask applied here was from GLC_FCS30D37,38 in 2019. Nationwide, the AGC density within the 100 m forest edge zone is only 66% of the interior forest AGC density (Fig. 5b), and AGC density declines progressively with decreasing distance to the forest edge44,45,46. In tropical forests (Fig. 5a), such as those in Hainan, Guangxi, Guangdong, and Yunnan, AGC density within 100 m of the forest edge is 15–30% lower than the average forest AGC density in these provinces46. A similar pattern is observed in the North China Plain, where forests already have low AGC densities. In contrast, northeastern forests experience a more pronounced AGC density reduction of 40–50% at the edges, indicating stronger edge effects in these regions.

Fig. 5: AGC density over different land covers, and AGC density change due to edge effect.
Fig. 5: AGC density over different land covers, and AGC density change due to edge effect.The alternative text for this image may have been generated using AI.
Full size image

The figure examines the impact of forest edges AGC density across different provinces in China. Plot a presents the AGC density ratio within the 100 m edge of forests across provinces, showing variations in carbon storage at the forest margins. Plot b illustrates the AGC density at different distances from forest edges, showing a clear decline in AGC density as proximity to the forest edge increases. The forest mask applied here was from GLC_FCS30D37,38 in 2019.

Discussion

Forest as the dominant reservoir of AGC in China

Nationally, trees taller than 5 m occupy 22.25 % of China’s land area (213.54 million ha) within forested regions37,38 in 2019, and an additional 7.60% (72.96 million ha) outside forest (Supplementary Fig 17c). This proportion of forested area with trees exceeding 5 m height is slightly lower—by 0.6%—than the 22.85% reported by the World Bank47,48, likely due to differences in forest classification criteria or the effects of forest fragmentation49, as smaller trees, particularly those near forest edges, were excluded in our product whereas the FAO definition used by the World Bank includes such areas if trees are capable of reaching a height of 5 m in situ29. Provinces, such as Yunnan (22.32 million ha), Heilongjiang (20.97 million ha), Sichuan (19.61 million ha) and Inner Mongolia (14.50 million ha) exhibit substantial forested areas (Supplementary Fig 17a), while substantial tree cover exists outside forest in provinces like Sichuan (10.80 million ha), Guangxi (7.45 million ha), and Yunnan (6.61 million ha) (Supplementary Fig 17b).

Forests remain the dominant AGC reservoir, contributing 67.1–79.2 % (11.66–13.77 Pg C) of China’s total AGC, with median AGC density around 49.3 Mg C ha-1 (46.6–56.2 Mg C ha-1), which is consistent with previous studies9,18,50,51. This finding reinforces their role as the primary carbon sink.

Significance of trees outside forests

Beyond forests, our results show that scattered or clustered of trees in non-forest ecosystems contribute 20.8–32.9 % of China’s total AGC (Supplementary Fig 16). This proportion is notably higher than in Europe, where trees outside forests contribute, on average, only 2% of the total AGC, with regional variations ranging from 0.6 to 12.2%3. The higher contribution in China may be attributed to widespread agroforestry and tree-based agricultural systems, particularly in the south, where trees are extensively planted in croplands and along field boundaries52,53. Additionally, in northern China, large-scale shelterbelt programs, such as the Three-North Shelter Forest Program54, contribute considerably to tree cover and AGC in cropland and grassland areas. These findings highlight that trees outside forests, long neglected in national carbon inventories, play an indispensable role in China’s carbon balance.

Among non-forest types, croplands exhibit the highest AGC density (with the median of 19.50 Mg C ha-1, consistent with previous study55,56), representing a substantial but often overlooked carbon pool. Trees interspersed within grasslands also contribute to carbon storage, although with lower AGC densities than croplands and forests. Urban green spaces, despite their limited spatial extent, store an average of 11.18 Mg C ha-1, comparable to values reported for European and East Asian cities57,58,59, but lower than values reported for tropical regions with denser canopy cover60. These findings highlight the significant yet underrecognized role of trees outside forests in carbon sequestration and climate adaptation. Trees in agricultural and urban landscapes mitigate heat stress by their evaporative cooling, reduce wind erosion, and buffer hydrological extremes, thereby improving local microclimates and ecosystem resilience61,62,63. Expanding and maintaining such tree cover offers co-benefits for carbon storage, temperature regulation, and sustainable land management, consistent with previous studies highlighting their multifunctional role in climate mitigation and adaptation55,61,62,63,64.

Our AGC map also reveals patterns that are not captured by forest-only datasets. Notably, Sichuan Province and other southern regions exhibit agroforestry carbon hotspots, highlighting opportunities for targeted agroecological practices that integrate tree-based systems with agricultural land use (Fig. 3c).

Fine-scale patterns of AGC distribution

Our fine-scale analysis reveals that forest fragmentation imposes a notable carbon cost, with AGC density declines sharply toward edges44,45,46 (66% of interior values within 100 m, Fig. 5). This pronounced edge effect is not merely a localized phenomenon but a major driver of biome-wide carbon deficits, as elevated tree mortality and altered microclimates in fragments persistently reduce carbon stocks65,66,67. Accordingly, conservation actions that maintain or increase core forest area—via strict protection of large, intact blocks, curbing new edge creation (e.g., road setbacks), and restoring structural connectivity—are among the most effective ways to increase carbon stocks by reducing edge-induced carbon deficits.

In northeast China, where edge effects reduce AGC density by 40–50%, prioritizing intact forests conservation and fragmented landscapes reconnection are likely to yield large and more durable carbon gains than simply expanding plantations of small or isolated patches—a critical distinction that area-based carbon metrics often overlook.

Plantation expansion alone is an imperfect substitute for conserving or restoring natural forests68. Natural forests typically store more carbon per unit area, recover carbon more rapidly after disturbance, and are more resilient to drought, windthrow, fire, and pests than even-aged monoculture plantations of comparable age69,70,71. Mixed-species, longer-rotation, and structurally diverse plantings can narrow (but rarely erase) this gap, especially when sited to reduce edge exposure and embedded within connected landscapes. Thus, a carbon-efficient conservation strategy is to (i) prevent fragmentation of high-biomass natural forests; (ii) expand core area and buffers around existing remnants; (iii) use connectivity-focused restoration (riparian corridors, stepping-stone woodlands); and (iv) where plantations are needed, favor diverse, longer-rotation designs that minimize new edge length.

Beyond forests, complementary investments in trees outside forests, such as agroforestry, shelterbelts, and urban green infrastructure, can enhance landscape-scale carbon while delivering co-benefits for climate adaptation and livelihoods, aligning with the UN Decade on Ecosystem Restoration’s emphasis on “quality” restoration72.

Policy pathways to protect and enhance non-forest carbon reservoirs

This study underscores the need for integrated land-use and climate policies that explicitly recognize the carbon value of trees across all ecosystems—not only in forests but also in agricultural, grassland, and urban landscapes, which align with recent international calls for nature-based solutions and land-use synergies that maximize co-benefits for carbon sequestration, biodiversity, and human well-being73.

Non-forest trees collectively represent a substantial and spatially dispersed carbon pool. Yet, they remain largely invisible in conventional carbon accounting frameworks. Incorporating these systems, such as agroforestry, shelterbelts, roadside vegetation, and urban green infrastructure, into national greenhouse gas inventories would improve the completeness and spatial realism of China’s land carbon monitoring.

From a conservation perspective, maintaining and enhancing these non-forest carbon reservoirs requires policies that minimize disturbance, regulate tree removal, and promote regeneration in managed landscapes. For example, agroforestry systems and shelterbelts are vulnerable to clearing for agricultural intensification, while urban green spaces can face degradation from land-use change and heat stress. Protecting these small but collectively important tree networks can buffer climate extremes, mitigate erosion, and sustain local carbon sinks under increasing anthropogenic and climatic pressures.

Policy mechanisms could include (i) Integrating agroforestry and field-boundary trees into agricultural carbon crediting schemes; (ii) Strengthening incentives for urban greening, especially in rapidly urbanizing regions where per-capita green cover is declining; (iii) Establishing long-term monitoring of trees outside forests using remote sensing and AI-based mapping to capture their dynamic contribution to national carbon budgets.

Methodological implications

This study demonstrates the feasibility of high-resolution, observation-driven AGC mapping that bridges the forest–non-forest divide. Deep-learning approaches trained on GEDI-derived canopy structure can be transferred to other regions, providing consistent, data-driven assessments of carbon dynamics across different land uses.

Future work should focus on integrating dynamic disturbance information—such as deforestation, agricultural expansion, and urbanization—into non-forest carbon accounting to better track both carbon losses and post-disturbance recovery. Expanding more field observations in non-forest systems is also critical to refine model calibration and address the current underrepresentation of scattered trees, hedgerows, and urban vegetation in training datasets.

Regarding the AGB-to-AGC conversion, our study used a standard factor of 0.58,36, which is widely recommended for Chinese forests. However, this ratio may vary across non-forest ecosystems because trees in croplands, grasslands, or urban environments often differ in species composition, wood density, and growth form from those in natural forests. Future work should develop ecosystem-specific conversion factors, e.g., by combining field-based carbon fraction measurements with structural and species data, to improve accuracy in non-forest AGC estimation.

Despite the advancements, several challenges remain. One limitation of our model is its tendency to underestimate canopy height and AGC in areas with exceptionally tall canopies (>30 m) or high biomass (>350 Mg ha-1), likely due to the limited representation of large-tree observations in the training data. Uncertainties also increase on steep terrain (>35°) and at high elevations, where Sentinel-2 imagery suffers from terrain-induced shadowing and shifted forest species. Addressing these issues through topographic correction, improved data fusion, and targeted field sampling could enhance model robustness across complex landscapes.

Ultimately, protecting and enhancing tree cover across all landscapes, whether within dense forests, or scattered trees outside forests, represents one of the most cost-effective and scalable pathways for achieving China’s carbon neutrality goals while simultaneously supporting climate resilience, biodiversity, and ecosystem services.

Methods

Data collection

We acquired satellite imagery data through Google Earth Engine (GEE)74, utilizing Sentinel-1 (S1) and Sentinel-2 (S2) datasets from the "COPERNICUS/S1_GRD" and "COPERNICUS/S2_SR" collections, respectively. The S1 dataset provides C-band Synthetic Aperture Radar (SAR) imagery, which enables all-weather, cloud-penetrating observations, making it particularly useful for forest monitoring and land surface analysis15. We used the VV and VH polarization bands, representing vertical–vertical and vertical–horizontal signal returns. The S2 dataset obtained from the level-2A product, includes multispectral surface reflectance (SR) imagery across 13 spectral bands: B2-4, B5-8A, B11-12, and RGB. This dataset is processed into SR format, which corrects for atmospheric distortions and ensures more accurate representation of land surface conditions16. The multispectral capabilities of Sentinel-2 provide critical data for vegetation monitoring, land-use classification, and environmental assessments.

Both the S1 and S2 datasets offer a spatial resolution of up to 10 m and span the period from 2019 to 2023. To minimize interference from cloud cover and snow, we applied quality filters based on the MSK_CLDPRB (cloud probability) and MSK_SNWPRB (snow probability) layers. Additionally, while most Sentinel-2 bands are at 10 m resolution, B5, B6, B7, and B8A are originally 20 m resolution and were resampled to 10 m by assigning values to a finer grid. The annual median composite images from S1 and S2 were extracted using EPSG: 4490 as the coordinate reference system (CRS). The scripts used for downloading and processing the datasets are publicly available: https://code.earthengine.google.com/fe92ee8c92a6cddc0e11246a2cc7ee37 and https://code.earthengine.google.com/191e2195b1527816869a85636cbc45e1, respectively. The aggregated S1 and S2 data over China totaled ~15 TB, partitioned into 10 × 10 km tiles (377,364 images) for efficient handling.

The elevation and slope datasets used in this study were derived from the Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM) provided by the CGIAR Consortium for Spatial Information (“CGIAR/SRTM90_V4” collection on GEE) at a native spatial resolution of 90 m25. The elevation data represents the absolute land-surface height above sea level, while the slope layer was calculated using the ee.Terrain.slope function. Both layers were resampled to 10 m to align with the Sentinel datasets. The GEE scripts used for generating and exporting the elevation and slope data are publicly available at: https://code.earthengine.google.com/9cdd86b4a21d8e61f504a38a1f62b159.

As for the GEDI canopy height dataset, it was obtained through GEE from the collection “LARSE/GEDI/GEDI02_A_002_MONTHLY”, which is GEDI's Level 2 A Geolocated Elevation and Height Metrics Product. This dataset covers the entire China (Supplementary Fig 1), with data available from 2019 onwards. We obtained the data from 2019 to 2023 and selected the RH98 value as the maximum canopy height metric. Several filters were applied to improve data quality. Following earlier work12, we used only nighttime acquisitions (solar elevation <0°) to reduce noise from SR. To address GEDI’s uncertainty in steep terrain75, we restricted footprints to areas with a slope of no more than 10 degrees. We also excluded locations within 25 m of forest edges based on the focal.max() operator on GEE and “ESA/WorldCover/v200” tree cover class40 to prevent geolocation errors33. For built-up areas, RH98 values were replaced with ground-return heights to ensure only vegetation height was considered. Despite GEDI’s native 25 m resolution, data were resampled to 10 m to match the resolution of S1 and S2 images. The RH98 value of each GEDI footprint was assigned to all 10 m pixels whose centers lay within the footprint boundary, while pixels without GEDI coverage are marked NA (stored as 0 in our label raster) and are excluded from the loss. The code for downloading GEDI canopy height data is available at https://code.earthengine.google.com/74509554a0ece937999f3fdb374110d4. The final dataset comprised 1,156,170,559 footprints which were well distributed across major biomes26 in China. After weighting footprint counts by biome, all vegetation zones retained sufficient samples for model training. Although broadleaf forests—temperate broadleaf and mixed forests (TBMF) and tropical and subtropical moist broadleaf forests (TSMBF)—account for a smaller share of GEDI samples, their large biome extent ensures adequate data representation (Supplementary Fig 2, 3).

For external validation of our canopy height model, the UAV LiDAR data27 (2019–2023) comprising 614,156 points (Supplementary Fig 4a, b) across China were used, processed into 30 × 30 m grids to estimate maximum canopy height (details can be found in prior work27). Among all UAV samples, approximately 32.6% were located in non-forest areas (tree cover ratio <30%29,30), while 67.4% were from forested regions (Supplementary Fig 5a, b).

The AGB dataset used in this study covers eight vegetation zones across China8,9 (Supplementary Fig 4c), comprising 4788 in situ biomass measurements collected nationwide in 2019. These plots have a diameter of 30 m and span a latitude range of 21°N to 50°N, covering a diverse range of tree types, including needleleaf and broadleaf trees. Among these field plots, approximately 11.4% were located in non-forest areas (tree cover ratio <30%29,30), while 88.6% were from forested regions (Supplementary Fig 5c, d). Each plot has decimeter-level positioning accuracy and covers an area greater than 400 m². The AGB at the tree level (DBH > 5 cm) was calculated using allometric equations derived from previous studies on biomass estimation in China76,77.

Model description

Originally designed for biomedical image segmentation, the U-NET framework11 has demonstrated remarkable adaptability in remote sensing applications. Its hallmark U-shaped architecture (Supplementary Fig 6) enables multi-scale image analysis through hierarchical feature extraction, effectively capturing spatial relationships between adjacent pixels even under data scarcity. This capability proves critical for delineating intricate land cover patterns—including forests, croplands, grasslands and urban areas17, particularly in nationwide canopy structure detection across heterogeneous landscapes12.

Model training, validation and testing - U-Net model

We trained a U-Net based model11,12 using canopy height measurements derived from GEDI lidar data, integrating S1 and S2 satellite imagery with digital elevation and slope data to generate continuous spatiotemporal canopy height estimates. Supplementary Fig 7 outlines the procedure for training, validating, and testing our canopy height models. Satellite data and reference heights were geospatially aligned and partitioned into 10 × 10 km tiles (1000 × 1000 pixels), yielding 306,113 tiles across China. Each tile contains both GEDI and satellite data. These tiles were then randomly split into three sets, 75% were assigned to the training set, 5% were utilized to the validation set for monitoring the training progress of the U-Net models (with training stopping once the validation loss converges), and the remaining 20% were reserved as the out-of-sample testing set to evaluate the final models' performances. It is important to note that the GEDI footprints encompass all land cover types, meaning that the out-of-sample testing performance reflects the overall model accuracy across diverse land cover classes.

During the training phase of the model, we processed 50 tiles at a time. From each 1000 × 1000 pixels tile, a sub-tile of 256 × 256 pixels is randomly extracted. This technique of random cropping is implemented to enhance the model's exposure to diverse data scenarios78, improve its ability to generalize across different spatial features, and increase the model robustness12. The training data consisted of co-registered 256 × 256 sub-tiles extracted from stacked Sentinel-1/2, elevation, and slope imagery together with the corresponding GEDI label raster. The U-Net ingested the full sub-tile as input, extracting multi-scale features from the complete spatial context. The loss was computed only for pixels with valid GEDI labels, while unlabeled pixels (without GEDI coverage) contributed indirectly through the convolutional receptive fields and skip connections, allowing the network to learn contextual relationships across the entire sub-tile. After training and testing the GEDI-Sentinel based U-Net model, we performed an external validation using UAV lidar observations27 to assess the model's performance specifically in forest regions. To achieve this, we first generated canopy height maps for the period 2019–2023 using annual Sentinel-1 and Sentinel-2 inputs from the same timeframe. Given that UAV LiDAR data were processed into 30m-diameter plot sizes, we extracted canopy height values by identifying the corresponding location of each UAV observation within our maps. For each observation point, we searched within a 30m-diameter circular area and selected the maximum canopy height value in that region. This extracted height was then compared with the UAV-measured canopy height recorded at the same geolocation and year of observation to perform external validation of model accuracy. The model’s performance was evaluated using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Coefficient of Determination (R2). The detailed model structure and model parameters can be found in Supplementary Table 1 and Supplementary Table 2.

The evaluation of our canopy height model demonstrated good predictive performance. In out-of-sample testing (Supplementary Fig 9a), the model achieved a mean absolute error (MAE) of 1.01 m and a root mean squared error (RMSE) of 2.16 m. To mitigate the influence of low vegetation, we applied a Gaussianization process to normalize the data distribution79, after which the model achieved an MAE of 1.77 m and an RMSE of 2.34 m (Supplementary Fig 9b). In external validation using UAV canopy height observations27 (Supplementary Fig 9c, d), the model maintained robust performance, achieving an MAE of 2.31 ~ 2.39 m, and an RMSE of 3.48 ~ 3.98 m.

In comparison, two widely used global models32,80 exhibited lower predictive performance (Supplementary Fig 9e, f). The Lang’s model80 had a higher MAE of 5.30 m and RMSE of 6.91 m, indicating a weaker ability to capture local canopy height variations in China. The model of Pauls32 performed less effectively, with a MAE of 4.44 m, and a RMSE of 5.80 m. Although our U-Net model has a better overall performance than these two global models, it exhibits the most significant underestimation for canopy heights exceeding 30 m (Supplementary Fig 10a), suggesting a strong height saturation effect or a lack of training samples for very tall forests. This underestimation is particularly prevalent in Southwest China (Supplementary Fig 10b), where high forests are more common. Across all three models, overestimations were observed in areas with steep slopes ( > 35°), likely due to terrain-induced errors in GEDI reference data and the snow or shadowing effects in Sentinel-2 imagery13,33,34. Notably, Lang’s model showed the strongest overestimation bias under high-slope conditions (Supplementary Fig 11).

Model training, validation and testing - RF model

AGB was modeled using RF14,81 regressors trained on observed AGB data8,9. Three U-Net derived predictors were generated: (1) a 30 m resolution maximum canopy height map, representing the highest value among 10 m U-Net outputs within a 30 m pixel, (2) a 100 m resolution mean canopy height map, calculated as the mean of 10 m U-Net outputs within a 100 m pixel, and (3) a 100 m resolution tree cover ratio map, indicating the proportion of 10 m pixels with canopy height >5 m within a 1-hectare area. The 100 m resolution was chosen to align with AGB observations, which are measured in Mg ha-1. Additionally, geographic coordinates, elevation, and slope were incorporated as covariates to capture bioclimatic gradients and forest-type variability. This approach leverages RF’s ability to model nonlinear relationships 14,82 while capitalizing on U-Net’s accuracy in canopy structural prediction across all land covers, effectively overcoming the limitations of traditional allometric scaling methods12.

The workflow for training, tuning, and evaluating the above-ground biomass (AGB) model is shown in Supplementary Fig 8. In the original dataset, the AGB density values were absent for bare soils. To address this, we randomly selected a subset of bare soil, water bodies, and built-up areas equivalent to 20% of the total original AGB dataset size and assigned these locations an AGB density of zero. This approach helps to fill gaps in the data by providing reasonable AGB density estimates for lower heights. The full dataset was then split into two portions: 80% as the dataset for model training and parameter tuning, and 20% as out-of-sample testing dataset.

For tuning, we applied a grid search approach82,83 with 10-fold cross-validation84 to optimize the hyperparameters: the number of random features considered at each split (“mtry”), and the number of trees (“ntree”)14,81. We varied “mtry” from 3 to 6 and “ntree” from 300 to 1000. Each parameter combination was evaluated using R2 and RMSE metrics calculated from 10-fold cross-validation, and the configuration with the lowest RMSE was chosen as the final set of hyperparameters. The final model was then evaluated on the out-of-sample testing dataset to assess predictive performance. The model achieves an MAE of 37.71 Mg ha-1 (Supplementary Fig 12) and maintains stable performance without substantial errors when AGB is below 350 Mg ha-1 (Supplementary Fig 13). Our RF model had a median error of −3.82 Mg ha-1 in the AGB density class of 0–50 Mg ha-1 (interquartile range: −30.61 to 0 Mg ha-1), −7.41 Mg ha-1 in the AGB density class of 50–100 Mg ha-1 (interquartile range: −29.38 to 6.79 Mg ha-1), and 15.51 Mg ha-1 in the AGB density class of 100–150 Mg ha-1 (interquartile range: −3.35 to 28.06 Mg ha-1) (Supplementary Fig 13). Within the out-of-box testing dataset, we further stratified the samples into two groups—forest and non-forest—based on the tree cover ratio28 at each observation location (threshold = 30%, following FAO forest definition29,30). Separate evaluations were conducted for each group to assess the model’s generalization across ecosystems. The results demonstrate robust and balanced performance across both domains, with R² = 0.60, MAE = 43.17 Mg ha-1, and RMSE = 61.97 Mg ha-1 for forest sites, and R² = 0.74, MAE = 7.58 Mg ha-1, and RMSE = 20.80 Mg ha-1 for non-forest sites (Supplementary Fig 14). These outcomes confirm that the model effectively captures AGB variability not only in dense forests but also in scattered and heterogeneous non-forest environments.

We also compared our 2019 AGC density map with two China-specific AGC density products from 2019 (Yang et al.9 and Chen et al.19) and one global AGC density product (Santoro et al.85), those models are focusing on forest regions. Yang’s product has a spatial resolution of 30 m, matching our own map's resolution, while Chen’s map is at a coarser resolution of 1 km, and Santoro’s global map is at 100 m resolution. Our AGC density map showed a similar spatial distribution pattern to Yang et al.’s product (Supplementary Fig 18b, c).

To provide an independent evaluation, we assessed the accuracy of these products using the same out-of-box testing dataset employed for our model validation (Supplementary Fig 19). This comparison quantified the performance of each product against observed AGC values from the out-of-boxing dataset. Among the three, Yang’s product achieved a low mean absolute error (MAE = 17.49 Mg C ha-1 or 34.98 Mg ha-1) and a moderate bias (5.04 Mg C ha-1 or 10.08 Mg ha-1), which is comparable to our product in overall accuracy but with a larger systematic bias. In contrast, the Santoro and Chen products showed higher errors and larger biases with vertical striping in their scatterplots (Supplementary Fig 19b, c), primarily due to the resolution mismatch and geolocation uncertainty. Their coarser spatial resolutions (100 m and 1 km, respectively) reduce sensitivity to fine-scale forest heterogeneity, as the extracted value at the location of each out-of-box observation represents an area-averaged AGC over a much larger grid cell, resulting in discretized patterns and inflated apparent errors.

For a detailed comparison of spatial consistency among products, we first resampled our AGC density map to match the spatial resolution of each respective comparison product. We then randomly selected 10,000 locations within forested areas37,38 (Supplementary Fig 18a). AGC density values from both our map and the other products were extracted at these selected points and analyzed using scatterplots. Overall, our AGC density map aligns well with regionally focused models for China (Supplementary Fig 20a) but differs markedly from Santoro’s global map (Supplementary Fig 20b), likely because the global product inadequately represents forest AGC density within China.

When accounting the AGC from the AGC density map, we used the AGC density value (Mg C ha-1) from each pixel (30 × 30 m) multiplied by the area of each pixel (900 m²). To estimate the total uncertainty of our AGC across China, we adopted a Monte Carlo approach86. First, we assigned pixel-level uncertainty as half8,36 of the median error associated with each AGB density class (Supplementary Fig 13). For each pixel, the corresponding error was used as the standard deviation to generate random Gaussian noise. This noise was then added to the original AGC density values, creating simulated AGC realizations. We performed 100 independent Monte Carlo simulations, each time recalculating the total AGC across the entire region. Negative AGC values resulting from noise addition were set to zero. Finally, the uncertainty in total AGC was quantified by calculating the standard deviation across the 100 simulated total AGC values, providing a robust estimate of uncertainty expressed in Pg C. For estimating AGC density within each land cover type, we first applied the land cover mask before performing this calculation.