Abstract
Vegetation vertical structure refers to the 3D distribution of vegetation aboveground biomass. Vegetation vertical structure of tropical forests influences other ecological and environmental variables that are essential for the functioning of the ecosystems. Integrating over 5.9 million Globel Ecosystem Dynamics Investigation (GEDI) LiDAR (Light Detection and Ranging) footprints, multispectral, and synthetic aperture radar (SAR) imagery, we built five national maps at 25 m resolution of five forest structural metrics for Colombia, South America, for the year 2020. We mapped canopy height, the height of half the cumulative returned energy from GEDI (RH50), total canopy cover, foliage height diversity, and total plant area index. The resulting maps tended to have the highest errors in the Amazon and Andean regions. Total cover had the highest relative error. Interrelationship curves between forest structural metrics of GEDI footprints are maintained across mapped metrics, indicating that the predictive models preserve structural relationships observed in GEDI data. Due to the medium-high spatial resolution and national coverage of the forest structural maps presented in this work, these maps will be useful for evaluating and mapping other ecological variables and conservation priorities in Colombia.
Similar content being viewed by others
Background & Summary
Three-dimensional vegetation structure (or vegetation vertical structure) refers to the distribution of plant biomass from the ground to the top of the canopy1,2,3. Vegetation vertical structure is an Essential Biodiversity Variable, a set of biological variables designed to monitor biodiversity changes in response to the current environmental crisis that the planet is experiencing4. Vegetation vertical structure influences hydrological cycles5,6,7, climatic regulation8,9,10, primary productivity11,12,13,14, nutrient fluxes15,16, habitat quality17,18,19,20, and biodiversity21,22,23. The most consistent and low-cost method to study vegetation vertical structure over large extents consists of using LiDAR (Light Detection and Ranging) sensors to estimate metrics that describe 3D vegetation structure, due to LiDAR’s ability to penetrate canopies and measure the sub-canopy distribution of vegetation24,25,26,27,28,29.
The NASA Global Ecosystem Dynamics Investigation (GEDI) LiDAR was designed to study vegetation vertical structure near-globally between approximately 51.6 degrees north and south latitude. It acquired data from April 2019 to March 2023, then was paused for 13 months30,31, and began reacquiring data in April 2024. GEDI uses eight laser beams which measure forest structure within ~25 m footprints. Along track, these footprints are spaced by 60 m, with 600 m spacing between beams32. Although GEDI tends to acquire fewer high-quality footprints in the tropics due to the geometrical characteristics of its orbit and persistent cloud cover3,30, never have there been so many detailed measurements of forest vertical structure in tropical ecosystems, the most diverse terrestrial areas on the planet33 and where the highest rates of natural habitat loss occur34.
GEDI footprints have limitations for spatially-continuous mapping because these footprint-level products represent samples of the land area, leaving most of the land surface without observations. GEDI is capable of discontinuously sampling only ~4% of the land surface every two-years30,31. Consequently, some research groups have integrated GEDI footprints with wall-to-wall multispectral data to enable the spatial prediction of GEDI information for consistent gridded maps of vegetation structure metrics and aboveground biomass. These predictions include a canopy height map at 30 m over the GEDI domain using Landsat predictors and RH95 as an indicator of height35, a global map of canopy height at 10 m using energy level RH98 and Sentinel-2 predictors36, maps of mean and standard deviation of canopy height at 1 km using the energy level RH10037, global maps of relative height metrics at 100 m, 200 m, 500 m, and 1000 m spatial resolutions integrating GEDI and ICESat2 (Ice, Cloud, and Land Elevation Satellite 2)38, and gridded mean aboveground biomass density at 1 km based on the canopy heights generated by GEDI39. This type of work modeled canopy height but did not map metrics related to the distribution of biomass between the ground and the canopy height. Burns et al.3 developed and published annual global maps from 2019 to 2023 of 26 GEDI structural metrics related to entire vertical vegetation profile at coarse spatial resolutions (1 km, 6 km, and 12 km), gridding the aggregated footprint values3. There are limited published maps of vegetation structure variables generated by GEDI predictions or interpolation that describe the entire vertical vegetation profile with a detailed resolution (<= 30 m) for large regions or countries. Those that have been published show great promise for enhancing our understanding of forest structure gradients and species habitat relationships40.
The objective of this research is to elucidate the construction and make available five maps of metrics of forest vertical structure (Table 1) with relatively high spatial resolution (25 m) for the year 2020 in Colombia, one of the most biodiverse countries on the planet. Colombia includes vegetation types that range from dry, moist, to rain forest at altitudes from sea level to >~5000 m. The maps were constructed by developing predictions for each metric of forest vertical structure using a set of 82 remote sensing predictors (temporal metrics) that included data from multispectral (Sentinel-2) and synthetic aperature radar (SAR) (Sentinel-1 and ALOS-PALSAR) sensors. The inclusion of the two SAR sensors allowed the use of regions of the electromagnetic spectrum that have been related to leaf density (Sentinel-1 C-band)41,42,43,44 and forest height (ALOS-PALSAR L-band)45,46,47, increasing the number of possible predictors and potentially reducing the error of the models. Each of these five national maps of forest structure was formed by a mosaic of regional maps corresponding to the five natural regions into which Colombia is divided. We did this to reduce errors in model predictions related to contrasting environmental conditions among regions, and relative uniformity within regions.
Methods
Study area
Colombia’s mainland territory presents an area of ~1.142 million km2 in the northwestern corner of South America. Colombia is categorized as a megadiverse country since it contains record high numbers in counts of several taxa (e.g., birds, mammals, amphibians, butterflies, freshwater fish, orchids, vascular plants), ecosystems, types of vegetation, and types of forests48. Colombian environmental authorities divide the country into five primary natural regions, Andean, Caribbean, Amazon, Chocó (Pacific), and Orinoquía (Fig. 1). It was estimated, in 2020, that 52.1% of Colombia is covered by forests distributed as follows: 64.8% in the Amazon, 17.2% in the Andes, 7.7% in Chocó, 5.5% in the Caribbean, and 4.8% in Orinoquía49. The Amazon is dominated by Tropical Moist-Forest, the Chocó by Tropical Rain-Forest, the Caribbean and Orinoquía by Tropical Dry Forest, and the Andes presents mosaics of Tropical Dry-Forest, Tropical Moist-Forest, and Tropical Rain-Forest, separated by small distances in some areas due to the high environmental variability generated by the branching of the Andes Mountain range into three mountain ranges (Western, Central, and Eastern Mountain ranges)50.
GEDI response variables
We downloaded all the L2A and L2B granule data of GEDI (version 2.1) for the Colombian territory corresponding to the years 2019, 2020, and 2021 to build regional datasets of the five metrics (Table 1). High quality footprints were afterward selected using the comprehensive filtering process published by Burns et al.3. First, we selected quality shots that suitably estimated ground elevation and vegetation structure metrics. Selection criteria included minimal surface water, minimal urban cover, leaf on vegetation status, vegetation structure metrics within expected ranges, and ground elevation agreement with a reference DEM, among others. Then, we linked the filtered L2A, L2B, and L4A datasets by shot number. Finally, we used a dictionary of local outlier granules produced by University of Maryland to exclude orbit segments that were identified as local outliers, typically associated with low clouds. This quality-filtering procedure resulted in 5,720,940 high-quality footprints for the Amazon region, 5,620,920 for the Andean region, 5,584,260 for the Caribbean region, 5,630,300 for Orinoquía, and 1,105,860 for Chocó.
SAR and multispectral predictors
We constructed 76 mosaics of temporal and textural metrics using the pixel values of all imagery of Sentinel-1 (SAR data of the c-band) and Sentinel-2 (multispectral data) available between 1 January 2019 and 31 December 2021 in Google Earth Engine – GEE51. By using all imagery of these three years in all our calculations, one year before and one year after 2020, we maximized the use of data for robust estimation, i.e. reduced error and uncertainty. The temporal metrics were average (X) and standard deviation (SD)21,41,42 while the textural metrics were sum average (SAVG) and difference variance (DVAR)52. These four metrics represented the central tendency (X and SAVG) and evaluated the data dispersion (SD and DVAR), generating balance among predictors. Textural metrics were calculated in neighborhoods of 3 × 3 pixels using the glcmTexture and map functions of GEE, which allowed us to estimate texture in each image of the temporal collection and later obtain an average. A description of the temporal and textural metrics is found in Table 2.
To develop the Sentinel-1 mosaics, the Sentinel-1 SAR GRD (C-band Synthetic Aperture Radar Ground Range Detected) product data sets53 were processed by applying an angular-based radiometric slope correction using a backscatter coefficient gamma nought, in addition to the calibration and ortho-correction of these data sets54. To develop the Sentinel-2 mosaics, we initially created image mosaics from Sentinel 2A surface reflectance products; however, because images were processed in tiles and bidirectional reflectance distribution function (BRDF) adjustments had not been applied, noticeable artifacts due to surface anisotropy and tile boundaries were present. To overcome this limitation, we applied a normalization approach based on the method outlined in Potapov et al. (2012) to Sentinel-2 Level 1 C top-of-atmosphere (TOA) imagery and constructed mosaics from these normalized images55. The approach reduces artifacts caused by surface anisotropy and variations in the viewing and solar geometries that remain in Sentinel-2A Level-2A products, resulting in mosaics with more consistent reflectance across scenes and acquisition dates. However, because the procedure adjusts TOA reflectance rather than performing full atmospheric correction, it does not provide true surface reflectance as other physics-based methods do. This method uses MODIS BRDF-adjusted reflectance as the normalization target. Here, we used a 10-year median of MODIS land surface reflectance bands, filtered to include only good-quality observations as indicated by the QA bands. We first selected relatively clear pixels from each image by using the scene classification map from the corresponding Sentinel 2 A SR product, which is developed by ESA and effectively removes most clouds and cloud shadows from L1C (Top-of-Atmosphere) and L2A (Surface Reflectance) imagery56. Next, the mean bias between MODIS and Sentinel-2 reflectance was calculated and used to adjust Sentinel-2 TOA reflectance, excluding pixels with large reflectance differences. To account for surface anisotropy, a linear regression between reflectance bias and distance from the center of each Sentinel-2 scene was applied to each spectral band independently. Table 3 shows the corresponding bands between the Sentinel-2 MSI and MODIS sensors used in the normalization process; however, there are no direct MODIS equivalents for the Sentinel-2 red edge bands. We generated synthetic MODIS red edge bands by modeling Sentinel-2 red edge bands as linear combinations of the MODIS red and near-infrared (NIR) bands. To do this, we convolved known surface reflectance spectra from the ECOSTRESS spectral library57,58 with the spectral response functions (SRFs) of the Sentinel-2 red edge bands and the MODIS red and NIR bands (SRFs obtained from the Pyspectral Python library59). The simulated reflectance values from the Sentinel-2 red edge bands served as dependent variables, while MODIS red and NIR band reflectances were used as independent variables.
We also constructed six mosaics for ALOS-PALSAR data applying a variation to the previous methodology described for Sentinel. We first obtained two metrics for the two polarizations of ALOS-2-PALSAR data, the average of years 2019, 2020, and 2021 using the GEE product 25 m PALSAR/PALSAR-2 mosaic47, since this is a one-date annual product created by mosaicking imagery from PALSAR/PALSAR-2. We then obtained four textural metrics over the previous annual mean, SAVG and DVAR for each polarization, estimated in neighborhoods of 3 × 3 pixels. A summary of each backscatter coefficient, band, and index used to build the 82 mosaics for the Sentinel-1, Sentinel-2, and ALOS-2-PALSAR data is shown in Table 4 and scripts used to build these mosaics are available in the section Code availability.
Prediction and mapping
To construct maps of the five GEDI metrics that describe the structure of Colombian forests at the year 2020 (Table 1), we first built maps for each natural region for each GEDI metric and then mosaiced these regional maps to create final national maps. We used this mapping approach because each natural region tends to have some similarity in forest types and environmental conditions (e.g., climate, topography, altitude) which allowed us to control sources of error in spatial modeling60,61. Other approaches typically applied in remote sensing modeling of large areas, such as mapping throughout the entire study area62 or mapping across regular grids that cover the study area35, could combine different forest types and environmental conditions, increasing modeling errors, given the highly heterogeneous characteristics of the Colombian territory.
Each regional map was constructed using the numerical values of each GEDI metric as the response variable, the associated values of the temporal and textural SAR and multispectral metrics as predictors, and the Random Forest algorithm (RF)63,64. Although in most regions we identified more than 5 million high-quality GEDI footprints we randomly subsampled 1,200,000 of these footprints for each regional model. This is the approximate maximum number of observations that our high-performance computing system could process for RF modeling with 82 predictors. The Choco region did not require any sub-sampling as we identified 1,105,860 high-quality footprints there. We then tuned RF hyperparameters, including the number of variables randomly sampled as candidates at each split and minimum size of terminal nodes. Once the best regional model was identified, the regional map for each GEDI metric was built based on the 82 mosaics of the SAR and multispectral predictors mentioned previously. We used the R packages “randomForest”64 and “Caret”65 for the RF modeling, “Boruta”66 to apply the Boruta algorithm for feature selection, and “raster” for mapping67.
Data Records
Maps of Colombian forest vertical structure for the year 2020 (Fig. 2) are available to download in GeoTiff format in Zenodo68: https://zenodo.org/records/15493516. These maps are also accessible in Google Earth Engine in the links below, which are organized corresponding to a tile shapefile, where each map is split into eleven tiles, with tile numbering starting at one and running from left to right, top to bottom, starting at the top left. The shapefile consists of four rows and three columns but note that the top row has only two tiles as the upper right tile does not contain any forest pixels in Colombia.
Tile shapefile
CH (canopy height)
COVER (canopy cover)
FHD_PAI (foliage height diversity calculated from plant area index)
PAI (plant area index)
RH50 (height at which 50% of lidar energy is returned)
Data Overview
The five national Maps of forest vertical structure for Colombia presented in this publication correspond to the 2020 year (Fig. 2), have the coordinate reference system EPSG:4326, spatial resolution of 25 m, and the data type Float32. Forest areas were identify by masking out all areas with <70% tree cover based on Hansen Global Forest Change database v1.12 (2000–2024)34. The map of Canopy height (CH) is in meters, the map of total cover (COVER) in percentage of cover, the map of Foliage Height Diversity (FHD) in the FHD index, the map of Total Plant Area Index (PAI) in the PAI index, and the map of height of half the accumulated energy (RH50) is in meters. The details of how map units were calculated are described in Table 1.
Technical Validation
We implemented three types of validation: 1) cross validation using sample data (VSD), 2) validation using external data (VED), and 3) Validation testing the interrelationship curves of the forest structural variables between footprint data vs. predicted data (VRC). VSD refers to error estimates calculating two metrics that allow comparisons between different units, RAE (Relative Absolute Error) and RRSE (Root Relative Squared Error), and two error metrics for absolute data to recognize the magnitude of the error, MAE (Mean Absolute Error) and RMSE (Root Mean Squared Error). These four-error metrics were calculated by sampling data partitions, using the sample data (GEDI footprints) where 70% of the footprints were used for building the maps and 30% were used for testing the resulting maps. These validations using sample data were estimated in each regional map for each of the five-forest structural metrics applying resampling of 5000 on the testing data to estimate value-ranges. We found error differences among the natural regions; the Amazon and Andean regions tended to present the highest RAE and RRSE values (Fig. 3) with maximum RMSE magnitudes of ~5.8 m for CH, ~0.25 for COVER, ~0.42 for FHD, 1.69 of m2/m2 for PAI, and ~5.4 m for RH50 (Table 5).
VED refers to error estimates using GEDI footprints simulated using 578 km2 of discrete ALS-LiDAR across the Chocó natural region. A description of this data set can be found at Fagua et al.21. We followed the approach described by Hancock et al.69 to simulate GEDI footprints using the ALS-LiDAR data and to estimate values of CH, RH50, FHD, and COVER. We note that PAI could not be simulated due to the lack of some parameters necessary for its estimation. The process to simulate the GEDI footprints first consisted of noise removal using the Statistical Outlier Removal method of the R package lasR70. Next, we established a grid with the same resolution as the vertical structure maps. We later identified the centroid of each raster cell that was contained within one of the LiDAR tiles to derive a simulated GEDI footprint and its corresponding CH, RH50, FHD, and COVER values, using the Rgedisimulator tool of the R package rGEDI71. We finally estimated the same error metrics described above, RAE, RRSE, MAE and RMSE, by comparing 5000 resamples of CH, RH50, FHD, and COVER simulated-values with the corresponding values from the resulting maps in the Choco. Simulated GEDI footprints were selected randomly using a spatial filter of 200 m. Parameters and scripts of GEDI simulation using ALS-LiDAR can be found at the github site for this manuscript (see Code Availability). We found higher errors for the VED validation compared with VSD validation in the Choco (Table 6). This was expected since error estimates from ALS-LiDAR can be considered field validation24,35,72,73, which usually results in higher errors compared with errors estimated by cross-validation with reserved sample data.
Finally, VRC evaluates the extent to which interrelationship curves between the sample data (our five metrics of GEDI footprints that describe forest structure) are preserved in the predicted data (mapped pixel data)21. Forest structure metrics covary (see Footprint values of Fig. 4); it is therefore important evaluate whether our independent models preserve the interrelationships of forest metrics as observed by GEDI. We consider this approach a useful complement to typical procedures because it indicates to users that even though the metrics were modeled independently, the predicted values reproduce observed relationships between structure metrics that may be important for forest ecology and conservation. We randomly selected 5000 footprints and 5000 pixels in the resulting maps to compare the interrelationship curves among the metrics. These 5000 pixels of predicted data did not coincide with the locations of the footprints. We observe that interrelationship curves and their parameters between the variable pairs of the footprints were maintained in the mapped pixel data (Fig. 4 and Table 7).
Usage Notes
Since the five produced maps correspond to Essential Biodiversity Variables, at moderately high spatial resolution (25 m) and, provide coverage throughout the continental territory of Colombia, they can be used to monitor and map the state of biodiversity and other environmental variables across the country. Previous works show that similar forest structural metrics have allowed precise mapping of tree alpha diversity, carbon content, and forest degradation, among others21,39,74,75. We note the regional approach in the creation of the maps accounts for the natural environmental variation of Colombia’s forests, in addition to reducing errors, which thereby provides more representative maps compared to global estimates that calculate without regional distinctions or are developed at lower spatial resolutions. Another point to highlight is that our maps were made for the forest areas of Colombia for the year 2020, using sample data for forested areas only. Forest areas were identify based on Hansen Global Forest Change database v1.12 (2000–2024)34. By focusing on forest cover type, we sought to reduce uncertainty for forest specific applications, such as mapping of forest diversity, carbon stock estimation, or forest degradation. These five national maps of forest structural metrics were formed by mosaicking of the regional maps using the average of the values in the transition zones. Although this method is commonly used in this type of analysis, possible unrepresentative values might be found in transition zones.
We note the error estimates of our CH maps in the Amazon and Andean regions, where errors were highest, are similar to the error estimates of an existing global CH map35 while the error estimates in other regions, such as Caribbean and Orinoquía, were lower than reported in such maps (Table 5). This, combined with the reported validations, indicates our maps are appropriate for forest assessments and related applications in Colombia.
Data availability
Resulting maps of this research are publicly accessible on Zenodo: https://zenodo.org/records/15493516.
Code availability
The code is publicly accessible on Github76: https://github.com/CamiloFaguaUNAL/Forest_Structure_Colombia.
References
McElhinny, C., Gibbons, P., Brack, C. & Bauhus, J. Forest and woodland stand structural complexity: Its definition and measurement. For. Ecol. Manage. 218, 1–24 (2005).
Hall, F. G. et al. Characterizing 3D vegetation structure from space: Mission requirements. Remote Sens. Environ. 115, 2753–2775 (2011).
Burns, P., Hakkenberg, C. R. & Goetz, S. J. Multi-resolution gridded maps of vegetation structure from GEDI. Sci. Data 11, 881 (2024).
Pereira, H. M. et al. Essential Biodiversity Variables. Science (80-.) 339, 277–278 (2013).
Pérez-Suárez, M., Arredondo-Moreno, J. T., Huber-Sannwald, E. & Serna-Pérez, A. Forest structure, species traits and rain characteristics influences on horizontal and vertical rainfall partitioning in a semiarid pine- oak forest from Central Mexico. Ecohydrology 7, 532–543 (2014).
Aron, P. G., Poulsen, C. J., Fiorella, R. P. & Matheny, A. M. Stable Water Isotopes Reveal Effects of Intermediate Disturbance and Canopy Structure on Forest Water Cycling. J. Geophys. Res. 124, 2958–2975 (2019).
Sun, J. et al. Effects of forest structure on hydrological processes in China. J. Hydrol. 561, 187–199 (2018).
Thom, D. & Keeton, W. S. Stand structure drives disparities in carbon storage in northern hardwood-conifer forests. For. Ecol. Manage. 442, 10–20 (2019).
Foley, J. A. et al. Amazonia revealed: forest degradation and loss of ecosystem goods and services in the Amazon Basin. Front. Ecol. Environ. 5, 25–32 (2007).
Frey, S. J. K. et al. Spatial models reveal the microclimatic buffering capacity of old-growth forests. Sci. Adv. 2, e1501392 (2016).
Gough, C. M., Atkins, J. W., Fahey, R. T. & Hardiman, B. S. High rates of primary production in structurally complex forests. Ecology 100, e02864 (2019).
Clark, D. B., Olivas, P. C., Oberbauer, S. F., Clark, D. A. & Ryan, M. G. First direct landscape-scale measurement of tropical rain forest Leaf Area Index, a key driver of global primary productivity. Ecol. Lett. 11, 163–172 (2008).
Coops, N. C., Hermosilla, T., Hilker, T. & Black, T. A. Linking stand architecture with canopy reflectance to estimate vertical patterns of light-use efficiency. Remote Sens. Environ. 194, 322–330 (2017).
Liu, X. et al. Enhancing ecosystem productivity and stability with increasing canopy structural complexity in global forests. Sci. Adv. 10, eadl1947 (2024).
Asner, G. P. et al. High-resolution mapping of forest carbon stocks in the Colombian Amazon. BIOGEOSCIENCES 9, 2683–2696 (2012).
Meyer, V. et al. Forest degradation and biomass loss along the Choco region of Colombia. Carbon Balance Manag. 14 (2019).
Sanchez-Daz, B. et al. Modeling of the vertical structure of shade trees in cacao agroforestry systems. Theor. Appl. Ecol. 28–37, https://doi.org/10.25750/1995-4301-2023-1-028-037 (2023).
Basham, E. W. et al. Large, old trees define the vertical, horizontal, and seasonal distributions of a poison frog. Oecologia 199, 257–269 (2022).
Li, S., Hou, Z. Y., Ge, J. P. & Wang, T. M. Assessing the effects of large herbivores on the three-dimensional structure of temperate forests using terrestrial laser scanning. For. Ecol. Manage. 507 (2022).
Coops, N. C. et al. A forest structure habitat index based on airborne laser scanning data. Ecol. Indic. 67, 346–357 (2016).
Fagua, J. C. et al. Mapping tree diversity in the tropical forest region of Chocó-Colombia. Environ. Res. Lett. 16, 54024 (2021).
Marselis, S. M. et al. Evaluating the potential of full-waveform lidar for mapping pan-tropical tree species richness. Glob. Ecol. Biogeogr. n/a (2020).
Feng, G., Zhang, J., Girardello, M., Pellissier, V. & Svenning, J. C. Forest canopy height co-determines taxonomic and functional richness, but not functional dispersion of mammals and birds globally. Glob. Ecol. Biogeogr. 29, 1350–1359 (2020).
Drake, J. B. et al. Estimation of tropical forest structural characteristics using large-footprint lidar. Remote Sens. Environ. 79, 305–319 (2002).
Dubayah, R. O. et al. Estimation of tropical forest height and biomass dynamics using lidar remote sensing at La Selva, Costa Rica. J. Geophys. Res. 115 (2010).
Hancock, S., Disney, M., Muller, J.-P., Lewis, P. & Foster, M. A threshold insensitive method for locating the forest canopy top with waveform lidar. Remote Sens. Environ. 115, 3286–3297 (2011).
Asner, G. P. et al. A universal airborne LiDAR approach for tropical forest carbon mapping. Oecologia 168, 1147–1160 (2012).
Coops, N. C. et al. Modelling lidar-derived estimates of forest attributes over space and time: A review of approaches and future trends. Remote Sens. Environ. 260, 112477 (2021).
Tompalski, P. et al. Estimating Changes in Forest Attributes and Enhancing Growth Projections: a Review of Existing Approaches and Future Directions Using Airborne 3D Point Cloud Data. Curr. For. Reports 7, 1–24 (2021).
Dubayah, R. et al. The Global Ecosystem Dynamics Investigation: High-resolution laser ranging of the Earth’s forests and topography. Sci. Remote Sens. 1, 100002 (2020).
GEDI-team. GEDI Ecosystem Lidar. Available at: https://gedi.umd.edu/ (2024).
Eegholm, B. et al. Global Ecosystem Dynamics Investigation (GEDI) instrument alignment and test. in Proc.SPIE 11103, 1110308 (2019).
Primack, R. B. & Corlett, R. T. Tropical Rain Forests: An Ecological and Biogeographical Comparison. (Blackwell Publishing, 2009).
Hansen, M. C. et al. High-Resolution Global Maps of 21st-Century Forest Cover Change. Science (80-.) 342, 850–853 (2013).
Potapov, P. et al. Mapping global forest canopy height through integration of GEDI and Landsat data. Remote Sens. Environ. 253, 112165 (2021).
Lang, N., Jetz, W., Schindler, K. & Wegner, J. D. A high-resolution canopy height model of the Earth, https://doi.org/10.48550/ARXIV.2204.08322 (2022).
Dubayah, R. O. et al. GEDI L3 Gridded Land Surface Metrics, Version 2, https://doi.org/10.3334/ORNLDAAC/1952 (2021).
Saatchi, S. S. & Favrichon, S. Global Vegetation Height Metrics from GEDI and ICESat2. https://doi.org/10.3334/ORNLDAAC/2294 (2024).
Dubayah, R. et al. GEDI launches a new era of biomass inference from space. Environ. Res. Lett. 17, 95001 (2022).
Vogeler, J. C. et al. Evaluating GEDI data fusions for continuous characterizations of forest wildlife habitat. Front. Remote Sens. 4 (2023).
Fagua, J. C. & Jantz, P. Mapping Tropical Dry Forest Gradients in an Andean Region with High Environmental Variability. Ecol. Indic. 168, 112744 (2024).
Fagua, J. C., Rodríguez-Buriticá, S. & Jantz, P. Advancing High-Resolution Land Cover Mapping in Colombia: The Importance of a Locally Appropriate Legend. Remote Sensing 15 (2023).
Stendardi, L. et al. Exploiting Time Series of Sentinel-1 and Sentinel-2 Imagery to Detect Meadow Phenology in Mountain Regions. Remote Sensing 11 (2019).
Vreugdenhil, M. et al. Sensitivity of Sentinel-1 Backscatter to Vegetation Dynamics: An Austrian Case Study. Remote Sensing 10 (2018).
Qin, Y. et al. Annual dynamics of forest areas in South America during 2007–2010 at 50-m spatial resolution. Remote Sens. Environ. 201, 73–87 (2017).
Fagua, J. C., Jantz, P., Rodriguez-Buritica, S., Laura, D. & Goetz, S. J. Integrating LiDAR, Multispectral and SAR Data to Estimate and Map Canopy Height in Tropical Forests. Remote Sens. 11(1), 20 (2019).
Shimada, M. et al. New global forest/non-forest maps from ALOS PALSAR data (2007–2010). Remote Sens. Environ. 155, 13–31 (2014).
IDEAM, I. de H. M. y E. A., INVERMAR, I. de I. M. y C. J. B. V. de A., IIAP, I. de I. A. del P. & IAvH, I. H. Informe del Estado del Ambiente y de los Recursos Naturales Renovables. (IDEAM, 2016).
IDEAM, I. de H. M. y E. A. Resultados del monitoreo deforestación año 2020-2021. (2022).
Etter, A. et al. Ecosistemas colombianos: amenazas y riesgos. (Pontificia Universidad Javeriana, 2020).
Gorelick, N. et al. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 202, 18–27 (2017).
Haralick, R. M., Shanmugam, K. & Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man. Cybern. SMC-3, 610–621 (1973).
ESA, E. S. A. Sentinel-1 SAR GRD: C-band Synthetic Aperture Radar Ground Range Detected, log scaling. Earth Engine Data Catalog Available at: https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S1_GRD (2022).
Vollrath, A., Mullissa, A. & Reiche, J. Angular-Based Radiometric Slope Correction for Sentinel-1 on Google Earth Engine. Remote Sens. 12 (2020).
Potapov, P. V. et al. Quantifying forest cover loss in Democratic Republic of the Congo, 2000-2010, with Landsat ETM plus data. Remote Sens. Environ. 122, 106–116 (2012).
Pasquarella, V. J., Brown, C. F., Czerwinski, W. & Rucklidge, W. J. Comprehensive quality assessment of optical satellite imagery using weakly supervised video learning. in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2125–2135, https://doi.org/10.1109/CVPRW59228.2023.00206 (2023).
Baldridge, A. M., Hook, S. J., Grove, C. I. & Rivera, G. The ASTER spectral library version 2.0. Remote Sens. Environ. 113, 711–715 (2009).
Meerdink, S. K., Hook, S. J., Roberts, D. A. & Abbott, E. A. The ECOSTRESS spectral library version 1.0. Remote Sens. Environ. 230, 111196 (2019).
Dybbroe, A. et al. Satellite Sensor Relative Spectral Response data, https://doi.org/10.5281/zenodo.14008148 (2024).
Wang, J. et al. Enhancing Land Cover Mapping in Mixed Vegetation Regions Using Remote Sensing Evapotranspiration. IEEE Trans. Geosci. Remote Sens. 62 (2024).
Tsendbazar, N. et al. Towards operational validation of annual global land cover maps. Remote Sens. Environ. 266, 112686 (2021).
Venter, Z. S. & Sydenham, M. A. K. Continental-Scale Land Cover Mapping at 10 m Resolution Over Europe (ELC10). Remote Sens. 13 (2021).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Liaw, A. Package ‘randomForest’: Breiman and Cutler’s Random Forests for Classification and Regression. (2018).
Kuhn, M. et al. Package ‘caret’:Classification and Regression Training. Available at: https://github.com/topepo/caret/ (2022).
Kursa, M. B. & Rudnicki, W. R. Feature Selection with the Boruta Package. J. Stat. Softw. 36, 1–13 (2010).
Hijmans, R. et al. Package ‘raster’. (r-project.org, 2016).
Fagua, J. C. & Jantz, P. Maps of forest vertical structure for Colombia, a megadiverse country. Zenodo https://doi.org/10.5281/zenodo.15493516 (2025).
Hancock, S. et al. The GEDI Simulator: A Large-Footprint Waveform Lidar Simulator for Calibration and Validation of Spaceborne Missions. EARTH Sp. Sci. 6, 294–310 (2019).
Russel, J. lasR: Fast and Pipeable Airborne LiDAR Data Tools. (2025).
Silva, C. A. rGEDI:NASA’s Global Ecosystem Dynamics Investigation (GEDI) Data Visualization and Processing. (r- project.org, 2021).
Mascaro, J. et al. Controls over aboveground forest carbon density on Barro Colorado Island, Panama. BIOGEOSCIENCES 8, 1615–1629 (2011).
Meyer, V. et al. Detecting tropical forest biomass dynamics from repeated airborne lidar measurements. BIOGEOSCIENCES 10, 5421–5438 (2013).
Torresani, M. et al. LiDAR GEDI derived tree canopy height heterogeneity reveals patterns of biodiversity in forest ecosystems. Ecol. Inform. 76, 102082 (2023).
Liang, M., Duncanson, L., Silva, J. A. & Sedano, F. Quantifying aboveground biomass dynamics from charcoal degradation in Mozambique using GEDI Lidar and Landsat. Remote Sens. Environ. 284, 113367 (2023).
Fagua, J. C. Code for generating maps (rasters at 25m of spatial resolution) of forest vertical structure for Colombia (South America) from GEDI spaceborne LiDAR. GitHub Available at: https://github.com/CamiloFaguaUNAL/Forest_Structure_Colombia (2025).
Acknowledgements
We acknowledge Departamento de Biología of Universidad Nacional de Colombia (Sede Bogota D.C) and the School of Informatics, Computing, and Cyber Systems at Northern Arizona University for providing access to high performance computing resources. Support for J.C.F. was provided by Universidad Nacional de Colombia—Sede Bogotá; Proyecto HERMES 66218 and Semillero de investigación 2971. Support for P.J. was provided by NASA Group on Earth Observations Work Program, Grant #80NSSC18K0338.
Author information
Authors and Affiliations
Contributions
Conceptualization, J.C.F., P.J. and S.J.G.; Methodology, J.C.F., P.J., P.B., S.M.J., J.B.J.; Formal analysis, J.C.F. and P.J.; Investigation, J.C.F., P.J. and S.J.G.; Primary writing review and editing, J.C.F. and P.J. All authors have reviewed, edited, and agreed to the submitted version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Camilo Fagua, J., Jantz, P., Burns, P. et al. Maps of forest vertical structure for Colombia, a megadiverse country. Sci Data 13, 1 (2026). https://doi.org/10.1038/s41597-025-06297-7
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41597-025-06297-7






