Abstract
Biodiversity is essential for maintaining ecosystem functionality, ensuring food security, regulating climate, and sustaining human well-being. Yet, in neglected biomes, biodiversity is often lost before many species can even be identified or studied. In Brazil, the Cerrado stands out as the most flora diverse savanna on the planet, but it remains critically underexplored. To address this gap, the National Forest Inventory (IFN) carried out an unprecedented, systematic sampling across the Brazilian Cerrado. Using 1803 tree-inventoried vegetation plots across the most biodiverse savanna on earth, we offer a first biome-wide assessment of tree diversity across the Brazilian Cerrado. We mapped tree species richness and alpha-diversity at 0.1-degree resolution and investigated their environmental drivers. Using only spatial location, stratified by vegetation formation, our LOESS-based model provides the most spatially detailed assessment of tree diversity across the Brazilian Cerrado to date, explaining approximately 47% of the observed variation in tree species richness. The south-western and central-western regions of the Brazilian Cerrado exhibited the highest tree biodiversity. Tree species richness was positively associated with broad-scale precipitation, temperature gradients, and soil clay content, whereas it declined with increasing fire frequency, soil bulk density, and soil aluminium concentration. Our results reveal marked spatial patterns and key environmental drivers of tree diversity across the Brazilian Cerrado, providing a valuable foundation for future biodiversity assessments and evidence-based conservation planning throughout the biome.
Similar content being viewed by others
Introduction
Biodiversity is fundamental in maintaining ecosystem functions, supporting food security, regulating climate, and safeguarding human well-being1,2. However, the accelerating anthropogenic pressures, such as habitat destruction, climate instability, and the intensification of land use, which continue to erode biodiversity at an alarming pace3,4. Crucially, the speed and scale of these human-driven impacts now outstrip the pace at which science can document, understand, and respond to biodiversity loss5. As a result, species vanish undocumented, ecological interactions are disrupted, and opportunities to understand and conserve unique biotas are irreversibly lost6. Large-scale, nationally coordinated efforts to sample and monitor biodiversity and their habitats are urgently needed to close this widening gap7,8,9. Such initiatives are key to advancing conservation strategies and addressing critical knowledge gaps in underrepresented non-forest ecosystems like the South American savannas.
South America’s Cerrado represents the most biodiverse flora savanna on Earth10 and encompasses roughly 2 million km² in central Brazil, in addition to small and discontinuous extensions in eastern Paraguay (over 8000 km²) and Bolivia (about 30000 km²)11. In Brazil, the Cerrado also serves as the nation’s “Cradle of Waters”, supplying runoff to three of South America’s major river basins12. Despite harbouring an estimated 10500 vascular plant species, including 1605 tree taxa, 30 species ( < 2%) account for half of all individual trees13,14. The Brazilian Cerrado, has already lost nearly 50% of its native vegetation to agricultural conversion and other land-use changes, leading the projections that nearly 500 species may lose over 80% of their habitat, crossing critical thresholds for extinction risk15. Conservation efforts remain insufficient, strictly protected areas cover only 8% of the original extent16, far below the thresholds needed to secure species persistence in this highest biodiverse savanna. In this context, targeted biodiversity-mapping efforts can pinpoint high-priority regions where expanding protection would most effectively strengthen the Brazilian Cerrado overall conservation capacity.
Tree α-diversity, here quantified by Fisher’s α from standardized inventory plots, denotes the local richness of tree species and arises from the combined influence of climate, soil properties, and disturbance regimes17,18. In the Cerrado, these drivers interact with the fire regime to generate a fine-scale mosaic of vegetation type, from open grasslands to dense woodlands and riparian forest, each supporting distinct assemblages of tree species along a diversity gradient19. In the case of the Brazilian Cerrado, it comprises seven biogeographic districts that differ markedly in climatic conditions, habitat loss, and species composition20,21. Generally, tree diversity and richness increase from nutrient-poor, frequently burned savannas toward more fertile, less fire-prone woodlands and gallery forests21,22. The variation in these ecological factors described above in the Cerrado and their effect on the floristic composition of vegetation types has been studied primarily at small spatial scales, mostly at individual sites14,23. Other studies by Oliveira-Filho & Ratter19 and Ribeiro & Walter19 scaled these local studies up to the entire Brazilian Cerrado, but using a floristic composition approach.
A Spatial distribution of vegetation plots across the Brazilian Cerrado. Green points represent plots located in the Forest Formation, and yellow points represent plots in savanna formation. The inset map at the top-left highlights a zoomed area with high sampling density. B Number of sampled plots in each vegetation type, savannas and forest formations. Total number of plots: 556 forest, 1247 savanna. C Frequency distribution of the number of individuals sampled per plot.
In complex tropical biomes, robust, high-volume, and spatially well-distributed sampling is essential for elucidating diversity patterns and ecosystem geography18,24. Despite numerous botanical inventories in the Cerrado, critical gaps persist at the large scale13,25: What is the total tree species richness, and how is diversity spatially structured? To address these questions, we compiled a unique, biome-wide dataset of 1803 botanically inventoried vegetation plots from national efforts, enabling the first spatially explicit assessment of tree α-diversity and alongside species richness across the entire Brazilian Cerrado (Fig. 1). We generated maps of tree species richness at 0.1° resolution ( ~ 11 km), using only plot coordinates stratified by major vegetation types (Savannas and Forest) and then summarised by floristic biogeographic districts from Brazilian Cerrado20. We tested potential drivers for the observed patterns in tree species richness in savannas and forests, including climatic variables, soil variables, anthropogenic impact, and fire frequency. Our model, based on the first large-scale data of vegetation plots spanning the entire Brazilian Cerrado, provides the most comprehensive and fine-scale portrait of tree species richness in this biodiversity hotspot.
Results
Tree species richness and alpha diversity in sampled plots
Tree density (N ha–1) exhibited high variability, averaging 266.7 ± 267.5 (mean ± sd) individuals per hectare. Plots in forest formation reached a mean density of 334.0 ± 319.0, whereas those in savanna formation averaged 237.0 ± 235.0 individuals per hectare (Fig. 2A, B). Tree alpha-diversity, measured as Fisher’s alpha, averaged 9.3 ± 6.6 across all plots, with a moderately right-skewed distribution (Fig. 2C). When analysed by vegetation formation, forest formation showed higher mean values (10.8 ± 7.8) than savanna formation (8.6 ± 5.8) (Fig. 2D). Tree species richness per hectare (Sha) followed a similar pattern, with a mean of 28.1 ± 17.8 species per hectare across all plots. Richness was greater in forest formation (33.1 ± 20.1) compared to savanna formation (25.9 ± 16.2).
A Tree density (Nha, individuals per hectare), B tree species richness (number of species per hectare), C tree alpha-diversity measured as Fisher’s alpha index, and D tree density per hectare by vegetation formations and physiognomies (individuals per hectare); E tree species richness per hectare by vegetation formations and physiognomies (F) tree alpha-diversity measured as Fisher’s alpha index by vegetation formations and physiognomies. Colours indicating vegetation formation: green for forest formations and yellow for savanna formations. Black dashed vertical lines represent the overall mean value for each variable.
When disaggregated by Cerrado physiognomies, clear differences in tree alpha-diversity, richness, and density emerged across physiognomies (Fig. 2D–F). Dry Forest and Woodland savannas exhibited the highest tree alpha-diversity, with a mean alpha-diversity of 10.9 ± 7.8, tree species richness (Sha) of 33.1 ± 20.2, and average tree density of 319.0 ± 255.0 individuals per hectare. Riparian Forests also supported high diversity levels, with the mean tree alpha-diversity of 10.1 ± 7.1, species richness (Sha) of 32.8 ± 19.7, and the highest density recorded across groups (491.0 ± 689.0 ind./ha), albeit with substantial variability. The cerrado strict sensu, a typical savanna, in contrast, presented lower diversity metrics: alpha-diversity averaged 8.6 ± 5.8, species richness 25.9 ± 16.3, and density 238.0 ± 235.0. Palm-dominated vegetation showed the lowest value in tree species richness (17.6 ± 7.0) and tree density (77.5 ± 39.7), yet maintained a moderate alpha-diversity of 9.0 ± 5.5.
Spatial patterns of tree species richness
We used tree species richness per hectare as a summary diversity metric due to its high correlation with other diversity indices (Supplementary Fig. 1). Tree species richness, defined as the number of species per hectare, ranged from approximately 10 to 90 across the Cerrado region, with an average of 36.3 species/ha. Predictions showed moderate skewness, with higher frequencies of low to intermediate richness values (Fig. 3C). Vegetation formation explained a substantial portion of this variation, with forest formations presenting higher predicted richness (mean = 33.1 species/ha) compared to savanna formations (mean = 25.9 species/ha) (Fig. 3C).
A Spatial distribution of predicted tree species richness (species per hectare), interpolated using a LOESS model across the Brazilian Cerrado. B Relationship between observed and predicted values of Tree species richness per hectare (Sha), coloured by vegetation formation. The dashed line indicates the 1:1 relationship. C Frequency distribution of predicted tree species richness by vegetation formation. Dashed vertical lines represent the mean richness for each formation (green = forest, yellow = savanna). D Tree species diversity per hectare (Sha) across biogeographic districts of the Brazilian Cerrado. Boxplots represent the median, interquartile range, and dispersion of values within each district: Central-west (CW), Central (CE), External North, North-east (NE), North-west (NW), South-east (SE), South-west (SW) e South (S).
The spatial model based on a LOESS smoother (span = 0.2) captured clear spatial gradients in predicted tree species richness across the Cerrado (Fig. 3A). Areas of highest tree species richness were concentrated in the central and west portions of the Cerrado. Tree species richness varied markedly across biogeographic districts in the Cerrado (Fig. 3D). The Central-west (CW) district exhibited the highest tree species richness, followed by the South-west (SW), South-east (SE), and Central (CE) districts. In contrast, the External North (ExN) and North-east (NE) districts had the lowest values, indicating reduced diversity in these northern savanna-dominated regions (Fig. 4D).
A Mean annual precipitation (1986–2020), B Fire frequency (annual fire events, 1985–2020), C annual temperature range (1985–2020), D soil bulk density (0–30 cm), E soil aluminium concentration (cmol/kg), F soil clay content (%), and G latitude (decimal degrees). Each point represents a classified as forest (green) or savanna (yellow) formations. Red lines represent quantile regression models for the 50th percentile (solid) and the 95th percentile (dashed), capturing both median and upper-bound trends in species richness.
Model performance was reasonable (R² = 0.47), with a strong positive relationship between predicted and observed tree species richness per hectare (Fig. 3B). Residuals from the LOESS model were weakly autocorrelated (Moran’s I = 0.002, p > 0.05), indicating that most of the spatial signal was effectively captured by the model. The residual error map (Supplementary Fig. 2a) shows the spatial distribution of model residuals, calculated as the difference between observed and predicted values of Tree species Richness (Sha). Residuals from the combined spatial model had a mean close to zero and did not differ much between vegetation formations (Supplementary Fig. 2b).
The standard error for Fisher’s alpha was mostly low (Supplementary Fig. 3a and Supplementary Fig. 3c) and consistent across vegetation formation and biogeographic districts (Supplementary Fig. 3b and Supplementary Fig. 3d), although higher for South, resulting in higher standard errors in the subsampled portion of Cerrado.
The effects of abiotic factors on tree diversity and richness
We used quantile regression to assess the effects of abiotic variables on tree species richness following Steege17 approach. Tree species richness per hectare was significantly influenced by a combination of climatic, fire, and edaphic factors across the sampled vegetation plots (Table S1). Among the climatic predictors, both mean annual precipitation (τ = 0.5, p = 0.032) and annual temperature range (τ = 0.5, p = 0.0016) were positively associated with tree species richness. Fire frequency had a significant negative impact on tree species richness (τ = 0.5, p = 0.0024).
Soil conditions were also critical in shaping tree richness patterns. Increased soil bulk density was negatively associated with species richness (τ = 0.5, p < 0.01). Similarly, soil aluminium concentration exhibited a negative relationship (τ = 0.5, p = 0.001). A positive effect was also observed for soil clay content (τ = 0.5, p < 0.01). Lastly, latitude degrees had a negative association with tree species richness (τ = 0.5, p < 0.01).
Discussion
This study presents the first comprehensive, biome-wide assessment of tree species richness and alpha-diversity across the Brazilian Cerrado, leveraging an extensive dataset of 1803 standardized inventory plots. By integrating ground-based data with spatial modelling, we provide a portrait of tree species richness patterns and identify key environmental drivers, offering a robust approach to use data from national efforts programs for conservation implications.
Our findings reveal substantial variation in tree species richness and alpha-diversity across the Brazilian Cerrado, structured by vegetation formation and physiognomy. As expected, and in line with previous localized studies21,26, forest formations, particularly woodland savannas, dry forests, and riparian forests, exhibited significantly higher trees diversity values compared to savannas formations. These denser and often wetter environments function as tree diversity hotspots and important refugia that sustain higher tree diversity within the broader savanna matrix. In contrast, savannas formations (Cerrado strictu sensu) shaped by frequent disturbances and nutrient-poor soils, exhibited lower richness and alpha-diversity26. It is important to note, however, that the absolute values of richness and density reported here may appear comparatively lower than those found in other Cerrado-specific inventories14,23,27,28. Probably, due to the adoption of a stricter inclusion criterion Diameter at Breast Height (DBH > 10 cm), as opposed to the more commonly used Diameter at Soil Height (DSH > 5 cm)29,30. This methodological difference likely excluded a considerable number of individuals, especially in more open and fire-prone vegetation, where trees often remain below 10 cm DBH. Such variation underscores a key challenge in biome-wide biodiversity assessments: achieving methodological consistency across heterogeneous vegetation types However, some studies that applied the same methodological approach in the Bolivian Cerrado reported similar tree species richness per hectare in forest formations31,32,33. Despite this limitation, we can identify a consistent large-scale pattern in tree diversity and to assess the influence of environmental gradients on tree community structure. Therefore, while absolute values should be interpreted with caution, the relative differences across formations and physiognomies remain ecologically meaningful and informative for conservation planning.
The provided map illustrates a non-uniform distribution of tree species richness across the sampled region of the Cerrado. A distinct hotspot of higher richness (approximately 50 species per hectare) is evident in the central-western portion and south-west portion from transitional zone. The spatial patterns identified, with higher richness concentrated in the center and west portions of the Cerrado, are consistent with biogeographic patterns previously suggested for the Brazilian Cerrado24. This spatial gradients in in tree species richness observed across the Cerrado are characterized by a core region of high diversity and transitional patterns toward the Amazonian border. It is well established that savanna trees in the Cerrado–Amazon transition may exhibit greater basal area or biomass than similar formations located in the core region of the Cerrado34,35,36. Our findings reveal a comparable pattern for tree species diversity. This is consistent with recent evidence showing that both the central Cerrado and transitional regions, including the Bolivian Cerrado, exhibit similar patterns of tree species richness28. It is therefore essential to account for both the central Cerrado and its transition zones when delineating conservation units.
In fact, the core Cerrado exhibits relatively high species richness due to its proximity to the centre of species dispersal, whereas more peripheral regions tend to be poorer in species despite the influence of adjacent biomes37, with the exception of the southern Amazonian transition area34. In the South-west, at the transition zone with the Amazon rainforest and in the contiguous Bolivian Cerrado, a belt of high richness is evident due to the proximity to the hyperdiverse Amazonian forests17,34. It is evident that in the contact zones, ecotonal regions of ecological tension, the interchange of Amazon species contributes to increased richness and shapes species composition38,39,40. In addition, trees in these humid and hyperdynamic environments may grow faster34. As consequence, surpass the minimum diameter threshold (e.g., DBH ≥ 10 cm), thereby increasing apparent richness due to a higher inclusion rate of individuals. Conversely, in the northeastern edge, approaching the Caatinga, species richness declines sharply, likely reflecting both increased climatic seasonality and a decrease in tree species richness. However, it is also plausible that this pattern partly results from fewer individuals meeting the inclusion criteria, due to stunted growth under more xeric conditions41,42. In this sense, our results reflect natural patterns driven by seasonality and precipitation, as well as methodological aspects that favour sampling primarily in moist forest formations. Thus, both ecological gradients and sampling thresholds contribute to the observed richness patterns.
The positive association we observed between species richness and both mean annual precipitation and annual range temperature aligns with general patterns in tropical ecosystems where water availability and energy input are key diversity drivers15. This reinforces the sensitivity of Cerrado tree communities to climatic conditions43. Furthermore, the significant negative impact of fire frequency on richness underscores the role of disturbance in shaping these ecosystems44,45,46. While fire is a natural component of the Cerrado, our results suggest that high frequency, potentially exacerbated by human activities, can suppress tree diversity, likely by favouring a smaller subset of fire-tolerant species and hindering the establishment of less resistant ones47,48,49,50.
Soil conditions proved equally critical. The negative relationships found between richness and soil bulk density, aluminium concentration, and clay content point towards the importance of soil physical structure and chemistry. In general, soil bulk density have strong effects on forest structure and clay content on biodiversity51.Lower bulk density likely facilitates better root penetration and water infiltration52, while the association with higher aluminium concentration (lower pH -more acidic soils) is consistent with patterns in many high-diversity, nutrient-poor tropical systems, including specific Cerrado formations where diverse flora thrives on dystrophic soils53,54,55. In addition, clayey soils are associated with flatter areas had generally higher nutrient soils56,57. Consequently, clay-rich soils can support greater tree diversity, likely due to their enhanced water and nutrient retention capacity58.
The pronounced negative association between richness and latitude observed in our plots reflects a well-known global biodiversity gradient. However, interpreting this within the Cerrado requires considering the biome’s internal heterogeneity. Recent Brazilian Cerrado-wide analyses13, utilizing hundreds of plots, provide a more comprehensive picture of spatial diversity patterns and hyperdominance. Their work reveals extreme hyperdominance, where fewer than 2% of species comprise half of all individuals, a pattern mirroring Amazonia. While our study focused on richness drivers within specific plots, our findings contribute to understanding the factors that structure these broader patterns. For instance, the environmental factors we identified likely influence the distribution and abundance of both hyperdominant and rare species across the Brazilian Cerrado.
The critical conservation status of the Cerrado estimates massive tree loss since 1985 and potential extinction threats for hundreds of species due to deforestation (exceeding that of the Amazon in recent years), lends urgency to understanding diversity drivers13. Our identification of specific climatic, fire, and edaphic factors influencing richness provides valuable information for targeted conservation strategies. Managing fire regimes, protecting areas with favourable soil conditions, and considering the impacts of climate change on precipitation patterns are crucial steps. Furthermore, understanding the factors promoting richness in specific locations can help prioritize areas to safeguard the substantial number of rare species and potentially undiscovered diversity within the biome13.
In conclusion, our wide analysis reveals clear spatial patterns and environmental controls on tree diversity in the Brazilian Cerrado. The central Cerrado and the Amazonian transition zone exhibit the highest levels of tree diversity, a pattern strongly associated with precipitation, temperature ranges, fire frequency, and soil texture properties s. These findings underscore the urgent need for expanded and strategically placed conservation efforts, informed by spatially explicit data, to safeguard the unique tree flora of the Brazilian savanna against ongoing anthropogenic pressures. By establishing a robust baseline and identifying key diversity drivers, this work provides an essential foundation for future monitoring, research, and evidence-based conservation planning in the Brazilian Cerrado.
Methods
Tree-inventory data and sampling design
Tree-inventory data were obtained from the Brazilian National Forest Inventory (NFI, IFN in Portuguese59), following its standardized protocol for the Brazilian Cerrado. Sampling points were laid out in a regular grid (Grade Nacional de Pontos Amostrais, GNPA) at 20 km intervals across entire Brazilian portion of the Cerrado biome (here after mention as Cerrado), not including the Bolivian and Paraguayan extensions. At each GNPA point, we installed a “Maltese-cross” cluster of four rectangular subplots (20 × 50 m; total area 1000 m²), oriented toward the four cardinal directions. Within each subplot, all free-standing woody individuals with diameter at breast height (DBH, 1.30 m) ≥ 10 cm were measured (DBH, total height) and identified to the lowest taxonomic level possible (species, genus, or family). Field sampling was carried out between 2018 and 2020. We retained only plots located within native vegetation (savanna or forest formations) classified in field observation. We classified woodland savanna, riparian forest, and dry forest as forest formations, while the cerrado strict sensu considered as typical savanna (including cerrado típico, cerrado ralo, cerrado rupestre, and cerrado denso) were classified as savanna formations. In addition, babaçuais (palm-dominated vegetation formed primarily by Attalea speciosa) and veredas (palm swamp vegetation dominated by Mauritia flexuosa) were also considered savanna formations, following the physiognomic classification of Ribeiro and Walter19 The raw database comprised 3030 plots; we removed those in non-native vegetation or affected by access impediments. To ensure taxonomic and ecological consistency, we excluded any plot where more than 15% of stems lacked species-level identification and retained only subplots located within native forest and savanna vegetation. Additionally, exotic species were excluded based on consultations with the REFLORA60 Virtual Herbarium database. After applying these filters, 1084 plots remained for all subsequent calculations and spatial mapping.
Calculating tree diversity and species richness
Observed species richness (S) was first tallied as the total number of tree species recorded in each plot. Because the effective sampling area (Aᵢ, in hectares) could vary, due to access impediments or partial subplots, we standardized richness to a common area. For each plot, Tree species alpha-diversity was expressed as Fisher’s alpha, a diversity measure theoretically insensitive to sample size, by iteratively solving α = S/ln(1 + N/α), with N as the total number of individuals and S as the total number of morpho-species per plot61. Species richness per ha (Sha) was estimated by solving for Sha = α * ln(1 + Nha/α)1. We also calculated Hill numbers (q = 0, 1, 2)62 and rarefied richness for 100 individuals using the iNEXT package63. However, due to high correlations among these metrics, we retained Fisher’s alpha and tree species richness per hectare (Sha) as the primary diversity indicators for subsequent analyses.
We calculated diversity and density metrics for each plot according to the physiognomies observed in the field within the broader savanna and forest formations. We report our results both by the two major vegetation formations (savanna and forest) and by each physiognomies considered: Palm-dominated vegetation, babaçuais (dominated by Attalea speciosa) and palm, as well as cerrado strict sensu were considered as savanna formations. Forest formations considering Riparian Forest, dry forest, and woodland savanna as physiognomies.
All calculations were performed in R Software (v4.3.1) using base functions and the vegan package for α-diversity64. This two-step approach, raw richness standardized by area, plus Fisher’s α-based extrapolation, ensures comparability of diversity across plots with different sampling extents and stem densities.
Modelling diversity and richness patterns
The spatial predictions of tree alpha-diversity and tree species-richness for the Cerrado vegetation were plotted on a map with a resolution of 0.1 degree (11 × 11 km), based on the original vegetation formation extent of the Brazilian Cerrado, stratified into the major vegetation types of savannas and forest formations, without considering detailed vegetation physiognomies, only forest and savannas formations.
For our spatial interpolations we used loess regression, using only longitude, latitude, and their interaction as independent variables and tree alpha-diversity and species richness as the dependent variables17,63. We used a span of 0.2 for all loess regressions, a 2nd degree polynomial, and no extrapolation17.
For each of the two categories: forest formations and savannas formations, we constructed a separate spatial interpolation model of tree alpha-diversity and tree species-richness across Cerrado. For example, for tree species richness, we made a single spatial interpolation for all plots located on savannas. This interpolation was then used to predict tree species richness for each grid cell, using savannas vegetation class boundaries extracted from the TerraBrasilis platform, developed by the Brazilian National Institute for Space Research (INPE) under the Brazilian Biomes Monitoring Program65. The same procedure was applied to all plots established in forest formations (Supplementary Fig. 4).
Whereas the TerraBrasilis is based on the major vegetation type, the vegetation type of the plots was determined independently of this map and based on field observations of those who established the plot65. These classifications followed the physiognomic criteria of the classical Brazilian vegetation system, distinguishing savanna and forest formations according to Ribeiro & Walter19. Consequently, it is possible that a plot classified by observers as savanna is located in a grid cell classified as forest on the map. Regardless, it was used in the savanna spatial model as the field observations are considered to be correct. As we allowed no extrapolation, pixels too far from the plots were not given a value. As a 2nd degree polynomial may produce upward and downward exaggerations, values higher than the observed maximum in the data were set to the maximum value and those lower than the minimum to the minimum value.
Lastly, to provide a clear guideline on which biogeographical districts harbour greater tree species richness, we summarized the predicted values according to the biogeographic districts of the Brazilian Cerrado as proposed by Françoso et al. (2019), allowing for improved visualization.
Environmental, climatic, and anthropogenic drivers of diversity
To investigate the drivers of tree diversity patterns across forest formations, we evaluated a set of climatic, edaphic, and anthropogenic variables. Prior to model fitting, all numerical predictors were tested for multicollinearity. Because the soil dataset contained a large number of highly correlated indicators, we first performed a preliminary Pearson correlation analysis exclusively among the edaphic variables to reduce redundancy before combining them with the other predictors. Based on this soil correlation matrix. After this initial filtering step, all remaining predictors were evaluated using Pearson’s correlation (r > 0.7), and highly correlated variables were excluded to avoid redundancy (Supplementary Fig. 5), resulting in the selection of 12 variables (Supplementary Fig. 6).
Climatic predictors were selected based on their ecological relevance to tree community structure. We extracted temperature-related variables from the ERA566 reanalysis dataset provided by the Copernicus Climate Data Store and precipitation data from the CHIRPS dataset (Climate Hazards Group InfraRed Precipitation with Station data)67. Variables considered included mean annual temperature, temperature seasonality, and temperature annual range (from ERA5), along with annual precipitation and precipitation seasonality (from CHIRPS). In addition, we calculated the number of months with less than 100 mm of rainfall as a proxy for intra-annual drought intensity (months with less than 100 mm).
Soil data were obtained from two main sources. Soil bulk density (BDOD, g cm⁻³) was extracted from the SoilGrids database68 at 250 m resolution, averaged across the upper 0–30 cm soil layer. The edaphic variables, including organic matter, organic carbon, pH in water and KCl, nitrogen, sodium, magnesium, calcium, potassium, aluminium, and exchangeable hydrogen, aluminium saturation, sum of bases, base saturation, cation exchange capacity, clay, silt, sand, were obtained from a harmonized and model-based global soil dataset69. which integrates multiple soil profiles and environmental covariates across continent. We performed a preliminary Pearson correlation analysis exclusively among the edaphic variables to reduce redundancy before model fitting. This step allowed us to identify and retain the variables that best summarized the major properties of the soil before model fitting. Based on this soil correlation matrix (Supplementary Fig. 7), we selected a subset of soil predictors for subsequent analyses: bulk density (BDOD), aluminum concentration (Al), cation exchange capacity (CEC), pH, clay content, total carbon, and base saturation. Only after this filtering step were this soil variables combined with the climatic and anthropogenic predictors for the full correlation assessment.
Anthropogenic pressures were evaluated at both the landscape and plot levels. We used land cover data from the PRODES (Projeto de Monitoramento do Desmatamento na Amazônia Legal por Satélite), developed by the Brazilian National Institute for Space Research (INPE)70. The PRODES dataset, accessed via the TerraBrasilis platform (https://terrabrasilis.dpi.inpe.br/), provides annual deforestation maps at 30 m spatial resolution, allowing for consistent monitoring of forest loss over time. Using this dataset, we derived historical metrics of forest cover and land-use history. For each vegetation plot, we delineated three buffer zones with radii of 1000 m. Within each buffer, we calculated total native vegetation cover at 2020 including grassland, savannas, and forest formations.
Land-use history was characterized over a 35-year period (1985–2020) from MapBiomas71 using 1000 m buffers around each vegetation plot. Three metrics were derived:
Land-use duration
The average number of years in which each pixel was classified as non-native vegetation, representing the persistence of anthropogenic use. We defined as the non-native vegetation each pixel was classified as forest formation, savanna formation, or grassland, classified from MapBiomas. These variable measures the persistence of anthropogenic land use in areas where native Cerrado vegetation has been replaced by human-modified land uses Land-use change frequency: the average number of land cover transitions per pixel over the time series; Fire frequency: the total number of years a pixel was recorded as burned, based on MODIS fire detection data (500 m resolution) extracted for a 5000 m buffer. Only one fire event per year per pixel was counted, regardless of the number of detections. These metrics were computed via the Google Earth Engine platform using custom scripts applied to MapBiomas71 and MODIS72 products. Fire frequency represents the cumulative number of fire years before the sampling date of each vegetation plot (either 2018 or 2020).
Testing the model fit and data analysis
We calculated the percentage of variation as explained by the combination of the spatial models for tree species richness, by analysing the observed and predicted values together, using a simple linear regression. We tested the goodness of prediction by mapping the standard error of the loess regression, also examining it by region and forest type. We tested for autocorrelation in the residuals, using the function Moran.I(), in the ape package distribution to further assess the validity of the model predictions and mapped the residuals to assess potential residual spatial signal73,74. A histogram was constructed of all values for each variable, as well as a boxplot by region and forest type. This map was then used to predict the species richness/ha for the plot that was left out and can be considered a non-biased estimate of the quality of the resulting map.
We modelled the effects of climatic variables, anthropogenic pressures, and large-scale soil properties on species richness in forest and savanna formations, which had the highest number of sampled plots in the Cerrado. We analysed environmental, soil properties and anthropogenic impacts only for tree species richness, due to two main reasons: (1) species richness is more straightforward to interpret than Fisher’s alpha, and (2) species richness is strongly correlated with Fisher’s alpha and other diversity metrics. We applied quantile regression for a more robust estimation, as it is less sensitive to outliers49,50. We used tau = 0.9 and 0.5 to estimate the upper bound of the relationship, representing the maximum potential value of species richness for a given predictor. This approach has previously been effective in revealing limiting factors in tropical ecosystems17. All analyses were conducted in the R programming environment using mostly custom scripts75.
Statistics and reproducibility
All tests were carried out with all plots. All tests and data are available in the online supplementary material and can thus be reproduced.
Data availability
All raw tree inventory data used in this study are publicly available through the National Forest Information System (SNIF) and can be downloaded at: https://snif.florestal.gov.br/pt-br/paineis-interativos/dados-botanicos-ifn-panorama. Processed datasets used in the analyses, as well as all derived model outputs, will be made publicly available in a Zenodo repository (https://doi.org/10.5281/zenodo.17910260). Also, the code used is available in a Zenodo repository (https://doi.org/10.5281/zenodo.17911171).
References
Díaz, S., Fargione, J., Iii, F. S. C. & Tilman, D. Biodiversity loss threatens human well-being. PLOS Biol. 4, e277 (2006).
Zhang, R., Tian, D., Wang, J. & Niu, S. Critical role of multidimensional biodiversity in contributing to ecosystem sustainability under global change. Geogr. Sustainability 4, 232–243 (2023).
Cardinale, B. J. et al. Biodiversity loss and its impact on humanity. Nature 486, 59–67 (2012).
Díaz, S. et al. Pervasive human-driven decline of life on Earth points to the need for transformative change. Science 366, eaax3100 (2019).
Keck, F. et al. The global human impact on biodiversity. Nature 641, 395–400 (2025).
Scheffers, B. R. et al. The broad footprint of climate change from genes to biomes to people. Science 354, aaf7671 (2016).
Ribeiro, R. et al. Advances in Brazil’s National Forest Inventory. Res. Ideas Outcomes 10, e138413 (2024).
Johnson, C. N. et al. Biodiversity losses and conservation responses in the Anthropocene. Science 356, 270–275 (2017).
Proença, V. et al. Global biodiversity monitoring: From data sources to Essential Biodiversity Variables. Biol. Conserv. 213, 256–263 (2017).
Brummitt, N., Araújo, A. C. & Harris, T. Areas of plant diversity—What do we know? PLANTS, PEOPLE. PLANET 3, 33–44 (2021).
Daniel, V., Munhoz, C., Proença, C. & CAMPOS, Y. Sabanas Del Cerrado En Bolivia: Delimitación, Síntesis Terminológica Y Sus Caracteristicas Fisionómicas Cerrado Grasslands And Savanna In Bolivia: Delimitation, Terminology And Physiognomies. Kempffiana 12, 47–80 (2016).
Vourlitis, G. L. et al. Tree growth responses to climate variation in upland and seasonally flooded forests and woodlands of the Cerrado-Pantanal transition of Brazil. Ecol. Manag. 505, 119917 (2022).
Alvarez, F. et al. Tree species hyperdominance and rarity in the South American Cerrado. Commun. Biol. 8, 695 (2025).
Ratter, J. A., Bridgewater, S. & Ribeiro, J. F. Biodiversity Patterns of the Woody Vegetation of the Brazilian Cerrado. in Neotropical Savannas and Seasonally Dry Forests (CRC Press, 2006).
Strassburg, B. B. N. et al. Moment of truth for the Cerrado hotspot. Nat. Ecol. Evol. 1, 1–3 (2017).
Françoso, R. D. et al. Habitat loss and the effectiveness of protected areas in the Cerrado Biodiversity Hotspot. Nat. Conserv. ção 13, 35–40 (2015).
ter Steege, H. et al. Mapping density, diversity and species-richness of the Amazon tree flora. Commun. Biol. 6, 1–14 (2023).
Ter Steege, H. et al. A spatial model of tree α-diversity and tree density for the Amazon. Biodivers. Conserv. 12, 2255–2277 (2003).
Ribeiro, J. & Walter, B. As principais fitofisionomias do bioma Cerrado. in 151–212 (2008).
Françoso, R. D. et al. Delimiting floristic biogeographic districts in the Cerrado and assessing their conservation status. Biodivers. Conserv 29, 1477–1500 (2020).
Bueno, M. L. et al. The environmental triangle of the Cerrado Domain: Ecological factors driving shifts in tree species composition between forests and savannas. J. Ecol. 106, 2109–2120 (2018).
Cordeiro, N. G. et al. The role of environmental filters in Brazilian savanna vegetation dynamics. Ecol. Manag. 500, 119645 (2021).
Bridgewater, S., Ratter, J. A. & Felipe Ribeiro, J. Biogeographic patterns, β-diversity and dominance in the cerrado biome of Brazil. Biodivers. Conserv. 13, 2295–2317 (2004).
Hortal, J. et al. Seven shortfalls that beset large-scale knowledge of biodiversity. Annu. Rev. Ecol. Evolut. Syst. 46, 523–549 (2015).
Guilherme, F. A. G., Júnior, A. F., Pereira, F. C., Silva, G. E. & Maciel, E. A. Disturbances and environmental gradients influence the dynamics of individuals and basal area in the Cerrado complex. Trees People 9, 100298 (2022).
Oliveira-Filho, A. T. & Ratter, J. A. 6. Vegetation Physiognomies and Woody Flora of the Cerrado Biome. in The Cerrados of Brazil: Ecology and Natural History of a Neotropical Savanna (eds Oliveira, P. S. & Marquis, R. J.) 91–120 (Columbia University Press, 2002).
de Miranda, S. doC. et al. Regional variations in biomass distribution in Brazilian Savanna Woodland. Biotropica 46, 125–138 (2014).
Morandi, P. S. et al. Tree diversity and above-ground biomass in the South America Cerrado biome and their conservation implications. Biodivers. Conserv 29, 1519–1536 (2020).
Felfili, J. M. et al. Análise comparativa da florística e fitossociologia da vegetação arbórea do cerrado sensu stricto na Chapada Pratinha, DF - Brasil. Acta Bot. Bras. 6, 27–46 (1992).
Moro, M. & Martins, F. Métodos de levantamento do componente arbóreo-arbustivo. in 174–212 (2011).
Killeen, T. J., Jardim, A., Mamani, F. & Rojas, N. Diversity, composition and structure of a tropical semideciduous forest in the Chiquitanía region of Santa Cruz, Bolivia. J. Trop. Ecol. 14, 803–827 (1998).
An Overview of the Plant Diversity, Biogeography and Conservation of Neotropical Savannas and Seasonally Dry Forests. in Neotropical Savannas and Seasonally Dry Forests (eds Pennington, R. T., Lewis, G. P. & Ratter, J. A.) (CRC Press, 2006).
Killeen, T. J. et al. The Chiquitano Dry Forest, the Transition between Humid and Dry Forest in Eastern Lowland Bolivia. in Neotropical Savannas and Seasonally Dry Forests (CRC Press, 2006).
Marimon, B. S. et al. Disequilibrium and hyperdynamic tree turnover at the forest–cerrado transition zone in southern Amazonia. Plant Ecol. Diversity 7, 281–292 (2014).
Marimon Junior, B. H. & Haridasan, M. Comparação da vegetação arbórea e características edáficas de um cerradão e um cerrado sensu stricto em áreas adjacentes sobre solo distrófico no leste de Mato Grosso, Brasil. Acta Bot. Bras. 19, 913–926 (2005).
Kunz, S. H., Ivanauskas, N. M. & Martins, S. V. Estrutura fitossociológica de uma área de cerradão em Canarana, Estado do Mato Grosso, Brasil - https://doi.org/10.4025/actascibiolsci.v31i3.1625. Acta Scientiarum. Biological Sciences 31, 255–261 (2009).
Castro, A. A. J. F., Martins, F. R., Tamashiro, J. Y. & Shepherd, G. J. How Rich is the Flora of Brazilian Cerrados? Ann. Mo. Botanical Gard. 86, 192–224 (1999).
Maracahipes Santos, L. et al. Diversity, floristic composition, and structure of the woody vegetation of the Cerrado in the Cerrado–Amazon transition zone in Mato Grosso, Brazil. Braz. J. Bot. 38, 877–887 (2015).
Méio, B. B. et al. Influência da flora das florestas Amazônica e Atlântica na vegetação do cerrado sensu stricto. Braz. J. Bot. 26, 437–444 (2003).
Maciel, E. A., Oliveira-Filho, A. T. & Eisenlohr, P. V. Prioritizing rare tree species of the Cerrado-Amazon ecotone: warnings and insights emerging from a comprehensive transitional zone of South America. Nat. Conserv. ção 14, 74–82 (2016).
de Lima, R. B. et al. Giants of the Amazon: How does environmental variation drive the diversity patterns of large trees? Glob. Change Biol. 29, 4861–4879 (2023).
Olivares, I., Svenning, J.-C., van Bodegom, P. M. & Balslev, H. Effects of warming and drought on the vegetation and plant diversity in the amazon basin. Bot. Rev. 81, 42–69 (2015).
Hofmann, G. S. et al. The Brazilian Cerrado is becoming hotter and drier. Glob. Chang Biol. 27, 4060–4073 (2021).
das Graças Costa, A. et al. Influence of fire on woody vegetation of savanna and forest formations in the Cerrado biome. J. Res. 34, 1207–1216 (2023).
Hoffmann, W. A. Fire and population dynamics of woody plants in a neotropical savanna: Matrix model projections. Ecology 80, 1354–1369 (1999).
Hoffmann, W. A. et al. Tree topkill, not mortality, governs the dynamics of savanna–forest boundaries under frequent fire in central Brazil. Ecology 90, 1326–1337 (2009).
Dantas, V., de, L., Batalha, M. A., França, H. & Pausas, J. G. Resource availability shapes fire-filtered savannas. J. Vegetation Sci. 26, 395–403 (2015).
Hoffmann, W. A. et al. Ecological thresholds at the savanna-forest boundary: how plant traits, resources and fire govern the distribution of tropical biomes. Ecol. Lett. 15, 759–768 (2012).
Simon, M. F. & Pennington, T. Evidence for Adaptation to Fire Regimes in the Tropical Savannas of the Brazilian Cerrado. Int. J. Plant Sci. 173, 711–723 (2012).
Maracahipes, L. et al. How to live in contrasting habitats? Acquisitive and conservative strategies emerge at inter- and intraspecific levels in savanna and forest woody plants. Perspect. Plant Ecol., Evolut. Syst. 34, 17–25 (2018).
Giles, A. L. et al. Simple ecological indicators benchmark regeneration success of Amazonian forests. Commun. Earth Environ. 5, 780 (2024).
Emilio, T. et al. Soil physical conditions limit palm and tree basal area in Amazonian forests. Plant Ecol. Diversity 7, 215–229 (2014).
Lira-Martins, D. et al. Soil properties and geomorphic processes influence vegetation composition, structure, and function in the Cerrado Domain. Plant Soil 476, 549–588 (2022).
Abrahão, A. et al. Soil types select for plants with matching nutrient-acquisition and -use traits in hyperdiverse and severely nutrient-impoverished campos rupestres and cerrado in Central Brazil. J. Ecol. 107, 1302–1316 (2019).
Haridasan, M. Nutritional adaptations of native plants of the cerrado biome in acid soils. Braz. J. Plant Physiol. 20, 183–195 (2008).
Laurance, S. G. W. et al. Influence of soils and topography on Amazonian tree diversity: a landscape-scale study. J. Vegetation Sci. 21, 96–106 (2010).
Quesada, C. A. et al. Variations in chemical and physical properties of Amazon forest soils in relation to their genesis. Biogeosciences 7, 1515–1541 (2010).
Costa, F. R. C., Schietti, J., Stark, S. C. & Smith, M. N. The other side of tropical forest drought: do shallow water table regions of Amazonia act as large-scale hydrological refugia from drought? N. Phytologist 237, 714–733 (2023).
Inventário Florestal Nacional. Serviço Florestal Brasileiro https://www.gov.br/florestal/pt-br/assuntos/ifn/ifn.
Fisher, R. A., Corbet, A. S. & Williams, C. B. The Relation Between the Number of Species and the Number of Individuals in a Random Sample of an Animal Population. J. Anim. Ecol. 12, 42–58 (1943).
Chao, A. et al. Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies. Ecol. Monogr. 84, 45–67 (2014).
Hsieh, T. C., Ma, K. H. & Chao, A. iNEXT: an R package for rarefaction and extrapolation of species diversity (Hill numbers). Methods Ecol. Evolut. 7, 1451–1456 (2016).
Oksanen, J., Kindt, R. & Legendre, P. The Vegan Package: Community Ecology Package 2007. (2013).
Downloads – Terrabrasilis. https://terrabrasilis.dpi.inpe.br/downloads/.
Hersbach, H. et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 146, 1999–2049 (2020).
Funk, C. et al. The climate hazards infrared precipitation with stations—a new environmental record for monitoring extremes. Sci. Data 2, 150066 (2015).
Poggio, L. et al. SoilGrids 2.0: producing soil information for the globe with quantified spatial uncertainty. SOIL 7, 217–240 (2021).
Arruda, D. M., Fernandes-Filho, E. I., Solar, R. R. C. & Schaefer, C. E. G. R. Combining climatic and soil properties better predicts covers of Brazilian biomes. Sci. Nat. 104, 32 (2017).
PRODES — Coordenação-Geral de Observação da Terra. http://www.obt.inpe.br/OBT/assuntos/programas/amazonia/prodes.
Souza, C. M. et al. Reconstructing three decades of land use and land cover changes in Brazilian biomes with landsat archive and earth engine. Remote Sens. 12, 2735 (2020).
Giglio, L., Randerson, J. T. & van der Werf, G. R. Analysis of daily, monthly, and annual burned area using the fourth-generation global fire emissions database (GFED4). J. Geophys. Res. Biogeosci. 118, 317–328 (2013).
Paradis, E., Claude, J. & Strimmer, K. A. P. E. Analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290 (2004).
Gaspard, G., Kim, D. & Chun, Y. Residual spatial autocorrelation in macroecological and biogeographical modeling: A review. J. Ecol. Environ. 43, 19 (2019).
R: The R Project for Statistical Computing. https://www.r-project.org/.
Acknowledgements
This study was conducted by the Brazilian Forest Service (SFB/MMA) within the framework of the project ‘Forest Information for Conservation-Oriented Management and Valuation of Brazil’s Forest Resources by the Public and Private Sectors – ATN/SX-19186-BR’, funded by the Forest Investment Program (FIP). The project was implemented by the Inter-American Development Bank (IDB), with financial execution by the Inter-American Institute for Cooperation on Agriculture (IICA) and technical implementation by the Brazilian Forest Service (SFB). A.L.G. is supported by a CNPq 384994/2025-3 and thank CNPq 402140/2024-9. L.D.B.F. is supported by a CNPq Productivity 466 Fellowship (Process no. 306247/2025-9).
Author information
Authors and Affiliations
Contributions
A.L.G. conceived and designed the study. A.L.G., L.B.F., R.F., and S.B.R. performed the formal data analysis. H.J.S., L.C.M., R.S.R., G.S.C.P., M.R.T., E.D., B.G., R.C., R.A.L., A.L.T., C.R.A.J., and D.F.C.A. contributed with resources and data curation regarding the Brazilian National Forest Inventory. A.L.G. wrote the original draft of the manuscript. All authors contributed to the interpretation of results and critically reviewed and edited the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Giles, A.L., Faria, L.D.B., Souza, H.J.d. et al. Mapping and revealing the tree biodiversity of the Brazilian Cerrado through biome-wide sampling efforts. npj biodivers 5, 9 (2026). https://doi.org/10.1038/s44185-026-00122-z
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s44185-026-00122-z






