Introduction

The Gran Chaco ecoregion is the second-largest forest in South America behind Amazon rainforest, the largest dry forest in the region. It covers 1,066,000 km2 throughout subtropical South America (eastern Bolivia, southwestern Brazil, western Paraguay, and central-northern Argentina)1,2. This region is one of the few vast areas with forests and savannas forming a transition between tropical and temperate areas in the world, and it is one of the most threatened subtropical regions by deforestation3. This is presumably one of the most diverse extratropical ecoregions of South America, but its floristic diversity was still not adequately analyzed, especially regarding to the number and proportion of endemics species and its threats. It is still lacking an exhaustive inventory of the native flora, although several initiatives are ongoing by means of regional floras in Argentina (for example, the preliminary notes and chapters of the “Chacoan flora”4, and the last volume of5), Paraguay6, and Brazil7.

Although in appearance, the Gran Chaco seems to be a region with relatively few endemic taxa, when compared with adjacent ecoregions such as Atlantic Forests8, more recent works have suggested that its diversity is underestimated. For instance, in Paraguay, according to a recent census9 at least 20% of the total endemic taxa of the country are restricted to Gran Chaco and its surrounding areas. Several Chacoan sectors still remain to be undersampled10 or, at least, they have not been investigated. In spite of these multiple gaps in flora documentation in the Gran Chaco, other South American ecoregions, such as the Atlantic Forests and Pampas, have been historically well explored by botanists and they have now extensive inventories of species11.

The Gran Chaco ecoregion can be divided into different districts or subregions, according to different authors: Humid Chaco, Dry Chaco, and Sierra Chaco10,12) which differ in climatic regime, vegetation and soils. Humid Chaco comprises the eastern region of the Gran Chaco, exhibiting comparatively higher amount annual of precipitations and more regularly distributed in the year than Dry Chaco, located in the western area of the ecoregion. However, Sierra Chaco, located at southwestern extreme and small sectors of the western area of the ecoregion, exhibits a gradient of precipitations and temperatures according to altitude differences1. These differences determine the presence of different types of vegetation and different floristic composition of the formations, as well as different levels of endemism. Consequently, Dry Chaco and Sierra Chaco are the richest subregions with endemic or highly restricted species10. A fourth subregion, the Arid Chaco has been also recognized by several authors, becoming the driest area of the Dry Chaco13.

The Fabaceae (Leguminosae) is one of the most diversified families in the Gran Chaco. There are 300 species and 20% of their infraspecific and specific taxa are endemic. Furthermore, ca. 60% of the species are related to some endemic lineages10. This means that this plant family is a good indicator or “proxy” to understand the ecological basis of the current distribution of the flora in this ecoregion. It is possible because of the ecological dominance of the family in the Gran Chaco ecoregion and the omnipresence and high diversity of its genera in this area10. All these facts imply that this family is potentially useful to identify Chacoan hotspots of biodiversity.

Objectives such as improving regionalization and the detection of biodiversity hotspots can be accomplished by means of methodologies based on the distribution of the biota. These approaches include, for example, Areas of Endemism (AE), Centers of Endemism (CE) and hierarchical clustering methods, such as Parsimony Analysis of Endemism (PAE) or Unweighted Pair Group Method with Arithmetic Mean (UPGMA)14. Although all of these methods are generally used in biogeography, hierarchical methods are more focused in classification of areas, but they frequently do not reflect the presence of ecotones15.

The former method, known as Areas of Endemism (AEs) considers areas inhabited by taxa with congruent patterns of distribution16. They are assumed to generate similar responses to different environmental factors, converging to similar ranges of distribution. These areas are expected to exhibit overlapping, since different factors are affecting in different ways different components of the biota. Consequently, it is common to find partial overlapped and nested patterns of AEs15. Another approach to study regionalization and species richness is the delimitation of Centers of Endemism (CEs), which are geographic areas where endemic species are common to a whole center of an area or region17. The definition of centers of endemism could be useful for regionalization purposes, as well as for planning conservation18. It could help identify priority areas to preserve, especially when funds and available territory are restricted19.

The objectives of this work were: (1) to identify areas and centers of endemism in the Gran Chaco ecoregion, focusing in the Fabaceae family; (2) to identify the taxa that define these areas and centers; (3) to determine the environmental variables that explain their existence of these areas; (4) to identify implications and applications of this information for further research in conservation and biogeography at regional scale. The main goal was to detect potential biodiversity hotspots considering that Fabaceae is an adequate proxy for the South American biogeographic regionalization and for conservation purposes.

Materials and methods

Plant database

Firstly, we built a database with specific and infraspecific endemic and restricted taxa for Fabaceae from the Gran Chaco ecoregion. The criteria for including taxa were based on our previous classification from the Chacoan flora10. The database included the specific and infraspecific taxa with the range of “Chacoan-endemic” and having “Chacoan lineage” or “Chacoan/Seasonally Dry Tropical Forests (SDTF) lineage”. We assigned the status of “endemic” to those taxa which were registered exclusively within the boundaries of the Gran Chaco.

We also analyzed the lineage, based on criteria of20 with modifications of10: the taxa attributed to “Chacoan-lineage” were those registered in the Gran Chaco as well as in the surrounding ecoregions grouped in the Chaco Domain, such as Monte, Pampa, and Espinal. The taxa exhibiting Chacoan/SDTF lineage were recorded either in ecotones between Chacoan and SDTF formations or in mixed Chaco/SDTF formations. The SDTF concept has been explained and mapped in detail by us in previous works10,21,22). For our studies about modeling and ecological niches, the dataset was improved by means of removing localities of the same taxa if they were distant up to 5 km in the R Environment, except those taxa with less than 10 occurrences.

We registered the localities of occurrence, based on the information that we previously compiled10 but improved by adding new occurrences. For that objective, we visited herbaria with representative specimens of the Chacoan flora (BAB, CGMS, CPAP, CTES, MO, NY, SI), compiled data from the databases of specimens from the taxa occurrences by administrative divisions in Flora del Cono Sur23. In addition, we also included information from field observations and recently collected specimens in several field trips across the region in recent years (Argentinean Chaco: 2021, 2022; Brazilian Chaco: 2004–2020). We complemented the information with data from online herbarium databases, such as TROPICOS (https://tropicos.org/3E) or Species Link (https://splink.cria.org.br/3E). However, we adopted these sources only when the specimen’s images were available or determinations were reliable. All the plant material that we used complies with guidelines of our institutions and with the national and international laws, as well is explained in the respective section.

Study area

The background area included the Gran Chaco ecoregion, with the boundaries previously proposed by us10 and surrounding locations, such as the Espinal, Pampas, Monte, Mesopotamic Savannas, Alto Paraná Atlantic Forests, Araucaria Humid Forests, Inter-Andean Dry Valleys and Seasonally Dry Tropical Forest ecoregions10,12. Therefore, this area included the entire range of many species that are highly restricted to the Gran Chaco but occurring in the adjacent ecoregions. Our background was based in a stack delimitated by 70°W and 50°W and 16°S and 41°S. The study area was divided in 500 grid cells of 1° × 1°, which was the basis of main analyses of endemism, although for previous screening of the Analysis of Endemism we also mapped grid cells of 1.25° × 1.25° and 0.75° × 0.75°.

In addition, to taking into account the possible subsampling of certain areas of the Gran Chaco ecoregion, we mapped all other records of native flora from the GBIF database, assuming a subsample of 10,000 accessions for the most diverse plant families in the region (Asteraceae, Fabaceae and Poaceae). Later, we calculated the number of records per grid cell, and ranged them according to the deciles 0, 0.1, 0.9 and 1; the grid cells were classified in three categories: (1) not sampled; (2) scarcely sampled; (3) adequately sampled. We used the raster package24 for reading, writing and manipulating spatial data.

Determination of areas of endemism (AE)

We estimated the Areas of Endemism (AEs) by means of the Endemicity Analysis using the method of Szumik and Goloboff16) in the NDM/VNDM software25. We evaluated different grid cell sizes (0.75° × 0.75°, 1° × 1°, 1.25° × 1.25°, and 1.5° × 1.5°). The grid cell sizes were based on previous works (e.g26,27,28,29). , for the Southern Cone South American region.

We ran analyses with different radius size filling (10, 25, and 50% of cell size). We assumed that these filling rates are adequate, considering, on the one hand, that some areas of Gran Chaco seemed to be scarcely sampled and, on the other hand, that there is a strong environmental gradient in western areas of the region. The conditions of the analyses in NDM/VNDM were the following: swapping a cell at a time, 100 repetitions, saving sets with score higher than 2, discarding superfluous sets as they are found, and replacing a set improved during swapping. We saved the sets with two or more endemic species. Later, we estimated the consensus areas, considering a relaxed consensus rule (30%). We also registered the taxa that contributed to the scores of the AEs. AEs are indicated in the NDM/VNDM software.

Determination of centers of endemism (CE)

The centers of endemism represent a different way to delimitate areas of high conservation and diversity values to the AEs. They are mainly areas where endemic species are frequently occurring14 and they are not focused strictly in the biogeographical history. We defined the centers of endemism in the Gran Chaco region by means of two approaches: the calculation of endemicity indexes and a cluster analysis based on similarity matrices. These approaches have been used in recent works of endemic floras14,30). Firstly, we built a matrix of presence/absence of taxa in the area of study, considering a grid cell size of 1° × 1°. We used QGIS 3.1431 to join the attributes of the matrices and the maps of the region.

In the first case (endemicity indexes), we calculated the Endemic Richness (ER), the Relative Endemic Richness (RER) and the Weighted Endemic Richness (WER) indexes18. ER was calculated as the sum of endemic taxa per grid cell, while considered the “range-restricted” taxa to calculate RER as those occurring in maximally 3 grid cells; WER was calculated in two steps: (1) calculation the inverse of the cells occupied by each species (1/number of grid cells where species occurs in); (2) sum of weighted scores per grid cell.

In the second case (cluster analysis) we built a matrix of similarity, based on the similarity coefficient of Jaccard, which has been largely used in ecology community studies32,33) and performed a cluster analysis. We used the functions dist and hcluster in the R Environment34 to build the dissimilarity matrix and to run the cluster analyses, respectively. We selected the UPGMA clustering algorithm instead of PAE, since its performance hass been better to identify CEs of flora in recent works14.

For the delimitation and identification of the centers of endemism, we considered them as “cores” of ampler biogeographic regions, based in the occurrence of endemic taxa. Consequently, we performed two cluster analyses: (1) including all endemic taxa, to infer the presence of some regionalization pattern; (2) including a limited number of endemic taxa, assuming the number of taxa that maximizes the number of endemic taxa, but preserving as close as possible to the clustering pattern found in 1).

The representative taxa of the CEs were determined assuming the following criteria: (1) in CEs determined by means UPGMA, we considered those taxa occupying more than 50% of the total cells in each CE; (2) in CEs determined by means the Endemic Richness indexes, we considered the entire set of taxa occurring in these cells.

Occurrences and distribution models

We developed distributional models in order to mitigate range under-commission in poorly sampled areas and to explore the bioclimatic variables that are influencing the occurrence of the endemic taxa. The primary criterion for selecting taxa was that they contributed to the definition of the endemism areas detected by NDM-VNDM. However, in many cases, the taxa performing these areas had insufficient occurrences to generate reliable models. In this case, we alternatively selected species contributing to the Centers of Endemism detected by other methods with an adequate number of localities giving reliable models.

For selected taxa contributing to AEs partially or not overlapping, we developed models of current distribution by means of the MAXENT algorithm35, to explore the areas with suitable environmental conditions for growing endemic Chacoan taxa. In order to prevent methodological restrictions and biases, we only analyzed the endemic and restricted taxa of legumes having more than 5 occurrences.

We initially evaluated the entire set of bioclimatic variables from WORDCLIM 2.1 version36 at a resolution of 30 s. For this aim, we made a correlation analysis by means of the calculation of the Pearson correlation coefficient and their significance levels in the ecospat package of the R environment37. We considered a threshold of 0.75 and significant differences between them.

Finally, we used a set of 6 uncorrelated bioclimatic variables of that were previously considered to influence in the current distribution of the studied taxa. We compared them with the models of the entire set of bioclimatic variables and a mix of bioclimatic and edaphic variables from the SoilGrids project (<https://SoilGrids250m 2.0>), but they were similar to the models of the set of 6 uncorrelated variables. This set of 6 variables was chosen as the set of explanatory variables to run the final models. We ran MAXENT in the ENMEval package38, which allows selecting the best model varying the regularization multipliers and the feature classes.

In that package, we evaluated three regularization multipliers: “linear”, “linear + quadratic” and “linear + quadratic + hinge” and three regularization multipliers (1 to 3), excepting in those taxa with less than 10 occurrences; in this case, we used only two features for the evaluation (“linear” and “linear + quadratic”). We partitioned the occurrences using the “Spatial Block” method. The background consisted in 10,000 occurrences that were randomly sampled from the raster stack. We selected the partition algorithm “maxent.jar”. Once the models were run, we selected the best of them considering the lowest Akaike’s information criterion corrected for small samples (AICc). We mapped the logarithmic output and explored the contribution of each variable to the selected models, considering the values after permutation. After the generation of the distribution models, we performed a stack in DIVA–GIS software by summing the models in each endemism area.

Overlap analysis of ecological niches

Once we obtained the AEs, CEs and the taxa contributing to them, we analyzed the overlap degree and the similarity of the ecological niches of the taxa. It allowed us to define and compare the environmental space that each one of the endemic taxa is occupying in the Gran Chaco, and to find common patterns to help answering questions regarding the environmental conditions delimitating areas with endemisms.

We used the Principal Component Analysis (PCA) method, according to39 by means of the ‘ecospat’ package in the R Environment40to compare the ecological niches of paired taxa contributing to the areas of endemism. We firstly created a grid with occurrence densities along the environmental gradients in each representative taxa of AEs, based on the occurrence data of each taxon, its background (performing a circular buffer area of 400 km around each occurrence) and the background of the stack including the 6 uncorrelated environmental variables in the studied area (this background was extracted for the two first axes of the PCA-environment of the 6-Bioclimatic variables stack).

The two-dimensional environmental space (using all the 6 bioclimatic variables) was then projected onto a 100 × 100 PCA grid of cells restricted by the minimum and maximum PCA values in the background. Then, we calculated the Schoener’s D index of niche overlap and run the similarity test between pairs of taxa41, from the density occurrences of each pair of taxa contributing to the respective AE. For this purpose, we applied 1000 replications to perform the results. The circle of correlation of variables was also analyzed. The categories of the niche overlap were based on42.

In order to quantify the comparison between ecological niches, we calculated the niche expansion, stability and unfilling using the ecospat.niche.dyn.index function in ecospat43. Although this approach is more frequently used in the case of biological invasions, it is feasible to adopt it for comparing native niches of different species44. Niche stability is the proportion of the niche of one species overlapping with a second one; niche unfilling is the proportion of the native species non overlapping with the invader or former species; and niche expansion is the proportion of the invader species non-overlapping with the native.

Results

Plant database and study area

We assembled data to reach 1,957 localities of the endemic and restricted taxa of Fabaceae whose taxonomic identity and coordinates were adequately checked. Our assessment recovered a checklist with 78 taxa which were endemic, highly restricted and/or exhibited Chacoan lineage but present in the Gran Chaco ecoregion (Table 1). The analyzed specimens were listed in the Suppl. Files (Suppl. Files: Appendix). The area of study comprised the Gran Chaco ecoregion and surrounding areas, which was divided for main analyses in 1° × 1° grid cells (Fig. 1a). According to our registers, Dry Chaco resulted the most undersampled subregion (Fig. 1b).

Table 1 Specific and infraspecific taxa of Fabaceae endemic or highly restricted in the Gran Chaco ecoregion and their lineage and endemic status. SDTF: seasonally dry Tropical forests. Endemic are mainly of Chacoan lineage, excepting if it is indicating another.
Fig. 1
figure 1

(A) Study area and grids. (B) Grids indicating quality data of sampling, according to the calculated quartiles.

Areas of endemism

The NDM/VNDM analysis of endemism assuming a grid cell size of 0.75° × 0.75° showed scarce AEs and consensus AEs, considering different filling size. The analysis of the AEs assuming the maximum size (1.5° × 1.5°) recovered the maximum number of AEs and consensus AEs (Table 2), but the circumscription of cells was extremely large, with cells including excessive environmental heterogeneity, and it resulted inconvenient for an adequate biogeographic interpretation.

Table 2 Consensus areas of Endemism of Fabaceae in the Gran Chaco ecoregion.

The analysis of endemism analysis assuming a grid cell size of 1° × 1° and filling 25% of cells recovered 20 areas of endemism throughout the Gran Chaco ecoregion. These areas were grouped in five consensus areas, two in Dry Chaco, two in Humid Chaco, and one in Sierra Chaco (Table 2; Fig. 2). The grid cell size considering a filling of 10 and 50% showed a little higher number of consensus AEs than 25%, but some of them were mainly overlapped and included the same taxa. The taxa contributing to perform each AEs in this grid size belonged mainly to Caesalpinioideae and Papilionoideae families, excepting for Bauhinia hagenbeckii Harms in the Upper Paraguay River Basin, which belongs to Cercidoideae (Table 3; Suppl. Files: Table 1).

Fig. 2
figure 2

Selected Areas of Endemism in the Gran Chaco Ecoregion based on a grid cell size of 1° × 1°. (A) Sierra Chaco; (B) Upper Paraguay River Basin; (C) Interior Dry Chaco; (D) Low Paraguay and Paraná River Basins. Grey thin lines delimitate the three subregions of the Gran Chaco ecoregion.

Table 3 Areas of Endemism in the Gran Chaco based on the Fabaceae diversity (grid cell size: 1° × 1°, cell fill 0.25 × 0.25).

The analysis of endemism assuming a grid cell size of 1.25° × 1.25° and filling 25% of cells recovered 35 areas of endemism throughout the Gran Chaco ecoregion. These areas were grouped in 7 consensus areas (Table 2), three in interior Dry Chaco (but two including ecotones with the Humid Chaco), three in Humid Chaco, one in Sierra Chaco and one in the ecotone of Dry and Sierra Chaco. For this reason, the only different AE identified with this configuration of cells was the ecotone beween Dry and Sierra Chaco (Suppl. Files: Fig. 1).

Centers of endemism

The centers of endemism (CEs) ranged from 5 to 10, and 7 to 30, respectively. The UPGMA and the ER showed the highest number of cells and CEs, while the WER showed the lowest values in both parameters (Table 4).

Table 4 Number of centers of Endemism of Fabaceae in the Gran Chaco ecoregion.

The UPGMA hierarchical clustering including all cells with at least one taxa showed a clustering with the following groups, which should be potential biogeographic areas: (1) Southern Paraguay Grasslands; (2) Upper Paraguay River Basin; (2) Marginal Dry Seasonally Forest; (3) Peripheral Gran Chaco; (4) Dry Chaco foothills and Dry Valleys; (5) Dry Chaco core”; (6) Humid Chaco “core”; (7) Chaco–Espinal ecotone; (8) Dry/Sierra Chaco ecotone (Fig. 3; Table 5).

Fig. 3
figure 3

Cluster Analysis (CA) and mapping of cells by means UPGMA of the Gran Chaco Ecoregion based on endemic and highly restricted Fabaceae, calculated with Jaccard similarity index. Cells lacking presences were discarded. Grey thin lines delimitate the three subregions of the Gran Chaco ecoregion. Numbers in terminals of the CA indicate the number of the grid cell.

Table 5 Characterization of the centers of endemisms in the Gran Chaco based on the Fabaceae diversity, detected by means UPGMA hierarchical analyses using Jaccard similarity index.

The UPGMA analysis including only the cells summing 12 or more taxa showed 5 clusters (Centers of Endemism) distributed in 30 cells: (1) Upper Paraguay River Basin; (2) Dry/Sierra Chaco ecotone; (3) Interior Dry Chaco; (4) Low Paraguay–Paraná River Basin; 6) Peripheral Dry Chaco (Table 5; Fig. 4).

Fig. 4
figure 4

Cluster Analysis and mapping of Centers of Endemism (CEs) by means Unweighted Pair Group Method with Arithmetic Mean (UPGMA) of the Gran Chaco ecoregion based on endemic and highly restricted Fabaceae, calculated with the Jaccard similarity index. This analysis includes only cells with nine or more occurring taxa. Grey thin lines delimitate the three subregions of the Gran Chaco ecoregion.

The endemic richness indexes showed different numbers of centers of endemism. The Endemic Richness Index showed the highest number of cells and CEs (17 and 10, respectively) and the Relative Endemic Richness showed the lowest (7 and 5, respectively). The three indexes showed cells with possible CEs in the same geographic areas, and they were mainly contained in the CEs (clusters) determined by UPGMA analyses (Fig. 5).

Fig. 5
figure 5

Centers of Endemism in the Gran Chaco ecoregion, based on different Endemicity Richness indexes in grid cell size of 1° × 1°. Left to right: (A) Endemic Richness; (B) Weighted Endemic Richness; (C) Restricted Endemic Richness. Grey thin lines delimitate the three subregions of the Gran Chaco ecoregion.

Distribution Models

For modelling purposes, we selected endemic Fabaceae with a number of occurrences adequate to generate reliable models. Additionally, in some cases, we included other endemic Fabaceae taxa from the region (Table 6). After evaluating the correlation between all the bioclimatic variables (Suppl. Files: Table 2), we selected six uncorrelated variables related to precipitation and temperature: Isothermality (Bio3), Maximum Temperature of Warmest Month (Bio5), Temperature Annual Range (Bio7), Mean Temperature of Coldest Quarter (Bio11), Annual Precipitation (Bio12), Precipitation Seasonality (Bio15) (Fig. 6). The models with the best performance were, in general terms, those with 1–2 regularization multipliers and Linear feature or Linear–Quadratic feature combination. (Table 7; Suppl. Files: Table 3).

Table 6 Selected taxa to model the distribution with MAXENT contributing to the main areas of Endemism and distinctive Center of Endemism detected in the Gran Chaco ecoregion.
Fig. 6
figure 6

Bivariate analyses (under the diagonal) and Pearson correlation indexes (over the diagonal) of the bioclimatic variables selected for the Chacoan endemic Legumes.

Table 7 Evaluation of the performance of MAXENT models for Chacoan Fabaceae endemic taxa.

We extracted the models and explanatory variables from ENMEval (Fig. 7A–E and F; Table 8; Suppl. Files: Figures 2, 3, 4 and 5). The models of species (Arachis batizocoi Krapov. & W. C. Greg., Chaetocalyx chacoensis Vanni, Lophocarpinia aculeatifolia (Burkart) Burkart, Mimosa castanoclada Barneby & Fortunato, Piptadeniopsis lomentifera Burkart, Neltuma nuda (Schinini) C. E. Hughes & G. P. Lewis, Senegalia emilioana (Fortunato & Ciald.) Seigler & Ebinger) and the species-richest area from the stack of the Interior Dry Chaco showed the main occurrence probability in western Paraguay and adjacent regions (Fig. 7A; Suppl. Files: Fig. 2). The species models (Arachis correntina (Burkart) Krapov. & W. C. Greg., Galactia longifolia Benth. ex Hoehne and Mimosa pseudopetiolaris Barneby) and the stack of the Low Paraguay and Paraná River Basins showed the highest probability of occurrence and richness in the Humid Chaco (Fig. 7B; Suppl. Files: Fig. 3D–F). The models of the Upper Paraguay River Basin (Arachis lignosa (Chodat & Hassl.) Krapov. & W.C.Greg., Bauhinia hagenbeckii and Neltuma rubriflora (Hassl.) C. E. Hughes & G. P. Lewis) and their stack concentrated the highest occurrence probability in that region, but also showed a potential region in eastern Bolivia (Fig. 7C; Suppl. Files: Fig. 3A–C); those from the Dry–Sierra Chaco ecotone (Adesmia cordobensis Burkart, Crotalaria chaco-serranensis H. G. Bach & Fortunato, Neltuma flexuosa (DC.) C. E. Hughes & G. P. Lewis, Neltuma pugionata (Burkart) C.E. Hughes and G.P. Lewis and Senna subulata (Griseb.) H. S. Irwin & Barneby) were more probable to occur in the western extreme area of the Gran Chaco ecoregion (Fig. 7D; Suppl. Files: Fig. 4A–D). Finally, the species from the Sierra Chaco (Dalea elegans Gillies ex Hook., Galactia glaucophylla Harms) showed their highest probability of occurrence in the southern extreme of this subregion (Fig. 7E; Suppl. File: Fig. 4E–F).

Fig. 7
figure 7

Maps of species richness based on stack of MAXENT distribution models of the taxa contributing to the Areas of Endemism in the Gran Chaco ecoregion. (A) Interior Dry Chaco; (B) Low Paraguay–Paraná River Basin; (C) Upper Paraguay River Basin; (D) Dry/Sierra Chaco ecotone; (E) Sierra Chaco; (F) regionalization of the Gran Chaco ecoregion (Brown: Sierra Chaco; yellow: Dry Chaco; green: Humid Chaco).

Table 8 Main explanatory bioclimatic variables of models (in percentage) for the representative species in endemism areas from the Gran Chaco ecoregion.

Different environmental variables explained the distribution models (Table 8). In general terms, the variables related to the temperatures in summer (Maximum Temperature of Warmest Month (BIO5) explained high percentages of the models of three of the species representative of Interior Dry Chaco. Distribution of other species of this Center of Endemism were explained by the Annual Precipitation (BIO12) and the Temperature of the Coldest Quarter (BIO11). In the case of the representative species of the Dry/Sierra Chaco ecotone, Isothermality (BIO3) was a variable explaining a significant percentage of the models, together with Annual Precipitation (BIO12) and Precipitation Seasonality (BIO15). The models of the representative species of the regions of Low Paraguay/Paraná centers of endemism were explained by Isothermality and Maximum Temperature of the Warmest Month. The models of the species contributing for Upper Paraguay Basin (Arachis lignosa, Bauhinia hagenbeckii, Neltuma rubrifora) were explained by the Mean Temperature of Coldest Quarter and Precipitation Seasonality. Finally, the models of the Sierra Chaco, such as Dalea elegans and Galactia glaucophylla,representative species were mainly explained by the Temperature in the Warmest Month and Annual Precipitation.

Overlap analysis of ecological niches

The analysis of the ecological niche overlap (Table 9) showed that the Schoener’s D index was mainly moderate to high in the pairs of taxa contributing to the Interior Dry Chaco AE, which ranged mainly from 0.25 to 0.63, excepting for Arachis batizocoi., which exhibited no overlapping with the remainder taxa. Similarly, the overlap niche between taxa of the Dry–Sierra Chaco ecotone ranged from 0.19 to 0.42. In the remainder Areas of Endemism, the niche overlap was variable, ranging from no overlapping in highly restricted taxa (for example, Mimosa pseudopetiolaris in Low Paraguay–Paraná River Basins) to high niche overlap between Neltuma rubriflora and Arachis lignosa (Upper Paraguay Basin) and moderate in the case of Arachis correntina and Galactia longifolia (Low Paraguay and Paraná Basins). In Sierra Chaco, the niche overlap was variable, ranging from 0.09 to 0.53.

Table 9 Schoener’s index (D) of niche overlap and equivalency test between the main taxa contributing to the detected areas of Endemism (bold letters represent statistically significant values p = 0.05).

The equivalency index was significant in some pairs of Interior Dry Chaco, with moderate to high overlap in the case of Lophocarpinia aculeatifolia, Senegalia emilioana and Mimosa castanoclada, and almost null overlap in the case of pairs involving Arachis batizocoi. In Dry/Sierra Chaco ecotone, the equivalency indexes of pairs Adesmia cordobensis were significant and moderate (Table 9).

The kernel density of the taxa occurrences within each Area of Endemism on the PCA-environmental exhibited similar distribution pattern. They differed between the areas (Figs. 8, 9, 10, 11 and 12). The niches of Interior Dry Chaco were ample, with a relatively high overlap between its taxa (Fig. 8a, b) and the kernel density was concentrated in the positive area of the two first axes in the biplot. The niches of Lower Paraguay and Paraná River Basins were distinctly more concentrated in the positive area of the axis 1 and the negative area of the axis 2; they showed a high overlap between Arachis correntina and Galactia longifolia, but almost no overlap with Mimosa pseudopetiolaris (Fig. 9; Table 8). The niches of the taxa in Sierra Chaco were reduced and concentrated in the negative area of axes 1 and 2, and they showed moderate to high overlap and stability in the majority of taxa pairs (Fig. 10; Table 8). The niches were more restricted in the Upper Paraguay River Basin, in which the densities were more concentrated in the positive area of the Axis 1 and near the null value of the Axes 2; Arachis lignosa and Neltuma rubriflora showed a high overlap and stability (Fig. 11; Table 8). The niches from Arid Sierra Chaco ecotone were ample and concentrated in the negative area of both axes, similarly to the representative taxa of Sierra Chaco, but their niches seemed more restricted (Fig. 12; Table 8).

Fig. 8
figure 8

Kernel density occurrences based on grid in the PCA-environmental space (axes 1 and 2) on the representative taxa of Fabaceae in the Gran Chaco ecoregion from Interior Dry Chaco.

Fig. 9
figure 9

Kernel density occurrences based on grid in the PCA-environmental space (axes 1 and 2) on the representative taxa of Fabaceae in the Gran Chaco ecoregion from Low Paraguay and Paraná River Basins.

Fig. 10
figure 10

Kernel density occurrences based on grid in the PCA-environmental space (axes 1 and 2) on the representative taxa of Fabaceae in the Gran Chaco ecoregion from Sierra Chaco.

Fig. 11
figure 11

Kernel density occurrences based on grid in the PCA-environmental space (axes 1 and 2) on the representative taxa of Fabaceae in the Gran Chaco ecoregion from Upper Paraguay River Basin.

Fig. 12
figure 12

Kernel density occurrences based on grid in the PCA-environmental space (axes 1 and 2) on the representative taxa of Fabaceae in the Gran Chaco ecoregion from Dry–Sierra Chaco ecotone.

The quantification of the niche expansion, stability and unfilling showed high stability between pairs of taxa of the endemism areas, and they were coincident with the D niche overlap (Table 10; Suppl. Files: Table 4). The second quartile was ca. 0.5 in almost all of the endemism areas, excepting for Interior Dry Chaco (0.18) and Low Paraguay Basins. The latter also showed a lowest value of third quartile (0.51), comparing with the other areas, where this quartile was higher than 0.8. The presence of extreme values of 0.00 and 1.00 for any of the three indicators was concordant with taxa exhibiting a restricted distribution. The comparison between pairs of taxa from different areas showed that, in general terms, resulted in null or very low values of niche stability (Suppl. Files: Table 4).

Table 10 Niche expansion, stability and unfilling in pairs of endemic and restricted taxa of the Gran Chaco ecoregion.

Discussion

Areas of endemism

We found different Chacoan Areas and Centers of Endemism. The consensus areas of endemism can be attributed to different factors: presence of microhabitats generated by orography (Sierra Chaco), boundaries between ecoregions (Dry/Sierra Chaco ecotone) or harsh extreme environmental conditions (in the case of Interior Dry Chaco). In contrast to our current detection of several Areas of Endemism, other previous works detected only a single significant area of endemism in Argentina comprising the Gran Chaco plains28.

Although our study is limited to one family of plants, we also could demonstrate in recent works that Fabaceae is a good indicator to define ecoregions in tropical and subtropical forest areas in the world45,46,47). In fact, we detected areas largely known as potential Chacoan subregions, such as Arid Chaco48. It suggests that the study of the endemism distribution at fine scale can adequately identify areas with common patterns of historical biogeography, and, possibly also, potential centers of speciation for the biota.

Some of the Endemism Areas in Gran Chaco showed partial overlapping and floristic differences, varying its species, especially in the westernmost and easternmost extremes of the ecoregion. It is interesting to point out that the overlapping patterns suggested that different environments are in the same cells29. This fact suggests that NDM/VNDM outperforms the hierarchical methods, which are commonly used in biogeography. At regional scale, the 1° × 1° grid cell size shows to be adequate to detect AEs, but larger cells (1.25° × 1.25°) still recovered new Areas of Endemism. We observed that larger cell sizes (more than 1.25° × 1.25°) might be not adequate for Areas of Endemism analyses, given the complexity of the Gran Chaco in the western region, where it is possible to find mosaic of environments, ecotones and pronounced gradients of rainfall and temperatures. On the other hand, we found that grid cells smaller than 1° × 1° recovered very restricted Areas of Endemism, possibly because of the presence of several undersampling sectors.

Centers of endemism

We identified centers of endemism based on different endemic richness indexes and the Jaccard similarity distances. However, UPGMA based on cells including at least 12 taxa were the most informative. It allowed us to discover some nexus between the richest cells, the Areas of Endemism, and potential biogeographic divisions. In addition, UPGMA allowed us to detect the largest number of Centers of Endemism and rich-endemic cells among all methods here used. Although the Centers of Endemism are not directly related to biogeographical units and its detection is more focused to implement conservation policies49, recovering cells with CEs by means of UPGMA could give more biogeographic information than indexes by means of identifying areas with common endemism.

Previous authors14 pointed out that the identification of Centers of Endemism, when these analyses are carried out with a larger analysis constrained by biogeographic region, would ensure that these centers are their “core areas”. It would lessen the possibility of conflicts between both types of biogeographic units. The cluster analysis by means of UPGMA gave us adequate information to infer that Centers of Endemism are related to some biogeographic units; although some of them are relatively known in the past (for example, the Sierra Chaco or Dry Chaco10), other can be only visualized by our analyses; they would represent a starting point for new proposals of regionalization. It is a challenging topic to develop in further studies integrating more key families of the native Chacoan flora.

In the Gran Chaco ecoregion, we detected some cells configuring small Centers of Endemism. They were from Paraguayan Dry Chaco, the Brazilian–Paraguayan border of Humid Chaco, and the southern portion of Sierra Chaco. These areas would be “hotspots” of biodiversity. At present, neither the Gran Chaco nor larger parts of it are recognized as global “hotspot” areas50. However, there are some examples of recently recognized hotspots, such as the North American Coastal Plains51 where some assumptions such as the climate and environmental homogeneity and the erroneous climax vegetation were revised. The Gran Chaco could be also subject to similar fallacies, but more probably, undersampling and lack of public registers of biodiversity could be influencing that classification. Some potential areas of Argentine Chaco appear also as Centers of Endemism in our analyses and could be candidates for hotspots, such as the “Impenetrable” area52, near the Paraguayan border, as well as the Dry/Sierra Chaco ecotones. These regions clearly require intensive efforts to document their biota.

Biogeographic issues

The cluster analysis revealed biogeographic patterns that are similar to the Analysis of Endemism. The analysis of all grid cells showed areas related to particular ecosystems described in the literature. The Peripheral Dry Chaco mostly coincided with the sandy area in northern Paraguay and adjacent Bolivia, including a mosaic of Chacoan and SDTF plant species. The Marginal Seasonally Dry Forest area was characterized by the dominance of several species that we previously identified as Seasonally Dry Tropical Forests and Cerrado elements, although they also were endemic or highly restricted to the region10,53). Coincidently, it has been observed that the formations of northern Paraguayan Chaco represent transitions from the Gran Chaco to the Chiquitania and Pantanal formations54. The Dry Chaco foothills and Dry Valleys are also dominated by a blend of SDTF and Chacoan elements. The Chaco-Espinal ecotone exhibited species of Chacoan lineage that were found in both ecoregions, such as Neltuma affinis (Spreng.) C.E. Hughes & G.P. Lewis. The Dry/Sierra Chaco Ecotone comprised an ample region with a mix of species exhibiting Chacoan lineages but mainly exclusive of this sector of the ecoregion.

Regarding the detection of endemism areas, previous studies have revealed that in the Gran Chaco plains of Argentina a large Area of Endemism it has been detected; it comprised the Dry and Humid Chaco, and certain parts of the Sierra Chaco28. This result was based on a large and extensive database of animals and plants. In our particular case, exclusively using the endemic and restricted Fabaceae we could retrieve different Endemism Areas; some of them encompassed areas within each subregion of Gran Chaco while others also encompassed ecotones between them.

For instance, the Area of Endemism that we called “Sierra Chaco” comprised a set of cells almost entirely included in the Sierra Chaco subregion10 in its southernmost portion. The taxa contributing to this Area of Endemism are highly restricted, with reduced ecological niches but a relatively high degree of overlapping between them. The variables explaining better the distribution of their endemisms are related to summer temperatures, which are lower than the surrounding areas, and the annual precipitations, which are significantly higher than the adjacent areas. These are direct effects of altitude. The ecological distinction can be explained by the presence of intricate environmental heterogeneity leading to high biological diversity in Andes and surrounding areas, which is partly explaining by tectonics and by its perpendicularity to the atmospheric circulation patterns55. In mountainous environments, such as Sierra Chaco, speciation processes (natural selection, genetic drift, related to isolation of populations) could explain the notable diversity related to surrounding areas56.

In our study, even the most restrictive techniques recovered some Centers of Endemism in the Sierra Chaco area. Interestingly, variability in altitude and topography and the consequent climatic and edaphic variations in small areas of this region determined the existence of the recently recognized “Comechingones” phytogeographic province57. Since it consisted of cold temperate grasslands having a different lineage than the Chacoan ones58,59), its biota has been not considered for the present study, unless the common species with it. In this work, we assessed the flora reaching up to 1,700 m a.s.l., in concordance with previous criteria of upper limit of Sierra Chaco10,60).

In Dry Chaco, we retrieved several Areas of Endemism from the analyses based on the 1°×1° and, in one case, on the 1.25°×1.25° cells. The consensus area for all these areas in the interior Dry Chaco is the central western Paraguay plain, but extending in adjacent Bolivia and Argentina. This area was clearly defined here by several xerophytic, endemic Fabaceae, some of them belonging to monotypic genera, such as Mimozyganthus Burkart and Lophocarpinia Burkart, which were largely known as endemisms of the Gran Chaco61,62). The variables explaining the potential distribution of the endemic taxa are related to the summer temperatures and, secondarily, to the summer rainfall. Consequently, these taxa seem to be well-adapted to extreme conditions of high temperature and low precipitation of Interior Dry Chaco. Some of their ecological niches are convergent and the niche overlap was moderate to high, configuring a large area of potential speciation for Legumes in the Gran Chaco ecoregion.

In the Humid Chaco (eastern portion of the Gran Chaco ecoregion), the analysis of endemism areas detected at least two main consensus areas: Upper Paraguay River Basin, and Low Paraguay–Paraná River Basins. The first one comprised the Brazilian Chaco and the adjacent Paraguayan Chaco, and consisted of a mix of Seasonally Dry Tropical Forests, Chacoan formations and remnants of Cerrado10. The existence of ecotones between these ecoregions, as well as the presence of specialized habitats (such as the calcareous outcrops on the left margin of the Paraguay River) could explain the concentration of Fabaceae endemisms.

In this area, our analyses show that the potential distribution of contributing species inferred by the predictive models is mostly configured by seasonal rainfall variability and the winter temperature. According to these variables, this area represents a thermal and rainfall transition between the ecoregions of Atlantic Paraná Rain Forests and Humid Chaco12. This subtropical strip is transitional between the first one, with precipitations mainly in all seasons, and the second one, with precipitations more concentrated in summer. The main taxa contributing to this Area of Endemism, such as Arachis lignosa, Bauhinia hagenbeckii and Neltuma rubriflora have been reported to grow in mosaics of SDTF and Chacoan formations in recent works including field data63,64,65;66,67). Especially A. lignosa and N. rubriflora appear to share similar and convergent ecological niches, becoming good indicators for the presence of this endemism area. In our study, 79 specific taxa of Fabaceae are endemic or highly restricted in the Gran Chaco ecoregion, and, notably, 16% of them, in Brazil, are restricted to the Chacoan formations strongly threatened by increasing deforestation.

The Low Paraguay–Paraná River Basins Area of Endemism is also constituted by a mosaic of SDTF and Chacoan formations. It is another transition between the Paranaense and Humid Chaco ecoregions, with thermal transitions; in fact, summer temperatures are milder than other areas of the Gran Chaco ecoregion. It could explain the presence of many grasslands species, such as Arachis correntina or Galactia longifolia. Additionally, it includes some topographic accidents in southern Paraguay53,63). This diversity of environments might explain the occurrence of endemic or rare taxa. Particularly, the outcrops of Tobatí, as well as other minor hills, have restricted relicts of cerrado and grasslands with unique or almost so species, belonging to the genus Mimosa63,68) or Aeschynomene L69. These outcrops preserve highly restricted species, such as Mimosa pseudopetiolaris Barneby68. Other representative taxa are growing outside of these outcrops, such as some species of Desmodium Desv70,71.

We detected another Area of Endemism in the transition between Dry and Sierra Chaco, but only considering an analysis with large cells (1.25° × 1.25°). This Area was identified by us as “Dry/Sierra Chaco ecotone”. It was found as a different Center of Endemism (CEs) in the hierarchical analyses. This area comprised mosaics of extremely dry plains72, salt lakes, and dunes, mixed with hills, where the climate is cooler and humid. The taxa that contributed to this AE are typical of these different environments and intermediate areas73, and their ecological niches are delimited by temperature and precipitation variables, such as minimum extreme temperature in winter. This AE has a correspondence with some cells representing CEs throughout the area.

This ecotone is partially located in the previously recognized unit of vegetation called “Chaco Árido”48,72). , which mainly involves a transition between forests of Aspidosperma quebracho–blanco Schltdl. and steppes13. Some of the endemic taxa contributing to the Area of Endemism are Neltuma flexuosa and N. pugionata, accompanied by other endemic Fabaceae, such Mimozyganthus carinatus (Griseb.) Burkart. This unit of vegetation seems to be situated in a mosaic with formations of Sierra Chaco, where other taxa, such as Adesmia cordobensis are also contributors to performing the CEs. The presence of floristic elements of Fabaceae in this area, that are common to Sierra Chaco and Arid and Dry Chaco, was previously found by other authors60). In addition, it is interesting to point out that this Area of Endemism includes the presence of ecotones between arid and humid ecoregions, such as Monte, Yungas, Puna and Gran Chaco. In the adjacent western area, previous authors have found several Areas of Endemism analyzing the flora of southern Central Andes, which was explained by the diversity in altitude and rainfalls27.

Conservation issues

Since the Gran Chaco ecoregion is suffering one of the highest deforestation rates in the world from the 2000’decade74, all information contributing to elucidate priority areas of conservation is urgently demanded. Recently, some works have established priority areas based in the presence of endemic vertebrates75. These areas coincided with our detected sectors in the interior Dry Chaco AE/CEs, the Low Paraguay–Paraná River Basins AE/CE, and the Sierra Chaco and Dry/Sierra Chaco Ecotone Areas of Endemism. Interestingly76, some authors have observed that endemic richness of plants and vertebrates are generally correlated. Therefore, further integration of information from endemic taxa could have significant implications for delimitation of new protected areas and extensions of those already established or, even better, the consolidation of buffer areas surrounding reserves and parks.

It is especially interesting that the Interior Dry Chaco Area of Endemism represented the most visible area of endemism and included centers rich in endemics. It coincided with the areas of the Gran Chaco that has suffered the highest rates of forest loss from 200077,78).

Our studies about ecological niches of indicator taxa allowed us to infer that delimitation of areas and centers of endemism in which their taxa exhibited possible ecological niche divergence requires more detailed studies to conservation policies. It is because of their biota seems to be ecologically complex; it might be related to the presence of microhabitats, ecotones or mosaics of habitat. The most visible case of this work is the Sierra Chaco area/center of endemism, whose complex topography might be causing specialized habitats58 via ecological isolation, climatic and edaphic variations and the ingression of Andean elements of the flora10).

The northwestern area of the Gran Chaco requires an intensive sampling and complementary studies of the biota to improve the regionalization proposals, because of the complexity generated not only by the topography, but also by the presence of mosaics of different vegetal formations79. They are related to slope orientation and altitude triggering great differences in annual precipitation and temperatures. In our analyses, the northern portions of Sierra Chaco were detected only at large-scale analyses in both, AE and CE methodologies. For conservation purposes, an exhaustive sampling in this area is imperative, in order to generate adequate plans to preserve the biological diversity.

There are currently few areas with Chaco formations in conservation units; this increases the risk of extinction of endemic species. Indeed, the natural resources of the Gran Chaco areas along its extension are used by humans for different purposes, such as fishing, crafts, religious cults, construction, medicinal uses which strengthens their relationship with the Chaco formations. Public policies are urgently needed in areas with CEs, especially in some cells of the Chaco such as the Paraguayan Dry Chaco, the Brazil-Paraguay border of the Humid Chaco and the southern portion of the Chaco mountains.