Introduction

Intraspecific genetic diversity is a dynamic outcome of long-term evolutionary and demographic processes, often linked to climatic fluctuations1. Consequently, genetic diversity levels are not homogeneous across the geographic distribution of species, often displaying pockets of high diversity surrounded by landscapes of lower and more homogeneous diversity2. One hypothesis seeking to explain the observed genetic patterns is the niche centroid hypothesis, proposing that populations living within their optimal niche conditions tend to exhibit higher abundance and fitness-related attributes, consequently accumulating higher genetic diversity over time3,4,5,6. Under this hypothesis, the niche is defined as a multidimensional hypervolume, whose dimensions are the fundamental variables essential for species’ long-term persistence7. The niche has an internal structure such that its centroid represents theoretically optimal conditions for species survival, progressively decreasing to suboptimal at its periphery, and reaching zero probability of survival outside it7. Under this paradigm, the relative distance from the niche centroid can be linked to fitness-related attributes and genetic diversity levels, with recent studies reporting higher genetic diversity closer to the niche centroid (central populations) compared to less suitable conditions (peripheral populations)4,6,8. However, considering that conditions and species ranges are not stable over time, populations presently regarded as central might have been peripheral in the past, and vice versa. In this context, populations may hold higher genetic diversity today due to their historical exposure to long-term optimal conditions (i.e., climatic refugia), contrary to where bottlenecks and founder effects linked to range expansions might have occurred9,10. Such imprints reflecting past distribution changes are particularly evident in the genetic diversity patterns of low-dispersive cold and temperate species11,12. Integrating for the effects of both historical and contemporary climatic influences into a unified theoretical model could provide an unprecedented understanding of how genetic diversity is distributed across species and regions13,14.

Despite the fundamental role of intraspecific genetic diversity in driving evolutionary potential and climate adaptability15,16, global estimates remain limited17,18, especially for marine species. Existing studies predominantly focus on fish19, while other ecologically critical groups, such as brown macroalgae, remain largely overlooked. These low-dispersive ecosystem-structuring species can form productive marine forests, providing essential habitats and invaluable ecological services to numerous associated species20,21,22. Recognizing their ecological importance, international conventions have long prioritized marine forests as conservation targets (e.g., Barcelona Convention 1975, Convention on Biological Diversity 2010, OSPAR Convention 2021), with a growing emphasis on the conservation of their genetic diversity23. Studies on the genetic diversity of marine forest species remain fragmented, often limited to case-by-case approaches (e.g.,11,24,25) or lacking a biogeographic perspective26, leaving significant gaps in our understanding of their evolutionary and ecological dynamics.

Here, we address this gap by developing a unified model based on the niche centroid hypothesis to explore how marine forest genetic diversity relates to environmental conditions, considering both contemporary and Last Glacial Maximum (LGM, ~20,000 years ago)—a period of extreme climatic cooling that drove extensive distribution shifts, leaving strong imprints on the genetic diversity of cold and temperate marine forest species11,12,27,28. We hypothesize that genetic diversity peaks at the niche centroid, where conditions are optimal, and declines towards the periphery (Fig. 1). To test this hypothesis, we used genetic diversity data of 29 brown macroalgae and a set of six environmental variables under both time periods, identifying the drivers and predicting genetic diversity patterns. We then apply the developed model to the available distribution range maps of 280 marine forest species29 and predict their global biogeographical patterns of genetic diversity. By mapping hotspots of high genetic diversity and revealing long-term climate drivers, we offer insights into the patterns and processes mediating marine forest genetic diversity across all oceans.

Fig. 1: Conceptual illustration of the methodology.
figure 1

A Inferring the environmental niche of a hypothetical species and its centroid from its contemporary distribution. The environmental distance of each sampled population from the niche centroid is calculated. B Species distribution under Last Glacial Maximum (LGM) and contemporary conditions, highlighting range shifts over time. We hypothesize that genetic diversity decreases with increasing environmental distance from the niche centroid. The environmental distances of each sampled population from the niche centroid are estimated under both LGM and contemporary conditions.

Results

Genetic diversity data collection

The literature review initially identified 1956 potential studies from which 64 reported genetic diversity estimates for brown macroalgae. Removing duplicate species and regions resulted in a final dataset comprising 40 studies, which collectively covered 29 brown macroalgae species and a total of 669 populations collected at 583 sites (Fig. 2; Supplementary Data 1; Supplementary Data 2). Most species had either cold (6 species) or temperate (22 species) thermal affinities, with genetic data being available only for one tropical species (Fig. 2; Supplementary Data 3). The most reported diversity measures were allelic richness (28 studies, 25 species, 463 populations) and expected heterozygosity (36 studies, 27 species, 534 populations), measured using microsatellites (32 studies), mitochondrial DNA (mtDNA; 2 studies) and single nucleotide polymorphisms (SNPs; 7 studies) genetic markers (Supplementary Data 2).

Fig. 2: Genetic diversity dataset.
figure 2

Map depicting the locations of marine forest sampled populations. The species, number of sampled populations, genetic index, genetic marker, number of studies and temperature affiliation inferred from existing species’ distribution range maps are shown. Species were characterized as cold when the 95th percentile of temperature along their range was below 15 °C, temperate when between 15 and 25 °C and tropical above 25 °C. Abbreviations stand for allelic richness (Ar), expected heterozygosity (He), microsatellites (Mic), mitochondrial DNA (mtDNA) and single nucleotide polymorphisms (SNP).

Genetic diversity modelling

A unified machine learning model grounded in niche centroid theory hypothesizing reduced genetic diversity away from the niche centroid (Fig. 1) was developed to predict genetic diversity measured as allelic richness (Ar) and expected heterozygosity (He). The models retrieved high performance both in cross-validation with independent test data and in final predictions (Supplementary Table 1). The final model explained 78% of the deviance with a correlation coefficient of 0.883 for allelic richness and 77.5% of deviance with a correlation coefficient of 0.880 for expected heterozygosity (Fig. 3). Overall, models underpredicted genetic diversity estimates by an average of 8.2% for allelic richness and 22.4% for expected heterozygosity when compared to observed data (Fig. 3). The inclusion of distances from the niche centroid during LGM and contemporary conditions improved the null model (only study, species and genetic marker as predictors), explaining an additional 27.2% of genetic variability for allelic richness and 22% for expected heterozygosity (Supplementary Table 1). Distance to the niche centroid during LGM contributed more than double compared to contemporary conditions (Supplementary Table 2), suggesting that past climate conditions have a stronger impact on observed levels of genetic diversity than contemporary conditions.

Fig. 3: Genetic diversity model fit.
figure 3

Observed and predicted genetic diversity model fit for A allelic richness and B expected heterozygosity. The deviance explained and correlation coefficient are presented for each model. Red dashed lines represent a 1:1 model fit.

Predicting global marine forest genetic diversity

We transferred the developed model to the distribution range maps of 280 cold and temperate marine forest species (Supplementary Data 3) and mapped the potential genetic diversity patterns per species (See Data Availability Statement). Stacking and averaging the individual predictions of genetic diversity provided a global overview of diversity hotspots, with patterns of allelic richness and expected heterozygosity largely overlapping (correlation coefficient 0.953, p-value < 0.001). Hotspots of high genetic diversity for both metrics were found in regions of the north-eastern Pacific, north-eastern Atlantic and the south coast of Australia (Fig. 4). High genetic diversity was also found in regions along the coasts of the Mediterranean Sea, New Zealand, and Japan (Fig. 4). Regions of low genetic diversity were found worldwide, with the largest areas located in the high latitudes of the Arctic, such as along the coasts of Canada and Russia (Fig. 4).

Fig. 4: Biogeographic patterns of marine forest genetic diversity.
figure 4

Genetic diversity was estimated as the average A allelic richness and B expected heterozygosity per cell (0.08° resolution) for 280 cold and temperate marine forest species of brown macroalgae.

Lastly, genetic diversity had a positive and significant relationship with species richness (correlation coefficient 0.797, p-value < 0.001 for Ar and 0.945 and p-value < 0.001 for He, respectively; Fig. 5). Regions with the highest genetic diversity overlapped with highest species richness worldwide and the opposite (Fig. 5). Notably, there were no regions with high species richness and low genetic diversity (Fig. 5).

Fig. 5: Congruence between marine forest genetic and species diversity.
figure 5

A Global estimates of species richness for 280 marine forest species. B Colour scale for the classification of cells based on genetic and species diversity quantile combinations. Global maps depicting quantile combinations of genetic and species diversity and their regression for C, D allelic richness and E, F expected heterozygosity, respectively.

Discussion

By coupling empirical genetic data with a machine learning model to show that the genetic diversity of brown macroalgae marine forests aligns with the niche centroid hypothesis4, we revealed thet present diversity is shaped by past niches. Past climatic conditions (during the Last Glacial Maximum) shaped the present biogeography of genetic diversity of marine forests more than contemporary climate, with regions of past climatic optimum holding higher genetic diversity for multiple species. This global analysis supports case-by-case studies focusing on long-term persistence11,24,30,31,32 and evolutionary theory1,10. The development of a unified predictive model allowed the reconstruction of genetic patterns for multiple species, mapping for the first time the global hotspots of temperate and cold-water marine forest genetic diversity. The predicted biogeographical patterns for allelic richness and expected heterozygosity were consistent and matched the biogeography of species richness33, highlighting how persistence over geological time scales can be a common driver of diversity for genes and species10,34.

The genetic diversity models retrieved high performance when tested against independent datasets through cross-validation, with a final deviance explained of 0.78 and a significant Pearson’s correlation coefficient of 0.883 for allelic richness and 0.775 and 0.880 for expected heterozygosity, respectively. Despite the high model performance and linear trend, predictions of genetic diversity were underestimated by 8.2% for allelic richness and 22.4% for expected heterozygosity when compared to observed data. Study and species-specific aspects (null model) were crucial components of genetic diversity, explaining up to 56% of genetic variance. Indeed, the sampling design, the number of sampled populations, the geographic extension of the study, unique species traits, the selection of genetic markers and their respective resolution, as well as laboratory conditions can influence the estimates of genetic diversity35,36,37,38. Yet, the inclusion of the niche centroid distances explained an additional 27.2% of genetic variability for allelic richness and 22% for expected heterozygosity. Specifically, LGM climate conditions emerge as a key driver shaping the genetic diversity of marine forest species, with contemporary dynamics playing a secondary role, and predicting higher diversity in regions of climatic optimum. These regions represent climatic refugia of long-term persistence, as conditions were optimal in the past and populations continue to persist there today10. High genetic diversity in long-term climatic refugia has been previously hypothesized for individual species11,24,30,31,32, but here we provide a global unified hypothesis that applies to multiple cold and temperate marine forest species and regions. Further, by disentangling the role of LGM and contemporary climate conditions, we build upon previous global studies that used simplistic latitudinal models to predict diversity for different ecological groups17,19,26. Such simplistic models are unable to unravel the potential underlying drivers of genetic diversity that may be operating at the latitudinal level39. Our findings of primarily LGM conditions shaping diversity are in line with global studies for terrestrial mammals18 but contrast those on terrestrial plant and animal diversity, which found contemporary conditions being the main driver of diversity8. The contrasting results among different ecological groups and between terrestrial and marine environments could potentially be attributed to variations in ecological traits (e.g., dispersal), life history characteristics (e.g., reproduction) and distinct biogeographic histories8, as well as different levels of exposure to anthropogenic pressures that may have influenced genetic diversity17. These differences further emphasize the importance of accounting for both historical and contemporary climatic influences when estimating genetic diversity13,14. Nevertheless, species have complex biogeographic and evolutionary histories and additional drivers may further contribute to the genetic diversity patterns observed today40. For instance, although not considered in the models, oceanographic connectivity can influence the genetic diversity patterns of marine forest species globally41.

The prediction of genetic diversity patterns for 280 cold and temperate marine forest species permitted mapping the global genetic diversity hotspots. Overall, model estimates were consistent, and biogeographical predictions of allelic richness matched those of expected heterozygosity. High-diversity hotspots were mainly predicted at mid-latitudes of the north-eastern Pacific, north-eastern Atlantic and southern Australia, in line with previous studies2,42,43,44 and corroborating expectations of LGM climate refugia regions for marine forest species9,10. Indeed, for species whose distributions extended from mid to high latitudes, the highest genetic diversity was commonly predicted at lower latitudes (see individual maps in Supplementary Information). Yet, regional hotspots of genetic diversity were also predicted in high-latitude regions, such as the coasts of Norway, Iceland and the Aleutian Islands. Note that, species with exclusively high latitude/polar distributions (e.g., Laminaria solidungula) are expected to have high diversity at polar latitudes. Further, recent studies have challenged the notion of generalized low diversity in high-latitude populations due to bottlenecks in post-LGM recolonizations. Instead, insights based on genetic diversity data and hindcasts of LGM ice coverage indicate the potential existence of regional high-latitude periglacial refugia, that could have supported the long-term persistence of marine forest species42,44. Such refugia would have contributed to increased genetic diversity, as predicted by our models and confirmed for multiple Arctic species42,44. Some marine forest taxa also display unexpectedly high genetic diversity in polar regions due to admixture during post-glacial expansions from multiple genetically distinct sources45,46. In contrast, at lower tropical latitudes, LGM effects are hypothesized to have been less impactful on the distribution of species. However, the lack of intraspecific genetic diversity data for tropical marine forest species (literature review; Supplementary Data 1) precludes any robust conclusions on potential genetic diversity imprints. Several studies have highlighted a significant underestimation of species diversity of dominant tropical brown seaweeds47,48, but it is unclear whether species richness translates to genetic diversity in tropical latitudes. At least for marine fishes, patterns are not that clear, with mitochondrial genetic diversity being highest near the equator, while nuclear genetic diversity did not follow a geographic pattern19,49.

For cold and temperate species, the predictions of genetic diversity largely overlapped the previously inferred species richness patterns33, suggesting common underlying drivers shaping both genetic and species diversity. Long-term persistence over geological times in combination with high speciation rates could have contributed to the congruence of genetic and species diversity10,34. Such spatial covariation between genetic and species diversity has been previously demonstrated for many different ecological groups26, suggesting that eco-evolutionary processes can act simultaneously on genetic and species diversity. The better correlation of species richness and gene diversity more than allelic richness might reflect the latter being more affected by recent bottlenecks. Specifically for marine forests, the highest genetic diversity was consistently predicted in regions of high species richness. The opposite pattern of high species richness and low genetic diversity was not predicted.

While our study provides key insights into the global genetic diversity patterns of marine forest species, it is subject to several limitations that should be acknowledged. These limitations arise from both data-related and conceptual aspects. First, the genetic data used to develop the predictive models were collected from existing literature, carrying potential spatial and species biases26. For example, the oversampling of certain regions (e.g., northeastern Atlantic compared to South America; Fig. 2) may have influenced diversity predictions37,38. Additionally, genetic diversity data were also unbalanced between species with temperate (22 species) and cold distributions (only 6 species), highlighting the need for more comprehensive sampling, both geographically and taxonomically. Another limitation may lie in the resolution of the genetic markers used, with future uses of more advanced genomic techniques potentially providing a more comprehensive understanding of genetic diversity and the evolutionary processes shaping it. Further, genetic diversity estimates are not directly comparable among studies, due to different study designs, methodological approaches, and research scopes50, impacting predictions, as indicated by the substantial part of genetic variability explained by study-related aspects. Specifically, our model is unable to predict the effects of such study-related aspects, with diversity estimates inferred exclusively from species-specific niche centroid distances. Additional uncertainties may arise from the quality of environmental data for LGM conditions (e.g., nutrient availability estimates). However, temperature, the primary environmental driver defining marine forest niches33, has shown a good agreement for LGM estimates when tested against independent data51. Lastly, the environmental niche of each species was captured from available distribution range maps that have been produced with species distribution modelling. It is important to note that these maps may not fully represent the entire species niche, given the inherent limitations of this approach29,33.

The global relationships here found are especially remarkable considering all the uncertainties above mentioned. Despite those shortfalls, our study establishes new baseline information on the global genetic diversity patterns of marine forest species, that are provided openly and in high resolution (0.08°). These insights can be used in multiple applications for sustainability52 and can be particularly beneficial for countries that may lack the resources, capacity or funding required to conduct extensive genetic analyses. Specifically, the identification of genetic diversity hotspots can guide spatial conservation and prioritization, in alignment with the Post-2020 Global Framework for Biodiversity53. Conservation of rich gene pools may contribute to the preservation of genes responsible for species adaptation and evolution9, potentially increasing populations’ resilience under climate change15,16. In this scope, our results can guide stakeholders and managers in the implementation of marine protected areas (MPAs), to mitigate anthropogenic threats (e.g., degradation, overexploitation) on populations with high genetic diversity54,55. These MPAs could be integrated into networks that maximize connectivity, thus, promoting gene flow and further enhancing population resilience56,57, a mandate often overlooked in marine spatial planning but integral to the Post-2020 Global Framework for Biodiversity53,58. The geographical variation in genetic patterns revealed disproportional roles of individual countries in the preservation of rich genetic pools. Notably, countries such as the United States, Canada, United Kingdom, Ireland, Norway and Australia, emerged as key custodians of major genetic diversity hotspots. As such, they bear an increased responsibility for conserving the rich genetic pools for multiple marine forest species. Moreover, our baseline estimates on genetic diversity patterns can be coupled with range shift projections to anticipate future climate change impacts on the genetic diversity pools of marine forest species42. Lastly, our results can also be used to inform restoration actions, by promoting the restoration of genetically rich populations while maintaining their genetic traits by the selection of both non-divergent and similarly diverse populations as donors, thereby increasing the likelihood of restoration success59,60. Yet, it also shows the challenge in recovering high diversity by restoration, as once ancient persistent populations that evolved in past climate optima are lost, the reconstruction of their previous evolutionary potential levels is not easily feasible within desirable time scales of human generations.

In conclusion, this study highlights the importance of long-term population persistence in creating biodiversity hotspots by showing how past climatic optima can be the main drivers of present biodiversity patterns, more than present population conditions. Furthermore, our estimates provide a global overview of the biogeographic genetic diversity patterns of marine forest species. This is also timely in improving the information for planning the allocation of conservation resources to safeguard biodiversity in the face of climate change.

Methods

Intraspecific genetic data

A literature review was conducted in the “Web of Science” platform (accessed in July 2023) to compile intraspecific genetic diversity data for species of brown macroalgae, regardless of marine forest formation traits. The combination of the words “genetic diversity”, “genetic differentiation” or “phylogeography” with “algae”, “macroalgae”, “seaweed” or “heterokontophyta” were used to retrieve a list of potential studies. As initial selection criteria, studies were retained when reporting at least one genetic diversity metric (e.g., allelic richness, expected heterozygosity, observed heterozygosity, etc.) of wild, native populations, specifying the species name, geographic location (longitude and latitude), sample size and genetic marker used (e.g., microsatellite, mitochondrial DNA, etc.).

Considering that regional studies are not expected to enable the detection of genetic and climatic variability (e.g., LGM effects), further filtering retained studies reporting genetic data spanning at least two marine ecoregions61 (Suplementary Data 1). Additionally, when multiple studies reported genetic diversity for the same species, the study covering the largest geographic area was retained (i.e., spanning a larger number of marine ecoregions61). When multiple studies reported genetic diversity for the same species, but sampling spanned different ecoregions, all genetic data were retained. When studies reporting genetic diversity for the same species spanned the same ecoregions but reported different genetic markers and/or genetic diversity metrics, all genetic data were retained. When multiple studies reported genetic diversity for the same species, ecoregions, genetic diversity metrics and genetic markers, the study with more sampling sites (n) was retained. A list of the revised studies and justifications for their selection or rejection are shown in Supplementary Data 1. Lastly, the genetic diversity dataset of Macrocystis pyrifera along its global distribution32 was divided into two subsets in light of recent findings of extreme genetic divergence between the two morphs of M. pyrifera and M. integrifolia62,63. Accordingly, in the northern hemisphere, M. integrifolia was considered to occur from Stillwater Cove, USA (36.56°N, 121.94°W) northwards and M. pyrifera southwards62,64. In the southern hemisphere, M. integrifolia was considered to occur from La Boca, Chile (33.90°S, 71.83°W) northwards and M. pyrifera southwards62,64. M. integrifolia distribution was considered only in the Americas, contrary to M. pyrifera which is distributed across the Antarctic and subantarctic islands, including New Zealand, Australia and South Africa64. Note that the taxonomic names used in the original studies were retained here for clarity, to allow easy tracking of the data sources, despite taxonomical updates (e.g., Cystoseira tamariscifolia updated to Ericaria selaginoides), because the goal was to describe global biogeographic patterns and not species-specific information.

Environmental data

A set of environmental variables was selected as potential predictors defining the niche of marine forest species based on their biological relevance33 and their availability for contemporary and LGM climatic conditions. Namely, the variables of long-term temperature (minimum and maximum), nitrate, phosphate, salinity, and sea ice coverage were extracted from Bio-ORACLE V2 for contemporary conditions65,66. The same variables for LGM conditions were accessed from previously developed datasets found in two publications32,51.

Capturing environmental niche and centroid

Since species distributions are shaped by their environmental niches—the range of suitable conditions for their survival—we used available contemporary distribution range maps29 to estimate the environmental niche and its centroid (Fig. 1) for each species with genetic data (29 species in total; see results). To this end, we extracted environmental data for the six pre-selected environmental predictors across the species distribution range and performed a Principal Component Analysis. This approach allowed us to address potential collinearity among predictors and define the species’ niche. The niche centroid (NC), representing optimal conditions for the species, was calculated as the mean value of each environmental predictor across the species range. Although species distributions may change over time, we assumed the niche centroid to be stable due to niche conservatism67, reflecting the species’ fundamental environmental preferences.

To explore the relationship between genetic diversity and environmental conditions, we calculated the Euclidian distance between the niche centroid and the local environmental conditions of each population for which genetic data was compiled. These distances were computed for both contemporary environmental conditions and those during LGM, allowing us to evaluate how the sampled populations were positioned relative to their niche centroid in both time periods (Fig. 1).

This process was repeated for all species for which genetic data was available, resulting in a comprehensive matrix that includes information on the study ID, species name, coordinates of the sampled population, genetic marker, genetic diversity metric (e.g., allelic richness) and its measure, and the distances between populations and the niche centroid under contemporary and LGM conditions (Supplementary Data 2).

Modelling genetic diversity

The machine learning algorithm Boosted Regression Trees (BRT) was chosen to model the genetic diversity (allelic richness and expected heterozygosity) of marine forest species because of its ability to handle nonlinear relationships and its high predictive performance68. This algorithm fitted a “Gaussian” distribution between genetic diversity estimates and five predictors, namely, the distance from the niche centroid under contemporary and past LGM conditions, and, as random effect variables, the study, the species and the genetic marker used. Monotonicity constraints, based on the expected outcomes of each predictor effect on genetic diversity, were also implemented to reduce overfitting and improve spatial transferability and interpretability69. Specifically, we implemented a negative monotonicity effect for niche centroid distances under contemporary and LGM conditions and an arbitrary effect for the study, species and genetic markers. Model hyperparametrization was optimized through tenfold cross-validation33,68. In this process, the “grid search” approach trained all hyperparameter combinations of number of trees (from 50 to 500, by step of 50), tree complexity (2 to 5, by step of 1), learning rate (from 0.1 to 1, by step of 0.1) and the fraction of training set observation (0.25 to 1, by step of 0.25). The model’s predictive performance was evaluated with deviance explained in one independent fold withheld at a time. Optimal hyperparameters were selected as those producing the highest deviance explained in cross-validation. The relative contribution of each predictor to the model was also computed70. A null model using the study, the species and the genetic marker as predictor variables was developed to allow estimating the relative contribution of distance from the niche centroid under contemporary and LGM conditions on the model’s predictive performance.

Predicting global marine forest genetic diversity

The developed BRT model was transferred to available distribution range maps of marine forest species29 to predict genetic diversity (allelic richness and expected heterozygosity) patterns beyond the species for which genetic data was available. We restricted the model application on cold and temperate brown macroalgae species because (1) genetic data were mostly available for cold and temperate species, with data reported only for one tropical species (Sargassum liebmannii, Fig. 2), and (2) LGM effects are expected to be less profound on tropical species2,10,12. Therefore, a subset of the available marine forest distribution range maps29 was selected to exclude tropical species based on the temperature range of species distributions. The temperature ranges for each species were inferred by overlaying their distribution with the long-term average temperature from 2010–202065. Species were defined as cold when the 95th percentile of their distribution was below 15 °C, as temperate between 15 and 25 °C, and as tropical above 25 °C (Supplementary Data 3)71.

Genetic diversity was predicted with the previously produced BRT model, using the distances from the niche centroid as predictors and by fixing the study, the species and genetic marker effects to zero, in line with common practices for mixed effect modelling72. Distribution maps reflecting the potential patterns of genetic diversity were developed per species. Global hotspots of diversity were inferred by summing individual species predictions and estimating the average diversity per pixel33,73. The correlation between the predicted genetic diversity as allelic richness and expected heterozygosity was tested through a two-sided Pearson correlation test. Further, species richness per pixel was estimated from available distribution range maps29 and its correlation with genetic diversity was tested through a two-sided Pearson correlation test. Lastly, a global classification based simultaneously on species richness and genetic diversity was produced by dividing species richness and genetic diversity into quartiles, forming 16 categories of quartile combinations19,74.

All analyses were performed in R75. The data, the R code and the produced predicted genetic diversity layers are openly available in a permanent repository76.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.