Introduction

Pollen grains—the male gametophytes of seed plants—exhibit astonishingly intricate structures within microscopic dimensions, captivating scientists and artists for centuries1,2,3,4,5. The outer wall (exine) of pollen grains provides a barrier against chemical damage and encases the male genetic material, with areas of reduced exine forming apertures that are the starting points for pollen tube growth and thereby pollination6. While pollen grains are designed to protect and transfer male genetic material, the morphological variation in both the apertures and the wall makes them among the most diversified microstructures in nature (Fig. 1)3,4,7. This diversity is thought to provide varying harmomegathic stress systems – the folding of pollen grains to accommodate cellular volume changes due to hydration variations (Wodehouse effect)2 – and is believed to play a role in the efficiency of interactions with pollinators and in pollen-stigma interactions6,8,9. The developmental origins of pollen morphological diversity (disparity) have been explored in numerous taxa with key findings being the influence of minor variations in microspore meiosis on pollen symmetry and aperture position, and the role of self-assembly and phase separation in the ultrastructure of the exine10,11,12,13.

Fig. 1: Pollen diversity within angiosperms.
Fig. 1: Pollen diversity within angiosperms.
Full size image

Representative scanning electron microscope (SEM) pollen images from species listed in Supplementary Table 1, which provides the image sources. Micrographs were selected to illustrate the typical pollen morphology reported for each species, based on visual comparison of multiple pollen grains during image selection. Taxa are numbered and ordered according to a linearised sequence following APG IV classification. (1-2): ANA grade; (3-5): Magnoliids; (6): Chloranthales; (7-10): Other monocots; (11-12): Commelinids; (13): Ceratophyllales; (14-17): Other eudicots; (18-19): Other superrosids; (20-28): Rosids; (29-33): Other superasterids; (34-40): Asterids. Scale bars: 10 µm (1-7, 9-10, 12-19, 21, 22, 24, 26, 28-32, 34, 36-40); 20 µm (11, 20, 23, 25, 27, 33); 30 µm (8); 150 µm (35). Ordinal-level chronograms are shown (bottom), subsampled from our full time-calibrated phylogeny. Coloured bars above the chronogram indicate the major angiosperm lineages (ANA grade, Magnoliids, Monocots, and Eudicots). OthE, other eudicots; OthM, other monocots; OtSA, other superasterids; OtSR, other superrosids.

Understanding how the diversity of organismal forms arises is a crucial topic in evolutionary biology14,15, and the study of angiosperm pollen trait evolution within a broad evolutionary context has a particularly rich history1,3,16,17. Pollen traits played a pivotal role in the morphological diagnosis, through cladistic analysis, of the eudicots (i.e., ‘the tricolpate clade’)—a monophyletic group comprising approximately 75% of extant angiosperm species—distinguished by the synapomorphy of equatorially tricolpate pollen18,19. This was later regarded as a key innovation in early angiosperm evolution11 and the monophyly of the eudicots was soon confirmed using molecular systematics. Meanwhile, owing to its widespread distribution, taxonomic distinctiveness, abundance, and exceptional preservation potential in sedimentary rocks, pollen has been extensively employed in tracing past vegetation and climate changes on Earth7,20,21,22. However, despite this importance, the large-scale patterns of angiosperm pollen diversification remain poorly quantified and understood. This lack of a quantitative framework limits—and may even distort—our understanding of angiosperm radiation, particularly regarding deep evolutionary processes.

Morphological diversity is one of the most intuitive ways to quantify morphological variations within clades and has contributed substantially to documenting the dynamics of evolutionary history—sometimes referred to as the “kinetics” of biological diversity23. In a paradigm rooted in zoology24,25,26,27, early stages in the evolution of a group are expected to feature less constrained genetic and developmental systems coupled with abundant ecological opportunities, leading to a swift peak in disparity. Lupia’s28 quantitative analysis of fossil pollen morphologies during North America’s Cretaceous Period and Palaeocene Epoch demonstrated a rapid rise in Early Cretaceous pollen disparity, often cited as a typical example supporting this model. Pollen disparity in the Asterales—the largest order of angiosperms—further corroborates this pattern29. A plausible evolutionary explanation for this model is that the early success of angiosperms was intricately linked to their capacity for rapid reproduction30,31,32, with selection favouring short-lived pollen grains capable of quick germination32. This process likely drove a series of morphological changes, particularly in the aperture structure. Interestingly, palynological observations of extant early-divergent angiosperms confirm the presence of relatively diverse aperture designs (Fig. 1)16,33.

The early maximum disparity model faces scrutiny from several disparity studies, which suggest that morphological radiations may occur at any point in a clade’s history, with their appearance contingent on the temporal and taxonomic scales of the analysis34,35 Phenotypic disparity research across animals, plants and fungi indicates sporadic growth patterns, portraying early pulse events like the Cambrian explosion as mere snapshots of fossilizable traits, inadequate for explaining overall phenotypic disparity36,37,38. Since disparity studies have predominantly focused on animals, and considering the distinctive evolutionary trajectories of animals and plants—sometimes likened to “dancing to a different beat”39—it remains unclear whether similar disparity patterns exist in plants. Beyond pollen, studies by Oyston et al.40 have shown that most plant clades reached their morphological diversity zenith soon after diverging; angiosperm flowers achieved their disparity peak in the Early Cretaceous41, whereas wood functional disparity only began its rapid ascent in the Mid Cretaceous42.

Extensive and reliable fossil pollen records would undoubtedly be the best data sources for quantifying angiosperm pollen disparity. However, finding reliable angiosperm fossils from the Early Cretaceous and earlier is highly challenging, and taxonomic uncertainty in many microfossils limits their integration into phylogenetic frameworks, thereby constraining their use in macroevolutionary analyses21,22,43,44,45,46,47. For instance, the most reliable crown age for angiosperms is currently established using Barremian–Aptian tricolpate pollen, but this type of pollen should define the crown age of eudicots, thus underestimating the true age of crown angiosperms48,49. Similarly, the prevailing tentative candidates for pre-Cretaceous angiosperm fossils also derive from pollen, dated to the Middle Triassic, though the reproductive structure producing this pollen remains unknown50. Moreover, earlier fossil pollen works, such as those by Doyle51 and Lupia28, were constrained by a limited understanding of evolutionary relationships among groups and were restricted in geographical and temporal extent. Given these challenges, there is a clear need to reevaluate angiosperm pollen disparity in a modern phylogenetic context. By integrating time-calibrated phylogenetic trees, we situate the pollen morphospace within a broader evolutionary context, providing insights into morphological evolution and enhancing our understanding of angiosperm radiation.

Here, we compile a genus-level dataset of extant pollen morphologies and a time-calibrated phylogeny to explore the large-scale evolutionary dynamics of angiosperm pollen, including morphospace occupation, disparity, and rates of morphological evolution. To examine how different structural domains contribute to overall pollen disparity, we defined three analytical subsets: pollen grain shape (shape), aperture pattern (aperture), and exine structure and ornamentation (exine). These subsets were each analysed independently, alongside the full matrix, to provide a modular framework for interpreting pollen disparity. In addition, we examine how pollen disparity correlates with pollination type, genome size, and taxonomic diversity to identify potential relationships between these parameters. This study aims to elucidate the evolutionary mode and tempo of pollen disparity, assessing its compatibility with the early maximal disparity model and offering palynological insights into the evolutionary dynamics of angiosperm diversification.

Results

We developed a discrete-coded matrix of 20 pollen characters specifically designed to capture the fundamental variations in pollen morphology, with genera as the terminal units (Supplementary Data 1 and Supplementary Table 2). These characters are grouped into three major morphological categories: shape (characters 1–6), aperture (characters 7–13), and exine (characters 14–20; referring exclusively to the stratified layers of the pollen wall). These groupings are consistently used throughout the results and are defined in Supplementary Table 2, which also provides detailed descriptions of all 20 pollen characters and their discrete states used in this study. We ensured the proportion of missing data was kept below 5% (including inapplicable codes) to mitigate the adverse effects of missing data, and polymorphic states accounted for approximately 6.2% of the genus-level codings. For the selected 1,880 genera, we reconstructed a maximum clade credibility (MCC) tree that encompasses all 64 orders and 434 recognised families of angiosperms (Fig. 1). The topology is broadly congruent with recent plastome-based phylogenetic frameworks52,53, although a recent nuclear phylogenomic study50 suggests some alternative relationships, particularly within the rosid clade. Our topology supports relationships [Chloranthales [magnoliids [monocots [Ceratophyllales + eudicots]]]] within the mesangiosperms (Fig. 1). All orders and families were recovered as monophyletic in our phylogeny, following the exclusion of rogue taxa to stabilise tree topology.

Current consensus maintains that fossil-calibrated molecular dating analyses cannot reliably resolve the crown age of angiosperms, though they typically employ maximum age constraints on the crown node of angiosperms based on prior temporal boundaries, ultimately producing alternative chronological frameworks (young vs. old angiosperm time trees)54,55,56. Our date trees show closer congruence with the plastid phylogenomic angiosperm (PPA) tree of Li et al.52 than with other recent dating studies, likely reflecting shared time-calibration strategies despite distinct molecular datasets (Fig. 1 and Supplementary Fig. 1). Temporal discrepancies among our three dated trees primarily stem from differential maximum age constraints on the crown node of angiosperms, with significant stem age variations observed in the three orders of the ANA grade and five major mesangiosperm clades (spanning Early Cretaceous to Early Triassic) (Supplementary Fig. 1). Notably, our dated trees align more closely with the ‘young tree’ scenario at ordinal and familial levels, in contrast to ‘old tree’ scenarios that place most ordinal diversifications in the Triassic—a timeframe that conflicts with current paleobotanical evidence22,44.

We carried out both Non-metric Multidimensional Scaling (NMDS) and Principal Coordinates Analysis (PCoA), with NMDS proving more effective for visualising and interpreting morphospace in this context (Fig. 2). The two NMDS axes showed strong correlation with the first two axes of the PCoA (Spearman’s r > 0.84, P < 0.001), accounting for 21% and 14% of the variance, respectively. Moreover, in our assessment of the reliability of ordination-based metrics, NMDS-based Mean Pairwise Distance (MPD) showed strong concordance with values derived from the raw distance matrix (Pearson’s r = 0.957, P < 0.001), indicating minimal distortion. By contrast, MPD based on PCoA coordinates was less strongly correlated (r = 0.737, P = 0.0063) (Supplementary Fig. 2). Correlation tests among the distance matrices of the subsets revealed that shape and aperture were weakly correlated (Mantel’s r = 0.285, P = 0.0099), whereas the correlations between exine and the other two components were very weak (r = 0.104-0.136, both P = 0.0099).

Fig. 2: The angiosperm pollen morphospace.
Fig. 2: The angiosperm pollen morphospace.
Full size image

a Non-metric multidimensional scaling (NMDS) morphospace of 1880 angiosperm genera based on 20 discrete pollen characters. b Disparity (sum of variances) from 1000 bootstraps of the fourteen angiosperm groups using NMDS and PCoA ordinations. Original sample sizes for each group were: ANA (n = 12), CHLO (n = 4), MAG (n = 12), MONO (n = 215), EUD (n = 1601); OthM (n = 114), COMM (n = 101), OthE (n = 193), OtSR (n = 30), FAB (n = 687), MAV (n = 173), OtSA (n = 195), CAM (n = 142), LAM (n = 181). Boxplots show the median, interquartile range, and 1.5 × IQR whiskers; outliers are shown as points. c Angiosperm pollen morphospace (tips in red; reconstructed internal nodes in blue) shown against 10,000 points in a theoretical morphospace (grey). ANA, ANA grade; CHLO, Chloranthales; MAG, magnoliids; MONO, monocots; EUD, eudicots; COMM, commelinids; CAM, campanulids; FAB, fabids; LAM, lamiids; MAV, malvids; OthE, other eudicots; OthM, other monocots; OtSA, other superasterids; OtSR, other superrosids. Source data are provided as a Source Data file.

Pollen morphospace occupation

The NMDS and PCoA ordinations revealed that the angiosperm pollen morphospace forms a continuous, cloud-like distribution (Fig. 2a). In these ordinations, Axis 1 showed the strongest association with polarity (shape) and the presence/absence of endoaperture (aperture), while Axis 2 was associated most strongly with tectum sculpture (exine) and the presence/absence of supratectal elements (exine), highlighting the importance of these four traits in characterising pollen disparity (Supplementary Fig. 3 and Supplementary Table 2). Although these axes are based on distance rather than real biological variables, correlations with original pollen traits allowing us to infer the gradients that each axis represents (Supplementary Fig. 3). Based on the sum of variances (SOV) across NMDS axes, the empirical morphospace of all extant genera encompasses approximately 49.0% of the theoretical morphospace’s total variance (Fig. 2c). When measured using the sum of ranges (SOR) across axes, the occupied proportion is substantially larger, reaching ~ 89.6% (Fig. 2c). This occupied region is concentrated toward the negative end of Axis 1, which covaries most strongly with traits such as pollen size (shape) and the presence/absence of tectum (exine). Notably, when restricting the empirical morphospace to the central 95% of genera (excluding extreme outliers), the occupied space contracts to only ~ 32% of the theoretical variance and ~ 51% of the range (Fig. 2c).

Under a within-clades analysis, eudicots as a whole occupy the largest morphospace area (SOR) among angiosperms, while within eudicots, fabids represent one of the largest subclades in terms of SOR across all four dimensions—overall disparity, shape, aperture, and exine (Supplementary Fig. 2). The sole exception is the exine subset, where eudicots have a similar range value to monocots in the NMDS results, and a smaller range in the PCoA results. Within the eudicots, fabids, the other superasterids, and lamiids occupy the largest areas (Supplementary Fig. 2). Moreover, these three subgroups achieve the highest disparity levels in the exine, aperture, and shape subsets. Sum of Variances (SOV), Mean Pairwise Distance (MPD), and Mean Distance to Centroid (MDC) displayed similar size trends. In terms of overall pollen disparity, monocots and the other superasterids exhibit the highest disparity levels among angiosperms and eudicots, respectively. In shape, magnoliids and the other superasterids stand out, while for apertures, it is monocots and the other superasterids. In the exine subset, monocots and fabids show the highest disparity. Details of the comparative results of these indices are shown in Fig. 2b and Supplementary Figs. 2, 47.

A k-means clustering analysis divided the pollen morphospace into four clusters, with eudicots forming exclusive clusters for shape and aperture subsets, while the distribution was even in the exine pattern among major groups (Supplementary Fig. 8). Disparity differs significantly among angiosperm groups (Kruskal-Wallis test: χ2 = 4669.6, df = 4, P < 0.001) and eudicot groups (χ2 = 6149.2, df = 6, P < 0.001) (Fig. 2). Using the subgroup omission method, we found that commelinids, the other monocots, and fabids made the highest contributions to overall angiosperm pollen disparity, while the ANA grade, Chloranthales, and the other superasterid made the lowest contributions (Supplementary Fig. 5). The PERMANOVA analyses revealed a statistically significant separation in pollen morphospace occupation among the major angiosperm groups and among the major eudicot groups (see Supplementary Data 2 for detail), indicating that most subgroups possess distinct trait combinations. Notably, morphospace separation was more pronounced between eudicots and other groups, whereas the ANA grade and Chloranthales were not clearly distinguished from the other non-eudicot groups (Supplementary Data 2). Compared with the NMDS results, PCoA showed greater discriminating power: all eudicot groups exhibited a clear separation in their morphospace occupation (Supplementary Figs. 6, 7 and Supplementary Data 2).

The observed patterns of morphospace occupation are shaped by a detectable phylogenetic structure. A Mantel test revealed a moderate but significant correlation between the pollen morphological distance matrix and the phylogenetic distance matrix derived from the time-calibrated tree (Mantel’s r = 0.397, P = 0.0099), indicating that evolutionary time contributes to overall pollen disparity. Phylogenetic signal tests using Blomberg’s K further supported this view: Axis 1 of both NMDS and PCoA ordinations exhibited statistically significant signals (K = 0.10 and 1.03, respectively; P < 0.001) (Supplementary Fig. 9).

The C1 convergence analysis for all genus pairs revealed a median C1 value of 0.231, indicating a moderate reduction in ancestral morphological divergence. The C1 distribution (Supplementary Fig. 10) exhibited a bimodal pattern with a primary peak at low C1 values (near zero), reflecting widespread morphological divergence, and a distinct secondary peak at C1 = 1.0, representing extreme morphological convergence. In total, 203,760 genus pairs (5.77% of all pairs) demonstrated high convergence (C1 ≥ 0.90). The taxonomic rank analysis of these highly convergent pairs (Supplementary Fig. 10) showed the majority occurred at G1 (70.1%) and G2 (17.6%) ranks, whereas only a small fraction was observed at other ranks (Family: 6.1%; Order: 5.1%; Subclass: 1.1%), indicating that high morphological convergence arises among taxonomically distant lineages. Here, G1 and G2 ranks represent two informal taxonomic ranks between order and subclass, with specific divisions detailed in the Methods section.

Rates of morphological evolution

Rates of morphological character change were quite low (<; 0.1 character state changes/Ma) until the Neogene, but increased markedly thereafter, reaching 0.21 character state changes/Ma in the Neogene and 0.58 character state changes/Ma in the Quaternary (Fig. 3 and Supplementary Fig. 11). In tests of branch rates, significant character state changes were detected in 599 branches, and among them, the 150 internal branches are marked in Fig. 3. In tests of morphological evolution rates based on overall pollen traits and the aperture subset, the top fifteen fastest-evolving branches for each trait set are marked in Fig. 3. Three notable branches—the ones leading to the eudicots, superrosids, and asterids—appear in both rankings (see Fig. 3 and Supplementary Table 3). Among these, the branch leading to eudicots showed unambiguous state changes from heteropolar and mono-aperturate grains to isopolar and tricolpate grains, while superrosids and asterids displayed the emergence of endoapertures (Fig. 3 and Supplementary Table 3). Meanwhile, the branch leading to the Nitrogen-fixing Clade exhibited the highest rate of pollen morphological change (1.66 character state changes/Ma)—a particularly noteworthy finding (Fig. 3 and Supplementary Table 3).

Fig. 3: Evaluation of branch rates of morphological evolution within time-calibrated phylogeny.
Fig. 3: Evaluation of branch rates of morphological evolution within time-calibrated phylogeny.
Full size image

Coloured dots represent the fifteen branches with the fastest rates of morphological change in overall pollen morphology (red) and aperture specifically (yellow) during the Early Cretaceous or earlier (J; Jurassic, EC; Early Cretaceous, LC; Late Cretaceous, Pal; Paleogene, respectively). One hundred and fifty internal branches with significant morphological shifts are labelled with black dots. ANA, ANA grade; CHLO, Chloranthales; MAG, magnoliids; OthE, other eudicots; OthM, other monocots; OtSA, other superasterids; OtSR, other superrosids. Source data are provided as a Source Data file.

In rate tests for individual traits, tectum sculpture, size, outline in polar view (shape), supratectal element presence/absence, and endoaperture presence/absence (aperture) were the five fastest-evolving traits, with tectum sculpture evolving at a rate approximately four times that of the second fastest, size (1.74 versus 0.45 character state changes/Ma) (Supplementary Fig. 11). Meanwhile, the five slowest-evolving traits were basic shape (shape), exine presence/absence (exine), macro-ornamentation type (shape), tectum presence/absence, and infratectum type (exine) (Supplementary Fig. 11). Apart from the presence/absence of endoaperture, aperture traits generally exhibited moderate evolutionary rates. Further analysis confirmed that rates of morphological evolution across the three blocks were fastest for the exine, followed by aperture and shape (Supplementary Fig. 11).

Pollen disparity through time

We analysed time slices from our dated tree covering three possible crown age scenarios for angiosperms: Late Triassic, Early Jurassic, and Early Cretaceous to the present. The results indicate that, under these different crown age hypotheses, variations in pollen disparity over time primarily pertain to the initial expansion of pollen morphospace during the establishment of the clade (Fig. 4 and Supplementary Fig. 12). However, a consistent pattern emerges: under all scenarios we see a rapid increase in disparity during the Albian–Cenomanian stages (~ 20 Ma) of the Cretaceous, dominated by changes in aperture patterns. This surge is followed by a prolonged period of stability—or very slow growth—during the Late Cretaceous, then a steady increase beginning at the Cretaceous–Paleogene transition, contributed by all three subsets of characters (aperture, exine and shape), reaching a peak by the Oligocene (Fig. 4). This evolutionary pattern remains consistent across 100 random subsamples, each with a sample size of 1000, and proves stable under varying rarefaction levels and different size-dependent indices, underscoring the robustness of the dataset (Supplementary Fig. 12).

Fig. 4: Angiosperm pollen disparity (sum of variances, SOV) through time.
Fig. 4: Angiosperm pollen disparity (sum of variances, SOV) through time.
Full size image

a Disparity Through Time (DTT) for crown angiosperms under a Jurassic crown age (187 Ma), based on NMDS ordinations of MORD distance matrices. The black line shows the median SOV across 1000 bootstrap replicates, and the grey ribbon indicates the 5th–95th percentile envelope. The upper-left inset shows corresponding DTT curves for the three subsets (shape, aperture and exine). b DTT curves for three alternative dating scenarios for the angiosperm crown: Cretaceous crown (KC, 139 Ma), Jurassic crown (JC, 187 Ma) and Triassic crown (TC, 248 Ma), each based on 1000 bootstrap replicates. c Robustness of the DTT pattern to taxon sampling, based on 100 random subsamples of the morphological matrix (n = 1000 genera per subsample). Each grey line represents the median SOV curve through time for one subsampled dataset. EJ, MJ and LJ denote the Early, Middle and Late Jurassic; EC and LC the Early and Late Cretaceous; Pal, Eo and Ol the Palaeocene, Eocene and Oligocene; Mio the Miocene. OthE, other eudicots; OtSA, other superasterids; OtSR, other superrosids. Source data are provided as a Source Data file.

By plotting the disparity through time (DTT) for each of the three subsets of characters, we can observe the varying contributions of each subset to the overall disparity at different stages of angiosperm evolution (Fig. 4). The three subsets exhibit distinct patterns of disparity accumulation, with the disparity in aperture patterns, particularly, experiencing an approximately fivefold increase across the Albian–Cenomanian boundary (Fig. 4).

Correlations

Our findings reveal that pollen disparity is not correlated with taxonomic diversity in angiosperms (Mantel’s r = -0.01, P > 0.5; Supplementary Fig. 13). However, we observed weak but significant correlations with genome size (Mantel’s r = 0.104, P < 0.001; Supplementary Fig. 13). Although the correlation between pollination mode and the primary two axes of the NMDS and PCoA ordination was weak (Pearson’s r = 0.1–0.24, P < 0.001), statistically significant separations were observed among most pairs of pollination modes (e.g., abiotic versus biotic, wind versus insect) (Supplementary Data 3). Notably, the separation between insect and vertebrate pollination was not statistically significant, as indicated in Supplementary Figs. 14, 15 and Supplementary Data 3.

Discussion

Extant angiosperm pollen morphospace

In characterising and visualising the disparity of extant angiosperm pollen, we demonstrate that clades occupy morphospace unevenly. Notably, there is a clear separation between eudicot and non-eudicot angiosperms (Supplementary Fig. 4 and Supplementary Fig. 8). Within eudicots, differentiation in pollen morphospace among the main groups is apparent, particularly concerning shape and apertures (Supplementary Fig. 4 and Supplementary Data 2). Monocots exhibit the highest disparity across multiple size indices (SOV, MPD, MDC), even though they do not occupy the largest area of the morphospace (SOR). This implies a tight but highly dispersed distribution of monocot pollen morphospace—suggesting extensive within-clade variation constrained within a narrower morphotypes domain. In contrast, fabids, though less internally dispersed, occupy one of the largest morphospace areas among eudicots, especially in the exine and aperture subsets (Supplementary Fig. 4). The pattern of pollen disparity contrasts with floral disparity patterns observed in López-Martínez et al.41, where monocots exhibited low levels of floral disparity, and fabids showed only moderate disparity within eudicots. In animal taxa, early-diverging groups have often been found to exhibit higher levels of disparity than their more derived counterparts15,24,25, a phylogenetic pattern also observed in the morphospace of angiosperm flowers41,57. However, our results show that monocots display the highest overall disparity, with magnoliids exhibiting the greatest variability in shape, while the other superasterids has the highest disparity among major eudicot groups (Fig. 2 and Supplementary Fig. 4). This suggests that the pattern of higher disparity in early-diverging groups is not consistently observed in pollen, contrasting with findings from a floral perspective. These findings reinforce the idea that disparity patterns are highly context-dependent and shaped by both evolutionary history and morphological domain15,34,35. Interestingly, while the other superasterids ranks highest in eudicot-level disparity, its contribution to total angiosperm disparity remains limited (Fig. 2 and Supplementary Figs. 4, 5). This decoupling arises because overall disparity is sensitive not just to internal trait variation, but also to the distinctiveness of occupied regions in morphospace58. Conversely, clades like commelinids and malvids, which occupy relatively peripheral regions, have stronger impacts on overall disparity even with moderate pollen disparity (Fig. 2 and Supplementary Figs. 4, 5). Moreover, our analysis indicates that genome size and pollination types have weak explanatory power for pollen disparity, with the exact mechanisms of influence remaining unknown (Supplementary Fig. 13). In addition, pollen disparity is completely decoupled from taxonomic diversity, consistent with previous disparity studies28,40 (Supplementary Fig. 13).

In the theoretical morphospace, extant angiosperm pollen occupies a substantial region, consistent with observations by Mander59 that pollen character combinations in extant tropical Neotropical angiosperm species has explored much of the raw morphospace. This underscores the immense inherent evolutionary potential of angiosperm pollen to realise diverse forms. However, this broad occupation is not uniform, with the bulk of angiosperm pollen diversity clustering within a more restricted—yet still extensive—portion of the theoretical morphospace (Fig. 2), suggesting that realised forms predominantly inhabit a constrained region of the morphological landscape. The significant Mantel correlation (Mantel’s r = 0.397, P = 0.0099) indicates that evolutionary time contributes to the overall divergence of pollen. However, the generally low Blomberg’s K values across the ordination axes suggesting limited phylogenetic conservatism in the multivariate structure of pollen morphology (Supplementary Fig. 9). Indeed, our C1 convergence analysis highlights selection pressures as a potent evolutionary driver within pollen morphospace. The observed median C1 of 0.231 reflects a general tendency for pollen lineages to partially close ancestral morphological distances. More importantly, the prominent peak at C1 = 1.0 (Supplementary Fig. 10) reveals numerous instances of complete morphological convergence. The enrichment of these highly convergent pairs (C1 ≥ 0.90) at higher taxonomic ranks (70.1% at G1 and 17.6% at G2; Supplementary Fig. 10), rather than within closely related groups, strongly suggests that they represent independent evolutionary events among distantly related lineages. Conversely, the relative scarcity of such pairs across the six major angiosperm lineages (1.1% at subclass) may indicate deeper developmental constraints at these broader phylogenetic scales, particularly spanning the fundamental monocot–eudicot divide (Supplementary Fig. 10). These findings, which demonstrate vast evolutionary potential coupled with strong selective guidance toward specific forms, are consistent with a study from a developmental perspective emphasising a key role for natural selection in shaping pollen morphology60.

Two main pulses in angiosperm pollen disparity

Our research focuses on pollen, a highly diversified plant structure, providing a microscopic lens into the macroevolutionary history of angiosperms. We demonstrate that angiosperm pollen disparity reached a peak following two main pulses of diversification in the Mid Cretaceous and the Paleogene periods (Fig. 4). The Albian-Cenomanian stages, spanning approximately 20 million years, marked the most rapid period of growth in disparity, primarily associated with changes in aperture patterns, which elevated pollen disparity to a temporary plateau. This period of rapid evolution and subsequent stability, seen by analysing modern-day pollen traits, corresponds to the broad patterns of character acquisition revealed by the Barremian to Cenomanian fossil record (summarised by Traverse61), including the appearance of novel aperture types, exine structures and grain sizes and shapes that resulted in a broad range of angiosperm pollen morphologies by the beginning of the Late Cretaceous. This pattern is also largely consistent with trends in fossil pollen disparity shown by Lupia28, with the main increase in disparity occurring between the Aptian and the Cenomanian. A subsequent though smaller rise in disparity was related in continued innovation in triaperturate grains, including the appearance of extinct groups of triprojectate pollen (i.e., Aquilapollenites) and triporate pollen, including morphotypes with complex internal aperture structures (Normapolles pollen)28.

The Paleogene increase in disparity revealed by our analysis (Fig. 4) coincides with the reorganisation of plant communities following the Cretaceous–Paleogene mass extinction (discussed below in section ‘Emergence of Environmental Opportunities’), and the appearance through the Paleogene of pollen types that can be more confidently assigned to extant families and genera. To date there are no studies of fossil pollen disparity to allow for direct comparisons with the extant-taxon results presented here. However, studies from the Neotropics62,63 and southern North America64,65 have shown a rise in palynological diversity from the mid-Palaeocene through to a peak in the Middle Eocene, largely driven by the appearance of monoaperturate, porate and colporate pollen types that differ from each other in terms of aperture number and structure, exine structure and ornamentation, and grain size and shape61,64,66, which is consistent with the rise in disparity shown in our results. There is limited evidence for continued morphological innovation in the fossil pollen record beyond the Paleogene, with similar morphologies carried through into the Neogene and Quaternary61, again consistent with our disparity through time analysis.

Interestingly, this two-phase increase in pollen disparity coincides with surges in angiosperm leaf vein density, with rapid increases recorded in the Albian-Cenomanian and Maastrichtian-Palaeocene67. Feild et al.67 linked these sharp increases in leaf vein density to the first appearance of angiosperm trees in the Mid Cretaceous, and ecological expansion and radiation into new environments in the latest Cretaceous, including wet megathermal forests. It is not currently clear whether variations in pollen disparity and leaf vein density shared similar drivers, or whether innovations in one plant organ enabled ecological expansion, which in turn drove morphological evolution in other parts of the plant bauplan, but linking patterns of morphological evolution and disparity across plant organs is a key topic for future research.

Throughout the Cretaceous, pollen disparity increased more than elevenfold, and by the Cenomanian, it had reached about 57% of its later maximum value in the Oligocene (Fig. 4). Under the hypothesis of an Early Cretaceous origin of crown angiosperms, the Mid Cretaceous increase in disparity occurred during a phase of angiosperm radiation (125–90 Ma68). However, it lagged behind the radiation of major mesangiosperm clades and did not reach maximum disparity, thus not aligning with the early-burst model of adaptive radiation. Overall, this pattern is better interpreted as an episodic increase in disparity, a perspective more compatible with the hypothesis that crown angiosperms originated in the Cretaceous, the Jurassic or even earlier. This raises a significant issue: prior to these pulses, pollen disparity remained low for an extended period, making it challenging to distinguish fossil morphologies among the earliest angiosperm taxa, which could obscure our understanding of early angiosperm evolution. Under this interpretation, the seemingly sudden appearance of Cretaceous angiosperms in a “modern form”—Darwin’s “abominable mystery”—is more likely a manifestation of pulsed increases in disparity.

This increasing trend in pollen disparity contrasts with recent findings in Asterales pollen and angiosperm flowers29,41, which both follow the commonly held view of early maximal disparity seen in many animal groups. The early maximal disparity model also reflects an uneven pace of evolutionary change, which does not necessarily conflict with episodic increases in disparity. These differences in conclusions from studies on Asterales pollen primarily depend on taxonomic scale; when their observed early bursts are nested within higher-level angiosperm groups, different evolutionary patterns emerge. Moreover, reproductive cells fortified with sporopollenin have been found throughout land plant history69, and the emergence of sporopollenin is considered a key adaptation to terrestrial life70. Flowers, by contrast, are a more recent innovation. Taken together, these contrasting patterns of disparity reflect how evolutionary insights are shaped by temporal and taxonomic scales, as well as the focus of observation.

Rapid pollen character state changes

During the Mid Cretaceous and Paleogene periods, the overall rates of angiosperm pollen morphological evolution remained generally stable and low over time, indicating that the observed pulsed increases in disparity relate more to evolutionary mode rather than the tempo (Fig. 3). When morphospace is constrained, traits tend to exhibit frequent reversals and convergences71. Based on existing pollen morphology datasets, this phenomenon is captured at the tips of the time tree, leading to apparent rapid morphological evolution in later stages (Fig. 3). Notably, some rapid pollen morphological shifts occur at crucial evolutionary nodes associated with the formation of the most successful angiosperm lineages—for example, eudicots (increase in aperture number from one to three), superrosids (appearance of endoaperture), and asterids (appearance of endoaperture). The latter two correspond with the observed Mid Cretaceous pulse period (Fig. 3 and Supplementary Table 3). The sequential emergence of tricolpate pollen (based on Fischer’s rule) and endoapertures reflects that increases in disparity were gradually achieved by building upon morphological foundations established closer to the divergence point of each lineage. These innovations may have reshaped the harmomegathic systems of pollen during early angiosperm evolution, conferring greater survival advantages at the clade level and leading to significant ecological and evolutionary success.

Emergence of environmental opportunities

Since the 1970s, the formation of the pollen wall has been recognised as a product of both gene expression and physical forces4,72,73, highlighting the impact of the external environment on pollen morphology. The sporopollenin wall is a stress-mechanical device constructed within a given volume. Its vast range of variation, particularly the diversity of aperture structures, can readily be interpreted as highly adaptive8,9,72,73,74. An intriguing example is that, as angiosperm seagrasses returned to the marine environment, their pollen evolved into an extraordinary filiform, exineless form, and some genes involved in the biosynthesis and modification of exine were lost75. This shift is considered the most severe habitat transition in angiosperm history, resulting in the emergence of extremely specialised pollen forms75.

The Mid Cretaceous Period experienced significant physical environmental changes, including sea-level rise, continental break-up, and greenhouse effects, which greatly affected the growth and spread of angiosperms76,77. This expansion is also observed in the pollen record: initially, angiosperm pollen was very scarce in the fossil record during the Berriasian, but, starting in the Aptian–Cenomanian stages, fossils suggest angiosperms gradually spread from low to mid-high latitudes and became an important component of the ecosystems they inhabited20,28. The Cretaceous–Paleogene mass extinction (K–Pg event) had destructive impacts, leading to significant global vegetation turnover and the decline of non-flowering plants78,79. Based on studies of pollen and leaf fossils, Carvalho et al.78 found that the K–Pg event altered forest structures, leading to the formation of Neotropical rainforests and allowing the emergence of more diverse plant habits and growth forms. The increase in pollen disparity during the Paleogene corresponds to this period when angiosperms radiated into various environments and gained ecological advantages79. It also coincides with the rapid rise in pollinating insect family richness79,80. Interestingly, during the morphospace expansion in the Paleogene, the most rapid shifts in pollen morphology predominantly occurred within the orders Alismatales, Ericales and Malpighiales (Supplementary Table 3). The Alismatales illustrate transitions from terrestrial to aquatic and even marine environments81, while the Malpighiales and Ericales are strongly represented in lowland tropical rainforests and have been associated with the development of neotropical rainforests82,83.

Environmental opportunities control the availability of ecospace, thereby influencing morphological evolution84,85. If adaptive morphological radiations are associated with unfilled ecospace, then increases in disparity after mass extinctions are expected because large amounts of ecospace previously occupied by other plants were vacated14,85,86. This is consistent with our pollen results (Fig. 4). Such rebounds in disparity after extinction events have also been observed in the fossils of many animal groups, such as brachiopods, crinoids, and ammonites87,88.

The K–Pg event’s impact on angiosperm extinction was mainly at the species level, with less effect at higher hierarchical levels (families and orders)89,90. Therefore, the expected impact on pollen disparity would be relatively limited89,90. However, our findings go beyond this, as pollen disparity reached new peaks during the Paleogene. Genomic and developmental innovations during this period may have played important roles in breaking earlier pollen morphological constraints. The numerous whole-genome duplication (WGD) events clustered around the Cretaceous–Paleogene boundary are potential contributors91,92. Angiosperm WGDs are considered related to environmental conditions and morphological innovations92,93, and fossil evidence of unreduced pollen (gametes retaining somatic chromosome numbers) or morphologically abnormal pollen found near several mass extinction events is not rare94,95. Indeed, WGDs may be merely one of numerous intrinsic innovations, and the association between WGD and pollen evolution still requires further confirmation. Two outstanding questions around the impact of the K-Pg event are particularly noteworthy: (i) The Mid Cretaceous pulsed increase in pollen disparity was dramatic, but the associated environmental changes were not as destructive as the K–Pg event; could this be attributed to fewer genetic and developmental constraints in earlier phases of angiosperm evolution? (ii) If this mass extinction event had not occurred, would angiosperm pollen morphospace have exhibited relative stasis after the initial rapid morphological evolution?

Summary

Based on extant pollen data and time-calibrated trees, we have identified a pulsed evolutionary pattern in angiosperm pollen disparity. However, our approach faces two main limitations. First, by focusing solely on extant angiosperm taxa, we only capture the net accumulation of pollen disparity, whereas incorporating fossil pollen data—particularly from taxa that have left no direct extant descendants—would likely yield a more complete picture. Second, our pollen coding strategy targets the most fundamental characteristics of extant angiosperm pollen, thereby overlooking finer-scale evolutionary details that could be both informative and engaging. Nevertheless, we detect robust evolutionary signal in the dynamics of angiosperm pollen disparity over time, corroborated by the fossil record28,96,97. We infer that this reflects the true history of early divergences leading to dissimilar pollen forms. A significant consequence of this evolutionary pattern is that pollen disparity likely remained low before the major pulses, complicating the identification of unequivocal angiosperm pollen fossils from the Early Cretaceous and earlier. This difficulty significantly obscures our understanding of the ‘morphological age’ (referring to the time when the ancestral lineage of an evolutionary branch can be identified in the fossil record) of angiosperms.

We found that the periods of rapid expansion in pollen disparity corresponded to significant environmental changes and were synchronous with changes in leaf vein density. This suggests that key environmental opportunities played an important role in enhancing pollen disparity. A possible explanation is that environmental changes increased the evolutionary potential of genomes (e.g., WGDs), supplying a wealth of evolutionary novelties. Some of these novelties were selected and fixed due to their survival and reproductive advantages in the diverse ecological spaces opened up by environmental changes, leading to higher taxonomic diversity. This observation is consistent with the significant taxonomic value of pollen at the family, order, and higher levels in angiosperms3,16,17,33,98.

The evolution of phenotypic disparity across the entire plant kingdom is considered to exhibit episodic increases38, as also concluded for the reproductive complexity of land plants99. Some evidence suggests that groups showing early bursts of disparity are often subclades of higher groups that exhibit episodic increases34,35,36,37,38. Overall, our understanding of disparity at different higher taxonomic levels is continually developing. Although angiosperms are a relatively young higher taxonomic group, they have exhibited pulsed evolution in pollen disparity over a relatively short geological timescale, which is inseparable from their remarkable intrinsic innovation.

Methods

Phylogenetic reconstruction

Given the significance of pollen traits in angiosperm systematics, our goal was to balance the representation of diverse pollen morphologies with comprehensive phylogenetic coverage. Consequently, we sampled 1926 genera, including 1880 angiosperm genera across 434 families and 64 orders, and 46 gymnosperm genera as outgroup taxa. We chose to use genera as terminal units —a choice consistent with standard practice in both palynology and macroevolutionary studies43,100. Genera are widely regarded as evolutionarily ‘real’ units, often reflecting shared evolutionary trajectories and coherent morphological patterns among their constituent species100,101. Genus-level coding reduces species-level sampling bias while still capturing meaningful proxies for ecological and morphological diversity100. This level of sampling is also well-suited to palynological datasets, owing to the relatively conservative nature of pollen variation within genera (and sometimes within families).

Recent molecular supermatrix studies have shown that most currently accepted angiosperm genera are now well circumscribed and generally supported as monophyletic102,103. Given the macroevolutionary scale of our study, potential non-monophyly of some genera is unlikely to affect our main conclusions. Nevertheless, to ensure taxonomic consistency, we cross-validated the status of all sampled genera against global databases, including Plants of the World Online (POWO; accessed May 2021) and the Catalogue of Life (COL; accessed May 2021). Genera with ambiguous or unresolved taxonomic placements were excluded from subsequent analyses. Although genomic data could provide a more robust angiosperm backbone, we opted for a gene-sampling and composite taxon approach to ensure thorough coverage of all recognised families and their representative infra-familial branches (predominantly subfamilies). While recent phylogenomic studies, such as Zuntini et al.56 have proposed alternative ordinal relationships based on nuclear dataset, particularly within rosids, we followed the ordinal and major clade definitions of APG IV104, which remain widely accepted and facilitate consistency across comparative analyses (Supplementary Data 1). Infra-familial subdivision is mainly based on the Angiosperm Phylogeny website (http://www.mobot.org/MOBOT/research/APweb), covering 111 families. The composite terminal strategy allows for the supplementation of terminal unit sequences with sequences from the same genus, significantly increasing the genetic marker coverage for each genus. Our sampling approach offers several advantages: (i) extended sampling at the subfamilial level using genera as terminals enhances our capture of morphological variations; (ii) the composite taxon approach greatly improves phylogenetic analysis accuracy by reducing missing data105; (iii) this method does not compromise higher-level relationships, as it does not focus on the evolutionary history within genera.

For the sampled genera, we extensively searched GenBank for commonly analysed genetic regions across all three plant genomes, including plastid genes (atpB, matK, ndhF, rbcL, rpoC2, and rps4), mitochondrial genes (atp1, matR, nad5, and rps3), and nuclear ribosomal genes (18S and 26S). Accession numbers and associated taxonomic information for all genera included in the final dataset are provided in Supplementary Data 4. The 12-locus dataset was meticulously curated through iterative screening of individual loci and concatenated matrices to identify and address any rogue taxa, ensuring that the sequences of the composite terminals originated only from monophyletic genera. After manually inspecting initial trees, we concatenated the loci and used RAxML to infer a maximum likelihood (ML) tree. We then applied penalised likelihood, as implemented in the programme treePL106, to the topology from the RAxML analysis for molecular dating, utilising 56 fossil calibration points described in Li et al.52 and setting a smoothing value of 0.01. We further constrained our estimates by specifying dates (139.4 Ma and 209 Ma) for the crown age of angiosperms to explore potential effects52,107.

Morphological dataset

Pollen character scoring was based on based on multiple sources, including experimental observations, PalDat database (https://www.paldat.org), and the palynological literature (Supplementary Data 5). All records were reassessed and recoded according to the coding strategies outlined here (Supplementary Table 2 and Supplementary Data 1).

To effectively characterise the myriad features of pollen, palynologists have developed a rich array of complex terminology. Punt et al.108 documented 588 pollen terms, recommending 349 for use—a striking figure reflecting the minute structural details of pollen. In paleopalynology, “pollen type” is commonly employed to describe pollen grains, proving valuable in identifying isolated, dispersed fossil pollen. However, this approach is less suited to our study. Phylogenetic comparative methods prioritise shared traits, whereas incorporating excessive features introduces excessive missing data and complicates the weighting of numerous morphological variables. In line with recommendations of Wortley et al.98 and Punt et al.108, we standardised terminology based on widely accepted glossaries (Supplementary Table 2). When original definitions were inconsistent or outdated (e.g., due to LM-only historical usage), we recoded character states based on contemporary interpretations supported by SEM images and recent palynological reviews. Descriptions of character states were extracted directly from published pollen micrographs, supplemented by taxonomic comments when necessary. This direct visual scoring approach avoids reliance on over-summarised database records or ambiguous trait labels.

Following Wortley et al.98, we prioritised pollen characters that are (i) taxonomically widespread, (ii) variable at the ordinal or intra-familial level, and (iii) observable under LM and/or SEM with minimal ambiguity. Ultimately, we selected 20 discrete pollen traits that reflect essential aspects of pollen variations. Complex characters were decomposed into more basic and consistently scorable components to minimise subjectivity. We acknowledge that categorisation of characters is inherently constrained by both technical and conceptual factors109. Trait observability depends not only on the maturity of the pollen grain but also on the mode of observation (e.g., LM vs SEM vs TEM), sample preparation (e.g., acetolysis, hydration state), and subjective interpretation98,109.

We used discrete characters, rather than continuous characters or geometric morphometric approaches, to capture pollen morphology, because this allowed us to incorporate the full range of morphological features, including aperture type and exine structure, in one coherent framework40. Classical and geometric morphometrics (e.g., length/width ratios, elliptic Fourier) are powerful within relatively homogeneous morphospaces but lose traction as disparity broadens across higher taxa40. Discrete characters, by contrast, allow comparison across the full breadth of pollen morphologies and capture biologically and phylogenetically informative features (aperture architecture, exine stratification, polarity) that are difficult to reduce to continuous metrics. The flexibility of discrete coding—despite its coarser granularity—remains advantageous for capturing the full extent of disparity in large-scale evolutionary analyses15,40.

To represent within-genus variation, we used polymorphic coding and retained all observed states without modification (e.g., 0&1&2). As standard in discrete-character matrices, our use of polymorphism excludes intraspecific variation attributable to ontogeny, geography, or plasticity110,111. Rare or ambiguous states were conservatively coded as missing. This approach avoids imposing unverifiable frequency assumptions (as in “democratic coding”) while preserving observed variation. Finally, the matrix explicitly distinguishes ‘not applicable’ from missing entries, providing a consistent basis for downstream morphospace and disparity analyses. We further segmented the data into three subsets: (1) shape (characters 1–6); (2) aperture (characters 7–13); and (3) exine (characters 14–20) (Supplementary Table 2).

Distance matrix and ordination

We followed the “preordination ancestral state estimation” (pre-OASE) approach proposed by Lloyd112. This method first reconstructs ancestral character states for the nodes of the phylogeny using maximum likelihood estimation, then calculates morphological distances to generate pairwise distance matrices, and finally ordains the distances to produce a morphospace. Pairwise distances were computed with a metric that accommodates polymorphisms (maximum observable rescaled distance, MORD), treating differences conservatively (minimum difference) so that a polymorphic state is compared to the closest observed counterpart. To assess the robustness of the matrices under different distance metrics and character dependency values112,113, we evaluated nine distance matrices of pollen disparity using various metrics: raw Euclidean distance (RED), Wills’ generalised Euclidean distance (GED), Gower’s coefficient (GC), maximum observable rescaled distance (MORD), and alternative alpha values (0, 0.5, and 1), which represent the impact level of secondary characters on the distance114. The stability of these matrices was tested using a Mantel test with 99 permutations. After excluding the least correlated GED metrics, the pairwise correlation coefficient (r) ranged from 0.828 to 0.998. We ultimately chose the MORD distance metric with an alpha value of 0.5 as our main distance matrix, which is an extension of the GC and is effective in handling polymorphic trait states113,115. Given the high correlations among alternative metrics, our downstream results are expected to be robust to the choice of distance measure. All these functions were implemented in Claddis ver. 0.63113 in R116.

We then applied two ordination methods to the distance matrices: non-metric multidimensional scaling (NMDS) using the vegan (v2.6) package’s metaMDS function117, and principal coordinate analysis (PCoA) using the ape (v5.6) package’s pcoa function118. For both the main pollen character dataset and its subsets, our NMDS ordinations with k = 2 yielded stress values below 0.2, effectively balancing the preservation of the variation structure in our data with dimensionality reduction for intuitive visualisation. We did not include the correction of principal coordinate axes (PCos) in our disparity pipeline, which is considered not to affect the stability of the results, but would result in a significant reduction in the variance explained by the first few PCos.

Characterising disparity

We used four key metrics related to size—Sum of Variances (SOV), Mean Pairwise Distance (MPD), Mean Distance to Centroid (MDC), and Sum of Ranges (SOR)—to measure and describe the within-clade morphological disparity within our dataset. SOV, as the primary focus, alongside MPD, provides insights into the overall extent of the explored area. MDC highlights how data points cluster around their central location, whereas SOR examines the breadth of the distribution within the morphospace. Given SOR’s sensitivity to sample size, we adjusted this metric using rarefaction prior to comparison. In our rarefaction process, n species were randomly selected from each group without replacement, where n equals the number of species in the smallest group minus one87. To assess the reliability of ordination-based metrics, we also compared MPD values derived from the raw distance matrix—recommended for discrete character data by Gerber119—with those from NMDS and PCoA ordinations.

To facilitate comparisons of pollen disparity within angiosperms, we adopted two informal grouping schemes (see Supplementary Data 1 for details). We treated angiosperms as a single subclass (Magnoliidae) without using the superorder rank of Chase and Reveal120, as it combines a few extremely large superorders (e.g., Lilianae, Rosanae, Asteranae) with many small superorders, which may be less suitable for our comparative analyses.

G1. Angiosperms were divided into six major grades/clades: the ANA grade, magnoliids, Chloranthales, Ceratophyllales, monocots, and eudicots (Fig. 1). The ANA grade comprises the three earliest diverging angiosperm orders (Amborellales, Nymphaeales, and Austrobaileyales). The subsequent five groups represent the five major clades of mesangiosperms.

G2. G2 was established to further subdivide monocots and eudicots, thus retaining the ANA grade, magnoliids, Chloranthales, and Ceratophyllales. For monocots, we subdivided them into commelinids and other monocots. Within eudicots, we adopted the informal clade names superrosids and superasterids of APG IV104: superrosids include fabids, malvids, and other superrosids (Saxifragales and Vitales), whereas superasterids include campanulids, lamiids, and other superasterids (Berberidopsidales, Caryophyllales, Cornales, Ericales, and Santalales). The five early-diverging eudicot orders (Ranunculales, Proteales, Trochodendrales, Buxales, and Gunnerales), together with Dilleniales, were grouped as other eudicots. In total, G2 comprises thirteen groups: ANA grade (ANA), magnoliids (MAG), Chloranthales (CHLO), Ceratophyllales (CER), commelinids (COMM), other monocots (OthM), other eudicots (OthE), fabids (FAB), malvids (MAV), other superrosids (OtSR), campanulids (CAM), lamiids (LAM), and other superasterids (OtSA).

To decipher the significance of each axis in morphospace, we aligned ordination axes with original variables using pseudo-factor loadings. We employed K-means clustering to examine morphospace occupation and assessed the phylogenetic signal using Blomberg’s K121, verifying statistical significance through permutation tests with 1000 iterations. In addition, we performed a Mantel test between the pollen morphological distance matrix and the phylogenetic distance matrix (measured in millions of years) across all pairs of genus tips, in order to evaluate the extent to which evolutionary time contributes to divergence in pollen morphology. We also employed a subgroup omission approach to evaluate each subgroup’s contribution to the overall disparity58. Specifically, we recalculated the disparity after excluding a given subgroup and then compared this value to the total disparity. The absolute difference between these two values is proportional to the subgroup’s contribution.

Theoretical morphospace

Beyond empirical data, we introduced 10,000 theoretical morphotypes to illustrate possible combinations of angiosperm pollen forms based on the empirical pollen morphological matrix (Supplementary data 1). These taxa were generated by randomly assembling observed states across the 20 discrete traits described in Supplementary Table 2, drawing on simulation strategies developed in earlier studies36,41. To ensure internal consistency and biological plausibility, we preserved the hierarchical structure of the dataset: if a primary character was coded as missing or absent, its dependent secondary characters were also coded as missing. The polymorphic coding used increases the number of theoretical combinations, which we treated as missing data in the simulations. The proportion of missing characters in our theoretical matrix was adjusted to match that of the empirical dataset, ensuring comparability between observed and theoretical morphotypes. The resulting simulated data was then integrated with the empirical data for distance matrix construction and subsequent ordination analyses.

Morphological convergence

We quantified morphological convergence in angiosperm pollen using the C1 statistic of Stayton122, which measures the proportion of maximum ancestral divergence that has been closed by subsequent evolution between two lineages. We employed the MORD distance matrix (alpha value of 0.5) and calculated C1 for all 3,530,641 unique genus pairs (= (1880 tips – 1)2) in our angiosperm pollen dataset. Genus pairs with C1 ≥ 0.90 were considered highly convergent, and the lowest shared taxonomic rank was determined for these pairs to assess the taxonomic distance between convergent lineages112,123. It is important to note that a C1 value of 1.0 does not necessarily imply that genera are identical across all traits, but rather that their subsequent evolution has maximally reduced prior divergence since their most recent common ancestor. Notably, instead of adopting the superorder rank, we introduced two informal taxonomical ranks (G1 and G2) to capture the major group structure of angiosperms.

Rate test of character evolution

We analysed variations in rates of character evolution across the phylogenetic tree by conducting discrete-character rate tests using the likelihood-ratio test method, as outlined by Lloyd et al.124. This method involves probability models that consider branch lengths and the completeness of character data to calculate detailed rates of character change (λ). To enhance the statistical reliability of our results, we employed the Benjamini and Hochberg false-discovery rate method with a stringent significance threshold of 0.01. This step helps manage multiple comparisons, bolstering our confidence in the findings. These analyses were carried out for all 20 characters and three character subsets, consistently using Claddis ver. 0.63113.

Disparity through time

We employed the ‘gradual splits’ model developed by Guillerme and Cooper125 for time slicing, which probabilistically selects ancestors or descendants for each bootstrap replicate, based on their relative distances to designated time slices. This approach helps sample data from NMDS and PCoA ordinations at various points in the evolutionary timeline of angiosperms. Disparity was calculated at intervals of five million years using 1000 bootstrapped replicates to define the disparity estimates and associated confidence intervals. The analysis was conducted under three scenarios representing different dating estimates for the crown age of flowering plants: Cretaceous (139 Ma), Jurassic (187 Ma), and Triassic (248 Ma) crowns. In addition, to mitigate potential sampling biases, we performed 100 random subsamples (n = 1000) of the morphological matrix for DTT analysis, focusing only on monocots and eudicots due to their predominance in the dataset. The DTT results under different rarefaction levels were also examined.

Correlations

We compiled species diversity information for all genera included in our pollen disparity dataset from the Catalogue of Life126. Genome size information was obtained from the DNA C-values database127, using the average value per genus, yielding 770 genera of angiosperms representing 57 orders and 255 families (Supplementary Data 1). Pollination types were obtained from Stephens et al.128, with 758 genera overlapping with our dataset (Supplementary Data 1). After excluding genera with polymorphic states (treated as missing data), the final coding comprised 708 genera (62 orders, 403 families) for the abiotic versus biotic categories and 659 genera (60 orders, 386 families) for the wind, water, insect, and vertebrate categories. Although the coding scheme used by Stephens et al.128 was primarily constructed at the species level, the exclusion of polymorphic states in our analyses minimises potential biases, and our focus remained exclusively on extant genera.

We obtained pairwise distance matrices for the main pollen disparity dataset as well as subsets pertaining to each taxon pair. In addition, we also calculated pairwise differences in taxonomic diversity and genome size for every taxon pair. Consequently, we were able to evaluate the correlation between the two by employing phylogenetic Mantel tests as implemented in the ‘EvolQG’ R package129. We additionally calculated lineages through time (LTT) to estimate the temporal patterns of genus richness depicted in the phylogeny using the ‘ape’ R package (Supplementary Fig. 16). For pollination type, we employed the Pearson’s correlation coefficient (r) to assess the correlation between the ordination axes of pollen disparity and the categorical pollination coding, and we further conducted a PERMANOVA test to examine the morphospace occupied by different pollination modes117.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.