Abstract
Developing robust phylogenies and comprehensive taxonomies for big plant genera is crucial for unlocking plant-derived solutions to global sustainability challenges. Artemisia, a big genus comprising ~500 species, holds immense medicinal and ecological importance. Despite decades of efforts, establishing a comprehensive phylogeny and taxonomy for global Artemisia has remained a formidable challenge. Here, we reconstruct the most comprehensive phylogeny of global Artemisia to date (394 species) using a gigamatrix approach. We also analyze evolutionary patterns of 20 morphological characters of Artemisia worldwide to evaluate their taxonomic utility. Based on these findings, we propose a global taxonomy for Artemisia, recognizing 24 sections in 8 subgenera, and placing 99.6% of accepted species (502/505). This study provides a robust framework to advance understanding of the evolution and ecology of Artemisia, and to promote the sustainable utilization of its rich resources. Meanwhile, it introduces an exemplary case for taxonomic research on big genera in the genomic era.
Similar content being viewed by others
Introduction
Big plant genera – often defined by thresholds of ≥ 500 species1,2–account for ca. 25% of the flowering plant species2. Despite their critical role in shaping plant diversity, their global-scale phylogenetic and taxonomic research are significantly behind compared to other genera, owing to their inherent complexity1,2. Their phylogeny and taxonomy are uniquely challenging: beyond the sheer number of species, they exhibit extensive (often intercontinental) distributions and frequent rapid radiations (e.g., Astragalus3), collectively complicating phylogenetic resolution and taxonomic completeness1,2,4. The necessary studies were deemed too daunting. This results in a lack of global research on big genera, with the existing studies mostly being regional (e.g., Myrcia5, Cyperus6). In the past decades, new methods, technologies, and international collaboration have made progress in resolving these challenges, as evidenced by progress in the study of Solanum7 and Begonia8, though the process remains time-intensive. Some big genera are primarily distributed in tropical regions (e.g., Ipomoea4, Myrcia5), whereas others are in temperate regions (e.g., Centaurea2). While significant progress has been made in resolving a few tropical big genera (e.g., Ipomoea4), research on temperate ones remains scarce yet presents distinct opportunities. Originating predominantly during mid-Cenozoic geo-climatic upheavals–such as global cooling9, grassland expansion10, and desertification11–temperate genera often exhibit accelerated diversification rates12 and acquisition of novel morphological or physiological adaptations to new niches (e.g., cold and/or aridity)12,13. These Cenozoic-forged evolutionary innovations position the temperate big genera as unparalleled systems for studying adaptive evolution under environmental upheavals. Their distinct metabolites—such as the antimalarial dominance of Artemisia annua’s artemisinin14, which has replaced tropical Cinchona-derived quinine in the past decades—reveal untapped potential for bioprospecting15. Consequently, completing the phylogeny and taxonomy of temperate big genera becomes imperative for both evolutionary theory and resources utilization.
Some pioneer attempts have been made to resolve the phylogenies of species-rich taxa by using Sanger sequencing data16. However, resolution in these phylogenies is restricted due to limited evolutionary signals in such data. Recent advances in high-throughput DNA sequencing have facilitated the resolution of complex phylogenetic relationships by utilizing extensive evolutionary signals across entire genomes17. How to balance taxonomic coverage and genomic depth for a comprehensive and robust phylogeny is a big challenge. The construction of a gigamatrix18,19, which integrates high-throughput DNA sequencing data (some species and many sequences) with Sanger sequencing data (numerous species and few sequences), offers a promising strategy to address this gap20.
Artemisia, commonly known as wormwood, mugwort, and sagebrush, is a big genus in the family Asteraceae. It comprises over 500 species that are predominantly distributed in the northern temperate regions21,22,23,24. These species are important in medicine, phytochemistry, and ecological restoration24 (Fig. 1). As the most famous example, the discovery of artemisinin, a sesquiterpene lactone extracted from A. annua14 (Fig. 1g) for the first time, was recognized with the Nobel Prize in 2015 for being the most effective antimalarial drug to date25. This milestone triggered a research boom on the chemistry of Artemisia species, with over one thousand papers published annually between 2015 and 2019 (Supplementary Table 1). Recent studies have expanded its therapeutic potential to tuberculosis26, polycystic ovarian syndrome27, and allergen-specific immunotherapy via pollen-derived proteins28. Ecologically, species like A. ordosica and A. halodendron play critical roles in desertification control across Asia29 (Fig. 1e). A high-resolution global phylogenetic framework and a comprehensive infrageneric taxonomy are crucial for understanding and leveraging this scientifically and economically important genus. But a robust global phylogeny and a complete taxonomic hypothesis for each species of Artemisia remain missing, even after decades of efforts (see Supplementary Data 1).
a–d Morphology of Artemisia verlotiorum: a Plants; b Synflorescence; c Leaf; d Capitulum, floret and achene, scale bar = 1 mm. e–h Economically significant species: e Artemisia oxycephala (ecological restoration); f A. absinthium (medicinal); g A. annua (medicinal); h A. chinensis (ornamental). i Global sampling localities, overlaid on species richness (Global Compositae Checklist91; WGSRPD Level 4 ecoregions92), the darker the shade of green, the more species are distributed in the region. Sampling point coordinates are listed in Supplementary Data 2.
The existing infrageneric taxonomy of Artemisia predominantly relies on the morphology of the capitula, leaves, and their life forms21,22,23,30. These morphological characters, however, are prone to repeated evolution, often leading to conflicts with molecular phylogenies or to unreasonable taxonomic treatments30,31. For example, the homogamous-discoid capitulum was revealed to evolve independently based on the evidence from cytology and molecular phylogenies31,32. Yet all species bearing this state were once classified into a single genus Seriphidium based solely on their similar capitulum morphology33. Nevertheless, morphological characters remain crucial for taxonomy, identification and evolutionary studies as they show fascinating diversity and are easy to observe and use34,35,36. Morphological variation in numerous taxa, however, remains far from comprehensively studied34,36, highlighting a big gap in our understanding of their evolutionary trajectories and taxonomic utility. Therefore, it is essential to critically evaluate the already employed morphological characters, actively develop new ones, and identify reliable diagnostic characters for each recognized lineage. Regrettably, this has not yet been done with Artemisia on a global scale. Additionally, while some taxonomists37,38 acknowledged the potential of micromorphological characters in Artemisia infrageneric taxonomy, comprehensive studies on them have been scarce38. A critical challenge lies in the tiny size of Artemisia capitula (mostly < 5 mm in diameter), compounded by the abundance of minute florets within (often > 20 per capitulum) and microscopic internal structures (e.g., anther) inside21,22,23.
In this work, we reconstruct a global phylogeny for Artemisia with 78% (394 out of 505 species) of its species using a gigamatrix approach by integrating 202 low copy nuclear genes and two ribosomal DNA markers. We also reconstruct phylogenomic frameworks for Artemisia utilizing low copy nuclear genes and complete plastomes with 59% of its species respectively, followed by a comparative analysis to clarify the pattern of cytonuclear discordance in Artemisia. Furthermore, we analyze 20 morphological characters—traditionally or potentially significant for its infrageneric taxonomy—using herbarium specimens and our field collections across its main distribution ranges (Fig. 1), and infer their evolutionary trajectories. Based on these findings, we propose a global taxonomy for Artemisia, including 8 subgenera, 24 sections, and 502 out of the 505 accepted species with taxonomic placements. This study provides a comprehensive framework for advancing our understanding of the evolution, ecology, and sustainable use of Artemisia, and introduces a case study on how to tackle taxonomic challenges in big genera in the genomic era.
Results and discussion
An expanded phylogenetic framework incorporating 78% Artemisia species
We obtained genome-skimming data for 314 species of Artemisia and its allies (298 Artemisia species), comprising 96 newly sequenced Artemisia species not sampled in Jiao et al31. (Supplementary Data 2). From these data, we assembled three nuclear datasets: 1) 202 nuclear low-copy genes (NLC dataset); 2) the internal transcribed spacer (ITS) and external transcribed spacer (ETS) regions of nuclear ribosomal DNA (NR dataset); and 3) the concatenated gigamatrix (GM dataset) comprising the NLC dataset and the NR dataset. The final NLC dataset spans 226,916 bp and comprises 314 species (298 ingroups + 16 outgroups), the NR dataset is 916 bp long covering 414 species (394 ingroups + 20 outgroups), and the GM dataset totals 227,832 bp covering 414 species (394 ingroups + 20 outgroups). Among them, 76% (314/414) had both NLC and NR data, while 24% (100/414) were represented solely by NR data. The substantial disparity in data volume between NLC and NR datasets precluded multi-species coalescent analysis (ASTRAL) due to computational infeasibility and statistical limitation for the gigamatrix19. To ensure methodological consistency in phylogenetic analysis, we presented concatenated maximum likelihood (ML) trees throughout the main text.
Using maximum likelihood (ML) analysis of the GM dataset, we reconstructed a fully resolved phylogeny of Artemisia (Fig. 2 and Supplementary Fig. 1). Eight clade and 24 subclade nodes showed strong bootstrap support (bootstrap support, BS > 95%), and this tree was completely concordant with the NLC-derived ML tree (Fig. 2, Supplementary Figs. 1 and 2). Topological comparison of all the nuclear phylogenetic trees was shown in Supplementary Fig. 3. Compared with the latest phylogeny (ML tree from the SNP dataset; Supplementary Fig. 3a)31, the ML trees from GM and NLC datasets increased the species coverage of the Artemisia phylogeny from 41% (205/505 species) to 78% (394/505 species, GM dataset; Supplementary Figs. 1 and 3e) and to 59% (298/505 species, NLC dataset, Supplementary Figs. 2 and 3c), respectively. These robust phylogenies enabled the delimitation of 24 strongly supported subclades in 8 clades (Fig. 2 and Supplementary Fig. 3), corresponding to 24 sections in 8 subgenera respectively (detailed in the taxonomy section below and Supplementary Note 1). Our framework resolved evolutionary relationships across 78% of Artemisia species with high resolution. Among the 24 subclades, we newly identified 10, revised 11, and confirmed 3 previously established subclades (all BS > 95%). All eight clades showed full support (BS > 95%), consistent with prior work31. A comparison of phylogenies between this study and the previous work31 is summarized in Supplementary Fig. 3. Considering that this study involves taxonomic treatment and phylogeny is a crucial foundation for taxonomy, the terms “subgenera” and “sections” are used simultaneously in the main text to denote the clades and subclades of Artemisia respectively, as identified by the GM-derived and NLC-derived ML trees (Fig. 2 and Supplementary Fig. 2).
Concatenated maximum likelihood (ML) tree of 414 species (394 Artemisia + 20 allied species) based on the gigamatrix (GM) dataset. Branches with bootstrap support ≥ 75% are in bold. Colored rings indicate subgenera (inner) and sections (outer); branch colors correspond to subgenera. Outer images depict floret morphology for each section in Artemisia and allied species. The full phylogeny with branch length and detailed support values is provided in Supplementary Fig. 1. Scale bar = 2 mm.
Some species historically classified within Artemisia subg. Artemisia based on few morphological characters exhibited unresolved phylogenetic positions in the previous phylogeny based solely on NR dataset23,39 (Supplementary Fig. 4). Our analyses of GM dataset robustly resolved these species within four distinct subclades (Auratae, Salsoloides, Norvegica, and Laciniatae) of A. subg. Dracunculus (Fig. 2, and Supplementary Figs. 1 and 3), which is further corroborated by following shared morphological characters between them (Supplementary Note 1). For example, A. phaeolepis—a species previously placed in A. subg. Artemisia23 and unresolved in the ML tree from NR dataset (Supplementary Fig. 4)—is now confidently assigned to the Laciniatae subclade of A. subg. Dracunculus (Fig. 2; Supplementary Note 1). A. magellanica, a Patagonian species previously assigned to A. subg. Artemisia40 was resolved into A. subg. Pectinatae in the ML trees of NLC and GM datasets (Fig. 2, and Supplementary Fig.s 1 and 2), with morphological observations aligning congruently with other members of A. subg. Pectinatae (Supplementary Note 1).
Regarding the circumscription of Artemisia, our results supported the inclusion of eight small genera (Crossostephium, Filifolium, Kaschgaria, Mausolea, Neopallasia, Picrothamnus, Sphaeromeria and Turaniphytum) that had been discussed (BS = 100%; Fig. 2, and Supplementary Fig. 1; see Supplementary Data 1 and 2). Our expanded sampling further revealed that, Ajaniopsis, a genus endemic to the Tibet Plateau, was nested in Artemisia with strong support41 (BS = 100%; Supplementary Fig. 1). Furthermore, we classified it into A. subg. Absinthium sect. Argyrophyllae based on the ML tree of NLC and GM dataset (Fig. 2, and Supplementary Figs. 1 and 2). Its unique morphological characters (apically pilose florets, 5- or 6-ribbed achenes, and corymbous synflorescences) may be the result of convergent evolution, as these characters also occur in other species of Artemisia—such as A. albicans (subg. Tridentatae, western North America) and A. glacialis (subg. Absinthium, European Alps)—that are both phylogenetically and geographically distant41 (Fig. 2 and Supplementary Fig. 1).
The placement of Artemisiella is the only problem remaining for the circumscription of Artemisia31. The ML trees derived from both GM and NLC dataset strongly supported a sister relationship between Artemisia and the clade consisting of Artemisiella and Phaeostigma (BS = 100%; Supplementary Fig. 3 c and e). In contrast, the ASTRAL tree based on NLC dataset nested Artemisiella within Artemisia, yet with weak support (local posterior probability, LPP = 0.61; Supplementary Figs. 3b and 5). Our alternative hypothesis tests [Shimodaira-Hasegawa (SH) test42, Kishino-Hasegawa (KH) test43, and approximately unbiased (AU) test44 based on the NLC dataset did not reject the hypothesis of its inclusion in Artemisia (Supplementary Table 2). Given that Artemisiella can be clearly distinguished from Artemisia by morphological characters (e.g., leaves oblong, 3-pinnatisect, with 8–13 pairs of lateral lobes)45, we tentatively treated it as a separate genus.
Cytonuclear discordance and possible hybridization within Artemisia
We assembled complete plastomes for 314 species (298 Artemisia + 16 allied species, plastome dataset) from our genome-skimming data to assess cytonuclear discordance and possible hybridizations in Artemisia. This 113,792 bp dataset (80 coding regions, inverted repeats excluded) yielded a robust phylogeny (78% nodes BS > 95%; Supplementary Fig. 6), revealing extensive cytonuclear discordance in Artemisia (Fig. 3a; full topologies in Supplementary Figs. 2, 4, and 6–8). The plastome phylogeny showed that Artemisia was not monophyletic, with Chrysanthemum-Ajania-Elachanthemum clade and Artemisiella clade nested within it (Fig. 3a and Supplementary Fig. 6, BS = 82%, 96%). And seven of the eight clades (subgenera) are polyphyletic, except for A. subg. Pacifica (Fig. 3a), which has only four species. This cytonuclear discordance suggested the possibility of rapid evolutionary radiations or hybridizations within specific lineages of Artemisia. To identify potential causes, we quantified genealogical concordance46, conducted polytomy tests47, and performed phylogenetic network analyses48 based on the NLC dataset (see “Methods”). We measured phylogenomic genealogical discordance using gene concordance factors (gCF) derived from the NLC dataset (Supplementary Fig. 2). This factor evaluates the percentage of gene trees that agree with various nodes on the phylogenetic tree and low gCF values (< 5%) may arise from insufficient information (such as short branches) or from genuine conflicting signals46. The ML tree of NLC dataset exhibited high bootstrap support (mean value = 89.50%; Supplementary Fig. 2) and low gCF (mean value = 3.66%; Supplementary Fig. 2), implying conflicting signals among genes. The polytomy tests retained the null hypothesis of zero branch lengths (indicating unresolved relationships or polytomies rather than bifurcating nodes) for all inter-subgeneric nodes (p > 0.1; Supplementary Fig. 5), except the divergence node of A. subg. Pacifica (p = 0; Supplementary Fig. 5). Meanwhile, 79% of the nodes within subclades (sections) and among species could not rule out the polytomy hypothesis (p > 0.1; Supplementary Fig. 5). These results extended beyond known radiated subgenera (A. subg. Dracunculus, 81% nodes p > 0.1; A. subg. Seriphidium, 87% nodes p > 0.1)39 to the third-biggest subgenus A. subg. Artermisia (91% nodes p > 0.1). By applying phylogenomic data, we preliminarily ruled out the possibility of unresolved relationships in polytomy tests, and inferred that the evolutionary history of Artemisia might have been persistently accompanied by incomplete lineage sorting (ILS) resulting from rapid diversification, which could be a cause of low gCF. Rapid diversification has thereby complicated phylogenetic resolution at the subgeneric and species levels39,49,50 and necessitating phylogenomic data for robust inference of subgeneric and species-level relationship in big genera.
a Tanglegram comparing maximum likelihood (ML) phylogenies from 202 low-copy nuclear genes (NLC dataset, left) and plastome dataset (right), with connecting lines matching sections (color-coded by subgenus). The full phylogeny with branch length and detailed support values is provided in Supplementary Figs. 2 and 6. b Split network based on uncorrected p-distances NLC dataset. c, d Optimal phylogenetic networks reconstructed from two different 22-taxa NLC datasets with maximum number of reticulations (hmax) = 4. Curved branches denote hybrid nodes annotated with inheritance probabilities; branch colors correspond to subgenera.
Hybridization within Artemisia has long been hypothesized51,52,53, yet lacked rigorous global-scale testing. We analyzed two simplified 22-species datasets (20 Artemisia species representing all subgenera + 2 outgroups including Artemisiella; Fig. 3c and d), given the computational limitation of PhyloNet v.3.6.9. The results revealed widespread hybridization during the early evolution of Artemisia (Fig. 3c, d, and Supplementary Figs. 9 and 10), and plausibly explained the close phylogenetic relationships observed between species from different subgenera in the plastome tree (Fig. 3a and Supplementary Fig. 6). For example, A. subg. Dracunculus exhibited obvious reticulate evolution in the Split network, especially with its close relatives A. subg. Pectinatae and A. subg. Absinthium (Fig. 3b). The plastome phylogeny showed two species, i.e., A. hedinii (A. subg. Pectinatae) and A. blepharolepis (A. subg. Absinthium), were embedded within A. subg. Dracunculus (BS = 100%; Fig. 3a). PhyloNet analysis also indicated that members of A. subg. Dracunculus was involved in the hybrid origin of members of A. subg. Pectinatae (γ = 0.3; Fig. 3c), and its ancestral lineages participated in the hybrid origin of A. blepharolepis (γ = 0.49; Fig. 3c). Morphologically, A. blepharolepis can be distinguished from other species of A. subg. Absinthium by having 5 ~ 8 leaf segment pairs (vs. < 4 pairs in other species of A. subg. Absinthium)23. Interestingly, this character closely resembled that of members of Laciniatae clade (A. subg. Dracunculus), its sister in the plastome tree (Fig. 4). Additionally, A. pectinata (A. subg. Pectinatae) has been observed to share sterile disk florets with some lineages of A. subg. Dracunculus23, providing additional evidence for their potential hybrid origin.
Schematic maximum likelihood (ML) tree topology based on the gigamatrix (GM) dataset shown on the left, with branch colors indicating subgenera; selected macromorphological characters (e.g., leaf type) and their states depicted on the right and bottom. Detailed character state definitions are provided in Supplementary Data 3.
Chloroplast capture likely contributed to cytonuclear discordance in Artemisia. Species of A. subg. Pectinatae were scattered across the plastome tree, with five sampled species nested within five distinct subgenera (Fig. 3a). Conversely, nuclear phylogeny strongly supported their monophyly (BS = 100%; Fig. 3a), despite minimal gene tree concordance (gCF = 0.99%; Supplementary Fig. 2). This conflict—nuclear cohesion versus chloroplast dispersion—aligned with chloroplast capture dynamics54. These five species and their plastome-implied relatives showed sympatric/parapatric distributions (Supplementary Figs. 6 and 11), providing spatial evidence for the possible chloroplast capture.
In summary, our results show hybridization and chloroplast capture have occurred within specific Artemisia lineages. These events lead to cytonuclear discordance, highlighting the limitations of relying solely on maternally inherited plastome data to reconstruct its evolutionary history. In particular, chloroplast capture events can mislead phylogenetic inference, further complicating the interpretation of organismal relationships55. Consequently, the current taxonomy of Artemisia primarily relies on nuclear-based phylogenies, which provide a more robust framework for resolving relationships56. Nevertheless, we recognize plastome remains an essential component of plant genomes. Conducting comparative phylogenetic analyses that incorporate plastome data can yield valuable insights into the evolutionary journey of Artemisia. In this way, it serves as a complementary approach to the nuclear-based phylogenies57.
Evolutionary trajectories of morphological characters
We totally studied 20 (13 macro- and 7 micro-) morphological characters that can potentially be used for Artemisia infrageneric taxonomy by reconstructing their ancestral states (Figs. 4, 5, and Supplementary Figs. 12–14), and testing their phylogenetic signals (see details in “Methods”; Supplementary Data 3–5).
Schematic maximum likelihood (ML) tree topology based on the gigamatrix (GM) dataset shown on the left, with branch colors indicating subgenera; seven micromorphological characters and their states depicted on the right and bottom. Detailed character state definitions are provided in Supplementary Data 5.
Thirteen macromorphological characters, including life form, plant height, synflorescence, capitulum, and leaf morphology (Supplementary Tables 3 and 4) were analyzed based on the ML tree derived from GM dataset (Fig. 4 and Supplementary Fig. 12). Four of them (capitulum type, leaf type, leaf segment width and leaf area) exhibited strong (λ ≥ 0.9 or K ≥ 1.5) or moderate (0.9 > λ ≥ 0.7 or 1.5 > K ≥ 1) phylogenetic signals (Fig. 4 and Supplementary Fig. 12, Supplementary Data 3), and thus could be useful in delineating subgenera and sections within Artemisia. The other characters (mostly vegetative) show weak (λ < 0.7 or K < 1.5) phylogenetic signals (Supplementary Data 3) and are therefore not ideal for infrageneric taxonomy.
Notably, aside from capitulum type, which was previously employed in infrageneric taxonomy21,22,23 (λ = 0.93, K = 1.99; Fig. 4 and Supplementary Fig. 12, Supplementary Data 1 and 3), leaf type, a character introduced in this study, was identified as a key infrageneric taxonomic character due to its strong phylogenetic signal (λ = 0.99, K = 1.50; Supplementary Fig. 12, Supplementary Table 3, Supplementary Data 3). For example, with the help of leaf type, we can further identify two new sections in the recently established subgenus A. subg. Pectinatae31. Species of sect. Pectinatae have Type 2 leaves (1-pinnatisect multiple-lobed, small), while species of sect. Hedinianae have Type 11 leaves (2-pectinately pinnatisect, large) (Fig. 4, Supplementary Table 3). Besides, leaf type is more readily observable than capitulum type (Fig. 4), which usually requires dissection under a stereomicroscope for identification. Other leaf morphological characters, like leaf segment width and number of leaf segment pairs, are often used for the subgeneric taxonomy and species identification in Artemisia (Supplementary Data 1). Our results showed that, except for the leaf segment width (λ = 0.73, K = 1.05; Supplementary Fig. 12, and Supplementary Data 3) and leaf area (λ = 0.81, K = 1.10; Supplementary Fig. 12, and Supplementary Data 3) which exhibit moderate phylogenetic signals, most others showed weak phylogenetic signals (λ < 0.7 or K < 1; Supplementary Data 3) and are therefore not useful for infrageneric taxonomy.
We also studied seven micromorphological characters in florets across 200 Artemisia species based on the pruned GM-derived ML tree, including the corolla shape and style morphology of both marginal and disk florets, along with the shape of the anther’s apical appendage, thecal base, and collar in disk florets (Fig. 5, and Supplementary Figs. 13 and 14; Supplementary Table 5; Supplementary Data 5). Although some taxonomists37,38 realized the taxonomic potential of these characters, to our knowledge, none have been systematically studied38,58. This study initiated a global investigation, addressing a knowledge gap.
Of the seven micromorphological characters, four of them (e.g., corolla shape and style morphology of both marginal and disk floret) displayed strong phylogenetic signals, which could assist in delimiting subgenera and sections (Fig. 5). Notable examples include corolla shape of disk floret (λ = 0.99, K = 0.97), corolla shape of marginal floret (λ = 0.99, K = 3.22), style morphology of disk floret (λ = 0.99, K = 4.41), and style morphology of marginal floret (λ = 0.99, K = 2.50) (all in Supplementary Data 5). While anther-related characters exhibited weak phylogenetic signals (λ < 0.7 or K < 1; Supplementary Data 5), our observations revealed stable interspecific variations in anther apical appendage and thecal base morphology (Supplementary Figs. 13 and 14). Thus, it would be beneficial to expand sampling and test the value of these characters in species delimitation.
Our studies also revealed some interesting evolutionary trajectories of these morphological characters. For example, homogamous-discoid capitulum (Seriphidium type), a character state in Artemisia (Fig. 4 and Supplementary Fig. 12) was once used to circumscribe the genus Seriphidium33. Our inferred phylogeny (Fig. 2) and morphological analysis (Fig. 4) revealed that it evolved independently at least three times in Asia, North America and North Africa (Fig. 4 and Supplementary Fig. 12)—regions where species with this state inhabit arid zones across the three continents21,22,23. It seems that Artemisia lineages in these geographically isolated regions are repeatedly playing the same tape of life59.
All these efforts and results indicated that morphology, as the most direct evidence in taxonomic research, remained worthy of in-depth investigation even in the genomic era34,35,36.
The comprehensive sectional taxonomy of Artemisia
Following the principles of monophyly and identifiability35, we proposed a complete sectional taxonomy of Artemisia, identified diagnostic morphological characters for each subgenus and section, and developed an identification key to all of them (Supplementary Note 1) based on the phylogenies from genomic data (Fig. 2) and analysis of morphological characters (Figs. 4 and 5). This update included the establishment of 10 new sections and revised circumscriptions for 11 existing ones (Supplementary Note 1). The species that were not sampled in the GM-derived ML tree were assigned to a subgenus and section by employing the diagnostic morphological characters revealed in the current study. In total, 502 of 505 accepted Artemisia species were classified into 8 subgenera and 24 sections, 394 species were assigned based on both molecular and morphological evidence, while 105 species solely on diagnostic morphological characters (Supplementary Data 6). Three species (Artemisia avarica, A. dipsacea, A. galinae) remained unplaced due to insufficient morphological and molecular evidence, primarily stemming from ambiguous protologues and specimen inaccessibility. Most Artemisia species now have testable hypotheses for their subgeneric and sectional placements (Supplementary Data 6). The differences between the present taxonomy and previous taxonomies (most regional) were summarized in Supplementary Data 6.
The current taxonomic framework decomposes this big and complex genus (505 species) into 24 smaller sections (mean 19 species/section; range 1–119). Most sections are confined to a single country or region, enabling local taxonomists to handle them within a practical timeframe. Additionally, since each section has distinct morphological characters (Supplementary Note 1), this framework will not only aid in identification but also encourage local taxonomists—especially those without access to molecular facilities—to revise the taxonomy of Artemisia in their own regions based on morphology. Data obtained from these local revisions could then be continuously integrated back into the present framework, progressively enhancing species and geographic coverage toward a comprehensive monograph. The present taxonomic framework thus facilitates cross-continental collaboration and data integration, exemplifying a generalizable workflow for updating phylogenies and taxonomies of big genera in the genomic era (Supplementary Fig. 15). Unplaced species and those unsampled in molecular phylogeny warrant further sampling, sequencing, and integration. Notably, our single-representative-per-species approach does not address species delimitations in Artemisia, and thus, this species treatment herein remains provisional, consistent with the inherently dynamic nature of taxonomic studies.
Summary
Global-scale phylogenetic and taxonomic studies on big plant genera have been metaphorically termed a “black hole” due to their inherent complexity4,35. Our integrative analysis of genomic and morphological data resolves the long-standing phylogenetic and taxonomic complexities in the big genus Artemisia, establishing a phylogenetically robust and comprehensive taxonomic framework that proposed testable hypotheses for nearly all recognized species. Our study, along with other large-scale investigations of tropical big genera (e.g., Ipomoea4), demonstrates that actively adopting new technologies, exploring novel morphological characters, and fostering global collaboration, can tackle these long-standing challenges, and even accelerate the process that might otherwise take decades or even centuries. Our study provides a robust baseline for future studies of this ecologically and economically important big genus, such as local revisions, bioprospecting, conservation, and comparative evolutionary analysis. Future research could prioritize local revisions within this framework, enhanced by increasing sampling density and integrating genomic data with fine-scale morphological, geographical and ecological data. This approach is crucial for clarifying species boundaries in Artemisia, particularly within notoriously difficult groups such as the A. vulgaris complex. This work serves as an exemplary case for taxonomic research on big genera in the genomic era.
Methods
Taxon sampling, DNA extraction, PCR reaction and sequencing
We sampled 394 Artemisia species spanning all eight subgenera of Artemisia currently recognized31. This sampling included all genera now classified within Artemisia but formerly placed in segregate genera–specifically Ajaniopsis, Crossostephium, Filifolium, Kaschgaria, Mausolea, Neopallasia, Picrothamnus, Sphaeromeria, and Turaniphytum–based on the most recent phylogeny31 (Supplementary Data 2). We also sampled 20 species from its closely related genera within the tribe Anthemideae as outgroups based on the recent phylogeny of Anthemideae60,61, including Artemisiella, Chrysanthemum, Ajania, Phaeostigma, Elachanthemum, Stilpnolepis, Nipponanthemum, Tanacetum, Brachanthemum, Hippolytia, and Allardia. Each species was represented by a single sample. Of the 414 species (ingroups+outgroups) analysed, 314 species had genome-skimming data (96 newly sequenced species), while the remaining 100 species had only nuclear ribosomal DNA data (ITS, ETS). Of these 100, 18 were generated in this study and 82 were retrieved from GenBank (Supplementary Data 2). From the genome-skimming data, four datasets were generated: 1) low-copy nuclear gene sequences (NLC dataset), 2) nuclear ribosomal DNA sequences (NR dataset), 3) the concatenated gigamatrix (GM dataset) comprising the concatenated NLC dataset and the NR dataset, and 4) complete plastome (plastome data). The sampling details for each subgenus of Artemisia are as follows: 1) A. subg. Dracunculus, comprising 94 out of the total 122 species, with genomic data available for 73 species; 2) A. subg. Pectinatae, which includes six out of the 10 species, with genomic data accessible for five species; 3) A. subg. Pacifica, encompassing all four species, with genomic data available for all species; 4) A. subg. Ponticae, comprising 17 out of 23 species, with genomic data accessible for 14 species; 5) A. subg. Seriphidium, which includes 96 out of the total 128 species, with genomic data available for 65 species; 6) A. subg. Absinthium, comprising 50 out of the total 63 species, with genomic data accessible for 39 species; 7) A. subg. Tridentatae, which includes 41 out of the total 41 species, with genomic data available for 29 species; and finally, 8) A. subg. Artemisia, which comprises 86 out of the total 111 species, with genomic data accessible for 69 species. Supplementary Data 2 provides detailed sampling information.
Total genomic DNA was extracted using the TIANGEN plant genomic DNA extraction kit (TIAN-GEN Biotech., Beijing, China) following the manufacturer’s protocol. Total DNA extracted from silica-gel-dried leaves was sheared into ~350 bp fragments to build 350 bp insert libraries, and unsheared DNA from herbarium specimens was used to construct 150 bp insert libraries. The DNA libraries were constructed using the NexteraXT DNA Library Preparation Kit (Illumina, Shanghai, China) and were sequenced on the Illumina HiSeq Xten platform (Illumina, Shanghai, China). We obtained ~3 Gb of data for each sample with paired-end libraries. The average read length was 150 bp for silica-gel-dried leaves, and 100 bp for herbarium specimens.
Transcriptome assembly and low-copy nuclear gene selection
We first selected low-copy nuclear genes from transcriptome sequencing data of nine diploid Artemisia species representing all eight subgenera and one outgroup (Supplementary Data 2). Ploidy was cytologically confirmed (Supplementary Method 1; Supplementary Fig. 16). Then we extracted the target nuclear genes from the genome-skimming data. RNAs were isolated from leaves and/or floral buds using the ZR Plant RNA MiniPrep kit (Zymo Research, Orange, CA, USA) following the manufacturer’s protocol. Library preparation utilized the PureLink RNA Mini Kit (Invitrogen, Carlsbad, CA, USA), and sequencing was performed on the Illumina HiSeq 2500 platform, employing 100-150 bp paired-end reads with a 6 Gb throughput. Raw RNA-seq reads were cleaned using SeqyClean v1.10.0962 to trim poly-A/T tails and terminal nucleotides with a 10 bp sliding window (Phred score ≤10). The filtered reads were de novo assembled with Trinity v2013-11-1063, yielding an average of 145,547 transcripts per sample. To estimate orthologs counts, we applied Yang and Smith’s pipeline64 using the rooted ingroups (RT) method. We ran MarkerMiner v1.265 using Artemisia annua as a reference and with a minimum transcript length of 400 bp. Following Vargas et al66., we used GoldFinder (https://bitbucket.org/oscarvargash/goldfinder)66 to efficiently sub-select markers from the results of MarkerMiner v1.265 to select the best markers, and identified 253 single-copy nuclear genes as target nuclear gene markers with an average length of 1692 base pairs (bp) per marker.
Obtaining the target markers from genome skimming data
Following Vargas et al66., we used GoldFinder selected markers as references to retrieve target markers from genome-skimming data. First, we used SeqyClean v1.10.0962 to clean the genome-skimming data using the same parameters as for transcriptome data. Then, we used the script bbmap.sh (https://sourceforge.net/projects/bbmap/) to filter out reads from chloroplast, mitochondrial, and ribosomal DNA by mapping the genome-skimming data to the references using default parameters, retaining only reads from the nuclear genome. The reference genomes used are the chloroplast genome of Artemisia frigida (GenBank accession: NC020607), the mitochondrial genome of Chrysanthemum indicum (GenBank accession: MH716014), and the ribosomal DNA of Eschweilera congestiflora (GenBank accession: JN222324, JN222317). For each of 253 target low copy nuclear markers, the longest sequence in the marker matrices was chosen as the reference using the script longest_seq_fasta.py66. The genome skimming data was aligned to the reference sequences using the script sam2consense.py (https://github.com/edgardomortiz/sam2consensus), in order to generate consensus sequences. The script baits_file_organizer.py was used to separate the sequences of each sample into single files, and then merge the homologous genes from all samples into a matrix using cat_fastas_per_gene.py66. Multiple sequence alignments were performed using prank_wrapper.py64, followed by alignment trimming with phyutility_wrapper.py64. Given the shallow sequencing depth of the 3 Gb genome skimming data, and to ensure species coverage and phylogenetic information of the final obtained markers, only 202 markers (with 50% species coverage and matrix length > 150 bp) were retained from the 253 markers. These were concatenated using concatenate_matrices.py64 for downstream phylogenetic analysis.
Plastome assembly and annotation
We de novo assembled complete plastomes for 314 species (298 Artemisia + 16 allied species) using GetOrganelle v1.7.667 and recommended parameters (Supplementary Data 2). Assembled plastomes were annotated using GeSeq v2.0368 (CHLOROBOX; https://chlorobox.mpimp-golm.mpg.de/index.html), with the Artemisia frigida plastome (GenBank accession: NC020607) as a reference. tRNA annotation was performed using tRNAscan-SE v2.0.73369. Annotation results were imported into Geneious v. 11.0.470, where the amino acid sequences of protein-coding regions were validated and annotations adjusted accordingly. Based on the adjusted annotations, coding and non-coding regions were extracted using a Python script get_annotated_regions_from_gb.py (https://github.com/Kinggerm/PersonalUtilities/blob/master/). Each region was aligned using MAFFT v7.223471 and adjusted manually using BioEdit v.7.0.5.372. This generated three datasets: 1) a concatenated matrix of the entire plastome (including all coding and non-coding regions); 2) a CDS matrix concatenated from all protein-coding regions; 3) a non-CDS matrix concatenated from all non-coding regions (Supplementary Data 2).
Ribosomal DNA sequence assembly and acquisition
We de novo assembled nuclear ribosomal DNA (ITS and ETS sequences) of 314 Artemisia species from genome-skimming data using GetOrganelle v1.7.667 with the recommended parameters (Supplementary Data 2). Additionally, we generated ITS/ETS sequences for 18 Artemisia species following the protocol of Jiao et al58. (detailed in Supplementary Method 2) and retrieved those of 82 congeneric species from GenBank, yielding a final NR dataset of 414 species (394 Artemisia + 20 allied species).
Gigamatrix approach
We generated a gigamatrix (GM dataset) of 414 species by concatenating the NLC dataset (202 nuclear low-copy genes from 314 species via genome-skimming data) and NR dataset (ITS and ETS sequences for 414 species, including 314 from genome-skimming, 18 newly generated in this study, and 82 from GenBank; see Supplementary Data 2). The gigamatrix approach offers two key advantages: (1) its higher-level relationships align closely with those in phylogenomic trees, and (2) it enables accurate placement of some species using only ITS and ETS sequences19. However, several limitations should be noted: 1) Data heterogeneity impacts: different evolutionary rates between NR data and NLC data may introduce biases in branch length estimation, necessitating partitioned models for correction. 2) Low support for certain taxa: taxa relying solely on a few markers (e.g., ITS/ETS) may have unstable placements and weak support due to high missing data rate. 3) Constraints on species-tree methods: Non-random distribution of missing data (e.g., incomplete NLC coverage across taxa) may reduce the reliability of multispecies coalescent methods (e.g., ASTRAL)19. Thus, we used IQ-TREE v.2.0.673 to build maximum likelihood (ML) trees for the gigamatrix, treating each nuclear gene and ribosomal DNA markers as a separate partition. Substitution models were selected via the corrected Akaike information criterion (AICc) calculated using ModelFinder in IQ-TREE.
Tree topology tests
The topologies generated from plastome dataset and NLC dataset (inferred via maximum likelihood and coalescent methods) revealed incongruent phylogenetic relationships between Artemisia and its allies. To find an optimal tree from these genomic datasets, we compared the likelihood values of alternative hypotheses against those of the unconstrained ML tree using three statistical tests generated in IQ-TREE v.2.0.673: Shimodaira-Hasegawa (SH) test42, Kishino-Hasegawa (KH) test43, and the approximately unbiased (AU) test44. Constrained trees were constructed in Mesquite v3.6174 by enforcing monophyly on target taxa (with others as polytomies), representing three scenarios: 1) monophyly of Artemisia; 2) Artemisia + Artemisiella as a clade; and 3) Artemisia + Artemisiella + (Chrysanthemum-Ajania-Elachanthemum) as a clade. These constrained topologies were then optimized in IQ-TREE under the GTR + I + G model (partitioned by DNA regions) and used to test alternative hypotheses of tree topology.
Analysis of genomic data and causes of cytonuclear discordance
We built two datasets for phylogenetic reconstruction: 1) NLC dataset; 2) plastome dataset. For the NLC dataset, phylogenetic analysis was performed using two methods. First, we used IQ-TREE v.2.0.673 to build a ML tree for the concatenated matrix. Substitution models were selected via the corrected Akaike information criterion (AICc) calculated by ModelFinder75 in IQ-TREE. After building the ML tree, we measured phylogenomic discordance using gene concordance factors (gCF) and site concordance factors (sCF) calculated in IQ-TREE46. These factors evaluate the percentage of gene trees that agree with various nodes on the phylogenetic tree and quantify the number of informative sites that support different topologies. Low gCF values may arise from insufficient information (e.g., short branches) or from genuine conflicting signals. Similarly, low sCF values (~30%) suggest a lack of phylogenetic information in the loci46. Second, we applied the multispecies coalescent method with ASTRAL III v.5.7.476. The input gene trees were generated using IQ-TREE. Before running ASTRAL III, we collapsed branches with bootstrap support (BS) ≤ 20% in gene trees using the ‘nw_ed’ function in Newick Utilities v1.677. The coalescent-based species tree was reconstructed with ASTRAL-III, using gene trees generated for each single-gene matrix with IQ-TREE as input. Polytomy tests were conducted in ASTRAL-III, using the ASTRAL-III topology (option -t 10)47. Polytomy test is a statistical test, for the null hypothesis that a branch of a species tree is a polytomy given a set of gene trees, which helps identify the presence of polytomies while considering incomplete lineage sorting (ILS). A p-value ≥ 0.1 indicates that the null hypothesis cannot be rejected. We chose this conservative p value because phylogenetic analyses with <1000 genes may lack resolution to distinguish short branches from multiple nodes47,78. For the plastome dataset, since all coding and non-coding regions are treated as a single heritable unit, no separate gene tree was inferred. We used IQ-TREE to build a ML tree for plastome dataset, and only the site consistency factor (sCF) was calculated.
Phylogenetic network estimation
To explore whether gene flow contributed to cytonuclear discordance in Artemisia, we inferred and simulated phylogenetic networks using the maximum pseudo likelihood approach, accounting for incomplete lineage sorting (ILS) and gene flow. Phylogenetic networks were reconstructed from gene trees using the InferNetwork-MPL command in PhyloNet v.3.6.948. The cytonuclear discordance patterns (Fig. 3a) suggested hybridization events among Artemisia subgenera. Given the computational limitation of PhyloNet, we constructed a 22-species dataset, comprising 20 Artemisia species (representing all eight subgenera) and two outgroups (Artemisiella+Chrysanthemum/Elachanthemum; Fig. 3c and d). To enhance taxon representativeness, two independent 22-species datasets were generated for PhyloNet analysis. For each dataset, we performed 5 network searches allowing 0 to 6 reticulations, with 10 replicates per search. To determine the optimal reticulation number, we used InferNetwork-MP to calculate the minimum depth of traceability (MDC) for different reticulation counts. The optimal model was identified as the reticulation number corresponding to the lowest MDC value and steepest decline in the MDC curve79.
Evolutionary analysis of macro- and micro-morphological characters
We investigated a total of 20 morphological characters potentially useful for Artemisia infrageneric taxonomy, of which 13 are macromorphological (Supplementary Table 4) and seven are micromorphological (Supplementary Table 5). For macromorphological characters, in addition to these previously used for subgeneric taxonomy31, such as life form, capitulum type, synflorescence, leaf shape, and leaf size, we also investigated others, including plant height, capitulum diameter, leaf area, leaf length-width ratio, leaf segment length and leaf segment width (Supplementary Table 4, Supplementary Data 3). We introduced a character “leaf types” defined by the following seven specific leaf morphological features: leaf shape, number of leaf segment pairs, leaf size, leaf area, leaf length-width ratio, leaf segment length, and leaf segment width (Fig. 4 and Supplementary Fig. 12, Supplementary Table 3). We gathered data of these macromorphological characters for all the sampled 394 Artemisia and 6 outgroup species in the GM-derived ML tree (Fig. 2) through field observations and specimen examinations. Each character was measured in three individuals using ImageJ software (https://imagej.nih.gov/ij/), and the average value was calculated. Data were verified against literature (e.g., floras and protologues; Supplementary Data 1). For leaf size, a character reflecting overall leaf dimensions, we followed Cain80 to derive a rough estimate: leaf size was approximated by measuring length and width, then calculating length × width × 3/4. In contrast, leaf area was defined as the precise one-side or projected area of an individual leaf 81. Of the 13 macromorphological characters, eight are quantitative (plant height, capitulum diameter, number of leaf segment pairs, leaf size, leaf area, leaf length-width ratio, leaf segment length, and leaf segment width; Supplementary Data 4). To facilitate subsequent classification of character states, we discretized these eight continuous characters using a Gaussian Mixture Model (GMM)82. The optimal number of discrete states was determined by minimizing the Bayesian Information Criterion (BIC) in conjunction with the distribution of character states on the phylogenetic tree. The scripts used in this analysis are available on Figshare (https://doi.org/10.6084/m9.figshare.28164335). Ancestral states reconstruction for the 13 characters was performed using the maximum likelihood method in the “APE” package83 of RASP v4.284 based on the GM-derived ML tree (Fig. 2). Phylogenetic signals of these characters were tested using Blomberg’s K85 and Pagel’s λ86 in R v3.6.187. Phylogenetic signal denotes the tendency for closely related species to exhibit greater phenotypic resemblance than expected under random sampling from a phylogenetic tree. A strong signal is indicated when Pagel’s λ approaches 1 or Blomberg’s K exceeds 1.
We studied seven micromorphological characters of florets across all sampled Artemisia species and selected outgroup species (Supplementary Table 5), using our field collections and herbarium specimens deposited in PE, LE and MW. Sampling encompassed all eight subgenera and 24 sections (Supplementary Data 2 and 5), with three individuals examined per species. Capitula were processed through: (1) 24-hour FAA fixation; (2) ultrasonic cleaning (100 Hz, 5 min); (3) 5% NaOH treatment (2 hr). After rinsing, capitula were mounted in Hoyer’s solution and imaged using a Leica DM5000B microscope. We examined the corolla and style of both marginal and disk florets, and the anther of disk florets. Terminology follows Roque et al88. and Grossi et al89. Due to material limitations and experimental constraints, micromorphological data were unavailable for some species in the GM-derived ML tree (Fig. 2). These species were excluded from the GM-based ML tree. Based on this pruned tree [200 Artemisia species and 4 outgroup species (Chrysanthemum + Ajania); Supplementary Data 5], ancestral states and phylogenetic signals of the seven micromorphological characters were reconstructed and estimated using the methods described above.
Compilation of Artemisia species with taxonomic positions
We collected all accepted species names of Artemisia primarily from major databases: Plants of the World Online (https://powo.science.kew.org/); Global Compositae Checklist (https://www.compositae.org); World Flora Online (https://www.worldfloraonline.org); Catalog of Life (https://www.catalogueoflife.org); all retrieved on 29 September 2024. These names were cross-referenced with floras of its main distribution ranges (Supplementary Data 1) and our original research on herbarium specimens. We checked and standardized the species names following the International Code of Nomenclature for algae, fungi, and plants90. Based on the phylogenetic and morphological evidence (Fig. 2, Supplementary Figs. 12 and 13), we assigned each species to a subgenus and section within the proposed taxonomical framework (see Supplementary Data 6). Species with insufficient morphological and molecular evidence were treated as unplaced and listed separately. For conflicting taxonomic treatments, decisions were made based on our morphological assessments in the field and herbaria, along with the latest phylogeny established here (Fig. 2).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All data generated or analysed in this study are included in this published article and/or its supplementary materials. Sequencing data generated in this study have been deposited in GenBank under Bioproject PRJNA909040. All sequences used in the phylogenetic analyses are available in GenBank, with accession numbers listed in Supplementary Data 2. Morphological character data generated in this study is provided in Supplementary Data 3–5. The sectional taxonomy of Artemisia with an identification key to all the sections is provided in Supplementary Note 1. The accepted species list of Artemisia with subgeneric, sectional positions (including supporting evidence), and comparisons with databases and previous taxonomies are provided in Supplementary Data 6. Phylogenies and datasets generated during and/or analysed in this study have been deposited in figshare data repository (https://doi.org/10.6084/m9.figshare.28164335).
Code availability
We used publicly available software packages and scripts as described in the materials and methods. The scripts used for discretization of continuous morphological character are available on Figshare (https://doi.org/10.6084/m9.figshare.28164335).
References
Frodin, D. G. History and concepts of big plant genera. Taxon 53, 753–776 (2004).
Moonlight, P. W. et al. Twenty years of big plant genera. Proc. Biol. Sci. 291, 20240702 (2024).
Folk, R. A. et al. Anatomy of a mega-radiation: biogeography and niche evolution in Astragalus. Am. J. Bot. 111, e16299 (2024).
Munoz-Rodriguez, P. et al. A taxonomic monograph of Ipomoea integrated across phylogenetic scales. Nat. Plants 5, 1136–1144 (2019).
Santos, M. F., Lucas, E.dA. & Sano P.T. A taxonomic monograph of Myrcia sect. Sympodiomyrcia (Myrteae, Myrtaceae). Phytotaxa 380, 1 (2018).
Griffiths, M., Ralimanana, H., Rakotonasolo, F. & Larridon, I. A monograph of the African and Madagascan species of Cyperus sect. Incurvi (Cyperaceae). Kew Bull. 77, 819–850 (2022).
Knapp, S., Bohs, L., Nee, M. & Spooner, D. M. Solanaceae-a model for linking genomics with biodiversity. Comp. Funct. Genomics 5, 285–291 (2004).
Moonlight, P. W. et al. Dividing and conquering the fastest–growing genus: towards a natural sectional classification of the mega–diverse genus Begonia (Begoniaceae). Taxon 67, 267–323 (2018).
Liu, Z. et al. Global cooling during the Eocene-Oligocene climate transition. Science 323, 1187–1190 (2009).
Palazzesi, L., Hidalgo, O., Barreda, V. D., Forest, F. & Höhna, S. The rise of grasslands is linked to atmospheric CO2 decline in the late Palaeogene. Nat. Commun. 13, 293 (2022).
Guo, Z. T. et al. Onset of Asian desertification by 22 Myr ago inferred from loess deposits in China. Nature 416, 159–163 (2002).
Dimitrov, D. et al. Diversification of flowering plants in space and time. Nat. Commun. 14, 7609 (2023).
Stebbins, G. L. Flowering Plants: Evolution above the Species Level, (The Belknap Press of Harvard University Press, 1974).
Tu, Y. The discovery of artemisinin (qinghaosu) and gifts from Chinese medicine. Nat. Med. 17, 1217–1220 (2011).
Saslis-Lagoudakis, C. H. et al. Phylogenies reveal predictive power of traditional medicine in bioprospecting. Proc. Natl. Acad. Sci. Usa. 109, 15835–15840 (2012).
Faurby, S., Eiserhardt, W. L., Baker, W. J. & Svenning, J. C. An all-evidence species-level supertree for the palms (Arecaceae). Mol. Phylogenet. Evol. 100, 57–69 (2016).
Zuntini, A. R. et al. Phylogenomics and the rise of the angiosperms. Nature 629, 843–850 (2024).
Talavera, G., Lukhtanov, V., Pierce, N. E. & Vila, R. DNA Barcodes combined with multilocus data of representative taxa can generate reliable higher-level phylogenies. Syst. Biol. 71, 382–395 (2022).
Portik, D. M. et al. Redefining possible: combining phylogenomic and supersparse data in frogs. Mol. Biol. Evol. 40, msad109 (2023).
Perez-Escobar, O. A. et al. The origin and speciation of orchids. N. Phytol. 242, 700–716 (2024).
Poljakov, P. P. Artemisia L. In Flora of the U.S.S.R., Vol. 26, (ed. Shishkin, B. K. & Bobrov, E. G.) 425–631 (Nauka, Leningrad 1961).
Shultz, L. M. Artemisia. In Flora of North America (ed. Flora of North America Editorial Committee) 503–534 (Oxford University Press, New York, 2006).
Ling, Y. R., Humphries, C. J. & Gilbert, M. G. Artemisia L. In Flora of China, Vol. 20 (eds Wu, Z. Y., Raven, P. H. & Hong, D. Y.) 1151–1259 (Science Press, Missouri Botanical Garden Press, Beijing, St. Louis, 2011).
Vallès, J. et al. Biology, Genome evolution, biotechnological issues and research including applied perspectives in artemisia (Asteraceae). Adv. Botanical Res. 60, 349–419 (2011).
Wang, J. et al. Artemisinin, the magic drug discovered from traditional Chinese medicine. Engineering 5, 32–39 (2019).
Zheng, H. et al. Inhibitors of Mycobacterium tuberculosis DosRST signaling and persistence. Nat. Chem. Biol. 13, 218–225 (2017).
Liu, Y. et al. Artemisinins ameliorate polycystic ovarian syndrome by mediating LONP1-CYP11A1 interaction. Science 384, eadk5382 (2024).
Ma, T. et al. Prevalence and risk factors for allergic rhinitis in adults and children living in different grassland regions of Inner Mongolia. Allergy 75, 234–239 (2020).
Zhou, W., Li, C., Wang, S., Ren, Z. & Stringer, L. C. Effects of vegetation restoration on soil properties and vegetation attributes in the arid and semi-arid regions of China. J. Environ. Manag. 343, 118186 (2023).
Sanz, M. et al. Molecular phylogeny and evolution of floral characters of Artemisia and allies (Anthemideae, Asteraceae): evidence from nrDNA ETS and ITS sequences. Taxon 57, 66–78 (2008).
Jiao, B. et al. Phylogenomics and morphological evolution of the mega-diverse genus Artemisia (Asteraceae: Anthemideae): implications for its circumscription and infrageneric taxonomy. Ann. Bot. 131, 867–883 (2023).
McArthur, E., Pope, C. & Freeman, D. Chromosomal studies of subgenus Tridentatae of Artemisia: evidence for autopolyploidy. Am. J. Bot. 68, 589–605 (1981).
Ling, Y. R., Humphries, C. J. & Gilbert, M. G. Seriphidium (Besser ex Lessing) Fourreau. In Flora of China Vol. 20 (eds Wu, Z.Y., Raven, P.H. & Hong, D.Y.) 737–747 (Science Press, Missouri Botanical Garden Press, Beijing, St. Louis, 2011).
Stuessy, T. F. Plant Taxonomy: The Systematic Evaluation of Comparative Data (Columbia University Press, 2009).
Muñoz-Rodríguez, P. et al. The challenges of classifying big genera such as Ipomoea. Taxon 72, 1201–1215 (2023).
Wheeler, Q. et al. Species, Science and Society: The Role of Systematic Biology (Routledge, 2023).
Poljakov, P. P. Materials for the systematics of the genus Artemisia. Trudy Inst. Bot. (Alma - Ata) 11, 134–178 (1961).
Park, M. S., Hong, K. N., Eom, J. A. & Chung, G. Y. Taxonomic study on the capitulum morphology of Korean Artemisia (Compositae). Korean J. Plant Taxon. 40, 27–42 (2010).
Malik, S. et al. Phylogeny and biogeography of Artemisia subgenus Seriphidium (Asteraceae: Anthemideae). Taxon 66, 934–952 (2017).
Pellicer, J. et al. Origin and evolution of the South American endemic Artemisia species (Asteraceae): evidence from molecular phylogeny, ribosomal DNA and genome size data. Aust. J. Bot. 58, 605–616 (2010).
Wei, M. et al. Merging Ajaniopsis (Asteraceae), an endangered genus endemic to the Tibet Plateau, into Artemisia: Implications for systematics and conservation. Taxon 74, 417–434 (2025).
Shimodaira, H. & Hasegawa, M. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16, 1114–1116 (1999).
Kishino, H. & Hasegawa, M. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J. Mol. Evol. 29, 170–179 (1989).
Shimodaira, H. An approximately unbiased test of phylogenetic tree selection. Syst. Biol. 51, 492–508 (2002).
Ghafoor, A. Artemisiella, a new genus of Compositae based on Artemisia stracheyi Hook.f. & Thorns, ex Clarke. Candollea 47, 635–643 (1992).
Minh, B. Q., Hahn, M. W. & Lanfear, R. New methods to calculate concordance factors for phylogenomic datasets. Mol. Biol. Evol. 37, 2727–2733 (2020).
Sayyari, E. & Mirarab, S. Testing for polytomies in phylogenetic species trees using quartet frequencies. Genes 9, 132 (2018).
Than, C., Ruths, D. & Nakhleh, L. PhyloNet: a software package for analyzing and reconstructing reticulate evolutionary relationships. BMC Bioinf 9, 322 (2008).
Gagnon, E. et al. Phylogenomic discordance suggests polytomies along the backbone of the large genus Solanum. Am. J. Bot. 109, 580–601 (2022).
Riggins, C. W. & Seigler, D. S. The genus Artemisia (Asteraceae: Anthemideae) at a continental crossroads: molecular insights into migrations, disjunctions, and reticulations among old and new world species from a Beringian perspective. Mol. Phylogenet. Evol. 64, 471–490 (2012).
Freeman, D. C., Turner, W. A., McArthur, E. D. & Graham, J. H. Characterization of a Narrow Hybrid Zone between 2 Subspecies of Big Sagebrush (Artemisia tridentata, Asteraceae). Am. J. Bot. 78, 805–815 (1991).
Tkach, N. V., Hoffmann, M. H., Roser, M., Korobkov, A. A. & von Hagen, K. B. Parallel evolutionary patterns in multiple lineages of arctic Artemisia L. (Asteraceae). Evolution 62, 184–198 (2008).
Garcia, S. et al. Molecular phylogenetic approach to Western North America endemic Artemisia and Allies (Asteraceae): untangling the sagebrushes. Am. J. Bot. 98, 638–653 (2011).
Soltis, D. E. & Kuzoff, R. K. Discordance between nuclear and chloroplast phylogenies in the Heuchera group (Saxifragaceae). Evolution 49, 727–742 (1995).
Rieseberg, L. H. & Soltis, D. E. Phylogenetic consequences of cytoplasmic gene flow in plants. Evol. Trends Plants 5, 65–84 (1991).
Stull, G. W., Pham, K. K., Soltis, P. S. & Soltis, D. E. Deep reticulation: the long legacy of hybridization in vascular plant evolution. Plant J. 114, 743–766 (2023).
Li, H.-T. et al. Origin of angiosperms and the puzzle of the Jurassic gap. Nat. Plants 5, 461–470 (2019).
Jiao, B. et al. Artemisia baxoiensis (Asteraceae: Anthemideae), a distinctive new species from Xizang, China. Syst. Bot. 44, 424–432 (2019).
Blount, Z. D., Lenski, R. E. & Losos, J. B. Contingency and determinism in evolution: replaying life’s tape. Science 362, eaam5979 (2018).
Watson, L. E., Bates, P. L., Evans, T. M., Unwin, M. M. & Estes, J. R. Molecular phylogeny of Subtribe Artemisiinae (Asteraceae), including Artemisia and its allied and segregate genera. BMC Evol. Biol. 2, 17 (2002).
Criado-Ruiz, D. et al. A phylogenomic approach to disentangling the evolution of the large and diverse daisy tribe Anthemideae (Asteraceae). J. Syst. Evol. 63, 282–306 (2025).
Zhbannikov, I. Y., Hunter, S. S., Foster, J. A. & Settles, M. L. SeqyClean: A Pipeline for High-throughput Sequence Data Preprocessing. In 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB) 407-416 (Boston, MA, 2017).
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Yang, Y. & Smith, S. A. Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: Improving accuracy and matrix occupancy for phylogenomics. Mol. Biol. Evol. 31, 3081–3092 (2014).
Chamala, S. et al. MarkerMiner 1.0: A new application for phylogenetic marker development using angiosperm transcriptomes. Appl. Plant Sci. 3, apps–1400115 (2015).
Vargas, O. M., Heuertz, M., Smith, S. A. & Dick, C. W. Target sequence capture in the Brazil nut family (Lecythidaceae): Marker selection and in silico capture from genome skimming data. Mol. Phylogenet. Evol. 135, 98–104 (2019).
Jin, J.-J. et al. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 21, 241 (2020).
Tillich, M. et al. GeSeq - versatile and accurate annotation of organelle genomes. Nucleic Acids Res 45, W6–W11 (2017).
Chan, P. P., Lin, B. Y., Mak, A. J. & Lowe, T. M. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res 49, 9077–9096 (2021).
Kearse, M. et al. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647–1649 (2012).
Katoh, K. & Standley, D. M. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol. Biol. Evol. 30, 772–780 (2013).
Hall, T. BioEdit: A user-friendly biological sequence alignment editor and analysis program for windows 95/98/NT. Nucleic Acids Symp. Ser. (Oxf.) 41, 95–98 (1999).
Minh, B. Q. et al. IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Maddison, W. P. & Maddison, D. R. Mesquite: a modular system for evolutionary analysis. Version 3, 61 (2023).
Kalyaanamoorthy, S., Bui Quang, M., Wong, T. K. F., von Haeseler, A. & Jermiin, L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14, 587–589 (2017).
Mirarab, S. et al. ASTRAL: genome-scale coalescent-based species tree estimation. Bioinformatics 30, I541–I548 (2014).
Junier, T. & Zdobnov, E. M. The Newick utilities: high-throughput phylogenetic tree processing in the Unix shell. Bioinformatics 26, 1669–1670 (2010).
Rose, J. P., Toledo, C. A. P., Lemmon, E. M., Lemmon, A. R. & Sytsma, K. J. Out of sight, out of mind: widespread nuclear and plastid-nuclear discordance in the flowering plant genus Polemonium (Polemoniaceae) Suggests Widespread Historical Gene Flow Despite Limited Nuclear Signal. Syst. Biol. 70, 162–180 (2021).
Cao, Z., Liu, X., Ogilvie, H. A., Yan, Z. & Nakhleh, L. Practical aspects of phylogenetic network analysis using PhyloNet. bioRxiv (2019).
Cain, S. & Castro, G.M.D.O. Manual of Vegetation Analysis (Harper, New York, 1959).
Perez-Harguindeguy, N. et al. New handbook for standardised measurement of plant functional traits worldwide. Aust. J. Bot. 61, 167–234 (2013).
Tiburtini, M., Scrucca, L. & Peruzzi, L. A guide to plant morphometrics using Gaussian mixture models. bioRxiv, https://doi.org/10.1101/2024.04.21.590472 (2024).
Paradis, E., Claude, J. & Strimmer, K. A. P. E. Analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290 (2004).
Yu, Y., Harris, A. J., Blair, C. & He, X. RASP (reconstruct ancestral state in phylogenies): a tool for historical biogeography. Mol. Phylogenet. Evol. 87, 46–49 (2015).
Blomberg, S. P., Garland, T. & Ives, A. R. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution 57, 717–745 (2003).
Pagel, M. Inferring the historical patterns of biological evolution. Nature 401, 877–884 (1999).
R-Core-Team. R: A language and environment for statistical computing, (R Foundation for Statistical Computing, Vienna, Austria, 2019).
Roque, N., Keil, D. J. & Susanna, A. Illustrated glossary of Compositae. In Systematics, evolution and biogeography of the Compositae (eds Funk, V., Stuessy, T. & Bayer, R.) 781–806 (International Association Plant Taxonomy, Vienna, 2009).
Grossi, M. A. et al. Providing tools for the reassessment of Eupatorieae (Asteraceae): Comparative and statistical analysis of reproductive characters in South American taxa. Perspect. Plant Ecol. Evol. Syst. 46, 125566 (2020).
Turland, N. J. et al. International Code of Nomenclature for algae, fungi, and plants (Shenzhen Code) adopted by the Nineteenth International Botanical Congress Shenzhen, China, July 2017 (Koeltz Botanical Books, Glashütten, 2018).
Compositae Working Group (C. W. G.). Global compositae database. version 1.7 (2024).
Brummitt, N., Araújo, A. C. & Harris, T. Areas of plant diversity—What do we know?. Plants People Planet 3, 33–44 (2021).
Acknowledgements
We thank the curators of A, B, BC, BCN, BR, E, GH, HAW, K, KUN, KYO, LE, MHA, MO, MW, P, PE, PTBG, TI and US, who granted us access to their collections. We also thank Dr. Zheng Li from Miami University for his assistance in polishing the language. This research was partly supported by the National Natural Science Foundation of China (Grant No. 32270229, 31870179 and 31570204 of T.G.G., 32361133549 of W.W.), Sino-Africa Joint Research Center (Grant No. SAJC201614 of T.G.G.), National Plant Specimen Resource Center (Grant No. E0117G1001 of W.W.), Tencent Foundation and Shan Shui Conservation Center of M.W.
Author information
Authors and Affiliations
Contributions
T.G.G. and B.H.J. conceived the study; B.H.J., T.G.G., M.W., G.H.N., Y.F.L., C.C. and J.Y.Z. collected and analyzed the samples; M.W., G.Y.H., J.H.S., J.Y.Z. and G.H.N. provided pictures; B.H.J. and X.Y.C. drew the figures and tables; T.G.G., J.V., F.J., D.V., F.V., A.S.E., A.P.S., A.S., A.L., X.J.G., G.K., and W.W. contributed botanical knowledge; B.H.J. and T.G.G. wrote the paper with inputs from all the authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Benedikt Kuhnhäuser, Chance Riggins and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Jiao, B., Wei, M., Niu, G. et al. Global phylogeny and taxonomy of Artemisia. Nat Commun 16, 8648 (2025). https://doi.org/10.1038/s41467-025-64039-0
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-025-64039-0
This article is cited by
-
High-altitude adaptation in Artemisia: a multi-level integrated synthesis
Alpine Botany (2026)







