Genome of the early spider-orchid Ophrys sphegodes provides insights into sexual deception and pollinator adaptation

Russo, Alessia; Alessandrini, Mattia; El Baidouri, Moaine; Frei, Daniel; Galise, Teresa Rosa; Gaidusch, Lara; Oertel, Hannah F.; Garcia Morales, Sara E.; Potente, Giacomo; Tian, Qin; Smetanin, Dmitry; Bertrand, Joris A. M.; Onstein, Renske E.; Panaud, Olivier; Frey, Jürg E.; Cozzolino, Salvatore; Wicker, Thomas; Xu, Shuqing; Grossniklaus, Ueli; Schlüter, Philipp M.

doi:10.1038/s41467-024-50622-4

Download PDF

Article
Open access
Published: 26 July 2024

Genome of the early spider-orchid Ophrys sphegodes provides insights into sexual deception and pollinator adaptation

Nature Communications volume 15, Article number: 6308 (2024) Cite this article

16k Accesses
10 Citations
21 Altmetric
Metrics details

Subjects

Abstract

Pollinator-driven evolution of floral traits is thought to be a major driver of angiosperm speciation and diversification. Ophrys orchids mimic female insects to lure male pollinators into pseudocopulation. This strategy, called sexual deception, is species-specific, thereby providing strong premating reproductive isolation. Identifying the genomic architecture underlying pollinator adaptation and speciation may shed light on the mechanisms of angiosperm diversification. Here, we report the 5.2 Gb chromosome-scale genome sequence of Ophrys sphegodes. We find evidence for transposable element expansion that preceded the radiation of the O. sphegodes group, and for gene duplication having contributed to the evolution of chemical mimicry. We report a highly differentiated genomic candidate region for pollinator-mediated evolution on chromosome 2. The Ophrys genome will prove useful for investigations into the repeated evolution of sexual deception, pollinator adaptation and the genomic architectures that facilitate evolutionary radiations.

On the way to diploidization and unexpected ploidy in the grass Sporobolus section Spartina mesopolyploids

Article Open access 26 February 2025

Gene duplications and phylogenomic conflict underlie major pulses of phenotypic evolution in gymnosperms

Article 19 July 2021

Molecular mechanisms of mutualistic and antagonistic interactions in a plant–pollinator association

Article 17 May 2021

Introduction

Understanding the genetic mechanisms of adaptation to pollinators is a central question in plant evolutionary biology. The prominent role of pollinators in flower evolution is related to their dual function: enabling sexual reproduction and imposing selection on floral traits^1,2. This is particularly evident in plant species with a specialised pollination mechanism associated with floral traits that evolved to attract one or a few functionally alike pollinator species^3,4. Therefore, pollinator-mediated evolution of floral traits is considered a major force driving angiosperm diversity by contributing to their radiation^5,6. It has been hypothesised that radiations are facilitated by entering a new ecological niche with little or no competition from similar species (ecological opportunity), and the genetic potential allowing the necessary adaptations to evolve (genetic variation), ultimately leading to rapid bursts of speciation^7,8,9.

Orchids of the Euro-Mediterranean genus Ophrys ensure reproduction through sexual deception. Specifically, they mimic the olfactory, visual and tactile signals of females of their pollinating insects to entice conspecific males to pseudocopulate with the flower, leading to pollination (Fig. 1a, b). Among the flower traits adapted to pollinators, olfactory signals are pivotal to specific pollinator attraction^10,11, with selection by pollinators¹² leading to strong odour differentiation among closely related species¹³. Additional adaptations to pollinators likely involve flower labellum colour, which matches pollinator body coloration, and floral morphology that optimises pollen transfer¹³. At the same time, conspicuous UV-reflective patterns and odour compounds not primarily required for sexual attraction are highly variable between plants and likely aid male pollinators in memorising and avoiding plants, thereby increasing outcrossing rates^13,14,15. This extreme specialisation of Ophrys floral traits makes the plant-insect interaction highly species-specific, and the lack of shared pollinators between species leads to strong premating reproductive isolation^12,13. Notwithstanding some uncertainty about the number of species and the extent of pollinator sharing^16,17, at the local population level extreme pollinator specialisation is evident in Ophrys species^12,13,18. The genus Ophrys is of relatively recent origin (c. 4.9 Ma crown age), with the earliest-diverged Ophrys lineage likely having been wasp-pollinated^19,20. Two independent Ophrys lineages later experienced a pollinator shift to Andrena bee pollinators, preceding a burst of speciation in the O. sphegodes group within the last million years¹⁹. This resulted in a species radiation with one of the highest reported diversification rates among angiosperms¹⁹. While several facets of Ophrys speciation are relatively well understood^12,13,19, the genomic features allowing its adaptation to diverse pollinators are largely unknown. Our ability to elucidate how genome architecture has contributed to Ophrys diversification and adaptation to pollinators has hitherto been hampered by the lack of a reference genome.

**Fig. 1: Pollination and genome of *Ophrys sphegodes*.**

Here, we present the chromosome-level genome assembly of Ophrys sphegodes, a key representative of the genus and the adaptive radiation, to address questions on the genetic mechanisms underlying such rapid pollinator-driven speciation.

Results and discussion

Chromosome-level genome assembly

The orchid Ophrys sphegodes Mill. has a karyotype of 2n = 2x = 36 ^12,21,22. The haploid genome size was estimated to be 4.83 Gb by flow cytometry, and heterozygosity was estimated at 1.28% via k-mer analysis (Supplementary Figs. 1 and 2, Supplementary Table 1). To assemble the genome, we generated a total of 409 Gb data on the Nanopore PromethION platform (Supplementary Table 2, Supplementary Fig. 3). Additionally, whole-genome Illumina sequencing data (WGS; 268 Gb, Supplementary Table 3) and Hi-C chromatin conformation capture libraries (Illumina, Supplementary Table 3) were produced to perform polishing and anchoring of scaffolds, respectively. We used Miniasm assembly²³ to generate a total of 11,148 contigs in 6.4 Gb, with an N50 value of 754 kb (Supplementary Table 4, Supplementary Fig. 4). We removed under-collapsed heterozygous contigs (1.2 Gb) and used Hi-C data to anchor the scaffolds into 18 pseudomolecules corresponding to the 18 chromosomes expected for the haploid genome (Fig. 1c, Supplementary Fig. 5a; Supplementary Table 5). The final assembled genome size was 5.2 Gb, in line with the estimated genome size, with a scaffold N50 of 218 Mb, L50 of 10 and L75 of 17 (Supplementary Table 4). Overall, 97.8% of raw Illumina WGS reads and 98.1% of an independent Pacific Biosciences (PacBio) dataset (not used for assembly; Supplementary Table 6) could be mapped to the assembly, suggesting that our assembly contains the complete genetic information. Gene region completeness was assessed using BUSCO (Benchmarking Universal Single-Copy Orthologs)²⁴: 1453 out of 1612 (91.1%) conserved core land plant genes were found in our assembly, of which 1361 (84.9%) were complete (Supplementary Fig. 6a). We confidently annotated a total of 42,549 protein-coding genes, of which 90.0% had RNA-seq support or functional annotation (see Supplementary Tables 7, 8, Supplementary Note 1-2 and Supplementary Data 1–4 for genes of interest), which included information on Gene Ontology (GO) terms, protein domain information, putative pathways and enzyme function. Cytosine methylation was inferred from Nanopore sequencing data (Fig. 1c, Supplementary Figs. 7, 8).

A burst in transposon activity preceded adaptive radiation

We manually characterised transposable elements (TEs) in the genome following a protocol²⁵ for TE classification that characterises TEs through a combination of homology-based prediction and manual inspection to find structural motifs and define sequence boundaries. To this end, we created a species-specific database of TEs for this orchid genome, containing a total of 436 sequences specific to O. sphegodes. Using this database, we identified a total of 4.05 Gb repetitive elements, occupying 78% of the O. sphegodes genome (Fig. 1c, Supplementary Table 9). Overall, O. sphegodes represents the largest orchid genome assembled to date and exhibits the highest abundance of long terminal repeat (LTR) elements (75% of the genome), more than Vanilla planifolia (10%)²⁶, Apostasia shenzenica (17%)²⁷, Dendrobium catenatum (40%)²⁷, Phalaenopsis equestris (44%)²⁷, Gastrodia elata (55%)²⁸, Cymbidium sinense (55%)²⁹, and similar to Platanthera guangdongensis and P. zijinensis (73% and 72%³⁰; Supplementary Table 10). TE activity is known to influence genome size variation in eukaryotes³¹, and LTR/Gypsy and LTR/Copia in particular have previously been shown to correlate with genome size expansion in orchids^32,33. We conducted an analysis of LTR insertion age based on the idea that both LTR sequences of a TE are identical at the time of insertion, but will diverge over time as mutations accumulate. In O. sphegodes, analysis of recent LTR insertions showed that LTR activity had an initial increase at around 3 Ma ago, to reach its maximum at around 1.3 to 0.8 Ma (Fig. 1d). During this period, the Mediterranean Basin experienced climatic oscillations with glacial/interglacial periods^34,35. It is conceivable that such environmental disturbances^36,37,38 might have led to bursts of TE proliferation in O. sphegodes, thus inflating its genome size. Interestingly, the peak of LTR element insertions precedes or overlaps with the radiation of the most species-rich Ophrys lineage, including the O. sphegodes species group, less than 1.0 Ma ago¹⁹. The ability to cope with genome size changes has allowed angiosperms to successfully diversify³⁹ and TEs have played an important role in enhancing angiosperm evolution⁴⁰ through their effects upon gene expression³⁷, as well as gene duplications and genomic rearrangements^41,42. Since TEs often carry transcription factor binding sites, TE expansion can rewire existing metabolic networks and facilitate the evolution of new compounds (or mixtures), as was shown for the evolution of nicotine biosynthesis in tobacco⁴³. Thus, TE bursts can contribute to the generation of intraspecific genetic and metabolic diversity⁴⁴. Since changes in pollinator-attractive hydrocarbon compounds are suspected to be involved early in speciation in the O. sphegodes lineage¹³, it is tempting to speculate that TE bursts may have provided this lineage with the genetic capacity to adapt to a new pollinator niche (i.e., Andrena bees), thereby facilitating the adaptive radiation of the O. sphegodes group.

Genome evolution through chromosome fusions in the Ophrys lineage

To understand the evolutionary history of the Ophrys lineage, we constructed a phylogenomic tree and estimated divergence times across O. sphegodes and 20 other plant species with fully sequenced genomes, based on single-copy orthologues. Ophrys diverged from Platanthera (both from Orchidoideae subtribe Orchidinae), the most closely related orchid with a fully sequenced genome^30,45,46, approximately 22.42 Ma ago with a 95% confidence interval (CI) of 20.77–23.95 Ma, and Orchidoideae separated from other orchids around 54.49 Ma (CI 53.00–56.15 Ma; Fig. 2a, Supplementary Fig. 9). Our analysis further suggests that Orchidaceae separated from the common ancestor of Asparagales approximately 99.96 Ma ago (CI 98.66–105.66 Ma) and places monocot/eudicot divergence around 147.51 Ma (CI 139.87–154.45 Ma), in line with previous studies^47,48. Our age estimates for orchids, while in line with results by Kim et al.⁴⁹, are younger when compared with previous orchid genome studies^27,30.

**Fig. 2: Evolutionary relationships among orchid genomes.**

To track chromosome evolution of O. sphegodes, we compared it with the most closely related sequenced orchid genomes³⁰, focusing on the comparison with Platanthera zijinensis (Fig. 2b). The two genera differ in their chromosome numbers, with karyotype organisation of Platanthera (n = 21)⁵⁰ reflecting the ancestral and Ophrys (n = 18) the derived state^45,51. Overall, most chromosomes maintained their structure between Platanthera and Ophrys, but some major rearrangements are apparent, particularly with regard to chromosome fusions. Chromosome (chr) 4 in O. sphegodes appears to be the product of a fusion between chr 7 and part of chr 4 in Platanthera, and Ophrys chr 10 derives from a fusion between Platanthera chr 15 and part of chr 14. Moreover, Platanthera chr 8 has no homologous chromosome in Ophrys, and significant parts of Platanthera chr 3, 4, 11, 14, 20 and 21 lack syntenic regions in O. sphegodes (Fig. 2b and Supplementary Fig. 10). Taken together, these findings are consistent with a reduction in chromosome number via fusions in the Ophrys lineage.

Expanded gene families include genes involved in flower development

Sexual deception is not restricted to the Euro-Mediterranean genus Ophrys, but is a worldwide phenomenon. It has originated several times independently and there are many examples among Australian orchids, whereas only a few non-orchid cases, such as the South African daisy Gorteria diffusa⁵², are known. Although specific pollinator interactions mediated by floral chemistry are a common theme in sexual deception^53,54, it remains unknown why this pollination strategy occurs predominantly among orchids and what allowed its repeated evolution in this family. To gain insights into the genomic basis of sexual deception in Ophrys, we first identified orthologous gene families using OrthoFinder⁵⁵. We identified a total of 495,819 genes in 26,709 orthogroups among the 21 species, of which 3054 are shared among all species. A total of 1351 families containing 4537 genes were unique to O. sphegodes. We then identified expanded and contracted gene families using CAFE⁵⁶ (Fig. 2a). In O. sphegodes, 3712 gene families underwent an expansion, whereas 756 underwent a contraction. This is the highest level of gene family expansion reported among orchids to date, followed by V. planifolia (+3248) and D. catenatum (+1817). Of those gene families, 291 and 59 exhibited significant (p-value ≤ 0.01) expansions and contractions, respectively. Among the significantly expanded gene families were genes encoding transcription factors (TFs) involved in plant reproduction and flower development (but also other processes, e.g., stress responses), such as MADS-domain TFs (55 genes encoding type I and 6 genes of type II MADS; Supplementary Figs. 11, 12), MYB TFs (73 genes, of which 32 MYB-P; Supplementary Fig. 13), LATERAL ORGAN BOUNDARIES (LOB) domain TFs (10 genes), C2C2-GATA TFs (12 genes), WRKY TFs (11 genes) involved for instance in trichome development⁵⁷, SERINE CARBOXYPEPTIDASE-LIKE-1 (SCPL-1) proteins (15 genes) controlling anthocyanin acylation⁵⁸, TCP TFs (14 genes) regulating flavonoid biosynthesis and floral symmetry^59,60, and YABBY TFs (6 genes) involved in establishing adaxial-abaxial polarity⁶¹. The configuration of MADS-box genes putatively involved in perianth specification in Ophrys appears similar to other orchids (Supplementary Fig. 12), including the related Orchis italica⁶², suggesting that the stark difference in their flowers likely results from the action of downstream genes. Secondly, disease resistance-related and stress response genes were found, such as NAC TFs (30 genes), glutathione S-transferases (28 genes) and plant PLEIOTROPIC DRUG RESISTANCE (PDR) proteins (10 genes). Finally, two previously identified putative candidate gene families for floral odour production and anthocyanin biosynthesis⁶³ showed significant expansion too, namely fatty acyl-CoA reductases (FARs; Fig. 3) and chalcone synthases (CHSs; Supplementary Fig. 14), also involved in defence response (29 and 14 genes, respectively).

**Fig. 3: Summary of the putative hydrocarbon biosynthesis pathway in *O. sphegodes*.**

Plant adaptation to pollinators in Ophrys may involve the local duplication of candidate genes

The key component for pollinator attraction in Ophrys species is the chemical mimicry of the pollinator female’s sex pheromone, the composition of which has previously been characterised for O. sphegodes s.l.^10,64. Alkene hydrocarbons are especially important, as different proportions of (Z)−12-alkenes, (Z)−9-alkenes and (Z)−7-alkenes are the major odour differences between O. sphegodes and the closely related O. exaltata, responsible for attracting different pollinators^65,66. Thus, genes involved in hydrocarbon biosynthesis are likely important for pollinator attraction⁶³. We annotated previously identified candidate genes⁶³ in the genome of O. sphegodes (Fig. 3, Supplementary Figs. 14 and 15, Supplementary Table 7, Supplementary Note 1 and Supplementary Data 1). Among these, structural annotation of stearoyl-ACP desaturases (SADs) showed that the key genes, SAD1 and SAD2 (SAD2-type), are duplicated in tandem in a single cluster on chromosome 4 (283.17–283.30 Mb; containing four copies; Fig. 4d), whereas the house-keeping desaturase SAD3 ⁶⁷ resides in one copy in chromosome 5 (280.45–280.47 Mb) and SAD4 (SAD5-type) is also present as a single copy (scaffold 75: 1.04–1.11 Mb; Supplementary Table 7 and Supplementary Data 1). For other O. exaltata SAD5-type genes, we identified four full-length copies in the O. sphegodes genome (at least two of them on the same scaffold 210), although none of these appeared to be a functional copy (Supplementary Table 7 and Supplementary Data 1). This is in line with previous findings that O. sphegodes only expresses functional SAD2-type alleles, while functional SAD5-type alleles are expressed in O. exaltata^66,67. It is also consistent with gene expression patterns revealed by RNA-seq data (Supplementary Fig. 16). Phylogenetic analysis confirms the presence of separate SAD gene lineages corresponding to SAD3, SAD2-type (SAD1/2/7/8), and SAD5-type (SAD4/5/6/9/10, an incomplete SAD11) genes (Fig. 4a). Both SAD2- and SAD5-type lineages appear as single copies in Platanthera, suggesting they underwent recent gene duplication events. Of note, SAD2 homologues were only present in orchid genomes from the subfamily Orchidoideae.

**Fig. 4: Phylogenetic analyses of three gene families showing gene duplications.**

Like SADs, membrane-bound fatty acid desaturases (FADs)⁶⁸, which contribute to fatty acid desaturation and, potentially, alkene production⁶⁹, were also found duplicated in the O. sphegodes genome (Fig. 3). We found a phylogenetic lineage containing four FAD gene copies, of which at least three reside on chromosome 1 (368.58–368.63 Mb; Fig. 4e), a second lineage of two copies clustered on scaffold 33, while one gene, FAD2, was not duplicated. Both lineages showing gene duplications are present as single copies in the Platanthera genome (Fig. 4b).

Fatty acyl-CoA reductase (FAR) homologues show gene duplications in Ophrys. FARs likely catalyse the conversion of fatty acyl-CoA to primary alcohols and different FARs produce fatty alcohols with different acyl chain lengths⁷⁰. Fifteen out of the 16 FAR homologues found in the O. sphegodes genome form a phylogenetic Orchidoideae gene clade together with a single Platanthera sequence. Of these, 14 genes are distributed over only three scaffolds (Fig. 4c; chr 5: 1; scaffold 279: 4; scaffold 133: 4; scaffold 578: 6). Six of these clustered FAR genes reside in one region of 109 kb on scaffold 578 (Fig. 4f). The close vicinity of these copies in a unique clade shows a recent duplication event in the O. sphegodes lineage (Fig. 4c). Gene duplication plays a crucial role in shaping the evolutionary landscape of genomes, as they provide the main raw material for the evolution of new genes⁷¹. Single or tandem gene duplications are also involved in the origin of many plant genes⁷². Often, retention of duplicate genes occurs non-randomly, as changes in the concentration of gene products can have a selective advantage for the organism⁷³. These genes may provide Ophrys with an opportunity to respond to selection by pollinators, e.g., through positive dosage effects or neofunctionalisation and pseudogenisation of the less effective variants⁷⁴.

Population genomic analyses reveal putative barrier loci under pollinator-driven selection

Closely related Ophrys species provide plausible examples of pollinator driven-speciation^13,18. We therefore investigated the genetic differentiation between O. sphegodes and three other closely related, sympatric co-flowering species, O. exaltata, O. garganica and O. incubacea in Gargano, Southern Italy^12,13. These four species are pollinated by sexual deception of different solitary bee males: Andrena nigroaenea, Colletes cunicularius, A. pilipes and A. morio, respectively⁷⁵. These Ophrys species show variation in floral traits, ranging from labellum coloration (markedly blacker in O. garganica and O. incubacea¹³), to different floral odours mimicking their pollinators’ sex pheromones, which make a major contribution to reproductive isolation^12,13,66,76. We used Genotyping-By-Sequencing (GBS) data¹³ to investigate the genetic differentiation between species and scanned the genome for signatures of selection. Chord between-population genetic distance⁷⁷ was calculated per 1 Mb window to identify the most similar/dissimilar species at a given chromosomal region in the O. sphegodes genome. This analysis revealed that segregating polymorphisms between species are distributed across the genome, and that overall genetic (dis)similarity between O. sphegodes and the three other species is roughly equal (Fig. 5a, Supplementary Fig. 17), as may be expected in a species radiation. Yet, cumulatively, the four species were clearly separable in a principal coordinate analysis (PCoA; Fig. 5b; Supplementary Fig. 18 for individual chromosomes). These findings are in line with previous population analyses, suggesting that many polymorphisms in the genome are shared among all species¹³, whilst few barrier loci may separate them. Furthermore, global F_ST outlier analysis (FDR < 0.01) of a GBS dataset from 126 individuals¹³ revealed a highly differentiated region in chr 2 (333–352 Mb; Figs. 1c and 4c; Supplementary Fig. 19). Interestingly, in contrast to the genome-wide pattern, especially O. sphegodes and O. exaltata are clearly separated in the highly differentiated ~20 Mb region on chr 2, as seen by PCoA of this region (Fig. 5d). In an independent F_ST analysis based on RNA-seq data, this genomic region was confirmed as being differentiated between these species, although this dataset offers denser sampling of the genome and yielded a somewhat larger interval (327 – 358 Mb, 31 Mb in length; 0.6% of the genome; Supplementary Fig. 19 and Supplementary Data 5). Whether this pattern of differentiation is due to divergent selection and suppression of effective recombination via, e.g., divergence hitchhiking or an inversion, is currently unknown. Since the differentiated region of chr 2 did not contain any a priori candidate genes for hydrocarbon or pigment biosynthesis (Supplementary Table 7 and Supplementary Data 1, 2), we performed a GO enrichment analysis (Supplementary Data 6). This revealed term GO:1903415 (‘flavonoid transport from endoplasmic reticulum to plant-type vacuole’) as the most significantly enriched ‘biological process’ category. Three genes (Osph2G68830, Osph2G68850 and Osph2G68960) with this term are homologues of Arabidopsis TT9 (AT3G28430), a peripheral membrane protein necessary for vacuole development and the accumulation of anthocyanins in the vacuole⁷⁸. Additionally, we screened genes in this region for (i) elevated F_ST between O. exaltata and O. sphegodes, (ii) annotated transcription factors, (iii) genes with excess amino acid change (i.e., excess non-synonymous nucleotide diversity), and (iv) differential floral labellum gene expression between the two species (Supplementary Fig. 19). Only 7 genes were found in more than one screen (Supplementary Fig. 19d, Supplementary Data 7), three of which had useful annotation information that revealed one AP2 and one B3-ARF TF (Osph2G66470 and Osph2G63210, respectively) as well as one ubiquitin conjugating enzyme (Osph2G63100). While it remains to be established whether any of these genes are directly linked to differences between the two Ophrys species, currently it seems more likely that regulatory rather than biosynthetic genes contribute to this genomic region of differentiation. Overall, using the metaphor of genomic islands of speciation⁷⁹, these results suggest that adaptation involving individual loci (or small genomic segments) rather than entire genomes, characterises these reproductively isolated sympatric Ophrys species. The overall high level of allele sharing and the genomic mosaic pattern of species relationships (without long contiguous stretches of similar relationships) cannot easily be explained by gene flow after a secondary contact of species that are separated by strong floral isolation¹². An alternative explanation could be that the high level of segregating polymorphisms is due to shared ancestry and large effective population sizes, suggesting that the species are in an early stage of genomic divergence^11,12,65,79. Our population genomic analyses identified candidate regions potentially under pollinator-driven selection, thus calling for future research into the roles and functions of the genes in these regions.

**Fig. 5: GBS-based population genetic analysis between four sympatric *Ophrys* species.**

The chromosome-level genome sequence of O. sphegodes, a plant with high pollinator specificity, provides important insights into the evolution of plant adaptation to pollinators and its role in species diversification. It seems that, ecologically, the recent adaptive radiation of this group was fuelled by the availability of pollinator niches and, genetically, a burst in TE activity and rampant gene duplication provided the genetic raw material for pollinator-mediated selection to act upon. These mechanisms may also provide a blueprint for other angiosperm radiations, as floral trait-mediated reproductive isolation may often be underlain by a simple genetic architecture². More broadly, this case study supports the idea that the generation of genomic diversity, as genomic potential in the form of genome duplication, hybridisation or TE activity, often precedes adaptive radiation.

Methods

Sample preparation

The single O. sphegodes individual for the genome assembly (accession SPH_8) was selected among several samples previously collected in Capoiale, Gargano area, Southern Italy¹², and grown in a pot at the Dept. of Biology, University of Naples Federico II, under natural light at ambient temperature. Fresh young leaves were harvested, snap-frozen, and high-molecular-weight genomic DNA was isolated with an SDS lysis buffer supplemented with β-mercaptoethanol, followed by purification via phenol/chloroform extraction and carboxylated magnetic beads⁸⁰.

Genome size estimation

Nuclear DNA content and thus genome size was estimated via flow cytometry. Since genome size estimation via flow cytometry from orchid leaves is challenging due to DNA endoreduplication⁸¹, we used pollinia (carrying haploid pollen) for the analysis. We followed a previously published protocol⁸² with slight modifications. Briefly, a pair of pollinia was crushed in a 1.5 ml tube with Otto I buffer using a clean pestle, transferred to a Petri dish, and co-chopped with 2 × 2 cm tissue of a reference leaf (Solanum lycopersicum cultivar ‘Stupicke polni tyckove rane’; tomato 1C = 0.98 pg, as measured in ref. ⁸³). The suspension was filtered, mixed with Otto II buffer, and stained with propidium iodide in the dark at 4 °C for 1 hour. At least 10,000 nuclei were analysed on a Cytoflex S (Beckman Coulter) flow cytometer. The average fluorescence value (Mean PerCP-A, see Supplementary Fig. 1) of reference 2C nuclei was 4,207,840; the average fluorescence for O. sphegodes 1C nuclei was 10,506,736 (Supplementary Table 1). Thus, following a published formula⁸⁴, and converting picograms of DNA to number of nucleotide pairs⁸⁵, we estimated a haploid genome size of 4.83 Gb (average of three measurements, see Supplementary Table 1).

Nanopore library construction, sequencing, and genome assembly

DNA isolated from accession SPH_8 was used to prepare two Illumina and eight ONT libraries (Supplementary Table 2), following the general guidelines provided by Oxford Nanopore Technologies® for the 1D Genomic DNA by Ligation (SQK-LSQ109) protocol, with modifications proposed by New England Biolabs® (NEB)⁸⁰. Six ONT libraries were sequenced on a PromethION PTC0031 platform (Oxford Nanopore Technologies), and two on MinION Mk1B. All sequencing was performed at Agroscope Wädenswil, Switzerland. We used Miniasm v0.3 and Minimap2 v2.17²³ to assemble the initial contigs. Pilon v1.23⁸⁶ and Racon v1.4.3⁸⁷ were used to correct indels and mis-assemblies with Illumina reads (Supplementary Table 3 and Supplementary Method 1). To deal with the heterozygosity of the O. sphegodes genome, we used Redundans v0.11⁸⁸ to remove under-collapsed contigs and obtain a final assembled genome size of 5.2 Gb (see Supplementary Method 1). Raw Nanopore sequencing data were subsequently re-basecalled with Guppy v5.0.11, and Nanopolish v0.13.3 was used to detect 5-methylcytosine (see Supplementary Method 2).

Hi-C library preparation and assembly scaffolding

A total of 0.5 g leaf tissue were fixed in nuclei isolation buffer and 36% formaldehyde, followed by cell lysis, chromatin digestion with the enzyme HindIII-HF (New England Biolabs, Ipswich, MA, USA, R3104L), re-ligation, DNA re-extraction and library preparation^89,90 as detailed in Supplementary Method 1. Four Hi-C libraries were sequenced on an Illumina NovaSeq 6000 SP FlowCell. To assemble the chromosomes, raw Hi-C data was mapped against the genome assembly using the ArimaGenomics mapping script (https://github.com/ArimaGenomics/mapping_pipeline/blob/master/01_mapping_arima.sh). Subsequently, Salsa v2.3^91,92 was used to build scaffolds (total scaffolds 2511, N50 = 4.8 Mb); Juicebox Assembly Tools (JBAT)⁹³ and the 3D-DNA pipeline v-180114⁹⁴ were used to order scaffolds to chromosomes.

Gene and non-coding RNA annotation

Gene annotation was carried out via an integrative approach that included de novo gene prediction, homology-based prediction and transcriptome-based prediction. AUGUSTUS v3.4.0 was used as part of the BRAKER2 v2.1.6 pipeline⁹⁵ and trained with RNA-seq data²⁰ from floral tissues to predict coding regions in the repeat-masked genome. ProtHint v2.6.0 (https://github.com/gatech-genemark/ProtHint) was used to generate protein hints using the protein database liliopsida_odb10-v.2020-09-10 from OrthoDB10, to score intron intervals, start and stop codons from ultra-conserved proteins of the monocot lineage. This extrinsic evidence was given to GeneMark-EX v4.64⁹⁶ to self-train, and improve prediction accuracy of AUGUSTUS. Proteomes from P. equestris²⁷, C. goeringii⁹⁷, O. sativa ssp. japonica cultivar Nipponbare IRGSP-1.0⁹⁸, and A. officinalis Aspof.V1⁹⁹ were downloaded and used as input for GeMoMa v1.8¹⁰⁰ for homology-based prediction. The transcriptome of O. sphegodes²⁰ was mapped against the genome using PASA v2.5.1 (Program to Assemble Spliced Alignment; https://github.com/PASApipeline/PASApipeline/blob/master/docs/index.asciidoc). Valid transcript alignments were clustered based on genome mapping location and assembled into gene structures. PASA assemblies were incorporated into gene predictions to correct exon boundaries, add UTRs, and update gene structures. AHRD v3.3.3 (https://github.com/groupschoof/AHRD) was used to filter genes based on SwissProt, trEMBL and TAIR10 databases, and candidate genes for pollinator attraction from previous studies were manually included. Functional annotation was carried out using TRAPID v2.0¹⁰¹ for Gene Onthology (GO) terms, InterPro-Scan, KEGG, and iTAK Classifier¹⁰² for transcription factors and protein kinases. Genes for tRNA and rRNA were identified using tRNAscan-SE v2.0.9¹⁰³ and Barrmap v0.9 (https://github.com/tseemann/barrnap; see Supplementary Method 3 for details).

Repetitive element annotation

We used a published approach²⁵ to classify and manually annotate repetitive elements in the genome. Putative TEs identified with RepeatModeler v2.0.1 were blasted against the TRansposable Elements Platform (TREP) database (v.2019, https://trep-db.uzh.ch/). Sequences with strong hits to a known family were selected (plus their flanking regions) and blasted against the O. sphegodes genome to identify other copies belonging to the same TE family. Multiple sequence alignments between those clusters of sequences were performed to create consensus TE sequences, and to search for structural motifs that could help classify them. Sequences with no strong hits to a known family were blasted against PTREP (Protein TRansposable Elements Platform) or Dfam (https://dfam.org) or examined for the presence of structural motifs to identify TEs in the genome. All TE sequences were classified according to a three-letter code²⁵.

Analysis of LTR divergence and insertion time

LTR_harvest v2.0.0¹⁰⁴ was used to identify LTRs with complete structure (-xdrop 37 -motif tgca -motifmis 1 -minlenltr 100 -maxlenltr 3000 -mintsd 2), expected to be detectable for young LTR copies. The insertion time was estimated based on the sequence divergence between two LTRs of the same type. 5’ and 3’ LTR of each full-length paralogue were aligned using Mafft¹⁰⁵ and divergence between them was estimated using the Kimura two-parameter model using the distmat program implemented in the EMBOSS package v6.6.0 (https://emboss.sourceforge.net/). LTRs insertion age was estimated using a substitution rate of 1.3 × 10⁻⁸ substitutions per site per year¹⁰⁶.

Phylogenomic analysis and estimation of divergence time

The protein-coding regions of 21 plant species with a sequenced genome (Supplementary Tables 11 and 12) were used for phylogenomic analysis. Orthologue groups were identified using OrthoFinder v2.5.4⁵⁵, using Mafft for multisequence alignment and fastTree¹⁰⁷ for building the trees. Among 526,955 genes from the 21 species, 495,819 (92.1%) were assigned into 26,709 orthologue groups. The concatenated supermatrix of 34 single-copy nuclear genes was used to infer phylogenetic relationships using the GTR-GAMMA model with 1000 bootstrap replicates in RAxML v8.2.11¹⁰⁸. The same supermatrix and corresponding concatenated ML tree were used for dating analysis. We estimated divergence times using the penalised likelihood method implemented in TreePL v1.0¹⁰⁹. We considered 83.78 Ma and 114.92 Ma as the minimum- and maximum-age calibrations of the stem age of orchids⁴⁹. We used an optimal smoothing parameter determined by the “random subsample and replicate” cross-validation method to accommodate rate heterogeneity. A total of 1000 bootstrap replicates with the topology fixed to the concatenated ML tree and branch lengths allowed to vary were also generated using RAxML for calculating the confidence intervals of age estimates. Results from the dating of the bootstrapped trees were then summarised and visualised on the concatenated ML tree using TreeAnnotator (part of the BEAST2 v2.7.0 package¹¹⁰).

We used CAFE v5.0.0⁵⁶ to identify gene family contractions and expansions, using the reconstructed species tree. Only gene families were kept in which among-species difference in size was less than 100. First, nucleotide sequences of the previously identified single-copy orthologues were translated into amino acids with TranslatorX v1.1 within the MitoPhAST-master v3.0 package¹¹¹, and then given as input to MUSCLE v5.1.0¹¹² for multiple sequence alignment (MSA). The resulting MSAs were trimmed with trimAL v1.2¹¹³ (parameter -automated1 to select optimal cut-off based on alignment similarity) to remove poorly aligned regions or spurious sequences, and the best-fit model for each tree was computed with ModelTest-NG v0.1.7¹¹⁴ (parameters -f ef -h f -s 3 -tr phyml). The final trees were built with phyml v3.3.20220408 (parameters -d nt -c 4 -m AIC -f modelfreq -o modelparam) (https://github.com/stephaneguindon/phyml) using the parameters from ModelTest-NG and 1001 bootstrap replicates.

Comparative genomic analyses

Synteny analyses between O. sphegodes and P. guangdongensis and P. zijinensis were performed using the MCScan¹¹⁵ functions of the JCVI utility libraries v1.2.9 (https://github.com/tanghaibao/jcvi). First, jcvi.compara.catalog ortholog (--min_size 5 -dist 35) used LAST v2.34¹¹⁶ to identify syntenic blocks between the O. sphegodes genome and another genome (see also Supplementary Table 13). Second, jcvi.compara.synteny screen (--minspan 30 -simple) was used to create simplified versions of the anchor files containing the syntenic blocks. Finally, linear synteny plots were made with jcvi.graphics.karyotype and dot-plots with jcvi.graphics.dotplot.

Population genetic analyses

Previously published GBS data from 126 individuals¹³ and RNA-seq data from floral tissues of 32 accessions²⁰ were used to analyse the genetic variation between Ophrys species. Both datasets were mapped against the genome using BWA-MEM2 v2.2.1¹¹⁷ with default parameters, and variants (SNPs and small InDels; Supplementary Tables 14 and 15) were identified using freebayes v1.0.2 (https://github.com/freebayes/freebayes) with minimum depth 2 (-C 2). Samtools v1.10¹¹⁸ was used to remove duplicated reads prior to variant calling. Raw variants (1,680,249, of which 1,621,433 biallelic GBS markers) were filtered for depth and frequency, and pairwise genetic distances (comparison matrix in Supplementary Table 16) calculated between individuals using SPA v0.1 (https://peb.uni-hohenheim.de/spa) as detailed in Supplementary Method 4. Biallelic SNPs were used with BayeScan v2.1¹¹⁹ to identify candidate loci under selection in a “global” manner, i.e., treating each species as a subpopulation, and in a “pairwise” fashion, comparing two species at a time. Chord distance^77,120 was calculated and PCoA was carried out genome-wide and per chromosome. A detailed description of population genetic methods can be found in Supplementary Method 4 and, for analyses of chromosome 2, in Supplementary Method 5.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The genome assembly and raw sequencing data generated for this study, including ONT data, PacBio data, Illumina WGS and HiC data, were submitted to NCBI under BioProject number PRJNA994461. The Whole Genome Shotgun project was deposited at DDBJ/ENA/GenBank under the accession JBANGT000000000. The version described in this paper is version JBANGT010000000. The RNA-seq data used for genome annotation and expression analysis can be found on NCBI under accession PRJNA574279²⁰. The GBS data used for population genetic analysis can be found in the NCBI accession PRJNA257331¹³. Annotation data, including protein-coding gene annotation, transposable element database and annotation, non-coding RNA, as well as the alternative haplotig fasta file can be found on figshare [https://doi.org/10.6084/m9.figshare.25398166]. Transposable element sequences were also deposited in the TREP database [https://trep-db.uzh.ch/]. Source data are provided with this paper.

References

Waser, N. M. Pollination, angiosperm speciation, and the nature of species boundaries. Oikos 82, 198–201 (1998).
Article ADS Google Scholar
Schlüter, P. M. The magic of flowers or: speciation genes and where to find them. Am. J. Bot. 105, 1957–1961 (2018).
Article PubMed Google Scholar
Armbruster, W. S. Floral specialization and angiosperm diversity: phenotypic divergence, fitness trade-offs and realized pollination accuracy. Ann. Bot. Plants 6, plu003 (2014).
Google Scholar
Gómez, J. M. & Zamora, R. Generalization vs. specialization in the pollination system of Hormathophylla spinosa (Cruciferae). Ecology 80, 796–805 (1999).
Article Google Scholar
Van der Niet, T. & Johnson, S. D. Phylogenetic evidence for pollinator-driven diversification of angiosperms. Trends Ecol. Evol. 27, 353–361 (2012).
Article PubMed Google Scholar
Grant, V. Pollination systems as isolating mechanisms in angiosperms. Evolution 3, 82–97 (1949).
Article CAS PubMed Google Scholar
Bouchenak-Khelladi, Y. et al. On the complexity of triggering evolutionary radiations. N. Phytol. 207, 313–326 (2015).
Article Google Scholar
Seehausen, O. Process and pattern in cichlid radiations - inferences for understanding unusually high rates of evolutionary diversification. N. Phytol. 207, 304–312 (2015).
Article Google Scholar
Nürk, N. M. et al. Diversification in evolutionary arenas—Assessment and synthesis. Ecol. Evol. 10, 6163–6182 (2020).
Article PubMed PubMed Central Google Scholar
Schiestl, F. P. et al. Sex pheromone mimicry in the early spider orchid (Ophrys sphegodes): patterns of hydrocarbons as the key mechanism for pollination by sexual deception. J. Comp. Physiol A 186, 567–574 (2000).
Mant, J. et al. Cuticular hydrocarbons as sex pheromone of the bee Colletes cunicularis and the key to its mimicry by the sexually deceptive orchid, Ophrys. exaltata. J. Chem. Ecol. 31, 1765–1787 (2005).
Article CAS PubMed Google Scholar
Xu, S. et al. Floral isolation is the main reproductive barrier among closely related sexually deceptive orchids. Evolution 65, 2606–2620 (2011).
Article CAS PubMed Google Scholar
Sedeek, K. E. M. et al. Genic rather than genome-wide differences between sexually deceptive Ophrys orchids with different pollinators. Mol. Ecol. 23, 6192–6205 (2014).
Article CAS PubMed Google Scholar
Stejskal, K. et al. Functional significance of labellum pattern variation in a sexually deceptive orchid (Ophrys heldreichii): evidence of individual signature learning effects. PLoS One 10, e0142971 (2015).
Article PubMed PubMed Central Google Scholar
Ayasse, M. et al. Evolution of reproductive strategies in the sexually deceptive orchid Ophrys sphegodes: how does flower‐specific variation of odor signals influence reproductive success? Evolution 54, 1995–2006 (2000).
CAS PubMed Google Scholar
Vereecken, N. J. et al. Integrating past and present studies on Ophrys pollination-a comment on Bradshaw et al. Bot. J. Linn. Soc. 165, 329–335 (2011).
Article Google Scholar
Bateman, R. M. et al. Species arguments: clarifying competing concepts of species delimitation in the pseudo-copulatory orchid genus Ophrys. Bot. J. Linn. Soc. 165, 336–347 (2011).
Article Google Scholar
Xu, S., Schlüter, P. M. & Schiestl, F. P. Pollinator-driven speciation in sexually deceptive orchids. Int. J. Ecol. 2012, 285081 (2012).
Article Google Scholar
Breitkopf, H. et al. Multiple shifts to different pollinators fuelled rapid diversification in sexually deceptive Ophrys orchids. N. Phytol. 207, 377–389 (2015).
Article Google Scholar
Piñeiro Fernández, L. et al. A phylogenomic analysis of the floral transcriptomes of sexually deceptive and rewarding European orchids, Ophrys and Gymnadenia. Front. Plant Sci. 10, 1553 (2019).
Article PubMed PubMed Central Google Scholar
Greilhuber, J. & Ehrendorfer, F. Chromosome numbers and evolution in Ophrys (Orchidaceae). Plant Syst. Evol. 124, 125–138 (1975).
Article Google Scholar
Bianco, P. et al. Karyological studies of some taxa of the genus Ophrys (Orchidaceae) from Apulia (Italy). Caryologia 42, 57–63 (1989).
Article Google Scholar
Li, H. Minimap and miniasm: Fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32, 2103–2110 (2016).
Article CAS PubMed PubMed Central Google Scholar
Simão, F. A. et al. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Article PubMed Google Scholar
Wicker, T. et al. A unified classification system for eukaryotic transposable elements. Nat. Rev. Genet. 8, 973–982 (2007).
Article CAS PubMed Google Scholar
Hasing, T. et al. A phased Vanilla planifolia genome enables genetic improvement of flavour and production. Nat. Food 1, 811–819 (2020).
Article CAS PubMed Google Scholar
Zhang, G. Q. et al. The Apostasia genome and the evolution of orchids. Nature 549, 379–383 (2017).
Article CAS PubMed PubMed Central ADS Google Scholar
Yuan, Y. et al. The Gastrodia elata genome provides insights into plant adaptation to heterotrophy. Nat. Commun. 9, 1625 (2018).
Article ADS Google Scholar
Yang, F. X. et al. The genome of Cymbidium sinense revealed the evolution of orchid traits. Plant Biotechnol. J. 19, 2501–2516 (2021).
Article CAS PubMed PubMed Central Google Scholar
Li, M. H. et al. Genomes of leafy and leafless Platanthera orchids illuminate the evolution of mycoheterotrophy. Nat. Plants 8, 373–388 (2022).
Article PubMed PubMed Central Google Scholar
Kidwell, M. G. Transposable elements and the evolution of genome size in eukaryotes. Genetica 115, 49–63 (2002).
Article CAS PubMed Google Scholar
Chumová, Z. et al. Repeat proliferation and partial endoreplication jointly shape the patterns of genome size evolution in orchids. Plant J. 107, 511–524 (2021).
Article PubMed Google Scholar
Eriksson, M. C. et al. Repeat dynamics across timescales: a perspective from sibling allotetraploid marsh orchids (Dactylorhiza majalis s.l.). Molec. Biol. Evol. 39, msac167 (2022).
Article CAS PubMed PubMed Central Google Scholar
Quézel, P. La flore du bassin méditerranéen: origine, mise en place, endémisme. Ecol. Mediterr. 21, 19–39 (1995).
Article Google Scholar
Thompson, J. D. Plant Evolution in the Mediterranean. Insights for conservation. 2nd ed. Ch. 1 (Oxford University Press, 2020).
Chuong, E. B. et al. Regulatory activities of transposable elements: From conflicts to benefits. Nat. Rev. Genet. 18, 71–86 (2017).
Article CAS PubMed Google Scholar
Dubin, M. J. et al. Transposons: a blessing curse. Curr. Opin. Plant Biol. 42, 23–29 (2018).
Article CAS PubMed Google Scholar
Meier, J. I. et al. Cycles of fusion and fission enabled rapid parallel adaptive radiations in African cichlids. Science 381, eade2833 (2023).
Article CAS PubMed Google Scholar
Puttick, M. N. et al. Size is not everything: rates of genome size evolution, not C-value, correlate with speciation in angiosperms. Proc. R. Soc. B Biol. Sci. 282, 20152289 (2015).
Article Google Scholar
Oliver, K. R. et al. Transposable elements: powerful contributors to angiosperm evolution and diversity. Genome Biol. Evol. 5, 1886–1901 (2013).
Article PubMed PubMed Central Google Scholar
Zhao, D. et al. What makes up plant genomes: The vanishing line between transposable elements and genes. Biochim. Biophys. Acta Gene Regul. Mech. 1859, 366–380 (2016).
Article CAS Google Scholar
Balachandran, P. et al. Transposable element-mediated rearrangements are prevalent in human genomes. Nat. Commun. 13, 7115 (2022).
Article CAS PubMed PubMed Central ADS Google Scholar
Xu, S. et al. Wild tobacco genomes reveal the evolution of nicotine biosynthesis. Proc. Natl Acad. Sci. USA 114, 6133–6138 (2017).
Article CAS PubMed PubMed Central ADS Google Scholar
Lisch, D. How important are transposons for plant evolution? Nat. Rev. Genet. 14, 49–61 (2013).
Article CAS PubMed Google Scholar
Bateman, R. M. et al. Molecular phylogenetics and evolution of Orchidinae and selected Habenariinae (Orchidaceae). Bot. J. Linn. Soc. 142, 1–40 (2003).
Article Google Scholar
Wong, D. C. J. & Peakall, R. Orchid phylotranscriptomics: the prospects of repurposing multi-tissue transcriptomes for phylogenetic analysis and beyond. Front. Plant Sci. 13, 910362 (2022).
Article PubMed PubMed Central Google Scholar
Bell, C. D. et al. The age and diversification of the angiosperms re-revisited. Am. J. Bot. 97, 1296–1303 (2010).
Article PubMed Google Scholar
Hertweck, K. L. et al. Phylogenetics, divergence times and diversification from three genomic partitions in monocots. Bot. J. Linn. Soc. 178, 375–393 (2015).
Article Google Scholar
Kim, Y.-K. et al. Plastome evolution and phylogeny of Orchidaceae, with 24 new sequences. Front. Plant Sci. 11, 22 (2020).
Article CAS PubMed PubMed Central Google Scholar
Turco, A. et al. Contribution to the study of wild Orchidaceae, genus Platanthera L.C.M. Richard. Karyotype and C-banding analysis of two species from Italy. Caryologia 4, 11–16 (2021).
Google Scholar
Inda, L. A. et al. Phylogenetics of tribe Orchideae (Orchidaceae: Orchidoideae) based on combined DNA matrices: inferences regarding timing of diversification and evolution of pollination syndromes. Ann. Bot. 110, 71–90 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kellenberger, R. T. et al. Multiple gene co-options underlie the rapid evolution of sexually deceptive flowers in Gorteria diffusa. Curr. Biol. 33, 1502–1512 (2023).
Article CAS PubMed Google Scholar
Peakall, R. Pollination by sexual deception. Curr. Biol. 33, R489–R496 (2023).
Article PubMed Google Scholar
Bohman, B. et al. Pollination by sexual deception - it takes chemistry to work. Curr. Opin. Plant Biol. 32, 37–46 (2016).
Article CAS PubMed Google Scholar
Emms, D. M. & Kelly, S. OrthoFinder: Phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
Article PubMed PubMed Central Google Scholar
Bio, F. et al. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics 36, 5516–5518 (2020).
Google Scholar
Johnson, C. S. et al. TRANSPARENT TESTA GLABRA2, a trichome and seed coat development gene of Arabidopsis, encodes a WRKY transcription factor. Plant Cell 14, 1359–1375 (2002).
Article CAS PubMed PubMed Central Google Scholar
Curaba, J. et al. Identification of an SCPL gene controlling anthocyanin acylation in carrot (Daucus carota L.) root. Front. Plant Sci. 10, 1770 (2020).
Article PubMed PubMed Central Google Scholar
Corley, S. B. et al. Floral asymmetry involves an interplay between TCP and MYB transcription factors in Antirrhinum. Proc. Natl Acad. Sci. USA 102, 5068–5073 (2005).
Article CAS PubMed PubMed Central ADS Google Scholar
Li, S. & Zachgo, S. TCP3 interacts with R2R3-MYB proteins, promotes flavonoid biosynthesis and negatively regulates the auxin response in Arabidopsis thaliana. Plant J. 76, 901–913 (2013).
Article CAS PubMed Google Scholar
Chen, Y. Y. et al. Genome-wide identification of YABBY genes in Orchidaceae and their expression patterns in Phalaenopsis orchid. Genes 11, 955 (2020).
Article CAS PubMed PubMed Central Google Scholar
Valoroso, M. C. et al. The MADS-box genes expressed in the inflorescence of Orchis italica (Orchidaceae). PLoS One 14, e0213185 (2019).
Article CAS PubMed PubMed Central Google Scholar
Sedeek, K. E. M. et al. Transcriptome and proteome data reveal candidate genes for pollinator attraction in sexually deceptive orchids. PLoS One 8, e64621 (2013).
Article CAS PubMed PubMed Central ADS Google Scholar
Bohman, B. et al. Identification of (Z)−8-heptadecene and n-pentadecane as electrophysiologically active compounds in Ophrys insectifera and its Argogorytes pollinator. Int. J. Mol. Sci. 21, 620 (2020).
Article CAS PubMed PubMed Central Google Scholar
Schlüter, P. M. et al. Stearoyl-acyl carrier protein desaturases are associated with floral isolation in sexually deceptive orchids. Proc. Natl Acad. Sci. USA 108, 5696–5701 (2011).
Article PubMed PubMed Central ADS Google Scholar
Xu, S. et al. The genetic basis of pollinator adaptation in a sexually deceptive orchid. PLoS Genet 8, e1002889 (2012).
Article CAS PubMed PubMed Central Google Scholar
Sedeek, K. E. M. et al. Amino acid change in an orchid desaturase enables mimicry of the pollinator’s sex pheromone. Curr. Biol. 26, 1505–1511 (2016).
Article CAS PubMed Google Scholar
Shanklin, J. & Cahoon, E. B. Desaturation and related modifications of fatty acids. Annu. Rev. Plant Physiol. Plant Mol. Biol. 49, 611–641 (1998).
Article CAS PubMed Google Scholar
Perera, M. A. D. N. et al. Biological origins of normal-chain hydrocarbons: a pathway model based on cuticular wax analyses of maize silks. Plant J. 64, 618–632 (2010).
Article CAS PubMed Google Scholar
Wang, M. et al. Five fatty acyl-coenzyme A reductases are involved in the biosynthesis of primary alcohols in Aegilops tauschii leaves. Front. Plant Sci. 8, 1012 (2017).
Article PubMed PubMed Central Google Scholar
Ohno, S. Evolution by gene duplication (Springer, 1970).
Magadum, S. et al. Gene duplication as a major force in evolution. J. Genet. 92, 155–161 (2013).
Article PubMed Google Scholar
Innan, H. & Kondrashov, F. The evolution of gene duplications: classifying and distinguishing between models. Nat. Rev. Genet. 11, 97–108 (2010).
Article CAS PubMed Google Scholar
Francino, M. P. An adaptive radiation model for the origin of new gene functions. Nat. Genet. 37, 573–578 (2005).
Article CAS PubMed Google Scholar
Paulus, H. & Gack, C. Pollinators as prepollinating isolation factors: evolution and speciation in Ophrys (Orchidaceae). Isr. J. Bot. 39, 43–79 (1990).
Google Scholar
Schiestl, F. P. & Schlüter, P. M. Floral isolation, specialized pollination, and pollinator behavior in orchids. Annu. Rev. Entomol. 54, 425–446 (2009).
Article CAS PubMed Google Scholar
Cavalli-Sforza, L. L. & Edwards, A. W. F. Phylogenetic analysis models and estimation procedures. Evolution 21, 550–570 (1967).
Article CAS PubMed Google Scholar
Ichino, T. et al. GFS9/TT9 contributes to intracellular membrane trafficking and flavonoid accumulation in Arabidopsis thaliana. Plant J. 80, 410–423 (2014).
Article CAS PubMed Google Scholar
Nosil, P. & Feder, J. L. Genomic divergence during speciation: causes and consequences. Philos. Trans. R. Soc. B Biol. Sci. 367, 332–342 (2012).
Article Google Scholar
Russo, A. et al. Low-input high-molecular-weight DNA extraction for long-read sequencing from plants of diverse families. Front. Plant Sci. 13, 883897 (2022).
Article PubMed PubMed Central Google Scholar
Trávníček, P. et al. Challenges of flow-cytometric estimation of nuclear genome size in orchids, a plant group with both whole-genome and progressively partial endoreplication. Cytom. Part A 87, 958–966 (2015).
Article Google Scholar
Temsch, E. M. et al. Heavy metal pollution, selection, and genome size: the species of the Žerjav study revisited with flow cytometry. J. Bot. 2010, 596542 (2010).
Google Scholar
Doležel, J. et al. Comparison of three DNA fluorochromes for flow cytometric estimation of nuclear DNA content in plants. Physiol. Plant. 85, 625–631 (1992).
Article Google Scholar
Doležel, J. et al. Estimation of nuclear DNA content in plants using flow cytometry. Nat. Protoc. 2, 2233–2244 (2007).
Article PubMed Google Scholar
Doležel, J. et al. Nuclear DNA content and genome size of trout and human. Cytometry 51A, 127–128 (2003).
Article Google Scholar
Walker, B. J. et al. Pilon: An integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9, e112963 (2014).
Article PubMed PubMed Central ADS Google Scholar
Nagarajan, N. et al. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).
Article PubMed PubMed Central Google Scholar
Pryszcz, L. P. & Gabaldón, T. Redundans: An assembly pipeline for highly heterozygous genomes. Nucleic Acids Res. 44, e113 (2016).
Article PubMed PubMed Central Google Scholar
Grob, S. et al. Hi-C analysis in Arabidopsis identifies the KNOT, a structure with similarities to the flamenco locus of Drosophila. Mol. Cell 55, 678–693 (2014).
Article CAS PubMed Google Scholar
Liu, C. In situ Hi-C library preparation for plants to study their three-dimensional chromatin interactions on a genome-wide scale. Methods Mol. Biol. 1629, 155–166 (2017).
Article CAS PubMed Google Scholar
Ghurye, J. et al. Scaffolding of long read assemblies using long range contact information. BMC Genomics 18, 527 (2017).
Article PubMed PubMed Central Google Scholar
Ghurye, J. et al. Integrating Hi-C links with assembly graphs for chromosome-scale assembly. PLOS Comput. Biol. 15, e1007273 (2019).
Article CAS PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
Article CAS PubMed PubMed Central Google Scholar
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
Article CAS PubMed PubMed Central ADS Google Scholar
Hoff, K. J. et al. Whole-genome annotation with BRAKER. Methods Mol. Biol. 1962, 65–95 (2019).
Article CAS PubMed PubMed Central Google Scholar
Brůna, T. et al. GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins. NAR Genomics Bioinforma. 2, lqaa026 (2020).
Article Google Scholar
Chung, O. et al. A chromosome-scale genome assembly and annotation of the spring orchid (Cymbidium goeringii). Mol. Ecol. Resour. 22, 1168–1177 (2021).
Article PubMed Google Scholar
Kawahara, Y. et al. Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice 6, 3–10 (2013).
Article Google Scholar
Harkess, A. et al. The asparagus genome sheds light on the origin and evolution of a young Y chromosome. Nat. Commun. 8, 1279 (2017).
Article PubMed PubMed Central ADS Google Scholar
Keilwagen, J. et al. Using intron position conservation for homology-based gene prediction. Nucleic Acids Res. 44, e89–e89 (2016).
Article PubMed PubMed Central Google Scholar
Bucchini, F. et al. TRAPID 2.0: a web application for taxonomic and functional analysis of de novo transcriptomes. Nucleic Acids Res. 49, e101–e101 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zheng, Y. et al. iTAK: A program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators, and protein kinases. Mol. Plant 9, 1667–1670 (2016).
Article CAS PubMed Google Scholar
Chan, P. P. et al. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 49, 9077–9096 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ellinghaus, D. et al. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinforma. 9, 18 (2008).
Article Google Scholar
Katoh, K. et al. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002).
Article CAS PubMed PubMed Central ADS Google Scholar
Ma, J. & Bennetzen, J. L. Rapid recent growth and divergence of rice nuclear genomes. Proc. Natl Acad. Sci. USA 101, 12404–12410 (2004).
Article CAS PubMed PubMed Central ADS Google Scholar
Price, M. N. et al. FastTree 2-Approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490 (2010).
Article PubMed PubMed Central ADS Google Scholar
Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
Article CAS PubMed PubMed Central Google Scholar
Smith, S. A. & O’Meara, B. C. treePL: divergence time estimation using penalized likelihood for large phylogenies. Bioinformatics 28, 2689–2690 (2012).
Article CAS PubMed Google Scholar
Bouckaert, R. et al. BEAST 2: A software platform for Bayesian evolutionary analysis. PLoS Comput. Biol. 10, e1003537 (2014).
Article PubMed PubMed Central Google Scholar
Tan, M. H. et al. MitoPhAST, a new automated mitogenomic phylogeny tool in the post-genomic era with a case study of 89 decapod mitogenomes including eight new freshwater crayfish mitogenomes. Mol. Phylogenet. Evol. 85, 180–188 (2015).
Article CAS PubMed Google Scholar
Edgar, R. C. MUSCLE: A multiple sequence alignment method with reduced time and space complexity. BMC Bioinforma. 5, 113 (2004).
Article Google Scholar
Capella-Gutiérrez, S. et al. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Article PubMed PubMed Central Google Scholar
Darriba, D. et al. ModelTest-NG: A new and scalable tool for the selection of DNA and protein evolutionary models. Mol. Biol. Evol. 37, 291–294 (2020).
Article MathSciNet CAS PubMed Google Scholar
Tang, H. et al. Unraveling ancient hexaploidy through multiply-aligned angiosperm gene maps. Genome Res. 18, 1944–1954 (2008).
Article CAS PubMed PubMed Central Google Scholar
Kiełbasa, S. M. et al. Adaptive seeds tame genomic sequence comparison. Genome Res. 21, 487 (2011).
Article PubMed PubMed Central Google Scholar
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, gia008 (2021).
Article Google Scholar
Fischer, M. C. et al. Enhanced AFLP genome scans detect local adaptation in high-altitude populations of a small rodent (Microtus arvalis). Mol. Ecol. 20, 1450–1462 (2011).
Article PubMed Google Scholar
Takezaki, N. & Nei, M. Genetic distances and reconstruction of phylogenetic trees from microsatellite DNA. Genetics 144, 389–399 (1996).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We are grateful to Martin Mascher and Chang Liu for fruitful discussions about the project, to Yangzi Wang and Jhih-Sheng Liu for assistance with analyses, and to Noa Schwabe for kindly providing pictures of Ophrys and Andrena. This work was supported by the University of Zurich, a Ph.D. project funded by the University Research Priority Program “Evolution in Action: from Genomes to Ecosystems” (to U.G. and P.M.S.), the Swiss National Science Foundation (SNSF; grant 31003A_155943 to P.M.S. and grant IZLRZ3_163885 to U.G.), the German Research Foundation (DFG; project 446145319 to P.M.S.) and the 2022 PRIN Program (to S.C.). This work is set within the framework of the “Laboratoires d’Excellences (LABEX)” TULIP (ANR-10-LABX-41) and of the “École Universitaire de Recherche (EUR)” TULIP-GS (ANR-18-EURE-0019). This work made use of infrastructure services provided by the Science IT team of the University of Zurich (www.s3it.uzh.ch) and was supported via bwHPC by the High Performance and Cloud Computing Group at the Zentrum für Datenverarbeitung of the University of Tübingen, state of Baden-Württemberg, and the DFG (grant INST 37/935-1 FUGG).

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Department of Plant Evolutionary Biology, Institute of Biology, University of Hohenheim, Stuttgart, Germany
Alessia Russo, Mattia Alessandrini, Lara Gaidusch, Hannah F. Oertel, Sara E. Garcia Morales & Philipp M. Schlüter
Department of Plant and Microbial Biology and Zürich-Basel Plant Science Centre, University of Zurich, Zürich, Switzerland
Alessia Russo, Dmitry Smetanin, Thomas Wicker & Ueli Grossniklaus
Department of Systematic and Evolutionary Botany and Zürich-Basel Plant Science Centre, University of Zurich, Zürich, Switzerland
Alessia Russo, Giacomo Potente & Philipp M. Schlüter
Université Perpignan Via Domitia, Laboratoire Génome et Développement des Plantes, UMR5096, Perpignan, France
Moaine El Baidouri, Joris A. M. Bertrand & Olivier Panaud
CNRS, Laboratoire Génome et Développement des Plantes, UMR5096, Perpignan, France
Moaine El Baidouri, Joris A. M. Bertrand & Olivier Panaud
EMR269 MANGO, Institut de Recherche pour le Développement, Perpignan, France
Moaine El Baidouri, Joris A. M. Bertrand & Olivier Panaud
Department of Methods Development and Analytics, Agroscope, Wädenswil, Switzerland
Daniel Frei & Jürg E. Frey
Department of Biology, University of Naples Federico II, Naples, Italy
Teresa Rosa Galise & Salvatore Cozzolino
Naturalis Biodiversity Centre, Leiden, The Netherlands
Qin Tian & Renske E. Onstein
Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China
Qin Tian
German Centre for Integrative Biodiversity Research (iDiv) Halle – Jena – Leipzig, Leipzig, Germany
Renske E. Onstein
Institute of Organismic and Molecular Evolution, University of Mainz, Mainz, Germany
Shuqing Xu

Authors

Alessia Russo
View author publications
Search author on:PubMed Google Scholar
Mattia Alessandrini
View author publications
Search author on:PubMed Google Scholar
Moaine El Baidouri
View author publications
Search author on:PubMed Google Scholar
Daniel Frei
View author publications
Search author on:PubMed Google Scholar
Teresa Rosa Galise
View author publications
Search author on:PubMed Google Scholar
Lara Gaidusch
View author publications
Search author on:PubMed Google Scholar
Hannah F. Oertel
View author publications
Search author on:PubMed Google Scholar
Sara E. Garcia Morales
View author publications
Search author on:PubMed Google Scholar
Giacomo Potente
View author publications
Search author on:PubMed Google Scholar
Qin Tian
View author publications
Search author on:PubMed Google Scholar
Dmitry Smetanin
View author publications
Search author on:PubMed Google Scholar
Joris A. M. Bertrand
View author publications
Search author on:PubMed Google Scholar
Renske E. Onstein
View author publications
Search author on:PubMed Google Scholar
Olivier Panaud
View author publications
Search author on:PubMed Google Scholar
Jürg E. Frey
View author publications
Search author on:PubMed Google Scholar
Salvatore Cozzolino
View author publications
Search author on:PubMed Google Scholar
Thomas Wicker
View author publications
Search author on:PubMed Google Scholar
Shuqing Xu
View author publications
Search author on:PubMed Google Scholar
Ueli Grossniklaus
View author publications
Search author on:PubMed Google Scholar
Philipp M. Schlüter
View author publications
Search author on:PubMed Google Scholar

Contributions

A.R., P.M.S. and U.G. conceived and designed this study. A.R., P.M.S. and S.C. provided Ophrys samples. A.R. performed genome size estimation, gDNA extraction and the Hi-C experiment. A.R. and D.S. conducted the initial genome assembly. A.R. improved the assembly and performed annotation analysis. D.F. performed Nanopore library preparation and sequencing. A.R. and M.A. performed the JBAT visualisation to final chromosome scaffold. A.R., T.R.G., L.G. and T.W. conducted the transposable element annotation. M.E.B. conducted the LTR divergence time analysis. H.F.O. performed the methylation analysis. S.X., Q.T., R.E.O. and S.E.G.M. carried out the phylogenetic and gene family analyses. G.P. conducted the synteny analysis. A.R. and P.M.S. performed the population genomic analyses. M.A. performed differential expression and GO analysis. A.R. and M.A. performed figure curation. A.R. wrote the original draft of the manuscript with inputs from P.M.S., S.C., S.X. and U.G. All authors discussed and approved the final manuscript. J.E.F., J.B., S.C., O.P. and U.G. provided critical resources for this study. P.M.S. and U.G. supervised the project. P.M.S., U.G., S.C. and J.E.F. acquired funding.

Corresponding authors

Correspondence to Alessia Russo or Philipp M. Schlüter.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Rod Peakall and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Russo, A., Alessandrini, M., El Baidouri, M. et al. Genome of the early spider-orchid Ophrys sphegodes provides insights into sexual deception and pollinator adaptation. Nat Commun 15, 6308 (2024). https://doi.org/10.1038/s41467-024-50622-4

Download citation

Received: 06 November 2023
Accepted: 17 July 2024
Published: 26 July 2024
DOI: https://doi.org/10.1038/s41467-024-50622-4

This article is cited by

Telomere-to-telomere gap-free genome assembly of Euchiloglanis kishinouyei
- Hao Wang
- Xuefen Yang
- Ruibin Yang
Scientific Data (2025)