Introduction

In most sexually reproducing organisms, an individual’s sex is determined by sex chromosomes1. Aside from their role in sex determination, sex chromosomes are distinct from autosomes in other ways that have important consequences for their evolutionary fates. Sex chromosomes typically evolve from a pair of homologous autosomes on which a sex-determining locus emerges. Recombination becomes suppressed around the locus, expands along the chromosome, and degeneration ensues for the non-recombining region, which is found only in the heterogametic sex (the Y and W in XY and ZW systems, respectively); sometimes the sex-limited chromosome is lost completely2. This process leads to sex-linked genes being expressed and transmitted differently to those on autosomes (Fig. 1a). The patterns of evolution that result from this are influenced by a complex interaction of evolutionary processes3. Understanding how sex chromosomes and autosomes evolve differently, and the processes that drive these differences, can contribute to a more comprehensive understanding of how genomes evolve.

Fig. 1: Chromosome dynamics under different reproductive systems.
figure 1

a In classic heterogametic systems, the presence of a sex-limited chromosome (e.g., the Y in XX/XY systems) results in the X being haploid in males and transmitted less frequently than autosomes. b In systems with paternal genome elimination (PGE), the males are also haploid for the X but eliminate paternally inherited chromosomes during spermatogenesis, so that every sperm contains the maternal haploid genome and the X and autosomes are transmitted equally. c In systems with PGE, sex is determined after fertilisation, establishing female (XX) and male (X0) karyotypes. Note that in fungus gnats, sperm have two duplicate copies of the maternally inherited X, and fertilised embryos are therefore XXX. Male and female embryos, therefore, lose 2 and 1 Xs, respectively. Figure created in BioRender (http://BioRender.com/ajxef9e).

Genomes are shaped by both selective and neutral processes, as well as the rate, effect, and dominance of new mutations4. Where recessive alleles are hemizygous (i.e., X- or Z-linked in the heterogametic sex), they are always exposed to selection. Mutations should therefore be swiftly purged if deleterious or invade and become fixed more quickly if beneficial. X- or Z-linked genes are indeed often found to diverge faster between species (i.e., have higher dN/dS) than autosomal genes5,6,7,8,9,10,11,12, known as the ‘faster-X’ effect3,13. Conversely, because sex chromosomes are only transmitted to half of offspring by the heterogametic sex and thus have a smaller effective population size (Ne) than the autosomes (NeX/NeA = ¾), selective constraints on many new mutations should be relaxed on the sex chromosomes, allowing a greater number of slightly deleterious alleles to behave neutrally and fix through drift (for the sake of brevity, we will refer to these opposing forces as ‘selection’ and ‘drift’).

Faster-X evolution may therefore be driven by a combination of positive selection on recessive beneficial mutations (adaptive faster-X) and stronger drift on weakly deleterious mutations (nonadaptive faster-X)3. Although numerous studies across a range of taxa have found evidence for accelerated rates of X/Z chromosome evolution compared to autosomes, support in some lineages is mixed9,14,15,16 and the precise causes can be difficult to disentangle. Some studies find support for drift as the predominant driver9,12,17, others, adaptive evolution6,10,18. A key limiting factor in systems with regular chromosome inheritance is that these two effects are naturally confounded: X-linked genes are simultaneously hemizygous in males and have a smaller Ne than do autosomal genes. As a result, it is often unclear what drives the different patterns of genetic diversity we see on sex chromosomes and autosomes.

Dark-winged fungus gnats (Diptera: Sciaridae) and gall midges (Diptera: Cecidomyiidae) are two distantly related families of flies that have independently evolved an unusual form of haplodiploidy known as germline paternal genome elimination19. They have X0 sex determination, but all paternally inherited homologues are eliminated during the first meiotic division during spermatogenesis. Males therefore inherit but do not transmit their fathers’ genes, instead passing on only the maternally inherited genome, including an X chromosome to every offspring (Fig. 1b). The result of this is that both X chromosomes and autosomes are transmitted in a haploid state through males, and thus NeX should equal NeA20. However, elimination of X chromosomes in somatic cells during embryogenesis, which establishes the sex of an individual, produces X0 males and XX females (Fig. 1c). As such, the X is haploid in males but diploid in females, while the autosomes are diploid in both sexes. Thus, while in other systems the cause of a lower Ne for the X arises from the difference in ploidy relative to the autosomes, under PGE, this is not the case due to non-Mendelian transmission of the chromosomes.

PGE, therefore, negates the effects of X and autosomal differences in Ne when it comes to comparisons between sex chromosomes and autosomes. Moreover, this uniform transmission of the genome makes the system invariant to many processes that affect the NeX/NeA ratio, such as sex-differences in variance in reproductive success21, demography22, and fluctuations in population size23. Recombination rates and mutation rates, which usually differ between the X and autosomes due to differences in transmission24,25, should also be equal under PGE. Many of these factors can differ between populations of closely related species26, which may explain why faster-X results from Drosophila are difficult to interpret and often seem contradictory5,14,18,27. Under PGE, hemizygosity of the X should be the sole driver of differences in autosome and X chromosome evolution20. This may be particularly useful when investigating, for example, patterns of neutral genetic diversity, which are influenced by multiple processes22,28,29, as well as whether dominant versus recessive favourable mutations are more prevalent, which is difficult to estimate a priori3,30.

These PGE systems have a further idiosyncrasy in that some species have unorthodox reproductive biology associated with unusual genomic architecture. Some fungus gnats and gall midges have genetically distinct male-producing and female-producing females, a phenomenon known as ‘monogenic’ reproduction31. Female producers are heterozygous for large inversions that are X-linked (X’) in sciarids and autosomal (A’) in cecidomyiids32,33, and these appear to be recently evolved in at least one case33. Other species in the families are ‘digenic’, do not have inversions, and produce mixed-sex broods. Therefore, in addition to being a good model for studying faster-X, these systems provide opportunities to explore the consequences of recent recombination suppression on both X chromosomes and autosomes.

Here, we investigate patterns of evolution in fungus gnats and gall midges and find that, in contrast to most previous studies on systems with regular Mendelian inheritance, X chromosomes consistently diverge more slowly than autosomes under PGE. In some of these species, we use population resequencing and gene expression data to examine polymorphism, selection, and the contribution of differentially expressed genes. We find that, although X-linked genes evolve more adaptively than autosomal genes, purifying selection on weakly deleterious mutations being stronger than positive selection on beneficial mutations causes the slower-X divergence we observe. We find that this purifying selection is likely driven by a combination of hemizygous expression of the X in males, supplemented by higher overall expression of X-linked compared to autosomal genes in both sexes. We also find that recent loss of recombination has the potential to drastically reduce the levels of neutral diversity within recombination-suppressed regions. Our results demonstrate the value of using systems with unusual genetics in understanding complex evolutionary processes.

Results

Effective population size, patterns of neutral diversity, and unusual genomic architecture

Under PGE, autosomes and X chromosomes should have equal effective population sizes (NeX/NeA = 1)20. One method of testing this prediction is by comparing neutral diversity, i.e., diversity at sites that should be evolving in the absence of selection, such as fourfold degenerate sites17. Many factors contribute to driving different patterns of neutral diversity on autosomes and X chromosomes, such as variance in male reproductive success and X/A differences in recombination and mutation rates21,24,25. PGE systems should be invariant to such processes because they rely on differences in autosome and X transmission. We used population resequencing data to estimate neutral diversity (as a proxy for NeX/NeA) in the species Bradysia (formerly Sciara) coprophila, Lycoriella ingenua (both fungus gnats), and Mayetiola destructor (a gall midge), by calculating nucleotide diversity (θ) at fourfold degenerate sites. For L. ingenua and M. destructor, θXA was 0.77 and 0.86, respectively. A value lower than 1 could be caused by background selection removing synonymous polymorphism at linked nonsynonymous sites34, which is likely to be higher on the X due to haploid selection.

Surprisingly, we found θXA to be only 0.28 for B. coprophila. In this species, half of females (the female-producers) are heterozygous for a large inversion-based supergene that comprises ~ 80% of the X chromosome and formed < 0.5 mya33 (Fig. 2a). One consequence of this is an overall reduction in the rate of recombination for the X: in addition to male meiosis being achiasmatic35, most of the X does not recombine in half of females. This should enhance the effects of linked selection, including background selection, hitchhiking, and selective interference, which in turn should reduce polymorphism34. To examine whether lower θXA in B. coprophila is likely to be a consequence of these processes, we performed a sliding window analysis of heterozygosity along the X in wild-caught XX and X’X females. We found strongly reduced heterozygosity in the region of recombination suppression in B. coprophila, suggesting that lower recombination is indeed responsible for purging neutral diversity. In contrast, L. ingenua, in which females are always XX, did not show a pronounced reduction (Fig. 2b). While this unusual genomic architecture may affect X versus autosomal evolution in B. coprophila, this is unlikely to be true of L. ingenua and other species without X-linked supergenes.

Fig. 2: Consequences of recombination suppression on the X chromosome in Bradysia coporphila.
figure 2

a Schematic of the X chromosomes of B. coprophila. Female-producing females are heterozygous for a large X-linked supergene composed of paracentric inversions on the long arm of the chromosome. b The supergene region is visible as raised heterozygosity in X’X females, and causes reduced diversity in XX females. In L. ingenua, a species which does not have a supergene, females are all XX and heterozygosity is similar between the autosomes and X chromosome.

X chromosomes diverge more slowly under PGE

We sought to explore the consequences of PGE for rates of X chromosome divergence. To estimate whether X chromosomes evolve faster than autosomes, studies often examine whether X-linked genes accumulate more nonsynonymous substitutions than do autosomal genes by measuring the scaled rate of divergence, i.e., dN/dS3. We calculated dN/dS from single-copy orthologs between 8 species pairs within the Dipteran infraorder Bibionomorpha: 4 pairs of fungus gnats (Sciaridae), 3 of gall midges (Cecidomyiidae), and one of fever flies (Bibionidae). Fever flies are a non-PGE outgroup with male heterogamety and differentiated X and Y chromosomes36, thus serving as a control (Fig. 3a). Surprisingly, in all PGE pairs studied, we found significantly lower dN/dS for the X, indicating slower divergence of X-linked genes relative to autosomal genes (Fig. 3b and Table 1). This result places PGE in stark contrast with the faster-X divergence seen in many classic reproductive systems. For the non-PGE outgroup, dN/dS was around 20% greater on the X chromosome compared to autosomes, in-line with the faster-X divergence seen in many comparisons of sex chromosomes and autosomes in other systems5,6,7,8,9,10,11,12.

Fig. 3: Rates of divergence between species pairs within the PGE families Sciaridae and Cecidomyiidae and the non-PGE family Bibionidae.
figure 3

a BUSCO-based phylogenetic tree showing all species analysed (see Supplementary Fig. 1 for separate X and autosome trees). Values at branches represent branch length; bold values at nodes represent ultrafast bootstrap support (%). b Comparisons of dN/dS for autosomal (grey) and X-linked (blue) genes between each species pair (outliers not shown). For a version with medians emphasised, see Supplementary Fig. 2. Numbers on x-axes show the number of autosomal and X-linked genes single-copy orthologs tested. Asterisks represent significance levels for two-sided Mann-Whitney-Wilcoxon tests between the X and autosomes (**P < 0.01, ***P < 0.001; see Supplementary Table 1 for full statistical results). Box plots show median (central line), IQR (25th and 75th percentiles, box limits); whiskers extend to 1.5 x IQR. Outliers not shown. Images were obtained from https://www.inaturalist.org/, credit: Sciaridae (Odontosciara nigra), J Gallagher; Cecidomyiidae (Contarinia pseudotsugae), G San Martin; Bibionidae (Dilophus febrilis), P Le Mao.

Table 1 Species pairs analysed in this study, genetic distance (DXY) between species within each pair, median dN/dS for autosomal (A) and X-linked genes, and numbers of single-copy orthologs between species within each pair from which dN/dS was calculated

Purifying selection slows rates of X chromosome divergence despite adaptive evolution

One limitation of dN/dS as a measure of faster-X evolution is that, although it is informative about rates of sequence divergence, it cannot distinguish between adaptive and non-adaptive causes of divergence3. For example, positive, hemizygous selection on the X can increase the rate of non-synonymous substitutions by fixing recessive beneficial mutations (e.g., monarch butterflies10), but relaxed purifying selection due to the smaller effective population size of the X (in XX/XY systems without PGE), which allows more weakly deleterious mutations to fix through drift, also contributes to sequence divergence (e.g., in Timema stick insects37). As such, faster-X evolution does not always imply faster-X adaptative evolution, as studies of the latter generally require evidence of a higher fixation rate of new beneficial mutations. Since X chromosomes in PGE systems are diverging more slowly, this implies that purifying selection dominates over positive selection, potentially reflecting a distribution of fitness effects biased away from beneficial recessive mutations. However, factors such as sex-specific mutation rates and differences in the distribution of dominance and selection coefficients of new mutations could also contribute towards patterns we observe24.

We used population resequencing data in two fungus gnat species, B. coprophila and L. ingenua, and one gall midge, M. destructor, to examine the impact of different types of selection on polymorphism. Notably, our estimates of nonsynonymous (pN) and scaled rates (pN/pS) of polymorphism were significantly lower on the X chromosome compared to the autosomes in all three species (Fig. 4a, b, full statistical results in Supplementary Table 2), indicating stronger purifying selection on X-linked genes. In B. coprophila and M. destructor, pS was also lower on the X, which could be explained by associated background selection. Strangely, we found pS to be higher on the X in L. ingenua (Fig. 4c). Higher pS can be caused by a higher mutation rate or a lower recombination rate (and reduced background selection), but because of equal transmission across the genome we do not expect these to differ between the X and autosomes under PGE. However, it is possible that particular genomic architecture on the autosomes could affect pS, similar to how the X’ supergene in B. coprophila affects neutral diversity through recombination suppression. We checked heterozygosity along the genome in the resequenced individuals in this species and did not find evidence of recombination suppression on the autosomes, but we did find that some individuals had runs of homozygosity that may result from cross-overs where parents were siblings, which likely explains this finding, and may also reflect a degree of inbreeding in the population that we sampled which could explain why both pN and pS were lower for L. ingenua compared to the other two species (Supplementary Fig. 2).

Fig. 4: Polymorphism and selection on the X chromosomes (blue) versus autosomes (grey) in the fungus gnats Bradysia coprophila and Lycoriella ingenua and the gall midge Mayetiola destructor.
figure 4

Scaled rates of a polymorphism (pN/pS), b nonsynonymous polymorphism (pN), and c synonymous polymorphism (pS) are shown. d Per-gene α. The result shown for B. coprophila is after removing polymorphisms with frequencies lower than 0.2 (see Supplementary Notes 1). e Aggregate α. Error bars in e represent 95% confidence intervals obtained by parametric bootstrapping. Asterisks represent significance levels for two-sided Mann-Whitney-Wilcoxon tests between the X and autosomes (*P < 0.05, **P < 0.01, ***P < 0.001, NS = not significant). Box plots show median (central line), IQR (25th and 75th percentiles, box limits); whiskers extend to 1.5 x IQR. Outliers in A-C are not shown. See Supplementary Table 2 for full statistical results, including numbers of genes in each comparison.

Although purifying selection appears to be predominant on the X, this does not preclude the X from evolving more adaptively relative to the autosomes, because haploid selection should cause recessive beneficial mutations to fix more quickly. To examine the contribution of adaptive evolution, we combined polymorphism and divergence in B. coprophila, L. ingenua, and M. destructor to calculate α, the proportion of substitutions driven by positive selection38. We calculated α via two methods: (i) separately for each gene, and (ii) from summed variants across the X and autosomes (i.e., aggregated α) to increase power to detect signatures of selection. Per-gene α was significantly higher for the X relative to autosomes in L. ingenua and M. destructor, suggesting faster adaptation on the X. This was originally not the case for B. coprophila, where α was lower on the X. Since an excess of segregating weakly deleterious mutations can reduce α values39, we removed low-frequency polymorphisms in a manner similar to an asymptotic McDonald-Kreitman test40 prior to calculating α (Supplementary Notes 1), after which we were able to recover positive α for the X (Fig. 4d). For aggregated α, the X was higher than the autosomes in L. ingenua and M. destructor, but not significantly so. Only in B. coprophila was there a significant difference, but this species again had lower α on the X. We hypothesise that this may be due to recombination suppression and the reduced Ne of the B. coprophila X, resulting in less efficient positive selection as well as less efficient purging of the segregating weakly deleterious variants that drive down α. Together, these findings suggest that the X may still be adapting faster under PGE, but that in B. coprophila, recombination suppression may reduce the adaptability of this chromosome. Overall, our results suggest that the predominance of purifying selection on X-linked genes causes the slower-X divergence we observe under PGE.

Sex-biased gene expression and hemizygous selection

Since genes are only likely exposed to selection if they are expressed, and the X is only hemizygous in males, the effect of selection on genes should increase with male-biased expression. Moreover, because sex chromosomes spend disproportionately more time in one sex, they should accumulate more genes with sex-biased expression than autosomes41,42. Accordingly, X and Z chromosomes are often found to have an excess of female-biased (feminisation) or male-biased genes (masculinisation), respectively43,44,45,46, a pattern which tends to be driven primarily by sexually dimorphic expression in reproductive tissues47.

We analysed differential gene expression (DGE) in bodies and reproductive tissues of B. coprophila and L. ingenua, and we did not find a consistent distribution of sex-biased genes: for genes expressed in the body, the X appeared to have an excess of male-biased genes and a dearth of female-biased genes relative to the autosomes in L. ingenua and B. coprophila, and we found the opposite pattern for reproductive tissues (residuals in Supplementary Table 3). In flies, the X is generally found to harbour an excess of female-biased genes and a deficit of male-biased genes, and strong sex-biased gene expression is largely limited to gonadal tissue48. Interestingly, we found extreme patterns of sex-biased gene expression that was not limited to the gonads, with 72–80% of genes showing sex-biased expression in the body and 73–81% in the reproductive tissues (Fig. 5a). Such extreme patterns of sex-biased gene expression are often associated with sexual dimorphism49 or resolution of intragenomic conflicts50,51. Extreme sex-biased expression also appears to be associated with PGE in scale insects52, which could hint at an association with a history of intragenomic conflict. We extended our DGE analysis to three other species: the fungus gnat B. odoriphaga and the gall midges M. destructor and Sitodiplosis mosellana, and found similarly extreme patterns (Fig. 5b), further suggesting that these results could be associated with PGE rather than being an idiosyncrasy of the sciarid clade. Previous work suggests that PGE creates strongly asymmetric conditions for male- and female-beneficial variants to invade53, which may explain an accumulation of genes with strongly male- and female-biased expression. The inconsistent patterns of sex-biased gene content that we found on the X and autosomes could be indicative of turnover in sex-biased gene content, but a causal link requires further exploration.

Fig. 5: Differential gene expression on the X chromosomes and autosomes under PGE.
figure 5

a Proportions of genes with strongly sex-biased (> 90% in one sex), sex-biased (> 70% in one sex), and unbiased (30–70% in one sex) expression on the autosomes (A) and X chromosome in the body and somatic reproductive tissues of the fungus gnats Bradysia coprophila and Lycoriella ingenua (3 RNAseq replicates each). b Same as above for Bradysia odoriphaga (3 replicates) and the gall midges Sitodiplosis mosellana (2 replicates) and Mayetiola destructor (1 replicate). Asterisks represent significance differences in sex-biased composition using two-sided χ2 tests (***P < 0.001, NS = not significant). Significance test results are not shown for S. mosellana or M. destructor as the comparisons were based on fewer than 3 biological replicates. See Supplementary Table 3 for all residuals and P-values.

We calculated dN/dS and pN/pS for male-biased, female-biased, and unbiased genes in B. coprophila and L. ingenua to determine whether the slower-X divergence we observed is primarily driven by haploid selection in males (full statistical results in Supplementary Table 4). We found that genes with sex-biased expression generally evolve more quickly than those with unbiased expression, in line with findings from previous studies on other organisms7,10,54. Consistent with haploid selection in males, male-biased X-linked genes had lower dN/dS than male-biased autosomal genes (Fig. 6a). In B. coprophila and L. ingenua, this result was not limited to gonadal tissues, which was surprising given that sexually dimorphic tissues are often the most rapidly evolving traits55. Similarly, Heliconius shows no particular correlation between faster-evolving Z-linked genes and expression in the ovaries16. Surprisingly, we found a similar pattern for female-biased genes (Fig. 6a). Such a finding could be due to background selection from selection on male-biased genes, but it may also suggest that some effect other than haploid selection is driving slower-X divergence.

Fig. 6: Evolution of sex-biased body- and gonad-expressed genes in Bradysia coprophila and Lycoriella ingenua.
figure 6

Genes are designated as body- or gonad-expressed if they are predominantly expressed in one tissue (i.e., > 50% of a gene’s expression). Note that we excluded germline (testes and ovaries) from reproductive tissues as germline and somatic karyotypes differ (see methods). a Scaled rates of divergence (dN/dS); (b) scaled rates of polymorphism (pN/pS) for autosomal (grey) and X-linked (blue) genes. MB = male-biased, UB = unbiased, FB = female-biased. Asterisks represent significance levels for two-sided Mann-Whitney-Wilcoxon tests between the X and autosomes (*P < 0.05, **P < 0.01, ***P < 0.001, NS = not significant). Box plots show median (central line), IQR (25th and 75th percentiles, box limits); whiskers extend to 1.5 x IQR. Outliers not shown. See Supplementary Table 4 for full statistical test results, including numbers of genes in each comparison.

Due to haploid selection, we also expect to see the patterns of pN/pS observed above to apply primarily to genes with male-biased expression. However, with the exception of male-biased body-expressed genes in L. ingenua, we did not to recover significant differences in pN/pS between the X and autosomes. In some cases, this may be because relatively few genes were being tested (Fig. 6b and Supplementary Table 4).

Differences in X and autosome expression may contribute to slower X divergence

We investigated differences in expression levels between autosomal and X-linked genes, which could influence the exposure of alleles to selection: genes with higher expression are likely to have a larger phenotypic effect and should, therefore, be under stronger purifying selection56,57,58. For example, in the pea aphid system, in which parthenogenetic sex determination also predicts that NeX/NeA should be approximately equal59,60, X-linked genes have much lower expression than autosomal genes and show correspondingly higher dN/dS7.

Across all species in which we assayed gene expression levels, we found that X-linked genes were generally more highly expressed than autosomal genes in both sexes (Fig. 7). Only in L. ingenua was the difference in autosomal versus X-linked expression not significant in every tissue. That we found similar patterns of autosomal and X expression in both sexes confirms previous findings of dosage compensation in fungus gnats61,62, and suggests that the X may also be dosage balanced in gall midges. Moreover, the fact that we find higher expression of the X may reflect possible dosage overcompensation, a theoretically predicted but rarely observed phenomenon in sex chromosome evolution63. The X chromosomes of fungus gnats are evolutionarily independent from those of gall midges: the former are comprised of Muller Elements A and E, while the latter are mainly C, D, and F64. That both families show higher expression of X-linked genes, therefore suggests a potential evolutionary explanation for this pattern, which remains to be explored.

Fig. 7: Expression (Log mean counts) of autosomal (grey) versus X-linked (blue) genes.
figure 7

X chromosomes generally show higher expression than autosomes in the fungus gnats Bradysia coprophila, Lycoriella ingenua, B. odoriphaga, and the gall midges Sitodiplosis mosellana and Mayetiola destructor. Asterisks represent significance levels for two-sided Mann-Whitney-Wilcoxon tests, Bonferroni-corrected for multiple tests, between the X and autosomes (*P < 0.05, ***P < 0.001, NS = not significant). Box plots show median (central line), IQR (25th and 75th percentiles, box limits); whiskers extend to 1.5 x IQR. Outliers not shown. See Supplementary Table 5 for all residuals and P-values, including numbers of genes in each comparison.

Discussion

Most species in which evolutionary rates of autosomes and sex chromosomes have been compared show faster rates of divergence for X- or Z-linked genes, and this is often attributed to faster adaptation or stronger drift acting on sex-linked genes. However, results are often mixed, and the causes of faster divergence are a challenge to disentangle. We found that under PGE in fungus gnats and gall midges, where NeX/NeA = 1, X chromosomes diverge more slowly than do autosomes. The slower-X divergence appears to be driven by stronger purifying selection of X-linked genes, resulting from a combination of haploid selection in males and purifying selection on the more highly expressed X chromosome.

Interestingly, the unusual genomic architecture of B. coprophila, where inversions suppress recombination, also results in drastically reduced diversity on the X. While this could theoretically contribute to the slower-X divergence we see, it is unlikely to explain our observations across PGE species for two reasons. Firstly, L. ingenua and many other fungus gnats lack inversions on their X chromosomes yet show similar patterns of X versus autosomal divergence. Secondly, gall midges can also be monogenic with much smaller associated inversions32, but these are autosomal and should, therefore, have the opposite effect to that which we see.

More broadly, our findings suggest that the consistent observations of higher dN/dS for sex chromosomes relative to autosomes5,6,7,8,9,10,11,12 may be driven primarily by the smaller effective population size of the X chromosomes, and, therefore, that faster divergence of X chromosomes is primarily a result of more weakly deleterious mutations fixing through drift rather than beneficial mutations through selection. Thus, while evidence suggests that X chromosomes often indeed adapt faster than autosomes, faster-X divergence is not necessarily adaptive. Overall, our findings demonstrate how species with unusual inheritance mechanisms are uniquely placed to provide insights into complex questions surrounding sex chromosome evolution.

Methods

Genome assembly, annotation, and sex-linkage assignment for species pairs

We used publicly available genomes for A.aphidimyza, B. coprophila, B. odoriphaga, C. nasturtii, D. febrilis, L. ingenua, M. destructor, P. flavipes, and S. moselllana (see Supplementary Table 6 for all accessions). For all remaining species, we generated de novo genome assemblies using WGS Illumina data. For D. femoratus, C. rumicis, and M. hordei, we used publicly available data. We collected and generated WGS data for L. agraria, B. pectoralis, B. confinis, and B. desolata. B. confinis, B. desolata, and B. pectoralis were collected from Cerová Vrchovina forest, Slovakia. L. agraria was collected from Whytham Woods, Oxford, UK. All collected specimens were identified from genital clasper morphology and amplification and Sanger sequencing of the COI barcode sequence using the primer pair LCO1490/HCO219865.

Genomic DNA was extracted from collected flies using a protocol modified to obtain high-molecular-weight DNA with low-input material, developed by C Laumer at the Wellcome Sanger Institute (dx.doi.org/10.17504/protocols.io.bypypvpw). Quantification and quality control was performed with Qubit and Nanodrop (ThermoFisher). Samples were sequenced for 150 bp paired-end Illumina reads on the Novoseq 6000 platform (Supplementary Table 7 for per-sample coverage). Raw reads were trimmed with fastp v0.12.466. We generated de novo short read assemblies for B confinis, B. desolata, and B. pectoralis using SPAdes v3.14.167 and used Blobtools v168 to identify and remove contaminant (non-metazoan), low-coverage (< 5x), and short (< 500 bp) contigs. Assembly quality was assessed with BUSCO v569 (Supplementary Fig. 4). We constructed a phylogenetic tree to determine species pairs using BUSCO Phylogenomics (github.com/jamiemcg/BUSCO_phylogenomics) and IQTree v2.2.570, using FigTree v1.4.471 to plot the trees.

For B. coprophila, B. odoriphaga, and C. nasturtii, RefSeq gene predictions were publicly available (Supplementary Table 6). For all other species, we generated de novo predictions. The genomes were soft-masked with Red v272 and annotated with BRAKER273,74,75,76,77 using the OrthoDB v10 Diptera protein sequences78. The L. ingenua genome annotation was generated at a later date using BRAKER379 and the OrthoDB v11 Dipteran protein sequences, along with all RNAseq libraries generated in this study (see below), which were mapped to the genome with HISAT80.

We identified autosomal and X-linked genes based on read depth from the homogametic sex (males), as autosomal sequences should have diploid (2n) coverage while X-linked sequences should have haploid (1n) coverage. Where male reads were available for both members of a species pair, we aligned them with BWA-MEM81 and used per-scaffold read depth, computed with SAMtools v1.1482, to identify autosomal and X-linked sequences. For 3 species pairs (B. coprophila/B. odoriphaga, L. ingenua/L. agararia, M. destructor/M. hordei), we inferred X-linkage from only one species since male reads were not available for both. For one species pair (C. nasturtii/C. rumicis), only female reads were available, so we inferred X-linkage based on alignments, using Minimap v2.17-r94183, to S. mosellana, which was the closest chromosome-level outgroup for which sex linkage information was available (for more information on sex-linkage assignments see Supplementary Notes 2).

Analysis of neutral diversity

We calculated nucleotide diversity (θ) at fourfold degenerate (i.e., putatively neutral) sites to test our prediction that under PGE, NeX/NeA should equal 1. To this end, we used alignments of resequencing datasets for B. coprophila, L. ingenua, and M. destructor (described below), and the population genomic tool ANGSD v0.94084 to calculate θ for each scaffold.

We analysed θ across sliding windows of the X chromosome in B. coprophila and L. ingenua individuals using resequencing datasets (see below). We called genotypes at all sites with GATK-485,86,87 using the output mode EMIT_ALL_CONFIDENT_SITES with the option -include-non-variant-sites when genotyping VCFs, in order to call all (variant and invariant) genotypes at every site. The scripts parseVCF.py and popgenWindows.py (https://github.com/simonhmartin/genomics_general) were used to calculate θ in 100 kb sliding windows across the X in XX and X’X individuals.

Analysis of between-species divergence

To calculate dN/dS we used alignments of WGS data of one species to the genome of another. For each pair, the species with the better quality (more contiguous) genome was chosen as the ingroup to ensure better mapping rates and reduce mapping times (Table 1; see Supplementary Table 8 for assembly statistics). Outgroups were aligned to their respective ingroup species with BWA-MEM81 using default parameters. PicardTools (http://broadinstitute.github.io/picard/) was used to sort alignments, mark and remove duplicates, and add read groups. Variants were called using the GATK-4 best practises pipeline85,86,87, and were filtered for Quality by Depth > 2, Fisher Strand-bias < 60, and Mapping Quality > 40. Variant annotations were generated using SnpEff v5.1d88 and SnpSift v5.1d89. The script partitioncds.py90 was used to annotate degeneracy for all genic sites. Finally, a custom R script91 was used to calculate scaled divergence (dN/dS), i.e., the ratio between non-synonymous variants per non-synonymous site (dN) and synonymous variants per synonymous site (dS), for single-copy orthologs within each species pair. We calculated dN/dS only for single-copy orthologs between species pairs, which we identified by running Orthofinder v1.4.292 on the longest predicted isoforms of the predicted genes.

We calculated DXY between species by aligning whole genomes with Minimap v2.17-r94183 with the option -x asm20 to allow for 20% sequence divergence and –secondary = no to avoid secondary alignments. We also repeated this for the X and autosomes separately. DXY was calculated as 1 – percentage identity (i.e., the number of mismatches divided / total length of aligned sequence).

Analysis of within-species polymorphism and selection

To estimate scaled rates of polymorphism (i.e., pN/pS), we utilised population resequencing data for B. coprophila, L. ingenua, and M. destructor. We collected 11 B. coprophila females and 5 L. ingenua XX females for resequencing. In taxonomic literature, B. coprophila is traditionally named B. tilicola, but we use the former name here to maintain consistency with evolutionary literature. Females of B. coprophila were collected from houseplants in different apartments in Edinburgh, as well as from the Royal Botanic Gardens, Edinburgh, UK. Because B. coprophila females can either be X’X (heterozygous for an X-linked supergene) or XX, and we identified XX and X’X females based on distributions of heterozygosity using a sliding window analysis (Supplementary Notes 3). We identified 3 individuals as having the XX genotype and the remaining 8 as having the X’X genotype. Since we were interested in polymorphisms on the standard X chromosome, we analysed only the 3 XX individuals. We collected L. ingenua from a single mushroom straw log obtained from Mycobee Mushroom Farm, North Berwick, from which thousands of individuals were emerging, and may, therefore, have been descended from relatively fewer individuals, which may explain why we observed lower polymorphism in this species (see main text). DNA extractions and sequencing were performed as described above, for ~20x coverage per individual. For M. destructor, we used 5 publicly available resequencing libraries from female individuals93. We calculated pN/pS using the same method described above for dN/dS but aligning resequencing libraries to their respective genomes and combining VCFs with GATK-4 combineGVCFs prior to genotyping.

To assess the extent of positive (adaptive) selection on the X versus the autosomes, we calculated α for each gene using a calculation of the Neutrality Index (1 – NI)38. A higher α value indicates a higher proportion of fixed non-synonymous differences that are driven by positive selection. We calculated α both per-gene as well as aggregated within a chromosome (i.e., by summing dN, dS, pN, and pS across autosomal and X-linked genes prior to calculating α). Because some genes contain very few variants, the latter method arguably allows for a more powerful test of positive selection38. Many of the per-gene α values were negative, as was aggregate α for the B. coprophila X and L. ingenua X and autosomes. Since an excess of weakly deleterious segregating variants can cause negative α, we applied a sliding cut-off of both synonymous and non-synonymous polymorphisms, akin to an asymptotic McDonald-Kreitman test40. We provide all α estimates in Supplementary Notes 1, and present original estimates in the main text, except for B. coprophila per-gene α where we present the estimate after removing polymorphisms with frequencies < 0.2.

Differential gene expression

We mass-reared lab cultures of B. coprophila94 and L. ingenua and performed dissections to separate gonadal tissue from the remainder of the body (Supplementary Fig. 5). Germline (i.e., testes or ovaries) and somatic reproductive tissues were separated and sequenced separately, since male Sciaridae germ cells are XX95. Adult flies were anaesthetised on ice, dissected in 1 × phosphate buffered saline on a glass slide under a Leica EZ4 microscope, and flash-frozen at − 70 °C. 20–60 individuals were pooled per replicate, and 3 replicates were collected for each sex; for B. coprophila, we used male-producing (XX) females, which lack the female-limited X’ supergene. RNA was extracted using a modified version of the PureLink RNA purification kit (ThermoFisher) with a TRIzol solubilisation step. All samples were quantified and quality-checked using the Qubit and Nandrop (ThermoFisher) and were sequenced for 30 M 150 bp paired-end reads on the NovaSeq 6000 platform, for 9 G data per sample.

To analyse sex-biased gene expression and classify genes as differentially expressed, we used Kallisto v0.46.196. Indices were generated from the B. coprophila33 and L. ingenua gene predictions, against which RNAseq libraries were quantified. Data quality control (PCA and comparing transcript per million (TPM) distributions, Supplementary Fig. 6) and quantile-normalisation of transcript counts were performed in R. Sex-biased expression was assessed using the specificity metric (SPM) for male versus female expression by dividing the squared mean counts for each gene in females by the sum of squared mean male and female counts97, such that male-limited genes have an SPM of 0 and female-limited genes an SPM of 1. Genes with SPM < 0.3 were assigned as male-biased and SPM > 0.7 as female-biased, corresponding to a difference of 1.5x in expression between the sexes. Genes with SPM < 0.1 and > 0.9 were assigned as strongly male-biased and strongly female-biased, respectively, though when analysing dN/dS and pN/pS of sex-biased genes we combined strongly-biased and biased genes into one category. Prior to calculating SPM, we excluded genes with normalised counts < 4 to ensure that our results were not driven by potentially unreliable assignment of bias in genes with very low expression. Because sexually dimorphic reproductive tissues generally contribute disproportionately to sex-biased gene expression, we categorised genes into those that were predominantly (> 50%) expressed in the body (i.e., non-sexually dimorphic tissues) and those predominantly expressed in the somatic reproductive tissues (i.e., sexually dimorphic tissues). For analyses of dN/dS and pN/pS for differentially expressed genes, we only included somatic gonadal tissue (i.e., excluding testes and germline) since germ cells are XX in both sexes and should thus not be subject to haploid selection.

We also performed a complementary differential gene expression analysis on counts with DESeq298, which is more conservative in defining sex-biased genes compared to using SPM. We used an adjusted P-value cut-off of < 0.05 to define expression as significant and a Log2 fold-change of > 0.6 to categorise genes as sex-biased, again corresponding to a difference in expression between the sexes of approximately 1.5x. DESeq2 assigned fewer genes as sex-biased compared to SPM, but there were no major inconsistencies between the two methods (i.e., no genes were categorised as male-biased by one method but female-biased by the other, Supplementary Table 9).

Statistical analyses

Significant differences in dN/dS, pN/pS, and α were tested with Mann-Whitney-Wilcoxon tests for comparisons between X chromosomes and autosomes. Sex-biased composition of chromosomes was assessed using Chi-squared (χ2) tests. To test for significance differences in α between the X and autosomes, we used a permutation test framework10,12,99. We calculated point estimates of α for the X and autosomes and used the absolute value of the difference between the two estimates as the permutation test statistic. From the two gene sets being tested, we randomly sampled individual genes to create two permuted sets of equal size to the true sets, without replacement. We then calculated absolute differences in α between the two sampled gene sets for 10,000 permutations and constructed a distribution of differences in point estimates for α. These distributions were then compared to the true value, and P-values were calculated as the proportion of times the observed value was smaller than the values in the permuted distribution.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.