Abstract
Chromosomal organization is sufficiently evolutionarily stable that large syntenic blocks of genes can be recognized even between species as distantly related as mammals and puffer fish (450 million years (Myr) of divergence)1,2,3,4,5,6,7. In Diptera, the gene content of the X chromosome and the autosomes is well conserved: in Drosophila more than 95% of the genes have remained on the same chromosome arm in the 12 sequenced species (63 Myr of divergence, traversing 400 Myr of evolution)2,4,6, and the same linkage groups are clearly recognizable in mosquito genomes (260 Myr of divergence)3,5,7. Here we investigate the conservation of Y-linked gene content among the 12 sequenced Drosophila species. We found that only a quarter of the Drosophila melanogaster Y-linked genes (3 out of 12) are Y-linked in all sequenced species, and that most of them (7 out of 12) were acquired less than 63 Myr ago. Hence, whereas the organization of other Drosophila chromosomes traces back to the common ancestor with mosquitoes, the gene content of the D. melanogaster Y chromosome is much younger. Gene losses are known to have an important role in the evolution of Y chromosomes8,9,10, and we indeed found two such cases. However, the rate of gene gain in the Drosophila Y chromosomes investigated is 10.9 times higher than the rate of gene loss (95% confidence interval: 2.3–52.5), indicating a clear tendency of the Y chromosomes to increase in gene content. In contrast with the mammalian Y chromosome, gene gains have a prominent role in the evolution of the Drosophila Y chromosome.
Similar content being viewed by others
Main
Even in sequenced species little is known about the Y chromosomes because their heterochromatic state precludes sequence assembly into large and easily studied scaffolds, but instead short Y-linked scaffolds must be individually identified11,12. In most Drosophila species the Y chromosome is essential for male fertility13, and genetic data have identified between six and ten Y-linked factors required for this function14,15. The paucity of genes and its heterochromatic state suggested that, like the mammalian Y chromosome16, the Drosophila Y chromosome might be largely a degenerated X chromosome. The conservation of the fertility function in rather distant species fits well with the known conservation of the gene content of Drosophila chromosomal arms6,17. Therefore sex-chromosome evolutionary theory8,9, well-known patterns of chromosome evolution in Drosophila, and conservation of biological function all suggest that the Drosophila Y chromosome ought to be a degenerated X chromosome, with a few remaining and well-conserved genes. However, the 12 genes identified on the D. melanogaster Y chromosome were all acquired through gene duplications from the autosomes, rather than being a relic subset of the X-linked genes18,19,20,21,22. Furthermore, a Y-chromosome–autosome fusion in the D. pseudoobscura lineage made the ancestral Y chromosome into part of an autosome, and a new Y chromosome arose23. Both findings suggest that Drosophila Y chromosomes are labile and raise the question of how well conserved is their gene content.
The recent sequencing of 10 further Drosophila genomes24 allows a detailed study of this question. We first identified the putative orthologues of the 12 known D. melanogaster Y-linked genes18,19,20,21,22 in the remaining species (see Methods). Owing to the low coverage of the Y chromosome11 and its abundance of repetitive sequences, the sequences of almost all Y-linked genes have large gaps and sequencing errors, and different exons of the same gene are scattered in several scaffolds19,20 (Supplementary Fig. 1). These problems were corrected by direct sequencing of the products from polymerase chain reaction with reverse transcription (RT–PCR) and rapid amplification of complementary DNA ends (RACE) (see Methods) for all genes. We sequenced ∼150 kilobases (kb), and the average gene has one-third of its sequence generated de novo (Supplementary Table 1). Notably, we could not find the orthologue of the Pp1-Y1 gene in D. mojavensis or the orthologue of Ppr-Y in D. grimshawi, even among the raw sequencing traces. Synteny analysis strongly suggests that the Pp1-Y1 loss is real; degenerate PCR with a primer pair that amplifies Ppr-Y in a broad range of species confirmed its loss in D. grimshawi (Supplementary Discussion).
Molecular evolutionary analysis, revealing a substantial excess of synonymous over nonsynonymous changes in protein-coding genes, strongly indicates that all of these Y-linked genes are functional (Supplementary Table 2). Orthology was confirmed by phylogenetic analysis of all genes (Supplementary Fig. 2). We then tested their Y-linkage by PCR in males and females. Notably, many of the genes are not Y-linked in several species (Supplementary Fig. 3 and Table 1). The results of D. pseudoobscura and D. persimilis are expected, given the known Y-autosome fusion that occurred in this lineage23. The other linkage changes (Table 1) can be caused by individual movements of genes from the Y chromosome to the other chromosomes or vice versa. Movement direction was unambiguously ascertained by synteny analysis even in the kl-5 gene, with the data indicating two independent transfers to the Y chromosome (Fig. 1 and Supplementary Fig. 4). Using synteny (Supplementary Figs 4–8) and the known phylogenetic relationships among the sequenced species24, we could infer the direction and time of the gene movements, as shown in Fig. 2. Intron positions were conserved in all cases, which rules out retrotransposition and suggests a DNA-based mechanism for the gene movements (Supplementary Discussion). Most or all extant genes were acquired individually by the Y chromosome (as opposed to resulting from large segmental duplications), because they are not adjacent to each other at their original autosomal locations (Supplementary Figs 4–8 and Supplementary Table 3).
a, b, The gene is Y-linked in all examined Drosophila species except D. willistoni (and in D. pseudoobscura or D. persimilis), which might suggest a Y-chromosome-to-autosome transfer in the D. willistoni lineage. However, the conserved synteny between D. willistoni and Anopheles gambiae (a) shows that the autosomal D. willistoni location is ancestral (thick lines in b). Hence, there were two independent transfers of kl-5 to the Y chromosome (arrows in b). Note that the Drosophila CG3330 gene has no orthologue in Anopheles. See Supplementary Fig. 4 for the remaining species.
Gene gains (red arrows) and losses (blue arrows) were inferred by synteny. For changes that occurred before the split of the Drosophila and Sophophora subgenera (genes kl-2, kl-3, ORY, PRY and Ppr-Y; dashed arrows) there is no close outgroup for inferring the direction (gain versus loss) through synteny. However, all five genes are autosomal or X-linked in Anopheles, which suggests that they were acquired by the Y chromosome between 260 (that is, the Drosophila–Anopheles divergence time3,5) and 63 Myr ago.
It is clear from Fig. 2 that the gene content of the Drosophila Y chromosome is highly variable: among the 12 known Y-linked genes of D. melanogaster, only three (kl-2, kl-3 and ORY) are Y-linked in all sequenced species (we ignored the special case of the Y-chromosome–autosome fusion in the D. pseudoobscura lineage because the changes that happened there were not caused by individual gene gain and loss). All other genes (75% of the total) moved onto or off the Y chromosome at least once, or were lost. This contrasts sharply with the remainder of the genome, where it was found that 514 genes out of ∼13,000 (4% of the total) moved to different chromosome arms in the same set of species6, and may suggest that there is increased gene movement to and from the Y chromosome, as has been observed for the X chromsome25,26,27. However, the rate of gene movements in the Y chromosome is smaller than the rate of similarly sized chromosome arms (Supplementary Discussion), and thus increased gene movement does not seem to be the main cause of the low conservation of Y-linked gene content.
The contrast between the Y chromosome and the other chromosomes seems to reflect their different evolutionary histories: whereas in the ancestor of all sequenced species the large chromosome arms had thousands of genes, the Y chromosome had a very low number of genes (we know of five: kl-2, kl-3, Ppr-Y, PRY and ORY; Fig. 2). This, coupled with a small number of gene movements in both genomic compartments, would produce the present pattern of low conservation in the Y chromosome and high conservation in the other chromosomes. A possible caveat to this conclusion is that we do not know the full gene content of the Drosophila Y chromsome22. However, the low conservation of linkage we found should hold for the full gene set of the D. melanogaster Y chromosome, because the discovery of the 12 known Y-linked genes did not use any information from the other species (their genomic sequences were not even available at that time). Hence it is safe to conclude that most of the D. melanogaster Y-linked genes are recent acquisitions. In contrast, the mammalian Y chromosome mostly contains relic subsets of the X-linked genes, and variation in the Y-linked gene content among species reflects differential loss of these relic genes and some gene acquisitions28,29. In Drosophila no such relic genes have been found, and variation arises mainly from a continuing process of gene acquisition.
Figure 2 suggests that there are more gene gains than losses in the Y chromosome lineages examined, but these inferences were drawn using genes ascertained in D. melanogaster, opening a concern about bias. For example, D. virilis probably harbours Y-linked genes that were either acquired after its ancestor split from the D. melanogaster lineage, or were lost in the D. melanogaster lineage, and such genes would not be detected in the present study. Indeed, direct search in the D. virilis genome identified at least two Y-linked genes not shared with D. melanogaster (A.B.C. and A.G.C, unpublished data). Given the ascertainment issue, only the rate of gene gain can be estimated in the D. melanogaster lineage branches of the phylogeny, and only the rate of gene loss can be estimated in the other branches (Supplementary Fig. 9). This procedure produces an estimate of the raw rate of gene gain by the Y chromosome of 0.1113 genes per Myr (7 gains in 63 Myr), whereas the raw rate of gene loss is 0.0073 genes per Myr (2 losses in 275 Myr). After correcting for an ascertainment bias in the loss rate (Supplementary Methods), and under the assumption that the rates of gene gain and gene loss are homogeneous across the lineages, we found that the rate of gene gain is 10.9 times higher than the rate of gene loss (P = 0.003 under the null hypothesis of equal gain and loss rates), which strongly suggests that the gene content of the Y chromosome has indeed increased.
To explore more fully the consequences of the ascertainment bias of gene content, we performed simulations of gene gain and loss using the observed phylogeny and branch lengths, and made inferences of gene loss conditional on observing the same genes in D. melanogaster (identical to the true ascertainment). Approximate Bayesian estimates of the posterior densities of the rates of gene gain and loss were obtained by a rejection-sampling procedure for 1,000 runs (Supplementary Methods). All 1,000 runs had a gene gain rate exceeding the gene loss rate across the phylogeny (Fig. 3 and Supplementary Fig. 11). Thus both the simulations and the analytical result provide strong evidence that the Y chromosome lineages examined have experienced a net gain in gene number. The origin of the Drosophila Y chromosome remains a controversial issue9,23; if one assumes that it arose from the degeneration of the X chromosome, then only more recently had gene gains became important after all of its ancestral genes (shared with the X chromosome) had been lost.
A Bayesian rejection sampling procedure was applied (see text) to yield 1,000 estimates of the rates of gene gain and loss conditional on the observed gains and losses of genes on the Y chromosome, and conditional on the genes being observed in D. melanogaster (matching the actual ascertainment of Y genes used in this study). The average net gain rate (gain rate minus loss rate) is 0.130 genes per Myr, and all 1,000 simulations had a higher rate of gene gain than loss (range of net gain rate: 0.035 to 0.352).
Given the restrictive characteristics of the Y chromosome (for example, its heterochromatic state) it is puzzling that genes moved there. Several hypotheses, ranging from neutrality to positive selection, could explain this but our data do not allow definitive support for one model (Supplementary Discussion). The Y-linked gene Suppressor of Stellate, which is a recent acquisition in the D. melanogaster lineage, may be a case of positive selection30 (we excluded it because it is multi-copy and RNA-encoding). Whatever its cause, the finding that the Y chromosome has gained genes has interesting consequences. A chromosome that on average has gained genes and yet has few of them must be relatively young. Further Diptera genome sequences may shed light on this issue. But the data in hand already strongly support the conclusion that the gene content of the Drosophila Y chromosome is younger than the other chromosomes, and that gene acquisitions have had a prominent role in its evolution.
Methods Summary
Genomic sequences
We used the WGS3 assembly of D. melanogaster (accession AABU00000000), the TIGR assembly of D. pseudoobscura (accession AAFS01000000) and the CAF1 assemblies for all other species (available at http://rana.lbl.gov/drosophila/caf1.html). Full details of the strains used, sequencing and assembly strategies are described in ref. 24.
Search of orthologues of D. melanogaster Y-linked genes
We searched for these genes with TblastN20, using as queries the protein sequences of the D. melanogaster Y-linked genes18,19,20,21,22 and as databases the genomes of the remaining species. Orthology was confirmed by phylogenetic analysis (Supplementary Fig. 2). Supplementary Table 1 shows the accession numbers of the finished CDS sequences.
Molecular biology methods
DNA and RNA were extracted from the same strains used for the genome sequencing24. RNA and DNA extractions, PCR and RT–PCR were performed using standard protocols19,20. 3′ RACE and 5′ RACE were performed with the Invitrogen Gene Racer Kit following the instructions of the manufacturer, using testis or whole body total RNA (in the case of D. grimshawi) as templates. DNA sequencing was done at Macrogen (Korea) and the Cornell DNA sequencing core facility.
Accession codes
Data deposits
Nucleotide sequence accession numbers are listed in the Supplementary Information.
References
Aparicio, S. et al. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes . Science 297, 1301–1310 (2002)
Beverley, S. M. & Wilson, A. C. Molecular evolution in Drosophila and the higher diptera II. A time scale for fly evolution. J. Mol. Evol. 21, 1–13 (1984)
Zdobnov, E. M. et al. Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster . Science 298, 149–159 (2002)
Tamura, K., Subramanian, S. & Kumar, S. Temporal patterns of fruit fly (Drosophila) evolution revealed by mutation clocks. Mol. Biol. Evol. 21, 36–44 (2004)
Yeates, D. K. & Wiegmann, B. M. The Evolutionary Biology of Flies 35 (Columbia Univ. Press, 2005)
Bhutkar, A., Russo, S. M., Smith, T. F. & Gelbart, W. M. Genome-scale analysis of positionally relocated genes. Genome Res. 17, 1880–1887 (2007)
Nene, V. et al. Genome sequence of Aedes aegypti, a major arbovirus vector. Science 316, 1718–1723 (2007)
Rice, W. R. Evolution of the Y sex chromosome in animals. Bioscience 46, 331–343 (1996)
Charlesworth, B. & Charlesworth, D. The degeneration of Y chromosomes. Phil. Trans. R. Soc. Lond. B 355, 1563–1572 (2000)
Bachtrog, D., Hom, E., Wong, K. M., Maside, X. & de Jong, P. Genomic degradation of a young Y chromosome in Drosophila miranda . Genome Biol. 9, R30 (2008)
Carvalho, A. B. et al. Y chromosome and other heterochromatic sequences of the Drosophila melanogaster genome: how far can we go? Genetica 117, 227–237 (2003)
Hoskins, R. A. et al. Sequence finishing and mapping of Drosophila melanogaster heterochromatin. Science 316, 1625–1628 (2007)
Ashburner, M., Golic, K. G. & Hawley, R. S. Drosophila: a Laboratory Handbook 2nd edn 607–639 (Cold Spring Harbour Laboratory Press, 2005)
Kennison, J. A. The genetic and cytological organization of the Y chromosome of Drosophila melanogaster . Genetics 98, 529–548 (1981)
Hackstein, J. H. & Hochstenbach, R. The elusive fertility genes of Drosophila: the ultimate haven for selfish genetic elements. Trends Genet. 11, 195–200 (1995)
Skaletsky, H. et al. The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature 423, 825–837 (2003)
Sturtevant, A. H. & Novitski, E. The homologies of the chromosome elements in the genus Drosophila . Genetics 26, 517–541 (1941)
Gepner, J. & Hays, T. S. A fertility region on the Y chromosome of Drosophila melanogaster encodes a dynein microtubule motor. Proc. Natl Acad. Sci. USA 90, 11132–11136 (1993)
Carvalho, A. B., Lazzaro, B. P. & Clark, A. G. Y chromosomal fertility factors kl-2 and kl-3 of Drosophila melanogaster encode dynein heavy chain polypeptides. Proc. Natl Acad. Sci. USA 97, 13239–13244 (2000)
Carvalho, A. B., Dobo, B. A., Vibranovski, M. D. & Clark, A. G. Identification of five new genes on the Y chromosome of Drosophila melanogaster . Proc. Natl Acad. Sci. USA 98, 13225–13230 (2001)
Carvalho, A. B. & Clark, A. G. Birth of a new gene on the Drosophila Y chromosome. The 44th Annual Drosophila Research Conference, Philadelphia, USA. Abstract 318C, page 113 (The Genetics Society of America, 2003)
Vibranovski, M. D., Koerich, L. B. & Carvalho, A. B. Two new Y-linked genes in Drosophila melanogaster . Genetics 179, 2325–2327 (2008)
Carvalho, A. B. & Clark, A. G. Y chromosome of D. pseudoobscura is not homologous to the ancestral Drosophila Y. Science 307, 108–110 (2005)
Drosophila 12 Genomes Consortium Evolution of genes and genomes on the Drosophila phylogeny. Nature 450, 203–218 (2007)
Betran, E., Thornton, K. & Long, M. Retroposed new genes out of the X in Drosophila . Genome Res. 12, 1854–1859 (2002)
Emerson, J. J., Kaessmann, H., Betran, E. & Long, M. Extensive gene traffic on the mammalian X chromosome. Science 303, 537–540 (2004)
Sturgill, D., Zhang, Y., Parisi, M. & Oliver, B. Demasculinization of X chromosomes in the Drosophila genus. Nature 450, 238–241 (2007)
Graves, J. A. Sex chromosome specialization and degeneration in mammals. Cell 124, 901–914 (2006)
Murphy, W. J. et al. Novel gene acquisition on carnivore Y chromosomes. PLoS Genet. 2, e43 (2006)
Hurst, L. D. Is Stellate a relict meiotic driver? Genetics 130, 229–230 (1992)
Acknowledgements
We thank S. Kumar, P. O’Grady, T. Markow, A. J. Bhutkar, S. C. Vaz, E. Betran, A. A. Peixoto, P. H. Krieger and P. Paiva for comments on the manuscript and/or for sharing their unpublished results. We also thank T. Pinhão, A. Bastos and F. Krsticevic for help with the experiments, K. Krishnamoorthy for statistical advice and M. Fetchko for help with GenBank submission. This work was supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico-CNPq, Coordenação de Aperfeiçoamento do Pessoal de Ensino Superior-CAPES, FAPERJ, FIC-NIH grant TW007604-02 (A.B.C.) and NIH grant GM64590 (A.G.C.).
Author information
Authors and Affiliations
Corresponding author
Supplementary information
Supplementary Information
This file contains Supplementary Discussions, Supplementary Methods, Supplementary Figures 1-11 with Legends, Supplementary Tables 1-5, Supplementary References and Supplementary Notes. (PDF 2733 kb)
Rights and permissions
About this article
Cite this article
Koerich, L., Wang, X., Clark, A. et al. Low conservation of gene content in the Drosophila Y chromosome. Nature 456, 949–951 (2008). https://doi.org/10.1038/nature07463
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/nature07463
This article is cited by
-
De novo assembly of the olive fruit fly (Bactrocera oleae) genome with linked-reads and long-read technologies minimizes gaps and provides exceptional Y chromosome assembly
BMC Genomics (2020)
-
Multiple gene movements into and out of haploid sex chromosomes
Genome Biology (2017)
-
Convergent evolution of Y chromosome gene content in flies
Nature Communications (2017)
-
First report of Y-linked genes in the kissing bug Rhodnius prolixus
BMC Genomics (2016)
-
How to make a sex chromosome
Nature Communications (2016)