Main

What types of genetic change bring about speciation is one of the most basic questions in biology. Speciation is a fundamental outcome of life. Given metabolism, reproduction, mutation, heredity, and the spatial–temporal subdivision of the environment and of individuals into populations, new species form through time, increasing biodiversity. The evolution of this biodiversity can only be fully explained if we identify the heritable underpinnings of species formation and the forces responsible for their origins.

Here, we review how recent advances in molecular and genomic techniques are helping to achieve a greater understanding of the genetics of speciation. For the purpose of this review, we focus on technical advances rather than theoretical concepts, which are discussed extensively elsewhere1. We define speciation for sexually reproducing organisms as the transformation of within-population variation into taxonomic differences through the evolution of inherent barriers to gene flow. This definition is not universally accepted, but it remains the most commonly used by students of speciation and is of the greatest utility to dissecting the genetics of the process1. We discuss whether and how molecular techniques are helping to discern the genetic bases and evolutionary origins of barriers that contribute to population divergence. We present some recent discoveries from laboratory and field studies that apply molecular and genomics techniques to the speciation question. Our examples are chosen to illustrate some of the breadth of approaches that are used to tackle this exciting question.

We begin by framing the problem from a historical perspective, tracing how the development of ever-more-sophisticated methods has led to finer dissection of the genetic origins of species down to the level of the individual loci and eventually nucleotides. Technical and statistical advances are also extending laboratory-based discovery to natural populations, allowing researchers to investigate barriers to gene flow, genomic interactions and the genetic permeability of species boundaries in hybrid zones where differentiated taxa overlap and interbreed. Whole-genome surveys can provide representative snapshots of differentiation across the entire genomes of model genetic systems. Ultimately this progress is leading to large-scale comparative genomic analysis of entire taxonomic groups (model and emerging-model systems alike) from which general patterns and rules might emerge.

We conclude by discussing what the new methods have and potentially will reveal about the genetics of speciation. We argue that the new technology is not necessarily providing results that are inherently different from those of earlier studies. Rather, the new methodology is adding detail by accelerating the rate and ease of data collection on a genome-wide scale. We contend that this aspect of technology will have the greatest immediate impact — it will allow us to move from isolated case studies to compilations of results for representative groups to answer relative frequency questions about processes and factors that contribute to speciation. For example, how often do natural selection, sexual selection, genomic conflict and ecological interactions drive divergence? What proportion of genetic change is regulatory versus functional? How important is chromosomal change in speciation? How often does speciation occur in the face of periodic or regular gene flow, generating mosaic patterns of genomic differentiation? It is not clear that the new methodology will lead to a major shift in our thinking about the ways in which speciation can occur. We propose that this limitation is not technical, but mostly rests in the imagination of students in the field.

Classical approaches

Descriptive and conceptual. Although he entitled his famous book On the Origin of Species, Darwin2 had only vague insights into the speciation process itself, viewing it as a later stage in a continuum from adaptive divergence among 'varieties' within species. The Russian geneticist Theodosius Dobzhansky and the German systematist Ernst Mayr instilled broader excitement in the field of speciation genetics with three conceptual advances in the 1930s and 1940s. First, Dobzhansky3 and Mayr4 compiled lists of traits that prevent gene flow between species, such as habitat divergence and hybrid sterility (here, we refer to them collectively as barriers). They noted that some of these barriers prevent the formation of hybrid offspring, whereas others prevent the success and propagation of hybrid offspring once formed. Second, Dobzhansky and Mayr argued that species can be defined in the context of these traits. This second advance identified a means for the study of speciation genetics: the genetics of barriers acting to reduce gene flow in nature indicates the genetics of speciation itself. Third, they studied these barriers directly: Mayr defined which ones operate in nature and Dobzhansky determined their genetic basis in the laboratory through controlled crosses (Box 1). The implications of this last endeavour were profound, demonstrating that barriers are traits that can be genetically mapped and that species boundaries can be quantified genetically.

Despite the dramatic advances in molecular and genomic techniques, the 'old-school' approaches of Dobzhansky and Mayr for studying the genetics of speciation still apply today5 (Box 1). Since that time, similar laboratory and field studies have become plentiful enough for elegant meta-analyses of broader taxonomic data sets to be carried out. These meta-analyses have found, unsurprisingly, that barriers between long-diverged species are typically much stronger or more effective than barriers between very recently diverged species (for example, see Refs 6–11). Therefore, genetic divergence is associated with the accumulation of more (and/or stronger) barriers, irrespective of how genetic divergence is measured.

Echoing the differences in Dobzhansky's and Mayr's approaches, a slight divide has nonetheless emerged within the speciation community. At one extreme are the researchers who investigate barriers to gene flow using laboratory crosses of well-established model species, often focusing on easily scored phenotypes such as hybrid sterility or inviability. They have been successful in identifying genes that contribute to these traits (see below). However, the specific traits or genes identified might not directly reduce or have previously restricted gene flow between the focal species in nature (indeed, they often do not, as several species studied in this way do not occur together in nature). Moreover, these genes could contribute to reproductive isolation in the laboratory, but if they arose after gene flow was essentially complete between species in nature, they would not represent 'speciation' loci in the strict sense. Nonetheless, these studies have provided valuable genetic and evolutionary insights that can be applied to naturally hybridizing but less tractable species, and have identified the rate of accumulation of alleles that cause hybrid incompatibilities in isolated species. At the other extreme are naturalists who study very recently diverged, often hybridizing populations. Although their studies are directly applicable to natural divergence in its early stages, the populations might be ephemeral and never actually speciate: such taxa could eventually fuse (for example, see Ref. 12). So, an unproven assumption of eventual speciation underlies these studies. The authors of this review were trained at opposite ends of this divide. We argue that complementary insights are yielded by these two approaches, and genetic and genomic techniques have facilitated progress in both arenas.

Laboratory-crossable species. Genetic crosses of laboratory-amenable organisms have been a touchstone for dissecting the genetics of speciation since the early 1930s, providing information on the minimum number of genes and the relative contributions of different chromosomes (especially the sex chromosomes) to barriers to gene flow between taxa. Although the basic protocol for gene mapping remains the same, several technical and methodological advances have increased the pace of discovery since Dobzhansky's13 initial study of hybrid male sterility in Drosophila pseudoobscura 'races'. These advances include improvements in molecular genotyping (for example, see Refs 14, 15; also see below), statistical advances in detecting association between markers and traits (for example, see Refs 16, 17) and improved genetic cross methodologies (for example, see Ref. 18). Repeated backcrosses have been used to introgress small and well-defined segments from one species into another to determine the effects of single loci. For example, in a tour-de-force, True et al.19 inserted transposable P-elements into 87 positions in the genome of Drosophila mauritiana, and introgressed each of these segments into Drosophila simulans by backcrossing for 15 generations to determine the genetic positions of hybrid-sterility-conferring loci. Even now, a decade later, this represents an impressive accomplishment.

Data from nature. One can only learn so much by studying model organisms in the laboratory. One problem is that a reproductive barrier between geographically isolated populations that is identified on the basis of laboratory crosses might not be an effective barrier to gene flow in the field if and when these populations encounter each other. Surveys of natural hybrid zones and/or transplant studies in the field are therefore needed to complement laboratory-based studies to establish the significance and strength of specific barriers in nature. Another problem is the lack of taxonomic representation. Successful mapping studies have, to date, been mainly limited to model systems such as Drosophila species, where detailed linkage maps and tools such as deletions, inbred lines, genomic sequences and transformation systems are available. These model systems might not be representative of the types of trait or genetic architecture involved in speciation across taxonomic groups.

In this regard, analysis of natural populations can broaden our surveys by providing entries for identifying candidate speciation genes in non-model genetic organisms20,21,22. In hybrid zones, genomic regions that are relatively impermeable to introgression (admixture) and display enhanced differentiation probably contain genes that generate barriers between populations23,24,25,26,27,28. In addition, QTL analysis in natural hybrid zones can look for correlations of markers with phenotypic traits that distinguish taxa, several of which could be associated with barriers29,30. Therefore, hybrid zones not only provide valuable genetic information on the natural history of speciation and its causes, but can also be used to help to move from broad-scale characterization of genomic architecture to the specific genes responsible for barriers20,31. In the past decade, we have witnessed the transformation of many non-model genetic organisms, such as Heliconius butterflies and Gasterosteus sticklebacks, into 'emerging' model systems that are of particular interest because they integrate complementary information on speciation from the field and laboratory to provide a fuller understanding of the process.

As with genetic crosses, genetic studies of samples from natural populations have greatly benefited from technical advances in the sensitivity, speed, types and numbers of marker that can be scored. Sequence-based studies of multiple nuclear loci are now standard and combinations of different types of marker are being used to investigate different aspects of the problem of gene flow32,33 (see also Luikart et al.34 and Box 2 for a description of the types of molecular genetic marker used in population genomics studies). Analyses of hybrid zones have yielded estimates of the number of genes that contribute to barriers, insights into how the balance between selection and migration shapes gene frequency clines, and assessments of the relative importance of inherent genomic incompatibilities, population demography and ecology in maintaining the genetic integrity of taxa20,21,22,30,35,36,37,38,39,40,41,42,43,44,45. In addition, an increasing (although still limited) number of case studies provide evidence for divergence-with-gene-flow speciation (for example, see Refs 33, 46–49) and related phenomena, including sympatric speciation through genic mechanisms that do not involve changes in ploidy number50,51,52, and hybrid speciation and the creative role of introgression in divergence1,53,54,55. Many of the case studies cited are not universally accepted1. Nonetheless, the modern approaches we discuss below can be and have been used to bolster the evidence for these controversial hypotheses in particular cases, such as sympatric speciation.

Modern approaches

Recreation or dissolution of speciation events. Genetic studies of speciation often suffer from the problem that they are investigating a process that is either complete or nearly so. One innovative yet relatively underused empirical solution to this problem involves the experimental recreation of new species and hybrid zones in nature. Applying molecular techniques to the traditional experimental hybrid approach has been fruitful in studies of hybrid speciation. Using Helianthus sunflower species, Rieseberg and colleagues56,57,58 were able to recreate the complex phenotypes of ancient hybrid species from early generation synthetic hybrids. Consistent with expectations, the same combinations of parental chromosomal segments required to generate extreme phenotypes in synthetic hybrids are those found in the natural hybrids.

Experimental hybridization has also proved useful for reconstituting possible sequential phenotypic steps from ancestral to derived states in the evolution of barriers between taxa, such as in the study of flower colour and shape in monkeyflowers59,60 and wing patterns in Heliconius butterflies61. These experiments are natural extensions of some of the classical work that selected for reduced gene flow in maize62 and Drosophila species63,64. In a recent example, Leu and Murray65 identified the genetic basis for altered mate preference by selection for increased assortative mating between two yeast strains. Although this study did not yield information on particular examples from nature, it provided a hypothesis for one physiological means of assortative mating in yeasts.

Yeast have also been used in the experimental 'de-evolution' of a species. In an elegant study that examined the effect of genome rearrangements on hybrid sterility, Delneri et al.66 experimentally reconfigured the Saccharomyces cerevisiae genome to make it collinear with that of its relative Saccharomyces mikatae. Hybrids of these two species normally produce only inviable spores, whereas hybrids of the experimentally manipulated S. cerevisiae with S. mikatae produced hybrids with some (but incomplete) spore viability. This experiment showed a direct contribution of the genome rearrangement to hybrid spore viability, but also demonstrated that genic effects must also contribute.

Despite past imperfections in design, Rieseberg and colleagues47,67 concluded from a review of the collective experimental hybrid literature that most traits that differentiate species seem to be under selection in the wild, that hybrid fitness tends to be contingent on both hybrid genotype and the habitat into which they are placed, that intrinsic isolating factors are not necessarily more stable and irreversible than extrinsic, ecologically related barriers, and that hybrid incompatibilities could be quickly purged in both experimental and natural hybrid zones. These conclusions demonstrate that the coupling of experimental hybridization with genomics holds great promise for understanding the speciation process.

Empirical and statistical analyses for introgressive hybridization. The new molecular approaches are having a major impact on resolving the genetic architecture and permeability of species boundaries. In the past decade, we have witnessed a shift in surveys of natural populations from studies that are based on mitochondrial sequences to studies of multiple nuclear loci to high-throughput scans of entire genomes for detecting introgression and regions of higher differentiation ('islands of speciation') between diverging taxa. It is now possible to use microarray hybridization techniques on samples from natural populations to rapidly paint a broad view of genomic differentiation. Follow-up sequence analysis is used to finely map the boundaries of introgressed versus diverged regions and to quantify the extent of differentiation for these genes.

Genetic data alone do not answer the question of whether taxa are or have undergone introgressive hybridization unless the results are viewed in the context of an appropriate evolutionary model for rigorous statistical testing. Concomitant with the advances in technology, more sophisticated analytical methods are also being developed to discern whether shared variation is due to introgression from a related species or represents the persistence of ancestral polymorphism. These methods can be categorized as being based on either summary population genetic statistics or phylogenetic gene-tree-building approaches. Gene-tree approaches can reveal alleles that show discordant phylogenetic patterns (paraphyly) that could indicate introgression68,69,70,71, but could also be explained by incomplete lineage sorting. Gene trees have also been used to distinguish historical processes through nested clade analysis72,73. However, because it is difficult to devise powerful statistical tests of competing hypotheses for gene-tree-based methodologies such as nested clade analysis74, one can incorrectly accept a best-fit model for a process that never occurred. So, gene trees could find their most practical application as a first, qualitative approach to identify loci that might have introgressed in the past or to identify taxa of possible hybrid origin.

Variation among loci in summary FST values of interpopulation differentiation75,76 and their multi-allele GST extension77 has also been used to test for introgression (for example, see Refs 26, 78). The basis for the test is that divergent selection pressures on a trait can produce large between-population allele frequency differences for genetic markers that encode or are linked to the trait, and these differences are detectable by high, outlier FST values for the markers79. The test has been applied in an analogous manner to interspecific comparisons. In this case, outlier FST values point to genomic regions that might be relatively impermeable to introgression between taxa, presumably owing to association with a barrier. Although informative as a descriptive measure, FST comparisons have their shortcomings. First, they do not fully use the genealogical information inherent in DNA sequences. Second, FST values can be affected by historical and stochastic population processes that are unrelated to introgression, one of the most problematic being differences in mutation rate among genes or genomic regions. FST and GST values should therefore be adjusted by appropriate estimates of mutation rate when testing for introgression, which are based on either levels of intraspecific polymorphism or interspecific divergence derived from outgroup comparisons. Modified approaches such as the analysis of molecular variation80 and the use of GST values81 have been developed to ameliorate some of these problems. Accompanying biogeographical information can also strengthen the case for differential introgression; for example, through comparisons of cline width in hybrid zones among loci24. However, new generation methods for DNA sequence data that are based on the coalescent theory, such as the Wang–Wakeley–Hey test of shared polymorphism to fixed differences48,82, the linkage disequilibrium test of gene flow83, the isolation with migration model devised by Nielsen and Wakeley84 and its multilocus extension85, and the relative node depth approach23,86,87, provide more sophisticated approaches to test for possible introgression (Box 3).

Recent analyses suggest that many evolutionarily related, geographically overlapping taxa exchange genes through introgressive hybridization1,55,88,89. Moreover, these taxa commonly possess mosaic (composite) genome structures88. One particularly relevant finding is that genomic regions that are relatively impermeable to many instances of introgression have been associated with low recombination rates, such as is caused by chromosomal rearrangement. Studies in sunflowers30, the D. pseudoobscura subgroup83,90,91,92 and Rhagoletis pomonella23,93 found evidence for greater introgression (lower divergence) in collinear segments of the genome than in inverted segments. Other regions of restricted recombination, such as pericentromeric regions or translocated/inverted regions, have also been implicated in divergence between races and species (for example, see Refs 27, 28, 94). The continued application of new molecular approaches coupled with powerful statistical analyses promises to lead to even greater insights into the relationship between genome structure and the persistence of species despite gene flow.

Use of whole-genome sequence assemblies. Whole-genome sequence assemblies have been completed for a number of eukaryotes, including several closely related taxa that can then be studied in a comparative framework. Coupling whole-genome sequences with functional studies can yield important insights into the genetic changes that underlie speciation. One recent example involves comparisons among three distantly related yeasts. Roughly 100 million years ago, a yeast ancestor experienced a whole-genome duplication95. Subsequently, alternative copies of the duplicated loci, including several essential genes, were reciprocally lost in different yeast lineages96. If two taxa were to hybridize, then for every reciprocally deleted gene, 25% of the resulting hybrid's spores would lack a functional copy of the gene96, providing a simple mechanism for hybrid dysfunction97. Although this study fails to identify a particular speciation event that is facilitated by this process, it is possible that such reciprocal gene loss contributed to at least some barriers to gene flow between yeast species.

The completion and assembly of several eukaryotic genome sequences has also been a boon for speciation research in several indirect ways. With respect to genetic mapping studies, researchers can now select markers at any location in the genome to help pinpoint genes that contribute to barriers (for example, see Ref. 98). Similarly, whole-genome sequences allow for the construction of microarrays and other genetic tools to assess the expression of all known and inferred transcripts for differences between species or races and strains (for example, see Ref. 99). We discuss these advances in the next section.

High-throughput approaches for genotyping or expression analysis. Methods that allowed for high-throughput genotyping have dramatically increased the speed and precision with which one can localize genes that confer barriers between species. Until recently, the bulk of molecular genotyping used in genetic mapping relied on electrophoretic separation of PCR products or proteins. Capillary-based approaches have also been available, but these are still rather limited in throughput capability. Several microarray-based marker methods for scoring SNPs have been developed that can allow one to genotype literally thousands of markers simultaneously from one individual with a small quantity of DNA (for example, see Ref. 100). Although most of these array-based genotyping methods require a priori sequence information, a few, such as DArT101, do not.

High-throughput approaches can also be used to look for divergence between species or races. Most studies have examined genetic divergence either across the genome using the single genome sequence assemblies for each species (for example, see Ref. 102) or using multiple individuals of each species but typically only 30 or fewer loci (for example, see Refs 46, 103–107). The former gives a genome-wide view but lacks the ability to distinguish divergence from polymorphism within species, whereas the opposite is true for the latter approach. With newly available genomic tools, one can get a glimpse of divergence between species using multiple individuals in a high-throughput format. A recent example is the study by Turner et al.28, in which samples of genomic DNA from seven strains of each of two hybridizing Anopheles gambiae mosquito races were hybridized to oligonucleotide microarrays. The researchers identified three regions of the genome that bore significant differentiation between the two races, suggesting that genes responsible for ecological and behavioural differentiation are likely to be located there. These results were generally (albeit not completely) consistent with other studies that use single marker scoring approaches27.

Oligonucleotide and cDNA microarrays are also useful for rapidly assessing differences in gene expression for thousands of loci simultaneously; these tools might provide insights into divergent or disrupted genes that could be associated with barriers. Gene expression often evolves quickly between related species108,109, and disruptions in transcriptional regulation could contribute to reduced fitness in sterile or inviable hybrids110. Disruptions of gene expression (defined as levels that are higher or lower than in both parental strains) have been documented in interspecies hybrids (for example, see Refs 111–113), and have at least sometimes been associated with sterility in hybrids (for example, see Refs 114, 115). If disruptions of gene expression are the cause (and not the consequence) of some hybrid incompatibilities, then high-throughput 'reverse-genetics' approaches have the potential to quickly identify candidate genes or candidate pathways that contribute to speciation. That said, the challenge then is to determine whether the association is through causation. If hybrids have severely underdeveloped gonads, then sterility and underexpression of gonad-specific transcripts would both result, but the underexpression says nothing of the underlying causation.

Direct gene manipulations or assays. For all candidate genes, the final standard of proof for causality rests on direct genetic manipulations. One can insert the candidate gene of species A into species B to demonstrate that a barrier is formed or disrupted by this insertion. Recent advances in technologies for gene manipulations can be broadly categorized as transposon-based, reverse-genetics approaches (for example, replacement by homologous recombination) or transgenic116. These approaches have been applied to several genes that confer barriers between species. For example, hybrids of D. simulans and D. mauritiana are sterile, and one contributor might be the putative hybrid sterility gene Odysseus (Ods). This gene was initially identified by traditional mapping and introgression approaches117,118,119. When full-length Ods cDNA from each of the two species was injected into a fertile hybrid line of these two species, a strong and statistically significant effect was seen which depended on which of the two parent species the inserted alleles came from120. This result provides a molecular confirmation of the effect of alleles at this locus on hybrid fertility. However, this effect was only observed when the alleles were inserted into introgression lines and not when inserted into pure D. simulans. As such, while Ods probably contributes to hybrid fertility, insertion of a foreign allele alone is not sufficient to cause sterility.

The above example could have suffered an additional complication if the native copies of Ods in the hybrid genome altered the phenotypic effect of the inserted copy. In a similar study, Greenberg et al.121 investigated the effect of the desaturase2 (desat2) gene on various adaptive differences between two D. melanogaster populations. They used the elegant gene-replacement technique of Rong and Golic122, which, unlike many transgenic methods, leaves only a single (transgenic) copy of the target gene of interest per genome. Using this technique on desat2, the authors found differences in cold tolerance and starvation susceptibility between geographical alleles. Although these elegant manipulations showed much potential, their result was questioned by a subsequent study using much larger sample sizes and greater numbers of replicates123.

Another possibility is to knock out the function of a particular candidate gene, and then show by complementation that it conferred the barrier of interest. Presgraves et al.124 mapped a hybrid inviability gene using overlapping chromosomal deficiencies from D. melanogaster to a particular cytological region. They then tested investigator-generated loss-of-function mutations at 12 loci that span the region, testing individual D. melanogaster mutations for their ability to uncover hybrid lethality when heterozygous with the D. simulans wild-type allele. They found that mutant alleles at only one locus failed to complement the D. simulans hybrid lethal factor, and thereby confirmed their candidate hybrid inviability locus125.

For emerging model systems in which such tests are more difficult, in vitro tests of gene function can add support to a proposed candidate gene's involvement for a barrier that separates two species. Such studies have been done using interpopulation hybrids of Tigriopus californicus copepods, which showed reduced performance in several fitness-related traits relative to their parents'. Rawson and Burton126 proposed that this hybrid breakdown was associated with co-adaptation between cytochrome c and cytochrome c oxidase. Using in vitro assays of enzyme activity, they observed that the cytochrome c variants isolated from two different populations each had significantly higher activity with the cytochrome c oxidase derived from their respective source populations, providing a mechanistic explanation for the observed hybrid fitness reduction. In a subsequent study, Harrison and Burton127 used site-directed mutagenesis to construct cytochrome c variants and showed that interpopulation hybrid breakdown can be attributed to a single, naturally occurring amino-acid substitution. Alas, the story is more complicated, as F2 hybrid offspring do not show consistently higher fitness when cytochrome c genotype matches maternal mtDNA-type in a constant 20°C environment128,129. Nonetheless, seeing how potential barriers can be mapped to a single amino acid in non-model systems is impressive and promising.

Synthesis, advances and prospects

The application of molecular genetic and genomics techniques to the study of speciation has led to significant progress in two areas. First, technical advances have facilitated the mapping and characterization of specific genes that are responsible for barriers to gene flow. Second, more extensive and sensitive genetic and statistical surveys of natural populations have allowed inferences to be made about the speciation process and the nature of species boundaries. Most, although not all, of this progress involves improvements in scale and speed, although we have moved into a phase in which we now have several 'barrier genes' in hand and potentially many more to come soon (Box 4). Collectively, these loci include both housekeeping and regulatory functions, and frequently seem to be targets of natural selection1,130,131. Nevertheless, we could be surprised in the future as more barriers are mapped and prove to involve mechanisms such as meiotic drive or genomic conflict to a greater extent than is currently appreciated.

We are also gaining a clearer picture of the genetic architecture of species barriers through the analysis of naturally occurring and experimentally generated hybrids. These studies have confirmed many traditional views about speciation, such as the role that geography often has in population divergence. They have also yielded evidence supporting contentious processes, especially in animals, related to divergence-with-gene-flow speciation. For example, many more potential cases of sympatric speciation have been proposed50,51,52.

As a result of these discoveries, questions are now shifting from whether specific types of gene and types of process occur to what their relative frequency and importance for speciation are. Genomics is therefore having a major impact on speciation research, not through a fundamental paradigm shift in theory, but by providing rigorous confirmations of hypotheses and facilitating a methodological transition to meta-analysis. The new techniques are providing the means for compiling extensive catalogues of compelling case studies from representative groups to answer the next generation of frequency issues. Additionally, the techniques can be used in combination for a more complete understanding of specific cases. For example, genetic mapping initially identified the Ods gene that causes hybrid sterility117,118, its effect was confirmed by gene manipulation120, and possible downstream targets or consequences of its allelic replacement were described through microarrays and real-time PCR115,132. These types of combinatorial approach bring us closer to understanding the molecular mechanisms that underlie speciation.

As we continue to move into the era of 'comparative speciation genomics' it is important to recognize both the strengths and limitations of genomic technologies. Unquestionably, the pace and scale of data acquisition will accelerate, which is due in large part to advances in sequencing technology. It is therefore not unreasonable to think that in the next decade a single-investigator grant could propose to obtain and compare whole-genome sequences for multiple species or individuals. Annotations of genes will also improve such that the functional consequences of any observed differences could be determined computationally, and candidate genes for a particular trait could be identified almost instantaneously. In this regard, it is important not to neglect non-model organisms and to develop comparative genomics for both emerging and model systems to link field and laboratory-based studies of speciation genetics.

The 'genomics revolution' is not the be-all or end-all for the study of speciation, however, and the new purely sequence-based and/or expression-based approaches to studying speciation merely generate hypotheses to be tested. Another liability is that our current capacity to acquire data has exceeded our analytical ability to interpret the results, especially with regards to microarray experiments. Consequently, many barely interpreted data sets appear in the literature that are based on overly simplistic models. The assumption is often that with enough data, irrespective of how poorly analysed, someone will eventually 'divine' their true significance and meaning. For now, the new genomic tools are most reliably used to generate hypotheses, with careful and precise old-school reductionist bench work still needed to follow up and test these hypotheses.

A second problem is establishing the chronology of the genetic and phenotypic changes that lead to speciation. This problem pertains to determining both the sequence in which mutations in the same and different genes arose to generate a particular barrier and the order in which different barriers arose during a speciation event. Under certain instances of divergence-with-gene-flow speciation it might be possible to discern the order in which changes evolved between taxa, based on levels of neutral genetic divergence separating genes. Genes that restricted gene flow early in a speciation event will tend to show greater differentiation for linked sequences than those that arose later. For other speciation events, however, the best we might be able to do is to infer the possible order of change, based on information from experimental hybridization studies. In addition, large-scale comparative meta-analyses of taxa in different chronologically or temporally ordered stages of divergence could reveal trends in which certain types of barrier appear more often than not early in the speciation process relative to others (for example, see Refs 6–11).

In conclusion, the technical advances of the past decade have facilitated and accelerated speciation research. However, contrary to the hype, the new genomics approaches have not led to major conceptual insights into how we think new species form. Nor do we, the authors, foresee any theoretical breakthrough in the near future that would be driven by technology alone. The new methods are the tools that help to test and shape imaginative new hypotheses — the real 'engines' of progress. We now know much more about the genetic changes that make new species and the processes that are responsible for their evolution. One day we might even reach consensus on the nagging question of what exactly constitutes a species.