Abstract
The evolutionary histories of adaptive radiations can be marked by dramatic demographic fluctuations. However, the demographic histories of ecologically-linked co-diversifying lineages remain understudied. The Laurentian Great Lakes provide a unique system of two such radiations that are dispersed across depth gradients with a predator-prey relationship. We show that the North American Coregonus species complex (“ciscoes”) radiated rapidly prior to the Last Glacial Maximum (80–90 kya), a globally warm period, followed by rapid expansion in population size. Similar patterns of demographic expansion were observed in the predator species, Lake Charr (Salvelinus namaycush), following a brief time lag, which we hypothesize was driven by predator-prey dynamics. Diversification of prey into deep water created ecological opportunities for the predators, facilitating their demographic expansion, which is consistent with an upward adaptive radiation cascade. This study provides a new timeline and environmental context for the origin of the Laurentian Great Lakes fish fauna, and firmly establishes this system as drivers of ecological diversification and rapid speciation through cyclical glaciation.
Similar content being viewed by others
Introduction
The rapid evolution of phenotypically and ecologically diverse species from a common ancestor, known as evolutionary radiations, are often fueled by adaptation to ecological opportunities that include colonization of new habitats and extinction of interspecific competitors1,2,3. There are cases in aquatic systems where multiple sympatric lineages have each radiated into several species or morphs. Species within these sympatric lineages can be linked through spatial and trophic interactions such as overlapping habitat preferences and predator-prey dynamics4. Several famous examples of predator-driven prey diversification have been documented5,6. Conversely, it has also been theorized that rapid diversification of prey species would likely facilitate similar diversification in its predators, triggering a so called “upward adaptive radiation cascade”, although few empirical examples of such instances have been documented7. Additionally, there is a scarcity of studies that examine the historical demographic patterns of rapid co-diversification between ecologically-linked lineages8. Investigating multiple sympatric lineages in the early stages of diversification can provide a richer understanding of the mechanistic drivers of speciation by characterizing the impact of the interactions among demographic processes, genetic change, and ecological differences9.
A well-suited system for exploring the evolutionary consequences of demographic shifts are the native salmonids of the Laurentian Great Lakes (hereafter, Great Lakes), which have diversified geologically recently in conjunction with changes in their environment10. Historically endemic to the Northern Hemisphere, salmonid populations were influenced by Pleistocene glaciation cycles resulting in periods of isolation followed by population expansion, diversification, and range overlap11,12. One lineage that experienced such periods of isolation and population expansion is the genus Coregonus (Salmonidae: Coregoninae; “ciscoes”). Representatives of this most species-rich genus within Salmonidae are dispersed across cold freshwater habitats of Asia, Europe, and North America and are associated with a history of taxonomic uncertainty13,14,15. In North America, members of the Coregonus artedi species complex exhibit a vast range of ecological, phenotypic, and functional diversity accompanied by little-to-no mitochondrial genetic differentiation among species16,17,18, likely resulting from their recent, rapid diversification in lacustrine habitats. Depth gradients within these habitats allow for diversification within the species complex through the niche partitioning within a lake system19. These patterns are exemplified in Lake Superior Coregonus species such as Coregonus artedi, which favor epipelagic strata (5–90 m), while benthopelagic-oriented species such as C. hoyi and C. kiyi prefer deeper hypolimnetic zones of 55–90 m and >90 m, respectively20,21,22 (Fig. 1a). Lake Superior coregonines exhibit adaptive genetic differences across depth gradients, such as spectral tuning of the visual signal transduction protein rhodopsin23,24.
a Depth profile of Lake Superior showing bathymetric distributions of Coregonus artedi, C. hoyi, C. kiyi, and two morphs of Salvelinus namaycush (lean and siscowet). Artwork provided by Joseph R. Tomelleri. b Maximum likelihood phylogenetic tree with branch support calculated through 1,000 ultrafast bootstrap replicates. PCA of genetic variation in (c) C. artedi, C. hoyi, C. kiyi, C. nigripinnis and (d) S. namaycush lean and siscowet derived from single nucleotide polymorphisms with no missing data. Source data are available at https://doi.org/10.5061/dryad.n02v6wx59.
One of the primary native predators of Coregonus is Lake Charr (Salvelinus namaycush). Like the coregonines, S. namaycush is comprised of several ecologically, genetically, and phenotypically divergent morphs (in this case formally recognized as sub-specific ‘strains’). These morphs are distributed across depth gradients similar to Coregonus25,26,27,28 (Fig. 1a). Within Lake Superior, the “lean” morph is associated with a shallower habitat preference ( < 80 m), whereas the “siscowet” morph has a deeper preferred habitat ( > 80 m)26,29,30 (Fig. 1a). The deeper-oriented siscowet S. namaycush morph is more abundant and has been shown to often feed on deepwater coregonines, such as C. hoyi and C. kiyi29,30,31. Depth gradients also facilitate presumably adaptive genetic and physiological divergence of siscowet from its shallow-water ancestors, such as morph-specific immune responses32 and greater foraging capability in low light conditions33.
The Great Lakes geologic history has been volatile, which would be reflected in the demographic history of fishes in the region. Fluctuations in population size through time are shaped by ecological and evolutionary forces, with the signatures of these influences catalogued in the genomes of extant organisms. Coalescent-based methods can utilize these signatures for estimating changes in effective population size (Ne) over geologic time using one or more genomes. Notably, the pairwise sequentially Markovian coalescent model (PSMC)34 and sequential Markov coalescent + plenty of unlabeled samples (SMC++)35 are powerful tools for inferring Ne of populations using whole-genome sequence data36,37,38. Understanding how Ne changes through time can help resolve the evolutionary history of species and show how organisms have responded to shifting biotic and abiotic factors in their environment39. Moreover, responses to shifting environments by different species can help predict how current and future environmental changes will affect demography. Coalescent-based approaches can also provide a temporal context to speciation events, including radiations that evolve rapidly through the exploitation of ecological opportunities. PSMC and other sequential Markovian coalescent (SMC) models have been used to infer divergence times between species and among intraspecific ecotypes34,40,41,42,43. The divergence between lineages can be inferred by identifying the point at which divergent Ne trajectories coalesce into a shared population history43. When coupled with environmental data, these approaches can contextualize the evolutionary origins of biodiversity by showing how populations responded to environmental forces during the time of divergence.
Here we used genome-wide coalescent-based tools to explore the parallel adaptive radiations of the C. artedi species complex and the S. namaycush morphs. Because no genome assemblies are available for any of the Coregonus study species, we generated a high-quality de novo assembly of the C. artedi genome and combined coalescent-based tools with population-level resequencing data to estimate the timing of population size changes in both predator and prey species throughout the Pleistocene. Ne paleohistories for three Coregonus species (C. artedi, C. hoyi, C, kiyi) and two morphs of S. namaycush (lean and siscowet) from Lake Superior and one Coregonus species from Lake Nipigon (C. nigripinnis) were estimated to reconstruct demographic histories in the context of environmental change. Previous studies proposed that the C. artedi species complex radiated shortly before44 or following the Last Glacial Maximum (LGM)45,46. We used genome-wide data to test these hypotheses and evaluate whether diversification was stimulated by pre-LGM Great Lakes lentic (still water) habitats. Furthermore, comparing Ne trajectories throughout the Pleistocene among Great Lakes salmonid lineages provides insights into the possible co-evolution of a native predator-prey complex, both having likely undergone adaptive radiations in concert. Ne trajectories provide information on the temporal origins of demographic independence among members of the C. artedi species complex. Moreover, the evolutionary success of the primary predator species, S. namaycush, appears to be contingent on the diversification and abundance of the prey species. Taken together, we identify the geologic context of evolutionary diversification within these lineages with pre-LGM divergence to reflect previously unoccupied niche space in the precursors to the Great Lakes basins of today.
Results
Coregonus artedi reference genome assembly
The reference genome assembly for C. artedi was sequenced with DNA extracted from muscle tissue of an adult female collected in Lake Huron. A total of 11,393,306 Oxford Nanopore sequences ( ~ 28x coverage) that passed quality filtering were generated, resulting in a read N50 of 21,275 base pairs (bp). Illumina (San Diego, CA, USA) DNA sequencing produced an additional 119.9 Gbp ( ~ 40x coverage) of paired-end 150 bp sequence data. An initial raw genome assembly from Flye was 3,037,857,332 bp in length contained within 29,378 contigs. Hi-C scaffolding of the assembled contigs produced well-defined genomic blocks that were then incorporated into a linkage map47. Scaffolding of the genome with Chromonomer incorporated ~98% of the polished assembly into 38 linkage groups (chromosomes). The final assembly contained 38 scaffolds and 1,680 unincorporated contigs comprising a total length of 2,491,861,367 bp with a scaffold N50 of 52,236,198 bp and 97.1% complete BUSCOs (Table 1). Repetitive elements comprised a large proportion of the assembly (70.17%) totaling 1.748 Gbp (Supplementary Table 1). Retroelements and DNA transposons comprised 16.53% and 16.28% of the assembly, respectively (Supplementary Table 1). A large percentage of repetitive sequences in the assembly were uncategorized (37.34% of the genome). The dominant repeat families for retroelements and DNA transposons were L2/CR1/REX and Tc1-IS630-Pogo respectively (Supplementary Table 1).
Coregonus species population genomic structure
DNA was extracted from fourteen samples: four each of C. artedi, C. hoyi, and C. kiyi from Lake Superior and two samples of C. nigripinnis from Lake Nipigon. Short-read genome sequencing of these samples produced 11.63–15.38x coverage (Supplementary Table 2). The proportion of reads that were mapped to the reference genome was consistently high across all individuals (99.32–99.89%) with a proper pair alignment range of 94.93–96.1% (Supplementary Table 2). A total of 15,331,196 SNPs were identified. The first PC axis of SNP data ( ~ 13% of the variation) separated C. nigripinnis from C. artedi and C. kiyi, with C. hoyi occupying an intermediate position along this axis. PC2 (9% of the variation) separated C. kiyi from C. hoyi and C. artedi, with C. nigripinnis being intermediate (Fig. 1c). The third PC axis (8% of the variation) separated C. hoyi from the other species. Additional PC axes revealed only individual-level variation (Supplementary Fig. 1). The SNP matrix used for phylogenetic analysis contained 7,008,815 sites after filtering, 60% of which were parsimony informative. TVM + F + ASC + R2 was selected as the best-fit substitution model based on BIC scores. The resulting phylogenetic tree had 100% bootstrap support for all but one branch (Fig. 1b). C. artedi, C. kiyi, and C. nigripinnis were resolved as monophyletic groups, while C. hoyi formed a paraphyletic grade. The longest internal branch in the phylogeny separated C. nigripinnis from the Lake Superior species, consistent with the PCA results.
Demographic history of Coregonus species
Demographic reconstructions using PSMC (Fig. 2, Supplementary Fig. 2a,c,e,g) showed that the four species of Coregonus had nearly identical Ne trajectories from three million years ago (mya) to approximately 80 thousand years ago (kya). All Ne trajectories declined steadily while remaining in tight correspondence with multiple plateaus occurring through time. Between 80–90 kya, divergence of Ne trajectories occurred and was followed by all species having their lowest Ne estimates during the time of the LGM (19.5–33 kya). At this point in time, C. artedi had the lowest estimated Ne for any species (Supplementary Fig. 2a). Following the LGM, all species showed a massive, rapid population expansion (Fig. 2).
a Mean surface-air temperature of the continental Northern Hemisphere through time, relative to present day. Temperature data from De Boer, Lourens, and Van De Wal102. b Estimates of effective population size through time via PSMC analysis for four species of Coregonus (C. artedi, C. hoyi, C. kiyi, C. nigripinnis) and Salvelinus namaycush (lean and siscowet). Dark grey bar indicates the time range of the Last Glacial Maximum (19–33 kya). Light grey bars indicate time points (10.5, 13, 15.5, 17.5 kya) of deglaciation stages (c) of the Laurentian Great Lakes basin. Shapefiles for glacial margins as well as river and lake boundaries obtained from Dalton et al.73 and https://www.naturalearthdata.com respectively. Source data are available at https://doi.org/10.5061/dryad.n02v6wx59103.
Estimates of effective population size through time with SMC++ (Supplementary Fig. 3) began around four mya with consistent Ne trajectories for all species until 60 kya. There was a downward trend in Ne until all species reached their lowest Ne between 30 and 50 kya. The dates for minimum Ne from SMC++ were consistently older than the lowest Ne point inferred from PSMC (Supplementary Fig. 2b,d,f,h, Supplementary Fig. 3). Following the lowest point in Ne, SMC++ curves showed rapid Ne increases in all species with peaks appearing at distinct time intervals across species. After the Ne peak seen in the C. artedi samples, a split in bootstrap trajectories was apparent that was maintained until the most recent estimates. The most contemporary estimates available through SMC++ ( ~ 100 years ago) showed C. artedi having the greatest Ne of ~150,000, followed by C. hoyi at ~100,000 and C. kiyi and C. nigripinnis with the lowest Ne of ~60,000. Mutation rate sensitivity analyses indicated that decreasing the mutations/site/generation to the lower bound of the confidence intervals for salmonid species presented by Bergeron et al.48 introduced multiple peaks in Ne estimates between 30–500 kya (Supplementary Fig. 4).
Demographic history of Salvelinus namaycush
To understand population structure and estimate effective population sizes through time within the predator species, three lean and three siscowet S. namaycush samples were sequenced with Illumina short-reads. The samples ranged from 16.30–20.45x average alignment depth against the S. namaycush reference genome49 (Supplementary Table 3). The percentage of total reads that aligned to the reference ranged from 97.07–97.39% with a proper pair alignment range of 92.68–93.97% (Supplementary Table 3). Variant calling for the two morphs resulted in 3,534,068 SNPs. Lean and siscowet morphs were strongly separated along PC1 of the SNP data (Fig. 1c, Supplementary Fig. 5).
Estimates of effective population size for the lean and siscowet morphs of S. namaycush derived from PSMC began at approximately 2.5 mya, similar to patterns among Coregonus species (Fig. 2, Supplementary Fig. 6a,c). From 2.5 mya to 200 kya both morphs showed a steady decline in Ne with strong overlap between trajectories until reaching the lowest density of Ne (~5,000) during the LGM. The Ne curve using a generation time of 16 years (based on Hansen et al.50) and same salmonid mutation rate as employed for Coregonus above as in Crête-Lafrenière et al.51) reaches its lowest point coincident with the LGM (Fig. 2, Supplementary Fig. 6). After the LGM, both S. namaycush morphs display a rapid expansion in Ne subsequent to the expansion of Coregonus species, albeit with a short time lag (Fig. 2b).
Demographic analyses for the lean and siscowet morphs of S. namaycush with SMC++ showed consistent trajectories up to the LGM (Supplementary Fig. 6b,d, Supplementary Fig. 8). The estimates for both morphs began just before 2 mya and continued with a stepped decline to their lowest point, which is lower than any Coregonus species. Following this population decline, Ne increased rapidly, coincident with the expansion of Coregonus populations. The lean morph of S. namaycush reached a maximum Ne (~200,000) at 25 kya before steadily declining until the most recent estimate (Ne ~ 50,000). Siscowet reached its maximum peak from 10–20 kya at (Ne = ~150,000), and steadily declined until the most recent estimate (Ne = ~50,000).
Discussion
The C. artedi reference genome and 14 resequenced genomes from four species of Great Lakes Coregonus were used to explore population structure and species delimitation. Analyses of more than fifteen million polymorphic SNPs support differentiation of C. artedi, C. hoyi, C. kiyi, and C. nigripinnis, consistent with previous studies23,52. The two representatives of C. nigripinnis from Lake Nipigon were separated along the first PC from the rest of the Lake Superior populations, indicative of distinct population structure likely influenced by two drivers: (1) differing lineage contributions to populations in Lake Nipigon and Lake Superior, which occur at the contact zone of two glacial lineages of Coregonus spp. as inferred by Turgeon and Bernatchez45 and Favé and Turgeon53, and (2) drift from geographic separation between Lake Nipigon and Lake Superior populations (Fig. 1b). Within Lake Superior, C. artedi, C. hoyi, and C. kiyi form distinct, non-overlapping clusters along PC2. Thus, variation in genome-wide SNPs among these species mirrors their differentiation in diet19,23, spawning time and depth54,55, and visual adaptation24.
Genomic data provide a valuable resource for reconstructing effective population size through time and inform the timing of diversification. Utilizing two methods that rely on sequentially Markovian coalescent models, we estimated effective population size through time. The overall shape of the curves was similar across the two methods, and the timing of the lowest Ne estimates were largely similar between PSMC and SMC++, with the latter method nadir coinciding with the start of the LGM (Supplementary Fig. 2, Supplementary Fig. 3). This result is not uncommon, as these methods have different accuracy at different time scales. SMC++ is considered more robust at recent time scales (up to 10 kya), while PSMC is more accurate at deeper time scales ( > 20 kya)56,57. A low point in Ne for Coregonus species during the LGM suggests that suitable environmental conditions during this time were restricted while populations inhabited glacial refugia. Two glacial refugia have been proposed for Coregonus species, a Mississippian and Atlantic Coastal refugium, that ranged from lotic to offshore habitat58. Restrictions in habitat area would have constricted Coregonus population size while in these environments, reflected in the low estimates of Ne found in PSMC and SMC++ analyses.
Our results imply that the four Coregonus species sampled had a shared Ne trajectory back to ~80 kya, predating the LGM by nearly 50 kya. This shared trajectory of demographic history indicates that the C. artedi species complex was likely represented by a single ancestral species which diversified approximately 80 kya. We caution that estimates of divergence time through a visual examination of Ne trajectories could be confounded by other factors, such as post-divergence gene flow. Demographic independence prior to the LGM suggests that divergence occurred in the precursors to the Great Lakes, which were carved from multiple glacial cycles during the Pleistocene (past 2.4 million years)59. The timing of diversification and environmental context shown here contradicts previous hypotheses that divergence of the C. artedi species complex took place after the LGM44,45,46,60,61. The reconstructions of historical demography prompt two possible scenarios of divergence. In the first scenario, the common ancestor of the C. artedi complex diverged into independent, genetically distinct lineages with similar phenotypes and ecologies. These lineages were then isolated into the Atlantic and Mississippian glacial refugia during the LGM. Following the colonization of the Great Lakes after glacial retreat, secondary contact of glacial lineages drove speciation through niche specialization and character displacement as seen in ecomorphotypes of Lake Whitefish (Coregonus clupeaformis)62,63. A second plausible scenario is that the C. artedi complex progenitor specialized prior to the LGM due to the depth gradients present in the precursors to the Great Lakes. When confined to glacial refugia deepwater species, such as C. hoyi and C. kiyi, would not have been adapted to these shallower environments and would have had lower Ne during that time as seen in Fig. 2. Additionally, natural selection on standing genetic variation in the form of rare-alleles conferring shallow-water phenotypes, and/or cross-species gene flow, could have played a role in minimizing habitat-phenotype mismatches during this time.
Salvelinus namaycush genome resequencing data provided a unique opportunity to investigate how a predator is affected by prey diversification and expansion. Analyses of genomic variation indicate strong genetic differentiation between lean and siscowet morphs of S. namaycush (Fig. 1c). These whole-genome patterns of variation reinforce differentiation of morphs previously discovered in mitochondrial12, microsatellite64, transcriptomic65, and RAD-seq data66.
Our demographic reconstructions show that S. namaycush effective population size was lower than Coregonus species during the LGM, but S. namaycush populations expanded rapidly following the population expansion and diversification of Coregonus species. The critical nature of Coregonus species as a primary food source for S. namaycush67 plausibly signifies a causal, evolutionary response of predator diversification following the expansion of prey into newly formed habitats following the LGM. Under a scenario of newly formed habitats, a predator would not likely expand into a new niche without a prey base to sustain it. More broadly, the demographic cycles observed in Ne over evolutionary time are analogous (or perhaps even mechanistically ‘homologous’) to classic predator-prey population size oscillations over ecological time scales68. Similar to lags observed in Lotka-Volterra dynamics, we observed a slight lag in the expansion of S. namaycush populations following Coregonus population expansions. We hypothesize that S. namaycush diversified into morphs such as the siscowet to follow the newly-abundant deepwater Coregonus species (such as C. kiyi and C. hoyi) forage base, which is supported by contemporary diet studies29,30,31.
Historical demography has been analyzed in both predator and prey radiations independently, yet co-evolutionary interactions of paired radiations inferred through coalescent approaches are rare. In one example, a study of a predator-prey-scavenger system used historical demography analysis of mitochondrial data to show that mammalian predators exhibited stable populations during prey population expansion69. Our results yielded different demographic patterns in the Great Lakes, where population size fluctuations are correlated across trophic levels. This may be inherently linked to the processes involved in radiation cascades. In the presence of diversifying prey, the predator species can evolve to (1) become generalists, or (2) become specialized to favor a distinct prey phenotype7. Thus, the second option, in tandem with habitat distinctiveness in prey, may lead to an upward adaptive radiation cascade through divergent selection in predator populations. In a parallel scenario, trophic interactions between the predator and prey prior to diversification could drive disruptive selection in the prey species, thereby promoting speciation in the prey complex70. However, diversification in prey would then provide the niche space for predator species to radiate. In either case, the diversification of the prey species is a prerequisite for predator radiation, as suggested here for native Great Lakes salmonid species. Taken together, this system clearly provides unique opportunities to explore the evolutionary patterns and processes of parallel radiations in evolutionary young species complexes.
Here we demonstrate that the North American C. artedi species complex is geologically young, yet older than previous hypothesized estimates, and that the species likely diverged around Marine Isotope Stage 5 (MIS5; 71 to 130 kya). MIS5 was characterized by warmer climate and minimal Laurentide Ice Sheet extent, similar to the Holocene, the current interglacial period71,72. The Great Lakes basins were formed by glacial erosion during at least two-dozen successive Pleistocene glaciations59. During MIS5, the Great Lakes likely had similar depths to the Holocene, as there was only one intervening glaciation that further eroded the basins. Thus, the Great Lakes during MIS5 likely had diverse lentic habitats that propelled Coregonus diversification into evolutionary distinct lineages, perhaps with accompanying phenotypic differentiation to match environmental gradients. Following the diversification of Coregonus around 80 kya, effective population size remained low or decreased. During this time, the climate was cool and the Laurentide Ice Sheet advanced and at times filled the Great Lakes basin with ice or silt-laden ice-sheet meltwater, setting an environmental stage where conditions for population expansion would be sub-optimal.
The LGM confined Coregonus populations into isolated glacial refugia until the Laurentide Ice Sheet began to recede out of the Great Lakes basins around 17 kya73. Subsequent migration out of these refugial habitats into the newly reforming Great Lakes would have created scenarios of secondary contact between glacial lineages and abundant unoccupied niche space, conditions which can facilitate adaptive radiations74,75,76. In these vast deepwater environments, Coregonus progenitors were able to rapidly colonize and diversify into the remarkable diversity that was present prior to human-driven environmental degradation, population collapse, and loss of coregonine biodiversity during the 20th century.
The burst of rapid diversification within the C. artedi species complex occurred within the past 80 thousand years, equating to ~13,000 generations, a rate that would be ranked among the fastest in fishes estimated to-date (e.g., Hench et al.)77. This is also supported by previous studies that have shown low levels of genetic divergence among these species23,52,78. The rapid diversification of the C. artedi complex is comparable to the speed of radiation and genetic divergence found in cichlid species in the African Rift Lakes79. Future studies should focus on the mechanistic drivers of this rapid diversification, such as further studies of genes under apparent natural selection that could shape the ecology of this group. Our findings substantiate the notion that the formation of deepwater habitats in the glacially eroded precursors to the Great lakes drove diversification in this species complex and following migration out of glacial refugia populations expanded rapidly.
The use of coalescent-based models with whole-genome data can help illuminate the timing and dynamics of diversification. We have shown that Coregonus species are likely a young adaptive radiation that underwent a rapid expansion in effective population size during MIS5 ( ~ 80 kya), a warm period when the Great Lakes had similar habitats to the Holocene. Moreover, we show that the subsequent population size expansion of multiple Salvelinus namaycush morphs closely follows the diversification of Coregonus, its primary prey species, suggesting linked eco-evolutionary dynamics. This work provides essential information to understand the timing and mechanisms of diversification in the commercially and ecologically important Coregonus and Salvelinus complexes. Finally, this study supports the hypothesis that habitat availability can promote diversification, which in turn suggests ongoing restoration efforts of Great Lakes ecosystems will yield important benefits for the conservation and management of native biodiversity.
Methods
Coregonus artedi genome: sample collection and DNA extraction
Eggs from Coregonus artedi were collected from Les Cheneaux Island, Lake Huron in 2015 and incubated in McDonald jars until hatching at the Great Lakes Science Center (Ann Arbor, MI). The fish were then moved to circular tanks, increasing in diameter from 1.22 to 3.05 m as the fish grew. Tank temperature was kept at 7–9 °C. A single mature female was euthanized with a lethal dose of buffered MS-222 and tissues (muscle, liver, eye, gut, and gills) were dissected immediately and preserved on dry ice. High molecular weight DNA was extracted from muscle using a Qiagen (Germantown, MD, USA) Genomic-tip 500/G kit and associated buffers following the manufacturer’s protocol. Aliquots of extracted DNA were taken and Circulomics Short Read Eliminator (SRE) and SRE XL kits were used for size selection to preserve DNA fragments longer than 10 kb and 25 kb, respectively.
Genome sequencing and assembly
Raw and size-selected genomic DNA was sequenced on fifteen MinION flow cells (version R9.4; Oxford Nanopore Technologies, Inc., Oxford, UK; hereafter ONT) using ligation sequencing kits (SQK-LSK-109). ONT long-reads were basecalled with high accuracy using Guppy v4.2.3+f90bd04. ONT reads with quality score <7 and/or length <1000 bp were discarded. Frozen muscle tissue was also sent to Novogene Corporation Inc. (Sacramento, CA, USA) for Illumina (San Diego, CA, USA) PE150 sequencing with the NovaSeq 6000 platform. Adapter trimming and quality filtering of the Illumina (San Diego, CA, USA) sequence data were performed with Trim Galore! v.0.6.0 (Krueger F. Trim-Galore!, http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) with a Phred quality score threshold of 20. To detect chromatin interactions an additional muscle sample was sent to Phase Genomics for Hi-C library preparation and Illumina (San Diego, CA, USA) Novaseq sequencing.
An initial genome assembly was produced with Flye v2.8-b167480 using the genome size parameter (-g) set to 2.6 Gb and the maximum overlap distance “-m” set to 10 kb. Two iterations of internal polishing with Flye were run using the iteration parameter (“-i 2”). Following the initial assembly Illumina short reads were utilized for polishing. Illumina short reads were mapped to the initial genome assembly with bwa v0.7.17-r119881 to create a sequence alignment, which as passed to Pilon v1.2382 for two rounds of polishing. Purge Haplotigs v1.1.283 was used to identify and remove haplotigs and deep coverage repeats. Kraken2 v.2.0.8-beta84 was used to identify and remove possible contamination (e.g., microbiome sequences) using a custom database comprised of the Atlantic salmon (Salmo salar) genome “ICSASG_v2”85 (GCF_000233375.1) and sequences from archaea, bacteria, plasmid, viral, human, fungi, plant, protozoa and the “nr” and “nt” databases from NCBI.
We took a two-step approach to scaffold the draft assembly. First, we used Hi-C to identify well-defined blocks of three-dimensional structure, which were then integrated into an existing C. artedi linkage map. A Hi-C contact heatmap was created with Juicer v1.686. Using the aligned Hi-C reads, misjoins were corrected with 3D-DNA v1809287 with the settings “--editor-coarse-resolution 100000 -i 50000 -q 20 --editor-coarse-stringency 20”. Contact maps were visualized with Juicebox assembly tools88, and the remaining misassemblies were manually corrected. The output was passed to an additional run of 3D-DNA to finalize the manual edits, and place gaps between scaffolds filled with 150 N’s. The Hi-C scaffolded assembly was then integrated with the female C. artedi linkage map from Blumstein et al.47 using Chromonomer v1.1389. A file containing contigs and gaps was generated with a custom python script ‘fasta_to_simple_agp.py’. The paired-end markers from the linkage map were then mapped to the Flye assembly version with bwa v 0.7.17-r119881. The linkage map markers, assembly gap file, and sequence alignment file were used as inputs for Chromonomer v1.1389 to integrate Flye contigs into the linkage map, resulting in a merged Hi-C and linkage map-scaffolded assembly. Genome completeness was assessed for both the initial Flye assembly and final assembly with BUSCO v5.1.290 using the actinopterygii_odb10 database.
Repetitive elements in the assembly were first characterized with RepeatModeler v2.0.191 with the “-LTRStruct” flag enabled. The repeat library was checked for coding sequences with blastx v2.9.0+ 92 against the uniport_sprot database93, which was filtered for transposable elements. If there was a protein match, those sequences were removed from the repeat library. The protein-filtered repeat library was fed into RepeatMasker v4.1.194, which was run two ways. First, a gff file was generated from predicted repeats, then another pass that incorporated the “vertebrata” database from Dfam95. The results from each pass were combined and produced a final gff containing locational information for repetitive sequences.
Coregonus species sampling and genome resequencing
Samples of C. artedi, C. hoyi, and C. kiyi were collected from multiple sites in Lake Superior between May and August of 2015 and C. nigripinnis from Lake Nipigon in September 2018 (Supplementary Table 1). For the Lake Superior fish, multiple gear types were used for sample collection across a depth gradient with a 15.24 m midwater trawl used for the C. artedi samples, and a 11.89 m benthic trawl for the C. hoyi and C. kiyi samples. The C. nigripinnis samples were collected with multi-panel gill nets of various mesh sizes. Samples were identified to species following Koelz54 and Eshenroder et al.15, euthanized by pithing, and dissected to obtain muscle tissues for DNA extraction with Purelink Genomic DNA Kit (Invitrogen, Inc; Carlsbad, CA, USA). Four individuals from each species were extracted except C. nigripinnis (n = 2). DNA was sent for Illumina PE150 genomic resequencing on the NovaSeq 6000 platform performed by Novogene (Davis, CA, USA). Illumina-specific sequencing adapters were trimmed from the reads using TrimGalore! v0.6.6 (Krueger F. Trim-Galore!, accessible at http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) implementing a Phred quality score cutoff of thirty (-q 30) and default error rate (-e 0.1).
Population genomic structure
Genome resequencing data for individuals of each species of Coregonus were mapped to the genome assembly using bwa ‘mem’ v0.7.1781. Variant calling and filtering across individual samples were completed with BCFtools v1.1596 ‘mpileup’ (-C50) and ‘call’ using default parameters. VCFtools v0.1.1797 was used to filter only bi-allelic SNPs with minor allele frequency > 0.05 and with no missing data. Visualization of population genomic structure was completed in R v4.3.0 by first converting the VCF to genlight object with vcfR v1.10.0, which was followed by principal component analysis (PCA) completed with the ‘glPca’ function from adegenet v.2.1.398. In addition, true singletons were removed using VCFtools and maximum likelihood phylogenetic analyses were performed using IQ-Tree v2.2.2.699. The optimal substitution model was selected using ModelFinder v2.2.2.6100, only allowing models that included an ascertainment bias correction since the alignment contained only variable sites. Branch support was assessed using 1,000 ultrafast bootstrap replicates101. Additional optimization of the bootstrap trees using nearest neighbor interchange (-bnni) was performed to minimize the potential impacts of model violations due to concatenating a genome-wide markers. The resulting unrooted phylogeny was visualized using FigTree v.1.4.4. (http://tree.bio.ed.ac.uk/software/figtree/).
Coregonus demographic analysis with PSMC
For demographic history analyses, the resequenced Coregonus spp. were mapped to the repeat masked version of the genome and variants were called in the same manner as the population structure analyses. PSMC requires consensus genome fastq files for each sample, which were produced with the vcfutils.pl v0.1.1797 vcf2fq command with a minimum depth of a third of the average coverage and a maximum depth of two times the average coverage. PSMC v0.6.5-r6734 was run for 25 iterations using the parameters -t5 -r5 -p “4 + 20*2 + 6*4 + 4” with t being the upper limit of time for the most recent common ancestor, r being a ratio of mutation rate over the recombination rate (θ/p), and p, which is the number of free atomic time intervals. PSMC analysis was assessed for variance via bootstrapping with 100 replicates with replacement by splitting the consensus genome sequence for each individual into five Mb blocks and estimating Ne.
Plotting of PSMC output requires estimates for mutation rate and generation time. A nuclear genome mutation rate of 7.26 e−09 mutations/site/generation was selected following Crête-Lafrenière et al.51, a rate that falls within the confidence interval for S. salar presented by Bergeron et al.48. PSMC plots were also generated with the lower (2.50 e−09) and upper (8.23 e−09) bounds of the confidence interval from Bergeron et al.48 to evaluate the effects that mutation rate had on the demographic analyses. Generation time was calculated by using age-specific elements in a life table (Appendix I) and a final generation time of six years was applied to all samples. The parameters were input into the PSMC utility ‘psmc_plot.pl’ v0.6.5-r67 and applied to the combined PSMC bootstrap results for each sample with the ‘-R’ flag enabled to output the data as text files. The final output data were plotted with a custom R script.
Coregonus demographic analysis with SMC++
To examine cisco demographic histories at more recent time-scales we employed SMC++ v1.15.235. Unlike PSMC, SMC++ requires multiple sampled genomes per population or species35. VCF files from mapping to the repeat masked genome were merged for each species using BCFTools v.0.1.1996, then filtered to retain only biallelic SNPs (-m2 -M2 -v snps). The merged VCF files were converted to SMC++ input format using the ‘vcf2smc’ subcommand. This command was used separately for each of the 38 Coregonus linkage groups and for every possible “distinguished lineage” within each species. Demographic histories for each species were then reconstructed using the ‘estimate’ subcommand, with 50 EM iteration, polarization error set to 0.5 (indicating an unknown ancestral allele), and 40 spline knots. The minimum model timepoint was set to 10 generations and the maximum model time point was set to 10,000 generations. However, we note that specification of the maximum model time point does not appear to work as intended; instead, the software seems to use the heuristic approach to automatically calculate model time points. As with PSMC, we used a series of mutation rates to evaluate their impacts on Ne estimates, with 7.26e−09 mutations/site/generation used as the final rate, with the most support51. To assess confidence in the inferred demographic history we performed 100 non-parametric bootstrap replicates using a custom script to resample the original VCF files in 5 Mb blocks. For each bootstrap replicate, SMC++ was run using the parameters described above. Results were plotted using a custom R script with a generation time of 6 years.
Salvelinus namaycush population structure and demographic analysis
Historical demographic analysis of S. namaycush generally followed the Coregonus analysis. S. namaycush from Lake Superior comprised of two different morphs were sequenced to ≥15x coverage with Illumina PE150 genomic sequencing on the NovaSeq 6000 platform performed by Novogene (Davis, CA, USA). The samples were comprised of three lean and three siscowet S. namaycush individuals. S. namaycush sequence data were processed with the same trimming and quality filtering steps as the Coregonus data. Reads were mapped to a repeat-masked version of the S. namaycush reference genome, “SaNama_1.0”49 using bwa v0.7.1781 and SNP calls were conducted with BCFtools v.0.1.1996. Population structure analyses were completed with the same techniques as Coregonus species. Demographic history estimation was then completed using PSMC and SMC++ as per Coregonus species analyses, except for the generation time. S. namaycush was assigned a generation time of 16 years based on previous work that showed they reach sexual maturity exceeding 15 years in Canadian lakes50. A series of generation times (6, 8, 10, 12, 14, 16, 18, 20 years) were also applied when plotting the results of PSMC to estimate the effect of differential generation times on trajectories of Ne.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The genome assembly and all sequence data are available under NCBI BioProjects PRJNA1062807 (Coregonus spp.) and PRJNA1077361 (Salvelinus namaycush). Source data for the figures can be found at https://doi.org/10.5061/dryad.n02v6wx59.
Code availability
Code for data analysis is available at https://github.com/KrabbenhoftLab/Coregonus_demography.
References
Simpson G. G. Tempo and mode in evolution. Columbia University Press (1944).
Simpson, G. G. The baldwin effect. Evolution 7, 110–117 (1953).
Schluter D. The ecology of adaptive radiation. OUP Oxford (2000).
Brodersen, J., Howeth, J. G. & Post, D. M. Emergence of a novel prey life history promotes contemporary sympatric diversification in a top predator. Nat. Commun. 6, 1–9 (2015).
Reznick, D. & Endler, J. A. The Impact of Predation on Life History Evolution in Trinidadian Guppies (Poecilia reticulata). Evolution 36, 160–177 (1982).
Langerhans, R. B., Layman, C. A., Shokrollahi, A. M. & DeWitt, T. J. Predator‐driven phenotypic diversification in Gambusia affinis. Evolution 58, 2305–2318 (2004).
Brodersen, J., Post, D. M. & Seehausen, O. Upward adaptive radiation cascades: predator diversification induced by prey diversification. Trends Ecol. Evol. 33, 59–70 (2018).
Manthey, J. D., Girón, J. C. & Hruska, J. P. Impact of host demography and evolutionary history on endosymbiont molecular evolution: A test in carpenter ants (genus Camponotus) and their Blochmannia endosymbionts. Ecol. Evol. 12, e9026 (2022).
Harvey, M. G., Singhal, S. & Rabosky, D. L. Beyond reproductive isolation: Demographic controls on the speciation process. Annu. Rev. Ecol. Evol Syst. 50, 75–95 (2019).
Salisbury, S. & Ruzzante, D. Genetic causes and consequences of sympatric morph divergence in Salmonidae: a search for mechanisms. Annu. Rev. Anim. Biosci. 10, 81–106 (2022).
Waples, R. S., Pess, G. R. & Beechie, T. Evolutionary history of Pacific salmon in dynamic environments. Evolut. Appl. 1, 189–206 (2008).
Wilson, C. C. & Mandrak, N. E. History and evolution of lake trout in Shield lakes: past and future challenges. Boreal Shield watersheds: lake trout Ecosyst. a changing Environ. 21, 35 (2004).
Svärdson, G. The Coregonid Problem. VI. The Palaearctic species and their Intergrades. Ann. Rep. Drottningholm 38, 267–356 (1957).
Stott, W. & Todd, T. N. Genetic markers and the coregonid problem. Adv. Limnol. 60, 3–23 (2007).
Eshenroder R. et al. Ciscoes of the Laurentian Great Lakes and Lake Nipigon. Ann Arbor, Mich[online] Available from http://www.glfc.org/pubs/misc/Ciscoes_of_the_Laurentian_Great_Lakes_and_Lake_Nipigon.pdf [accessed 26 January 2017], (2016).
Snyder, T. P., Larsen, R. D. & Bowen, S. H. Mitochondrial DNA diversity among Lake Superior and inland lake ciscoes (Coregonus artedi and C. hoyi). Can. J. Fish. Aquat. Sci. 49, 1902–1907 (1992).
Reed, K. M., Dorschner, M. O., Todd, T. N. & Phillips, R. B. Sequence analysis of the mitochondrial DNA control region of ciscoes (genus Coregonus): taxonomic implications for the Great Lakes species flock. Mol. Ecol. 7, 1091–1096 (1998).
Turgeon, J. & Bernatchez, L. Mitochondrial DNA phylogeography of lake cisco (Coregonus artedi): evidence supporting extensive secondary contacts between two glacial races. Mol. Ecol. 10, 987–1001 (2001).
Rosinski, C. L., Vinson, M. R. & Yule, D. L. Niche partitioning among native ciscoes and nonnative rainbow smelt in Lake Superior. Trans. Am. Fish. Soc. 149, 184–203 (2020).
Dryer, W. R. Bathymetric distribution of fish in the Apostle Islands region, Lake Superior. Trans. Am. Fish. Soc. 95, 248–259 (1966).
Selgeby J. H., & Hoff M. H. Seasonal bathymetric distributions of 16 fishes in Lake Superior, 1958–1975 (1996).
Gorman, O. T., Yule, D. L. & Stockwell, J. D. Habitat use by fishes of Lake Superior. I. Diel patterns of habitat use in nearshore and offshore waters of the Apostle Islands region. Aquat. Ecosyst. Health Manag. 15, 333–354 (2012).
Bernal, M. A. et al. Concordant patterns of morphological, stable isotope, and genetic variation in a recent ecological radiation (Salmonidae: Coregonus spp.). Mol. Ecol. 31, 4495–4509 (2022).
Eaton, K. M., Bernal, M. A., Backenstose, N. J., Yule, D. L. & Krabbenhoft, T. J. Nanopore amplicon sequencing reveals molecular convergence and local adaptation of rhodopsin in Great Lakes salmonids. Genome Biol. Evol. 13, evaa237 (2021).
Krueger, C. C. & Ihssen, P. E. Review of genetics of lake trout in the Great Lakes: history, molecular genetics, physiology, strain comparisons, and restoration management. J. Gt. Lakes Res. 21, 348–363 (1995).
Bronte, C. R. et al. Fish community change in Lake Superior, 1970–2000. Can. J. Fish. Aquat. Sci. 60, 1552–1574 (2003).
Zimmerman, M. S. & Krueger, C. C. An ecosystem perspective on re-establishing native deepwater fishes in the Laurentian Great Lakes. North Am. J. Fish. Manag. 29, 1352–1371 (2009).
Muir, A. et al. Ecomorphological diversity of lake trout at Isle Royale, Lake Superior. Trans. Am. Fish. Soc. 143, 972–987 (2014).
Harvey, C. J. & Kitchell, J. F. A stable isotope evaluation of the structure and spatial heterogeneity of a Lake Superior food web. Can. J. Fish. Aquat. Sci. 57, 1395–1403 (2000).
Kitchell, J. F. et al. Sustainability of the Lake Superior fish community: interactions in a food web context. Ecosystems 3, 545–560 (2000).
Conner, D. J., Bronte, C. R., Selgeby, J. H. & Collins, H. L. Food of salmonine predators in Lake Superior, 1981-87. Great Lakes Fishery Commission Technical Report, I–19 (1993).
Baillie, S. M., Hemstock, R. R., Muir, A. M., Krueger, C. C. & Bentzen, P. Small-scale intraspecific patterns of adaptive immunogenetic polymorphisms and neutral variation in Lake Superior lake trout. Immunogenetics 70, 53–66 (2018).
Keyler, T. D., Hrabik, T. R., Mensinger, A. F., Rogers, L. S. & Gorman, O. T. Effect of light intensity and substrate type on siscowet lake trout (Salvelinus namaycush siscowet) predation on deepwater sculpin (Myoxocephalus thompsonii). Hydrobiologia 840, 77–88 (2019).
Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011).
Terhorst, J., Kamm, J. A. & Song, Y. S. Robust and scalable inference of population history from hundreds of unphased whole genomes. Nat. Genet. 49, 303–309 (2017).
Nadachowska‐Brzyska, K., Burri, R., Smeds, L. & Ellegren, H. PSMC analysis of effective population sizes in molecular ecology and its application to black‐and‐white Ficedula flycatchers. Mol. Ecol. 25, 1058–1072 (2016).
Prada, C. et al. Empty niches after extinctions increase population sizes of modern corals. Curr. Biol. 26, 3190–3194 (2016).
Barth, J. M., Damerau, M., Matschiner, M., Jentoft, S. & Hanel, R. Genomic differentiation and demographic histories of Atlantic and Indo-Pacific yellowfin tuna (Thunnus albacares) populations. Genome Biol. Evol. 9, 1084–1098 (2017).
Lucena-Perez, M. et al. Genomic patterns in the widespread Eurasian lynx shaped by Late Quaternary climatic fluctuations and anthropogenic impacts. Mol. Ecol. 29, 812–828 (2020).
Freedman, A. H. et al. Genome sequencing highlights the dynamic early history of dogs. PLoS Genet. 10, e1004016 (2014).
Lamichhaney, S. et al. Evolution of Darwin’s finches and their beaks revealed by genome sequencing. Nature 518, 371–375 (2015).
Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).
Prado-Martinez, J. et al. Great ape genetic diversity and population history. Nature 499, 471–475 (2013).
Bailey, R. M. & Smith, G. R. Origin and geography of the fish fauna of the Laurentian Great Lakes basin. Can. J. Fish. Aquat. Sci. 38, 1539–1561 (1981).
Turgeon, J. & Bernatchez, L. Reticulate evolution and phenotypic diversity in North American ciscoes, Coregonus ssp.(Teleostei: Salmonidae): implications for the conservation of an evolutionary legacy. Conserv. Genet. 4, 67–81 (2003).
Eshenroder, R. L. & Jacobson, P. C. Speciation in Cisco with Emphasis on Secondary Contacts, Plasticity, and Hybridization. Trans. Am. Fish. Soc. 149, 721–740 (2020).
Blumstein, D. M. et al. Comparative genomic analyses and a novel linkage map for Cisco (Coregonus artedi) provide insights into chromosomal evolution and rediploidization across salmonids. G3: Genes, Genomes, Genet. 10, 2863–2878 (2020).
Bergeron L. A. et al. Evolution of the germline mutation rate across vertebrates. Nature, 1-7 (2023).
Smith, S. R. et al. A chromosome‐anchored genome assembly for Lake Trout (Salvelinus namaycush). Mol. Ecol. Resour. 22, 679–694 (2022).
Hansen, M. J. et al. Age, Growth, Survival, and Maturity of Lake Trout Morphotypes in Lake Mistassini, Quebec. Trans. Am. Fish. Soc. 141, 1492–1503 (2012).
Crête-Lafrenière, A., Weir, L. K. & Bernatchez, L. Framing the Salmonidae family phylogenetic portrait: a more complete picture from increased taxon sampling. PloS one 7, e46662 (2012).
Ackiss, A. S., Larson, W. A. & Stott, W. Genotyping‐by‐sequencing illuminates high levels of divergence among sympatric forms of coregonines in the Laurentian Great Lakes. Evolut. Appl. 13, 1037–1054 (2020).
Favé, M.-J. & Turgeon, J. Patterns of genetic diversity in Great Lakes bloaters (Coregonus hoyi) with a view to future reintroduction in Lake Ontario. Conserv. Genet. 9, 281–293 (2008).
Koelz W. Coregonid fishes of the Great Lakes. US Government Printing Office (1929).
Smith G., & Todd T. N. Evolution of species flocks of fishes in north temperate lakes. (1984).
Dong, F. et al. Population genomic, climatic and anthropogenic evidence suggest the role of human forces in endangerment of green peafowl (Pavo muticus). Proc. R. Soc. B 288, 20210073 (2021).
Lu C. W., Yao C. T., & Hung C. M. Domestication obscures genomic estimates of population history. Molecular ecology, (2021).
Schmidt R. E. Zoogeography of the northern Appalachians. The zoogeography of North American freshwater fishes, 137–159 (1986).
Larson, G. & Schaetzl, R. Origin and evolution of the Great Lakes. J. Gt. Lakes Res. 27, 518–546 (2001).
Bernatchez, L. & Wilson, C. C. Comparative phylogeography of Nearctic and Palearctic fishes. Mol. Ecol. 7, 431–452 (1998).
Eshenroder, R. L., Breckenridge, A. J. & Jacobson, P. C. Reconciling zoogeography and genetics: Origins of deepwater Cisco Coregonus artedi (sensu lato) in the Great Lakes. Trans. Am. Fish. Soc. 153, 23–38 (2024).
Rougeux, C., Bernatchez, L. & Gagnaire, P.-A. Modeling the multiple facets of speciation-with-gene-flow toward inferring the divergence history of lake whitefish species pairs (Coregonus clupeaformis). Genome Biol. Evol. 9, 2057–2074 (2017).
Mérot C. et al. Genome assembly, structural variants, and genetic differentiation between lake whitefish young species pairs (Coregonus sp.) with long and short reads. Mol. Ecol. (2022).
Page, K. S., Scribner, K. T. & Burnham-Curtis, M. Genetic diversity of wild and hatchery lake trout populations: relevance for management and restoration in the Great Lakes. Trans. Am. Fish. Soc. 133, 674–691 (2004).
Goetz, F. et al. A genetic basis for the phenotypic differentiation between siscowet and lean lake trout (Salvelinus namaycush). Mol. Ecol. 19, 176–196 (2010).
Euclide, P. T., Jasonowicz, A., Sitar, S. P., Fischer, G. & Goetz, F. W. Further evidence from common garden rearing experiments of heritable traits separating lean and siscowet lake charr (Salvelinus namaycush) ecotypes. Mol. Ecol. 31, 3432–3450 (2022).
Ray, B. A. et al. Diet and Prey Selection by Lake Superior Lake Trout during Spring, 1986–2001. J. Gt. Lakes Res. 33, 104–113 (2007).
Lotka A. J. Elements of physical biology. Williams & Wilkins (1925).
Perrig, P. L., Fountain, E. D., Lambertucci, S. A. & Pauli, J. N. Demography of avian scavengers after Pleistocene megafaunal extinction. Sci. Rep. 9, 1–9 (2019).
Pontarp, M. Ecological opportunity and upward prey-predator radiation cascades. Sci. Rep. 10, 1–9 (2020).
Batchelor, C. L. et al. The configuration of Northern Hemisphere ice sheets through the Quaternary. Nat. Commun. 10, 1–10 (2019).
Otto-Bliesner, B. L. et al. How warm was the last interglacial? New model–data comparisons. Philos. Trans. R. Soc. A: Math., Phys. Eng. Sci. 371, 20130097 (2013).
Dalton, A. S. et al. Deglaciation of the north American ice sheet complex in calendar years based on a comprehensive database of chronological data: NADI-1. Quat. Sci. Rev. 321, 108345 (2023).
Losos, J. B. & Schluter, D. Analysis of an evolutionary species–area relationship. Nature 408, 847–850 (2000).
Seehausen, O. African cichlid fish: a model system in adaptive radiation research. Proc. R. Soc. B: Biol. Sci. 273, 1987–1998 (2006).
Marques, D. A., Meier, J. I. & Seehausen, O. A combinatorial view on speciation and adaptive radiation. Trends Ecol. Evol. 34, 531–544 (2019).
Hench, K., Helmkampf, M., McMillan, W. O. & Puebla, O. Rapid radiation in a highly diverse marine environment. Proc. Natl Acad. Sci. 119, e2020457119 (2022).
Lachance, H., Ackiss, A. S., Larson, W. A., Vinson, M. R. & Stockwell, J. D. Genomics reveals identity, phenology and population demographics of larval ciscoes (Coregonus artedi, C. hoyi, and C. kiyi) in the Apostle Islands, Lake Superior. J. Gt. Lakes Res. 47, 1849–1857 (2021).
Samonte, I. E. et al. Gene flow between species of Lake Victoria haplochromine fishes. Mol. Biol. Evol. 24, 2069–2080 (2007).
Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540–546 (2019).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PloS one 9, e112963 (2014).
Roach, M. J., Schmidt, S. A. & Borneman, A. R. Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies. BMC Bioinforma. 19, 1–10 (2018).
Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 1–13 (2019).
Lien, S. et al. The Atlantic salmon genome provides insights into rediploidization. Nature 533, 200–205 (2016).
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
Catchen, J., Amores, A. & Bassham, S. Chromonomer: a tool set for repairing and enhancing assembled genomes through integration of genetic maps and conserved synteny. G3: Genes Genomes Genet. 10, 4115–4128 (2020).
Seppey M., Manni M., Zdobnov E. M. BUSCO: assessing genome assembly and annotation completeness. In: Gene prediction). Springer (2019).
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. 117, 9451–9457 (2020).
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinforma. 10, 421 (2009).
UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 51, D523–D531 (2023).
Smit A., Hubley R., Green P. RepeatMasker Open-4.0. 2013–2015.) (2015).
Storer, J., Hubley, R., Rosen, J., Wheeler, T. J. & Smit, A. F. The Dfam community resource of transposable element families, sequence models, and genome annotations. Mob. DNA 12, 1–14 (2021).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Jombart, T. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24, 1403–1405 (2008).
Minh, B. Q. et al. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Mol. Biol. Evolution 37, 1530–1534 (2020).
Wang, H.-C., Minh, B. Q., Susko, E. & Roger, A. J. Modeling Site Heterogeneity with Posterior Mean Site Frequency Profiles Accelerates Accurate Phylogenomic Estimation. Syst. Biol. 67, 216–235 (2017).
Hoang, D. T., Chernomor, O., von Haeseler, A., Minh, B. Q. & Vinh, L. S. UFBoot2: Improving the Ultrafast Bootstrap Approximation. Mol. Biol. Evolution 35, 518–522 (2017).
De Boer, B., Lourens, L. J. & Van De Wal, R. S. Persistent 400,000-year variability of Antarctic ice volume and the carbon cycle is revealed throughout the Plio-Pleistocene. Nat. Commun. 5, 1–8 (2014).
Backenstose N. J. C. et al. Data-Origin of the Laurentian Great Lakes fish fauna through upward adaptive radiation cascade prior to the Last Glacial Maximum. In: https://doi.org/10.5061/dryad.n02v6wx59) Dryad (2024).
Acknowledgements
We thank the U.S. Geological Survey Research Vessel Kiyi Captain Joe Walters, First Mate Keith Peterson, and Engineer Charles Carrier. We would also like to thank the Red Cliff Band of the Lake Superior Chippewa Nation and members of OMNDMNRF (Anders Nyman, David Niskanen, and Robyn Avis, Eric Berglund and David Montgomery) for providing samples. This study was supported by the Great Lakes Fishery Commission (Awards #2018-KRA-44073 and #2021-KRA-440960 to T.J.K) and the University at Buffalo College of Arts and Sciences (Dean’s Fellowship awarded to N.J.C.B.). Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government. All sampling and handling of fish were carried out in accordance with guidelines for the care and use of fishes by the American Fisheries Society (Jenkins et al., 2014). We would also like to thank Joseph R. Tomelleri for giving permission to use his artwork in the manuscript.
Author information
Authors and Affiliations
Contributions
T.J.K., N.J.C.B., D.L.Y., A.S.A., W.S., V.A.A., and L.B. conceived the study. D.L.Y. and W.S. provided samples. N.J.C.B., D.J.M., C.A.O., M.A.B., D.L.Y., E.N., and T.J.K. generated data and conducted analyses. N.J.C.B. and E.N. assembled and scaffolded the Coregonus artedi genome. E.K.T. provided context and interpretation of geological data. N.J.C.B. and T.J.K. wrote the initial draft, and all co-authors edited the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Communications Biology thanks Daniel Estévez-Barcia and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor:Christina Karlsson Rosenthal. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Backenstose, N.J.C., MacGuigan, D.J., Osborne, C.A. et al. Origin of the Laurentian Great Lakes fish fauna through upward adaptive radiation cascade prior to the Last Glacial Maximum. Commun Biol 7, 978 (2024). https://doi.org/10.1038/s42003-024-06503-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42003-024-06503-z