Balancing selection maintains intraspecific diversity in a deep-sea fish

Hoelzel, A. Rus; Garza, John Carlos; Clemento, Anthony; Gkafas, Georgios A.; Steeds, Natasha; Gaither, Michelle; Peachment, Harry; Regnier, Thomas; Gibb, Fiona

doi:10.1038/s41437-025-00813-6

Download PDF

Article
Open access
Published: 27 November 2025

Balancing selection maintains intraspecific diversity in a deep-sea fish

Heredity volume 135, pages 13–22 (2026)Cite this article

2576 Accesses
2 Altmetric
Metrics details

Subjects

Abstract

Segregating alleles in natural populations can be driven to fixation or loss by genetic drift or directional selection, or may be maintained in a polymorphic state by balancing selection. Balancing selection in a panmictic population is theoretically well established, but not widely understood at the molecular level. In this study, we focus on the evolutionary processes affecting non-synonymous variants at eight functionally relevant loci (based on candidate SNP genotyping) in a deep-sea fish species (Coryphaenoides rupestris) that lives across habitat zones ranging from ~200 m to ~2000 m depth. At each of these loci, one allele is predominant in the deeper water. Across a shallower depth range, we find that minor allele frequencies show a highly significant increase or decline progressively across five defined age categories. At single depths below a threshold depth, the deep-water allele declines in frequency with age. Together, these data indicate segregation to different depths, either shallow or deep, and balancing selection to retain variants needed for each depth range. This is supported by signals for long-term balancing selection at these loci (based on published genomic data). We discuss alternative interpretations and conclude that balancing selection maintaining ecotype diversity is the best supported mechanism.

Long-term balancing selection for pathogen resistance maintains trans-species polymorphisms in a planktonic crustacean

Article Open access 22 June 2024

Clinal genomic analysis reveals strong reproductive isolation across a steep habitat transition in stickleback fish

Article Open access 11 August 2021

Population structure and genetic diversity of the endangered fish black shinner Pseudopungtungia nigra (Cyprinidae) in Korea: a wild and restoration population

Article Open access 15 June 2023

Introduction

Genetic diversity is the raw material that allows species to adapt to a changing environment (see Hoelzel et al. 2019; Des Roches et al. 2021). Intraspecific differentiation among populations can be maintained by genetic drift and selection when populations are isolated by geographic distance or some environmental feature that restricts gene flow. When populations are parapatric or sympatric and maintain differentiation, there is often post-zygotic isolation (e.g. following allopatric isolation and re-convergence) or some factor that promotes assortative mating (e.g. Class and Dingemanse 2022). The latter can be associated with phenology, as in the classic examples of Anthoxanthum odoratum in contaminated or uncontaminated soil (differential flowering times; Antonovics 2006) and the hawthorn fly (Rhagoletis polmonella) parasitizing hawthorn berries or apples (different host life cycle; Feder et al. 1988). However, it is also possible for selection to maintain alternative forms, strategies, behaviours and genotypes by balancing or disruptive selection (see van Rijssel et al. 2018; Bitarello et al. 2023). Balancing selection maintains genetic variability in populations by various mechanisms, including frequency dependent selection and heterozygote advantage (Fisher 1923; Charlesworth 2006; Zeng et al. 2021).

For example, polymorphism sustained by balancing selection has been suggested previously for species in the family Centrarchidae (sunfish). The male bluegill sunfish (Lepomis macrochirus) can adopt a form that mimics females to achieve stealthy matings and reproductive success, but males adopt this strategy or retain male phenotype, not both strategies, which suggests some type of frequency-dependent selection (Dominey 1980). Both the bluegill sunfish (Ehlinger and Wilson 1988) and the pumpkinseed sunfish (Lepomis gibbosus; Parsons and Robinson 2007) also show morphological and behavioural specialisations associated with foraging in different habitats in the same lake. However, we know of no genomic studies for these species that attempt to identify the relevant loci underlying the variation in these traits. The mechanisms are better understood in some lepidopteran systems, for example involving balancing selection at the cortex locus controlling wing patterning in butterflies (e.g. Nadeau et al. 2016; Van’t Hof et al. 2016; VanKuren et al. 2019; Wang et al. 2022). In general, while there are various historical examples of balancing selection at the molecular level, such as major histocompatibility loci and sickle cell anaemia (see Hendrick 2007), it hadn’t been thought to be common until recently with the further development of genomics (e.g. Bitarello et al. 2023). At the genomic level, balancing selection affects not just the relevant functional mutation, but also neutral sites in the flanking regions in linkage disequilibrium, maintaining diversity at these sites as well (see Charlesworth 2006). The duration that balancing selection is maintained is often a consequence of the type of selection involved (e.g. frequency dependence may persist longer than heterosis driven by ephemeral pathogen exposure), and this impacts on the footprint that can be detected in the genome indicating balancing selection (Charlesworth 2006).

The deep-sea environment is partitioned with depth by environmental gradients and boundaries associated with factors such as light penetration, hydrostatic pressure, circulation patterns (e.g. Godin et al. 2024) and biological community (known to drive evolutionary change; Gaither et al. 2016). Some species live across a broad depth range, including the roundnose grenadier, Coryphaenoides rupestris (Macrouridae), which is found between ~200 and 2000 m in depth (Cohen et al. 1990). Juveniles are thought to be largely pelagic in mid water, and later segregate by depth as adults (Bergstad 1990; Bergstad and Gordon 1994). The species is listed as critically endangered by the IUCN, and Delaval et al. (2018) have found low levels of population genetic structure. Studies on adult C. rupestris feeding behaviour show both benthic and pelagic prey, suggesting vertical migration (e.g. Bergstad et al. 2010). Diel vertical migrations (DVM) are common in the oceans, where some species migrate to shallower depths at night to feed and return to deeper water in the day to avoid predation (see Bandara et al. 2021). For example, this behaviour has been documented in the deep sea for the blackspot seabream (Pagellus bogaraveo) based on active and passive acoustic telemetry (Afonso et al. 2014). However, a study concluded for C. rupestris that ‘although the importance of pelagic prey was found, a diel vertical migration pattern could not be confirmed’ (Høie 2017). It is possible that pelagic prey are taken at the deeper range of the prey’s diel migrations. Both the blackspot seabream (Afonso et al. 2014) and another deep-sea species (Hexanchus griseus; Coffey et al. 2020) differ from C. rupestris in that they were found only in shallower waters (down to about 700 m). The latter species also showed very consistent habitat depth during each daytime and nighttime period, suggesting that daytime sampling may give consistent results for this and other deep-water species. It is of course also possible that the inference would not extend across species.

A study based on 60 whole genome sequences from C. rupestris samples collected along the habitat depth gradient (Gaither et al. 2018) found that individuals captured at depths of 1800 m or greater had fixed genetic differences compared to those from shallower depths at a set of loci associated especially with membranes, morphogenesis and muscle function. Gaither et al. (2018) found that at 750 m to 1500 m depth, genotypes at the identified loci were variable, but all were fixed for a particular allele at 1800 m, implying that the transition was somewhere between 1500 m and 1800 m. The Gaither et al. (2018) study provided the basis for an analysis of those specific loci found to correlate strongly with habitat depth. At least during the daytime periods when samples were collected, the fish in the deepest water habitat appear to remain there based on consistent genotypes. Fidelity to a particular depth range was also suggested from extensive catch per unit effort data (Gaither et al. 2018). Fish captured at different depth also had significant phenotypic differences, consistent with ‘ecotype’ differentiation (Steeds et al. 2024).

Our objective in the current study was to discover the mechanism sustaining ecotype diversity, given that Gaither et al. (2018) showed no differentiation between shallow and deep water ecotypes at neutral genetic markers (and so unlikely to be incipient speciation). To provide the power to test our hypotheses, we increased the sample size per habitat depth (from an average of 15 to an average of 36), the number of depth zones sampled (from 4 to 8) and included age estimates on all 290 newly genotyped fish. Including age estimates allows us to test the hypothesis that the frequency of alleles at loci associated with habitat depth vary with age class, which could be consistent with a pattern promoted by balancing selection retaining multiple phenotypes. If individuals with different genotypes at these loci segregate to depth habitats that suit their genotypes, then any increased mortality risk may only be associated with habitat availability or suitability (which could lead to frequency dependence). An unstable equilibrium (see Prout 1968; Zeng et al. 2021) may be expected if the individuals are mobile among habitats, and not excluded from one habitat or the other, only more successful in the ‘best fit’ habitat. If this was sustained over time, there should be evidence for long-term balancing selection. We use sampling individually aged fish across a broad depth range and genotyping known non-synonymous polymorphisms associated with habitat depth to investigate the potential mechanisms for the retention of ecotype diversity in this species.

Materials and methods

Specimens of Coryphaenoides rupestris were collected from eight depths ranging from 750 m to 1900 m (Table S1). The species is widely distributed along the North Atlantic shelf margins and at the mid-Atlantic Ridge (see Gordon et al. 1992; Priede et al. 2013). Samples were obtained during trawling surveys on the west of the Scotland shelf edge into the Rockall trough over a small geographic range and short period of time (5–14 September 2015). Samples of muscle tissue or fin clips (N = 290) were collected as soon after capture as possible and preserved in 20% dimethyl sulfoxide saturated with salt or 95% ethanol before transfer to long-term cold storage at −20 °C. Details of specific sample sets are provided in Table S1. Isolation of DNA was done with a standard phenol/chloroform method. Otoliths (calcified structures from the fish inner ear) were extracted, and the left sagittal otolith was used for age estimation. The otolith was embedded in resin blocks and thinly sectioned transversely through the core for inspection using a binocular microscope. Ages of deep-sea fish are notoriously difficult to determine from the otolith alone, due to long lifespan and periods of slow growth typical of deep-sea fish, so we estimated age by three different methods: (1) independent increment estimates based on otolith ring counts made by three different experienced age readers, (2) pre-anal fin length (PAFL) measurements, given that age estimates showed a strong correlation with the PAFL with a slope close to 1 (Figs. S1, S2) and (3) otolith weights, with the weight range corrected so that it was comparable to the age range categories (Fig. S3). We then took the average to assign the final age estimates. Age estimate variation in comparison with the average is shown in Fig. S4. The average age estimates were further divided into five age classes, 0–5 (N = 9), 5–10 (N = 13), 10–15 (N = 30), 15–20 (N = 43), and >20 (N = 39) when a subsample from 750 to 1500 m depth was used (see below). To balance sample size per category, the categories were changed to 5–10 (N = 13), 10–15 (N = 35), 15–19 (N = 47), 20–24 (N = 48) and >25 (N = 13) when the deeper depth ranges (1600 m, 1700–1900 m) were analysed, and >20 used for 1600 m as a fourth category (sample size insufficient for five categories; 5–10: N = 7, 10–15: N = 19, 15–20: N = 10, >20: N = 21).

The loci investigated here were chosen based on the study by Gaither et al. (2018), who sequenced 60 genomes of C. rupestris across a depth gradient (at 750 m, 1000 m, 1500 m and 1800 m) from the same geographic location and during a short period of time (within a 2 km and over 2 days from a region west of the Hebrides). Genome wide association analysis found strong outlier regions associated with habitat depth including non-synonymous changes in relevant coding loci. We chose 10 SNPs from eight depth-associated loci, and an additional 10 SNPs from three loci that showed no association with depth as controls (Table 1). The eight depth associated loci represented the eight loci showing the strongest signal from the Manhattan plot analysis in the original study, and included five of the loci discussed in detail in Gaither et al. (2018) together with three additional loci. The choice was governed in part by loci for which primers could be designed to amplify reliably and multiplex well. The three control loci were chosen to be out of linkage disequilibrium with the depth-associated loci, and again determined in part by the logistics of the multiplex protocol. Note that loci within a given locus may be affected by linkage disequilibrium, though there is still the potential for differential selection at these sites.

Table 1 Statistical tests assessing genotype frequency differences by depth (less than 1700 m compared to 1700 m and deeper) and allele frequencies at 1250 m comparing 14 years or younger to older than 14 years.

Full size table

To obtain genotype data, sequences from the target loci containing SNP variation associated with depth and the controls were extracted from the 60 genome sequences of Gaither et al. (2018). Consensus sequences were compiled using Sequencher v5.1.1 (Gene Codes Corp.) and target SNP variation identified. Primer 3 v4.1.0 was used with default settings and a target length of 90–143 bp to design primers flanking the target SNPs. The assays were then validated by sequencing the loci using the GT-seq (genotyping in 1000 s by sequencing; Campbell et al. 2015) method, with the modifications described by Baetscher et al. (2018).

Initially, a test sample of 96 individuals were run on a MiSeq (Illumina Inc.) with a 2 × 75 bp paired-end sequencing protocol. This first run was used to assess the relative read-depths among loci, resulting in the dilution of some loci in the GT-seq primer multiplex. We then sequenced these loci in a total of 290 samples using the same protocol. These same 290 individuals were aged from their otoliths. Raw reads were automatically de-multiplexed by the MiSeq (Illumina Inc.) Analysis Software using the individual-specific index barcodes. Paired-end reads were combined using FLASH (min overlap of 4 and max overlap of 50; Magoč and Salzberg 2011). Merged reads were then mapped to the compiled consensus sequences for the target loci using BWA-MEM v0.7.17-r1188 (Li and Durbin 2009). Mapped reads were converted from Sequence Alignment/Map (SAM) files to BAM files with SAMtools v1.13 (Li et al. 2009). Variable sites were identified using FreeBayes v1.3.6 (--haplotype-length 0 -kwVa –no-mnps –no-complex; Garrison and Marth 2012); the positions of all SNPs for each locus were recorded in a VCF file. Raw genotypes were extracted directly from the VCF using vcfR v1.15.0 (Knaus and Grunwald 2017).

Allele frequencies were calculated for each locus in each age and depth category to investigate trends with age for different depths. Significance of changes in allele frequency and tests against Hardy-Weinberg expectations were evaluated using Chi-squared tests, and corrections for false discovery used the Bonferroni method. Trends were assessed by linear regression. Power was high for the Chi square tests using the full dataset assuming an effect size of 0.5 and an alpha of 0.05 (test power = 1.0). For comparisons at a single depth by age using the Fisher exact test (14 or younger: N = up to 20, older than 14: N = up to 21; see Table 1) the power was greater than 0.70.

To investigate evidence for long-term balancing selection, we used the 60 sequenced genomes (30 below 1500 m and 30 above 1500 m) from Gaither et al. (2018). These sequences provided extended sequence data around the relevant SNPs, but the individuals were not aged and so these samples were only used for the long-term balancing selection analyses. We used three methods, BetaScan2 (Siewert and Voight 2020), Tajima’s D (Tajima 1989) and MLHKA (Maximum Likelihood Hudson-Kreitman-Aguadé test; Wright and Charlesworth 2004). Each of these methods use the statistical analysis of polymorphism data to detect balancing selection. They differ in that BetaScan2 uses both polymorphism and substitution data to detect balancing selection, Tajima’s D compares the number of segregating sites to the average number of nucleotide differences, and MLHKA compares polymorphism within to divergence between species. The VCF file was first filtered to remove sites that were fixed, with option mac set at 1 using VCFtools (Danecek et al. 2011). For BetaScan2, the VCF file was converted to allele count file format and then to betascan format using glactools (Renaud 2018). We calculated beta 1 score for each Contig separately through a sliding window of 1 Kb. Tajima’s D values were calculated in VCFtools (Danecek et al. 2011) also using a 1 Kb window size. Outlier values for Beta 1 were considered those in the upper 5% of Betascan1* scores, while Tajima’s D outliers were those above 95% percentile (after Grace et al. 2021). The significance of the Betascan1* score and Tajima’s D values in that context were assessed using Z-scores. For each position, a Z-score was calculated as Z = μ/σ, where μ is the mean value and σ is the standard deviation. Z-scores with a value greater than 7.72 and 2.42 (two -tailed p < 0.05) for Beta scores and Tajima’s D, respectively were considered significant. We also tested for strong Tajima’s D values around the SNP of interest using t-test in a 5 Kb window (2.5 Kb before and after the SNP) for both control and depth associated loci.

For the Hudson-Kreitman-Aguadé (HKA) test we used the C. brevibarbis genome (Gaither et al. in prep.) for comparisons among species. We used PSMC software (Liu and Hansen 2016) to calculate the time of divergence between C. rupestris and C. brevibarbis. We trialled different lengths of MCMC chains (100,000–1,000,000) and chose 300,000 (ML values for 100 K: −93.8255, 200 K: −91.2247, 300 K: −88.9939, 500 K: −91.5076, 750 K: −92.5436, 1 M: −93.8255). Segregating sites, pairwise differences, and theta were calculated in DnaSP v.6.0 (Rozas et al. 2017). We set up a model with the depth associated loci under selection and the control loci evolving neutrally. We compared against a model where all loci were considered to be under neutrality. Significance was assessed by likelihood ratio test between the two models using the chi-square statistic (df = number of loci under selection).

Results

We compared the proportion of homozygous genotypes at seven depth categories (combining 750 m and 1000 m) and found an inflection point after 1600 m (average homozygosity change from 750 to 1600 m is 0.527, the average from 1600 to 1900 m is 0.068; Fig. S5). We therefore compared genotype frequencies across two depth zones (750–1600 m and 1700–1900 m) for all loci (Table 1). The division at 1600 m is based on empirical data but is also consistent with theoretical expectations drawn from species distributions in deep water (e.g. many species become intolerant to hydrostatic pressure between 1000 m and 2000 m; see Brown and Thatje 2014). The difference in genotype frequencies between depth zones was highly significant for all of the loci previously identified by Gaither et al. (2018) as associated with depth. In contrast, there were no significant differences between zones at any of the three control loci, confirming their utility for this comparison (Table 1). These differences are illustrated in Fig. 1 for each of the SNPs found to be associated with habitat depth (including three SNPs in OBSL1). Both alleles are segregating for both depth zones at each locus, but one allele clearly dominates in deeper water. The significance of these differences is shown in Table 1. Note that sample size varies somewhat among loci due to differential success with amplification and sequencing. For these larger sample sizes compared to Gaither et al. (2018) the ‘depth’ allele is no longer fixed in the deepest habitats, but it is still clearly the most common allele at depth.

**Fig. 1: Depth associated allele frequencies.**

We then considered the relationship between genotype and age, comparing individuals from the eight depth categories (from 750 to 1900 m) with their estimated age. For our sample set there was no clear linear trend between the age of fish captured and depth (R² = 0.053), though the youngest fish (less than age 5) were found only in shallower water (Fig. S6), as reported in Gaither et al. (2018). However, in making comparisons with genotype, we controlled for depth due to the clear association between genotype and depth extremes (Table 1, Fig. 1). When we restricted the depth range to only 1250 m (reflecting a relatively large and broad representation across ages) and compared fish older vs younger than 14 years, all but one depth-associated locus showed significant differentiation (p < 0.05) beyond the Bonferroni threshold (see Table 1). The older fish had the lower frequency of the ‘depth’ allele at each locus (allele most common in water deeper than 1600 m), such that genotypes were becoming a better match for the ‘shallow’ water habitat with age (Table 1). We then extended the depth range to 750–1500 m (for sufficient sample size per age class) and tracked the minor allele frequency (MAF) among each of five age classes (Fig. 2). In some cases, MAF tracked downward with increasing age (R² = 0.79; F = 80.98, p < 0.0001; Fig. 2a) and some tracked upward (R² = 0.63; F = 39.1, p < 0.0001; Fig. 2b). None of the control SNPs showed a significant pattern, either at 1250 m (Table 1) or across the five age classes (R² = 0.08; F = 0.61; p = 0.43; Fig. 2c). The lack of significant regression in the controls is not due to a mix of some loci increasing and some decreasing (loci nominally increasing: F = 2.47, p = 0.127; decreasing: F = 1.37, p = 0.257). We also assessed frequency variation of the depth allele across the five age classes for fish at 750–1500 m, 1600 m and for the range 1700–1900 m (Fig. 3). At 750–1500 m (F = 64.5, p < 0.0001; Fig. 3a) and 1600 m (F = 23.27; p < 0.0001; Fig. 3b), the depth allele decreased with increasing age class for all 10 SNPs that had been associated with habitat depth variation. In the deepest water (1700–1900 m), the depth allele frequency was high and relatively stable across all ages (F = 3.2; p = 0.08; Fig. 3c).

**Fig. 2: Allele frequencies associated with age.**

**Fig. 3: Depth allele frequency trends.**

We tested for potential deviations from the Hardy-Weinberg equilibrium (HWE) either side of the putative depth threshold separating the two groups (750–1600 m compared to 1700–1900 m). There were no significant deviations from HWE expectations within these depth ranges for any loci (Table S2). However, when all samples were included there were significant deviations (heterozygote deficiency) for four of the loci. This was most likely associate with a Wahlund effect, due to the strong allele frequency difference between the two depth ranges at those loci. In Gaither et al. (2018), loci showing an association with depth based on Manhattan plot analyses demonstrated elevated linkage disequilibrium (R²) compared to neutral loci, suggesting that they evolve as a haplotype (Gaither et al. 2018). The strongest linkage disequilibrium signal was for the non-synonymous SNPs used in this study. Here we assessed the frequency at which for a given locus, a genotype fixed at one allele for either the depth or the alternative allele, was also fixed the same way at other loci. This happened 82.4% of the time, and was significantly more frequent than expected by chance (χ² = 1088, p < 0.00001), consistent with the loci evolving together.

Evidence for long-term balancing selection is presented in Table 2 and Fig. 4. Beta 1 scores ranged from −5.22 to 18.44. The top 5% of scores were above 7.74. All control loci had score values well below the threshold, while all loci associated with depth were elevated or above 7.74, with the exception of adgalt2 (Fig. 4). Tajima’s D showed a similar pattern for the 1 Kb window analysis (Fig. 4). Tajima’s D values around the SNP of interest showed significance for all depth associated loci, but none of the control loci (Table 2). For MLHKA the likelihood value for the neutral model was −107,525 and −88.9939 for a model of 8 loci under selection. To test for significance, we used the likelihood ratio test (twice the difference in log likelihood between the models) and the chi-squared distribution. The ratio was significant (χ² = 37.06, p < 0.001, df = 8), indicating that the eight depth associated loci fit a balancing selection model better than the neutral model.

**Fig. 4: Evidence for balancing selection.**

Table 2 Results for Tajima’s D test using 5 Kb window around relevant SNP (data from Gaither et al. 2018). Loci associated with depth highlighted in yellow.

Full size table

Discussion

The non-synonymous variants identified in Gaither et al. (2018) again showed a strong correlation between habitat depth and allele frequency in the deep-sea fish, C. rupestris. All of the loci identified earlier showed a highly significant pattern, while none of the control loci did (Table 1). Based on our larger sample size and broader sampling range, we found that the transition between selection regimes seems to occur between 1600 m and 1700 m depth (Fig. S5), and while one allele is most common in deeper water (1700–1900 m), both alleles were present throughout the depth range. The loci that showed a strong association with depth also showed allele frequency variation correlated with age when controlling for depth (Figs. 2, 3 and Table 1). None of the control loci showed either pattern or association. There was no clear correlation between age and depth for adult fish (Fig. S6), instead a broad age range was found at all depths.

The loci that correlate with both depth and age are associated with membranes, development and muscle contraction. All of these functions are potentially associated with adaptive needs at different depths. For example, ROCK1 is understood to promote the formation of migrosomes (vesicles generated during cell migration), which are essential for embryonic organ development during morphogenesis in zebrafish (Jiang et al. 2019). Steeds et al. (2024) show that C. rupestris and three other species with similar depth distributions have morphological variation associated with different depths (especially associated with gape size, maximum width, swim bladder weight, and body elongation in C. rupestris). This is consistent with a large proportion of the loci putatively under selection associated with habitat depth in C. rupestris being involved in development or morphogenesis (Gaither et al. 2018). ROCK1 is also within a genomic region correlated with adult migration timing to freshwater in Pacific salmon species (Thompson et al. 2020; Willis et al. 2020). However, this association is shared with the neighbouring locus, GREB1L, an oestrogen-responsive gene, and structural variation in the intergenic region between them. It is not yet clear what role ROCK1 plays in migration timing, if any (as opposed to the association being due to linkage disequilibrium with the intergenic region or GREB1L). EGRF1 also plays an important role in development and has been implicated in the growth of tumours (Fromm et al. 2008). EGRF1 showed a particularly strong association with depth (Fig. 1). Other depth-associated loci are involved in membrane function (Table S3), including CAC1E which provides voltage regulated calcium channels involved in functions such as muscle contraction. OBSL1 also plays a role in muscle contraction (Geisler et al. 2007). Information on the function of all eight loci is provided in Table S3.

Figure S6 shows that there is no clear association between habitat depth and fish age, except for a lack of fish younger than 5 years in water deeper than 1250 m. Fish older than 5 years were found across all depths (Fig. S6). Although individual fish segregate to a particular depth as adults, the specific depth they segregate to is not associated with their age. This means that depth segregation by age can’t fully explain the progressive increase or decrease in MAF, especially when measured at a restricted depth or depth range. The broadest age range was found at 1250 m (Table 2, Fig. S6), which permitted a comparison between young and old fish at a single depth, showing significant differentiation between young and old fish at that depth (Table 1). When the depth range was restricted to 750–1500 m, there was a consistent decrease or increase in MAF (which could be either the depth or shallow related allele) with increasing age class for all depth-related loci (Fig. 2). This could happen in an unstable equilibrium if both ecotypes benefit from segregating to an appropriate habitat depth, but the capacity of each habitat varies over time or individuals don’t show strict fidelity to a given depth. The fact that the frequency of the depth allele decreased with age at all depths below a threshold depth of 1700 m could be interpreted several ways. It could suggest that fish with the depth allele migrate to deeper water increasingly as they age. However, in that case we would expect to see the depth allele increasing with age in the deeper water. Instead, there is a non-significant trend for it to decrease with age there, and all ages show high frequency for the depth allele at 1700–1900 m. It is possible that further sampling in deeper water would show the depth allele increasing, though it is more likely that the alleles are near or at their upper limit (fixation) at that depth. Alternatively, older fish in shallower water may die younger if they have the depth allele. This may be unexpected in a highly mobile species such as C. rupestris that can adjust to different habitat depths; however, a system involving antagonistic pleiotropy is possible. Under this scenario the depth allele would be beneficial or neutral in shallower water when the fish is young, and detrimental in shallower water when they are old. It would seem to be beneficial at all ages in the deepest water.

Allele frequency could also change with age in a sample if there is a sampling bias, which may be associated with different ages being sampled at different depths, or with diel patterns of movement. Neither are supported by the data for this study. DVM could affect the observed pattern (if there was movement during the day) but that would be more likely to disrupt rather than create the clear associations between genotype, age and depth. Further, there was a significant correlation between allele frequency and age class when three different controls for depth were applied (Table 1, Figs. 2, 3), and fish at a given depth were sampled at various times of day. Another potentially relevant factor is that the proportion of large fish (and therefore old fish) was seen to increase as a result of decreased fishing pressure in the Rockall Trough, particularly in the shallower depths (Mindel et al. 2018). Since allele frequencies at these loci varied with age, this raises the possibility that fishing could have impacted allele frequencies. Although close linkage (SNPs within the same locus) may be expected to show correlated patterns of change with age, this was not always the case (see Fig. 2), which could be due to either drift or selection.

Although our measurements of the pattern of allele frequency across habitat depths may have been distorted by these factors, the fact that two alleles are retained in the breeding population at each of these loci suggests that some type of selection is promoting their retention. A progressive change in allele frequency with age in this long-lived species may suggest balancing selection by frequency dependence. In a recent study based on the very large Biobank sample of human genomes (276,406 participants from the UK), Long and Zhang (2023) tested the hypothesis that pleiotropic mutations that promote reproduction but cause aging in humans are favoured by natural selection. They compared age cohorts born between 1940 and 1965 and found alleles at relevant loci with the expected antagonistic pleiotropy characteristics (improved reproduction but shorter lifespan) that showed a positive association with cohort age. For example, the T allele at chromosome position 6p25.3, associated with expression at IRF4, was ‘associated with a younger age at first sexual intercourse and an increased risk of mortality and shows a rise in the frequency of the T allele over 25 years’ (Long and Zhang 2023).

Our data suggest balancing selection retaining diversity, but probably not by overdominance, which might be expected to lead to an equilibrium with stable allele frequencies. The observed dynamic allele frequencies, trending both up and down, also seems incompatible with underdominance (disruptive selection). The data instead seem most consistent with balancing selection and an unstable equilibrium, based on occupation of heterogeneous habitat and the differential segregation of individuals with different genotypes. The potential for free movement across habitats together with improved reproduction in the best suited habitat, could explain the dynamics. Each allele could be favoured in the right environment, and so the ‘depth’ allele may increase or decrease, depending the relative frequency of ecotypes and habitat availability. If individuals were instead restricted to a specific depth range, then the allele frequency changes (see Figs. 2 and 3) would suggest early mortality for genotypes in the ‘wrong’ habitat, and the degree of change may suggest a strong effect. However, our precision with detecting allele frequency may be impacted by noise generated from individual movements among habitats and the chance composition of catches at a given depth. Therefore, trending up or down may be more informative that the exact allele frequencies. The fact that allele frequencies trend up or down is most consistent with balancing selection associated with segregation among habitats, and differential reproductive success rather than high mortality. There is some indication that C. rupestris is a regional and seasonal spawner (Bergstad 1990), and potential breeding swarms have been reported at 1500 m (Neat 2017). If fish from different depths gather together to mate, this would be consistent with the lack of differentiation found at neutral loci (Gaither et al. 2018), and the persistence of polymorphisms in a single panmictic population.

We find that for loci associated with deep-water habitat, MAF changes across age classes, and that this holds even when examining fish collected from a single or restricted habitat depth (Table 1, Fig. 2). The persistence of variation at these loci is likely related to variation in their spatial and temporal frequencies, and due to balancing selection retaining both alleles. These loci also showed evidence of long-term balancing selection based on polymorphism data (Table 2, Fig. 4), and none of the control loci showed evidence for this. For a single species to exploit habitat across a range where habitat characteristics (associated with light, pressure, prey resources, etc.) vary extensively, it is likely necessary to maintain polymorphism that allows individuals to be best adapted to one or another of the different depth habitats. This could be associated with life stage (e.g. differential segregation for juveniles and adults), but there is little evidence for that in this case (see Fig. S6). Eventually this may lead to segregation by habitat and assortative mating, but there is no evidence that this has happened yet in this species (Fst values for 44,650 neutral loci comparing samples from different habitat depths were not significantly different from zero; see Gaither et al. 2018). Instead, it seems that the polymorphism is maintained within a reproductive population. Bolnick et al. (2003) review many examples of conspecific individual specialisation and the potential evolutionary mechanisms, including directional, balancing and disruptive selection. Bitarello et al. (2023) provide a timeline from the first mention of balancing selection by Fisher (1923) to modern studies using genomic data to detect signals of balancing selection. Here we identify some of the molecular mechanisms that appear to underlie the polymorphism supporting individual differences associated with divergent habitat in the deep sea. Of the various potential mechanisms, the data suggest frequency dependent balancing selection as the most likely mechanism, impacting loci with functions relevant to development and morphology.

Data availability

Genotype data for this study can be found at Dryad (https://doi.org/10.5061/dryad.34tmpg4xx).

References

Afonso P, McGinty N, Graça G, Fontes J, Inácio M, Totland A et al. (2014) Vertical migrations of a deep-sea fish and its prey. PLoS ONE 9:e97884.
Article PubMed PubMed Central Google Scholar
Antonovics J (2006) Evolution in closely adjacent plant populations X: long-term persistence of prereproductive isolation at a mine boundary. Heredity 97:33–37
Article CAS PubMed Google Scholar
Baetscher DS, Clemento AJ, Ng TC, Anderson EC, Garza JC (2018) Microhaplotypes provide increased power from short-read DNA sequences for relationship inference. Mol Ecol Resour 18:296–305
Article CAS PubMed Google Scholar
Bandara K, Varpe Ø, Wijewardene L, Tverberg V, Eiane K (2021) Two hundred years of zooplankton vertical migration research. Biol Rev 96:1547–1589
Article PubMed Google Scholar
Bergstad OA (1990) Distribution, population structure, growth and reproduction of the roundnose grenadier Coryphaenoides rupestris (Pisces: Macrouridae) in the deep waters of the Skagerrak. Mar Biol 107:25–39
Article Google Scholar
Bergstad OA, Gjelsvik G, Schander C, Høines ÅS (2010) Feeding ecology of Coryphaenoides rupestris from the Mid-Atlantic Ridge. PLoS ONE 5:e10453
Article PubMed PubMed Central Google Scholar
Bergstad OA, Gordon JDM (1994) Deep-water ichthyoplankton of the Skagerrak with special reference to Coryphaenoides rupestris Gunnerus, 1765 (Pisces: Macrouridae) and Argentina silus (Ascanius, 1775) (Pisces, Argentinidae). Sarsia 79:33–43
Article Google Scholar
Bitarello BD, Brandt DYC, Meyer D, Andrés AM (2023) Inferring balancing selection from genome-scale data. Genome Biol Evol 15: evad032
Article PubMed PubMed Central Google Scholar
Bolnick DI, Svanbäck R, Fordyce JA, Yang LH, Davis JM, Hulsey CD et al. (2003) The ecology of individuals: incidence and implications of individual specialization. Am Nat 161:1–28
Article PubMed Google Scholar
Brown A, Thatje S (2014) Explaining bathymetric diversity patterns in marine benthic invertebrates and demersal fishes: physiological contributions to adaptation of life at depth. Biol Rev Camb Philos Soc 89:406–426
Article PubMed Google Scholar
Campbell NR, Harmon SA, Narum SR (2015) Genotyping-in-Thousands by sequencing (GT-seq): a cost effective SNP genotyping method based on custom amplicon sequencing. Mol Ecol Resour 15:855–867
Article CAS PubMed Google Scholar
Charlesworth D (2006) Balancing selection and its effects on sequences in nearby genome regions. PLoS Genet 2:e64
Article PubMed PubMed Central Google Scholar
Class B, Dingemanse NJ (2022) A variance partitioning perspective of assortative mating: Proximate mechanisms and evolutionary implications. J Evol Biol 35:483–490
Article PubMed Google Scholar
Coffey DM, Royer MA, Meyer CG, Holland KN (2020) Diel patterns in swimming behavior of a vertically migrating deepwater shark, the bluntnose sixgill (Hexanchus griseus). PLoS ONE 15:e0228253
Article CAS PubMed PubMed Central Google Scholar
Cohen DM, Inada T, Iwamoto T, Scialabba N (1990) FAO species catalogue. Vol. 10. Gadiform fishes of the world (Order Gadiformes). An annotated and illustrated catalogue of cods, hakes, grenadiers and other gadiform fishes known to date. FAO Fisheries Synopses 125. FAO, Rome
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R (2011) 1000 Genomes project analysis group, the variant call format and VCFtools. Bioinformatics 27:2156–2158
Article CAS PubMed PubMed Central Google Scholar
Delaval A, Dahle G, Knutsen H, Devine J, Salvanes AGV (2018) Norwegian fjords contain sub-populations of roundnose grenadier Coryphaenoides rupestris, a deep-water fish. Mar Ecol Prog Ser 586:181–192
Article Google Scholar
Des Roches S, Pendleton LH, Shapiro B, Palkovacs EP (2021) Conserving intraspecific variation for nature’s contributions to people. Nat Ecol Evol 5:574–582
Article PubMed Google Scholar
Dominey WJ (1980) Female mimicry in male blue-gill sunfish—a genetic polymorphism? Nature 284:546–548
Article Google Scholar
Ehlinger TJ, Sloan Wilson D (1988) Complex foraging polymorphism in bluegill sunfish. Proc Natl Acad Sci 85:1878–1882
Article CAS PubMed PubMed Central Google Scholar
Feder JL, Chilcote CA, Bush GL (1988) Genetic differentiation between sympatric host races of the apple maggot fly Rhagoletis pomonella. Nature 336:61–64
Article Google Scholar
Fisher RA (1923) On the dominance ratio. Proc R Soc 42:321–341
Google Scholar
Fromm JA, Johnson SAS, Johnson DL (2008) Epidermal growth factor receptor 1 (EGFR1) and its variant EGFRvIII regulate TATA-binding protein expression through distinct pathways. Mol Cell Biol 28:6483–6495
Article CAS PubMed PubMed Central Google Scholar
Gaither MR, Gkafas GA, de Jong M, Sarigol F, Neat F, Regnier T et al. (2018) Genomics of habitat choice and adaptive evolution in a deep-sea fish. Nat Ecol Evol 2:680–687
Article PubMed Google Scholar
Gaither MR, Violi B, Gray HWI, Neat F, Drazen JC, Grubbs RD et al. (2016) Depth as a driver of evolution in the deep sea: Insights from grenadiers (Gadiformes: Macrouridae) of the genus Coryphaenoides. Mol Phylogenet Evol 104:73–82
Article PubMed Google Scholar
Garrison E, Marth G (2012) Haplotype-based variant detection from short-read sequencing. https://doi.org/10.48550/ARXIV.1207.3907
Geisler SB, Robinson D, Hauringa M, Raeker MO, Borisov AB, Westfall MV et al. (2007) Obscurin-Like 1, OBSL1, is a novel cytoskeletal protein related to obscurin. Genomics 89:521–531
Article CAS PubMed Google Scholar
Godin OA, Tan TW, Joseph JE, Walters MW (2024) Observation of exceptionally strong near‑bottom flows over the Atlantis II Seamounts in the northwest Atlantic. Sci Reps 14: 10308
Article CAS Google Scholar
Gordon JDM, Bergstad OA (1992) Species composition of demersal fish in the Rockall Trough, north-eastern Atlantic, as determined by different trawls. J Mar Biol Assoc UK 72:213–230.
Article Google Scholar
Grace CA, Forrester S, Silva VC, Carvalho KSS, Kilford H, Chew YP, James S, Costa DL, Mottram JC, Costa CCHN, Jeffares DC (2021) Candidates for Balancing Selection in Leishmania donovani Complex Parasites,. Genome Biol Evol 13: evab265
Article CAS PubMed PubMed Central Google Scholar
Hedrick PW (2007) Balancing selection. Current Biol 17:R230–R231
Article CAS Google Scholar
Hoelzel AR, Bruford MW, Fleischer RC (2019) Conservation of adaptive potential and functional diversity. Cons Gen 20:1–5
Article Google Scholar
Høie JS (2017) Comparative feeding ecology of roundnose grenadier (Coryphaenoides rupestris) in Norwegian fjords. Master’s Thesis, Department of Biology, University of Bergen
Jiang D, Jiang Z, Lu D, Wang X, Liang H, Zhang J et al. (2019) Migrasomes provide regional cues for organ morphogenesis during zebrafish gastrulation. Nat Cell Biol 21:966–977
Article CAS PubMed Google Scholar
Knaus BJ, Grunwald NJ (2017) vcfR: a package to manipulate and visualize variant call format data in R. Mol Ecol Res 17:44–53
Article CAS Google Scholar
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinfo 25:1754–1760
CAS Google Scholar
Li H, Handsaker B, Wysoker A, Fennell T, Ruen J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinfo 25:2078–2079
Google Scholar
Liu S, Hansen MM (2016) PSMC (pairwise sequentially Markovian coalescent) analysis of RAD (restriction site associated DNA) sequencing data. Mol Ecol Resour 17(4):631–641. https://doi.org/10.1111/1755-0998.12606
Article CAS PubMed Google Scholar
Long E, Zhang J (2023) Evidence for the role of selection for reproductively advantageous alleles in human aging. Sci Adv 9: eadh4990
Article PubMed PubMed Central Google Scholar
Magoč T, Salzberg SL (2011) FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinfo 27:2957–2963
Google Scholar
Mindel BL, Neat FC, Webb TJ, Blanchard JL (2018) Size-based indicators show depth-dependent change over time in the deep sea. ICES J Mar Sci 75:113–121
Article Google Scholar
Nadeau NJ, Pardo-Diaz C, Whibley A, Supple MA, Saenko SV, Wallbank RWR et al. (2016) The gene cortex controls mimicry and crypsis in butterflies and moths. Nature 534:106–110
Article CAS PubMed PubMed Central Google Scholar
Neat FC (2017) Aggregating behaviour, social interactions and possible spawning in the deep-water fish Coryphaenoides rupestris. J Fish Biol 91:975–980
Article CAS PubMed Google Scholar
Parsons KJ, Robinson BW (2007) Foraging performance of diet-induced morphotypes in pumpkin seed sunfish (Lepomis gibbosus) favours resource polymorphism. J Evol Biol 20:673–684
Article CAS PubMed Google Scholar
Priede IG, Billett DSM, Brierley AS, Hoezel AR, Inall M, Miller PI et al. (2013) The ecosystem of the Mid-Atlantic Ridge at the sub-polar front and Charlie–Gibbs Fracture Zone; ECO-MAR project strategy and description of the sampling programme 2007–2010. Deep-Sea Res II 98:220–230
Google Scholar
Prout T (1968) Sufficient conditions for multiple niche polymorphism. Am Nat 102:493–496
Article Google Scholar
Renaud G (2018) glactools: a command-line toolset for the management of genotype likelihoods and allele counts. Bioinformatics 34:1398–1400
Article CAS PubMed Google Scholar
van Rijssel JC, Moser FN, Frei D, Seehausen O (2018) Prevalence of disruptive selection predicts extent of species differentiation in Lake Victoria cichlids. Proc R Soc B 285:20172630
Article PubMed PubMed Central Google Scholar
Rozas J, Ferrer-Mata A, Sánchez-DelBarrio JC, Guirao-Rico S, Librado P, Ramos-Onsins SE, Sánchez-Gracia A (2017) DnaSP 6: DNA sequence polymorphism analysis of large datasets. Mol Biol Evol 34:3299–3302
Article CAS PubMed Google Scholar
Siewert KM, Voight BF (2020) BetaScan2: standardized statistics to detect balancing selection utilizing substitution data. Genome Biol Evol 12:3873–3877
Article CAS PubMed PubMed Central Google Scholar
Steeds N, Zulqurnain Z, Regnier T, Gibb F, Stirling D, Hoezel AR (2024) Intraspecific phenotypic differentiation by habitat depth in deep demersal fish species. Front Mar Sci 11: 1437952
Article Google Scholar
Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595
Article CAS PubMed PubMed Central Google Scholar
Thompson NF, Anderson EC, Clemento AJ, Campbell MA, Pearse DE, Hearsey JW et al. (2020) A complex phenotype in salmon controlled by a simple change in migratory timing. Science 370:609–613
Article CAS PubMed Google Scholar
VanKuren NW, Massardo D, Nallu S, Kronforst MR (2019) Butterfly mimicry polymorphisms highlight phylogenetic limits of gene reuse in the evolution of diverse adaptations. Mol Biol Evol 36:2842–2853
Article CAS PubMed PubMed Central Google Scholar
van’t Hof AE, Campagne P, Rigden DJ, Yung CJ, Lingley J, Quail MA et al. (2016) The industrial melanism mutation in British peppered moths is a transposable element. Nature 534:102–105
Article PubMed Google Scholar
Wang S, Teng D, Li X, Peiwen Y, Da W, Zhang Y et al. (2022) The evolution and diversification of oakleaf butterflies. Cell 185:3138–3152
Article CAS PubMed Google Scholar
Willis SC, Hess JE, Fryer JK, Whiteaker JM, Brun C, Gerstenberger R et al. (2020) Steelhead (Oncorhynchus mykiss) lineages and sexes show variable patterns of association of adult migration timing and age-at-maturity traits with two genomic regions. Evol Appl 13:2836–2856
Article CAS PubMed PubMed Central Google Scholar
Wright SI, Charlesworth B (2004) The HKA test revisited: a maximum likelihood ratio test of the standard neutral model. Genetics 168:1071–1076.
Article PubMed PubMed Central Google Scholar
Zeng K, Charlesworth B, Hobolth A (2021) Studying models of balancing selection using phase-type theory. Genetics 218: iyab055
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank Marine Directorate colleagues and the crew of MRV Scotia for their help in the collection of fish samples. We thank the technical support staff in Durham for help with access to facilities. We acknowledge the use of the Hamilton HPC core at Durham and thank the associated technical staff.

Author information

Authors and Affiliations

Department of Biosciences, South Road, Durham University, Durham, UK
A. Rus Hoelzel, Georgios A. Gkafas, Natasha Steeds & Harry Peachment
NOAA Southwest Fisheries Science Center and the University of California Santa Cruz, Santa Cruz, CA, USA
John Carlos Garza & Anthony Clemento
Department of Ichthyology and Aquatic Environment, University of Thessaly, Volos, Greece
Georgios A. Gkafas
Department of Biology, University of Central Florida, Orlando, FL, USA
Michelle Gaither
Marine Directorate, Aberdeen, UK
Thomas Regnier & Fiona Gibb

Authors

A. Rus Hoelzel
View author publications
Search author on:PubMed Google Scholar
John Carlos Garza
View author publications
Search author on:PubMed Google Scholar
Anthony Clemento
View author publications
Search author on:PubMed Google Scholar
Georgios A. Gkafas
View author publications
Search author on:PubMed Google Scholar
Natasha Steeds
View author publications
Search author on:PubMed Google Scholar
Michelle Gaither
View author publications
Search author on:PubMed Google Scholar
Harry Peachment
View author publications
Search author on:PubMed Google Scholar
Thomas Regnier
View author publications
Search author on:PubMed Google Scholar
Fiona Gibb
View author publications
Search author on:PubMed Google Scholar

Contributions

ARH conceived and conceptualised the study. MG, TR and FG acquired samples and their metadata, and ARH, NS and MG undertook DNA extraction. NS, TR and FG undertook analyses on morphometrics and age. JCG, AC and GAG generated sequence data and undertook bioinformatics. HP, GAG and ARH undertook further analyses. ARH wrote the paper with review from all other authors.

Corresponding author

Correspondence to A. Rus Hoelzel.

Ethics declarations

Competing interests

The authors declare no competing interests.

Research ethics statement

This study was conducted in accordance with the local legislation and institutional requirements. It was based on opportunistic sampling from deep water surveys. Fish were brought aboard deceased and sampled postmortem.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Associate editor: Sebastián Ramos-Onsins.

Supplementary information

Supplement (download DOCX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hoelzel, A.R., Garza, J.C., Clemento, A. et al. Balancing selection maintains intraspecific diversity in a deep-sea fish. Heredity 135, 13–22 (2026). https://doi.org/10.1038/s41437-025-00813-6

Download citation

Received: 15 July 2025
Revised: 06 November 2025
Accepted: 07 November 2025
Published: 27 November 2025
Version of record: 27 November 2025
Issue date: January 2026
DOI: https://doi.org/10.1038/s41437-025-00813-6